RDM Pipeline
The Robust Decision Making (RDM) pipeline is the core of OSeMOSYS-RDM, enabling systematic exploration of uncertainty in energy system models.
What is RDM?
Robust Decision Making is a decision support methodology that:
Explores a wide range of plausible futures
Identifies strategies that perform well across many scenarios
Helps decision-makers understand vulnerabilities
Supports adaptive policy design
Pipeline Stages
1. Base Future Generation
The base future (Future 0) establishes the reference scenario.
python scripts/run_base_future.py
What happens:
Reads scenario from
src/workflow/0_Scenarios/Extracts model structure to
B1_Model_Structure.xlsxSolves the optimization problem
Exports results to CSV format
2. Uncertainty Sampling
Latin Hypercube Sampling (LHS) generates parameter combinations.
Why LHS?
Ensures uniform coverage of the uncertainty space
More efficient than random sampling
Stratified sampling across all dimensions
Configuration:
# In Interface_RDM.xlsx, Setup sheet:
Number_of_Runs: 100 # Number of futures to generate
3. Future Generation and Solving
Each future is created by modifying baseline parameters.
# Pseudocode of the process
for future_id in range(1, N+1):
# Generate parameter modifications from LHS sample
modifications = apply_uncertainties(baseline, lhs_sample[future_id])
# Create scenario file
create_scenario_file(modifications, future_id)
# Preprocess data file (add sets, compute CRF/PvAnnuity)
preprocess_data(scenario_file)
# Solve optimization
solve(scenario_file, solver)
# Export results
export_results(future_id)
Automatic Data Preprocessing
Before each future is solved, the data file is automatically pre-processed by preprocess_data.py. This step:
Parses bracket-format parameters (
OutputActivityRatio,InputActivityRatio,EmissionActivityRatio, etc.)Calculates
CapitalRecoveryFactorandPvAnnuityfor each technologyAdds preprocessed sets (
MODExTECHNOLOGYperFUELout,MODExTECHNOLOGYperFUELin,MODEperTECHNOLOGY, etc.)
This is required for OSeMOSYS model formulation v5.4 and reduces matrix generation time.
EV UDC Sign Correction
When using User-Defined Constraints (UDC) to model EV penetration caps, LHS perturbations can flip coefficient signs, causing infeasibility. The workflow includes automatic post-perturbation sign correction:
Scans
UDCMultiplierTotalCapacity,UDCMultiplierNewCapacity, andUDCMultiplierActivityEnsures conventional technology coefficients (e.g., diesel/gasoline) remain negative
Ensures electric technology coefficients remain positive
Clamps flipped values to a small epsilon (0.001)
This is configured via three fields in Interface_RDM.xlsx → Setup sheet:
Field |
Description |
Example |
|---|---|---|
|
Semicolon-separated substrings for conventional technologies |
|
|
Substring for electric technologies |
|
|
Semicolon-separated EV penetration UDC names |
|
If any of these fields is empty, the correction is skipped. Correction logs are saved to Experimental_Platform/Logs/UDC_Corrections/.
4. Result Aggregation
Results are consolidated into unified datasets across two stages:
rdm_experimentstage: generatesOSEMOSYS_{Region}_Energy_Input.csvimmediately after all futures are solvedpostprocessstage: generatesOSEMOSYS_{Region}_Energy_Output.csv
src/Results/
├── OSEMOSYS_{Region}_Energy_Input.csv # Generated in rdm_experiment
├── OSEMOSYS_{Region}_Energy_Output.csv # Generated in postprocess
└── *.parquet # Efficient intermediate storage
Configuring Uncertainties
Uncertainty Table Structure
Define uncertainties in Interface_RDM.xlsx → Uncertainty_Table sheet:
Column |
Description |
Example |
|---|---|---|
X_Num |
Unique ID |
1, 2, 3… |
X_Category |
Grouping category |
“Fuel Costs” |
Min_Value |
Lower bound |
0.8 |
Max_Value |
Upper bound |
1.2 |
X_Mathematical_Type |
Variation method |
“Time_Series” |
Mathematical Types
Time_Series
Interpolates from current value to a modified final value:
Current trajectory: 2025: 100 → 2050: 200
With multiplier 1.2: 2025: 100 → 2050: 240
Constant
Maintains constant value from the uncertainty start year:
Original: 2025: 100 → 2030: 100 → 2050: 100
With initial year 2030: 2025: 100 → 2030: 100 → 2050: 100
Linear
Linear interpolation to final value:
Original: 2025: 100 → 2050: 200
Modified: 2025: 100 → 2050: 240 (linear path)
Logistic
S-curve (sigmoid) trajectory for technology adoption:
Slow adoption at start, accelerating in middle, saturating at end
Timeslices_Curve
Switches between predefined demand curves:
Selects a different curve from shape_of_demand.xlsx based on uncertainty sampling
Important
For Timeslices_Curve type:
Curves must be predefined in the file
shape_of_demand.xlsxThe parameter
Explored_Parameter_of_Xmust be set toChange_CurveThis allows exploring different demand profile shapes across futures
Example: Fuel Cost Uncertainty
X_Num: 1
X_Category: "Fuel Costs"
X_Plain_English_Description: "Natural gas price uncertainty"
X_Mathematical_Type: "Time_Series"
Explored_Parameter_of_X: "Final_Value"
Min_Value: 0.7
Max_Value: 1.5
Involved_Scenarios: "Scenario1"
Involved_First_Sets_in_Osemosys: "NATGAS"
Exact_Parameters_Involved_in_Osemosys: "VariableCost"
Initial_Year_of_Uncertainty: 2025
Example: Technology Capacity Uncertainty
X_Num: 2
X_Category: "Technology Limits"
X_Plain_English_Description: "Solar PV maximum capacity"
X_Mathematical_Type: "Time_Series"
Explored_Parameter_of_X: "Final_Value"
Min_Value: 0.5
Max_Value: 2.0
Involved_Scenarios: "Scenario1"
Involved_First_Sets_in_Osemosys: "PWRSOL001 ; PWRSOL002"
Exact_Parameters_Involved_in_Osemosys: "TotalAnnualMaxCapacity"
Initial_Year_of_Uncertainty: 2025
Running the RDM Pipeline
Quick Start
python run.py rdm
Monitoring Progress
The pipeline provides progress updates:
======================================================================
🔬 RDM Pipeline (Robust Decision Making)
======================================================================
Stages: base_future → rdm_experiment → postprocess
======================================================================
🔄 Executing RDM Pipeline...
----------------------------------------------------------------------
Step 1 finished
Step 2 finished
...
# This is future: 1 and scenario Scenario1
# This is future: 2 and scenario Scenario1
...
----------------------------------------------------------------------
✅ RDM Pipeline completed in 15m 32s!
Parallel Execution
RDM experiments are parallelized for efficiency. Each future launches its own system process (solver), so the number of futures you can run simultaneously depends on your CPU threads, RAM, and the solver you are using.
Configuration
# In Interface_RDM.xlsx, Setup sheet:
Parallel_Use: 10 # Futures processed simultaneously
Threads_CPLEX_Gurobi: 4 # Threads per solve (CPLEX/Gurobi only)
Time_CBC: 3600 # Max solve time in seconds (CBC only)
Calculating Parallel_Use Based on Your Machine
The key resource is the number of logical CPU threads available. You can check this with:
Windows: Task Manager → Performance → CPU → “Logical processors”
Linux/macOS:
nprocorlscpu
You must always reserve threads for the operating system and background processes (typically 2–4 threads). The formula depends on whether your solver uses single or multiple threads:
Single-thread solvers (CBC, GLPK)
These solvers use exactly 1 thread per future:
Parallel_Use = Total_CPU_Threads - Reserved_Threads
Example: A machine with 16 threads, reserving 4:
Parallel_Use = 16 - 4 = 12
Multi-thread solvers (CPLEX, Gurobi)
These solvers use multiple threads per future (configured via Threads_CPLEX_Gurobi):
Parallel_Use = (Total_CPU_Threads - Reserved_Threads) / Threads_CPLEX_Gurobi
Example: A machine with 16 threads, reserving 4, with Threads_CPLEX_Gurobi = 4:
Parallel_Use = (16 - 4) / 4 = 3
Summary Table
Machine Threads |
Reserved |
Solver |
Threads per Solve |
Max |
|---|---|---|---|---|
8 |
2 |
CBC/GLPK |
1 |
6 |
8 |
2 |
CPLEX/Gurobi |
2 |
3 |
16 |
4 |
CBC/GLPK |
1 |
12 |
16 |
4 |
CPLEX/Gurobi |
4 |
3 |
32 |
4 |
CBC/GLPK |
1 |
28 |
32 |
4 |
CPLEX/Gurobi |
4 |
7 |
64 |
4 |
CBC/GLPK |
1 |
60 |
64 |
4 |
CPLEX/Gurobi |
8 |
7 |
Warning
Do not exceed your machine’s capacity. If Parallel_Use × Threads_per_Solve exceeds the available CPU threads, the operating system will over-subscribe the CPU. This causes heavy context switching, degrades solver performance significantly (each individual solve takes much longer), and can make the machine unresponsive — potentially requiring a forced restart. Always leave threads free for the OS and other processes.
Memory Considerations
Besides CPU threads, each parallel future also consumes RAM. A rough guide:
Parallel_Use |
RAM Needed |
Speed |
|---|---|---|
1 |
~4 GB |
Slowest |
5 |
~8 GB |
Moderate |
10 |
~16 GB |
Fast |
20 |
~32 GB |
Fastest |
Note
RAM usage depends heavily on model size (number of technologies, time slices, years). For large models, monitor memory usage during the first batch and adjust Parallel_Use accordingly.
Output Structure
Per-Future Outputs
Each future generates:
Experimental_Platform/Futures/Scenario1/Scenario1_1/
├── Scenario1_1.txt # Modified scenario file
├── Scenario1_1_Input.parquet # Input parameters
├── Scenario1_1_Output.parquet # Solution outputs
└── Scenario1_1_Output.sol # Raw solver output
Aggregated Outputs
After pipeline completion:
src/Results/
├── OSEMOSYS_{Region}_Energy_Input.csv (from rdm_experiment stage)
│ └── Columns: Strategy, Future.ID, YEAR, Parameter, Value, ...
├── OSEMOSYS_{Region}_Energy_Output.csv (from postprocess stage)
│ └── Columns: Strategy, Future.ID, YEAR, TECHNOLOGY, Value, ...
└── *.parquet (efficient intermediate storage)
Solution Status Report
After all futures are solved, the pipeline automatically generates a solution status report at Results/solution_status.txt. This file summarizes whether each future reached an optimal solution, was infeasible, or ended with another status.
Report Format
Solution Status Summary
========================================
Total futures: 101
Optimal: 98
Infeasible: 3
----------------------------------------
Scenario1_0: optimal
Scenario1_1: optimal
Scenario1_2: infeasible
...
The report covers both the base future (Scenario*_0) and all RDM futures. Supported solver output formats are CPLEX (XML), Gurobi, CBC, and GLPK.
What Happens Internally
After all futures finish solving,
check_sol_status.pyscans the.solfiles in each future’s directory.It extracts the solution status from solver-specific output formats.
It writes the summary to
Results/solution_status.txt.It deletes the
.soland.lpfiles to free disk space (these files can be very large for big models).
Note
If a future is infeasible, its output Parquet files will not contain valid results. Check the solution status report to identify and exclude infeasible futures from downstream analysis.
Best Practices
1. Start Small
# Begin with 10-20 futures to test configuration
Number_of_Runs: 20
2. Validate Base Future
Before running many futures:
Check base future results manually
Verify model solves correctly
Confirm outputs make sense
3. Use Appropriate Ranges
# Too narrow: misses important outcomes
Min_Value: 0.99, Max_Value: 1.01 # ❌
# Too wide: includes implausible futures
Min_Value: 0.1, Max_Value: 10.0 # ❌
# Reasonable range
Min_Value: 0.7, Max_Value: 1.3 # ✅
5. Document Your Choices
Keep notes on:
Why specific ranges were chosen
Data sources for uncertainty bounds
Assumptions made