Data Tables Settings¶
PowerGenome loads data from tables (CSV, Parquet, DuckDB) configured in settings. Table configurations support filtering, column selection, and scenario-based data loading through the DataManager.
Table Configuration Patterns¶
Simple Configuration¶
Point to a filename in data_location:
generation_table: generators.csv
fuel_prices_table: fuel_prices.parquet
demand_table: hourly_demand.csv
Files are loaded from the folder specified in data_location.
Advanced Configuration¶
Use dictionary format for filtering and column selection:
generation_table:
table_name: generators.parquet
scenario: high_retirements
filters:
- - [operating_year, '<=', 2030]
- [retirement_year, '>', 2030]
columns: [plant_id, technology, capacity_mw, heat_rate, region]
Parameters:
table_name: Filename or database tablescenario: Quick filter for scenario columnfilters: DNF (Disjunctive Normal Form) filter logiccolumns: Limit columns loaded (for large files)
Filter Syntax¶
DNF (Disjunctive Normal Form)¶
Filters use nested lists representing (A AND B) OR (C AND D) logic:
filters:
- - [column1, operator, value1]
- [column2, operator, value2] # AND with above
- - [column3, operator, value3] # OR with first group
Operators:
=or==: Equality!=: Not equal>: Greater than>=: Greater than or equal<: Less than<=: Less than or equalin: Value in listnot in: Value not in list
Filter Examples¶
Simple filter (single condition):
AND filter (both conditions must be true):
OR filter (either condition can be true):
Complex filter ((A AND B) OR C):
List membership:
Scenario Convenience Filter¶
The scenario parameter is shorthand for a scenario column filter:
# These are equivalent:
generation_table:
table_name: generators.parquet
scenario: baseline
generation_table:
table_name: generators.parquet
filters:
- - [scenario, '=', 'baseline']
Standard Table Mappings¶
PowerGenome expects certain table names for core data:
| Setting Parameter | Standard Name | Purpose |
|---|---|---|
generation_table |
generation |
Existing generators |
demand_table |
demand |
Hourly demand profiles |
fuel_prices_table |
fuel_prices |
Fuel cost time series |
transmission_table |
transmission |
Network constraints |
plant_region_table |
plant_region |
Plant location mapping |
capacity_limit_spur_table |
capacity_limit_spur |
New-build capacity limits |
dg_capacity_table |
dg_capacity |
Distributed gen capacity |
dg_profiles_table |
dg_profiles |
Distributed gen profiles |
Core Data Tables¶
Generation Table¶
Setting: generation_table
Purpose: Existing power plant data
Required columns:
plant_idorplant_id_eia: Unique plant identifiertechnology: Technology typecapacity_mw: Nameplate capacityregion: Geographic region
Optional columns:
heat_rate_mmbtu_mwh: Thermal efficiencyoperating_year: Year plant startedretirement_year: Planned retirementfixed_o_m_mw: Fixed O&M costsvariable_o_m_mwh: Variable O&M costsminimum_load_mw: Minimum stable operationfuel: Fuel type
Example:
generation_table:
table_name: generators.parquet
filters:
- - [operating_year, '<=', 2030]
- [capacity_mw, '>', 10] # Exclude tiny plants
Demand Table¶
Setting: demand_table
Purpose: Hourly electricity demand in tidy format
Required columns:
region: Model region nametime_index: Hour index (1 to number of hours)load_mw: Demand in MW
Optional columns:
year: Model year (for multi-year data)scenario: Scenario identifierweather_year: Weather data year
Format: Tidy/long format with one row per region-time observation.
Example:
demand_table:
table_name: demand_timeseries.parquet
scenario: high_ev
filters:
- - [year, '=', 2030]
Example demand CSV (tidy format):
time_index,weather_year,region,load_mw,year
1,2012,CA_N,15234.5,2030
2,2012,CA_N,14123.2,2030
3,2012,CA_N,13890.4,2030
1,2012,CA_S,12450.8,2030
2,2012,CA_S,11234.5,2030
...
Fuel Prices Table¶
Setting: fuel_prices_table
Purpose: Fuel cost projections
Required columns:
fuel: Fuel name (coal, naturalgas, distillate, uranium, etc.)region: Region nameyear: Calendar yearprice: Price ($/MMBtu)
Optional columns:
scenario: Price scenariodata_year: Data vintage yeardollar_year: USD year (for inflation adjustment)month: Monthly prices (if seasonal)
Regional Coverage
The fuel price table must include prices for all base regions in your model. For example, if you have an aggregated region CA composed of base regions CA_N and CA_S, your fuel price table must contain separate rows for CA_N and CA_S. PowerGenome will automatically calculate the average price for CA from its constituent base regions.
Example:
Example fuel_prices.csv:
fuel,region,year,price,scenario,dollar_year
coal,CA_N,2030,2.5,reference,2024
coal,CA_S,2030,2.6,reference,2024
naturalgas,CA_N,2030,4.2,reference,2024
naturalgas,CA_S,2030,4.3,reference,2024
uranium,CA_N,2030,0.8,reference,2024
uranium,CA_S,2030,0.8,reference,2024
Transmission Table¶
Setting: transmission_table
Purpose: Inter-regional transmission constraints
Required columns:
transmission_path_name: Line identifierstart_region: Origin regiondest_region: Destination regiontransmission_line_mw: Existing capacity
Optional columns:
max_transmission_mw: Maximum capacitydistance_miles: Line lengthline_loss_pct: Transmission losses
Example:
Distributed Generation Tables¶
DG Capacity Table¶
Setting: dg_capacity_table
Purpose: Distributed generation capacity by region/year
Required columns:
region: Model regionyear: Model yearcapacity_mw: DG capacity
Optional columns:
scenario: DG adoption scenariotechnology: DG technology type
Example:
DG Profiles Table¶
Setting: dg_profiles_table
Purpose: Hourly generation profiles for DG
Required columns:
region: Model regionhour: Hour of yearcf: Capacity factor (0-1)
Optional columns:
weather_year: Weather data vintagetechnology: DG type
Example:
Renewable Resource Tables¶
Resource Group Profiles¶
Setting: RESOURCE_GROUP_PROFILES (path)
Purpose: Hourly generation profiles for renewable clusters
File naming: {technology}_profiles_{suffix}.csv or .parquet
Format: Tidy format with columns:
weather_year: Weather data yeartime_index: Hour index (1 to number of hours)site_id: Resource cluster/site identifiervalue: Capacity factor (0.0 to 1.0)
Example:
Files in folder:
landbasedwind_profiles_class3.parquet
utilitypv_profiles_class1.parquet
offshorewind_profiles_class1.parquet
Example profile CSV (tidy format):
weather_year,time_index,site_id,value
2012,1,3244,0.45
2012,2,3244,0.42
2012,3,3244,0.48
2012,1,3818,0.52
2012,2,3818,0.48
...
Policy and Cost Tables¶
Emission Policies¶
Setting: emission_policies_fn
Purpose: RPS, CES, carbon constraints by case/region/year
Required columns:
case_id: Case identifieryear: Model yearregion: Model region (or "all")
Policy columns (example):
RPS: Renewable portfolio standard (fraction)CES: Clean energy standard (fraction)CO2_cap: CO2 emissions limit (tonnes)
Example:
Cost Multipliers¶
Setting: cost_multiplier_fn
Purpose: Regional construction cost adjustments
Required columns:
region: Region nametechnology: Technology namemultiplier: Cost multiplier (1.2 = 20% increase)
Example:
Capacity Limits and Spur Lines¶
Setting: capacity_limit_spur_fn
Purpose: New-build capacity limits and connection costs
Required columns:
region: Model regiontechnology: Technology namemax_capacity: Maximum capacity (MW)spur_miles: Spur line distance
Optional columns:
cluster: Resource cluster IDspur_capex_mw_mile: Spur line cost
Example:
Data File Formats¶
CSV Files¶
Standard CSV format with header row:
Parquet Files¶
Binary columnar format (more efficient for large datasets):
PowerGenome automatically detects and loads Parquet files.
Database Tables¶
Database tables (SQLite or DuckDB format):
data_location: /path/to/database.duckdb
generation_table:
table_name: existing_generators # Table name in database
Both SQLite (.db, .sqlite) and DuckDB (.duckdb) databases are supported:
# SQLite database
data_location: /path/to/powergenome_data.db
# DuckDB database
data_location: /path/to/powergenome_data.duckdb
Data Location¶
data_location¶
Type: String (path)
Required: Yes
Example: "/path/to/data"
Root directory or database file for all data tables.
Folder structure:
Files loaded from:
/Users/me/powergenome_data/generators.csv
/Users/me/powergenome_data/demand.parquet
/Users/me/powergenome_data/fuel_prices.csv
Database file:
Tables loaded from database.
Column Selection¶
Limit columns loaded to reduce memory usage:
generation_table:
table_name: generators_full.parquet
columns:
- plant_id
- technology
- capacity_mw
- heat_rate_mmbtu_mwh
- region
filters:
- - [region, 'in', ['CA_N', 'CA_S']]
Only specified columns are loaded. Useful for large datasets with many unused columns.
Example Configurations¶
Minimal Configuration¶
data_location: /data/powergenome
generation_table: generators.csv
demand_table: demand.parquet
fuel_prices_table: fuel_prices.csv
Advanced Multi-Scenario¶
data_location: /data/powergenome_v2
generation_table:
table_name: generators_all_scenarios.parquet
scenario: baseline
filters:
- - [operating_year, '<=', 2030]
columns:
- plant_id
- technology
- capacity_mw
- heat_rate_mmbtu_mwh
- region
- operating_year
demand_table:
table_name: demand_timeseries.parquet
scenario: high_ev
filters:
- - [year, '=', 2030]
- - [weather_year, 'in', [2012, 2013, 2014]]
fuel_prices_table:
table_name: fuel_costs.parquet
scenario: reference
filters:
- - [year, '>=', 2025]
- - [year, '<=', 2050]
dg_capacity_table:
table_name: distributed_generation.parquet
scenario: high_solar
filters:
- - [technology, '=', 'rooftop_pv']
transmission_table: transmission_network.csv
Troubleshooting¶
Table Not Found¶
Error: Table 'generation' not found in data_location
Solutions:
- Check
data_locationpath is correct - Verify filename matches
table_name - Ensure file has correct extension (.csv, .parquet)
- For databases, verify table exists with
list_tables()
Column Missing¶
Error: Column 'capacity_mw' not found in table
Solutions:
- Check column names in source data (case-sensitive)
- Verify data schema matches expectations
- PowerGenome converts column names to snake_case automatically
Filter Errors¶
Error: Invalid filter syntax
Solutions:
- Check filter uses proper DNF nested list format
- Verify operators are quoted:
'='not= - Ensure values match column data types (string vs. numeric)
Related Settings¶
- Model Definition:
data_locationpath - Existing Generators:
generation_tableschema - Demand:
demand_table,dg_capacity_table,dg_profiles_table - Fuels:
fuel_prices_table - Transmission:
transmission_table