6. GridPath Data Toolkit¶
The GridPath Data Toolkit provides functionality to create GridPath scenario inputs from raw data. The user may provide their own data and use the Toolkit to convert the data to GridPath CSV input format for use in buildling a GridPath database. The Toolkit also includes functionality to download raw data from PUDL and from the GridPath RA Toolkit.
6.1. Obtaining Raw Data¶
6.1.1. PUDL¶
The Public Utility Data Liberation (PUDL) project collates publicly available data from a range of sources, and puts the data into a single database after cleaning, standardizing, and cross-linking the various datasets.
Download Datasets¶
To download data from PUDL, use the gridpath_get_pudl_data
command.
This will download the pudl.sqlite database as well as the RA Toolkit
wind and solar profiles Parquet file, and the EIA930 hourly interchange
data Parquet file. See –help menu for options and defaults, e.g., download
location, the Zenodo record number for each dataset, skipping datasets, etc.
Convert to GridPath Raw Format¶
GridPath can currenlty utilize a subset of the downloaded PUDL data, including:
Form EIA-860: generator-level specific information about existing and planned generators
Form EIA-930: hourly operating data about the high-voltage bulk electric power grid in the Lower 48 states collected from the electricity balancing authorities (BAs) that operate the grid
EIA AEO Table 54 (Electric Power Projections by Electricity Market Module Region): fuel price forecasts
GridPath RA Toolkit variable generation profiles created for the 2026 Western RA Study: these include hourly wind profiles by WECC BA based on assumed 2026 wind buildout for weather years 2007-2014 and hourly solar profiles by WECC BA based on assumed 2026 buildout for weather years 1998-2019; see the study for how profiles were created and note the study was conducted in 2022.
First, the data must be converted to the GridPath raw data CSV format. For the
purpose, use the gridpath_pudl_to_gridpath_raw
command.
This will query the PUDL database and process the Parquet files downloaded in the previous step in order to create the following files in the user-specified raw data directory.
pudl_eia860_generators.csv
pudl_eia930_hourly_interchange.csv
pudl_eiaaeo_fuel_prices.csv
pudl_ra_toolkit_var_profiles.csv
For options, including the download and raw data directories as well query filters see the –help menu. By default, we currently use 2024-01-01 as the EIA860 reporting data and “western_electricity_coordinating_council” as the EIA AEO electricity market to get data for.
6.1.2. GridPath RA Toolkit¶
The GridPath RA Toolkit datasets were developed to support the 2026 Western US case resource adequacy study.
6.2. Using the GridPath Data Toolkit¶
The various functionalities available in the GridPath Data Toolkit can be
accessed via the gridpath_run_data_toolkit
command. See the --help
menu for the available individual Toolkit steps. You may run individual steps
only or list the steps you want to run with their respective arguments in a
settings file you can point to with the --settings_csv
argument.
Descriptions of the individual steps available in the Toolkit are below.
6.2.1. Building the Raw Data Database¶
The first step in using the GridPath Data Toolkit is to create a raw data database. You may do so with the following command:
>>> gridpath_run_data_toolkit --single_step create_database --database PATH/TO/RAW/DB --db_schema ./raw_data_db_schema.sql --omit_data
6.2.2. Loading Raw Data¶
Load data into the GridPath raw data database. See the documentation of each GridPath Data Toolkit module for data prerequisites. Use the files_to_import.csv file to tell GridPath which CSV files should be loaded into which database table.
6.2.3. Load Zone Inputs¶
EIA 930 BAs¶
Create GridPath load_zone inputs (load_zone_scenario_id) based on BAs in Form EIA 930.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
This script depends on having loaded the Form EIA 930 hourly interchange data and to have defined a region for each BA in the user_defined_baa_key table ( in order to filter BAs if needed). It assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
user_defined_baa_key
Settings¶
database
output_directory
load_zone_scenario_id
load_zone_scenario_name
allow_overgeneration
overgeneration_penalty_per_mw
allow_unserved_energy
unserved_energy_penalty_per_mwh
max_unserved_load_penalty_per_mw
export_penalty_cost_per_mwh
unserved_energy_stats_threshold_mw
6.2.4. Temporal Inputs¶
Monte Carlo Weather Iteration Draws¶
The Monte Carlo approach employed in the GridPath RA Toolkit study synthesizes multiple years of plausible hourly load, wind availability, solar availability, and temperature-driven thermal derate data over which the system operations can be simulated. Synthetic days are built by combining load, wind, solar, and temperature derate shapes from different but similar days in the historical record. For a detailed description of the methodology, see Appendix B of the report available at https://gridlab.org/wp-content/uploads/2022/10/GridLab_RA-Toolkit-Report-10-12-22.pdf.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
user_defined_weather_bins
user_defined_data_availability
user_defined_monte_carlo_timeseries
Settings¶
database
weather_bins_id
weather_draws_id
weather_draws_seed
n_iterations
study_year
timeseries_iteration_draw_initial_seed
Temporal Scenarios¶
This is a very basic module that copies over the base CSVs created by the user
and calls the temporal iterations method to create the iterations.csv file if
needed. The location of the base CSVs and the iterations description CSV are
specified in a settings file you can point to with the --csv_path
argument.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_temporal_scenarios --settings_csv PATH/TO/SETTINGS/CSV
Settings¶
csv_path
6.2.5. Load Inputs¶
Sync Loads¶
Create GridPath sync load profile inputs.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_sync_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_system_load
user_defined_load_zone_units
Settings¶
database
output_directory
load_scenario_id
load_scenario_name
overwrite
Monte Carlo Loads¶
Create GridPath Monte Carlo load profile inputs. Before running this module,
you will need to create weather draws with the create_monte_carlo_draws
module (see Monte Carlo Weather Iteration Draws).
Usage¶
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_system_load
user_defined_load_zone_units
aux_weather_iterations (see the
create_monte_carlo_draws
step for how to create synthetic weather years and populate this table)
Settings¶
database
output_directory
load_scenario_id
load_scenario_name
stage_id
overwrite
weather_bins_id
weather_draws_id
6.2.6. Project Inputs¶
Form EIA 860 Project Portfolios¶
This module creates project portfolios from EIA 860 data.
The project capacity_types will be based on the data in the user_defined_eia_gridpath_key table.
Wind, solar, and hydro are aggregated to the BA level.
Note
Hybrid projects are currently not treated separately by this module. Their renewable generation components are lumped with wind/solar, and the storage components show up as individual units.
Project portfolios are created based on the data from a particular report date. The user selects the region (determines subset of generators to use) and the study date (determines which generators are operational, i.e., after their online date and before their retirement date in the EIA data.)
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_portfolio_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
user_defined_baa_key
Settings¶
database
output_directory
study_year
region
project_portfolio_scenario_id
project_portfolio_scenario_name
- TODO: disaggregate the hybrids out of the wind/solar project and combine
with their battery components
Form EIA 860 Project Load Zones¶
This module creates project load zone input CSVs for a EIA860-based project portfolio based on the user-defined mapping in the user_defined_eia_gridpath_key table.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
Settings¶
database
output_directory
study_year
region
project_load_zone_scenario_id
project_load_zone_scenario_name
Form EIA 860 Project Availability¶
Create availability type CSV for a EIA860-based project portfolio. Availability types are set to ‘exogenous’ for all projects with no exogenous profiles specified (i.e., always available).
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_availability_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
Settings¶
database
output_directory
study_year
region
project_availability_scenario_id
project_availability_scenario_name
Availability Iteration Inputs¶
Run unit outage simulation and create availability iteration inputs.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_availability_iteration_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_unit_availability_params
raw_data_var_project_units
Settings¶
database
output_directory
project_availability_scenario_id
project_availability_scenario_name
overwrite
n_parallel_projects
Weather Derates (Sync)¶
Create GridPath sync weather iteration availability inputs.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_sync_gen_weather_derate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_unit_availability_weather_derates
raw_data_unit_availability_params
Settings¶
database
output_directory
exogenous_availability_weather_scenario_id
exogenous_availability_weather_scenario_name
overwrite
n_parallel_projects
Weather Derates (Monte Carlo)¶
Create GridPath Monte Carlo weather iteration availability inputs. Before
running this module,you will need to create weather draws with the
create_monte_carlo_draws
module (see Monte Carlo Weather Iteration Draws).
Usage¶
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_weather_derate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_unit_availability_params
raw_data_unit_availability_weather_derates
aux_weather_iterations (see the
create_monte_carlo_draws
step for how to create synthetic weather years and populate this table)
Settings¶
database
output_directory
exogenous_availability_weather_scenario_id
exogenous_availability_weather_scenario_name
overwrite
n_parallel_projects
weather_bins_id
weather_draws_id
Form EIA 860 Project Capacity¶
Create specified capacity CSV for a EIA860-based project portfolio.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_specified_capacity_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
Settings¶
database
output_directory
study_year
region
project_specified_capacity_scenario_id
project_specified_capacity_scenario_name
Form EIA 860 Projects – Create CSV with Fixed Costs Set to Zero¶
Create fixed cost CSV for a EIA860-based project portfolio with fixed costs set to zero as fixed cost data are not available at this time. The CSV is necessary to create since fixed costs are currently a required GridPath input.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_fixed_cost_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
Settings¶
database
output_directory
study_year
region
project_fixed_cost_scenario_id
project_fixed_cost_scenario_name
Form EIA 860 Projects User-Defined Operating Characteristics¶
Create opchar CSV for a EIA860-based project portfolio. Note that most of operating characteristics are user-defined in the user_defined_eia_gridpath_key table and will take default values until more detailed data are available.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_opchar_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
Settings¶
database
output_directory
study_year
region
project_operational_chars_scenario_id
project_operational_chars_scenario_name
project_fuel_scenario_id
variable_generator_profile_scenario_id
hydro_operational_chars_scenario_id
Form EIA 860 Project Fuels¶
Create project fuels CSV for a EIA860-based project portfolio.
Note
Some fuel regions in the EIA AEO are more disaggragated than the BA in Form EIA 860 (e.g. CA South and North regions in the AEO, and CISO BA in Form EIA 860). This module currently can only assign one fuel region to each BA. If you need the extra resolution, you will need to modify it.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_fuel_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
user_defined_baa_key
Settings¶
database
output_directory
study_year
region
project_fuel_scenario_id
project_fuel_scenario_name
Form EIA 860 Project Heat Rates (User-Defined by Tech)¶
Create project heat rate CSV for a EIA860-based project portfolio.
Note
Heat rates are user-specified and generic by technology. If you need more granular heat rates by, say, project, you would need to modify this module.
Note
The query in this module is consistent with the project selection
from eia860_to_project_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia860_to_project_heat_rate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia860_generators
user_defined_eia_gridpath_key
user_defined_heat_rate_curve
Settings¶
database
output_directory
study_year
region
project_hr_scenario_id
project_hr_scenario_name
Variable Gen Profiles (Sync)¶
Create GridPath sync variable generation profile inputs.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_sync_var_gen_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_project_variable_profiles
raw_data_var_project_units
Settings¶
database
output_directory
variable_generator_profile_scenario_id
variable_generator_profile_scenario_name
overwrite
n_parallel_projects
Variable Gen Profiles (Monte Carlo)¶
Create GridPath Monte Carlo variable generation profile inputs. Before running
this module,you will need to create weather draws with the
create_monte_carlo_draws
module (see Monte Carlo Weather Iteration Draws).
Usage¶
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_var_gen_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_project_variable_profiles
raw_data_var_project_units
aux_weather_iterations (see the
create_monte_carlo_draws
step for how to create synthetic weather years and populate this table)
Settings¶
database
output_directory
variable_generator_profile_scenario_id
variable_generator_profile_scenario_name
overwrite
n_parallel_projects
weather_bins_id
weather_draws_id
Hydro Gen Inputs¶
Create hydro iteration input CSVs from year/month data.
Usage¶
>>> gridpath_run_data_toolkit --single_step create_hydro_iteration_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_project_hydro_opchars_by_year_month
raw_data_hydro_years
user_defined_balancing_type_horizons
Settings¶
database
output_directory
hydro_operational_chars_scenario_id
hydro_operational_chars_scenario_name
overwrite
n_parallel_projects
6.2.7. Fuel Inputs¶
EIA AEO Fuel Chars (User-Defined)¶
Create GridPath fuel chars inputs (fuel_scenario_id) for fuels in the EIA AEO. The fuel characteristics are user-defined.
Usage¶
>>> gridpath_run_data_toolkit --single_step eiaaeo_to_fuel_chars_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
Thios module assumes the following raw input database tables have been populated:
raw_data_eiaaeo_fuel_prices
user_defined_eia_gridpath_key
user_defined_generic_fuel_intensities
user_defined_eiaaeo_region_key
Settings¶
database
output_directory
model_case
report_year
fuel_scenario_id
fuel_scenario_name
EIA AEO Fuel Prices¶
Create GridPath fuel price inputs (fuel_scenario_id) based on the EIA AEO.
Warning
The user is reponsible for ensuring that all prices and costs in their model are in a consistent real currency year.
Usage¶
>>> gridpath_run_data_toolkit --single_step eiaaeo_fuel_price_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
Thios module assumes the following raw input database tables have been populated:
raw_data_eiaaeo_fuel_prices
user_defined_eiaaeo_region_key
Settings¶
database
output_directory
model_case
report_year
fuel_price_id
6.2.8. Transmission Inputs¶
Form EIA 930 Transmission Portfolio¶
This module creates a transmission line portfolio input CSV for an EIA930-based transmission portfolio. The transmission capacity type is set “tx_spec” for all lines.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_portfolio_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
Settings¶
database
output_directory
region
transmission_portfolio_scenario_id
transmission_portfolio_scenario_name
Form EIA 930 Tranmission Load Zones¶
Create load zone input CSV for a EIA930-based transmission portfolio.
Note
The query in this module is consistent with the transmission selection
from eia930_to_transmission_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
Settings¶
database
output_directory
region
transmission_load_zone_scenario_id
transmission_load_zone_scenario_name
Form EIA 930 Transmission Availability¶
Create availability type CSV for a EIA930-based project portfolio. Availability types are set to ‘exogenous’ for all transmission lines with no exogenous profiles specified (i.e., always available).
Note
The query in this module is consistent with the project selection
from eia930_to_transmission_portfolio_input_csvs
.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_availability_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
user_defined_baa_key
Settings¶
database
output_directory
region
transmission_availability_scenario_id
transmission_availability_scenario_name
Form EIA 930 Transmission Capacity¶
Create specified capacity CSV for a EIA930-based transmission portfolio.
Note
The query in this module is consistent with the transmission selection
from eia930_to_transmission_portfolio_input_csvs
.
Warning
Only minimal, manual data cleaning has been conducted on this dataset. More robust processing is required for usability past the demo stage.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_specified_capacity_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
Settings¶
database
output_directory
study_year
region
transmission_specified_capacity_scenario_id
transmission_specified_capacity_scenario_name
Form EIA 930 Transmission Opchar¶
This module creates transmission opchar input CSV for an EIA930-based transmission portfolio. The transmission operational type is set to “tx_simple” and the losses are set to 2% by default.
Usage¶
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_ochar_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites¶
- This module assumes the following raw input database tables have been populated:
raw_data_eia930_hourly_interchange
Settings¶
database
output_directory
tx_simple_loss_factor
region
transmission_operational_chars_scenario_id
transmission_operational_chars_scenario_name