6. GridPath Data Toolkit

The GridPath Data Toolkit provides functionality to create GridPath scenario inputs from raw data. The user may provide their own data and use the Toolkit to convert the data to GridPath CSV input format for use in buildling a GridPath database. The Toolkit also includes functionality to download raw data from PUDL and from the GridPath RA Toolkit.

6.1. Obtaining Raw Data

6.1.1. PUDL

The Public Utility Data Liberation (PUDL) project collates publicly available data from a range of sources, and puts the data into a single database after cleaning, standardizing, and cross-linking the various datasets.

Download Datasets

To download data from PUDL, use the gridpath_get_pudl_data command. This will download the pudl.sqlite database as well as the RA Toolkit wind and solar profiles Parquet file, and the EIA930 hourly interchange data Parquet file. See –help menu for options and defaults, e.g., download location, the Zenodo record number for each dataset, skipping datasets, etc.

Convert to GridPath Raw Format

GridPath can currenlty utilize a subset of the downloaded PUDL data, including:

  • Form EIA-860: generator-level specific information about existing and planned generators

  • Form EIA-930: hourly operating data about the high-voltage bulk electric power grid in the Lower 48 states collected from the electricity balancing authorities (BAs) that operate the grid

  • EIA AEO Table 54 (Electric Power Projections by Electricity Market Module Region): fuel price forecasts

  • GridPath RA Toolkit variable generation profiles created for the 2026 Western RA Study: these include hourly wind profiles by WECC BA based on assumed 2026 wind buildout for weather years 2007-2014 and hourly solar profiles by WECC BA based on assumed 2026 buildout for weather years 1998-2019; see the study for how profiles were created and note the study was conducted in 2022.

First, the data must be converted to the GridPath raw data CSV format. For the purpose, use the gridpath_pudl_to_gridpath_raw command.

This will query the PUDL database and process the Parquet files downloaded in the previous step in order to create the following files in the user-specified raw data directory.

  • pudl_eia860_generators.csv

  • pudl_eia930_hourly_interchange.csv

  • pudl_eiaaeo_fuel_prices.csv

  • pudl_ra_toolkit_var_profiles.csv

For options, including the download and raw data directories as well query filters see the –help menu. By default, we currently use 2024-01-01 as the EIA860 reporting data and “western_electricity_coordinating_council” as the EIA AEO electricity market to get data for.

6.1.2. GridPath RA Toolkit

The GridPath RA Toolkit datasets were developed to support the 2026 Western US case resource adequacy study.

6.2. Using the GridPath Data Toolkit

The various functionalities available in the GridPath Data Toolkit can be accessed via the gridpath_run_data_toolkit command. See the --help menu for the available individual Toolkit steps. You may run individual steps only or list the steps you want to run with their respective arguments in a settings file you can point to with the --settings_csv argument. Descriptions of the individual steps available in the Toolkit are below.

6.2.1. Building the Raw Data Database

The first step in using the GridPath Data Toolkit is to create a raw data database. You may do so with the following command:

>>> gridpath_run_data_toolkit --single_step create_database --database PATH/TO/RAW/DB --db_schema ./raw_data_db_schema.sql --omit_data

6.2.2. Loading Raw Data

Load data into the GridPath raw data database. See the documentation of each GridPath Data Toolkit module for data prerequisites. Use the files_to_import.csv file to tell GridPath which CSV files should be loaded into which database table.

6.2.3. Load Zone Inputs

EIA 930 BAs

Create GridPath load_zone inputs (load_zone_scenario_id) based on BAs in Form EIA 930.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites

This script depends on having loaded the Form EIA 930 hourly interchange data and to have defined a region for each BA in the user_defined_baa_key table ( in order to filter BAs if needed). It assumes the following raw input database tables have been populated:

  • raw_data_eia930_hourly_interchange

  • user_defined_baa_key

Settings
  • database

  • output_directory

  • load_zone_scenario_id

  • load_zone_scenario_name

  • allow_overgeneration

  • overgeneration_penalty_per_mw

  • allow_unserved_energy

  • unserved_energy_penalty_per_mwh

  • max_unserved_load_penalty_per_mw

  • export_penalty_cost_per_mwh

  • unserved_energy_stats_threshold_mw

6.2.4. Temporal Inputs

Monte Carlo Weather Iteration Draws

The Monte Carlo approach employed in the GridPath RA Toolkit study synthesizes multiple years of plausible hourly load, wind availability, solar availability, and temperature-driven thermal derate data over which the system operations can be simulated. Synthetic days are built by combining load, wind, solar, and temperature derate shapes from different but similar days in the historical record. For a detailed description of the methodology, see Appendix B of the report available at https://gridlab.org/wp-content/uploads/2022/10/GridLab_RA-Toolkit-Report-10-12-22.pdf.

Usage
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • user_defined_weather_bins

  • user_defined_data_availability

  • user_defined_monte_carlo_timeseries

Settings
  • database

  • weather_bins_id

  • weather_draws_id

  • weather_draws_seed

  • n_iterations

  • study_year

  • timeseries_iteration_draw_initial_seed

Temporal Scenarios

This is a very basic module that copies over the base CSVs created by the user and calls the temporal iterations method to create the iterations.csv file if needed. The location of the base CSVs and the iterations description CSV are specified in a settings file you can point to with the --csv_path argument.

Usage
>>> gridpath_run_data_toolkit --single_step create_temporal_scenarios --settings_csv PATH/TO/SETTINGS/CSV
Settings
  • csv_path

6.2.5. Load Inputs

Sync Loads

Create GridPath sync load profile inputs.

Usage
>>> gridpath_run_data_toolkit --single_step create_sync_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_system_load

  • user_defined_load_zone_units

Settings
  • database

  • output_directory

  • load_scenario_id

  • load_scenario_name

  • overwrite

Monte Carlo Loads

Create GridPath Monte Carlo load profile inputs. Before running this module, you will need to create weather draws with the create_monte_carlo_draws module (see Monte Carlo Weather Iteration Draws).

Usage
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_load_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_system_load

  • user_defined_load_zone_units

  • aux_weather_iterations (see the create_monte_carlo_draws step for how to create synthetic weather years and populate this table)

Settings
  • database

  • output_directory

  • load_scenario_id

  • load_scenario_name

  • stage_id

  • overwrite

  • weather_bins_id

  • weather_draws_id

6.2.6. Project Inputs

Form EIA 860 Project Portfolios

This module creates project portfolios from EIA 860 data.

The project capacity_types will be based on the data in the user_defined_eia_gridpath_key table.

Wind, solar, and hydro are aggregated to the BA level.

Note

Hybrid projects are currently not treated separately by this module. Their renewable generation components are lumped with wind/solar, and the storage components show up as individual units.

Project portfolios are created based on the data from a particular report date. The user selects the region (determines subset of generators to use) and the study date (determines which generators are operational, i.e., after their online date and before their retirement date in the EIA data.)

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_portfolio_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

  • user_defined_baa_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_portfolio_scenario_id

  • project_portfolio_scenario_name

TODO: disaggregate the hybrids out of the wind/solar project and combine

with their battery components

Form EIA 860 Project Load Zones

This module creates project load zone input CSVs for a EIA860-based project portfolio based on the user-defined mapping in the user_defined_eia_gridpath_key table.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_load_zone_scenario_id

  • project_load_zone_scenario_name

Form EIA 860 Project Availability

Create availability type CSV for a EIA860-based project portfolio. Availability types are set to ‘exogenous’ for all projects with no exogenous profiles specified (i.e., always available).

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_availability_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_availability_scenario_id

  • project_availability_scenario_name

Availability Iteration Inputs

Run unit outage simulation and create availability iteration inputs.

Usage
>>> gridpath_run_data_toolkit --single_step create_availability_iteration_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_unit_availability_params

  • raw_data_var_project_units

Settings
  • database

  • output_directory

  • project_availability_scenario_id

  • project_availability_scenario_name

  • overwrite

  • n_parallel_projects

Weather Derates (Sync)

Create GridPath sync weather iteration availability inputs.

Usage
>>> gridpath_run_data_toolkit --single_step create_sync_gen_weather_derate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_unit_availability_weather_derates

  • raw_data_unit_availability_params

Settings
  • database

  • output_directory

  • exogenous_availability_weather_scenario_id

  • exogenous_availability_weather_scenario_name

  • overwrite

  • n_parallel_projects

Weather Derates (Monte Carlo)

Create GridPath Monte Carlo weather iteration availability inputs. Before running this module,you will need to create weather draws with the create_monte_carlo_draws module (see Monte Carlo Weather Iteration Draws).

Usage
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_weather_derate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_unit_availability_params

  • raw_data_unit_availability_weather_derates

  • aux_weather_iterations (see the create_monte_carlo_draws step for how to create synthetic weather years and populate this table)

Settings
  • database

  • output_directory

  • exogenous_availability_weather_scenario_id

  • exogenous_availability_weather_scenario_name

  • overwrite

  • n_parallel_projects

  • weather_bins_id

  • weather_draws_id

Form EIA 860 Project Capacity

Create specified capacity CSV for a EIA860-based project portfolio.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_specified_capacity_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_specified_capacity_scenario_id

  • project_specified_capacity_scenario_name

Form EIA 860 Projects – Create CSV with Fixed Costs Set to Zero

Create fixed cost CSV for a EIA860-based project portfolio with fixed costs set to zero as fixed cost data are not available at this time. The CSV is necessary to create since fixed costs are currently a required GridPath input.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_fixed_cost_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_fixed_cost_scenario_id

  • project_fixed_cost_scenario_name

Form EIA 860 Projects User-Defined Operating Characteristics

Create opchar CSV for a EIA860-based project portfolio. Note that most of operating characteristics are user-defined in the user_defined_eia_gridpath_key table and will take default values until more detailed data are available.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_opchar_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_operational_chars_scenario_id

  • project_operational_chars_scenario_name

  • project_fuel_scenario_id

  • variable_generator_profile_scenario_id

  • hydro_operational_chars_scenario_id

Form EIA 860 Project Fuels

Create project fuels CSV for a EIA860-based project portfolio.

Note

Some fuel regions in the EIA AEO are more disaggragated than the BA in Form EIA 860 (e.g. CA South and North regions in the AEO, and CISO BA in Form EIA 860). This module currently can only assign one fuel region to each BA. If you need the extra resolution, you will need to modify it.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_fuel_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

  • user_defined_baa_key

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_fuel_scenario_id

  • project_fuel_scenario_name

Form EIA 860 Project Heat Rates (User-Defined by Tech)

Create project heat rate CSV for a EIA860-based project portfolio.

Note

Heat rates are user-specified and generic by technology. If you need more granular heat rates by, say, project, you would need to modify this module.

Note

The query in this module is consistent with the project selection from eia860_to_project_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia860_to_project_heat_rate_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia860_generators

  • user_defined_eia_gridpath_key

  • user_defined_heat_rate_curve

Settings
  • database

  • output_directory

  • study_year

  • region

  • project_hr_scenario_id

  • project_hr_scenario_name

Variable Gen Profiles (Sync)

Create GridPath sync variable generation profile inputs.

Usage
>>> gridpath_run_data_toolkit --single_step create_sync_var_gen_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_project_variable_profiles

  • raw_data_var_project_units

Settings
  • database

  • output_directory

  • variable_generator_profile_scenario_id

  • variable_generator_profile_scenario_name

  • overwrite

  • n_parallel_projects

Variable Gen Profiles (Monte Carlo)

Create GridPath Monte Carlo variable generation profile inputs. Before running this module,you will need to create weather draws with the create_monte_carlo_draws module (see Monte Carlo Weather Iteration Draws).

Usage
>>> gridpath_run_data_toolkit --single_step create_monte_carlo_var_gen_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_project_variable_profiles

  • raw_data_var_project_units

  • aux_weather_iterations (see the create_monte_carlo_draws step for how to create synthetic weather years and populate this table)

Settings
  • database

  • output_directory

  • variable_generator_profile_scenario_id

  • variable_generator_profile_scenario_name

  • overwrite

  • n_parallel_projects

  • weather_bins_id

  • weather_draws_id

Hydro Gen Inputs

Create hydro iteration input CSVs from year/month data.

Usage
>>> gridpath_run_data_toolkit --single_step create_hydro_iteration_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_project_hydro_opchars_by_year_month

  • raw_data_hydro_years

  • user_defined_balancing_type_horizons

Settings
  • database

  • output_directory

  • hydro_operational_chars_scenario_id

  • hydro_operational_chars_scenario_name

  • overwrite

  • n_parallel_projects

6.2.7. Fuel Inputs

EIA AEO Fuel Chars (User-Defined)

Create GridPath fuel chars inputs (fuel_scenario_id) for fuels in the EIA AEO. The fuel characteristics are user-defined.

Usage
>>> gridpath_run_data_toolkit --single_step eiaaeo_to_fuel_chars_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites

Thios module assumes the following raw input database tables have been populated:

  • raw_data_eiaaeo_fuel_prices

  • user_defined_eia_gridpath_key

  • user_defined_generic_fuel_intensities

  • user_defined_eiaaeo_region_key

Settings
  • database

  • output_directory

  • model_case

  • report_year

  • fuel_scenario_id

  • fuel_scenario_name

EIA AEO Fuel Prices

Create GridPath fuel price inputs (fuel_scenario_id) based on the EIA AEO.

Warning

The user is reponsible for ensuring that all prices and costs in their model are in a consistent real currency year.

Usage
>>> gridpath_run_data_toolkit --single_step eiaaeo_fuel_price_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites

Thios module assumes the following raw input database tables have been populated:

  • raw_data_eiaaeo_fuel_prices

  • user_defined_eiaaeo_region_key

Settings
  • database

  • output_directory

  • model_case

  • report_year

  • fuel_price_id

6.2.8. Transmission Inputs

Form EIA 930 Transmission Portfolio

This module creates a transmission line portfolio input CSV for an EIA930-based transmission portfolio. The transmission capacity type is set “tx_spec” for all lines.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_portfolio_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia930_hourly_interchange

Settings
  • database

  • output_directory

  • region

  • transmission_portfolio_scenario_id

  • transmission_portfolio_scenario_name

Form EIA 930 Tranmission Load Zones

Create load zone input CSV for a EIA930-based transmission portfolio.

Note

The query in this module is consistent with the transmission selection from eia930_to_transmission_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_load_zone_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia930_hourly_interchange

Settings
  • database

  • output_directory

  • region

  • transmission_load_zone_scenario_id

  • transmission_load_zone_scenario_name

Form EIA 930 Transmission Availability

Create availability type CSV for a EIA930-based project portfolio. Availability types are set to ‘exogenous’ for all transmission lines with no exogenous profiles specified (i.e., always available).

Note

The query in this module is consistent with the project selection from eia930_to_transmission_portfolio_input_csvs.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_availability_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia930_hourly_interchange

  • user_defined_baa_key

Settings
  • database

  • output_directory

  • region

  • transmission_availability_scenario_id

  • transmission_availability_scenario_name

Form EIA 930 Transmission Capacity

Create specified capacity CSV for a EIA930-based transmission portfolio.

Note

The query in this module is consistent with the transmission selection from eia930_to_transmission_portfolio_input_csvs.

Warning

Only minimal, manual data cleaning has been conducted on this dataset. More robust processing is required for usability past the demo stage.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_specified_capacity_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia930_hourly_interchange

Settings
  • database

  • output_directory

  • study_year

  • region

  • transmission_specified_capacity_scenario_id

  • transmission_specified_capacity_scenario_name

Form EIA 930 Transmission Opchar

This module creates transmission opchar input CSV for an EIA930-based transmission portfolio. The transmission operational type is set to “tx_simple” and the losses are set to 2% by default.

Usage
>>> gridpath_run_data_toolkit --single_step eia930_to_transmission_ochar_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
Input prerequisites
This module assumes the following raw input database tables have been populated:
  • raw_data_eia930_hourly_interchange

Settings
  • database

  • output_directory

  • tx_simple_loss_factor

  • region

  • transmission_operational_chars_scenario_id

  • transmission_operational_chars_scenario_name