Streamlit dashboard for exploring U.S. electricity demand data from the U.S. Energy Information Administration (EIA).
- EIA Open Data API:
electricity/rtodatasets - Fuel type page endpoint:
daily-fuel-type-data - Region page endpoint:
daily-region-data - API docs: https://www.eia.gov/opendata/documentation.php
-
Install dependencies:
pip install -r requirements.txt
-
Ensure
panderais available (already included inrequirements.txt):pip install "pandera[pandas]" -
Configure Streamlit secrets for BigQuery access:
# .streamlit/secrets.toml EIA_API_KEY = "LTIKgNRTzB9PyNmYCHTfG8sr3A4h2E1RH6O9Swfi" [gcp_service_account] type = "service_account" project_id = "sipa-adv-c-purple-flamingo" private_key_id = "..." private_key = "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n" client_email = "streamlit@sipa-adv-c-purple-flamingo.iam.gserviceaccount.com" client_id = "..." token_uri = "https://oauth2.googleapis.com/token" [bigquery] project_id = "sipa-adv-c-purple-flamingo" dataset_id = "eia_data" fuel_table_id = "daily_fuel_main"
-
If you will run the load script locally, authenticate your user account first:
gcloud auth application-default login export EIA_API_KEY="your_eia_api_key" python load_daily_eia_to_bigquery.py
The load script defaults to the fuel dataset for this lab. To load the region table instead:
EIA_DATA_SOURCE=region python load_daily_eia_to_bigquery.py
To support both app pages from BigQuery, include both table names in secrets:
[bigquery] project_id = "sipa-adv-c-purple-flamingo" dataset_id = "eia_data" fuel_table_id = "daily_fuel_main" region_table_id = "daily_region_main"
The repo already ignores .streamlit/secrets.toml, so the service account key will not be committed.
streamlit run main_page.pyThis opens a two-page app:
- Fuel type demand view (
app.py) - Region demand view (
region.py)
Both analytical pages now read from BigQuery with date filters pushed into SQL so the app only loads the records needed for the selected window.
See LAB_10.md for the write-up covering:
- which loading strategy each dataset uses
- why both datasets now live in BigQuery
- the performance changes made to keep page loads fast
pytest -qload_daily_eia_to_bigquery.py verifies each load by printing:
- total row count
- minimum and maximum
period - latest
loaded_attimestamp
That gives you a quick way to confirm the refresh worked as intended after upload.
Validation lives in schemas.py and is applied in both pages at two stages:
- Raw API payload validation:
- Checks required columns exist for each page.
- Allows extra columns (
strict=False) so API field additions do not break the app.
- Parsed data validation:
- Enforces
periodas datetime-like andvalueas numeric-coercible. - Invalid rows are dropped for required plotting columns.
Failure behavior is non-blocking by default:
- The app shows
st.warning(...)with a short validation summary. - It continues with cleaned data when possible.
- If no usable rows remain after cleaning, the page stops with a warning.
No data returned:- Check API key and date range.
- Validation warnings:
- Usually indicate API schema drift or dirty rows in the selected date window.
- The app will continue when possible; inspect warnings to see dropped fields/rows.
- Empty chart after warnings:
- All rows were filtered out by required-column checks, parsing, or timezone filter.
The project investigates grid behavior under stress using EIA demand and generation data, with optional future weather integration (NOAA) for event-based analysis.