Skip to content

Commit 4491c6f

Browse files
igarousiCastronovadanielletijerinaDanielle Tijerina-KreuzerIrene Garousi-Nejad
authored
Merge CUAHSI/cssi_evaluation to hydroframe/cssi_evaluation (#17)
* added the nwm snow comparison notebook * added documentation for snow metrics * uploadedc the correct documentation files * removed the wrong document * added snow.utils module to the evaluation framework. This contains two functions: computing water year, and performing a same-day swe comparison. * added different day comparison function to snow utils * formatting * added snow melt functions to the snow utils module. Split the snow example notebook into a data collection notebook and a single site comparison notebook * cleaned single site snow comparison notebook by removing unused imports and code * completed setting up virtual environment on Verde with micromamba, added steps in getting_started.md * created a new 01 snow notebook for HydroData data retrieval * changes to package versions for importing packages and snow __init__ for circular import in notebook 02 * added html/css styling in nwm_utils plot_sites_within_domain function for popup formatting * changed text box formatting, clarified language around 'model outputs' * remove notebook output * formatting for comparison_plots() function in nwm_utils.py * added timeseries/scatter plot figure description text * Reorganized and added sections for notebook 02; Plotting and description changes for notebook 02 sections 4-5 * cleaned up section 5 of 02 notebook (pre stats section 6 work) * 02 snow notebook outline for stats section, added multi-site stats summary and plots, condon diagrams * changed 'relative' bias and 'mean' bias descriptors, fixed the accumulation/ablation compute_stats * Created a third notebook for the last part of snow comparison (across a watershed) * add .DS_store and .ipynb_checkpoints to gitignore * create 01_data_collection_HydroData.ipynb notebook file * Added getting shapefile from WBD HUC dataset; added Pandas functionality to plot_sites_within_domain() function * added geojson file to domain_data for CCSS stations. Still have problems using getCCSSData() on Verde * Added geojson CCSS file for use on Verde. Confirmed that notebook 02 works on Verde up to the 'Summary Metrics at Multiple Sites' section * wrote loop for accessing CONUS1 SWE with the get_gridded_data function & first plotting * nearly final Hydrodata notebook; finalized section outline; added hydrodata_utils; removed most of the Cali files * Final 01 HydroData SWE Notebook cleanup, update mapping function in nwm_utils, readded domain shapefiles * removed old commented-out mapping function * deleted old 01 Hydrodata notebook * Revert "Cam 965 test snow notebooks and revise snow 02 notebook" * replaced the DonPedroDam shapefiles that were accidentally removed * renamed 02 snow notebook * PF evaluation for snow 02 notebook complete, EXCEPT for stats section * changed some NWM wording in a markdown box * create new 03 snow notebook for ParFlow/Hydrodata * added specific model and output folders for PF outputs * added coordinate grab for PF-CONUS grid cells to 01, 03 modifications * nearly complete ParFlow snow notebook 03; some modifications for NWM snow notebook 03; some package and function updates * added nwm example * added evaluation to nwm notebook * moved files to new directories * Created new folders for reorganizing files * Added all necessary python files, init files; Added a General README * added header descriptions in empty files * Migrated most functions to their appropriate files for reorg. Not including model specific and some observational functions. * Checked package imports for all new files, except model specific ones * created 'collect_observations' directory in /examples and moved the original 01_data_collection.ipynb notebooks. Renamed these for PF/HydroData and NWM * edited files for missing imports requirements * Modify dataCollectionHydrodata_parflow.ipynb for module reorg. Can confirm notebook works. * Modify dataCollection_nwm.ipynb for module reorg. Can confirm notebook works. * final changes before PR * Moved the root-level readme file to the docs folder for now. Some of the content is specific to Parflow. We can re-use the content of this file later when the reorganization of the github repository is finalized. * Minor updates to the readme file. * Add a readme to describe both the workflow and the entire repository * Add an initial workflow chart for the hackathon activities * Update the environment setup flow to include a one-time setup step that supports both the core library configuration for the evaluation workflow/package and the runtime dependencies. * updated instruction to install deps * tested running notebooks with the new env set-up, data collection works fine. NWM and Parflow notebooks (#2) should be updated. They still point to cssi_evaluation.snow which no longer exists in our reorg branch * removed backup files b/c the updated env set-up workflow works fine * revised deps based on Amy's feedback --------- Co-authored-by: Anthony Castronova <castronova.anthony@gmail.com> Co-authored-by: danielletijerina <dtt2@princeton.edu> Co-authored-by: Danielle Tijerina-Kreuzer <danielletk@Danielles-CUAHSI-MacBook.local> Co-authored-by: danielletijerina <dtijerina@cuahsi.org> Co-authored-by: Irene Garousi-Nejad <igarousi@Irene-CUAHSI.local>
1 parent 34f8ca8 commit 4491c6f

54 files changed

Lines changed: 9476 additions & 159 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 176 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,176 @@
1-
# cssi_evaluation
2-
3-
Code used to compare ParFlow simulated output to real-world observations.
4-
5-
Please see `example_workflow.ipynb` for an example of how the functions in this module are intended to be used with each other. Note that the input in this example is a mask generated from a HUC (list) using `subsettools`. This is one method of generating a mask, but the workflow will work with any mask and accompanying bounds within either the conus1 or conus2 domain. This workflow is restricted to comparisons within the conus1 or conus2 domains. If you wish to use the entire conus1 or conus2 domain, set `ij_bounds = None` and use `utils.get_conus_mask("conus2")` to obtain the mask for the full conus2 grid (or likewise for conus1).
6-
7-
This module contains three distinct steps, which will be linked up in the final workflow:
8-
1. Gather site-level observations for a requested domain
9-
2. Extract and format output from ParFlow grid cells that match up with site locations
10-
3. Calculate metrics and produce plots to compare outputs from (1) and (2)
11-
12-
Supported variables:
13-
- 'streamflow' (USGS) ('hourly' or 'daily')
14-
- 'water_table_depth' (USGS) ('hourly' or 'daily')
15-
- 'swe' (SNOTEL) ('daily')
16-
- 'latent_heat' (AmeriFlux) ('hourly')
17-
18-
Supported metrics:
19-
- 'r2': Correlation of determination
20-
- 'spearman_rho': Spearman's rank correlation coefficient
21-
- 'mse': Mean Squared Error
22-
- 'rmse': Root Mean Squared Error
23-
- 'bias': bias
24-
- 'percent_bias': percent bias
25-
- 'abs_rel_bias': absolute relative bias
26-
- 'total_difference': total difference (ParFlow minus observations)
27-
- 'pearson_r': Pearson's R
28-
- 'nse': Nash-Sutcliffe Efficiency
29-
- 'kge': Kling-Gupta Efficiency
30-
- 'bias_from_r': bias from R ([equation 16](https://www.nature.com/articles/srep19401))
31-
- 'condon': Condon category (low/high bias, poor/good shape)
32-
33-
Supported plots (see `plots.py` for full API details):
34-
- `plot_obs_locations()`: Given observation metadata, plot site locations within a mask. Sites are color-coded by site type, if multiple types of sites are present.
35-
- `plot_time_series()`: Plot ParFlow time series against observation time series; one plot per site.
36-
- `plot_compare_scatter()`: Plot a single scatterplot comparing the average values for all sites for ParFlow vs. observations.
37-
- `plot_metric_map()`: Plot sites within a mask, colored by their value on a given comparison metric; one plot per metric.
38-
- `plot_condon_diagram()`: Plot Condon diagram comparing absolute relative bias to Spearman's rho.
1+
# Hydrologic Model Evaluation Framework
2+
3+
This repository contains a hydrologic model evaluation framework for comparing modeled outputs against observations and reference datasets at national scale. The goal is not only to help users find data, but to provide a structured, reproducible workflow for model evaluation and benchmarking across hydrologic models, variables, and spatial scales.
4+
5+
This effort builds on earlier NSF-supported work from the HydroGEN and HydroFrame projects. Those projects focused on building and serving large-scale hydrologic modeling capabilities. In that work, it became clear that community-scale options for evaluating national and regional model outputs remain limited. `HydroData` was initially developed to meet internal project needs for data access, but broader engagement showed a clear need for a framework that connects those data resources to evaluation workflows. This repository, aka `cssi_evaluation`, addresses that need.
6+
7+
## What this framework does
8+
9+
The framework brings together several pieces that operate as one workflow:
10+
11+
1. Model-specific adapters that extract and **format model outputs**.
12+
2. Data-access tools that **retrieve observations** and reference datasets.
13+
3. Shared evaluation utilities for **preprocessing, statistics, and plotting**.
14+
4. **Variable-specific diagnostics** for process-aware evaluation of particular hydrologic variables.
15+
16+
The operational framework lives under [`src/cssi_evaluation/`](src/cssi_evaluation). The top-level [`docs/`](docs), [`examples/`](examples), and [`tests/`](tests) directories support the framework, but they are not themselves workflow stages.
17+
18+
## Current scope
19+
20+
We have started developing this framework as a **Python** package to facilitate evaluation of modeled results against real-world data. At present, the repository reflects two main modeling paths:
21+
22+
- ParFlow-oriented evaluation, especially for ParFlow-CONUS2.1
23+
- National Water Model (NWM) workflows that are being organized alongside the same framework structure
24+
25+
Future development is expected to add both additional site-based datasets and gridded remote-sensing datasets.
26+
27+
HydroData-related packages such as `hf_hydrodata` and `subsettools` are external dependencies used by this framework. They are not part of the source tree in this repository, but they are important data-access inputs, especially for the ParFlow-oriented workflow. Using those packages allows acquisition of comparison datasets to remain reproducible in code.
28+
29+
## How users interact with the framework
30+
31+
Users primarily interact with the package through Python functions and example notebooks. Typical workflows allow a user to define:
32+
33+
- set up environment with required packages
34+
- define a domain of interest by HUC, latitude/longitude bounding box, or upstream drainage area
35+
- define a time range for evaluation
36+
- select one or more observational variables to compare against model output
37+
38+
The package includes shared statistical metrics such as RMSE, MSE, Pearson correlation, Spearman rank correlation, Nash-Sutcliffe Efficiency, Kling-Gupta Efficiency, R-squared, bias, percent bias, absolute relative bias, total difference, and Condon category. It also includes plotting utilities for site-level time series and mapped summaries of evaluation metrics across sites.
39+
40+
## General metrics and variable-specific diagnostics
41+
42+
The repository intentionally separates model-agnostic evaluation tools from diagnostics that are specific to a variable or process.
43+
44+
- General metrics apply broadly across time series and model types.
45+
- Variable-specific diagnostics are evaluation methods designed around the scientific behavior of a particular hydrologic variable.
46+
47+
For example, continuous time-series metrics are useful but may not fully capture seasonal snow behavior. Snow evaluation often requires targeted diagnostics such as:
48+
49+
- peak SWE same-day comparison
50+
- peak SWE different-day comparison
51+
- melt timing comparison
52+
53+
These diagnostics complement general statistical metrics by focusing on process-relevant behaviors rather than only full-series agreement.
54+
55+
## Framework workflow
56+
57+
The framework standardizes modeled and observational data before applying shared utilities and variable-specific diagnostic functions.
58+
Dashed arrows indicate repository locations, not workflow direction. Blue boxes represent core workflow components, green circles show
59+
outputs generated by the workflow, and yellow boxes denote workflow sections.
60+
61+
62+
```mermaid
63+
flowchart LR
64+
subgraph A[Model Output Sources]
65+
direction LR
66+
A1[ParFlow outputs]
67+
A2[NWM outputs]
68+
A3[Future model<br/>outputs]
69+
end
70+
71+
subgraph B[Observation Data Sources]
72+
direction LR
73+
BS[" "]
74+
B1[HydroData-related<br/>packages]
75+
B2[External observation<br/>sources]
76+
B3[Other reference<br/>datasets]
77+
end
78+
79+
E([Model-Specific Adapters])
80+
F([Data Access Layer])
81+
I((Standardized evaluation-ready data))
82+
83+
subgraph C[General Evaluation Utilities]
84+
C1[General statistical metrics]
85+
C2[Plots and summaries]
86+
end
87+
88+
subgraph D[Process-aware Evaluation Utilities]
89+
direction LR
90+
DS[" "]
91+
D1[Streamflow diagnostics]
92+
D2[Snow diagnostics]
93+
D3[Additional variables]
94+
end
95+
96+
G([Shared Evaluation Utils])
97+
H([Variable-Specific Diagnostics])
98+
99+
A --> E
100+
B --> F
101+
E --> I
102+
F --> I
103+
I --> C
104+
I --> D
105+
C --> G
106+
D --> H
107+
G --> J((Evaluation Results))
108+
H --> J
109+
110+
subgraph P[Repository Paths]
111+
direction LR
112+
P1[src/cssi_evaluation/models/]
113+
P2[src/cssi_evaluation/external_data_access/]
114+
P3[src/cssi_evaluation/utils/]
115+
P4[src/cssi_evaluation/variables/]
116+
end
117+
118+
E -.-> P1
119+
F -.-> P2
120+
G -.-> P3
121+
H -.-> P4
122+
123+
classDef sourcegroup fill:#d9d9d9,stroke:#7a7a7a,color:#111;
124+
classDef source fill:#efe8db,stroke:#8a6b2f,color:#111;
125+
classDef framework fill:#bcdcf5,stroke:#1f5f99,color:#111;
126+
classDef results fill:#dff5df,stroke:#3c8c3c,color:#111;
127+
classDef repo fill:#7a7a7a,stroke:#4d4d4d,color:#111;
128+
classDef path fill:#b3b3b3,stroke:#4d4d4d,color:#111;
129+
classDef spacer fill:transparent,stroke:transparent,color:transparent;
130+
131+
class A,B,C,D sourcegroup;
132+
class A1,A2,A3,B1,B2,B3,C1,C2,C3,D1,D2,D3 source;
133+
class E,F,H,G framework;
134+
class I,J results;
135+
class BS,DS spacer;
136+
class E1,F1,Z,Y path;
137+
linkStyle 10,11,12,13 stroke:#333333,stroke-width:2px,stroke-dasharray:6 4;
138+
```
139+
140+
## Repository structure
141+
142+
```text
143+
cssi_evaluation/
144+
├── src/cssi_evaluation/ # Core framework code
145+
│ ├── models/ # Model-specific adapters
146+
│ ├── external_data_access/ # Observation and reference-data access helpers
147+
│ ├── utils/ # Shared evaluation utilities
148+
│ ├── variables/ # Variable-specific diagnostics
149+
│ └── example_workflow.ipynb # Package-level example workflow
150+
├── examples/ # Example notebooks and supporting assets
151+
├── docs/ # Project documentation and notes
152+
├── tests/ # Package tests
153+
├── pyproject.toml # Package metadata and dependencies
154+
└── README.md
155+
```
156+
157+
## What is and is not part of the workflow
158+
159+
The framework logic is the code under [`src/cssi_evaluation/`](src/cssi_evaluation).
160+
161+
The following directories are important, but they are not part of the workflow implementation itself:
162+
163+
- [`docs/`](docs) for narrative documentation and project references
164+
- [`examples/`](examples) for demonstration notebooks and usage examples
165+
- [`tests/`](tests) for validation and regression testing
166+
167+
## Outlook
168+
169+
The framework is intended to grow by adding:
170+
171+
- new model adapters
172+
- new observation and reference-data pathways
173+
- new variable-specific diagnostics
174+
- clearer workflows for users bringing their own model outputs, including both physics-based and ML-based models
175+
176+
The long-term aim is to reduce the barrier to reproducible hydrologic model evaluation while keeping the code structure aligned with the scientific workflow.

cssi_env.yml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: cssi_evaluation
2+
channels:
3+
- conda-forge
4+
dependencies:
5+
- python=3.10
6+
- pip
7+
- gdal
8+
- geopandas=0.12.2
9+
- rasterio=1.3.6
10+
- fiona
11+
- rioxarray=0.15.1
12+
- pyproj
13+
- libtiff=4.5.0
14+
- holoviews=1.19.0
15+
- geoviews=1.11.0
16+
- param=2.0.1
17+
- nodejs
18+
- zarr=2.13.3
19+
- s3fs=2023.6.0
20+
- pip:
21+
- "-e .[notebooks]"

docs/2026_modeling_hackathon.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
This workflow chart outlines the structure of the 2026 Modeling Hackathon. It shows how participants move from environment setup into the ParFlow and National Water Model (NWM) evaluation paths, and highlights how additional model workflows can be integrated into the same framework in future hackathons.
2+
3+
```mermaid
4+
flowchart TD
5+
A[Participant enters workshop] --> B[Set up environment]
6+
B --> B1[HydroData login]
7+
B --> B2[GitHub access]
8+
B --> B3[Python environment]
9+
B --> B4[Test imports or Jupyter example]
10+
11+
B4 --> C{Which model path?}
12+
13+
C --> D[Day 1: ParFlow path]
14+
C --> E[Day 2: NWM path]
15+
C --> F[Future model path]
16+
17+
D --> D1[Access observations through HydroData]
18+
D1 --> D2[Read ParFlow outputs with model-specific tools]
19+
D2 --> D3[Apply shared utilities]
20+
D3 --> D4[Run snow or streamflow evaluation]
21+
D4 --> D5[Review plots and metrics]
22+
23+
E --> E1[Access reference data from external links]
24+
E1 --> E2[Read NWM outputs with model-specific tools]
25+
E2 --> E3[Apply shared utilities]
26+
E3 --> E4[Run variable-specific evaluation]
27+
E4 --> E5[Review plots and metrics]
28+
29+
F --> F1[Create a new model-specific adapter]
30+
F1 --> F2[Match model outputs to framework format]
31+
F2 --> F3[Connect observations or reference data]
32+
F3 --> F4[Reuse shared metrics and plotting]
33+
F4 --> F5[Add variable-specific methods as needed]
34+
35+
D5 --> G[Collect feedback]
36+
E5 --> G
37+
F5 --> G
38+
39+
G --> G1[Challenges and incompatibilities]
40+
G --> G2[Integration opportunities]
41+
G --> G3[Needed data sources]
42+
G --> G4[Needed variables and metrics]
43+
44+
classDef start fill:#d9edf7,stroke:#31708f,color:#111;
45+
classDef day1 fill:#e7f4e4,stroke:#3d7a3a,color:#111;
46+
classDef day2 fill:#fff1cc,stroke:#9b7a19,color:#111;
47+
classDef future fill:#f4e1f5,stroke:#8d4a8f,color:#111;
48+
classDef feedback fill:#f9e0e6,stroke:#9c3758,color:#111;
49+
50+
class A,B,B1,B2,B3,B4,C start;
51+
class D,D1,D2,D3,D4,D5 day1;
52+
class E,E1,E2,E3,E4,E5 day2;
53+
class F,F1,F2,F3,F4,F5 future;
54+
class G,G1,G2,G3,G4 feedback;
55+
```

docs/Parflow-related-readme.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# cssi_evaluation
2+
3+
Code used to compare ParFlow simulated output to real-world observations.
4+
5+
Please see `example_workflow.ipynb` for an example of how the functions in this module are intended to be used with each other. Note that the input in this example is a mask generated from a HUC (list) using `subsettools`. This is one method of generating a mask, but the workflow will work with any mask and accompanying bounds within either the conus1 or conus2 domain. This workflow is restricted to comparisons within the conus1 or conus2 domains. If you wish to use the entire conus1 or conus2 domain, set `ij_bounds = None` and use `utils.get_conus_mask("conus2")` to obtain the mask for the full conus2 grid (or likewise for conus1).
6+
7+
This module contains three distinct steps, which will be linked up in the final workflow:
8+
1. Gather site-level observations for a requested domain
9+
2. Extract and format output from ParFlow grid cells that match up with site locations
10+
3. Calculate metrics and produce plots to compare outputs from (1) and (2)
11+
12+
Supported variables:
13+
- 'streamflow' (USGS) ('hourly' or 'daily')
14+
- 'water_table_depth' (USGS) ('hourly' or 'daily')
15+
- 'swe' (SNOTEL) ('daily')
16+
- 'latent_heat' (AmeriFlux) ('hourly')
17+
18+
Supported metrics:
19+
- 'r2': Correlation of determination
20+
- 'spearman_rho': Spearman's rank correlation coefficient
21+
- 'mse': Mean Squared Error
22+
- 'rmse': Root Mean Squared Error
23+
- 'bias': bias
24+
- 'percent_bias': percent bias
25+
- 'abs_rel_bias': absolute relative bias
26+
- 'total_difference': total difference (ParFlow minus observations)
27+
- 'pearson_r': Pearson's R
28+
- 'nse': Nash-Sutcliffe Efficiency
29+
- 'kge': Kling-Gupta Efficiency
30+
- 'bias_from_r': bias from R ([equation 16](https://www.nature.com/articles/srep19401))
31+
- 'condon': Condon category (low/high bias, poor/good shape)
32+
33+
Supported plots (see `plots.py` for full API details):
34+
- `plot_obs_locations()`: Given observation metadata, plot site locations within a mask. Sites are color-coded by site type, if multiple types of sites are present.
35+
- `plot_time_series()`: Plot ParFlow time series against observation time series; one plot per site.
36+
- `plot_compare_scatter()`: Plot a single scatterplot comparing the average values for all sites for ParFlow vs. observations.
37+
- `plot_metric_map()`: Plot sites within a mask, colored by their value on a given comparison metric; one plot per metric.
38+
- `plot_condon_diagram()`: Plot Condon diagram comparing absolute relative bias to Spearman's rho.

0 commit comments

Comments
 (0)