From 1ae9037a747fe7d4cdbdb3e59b1f1109fd5cc0cb Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 16 Apr 2026 09:24:46 -0400 Subject: [PATCH 1/8] Repurpose netcdf.rst for input only description Adds both user and developer level documentation. Need to still fix references to this document. --- doc/netcdf.rst | 411 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 403 insertions(+), 8 deletions(-) diff --git a/doc/netcdf.rst b/doc/netcdf.rst index fdcb9af..bfd16d6 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -1,12 +1,407 @@ -:orphan: +.. toctree:: + :maxdepth: 2 -.. _netcdf: + user_guide + developer_reference + descriptor_format + unit_contract + coordinate_normalization + technical_debt + adding_met_field + test_structure + next_steps -.. seealso:: :ref:`topo_netcdf`. +GeoClaw NetCDF Input System +=========================== -========================== -Using NetCDF output -========================== +This document covers the NetCDF input pipeline introduced in the +``refactor-netcdf-support`` PR. It has two sections: a user guide for +scientists who want to use NetCDF files as input, and a developer +reference for those working on the implementation. -NetCDF output is not currently supported in Clawpack. This is not a suitable -format for AMR style data. +-------------- + +User Guide +---------- + +Topography from NetCDF +~~~~~~~~~~~~~~~~~~~~~~ + +GeoClaw can read topography/bathymetry directly from a NetCDF file +(``topo_type=4``). The file must contain a 2D elevation variable on a +regular latitude/longitude grid. Common sources include GEBCO, ETOPO, +NOAA coastal DEMs, and any file produced by standard GIS tools. + +What GeoClaw handles automatically +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- Longitude conventions: both [-180, 180] and [0, 360] are detected and + normalized at runtime. You do not need to preprocess the file. +- Latitude ordering: both S-to-N and N-to-S are handled correctly. +- Dimension ordering: ``(lat, lon)`` and ``(lon, lat)`` are both + supported. +- Fill values: ``_FillValue`` and ``missing_value`` attributes are + resolved automatically. If a fill value is found within your + simulation domain, GeoClaw will abort with an error -- this is + intentional, as missing bathymetry is a silent correctness hazard. + +What your file must provide +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- A 2D variable containing elevation in **meters** (positive up, + negative for ocean). If your file uses feet or another unit, use + ``CFNormalizer`` to convert before running (see below). +- 1D coordinate variables for latitude and longitude with recognizable + names (``lat``/``latitude``/``y`` and ``lon``/``longitude``/``x`` are + all detected). Curvilinear (2D coordinate) grids are not currently + supported for topography. + +Registering a NetCDF topo file in setrun.py +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + from clawpack.geoclaw.netcdf_utils import TopoInterrogator + meta = TopoInterrogator('bathy.nc', var_name='z', + crop_bounds=(-100, -60, 15, 35)).interrogate_topo() + rundata.topo_data.topofiles.append([4, 'bathy.nc', meta]) + +That is the only change required in most cases. GeoClaw's Python layer +interrogates the file when you run ``setrun.py`` and writes the +necessary descriptor information into ``topo.data`` automatically. +Variable and coordinate names are auto-detected from CF attributes where +possible; pass ``var_name``, ``lon_name``, or ``lat_name`` explicitly to +``TopoInterrogator`` if auto-detection fails. + +Domain subsetting (crop) +^^^^^^^^^^^^^^^^^^^^^^^^ + +If your NetCDF file covers a larger area than your simulation domain +(common with global or regional datasets), you can crop at read time +without creating a smaller file: + +.. code:: python + + topo_data.topofiles.append([4, 'gebco_global.nc', + {'crop_bounds': [-100, -80, 20, 35]}]) + +Only the subset is read into memory at runtime. The full file is never +loaded. + +Checking CF compliance +^^^^^^^^^^^^^^^^^^^^^^ + +If you are unsure whether your file will be read correctly, the +``CFNormalizer`` utility can inspect and repair common issues: + +.. code:: python + + from clawpack.geoclaw.netcdf_utils import CFNormalizer + + cf = CFNormalizer('path/to/bathymetry.nc') + cf.report() # prints any issues found + cf.normalize('path/to/bathymetry_cf.nc') # writes a corrected copy + +``CFNormalizer`` adds missing ``standard_name``, ``axis``, and ``units`` +attributes, and resolves ``_FillValue``/``missing_value`` conflicts. It +does not resample or reproject. + +-------------- + +Storm surge met forcing from NetCDF +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For storm surge simulations using full gridded met forcing (wind and +pressure fields), GeoClaw can read directly from a NetCDF file. This +replaces the need to convert to OWI ASCII format. + +Supported source formats: + +- **ERA5** (ECMWF reanalysis): detected automatically from CF attributes +- **NWS13** (OWI NetCDF): detected automatically +- **Generic CF-compliant NetCDF**: requires variable name mapping if + names are non-standard + +Not yet supported: raw WRF output (requires preprocessing due to +curvilinear grid and string-encoded time axis). + +Required variables and units +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +GeoClaw expects the following in the NetCDF file **after any unit +conversion** (conversion happens automatically during Python +preprocessing): + +============================= ================================ +Variable Unit +============================= ================================ +Wind (u-component, eastward) m/s +Wind (v-component, northward) m/s +Surface pressure Pa +Time seconds from user-defined offset +============================= ================================ + +If your file uses hPa/mbar for pressure or knots for wind, +``MetInterrogator`` will convert automatically. + +Registering a NetCDF storm file +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In ``setrun.py``: + +.. code:: python + + surge_data.storm_specification_type = 'data' + surge_data.storm_file = 'isaac.storm' + +Then create the storm descriptor file, *e.g.* ``isaac.storm``, using the +Python API: + +.. code:: python + + from clawpack.geoclaw.surge.storm import Storm + import numpy as np + + storm = Storm() + storm.time_offset = np.datetime64('2012-08-29') + storm.file_format = 'netcdf' + storm.file_paths = ['path/to/forcing.nc'] + storm.write('isaac.storm', file_format='data') + +If your variable or dimension names are non-standard, provide a mapping: + +.. code:: python + + storm.write('isaac.storm', file_format='data', + dim_mapping={'t': 'valid_time'}, + var_mapping={'wind_u': 'u10', 'wind_v': 'v10', + 'pressure': 'msl'}) + +Time handling +^^^^^^^^^^^^^ + +GeoClaw works in seconds from a user-defined offset (typically landfall +or storm genesis). All CF time decoding -- including calendar handling +and unit conversion from hours/days -- is done in Python. The Fortran +runtime sees only seconds from offset. + +Set the offset when constructing the storm: + +.. code:: python + + storm.time_offset = np.datetime64('2012-08-29T00:00') # landfall time + +-------------- + +Developer Reference +------------------- + +Architecture overview +~~~~~~~~~~~~~~~~~~~~~ + +The system has a strict Python/Fortran split: + +**Python** handles: file interrogation, CF attribute parsing, coordinate +convention detection, fill value resolution, unit conversion, time +decoding, crop bound validation, and descriptor writing. + +**Fortran** handles: opening the NetCDF file at runtime using +information from the descriptor, index arithmetic for coordinate +normalization (no data copies), domain subsetting via +``start``/``count`` arguments to ``nf90_get_var``, and time-slice reads +for met forcing. + +Fortran assumes the unit contract from ``GEOCLAW_NETCDF_UNITS`` in +``units.py`` without checking. Python enforces it. + +Class hierarchy (``netcdf_utils.py``) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + NetCDFInterrogator + - open file (xarray, Dask-lazy) + - discover coordinate variables by name heuristics + CF standard_name + - detect lon convention, lat order, dim order + - resolve fill value (_FillValue wins over missing_value per CF spec) + - validate crop bounds against file extent + - output: NetCDFDescriptor dataclass + + TopoInterrogator(NetCDFInterrogator) + - detect fill values within crop region (hard error) + - verify and convert units to contract + - no multi-file coverage logic (Fortran handles compositing) + + MetInterrogator(NetCDFInterrogator) + - check wind_u, wind_v, pressure present and on same grid/time axis + - convert units to contract + - decode CF time to seconds from offset + - detect ensemble/member dimensions (hard error if non-singleton) + - partial domain coverage is allowed (Fortran fills edges) + + DescriptorWriter + - topo: writes key=value lines for inline inclusion in topo.data, using a + blank line to terminate the Fortran read loop + - met: writes &file_info namelist block + repeated &variable_info + blocks for *.storm file body + + CFNormalizer + - adds/fixes CF attributes without modifying data + - idempotent + +Descriptor format +~~~~~~~~~~~~~~~~~ + +Topo (lines in topo.data after topo_type) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + var_name = z + lon_name = longitude + lat_name = latitude + lon_convention = 180 + lat_order = S_to_N + dim_order = lat,lon + fill_value = -9999.0 + fill_action = abort + crop_bounds = -100.0 -80.0 20.0 35.0 + +``fill_action = abort`` is the only supported value for topography. +``crop_bounds`` is omitted if no crop is specified. + +Met forcing (body of \*.storm after format header) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + # format: netcdf + &file_info + source_file = /path/to/forcing.nc + lon_name = longitude + lat_name = latitude + time_name = valid_time + dim_order = time,lat,lon + lon_convention = 360 + lat_order = S_to_N + fill_value = -9999.0 + fill_action = warn + time_offset = 0.0 + crop_bounds = -100.0 -80.0 20.0 35.0 + / + &variable_info var_name=u10 geoclaw_role=wind_u / + &variable_info var_name=v10 geoclaw_role=wind_v / + &variable_info var_name=msl geoclaw_role=pressure / + +The ``&variable_info`` blocks use a manually parsed key=value format +(not true repeated Fortran namelists) for compiler portability. Fortran +reads them in a loop until EOF. Adding new roles (precipitation, +friction) requires no change to the Fortran parser. + +Unit contract +~~~~~~~~~~~~~ + +Defined in ``units.py``: + +.. code:: python + + GEOCLAW_NETCDF_UNITS = { + "topo": "m", + "wind_u": "m/s", + "wind_v": "m/s", + "pressure": "Pa", + "time": "s", + } + +All conversion happens in Python before the descriptor is written. +Fortran trusts the descriptor and never checks units. If you add a new +``geoclaw_role``, add its contract unit here first. + +Fortran coordinate normalization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Fortran applies coordinate normalization via index arithmetic only -- no +arrays are duplicated. The descriptor provides ``lon_convention`` and +``lat_order``; the reader uses these to compute correct indices when +calling ``nf90_get_var``. For crop bounds, ``start`` and ``count`` are +computed from the coordinate arrays at runtime. + +All ``nf90_*`` calls go through a checked interface in +``topo_module.f90`` that prints a meaningful diagnostic and stops +cleanly on failure. This is important for SLURM batch jobs where a +silent Fortran ``STOP`` produces no useful output. + +Known technical debt +~~~~~~~~~~~~~~~~~~~~ + +- ``util.get_netcdf_names`` and the coordinate/variable discovery logic + in ``NetCDFInterrogator`` are parallel implementations. They should + be consolidated -- ``NetCDFInterrogator`` should be the single source + and ``util.get_netcdf_names`` should delegate to it. This is deferred + to the surge module refactor. +- ``Storm.write(file_format="data")`` does not yet call + ``DescriptorWriter`` directly for NetCDF storm entries. Currently the + tests write descriptors manually. The integration should happen as + part of the surge refactor. +- WRF raw output requires ``MetPreprocessor`` for string-time decoding + (``Times`` character array, not a numeric CF time variable) and + curvilinear grid handling (``XLAT``/``XLONG`` are 2D time-varying + arrays). A skip-marked test stub documents the gap in + ``test_storm.py``. + +Adding a new met forcing field (e.g. precipitation) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. Add the role and unit to ``GEOCLAW_NETCDF_UNITS`` in ``units.py`` +2. Add detection logic to ``MetInterrogator`` for the new variable +3. ``DescriptorWriter`` requires no change -- new ``&variable_info`` + blocks are written automatically for any role the interrogator + returns +4. Add the Fortran reader logic in the storm NetCDF module to consume + the new ``geoclaw_role`` value +5. Add unit tests in ``tests/netcdf/test_met_interrogator.py`` +6. Add a regression test in the appropriate storm surge example + +Test structure +~~~~~~~~~~~~~~ + +:: + + tests/netcdf/ unit tests for netcdf_utils.py + conftest.py in-memory NetCDF fixtures + test_base_interrogator.py + test_topo_interrogator.py + test_met_interrogator.py + test_descriptor_writer.py + test_cf_normalizer.py + + tests/test_storm.py extended for ERA5 and NWS13 variants + + examples/tsunami/bowl-slosh-netcdf/ coordinate variant regression tests + examples/tsunami/chile2010/ topotools write path + coord variants + examples/storm-surge/isaac/ ERA5 and NWS13 met forcing regression + +All test NetCDF files are generated in-memory or in ``tmp_path`` -- no +binary files are committed to the repository. + +-------------- + +Next steps +---------- + +- **Surge module refactor**: consolidate ``util.get_netcdf_names`` into + ``NetCDFInterrogator``, wire ``Storm.write`` to use + ``DescriptorWriter``, clean up parallel discovery paths in + ``storm.py`` +- **WRF support**: implement ``MetPreprocessor`` to handle string-time + axis and curvilinear grid; the skip-marked test stub in + ``test_storm.py`` documents the expected interface +- **Precipitation field**: reserved in ``GEOCLAW_NETCDF_UNITS``, add + ``MetInterrogator`` detection and Fortran consumer +- **Friction field**: same pattern as precipitation +- **CLI tool**: expose ``CFNormalizer`` as a command-line utility for + users who want to check or repair their NetCDF files before running + GeoClaw +- **Duplicate code audit**: other places in the GeoClaw codebase that + do ad-hoc NetCDF variable discovery should be identified and migrated + to ``NetCDFInterrogator`` From d28de7035dbe971e49b43b43824bfdc25a29da04 Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 16 Apr 2026 17:07:51 -0400 Subject: [PATCH 2/8] Add docs for new netcdf input for GeoClaw --- doc/netcdf.rst | 33 +++++++++++++++++++++++++-------- doc/topo.rst | 48 +++++++++++++++++++++++------------------------- 2 files changed, 48 insertions(+), 33 deletions(-) diff --git a/doc/netcdf.rst b/doc/netcdf.rst index bfd16d6..7ef3bb6 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -11,12 +11,14 @@ test_structure next_steps +.. _netcdf_input: + GeoClaw NetCDF Input System =========================== This document covers the NetCDF input pipeline introduced in the -``refactor-netcdf-support`` PR. It has two sections: a user guide for -scientists who want to use NetCDF files as input, and a developer +``refactor-netcdf-support`` PR (VERSION ?5.15?). It has two sections: a user +guide for scientists who want to use NetCDF files as input, and a developer reference for those working on the implementation. -------------- @@ -59,6 +61,20 @@ What your file must provide Registering a NetCDF topo file in setrun.py ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +If your file meets the above requirements and either has the variable `z` or +only one 2D variable in the file, you can simply use the following in +``setrun.py``: + +.. code:: python + + rundata.topo_data.topofiles.append([4, 'bathy.nc']) + +This matches with what is expected for backwards compatibility. + +If your file meets the above requirements and has a non-standard variable name +or you want to specify crop bounds you can use the Python API to interrogate the +file and write a descriptor: + .. code:: python from clawpack.geoclaw.netcdf_utils import TopoInterrogator @@ -67,11 +83,12 @@ Registering a NetCDF topo file in setrun.py rundata.topo_data.topofiles.append([4, 'bathy.nc', meta]) That is the only change required in most cases. GeoClaw's Python layer -interrogates the file when you run ``setrun.py`` and writes the -necessary descriptor information into ``topo.data`` automatically. -Variable and coordinate names are auto-detected from CF attributes where -possible; pass ``var_name``, ``lon_name``, or ``lat_name`` explicitly to -``TopoInterrogator`` if auto-detection fails. +interrogates the file when you run ``setrun.py`` and writes the necessary +descriptor information into ``topo.data`` automatically. Variable and coordinate +names are auto-detected from CF attributes where possible; pass ``var_name``, +``lon_name``, or ``lat_name`` explicitly to ``TopoInterrogator`` if +auto-detection fails. This also should handle multiple different coordinate +layouts. Domain subsetting (crop) ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -86,7 +103,7 @@ without creating a smaller file: {'crop_bounds': [-100, -80, 20, 35]}]) Only the subset is read into memory at runtime. The full file is never -loaded. +loaded. Note this is the same as using the Checking CF compliance ^^^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/topo.rst b/doc/topo.rst index 42d4853..22a15b8 100644 --- a/doc/topo.rst +++ b/doc/topo.rst @@ -12,12 +12,13 @@ Topography data - :ref:`topo_order` - :ref:`tsunamidata` - :ref:`dtopo` + - :ref:`g_input` The :ref:`geoclaw` software for flow over topography requires at least one topo file to be input, see :ref:`setrun_geoclaw`. -Currently topo files are restricted to three possible formats as ASCII files, or -NetCDF files are also allowed. +Currently topo files are restricted to three possible formats as ASCII files, +and netCDF files. In the descriptions below it is assumed that the topo file gives the elevation of the topography (relative to some reference level) as a value of @@ -169,14 +170,8 @@ The recognized topotypes are: **topotype = 4** - This file type is not ASCII but rather in a NetCDF4 format supported by the - `CF MetaData conventions (v. 1.6) `_. Files - that conform to this standard can be read in by GeoClaw. The `topotools` - module also has support for reading and writing (including therefore - conversion) of these types of bathymetry files (see :ref:`topo_netcdf` - below). To use this functionality - you will need to add *-DNETCDF* to the *FFLAGS* variable either by the - command line or in the Makefile. + Read in netCDF formatted topography data. See :ref:`netcdf_input` for + NetCDF topography format information. The Fortran code will recognize headers for *topotype* 2 or 3 that have the labels first and then the parameter values. @@ -214,21 +209,24 @@ Several on-line databases are available for topograpy, see Some Python tools for working with topography files are available, see :ref:`topotools`. -.. _topo_netcdf: - -NetCDF format -^^^^^^^^^^^^^ - -Topofiles can be read in netCDF format, either from local `.nc` files or -from some online databases that provide netCDF servers, e.g. the -`NOAA THREDDS server `_. -Use the -`topotools.read_netcdf `_ -function. Note that this also allows reading in only a subset of the data, -both limiting the extent and the resolution, e.g. by sampling every other -point (by setting `coarsen=2`). This is particularly useful if you only want -a subset of a huge online netCDF file (e.g. coastal DEMs at 1/3 arcsecond -resolution are typically several gigabytes). +.. _noaa_thredds: + +NOAA THREDDS server +^^^^^^^^^^^^^^^^^^^^^ + +The `NOAA THREDDS server +`_ +can be used to access a variety of topography data sets, including the etopo1 +global data set at 1 arcminute resolution and the etopo2 global data set at 2 +arcminute resolution. These are available in netCDF format and can be read +directly into GeoClaw. As a convenience, you can use the `topotools.read_netcdf +`_ function. Note +that this also allows reading in only a subset of the data, both limiting the +extent and the resolution, e.g. by sampling every other point (by setting +`coarsen=2`). This is particularly useful if you only want a subset of a huge +online netCDF file (e.g. coastal DEMs at 1/3 arcsecond resolution are typically +several gigabytes). See :ref:`netcdf_input` for more details on working with +netCDF files. The dictionary `topotools.remote_topo_urls` contains some useful URLs for etopo and a few other NOAA THREDDS datasets. This allows reading etopo From 68acd0a45fe72d4b73d9934093aec06e33979ea7 Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 16 Apr 2026 17:08:06 -0400 Subject: [PATCH 3/8] Remove mention of NetCDF output for AMR --- doc/output_styles.rst | 13 ------------- 1 file changed, 13 deletions(-) diff --git a/doc/output_styles.rst b/doc/output_styles.rst index fefb5f8..ac6cae8 100644 --- a/doc/output_styles.rst +++ b/doc/output_styles.rst @@ -134,8 +134,6 @@ Each patch has a unique `grid_number` that usually isn't needed for visualization purposes. - - .. _output_binary: Raw binary output data formats @@ -163,15 +161,6 @@ Unlike the ASCII data files, the binary output files contain ghost cells as well as the interior cells (since a contiguous block of memory is dumped for each patch with a single `write` statement). - -.. _output_netcdf: - -NetCDF output data format ------------------------------- - -NetCDF output is not currently supported in Clawpack. This is not a suitable -format for AMR style data. - .. _output_aux: Output of aux arrays @@ -185,5 +174,3 @@ have the same format as the `q` data, as specifed by `output_format`. Set `output_aux_onlyonce` to `True` if the `aux` arrays do not change with time and you only want to output these arrays once. - - From 893bce5ab30a5a12138399b7c973640dd7c3dd29 Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Fri, 17 Apr 2026 19:27:22 -0400 Subject: [PATCH 4/8] Add changes needed for met forcing Some of the less common cases are now handled better with auto-detection of variables and better support for time offsets. --- doc/netcdf.rst | 160 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 128 insertions(+), 32 deletions(-) diff --git a/doc/netcdf.rst b/doc/netcdf.rst index 7ef3bb6..3050123 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -171,43 +171,136 @@ In ``setrun.py``: surge_data.storm_specification_type = 'data' surge_data.storm_file = 'isaac.storm' -Then create the storm descriptor file, *e.g.* ``isaac.storm``, using the -Python API: +Then create the storm descriptor file, *e.g.* ``isaac.storm``. GeoClaw +uses a two-stage discovery process: standard variable names are found +automatically from CF ``axis`` / ``standard_name`` attributes and +built-in fallback lists; you only need to supply ``var_mapping`` for +roles whose names are non-standard. Coordinate names (``lon``, ``lat``, +``time``) are always discovered automatically and never need to be +specified. + +**Case 1 — all standard names (ERA5 and similar):** + +ERA5 variable names (``u10``, ``v10``, ``msl``) are in the built-in +fallback lists and are found automatically. No ``var_mapping`` is +required: .. code:: python - from clawpack.geoclaw.surge.storm import Storm import numpy as np + from clawpack.geoclaw.surge.storm import Storm storm = Storm() storm.time_offset = np.datetime64('2012-08-29') storm.file_format = 'netcdf' - storm.file_paths = ['path/to/forcing.nc'] + storm.file_paths = ['path/to/era5_forcing.nc'] storm.write('isaac.storm', file_format='data') -If your variable or dimension names are non-standard, provide a mapping: +**Case 2 — mixed: standard wind names, non-standard pressure:** + +A common pattern when a file has multiple pressure fields (e.g. mean +sea-level and surface pressure) or a non-standard pressure variable +name. Specify only the roles that cannot be discovered automatically; +the rest are still found from the fallback lists: .. code:: python storm.write('isaac.storm', file_format='data', - dim_mapping={'t': 'valid_time'}, - var_mapping={'wind_u': 'u10', 'wind_v': 'v10', - 'pressure': 'msl'}) + var_mapping={'pressure': 'prmsl'}) -Time handling -^^^^^^^^^^^^^ +Any name supplied in ``var_mapping`` is validated against the variables +actually present in the file before the descriptor is written, so a +typo raises an informative error immediately rather than producing +incorrect output silently. + +**Case 3 — all non-standard names (e.g. NWS13/OWI-NetCDF):** + +Supply all three roles explicitly when none of the variable names match +the built-in fallback lists: + +.. code:: python + + storm = Storm() + storm.time_offset = np.datetime64('2012-08-29') + storm.file_format = 'nws13' + storm.file_paths = ['path/to/nws13_forcing.nc'] + storm.write('isaac.storm', file_format='data', + var_mapping={'wind_u': 'uwnd', + 'wind_v': 'vwnd', + 'pressure': 'press'}) -GeoClaw works in seconds from a user-defined offset (typically landfall -or storm genesis). All CF time decoding -- including calendar handling -and unit conversion from hours/days -- is done in Python. The Fortran -runtime sees only seconds from offset. +**Advanced: pre-built MetInterrogator:** -Set the offset when constructing the storm: +If you need direct control over CF validation, unit checking, or +lon/lat convention detection before the descriptor is written, you can +construct a :class:`~clawpack.geoclaw.netcdf_utils.MetInterrogator` +explicitly and pass it via ``met_interrogator``. When a pre-built +interrogator is supplied, auto-discovery and ``var_mapping`` validation +are bypassed entirely: + +.. code:: python + + from clawpack.geoclaw.netcdf_utils import MetInterrogator + + mi = MetInterrogator('path/to/forcing.nc', + variable_map={'wind_u': 'u10', + 'wind_v': 'v10', + 'pressure': 'msl'}) + storm.write('isaac.storm', file_format='data', met_interrogator=mi) + +.. note:: + + ``dim_mapping`` is accepted by ``write()`` / ``write_data()`` for + backwards compatibility but has no effect -- coordinate names are + always discovered automatically via CF conventions. + +.. note:: + + ``storm.window`` (ramp width and application domain) and + ``MetInterrogator``'s ``crop_bounds`` (read-time spatial subset of + the NetCDF file) are independent. Setting one does not affect the + other. + + +Time handling +^^^^^^^^^^^^^^^ + +GeoClaw works in seconds relative to a user-defined reference time +(typically landfall or storm genesis). Set it when constructing the +storm: .. code:: python storm.time_offset = np.datetime64('2012-08-29T00:00') # landfall time +Python computes ``nc_time_offset`` — the elapsed seconds from +``storm.time_offset`` to the first record in the file — and writes it +to the descriptor. This value is independent of how the file encodes +time internally (Unix epoch, local epoch, hours-since-reference, etc.). + +Fortran converts raw time values using: + +.. code:: + + storm_time[i] = raw[i] − raw[0] + nc_time_offset + +Subtracting ``raw[0]`` converts any absolute encoding to +elapsed-since-first-record; adding ``nc_time_offset`` then anchors that +to ``storm.time_offset``. No unit conversion (hours→seconds, +days→seconds) is performed in Fortran: the descriptor stores +``nc_time_offset`` in seconds and the raw values are read as stored. +``write_data`` passes ``time_reference=storm.time_offset`` to +``MetInterrogator`` automatically; no additional user action is needed. + +Longitude convention +^^^^^^^^^^^^^^^^^^^^^^ + +Longitude convention ([0, 360] vs [-180, 180]) is detected and +normalized automatically by ``MetInterrogator``. ERA5 files that store +longitude in [0, 360] do not need to be preprocessed to [-180, 180] +before use. The detected convention is written to the descriptor and +Fortran applies index normalization at runtime without copying arrays. + -------------- Developer Reference @@ -219,14 +312,16 @@ Architecture overview The system has a strict Python/Fortran split: **Python** handles: file interrogation, CF attribute parsing, coordinate -convention detection, fill value resolution, unit conversion, time -decoding, crop bound validation, and descriptor writing. +convention detection, fill value resolution, unit conversion, +``nc_time_offset`` computation (elapsed seconds from ``storm.time_offset`` +to the first record), crop bound validation, and descriptor writing. **Fortran** handles: opening the NetCDF file at runtime using information from the descriptor, index arithmetic for coordinate normalization (no data copies), domain subsetting via -``start``/``count`` arguments to ``nf90_get_var``, and time-slice reads -for met forcing. +``start``/``count`` arguments to ``nf90_get_var``, time-slice reads +for met forcing, and converting raw time values to seconds from +``storm.time_offset`` via ``raw[i] − raw[0] + nc_time_offset``. Fortran assumes the unit contract from ``GEOCLAW_NETCDF_UNITS`` in ``units.py`` without checking. Python enforces it. @@ -351,21 +446,21 @@ silent Fortran ``STOP`` produces no useful output. Known technical debt ~~~~~~~~~~~~~~~~~~~~ -- ``util.get_netcdf_names`` and the coordinate/variable discovery logic - in ``NetCDFInterrogator`` are parallel implementations. They should - be consolidated -- ``NetCDFInterrogator`` should be the single source - and ``util.get_netcdf_names`` should delegate to it. This is deferred - to the surge module refactor. -- ``Storm.write(file_format="data")`` does not yet call - ``DescriptorWriter`` directly for NetCDF storm entries. Currently the - tests write descriptors manually. The integration should happen as - part of the surge refactor. +- ``util.get_netcdf_names`` (in ``util.py``) remains a parallel + implementation of variable name discovery alongside + ``NetCDFInterrogator``. ``Storm.write_data`` now uses + ``MetInterrogator`` for NetCDF met forcing (format 2), but falls back + to ``util.get_netcdf_names`` for variable name discovery when no + explicit ``var_mapping`` is provided. Full consolidation -- making + ``util.get_netcdf_names`` delegate to ``NetCDFInterrogator`` and + removing the duplicate -- remains a TODO. - WRF raw output requires ``MetPreprocessor`` for string-time decoding (``Times`` character array, not a numeric CF time variable) and curvilinear grid handling (``XLAT``/``XLONG`` are 2D time-varying arrays). A skip-marked test stub documents the gap in ``test_storm.py``. + Adding a new met forcing field (e.g. precipitation) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -406,10 +501,11 @@ binary files are committed to the repository. Next steps ---------- -- **Surge module refactor**: consolidate ``util.get_netcdf_names`` into - ``NetCDFInterrogator``, wire ``Storm.write`` to use - ``DescriptorWriter``, clean up parallel discovery paths in - ``storm.py`` +- **Consolidate util.get_netcdf_names**: ``Storm.write_data`` now + calls ``MetInterrogator`` and ``DescriptorWriter`` for NetCDF met + forcing; the remaining step is to make ``util.get_netcdf_names`` + delegate to ``NetCDFInterrogator`` and remove the duplicate discovery + logic in ``storm.py`` - **WRF support**: implement ``MetPreprocessor`` to handle string-time axis and curvilinear grid; the skip-marked test stub in ``test_storm.py`` documents the expected interface From 8bb4891133820799acd9195e1e0040f651b19a8f Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 7 May 2026 10:44:44 -0400 Subject: [PATCH 5/8] Update netcdf docs for topography var name change --- doc/netcdf.rst | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/doc/netcdf.rst b/doc/netcdf.rst index 3050123..3e989a3 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -61,34 +61,50 @@ What your file must provide Registering a NetCDF topo file in setrun.py ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If your file meets the above requirements and either has the variable `z` or -only one 2D variable in the file, you can simply use the following in +If your file meets the above requirements, you can simply use the following in ``setrun.py``: .. code:: python - + rundata.topo_data.topofiles.append([4, 'bathy.nc']) This matches with what is expected for backwards compatibility. -If your file meets the above requirements and has a non-standard variable name -or you want to specify crop bounds you can use the Python API to interrogate the -file and write a descriptor: +If you want to specify crop bounds, or if GeoClaw cannot find the elevation +variable automatically, use the Python API to interrogate the file and write a +descriptor: .. code:: python from clawpack.geoclaw.netcdf_utils import TopoInterrogator - meta = TopoInterrogator('bathy.nc', var_name='z', + meta = TopoInterrogator('bathy.nc', crop_bounds=(-100, -60, 15, 35)).interrogate_topo() rundata.topo_data.topofiles.append([4, 'bathy.nc', meta]) That is the only change required in most cases. GeoClaw's Python layer interrogates the file when you run ``setrun.py`` and writes the necessary -descriptor information into ``topo.data`` automatically. Variable and coordinate -names are auto-detected from CF attributes where possible; pass ``var_name``, -``lon_name``, or ``lat_name`` explicitly to ``TopoInterrogator`` if -auto-detection fails. This also should handle multiple different coordinate -layouts. +descriptor information into ``topo.data`` automatically. + +The elevation variable is found automatically using a two-step search: + +1. **CF** ``standard_name`` **attribute** — variables whose ``standard_name`` + is one of ``surface_altitude``, ``height_above_mean_sea_level``, + ``height_above_reference_ellipsoid``, ``bedrock_altitude``, ``altitude``, + ``height``, or ``sea_floor_depth_below_geoid`` are matched first. +2. **Common variable names** — if no CF match is found, the variable name is + checked against a built-in list that includes ``z``, ``elevation``, + ``topo``, ``height``, ``altitude``, ``depth``, ``dem``, ``bathymetry``, + ``bathy``, ``Band1``, and several capitalisation variants. + +Pass ``var_name`` explicitly only when none of the names above match your file, +or when the file contains multiple variables that would both match and you need +to disambiguate:: + + meta = TopoInterrogator('bathy.nc', var_name='my_elevation', + crop_bounds=(-100, -60, 15, 35)).interrogate_topo() + +Coordinate names (``lon``/``latitude``/``x`` etc.) and dimension ordering are +always discovered automatically and never need to be specified. Domain subsetting (crop) ^^^^^^^^^^^^^^^^^^^^^^^^ From 917e494a274544f1250a45a068f287fed71936d9 Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 7 May 2026 23:15:49 -0400 Subject: [PATCH 6/8] Add .gitignore Mostly just ignores build and doctree --- .gitignore | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 .gitignore diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..dd68408 --- /dev/null +++ b/.gitignore @@ -0,0 +1,65 @@ +# ----------------------------------------------------------------------------- +# Sphinx build output +# ----------------------------------------------------------------------------- +_build1 + +# Generated API docs (if autogenerated) +api/generated/ + +# ----------------------------------------------------------------------------- +# Python +# ----------------------------------------------------------------------------- +__pycache__/ +*.py[cod] +*$py.class + +.pytest_cache/ +.mypy_cache/ +.ruff_cache/ +.coverage +coverage.xml +htmlcov/ + +# ----------------------------------------------------------------------------- +# Virtual environments +# ----------------------------------------------------------------------------- +.venv/ +venv/ +env/ + +# ----------------------------------------------------------------------------- +# Jupyter +# ----------------------------------------------------------------------------- +.ipynb_checkpoints/ + +# ----------------------------------------------------------------------------- +# Documentation tooling caches +# ----------------------------------------------------------------------------- +.doctrees/ + +# If using sphinx-gallery +auto_examples/ + +# If using jupyter-cache / myst-nb +.jupyter_cache/ + +# ----------------------------------------------------------------------------- +# OS / editor files +# ----------------------------------------------------------------------------- +.DS_Store +Thumbs.db + +.vscode/ +.idea/ + +# Vim +*.swp +*.swo + +# ----------------------------------------------------------------------------- +# Package managers / environments +# ----------------------------------------------------------------------------- +# uv +.uv/ +.pixi/ +conda-meta/ From 4788a81a1869210b67d51fba0ee7e3feefbeec9a Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Thu, 7 May 2026 23:17:56 -0400 Subject: [PATCH 7/8] Add docs for lon wrapping support in netcdf --- doc/netcdf.rst | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/doc/netcdf.rst b/doc/netcdf.rst index 3e989a3..bcea652 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -71,18 +71,19 @@ If your file meets the above requirements, you can simply use the following in This matches with what is expected for backwards compatibility. If you want to specify crop bounds, or if GeoClaw cannot find the elevation -variable automatically, use the Python API to interrogate the file and write a -descriptor: +variable automatically, use ``topo_entries()`` to interrogate the file: .. code:: python from clawpack.geoclaw.netcdf_utils import TopoInterrogator - meta = TopoInterrogator('bathy.nc', - crop_bounds=(-100, -60, 15, 35)).interrogate_topo() - rundata.topo_data.topofiles.append([4, 'bathy.nc', meta]) - -That is the only change required in most cases. GeoClaw's Python layer -interrogates the file when you run ``setrun.py`` and writes the necessary + with TopoInterrogator('bathy.nc', crop_bounds=(-100, -60, 15, 35)) as intr: + rundata.topo_data.topofiles.extend(intr.topo_entries()) + +``topo_entries()`` returns a list of ``[4, path, TopoMetadata]`` entries ready +to pass directly to ``topofiles``. In the common case the list has one entry, +but it may contain two when the crop region straddles the file's longitude cut +point (e.g. a global file cropped across the dateline) — ``extend`` handles +both cases without extra logic. GeoClaw's Python layer writes the necessary descriptor information into ``topo.data`` automatically. The elevation variable is found automatically using a two-step search: @@ -110,16 +111,18 @@ Domain subsetting (crop) ^^^^^^^^^^^^^^^^^^^^^^^^ If your NetCDF file covers a larger area than your simulation domain -(common with global or regional datasets), you can crop at read time -without creating a smaller file: +(common with global or regional datasets), pass ``crop_bounds`` to +``TopoInterrogator`` and use ``topo_entries()`` to register the file: .. code:: python - topo_data.topofiles.append([4, 'gebco_global.nc', - {'crop_bounds': [-100, -80, 20, 35]}]) + from clawpack.geoclaw.netcdf_utils import TopoInterrogator + with TopoInterrogator('gebco_global.nc', + crop_bounds=(-100, -80, 20, 35)) as intr: + rundata.topo_data.topofiles.extend(intr.topo_entries()) Only the subset is read into memory at runtime. The full file is never -loaded. Note this is the same as using the +loaded. Checking CF compliance ^^^^^^^^^^^^^^^^^^^^^^ @@ -388,7 +391,7 @@ Topo (lines in topo.data after topo_type) var_name = z lon_name = longitude lat_name = latitude - lon_convention = 180 + lon_offset = 0.0 lat_order = S_to_N dim_order = lat,lon fill_value = -9999.0 From 3c2a1470a94502aa4c093e79b6d13cb5c5abf7ee Mon Sep 17 00:00:00 2001 From: Kyle Mandli Date: Fri, 22 May 2026 14:22:11 -0400 Subject: [PATCH 8/8] =?UTF-8?q?Rename=20Interrogator=E2=86=92Inspector=20i?= =?UTF-8?q?n=20NetCDF=20docs;=20add=20API=20module=20page=20and=20storm=5F?= =?UTF-8?q?time=5Fscale?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - netcdf.rst: rename all TopoInterrogator/MetInterrogator/NetCDFInterrogator references to TopoInspector/MetInspector/NetCDFInspector, rename met_interrogator parameter to met_inspector, rename test file stubs to match (test_base_inspector.py etc.) - netcdf_utils_module.rst: new Sphinx automodule page covering NetCDFInspector, TopoInspector, MetInspector, CFNormalizer, DescriptorWriter - geoclaw.rst: add netcdf and netcdf_utils_module to the toctree so both the user guide and API reference are reachable from the GeoClaw index - setrun_geoclaw.rst: document the new storm_time_scale parameter Co-Authored-By: Claude Sonnet 4.6 --- doc/geoclaw.rst | 2 + doc/netcdf.rst | 84 ++++++++++++++++++------------------- doc/netcdf_utils_module.rst | 39 +++++++++++++++++ doc/setrun_geoclaw.rst | 11 ++++- 4 files changed, 93 insertions(+), 43 deletions(-) create mode 100644 doc/netcdf_utils_module.rst diff --git a/doc/geoclaw.rst b/doc/geoclaw.rst index c3ee6d0..4dabffe 100644 --- a/doc/geoclaw.rst +++ b/doc/geoclaw.rst @@ -69,6 +69,8 @@ More will eventually appear in the :ref:`apps`. nearshore_interp tsunamidata surgedata + netcdf + netcdf_utils_module marching_front force_dry sphere_source diff --git a/doc/netcdf.rst b/doc/netcdf.rst index bcea652..173fd1c 100644 --- a/doc/netcdf.rst +++ b/doc/netcdf.rst @@ -71,13 +71,13 @@ If your file meets the above requirements, you can simply use the following in This matches with what is expected for backwards compatibility. If you want to specify crop bounds, or if GeoClaw cannot find the elevation -variable automatically, use ``topo_entries()`` to interrogate the file: +variable automatically, use ``topo_entries()`` to inspect the file: .. code:: python - from clawpack.geoclaw.netcdf_utils import TopoInterrogator - with TopoInterrogator('bathy.nc', crop_bounds=(-100, -60, 15, 35)) as intr: - rundata.topo_data.topofiles.extend(intr.topo_entries()) + from clawpack.geoclaw.netcdf_utils import TopoInspector + with TopoInspector('bathy.nc', crop_bounds=(-100, -60, 15, 35)) as insp: + rundata.topo_data.topofiles.extend(insp.topo_entries()) ``topo_entries()`` returns a list of ``[4, path, TopoMetadata]`` entries ready to pass directly to ``topofiles``. In the common case the list has one entry, @@ -101,8 +101,8 @@ Pass ``var_name`` explicitly only when none of the names above match your file, or when the file contains multiple variables that would both match and you need to disambiguate:: - meta = TopoInterrogator('bathy.nc', var_name='my_elevation', - crop_bounds=(-100, -60, 15, 35)).interrogate_topo() + meta = TopoInspector('bathy.nc', var_name='my_elevation', + crop_bounds=(-100, -60, 15, 35)).inspect_topo() Coordinate names (``lon``/``latitude``/``x`` etc.) and dimension ordering are always discovered automatically and never need to be specified. @@ -112,14 +112,14 @@ Domain subsetting (crop) If your NetCDF file covers a larger area than your simulation domain (common with global or regional datasets), pass ``crop_bounds`` to -``TopoInterrogator`` and use ``topo_entries()`` to register the file: +``TopoInspector`` and use ``topo_entries()`` to register the file: .. code:: python - from clawpack.geoclaw.netcdf_utils import TopoInterrogator - with TopoInterrogator('gebco_global.nc', - crop_bounds=(-100, -80, 20, 35)) as intr: - rundata.topo_data.topofiles.extend(intr.topo_entries()) + from clawpack.geoclaw.netcdf_utils import TopoInspector + with TopoInspector('gebco_global.nc', + crop_bounds=(-100, -80, 20, 35)) as insp: + rundata.topo_data.topofiles.extend(insp.topo_entries()) Only the subset is read into memory at runtime. The full file is never loaded. @@ -178,7 +178,7 @@ Time seconds from user-defined offset ============================= ================================ If your file uses hPa/mbar for pressure or knots for wind, -``MetInterrogator`` will convert automatically. +``MetInspector`` will convert automatically. Registering a NetCDF storm file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -248,24 +248,24 @@ the built-in fallback lists: 'wind_v': 'vwnd', 'pressure': 'press'}) -**Advanced: pre-built MetInterrogator:** +**Advanced: pre-built MetInspector:** If you need direct control over CF validation, unit checking, or lon/lat convention detection before the descriptor is written, you can -construct a :class:`~clawpack.geoclaw.netcdf_utils.MetInterrogator` -explicitly and pass it via ``met_interrogator``. When a pre-built -interrogator is supplied, auto-discovery and ``var_mapping`` validation +construct a :class:`~clawpack.geoclaw.netcdf_utils.MetInspector` +explicitly and pass it via ``met_inspector``. When a pre-built +inspector is supplied, auto-discovery and ``var_mapping`` validation are bypassed entirely: .. code:: python - from clawpack.geoclaw.netcdf_utils import MetInterrogator + from clawpack.geoclaw.netcdf_utils import MetInspector - mi = MetInterrogator('path/to/forcing.nc', - variable_map={'wind_u': 'u10', - 'wind_v': 'v10', - 'pressure': 'msl'}) - storm.write('isaac.storm', file_format='data', met_interrogator=mi) + mi = MetInspector('path/to/forcing.nc', + variable_map={'wind_u': 'u10', + 'wind_v': 'v10', + 'pressure': 'msl'}) + storm.write('isaac.storm', file_format='data', met_inspector=mi) .. note:: @@ -276,7 +276,7 @@ are bypassed entirely: .. note:: ``storm.window`` (ramp width and application domain) and - ``MetInterrogator``'s ``crop_bounds`` (read-time spatial subset of + ``MetInspector``'s ``crop_bounds`` (read-time spatial subset of the NetCDF file) are independent. Setting one does not affect the other. @@ -309,13 +309,13 @@ to ``storm.time_offset``. No unit conversion (hours→seconds, days→seconds) is performed in Fortran: the descriptor stores ``nc_time_offset`` in seconds and the raw values are read as stored. ``write_data`` passes ``time_reference=storm.time_offset`` to -``MetInterrogator`` automatically; no additional user action is needed. +``MetInspector`` automatically; no additional user action is needed. Longitude convention ^^^^^^^^^^^^^^^^^^^^^^ Longitude convention ([0, 360] vs [-180, 180]) is detected and -normalized automatically by ``MetInterrogator``. ERA5 files that store +normalized automatically by ``MetInspector``. ERA5 files that store longitude in [0, 360] do not need to be preprocessed to [-180, 180] before use. The detected convention is written to the descriptor and Fortran applies index normalization at runtime without copying arrays. @@ -330,7 +330,7 @@ Architecture overview The system has a strict Python/Fortran split: -**Python** handles: file interrogation, CF attribute parsing, coordinate +**Python** handles: file inspection, CF attribute parsing, coordinate convention detection, fill value resolution, unit conversion, ``nc_time_offset`` computation (elapsed seconds from ``storm.time_offset`` to the first record), crop bound validation, and descriptor writing. @@ -350,7 +350,7 @@ Class hierarchy (``netcdf_utils.py``) :: - NetCDFInterrogator + NetCDFInspector - open file (xarray, Dask-lazy) - discover coordinate variables by name heuristics + CF standard_name - detect lon convention, lat order, dim order @@ -358,12 +358,12 @@ Class hierarchy (``netcdf_utils.py``) - validate crop bounds against file extent - output: NetCDFDescriptor dataclass - TopoInterrogator(NetCDFInterrogator) + TopoInspector(NetCDFInspector) - detect fill values within crop region (hard error) - verify and convert units to contract - no multi-file coverage logic (Fortran handles compositing) - MetInterrogator(NetCDFInterrogator) + MetInspector(NetCDFInspector) - check wind_u, wind_v, pressure present and on same grid/time axis - convert units to contract - decode CF time to seconds from offset @@ -467,11 +467,11 @@ Known technical debt - ``util.get_netcdf_names`` (in ``util.py``) remains a parallel implementation of variable name discovery alongside - ``NetCDFInterrogator``. ``Storm.write_data`` now uses - ``MetInterrogator`` for NetCDF met forcing (format 2), but falls back + ``NetCDFInspector``. ``Storm.write_data`` now uses + ``MetInspector`` for NetCDF met forcing (format 2), but falls back to ``util.get_netcdf_names`` for variable name discovery when no explicit ``var_mapping`` is provided. Full consolidation -- making - ``util.get_netcdf_names`` delegate to ``NetCDFInterrogator`` and + ``util.get_netcdf_names`` delegate to ``NetCDFInspector`` and removing the duplicate -- remains a TODO. - WRF raw output requires ``MetPreprocessor`` for string-time decoding (``Times`` character array, not a numeric CF time variable) and @@ -484,13 +484,13 @@ Adding a new met forcing field (e.g. precipitation) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Add the role and unit to ``GEOCLAW_NETCDF_UNITS`` in ``units.py`` -2. Add detection logic to ``MetInterrogator`` for the new variable +2. Add detection logic to ``MetInspector`` for the new variable 3. ``DescriptorWriter`` requires no change -- new ``&variable_info`` - blocks are written automatically for any role the interrogator + blocks are written automatically for any role the inspector returns 4. Add the Fortran reader logic in the storm NetCDF module to consume the new ``geoclaw_role`` value -5. Add unit tests in ``tests/netcdf/test_met_interrogator.py`` +5. Add unit tests in ``tests/netcdf/test_met_inspector.py`` 6. Add a regression test in the appropriate storm surge example Test structure @@ -500,9 +500,9 @@ Test structure tests/netcdf/ unit tests for netcdf_utils.py conftest.py in-memory NetCDF fixtures - test_base_interrogator.py - test_topo_interrogator.py - test_met_interrogator.py + test_base_inspector.py + test_topo_inspector.py + test_met_inspector.py test_descriptor_writer.py test_cf_normalizer.py @@ -521,19 +521,19 @@ Next steps ---------- - **Consolidate util.get_netcdf_names**: ``Storm.write_data`` now - calls ``MetInterrogator`` and ``DescriptorWriter`` for NetCDF met + calls ``MetInspector`` and ``DescriptorWriter`` for NetCDF met forcing; the remaining step is to make ``util.get_netcdf_names`` - delegate to ``NetCDFInterrogator`` and remove the duplicate discovery + delegate to ``NetCDFInspector`` and remove the duplicate discovery logic in ``storm.py`` - **WRF support**: implement ``MetPreprocessor`` to handle string-time axis and curvilinear grid; the skip-marked test stub in ``test_storm.py`` documents the expected interface - **Precipitation field**: reserved in ``GEOCLAW_NETCDF_UNITS``, add - ``MetInterrogator`` detection and Fortran consumer + ``MetInspector`` detection and Fortran consumer - **Friction field**: same pattern as precipitation - **CLI tool**: expose ``CFNormalizer`` as a command-line utility for users who want to check or repair their NetCDF files before running GeoClaw - **Duplicate code audit**: other places in the GeoClaw codebase that do ad-hoc NetCDF variable discovery should be identified and migrated - to ``NetCDFInterrogator`` + to ``NetCDFInspector`` diff --git a/doc/netcdf_utils_module.rst b/doc/netcdf_utils_module.rst new file mode 100644 index 0000000..a760778 --- /dev/null +++ b/doc/netcdf_utils_module.rst @@ -0,0 +1,39 @@ + +.. _netcdf_utils_module: + +netcdf_utils module for NetCDF input +===================================== + +.. seealso:: + - :ref:`netcdf_input` + - :ref:`topotools_module` + - :ref:`storm_module` + +The ``netcdf_utils`` module provides the Python layer for reading NetCDF +topography and meteorological forcing files in GeoClaw. It handles CF +attribute parsing, coordinate convention detection, unit conversion, and +descriptor writing so that Fortran only needs to open the file and read +pre-validated indices. + +The main classes are: + +- :class:`~clawpack.geoclaw.netcdf_utils.NetCDFInspector` — base class; + coordinate discovery, fill-value resolution, crop-bound validation. +- :class:`~clawpack.geoclaw.netcdf_utils.TopoInspector` — topography + subclass; fill-value checking within the crop region, unit enforcement. +- :class:`~clawpack.geoclaw.netcdf_utils.MetInspector` — meteorological + subclass; wind/pressure variable discovery, CF time decoding, unit + conversion. +- :class:`~clawpack.geoclaw.netcdf_utils.CFNormalizer` — adds or repairs + CF attributes in place without modifying data values. +- :class:`~clawpack.geoclaw.netcdf_utils.DescriptorWriter` — writes the + key=value topo descriptor lines or the ``&file_info`` / ``&variable_info`` + Fortran namelist blocks for met forcing. + +Documentation auto-generated from the module docstrings +-------------------------------------------------------- + +.. automodule:: clawpack.geoclaw.netcdf_utils + :members: + :undoc-members: + :show-inheritance: diff --git a/doc/setrun_geoclaw.rst b/doc/setrun_geoclaw.rst index e67bc2b..0f94d95 100644 --- a/doc/setrun_geoclaw.rst +++ b/doc/setrun_geoclaw.rst @@ -453,7 +453,16 @@ Storm Specification Data .. attribute:: rundata.surge_data.storm_file : string Specifies the path to the storm data. IF `storm_specification_type > 0` then - this should point to a GeoClaw formatted storm file (see :ref:`storm_module` for + this should point to a GeoClaw formatted storm file (see :ref:`storm_module` for details). If `storm_specification < 0` then this should either specify a path to files specifying the storm or a single file depending on the type of input data. + +.. attribute:: rundata.surge_data.storm_time_scale : float + + A multiplicative scale factor applied to the storm's time axis. Values + greater than 1 slow the storm down (it takes longer to traverse the + domain); values less than 1 speed it up. The default is ``1.0`` (no + scaling). This is primarily useful for sensitivity studies and for + creating synthetic storms whose tracks are spatially identical to a + historical event but travel at a different speed.