Skip to content

Add open_dataset and open_mfdataset wrappers to mpas_tools.io#737

Open
xylar wants to merge 3 commits into
MPAS-Dev:masterfrom
xylar:add-open-dataset
Open

Add open_dataset and open_mfdataset wrappers to mpas_tools.io#737
xylar wants to merge 3 commits into
MPAS-Dev:masterfrom
xylar:add-open-dataset

Conversation

@xylar

@xylar xylar commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

This PR adds thin wrappers around xarray.open_dataset() and xarray.open_mfdataset() that select the NetCDF engine from the module-level mpas_tools.io.default_engine variable when no engine is passed explicitly.

xarray.open_dataset() tries to open a file looking for "magic bits" to auto-select a backend, and that probe can crash on NETCDF3_64BIT_DATA (CDF5) files (see E3SM-Project/polaris#624). Specifying an engine explicitly avoids the sniffing. Since xarray has no global default-engine setting, reusing the existing default_engine variable (already consumed by write_netcdf()) gives downstream tools a single, process-wide knob for both reading and writing without modifying every call site.

The logger argument is included for API symmetry with write_netcdf and future diagnostics; no error recovery is performed because the CDF5 failure is a hard crash that cannot be caught.

xylar and others added 3 commits June 27, 2026 12:53
Add thin wrappers around xarray.open_dataset and xarray.open_mfdataset that
select the NetCDF engine from the module-level mpas_tools.io.default_engine
variable when no engine is passed explicitly.

xarray.open_dataset sniffs a file for "magic bits" to auto-select a backend,
and that probe can crash on NETCDF3_64BIT_DATA (CDF5) files (see
E3SM-Project/polaris#624). Specifying an engine
explicitly avoids the sniffing. Since xarray has no global default-engine
setting, reusing the existing default_engine variable (already consumed by
write_netcdf) gives downstream tools a single, process-wide knob for both
reading and writing without modifying every call site.

The logger argument is included for API symmetry with write_netcdf and future
diagnostics; no error recovery is performed because the CDF5 failure is a hard
crash that cannot be caught.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cover the new wrappers: a basic write/read round trip, opening a
NETCDF3_64BIT_DATA (CDF5) file with an explicit engine (exercising the
backend-sniffing crash the wrappers work around), resolving the engine from
mpas_tools.io.default_engine when engine is None (restoring the global
afterward), and a multi-file open_mfdataset smoke test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the new functions to the I/O autosummary in api.rst and describe them in
io.rst, including why specifying an engine via mpas_tools.io.default_engine
avoids the CDF5 backend-sniffing crash and an example of setting the default
engine.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@cbegeman

Copy link
Copy Markdown
Contributor

@xylar Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants