Skip to content

Add SGRID Xarray accessor#2638

Merged
VeckoTheGecko merged 22 commits into
Parcels-code:mainfrom
VeckoTheGecko:push-kupxytsrkxzn
May 21, 2026
Merged

Add SGRID Xarray accessor#2638
VeckoTheGecko merged 22 commits into
Parcels-code:mainfrom
VeckoTheGecko:push-kupxytsrkxzn

Conversation

@VeckoTheGecko
Copy link
Copy Markdown
Contributor

Description

As I work to improve our performance, I would like to do significant work on our software architecture to make things easier to read, maintain, and refactor.

And to help with this re-organisation - I think it would better if we rely more on SGRID-based fieldset creation in our test suite. This will mean that refactoring changes down the line will require little to no updates to the test suite, will result in less setup code in the test suite, and improve readability.

This PR adds an SGRID Xarray dataset accessor to help inspect, and manipulate Xarray SGRID datasets.

Namely:

  • ds.sgrid.metadata providing direct access to parsed SGRID metadata
  • ds.sgrid.rename allowing the renaming of dataset dimensions alongside the renaming of associated metadata (replaces the old parcels.sgrid.rename function)
  • ds.sgrid.isel selection of spatial dimensions (either by node or by face) which in conjunction selects the other dimension. Allows for the easy subsetting of regions and downsampling of data.
    • Subsetting, slicing with a step (aka. downsampling) etc. works out of the box for datasets that have padding as HIGH or LOW as there is a 1-1 correspondence between nodes and faces. Slicing with integer lists also works for this reason
    • With padding of BOTH (i.e., n_faces = n_nodes+1) and NONE (i.e., n_faces = n_nodes-1), it becomes ambiguous what slicing with an integer list or downsampling here means as its not clear where the extra node/face should come from. If padding is BOTH/NONE then only slice indexers with no step are supported (i.e., contiguous regions)

As we now have a bunch of xr.Dataset-SGRID logic - this PR also restructures the sgrid to a subpackage within parcels.

Checklist

  • Closes None
  • Tests added
  • This PR targets the correct branch (main for normal development, v3-support for v3 support)

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.
    • Describe how you used it (e.g., by pasting your prompt): Yes, only in the co-authored commits. Mapped out from a conceptual POV then asked claude to implement.

VeckoTheGecko and others added 18 commits May 18, 2026 14:48
Adds an xarray accessor for SGRID focussing on `metadata` and `rename()`. Removes old functions which handled this (i.e., get_grid_topology, parse_grid_attrs, and parcels.sgrid.rename(ds,...)), and updates all references
Alongside helpers get_n_nodes and get_n_faces
Relax requirement for slice objects in isel
Refactor to avoid calculating the index verbatim. Define NONE and BOTH isel indexing only for contiguous regions.

Add more hypothesis testing properties:
- P1: consistency invariant (assert_metadata_ds_consistency after any valid isel)
- P2: data correctness (values match direct xarray isel)
- P3: specification symmetry (isel by node dim == isel by derived face dim)


Co-authored-by: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@erikvansebille erikvansebille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; just a few small comments and questions below

Comment thread src/parcels/_sgrid/accessor.py Outdated
return grid_da

def isel(self, indexers: Mapping[str, Any] | None = None, **indexers_kwargs):
"""TODO: Docstring"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to still do now? Or in a next PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping - I think its good to do now. Will implement before merge

f"Due to dataset padding of {padding!r}, expected face dimension {face} to actually be size {expected_n_faces}."
)

# TODO: Also check on coordinates
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a next PR, I assume?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :)

)
except Exception as e:
raise SGridParsingException(f"Failed to parse Grid3DMetadata from {attrs=!r}") from e

def to_attrs(self) -> dict[str, str | int]:
d = dict(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own understanding, why has this changed from using dict() to {}? Why would the new approach be better?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this change was unnecessary. I thought it would be needed for typing, but it works fine the other way too

Comment thread src/parcels/_python.py
Comment on lines +6 to +7
K = TypeVar("K")
V = TypeVar("V")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's happening here? What is this for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are generic type variables https://mypy.readthedocs.io/en/stable/generics.html

They allow you to describe (via typing) a relationship between the input and output types of a function, without providing concrete types.

e.g.,

T = TypeVar("T")
def make_list(obj: T) -> list[T]:
    # make a list of length 1 from provided object
    # works on any object type. Returns a list with objects of same type as input
    return [obj]

In our case here we define type variables K and V (for "key" and "value") for

def invert_non_unique_mapping(d: Mapping[K, V]) -> Mapping[V, list[K]]:
    inv_map: dict[V, list[K]] = {}
    for k, v in d.items():
        inv_map[v] = inv_map.get(v, []) + [k]
    return inv_map

@VeckoTheGecko VeckoTheGecko enabled auto-merge (squash) May 21, 2026 09:34
@VeckoTheGecko VeckoTheGecko merged commit 39c9b0b into Parcels-code:main May 21, 2026
15 of 16 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Parcels development May 21, 2026
@VeckoTheGecko VeckoTheGecko deleted the push-kupxytsrkxzn branch May 21, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants