-
Notifications
You must be signed in to change notification settings - Fork 0
Refactor spatial_method logic: add vectorized axis extraction, update method names, and clarify 'auto' logic #140
Copy link
Copy link
Open
Description
Summary
Implement a new batch extraction method _extract_axis_batch() (or similar) for spatial_method="axis", replacing the per-point selection currently used for spatial_method="nearest" with fully vectorized indexing using xarray's .sel(..., method="nearest") on all points at once.
- new
spatial_method="axis": Vectorized, 1-D lat/lon coordinate selection (xarray native axis-based nearest; works on regular grids) replacesspatial_method="nearest" spatial_method="auto": Now resolves as follows:- If both lat.ndim == 1 and lon.ndim == 1 → use
axis - Else (either is 2D) → use
euclidean
- If both lat.ndim == 1 and lon.ndim == 1 → use
Detailed requirements
-
Vectorized extraction for
axis:- See example in this notebook for how to implement vectorized indexing: how_to_select_xarray_gridcells_using_vectorized_indexing.ipynb
- When
spatial_method="axis"and a granule has one or more points, batch all points into a single.sel()call for all variables. - Support vectorized selection in latitude, longitude, and time (all in one call), as well as any extra coords supplied by
coord_spec. - If the dataset lacks a time dimension but input points specify times, create a dummy/singleton time axis to enable vectorized selection.
- Remove the old per-point nearest code and all
spatial_method="nearest"references:spatial_method="axis"should always use the new batch/vectorized approach. - Check documentation in docs/ and examples/.py and examples/docs_.ipynb and README.md for any references to
spatial_method="nearest"and replace withspatial_method="axis".
-
Additional axes:
- For any additional axes specified in
coord_spec(e.g., depth, wavelength), support as additional vectorized indexers.
- For any additional axes specified in
-
autologic update:- When
spatial_method="auto", the engine should now check:- If both
lat.ndim == 1andlon.ndim == 1in the dataset, useaxis(vectorized via 1-D axis matching) - Else, use existing logic to switch to a 2D method
- it would be very unusual for lat to be 1D and lon to be 2D, but don't fail if that were the case. Make into 2D and 2D method.
- If both
- When
-
Quality and compatibility:
- Keep fallback/error handling as now (empty slices, NaNs, etc).
- API should remain: result rows, per-variable expansion, NaN handling, etc.
- Unit tests should cover:
- Datasets with/without time coordinates
- With/without extra axes
- Few/many points per granule
- 1-D vs 2-D coordinate arrays
Reactions are currently unavailable