You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some situations, it is important to be able to select multiple (disconnected) contiguous regions along a dimension, even if the dimension itself is very large.
If there's enough client memory, it is possible to emulate this by materializing the slices into an array of integers, but this becomes infeasible if that array is too large (and while supposedly we should be able to index by a dask integer array, I'm not sure how efficient that would be).
Examples of where this would be useful include:
The healpix MOC index at healpix moc index xarray-contrib/xdggs#151, where cell ids are represented as a set of disconnected ranges at the smallest possible refinement level. To be able to support Index.sel, I'd need to return a IndexSelResult with either a list of slices, or materialize these into an integer array and error out if that wouldn't fit into memory (or try to use dask as an indexer).
but that will obviously further increase the complexity of the indexing machinery
Describe alternatives you've considered
Manually iterating of the slices, then concatenating the result is possible, but will have an additional overhead if done using the xarray API. However, I don't see a way that can work as part of IndexSelResult.
Is your feature request related to a problem?
In some situations, it is important to be able to select multiple (disconnected) contiguous regions along a dimension, even if the dimension itself is very large.
If there's enough client memory, it is possible to emulate this by materializing the slices into an array of integers, but this becomes infeasible if that array is too large (and while supposedly we should be able to index by a
daskinteger array, I'm not sure how efficient that would be).Examples of where this would be useful include:
Index.sel, I'd need to return aIndexSelResultwith either a list of slices, or materialize these into an integer array and error out if that wouldn't fit into memory (or try to usedaskas an indexer).bcftools-style filtering sgkit-dev/sgkit#1330 (comment)). I'll let him provide further details.cc @benbovy, @shoyer, @TomNicholas, @dcherian
Describe the solution you'd like
I'd love to be able to specify this as another kind of indexer:
but that will obviously further increase the complexity of the indexing machinery
Describe alternatives you've considered
Manually iterating of the slices, then concatenating the result is possible, but will have an additional overhead if done using the
xarrayAPI. However, I don't see a way that can work as part ofIndexSelResult.