from_template ergonomics: coregister your data onto the grid + neighborhood-friendly dask chunks#3562
Merged
Conversation
…nks (#3561) Add template.xrs.coregister(data): reproject a raster DataArray, or rasterize a GeoDataFrame, onto the caller grid's exact CRS, bounds, and shape. One call to land your own data on a template and run a tool. The result shares the template's y/x coords, so layers stack cell-for-cell. Make the default dask tiling neighborhood-friendly: chunks='auto' (the default) now tiles into even, square-ish blocks (~2048 cells) instead of one giant byte-sized block or thin ragged edges, so downstream slope/ focal/etc. parallelize cleanly through map_overlap. Small grids stay one chunk; the block grows for huge grids so 'auto' never trips the chunk cap. Document the pick-a-grid, coregister, run-a-tool path in the from_template docstring and the templates reference page. Closes #3561 Claude-Session: https://claude.ai/code/session_01TEXdRkRqYJKvATjquz26Pa
reproject always emits north-up (descending y, ascending x), so blindly relabeling with the caller's coords would flip the data for a grid stored in a different axis order. Flip to the caller's directions first, then snap. from_template grids are already north-up, so this only matters for arbitrary caller rasters, but it removes a silent-flip footgun. Claude-Session: https://claude.ai/code/session_01TEXdRkRqYJKvATjquz26Pa
brendancol
commented
Jun 27, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: from_template ergonomics (coregister + neighborhood dask chunks)
I read the full diff from the PR worktree and exercised both features, including a few cases the tests do not cover (non-constant gradient orientation, ascending-y target, huge-grid chunk cap).
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
accessor.py_reproject_onto_accessor:**kwargsis forwarded toreproject, which already getsbounds,width,height, andresamplingas explicit args. A caller passing any of those throughcoregister(..., bounds=...)would hit a "multiple values for keyword argument" TypeError instead of a friendly message. Low odds, since those are exactly what coregister computes for you. Not worth guarding unless it comes up.accessor.py~line 895: the 2x2-cell minimum raises even whenattrs['res']is present and could anchor a 1x1 grid. Coregistering onto a single pixel is degenerate, so the clear error is fine, just slightly stricter than necessary.
What looks good
- Orientation is correct, which was the real risk. reproject always emits north-up (descending y, ascending x) no matter the source order, and the follow-up commit flips to the caller's axis directions before the coordinate snap, so an ascending-y target no longer flips silently. Verified with a north-south gradient: the north row stays north.
- Raster path data is byte-identical to a raw
reprojectwith the same derived bounds; the coordinate snap only removes float drift, it does not move data. - Vector path is exactly
rasterize(coregister=True), so it inherits that path's CRS handling and tests. - The chunk default is self-limiting: the block grows for very large grids so
chunks='auto'stays under the existing 1e6 cap (the #3557 contract holds). conus@1m lands at ~200k blocks instead of erroring. - Small grids still come back as a single chunk, so the change is invisible for the common small-template case.
- CRS attrs (crs int, crs_wkt, grid_mapping_name) are carried from the template onto the result, so it is a real drop-in rather than keeping reproject's WKT-string crs.
Checklist
- coregister output lands on the template grid (exact coords, CRS carried over)
- No coordinate flip (orientation verified with a gradient and an ascending-y target)
- Raster data identical to raw reproject; vector path identical to rasterize(coregister=True)
- Dask source returns a dask result; default tiling balanced and square
- chunks='auto' stays under the chunk cap at extreme resolution; explicit chunks bypass the tiling
- Docstrings present (coregister, from_template Recipes) and a templates reference note
- README feature matrix: N/A (no new top-level function)
- Benchmark: N/A (thin wrapper over already-benchmarked reproject/rasterize; chunk change is in grid construction, not a hot path)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two ergonomic additions to
from_template, so users go from a region name to an analysis-ready grid with less friction.template.xrs.coregister(data): one call to put your own data on the grid. A rasterDataArrayis reprojected onto the template's exact CRS, bounds, and shape; aGeoDataFrameis rasterized onto it (therasterize(coregister=True)path). The result shares the template'sy/xcoords, so layers stack cell-for-cell. This fills the gap left by the existing coregister paths, which covered vectors, points, and on-disk GeoTIFFs but not in-memory rasters.chunks='auto'(the default for dask templates) now tiles into even, square blocks (~2048 cells per side) instead of one giant byte-sized block or thin ragged edge slivers. Downstreamslope/focal/etc. inherit a parallel, overlap-friendly graph. Small grids stay a single chunk; the block grows for very large grids so the default never trips the existing chunk-count cap (from_template: guard against task-graph explosion on dask with very fine resolution #3557).Measured before/after for the chunking:
from_template('conus', resolution=1000, backend='dask')went from a single 3105x5865 chunk to 6 balanced ~1553x1955 blocks;conus@250mfrom 15 blocks with 291-px slivers to 66 even ~2070x2132 blocks.Backend coverage
coregisterruns on whatever backends the underlying reproject/rasterize support (numpy and dask verified; a dask source returns a dask result). The chunk change is dask-only. No new compute kernels.Test plan
rasterize(coregister=True), missing-CRS and bad-type errors, nearest resampling, dask source stays daskCloses #3561
https://claude.ai/code/session_01TEXdRkRqYJKvATjquz26Pa