Skip to content

from_template: dask-aligned extents and explicit height/width#3567

Merged
brendancol merged 2 commits into
mainfrom
from-template-dask-aligned-extents
Jun 28, 2026
Merged

from_template: dask-aligned extents and explicit height/width#3567
brendancol merged 2 commits into
mainfrom
from-template-dask-aligned-extents

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

What

Two changes to from_template, both so the grid partitions cleanly for dask instead of letting the geographic bbox decide the shape.

Default dask tiling pads to whole tiles. A multi-block grid grows its extent out from the lower-left anchor until each axis is a whole number of blocks. Every chunk comes back full-size, no ragged remainder. The requested resolution is unchanged and the padded grid still covers the study area (it only grows).

This is dask-only and scoped so it doesn't disturb the existing paths:

  • eager numpy/cupy grids keep the exact bbox-derived shape (no point bloating a materialized array)
  • a grid small enough to stay one chunk isn't padded, so its dask coords still match the eager grid cell-for-cell
  • an explicit chunks= is honored verbatim
conus @ 1km, backend="dask":
  before:  3105 x 5865, chunks (1553,1552) x (1955,1955,1955)
  after:   4096 x 6144, chunks (2048,2048) x (2048,2048,2048)

New height/width params. Pass both to set the grid shape exactly, anchored at the region origin, with the extent floating off that anchor: shape x resolution when a resolution is given, otherwise the resolution is derived so the shape spans the region bbox.

from_template("conus", resolution=1000, height=4096, width=6144)  # exact 4096x6144
from_template("conus", height=600, width=1200)                    # 600x1200 spanning CONUS

Heads-up

Same name + resolution now gives a different extent on dask vs eager (dask padded larger, eager exact bbox). That's deliberate, since only the chunked path benefits from tiling, but it does mean backend is no longer a pure execution detail for large grids.

Tests

11 new tests in test_templates.py covering the padding (applied; skipped for eager, single-chunk, and explicit-chunks paths; resolution stays exact) and height/width (exact shape, extent floats with and without resolution, validation). Full file: 276 passed, flake8 clean.

https://claude.ai/code/session_01RngbqHJk4id4BNY1ironEc

The default dask grid now pads its shape up to whole blocks, so every
chunk is full-size instead of leaving a ragged remainder on the far
edge. Padding only happens on a dask backend under the default tiling:
eager grids and an explicit chunks= keep the exact bbox-derived shape,
and a grid small enough to stay one chunk is left alone.

Also adds height/width: pass both to fix the grid shape exactly. The
extent floats off the region's lower-left anchor (shape x resolution
when a resolution is given, otherwise the resolution is derived so the
shape spans the region bbox).

Claude-Session: https://claude.ai/code/session_01RngbqHJk4id4BNY1ironEc
@brendancol

Copy link
Copy Markdown
Contributor Author

Self-review: from_template dask-aligned extents + height/width

Read the integrated templates.py and the new tests in full. No blockers; the padding math is consistent and coverage is preserved. A couple of things worth tightening before merge.

Blockers

None.

Suggestions

  • Error messages name the wrong knob on the height/width path. The cell-cap (templates.py:583-587) and chunk-cap (templates.py:604-611) both start with resolution {(res_x, res_y)} produces .... When the caller passed height/width and no resolution, that res is derived, so the message points them at a parameter they never set. When explicit_shape is true, the message should lead with the height/width they actually gave.

  • Test gaps on the new path. The added tests cover the common cases well, but three combinations have no coverage:

    • height/width together with preserve= (the anchor comes from the reprojected bbox, and that interaction is untested).
    • a dask backend with an explicit shape that is not a block multiple (e.g. 5000x5000), which exercises the ragged _balanced_axis tiling on the exact-shape path rather than the clean 4096x6144 case.
    • the eager cell cap firing on an oversized explicit shape.

Nits

  • Float height/width is truncated, not rejected (templates.py:548). int(4095.7) silently becomes 4095. The hint says int, so this is minor, but a non-integer could surprise someone. Optional to guard.

Heads-up for reviewers

The deliberate behavior change here: same name + resolution now yields a larger extent on a dask backend than on numpy (dask pads to whole tiles, eager stays exact-bbox). It's in the docstring and PR body, but it's the one thing a downstream user could trip on, so worth a second look on whether that divergence is acceptable.

What looks good

  • Built test-first; the scoping guards (eager, single-chunk, and explicit-chunks= all keep the exact bbox shape) are each pinned by a test.
  • Padding only grows the extent, so the named study area stays covered.
  • The same block edge feeds both the padding and the chunking, so a padded axis tiles into exact full blocks with no leftover sliver.

Checklist

  • All implemented backends produce consistent fill (NaN float32)
  • NaN/dtype handling unchanged
  • Resolution-exact invariant holds after padding
  • Dask chunk count stays under the cap on the default path
  • No materialization added
  • Benchmark — n/a (grid construction, no compute op)
  • Docstrings updated for height/width and padding
  • Test coverage — see gaps above

The cell-cap and chunk-cap errors led with "resolution ..." even when
the caller set height/width and never passed a resolution. They now name
whichever knob was actually used and adjust the "use a coarser ..."
advice to match.

Adds tests for the height/width path that were missing: the cap message
on an oversized explicit shape, height/width combined with preserve=,
and a dask backend with a non-block-multiple shape (stays exact, tiles
balanced).

Claude-Session: https://claude.ai/code/session_01RngbqHJk4id4BNY1ironEc
@brendancol

Copy link
Copy Markdown
Contributor Author

Addressed in 31765e6:

  • Cap messages now name the knob the caller set. On the height/width path the cell-cap and chunk-cap errors lead with height=..., width=... and advise a smaller height/width; the resolution path is unchanged. Before, both led with resolution ... even when no resolution was passed.
  • Test gaps filled: the cap message on an oversized explicit shape, height/width combined with preserve= (anchor from the reprojected bbox, CRS = chosen EPSG), and a dask backend with a non-block-multiple shape (5000x5000 stays exact and tiles balanced).

Not changed: the float-truncation nit (int(4095.7) -> 4095). The type hint already says int, so I left it rather than add validation for an off-spec input. Happy to guard it if you'd rather reject non-integers.

The dask-vs-eager extent divergence stands as designed and is documented in the docstring and PR body.

279 passed, flake8 clean.

@brendancol brendancol merged commit d6ec0f8 into main Jun 28, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant