reproject.merge: same-CRS dask path materializes full source per output chunk

When `merge()` is called with same-CRS dask-backed rasters, each output
chunk triggers a full `.compute()` on the entire source dask array, not
just the source window the chunk needs.

`_merge_block_adapter` at `xrspatial/reproject/__init__.py:1796-1806`
carries the dask source array into the closure via `functools.partial`,
and the `same_crs_list[i]` branch calls `src_data.compute()` on the full
array before passing it to `_place_same_crs`.

I measured this with a 256x256 source split into 64x64 chunks and 32x32
output chunks:

- total source pixels: 131,072 (2 sources x 256x256)
- pixels materialized inside the chunk fn: 8,912,896
- amplification: 68x

For an 8192x8192 source merged with 256x256 output chunks (1024 chunks),
the amplification is ~1024x and pushes driver-side data flow into
terabyte territory.

The fix is to slice the dask source to the chunk window before calling
`.compute()`, mirroring the pattern in `_reproject_chunk_numpy`
(line 273-276) and `_reproject_chunk_cupy` (line 425-428) which slice
first, then compute the window.

Reproducer:

```python
import dask.array as da
import xarray as xr
import numpy as np
from xrspatial.reproject import merge

t1 = xr.DataArray(
    da.from_array(np.arange(256*256, dtype='f8').reshape(256, 256),
                  chunks=(64, 64)),
    dims=['y', 'x'],
    coords={'y': np.linspace(40, 35, 256),
            'x': np.linspace(-10, -5, 256)},
    attrs={'crs': 'EPSG:4326'},
)
t2 = xr.DataArray(
    da.from_array(np.ones((256, 256)) * 2, chunks=(64, 64)),
    dims=['y', 'x'],
    coords={'y': np.linspace(40, 35, 256),
            'x': np.linspace(-5, 0, 256)},
    attrs={'crs': 'EPSG:4326'},
)
merge([t1, t2], strategy='first', chunk_size=32).compute()
# Patch da.Array.compute to count -- 136 calls, 68x source size materialized
```

Surfaced by the 2026-05-10 reproject performance sweep.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproject.merge: same-CRS dask path materializes full source per output chunk #1571

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

reproject.merge: same-CRS dask path materializes full source per output chunk #1571

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions