Skip to content

Apply nodata mask in read_geotiff_gpu (#1542)#1547

Open
brendancol wants to merge 1 commit intomainfrom
issue-1542
Open

Apply nodata mask in read_geotiff_gpu (#1542)#1547
brendancol wants to merge 1 commit intomainfrom
issue-1542

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Fixes #1542. read_geotiff_gpu (used when open_geotiff(path, gpu=True) is called) silently differed from the CPU and dask paths on rasters with a declared nodata sentinel: integer rasters kept the sentinel literal in a uint16 array, float rasters kept the sentinel rather than NaN, and attrs['nodata'] was never set. NaN-aware code that worked on the CPU and dask paths quietly produced wrong results on the GPU path.

The fix adds an _apply_nodata_mask_gpu helper that mirrors the CPU eager masking logic using cupy operations. Both the tiled main path and the stripped fallback inside read_geotiff_gpu now promote integer rasters to float64 with NaN at sentinel positions, replace finite sentinels in float arrays with NaN, and set attrs['nodata'] so the original value stays visible.

Two existing GPU tests had pinned the old behaviour and needed adjustment. test_sparse_tile_gpu_round_trip asserted dtype=uint16; it now expects float64 and NaN at sparse positions. test_*_sentinel_nodata in test_lerc_valid_mask_gpu compared read_geotiff_gpu against read_to_array (low-level reader); the helpers now restore the sentinel on the GPU side before the bit-for-bit comparison so the LERC mask preservation check still holds.

Test plan

  • New test_gpu_nodata_1542.py covers tiled + stripped paths, float32 and uint16 sentinels, signed int negative sentinel, NaN nodata, no-nodata-attr, and a four-backend agreement check across numpy, dask+numpy, cupy, and dask+cupy.
  • Updated LERC GPU and sparse COG GPU tests pass with the new behaviour.
  • Full geotiff test suite passes (943 passed, 6 skipped, 3 deselected unrelated matplotlib failures).
  • Wider xrspatial suite shows no new regressions from this change. One pre-existing balanced_allocation dask+cupy failure is unrelated to geotiff.

The GPU read backend silently differed from the CPU and dask paths
when the file declared a nodata sentinel. open_geotiff(path, gpu=True)
returned a DataArray whose attrs had no 'nodata' key and whose pixel
data still carried the raw sentinel value. Integer rasters were not
promoted to float64, and float rasters kept the sentinel rather than
NaN. NaN-aware code that worked on the CPU and dask paths quietly
produced wrong results on the GPU path.

Add an _apply_nodata_mask_gpu helper that mirrors the CPU masking
logic with cupy operations, and call it from both the tiled main
path and the stripped fallback inside read_geotiff_gpu. Also set
attrs['nodata'] from geo_info.nodata so callers can still see the
sentinel value.

Two existing tests had codified the old behaviour and needed
updates: test_sparse_tile_gpu_round_trip checked dtype=uint16, but
the GPU read now promotes to float64 to represent NaN, matching the
CPU path. test_*_sentinel_nodata in test_lerc_valid_mask_gpu
compared read_geotiff_gpu against read_to_array (low-level), so the
test helpers now restore the sentinel on the GPU side before the
bit-for-bit comparison so the LERC mask preservation check still
holds.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_geotiff_gpu skips nodata masking and drops attrs['nodata']

1 participant