Apply nodata mask in read_geotiff_gpu (#1542)#1547
Open
brendancol wants to merge 1 commit intomainfrom
Open
Conversation
The GPU read backend silently differed from the CPU and dask paths when the file declared a nodata sentinel. open_geotiff(path, gpu=True) returned a DataArray whose attrs had no 'nodata' key and whose pixel data still carried the raw sentinel value. Integer rasters were not promoted to float64, and float rasters kept the sentinel rather than NaN. NaN-aware code that worked on the CPU and dask paths quietly produced wrong results on the GPU path. Add an _apply_nodata_mask_gpu helper that mirrors the CPU masking logic with cupy operations, and call it from both the tiled main path and the stripped fallback inside read_geotiff_gpu. Also set attrs['nodata'] from geo_info.nodata so callers can still see the sentinel value. Two existing tests had codified the old behaviour and needed updates: test_sparse_tile_gpu_round_trip checked dtype=uint16, but the GPU read now promotes to float64 to represent NaN, matching the CPU path. test_*_sentinel_nodata in test_lerc_valid_mask_gpu compared read_geotiff_gpu against read_to_array (low-level), so the test helpers now restore the sentinel on the GPU side before the bit-for-bit comparison so the LERC mask preservation check still holds.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1542.
read_geotiff_gpu(used whenopen_geotiff(path, gpu=True)is called) silently differed from the CPU and dask paths on rasters with a declared nodata sentinel: integer rasters kept the sentinel literal in a uint16 array, float rasters kept the sentinel rather than NaN, andattrs['nodata']was never set. NaN-aware code that worked on the CPU and dask paths quietly produced wrong results on the GPU path.The fix adds an
_apply_nodata_mask_gpuhelper that mirrors the CPU eager masking logic using cupy operations. Both the tiled main path and the stripped fallback insideread_geotiff_gpunow promote integer rasters to float64 with NaN at sentinel positions, replace finite sentinels in float arrays with NaN, and setattrs['nodata']so the original value stays visible.Two existing GPU tests had pinned the old behaviour and needed adjustment.
test_sparse_tile_gpu_round_tripasserteddtype=uint16; it now expectsfloat64and NaN at sparse positions.test_*_sentinel_nodataintest_lerc_valid_mask_gpucomparedread_geotiff_gpuagainstread_to_array(low-level reader); the helpers now restore the sentinel on the GPU side before the bit-for-bit comparison so the LERC mask preservation check still holds.Test plan
test_gpu_nodata_1542.pycovers tiled + stripped paths, float32 and uint16 sentinels, signed int negative sentinel, NaN nodata, no-nodata-attr, and a four-backend agreement check across numpy, dask+numpy, cupy, and dask+cupy.balanced_allocationdask+cupy failure is unrelated to geotiff.