Skip to content

Handle PlanarConfiguration=2 across CPU + GPU multi-band reads#1532

Open
brendancol wants to merge 1 commit intoxarray-contrib:mainfrom
brendancol:fix/planar-config-multiband
Open

Handle PlanarConfiguration=2 across CPU + GPU multi-band reads#1532
brendancol wants to merge 1 commit intoxarray-contrib:mainfrom
brendancol:fix/planar-config-multiband

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Two related multi-band accuracy bugs from the geotiff audit:

A2 (HIGH): GPU tiled reads of planar=2 files returned wrong pixel values.
gpu_decode_tiles / _assemble_tiles_kernel iterate a single TileOffsets
list with bytes_per_pixel = itemsize * samples and assume chunky interleave.
For PlanarConfiguration=2, each band has its own contiguous run of tiles in
TileOffsets, so the kernel read across band boundaries.

A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded dims=['y','x']
on a 3-D CPU read, the same regression that PR #1525 addresses for the stripped
path. Folded in here so the planar=2 stripped tests pass without depending on
PR #1525 merging first.

Shared root cause: neither path branched on ifd.planar_config.

Fix

  • read_geotiff_gpu: when ifd.planar_config == 2 and samples > 1, slice
    TileOffsets / TileByteCounts per band, decode each slab as a single-band
    tile sequence, then cupy.stack(..., axis=2) into (H, W, samples). Chunky
    (planar=1) goes through the existing single-pass kernel unchanged.
  • read_geotiff_gpu stripped fallback: pick ('y', 'x', 'band') with a band
    coord when the CPU array is 3-D (mirrors PR Fix read_geotiff_gpu dim mismatch on stripped multi-band TIFFs #1525).
  • Both multi-band paths now assert the assembled shape equals
    (H, W, samples) before returning.

Diff is scoped to xrspatial/geotiff/__init__.py -- the GPU kernels are not
touched, so other GPU codecs and the chunky/predictor paths see no behaviour
change.

Paths that now diverge correctly

Layout Planar Backend Path
Tiled 1 GPU unchanged: single gpu_decode_tiles
Tiled 2 GPU new: per-band decode + cupy.stack
Stripped 1 / 2 CPU unchanged
Stripped 1 / 2 GPU dim fix (mirrors PR #1525)

Test plan

  • pytest xrspatial/geotiff/tests/test_planar_multiband.py -- 53 passed
    (planar in {separate, contig} x layout in {tiled, stripped} x bands in {2,3,4}
    x dtype in {uint8, uint16} x backend in {CPU, GPU}, plus single-band sanity
    and an explicit A3 repro). Seeds fixed; GPU tests skip without CUDA.
  • pytest xrspatial/geotiff/tests/ -k "multi or band or planar or strip or tile" -- 303 passed
  • pytest xrspatial/geotiff/tests/test_predictor_multisample.py -- 33 passed
  • pytest xrspatial/geotiff/tests/test_gpu_strict_fallback_1516.py -- 5 passed
  • PR Fix read_geotiff_gpu dim mismatch on stripped multi-band TIFFs #1525's test_gpu_stripped_multiband.py cases verified manually
    against this branch (3-band uint8, 2-band uint16, single-band sanity)
  • Full pytest xrspatial/geotiff/tests/ -- 837 passed (3 pre-existing
    matplotlib-deepcopy failures in test_features.py unrelated to this change)

Two related multi-band accuracy bugs from the geotiff audit:

A2 (HIGH): GPU tiled reads of planar=2 files (separate band planes)
returned wrong pixel values. gpu_decode_tiles iterates a single
TileOffsets list with bytes_per_pixel = itemsize * samples and assumes
chunky interleave. For planar=2, each band has its own contiguous run
of tiles in TileOffsets, so the kernel read across band boundaries and
produced garbage. read_geotiff_gpu now detects planar==2 and decodes
each band's tile slab as a single-band image, then stacks the results
into (H, W, samples). The chunky planar=1 path is unchanged.

A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded
dims=['y','x'] even for 3-D CPU reads, mirroring the pre-xarray-contrib#1525 bug.
The same fix lands here so the planar=2 stripped tests pass; this
matches the change in PR xarray-contrib#1525 verbatim.

Both paths assert the assembled shape equals (H, W, samples) before
returning, so a future kernel regression surfaces immediately rather
than producing garbage.

Adds xrspatial/geotiff/tests/test_planar_multiband.py covering 53
combinations: planar in {separate, contig} x layout in {tiled, stripped}
x bands in {2, 3, 4} x dtype in {uint8, uint16} x backend in
{CPU, GPU}, plus single-band sanity checks and an explicit A3 repro.
GPU tests skip cleanly when CUDA is unavailable. Seeds are fixed.

Refs audit issues A2, A3.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 8, 2026
@brendancol brendancol requested a review from Copilot May 8, 2026 20:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes GeoTIFF multi-band read correctness for PlanarConfiguration=2 (separate planar) across GPU tiled reads, and aligns the GPU stripped fallback’s dimension handling with the CPU reader so multi-band stripped outputs consistently return (y, x, band).

Changes:

  • Add a new GPU tiled read branch for planar=2 that decodes per-band tile slabs and stacks them into (H, W, samples).
  • Fix GPU stripped fallback dims/coords so 3-D CPU reads return ('y', 'x', 'band') rather than forcing 2-D dims.
  • Add comprehensive tests covering planar configuration × layout × band count × dtype across CPU and GPU backends.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
xrspatial/geotiff/__init__.py Adds planar=2 handling in read_geotiff_gpu, introduces a per-band tile decode helper, and fixes stripped fallback dims.
xrspatial/geotiff/tests/test_planar_multiband.py New test module validating CPU/GPU multi-band correctness and dims across planar/layout combinations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1659 to +1660
# single-pass kernel is correct.
if planar == 2 and samples > 1:
Comment on lines +1400 to +1409
def _gpu_decode_single_band_tiles(
source, offsets, byte_counts,
tw, th, width, height,
compression, predictor, file_dtype,
*,
byte_order: str,
gpu: str,
overview_level,
cupy,
):
Comment on lines +1472 to +1477
# Per-band CPU fallback would require a single-band CPU
# decode path keyed off planar config; the easier safe move
# is to surface the failure so the caller upstream can fall
# back to read_to_array for the whole image.
raise

Comment on lines +1672 to +1685
for band_idx in range(samples):
b0 = band_idx * tiles_per_band
b1 = b0 + tiles_per_band
band_offsets = list(offsets[b0:b1])
band_byte_counts = list(byte_counts[b0:b1])
band_arr = _gpu_decode_single_band_tiles(
source, band_offsets, band_byte_counts,
tw, th, width, height,
compression, predictor, file_dtype,
byte_order=header.byte_order,
gpu=gpu, overview_level=overview_level,
cupy=cupy,
)
band_arrays.append(band_arr)
Comment on lines +1664 to +1670
if tiles_per_band * samples != len(offsets):
raise ValueError(
f"PlanarConfiguration=2 expects "
f"{tiles_per_band * samples} TileOffsets entries "
f"({tiles_across} x {tiles_down} x {samples} bands), "
f"got {len(offsets)}"
)
Comment on lines +1688 to +1691
assert arr_gpu.shape == (height, width, samples), (
f"planar=2 GPU assembly produced shape {arr_gpu.shape}, "
f"expected ({height}, {width}, {samples})"
)
Comment on lines +1753 to +1756
assert arr_gpu.shape[:2] == (height, width) and arr_gpu.shape[2] == samples, (
f"GPU multi-band tile assembly produced shape {arr_gpu.shape}, "
f"expected ({height}, {width}, {samples})"
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants