Skip to content

Validate band index in read_to_array for backend parity (#1673)#1676

Merged
brendancol merged 2 commits into
mainfrom
issue-1673
May 12, 2026
Merged

Validate band index in read_to_array for backend parity (#1673)#1676
brendancol merged 2 commits into
mainfrom
issue-1673

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #1673.

Problem

xrspatial/geotiff/_reader.py:1963-1964 applied band with no validation:

if arr.ndim == 3 and ifd.samples_per_pixel > 1 and band is not None:
    arr = arr[:, :, band]

Two failure modes:

  • band=-1 silently selected the last channel via numpy negative indexing.
  • band=N with N >= samples_per_pixel raised a raw numpy IndexError whose message ("index N is out of bounds for axis 2 with size M") leaked the internal slice shape.

The dask path (read_geotiff_dask in __init__.py:1773-1780) and the GPU path (__init__.py:2413-2421) both validate band up front and raise IndexError("band=N out of range for M-band file."). The local eager path did not. Same public call, different error behaviour per backend.

Fix

Add the same validator in read_to_array, placed after the IFD is selected and the window is validated so it shares the existing parse work and rejects before any pixel allocation. The check mirrors the dask path verbatim, so read_to_array and read_geotiff_dask raise identical IndexError messages for the same invalid input.

The HTTP branch (_read_cog_http) accepts band but does not slice on it yet; #1669 will add the slice. A comment in the new validator notes that the same check should land there at that point.

Tests

xrspatial/geotiff/tests/test_band_validation_1673.py adds:

  • band=-1, band=samples_per_pixel, band=samples_per_pixel+100 against a 3-band tiled tiff. Each raises IndexError with the expected band=N out of range substring.
  • Backend parity: read_to_array and read_geotiff_dask raise the same error type and contain the same diagnostic substring for the same invalid band.
  • Sanity checks that valid band indices and band=None continue to work unchanged.

All 7 new tests fail against main and pass after the patch.

Verification

pytest xrspatial/geotiff/tests/test_band_validation_1673.py -q
.......                                                                  [100%]
7 passed in 0.10s

Full geotiff suite: 1622 passed, 7 skipped. The 3 test_features::TestPalette failures are pre-existing matplotlib deepcopy recursion issues unrelated to this change (also fail on main).

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 15:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes backend parity for GeoTIFF reads by validating the band argument in the eager read_to_array path, matching the existing validation behavior in the dask and GPU implementations (issue #1673).

Changes:

  • Add band index validation in read_to_array to reject negative and out-of-range band indices with a consistent IndexError message.
  • Add regression tests covering invalid band values and asserting parity between read_to_array and read_geotiff_dask.
  • Preserve existing behavior for valid band values and band=None.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/_reader.py Adds eager-path band validation to prevent negative/out-of-range indexing and align error behavior with other backends.
xrspatial/geotiff/tests/test_band_validation_1673.py Adds regression tests ensuring invalid band values raise consistent IndexErrors and eager/dask parity holds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/_reader.py Outdated
# leaks a raw numpy ``IndexError`` with the internal slice
# shape. Mirrors the dask path's pre-flight validator (see
# ``read_geotiff_dask`` in ``__init__.py``) and the GPU path
# so all backends agree on the contract: 0-based positive
Two failure modes follow:

* ``band=-1`` silently selects the last channel via numpy negative
indexing. The public contract is "0-based positive index", so
brendancol added a commit that referenced this pull request May 12, 2026
Addresses Copilot review feedback on PR #1676: "positive" excludes 0
in standard math usage, but band=0 is valid. Switch to "non-negative"
in the three places the contract is documented (_reader.py comment,
_vrt.py comment, test_band_validation_1673.py module docstring).
The local eager path applied `arr[:, :, band]` with no pre-flight
check, so `band=-1` silently selected the last channel via numpy
negative indexing and `band>=samples_per_pixel` leaked a raw numpy
IndexError with the internal slice shape.

Add the same validator the dask and GPU paths already run, raising
`IndexError("band=N out of range for M-band file.")` so all three
backends agree on the contract: 0-based positive index only.

The HTTP branch accepts `band` but does not slice on it yet; issue
#1669 will add the slice. A note in the new validator flags that
the same check should land there at that point.
Addresses Copilot review feedback on PR #1676: "positive" excludes 0
in standard math usage, but band=0 is valid. Switch to "non-negative"
in the three places the contract is documented (_reader.py comment,
_vrt.py comment, test_band_validation_1673.py module docstring).
@brendancol brendancol merged commit d427d0a into main May 12, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Local eager band parameter accepts negative and out-of-range values silently

2 participants