Skip to content

Bounds-check IFD entry table in parse_ifd (#1672)#1677

Merged
brendancol merged 2 commits into
mainfrom
issue-1672
May 12, 2026
Merged

Bounds-check IFD entry table in parse_ifd (#1672)#1677
brendancol merged 2 commits into
mainfrom
issue-1672

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Three struct.unpack_from calls inside parse_ifd read from the IFD entry table without first checking the buffer length: the num_entries field, each entry's tag/type/count, and the next-IFD pointer. A truncated file used to escape with struct.error, which is outside the ValueError/TypeError contract #1661's Hypothesis fuzz tests assume. The typed-error work in #1661 (commit 58a8b3d) covered the IFD value area but not the entry table itself.

This PR adds three bounds checks in parse_ifd, each raising ValueError with the offset, required byte count, and len(data). Message style matches the existing MAX_IFD_ENTRY_COUNT and value-area bounds-check messages.

Tests

  • test_ifd_entry_table_bounds_1672.py: unit tests building buffers exactly one byte short of each new bounds check, in both classic and BigTIFF layouts, plus a sanity case for a complete buffer and an out-of-bounds offset.
  • test_fuzz_hypothesis_1661.py: new Group 4 truncation property tests. One runs the existing corpus through random truncation; a second sweeps integer truncation points 0..400 against a fresh make_minimal_tiff. Both fail if struct.error escapes; ValueError, TypeError, MemoryError, and OverflowError are accepted.

Verification

pytest xrspatial/geotiff/tests/test_ifd_entry_table_bounds_1672.py xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py xrspatial/geotiff/tests/test_parse_ifd_bounds.py xrspatial/geotiff/tests/test_header.py -q

Fixes #1672.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 15:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the GeoTIFF parser’s parse_ifd routine by adding explicit bounds checks for the IFD entry table reads (entry count, per-entry header fields, and next-IFD pointer), ensuring truncated buffers raise typed ValueError instead of leaking struct.error—supporting the fuzz/property-test exception contract introduced in #1661.

Changes:

  • Add pre-unpack_from bounds checks in parse_ifd for num_entries, the entry table region, and the next-IFD pointer.
  • Add focused unit tests for classic TIFF and BigTIFF buffers that are 1 byte short of each guarded read.
  • Extend the Hypothesis fuzz suite with truncation-based properties to prevent regressions where struct.error escapes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
xrspatial/geotiff/_header.py Adds entry-table bounds checks in parse_ifd so truncation errors raise ValueError instead of struct.error.
xrspatial/geotiff/tests/test_ifd_entry_table_bounds_1672.py New unit tests covering truncations at each newly-guarded read in classic TIFF and BigTIFF layouts.
xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py Adds truncation fuzz/property tests to enforce typed-error-only behavior on truncated inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +25 to +31
from xrspatial.geotiff._dtypes import LONG
from xrspatial.geotiff._header import (
TIFFHeader,
TAG_IMAGE_WIDTH,
parse_header,
parse_ifd,
)
Comment on lines +483 to +488
num_entries_size = 8 if is_big else 2
if offset + num_entries_size > data_len:
raise ValueError(
f"IFD num_entries at offset {offset} needs "
f"{num_entries_size} bytes but file length is {data_len}"
)
brendancol added a commit that referenced this pull request May 12, 2026
Addresses two Copilot review comments on PR #1677.

The new bounds checks added in #1672 only guarded
`offset + size > len(data)`, missing negative offsets. With a negative
`offset`, `struct.unpack_from` interprets it as Python's negative-index
semantic (read from the buffer end), silently returning wrong bytes or
raising raw `struct.error` and escaping the typed `ValueError` contract.
This commit adds explicit `< 0` checks on all three bounds-check sites
(num_entries read, entry-table region, next-IFD pointer) and an unused
`TIFFHeader` import is removed from the test module so flake8 stays
clean.

Tests cover negative offsets on both classic TIFF and BigTIFF, plus a
large negative value, and reuse existing fixtures to keep the file
self-contained.
`parse_ifd` was unpacking `num_entries`, every entry's tag/type/count,
and the next-IFD pointer with no prior length check. A truncated or
crafted TIFF could escape with `struct.error`, outside the
ValueError/TypeError contract the #1661 fuzz tests assume. The S2
typed-error work in #1661 covered the IFD value area but not the
entry table itself.

Adds three explicit bounds checks (num_entries field, full entry
table, next-IFD pointer) each raising ValueError with the offset,
required byte count, and file length, matching the existing
MAX_IFD_ENTRY_COUNT / value-area error message style.

Tests:
- test_ifd_entry_table_bounds_1672.py: one-byte-short buffers for
  each of the three new checks, in classic and BigTIFF layouts.
- test_fuzz_hypothesis_1661.py: new Group 4 truncation property
  tests sweeping every byte offset; fails on struct.error escape.
Addresses two Copilot review comments on PR #1677.

The new bounds checks added in #1672 only guarded
`offset + size > len(data)`, missing negative offsets. With a negative
`offset`, `struct.unpack_from` interprets it as Python's negative-index
semantic (read from the buffer end), silently returning wrong bytes or
raising raw `struct.error` and escaping the typed `ValueError` contract.
This commit adds explicit `< 0` checks on all three bounds-check sites
(num_entries read, entry-table region, next-IFD pointer) and an unused
`TIFFHeader` import is removed from the test module so flake8 stays
clean.

Tests cover negative offsets on both classic TIFF and BigTIFF, plus a
large negative value, and reuse existing fixtures to keep the file
self-contained.
@brendancol brendancol merged commit d95d2d0 into main May 12, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parse_ifd raises raw struct.error before bounds checks on truncated TIFFs

2 participants