Skip to content

Decode TIFF predictor=3 by file byte order so big-endian floats read exactly#1500

Open
brendancol wants to merge 3 commits intomainfrom
fix-predictor3-big-endian
Open

Decode TIFF predictor=3 by file byte order so big-endian floats read exactly#1500
brendancol wants to merge 3 commits intomainfrom
fix-predictor3-big-endian

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • The floating-point predictor (TIFF Tech Note 3) byte-swizzles each row into bps lanes, MSB first. The decoder un-transposed those lanes assuming the output was always little-endian: MSB landed at byte index bps-1 of each output sample. Big-endian predictor=3 files therefore decoded into samples with the bytes in the wrong order, which silently produced garbage float values - the array still came back as a clean float32/float64.
  • The fix threads the file's byte order into the un-transpose on CPU and GPU. BE files put the MSB at byte index 0; LE files are unchanged. The GPU output also needed a final byteswap because cupy interprets the byte-stream view as native little-endian.

Repro before the fix

import tifffile, numpy as np
arr = np.array([[1.0, 2.0, 3.0, 4.0]], dtype='float32')
tifffile.imwrite('/tmp/be.tif', arr, byteorder='>',
                 predictor=3, compression='deflate')

from xrspatial.geotiff._reader import read_to_array
print(read_to_array('/tmp/be.tif')[0])  # before: garbage like 4.6e-41

Test plan

  • New test_predictor3_big_endian.py: float32 and float64 BE round-trip
  • BE predictor=3 with tiled layout (multiple tiles)
  • LE predictor=3 still round-trips (no regression)
  • GPU BE predictor=3 parity with CPU
  • Existing predictor and writer tests pass

… floating-point TIFFs read exactly

The floating-point predictor (TIFF Tech Note 3) byte-swizzles each row
into bytes_per_sample lanes with the most significant byte in lane 0.
The decoder un-transposed those lanes assuming the output was always
little-endian: it placed the MSB at byte index bps-1 of each output
sample. Big-endian predictor=3 files therefore decoded into samples
with the bytes in the wrong order; the result still came back as a
clean float32/float64 array, just with garbage values.

The fix threads the file's byte order into the un-transpose step on
both CPU and GPU paths. For big-endian files the MSB now lands at byte
index 0 of each output sample. The little-endian path is unchanged.

The GPU output also needed a final byteswap when the file was
big-endian: the GPU decode keeps bytes in the file's order through
predictor decode and tile assembly, but cupy interprets the resulting
view as native (little-endian) byte order.

New tests (test_predictor3_big_endian.py) decode tifffile/imagecodecs-
encoded big-endian predictor=3 float32 and float64 files (strip and
tiled layouts, plus a GPU parity check) and confirm the existing
little-endian round-trip still works.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 6, 2026
brendancol added 2 commits May 6, 2026 06:03
The third-pass agent worked from a stale local main and re-derived the
predictor=2 sample-wise fix that already shipped in #1498. That PR has been
closed as a duplicate. Only the predictor=3 big-endian fix in #1500 is a
genuine new finding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant