Skip to content

Optimize read_vrt window for needs_resample sources #1704

@brendancol

Description

@brendancol

Background

PR #1699 fixed issue #1694 by resampling source pixel data when a SimpleSource's <SrcRect> size differs from its <DstRect> size. The fix reads the full <SrcRect> from the source file before resampling and clipping to the destination window.

Concern

For needs_resample sources, read_vrt now always reads the full SrcRect even when the caller requests a tiny window= (e.g., a 1x1 window). For large SrcRect this can cause a big decode/memory hit.

See xrspatial/geotiff/_vrt.py around the needs_resample branch (currently lines 552-556), where the read window is set to the full SrcRect:

if needs_resample:
    read_r0 = sr.y_off
    read_c0 = sr.x_off
    read_r1 = sr.y_off + sr.y_size
    read_c1 = sr.x_off + sr.x_size

Proposal

Compute the subset of SrcRect that maps to the requested window (via the inverse of the nearest-neighbour mapping used by _resample_nearest) and read only that subset.

Care needed:

  • The inverse mapping must include all source pixels that any output pixel within window could sample from. Off-by-one boundary errors here would drop pixels at the window edge.
  • Both integer-ratio fast paths and the general non-integer path must agree on the subset.
  • The clip math at the end of the needs_resample branch must adjust to work with a partial src_arr whose origin is no longer at (sr.y_off, sr.x_off).

Workaround

Until this is optimized, callers that pass a small window= on a VRT with large-SrcRect SimpleSources will see the full SrcRect read into memory. Reading without window= and slicing the result has the same cost.

Originally raised by Copilot in the review of #1699.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions