Skip to content

refactor(raster): drop data()/contiguous_data() from Raster interface#915

Open
james-willis wants to merge 1 commit into
apache:mainfrom
james-willis:jw/raster-zerocopy-accessors
Open

refactor(raster): drop data()/contiguous_data() from Raster interface#915
james-willis wants to merge 1 commit into
apache:mainfrom
james-willis:jw/raster-zerocopy-accessors

Conversation

@james-willis
Copy link
Copy Markdown
Contributor

@james-willis james-willis commented Jun 6, 2026

After some discussions with @paleolimbot in some recent raster PRs, we reached the conclusion that we don't want the raster type to do transparent allocations.

Given this I'm simplifying the raster byte access patterns with this PR to be just the nd_buffer method, and eliminating the data and contiguous_data methods. Instead we added two methods to the NDBuffer trait:

  • is_contiguous: whether or not the data exported byte the buffer is all contiguous in memory
  • as_contiguous: returns that buffer as a Result<&'a [u8], ArrowError>, throwing an error if the data is not contiguous.

This avoids any transparent allocations. We also expose a try_contiguous_data method

Drive-by: split is_2d into is_spatial_2d + a contiguity check

While moving the GDAL boundary off data(), I noticed BandRef::is_2d() was conflating two unrelated things: whether a band is a 2-D spatial raster (dim count + recognized y/x, lat/lon, … names) and whether its view is the identity (a byte-layout question). The name only advertised the first.

So I split it:

  • is_spatial_2d() — a pure shape/naming predicate (dims only).
  • contiguity is now its own question, answered by NdBuffer::is_contiguous() / try_contiguous_data().

The GDAL gate now uses is_spatial_2d() for dimensionality and lets try_contiguous_data() enforce the byte layout where the bytes are actually read (it errors on a strided view). This also relaxes the old strict-identity requirement to plain contiguity — is_contiguous() is offset-agnostic, so a contiguous outer-axis slice is acceptable rather than rejected. Behavior-preserving today, since the read boundary still rejects non-identity views.

@github-actions github-actions Bot requested a review from paleolimbot June 6, 2026 01:16
@james-willis james-willis force-pushed the jw/raster-zerocopy-accessors branch 7 times, most recently from 0266335 to 01e03bb Compare June 6, 2026 02:19
@james-willis james-willis changed the title [WIP] refactor(raster): drop data()/contiguous_data() from Raster interface refactor(raster): drop data()/contiguous_data() from Raster interface Jun 6, 2026
…ccessors; split is_2d

Establish a no-transparent-materialization contract for raster byte access.

- Add NdBuffer::is_contiguous() + as_contiguous() — borrow-or-error access to
  the visible bytes as a packed row-major slice. Never materializes: a strided
  view returns an error pointing at RS_EnsureContiguous (apache#899).
- Remove BandRef::data() and contiguous_data(); callers read bytes via
  nd_buffer()?.as_contiguous(). Trait accessors stay zero-copy; explicit
  materialization is the caller's responsibility.
- Split is_2d() into is_spatial_2d() (a pure 2-D spatial shape/naming
  predicate) and a separate contiguity check. The GDAL boundary gates on
  is_spatial_2d() and lets nd_buffer()?.as_contiguous() enforce a usable byte
  layout. This relaxes the old strict-identity requirement to plain contiguity
  (is_contiguous is offset-agnostic).
- RS_EnsureLoaded: guard the InDb passthrough and OutDb result against
  non-identity views with a clear internal error (apache#897); the
  field-by-field rebuild can't carry a view yet.

Behavior-preserving: non-identity views are still rejected at the read
boundary, so every band is identity/contiguous today.
@james-willis james-willis force-pushed the jw/raster-zerocopy-accessors branch from 01e03bb to cf35a6b Compare June 6, 2026 02:41
@james-willis james-willis marked this pull request as ready for review June 6, 2026 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant