refactor(raster): drop data()/contiguous_data() from Raster interface#915
Open
james-willis wants to merge 1 commit into
Open
refactor(raster): drop data()/contiguous_data() from Raster interface#915james-willis wants to merge 1 commit into
james-willis wants to merge 1 commit into
Conversation
0266335 to
01e03bb
Compare
…ccessors; split is_2d Establish a no-transparent-materialization contract for raster byte access. - Add NdBuffer::is_contiguous() + as_contiguous() — borrow-or-error access to the visible bytes as a packed row-major slice. Never materializes: a strided view returns an error pointing at RS_EnsureContiguous (apache#899). - Remove BandRef::data() and contiguous_data(); callers read bytes via nd_buffer()?.as_contiguous(). Trait accessors stay zero-copy; explicit materialization is the caller's responsibility. - Split is_2d() into is_spatial_2d() (a pure 2-D spatial shape/naming predicate) and a separate contiguity check. The GDAL boundary gates on is_spatial_2d() and lets nd_buffer()?.as_contiguous() enforce a usable byte layout. This relaxes the old strict-identity requirement to plain contiguity (is_contiguous is offset-agnostic). - RS_EnsureLoaded: guard the InDb passthrough and OutDb result against non-identity views with a clear internal error (apache#897); the field-by-field rebuild can't carry a view yet. Behavior-preserving: non-identity views are still rejected at the read boundary, so every band is identity/contiguous today.
01e03bb to
cf35a6b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After some discussions with @paleolimbot in some recent raster PRs, we reached the conclusion that we don't want the raster type to do transparent allocations.
Given this I'm simplifying the raster byte access patterns with this PR to be just the
nd_buffermethod, and eliminating thedataandcontiguous_datamethods. Instead we added two methods to the NDBuffer trait:is_contiguous: whether or not the data exported byte the buffer is all contiguous in memoryas_contiguous: returns that buffer as aResult<&'a [u8], ArrowError>, throwing an error if the data is not contiguous.This avoids any transparent allocations. We also expose a
try_contiguous_datamethodDrive-by: split
is_2dintois_spatial_2d+ a contiguity checkWhile moving the GDAL boundary off
data(), I noticedBandRef::is_2d()was conflating two unrelated things: whether a band is a 2-D spatial raster (dim count + recognizedy/x,lat/lon, … names) and whether its view is the identity (a byte-layout question). The name only advertised the first.So I split it:
is_spatial_2d()— a pure shape/naming predicate (dims only).NdBuffer::is_contiguous()/try_contiguous_data().The GDAL gate now uses
is_spatial_2d()for dimensionality and letstry_contiguous_data()enforce the byte layout where the bytes are actually read (it errors on a strided view). This also relaxes the old strict-identity requirement to plain contiguity —is_contiguous()is offset-agnostic, so a contiguous outer-axis slice is acceptable rather than rejected. Behavior-preserving today, since the read boundary still rejects non-identity views.