Add FlatLayout range read for sub-segment IO by jiaqizho · Pull Request #6974 · vortex-data/vortex

jiaqizho · 2026-03-16T08:46:28Z

Summary

When a FlatLayout has its array_tree metadata inlined in the footer, we can figure out exactly which bytes of the segment are needed for a given row range without any IO. This lets us issue a single small read instead of fetching the entire segment, which is a big win for point lookups and narrow scans on wide tables.

The range read planner walks the encoding tree (Primitive, Bool, BitPacked, Delta, FoR, ZigZag, ALP, ALPRD, Dict, FixedSizeList, Constant) and computes the minimal contiguous byte range covering the needed buffers. If that range is less than 50% of the full segment, we issue the targeted read; otherwise we fall back to reading the whole segment.

To make Delta work with sub-ranged buffers, Delta::build() now derives child array lengths from len + offset instead of metadata.deltas_len. On disk, offset is always 0 so this is a no-op for the normal decode path, but it lets the range read pass a smaller decode_len without the decoder panicking on buffer size mismatch.

Also adds request_range() to the SegmentSource trait with a default fallback implementation, efficient overrides in FileSegmentSource and BufferSegmentSource, a RangeReadEnabled session flag, and pub const NAME on all encoding structs for pattern matching in the planner.

The current implementation requires the array encoding tree (ArrayNode) to be inlined in the footer via FLAT_LAYOUT_INLINE_ARRAY_NODE=1. Without this flag, the ArrayNode is stored inside the segment data and is not available to the range read planner until the entire segment is fetched (it would be possible to add an extra IO per column to fetch just the ArrayNode from the segment, but the overhead would negate much of the benefit). Since the planner needs the encoding tree to determine which byte ranges to read, range read is effectively disabled without inlining, and every take falls back to reading the full segment. A follow-up change will make inlining the default behavior.

Testing

jiaqizho · 2026-03-16T08:47:10Z

Note: If a segment contains a validity bitmap, it falls back to reading the entire segment.

Encodings with range read support

Encoding	Description	Read Amplification Reduction
Primitive	Reads sizeof(type) bytes	~130,000x ~ 1,000,000x
BitPacked	Reads one 1024-element chunk	~250x ~ 8,000x
FoR	Delegates to child (typically BitPacked)	~250x ~ 8,000x
ALP / ALPRD	Encodes to integers, delegates to child	~250x ~ 8,000x
Delta	Reads chunk-aligned deltas + corresponding bases	~250x ~ 8,000x
Bool	Reads byte-aligned range (1 byte)	~1,000,000x
Dict	Reads code sub-range + full dictionary	~250x ~ 8,000x
Constant	Value already in metadata	- (zero IO)
ZigZag	Delegates to child	~250x ~ 8,000x
FixedSizeList	Delegates to child	Depends on child encoding

Encodings without range read support

Encoding	Reason
Sparse	Cannot determine sub-range without reading indices first
BitPacked + patches	Patch indices are global coordinates, incompatible with sub-ranged data
Delta (offset≠0)	Non-zero offset breaks chunk alignment
RunEnd	Variable-length runs, cannot map row number to byte offset
RLE (fastlanes)	Same as RunEnd, variable-length runs
VarBin	Variable-length data, cannot map row to byte offset
VarBinView	Variable-length data, cannot map row to byte offset
FSST	Compressed variable-length, requires decompression to locate
PCO	Opaque compressed blocks, no random access
ByteBool	Could be supported but not implemented (low priority, Bool is more common)
DateTimeParts	Multi-child container, requires recursive support for all children
DecimalByteParts	Multi-child container, requires recursive support for all children
Sequence	Container encoding, not a leaf node
Zstd / ZstdBuffers	Opaque compressed blocks, no random access

joseph-isaacs

I really like the idea of sub segment reads however this must be added in an extensible way @gatesn any thoughts on where do we put this?

It seems to me that we want to allow arrays to specify how slice their buffers.

vortex-layout/src/layouts/flat/range_read.rs

joseph-isaacs · 2026-03-16T13:22:41Z

I think this should be moved to a design discussion, please open one and then we can refine the design before impl this feature.

gatesn · 2026-03-16T13:51:47Z

A couple of immediate thoughts:

This is definitely a problem we want to address.
A lot of the nastiness here is because we don't yet have a ListLayout (which means we cannot chunk the elements array independently of the offsets array).
We should change FixedSizeList to be a "view" type, possibly just ListView so that we can slice and shuffle without copying elements. (Somewhat unrelated I think)
It's not widely documented, but if you're doing this from local disk, you may get sufficient pruning from memmap'ing the file, in which case due to Vortex buffer alignment you will get sub-segment slicing out of the box.
As Joe says.... I would love for all arrays to be able to push-down slicing/selection into their I/O layer. I think we might be able to do something here with our BufferHandle abstraction that means arrays cannot assume the buffer is contiguous host memory. This might allow us to push down selection into the buffer, and then call to_host later to materialize it into something useful.

connortsui20 · 2026-03-16T13:52:21Z

just as an aside, I do like the const NAME change. Is it possible to split that out into a separate PR?

codspeed-hq · 2026-03-16T13:57:23Z

Merging this PR will improve performance by 11.12%

⚡ 1 improved benchmark
✅ 1008 untouched benchmarks
⏩ 1515 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`binary_search_std`	582.8 ns	524.4 ns	+11.12%

_{Comparing jiaqizho:support-rangeread (3d68adc) with develop (876813b)}

1515 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

jiaqizho · 2026-03-17T07:36:34Z

@joseph-isaacs @gatesn @connortsui20

Thanks for the thorough review and great suggestions! I've force-pushed an update addressing the feedback.

I have replaced the static match encoding_id dispatch with a VTable::plan_range_read method. Each encoding now declares its own buffer sub-ranges and child recursion strategy via the vtable without centralized match on encoding names. Unsupported encodings return None by default and fall back to full segment reads.

Design discussion: Opened #6991 with the motivation, full design, benchmark data(in my case), and future directions (nullable support, BufferHandle-based approach, etc.).

Would love any further feedback on the design or implementation — happy to keep iterating!

connortsui20 · 2026-03-17T11:42:46Z

@jiaqizho If you look at the CI actions, you'll see that this change doesn't pass a few checks (formatting and docs). Let us know when you've fixed it and we can run CI for you!

jiaqizho · 2026-03-17T12:11:32Z

@jiaqizho If you look at the CI actions, you'll see that this change doesn't pass a few checks (formatting and docs). Let us know when you've fixed it and we can run CI for you!

@connortsui20 Thanks for the heads up! I've pushed fixes for the formatting and public API lock files. These were generated locally so not 100% sure they match your CI environment — pls let me know if anything still fails.

connortsui20 · 2026-03-17T12:13:54Z

Seems like https://github.com/vortex-data/vortex/actions/runs/23193429228/job/67395598365?pr=6974 is still failing, just need to fix the doc links

When a FlatLayout has its array_tree metadata inlined in the footer, we can figure out exactly which bytes of the segment are needed for a given row range without any IO. This lets us issue a single small read instead of fetching the entire segment, which is a big win for point lookups and narrow scans on wide tables. The range read planner walks the encoding tree via `VTable::plan_range_read`, where each encoding (Primitive, Bool, BitPacked, Delta, FoR, ZigZag, ALP, ALPRD, Dict, FixedSizeList, Constant, Null, Sequence, ByteBool, DateTimeParts, DecimalByteParts) declares its own buffer sub-ranges and child recursion strategy. If the resulting byte range is less than 50% of the full segment, we issue the targeted read; otherwise we fall back to reading the whole segment. To make Delta work with sub-ranged buffers, Delta::build() now derives child array lengths from `len + offset` instead of metadata.deltas_len. On disk, offset is always 0 so this is a no-op for the normal decode path, but it lets the range read pass a smaller decode_len without the decoder panicking on buffer size mismatch. Also adds `request_range()` to the SegmentSource trait with a default fallback implementation, efficient overrides in FileSegmentSource and BufferSegmentSource, a `RangeReadEnabled` session flag, and `ScanBuilder::with_split_row_indices` to generate per-index tight ranges for point lookups. Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>

jiaqizho · 2026-03-17T13:27:41Z

Seems like https://github.com/vortex-data/vortex/actions/runs/23193429228/job/67395598365?pr=6974 is still failing, just need to fix the doc links

@connortsui20 done, retrigger pls, thanks.

jiaqizho · 2026-03-19T06:58:45Z

@joseph-isaacs @gatesn @connortsui20

Hi, wanted to follow up on this. The discussion in #6991 has the full design and S3 benchmark data. The key question is whether the vtable-based plan_range_read approach addresses the extensibility concern from the original PR review. Let me know if you'd prefer a different direction — happy to iterate.

connortsui20 · 2026-03-19T13:33:54Z

See #6991 (comment) for more discussion

joseph-isaacs requested changes Mar 16, 2026

View reviewed changes

vortex-layout/src/layouts/flat/range_read.rs Outdated Show resolved Hide resolved

jiaqizho mentioned this pull request Mar 17, 2026

Extract pub const NAME for all array encoding structs #6990

Closed

jiaqizho force-pushed the support-rangeread branch from 6081b5b to 533ce6a Compare March 17, 2026 07:28

joseph-isaacs added the action/benchmark Trigger full benchmarks to run on this PR label Mar 17, 2026

connortsui20 added changelog/feature A new feature and removed action/benchmark Trigger full benchmarks to run on this PR labels Mar 17, 2026

jiaqizho force-pushed the support-rangeread branch from 533ce6a to 458273c Compare March 17, 2026 12:08

jiaqizho force-pushed the support-rangeread branch from 458273c to 3d68adc Compare March 17, 2026 13:26

Conversation

jiaqizho commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

jiaqizho commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Encodings with range read support

Encodings without range read support

Uh oh!

joseph-isaacs left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joseph-isaacs commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gatesn commented Mar 16, 2026

Uh oh!

connortsui20 commented Mar 16, 2026

Uh oh!

codspeed-hq bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 11.12%

Performance Changes

Footnotes

Uh oh!

jiaqizho commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

connortsui20 commented Mar 17, 2026

Uh oh!

jiaqizho commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

connortsui20 commented Mar 17, 2026

Uh oh!

jiaqizho commented Mar 17, 2026

Uh oh!

jiaqizho commented Mar 19, 2026

Uh oh!

connortsui20 commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jiaqizho commented Mar 16, 2026 •

edited

Loading

jiaqizho commented Mar 16, 2026 •

edited

Loading

joseph-isaacs left a comment •

edited

Loading

joseph-isaacs commented Mar 16, 2026 •

edited

Loading

codspeed-hq bot commented Mar 16, 2026 •

edited

Loading

jiaqizho commented Mar 17, 2026 •

edited

Loading

jiaqizho commented Mar 17, 2026 •

edited

Loading