Layout Reader V27#8518
Conversation
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Checkpoint of in-progress V2 ScanNode work (segment scheduling driver, scheduled segment source, scan scheduler) so agent fixes can be integrated on a clean base. Reviewed/benchmarked state. Signed-off-by: Nicholas Gates <nick@nickgates.com>
The scan2 StructScanNode single-field fast paths (single get_item and single-referenced-field expressions) routed straight to the child scan node, bypassing the parent struct's validity mask. Projecting one field out of a nullable struct therefore returned the child's own values and validity with no parent null mask applied, producing wrong nulls (and a non-nullable result where a nullable one was expected). Mirror the v1 struct reader's `array.mask(validity)` behaviour: add a small MaskScanNode that reads an input value and the struct's non-nullable boolean validity child and produces `mask(input, validity)`. Wrap the single-field fast-path results in MaskScanNode when the struct is nullable. The full push_struct path already threads validity through StructValueScanNode, so it is unchanged. Add a V1-vs-V2 differential test harness in vortex-file that scans the same ScanRequest through both paths and asserts equality across flat (nullable + non-nullable), chunked, dict-encoded, zoned, and nested nullable-struct fixtures, plus ports of the v1 struct-null regression tests (test_struct_layout_nulls / test_struct_layout_nested) to the V2 path. Before the fix the five nested-nullable-struct cases failed with "expected i32?, actual i32"; after the fix all 18 cases pass. Signed-off-by: Nicholas Gates <nick@nickgates.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…filter-first Port of the V1 multi-conjunct filter behavior to the V2 PartitionWorkScheduler driver: (1) sort filter conjuncts cheapest-first in PreparedScanNodeFile::try_new so expensive residuals (e.g. FSST LIKE) run after cheap selective ones; (2) when the demanded-row density falls below EXPR_EVAL_THRESHOLD (0.2), read the residual predicate with selection=need so the leaf returns the compacted array and the expression evaluates over only the demanded rows, scattering the verdict back via Mask::intersect_by_rank. Adds V1-vs-V2 differential cases (low- and high-density multi-conjunct) and a predicate_cost unit test. Improves ClickBench multi-conjunct filters (q22 701->547ms, q23 now < V1). A separate single-LIKE FSST amplification (q21) remains and is tracked separately. Signed-off-by: Nicholas Gates <nick@nickgates.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
V2 parallelizes the join probe, aggregate, and Arrow decode ACROSS DataFusion partitions (V1 instead fans one partition into many split tasks). When a query projected a heavily-encoded column (e.g. a single RunEnd chunk for lineitem.l_orderkey), the opener fed split_aligned_row_range coarse chunk boundaries, which collapsed every byte-range file_group onto one partition and serialized the probe ~2-wide (TPC-H q4 ran 2.6x slower than V1). Feed split_aligned_row_range the scan's own morsel ranges instead: the read-column chunk hints, or the 100k-row fallback when a read column is a single chunk (mirroring PreparedScanNodeFile::splits). Each morsel lands wholly in one partition, so the scan spreads across all of DataFusion's byte-range file_groups with no collapse and no chunk straddling a partition boundary. The assignment is contiguous per partition, so it is correct even when the scan output must preserve order. Also run the Vortex->Arrow conversion on the runtime CPU pool (handle.spawn_cpu + buffered/buffer_unordered) so decode fans out within a partition rather than running serially on the consumer poll thread. TPC-H SF1 (datafusion-bench, VORTEX_SCAN_IMPL=v2): q4 goes from 2.6x slower than V1 to faster than V1; overall ~parity. Signed-off-by: Nicholas Gates <nick@nickgates.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…H_FULL_PLAN With --show-metrics and VORTEX_BENCH_FULL_PLAN=1, print the DataFusion EXPLAIN ANALYZE-style annotated plan (elapsed_compute / output_rows per operator) to stderr, to localize where wall time goes across scan, HashJoin build/probe, and aggregate. Signed-off-by: Nicholas Gates <nick@nickgates.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Rename the runtime scan node API to ScanPlan and move the plan and segment primitives into vortex-scan. Layout v2 now expands directly through layout.new_scan_plan with a plan ScanRequest, and the docs describe the v2 path as the layout scan model. Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
This comment was marked as off-topic.
This comment was marked as off-topic.
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: "Nicholas Gates" <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com> # Conflicts: # Cargo.lock # vortex-file/Cargo.toml
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.022x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.022x ➖, 2↑ 3↓)
No file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.987x ➖, 2↑ 3↓)
datafusion / parquet (1.085x ➖, 0↑ 9↓)
datafusion / arrow (1.035x ➖, 1↑ 2↓)
duckdb / vortex-file-compressed (1.099x ➖, 0↑ 12↓)
duckdb / parquet (1.092x ➖, 0↑ 7↓)
File Size Changes (17 files changed, -44.6% overall, 3↑ 14↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.818x ✅, 4↑ 0↓)
datafusion / parquet (0.935x ➖, 3↑ 0↓)
duckdb / vortex-file-compressed (0.824x ✅, 4↑ 1↓)
duckdb / parquet (0.998x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -46.3% overall, 0↑ 3↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.902x ➖, 47↑ 2↓)
datafusion / parquet (0.941x ➖, 16↑ 1↓)
duckdb / vortex-file-compressed (0.914x ➖, 49↑ 7↓)
duckdb / parquet (0.979x ➖, 0↑ 1↓)
File Size Changes (30 files changed, -43.4% overall, 3↑ 27↓)
Totals:
|
Benchmarks: Clickbench Sorted on NVMEVerdict: No clear signal (medium confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.923x ➖, 7↑ 2↓)
datafusion / parquet (1.016x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (1.200x ❌, 1↑ 6↓)
duckdb / parquet (1.000x ➖, 0↑ 0↓)
File Size Changes (201 files changed, -42.6% overall, 53↑ 148↓)
Totals:
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.010x ➖, 1↑ 1↓)
duckdb / parquet (1.016x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -32.3% overall, 0↑ 3↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.919x ➖, 2↑ 2↓)
datafusion / parquet (0.963x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.866x ➖, 3↑ 2↓)
duckdb / parquet (1.003x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.919x ➖, 4↑ 1↓)
datafusion / parquet (1.002x ➖, 0↑ 0↓)
datafusion / arrow (0.978x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.024x ➖, 1↑ 0↓)
duckdb / parquet (0.992x ➖, 1↑ 0↓)
File Size Changes (47 files changed, -44.4% overall, 15↑ 32↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.868x ✅, 17↑ 1↓)
datafusion / parquet (0.947x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (0.953x ➖, 14↑ 7↓)
duckdb / parquet (0.979x ➖, 0↑ 0↓)
File Size Changes (201 files changed, -39.1% overall, 51↑ 150↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.276x ➖, 1↑ 10↓)
datafusion / parquet (0.976x ➖, 3↑ 4↓)
duckdb / vortex-file-compressed (0.997x ➖, 1↑ 2↓)
duckdb / parquet (1.087x ➖, 0↑ 4↓)
|
Signed-off-by: Nicholas Gates <nick@nickgates.com>
What feels like the 27th time I've explored this space, I think I might finally be getting somewhere.
This design pulls out essentially a scan engine. Layouts are actually just one way take serialized arrays and construct a ScanPlan, but in theory we could build a ScanPlan by hand or by any other means.
A ScanPlan node can accept push-down of various operations:
This plan can then be used to answer different types of questions:
[more description to come]