fix(layout): preserve nullability of structs nested in structs#8590
fix(layout): preserve nullability of structs nested in structs#8590miniex wants to merge 1 commit into
Conversation
a nullable struct nested directly in another single-field struct came back non-nullable on a file round-trip. stop collapsing `pack`/`merge` in the `partition` splitter and only fast-path non-reconstruction, so validity lands at each scope's level. Closes vortex-data#8348 Signed-off-by: Han Damin <miniex@daminstudio.net>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
168.8 µs | 205.6 µs | -17.89% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
259.5 µs | 224.4 µs | +15.64% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
306.6 µs | 271.7 µs | +12.84% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
273.6 ns | 244.4 ns | +11.93% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing miniex:fix/nested-struct-nullability (9b2114d) with develop (15cec3b)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
This might be required but I wonder if there's a more elegant solution to this? As in can we correctly handle pack/merge/get_item validity such that we account for validity when doing expression rewrite so the splitting ends up being correct. |
Summary
A nullable struct field nested directly inside another single-field struct came back non-nullable on a file round-trip (
{c0={a={b=i32}?}?}was read back as{c0={a={b=i32}}?}), silently dropping the inner?. The struct reader's single-partition fast path delegated this scope's reconstruction to the child, so chained single-field structs accumulatedpackwrappers and each scope's validity landed at the wrong nesting level. I stopped thepartitionsplitter from collapsingpack/mergereconstruction nodes, and made the struct reader take the fast path only when nothing is reconstructed at this level, so reconstruction now flows through the multi-partition path, one level per scope. Multi-field structs were already unaffected.The fuzzer's struct-of-struct
assume()guard can be dropped once the fuzzer is ondevelop.Closes: #8348
Testing
Added a round-trip regression test in
vortex-file/tests/test_write_table.rsfor the nested nullable-struct schema; it dropped the inner?before this change and now round-trips unchanged.cargo nextest run -p vortex-array -p vortex-layout -p vortex-filepasses (3270 tests, including thepartitionand struct-reader regressions);cargo +nightly fmt --all --checkandcargo clippy --all-targets --all-featuresare clean.I'm Korean, so sorry if any wording reads a little awkward.