fix(rust): fixed IVF_RQ indexes ignoring the frag-reuse index#7217
Open
gstamatakis95 wants to merge 1 commit into
Open
fix(rust): fixed IVF_RQ indexes ignoring the frag-reuse index#7217gstamatakis95 wants to merge 1 commit into
gstamatakis95 wants to merge 1 commit into
Conversation
ad10713 to
c9a48a9
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #7216
After compaction with defer_index_remap=true, covering vector indexes are expected to remap their stored row addresses through the __lance_frag_reuse index at load time. The IVF_PQ storage does this, but
RabitQuantizationStorage::try_from_batch received the frag-reuse index as a parameter named _fri and discarded it. As a result an IVF_RQ index loaded from a debt-carrying version kept its pre-compaction row addresses, and any ANN query that fetched row content failed at the scanner take stage with take operation specified fragment id N but this fragment does not exist.
The IVF state cache reconstruction path in reconstruct_typed passed a hard-coded None for the frag-reuse index, so even with the storage fixed the remap was skipped whenever the index was rebuilt from a cached state. The cache key already includes the frag-reuse uuid, so cached states with and without remap debt stay distinct when the index is threaded through.
This PR makes the following changes:
In rust/lance-index/src/vector/bq/storage.rs, try_from_batch now builds a row id mapping for the partition from FragReuseIndex::remap_row_id, mirroring the PQ derivation, and applies the storage's existing
remap method. Reusing remap instead of porting the PQ inline rewrite is deliberate, because RQ codes live in a packed and permuted SIMD layout and remap already performs the layout-correct filter and repack,
including the multi-bit split ex code and factor columns. It is already pinned by the existing identity-remap regression tests. When no frag-reuse index is present, when its row id maps are empty, or when
no row in the partition is affected, the path stays a no-op. The generic remap_row_ids_record_batch helper used by the flat and SQ storages was considered and does not fit here, both because it assumes a two
column batch and because a plain row-wise take is not safe on the packed code layout.
In rust/lance/src/index/vector/ivf/v2.rs and rust/lance/src/index.rs, IvfStateEntry::reconstruct gains a frag-reuse parameter and the caller passes open_frag_reuse_index, which is already cached in the index
cache, closing the warm cache hole without adding load overhead.
In rust/lance/src/dataset/optimize.rs, a regression test test_read_ivf_rq_index_v3_with_defer_index_remap is added next to the PQ analog. It queries with a non-empty projection because the take stage where
this bug manifests is only exercised when row content is fetched. The existing PQ test uses an empty projection and never reaches that stage, which is why this went undetected.
Verification: with the storage remap reverted, the new test fails with the exact defect signature from scanner.rs. With the fix it passes. cargo test -p lance defer_index_remap passes (the new RQ test plus
the existing PQ tests), cargo test -p lance-index bq passes, and cargo fmt and cargo clippy -- -D warnings are clean on the touched crates.