feat(vamana): deferred compression for dynamic Vamana index#326
Draft
feat(vamana): deferred compression for dynamic Vamana index#326
Conversation
Build a dynamic Vamana index initially with an uncompressed backend (FP32/FP16) and transparently swap to a compressed target backend (SQ8 / LVQ / LeanVec) once a configured live-vector threshold is crossed. The graph, ID translator, status, and entry point are reused; only the underlying Data is retrained and transplanted. Public API: - MutableVamanaIndex::TransplantTag ctor + rvalue release_*() accessors - DynamicVamana::get_typed_impl<QueryTypes, Impl>() - DynamicIndexParams::deferred_compression_threshold (default 0) - DynamicIndexParams::initial_storage_kind (FP32 / FP16) - DynamicVamanaIndex::get_current_storage_kind() Behavior: - threshold == 0 -> unchanged eager build. - threshold > 0 -> initial build uses initial_storage_kind, swap closure retrains and transplants on add() once size() >= threshold. - Fast path: when the very first add() already meets threshold, the runtime skips staging and builds the target backend directly. Tests: - 5 new runtime tests under [runtime][deferred_compression]: FP32 -> LVQ4x8, FP32 -> LeanVec4x8, default-disabled passthrough, rejection of unsupported initial kinds, first-add fast-path.
rfsaliev
reviewed
May 5, 2026
| /// | ||
| /// The new instance recomputes ``first_empty_`` from ``status``. | ||
| template <typename Pool> | ||
| MutableVamanaIndex( |
Member
There was a problem hiding this comment.
I would think about kind of 'move' .ctor with signature like:
template <typename SourceData, typename Pool>
MutableVamanaIndex(
MutableVamanaIndex<Graph, SourceData, Dist>&& source_index,
Pool threadpool,
...
)
graph_{source_index.release_graph()},
...
{
...
}In such case, we do not need user code to know about MutableVamanaIndex internal structure and which data to be moved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for "deferred compression" on
svs::DynamicVamana: the index can be built initially with an uncompressed backend (FP32/FP16) and transparently swap to a compressed target backend (SQ8 / LVQ / LeanVec) once a configured live-vector threshold is crossed. The already-built graph, ID translator, status, and entry point are reused — only the underlyingDatais retrained and transplanted.Motivation
Compressed backends (LVQ/LeanVec) require sufficient training data for good recall. With purely-eager dynamic builds, users either (a) build with poor compression on a tiny initial population or (b) rebuild from scratch later. Deferred compression resolves this: build cheaply on an uncompressed backend, then swap once enough vectors are present.
Public API additions
svs::index::vamana::MutableVamanaIndex::TransplantTag— new constructor + tag that builds aMutableVamanaIndexover a newDatawhile reusing an existing graph / status / entry-point / translator. Validatesdata.size() == graph.n_nodes() == status.size().release_data()/release_graph()/release_status()/release_entry_point()/release_translator()onMutableVamanaIndex— move-out accessors enabling the transplant.svs::DynamicVamana::get_typed_impl<QueryTypes, Impl>()— controlled type-erasure escape hatch returning the concreteMutableVamanaIndex<...>*(ornullptron type mismatch).svs::runtime::v0::VamanaIndex::DynamicIndexParams:size_t deferred_compression_threshold = 0(0 disables; >0 arms).StorageKind initial_storage_kind = StorageKind::FP32(must be FP32 or FP16).svs::runtime::v0::DynamicVamanaIndex::get_current_storage_kind()— observable transition between initial and target kind.Runtime behavior
threshold == 0): unchanged eager behavior.initial_storage_kind, installs a per-source-tag swap closure that retrains and transplants whensize() >= threshold.add()already meets the threshold, the runtime skips the staging build and constructs the target compressed backend directly.Out of scope