Skip to content

feat(vamana): deferred compression for dynamic Vamana index#326

Draft
ibhati wants to merge 1 commit intomainfrom
ib/deferred-compression
Draft

feat(vamana): deferred compression for dynamic Vamana index#326
ibhati wants to merge 1 commit intomainfrom
ib/deferred-compression

Conversation

@ibhati
Copy link
Copy Markdown
Member

@ibhati ibhati commented May 5, 2026

Summary

Adds support for "deferred compression" on svs::DynamicVamana: the index can be built initially with an uncompressed backend (FP32/FP16) and transparently swap to a compressed target backend (SQ8 / LVQ / LeanVec) once a configured live-vector threshold is crossed. The already-built graph, ID translator, status, and entry point are reused — only the underlying Data is retrained and transplanted.

Motivation

Compressed backends (LVQ/LeanVec) require sufficient training data for good recall. With purely-eager dynamic builds, users either (a) build with poor compression on a tiny initial population or (b) rebuild from scratch later. Deferred compression resolves this: build cheaply on an uncompressed backend, then swap once enough vectors are present.

Public API additions

  • svs::index::vamana::MutableVamanaIndex::TransplantTag — new constructor + tag that builds a MutableVamanaIndex over a new Data while reusing an existing graph / status / entry-point / translator. Validates data.size() == graph.n_nodes() == status.size().
  • Rvalue-qualified release_data() / release_graph() / release_status() / release_entry_point() / release_translator() on MutableVamanaIndex — move-out accessors enabling the transplant.
  • svs::DynamicVamana::get_typed_impl<QueryTypes, Impl>() — controlled type-erasure escape hatch returning the concrete MutableVamanaIndex<...>* (or nullptr on type mismatch).
  • svs::runtime::v0::VamanaIndex::DynamicIndexParams:
    • size_t deferred_compression_threshold = 0 (0 disables; >0 arms).
    • StorageKind initial_storage_kind = StorageKind::FP32 (must be FP32 or FP16).
  • svs::runtime::v0::DynamicVamanaIndex::get_current_storage_kind() — observable transition between initial and target kind.

Runtime behavior

  • Default (threshold == 0): unchanged eager behavior.
  • Threshold > 0: builds the initial backend with initial_storage_kind, installs a per-source-tag swap closure that retrains and transplants when size() >= threshold.
  • Fast path: if the very first add() already meets the threshold, the runtime skips the staging build and constructs the target compressed backend directly.

Out of scope

  • Python bindings (separate follow-up).
  • Persistence of deferred config (saved indices remain post-swap state).

Build a dynamic Vamana index initially with an uncompressed backend
(FP32/FP16) and transparently swap to a compressed target backend
(SQ8 / LVQ / LeanVec) once a configured live-vector threshold is
crossed. The graph, ID translator, status, and entry point are reused;
only the underlying Data is retrained and transplanted.

Public API:
- MutableVamanaIndex::TransplantTag ctor + rvalue release_*() accessors
- DynamicVamana::get_typed_impl<QueryTypes, Impl>()
- DynamicIndexParams::deferred_compression_threshold (default 0)
- DynamicIndexParams::initial_storage_kind (FP32 / FP16)
- DynamicVamanaIndex::get_current_storage_kind()

Behavior:
- threshold == 0 -> unchanged eager build.
- threshold > 0  -> initial build uses initial_storage_kind, swap closure
  retrains and transplants on add() once size() >= threshold.
- Fast path: when the very first add() already meets threshold, the
  runtime skips staging and builds the target backend directly.

Tests:
- 5 new runtime tests under [runtime][deferred_compression]:
  FP32 -> LVQ4x8, FP32 -> LeanVec4x8, default-disabled passthrough,
  rejection of unsupported initial kinds, first-add fast-path.
///
/// The new instance recomputes ``first_empty_`` from ``status``.
template <typename Pool>
MutableVamanaIndex(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think about kind of 'move' .ctor with signature like:

template <typename SourceData, typename Pool>
MutableVamanaIndex(
  MutableVamanaIndex<Graph, SourceData, Dist>&& source_index,
  Pool threadpool,
 ...
)
 graph_{source_index.release_graph()},
 ...
{
...
}

In such case, we do not need user code to know about MutableVamanaIndex internal structure and which data to be moved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants