Skip to content

SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned nprobe — NO-GO#540

Open
shaal wants to merge 6 commits into
ruvnet:mainfrom
shaal:feat/seprag-bet4-ivf-pruning
Open

SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned nprobe — NO-GO#540
shaal wants to merge 6 commits into
ruvnet:mainfrom
shaal:feat/seprag-bet4-ivf-pruning

Conversation

@shaal
Copy link
Copy Markdown
Contributor

@shaal shaal commented Jun 5, 2026

SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned nprobe — NO-GO

This is a research finding (a NO-GO), not a feature request — no merge urgency. It is off clean main (independent of #535/#537/#539), in a self-contained bench crate that touches no shipping code. Linked to #534. Writeup: ADR-205.

The question (closes the ADR-201 caveat)

ADR-201 built and validated the region-pruning IVF kernel exact, but only ran it as BET 2's mechanism against ACORN — never against its natural incumbent, plain IVF nprobe, on unfiltered ANN. This is that head-to-head: does a triangle-inequality lower bound — LB(q,c)=max(0, ‖q−μ_c‖ − r_c) — with branch-and-bound probing beat tuned nprobe at matched recall@10?

Gate pre-registered + frozen before any run (docs/plans/bet4-ivf-pruning/PRE-REGISTRATION.md): WIN = ≥2× member-eval reduction AND wall-clock win across nclusters ∈ {64,256,1024}; KILL = <1.5×.

Result — NO-GO (robust, structural)

Three contenders share one index per nclusters (only the probe loop differs): plain nprobe, B&B LB-order (the faithful BET-2 kernel), and the steelman B&B (centroid-distance order + LB-skip — the strongest cluster-level variant).

The steelman ties nprobe at exactly 1.00× member-evals in EVERY cell — n=20k and n=50k, 128-d and a PCA-8 low-dim control.

n=50k, 128-d (member-evals/q at matched recall@10=0.95):

nclusters exact-prune plain nprobe B&B steelman ratio
64 0.0% 11,102 11,102 1.00×
256 4.7% 7,890 7,890 1.00×
1024 13.1% 5,682 5,682 1.00×

Mechanism (structural, not dimensional): the true top-k live in the nearest clusters; every method must scan them. The bound only prunes far clusters — which tuned nprobe already skips. So the bound is redundant with nprobe's centroid-distance cutoff, and the win is structurally impossible, in any dimension. The faithful LB-order kernel is strictly worse (0.18–0.25×): ordering by LB probes far, large-radius clusters early.

Honest deviation (kept prominent)

The frozen pre-registration expected the PCA-8 control to show B&B winning (tight bound). It tied (1.00×) instead — my premise was false (a tight bound beats full scan, not tuned nprobe). The control still validated the kernel: the exact-prune fraction scales correctly with dimension (0–13% @128-d vs 8–82.5% @PCA-8), so the bound prunes hard when it can — it's just redundant with nprobe. The control disproved my predicted mechanism and taught the real one. Recorded as such in ADR-205 (cf. ADR-203's documented deviations).

Scope / caveats

  • Kills cluster-level triangle-inequality pruning vs tuned nprobe, recall@10≈0.95. Does not speak to within-list/PQ (IVFADC asymmetric distance) — a different mechanism, the only open lever.
  • Two correctness gates pass: full-budget B&B is exact (recall ≥ 0.999); the instrumented incumbent matches ruvector-rairs::IvfFlat within 0.01 recall.

Scoreboard

2 WINS (ADR-200/202 reuse+periodic; ADR-204 incremental) / 4 KILLS (ADR-199 CCH; ADR-201 filtered-ANN; ADR-203 KG-treewidth; ADR-205 IVF cluster-pruning).

New self-contained crate crates/ruvector-bet4-ivf-bench (3 tests green, clippy clean); harness examples/ivf_pruning_sweep.rs. No changes to any shipping crate.

shaal added 6 commits June 5, 2026 00:25
Closes the BET 4 caveat left open by ADR-201: the region-pruning IVF
kernel was only run against ACORN (BET 2), never against its natural
incumbent, plain IVF nprobe, on unfiltered ANN. Frozen gate: WIN = >=2x
member-scan reduction at matched recall@10 (R=0.95) AND wall-clock win
across nclusters in {64,256,1024}; KILL = <1.5x or wall-clock reverses.
Two controls: exact-vs-exact pruning-fraction probe + low-d (PCA-8)
soundness control. Honest prior: NO-GO lean (128-d concentration makes
the triangle-inequality bound loose) — the IVF-level companion to
ADR-199. Branch off clean main; B&B kernel rebuilt self-contained
(BET 2's lives only on ruvnet#536).
…s certified)

New crate ruvector-bet4-ivf-bench (deps: ruvector-rairs, rand).
- data.rs: aligned arxiv 128-d feature CSV loader.
- kernel.rs: BnBIvf — IVF probed in ascending lower-bound order with B&B
  early termination (break when LB >= kth-best); LB(q,c)=max(0,|q-mu_c|-r_c),
  r_c=max member radius. Full budget = exact; max_probe cap = nprobe analogue.
  Built on ruvector-rairs kmeans so it shares centroids with the IvfFlat
  incumbent (shared-index pre-reg requirement).
- oracle.rs: brute-force exact kNN + recall@k + shared true-L2 helper.
- M0 gate test PASSES on real arxiv slice: full-budget B&B == oracle
  (recall@10 >= 0.999) → B&B invariant certified. clippy clean.

Frozen gate: docs/plans/bet4-ivf-pruning/PRE-REGISTRATION.md. Off clean main.
…aithfulness gate

BnBIvf::search_nprobe: the plain-IVF incumbent strategy (nprobe nearest
centroids, scan all members, no B&B) on the SAME centroids/lists as the
B&B contender, with member-eval counting. Refactored top-k accumulation
into shared consider()/finalize() so both strategies accumulate
identically and only the probe loop differs (shared-index pre-reg
requirement). New gate instrumented_nprobe_matches_rairs PASSES: recall
matches ruvector-rairs::IvfFlat within 0.01 at matched params → the
cost-measured incumbent is algorithmically the real one. 3 tests green.
- kernel: search_bnb_skip — the STEELMAN. Centroid-distance order (the
  effective nprobe ordering) + per-cluster LB-skip (correctness-safe in
  any order, unlike the LB-order global break). The strongest cluster-level
  B&B: if it can't beat tuned nprobe, the bound doesn't pay.
- pca: minimal power-iteration top-m PCA (no linalg dep) for the low-dim
  control — projects real arxiv features to 8-d where the bound is tight.
- examples/ivf_pruning_sweep: 3 contenders share one index per nclusters
  (plain nprobe / B&B LB-order / B&B steelman) x 2 regimes (128-d, PCA-8),
  exact-regime pruning probe, matched-recall@0.95, frozen-gate verdict.

RESULT (n=20k & n=50k both): steelman = 1.00x evals vs nprobe in EVERY
cell, BOTH regimes. NO-GO. Mechanism is structural, not dimensional: the
LB bound only prunes FAR clusters that tuned nprobe already skips, so it's
redundant with nprobe's centroid-distance cutoff. Exact-prune fraction
scales correctly with dim (0-13% @128-d, 8-87% @PCA-8) => kernel sound;
the redundancy is fundamental. LB-ORDER (faithful BET-2 kernel) is strictly
WORSE (0.18-0.25x) — LB-ordering probes far large-radius clusters early.
…l NO-GO

Verdict: NO-GO (robust, structural). Steelman B&B (centroid order +
LB-skip) ties tuned nprobe at exactly 1.00x member-evals in every cell,
n=20k & n=50k, 128-d & PCA-8. Mechanism: the triangle-inequality bound
only prunes FAR clusters that tuned nprobe already skips => redundant with
nprobe's centroid-distance cutoff; win is structurally impossible, not
just hard in high-d. LB-order (faithful BET-2 kernel) strictly worse
(0.18-0.25x). Companion to ADR-199.

Honest deviation recorded: the pre-registered PCA-8 control expected a B&B
WIN (tight bound). It tied instead — the premise was false (tight bound
beats full-scan, not tuned nprobe). Control still valid: exact-prune
fraction scales correctly with dim (0-13% @128-d, 8-82% @PCA-8) => kernel
sound; it revealed the structural redundancy. Scoreboard 2 WINS / 4 KILLS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant