SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned nprobe — NO-GO#540
Open
shaal wants to merge 6 commits into
Open
SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned nprobe — NO-GO#540shaal wants to merge 6 commits into
shaal wants to merge 6 commits into
Conversation
Closes the BET 4 caveat left open by ADR-201: the region-pruning IVF
kernel was only run against ACORN (BET 2), never against its natural
incumbent, plain IVF nprobe, on unfiltered ANN. Frozen gate: WIN = >=2x
member-scan reduction at matched recall@10 (R=0.95) AND wall-clock win
across nclusters in {64,256,1024}; KILL = <1.5x or wall-clock reverses.
Two controls: exact-vs-exact pruning-fraction probe + low-d (PCA-8)
soundness control. Honest prior: NO-GO lean (128-d concentration makes
the triangle-inequality bound loose) — the IVF-level companion to
ADR-199. Branch off clean main; B&B kernel rebuilt self-contained
(BET 2's lives only on ruvnet#536).
…s certified) New crate ruvector-bet4-ivf-bench (deps: ruvector-rairs, rand). - data.rs: aligned arxiv 128-d feature CSV loader. - kernel.rs: BnBIvf — IVF probed in ascending lower-bound order with B&B early termination (break when LB >= kth-best); LB(q,c)=max(0,|q-mu_c|-r_c), r_c=max member radius. Full budget = exact; max_probe cap = nprobe analogue. Built on ruvector-rairs kmeans so it shares centroids with the IvfFlat incumbent (shared-index pre-reg requirement). - oracle.rs: brute-force exact kNN + recall@k + shared true-L2 helper. - M0 gate test PASSES on real arxiv slice: full-budget B&B == oracle (recall@10 >= 0.999) → B&B invariant certified. clippy clean. Frozen gate: docs/plans/bet4-ivf-pruning/PRE-REGISTRATION.md. Off clean main.
…aithfulness gate BnBIvf::search_nprobe: the plain-IVF incumbent strategy (nprobe nearest centroids, scan all members, no B&B) on the SAME centroids/lists as the B&B contender, with member-eval counting. Refactored top-k accumulation into shared consider()/finalize() so both strategies accumulate identically and only the probe loop differs (shared-index pre-reg requirement). New gate instrumented_nprobe_matches_rairs PASSES: recall matches ruvector-rairs::IvfFlat within 0.01 at matched params → the cost-measured incumbent is algorithmically the real one. 3 tests green.
- kernel: search_bnb_skip — the STEELMAN. Centroid-distance order (the effective nprobe ordering) + per-cluster LB-skip (correctness-safe in any order, unlike the LB-order global break). The strongest cluster-level B&B: if it can't beat tuned nprobe, the bound doesn't pay. - pca: minimal power-iteration top-m PCA (no linalg dep) for the low-dim control — projects real arxiv features to 8-d where the bound is tight. - examples/ivf_pruning_sweep: 3 contenders share one index per nclusters (plain nprobe / B&B LB-order / B&B steelman) x 2 regimes (128-d, PCA-8), exact-regime pruning probe, matched-recall@0.95, frozen-gate verdict. RESULT (n=20k & n=50k both): steelman = 1.00x evals vs nprobe in EVERY cell, BOTH regimes. NO-GO. Mechanism is structural, not dimensional: the LB bound only prunes FAR clusters that tuned nprobe already skips, so it's redundant with nprobe's centroid-distance cutoff. Exact-prune fraction scales correctly with dim (0-13% @128-d, 8-87% @PCA-8) => kernel sound; the redundancy is fundamental. LB-ORDER (faithful BET-2 kernel) is strictly WORSE (0.18-0.25x) — LB-ordering probes far large-radius clusters early.
…l NO-GO Verdict: NO-GO (robust, structural). Steelman B&B (centroid order + LB-skip) ties tuned nprobe at exactly 1.00x member-evals in every cell, n=20k & n=50k, 128-d & PCA-8. Mechanism: the triangle-inequality bound only prunes FAR clusters that tuned nprobe already skips => redundant with nprobe's centroid-distance cutoff; win is structurally impossible, not just hard in high-d. LB-order (faithful BET-2 kernel) strictly worse (0.18-0.25x). Companion to ADR-199. Honest deviation recorded: the pre-registered PCA-8 control expected a B&B WIN (tight bound). It tied instead — the premise was false (tight bound beats full-scan, not tuned nprobe). Control still valid: exact-prune fraction scales correctly with dim (0-13% @128-d, 8-82% @PCA-8) => kernel sound; it revealed the structural redundancy. Scoreboard 2 WINS / 4 KILLS.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SepRAG BET 4 (finding): IVF cluster-pruning is structurally redundant with tuned
nprobe— NO-GOThis is a research finding (a NO-GO), not a feature request — no merge urgency. It is off clean
main(independent of #535/#537/#539), in a self-contained bench crate that touches no shipping code. Linked to #534. Writeup: ADR-205.The question (closes the ADR-201 caveat)
ADR-201 built and validated the region-pruning IVF kernel exact, but only ran it as BET 2's mechanism against ACORN — never against its natural incumbent, plain IVF
nprobe, on unfiltered ANN. This is that head-to-head: does a triangle-inequality lower bound —LB(q,c)=max(0, ‖q−μ_c‖ − r_c)— with branch-and-bound probing beat tunednprobeat matched recall@10?Gate pre-registered + frozen before any run (
docs/plans/bet4-ivf-pruning/PRE-REGISTRATION.md): WIN = ≥2× member-eval reduction AND wall-clock win acrossnclusters ∈ {64,256,1024}; KILL = <1.5×.Result — NO-GO (robust, structural)
Three contenders share one index per
nclusters(only the probe loop differs): plainnprobe, B&B LB-order (the faithful BET-2 kernel), and the steelman B&B (centroid-distance order + LB-skip — the strongest cluster-level variant).n=50k, 128-d (member-evals/q at matched recall@10=0.95):
nprobeMechanism (structural, not dimensional): the true top-k live in the nearest clusters; every method must scan them. The bound only prunes far clusters — which tuned
nprobealready skips. So the bound is redundant withnprobe's centroid-distance cutoff, and the win is structurally impossible, in any dimension. The faithful LB-order kernel is strictly worse (0.18–0.25×): ordering byLBprobes far, large-radius clusters early.Honest deviation (kept prominent)
The frozen pre-registration expected the PCA-8 control to show B&B winning (tight bound). It tied (1.00×) instead — my premise was false (a tight bound beats full scan, not tuned
nprobe). The control still validated the kernel: the exact-prune fraction scales correctly with dimension (0–13% @128-d vs 8–82.5% @PCA-8), so the bound prunes hard when it can — it's just redundant withnprobe. The control disproved my predicted mechanism and taught the real one. Recorded as such in ADR-205 (cf. ADR-203's documented deviations).Scope / caveats
nprobe, recall@10≈0.95. Does not speak to within-list/PQ (IVFADC asymmetric distance) — a different mechanism, the only open lever.ruvector-rairs::IvfFlatwithin 0.01 recall.Scoreboard
2 WINS (ADR-200/202 reuse+periodic; ADR-204 incremental) / 4 KILLS (ADR-199 CCH; ADR-201 filtered-ANN; ADR-203 KG-treewidth; ADR-205 IVF cluster-pruning).
New self-contained crate
crates/ruvector-bet4-ivf-bench(3 tests green, clippy clean); harnessexamples/ivf_pruning_sweep.rs. No changes to any shipping crate.