docs: finish RankQuantFastscan public-API promotion + positioning lift (PR3)#243
Conversation
…lift #233 promoted RankQuantFastscan to a stable public type in lib.rs only; the README, RANK_MODES, compatibility-policy, determinism, THREAT_MODEL, the fastscan module-doc, the rank_io persistence-contract doc, c-api, CHANGELOG, and the Python binding docstring all still described it as #[doc(hidden)] / unstable and undercounted the on-disk formats (four -> five, adds .ovfs) and the cargo-fuzz targets (eight -> nine). This finishes the propagation so the public surface is self-consistent. Also: - fuzz.yml weekly sweep now actually runs load_fastscan. The .ovfs loader was a registered fuzz target but was missing from the weekly matrix, so the THREAT_MODEL "all nine targets" claim is now true rather than aspirational. - README "What's different" positioning lift: sharpen the training-free / data-oblivious no-fit wedge, credit the FAISS FastScan + binary-quant-rescore lineage for the two-stage / RankQuantFastscan path (no novelty overclaim), and make the HNSW params precise (ef_construction=200, ef_search=128). - Claims-discipline softening (no in-repo benchmark backs them): sign_bitmap.rs "competitive with / superior to learned hash codes" -> a mechanism statement; the RANK_MODES arXiv paper-harness numbers (under the real-corpus rerun guardrail) -> redirect to the reproducible BEIR harness; README + CHANGELOG "nothing hand-entered" -> accurate "tables transcribe the harness summary outputs". No code changes: doc comments, markdown, the Python module docstring, and the fuzz workflow matrix only. Gate: fmt + clippy (-D warnings) + cargo doc (-D warnings) + doctests all clean. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
PR Summary by QodoDocs: complete RankQuantFastscan public API promotion and CI fuzz coverage WalkthroughsDescription• Make RankQuantFastscan’s stable/public status consistent across docs, rustdoc, and bindings. • Update persistence/docs to include .ovfs/OVFS and correct loader/format counts. • Add load_fastscan to the weekly cargo-fuzz CI matrix to fuzz all loaders. Diagramgraph TD
A["Docs (README + md)"] --> B["Rust public API (lib.rs)"] --> C["RankQuantFastscan type"] --> D[(".ovfs / OVFS format")]
E["Python binding docs"] --> B
F["CI workflow (fuzz.yml)"] --> G["cargo-fuzz weekly"] --> H["load_fastscan target"] --> I["rank_io loader"] --> D
High-Level AssessmentThe following are alternative approaches to this PR: 1. Add a docs/CI consistency check script
2. Generate format/target lists from source-of-truth constants
3. Reduce duplicated enumerations in prose docs
Recommendation: The PR’s approach (propagate the public-API promotion and correct inventories everywhere they’re stated) is the right immediate fix. As a follow-up guardrail, consider adding a small CI lint that asserts the fuzz workflow matrix includes every File ChangesDocumentation (12)
Other (1)
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Code Review
This pull request stabilizes the RankQuantFastscan API, making it a public, documented type with .ovfs persistence support. It also renames the on-disk format magics to OV* (retaining backward compatibility with legacy TV* magics), adds a ninth cargo-fuzz target (load_fastscan), and updates the documentation and benchmarks to reference a reproducible BEIR harness. There are no review comments, so I have no additional feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
The CI weekly sweep now runs all nine targets, but fuzz/run_full_fuzz.sh (the manual dev campaign helper) still defaulted to the eight pre-.ovfs targets, so a full manual run skipped load_fastscan — the untrusted .ovfs loader. Add it to the default TARGETS list (now nine) and fix the comment, so the manual campaign matches the THREAT_MODEL 'all nine targets' claim and the CI weekly matrix. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Follow-up to adding load_fastscan: the header comment still quoted the eight-target campaign totals (~3h x 8 ~= 24h; the SECS_PER_TARGET=120 example ~16 min). Targets run sequentially, so the totals scale with the count — now ~3h x 9 ~= 27h and ~18 min. The runtime 'est. total' print is already computed from n_targets, so only the static guidance was stale. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
What & why
#233 promoted
RankQuantFastscanto a stable public type inlib.rsonly — the rest of the public surface (README, docs, threat model, source rustdoc, CHANGELOG, Python binding) still described it as#[doc(hidden)]/ "API not yet stable" and undercounted the on-disk formats (4→5, adds.ovfs) and cargo-fuzz targets (8→9). The result was a self-contradictory public surface. This PR finishes the propagation, plus a guardrail-clean positioning lift.Branched off the current
origin/main(4784f59, #233) — the true public surface — not the local tree (which was 2 commits behind). No code changes: doc comments, markdown, the Python module docstring, and the fuzz workflow matrix only.Staleness refresh (the core)
README.md#[doc(hidden)]);.ovfsadded to the trust-model format list; HNSW params made precise (ef_construction=200, ef_search=128)src/lib.rssrc/fastscan.rs#[doc(hidden)]framing for the type; the free fn stayspub(crate))src/rank_io.rsRankQuantFastscan(.ovfs)docs/RANK_MODES.mdadd/search/write/load(.ovfs),swap_removestill unsupported; probe deferral noted (#232)docs/compatibility-policy.mdsearch_asymmetric_byte_lutstays#[doc(hidden)])docs/determinism.mddocs/c-api.mdRankQuantFastscan(.ovfs)THREAT_MODEL.md.ovfs/load_fastscanadded; status date bumpCHANGELOG.mdOV*rename) under Changed, #233 (FastScan public +.ovfs) under Addedordvec-pythondocstringRankQuantFastscanis a Rust-only specialized type (parity gap stated, not silent)Beyond docs — one real fix
fuzz.ymlweekly sweep now actually runsload_fastscan. The.ovfsloader was a registered fuzz target but missing from the weekly matrix — so the THREAT_MODEL "all nine targets" line was aspirational. Added it (smoke matrix unchanged), so the claim is now true and the untrusted.ovfsloader is genuinely fuzzed.Positioning lift (you approved: one PR, you review)
Claims-discipline softening (you approved both)
sign_bitmap.rs"competitive with / sometimes superior to learned hash codes" → mechanism statement (no in-repo head-to-head backs the comparison).RANK_MODES.mdarXiv paper-harness numbers (207,695 / 7,200 + the FAISS-HNSW comparison, under the real-corpus rerun guardrail) → redirect to the reproducible BEIR harness, which is the README's sanctioned real-corpus result. Also fixes the now-stale "summarized in the README" cross-refs (the README leads with BEIR, not arXiv).Guardrails honored
No fabricated numbers (only reproducible BEIR + synthetic
bench_rank); no "third category"; hypergeometric kept as a selectivity null; turbovec provenance unchanged. Every ordvec number in the positioning lift is from the in-repo reproducible run.Deliberately deferred (flagged, not done here)
ordvec-go/ordvec-ffiREADME stubs (parity pointers) — additive, separate follow-up.CHANGELOGcompare-link footer —v0.5.0is not tagged (onlyv0.2.0/v0.3.0/v0.4.0), so I did not add[Unreleased]/[0.5.0]links that would 404. Worth a maintainer decision on tagging.CLAUDE.mdis local-only / gitignored; it's materially stale (says v0.2.0 /.tv*/ FastScan hidden) and I'll refresh it separately — it's not part of this diff.Test plan
cargo fmt --all --check✓cargo clippy --all-targets --all-features -- -D warnings✓RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features✓ (newRankQuantFastscanintra-doc link resolves)cargo test --doc✓ (4 doctests)fuzz.ymlvalidated as YAML; weekly matrix = 9 targets incl.load_fastscan🤖 Generated with Claude Code