Skip to content

docs: finish RankQuantFastscan public-API promotion + positioning lift (PR3)#243

Merged
project-navi-bot merged 4 commits into
mainfrom
docs/pr3-fastscan-promotion-and-positioning
Jun 15, 2026
Merged

docs: finish RankQuantFastscan public-API promotion + positioning lift (PR3)#243
project-navi-bot merged 4 commits into
mainfrom
docs/pr3-fastscan-promotion-and-positioning

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Owner

What & why

#233 promoted RankQuantFastscan to a stable public type in lib.rs only — the rest of the public surface (README, docs, threat model, source rustdoc, CHANGELOG, Python binding) still described it as #[doc(hidden)] / "API not yet stable" and undercounted the on-disk formats (4→5, adds .ovfs) and cargo-fuzz targets (8→9). The result was a self-contradictory public surface. This PR finishes the propagation, plus a guardrail-clean positioning lift.

Branched off the current origin/main (4784f59, #233) — the true public surface — not the local tree (which was 2 commits behind). No code changes: doc comments, markdown, the Python module docstring, and the fuzz workflow matrix only.

Staleness refresh (the core)

File Change
README.md FastScan entry → stable/specialized public (was #[doc(hidden)]); .ovfs added to the trust-model format list; HNSW params made precise (ef_construction=200, ef_search=128)
src/lib.rs crate-doc "four families" → acknowledges FastScan as the specialized companion
src/fastscan.rs module-doc no longer contradicts the struct-level "stable public type" (dropped the #[doc(hidden)] framing for the type; the free fn stays pub(crate))
src/rank_io.rs persistence-contract doc lists RankQuantFastscan (.ovfs)
docs/RANK_MODES.md FastScan now add/search/write/load (.ovfs), swap_remove still unsupported; probe deferral noted (#232)
docs/compatibility-policy.md FastScan reclassified public (only search_asymmetric_byte_lut stays #[doc(hidden)])
docs/determinism.md "hidden" → "public, specialized"
docs/c-api.md V1 exclusions list RankQuantFastscan (.ovfs)
THREAT_MODEL.md 8→9 fuzz targets, 4→5 loaders/formats, .ovfs/load_fastscan added; status date bump
CHANGELOG.md Unreleased: #230 (OV* rename) under Changed, #233 (FastScan public + .ovfs) under Added
ordvec-python docstring notes RankQuantFastscan is a Rust-only specialized type (parity gap stated, not silent)

Beyond docs — one real fix

  • fuzz.yml weekly sweep now actually runs load_fastscan. The .ovfs loader was a registered fuzz target but missing from the weekly matrix — so the THREAT_MODEL "all nine targets" line was aspirational. Added it (smoke matrix unchanged), so the claim is now true and the untrusted .ovfs loader is genuinely fuzzed.

Positioning lift (you approved: one PR, you review)

  • README "What's different": sharpened the training-free / data-oblivious no-fit wedge (every other compressed path carries a fit step); added a FAISS FastScan lineage credit so the two-stage / FastScan path is framed as batteries-included, not a new technique (fiction-free, no novelty overclaim).

Claims-discipline softening (you approved both)

  • sign_bitmap.rs "competitive with / sometimes superior to learned hash codes" → mechanism statement (no in-repo head-to-head backs the comparison).
  • RANK_MODES.md arXiv paper-harness numbers (207,695 / 7,200 + the FAISS-HNSW comparison, under the real-corpus rerun guardrail) → redirect to the reproducible BEIR harness, which is the README's sanctioned real-corpus result. Also fixes the now-stale "summarized in the README" cross-refs (the README leads with BEIR, not arXiv).
  • README + CHANGELOG "nothing hand-entered" → accurate "tables transcribe the harness summary outputs" (the nDCG/latency tables are transcribed from a representative run).

Guardrails honored

No fabricated numbers (only reproducible BEIR + synthetic bench_rank); no "third category"; hypergeometric kept as a selectivity null; turbovec provenance unchanged. Every ordvec number in the positioning lift is from the in-repo reproducible run.

Deliberately deferred (flagged, not done here)

  • ordvec-go / ordvec-ffi README stubs (parity pointers) — additive, separate follow-up.
  • CHANGELOG compare-link footer — v0.5.0 is not tagged (only v0.2.0/v0.3.0/v0.4.0), so I did not add [Unreleased]/[0.5.0] links that would 404. Worth a maintainer decision on tagging.
  • CLAUDE.md is local-only / gitignored; it's materially stale (says v0.2.0 / .tv* / FastScan hidden) and I'll refresh it separately — it's not part of this diff.

Test plan

  • cargo fmt --all --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features ✓ (new RankQuantFastscan intra-doc link resolves)
  • cargo test --doc ✓ (4 doctests)
  • fuzz.yml validated as YAML; weekly matrix = 9 targets incl. load_fastscan

🤖 Generated with Claude Code

…lift

#233 promoted RankQuantFastscan to a stable public type in lib.rs only; the
README, RANK_MODES, compatibility-policy, determinism, THREAT_MODEL, the
fastscan module-doc, the rank_io persistence-contract doc, c-api, CHANGELOG,
and the Python binding docstring all still described it as #[doc(hidden)] /
unstable and undercounted the on-disk formats (four -> five, adds .ovfs) and
the cargo-fuzz targets (eight -> nine). This finishes the propagation so the
public surface is self-consistent.

Also:
- fuzz.yml weekly sweep now actually runs load_fastscan. The .ovfs loader was
  a registered fuzz target but was missing from the weekly matrix, so the
  THREAT_MODEL "all nine targets" claim is now true rather than aspirational.
- README "What's different" positioning lift: sharpen the training-free /
  data-oblivious no-fit wedge, credit the FAISS FastScan + binary-quant-rescore
  lineage for the two-stage / RankQuantFastscan path (no novelty overclaim),
  and make the HNSW params precise (ef_construction=200, ef_search=128).
- Claims-discipline softening (no in-repo benchmark backs them):
  sign_bitmap.rs "competitive with / superior to learned hash codes" -> a
  mechanism statement; the RANK_MODES arXiv paper-harness numbers (under the
  real-corpus rerun guardrail) -> redirect to the reproducible BEIR harness;
  README + CHANGELOG "nothing hand-entered" -> accurate "tables transcribe the
  harness summary outputs".

No code changes: doc comments, markdown, the Python module docstring, and the
fuzz workflow matrix only. Gate: fmt + clippy (-D warnings) + cargo doc
(-D warnings) + doctests all clean.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

Grey Divider

Qodo Logo

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Docs: complete RankQuantFastscan public API promotion and CI fuzz coverage
📝 Documentation ⚙️ Configuration changes 🕐 20-40 Minutes

Grey Divider

Walkthroughs

Description
• Make RankQuantFastscan’s stable/public status consistent across docs, rustdoc, and bindings.
• Update persistence/docs to include .ovfs/OVFS and correct loader/format counts.
• Add load_fastscan to the weekly cargo-fuzz CI matrix to fuzz all loaders.
Diagram
graph TD
  A["Docs (README + md)"] --> B["Rust public API (lib.rs)"] --> C["RankQuantFastscan type"] --> D[(".ovfs / OVFS format")]
  E["Python binding docs"] --> B
  F["CI workflow (fuzz.yml)"] --> G["cargo-fuzz weekly"] --> H["load_fastscan target"] --> I["rank_io loader"] --> D
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Add a docs/CI consistency check script
  • ➕ Prevents future drift between docs (format/target counts) and the repository state
  • ➕ Low effort; can run in CI as a fast lint
  • ➖ Still relies on duplicated knowledge; only detects mismatch after changes land
2. Generate format/target lists from source-of-truth constants
  • ➕ Eliminates duplicated enumerations (formats, fuzz targets) across docs
  • ➕ Makes updates automatic when adding new formats/targets
  • ➖ More build/tooling complexity (custom generator, rustdoc include, or md templating)
  • ➖ Harder to edit docs purely as prose
3. Reduce duplicated enumerations in prose docs
  • ➕ Less chance of stale counts and lists
  • ➕ Keeps docs focused on invariants rather than inventories
  • ➖ Docs become less concrete; harder for readers to see full supported surface at a glance

Recommendation: The PR’s approach (propagate the public-API promotion and correct inventories everywhere they’re stated) is the right immediate fix. As a follow-up guardrail, consider adding a small CI lint that asserts the fuzz workflow matrix includes every fuzz/fuzz_targets/*.rs target and that the threat model’s loader/format counts match the rank_io magics; this keeps the documentation honest without requiring a heavier doc-generation pipeline.

Grey Divider

File Changes

Documentation (12)
CHANGELOG.md Changelog entries for FastScan public API and OV* magic rename +13/-2

Changelog entries for FastScan public API and OV* magic rename

• Clarifies that benchmark tables are transcribed from harness outputs (not “nothing hand-entered”). Adds an Unreleased entry noting 'RankQuantFastscan' public stabilization, '.ovfs' persistence, and the ninth fuzz target; also records 'TV*'→'OV*' magic renames under Changed.

CHANGELOG.md


README.md Reposition FastScan as stable/specialized; document '.ovfs' and benchmark wording +25/-17

Reposition FastScan as stable/specialized; document '.ovfs' and benchmark wording

• Rewrites top-level positioning to emphasize training-free/data-oblivious quantization and removes contradictory “doc(hidden)” framing for 'RankQuantFastscan'. Documents '.ovfs'/'OVFS' in the trust model section, adds lineage attribution for FastScan/two-stage patterns, and makes HNSW parameters precise; updates benchmark reproducibility wording to “tables transcribe harness outputs.”

README.md


THREAT_MODEL.md Threat model refresh for '.ovfs' loader and 9 fuzz targets +15/-13

Threat model refresh for '.ovfs' loader and 9 fuzz targets

• Updates status date and expands the deserialization surface to include '.ovfs'/'OVFS' with notes about lack of legacy magic. Corrects fuzz target inventory from eight to nine and states that five loaders (including 'load_fastscan') are fuzz-covered.

THREAT_MODEL.md


RANK_MODES.md Document FastScan capabilities and redirect real-corpus claims to BEIR harness +23/-23

Document FastScan capabilities and redirect real-corpus claims to BEIR harness

• Replaces stale arXiv-paper-harness references with pointers to the reproducible in-repo BEIR harness summarized in the README. Updates the FastScan section to reflect stable/public status plus support for 'write'/'load' via '.ovfs', and notes deferred metadata-probe support.

docs/RANK_MODES.md


c-api.md Explicitly exclude FastScan ('RankQuantFastscan') from ABI v1 +5/-5

Explicitly exclude FastScan ('RankQuantFastscan') from ABI v1

• Updates the ABI v1 exclusions list to include 'RankQuantFastscan' and calls out the '.ovfs' FastScan path as excluded for now.

docs/c-api.md


compatibility-policy.md Reclassify 'RankQuantFastscan' as stable/public; narrow what remains hidden +3/-2

Reclassify 'RankQuantFastscan' as stable/public; narrow what remains hidden

• Moves 'RankQuantFastscan' out of the '#[doc(hidden)]' bucket and into the normal (pre-1.0) compatibility policy. Keeps 'search_asymmetric_byte_lut' as an example of still-hidden/internal API.

docs/compatibility-policy.md


determinism.md Update FastScan determinism section to match public/specialized status +1/-1

Update FastScan determinism section to match public/specialized status

• Changes the FastScan determinism documentation to describe 'RankQuantFastscan' as public/specialized rather than hidden. Leaves the determinism caveats and score non-equivalence to exact RankQuant intact.

docs/determinism.md


__init__.py Python module docstring: clarify FastScan is Rust-only and not bound +3/-1

Python module docstring: clarify FastScan is Rust-only and not bound

• Adds an explicit note that 'RankQuantFastscan' and its '.ovfs' persistence are intentionally not exposed in the Python bindings, preventing implied parity.

ordvec-python/python/ordvec/init.py


fastscan.rs FastScan module rustdoc aligns with stable/public type positioning +5/-5

FastScan module rustdoc aligns with stable/public type positioning

• Rewrites module-level rustdoc to remove contradictory “re-exported doc(hidden)” language for 'RankQuantFastscan'. Clarifies it is stable/public but specialized, and that the free function entrypoint remains 'pub(crate)'.

src/fastscan.rs


lib.rs Crate rustdoc: position FastScan as a specialized companion to headline families +3/-1

Crate rustdoc: position FastScan as a specialized companion to headline families

• Adjusts crate-level docs to describe the four main retrieval families as the headline surface while acknowledging 'RankQuantFastscan' as a specialized b=2 latency companion.

src/lib.rs


rank_io.rs Persistence rustdoc includes 'RankQuantFastscan' and '.ovfs' +3/-1

Persistence rustdoc includes 'RankQuantFastscan' and '.ovfs'

• Updates module rustdoc to list five formats, including '.ovfs' for 'RankQuantFastscan', and includes it in the supported type-level 'write()'/'load()' API list.

src/rank_io.rs


sign_bitmap.rs Soften performance/competitiveness claim into a mechanism statement +4/-4

Soften performance/competitiveness claim into a mechanism statement

• Replaces an unbacked comparative claim about learned hash codes with a description of why sign patterns preserve angular structure for candidate generation. No functional code changes.

src/sign_bitmap.rs


Other (1)
fuzz.yml Include 'load_fastscan' in the weekly fuzz sweep matrix +4/-3

Include 'load_fastscan' in the weekly fuzz sweep matrix

• Updates workflow comments and the weekly target matrix from eight to nine targets. Adds the missing 'load_fastscan' job entry so the '.ovfs' loader is exercised on the scheduled sweep.

.github/workflows/fuzz.yml


Grey Divider

Qodo Logo

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request stabilizes the RankQuantFastscan API, making it a public, documented type with .ovfs persistence support. It also renames the on-disk format magics to OV* (retaining backward compatibility with legacy TV* magics), adds a ninth cargo-fuzz target (load_fastscan), and updates the documentation and benchmarks to reference a reproducible BEIR harness. There are no review comments, so I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

The CI weekly sweep now runs all nine targets, but fuzz/run_full_fuzz.sh
(the manual dev campaign helper) still defaulted to the eight pre-.ovfs
targets, so a full manual run skipped load_fastscan — the untrusted .ovfs
loader. Add it to the default TARGETS list (now nine) and fix the comment,
so the manual campaign matches the THREAT_MODEL 'all nine targets' claim
and the CI weekly matrix.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Follow-up to adding load_fastscan: the header comment still quoted the
eight-target campaign totals (~3h x 8 ~= 24h; the SECS_PER_TARGET=120
example ~16 min). Targets run sequentially, so the totals scale with the
count — now ~3h x 9 ~= 27h and ~18 min. The runtime 'est. total' print is
already computed from n_targets, so only the static guidance was stale.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@project-navi-bot project-navi-bot merged commit cd699c0 into main Jun 15, 2026
38 checks passed
@project-navi-bot project-navi-bot deleted the docs/pr3-fastscan-promotion-and-positioning branch June 15, 2026 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants