feat: promote RankQuantFastscan to public API + .ovfs persistence by Fieldnote-Echo · Pull Request #233 · Fieldnote-Echo/ordvec

Fieldnote-Echo · 2026-06-15T02:53:22Z

Stacked on #230 (the .ov* format). Promotes the FastScan b=2 path from #[doc(hidden)] to a documented public type and gives it persistence in the OV* convention #230 establishes.

What

Un-hide RankQuantFastscan — it was #[doc(hidden)] on both the crate-root re-export and the struct definition. Now a stable, documented type, framed as a specialized b=2 minimum-latency scan path (block-32 nibble-LUT; AVX-512 → AVX2 → scalar) — not the headline retrieval surface (that's RankQuant / Bitmap / two-stage). Added a discoverable crate-root mention.
Persistence (net-new) — RankQuantFastscan::write / load via a new .ovfs format (magic OVFS): a 13-byte header (OVFS | version | dim | n_vectors) + the opaque block-32 packed payload. rank_io gains write_fastscan / load_fastscan / fastscan_payload_bytes (pub(crate)). The loader validates magic, dim % 4 == 0, n_vectors, and exact payload length (no trailing bytes), with the MAX_PAYLOAD guard enforced before File::create. OVFS is new in the ordvec format — no legacy TV* counterpart (FastScan never had persistence under turbovec).
Tests — write/load round-trip scans byte-identically; empty-index round-trip; OVFS-magic-on-write; rejects wrong magic / trailing bytes / dim % 4 != 0.
Fuzz — new load_fastscan libFuzzer target (parity with the four existing loader targets).

Not in scope

Kernels are unchanged — FastScan was already AVX-512/AVX2/scalar-optimized; this is promotion + persistence only.
Probe support for .ovfs is deferred: probe_index_metadata returns IndexKind, and adding a Fastscan variant to the non-#[non_exhaustive] IndexKind enum would be a breaking change → tracked for the 0.8.0 API re-arch (0.8.0 staging: public API re-architecture & streamlining (breaking) #232). A .ovfs file currently probes as 'unknown magic'.

Validation

cargo fmt --all --check; clippy --all-targets --all-features -D warnings; RUSTDOCFLAGS=-D warnings cargo doc; full test suites (experimental + default, 18 fastscan tests incl. 6 new persistence); cargo +nightly fuzz build load_fastscan.

Follow-up: PR3 docs refresh (surfaces FastScan + .ovfs across README / bindings / C-ABI / Go).

…-compat Files written by the crate now use the ordvec magics OVR1/OVRQ/OVBM/OVSB (extensions .ovr/.ovrq/.ovbm/.ovsb), replacing the turbovec-era TV* magics. The read contract is unchanged: all loaders (rank_io.rs) AND the C ABI accept BOTH the current OV* and the legacy TV* magics, so every file the crate (or turbovec) ever wrote still loads. Only the write path changed. - src/rank_io.rs: OV* magic constants (written) + TV* retained read-only for back-compat; writers emit OV*; loaders + probe_index_metadata accept both. - ordvec-ffi: the C ABI sniff-magic dispatch accepts both OV* and TV*; probe path was already format-agnostic (uses IndexKind). - tests/persistence_compat.rs: forward fixtures now pin OV*; added tests proving legacy TV* files still load for all four index types. - Parity sweep (docs / extensions only, no logic): ordvec-manifest (+ python bindings), ordvec-python docstrings + tests, ordvec-go test, C header, fuzz targets, docs/*, README format line, SECURITY/THREAT_MODEL, CONTRIBUTING stable-surface statement (the read contract is never broken), .gitignore. Gate: fmt + clippy -D warnings (core/ffi/manifest) + full test suites (core exp+default, manifest, ffi) + ordvec-python check + rustdoc -D warnings. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

The shorter UNSUPPORTED_FORMAT message in info_for_metadata let rustfmt collapse `info.kind = match {...}` onto one line. Formatting only, no logic change. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Un-hides RankQuantFastscan (it was `#[doc(hidden)]` on both the re-export and the struct) and gives it persistence, building on the OV* format convention. - Un-hide: remove `#[doc(hidden)]` from the crate-root re-export and the struct; reword its docs as a stable-but-specialized b=2 minimum-latency scan path (not the headline retrieval surface). Add a discoverable crate-root mention. - Persistence (net-new): `RankQuantFastscan::write`/`load` via a new `.ovfs` format (magic `OVFS`; 13-byte header + the opaque block-32 packed payload). rank_io gains `write_fastscan`/`load_fastscan`/`fastscan_payload_bytes` (pub(crate)); the loader validates magic, `dim % 4 == 0`, n_vectors, exact payload length (no trailing bytes), with the MAX_PAYLOAD pre-File::create guard. OVFS is new in the ordvec format — no legacy TV* counterpart. - Tests: write/load round-trip scans byte-identically; empty-index round-trip; OVFS-magic-on-write; rejects wrong magic / trailing bytes / dim%4!=0. - Fuzz: new `load_fastscan` libFuzzer target (parity with the other 4 loaders). Kernels were already AVX-512/AVX2/scalar-optimized; this PR is promotion + persistence only, no kernel changes. Gate: fmt + clippy -D warnings + rustdoc -D warnings + full test suites + `cargo +nightly fuzz build load_fastscan`. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

chatgpt-codex-connector · 2026-06-15T02:53:27Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

qodo-code-review · 2026-06-15T02:53:29Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Context used

✅ Tickets: 🎫 0.8.0 staging: public API re-architecture & streamlining (breaking)

1. ~~Write path can panic~~ ✓ Resolved 🐞 Bug ☼ Reliability

Description

rank_io::write_fastscan uses assert_eq!(packed_fs.len(), payload_bytes), so the new public
RankQuantFastscan::write can panic (abort the caller) if an internal invariant is ever violated,
even though it returns io::Result<()>. This is a robustness footgun for the newly-public
persistence API (panic instead of an error).

Code

src/rank_io.rs[R895-903]

+    let payload_bytes = fastscan_payload_bytes(dim, n_vectors)?;
+    check_payload_bytes(payload_bytes)?;
+    assert_eq!(packed_fs.len(), payload_bytes);
+    let mut f = BufWriter::new(File::create(path)?);
+    f.write_all(OVFS_MAGIC)?;
+    f.write_all(&[VERSION])?;
+    f.write_all(&(dim as u32).to_le_bytes())?;
+    f.write_all(&(n_vectors as u32).to_le_bytes())?;
+    f.write_all(packed_fs)?;

Evidence

The public API RankQuantFastscan::write forwards directly to rank_io::write_fastscan, which
contains an unconditional assert_eq! on the payload length; if triggered, it panics rather than
returning an io::Error.

src/fastscan.rs[645-654]
src/rank_io.rs[886-905]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`write_fastscan()` currently uses `assert_eq!(packed_fs.len(), payload_bytes)`. Because `RankQuantFastscan::write()` is public and returns `io::Result<()>`, a panic here is surprising and turns what should be an error return into a process crash.

### Issue Context
Even if today the invariant is established by construction, keeping the write path panic-free makes the new public persistence surface more robust against future refactors/in-crate callers that might accidentally violate the invariant.

### Fix Focus Areas
- src/rank_io.rs[895-904]

### Suggested change
Replace the `assert_eq!` with a checked branch that returns `io::ErrorKind::InvalidInput`/`InvalidData` (consistent with other loader/writer validation), e.g.:
- if `packed_fs.len() != payload_bytes` => `return Err(invalid("OVFS packed payload length mismatch"))`

(Optionally keep a `debug_assert_eq!` as an internal development aid.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-06-15T02:54:05Z

PR Summary by Qodo

Promote RankQuantFastscan to public API and add .ovfs persistence
✨ Enhancement 🧪 Tests ⚙️ Configuration changes 🕐 40+ Minutes

Walkthroughs

Description

• Make RankQuantFastscan a documented, stable public type for b=2 low-latency scans
• Add .ovfs (OVFS) write/load persistence with strict header and payload validation
• Extend test + fuzz coverage for OVFS round-trips and malformed-file rejection

Diagram

graph TD
  A["Public API: RankQuantFastscan"] --> B["rank_io fastscan I/O"] --> C[(".ovfs file (OVFS)")]
  D["Tests: fastscan persistence"] --> A
  E["Fuzz target: load_fastscan"] --> A

High-Level Assessment

The following are alternative approaches to this PR:

1. Embed FastScan payload into an existing OV* container (e.g., OVRQ)

➕ Fewer file extensions/magics to support
➕ Could reuse existing probing/metadata flows if already implemented for OV*
➖ More complex container format (optional sections, versioning, offsets)
➖ Higher risk of breaking compatibility expectations for existing OV* formats
➖ Harder to guarantee exact payload-length invariants without a dedicated spec

2. Add `.ovfs` probing support now (IndexKind::Fastscan)

➕ Better UX: probe can identify .ovfs instead of “unknown magic”
➕ Allows tooling to route files correctly without opening/parsing fully
➖ Requires changing a non-#[non_exhaustive] enum; breaking API change
➖ Pushes API churn earlier than planned (conflicts with stated 0.8.0 re-arch)

Recommendation: Keep the dedicated .ovfs/OVFS format as implemented: it cleanly isolates a specialized layout, enables strict length validation, and avoids entangling legacy formats. Deferring probe support is reasonable given the non-exhaustive enum breaking-change constraint; revisit during the planned API re-architecture.

File Changes

Enhancement (2)

fastscan.rs Promote RankQuantFastscan docs and add write/load API +34/-8
Promote RankQuantFastscan docs and add write/load API
• Removes doc-hiding and reframes 'RankQuantFastscan' as a stable but specialized b=2 scan path. Adds 'write'/'load' methods that persist and restore the opaque block-32 packed payload via the '.ovfs' format.
src/fastscan.rs

rank_io.rs Add OVFS (.ovfs) format support for FastScan persistence +81/-1
Add OVFS (.ovfs) format support for FastScan persistence
• Introduces OVFS magic and implements 'write_fastscan'/'load_fastscan' with a fixed 13-byte header and strict payload-length validation (including 'dim % 4 == 0' and no trailing bytes). Adds payload sizing logic with overflow checks and enforces MAX_PAYLOAD before file creation to avoid truncating existing files.
src/rank_io.rs

Tests (2)

load_fastscan.rs Add fuzz target that exercises RankQuantFastscan::load +27/-0
Add fuzz target that exercises RankQuantFastscan::load
• Introduces a libFuzzer harness that writes arbitrary bytes to a scratch file and calls the public 'ordvec::RankQuantFastscan::load' entry point. The goal is to ensure malformed inputs produce Ok/Err without panics/aborts/OOB reads.
fuzz/fuzz_targets/load_fastscan.rs

fastscan.rs Add OVFS persistence round-trip and validation tests +118/-0
Add OVFS persistence round-trip and validation tests
• Adds round-trip tests verifying identical search results after write/load and correct behavior for empty indexes. Adds negative tests rejecting wrong magic, trailing bytes, and 'dim' values not divisible by 4.
tests/index/fastscan.rs

Documentation (1)

lib.rs Document and publicly re-export RankQuantFastscan +12/-5
Document and publicly re-export RankQuantFastscan
• Adds crate-level documentation positioning FastScan as a b=2 latency-optimized companion to RankQuant/Bitmap. Removes the crate-root '#[doc(hidden)]' re-export so the type becomes discoverable in the public API.
src/lib.rs

Other (1)

Cargo.toml Register new libFuzzer target for FastScan loader +7/-0
Register new libFuzzer target for FastScan loader
• Adds a new fuzz binary target 'load_fastscan' pointing at 'fuzz_targets/load_fastscan.rs' so the OVFS loader path is fuzzed like the other index loaders.
fuzz/Cargo.toml

codecov · 2026-06-15T02:54:37Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

gemini-code-assist

Code Review

This pull request introduces persistence support for the RankQuantFastscan index, enabling saving and loading using the new .ovfs file format. It includes the implementation of the write and load methods, a new fuzz target, and comprehensive unit tests. The review feedback suggests validating dim and n_vectors prior to writing to prevent silent truncation and potential file corruption when casting n_vectors to a u32.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

`write_fastscan` backs the now-public `RankQuantFastscan` persistence API but could (a) silently truncate `dim`/`n_vectors` through the `as u32` casts, and (b) panic via `assert_eq!(packed_fs.len(), payload_bytes)` from a `Result`-returning fn (flagged by gemini + qodo). Validate `dim` (range + multiple-of-4 for FastScan b=2), `n_vectors`, the payload size, and the packed buffer length BEFORE `File::create`, returning a clean `io::Error` instead of panicking or truncating — and a rejected write never creates/truncates a file. Add a regression test: valid round-trip + Err-not-panic on bad dim / length. Regenerate `ordvec-ffi/include/ordvec.h` with cbindgen 0.29.3 (the CI version); the committed header had drifted from the loader doc-comment wording. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fieldnote-Echo · 2026-06-15T16:11:50Z

Remediated in 1bbdb8f (+ merged main = #237/#240):

gemini + qodo (write_fastscan): write_fastscan (backing the now-public RankQuantFastscan persistence) now validates dim (range + multiple-of-4), n_vectors, payload size, and packed-buffer length before File::create — returning a clean io::Error instead of (a) silently truncating via as u32 or (b) panicking via assert_eq! in a Result fn. A rejected write never creates/truncates a file. Added regression test write_fastscan_validates_and_never_panics (valid round-trip + Err-not-panic on bad dim/length).
CI (ffi header drift): regenerated ordvec-ffi/include/ordvec.h with cbindgen 0.29.3 (the committed header had drifted from the loader doc-comment).

Verified: cbindgen --verify ok, fmt + clippy clean, full default suite green.

…b doc fix #230 (OV* on-disk format) landed on main, conflicting with this branch's FastScan work in src/rank_io.rs. Resolution: - kept this branch's FastScan additions — the 5th `.ovfs`/`OVFS` format (doc bullet, `OVFS_MAGIC`, "Five formats"); - took main's canonicalized OV* loader magic labels (`read_magic(.., "OVR1")` etc.), matching the surrounding `OV*` field labels #230 introduced. Also rolled in the stop-gate follow-up that post-dates #230's merge (per request): the SignBitmap write/load/dim-check doc comments still described the CURRENT format as `.tvsb`; corrected to `.ovsb` (the written extension — `.tvsb` is legacy-only, accepted on load and documented as such). Verified: cbindgen --verify ok (no ffi drift), fmt + clippy clean, full suite green. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

…t (PR3) (#243) * docs: propagate RankQuantFastscan public-API promotion + positioning lift #233 promoted RankQuantFastscan to a stable public type in lib.rs only; the README, RANK_MODES, compatibility-policy, determinism, THREAT_MODEL, the fastscan module-doc, the rank_io persistence-contract doc, c-api, CHANGELOG, and the Python binding docstring all still described it as #[doc(hidden)] / unstable and undercounted the on-disk formats (four -> five, adds .ovfs) and the cargo-fuzz targets (eight -> nine). This finishes the propagation so the public surface is self-consistent. Also: - fuzz.yml weekly sweep now actually runs load_fastscan. The .ovfs loader was a registered fuzz target but was missing from the weekly matrix, so the THREAT_MODEL "all nine targets" claim is now true rather than aspirational. - README "What's different" positioning lift: sharpen the training-free / data-oblivious no-fit wedge, credit the FAISS FastScan + binary-quant-rescore lineage for the two-stage / RankQuantFastscan path (no novelty overclaim), and make the HNSW params precise (ef_construction=200, ef_search=128). - Claims-discipline softening (no in-repo benchmark backs them): sign_bitmap.rs "competitive with / superior to learned hash codes" -> a mechanism statement; the RANK_MODES arXiv paper-harness numbers (under the real-corpus rerun guardrail) -> redirect to the reproducible BEIR harness; README + CHANGELOG "nothing hand-entered" -> accurate "tables transcribe the harness summary outputs". No code changes: doc comments, markdown, the Python module docstring, and the fuzz workflow matrix only. Gate: fmt + clippy (-D warnings) + cargo doc (-D warnings) + doctests all clean. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * fuzz: add load_fastscan to the manual full-campaign default targets The CI weekly sweep now runs all nine targets, but fuzz/run_full_fuzz.sh (the manual dev campaign helper) still defaulted to the eight pre-.ovfs targets, so a full manual run skipped load_fastscan — the untrusted .ovfs loader. Add it to the default TARGETS list (now nine) and fix the comment, so the manual campaign matches the THREAT_MODEL 'all nine targets' claim and the CI weekly matrix. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * fuzz: refresh run_full_fuzz duration guidance for nine targets Follow-up to adding load_fastscan: the header comment still quoted the eight-target campaign totals (~3h x 8 ~= 24h; the SECS_PER_TARGET=120 example ~16 min). Targets run sequentially, so the totals scale with the count — now ~3h x 9 ~= 27h and ~18 min. The runtime 'est. total' print is already computed from n_targets, so only the static guidance was stale. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> --------- Signed-off-by: Nelson Spence <nelson@projectnavi.ai> Co-authored-by: Navi Bot <267427491+project-navi-bot@users.noreply.github.com>

Fieldnote-Echo added 3 commits June 14, 2026 21:31

style: rustfmt ordvec-ffi (match layout after error-string shortening)

ccf963d

The shorter UNSUPPORTED_FORMAT message in info_for_metadata let rustfmt collapse `info.kind = match {...}` onto one line. Formatting only, no logic change. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

Fieldnote-Echo requested a review from project-navi-bot as a code owner June 15, 2026 02:53

gemini-code-assist Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/rank_io.rs Outdated

Fieldnote-Echo added 2 commits June 15, 2026 11:00

Merge remote-tracking branch 'origin/main' into feat/promote-fastscan

e8af81b

Base automatically changed from feat/ovec-file-format to main June 15, 2026 16:23

project-navi-bot approved these changes Jun 15, 2026

View reviewed changes

project-navi-bot merged commit 4784f59 into main Jun 15, 2026
38 checks passed

project-navi-bot deleted the feat/promote-fastscan branch June 15, 2026 16:33

This was referenced Jun 15, 2026

docs(rank_io): note OVFS metadata-probe deferral (#232) #242

Merged

0.8.0 staging: public API re-architecture & streamlining (breaking) #232

Open

docs: finish RankQuantFastscan public-API promotion + positioning lift (PR3) #243

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: promote RankQuantFastscan to public API + .ovfs persistence#233

feat: promote RankQuantFastscan to public API + .ovfs persistence#233
project-navi-bot merged 6 commits into
mainfrom
feat/promote-fastscan

Fieldnote-Echo commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 15, 2026

Uh oh!

qodo-code-review Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

qodo-code-review Bot commented Jun 15, 2026

Uh oh!

codecov Bot commented Jun 15, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Fieldnote-Echo commented Jun 15, 2026

What

Not in scope

Validation

Uh oh!

chatgpt-codex-connector Bot commented Jun 15, 2026

Uh oh!

qodo-code-review Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

qodo-code-review Bot commented Jun 15, 2026

PR Summary by Qodo

Walkthroughs

File Changes

Uh oh!

codecov Bot commented Jun 15, 2026

Codecov Report

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Fieldnote-Echo commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qodo-code-review Bot commented Jun 15, 2026 •

edited

Loading