Skip to content

gix-ref: lazy name index in packed::Buffer for fast bulk lookups#2608

Open
Nik B (nikicat) wants to merge 2 commits into
GitoxideLabs:mainfrom
nikicat:gix-ref-packed-lazy-name-index
Open

gix-ref: lazy name index in packed::Buffer for fast bulk lookups#2608
Nik B (nikicat) wants to merge 2 commits into
GitoxideLabs:mainfrom
nikicat:gix-ref-packed-lazy-name-index

Conversation

@nikicat
Copy link
Copy Markdown

Summary

Supersedes #2605 with the optimization moved one layer down — out of fetch::update_refs and into gix_ref::packed::Buffer itself, where it accelerates every caller of try_find_full_name, not just fetch.

Background

#2605 noticed that fetch::refs::update() calls repo.try_find_reference(name) once per advertised mapping, and on wide-refs mirrors (the AUR's 154k branches) those per-call binary searches dominate wall time. That PR fixed it by snapshotting packed-refs into a HashMap<BString, Reference> once at the top of update().

After review, the better place for the optimization is inside packed::Buffer. This PR replaces #2605's approach with a lazy name → record-offset index on the buffer itself.

What's different

Old (#2605) New (this PR)
Lives where gix::remote::connection::fetch::update_refs gix_ref::packed::Buffer
Build trigger Every fetch, eagerly After 8 lookups, lazily
Benefits Fetch only Every caller of try_find_full_name
Build cost paid by Every fetch, regardless of mapping count Only the 8th lookup against a given snapshot
Memory shape HashMap<BString, Reference> (~120 B/entry) HashMap<BString, usize> (~80 B/entry)
Code surface +85 lines in fetch +97 in packed::find, +30 in packed::mod

How it works

```rust
pub(crate) fn try_find_full_name(&self, name: &FullNameRef) -> Result<...> {
if let Some(index) = self.name_index.get() {
return self.lookup_via_index(name, index); // O(1)
}
let prev = self.lookup_count.fetch_add(1, Relaxed);
if prev + 1 >= INDEX_BUILD_AFTER_LOOKUPS { // (=8)
let index = self.name_index.get_or_init(|| self.build_name_index());
return self.lookup_via_index(name, index);
}
self.try_find_full_name_via_binary_search(name) // O(log n), unchanged
}
```

  • Single-shot CLI calls don't pay. git rev-parse HEAD against a 154k-ref packed-refs does one lookup and stays on binary search.
  • Lookup-count gate via AtomicUsize. One relaxed atomic-add per cold-path lookup. Compatible with both parallel and non-parallel builds (uses gix_features::threading::OnceCell).
  • Re-decode at offset returns Reference<'a> borrowing from the mmap — same return shape as binary_search_by, no Reference clones.
  • Parse-failure semantics preserved. Malformed records flip encountered_parse_failure while the index is built; a miss against a corrupt packed-refs surfaces as Error::Parse, matching the binary-search path.

Trade-off vs #2605

This PR drops the loose-ref enumeration that #2605 also included. On a wide-refs fetch, repo.try_find_reference() still does one filesystem `stat` per mapping for the loose-ref shadow check, which #2605 cached with a HashSet. On the AUR mirror that's roughly an extra ~5s (154k × ~30μs warm-cache stat) the new PR doesn't reclaim.

That's an independent optimization — a future PR could add a similar lazy loose-name set inside file::Store and benefit every caller, just like this PR does for packed-refs. I left it out to keep the surface small and reviewable.

Test coverage

  • cargo test -p gix-ref (146 tests) passes.
  • New tests in gix-ref/tests/refs/packed/find.rs:
    • many_lookups_keep_returning_correct_results_across_the_index_threshold verifies all 64 records round-trip whether served by binary search (first 7 lookups) or by the index (lookups 8+).
    • index_path_surfaces_parse_failures_on_miss confirms a missed lookup against a corrupt packed-refs returns Error::Parse after the index has been built — same behavior as the binary-search path.
  • The existing find_speed test (10k lookups against a large fixture) naturally exercises the index path.
  • The existing invalid_refs_within_a_file_do_not_lead_to_incorrect_results test does fewer than 8 lookups, so it still pins the binary-search parse-failure path.

AI disclosure

Per `CONTRIBUTING.md`, drafted with Claude Opus 4.7 (see the `Co-authored-by:` trailer on the commit). Investigation, design choice between in-fetch and in-buffer placement, and review were done by me.

Closes #2605 (superseded).

`packed::Buffer::try_find_full_name` previously ran a `log₂(n)` binary
search on every call. On wide-refs mirrors (e.g. the AUR's 154k
branches) a fetch's `update_refs` loop calls this once per advertised
mapping, and the per-call binary search dominates wall time.

Add a lazy name → record-offset index that's built once per buffer
snapshot and consulted on every subsequent lookup:

- Gated by a small lookup counter so single-shot CLI calls (e.g. `git
  rev-parse HEAD`) never pay the index-build cost — only callers that
  perform more than `INDEX_BUILD_AFTER_LOOKUPS` (=8) binary searches
  against the same buffer materialize the index.
- Stores only `(name, record-start offset)` rather than full
  `gix_ref::Reference` clones; the matched offset is re-decoded through
  `packed::decode::reference` to produce the same `packed::Reference<'a>`
  borrowed from the mmap as the binary-search path returns.
- Preserves the binary-search path's parse-failure semantics:
  malformed records encountered while building the index flip
  `encountered_parse_failure`, and a missed lookup surfaces as
  `Error::Parse` rather than `Ok(None)`, matching what
  `binary_search_by` does today.

The optimization lives in `packed::Buffer` rather than at the fetch
call site, so it accelerates every caller of `try_find_full_name`
(porcelain `repo.try_find_reference()`, file-store overlay lookups,
etc.), not just the fetch's `update_refs` loop. No call sites need
to change.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds an `#[ignore]`'d test `loose_stat_overhead_profile` that simulates
a fetch's `update_refs` loop against the existing 150k-packed-ref
fixture, reporting per-strategy timings:

- `store.try_find(name)` — current path, stats the loose-ref file before
  falling through to packed.
- A pre-built `HashSet` of loose names + `packed.try_find(name)` direct,
  which is the lookup shape a hypothetical `file::Store`-level loose-name
  cache would produce.

On warm cache with this PR's packed buffer index applied, ~80% of the
remaining lookup wall time is the per-call loose `stat` (most of which
returns ENOENT against the dentry cache). The absolute reclaimable time
is workload-dependent — typically ~0.3 s warm-cache, more on cold-cache
or network filesystems — so the helper lets future work decide whether
a loose-refs follow-up is worth its complexity, given local numbers.

Run with:

    cargo test -p gix-ref --release --features sha1 --test refs \\
        -- --ignored --nocapture loose_stat_overhead_profile

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant