gix-ref: lazy name index in packed::Buffer for fast bulk lookups#2608
Open
Nik B (nikicat) wants to merge 2 commits into
Open
gix-ref: lazy name index in packed::Buffer for fast bulk lookups#2608Nik B (nikicat) wants to merge 2 commits into
Nik B (nikicat) wants to merge 2 commits into
Conversation
`packed::Buffer::try_find_full_name` previously ran a `log₂(n)` binary search on every call. On wide-refs mirrors (e.g. the AUR's 154k branches) a fetch's `update_refs` loop calls this once per advertised mapping, and the per-call binary search dominates wall time. Add a lazy name → record-offset index that's built once per buffer snapshot and consulted on every subsequent lookup: - Gated by a small lookup counter so single-shot CLI calls (e.g. `git rev-parse HEAD`) never pay the index-build cost — only callers that perform more than `INDEX_BUILD_AFTER_LOOKUPS` (=8) binary searches against the same buffer materialize the index. - Stores only `(name, record-start offset)` rather than full `gix_ref::Reference` clones; the matched offset is re-decoded through `packed::decode::reference` to produce the same `packed::Reference<'a>` borrowed from the mmap as the binary-search path returns. - Preserves the binary-search path's parse-failure semantics: malformed records encountered while building the index flip `encountered_parse_failure`, and a missed lookup surfaces as `Error::Parse` rather than `Ok(None)`, matching what `binary_search_by` does today. The optimization lives in `packed::Buffer` rather than at the fetch call site, so it accelerates every caller of `try_find_full_name` (porcelain `repo.try_find_reference()`, file-store overlay lookups, etc.), not just the fetch's `update_refs` loop. No call sites need to change. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Adds an `#[ignore]`'d test `loose_stat_overhead_profile` that simulates
a fetch's `update_refs` loop against the existing 150k-packed-ref
fixture, reporting per-strategy timings:
- `store.try_find(name)` — current path, stats the loose-ref file before
falling through to packed.
- A pre-built `HashSet` of loose names + `packed.try_find(name)` direct,
which is the lookup shape a hypothetical `file::Store`-level loose-name
cache would produce.
On warm cache with this PR's packed buffer index applied, ~80% of the
remaining lookup wall time is the per-call loose `stat` (most of which
returns ENOENT against the dentry cache). The absolute reclaimable time
is workload-dependent — typically ~0.3 s warm-cache, more on cold-cache
or network filesystems — so the helper lets future work decide whether
a loose-refs follow-up is worth its complexity, given local numbers.
Run with:
cargo test -p gix-ref --release --features sha1 --test refs \\
-- --ignored --nocapture loose_stat_overhead_profile
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Supersedes #2605 with the optimization moved one layer down — out of
fetch::update_refsand intogix_ref::packed::Bufferitself, where it accelerates every caller oftry_find_full_name, not just fetch.Background
#2605 noticed that
fetch::refs::update()callsrepo.try_find_reference(name)once per advertised mapping, and on wide-refs mirrors (the AUR's 154k branches) those per-call binary searches dominate wall time. That PR fixed it by snapshottingpacked-refsinto aHashMap<BString, Reference>once at the top ofupdate().After review, the better place for the optimization is inside
packed::Buffer. This PR replaces #2605's approach with a lazyname → record-offsetindex on the buffer itself.What's different
gix::remote::connection::fetch::update_refsgix_ref::packed::Buffertry_find_full_nameHashMap<BString, Reference>(~120 B/entry)HashMap<BString, usize>(~80 B/entry)packed::find, +30 inpacked::modHow it works
```rust
pub(crate) fn try_find_full_name(&self, name: &FullNameRef) -> Result<...> {
if let Some(index) = self.name_index.get() {
return self.lookup_via_index(name, index); // O(1)
}
let prev = self.lookup_count.fetch_add(1, Relaxed);
if prev + 1 >= INDEX_BUILD_AFTER_LOOKUPS { // (=8)
let index = self.name_index.get_or_init(|| self.build_name_index());
return self.lookup_via_index(name, index);
}
self.try_find_full_name_via_binary_search(name) // O(log n), unchanged
}
```
git rev-parse HEADagainst a 154k-ref packed-refs does one lookup and stays on binary search.AtomicUsize. One relaxed atomic-add per cold-path lookup. Compatible with bothparalleland non-parallelbuilds (usesgix_features::threading::OnceCell).Reference<'a>borrowing from the mmap — same return shape asbinary_search_by, noReferenceclones.encountered_parse_failurewhile the index is built; a miss against a corrupt packed-refs surfaces asError::Parse, matching the binary-search path.Trade-off vs #2605
This PR drops the loose-ref enumeration that #2605 also included. On a wide-refs fetch,
repo.try_find_reference()still does one filesystem `stat` per mapping for the loose-ref shadow check, which #2605 cached with aHashSet. On the AUR mirror that's roughly an extra ~5s (154k × ~30μs warm-cache stat) the new PR doesn't reclaim.That's an independent optimization — a future PR could add a similar lazy loose-name set inside
file::Storeand benefit every caller, just like this PR does for packed-refs. I left it out to keep the surface small and reviewable.Test coverage
cargo test -p gix-ref(146 tests) passes.gix-ref/tests/refs/packed/find.rs:many_lookups_keep_returning_correct_results_across_the_index_thresholdverifies all 64 records round-trip whether served by binary search (first 7 lookups) or by the index (lookups 8+).index_path_surfaces_parse_failures_on_missconfirms a missed lookup against a corrupt packed-refs returnsError::Parseafter the index has been built — same behavior as the binary-search path.find_speedtest (10k lookups against a large fixture) naturally exercises the index path.invalid_refs_within_a_file_do_not_lead_to_incorrect_resultstest does fewer than 8 lookups, so it still pins the binary-search parse-failure path.AI disclosure
Per `CONTRIBUTING.md`, drafted with Claude Opus 4.7 (see the `Co-authored-by:` trailer on the commit). Investigation, design choice between in-fetch and in-buffer placement, and review were done by me.
Closes #2605 (superseded).