perf!: delta-aware BnbMetric via SelectionView/compute_view by evanlinjin · Pull Request #53 · bitcoindevkit/coin-select

evanlinjin · 2026-07-02T17:03:54Z

Summary

Makes the branch-and-bound metric evaluator delta-aware, eliminating the dominant cost in run_bnb. A flamegraph showed ~65% of run_bnb_lowest_fee time was spent in cs.selected_value() (47.5%) and cs.input_weight() (17.3%) — both O(|selected|) walks recomputed many times per branch via excess/is_target_met/drain.

Now BnB maintains a per-branch SelectionCache (running aggregates: value_sum, weight_sum, input_count, segwit_count, candidate_count). Inclusion expansions call cache.add(c) — O(1). Metrics receive a &SelectionView<'_> (a handle over &CoinSelector + Cow<SelectionCache>) whose read methods (selected_value, input_weight, excess, is_target_met, drain, …) are all O(1).

Outside the hot path, cs.compute_view() builds a fresh cache once on demand, collapsing the ~15 duplicated O(|selected|) read methods that previously lived on CoinSelector into a single entry point.

Benchmarks

Criterion median; delta-aware vs the bitset baseline, vs the pre-bitset master:

bench	master	bitset	delta-aware	total speedup
clone/4096	5.68 µs	58 ns	52 ns	109×
run_bnb_lowest_fee/20	470 µs	417 µs	158 µs	3.0×
run_bnb_lowest_fee/50	26.0 ms	10.3 ms	5.4 ms	4.8×
run_bnb_lowest_fee/100	73.2 ms	40.2 ms	14.2 ms	5.2×
run_bnb_lowest_fee/200	1.27 s	1.10 s	196 ms	6.5×

The gain is largest at large n, where selected_value/input_weight recomputation dominated.

Commits

perf!: delta-aware BnbMetric via SelectionView/compute_view — the core change. Introduces SelectionCache/SelectionView, moves the read methods off CoinSelector behind compute_view(), and makes the BnbMetric trait take &SelectionView. Adapted to the current trait API (target passed as a parameter; metric owns its change decision via drain(); Changeless<M>).
bench: extend pool sizes to wallet (1k) and exchange (10M) scale
perf: skip unselected scan in BnbIter via per-branch cursor — each Branch carries a cursor into candidate_order, so insert_new_branches jumps straight to the next undecided candidate instead of re-scanning selected/banned.
docs: fix unresolved rustdoc links + rustfmt reflow

API changes (breaking)

New public SelectionView<'_>; SelectionCache is internal.
BnbMetric::{score,bound,drain} now take &SelectionView<'_> instead of &CoinSelector<'_>.
The ~15 read methods (selected_value, input_weight, excess, is_target_met, drain, fee, …) move from CoinSelector to SelectionView; reach them via cs.compute_view().

Testing

cargo fmt --check, cargo clippy --all-targets, RUSTDOCFLAGS="-D warnings" cargo doc --no-deps, and the full suite (33 lib+integration tests + 2 doctests) all pass. A proptest in src/selection_view.rs checks the cache invariant (cache.value_sum == cs.compute_view().selected_value()) against a from-scratch rebuild under random insert/deselect sequences.

🤖 Generated with Claude Code

Squash of the two-commit perf series (cd1017a "delta-aware BnbMetric via SelectionView/SelectionCache" + 536a03a "replace duplicated CoinSelector read methods with compute_view()"), rebased onto the new BnbMetric API (target passed as a parameter; metric decides its own change output). Squashed because the first commit's intermediate state (src/selection_cache.rs, SelectionView-by-value) is fully superseded by the second (src/selection_view.rs, Cow-backed cache, &SelectionView, compute_view()); replaying both would mean resolving the same BnbMetric merge against master twice. The flamegraph on fix/better-memory showed ~65% of run_bnb_lowest_fee time was in cs.selected_value() (47.5%) and cs.input_weight() (17.3%) -- both O(|selected|) walks recomputed many times per branch via excess/is_target_met/drain. This makes the metric evaluator delta-aware. BnB maintains a SelectionCache (running aggregates over the selection: value_sum, weight_sum, input_count, segwit_count, candidate_count) per Branch; inclusion expansions call cache.add(c), which is O(1). The metric trait now takes a `&SelectionView<'_>` -- a handle over (&CoinSelector, Cow<SelectionCache>) -- with O(1) versions of every read method (selected_value, input_weight, excess, is_target_met, drain, ...). CoinSelector's ~15 duplicated O(|selected|) read methods collapse into a single `cs.compute_view()` entry point (fresh cache built once on demand); the BnB hot path borrows the per-branch cache instead (zero clone). Adapted to the master-side API changes: - score/bound/drain take `target: Target` as a parameter (metrics no longer store target); BnbIter threads its `target` through to the metric. - LowestFee owns the change decision (dust_relay_feerate + drain_weights, no change_policy) and implements `drain`; its delta-aware not-target-met bound loop is preserved. - Changeless<M> wraps an inner metric, using &SelectionView + target. All lib, integration and doc tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The original benches capped at 4k candidates for `clone` and 200 for `run_bnb_lowest_fee`. Real callers span a much wider range -- a typical wallet has ~1k UTXOs, a large exchange ~10M. For the O(n)-ish operations (`new`, `clone`, `compute_view`) extend the parameter list to 64 / 1k / 16k / 256k / 1M / 10M. These all scale roughly linearly: clone/64 51 ns clone/1024 52 ns clone/16384 260 ns clone/262144 2.0 us clone/1048576 6.3 us clone/10000000 98 us At 10M UTXOs the `Candidate` slice itself is ~320MB and the selector's `candidate_order` Vec is ~80MB -- commented as a heads-up for memory- constrained hosts. Add new groups: - `new`: cost of `CoinSelector::new(candidates)` -- allocations grow with pool size. - `compute_view`: cost of building a SelectionView. Scales with |selected| rather than |pool|; benched against a fixed sparse selection of ~100 candidates regardless of pool size, matching how wallets actually use selection. For `run_bnb_lowest_fee` add n=500 and n=1000. BnB's search space is exponential but `max_rounds=100k` caps per-sample work, so these end up measuring "first 100k rounds" -- per-round cost grows roughly linearly: run_bnb_lowest_fee/200 193 ms run_bnb_lowest_fee/500 211 ms run_bnb_lowest_fee/1000 377 ms 10M-scale BnB is intentionally not benchmarked: it's impractical at any finite round budget, and real callers pre-filter / pre-group at that scale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The flamegraph after the delta-aware refactor showed ~32% of run_bnb time spent walking candidate_order and checking Bitset::contains(selected) || Bitset::contains(banned) per element, inlined into insert_new_branches's `cs.unselected().next()`. As BnB descends and more candidates get selected/banned, each .next() call scans further before finding the next viable candidate. But BnB never re-considers a position: each branch's exploration only moves forward in candidate_order. Inclusion advances by 1; exclusion advances past every consecutive same-(value, weight) candidate. So we can store a per-Branch cursor and avoid the scan entirely. Add `Branch::cursor: usize` (the position the branch will expand on next). The init branch starts at 0; insert_new_branches advances past any pre-selected/pre-banned positions on demand, then expands at the located cursor and hands children their new cursors directly. Bench (run_bnb_lowest_fee, n = pool size): n=20 166 us -> 147 us (11%) n=50 5.3 ms -> 4.5 ms (16%) n=100 12.6 ms -> 11.5 ms (9%) n=500 200 ms -> 171 ms (14%) n=1000 365 ms -> 248 ms (32%) Largest win at large n where the unselected scan was burning the most time. Flamegraph confirms the unselected-scan hot spot is gone; new top is LowestFee::bound itself (the metric's float math + lookahead). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three rustdoc warnings: - `SelectionView` referenced the pub(crate) `crate::bnb::BnbIter` in prose. Reworded to describe the behaviour without naming the internal type. - `Drain` linked to `CoinSelector::drain`, which moved to `SelectionView::drain`. Retargeted. - `metrics` module docs used `[CoinSelector::run_bnb]` / `bnb_solutions` with no `use` in module scope. Switched to fully-qualified `crate::CoinSelector::...` paths. Also picks up rustfmt reflows across the changed files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

evanlinjin force-pushed the perf/delta-aware-metric branch 2 times, most recently from e94e9a1 to 89b4255 Compare July 2, 2026 17:16

evanlinjin and others added 4 commits July 2, 2026 17:21

evanlinjin force-pushed the perf/delta-aware-metric branch from 89b4255 to ec6a713 Compare July 2, 2026 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf!: delta-aware BnbMetric via SelectionView/compute_view#53

perf!: delta-aware BnbMetric via SelectionView/compute_view#53
evanlinjin wants to merge 4 commits into
bitcoindevkit:masterfrom
evanlinjin:perf/delta-aware-metric

evanlinjin commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

evanlinjin commented Jul 2, 2026

Summary

Benchmarks

Commits

API changes (breaking)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant