Skip to content

sweep: fix utxo contention and spurious fee ratchet#10816

Draft
jtobin wants to merge 2 commits into
lightningnetwork:masterfrom
jtobin:sweep-contention-fixes
Draft

sweep: fix utxo contention and spurious fee ratchet#10816
jtobin wants to merge 2 commits into
lightningnetwork:masterfrom
jtobin:sweep-contention-fixes

Conversation

@jtobin
Copy link
Copy Markdown
Collaborator

@jtobin jtobin commented May 19, 2026

Makes two fixes to the UTXO sweeper. The first eliminates contention for UTXOs by leasing them during selection; the second ensures that we don't ratchet fee rates on sweep failure, if fee rates aren't the cause of the failure.

Interestingly, the first fix "unlocks" the need for the second one.

Consider two sweeps. Previously, the loser of a UTXO collision (in which both sweeps try to use the same UTXO) saw UnknownSpend, which would not trigger the fee ratchet. The first fix eliminates the concurrent UTXO contention; now, if there are no other UTXOs available for the second sweep after the first has leased one, the sweeper will fail with ErrNotEnoughInputs, which triggers the fee rate ratchet, even though fees aren't actually the issue (the issue is that there aren't any UTXOs to cover the fees at all).

Ratcheting the fee rate on a sweep for which fees are not the problem will not solve the problem, so the second fix makes sure we don't spuriously ratchet the fee rate to the point that the sweep becomes stranded.

Partially resolves lightninglabs/taproot-assets#1976. Resolves #7397.

@jtobin jtobin marked this pull request as draft May 19, 2026 02:43
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the reliability of the UTXO sweeper by addressing two critical issues: race conditions during UTXO selection and incorrect fee rate escalation. By implementing a leasing system for selected UTXOs, the sweeper now ensures that inputs are reserved during the coin selection process. Additionally, the logic for handling sweep failures has been refined to distinguish between fee-related errors and resource-related errors, ensuring that fee rates are only ratcheted when appropriate.

Highlights

  • UTXO Contention Fix: Introduced a leasing mechanism for wallet UTXOs during selection to prevent concurrent sweeps from competing for the same inputs, effectively eliminating UTXO contention.
  • Fee Ratchet Logic Adjustment: Modified the sweeper to avoid ratcheting fee rates on resource-related failures (e.g., ErrNotEnoughInputs or ErrNotEnoughBudget), preventing inputs from becoming stranded due to artificially high fee requirements.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@jtobin jtobin self-assigned this May 19, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to lease wallet UTXOs selected as fee inputs during the sweeping process, preventing collisions between concurrent sweep attempts. It also refines the fee-bumping logic to avoid ratcheting up fee rates for non-fee-related failures, such as insufficient budget or missing wallet inputs. A suggestion was made to ensure that error reporting in handleReplacementTxError consistently reflects the underlying cause when a fee calculation fails.

Comment thread sweep/fee_bumper.go
Comment on lines 1894 to 1898
if ferr != nil {
// If there's an error with the fee calculation, we need to
// abort the sweep.
event = TxFatal
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with handleInitialTxError (lines 1124-1128), when calculateRetryFeeRate returns an error, the err variable should be updated to the calculation error. This ensures that the BumpResult and the subsequent error log (line 1909) reflect the reason why the sweep was aborted with a TxFatal event.

Suggested change
if ferr != nil {
// If there's an error with the fee calculation, we need to
// abort the sweep.
event = TxFatal
}
if ferr != nil {
// If there's an error with the fee calculation, we need to
// abort the sweep.
event = TxFatal
err = ferr
}

@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label May 19, 2026
@github-actions
Copy link
Copy Markdown

🔴 PR Severity: CRITICAL

Automated classification | 5 files | 144 lines changed (non-test)

🔴 Critical (5 files)
  • sweep/fee_bumper.go - Fee bumping logic for output sweeping
  • sweep/interface.go - Core sweep package interfaces
  • sweep/sweeper.go - Main sweeper coordination logic
  • sweep/tx_input_set.go - Transaction input set management for sweeps
  • sweep/walletsweep.go - Wallet-level sweep operations
Test/Excluded (4 files — not counted)
  • sweep/fee_bumper_test.go
  • sweep/mock_test.go
  • sweep/sweeper_test.go
  • sweep/tx_input_set_test.go

Analysis

All changed files reside in the sweep/* package, which is classified as CRITICAL because it handles output sweeping, fund recovery, and fee bumping — operations directly responsible for recovering on-chain funds. Bugs in this package can result in financial loss (e.g., missed HTLC timeouts, incorrect fee estimation causing outputs to go unswept or swept at excessive cost).

The PR modifies 5 non-test production files with 144 net lines changed, which is under the bump thresholds (>20 files or >500 lines). No severity bump was applied. The changes appear to extend fee-bumping behavior and the tx input set logic, warranting careful review by an engineer familiar with the sweep package and on-chain fee estimation.


To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

@jtobin jtobin force-pushed the sweep-contention-fixes branch from 568f4ce to 36a7dab Compare May 19, 2026 10:36
@github-actions github-actions Bot added severity-critical Requires expert review - security/consensus critical and removed severity-critical Requires expert review - security/consensus critical labels May 19, 2026
jtobin added 2 commits May 19, 2026 08:07
When the sweeper processes multiple pending inputs in a single
coin-select-lock cycle, each one independently asks the wallet for a
UTXO to cover fees. Nothing in that loop communicates "this UTXO is
already claimed", so two concurrent sweeps can pick the same one;
when the winner publishes and burns it, the loser is left referencing
a spent input and can loop on retries without finding a free UTXO if
the wallet view is slow to reflect the spend.

Fix: lease each chosen wallet UTXO on the wallet under a sweeper-
specific lock ID at selection time. Subsequent selections in the
same cycle and across the retry window before the sweep tx confirms
see a strictly smaller pool. The lease expires after the default
duration if the sweep is abandoned, and is naturally consumed when
the sweep confirms.
When a sweep fails to publish, the sweeper computes a higher retry
fee rate and stores it as the input's new starting rate. That's the
right move for fee-related failures, where the broadcast was rejected
for not paying enough. But the same machinery also fires on resource
failures: no wallet UTXO available, the input itself can't cover the
desired fee, and so on. Those failures carry no fee-rate signal, and
repeatedly ratcheting the rate on them can push it past the input's
intrinsic budget, after which the aggregator silently skips the
input and it's stranded. The bug was previously masked by a separate
UTXO-collision quirk that produced UnknownSpend rather than a
resource error on the loser of a collision; with that fixed (in the
preceding commit), this ratchet bug surfaces.

Fix: distinguish fee-rate-bearing failures from resource ones at the
point where the bump result is produced, and skip the rate ratchet
for the latter. The input's existing starting fee rate is preserved
across non-fee failures, so the next retry attempts at the same rate
the original sweep request specified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

severity-critical Requires expert review - security/consensus critical

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[flake]: suspected custom_channels_htlc_force_close_MPP flake sweeper should lock wallet utxos

1 participant