auctioneer: fix initial per-account subscribe failure#520
Merged
Roasbeef merged 3 commits intoMay 22, 2026
Conversation
connectAndAuthenticate's deferred cleanup runs unconditionally on any non-success exit. On the ErrServerErrored branch we hand off to HandleServerShutdown, which itself clears subscribedAccts and re-runs StartAccountSubscription for every previously-live account — installing a fresh entry at acctPubKey. Without suppressing the defer, the outer return would then turn around and delete that fresh entry, leaving the just-resubscribed account unreachable from sendToSubscription. Set success = true before invoking HandleServerShutdown so the inner resubscribe's map entry survives. The defer's invariant — "only delete what this call originally inserted" — still holds because the inner StartAccountSubscription has already replaced our insertion with its own before HandleServerShutdown returns.
The existing TestConnectAndAuthenticateCleansUpOnAccountDoesNotExist only exercised one of the per-account error paths whose cleanup is now guarded by the deferred delete. The INCOMPLETE_ACCOUNT_RESERVATION case is the most error-prone of the bunch: it returns a non-nil sub *and* a typed AcctResNotCompletedError, which makes the "sub got returned, so the entry must still be in the map" mental model easy to fall into. Refactor the test into a table-driven TestConnectAndAuthenticateCleansUpOnError with a runCleanupCase helper that drives the full handshake and asserts the post-condition, and add the INCOMPLETE_ACCOUNT_RESERVATION case alongside the original ACCOUNT_DOES_NOT_EXIST case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
auctioneer.Client.connectAndAuthenticateregisters a newacctSubscriptioninc.subscribedAcctsbefore running the 3-way auth handshake, and only removes it again viacloseStream(called on full client shutdown or server-disconnect reconnect).Every per-account error path returns without deleting the entry, including the legitimate
ACCOUNT_DOES_NOT_EXISTresponse thatRecoverAccountstriggers when probing keys that aren't known to the auctioneer yet.The next call to
StartAccountSubscriptionfor that account key, typically fromhandleStateOpenafter the on-chain funding tx confirms, hits the "already subscribed" early-return guard, returnsnilwithout sending a freshCommit, and the per-account handshake never runs. The client thinks the account is subscribed; the auctioneer has never authenticated it. The trader is silently filtered asoffline traderat matching time and stays that way until the process is restarted (which clears the in-memory map).Symptoms
The bug fires whenever RecoverAccounts probes a key the auctioneer hasn't seen yet. This happens naturally when:
- A user opens an account and runs recovery before the funding tx confirms.
- A user restores from seed and has accounts in flight.
- A user has had failed account creation attempts (the recovery loop walks past "holes" in the key sequence; every probed-but-not-found key seeds a stale entry).
Litd / lightning-terminal users disproportionately affected. The terminal UI invokes
RecoverAccountsas part of its account-list refresh flow, so any restart of litd with a not-yet-confirmed account is enough to set the trap.Fix
Add a
successsentinel plus adeferinconnectAndAuthenticatethat removes the entry fromc.subscribedAcctsunless the handshake reached theGetSuccessbranch. The defer is set up after the "already-subscribed" early-return guard, so it does not delete a pre-existing entry on that path.srvMsg.GetSuccess() != nilis the only branch that setssuccess = true; every other exit (authenticate error, unrecognized response, recovery-modeINCOMPLETE_ACCOUNT_RESERVATION, recovery-modeACCOUNT_DOES_NOT_EXIST, unknown error, unknown message) cleans up. The cleanup is keyed onacctPubKey, captured at the top of the function and never re-derived, so the deferred delete cannot remove an entry for a different account.Tests
Adds
TestConnectAndAuthenticateCleansUpOnAccountDoesNotExistinauctioneer/client_test.go. It drives a fullconnectAndAuthenticatecall in recovery mode against a scripted fake stream: capture the Commit, echo itscommitHashback in a Challenge, drain the Subscribe, then respond withSubscribeError_ACCOUNT_DOES_NOT_EXIST. AssertsconnectAndAuthenticatereturns(sub, false, nil)andc.subscribedAcctsis empty afterwards.Verified the test catches the bug: against the unmodified
connectAndAuthenticate, it fails withsubscribedAccts entry was not cleaned up after ACCOUNT_DOES_NOT_EXIST; later subscribes for the same account would silently no-op.