Skip to content

auctioneer: resubscribe account after stream loss#521

Merged
Roasbeef merged 1 commit into
lightninglabs:masterfrom
djkazic:resubscribe-account-after-stream-loss
May 22, 2026
Merged

auctioneer: resubscribe account after stream loss#521
Roasbeef merged 1 commit into
lightninglabs:masterfrom
djkazic:resubscribe-account-after-stream-loss

Conversation

@djkazic
Copy link
Copy Markdown
Contributor

@djkazic djkazic commented May 21, 2026

Problem

HandleServerShutdown rebuilds every account's subscription after the auctioneer stream goes down. Today the loop bails on the first per-account error, leaving every remaining account silently un-subscribed until the process restarts. Switch the loop to best-effort: log per-account failures, keep going, and aggregate errors so the caller still gets a signal.

auctioneer/client.go:1304-1309 (current code):

for _, acctKey := range acctKeys {
    err := c.StartAccountSubscription(context.Background(), acctKey)
    if err != nil {
        return err
    }
}

acctKeys is drained from c.subscribedAccts immediately before the loop, so every entry that doesn't get re-subscribed is gone from the in-memory map as well. If StartAccountSubscription fails for one account, for example, the server returns any error response in non-recovery mode (client.go:694-697), or the new stream's per-account handshake errors, the function returns and every account after it in iteration order is silently dropped.

From the trader's perspective the accounts look healthy in the CLI and account manager state; from the server's perspective they're not subscribed, and their orders are filtered as "offline trader" at every matching attempt until the trader process restarts.

This is the same severity class as the EOF-no-reconnect bug (#518) and the recovery-leaves-stale-entry bug (#520): a single transient or account-specific failure silently degrades the trader's participation in auctions, with no in-band signal to the user.

The bug is also conditional on Go's randomized map iteration order — the "survivors" are whichever accounts happen to be drained before the failing one. That ordering can change between runs, which makes the symptom intermittent and hard to attribute.

Fix

auctioneer/client.go:HandleServerShutdown: collect per-account errors instead of returning on the first one. Log each failure (with the trader key) so the operator has actionable per-account context in the trader log, then return errors.Join of every collected error. The caller in pool/rpcserver.go:315 still receives — and logs — the aggregated error, so callers that previously inspected the return value continue to see a non-nil error when anything went wrong; they just no longer lose visibility into the survivors.

Tests

Adds TestHandleServerShutdownPartialResubscribeFailure in auctioneer/client_test.go. It pre-populates subscribedAccts with three accounts, drives HandleServerShutdown end-to-end against a scripted fake stream, and asserts every account's handshake is attempted even when one of them returns ACCOUNT_DOES_NOT_EXIST.

The orchestrator fails the first observed handshake rather than a specific account key. This keeps the assertion deterministic under Go's randomized map iteration order: with the buggy loop, exactly one handshake ever completes (the failing one); with the fix, all three do.

Verified the test catches the bug: against the unmodified HandleServerShutdown, it fails with expected re-subscribe to attempt all 3 accounts, only attempted 1 — the loop bails on first error and leaves later accounts silently un-subscribed.

Copy link
Copy Markdown
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🥕

@Roasbeef Roasbeef force-pushed the resubscribe-account-after-stream-loss branch from 44ba2ee to 5b7fe11 Compare May 22, 2026 02:46
@Roasbeef Roasbeef merged commit 573fb57 into lightninglabs:master May 22, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants