Skip to content

grpc: Add noncebalancer that tracks non-READY backends#8672

Closed
beautifulentropy wants to merge 1 commit intomainfrom
fix-noncebalancer
Closed

grpc: Add noncebalancer that tracks non-READY backends#8672
beautifulentropy wants to merge 1 commit intomainfrom
fix-noncebalancer

Conversation

@beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented Mar 11, 2026

The nonce service's maxConnectionAge (30s) periodically sends a GOAWAY to the WFE's gRPC connections, causing affected SubConns to briefly leave READY state while reconnecting. Due to jitter on maxConnectionAge, the getNonceService and redeemNonceService connections to the same backend can GOAWAY at slightly different times, creating a window where the WFE can still issue nonces from a backend it can no longer redeem against.

The original nonce balancer/picker (grpc/noncebalancer) only tracks READY SubConns. So when a backend is reconnecting after a GOAWAY it is indistinguishable from a backend that does not exist; this results in a badNonce error for the subscriber. The v2 balancer fixes this by maintaining two maps: one for READY backends and one for not-READY backends. When a request targets a prefix whose backend exists but isn't READY, the picker returns ErrNoSubConnAvailable, which tells gRPC to queue the RPC and wait for the SubConn to reconnect (see picker_wrapper.go:159). Only genuinely unknown prefixes now produce ErrNoBackendsMatchPrefix.

To simplify comparison during review and testing in staging, the v2 balancer is implemented as a separate package (grpc/noncebalancerv2) alongside the existing grpc/noncebalancer. Either can be configured in the WFE by setting redeemNonceService.srvResolver to "nonce-srv" or "nonce-srv-v2" in the WFE config.

Note: grpc/noncebalancerv2/balancer.go is best compared directly against vendor/google.golang.org/grpc/balancer/base/balancer.go

Fixes #8662

@beautifulentropy beautifulentropy changed the title grpc: Add noncebalancer that tracks reconnecting backends grpc: Add noncebalancer that tracks non-READY backends Mar 11, 2026
@beautifulentropy beautifulentropy marked this pull request as ready for review March 11, 2026 21:33
@beautifulentropy beautifulentropy requested a review from a team as a code owner March 11, 2026 21:33
@beautifulentropy beautifulentropy requested a review from jsha March 11, 2026 21:33
@github-actions
Copy link
Contributor

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

@aarongable
Copy link
Contributor

Closing in favor of #8679

@aarongable aarongable closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix badNonce CI flake

2 participants