fix(amqp091): close BrokerDetails.state and Test_Retry races#104
Open
miotte wants to merge 1 commit into
Open
Conversation
…nect, and Test_connect_connecting_* Three pre-existing races surfaced under -race in this package: - BrokerDetails.state was a uint16 written under bd.Mutex by connect() and connectionWatcher() but read without the lock from WaitForConnect, queueSubscribe, Publish, and connect()'s own CONNECTING fast-path check. Converted to atomic.Uint32 so all reads/writes synchronize, and dropped the now-redundant Lock()/Unlock() around state-only writes in connectionWatcher. - getBrokerDetails() unconditionally reassigned bd.tlsConfig on every call, racing with concurrent reads of bd.tlsConfig in connect(). The assignment was redundant: BrokerDetails is constructed with tlsConfig: prov.tlsConfig and prov.tlsConfig is never reassigned after NewAMQP091Provider, so removing the write fixes the race without changing behavior. - Test_Retry built an amqp091Message, started a goroutine that sent it on a channel, and only then mutated mm.Headers from the test goroutine. Moved the Headers assignment ahead of the goroutine so the value is fully populated before it is sent. Test_Retry, Test_WaitForConnect, Test_connect_connecting_connected, Test_connect_connecting_closed, and Test_connect_connecting_disconnected now pass under \`go test -race\`. Signed-off-by: Michael Otteni <MichaelGOtteni@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #103
Summary
BrokerDetails.statefromuint16toatomic.Uint32so all reads/writes synchronize, and drop the now-redundantLock()/Unlock()blocks around state-only writes inconnectionWatcher. Mirrors the pattern PR fix(amqp091): close race on BrokerDetails.clientDisconnect #102 applies toclientDisconnect.bd.tlsConfig = prov.tlsConfigwrite ingetBrokerDetails— the constructor already sets it (amqp091.go:458) andprov.tlsConfignever changes afterNewAMQP091Provider, so the write only created a race against the unlocked read inconnect().mm.Headers = ...assignment inTest_Retryahead of the goroutine that sendsmmonmsgs, so the value is fully populated before it is sent.Test plan
go test -race -run "Test_Retry|Test_WaitForConnect|Test_connect_connecting_" -count=5 ./internal/provider/connectors/amqp091/...go test ./internal/provider/connectors/amqp091/...(no-race) to confirm no behavior changego vet ./...Verification (2026-05-30):
bd.staterace this PR fixes is confirmed gone: onmainthis-racecommand fails 6/6 with abd.stateDATA RACE; on this branch that race no longer appears.go vet ./...pass. (vetreports only one pre-existing, unrelated warning attest/config/env.go:60, also present onmain; this PR's code vets clean.)-count=5command still flakes (~1 in 4, even with the full stack fix(amqp091): close race on BrokerDetails.clientDisconnect #102+fix(amqp091): handle NewConsumer failure and atomic-load shared counters #84 merged) because of a pre-existing leakedconnectionWatchergoroutine — its 30s fallback timer fires after a test has ended, calling the mockIsClosed()(amqp091_test.go:67) and racing the package-levelNewAmqpConn091swap. This is not introduced or fixed by this PR/stack and warrants a separate issue for test teardown.