fix: delay client removal in contract test service to mitigate flaky FDv2 test (SDK-2050)#1200
fix: delay client removal in contract test service to mitigate flaky FDv2 test (SDK-2050)#1200
Conversation
…ondition The DisconnectsOnGoodbye FDv2 test uses assert.Never which spawns goroutines that can outlive the test. When the test cleanup deletes the client, stale goroutines may still try to evaluate flags on it, resulting in a 404 that crashes the next test. By delaying removal from the pool for 1 second after closing, the closed client can still serve requests (InMemoryFeatureStore.close() is a no-op) while stale goroutines complete. Co-Authored-By: Steven Zhang <szhang@launchdarkly.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
@launchdarkly/js-sdk-common size report |
|
@launchdarkly/browser size report |
|
@launchdarkly/js-client-sdk size report |
|
@cursor review |
|
@launchdarkly/js-client-sdk-common size report |
|
@cursor review |
|
I'll put in a fix in test-harness... will close once we can verify the fix in test harness works |
|
fixed upstream |
Requirements
Related issues
Describe the solution you've provided
The
streaming/fdv2/permanent fallback to secondary synchronizercontract test fails intermittently with a404onPOST /clients/:id.Root cause: The preceding
DisconnectsOnGoodbyetest uses testify'sassert.Never, which spawns goroutines to poll a condition. These goroutines can outlive theNevercall—when the test finishes and cleanup sendsDELETE /clients/:id, the client is immediately removed from the pool. A stale goroutine then callsEvaluateFlagon the now-missing client, gets a 404, and panics viarequire.NoError, crashing the next test. The stack trace confirms this (assert.Never.func1 → DisconnectsOnGoodbye.checkForUpdatedValue.func1).Fix: Delay removal of the client from the
ClientPoolby 1 second after closing it. SinceInMemoryFeatureStore.close()is a no-op, the closed client can still serve stale flag evaluation requests with valid cached data, preventing the 404.Describe alternatives you've considered
testharness-suppressions-fdv2.txt— loses coverage.assert.Never— not in scope for this repo.Additional context
Important
For reviewers: The 1-second delay is a heuristic chosen to exceed the
assert.Neverparameters in the harness (waitFor=100ms, tick=20ms). The closed client still has its in-memory store intact, sovariation()returns real data. However, other subsystems (event processor, update processor) are closed—this is safe because the stale goroutines only callEvaluateFlag, which only needs the feature store.Note that similar failures were observed before the
gotremoval (Mar 12–13), suggesting this is a pre-existing race condition in the FDv2 test suite rather than a regression from PR #1188.Link to Devin session: https://app.devin.ai/sessions/6d2f3eb1ad234bff9516ca5a5ef2c6f0
Requested by: @joker23
Note
Low Risk
Low risk: change is isolated to the contract test service cleanup path and only affects when a closed client is removed from the in-memory
ClientPool.Overview
Mitigates a flaky contract-test race by delaying
ClientPoolremoval of a client for 1s afterDELETE /clients/:id, instead of removing it immediately.The service still closes the client right away, but keeps the entry briefly so late test-harness requests don’t hit a transient
404during teardown.Written by Cursor Bugbot for commit c5bae6d. This will update automatically on new commits. Configure here.