fix: allow flow runs to transition when deployment is deleted#19845
Closed
fix: allow flow runs to transition when deployment is deleted#19845
Conversation
CodSpeed Performance ReportMerging #19845 will not alter performanceComparing Summary
|
0278348 to
fd62f0c
Compare
When a deployment is deleted while flow runs are in AwaitingConcurrencySlot state, those runs would get stuck forever. The SecureFlowConcurrencySlots orchestration rule was aborting with "Deployment not found" instead of allowing the transition to proceed. Since the run can't execute without its deployment anyway, the fix now cancels the run with a clear message instead of leaving it stuck in AwaitingConcurrencySlot permanently. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
f94389b to
a7691f2
Compare
Contributor
|
This pull request is stale because it has been open 14 days with no activity. To keep this pull request open remove stale label or comment. |
Contributor
|
This pull request was closed because it has been stale for 14 days with no activity. If this pull request is important or you have more to add feel free to re-open it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes deployment concurrency issues that cause flow runs to get stuck in
AwaitingConcurrencySlotstate.Bug 1: Cleanup incorrectly decremented active_slots when runs were rejected
Root cause: When a flow run was rejected to
AwaitingConcurrencySlotbecause the deployment concurrency limit was reached, thecleanupmethod inSecureFlowConcurrencySlotswas unconditionally decrementingactive_slotseven though no slot was ever acquired. This caused:active_slotsto go negativeAwaitingConcurrencySlotcould never acquire slots even when they became availableFix: Only cleanup (decrement slots and revoke lease) if we actually acquired a slot, which is indicated by having a lease ID in the validated state.
Bug 2: Orphaned runs when deployment is deleted
Root cause: When a deployment was deleted while flow runs were in
AwaitingConcurrencySlot, the orchestration rule would ABORT with "Deployment not found" when the worker tried to transition them, leaving runs stuck forever.Fix: Cancel runs gracefully when their deployment is deleted instead of aborting.
Test plan
test_rejected_to_awaiting_concurrency_slot_does_not_decrement_slotsthat verifies active_slots doesn't go negative when runs are rejectedtest_deleted_deployment_allows_transition_instead_of_abortverifies orphaned runs are cancelledTestFlowConcurrencyLimitstests passImpact
This fixes the deployment concurrency integration test that was failing in the OSS testbed with "Expected 4 completed runs, got 2" - where 2 runs would complete but the other 2 would stay stuck in
AwaitingConcurrencySlotforever.🤖 Generated with Claude Code