Fix scenario runner event loop freeze and initial_unpark timeout#292
Open
Fix scenario runner event loop freeze and initial_unpark timeout#292
Conversation
9f72e99 to
80dd5ee
Compare
humbertoyusta
approved these changes
Apr 7, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
80dd5ee to
bcb2e6b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Event loop freeze fix: In scenario mode (
running_test_scenario), tool calls are now routed throughscenario_generate_tool_result_via_modelas fire-and-forget instead of going through local tool handlers. The local handlers usegql/aiohttpsessions whose cleanup (__aexit__) blocks the event loop, causing the scenario runner to hang indefinitely after the first tool call.Initial unpark timeout:
_completed_initial_unparknow has a 30-second timeout. Previously, scheduler events (kanban sync, etc.) keptdid_anything=Trueon every unpark cycle, so the flag never got set toTrue, blocking the scenario runner forever in the "WAIT for bot to complete initial unpark" loop.Background cloudtool_post_result: Moved
cloudtool_post_resultin_local_tool_callto anasyncio.create_taskbackground task to prevent blocking the unpark loop.Fire-and-forget pattern in ckit_scenario:
scenario_generate_tool_result_via_modelnow wraps the GQL mutation inasyncio.create_taskand immediately raisesAlreadyFakedResult, so the caller never awaits the HTTP response.Test plan
grok-4-1-fast-reasoning-- 8/10 passed (2 failures were model quality issues, not infra)🤖 Generated with Claude Code