fix(e2e): widen TUI chat correlation retry guard to cover partial missing replies#4572
fix(e2e): widen TUI chat correlation retry guard to cover partial missing replies#4572hunglp6d wants to merge 1 commit into
Conversation
…sing replies The looksLikeEventCaptureFailure guard only triggered a retry when ALL three replies were missing (chatEvents.length === 0). In nightly run 26698759656, prompts A and B received replies but prompt C timed out — a partial missing-reply scenario that bypassed the retry guard entirely. Widen the condition to fire whenever missingReplies.length > 0 (not just === sentRuns.length) while keeping the correlation-bug checks (no empty finals, no duplicates, no uncorrelated replies) so real regressions still fail immediately. Signed-off-by: Hung Le <hple@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
E2E Advisor RecommendationRequired E2E: None Dispatch hint: Full advisor summaryE2E Recommendation AdvisorBase: Required E2E
Optional E2E
New E2E recommendations
Dispatch hint
|
E2E Scenario Advisor RecommendationRequired scenario E2E: None Full scenario advisor summaryE2E Scenario AdvisorBase: Required scenario E2E
Optional scenario E2E
Relevant changed files
|
PR Review AdvisorFindings: 0 needs attention, 2 worth checking, 0 nice ideas Review findings🛠️ Needs attention
🔎 Worth checking
🌱 Nice ideas
This is an automated advisory review. A human maintainer must make the final merge decision. |
Summary
The
openclaw-tui-chat-correlation-e2ejob in nightly run #26698759656 failed because thelooksLikeEventCaptureFailureretry guard was too narrow: it only triggered when all three replies were missing (chatEvents.length === 0), but this nightly saw a partial missing-reply scenario — prompts A and B received replies while prompt C timed out with zero chat events.Changes
looksLikeEventCaptureFailureto fire whenevermissingReplies.length > 0(not just=== sentRuns.length), while keeping the correlation-bug checks (no empty finals, no duplicates, no uncorrelated replies) so real regressions still fail immediately.chatEvents.length === 0requirement that blocked retries on partial failures.Root Cause
The LLM inside the sandbox timed out on the third rapid-fire prompt (C2603) within the 120-second polling deadline. Since 2 of 3 replies arrived, the retry guard saw non-zero
chatEventsandmissingReplies.length (1) !== sentRuns.length (3), so it did not retry. The test failed withexpected [ 'C2603-REPLY' ] to deeply equal [].Validation
The
GITHUB_TOKENused by this CI run does not include theworkflowscope, so the-custom-e2evalidation branch could not be pushed. To validate manually:a25b3931Signed-off-by: Hung Le hple@nvidia.com