Skip to content

fix: trimMessages() orphans tool messages, crashes OpenAI runs#70

Merged
Treelovah merged 1 commit intomainfrom
fix/trim-messages-orphaned-tool
Mar 10, 2026
Merged

fix: trimMessages() orphans tool messages, crashes OpenAI runs#70
Treelovah merged 1 commit intomainfrom
fix/trim-messages-orphaned-tool

Conversation

@Treelovah
Copy link
Contributor

Summary

trimMessages() sliding window can slice between an assistant message (with tool_calls) and its tool response, orphaning the tool message. OpenAI hard-rejects this with a 400 — every other provider silently degrades by feeding the model a tool response with zero context for what was called.

Reproduced on broken-auth-enum:

Model Provider Iterations Result
gpt-5.4 OpenAI 18/50 Crashed — orphaned tool → 400
claude-sonnet-4-5-20250929 Anthropic 50/50 Survived, degraded context
grok-3-latest xAI 50/50 Survived, degraded context

The fix adds a second skip loop after the existing same-role dedup — any tool messages at the start of the trimmed tail get dropped since their parent assistant message was already sliced off.

Only manifests on challenges requiring ~20+ iterations. Easy challenges that complete in <15 iterations never trigger this.

Test plan

  • New unit tests for orphaned tool messages at trim boundary
  • New unit test for tool messages after anchor role collision
  • Full suite passes (417/417)
  • Reproduced bug before fix (GPT-5.4 crashes at iteration 18)

Closes #69

The sliding window in trimMessages() slices the last 39 messages but
can cut right between an assistant message (with tool_calls) and its
tool response. OpenAI's API enforces strict tool message pairing and
hard-rejects orphaned tool messages with a 400. Other providers don't
crash but silently feed the model a tool response with no context for
what was called or why, degrading reasoning quality.

Reproduced on broken-auth-enum: GPT-5.4 crashes at iteration 18 every
time. Claude Sonnet 4.5 and Grok 3 survive to iteration 50 but with
degraded context. The fix is embarrassingly simple — after the existing
same-role dedup loop, skip any orphaned tool messages at the start of
the trimmed tail.

Closes #69
@Treelovah Treelovah added the bug Something isn't working label Mar 10, 2026
@Treelovah Treelovah requested a review from pi3-code March 10, 2026 17:48
@Treelovah Treelovah merged commit b83dbfc into main Mar 10, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trimMessages() sliding window orphans tool messages, crashes OpenAI runs at ~20 iterations

2 participants