Skip to content

[Bug] Provider API error 400: assistant message with 'tool_calls' missing corresponding tool response (tool_call_id Read:158) #705

@irfndi

Description

@irfndi

Bug Description

Kimi Code repeatedly fails with a provider API error that makes the session unusable. The error claims an assistant message containing tool_calls is not followed by the required tool response messages.

Error Message

400 an assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: Read:158

Environment

  • Kimi Code version: 0.14.0
  • Install source: homebrew
  • OS: darwin arm64
  • Node.js version: 26.3.0
  • Wire protocol version: 1.4
  • Model: kimi-code/kimi-for-coding
  • Shell: /bin/zsh
  • Terminal: zed 1.6.3+stable.306.601ecb3ee5c16940191818ee7f244837abf6983c
  • Session ID: session_c44fe4ac-ae8b-4e56-a54a-4d37be7a1c47
  • Session title: based on @.omo what nexts?
  • Workspace: /Users/irfandi/.local/share/opencode/worktree/91fb77f4c6c843e75d4b0755a138a24dd7e2e7fe/clever-cabin

Reproduction / Observed behavior

  1. Resume an existing session.
  2. The assistant emits a tool call with id Read:158.
  3. Subsequent LLM requests fail with HTTP 400 because the message history is missing the corresponding tool response for Read:158.
  4. The failure persists across retries and turns, blocking the session completely.

Log excerpt from logs/kimi-code.log:

2026-06-13T02:33:12.563Z INFO  llm request  turnStep=9.18
2026-06-13T02:33:13.599Z WARN  llm request failed  turnStep=9.18 attempt=1/3 model=kimi-code/kimi-for-coding errorName=APIStatusError errorMessage="400 an assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: Read:158" statusCode=400
2026-06-13T02:33:13.600Z ERROR turn failed  turnId=9
...
2026-06-13T02:38:05.687Z WARN  llm request failed  turnStep=1.1 attempt=1/3 model=kimi-code/kimi-for-coding errorName=APIStatusError errorMessage="..." statusCode=400
2026-06-13T02:38:28.037Z ERROR compaction failed
  APIStatusError: 400 ... Read:158

The error also occurs during compaction, suggesting the corrupted/conversation state is being persisted.

Expected behavior

  • The client should either:
    • Ensure every tool_calls assistant message is followed by the matching tool response before sending the next request, or
    • Detect and repair the inconsistent turn state rather than repeatedly sending the malformed history to the provider.
  • A session should be able to resume without hitting a permanent 400 loop.

Related issues and PRs

This appears to be a known class of bug in MoonshotAI/kimi-code and the broader Kimi/Moonshot ecosystem. Below are the most relevant existing reports and attempted fixes.

Directly related issues in MoonshotAI/kimi-code

Issue Status Relationship
#269 open Same error after force-interrupt during tool execution; resume hits 400. Root cause diagnosed as dirty pendingToolResultIds + project() not filtering incomplete tool sequences.
#660 open "Impossible to resume crashed sessions" — OOM/force-kill during tool execution leaves session unresumable with tool_call_ids did not have response messages.
#701 open Same 400 on session resume with open tool calls; includes a proposed fix commit on an external fork.
#520 open Related flow: thinking model returns reasoning-only completion after a tool call (APIEmptyResponseError).

Related PRs in MoonshotAI/kimi-code

PR Status Relationship
#664 open Direct fix for #660 — adds trimTrailingOpenToolExchange() in project() and cleanupOrphanedToolCalls() in ContextMemory after resume.
#273 closed Earlier fix attempt for #269 — synthesizes missing tool.result messages at replay time.
#553 open UX-side mitigation: auto-undo interrupted prompts when a turn produces no output.

Related reports in other Moonshot / Kimi projects

Project Issue Notes
MoonshotAI/kimi-cli #1977 Exact same error (Shell:58 tool_call_id missing).
MoonshotAI/kimi-cli #1299 Same error on kimi --continue; corrupted session history (disk full). Workaround: start a new session without --continue.
MoonshotAI/kimi-cli #1171 Malformed tool_calls[].function.arguments JSON poisons session history permanently.
MoonshotAI/kimi-cli #2165 Corrupted context.jsonl makes session unrecoverable.

Cross-project reports (same OpenAI-compatible error)

Project Issue Notes
openai/codex #8479 Parallel tool_calls with lost responses during session interruption.
pydantic/pydantic-ai #562, #2360 Workaround: parallel_tool_calls=False.
google/adk-python #153, #187 Multi-agent orchestration loses tool_call_ids.
microsoft/semantic-kernel #9443 (Java), #7626 (.NET) Framework injects user messages between tool responses.

Root cause analysis (from local debug zip)

I exported the debug zip and inspected logs/kimi-code.log, agents/main/wire.jsonl, state.json, and manifest.json.

Timeline of the crash

Time (UTC) Event
2026-06-12 11:39 Session created.
~11:40–02:24 8 long turns with many tool calls and 6 compactions.
02:30:45 Turn 9 begins. User prompt: "Resume the active goal."
02:33:12 Turn 9, step 17: model decides to Read config.ts.
02:33:12 Critical race: a turn.steer event (background task completion notification) is injected mid-step, creating a new turn 10 while turn 9 is still in-flight.
02:33:12 Turn 9, step 17's Read tool call completes, but the tool result appears to be split across the turn boundary created by the steer.
02:33:13 400 error on turn 9 step 18: Read:158 tool_call_id has no matching tool response.
02:33:30 Turn 10 step 2 also fails with the same 400.
02:38:05 After session restart, the very first turn fails with the same 400.
02:38:28 Even manual compaction fails with the same 400, confirming the corrupted context is persisted.

Compaction history

Six compaction events occurred before the crash; the last one completed successfully at 22:43:12. A manual compaction attempt at 02:38:27 was cancelled because the compaction itself hit the 400 error.

Hypothesis

The Read:158 tool result was orphaned by a turn.steer event that fired while a Read tool call was in-flight. The steer injected new user messages and spawned a new turn 10 on top of the still-running turn 9. When the context was reconstructed for turn 9's next API call, the assistant message containing the Read:158 tool_call was preserved, but its matching tool result was attributed to the wrong turn/context slot or dropped across the turn boundary. This is a concurrency bug in context reconstruction when background-task notifications interrupt an active step.

Debug artifacts

Exported debug zip:

/Users/irfandi/.local/share/opencode/worktree/91fb77f4c6c843e75d4b0755a138a24dd7e2e7fe/clever-cabin/session_c44fe4ac-ae8b-4e56-a54a-4d37be7a1c47.zip

Contains:

  • manifest.json
  • state.json
  • logs/kimi-code.log
  • logs/global/kimi-code.log
  • agents/main/wire.jsonl (8757 lines of wire-protocol events)

Please let me know if you need the zip uploaded somewhere or additional logs.

Suggested fixes / mitigations

  1. Defer turn.steer injection until the current step's tool calls have completed and their results are committed to the same context slot.
  2. Validate tool_call/tool_result pairing in project() / context reconstruction before sending to the API; drop or synthesize orphaned pairs rather than sending malformed history.
  3. On session resume, run cleanupOrphanedToolCalls() / trimTrailingOpenToolExchange() even when the orphaned tool calls are no longer at the very end of the history (e.g. because a new user prompt was appended after them).
  4. Prevent compaction from persisting corrupted context: if an API call fails with this specific 400, do not compact until the context is repaired.

Additional context

This happened after the user chose to resume an active goal. The session had already accumulated many turns (turn 7.x, 8.x, 9.x) before the failure. The Read:158 tool call ID is the Kimi/Moonshot API-level ID for the 158th Read tool call in the session; it is distinct from Kimi Code's internal wire UUIDs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions