fix(codex): handle function_call_arguments streaming for both spark and non-spark models#235
Conversation
…eaming translator Some Codex models (e.g. gpt-5.3-codex-spark) send function call arguments in a single "done" event without preceding "delta" events. The streaming translator only handled "delta" events, causing tool call arguments to be lost — resulting in empty tool inputs and infinite retry loops in clients like Claude Code. Emit the full arguments from the "done" event as a single input_json_delta so downstream clients receive the complete tool input.
Build and push Docker image to GitHub Container Registry on push to fix/* and feat/* branches. Fork-only, not intended for upstream.
Summary of ChangesHello @thebtf, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue in the streaming translation of Codex responses to Claude, specifically concerning function call arguments. Previously, certain models that transmit these arguments solely via Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a bug where function call arguments were being dropped for certain models in the streaming translator. The change adds a handler for response.function_call_arguments.done events, which is a necessary addition. My review includes one suggestion to improve code maintainability by addressing code duplication.
| } else if typeStr == "response.function_call_arguments.done" { | ||
| // Some models (e.g. gpt-5.3-codex-spark) send function call arguments | ||
| // in a single "done" event without preceding "delta" events. | ||
| // Emit the full arguments as a single input_json_delta so the | ||
| // downstream Claude client receives the complete tool input. | ||
| if args := rootResult.Get("arguments").String(); args != "" { | ||
| template = `{"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":""}}` | ||
| template, _ = sjson.Set(template, "index", (*param).(*ConvertCodexResponseToClaudeParams).BlockIndex) | ||
| template, _ = sjson.Set(template, "delta.partial_json", args) | ||
|
|
||
| output += "event: content_block_delta\n" | ||
| output += fmt.Sprintf("data: %s\n\n", template) | ||
| } | ||
| } |
There was a problem hiding this comment.
This new block for handling response.function_call_arguments.done is functionally correct, but it introduces significant code duplication with the existing handler for response.function_call_arguments.delta on lines 173-179.
To improve maintainability, consider refactoring this duplicated logic. You could combine both else if blocks or extract the common code into a helper function.
For example, you could combine them like this:
} else if typeStr == "response.function_call_arguments.delta" || typeStr == "response.function_call_arguments.done" {
var args string
if typeStr == "response.function_call_arguments.delta" {
args = rootResult.Get("delta").String()
} else {
args = rootResult.Get("arguments").String()
}
if args != "" {
template = `{"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":""}}`
template, _ = sjson.Set(template, "index", (*param).(*ConvertCodexResponseToClaudeParams).BlockIndex)
template, _ = sjson.Set(template, "delta.partial_json", args)
output += "event: content_block_delta\n"
output += fmt.Sprintf("data: %s\n\n", template)
}
}While this change would also affect lines outside the current diff, it's a good candidate for a follow-up refactoring to keep the code DRY.
There was a problem hiding this comment.
Thanks for the suggestion! However, the second commit in this PR (2de17c8) adds a HasReceivedArgumentsDelta flag that makes the done handler conditional — it only emits when no delta events preceded it. This means the two handlers have different semantics:
- delta: always emits + sets the flag
- done: only emits if flag is false (spark models that skip deltas)
Merging them into one block would obscure this important distinction. The current structure makes the intent clear: spark (done-only) vs non-spark (delta+done) paths.
…ede done Non-spark codex models (gpt-5.3-codex, gpt-5.2-codex) stream function call arguments via multiple delta events followed by a done event. The done handler unconditionally emitted the full arguments, duplicating what deltas already streamed. This produced invalid double JSON that Claude Code couldn't parse, causing tool calls to fail with missing parameters and infinite retry loops. Add HasReceivedArgumentsDelta flag to track whether delta events were received. The done handler now only emits arguments when no deltas preceded it (spark models), while delta-based streaming continues to work for non-spark models.
|
This project only accepts pull requests that relate to third-party provider support. Any pull requests unrelated to third-party provider support will be rejected. If you need to submit any non-third-party provider changes, please open them against the mainline repository. Would you mind recreate a new PR to mainline? |
|
Thank you for the clarification! Understood — CLIProxyAPIplus is for third-party provider changes only. I've moved this to the mainline repository:
Closing this PR. |
Summary
Fixes tool call argument streaming in the Codex→Claude response translator for all Codex model variants:
Spark models (e.g.
gpt-5.3-codex-spark): Send arguments only in a singleresponse.function_call_arguments.doneevent without precedingdeltaevents. The translator didn't handledone, so arguments were lost → emptyinput: {}→ infinite retry loops.Non-spark models (e.g.
gpt-5.3-codex,gpt-5.2-codex): Send arguments via multipleresponse.function_call_arguments.deltaevents followed by adoneevent. After the spark fix, thedonehandler unconditionally emitted the full arguments, duplicating what deltas already streamed → invalid double JSON → parse failure → infinite retry loops.Changes
response.function_call_arguments.doneasinput_json_delta(spark fix)HasReceivedArgumentsDeltaflag to skipdoneemission whendeltaevents already streamed the arguments (non-spark fix)response.output_item.addedfunction_call (supports multiple tool calls per response)Tested with
Test plan