fix(responses): stream reasoning as a live reasoning item (#9658) by localai-bot · Pull Request #10284 · mudler/LocalAI

localai-bot · 2026-06-12T21:48:12Z

Problem

On the /v1/responses (Responses API) streaming path, a reasoning model's <think> monologue was streamed to the client as ordinary message text (output_text.delta on a msg_ item) and only reclassified into a reasoning output item after the stream completed. Subsequent deltas also kept referencing the old msg_ id.

Root causes

The reasoning gate used extractor.Reasoning(), which only reflects the Go-side ProcessToken parser and never the autoparser's ProcessChatDeltaReasoning accumulator - so autoparser-driven reasoning was dropped live and rebuilt only at end-of-stream.
The non-tool path eagerly emitted the msg_ item before any token, forcing reasoning to a later index and mis-attributing deltas to msg_.
Missing sticky preferAutoparser, letting a content-only autoparser leak <think> into content (Regression: Reasoning/thinking output provided as regular output #9985).

Fix

Extracted a pure streamReasoningRouter helper (mirroring chat_stream_workers.go) that gates on reasoningDelta != "", opens the message item lazily, and keeps a sticky autoparser preference. Both streaming callbacks now route reasoning deltas to the reasoning_ id, and the completed-response assembly orders reasoning -> message -> tool_calls.

Behavior note

A pure-reasoning turn with no content no longer emits an empty message item.

Test plan

New Ginkgo specs for streamReasoningRouter (red -> green).
go test ./core/http/endpoints/openresponses/... ./core/http/endpoints/openai/... green.
Scoped golangci-lint --new-from-merge-base=origin/master clean.

Assisted-by: claude:claude-opus-4-8 [Claude Code]

…9658) In the /v1/responses streaming handler a reasoning model's thinking monologue was streamed to the client as normal message text (a msg_ output item with output_text.delta) and only reclassified into a reasoning item after the stream completed. Subsequent output_text.delta events also kept referencing the old msg_ item id instead of the reasoning_ id. Root causes: 1. The live reasoning item was gated on extractor.Reasoning(), which is only updated by the Go-side raw-tag parser (ProcessToken). When the C++ autoparser drives reasoning through reasoning_content ChatDeltas, the reasoning delta is computed via ProcessChatDeltaReasoning into a separate accumulator, so extractor.Reasoning() stays empty and the gate never fired. The reasoning item was thus only reconstructed at end-of-stream. 2. The non-tool-call path created the message/msg_ output item eagerly before any token, forcing reasoning to a higher output index and making mis-split <think> text land on the pre-existing message item. 3. Neither path carried the sticky preferAutoparser flag, so a content-only autoparser (the non-jinja pure-content fallback, #9985) could leak <think>...</think> tokens into content. Extract the per-token reasoning-vs-message classification into a pure, unit-tested streamReasoningRouter (mirroring chooseDeferredReasoning and processStream in the chat streaming worker): it gates the reasoning item on the reasoning delta, opens the message item lazily on the first content delta, and keeps a sticky preferAutoparser fallback. Both streaming paths now route reasoning deltas to the reasoning_ id and order the reasoning item ahead of the message at completion. Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(responses): stream reasoning as a live reasoning item (#9658)#10284

fix(responses): stream reasoning as a live reasoning item (#9658)#10284
localai-bot wants to merge 1 commit into
masterfrom
fix/9658-responses-streaming-reasoning

localai-bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

localai-bot commented Jun 12, 2026

Problem

Root causes

Fix

Behavior note

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants