fix: handle streaming requests in chat_handler by bussyjd · Pull Request #5 · ObolNetwork/llms

bussyjd · 2026-02-13T10:57:46Z

Summary

Force stream=false unconditionally in process_chat() so upstream providers always return parseable JSON instead of SSE
Detect the client's original stream preference in chat_handler and convert the JSON response to SSE chunks (chat.completion.chunk with delta field) when the client requested streaming

Closes #4

Context

chat_handler has never supported streaming — it collects the full response and returns web.json_response(). But process_chat() only defaulted stream=false when the field was absent, so clients sending stream=true triggered two failures:

Providers like Ollama returned SSE, which response_json() couldn't parse → 500
Even when the response succeeded, streaming clients (e.g. Vercel AI SDK) expected SSE chunks and silently discarded the plain JSON → "No reply from agent"

This broke OpenClaw's embedded agent, which uses @ai-sdk/openai-compatible and defaults to stream=true.

Test plan

stream=false request returns JSON as before
stream=true request returns proper SSE with chat.completion.chunk objects
OpenClaw agent gets actual LLM content through the full inference pipeline
Non-streaming direct API calls still work

chat_handler always collects the full provider response and returns JSON, but when a client sends stream=true, two things break: 1. process_chat only sets stream=false when the field is absent. If the client includes stream=true, it passes through to the upstream provider (e.g. Ollama), which returns SSE instead of JSON. response_json then fails to parse the SSE body, raising "Expecting value: line 1 column 1 (char 0)" and returning 500. 2. Even when the provider call succeeds, the response is returned as plain JSON. Streaming clients (e.g. Vercel AI SDK's @ai-sdk/openai-compatible) expect Server-Sent Events with chat.completion.chunk objects containing a delta field, so they silently discard the unexpected JSON and report no response. Fix both by: - Unconditionally forcing stream=false in process_chat so providers always return a parseable JSON response. - Detecting the client's original stream preference in chat_handler and, when true, converting the JSON response to a single SSE chunk in the chat.completion.chunk format before sending it back.

bussyjd merged commit e5d3d6a into main Feb 13, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle streaming requests in chat_handler#5

fix: handle streaming requests in chat_handler#5
bussyjd merged 1 commit intomainfrom
fix/streaming-sse-passthrough

bussyjd commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bussyjd commented Feb 13, 2026

Summary

Context

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant