Skip to content

fix: handle streaming requests in chat_handler#5

Merged
bussyjd merged 1 commit intomainfrom
fix/streaming-sse-passthrough
Feb 13, 2026
Merged

fix: handle streaming requests in chat_handler#5
bussyjd merged 1 commit intomainfrom
fix/streaming-sse-passthrough

Conversation

@bussyjd
Copy link
Collaborator

@bussyjd bussyjd commented Feb 13, 2026

Summary

  • Force stream=false unconditionally in process_chat() so upstream providers always return parseable JSON instead of SSE
  • Detect the client's original stream preference in chat_handler and convert the JSON response to SSE chunks (chat.completion.chunk with delta field) when the client requested streaming

Closes #4

Context

chat_handler has never supported streaming — it collects the full response and returns web.json_response(). But process_chat() only defaulted stream=false when the field was absent, so clients sending stream=true triggered two failures:

  1. Providers like Ollama returned SSE, which response_json() couldn't parse → 500
  2. Even when the response succeeded, streaming clients (e.g. Vercel AI SDK) expected SSE chunks and silently discarded the plain JSON → "No reply from agent"

This broke OpenClaw's embedded agent, which uses @ai-sdk/openai-compatible and defaults to stream=true.

Test plan

  • stream=false request returns JSON as before
  • stream=true request returns proper SSE with chat.completion.chunk objects
  • OpenClaw agent gets actual LLM content through the full inference pipeline
  • Non-streaming direct API calls still work

chat_handler always collects the full provider response and returns
JSON, but when a client sends stream=true, two things break:

1. process_chat only sets stream=false when the field is absent.
   If the client includes stream=true, it passes through to the
   upstream provider (e.g. Ollama), which returns SSE instead of
   JSON.  response_json then fails to parse the SSE body, raising
   "Expecting value: line 1 column 1 (char 0)" and returning 500.

2. Even when the provider call succeeds, the response is returned
   as plain JSON.  Streaming clients (e.g. Vercel AI SDK's
   @ai-sdk/openai-compatible) expect Server-Sent Events with
   chat.completion.chunk objects containing a delta field, so they
   silently discard the unexpected JSON and report no response.

Fix both by:
- Unconditionally forcing stream=false in process_chat so providers
  always return a parseable JSON response.
- Detecting the client's original stream preference in chat_handler
  and, when true, converting the JSON response to a single SSE chunk
  in the chat.completion.chunk format before sending it back.
@bussyjd bussyjd merged commit e5d3d6a into main Feb 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streaming requests to /v1/chat/completions return 500 or are silently dropped

1 participant