Stop recommending run streams

ericallam · ericallam · commit dadfa857c5ac · 2026-05-19T16:10:53.000+01:00
diff --git a/docs/ai-chat/error-handling.mdx b/docs/ai-chat/error-handling.mdx
@@ -399,7 +399,7 @@ A specific run-failing error worth flagging on its own. Anything written through
 
 The error carries `chunkType`, `chunkSize`, and `maxSize`. Catch with the `isChatChunkTooLargeError` guard and route oversized values out-of-band.
 
-See [Large payloads in chat.agent](/ai-chat/patterns/large-payloads) for the two patterns that work around the cap (ID-reference + run-scoped `streams.writer()`).
+See [Large payloads in chat.agent](/ai-chat/patterns/large-payloads) for the ID-reference pattern that works around the cap, plus guidance on transient data parts and out-of-band logging.
 
 ## See also
 
diff --git a/docs/ai-chat/overview.mdx b/docs/ai-chat/overview.mdx
@@ -40,10 +40,10 @@ sequenceDiagram
     Task->>Task: onTurnStart({ chatId, messages })
     Task->>LLM: streamText({ model, messages, abortSignal })
     LLM-->>Task: Stream response chunks
-    Task->>API: streams.pipe("chat", uiStream)
+    Task->>API: Write chunks to session.out
     API-->>useChat: SSE: UIMessageChunks
     useChat-->>User: Render streaming text
-    Task->>API: Write trigger:turn-complete
+    Task->>API: Write turn-complete control record
     API-->>useChat: SSE: turn complete + refreshed token
     useChat->>useChat: Close stream, update session
     Task->>Task: onTurnComplete({ messages, stopped: false })
@@ -73,10 +73,10 @@ sequenceDiagram
     Task->>Task: onTurnStart({ turn: 1 })
     Task->>LLM: streamText({ messages: [all accumulated] })
     LLM-->>Task: Stream response
-    Task->>API: streams.pipe("chat", uiStream)
+    Task->>API: Write chunks to session.out
     API-->>useChat: SSE: UIMessageChunks
     useChat-->>User: Render streaming text
-    Task->>API: Write trigger:turn-complete
+    Task->>API: Write turn-complete control record
     Task->>Task: onTurnComplete({ turn: 1 })
     Task->>Task: Wait for next message (idle → suspend)
 ```
diff --git a/docs/ai-chat/patterns/large-payloads.mdx b/docs/ai-chat/patterns/large-payloads.mdx
@@ -1,7 +1,7 @@
 ---
 title: "Large payloads in chat.agent"
 sidebarTitle: "Large payloads"
-description: "Why a single chunk on the chat stream is capped at ~1 MiB, what error you'll see, and the two patterns that work around it: ID references and out-of-band run streams."
+description: "Why a single chunk on the chat stream is capped at ~1 MiB, what error you'll see, and how to work around it with ID references."
 ---
 
 The realtime stream that backs `chat.agent` enforces a **per-record cap of ~1 MiB** (`1048576` bytes minus a small envelope reserve). Anything written through the chat output — auto-piped LLM chunks, `chat.response.write`, custom `writer.write` parts — counts as one record per chunk and is rejected if it crosses the cap.
@@ -61,7 +61,7 @@ const fetchPage = tool({
 
 If the size is unbounded by input, fix the tool — not the stream.
 
-## Pattern 1: ID-reference (recommended)
+## ID-reference pattern
 
 Store the large value in your own database (or object store) and emit only an identifier through the chat stream. The frontend fetches the full payload separately on demand.
 
@@ -134,42 +134,19 @@ chat.response.write({ type: "data-report", data: { id, summary: shortSummary } }
   Persist the large value **before** you emit the id chunk. If the chunk reaches the UI before the row is written, the frontend gets a 404 on the follow-up fetch.
 </Tip>
 
-## Pattern 2: Out-of-band `streams.writer()`
+## Transient UI parts
 
-If the value is **only useful for the lifetime of the run** (a long log tail, a transient progress dump, a per-turn debug trace) and you don't want to persist it, write it to a **separate run-scoped stream** instead. Run-scoped `streams.writer()` is its own channel — chunks go through the same per-record cap, but the chat stream stays untouched, and `useRealtimeRunWithStreams` consumes them independently of the chat UI.
+For progress indicators or status data that should stream to the UI but not persist into the response message, use `chat.response.write` with `transient: true`. The chunk still travels on the chat stream (so the 1 MiB per-record cap still applies), but it never lands in `responseMessage` or `uiMessages`:
 
 ```ts
-import { task, streams } from "@trigger.dev/sdk";
-import { chat } from "@trigger.dev/sdk/ai";
-
-const debugLog = streams.define<{ line: string }>("debug-log");
-
-export const myChat = chat.agent({
-  id: "my-chat",
-  run: async ({ messages, signal }) => {
-    // Heavy diagnostic stream lives on its own channel.
-    const log = debugLog.writer();
-    log.write({ line: "starting turn" });
-
-    return streamText({ /* ... */ });
-  },
+chat.response.write({
+  type: "data-progress",
+  data: { percent: 50 },
+  transient: true,
 });
 ```
 
-Frontend:
-
-```tsx
-import { useRealtimeRunWithStreams } from "@trigger.dev/react-hooks";
-
-function DebugPanel({ runId }: { runId: string }) {
-  const { streams } = useRealtimeRunWithStreams<typeof myChat>(runId);
-  return (
-    <pre>{streams?.["debug-log"]?.map((c) => c.line).join("\n")}</pre>
-  );
-}
-```
-
-Same 1 MiB cap applies per record, so split long content across multiple writes (one record per line, per page, per progress tick) rather than one large blob.
+For genuinely high-volume diagnostic data (per-token traces, large debug dumps), don't try to ship it through the realtime stream at all. Log to your own store (DB, object storage, OTel logger) and surface it through a separate UI route that isn't tied to the chat session.
 
 ## What does **not** trigger the cap