Skip to content

Commit 12d40ac

Browse files
committed
docs(ai-chat): slim-wire HITL continuations + field-level merge contract
1 parent 61ca40b commit 12d40ac

6 files changed

Lines changed: 89 additions & 20 deletions

File tree

docs/ai-chat/changelog.mdx

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,35 @@ sidebarTitle: "Changelog"
44
description: "Pre-release updates for AI chat agents."
55
---
66

7+
<Update label="May 23, 2026" description="4.5.0-rc.2" tags={["SDK", "Webapp", "Bug fix"]}>
8+
9+
## HITL continuations — slim wire by default + field-level merge
10+
11+
`chat.addToolOutput(...)` and `chat.addToolApproveResponse(...)` continuations on reasoning-heavy agent loops used to fail two ways: either the wire body crossed the `/in/append` cap (encrypted reasoning blobs + tool input routinely > 512 KiB), or apps that slimmed the wire as a workaround landed a tool call with no `arguments` on the next LLM step (the per-turn merge replaced the hydrated message wholesale instead of overlaying only the new tool-state advance). Both modes are fixed.
12+
13+
The transport (`TriggerChatTransport.sendMessages`, `AgentChat.sendRaw`) now slims the assistant message itself on `submit-message` turns whose assistant carries resolved or approval-responded tool parts. The wire shape ships as `{ id, role: "assistant", parts: [<resolved tool part only>] }``state` plus `output` / `errorText` / `approval`, depending on the new state. Everything else (reasoning blobs, prior text, tool `input`, provider metadata) is reconstructed server-side from `hydrateMessages` or the durable snapshot. Continuation payloads typically drop from 600 KiB – 1 MiB to ~1 KiB.
14+
15+
The per-turn merge now overlays only the tool-part state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) from the wire copy onto the matching hydrated entry. Hydrated `input`, text, reasoning, and provider metadata stay put. The agent still accepts a fuller `UIMessage` on the wire (the merge only reads the resolved fields), so custom transports that ship more don't break — they just waste bytes.
16+
17+
### `hydrateMessages` upsert-by-id
18+
19+
If your `hydrateMessages` hook persists the incoming message, **upsert by id** — don't unconditionally push. HITL continuations ship the existing assistant's id with a slim payload; a blind `stored.push(newMsg)` duplicates the row in the chain you return, the merge updates the first match, and the slim duplicate hits `toModelMessages` with no `input`. The examples in [lifecycle hooks](/ai-chat/lifecycle-hooks#hydratemessages), [Database persistence](/ai-chat/patterns/database-persistence#alternative-hydratemessages), and [Persistence and replay](/ai-chat/patterns/persistence-and-replay) have all been updated.
20+
21+
### `onValidateMessages` slim wire caveat
22+
23+
The slim wire is what arrives in `onValidateMessages` on HITL turns. `validateUIMessages` from `ai` rejects the slim shape (the AI SDK schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). See the updated example in [lifecycle hooks](/ai-chat/lifecycle-hooks#onvalidatemessages).
24+
25+
### `/in/append` 413 + precise cap
26+
27+
In parallel:
28+
29+
- The 413 response now carries CORS headers, so browser fetches can read the status instead of failing as opaque `TypeError: Failed to fetch`. App-side retry-on-disconnect loops no longer spin forever on a permanently-rejected payload.
30+
- The per-record cap is now computed precisely against S2's actual ceiling instead of the conservative 512 KiB floor. Legitimate ~600 – 900 KiB tool outputs (search results, file content) now succeed; pathological all-quote content that would double under JSON escape still rejects cleanly with a clear error.
31+
32+
See the updated [413 row in the client protocol](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions).
33+
34+
</Update>
35+
736
<Update label="May 21, 2026" description="4.5.0-rc.1" tags={["SDK", "Bug fix"]}>
837

938
## v4.5.0-rc.1 — two bug fixes

docs/ai-chat/client-protocol.mdx

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -692,7 +692,7 @@ The body is a JSON-serialized [`ChatInputChunk`](#chatinputchunk) — a tagged u
692692
| `401` | Missing or invalid `Authorization` header. |
693693
| `403` | Token doesn't carry `write:sessions:{externalId}`. |
694694
| `409` | The session is closed — `{ "ok": false, "error": "Cannot append to a closed session" }`. |
695-
| `413` | Body exceeds 512 KiB. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record. |
695+
| `413` | Body exceeds 1 MiB **or** the wrapped record would exceed S2's ~1 MiB per-record metered ceiling. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record or pushing a single tool output that's itself oversized. Carries CORS headers so browser fetches can read the status. |
696696
| `500` | Transient backend failure on the durable stream. Safe to retry — appends are idempotent on `(externalId, X-Part-Id)` if you set the optional `X-Part-Id` request header (the built-in clients set it from a UUID). |
697697
698698
<Warning>
@@ -851,7 +851,7 @@ The agent trims trailing assistant messages from its accumulator and re-streams
851851
852852
### Tool approval responses
853853
854-
When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** (with `approval-responded` tool parts) back as a `kind: "message"` chunk — singular, not the full chain:
854+
When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** back as a `kind: "message"` chunk — singular, not the full chain. The minimum shape the agent reads is just the resolved tool parts:
855855
856856
```json
857857
{
@@ -861,12 +861,10 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
861861
"id": "asst-msg-1",
862862
"role": "assistant",
863863
"parts": [
864-
{ "type": "text", "text": "I'll send that email for you." },
865864
{
866865
"type": "tool-sendEmail",
867866
"toolCallId": "call-1",
868867
"state": "approval-responded",
869-
"input": { "to": "user@example.com", "subject": "Hello" },
870868
"approval": { "id": "approval-1", "approved": true }
871869
}
872870
]
@@ -878,7 +876,11 @@ When a tool requires approval (`needsApproval: true`), the agent streams the too
878876
}
879877
```
880878
881-
The agent matches the incoming message by `id` against the rebuilt accumulator. If a match is found, it **replaces** the existing message instead of appending.
879+
The agent matches the incoming message by `id` against the rebuilt accumulator (or hydrated chain) and **overlays the tool-state advance** onto the matching entry — `state` plus `output` / `errorText` / `approval`, depending on the new state. Hydrated `input`, text, reasoning, and provider metadata stay put. This is what makes the slim shape above sufficient: the agent rebuilds everything else from the snapshot or from your `hydrateMessages` hook.
880+
881+
The same shape applies to HITL `addToolOutput` answers — substitute `state: "output-available"` and `output: <result>` for the approval pair above. Single-tool HITL `addToolOutput` continuation payloads are typically ~1 KiB on the wire.
882+
883+
The built-in transports (`TriggerChatTransport`, `AgentChat`) ship the slim shape by default on `submit-message` continuations. Custom transports can ship a fuller `UIMessage` — the agent still only reads the resolved tool-part fields — but the slim shape is the most efficient and avoids brushing the per-record cap on reasoning-heavy turns.
882884
883885
<Note>
884886
The message `id` must match the one the agent assigned during streaming. `TriggerChatTransport` keeps IDs in sync automatically. Custom transports should use the `messageId` from the stream's `start` chunk.
@@ -938,7 +940,7 @@ To bridge that gap, the head-start route handler ships **full UIMessage history*
938940
939941
Two reasons this exception is safe:
940942
941-
1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The 512 KiB body cap on the realtime route doesn't apply.
943+
1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The per-record cap on the realtime route doesn't apply.
942944
2. **`headStartMessages` is only honored on `trigger: "handover-prepare"`**. The runtime ignores the field on every other trigger — the one-message-per-record rule still holds for normal turns.
943945
944946
After turn 1 completes, the snapshot is written and turn 2+ run as a normal single-message-per-record chat.
@@ -1067,7 +1069,7 @@ No. `seq_num` is monotonic across the entire session — turn 1 might emit seq 0
10671069
</Expandable>
10681070
10691071
<Expandable title="What's the maximum size of a single `.in/append` body?">
1070-
512 KiB. A typical `kind: "message"` is a few KB. If you're brushing the cap you're shipping more than one message per record, which the protocol forbids. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
1072+
The HTTP body is capped at 1 MiB as a DoS guard. The actual ceiling is at the storage layer: each `.in/append` becomes a single S2 record, metered as `8 + body_bytes_after_JSON_wrap`, capped at 1 MiB. So the practical limit on the raw HTTP body sits around ~1023 KiB for content with low JSON-escape overhead (ASCII, base64) and ~512 KiB for content that escapes heavily (all quotes / backslashes). A typical `kind: "message"` is a few KiB. If you're brushing the cap you're either shipping a single tool output that's itself oversized — see [Large payloads](/ai-chat/patterns/large-payloads) — or you're shipping more than one message per record, which the protocol forbids. The 413 response carries CORS headers so browser fetches can read the status. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there.
10711073
</Expandable>
10721074
10731075
## See also

docs/ai-chat/lifecycle-hooks.mdx

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -242,14 +242,22 @@ import { validateUIMessages } from "ai";
242242
export const myChat = chat.agent({
243243
id: "my-chat",
244244
onValidateMessages: async ({ messages }) => {
245-
return validateUIMessages({ messages, tools: chatTools });
245+
const userMessages = messages.filter((m) => m.role === "user");
246+
if (userMessages.length > 0) {
247+
await validateUIMessages({ messages: userMessages, tools: chatTools });
248+
}
249+
return messages;
246250
},
247251
run: async ({ messages, signal }) => {
248252
return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: chatTools, abortSignal: signal });
249253
},
250254
});
251255
```
252256

257+
<Warning>
258+
On HITL continuations (`addToolOutput` / `addToolApproveResponse`) the assistant entry in `messages` is **slim**`state` + `output` / `errorText` / `approval` only, no `input` or other parts. `validateUIMessages` against the AI SDK schema rejects that shape (the schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). The example above does the filter.
259+
</Warning>
260+
253261
<Note>
254262
`onValidateMessages` fires **before** `onTurnStart` and message accumulation. If you need to validate messages loaded from a database, do the loading in `onChatStart` or `onPreload` and let `onValidateMessages` validate the full incoming set each turn.
255263
</Note>
@@ -278,14 +286,24 @@ export const myChat = chat.agent({
278286
const record = await db.chat.findUnique({ where: { id: chatId } });
279287
const stored = record?.messages ?? [];
280288

281-
// Append the new user message and persist
289+
// Upsert the incoming message by id. On HITL continuations
290+
// (`addToolOutput` / `addToolApproveResponse`) the incoming wire
291+
// shares the id of an existing assistant in `stored` — `push`ing
292+
// unconditionally would duplicate the row. The runtime merges the
293+
// resolution onto the existing entry; new ids (typically a fresh
294+
// user message) get appended.
282295
if (trigger === "submit-message" && incomingMessages.length > 0) {
283296
const newMsg = incomingMessages[incomingMessages.length - 1]!;
284-
stored.push(newMsg);
285-
await db.chat.update({
286-
where: { id: chatId },
287-
data: { messages: stored },
288-
});
297+
const existingIdx = newMsg.id
298+
? stored.findIndex((m) => m.id === newMsg.id)
299+
: -1;
300+
if (existingIdx === -1) {
301+
stored.push(newMsg);
302+
await db.chat.update({
303+
where: { id: chatId },
304+
data: { messages: stored },
305+
});
306+
}
289307
}
290308

291309
return stored;
@@ -298,7 +316,7 @@ export const myChat = chat.agent({
298316

299317
**Lifecycle position:** `onValidateMessages`**`hydrateMessages`**`onChatStart` (chat's first message only) → `onTurnStart``run()`
300318

301-
After the hook returns, any incoming wire message whose ID matches a hydrated message is auto-merged. This makes [tool approvals](/ai-chat/frontend#tool-approvals) work transparently with hydration.
319+
After the hook returns, the runtime overlays the wire's tool-state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) onto matching hydrated entries by id. Everything else on the hydrated entry — text, reasoning, tool `input`, providerMetadata — stays put. This makes [tool approvals](/ai-chat/frontend#tool-approvals) and HITL `addToolOutput` continuations work transparently: ship a slim resolution on the wire, the agent merges the new state onto your DB-backed copy.
302320

303321
<Note>
304322
`hydrateMessages` also fires for [action](/ai-chat/actions) turns (`trigger: "action"`) with empty `incomingMessages`. This lets the action handler work with the latest DB state.

docs/ai-chat/patterns/database-persistence.mdx

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,9 +184,20 @@ export const myChat = chat.agent({
184184
const record = await db.chat.findUnique({ where: { id: chatId } });
185185
const stored = record?.messages ?? [];
186186

187+
// Upsert by id. HITL continuations (addToolOutput /
188+
// addToolApproveResponse) ship the existing assistant's id with a
189+
// slim payload — push-without-check duplicates the row, the
190+
// runtime merges only the first match, and the duplicate slim copy
191+
// hits `toModelMessages` with no `input`.
187192
if (trigger === "submit-message" && incomingMessages.length > 0) {
188-
stored.push(incomingMessages[incomingMessages.length - 1]!);
189-
await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
193+
const newMsg = incomingMessages[incomingMessages.length - 1]!;
194+
const existingIdx = newMsg.id
195+
? stored.findIndex((m) => m.id === newMsg.id)
196+
: -1;
197+
if (existingIdx === -1) {
198+
stored.push(newMsg);
199+
await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
200+
}
190201
}
191202

192203
return stored;

docs/ai-chat/patterns/persistence-and-replay.mdx

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -139,9 +139,18 @@ export const myChat = chat.agent({
139139
hydrateMessages: async ({ chatId, trigger, incomingMessages }) => {
140140
const stored = (await db.chat.findUnique({ where: { id: chatId } }))?.messages ?? [];
141141

142+
// Upsert by id — HITL continuations ship the existing assistant's
143+
// id with a slim payload; the runtime overlays the new state.
144+
// See lifecycle-hooks for the full pattern + rationale.
142145
if (trigger === "submit-message" && incomingMessages.length > 0) {
143-
stored.push(incomingMessages[0]!);
144-
await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
146+
const newMsg = incomingMessages[0]!;
147+
const existingIdx = newMsg.id
148+
? stored.findIndex((m) => m.id === newMsg.id)
149+
: -1;
150+
if (existingIdx === -1) {
151+
stored.push(newMsg);
152+
await db.chat.update({ where: { id: chatId }, data: { messages: stored } });
153+
}
145154
}
146155

147156
return stored;

docs/ai-chat/patterns/trusted-edge-signals.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ The body is a JSON-serialized `ChatInputChunk`. The proxy parses it, checks `kin
115115
}
116116
```
117117

118-
Both bodies stay well under the [512 KiB cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
118+
Both bodies stay well under the [per-record cap on `/in/append`](/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is ~200 bytes.
119119

120120
Other paths — `.out` SSE, `/api/v1/auth/jwt/claims`, anything else — pass through the proxy untouched. The SSE stream in particular must not be buffered; preserve the response body as-is.
121121

0 commit comments

Comments
 (0)