Feat/ai sdk v6 migration#539
Open
saoc90 wants to merge 508 commits into
Open
Conversation
- Updated document checking logic to account for both chat-attached and persona-linked documents. - Improved document hint generation to provide clearer context based on the type of documents available. - Adjusted the conditions for adding the search_documents tool to include any available documents, enhancing functionality.
refactor: Enhance document handling in chat API responses
…cument access - Added AllowedPersonaDocumentIds function to compute accessible persona document IDs for the current user. - Updated searchDocuments function to include documentIds in the context, allowing for more refined document searches based on user permissions.
…for GPT-5 Co-authored-by: saoc90 <7711719+saoc90@users.noreply.github.com>
Enhance system prompt with explicit markdown formatting instructions for GPT-5
…rsation state management
… it in mapOpenAIChatMessages function
… provide user-friendly messages on abort and incomplete responses
…rams, embedding control headers, environment-based model selection, migration instructions, OpenAI responses API streaming, OpenAI SDK migration, and reasoning summaries implementation.
…e reasoning effort handling
…dentials chore: Add support for AzureDefaultCredentials
…in OpenAIResponsesStream
…protobufjs-7.5.8 chore(deps): bump protobufjs from 7.5.4 to 7.5.9 in /src
…multi-597d7db90c chore(deps): bump esbuild, vite, @vitest/coverage-v8 and vitest in /src
…ze-feedback Fix silent drop of large Code Interpreter docs (#113)
Adds ai@6, @ai-sdk/azure@3, @ai-sdk/react@3, zustand@5 (per-tab store factory) and tightens tooling configs (vitest, playwright, next) for the streamText route + UI message stream pipeline that follows.
Replaces the legacy 7-layer custom chat orchestrator (ChatAPIEntry → ChatAPIResponse → ConversationManager → OpenAIResponsesStream) with a single streamText call exposing `toUIMessageStreamResponse()`. Highlights: - New /api/chat route built around streamText + stepCountIs(8) + experimental_transform. - thread-context.ts hydrates the per-request context (thread, user, history, extensions, attached files, turnId); inlines a blob:// -> data: resolver for legacy multimodal images so the AI SDK's downloadAssets pass doesn't reject the unfetchable scheme. - persist-assistant.ts persists from streamText.onFinish (not the response stream's close event), retries on transient Cosmos failures, writes a friendly sentinel row on hard failures, and rewrites sandbox:// URLs in assistant text via rewrite-sandbox-urls. - message-adapter.ts is the only place Cosmos rows <-> UIMessages cross (assistant + tool rows fold/split symmetrically). - Tools registry registers RAG, sub-agent search/call, and dynamic Azure-function extensions as AI SDK custom tools; built-in Azure tools (code_interpreter, image_generation, web_search_preview) flow through providerOptions via the provider seam. - Provider seam (models/provider.ts + provider-seam.ts) abstracts Azure-specific config so Anthropic / future providers can slot in without touching the route; preserves AAD-token fetch wrapper + image-generation deployment header. - Per-thread turn registry + TTL plumbs resumable streams across page reloads inside the 10-minute window. - Rate-limit subject abstraction so the cost-bomb guard can move to per-org / per-tenant later without touching the route. - E2E fake provider returns MockLanguageModelV2 for Playwright runs; test-backend gate centralised in features/common/services/e2e-fakes-gate.ts.
Replaces the Valtio module singleton (one global store shared by every
open thread) with a per-React-tree Zustand store factory wrapped in a
ChatStoreProvider. Each chat-page mount creates its own store keyed by
threadId, so submitting in tab B no longer cancels the stream running
in tab A.
- chat-store-factory.ts: createChatStore({threadId, initialMessages,
userName, chatThread}) → returns a fresh Zustand store with selectedModel,
reasoningEffort, default-tool toggles, attachedFiles, tool-call history,
usage data and a debounced thread-persistence side effect.
- chat-store-context.tsx: React provider keyed on threadId so opening a
different thread in the same tab also gets a fresh store.
- active-chat-store.ts: bridges the live AI SDK useChat() handle with the
Zustand store so streaming state and persisted preferences coexist.
- chat-page.tsx + chat-header/* + chat-input/* + ai-elements/tool +
ui/chat/* are reworked to consume `useChatStore(selector)` instead of
the old module-singleton imports.
Azure's provider-executed image_generation tool emits the generated PNG as ~2 MB of raw base64 inside the tool-result chunk's output.result. Persisting that inline blows past Cosmos's 2 MB request-size limit and the SSE wire ships the same blob across to every connected client. The model itself never sees the resulting URL (Azure resolves the tool inside one response generation), so the legacy code injected the markdown image into the assistant text from the app side — this commit restores that pattern adapted for the AI SDK pipeline. - image-generation-stream-rewriter.ts: intercepts the tool-result chunk, hands the base64 to persistBase64Image (the image service), swaps output.result for the returned blob:// reference, then emits text-start / text-delta / text-end chunks carrying the `` markdown. Filters out partial-image previews and dedupes Azure's identical-second-emission per toolCallId. - chat-file-store-ingest.ts: onFinish-time fallback — if a base64 payload still reaches event.toolResults (rewriter bypassed or failed), persistBase64Image runs there instead. Existing blob:// refs pass through unchanged. - sandbox-rewrite-core.ts + sandbox-url-transform.ts + rewrite-sandbox-urls.ts: shared substring rewriter that swaps `sandbox:/mnt/data/<file>` URLs in assistant text for stored URLs, used both during streaming (text-delta transform) and at persist time (UIMessage tree walk). Handles container_file_citation resolution from code_interpreter outputs. - chat-image-service.ts: GetImageUrlPath helper returns a same-origin relative `/api/images?…` URL (vs the legacy absolute form, which Streamdown's link sanitiser blocks). - chat-image-persistence-utils.ts: resolveBlobReferenceToPath gives callers (RichResponse, tool-part-view) a single helper to translate a stored `blob://` reference into the renderable URL at the last mile. - rich-response.tsx: pre-render scan over markdown text replaces `blob://` references with the resolved URL so Streamdown renders the image inline. - tool-part-view.tsx: typed renderer for AI SDK tool parts; special- cases image_generation to render the URL as an inline <img> (also resolves blob:// client-side). - /api/code-interpreter/file/[fileId] honours containerId + filename query params so the route can call DownloadContainerFile for files that live in a code_interpreter container. - UpsertChatMessage runs processMessageForImagePersistence on tool rows too (legacy only handled assistant content). Storage contract: `blob://threadId/filename` flows through everywhere internally (stream, onFinish, Cosmos rows). The only translation to a fetchable `/api/images?…` URL happens at the UI render boundary in RichResponse / tool-part-view via the image-service helper. No URL construction outside the image service.
The mount-gating in ThemeProvider deferred all rendering — including the inline <script> next-themes injects before hydration to prevent the dark/light flash — until after mount. React 19 warns scripts rendered after mount never execute, and the flash returned. next-themes is already SSR-safe; render it straight through.
Drops the seven-layer pre-migration pipeline now that streamText owns
the route:
- chat-api.ts (ChatAPIEntry) and chat-api-response.ts (ChatAPIResponse)
- conversation-manager.ts (ConversationManager + startConversation /
continueConversation)
- openai-responses-stream.ts (custom SSE encoder consuming Azure
Responses API events)
- function-registry.ts (Map<string, fn> tool dispatcher that wrapped
errors as {error:...} JSON the model saw as success)
- chat-store.tsx (Valtio module singleton)
- openai.test.ts (legacy OpenAI client tests no longer applicable)
All replaced by streamText, AI SDK tool primitives, Zustand store
factory, and the new fake-provider test seam.
Adds Playwright specs against the memory-backend fake provider so each new feature has a regression net before the cutover: - abort-mid-stream — stop() during a streaming turn persists partial assistant text - background-completion — onFinish fires even when the client disconnects mid-stream - code-interpreter-end-to-end — container_file_citation -> blob ref -> rendered <img> - image-generation — base64 -> blob:// -> RichResponse -> /api/images render - multi-conversation-isolation — two tabs streaming simultaneously, no cross-talk - multi-turn-tool-loop — tool -> result -> tool -> result -> text in one turn - persistence-round-trip — submit -> stream -> reload -> identical UIMessage tree - reasoning-effort — reasoning parts persist and render - sub-agent-tool-call — call_sub_agent widget shows the sub-agent's response - tool-execution-error — tool-error part renders as failure state and persists Existing specs (abort, chat-thread, error-toast, multi-image-input, persisted-multi-turn, tool-call) updated to the new fake-provider and UIMessage shape.
Implements the AI SDK v6 resumable-streams pattern using an in-process
thread-keyed publisher (no Redis — single replica only). Switching
threads mid-stream and returning now reattaches to the live stream
(buffer replay then live forward) instead of waiting for completion +
router.refresh. The stop button now aborts streamText server-side via a
new POST /api/chat/[id]/stop endpoint instead of only disconnecting the
client.
- stream-publisher.ts: thread-keyed, multi-subscriber fan-out, owns the
AbortController passed into streamText.
- GET /api/chat/[id]/stream: replay-then-live; 204 when nothing active.
- POST /api/chat/[id]/stop: aborts upstream; existing onFinish chain
persists whatever tokens streamText produced.
- Client: useChat({ resume: true }) + prepareReconnectToStreamRequest;
stop button POSTs to the new endpoint before chat.stop().
- Removes turn-registry (single-reader, turnId-keyed, no client
consumer) and the now-orphaned turn-ttl module; stale per-thread
mutex doc-lines removed from models.ts / thread-context.ts.
Cross-replica returns 204 from GET and falls back to the existing
persisted-message path — same behaviour as before this change.
Vitest: 10 new tests in stream-publisher.test.ts cover buffer replay,
mid-stream subscribe, multi-subscriber fan-out, abort propagation,
idempotent abort, unregister, TTL eviction, and same-thread re-start.
Three bugs in fb204d8 surfaced by the new e2e specs: 1. customFetch (chat-store-context) unconditionally rebuilt the body as FormData, so the reconnect GET (which DefaultChatTransport routes through the same fetch) hit "Request with GET/HEAD method cannot have body" in the browser. Short-circuit non-POST methods to plain fetch — the FormData rebuild only makes sense for the submit POST. 2. same-origin check 403'd every request on Azure App Service because req.url carries the internal hostname while Origin/Referer carry the public one. Trust X-Forwarded-Host when present; App Service strips any client-supplied value at the edge so this isn't a CSRF bypass. 3. POST /api/chat/[id]/stop aborted streamText, but onFinish never fires after an abort — so the consumeStream finally tripped the sentinel path and clobbered the tokens the user had already seen with the "Something went wrong" message. Accumulate text/reasoning via onChunk and persist from onAbort using the same path as onFinish; also skip the sentinel when abortController.signal.aborted (onAbort owns persistence on the abort branch). E2E coverage that caught these: - reattach-mid-stream.spec.ts — switches threads mid-stream, returns, asserts the live stream resumes (validates the customFetch fix). - abort-stream-server-side.spec.ts — clicks stop, hard-reloads, asserts the partial assistant text survives in Cosmos (validates onAbort persistence). Both pass against a fresh production build (npm run build + next start); the previous validation pass missed the bugs because the e2e harness reuses build/BUILD_ID and was running the un-fixed bundle.
…nject markdown Azure's built-in code_interpreter tool sometimes returns generated images as inline `data:image/png;base64,…` URLs in `outputs[].url`. When the model echoes those URLs in markdown links like `[Download](data:…)`, Streamdown's link sanitiser (rehype-harden) rejects every data: scheme inside an <a> href, so the user sees a "[blocked]" placeholder instead of a working link or rendered image. code-interpreter-stream-rewriter.ts mirrors the image_generation handler: hands each data: image URL to persistBase64Image (the image service) to get back a canonical `blob://threadId/filename` reference, rewrites the tool-result chunk's `outputs[].url` to that ref so persistence stays small and the chunk no longer carries the megabyte of base64, then injects a `` text-delta into the assistant stream so the image renders inline at the UI's last-mile blob → /api/images resolution step. Sandbox URL outputs are still handled by sandbox-url-transform. Wired into the experimental_transform array between the image_generation rewriter and the sandbox URL rewriter.
Azure splits `sandbox:/mnt/data/<file>` URLs across many small text-delta chunks (observed sequence: "sandbox", ":/", "mnt", "/data", "/random", "_py", "plot", ".png", ")"). The per-delta substitution never saw a complete pattern, so the assistant text reached Streamdown still containing the raw sandbox URL — which rehype-harden's link sanitiser rejects, leaving a `[blocked]` placeholder next to the link text. Legacy worked because the legacy stream handler ran rewrite on the full assistant text after Azure's response completed. The transform now keeps a per-text-part `pendingTail` and holds back any suffix that could be the start of a sandbox URL — either a strict prefix of the literal "sandbox" (e.g. "san", "sandb") or a `sandbox:` token whose URL hasn't terminated yet (no `)`, `]`, whitespace, or quote). Once enough deltas arrive to either complete the URL (so the regex matches and rewrites) or rule it out, the buffered portion is emitted. The text-end chunk flushes any leftover tail. Tests cover the chunked-URL case and the spurious-prefix flush.
…ions
Azure attaches `container_file_citation` entries to the `text-end`
chunk's `providerMetadata.azure.annotations`, not to separate `source`
chunks earlier in the stream. Without those citations the
text-delta containing `[Download](sandbox:/mnt/data/<file>)` couldn't
be rewritten — fileMap was empty when the URL arrived and Streamdown's
link sanitiser rejected the raw `sandbox:` scheme.
The sandbox-URL transform now:
- Holds back completed `sandbox:/mnt/data/<file>` URLs when the
filename is not yet in fileMap (waiting on the citation).
- Reads the citation annotations off the `text-end` chunk, downloads
the container file via the chat-image-service, and writes the
resulting filename → URL pair into fileMap before flushing.
- Substitutes data: image URLs in text via a `dataUrlToBlobRef` map
shared with `code-interpreter-stream-rewriter` so the model's
`[Download](data:image/...)` link gets a renderable href.
`code-interpreter-stream-rewriter` now populates that shared map when
it persists a data: URL, and no longer injects its own
`` markdown — the model's own sandbox URL
substitution covers the rendered image, and injecting would duplicate
it.
The DATA_PREFIXES list also dropped its short literals ("data", "dat",
"da", "d") which were colliding with the literal "data" inside
`sandbox:/mnt/data/...` paths and breaking buffer reassembly.
Tests: cross-delta sandbox URL reassembly (chunked split), text-end
flushing for un-mapped URLs, and the existing pass-through paths
continue to pass (44 files / 398 tests).
The migration to AI SDK v6 dropped the bridge between the client's
attached-file picker and azure.tools.codeInterpreter(). Client still
sent `codeInterpreterFileIds` in the request body, but the route
ignored them and only passed a persisted containerId. First-turn
runs got an empty container, so SharePoint and uploaded files showed
up to the model as "I don't have this file".
- provider-seam: accept codeInterpreterFileIds and build the right
shape — { container: id } when reusing, { container: { fileIds } }
when bootstrapping, {} otherwise.
- route: read payload.codeInterpreterFileIds, signature-compare
against the thread's persisted signature, invalidate stale
container on mismatch, harvest the new container_id from the
step's tool-call input in onFinish so subsequent turns reuse it.
- getFileIdsSignature helper: sort + dedupe so reorder doesn't
cause spurious invalidation.
- Tests cover the three container/fileIds branches plus the
signature helper.
Add an /embed route group so agents can be embedded as an iframe in external apps (e.g. SharePoint): - /embed/agent/[personaId]: auth-gated start card with popup login - /embed/agent/[personaId]/chat: creates a thread, forwards to /embed/chat/[id] - /embed/chat/[id]: reuses ChatPage inside a minimal EmbedFrame - EmbedModeProvider/useEmbedMode strips full-app chrome (ChatHeader) in embed mode - /embed/auth/start + /embed/auth/complete: popup OAuth round-trip (Entra can't render in an iframe) that postMessages the opener - next.config headers(): CSP frame-ancestors from EMBED_ALLOWED_ANCESTORS on /embed; X-Frame-Options SAMEORIGIN + frame-ancestors 'self' everywhere else - optional SameSite=None;Secure session cookies behind EMBED_ALLOW_THIRD_PARTY_COOKIES (off by default) - persona card: single copy dropdown (agent link / embeddable link / iframe snippet) - docs/embedding.md and unit tests
next.config.js headers() are evaluated at build time and baked into the routes manifest, so the CI-prebuilt standalone ignored EMBED_ALLOWED_ANCESTORS set as a runtime app setting (it stayed frozen at 'self'). Move the /embed frame-ancestors CSP into proxy.ts, which runs per request, so the allow-list can change via the env var without rebuilding. Non-embed routes keep the static 'self' lockdown.
…-retention Set prompt_cache_retention=24h on Azure OpenAI calls
…asts The full-page DisplayError rendered bare unstyled <p> text and error toasts had no title or icon, so failures were easy to miss. Redesign the shared surfaces: - DisplayError: centered error state with a destructive icon badge, clear heading, readable message(s), multi-error list and an optional retry action; role=alert; responsive down to the embed iframe width. - Toasts: leading icon (AlertCircle for errors, neutral Info otherwise), stronger description contrast, top-aligned layout. showError now carries a title. Icon is intent-correct: only destructive shows the error glyph, so default-variant warnings are not mislabelled with a success check. No call sites changed.
Generative UI: - genui catalog (Stack/Card/Stat/Badge/Table/Text/Chart) mapped to shadcn; rendered from genui fenced blocks in rich-response (explicit tag only) - system prompt scoped so the markdown rule no longer vetoes UI output UI polish (AI-elements review): - reasoning: live "Thought for Ns" timer, measured in /api/chat onChunk and round-tripped via message-adapter so it survives reload - tool: brand shimmer on running tools + human-readable tool labels - streamdown image download button repositioned (was clipped) Stability: - fix stick-to-bottom resize loop on large streamed content (resize=instant + useChat experimental_throttle) - recharts charts render at a measured width (avoids ResponsiveContainer loop) - per-message error boundary so one bad message can't crash the chat Adds deps: @json-render/core, @json-render/react, recharts.
Use browser-local datetime for the built-in `time` tool
Contributor
|
@saoc90 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
…w arch - resolve route.ts to the new streamText route (old ChatAPIEntry route dropped); keep old-stack files deleted (chat-api-response, function-registry, chat-store) - port main's getCurrentTime to the new architecture: get_current_time tool in buildToolset + x-client-datetime header from the new transport (user-local ISO datetime, server-UTC fallback) - drop over-engineered genui guards (specIsComplete / GenUILoading / genuiSystemPrompt) now covered by the streaming gate + error boundary
- bump zod 3.25 -> 4.4.3 (AI SDK v6 supports zod 4; dedupes with json-render's zod so they share one instance, removing the cross-instance type mismatch) - v4 breaking changes fixed in app code: required_error/invalid_type_error -> error (extensions/news/persona models); ZodError.errors -> .issues (5 services) - genui: import z from 'zod'. Catalog still cast `as any` because json-render's defineCatalog types props as a branded SchemaType, stricter than its public z.object() usage (unrelated to the zod instance, so zod 4 doesn't remove it) - route: drop today/isoDate from the system prompt (main #135 moved time to a tool; the ported get_current_time tool now provides it) next build passes. The vitest suite has pre-existing + new zod-4 type errors to clean up before merging to main.
openai@5 declares peerOptional zod@^3, which conflicts with zod@4 and fails CI's plain `npm install` with ERESOLVE. openai only uses zod for schema helpers (unused — we use the AzureOpenAI client), so set legacy-peer-deps in .npmrc and reconcile the lockfile (drops stray cross-platform esbuild peer entries from an earlier local install).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.