feat(spec): add renderTiming to McpUiToolMeta for deferred View rendering#553
feat(spec): add renderTiming to McpUiToolMeta for deferred View rendering#553netanelavr wants to merge 1 commit intomodelcontextprotocol:mainfrom
Conversation
…ring Add a new `renderTiming` field to `McpUiToolMeta` that lets servers declare when a View should appear in the conversation: - "inline" (default): render as soon as the tool returns - "end-of-turn": defer rendering until the agent's turn is complete This addresses a gap in the spec where hosts have no standardized way to know whether a View should be shown immediately or after the agent finishes its turn. Tools like "Apply to Site" need deferred rendering to prevent premature user interaction while the agent is still making additional tool calls. This is orthogonal to the existing visual `displayMode` (inline/fullscreen/pip) which controls layout, not timing. Changes: - spec.types.ts: add McpUiRenderTiming type and renderTiming field - types.ts: re-export new type and schema - specification/draft/apps.mdx: document Render Timing section and design decision - generated/schema.*: auto-regenerated from types Made-with: Cursor
|
Thanks @netanelavr ! To understand the gap, could you please provide additional example cases that tool definition doesn't cover? For example, in your current example, I'd imagine the "approval" tool could be forced to be called after the reasoning by requiring the |
|
@netanelavr just to make sure - currently the host renders the view immediately (and doesn't actually wait for the tool result). The decision of what to show inside the view is being done by the view itself, according to the data it gets from the host (i.e no data -> loading state, tool inputs -> stateA, tool result -> stateB). This mechanism can theoretically be extended so that the host will send a new type of message to signal that it has done reasoning (so that the view can respond to that). So the view can change according to these lifecycle events:
|
Summary
Adds a new
renderTimingfield toMcpUiToolMetathat lets servers declare when a View should appear in the conversation, addressing a gap in the spec where hosts have no standardized way to distinguish between Views that should render immediately vs. after the agent finishes its turn.Problem
The current spec defines
displayMode(inline/fullscreen/pip) for visual layout, but has no concept of temporal presentation — i.e., when to show the View. In agentic workflows where the LLM makes multiple sequential tool calls, some Views (e.g., "Apply to Site", confirmation dialogs) should only appear after the agent is done reasoning, to prevent premature user interaction.Today, hosts that need this behavior must invent proprietary metadata fields. This PR standardizes the pattern.
Solution
New type and field on
McpUiToolMeta:"inline"(default) — render the View as soon as the tool returns"end-of-turn"— defer rendering until the agent's turn is complete (no more tool calls)Design decisions
displayMode: Timing and layout are independent concerns — a View can beend-of-turn+fullscreen"inline", existing tools are unaffected"on-user-action") without breaking changesPrior art
_meta.ui.displayModefield with"inline"/"end-of-turn"values). This PR standardizes the concept._meta["openai/toolInvocation/invoking"]/invokedfields tracked in Protocol discrepancies between MCP Apps and Apps SDK #201, though those are status text rather than timing control.Changes
src/spec.types.ts— addMcpUiRenderTimingtype andrenderTimingfield toMcpUiToolMetasrc/types.ts— re-export new type and schemaspecification/draft/apps.mdx— document Render Timing section and design decisionsrc/generated/*— auto-regenerated schemas (Zod + JSON Schema + tests)Test plan
npm test— all 121 tests passnpm run build— builds successfully including all examplesMcpUiRenderTiminground-trips correctlyMade with Cursor