feat(spec): add renderTiming to McpUiToolMeta for deferred View rendering by netanelavr · Pull Request #553 · modelcontextprotocol/ext-apps

netanelavr · 2026-03-15T18:19:17Z

Summary

Adds a new renderTiming field to McpUiToolMeta that lets servers declare when a View should appear in the conversation, addressing a gap in the spec where hosts have no standardized way to distinguish between Views that should render immediately vs. after the agent finishes its turn.

Problem

The current spec defines displayMode (inline / fullscreen / pip) for visual layout, but has no concept of temporal presentation — i.e., when to show the View. In agentic workflows where the LLM makes multiple sequential tool calls, some Views (e.g., "Apply to Site", confirmation dialogs) should only appear after the agent is done reasoning, to prevent premature user interaction.

Today, hosts that need this behavior must invent proprietary metadata fields. This PR standardizes the pattern.

Solution

New type and field on McpUiToolMeta:

type McpUiRenderTiming = "inline" | "end-of-turn";

interface McpUiToolMeta {
  resourceUri?: string;
  visibility?: McpUiToolVisibility[];
  renderTiming?: McpUiRenderTiming;  // NEW
}

"inline" (default) — render the View as soon as the tool returns
"end-of-turn" — defer rendering until the agent's turn is complete (no more tool calls)

Design decisions

Server-declared hint: The server has domain knowledge about whether its View needs deferred rendering; the host respects it but MAY ignore it
Orthogonal to displayMode: Timing and layout are independent concerns — a View can be end-of-turn + fullscreen
Backward compatible: Optional field, defaults to "inline", existing tools are unaffected
Extensible: String union allows future values (e.g., "on-user-action") without breaking changes

Prior art

Elementor's Angie has shipped this pattern in production (as a vendor-specific _meta.ui.displayMode field with "inline" / "end-of-turn" values). This PR standardizes the concept.
Related to the deferred _meta["openai/toolInvocation/invoking"] / invoked fields tracked in Protocol discrepancies between MCP Apps and Apps SDK #201, though those are status text rather than timing control.

Changes

src/spec.types.ts — add McpUiRenderTiming type and renderTiming field to McpUiToolMeta
src/types.ts — re-export new type and schema
specification/draft/apps.mdx — document Render Timing section and design decision
src/generated/* — auto-regenerated schemas (Zod + JSON Schema + tests)

Test plan

npm test — all 121 tests pass
npm run build — builds successfully including all examples
Schema generation produces correct Zod and JSON Schema for the new type
Type-level integration tests verify McpUiRenderTiming round-trips correctly

Made with Cursor

…ring Add a new `renderTiming` field to `McpUiToolMeta` that lets servers declare when a View should appear in the conversation: - "inline" (default): render as soon as the tool returns - "end-of-turn": defer rendering until the agent's turn is complete This addresses a gap in the spec where hosts have no standardized way to know whether a View should be shown immediately or after the agent finishes its turn. Tools like "Apply to Site" need deferred rendering to prevent premature user interaction while the agent is still making additional tool calls. This is orthogonal to the existing visual `displayMode` (inline/fullscreen/pip) which controls layout, not timing. Changes: - spec.types.ts: add McpUiRenderTiming type and renderTiming field - types.ts: re-export new type and schema - specification/draft/apps.mdx: document Render Timing section and design decision - generated/schema.*: auto-regenerated from types Made-with: Cursor

idosal · 2026-03-19T00:10:16Z

Thanks @netanelavr ! To understand the gap, could you please provide additional example cases that tool definition doesn't cover? For example, in your current example, I'd imagine the "approval" tool could be forced to be called after the reasoning by requiring the reason argument.

liady · 2026-03-19T00:56:15Z

@netanelavr just to make sure - currently the host renders the view immediately (and doesn't actually wait for the tool result). The decision of what to show inside the view is being done by the view itself, according to the data it gets from the host (i.e no data -> loading state, tool inputs -> stateA, tool result -> stateB).

This mechanism can theoretically be extended so that the host will send a new type of message to signal that it has done reasoning (so that the view can respond to that).
What do you think? This might allow the most accurate visual feedback for the user.

So the view can change according to these lifecycle events:

The host decides to use the tool (renders the view)
The host calls the tool (streams tool inputs to the view)
The host receives the tool response (sends the tool result to the view)
The host finishes the agentic reasoning (sends a message to the view)

netanelavr · 2026-03-21T19:41:44Z

Thanks @netanelavr ! To understand the gap, could you please provide additional example cases that tool definition doesn't cover? For example, in your current example, I'd imagine the "approval" tool could be forced to be called after the reasoning by requiring the reason argument.

@idosal Thanks for the feedback! Here are a few cases where tool definition alone just doesn't cover the gap:

Multi-tool workflows with shared confirmation - Think of a flow like create_widget -> deploy_widget -> validate_widget. The "Publish to Site" view (attached to deploy_widget) should only show when everything is ready, but the server doesn't know the tool order upfront. If the view appears mid-turn, the user might publish before validation is done. We can try guiding it with a reason arg but we end up depending on LLM behavior instead of enforcing intent.
Views aren’t tied to a single tool - In my implementation, the view isn't a dedicated "publish_to_site" tool, it can attach to any tool via _meta.ui. For example, a create_widget tool can return both the content and a view to apply it. The tool runs early, but the view should only appear at the end.

Hopefully that makes it clearer.

netanelavr · 2026-03-21T20:10:59Z

@netanelavr just to make sure - currently the host renders the view immediately (and doesn't actually wait for the tool result). The decision of what to show inside the view is being done by the view itself, according to the data it gets from the host (i.e no data -> loading state, tool inputs -> stateA, tool result -> stateB).

This mechanism can theoretically be extended so that the host will send a new type of message to signal that it has done reasoning (so that the view can respond to that). What do you think? This might allow the most accurate visual feedback for the user.

So the view can change according to these lifecycle events:

The host decides to use the tool (renders the view)

The host calls the tool (streams tool inputs to the view)

The host receives the tool response (sends the tool result to the view)

The host finishes the agentic reasoning (sends a message to the view)

@liady Good question, the lifecycle events approach is interesting, but there's a challenge around timing and positioning.

If the host renders the view immediately (even hidden/loading), it gets inserted at the tool call position. When the agent finishes and the view "activates", it's still stuck in the middle of the conversation.

With the proposed renderTiming hint, the host just waits until the loop ends, so the view naturally shows in the right place. With lifecycle events alone, you'd still need either the view to move itself in the DOM or the host to reposition it when it's done reasoning.

That said, lifecycle events are still valuable since they enable more advanced UI behavior. I see them as complementary, not a replacement. In fact, I've already implemented the reverse direction (view -> host signals) on our end for cases where the view needs to trigger host behavior without appearing in the chat.

Would it make sense to have both? renderTiming for the simple case, and lifecycle events for views that want more control over their presentation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spec): add renderTiming to McpUiToolMeta for deferred View rendering#553

feat(spec): add renderTiming to McpUiToolMeta for deferred View rendering#553
netanelavr wants to merge 1 commit into
modelcontextprotocol:mainfrom
netanelavr:feat/render-timing

netanelavr commented Mar 15, 2026

Uh oh!

idosal commented Mar 19, 2026

Uh oh!

liady commented Mar 19, 2026 •

edited

Loading

Uh oh!

netanelavr commented Mar 21, 2026

Uh oh!

netanelavr commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

netanelavr commented Mar 15, 2026

Summary

Problem

Solution

Design decisions

Prior art

Changes

Test plan

Uh oh!

idosal commented Mar 19, 2026

Uh oh!

liady commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netanelavr commented Mar 21, 2026

Uh oh!

netanelavr commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liady commented Mar 19, 2026 •

edited

Loading