Skip to content

feat(mcp): Add plugin-scoped MCP tool support#93

Open
dcramer wants to merge 21 commits intomainfrom
feat/mcp-plugin-resume
Open

feat(mcp): Add plugin-scoped MCP tool support#93
dcramer wants to merge 21 commits intomainfrom
feat/mcp-plugin-resume

Conversation

@dcramer
Copy link
Member

@dcramer dcramer commented Mar 17, 2026

Add plugin-scoped MCP tool support with same-session auth resume.

Plugins can now declare a single HTTP MCP server in their manifest. When the agent loads a skill from that plugin, it progressively activates that plugin's MCP tools, and the turn checkpoint now persists both loaded skills and active MCP providers so resumed turns rebuild the same tool surface.

This also adds the MCP auth path. MCP auth challenges now pause the current turn, send a private auth link, and resume the same session after the callback instead of failing the request. The callback restores the stored turn context and persists the resumed reply back into Slack thread state so delivery and conversation state stay aligned.

I kept this scoped to tools-only MCP over HTTP. There is no sandbox-managed MCP transport here, and provider-specific behavior stays in the plugin manifest rather than leaking into the older capability and broker flows.

@vercel
Copy link

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
junior-docs Ready Ready Preview, Comment Mar 20, 2026 1:11am

Request Review

dcramer and others added 2 commits March 17, 2026 15:38
Add plugin manifest support for HTTP MCP servers and a turn-local MCP tool manager that progressively exposes plugin tools when matching skills load. Pause turns on MCP auth challenges and resume the same session after the callback instead of failing the turn.

Persist loaded skills and active MCP providers across resume, and align the resumed callback path with normal reply delivery and thread-state persistence so Slack state stays consistent.

Co-Authored-By: Codex <noreply@openai.com>
Preserve the stored authorization URL when clearing only the code verifier so paused MCP authorization can resume without losing its redirect target.

Render the MCP callback page through React's escaping path and add focused regressions for callback HTML escaping and auth-session invalidation behavior.

Co-Authored-By: GPT-5 Codex <codex@openai.com>
Replace the router slash-trimming regex with deterministic string trimming and redact PEM private key blocks without a broad backtracking pattern.

Add focused regression coverage for both behaviors so the CodeQL findings stay fixed on future refactors.

Co-Authored-By: GPT-5 Codex <codex@openai.com>
Sync the resumability checkpoint as soon as a plugin skill is loaded so an MCP auth pause cannot drop the newly activated skill. This keeps the resumed turn aligned with the tool state the user asked for before OAuth interrupted execution.

Add a focused unit regression that exercises loadSkill, auth-required pause, checkpoint persistence, and resume-time MCP tool restoration.

Co-Authored-By: GPT-5 Codex <codex@openai.com>
Keep MCP client shutdown best-effort so one failing client does not prevent the rest of the active providers from closing or leave the manager state populated. Preserve the first close error after cleanup so the caller can still log the teardown failure.

Extend the existing tool-manager unit coverage to assert that all active clients are closed and the active provider state is cleared even when one close call throws.

Co-Authored-By: GPT-5 Codex <codex@openai.com>
Keep resumed OAuth callback delivery aligned with the normal reply executor by uploading files whenever the delivery plan requests them, even if thread text posting is intentionally suppressed. This avoids silently dropping artifacts in reaction-ack flows and other no-text reply modes.

Add callback coverage for the resumed delivery path where files are attached but thread text is skipped.

Co-Authored-By: GPT-5 Codex <codex@openai.com>
Replace per-provider Pi tool registration with host-managed searchTools and useTool dispatch so loadSkill can expose MCP tools without replaying the turn. Persist MCP auth and session state at the host boundary, migrate the Notion skill to the new dispatcher contract, and tighten the MCP resume path.

Reduce dev noise by shrinking console logging, deduping startup/plugin file-load logs, and making MCP dispatcher status updates more specific in Slack.

Co-Authored-By: GPT-5 Codex <openai-codex@openai.com>
Commit the generated Next route types reference so the example app matches the current dev output after the recent routing and handler changes.

Co-Authored-By: GPT-5 Codex <openai-codex@openai.com>
Shrink the exported MCP tool descriptor to the fields consumers actually use and keep argument validation inside the manager that owns the live tool registry. This removes duplicated lookup work from useTool and keeps the dispatcher path focused on dispatch.

Also collapse repeated state-adapter connection setup in the MCP auth store so the storage operations read directly instead of repeating the same boilerplate in every function.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Return fixed browser copy for MCP OAuth callback success and failure pages instead of reflecting provider error text or exception messages into the HTML response. The Slack-side logging and resume flow still keep the actionable details.

Update the callback unit coverage to assert the browser page stays generic for provider and exception failures.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
@dcramer dcramer marked this pull request as ready for review March 19, 2026 01:00
Keep plugin-backed skill metadata intact when loadSkill reads a skill from the sandbox so later MCP activation and capability checks see the same contract as startup discovery.

Also preserve later log content when a malformed PEM block is missing its footer by only redacting through the next blank-line boundary instead of discarding the rest of the record.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Collapse the callback resume path onto a shared helper so OAuth and MCP
callbacks follow the same shape. Persist only loaded skills for MCP
turn resume and always continue from the checkpointed turn state.

Co-Authored-By: Codex <noreply@openai.com>
Add a regression test proving PEM private keys are still redacted
when they appear inside escaped JSON strings. This documents the
current behavior and closes out review concerns without widening the
logging implementation.

Co-Authored-By: Codex <noreply@openai.com>
Clear the cached MCP tool catalog whenever the client is disposed so a
recovered session always reloads tools from the rebuilt transport.

Add a focused unit regression covering the stale-session retry path to
lock the refreshed tool list behavior.

Co-Authored-By: Codex <noreply@openai.com>
Remove skill-level MCP tool allowlists so provider tool exposure is
owned by the plugin manifest only.

Extract shared provider unlink cleanup so app-home disconnect and jr-rpc
provider deletion follow the same path.

Co-Authored-By: Codex <noreply@openai.com>
Document why only the timeout branch waits for prompt settlement
before snapshotting agent messages, and why auth-pause snapshots are
already final once promptPromise settles.

Co-Authored-By: Codex <noreply@openai.com>
Catch MCP authorization errors raised during initial client setup so
provider activation uses the same auth-pause path as discovery-time
failures.

Add focused unit coverage for the connect-time auth-required case.

Co-Authored-By: Codex <noreply@openai.com>
Wrap auth-pause checkpoint persistence so transient state adapter failures do not replace the intended auth resume retry path. Add a regression test to keep this control flow aligned with timeout checkpoint handling.

Co-Authored-By: Codex <noreply@openai.com>
}
throw error;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale listedTools cache survives auth-required connect failure

Low Severity

When activateProvider catches McpAuthorizationRequiredError and the auth hook returns true, the PluginMcpClient created by getClient remains cached in clientsByProvider even though the provider failed to activate. If activateProvider is called again for the same provider within the same McpToolManager lifetime (e.g., from a second skill sharing the same plugin), getClient returns the cached client whose internal Client was disposed after the failed connect. The PluginMcpClient will recreate its internal client on next use, but it re-runs the full connect+auth dance, which could trigger a duplicate auth challenge that's silently swallowed by the pendingMcpAuthorizationPause guard rather than being avoided.

Additional Locations (1)
Fix in Cursor Fix in Web

Remove enduser.id from the console priority ordering.
The key remains hidden in compact console output, so this only changes
sorting behavior for console attributes.

Co-Authored-By: Codex <codex@openai.com>
Ignore MCP auth-pause requests once the turn has already produced a completed assistant stop reason, while still parking aborted turns for resume.

Add focused unit coverage for the tool-call race so completed turns are not discarded when an auth request arrives too late to interrupt them.

Co-Authored-By: Codex <codex@openai.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

timeoutResumeSessionId
) {
const nextSliceId = timeoutResumeSliceId + 1;
const piMessages = trimTrailingAssistantMessages(timeoutResumeMessages);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auth pause checkpoint skips fallback to latest stored messages

Medium Severity

The MCP auth-pause checkpoint path uses trimTrailingAssistantMessages(timeoutResumeMessages) directly without falling back to the latest stored checkpoint when timeoutResumeMessages is empty. The timeout path has a defensive fallback (timeoutResumeMessages.length > 0 ? timeoutResumeMessages : latestCheckpoint?.piMessages ?? []) but the auth-pause path does not. If the auth challenge fires before any messages are captured, the checkpoint persists an empty message list, losing all prior turn context on resume.

Fix in Cursor Fix in Web

},
},
),
tools: baseAgentTools,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCP tools not added to agent tool set after activation

High Severity

The agent is initialized with baseAgentTools which include searchTools and useTool wrappers, but these are the only indirect access paths. When onSkillLoaded activates an MCP provider mid-turn and discovers new tools, those tools are only returned as metadata in the loadSkill result — they're never injected into the agent's live tool set. The agent can only call MCP tools through the useTool indirection, which works, but the system prompt's <loaded_tools> section is built before the agent starts and never updated when new skills are loaded mid-turn, so the agent has stale tool catalog information after mid-turn skill loads.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant