Skip to content

feat(bridge): forward upstream MCP tools through the bridge with daemon-side allowlist gating#4

Open
benhoverter wants to merge 7 commits into
feat/bridge-tool-surface-v2from
feat/mcp-bridge-upstream-forwarding
Open

feat(bridge): forward upstream MCP tools through the bridge with daemon-side allowlist gating#4
benhoverter wants to merge 7 commits into
feat/bridge-tool-surface-v2from
feat/mcp-bridge-upstream-forwarding

Conversation

@benhoverter
Copy link
Copy Markdown
Owner

@benhoverter benhoverter commented May 21, 2026

Depends on RightNow-AI#1205. Do not merge until that lands; this PR is stacked on feat/bridge-tool-surface-v2. Diff-vs-base shows only the upstream-forwarding work.

Summary

Lets bridge-fronted agents call upstream MCP tools (Linear, Notion, etc.) without baking each one into the static ALLOWED_TOOLS table. The bridge asks the daemon for the agent's permitted upstream surface at handshake, advertises those tools to the rmcp client alongside built-ins, and routes mcp_{server}_{tool} invocations through the existing IPC. The daemon stays the source of truth for per-agent gating and tool dispatch.

Changes

Layer Change
Protocol (openfang-mcp-bridge/src/protocol.rs) New ListUpstreamRequest, UpstreamToolDef, UpstreamListResponse, UpstreamListResult. New Frame::ListUpstream / Frame::UpstreamList variants. Forward-compat: unknown type tags reject cleanly (serde error, no panic).
Daemon dispatch (openfang-api/src/bridge_ipc.rs) handle_list_upstream walks kernel.mcp_connections and returns the agent's permitted upstream tools. dispatch_upstream_mcp_call routes mcp_{server}_{tool} through the MCP registry, bypassing the static ALLOWED_TOOLS gate via an early branch. Request loop multiplexes Frame::Call and Frame::ListUpstream.
Bridge plumbing (openfang-mcp-bridge/src/lib.rs, main.rs) list_upstream() round-trip runs once after handshake() while the bridge still owns the stream, then spawn_ipc_actor() takes the halves. Results are cached on IpcDispatcher. Bridge::permitted_tools() appends upstream tools after built-ins. Local accept gate: name in allowed OR (starts_with("mcp_") AND advertised in cache).
Discovery hardening (openfang-runtime/src/mcp.rs) RESERVED_BUILTIN_NAMES mirrors ALLOWED_TOOLS. New register_discovered_tool helper skips (with WARN) any upstream tool whose namespaced form shadows a built-in or duplicates a same-server entry (catches get-teams vs get_teams normalization collisions). Drift test pins RESERVED_BUILTIN_NAMES == ALLOWED_TOOLS.

Per-commit:

  • dadc6f1 — protocol variants + tests, no behavior change
  • 5d33a4a — daemon dispatch + list_upstream handler
  • 8180191 — bridge handshake plumbing + advertise + dispatch
  • 52b4fba — discovery-time collision check + drift test
  • 8afe35acargo fmt --all on the touched crates (whitespace only)
  • 71639c0post-review: sandbox hard-deny on ~/.mcp-auth/ (MCP-02)
  • 4e0f71apost-review: list_upstream timeout + drift test + UTF-8 audit comment (MCP-06/08/09)

Testing

  • cargo fmt --all -- --check passes
  • cargo clippy -p openfang-runtime -p openfang-api -p openfang-mcp-bridge --tests -- -D warnings passes
  • cargo test on touched crates passes:
    • openfang-mcp-bridge --lib --bins → 16/16 (4 new, including list_upstream_handshake_and_mcp_dispatch end-to-end, permitted_tools_appends_upstream_after_builtins, is_advertised_upstream_matches_only_advertised_names, list_upstream_propagates_error_result)
    • openfang-api --lib bridge_ipc → 19/19 (4 new: ipc_list_upstream_roundtrip, ipc_mcp_call_through_twin_returns_canned_ok, reserved_builtin_names_matches_allowed_tools drift guard, no_builtin_uses_mcp_prefix drift guard)
    • openfang-runtime --lib (sandbox + mcp) → all green, +1 sandbox unit test for .mcp-auth/ deny
  • Live integration smoked against a local daemon with bridge-fronted agents:
    • Full surface: mcp_linear_*, mcp_notion_*, mcp_figma_*, mcp_supabase_* all callable end-to-end from an agent with the matching mcp_servers allowlist.
    • Gating verified by removing supabase from one agent's mcp_servers: the entire Supabase tool surface stopped being advertised (not just rejected at call time) — confirms the schema-layer gate.
    • mcp_servers = [] → zero upstream surface advertised. Default-deny on the bridge path holds (see Notes below).
    • Daemon restart resilience: bridge + plugin reconnect clean post-deploy-local.sh; no re-priming.
    • Upstream-MCP-server kill resilience: subprocess supervisor respawns transparently; the next tool call succeeds with no cryptic errors surfaced to the agent.
    • Forced exec attempts blocked at the schema layer when an agent has no permitted servers — denied tools aren't in the deferred registry, so a client can't fabricate the call.

New tests are named so they can be grepped:
frame_unknown_variant_is_rejected_cleanly, ipc_list_upstream_roundtrip, ipc_mcp_call_through_twin_returns_canned_ok, reserved_builtin_names_matches_allowed_tools, no_builtin_uses_mcp_prefix, permitted_tools_appends_upstream_after_builtins, is_advertised_upstream_matches_only_advertised_names, list_upstream_handshake_and_mcp_dispatch, list_upstream_propagates_error_result, denies_mcp_auth_home_dir.

Security

  • No new unsafe code
  • No secrets or API keys in diff
  • User input validated at boundaries

Threat-model notes for reviewers:

  • Hardened-token lane only. Both handle_list_upstream and dispatch_upstream_mcp_call resolve the agent identity from the Hello token. A legacy self-claimed agent_id on either new path is refused.
  • mcp_servers = [] is none on the bridge path (behavior change — see Notes below). The daemon refuses upstream MCP calls and returns an empty advertise list when an agent's allowlist is empty. The in-process path's historical [] = all semantic is left untouched in this PR (tracked: MCP-01 / MCP-05).
  • Explicit truncation. Oversized upstream results return is_error = true with a truncation marker; never silently truncated.
  • Server-side allowlist is authoritative. The bridge's local mcp_* accept gate is early-exit hygiene only — the daemon re-validates agent.mcp_servers on every call.
  • Collision-resistant naming. RESERVED_BUILTIN_NAMES + the discovery-time skip means a future built-in landing under mcp_* (or a misnamed upstream tool) cannot shadow a built-in or collide with a same-server sibling. Drift test guards the invariant; second drift test (no_builtin_uses_mcp_prefix) locks the dispatch-ordering invariant.
  • Audit log redaction. Tool arguments are not logged at info level on the new dispatch path; only the tool name + agent id + server.
  • Sandbox. ~/.mcp-auth/ OAuth tokens (mcp-remote, etc.) now hard-denied regardless of workspace_root, matching the 421e5d9 deny-outside-$OPENFANG_HOME threat model.

Post-review additions (security-openfang findings folded in)

Security review (_shared/openfang/findings/mcp-forwarding/) flagged 9 findings across the 12 commits in this stack. Four were folded into this PR before merge; the rest are tracked as follow-ups against local-main (see "Follow-ups" below).

Finding Sev Disposition Commit
MCP-02 HIGH Folded — hard-deny ~/.mcp-auth/ in workspace_sandbox.rs (same threat model as 421e5d9). 71639c0
MCP-06 MED Folded — tokio::time::timeout(15s) on list_upstream round-trip; downgrades to empty on timeout, matching the existing error path. 4e0f71a
MCP-08 LOW Folded — no_builtin_uses_mcp_prefix drift test locks the invariant is_mcp_tool dispatch ordering depends on. 4e0f71a
MCP-09 INFO Folded — UTF-8 4-byte-worst-case assumption named in a comment on the truncation budget. 4e0f71a
MCP-10 / MCP-11 / MCP-12 PASS No action — reviewer flagged these as reference patterns.

Follow-ups (tracked against local-main, not blockers for this PR)

  • MCP-01 (CRITICAL, pre-existing): tool_runner.rs:517-545 — in-process MCP path does not consult agent.toml mcp_servers. Out of scope for this bridge PR (kernel/runtime path), but filed concurrently with merge so it is not lost. Requires explicit default-deny-vs-allow+warn decision plus a parity test asserting bridge/in-process semantic equivalence.
  • MCP-03 (HIGH): Upstream tool results lack a provenance envelope ("external content from MCP server X, tool Y"). Same structural gap exists today on web_fetch and memory_recall (S9-H / FU-01). Should land all three together so the model sees a consistent boundary marker.
  • MCP-04 (MED): Per-tool (not per-server) allowlist granularity in agent.toml. New feature, not a regression; this PR's server-granular gate matches the existing config shape.
  • MCP-05 (MED): Resolves automatically when MCP-01 lands (couples to that issue).
  • MCP-07 (LOW): UX-only — mid-session manifest changes not reflected in advertise until restart. Dispatch-side gate is the real authority; the partial doc at main.rs:241-246 is sufficient interim.

Notes for reviewers

  • Behavior change — mcp_servers = [] on the bridge path. Historically, OpenFang interpreted mcp_servers = [] as "all servers allowed" (see openfang-api/src/routes.rs and the agents TUI's "all" mode). On the new bridge upstream path, [] means none. This is a deliberate default-deny posture for new code — bridge-fronted agents with no MCP intent specified get zero upstream surface advertised instead of every connected MCP server's tools. The in-process path's [] = all semantic is untouched to avoid a blast-radius this PR shouldn't claim. Follow-up planned: converge both paths on [] = none in a dedicated PR with a migration note. Calling this out now as "putting folks on notice."
  • Default-empty trait method. ToolDispatcher::upstream_tools() is added with a default empty Vec so existing Box<dyn ToolDispatcher> callsites compile unchanged.
  • Schema fallback. upstream_def_to_tool() falls back to an empty object schema when an upstream tool's inputSchema is not an object — keeps rmcp happy on non-conformant servers without a panic.

Out of scope

  • Converging the in-process path on [] = none (follow-up PR — MCP-01).
  • Per-tool (not per-server) allowlisting (MCP-04).
  • Streaming/chunked upstream results — current implementation returns a single response with explicit-truncation semantics.

…ior change)

Adds the wire-protocol surface for daemon→bridge forwarding of upstream
MCP tool calls (Linear, Notion, etc.) advertised via config.toml.
This commit is types-and-tests only; no caller wires the new variants
yet and no behavior changes.

New types in openfang_mcp_bridge::protocol:
  - ListUpstreamRequest         { request_id }
  - UpstreamToolDef             { name, server, description?, input_schema }
  - UpstreamListResponse        { request_id, result }
  - UpstreamListResult::{Ok,Error}

New Frame variants (internal tag "type", snake_case):
  - Frame::ListUpstream(ListUpstreamRequest)   tag: list_upstream
  - Frame::UpstreamList(UpstreamListResponse)  tag: upstream_list

Identity binding: ListUpstreamRequest carries no agent_id; the daemon
resolves the caller from the authenticated Hello token (consistent with
the hardened CallRequest path post-ANAI-31). Per-agent gating is
enforced server-side against `agent.toml mcp_servers`.

Naming: forwarded tools use the same `mcp_{server}_{tool}` namespace
already produced at MCP discovery in openfang-runtime/src/mcp.rs, so
the bridge surfaces them verbatim and the daemon dispatcher can route
on prefix.

Forward-compat: adding variants is additive on the wire. An older peer
sees the new tag and closes the connection with a clean serde decode
error (covered by `frame_unknown_variant_is_rejected_cleanly`); we ship
daemon and bridge from the same workspace, so version skew here would
be a build-system error, not a runtime concern.

Tests: 5 added, all green.
  - frame_roundtrip_list_upstream_request
  - frame_roundtrip_upstream_list_ok / _empty / _error
  - upstream_tool_def_omits_none_description
  - frame_unknown_variant_is_rejected_cleanly

Depends on RightNow-AI#1205 (feat/bridge-tool-surface-v2). Followup commits will
wire daemon dispatch, bridge surface, and the discovery-time
namespace-collision check.

Refs: projects/openfang-fork/mcp-upstream-forwarding-design.md
Adds the daemon half of the upstream MCP forwarding path
(design doc §5.2–5.4):

- `handle_list_upstream`: walks `kernel.mcp_connections`, returns
  tools for servers on the agent's `mcp_servers` allowlist.
- `dispatch_upstream_mcp_call`: routes `mcp_{server}_{tool}`
  invocations through the kernel's MCP registry, bypassing
  `execute_tool` and the static `ALLOWED_TOOLS` gate.
- Request loop now multiplexes `Frame::Call` and
  `Frame::ListUpstream` without state coupling.

Security gates baked in (per design doc §6):

- **Hardened-token lane only.** Both new paths refuse the legacy
  self-claimed `agent_id` lane outright — upstream MCP servers
  can carry secrets and we need a daemon-issued identity to
  authorize against.
- **Per-agent server allowlist with default-deny.**
  `entry.manifest.mcp_servers = []` means *no* upstream tools
  for the bridge path — deliberate semantic departure from the
  in-process MCP path's historical `[] = all` convention.
  Convergence on default-deny everywhere tracked as follow-up.
- **Truncation is explicit.** Results exceeding the frame
  budget come back as `is_error=true` with a truncation
  marker, never silently dropped.
- **Hyphen-safe server matching.** Uses
  `extract_mcp_server_from_known` against the agent's allowlist
  so servers like `bocha-search` resolve correctly.

Tests:

- `ipc_list_upstream_roundtrip` — wire round-trip of the new
  variants, plus a follow-up Call to confirm request-loop state
  isn't tainted between message kinds.
- `ipc_mcp_call_through_twin_returns_canned_ok` — exercises the
  production early-branch shape (mcp_* bypasses ALLOWED_TOOLS).
- Test twin extended to handle `ListUpstream` and mirror the
  production `is_mcp_tool` early branch.

No behavior change for the built-in tool surface: existing
17 `bridge_ipc::tests` (15 prior + 2 new) all pass.

Refs design doc: projects/openfang-fork/mcp-upstream-forwarding-design.md
…spatch

Bridge-side plumbing for upstream MCP forwarding (commit #3 of 4).

Lib:
- `ToolDispatcher` gains a default `upstream_tools()` method returning
  the cached upstream MCP tool catalog. Backward-compatible default impl
  for existing dispatchers.
- `Bridge::permitted_tools()` appends upstream tools to the built-in
  surface. Built-ins remain gated by `allowed_tools()`
  (`OPENFANG_BRIDGE_ALLOWED` / `DEFAULT_ALLOWED`); upstream tools are
  ungated here — server-side `mcp_servers` gating already ran when the
  daemon answered `ListUpstream`.
- `Bridge::call_tool()` accepts a call if the tool is either a permitted
  built-in OR a `mcp_*` name that was advertised in the upstream cache.
- Helper `upstream_def_to_tool()` converts the wire-format
  `UpstreamToolDef` to an rmcp `Tool`, with empty-object schema fallback
  for non-object input schemas.

Bin:
- `main()` performs a one-shot `ListUpstream` round-trip immediately
  after `handshake()`, while still owning the stream. Failures are
  logged and downgraded to "no upstream surface this session" — the
  bridge stays alive with built-ins only.
- `list_upstream()` helper factored alongside `handshake()`.
- `spawn_ipc_actor()` takes the cached `Vec<UpstreamToolDef>` and
  stores it on `IpcDispatcher`.
- `IpcDispatcher::call()` mirrors the bridge-side gate: allowed
  built-in OR advertised upstream `mcp_*` name. Anything else is
  refused locally with `NotPermitted` — early-exit hygiene; daemon
  remains the source of truth.

Tests:
- lib: `permitted_tools_appends_upstream_after_builtins`,
  `is_advertised_upstream_matches_only_advertised_names`.
- bin: `list_upstream_handshake_and_mcp_dispatch` (end-to-end fake
  daemon → handshake → ListUpstream → mcp_* round-trip + unadvertised
  rejection), `list_upstream_propagates_error_result`.

Verification:
- `cargo check -p openfang-mcp-bridge` clean.
- `cargo check -p openfang-api` clean (no trait-method exhaustiveness
  regressions on the daemon side).
- `cargo test -p openfang-mcp-bridge --lib --bins` → 16/16 green
  (13 lib + 3 bin).

No behavior change for agents without upstream MCP servers configured —
they get `Ok { tools: vec![] }` from the daemon and the bridge continues
exactly as before.
At `McpConnection::discover_tools`, refuse to register any upstream tool
whose namespaced form (`mcp_{server}_{tool}`) shadows an OpenFang
built-in, or duplicates a name already registered for the same server.
Both skips emit a `WARN` trace and silently drop the tool.

This is the final security gate from the upstream-forwarding design doc
§5.2: built-ins always win over upstream MCP servers, so the bridge's
blind trust of `mcp_*` names returned by the daemon is justified.

Built-in shadowing is structurally impossible today (no built-in starts
with `mcp_`), but enforcing this is defense-in-depth — adds zero risk,
protects against any future built-in landing under `mcp_*`, and pins
the invariant in a drift test.

Changes:
- `openfang-runtime::mcp`:
  - `RESERVED_BUILTIN_NAMES: &[&str]` — single-sourced list mirroring
    `openfang-api::bridge_ipc::ALLOWED_TOOLS`.
  - `is_reserved_builtin(name) -> bool` — gate helper.
  - `register_discovered_tool(...)` — extracted per-tool registration
    so the collision logic is unit-testable without a live MCP
    transport. Called from `discover_tools`.
- `openfang-api::bridge_ipc`:
  - `reserved_builtin_names_matches_allowed_tools` drift test —
    asserts equality between `RESERVED_BUILTIN_NAMES` and
    `ALLOWED_TOOLS`. Adds to the existing drift detection suite that
    catches the analogous `built_in_tools()` ↔ `ALLOWED_TOOLS` drift.

Tests:
- `cargo test -p openfang-runtime --lib mcp::tests` → 11/11 (4 new:
  `reserved_builtin_names_includes_core_surface`,
  `register_discovered_tool_skips_builtin_shadow`,
  `register_discovered_tool_skips_duplicate_from_same_server`,
  `register_discovered_tool_drops_namespaced_collision_with_builtin`).
- `cargo test -p openfang-api --lib bridge_ipc` → 18/18 (1 new drift
  test).
- `cargo test -p openfang-mcp-bridge --lib --bins` → 16/16 (unchanged).
- `cargo clippy -p openfang-runtime -p openfang-api -p openfang-mcp-bridge
  --tests -- -D warnings` ✅
…P-02)

Adds is_sensitive_home_path() check in resolve_sandbox_path so MCP OAuth token caches under ~/.mcp-auth/ cannot be read even from agents whose workspace_root would otherwise resolve under $HOME. Same threat model as the prior commit covering ~/.ssh, ~/.aws, etc.

Findings ref: _shared/openfang/findings/mcp-forwarding/ MCP-02 (HIGH).
…6/08/09)

MCP-06 (MED): wrap dispatch_upstream_list_tools round-trip in tokio::time::timeout (LIST_UPSTREAM_TIMEOUT = 15s). Slow upstream MCP servers previously could hang every new bridge session at startup. On timeout, downgrades to an empty upstream surface (same shape as the existing error path).

MCP-08 (LOW): adds no_builtin_uses_mcp_prefix drift test in bridge_ipc.rs to lock the dispatch-ordering invariant that builtin tools never collide with the mcp__ namespace.

MCP-09 (INFO): documents the UTF-8 4-byte assumption on the truncation budget in bridge_ipc.rs.

Findings ref: _shared/openfang/findings/mcp-forwarding/.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant