Add service-mode FastMCP integration for local and remote agents#2263
Add service-mode FastMCP integration for local and remote agents#2263jdye64 wants to merge 14 commits into
Conversation
upload-artifact flattens nemo_retriever/dist into ./dist on download; use find instead of ./dist/nemo_retriever/dist/*. Re-run failed jobs does not pick up workflow changes — dispatch a new run after merging.
Greptile SummaryThis PR adds a FastMCP integration to the NeMo Retriever service mode, mounting a
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/service/mcp_server.py | New FastMCP facade over the service HTTP API; contains an unrestricted server-side file read via the path parameter accessible to any authenticated remote agent, no file-size guard before read_bytes(), redundant _json_or_text logic, and missing return type on build_mcp_app. |
| nemo_retriever/src/nemo_retriever/service/app.py | Mounts the FastMCP ASGI app at /mcp during create_app; the broad except Exception: catch during MCP init will silently degrade programming errors rather than surfacing them at startup. |
| nemo_retriever/src/nemo_retriever/service/config.py | Adds well-structured MCPConfig Pydantic model with path validation; defaults to enabled: true, which auto-enables the MCP endpoint for all existing deployments on upgrade. |
| nemo_retriever/src/nemo_retriever/service/cli.py | Adds mcp-stdio Typer command that wires CLI flags into ServiceMCPSettings and runs the FastMCP server over stdio; clean and straightforward. |
| nemo_retriever/tests/test_service_mcp.py | Adds three focused MCP tests using httpx.MockTransport; the auth-protection test assertion is too permissive (!= 401) and could pass on a broken mount. |
| nemo_retriever/src/nemo_retriever/service/retriever-service.yaml | Adds the mcp: config block to the default YAML; values match MCPConfig defaults and are clearly commented. |
| docs/docs/extraction/workflow-agentic-retrieval.md | Adds MCP access documentation covering stdio and HTTP usage; correctly advises remote agents to use base64 content rather than server paths. |
| .github/workflows/reusable-pypi-publish.yml | Minor comment improvement clarifying the upload-artifact flat-layout behavior; no logic change. |
Sequence Diagram
%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Agent as Agent (local/remote)
participant MCP as FastMCP at /mcp
participant Auth as BearerAuthMiddleware
participant Client as ServiceMCPClient
participant Service as Retriever Service HTTP
Note over Agent,Service: HTTP mode (remote agent)
Agent->>Auth: GET/POST /mcp (Bearer token)
Auth-->>Agent: 401 if token invalid
Auth->>MCP: Forward request
MCP->>Client: dispatch tool call
Client->>Service: HTTP /v1/ingest/job, /v1/query, /v1/answer, etc.
Service-->>Client: JSON response
Client-->>MCP: result dict
MCP-->>Agent: MCP tool result
Note over Agent,Service: stdio mode (local agent)
Agent->>MCP: stdin JSON-RPC
MCP->>Client: dispatch tool call
Client->>Service: HTTP calls to base_url
Service-->>Client: JSON response
Client-->>MCP: result dict
MCP-->>Agent: stdout JSON-RPC result
Note over Agent,Client: ingest_documents with path parameter
Agent->>MCP: "ingest_documents(path='/etc/passwd')"
MCP->>Client: _document_bytes reads path.read_bytes()
Client->>Service: "POST /v1/ingest/job/{id}/document (file bytes)"
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Agent as Agent (local/remote)
participant MCP as FastMCP at /mcp
participant Auth as BearerAuthMiddleware
participant Client as ServiceMCPClient
participant Service as Retriever Service HTTP
Note over Agent,Service: HTTP mode (remote agent)
Agent->>Auth: GET/POST /mcp (Bearer token)
Auth-->>Agent: 401 if token invalid
Auth->>MCP: Forward request
MCP->>Client: dispatch tool call
Client->>Service: HTTP /v1/ingest/job, /v1/query, /v1/answer, etc.
Service-->>Client: JSON response
Client-->>MCP: result dict
MCP-->>Agent: MCP tool result
Note over Agent,Service: stdio mode (local agent)
Agent->>MCP: stdin JSON-RPC
MCP->>Client: dispatch tool call
Client->>Service: HTTP calls to base_url
Service-->>Client: JSON response
Client-->>MCP: result dict
MCP-->>Agent: stdout JSON-RPC result
Note over Agent,Client: ingest_documents with path parameter
Agent->>MCP: "ingest_documents(path='/etc/passwd')"
MCP->>Client: _document_bytes reads path.read_bytes()
Client->>Service: "POST /v1/ingest/job/{id}/document (file bytes)"
Comments Outside Diff (3)
-
nemo_retriever/src/nemo_retriever/service/mcp_server.py, line 609-624 (link)Unrestricted server-side file read via HTTP
/mcpendpointWhen the MCP app is mounted at
/mcp, authenticated remote agents can passpath: "/etc/passwd"(or any other server-visible path) in aningest_documentsrequest, and the server will read and upload that file's contents. There is no directory restriction or allowlist —Path(doc.path).expanduser()resolves the path freely. The documentation advises remote agents to usecontent_base64instead, but nothing in the code enforces this. An attacker who obtains the bearer token, or any authorized-but-malicious agent, can exfiltrate arbitrary files accessible to the service process (config files, TLS keys, credentials under~/.config, etc.).Prompt To Fix With AI
This is a comment left during a code review. Path: nemo_retriever/src/nemo_retriever/service/mcp_server.py Line: 609-624 Comment: **Unrestricted server-side file read via HTTP `/mcp` endpoint** When the MCP app is mounted at `/mcp`, authenticated remote agents can pass `path: "/etc/passwd"` (or any other server-visible path) in an `ingest_documents` request, and the server will read and upload that file's contents. There is no directory restriction or allowlist — `Path(doc.path).expanduser()` resolves the path freely. The documentation advises remote agents to use `content_base64` instead, but nothing in the code enforces this. An attacker who obtains the bearer token, or any authorized-but-malicious agent, can exfiltrate arbitrary files accessible to the service process (config files, TLS keys, credentials under `~/.config`, etc.). How can I resolve this? If you propose a fix, please make it concise.
-
nemo_retriever/src/nemo_retriever/service/mcp_server.py, line 610-614 (link)No file size limit before reading into memory
path.read_bytes()loads the entire file into memory without any size check. The security rules for this codebase require validating file sizes at every entry point to prevent OOM. A multi-GB file (e.g. a large PDF) submitted by an agent would be read in full before the HTTP upload even starts, potentially exhausting the server's heap.Prompt To Fix With AI
This is a comment left during a code review. Path: nemo_retriever/src/nemo_retriever/service/mcp_server.py Line: 610-614 Comment: **No file size limit before reading into memory** `path.read_bytes()` loads the entire file into memory without any size check. The security rules for this codebase require validating file sizes at every entry point to prevent OOM. A multi-GB file (e.g. a large PDF) submitted by an agent would be read in full before the HTTP upload even starts, potentially exhausting the server's heap. How can I resolve this? If you propose a fix, please make it concise.
-
nemo_retriever/tests/test_service_mcp.py, line 940-944 (link)Weak auth-protection assertion
assert authorized.status_code != 401only confirms the MCP endpoint doesn't return a 401, but would silently pass for a 500, 404, or 405 returned by a broken mount. Asserting a specific expected status code (e.g. 200 or 307) would catch regressions where the endpoint mounts but fails for an unrelated reason.Prompt To Fix With AI
This is a comment left during a code review. Path: nemo_retriever/tests/test_service_mcp.py Line: 940-944 Comment: **Weak auth-protection assertion** `assert authorized.status_code != 401` only confirms the MCP endpoint doesn't return a 401, but would silently pass for a 500, 404, or 405 returned by a broken mount. Asserting a specific expected status code (e.g. 200 or 307) would catch regressions where the endpoint mounts but fails for an unrelated reason. How can I resolve this? If you propose a fix, please make it concise.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix All With AI
Fix the following 6 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 6
nemo_retriever/src/nemo_retriever/service/mcp_server.py:609-624
**Unrestricted server-side file read via HTTP `/mcp` endpoint**
When the MCP app is mounted at `/mcp`, authenticated remote agents can pass `path: "/etc/passwd"` (or any other server-visible path) in an `ingest_documents` request, and the server will read and upload that file's contents. There is no directory restriction or allowlist — `Path(doc.path).expanduser()` resolves the path freely. The documentation advises remote agents to use `content_base64` instead, but nothing in the code enforces this. An attacker who obtains the bearer token, or any authorized-but-malicious agent, can exfiltrate arbitrary files accessible to the service process (config files, TLS keys, credentials under `~/.config`, etc.).
### Issue 2 of 6
nemo_retriever/src/nemo_retriever/service/mcp_server.py:521-523
`build_mcp_app` is a public function but is missing a return type annotation, violating the `type-hints-public-api` rule.
```suggestion
def build_mcp_app(settings: ServiceMCPSettings | None = None) -> Any:
"""Create the ASGI app mounted by ``retriever service start``."""
return build_mcp(settings).http_app(path="/")
```
### Issue 3 of 6
nemo_retriever/src/nemo_retriever/service/mcp_server.py:140-148
The first branch of `_json_or_text` is redundant. When `content-type` contains `"application/json"` the code calls `resp.json()` immediately; when it doesn't, the `try` block still calls `resp.json()` and only falls back to `{"text": resp.text}` on failure. The `if "application/json"` guard therefore never changes behaviour and can be removed.
```suggestion
@staticmethod
def _json_or_text(resp: httpx.Response) -> Any:
try:
return resp.json()
except ValueError:
return {"text": resp.text}
```
### Issue 4 of 6
nemo_retriever/src/nemo_retriever/service/mcp_server.py:610-614
**No file size limit before reading into memory**
`path.read_bytes()` loads the entire file into memory without any size check. The security rules for this codebase require validating file sizes at every entry point to prevent OOM. A multi-GB file (e.g. a large PDF) submitted by an agent would be read in full before the HTTP upload even starts, potentially exhausting the server's heap.
### Issue 5 of 6
nemo_retriever/tests/test_service_mcp.py:940-944
**Weak auth-protection assertion**
`assert authorized.status_code != 401` only confirms the MCP endpoint doesn't return a 401, but would silently pass for a 500, 404, or 405 returned by a broken mount. Asserting a specific expected status code (e.g. 200 or 307) would catch regressions where the endpoint mounts but fails for an unrelated reason.
### Issue 6 of 6
nemo_retriever/src/nemo_retriever/service/app.py:257-259
**Bare `except Exception:` silently masks MCP init bugs**
Catching every exception here (including programming errors thrown by `build_mcp_app` or `combine_lifespans`) means a coding mistake introduced in `mcp_server.py` will degrade silently to "MCP not mounted" rather than surfacing at startup. Consider narrowing to `ImportError` for the optional-dependency case and letting other errors propagate, or at minimum catching as `except Exception as exc:` so the warning can include the exception message inline rather than relying solely on `exc_info=True`.
Reviews (1): Last reviewed commit: "Add service-mode FastMCP integration" | Re-trigger Greptile
| def build_mcp_app(settings: ServiceMCPSettings | None = None): | ||
| """Create the ASGI app mounted by ``retriever service start``.""" | ||
| return build_mcp(settings).http_app(path="/") |
There was a problem hiding this comment.
build_mcp_app is a public function but is missing a return type annotation, violating the type-hints-public-api rule.
| def build_mcp_app(settings: ServiceMCPSettings | None = None): | |
| """Create the ASGI app mounted by ``retriever service start``.""" | |
| return build_mcp(settings).http_app(path="/") | |
| def build_mcp_app(settings: ServiceMCPSettings | None = None) -> Any: | |
| """Create the ASGI app mounted by ``retriever service start``.""" | |
| return build_mcp(settings).http_app(path="/") |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/service/mcp_server.py
Line: 521-523
Comment:
`build_mcp_app` is a public function but is missing a return type annotation, violating the `type-hints-public-api` rule.
```suggestion
def build_mcp_app(settings: ServiceMCPSettings | None = None) -> Any:
"""Create the ASGI app mounted by ``retriever service start``."""
return build_mcp(settings).http_app(path="/")
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @staticmethod | ||
| def _json_or_text(resp: httpx.Response) -> Any: | ||
| content_type = resp.headers.get("content-type", "") | ||
| if "application/json" in content_type: | ||
| return resp.json() | ||
| try: | ||
| return resp.json() | ||
| except ValueError: | ||
| return {"text": resp.text} |
There was a problem hiding this comment.
The first branch of
_json_or_text is redundant. When content-type contains "application/json" the code calls resp.json() immediately; when it doesn't, the try block still calls resp.json() and only falls back to {"text": resp.text} on failure. The if "application/json" guard therefore never changes behaviour and can be removed.
| @staticmethod | |
| def _json_or_text(resp: httpx.Response) -> Any: | |
| content_type = resp.headers.get("content-type", "") | |
| if "application/json" in content_type: | |
| return resp.json() | |
| try: | |
| return resp.json() | |
| except ValueError: | |
| return {"text": resp.text} | |
| @staticmethod | |
| def _json_or_text(resp: httpx.Response) -> Any: | |
| try: | |
| return resp.json() | |
| except ValueError: | |
| return {"text": resp.text} |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/service/mcp_server.py
Line: 140-148
Comment:
The first branch of `_json_or_text` is redundant. When `content-type` contains `"application/json"` the code calls `resp.json()` immediately; when it doesn't, the `try` block still calls `resp.json()` and only falls back to `{"text": resp.text}` on failure. The `if "application/json"` guard therefore never changes behaviour and can be removed.
```suggestion
@staticmethod
def _json_or_text(resp: httpx.Response) -> Any:
try:
return resp.json()
except ValueError:
return {"text": resp.text}
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| except Exception: | ||
| mcp_asgi_app = None | ||
| logger.warning("FastMCP service integration failed to initialise; /mcp will not be mounted", exc_info=True) |
There was a problem hiding this comment.
Bare
except Exception: silently masks MCP init bugs
Catching every exception here (including programming errors thrown by build_mcp_app or combine_lifespans) means a coding mistake introduced in mcp_server.py will degrade silently to "MCP not mounted" rather than surfacing at startup. Consider narrowing to ImportError for the optional-dependency case and letting other errors propagate, or at minimum catching as except Exception as exc: so the warning can include the exception message inline rather than relying solely on exc_info=True.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/service/app.py
Line: 257-259
Comment:
**Bare `except Exception:` silently masks MCP init bugs**
Catching every exception here (including programming errors thrown by `build_mcp_app` or `combine_lifespans`) means a coding mistake introduced in `mcp_server.py` will degrade silently to "MCP not mounted" rather than surfacing at startup. Consider narrowing to `ImportError` for the optional-dependency case and letting other errors propagate, or at minimum catching as `except Exception as exc:` so the warning can include the exception message inline rather than relying solely on `exc_info=True`.
How can I resolve this? If you propose a fix, please make it concise.
Adds a FastMCP integration for NeMo Retriever service mode so agents can use the service locally over stdio or remotely over HTTP.
Changes
Mounts a FastMCP HTTP endpoint at /mcp from retriever service start
Adds retriever service mcp-stdio for local stdio-based agent configs
Adds MCP tools for:
Supports remote document upload through inline/base64 content
Keeps write tools enabled by default
Reuses the existing service bearer-token middleware for /mcp
Documents local stdio and remote HTTP MCP usage
Adds focused MCP tests