Skip to content

Python: Sanitize MCP tool schemas before sending to LLM APIs#4607

Closed
giles17 wants to merge 2 commits intomicrosoft:mainfrom
giles17:fix/mcp-tool-schema-sanitization
Closed

Python: Sanitize MCP tool schemas before sending to LLM APIs#4607
giles17 wants to merge 2 commits intomicrosoft:mainfrom
giles17:fix/mcp-tool-schema-sanitization

Conversation

@giles17
Copy link
Contributor

@giles17 giles17 commented Mar 10, 2026

Motivation and Context

Fixes #4540

MCP servers built with schema-generating libraries (e.g. Go's google/jsonschema-go, used by matlab-mcp-core-server) produce inputSchema dicts containing standard JSON Schema fields ($schema, $id, title, $defs, $ref) that OpenAI-compatible API backends (LM Studio, Aliyun) reject with 400 invalid_request_error.

Simple MCP servers like mcp-server-calculator produce minimal schemas that work fine, making this a class-specific incompatibility rather than a general MCP failure.

Description

New utility: sanitize_schema_for_api() in _tools.py

Produces a clean copy of an MCP tool schema suitable for LLM API parameters fields:

  • Resolves $ref pointers inline from $defs/definitions (recursive)
  • Strips unsupported root-level keys ($schema, $id, title)
  • Removes $defs/definitions after resolution
  • Ensures type: "object" when properties is present but type is missing
  • Returns a new dict — never mutates the cached original

Applied in two code paths:

  • OpenAIResponsesClient._prepare_tools_for_openai() — also fixes a pre-existing bug where params["additionalProperties"] = False mutated the cached schema dict returned by parameters()
  • FunctionTool.to_json_schema_spec() — covers Chat Completions and Ollama clients

Tests:

  • 12 unit tests for sanitize_schema_for_api and _resolve_refs in test_tools.py
  • 3 integration tests in test_openai_responses_client.py validating the end-to-end path through _prepare_tools_for_openai()

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

MCP servers (e.g. matlab-mcp-core-server) can produce inputSchema dicts
with standard JSON Schema fields that OpenAI-compatible API backends reject.

Add sanitize_schema_for_api() utility that:
- Resolves $ref pointers inline from $defs/definitions
- Strips unsupported root-level keys ($schema, $id, title)
- Removes $defs/definitions after resolution
- Ensures type: object when properties is present
- Deep-copies to avoid mutating cached schemas

Apply sanitization in:
- OpenAIResponsesClient._prepare_tools_for_openai() (also fixes
  pre-existing bug where additionalProperties=False mutated the cache)
- FunctionTool.to_json_schema_spec() (covers Chat Completions + Ollama)

Fixes microsoft#4540

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 10, 2026 22:54
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Mar 10, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _tools.py8258190%168–169, 328, 330, 348–350, 358, 376, 390, 397, 404, 420, 422, 429, 437, 469, 494, 498, 515–517, 564–566, 588, 644, 669, 704, 836–842, 878, 889–900, 919–921, 925, 929, 943–945, 1284, 1304, 1380–1384, 1508, 1512, 1536, 1654, 1684, 1704, 1706, 1759, 1822, 2013–2014, 2065, 2134–2135, 2195, 2200, 2207
packages/core/agent_framework/openai
   _responses_client.py80112884%304–307, 311–312, 315–316, 322–323, 328, 341–347, 368, 376, 399, 495, 497, 594, 649, 653, 655, 657, 659, 727, 741, 821, 831, 836, 879, 958, 975, 988, 1049, 1140, 1145, 1149–1151, 1155–1156, 1200, 1229, 1235, 1245, 1251, 1256, 1262, 1267–1268, 1329, 1351–1352, 1367–1368, 1386–1387, 1428–1431, 1593, 1648, 1650, 1730–1738, 1857, 1912, 1927, 1947–1957, 1970, 1981–1985, 1999, 2009–2020, 2029, 2061–2064, 2072–2073, 2075–2077, 2091–2093, 2103–2104, 2110, 2125
TOTAL22764260388% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4988 20 💤 0 ❌ 0 🔥 1m 21s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Python-side sanitization step for MCP/JSON-Schema tool parameter schemas before sending them to OpenAI-compatible LLM APIs, addressing backends that reject common JSON Schema metadata and $ref/$defs usage (Fixes #4540).

Changes:

  • Introduces sanitize_schema_for_api() (and _resolve_refs()) to inline local $ref from $defs/definitions and strip unsupported root-level metadata keys.
  • Applies schema sanitization in FunctionTool.to_json_schema_spec() (Chat Completions / Ollama paths) and OpenAIResponsesClient._prepare_tools_for_openai() (Responses path), also avoiding cached-schema mutation.
  • Adds unit tests for schema sanitization/ref resolution and integration tests for the Responses tool-prep path.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
python/packages/core/agent_framework/_tools.py Adds schema sanitization + $ref resolution and uses it when emitting JSON-schema tool specs.
python/packages/core/agent_framework/openai/_responses_client.py Sanitizes FunctionTool parameter schemas before building Responses API tool definitions and prevents mutating cached schemas.
python/packages/core/tests/core/test_tools.py Adds unit tests covering sanitization behaviors and _resolve_refs deep-copy behavior.
python/packages/core/tests/openai/test_openai_responses_client.py Adds integration tests validating sanitized tool schemas and no cached-schema mutation in Responses tool prep.

You can also share your feedback on Copilot code review. Take the survey.

- Merge both $defs and definitions so refs using either prefix resolve
- Remove redundant cast(list[Any], ...) flagged by mypy
- Add test for schema with both $defs and definitions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 91%

✗ Correctness

The new _resolve_refs function has no guard against circular $ref references, which are common in JSON Schema (e.g., tree-structured types). A self-referential schema from an MCP server will cause infinite recursion and a RecursionError at runtime, crashing the request. Additionally, the unresolvable-$ref fallback path skips recursion on sibling values, so nested $refs in those siblings would silently survive unsanitized.

✗ Security Reliability

The new _resolve_refs function has no cycle detection and will hit Python's recursion limit (or stack overflow) on circular $ref schemas. Circular references are valid in JSON Schema (e.g., tree or linked-list types) and can appear in schemas from untrusted MCP servers, making this a reliability issue at a trust boundary. The rest of the sanitization logic and tests are solid.

✓ Test Coverage

The new sanitize_schema_for_api and _resolve_refs functions are thoroughly unit-tested with 13 cases covering empty input, root-key stripping, simple/nested/array $ref resolution, deep-copy safety, and a real-world Go-jsonschema scenario. Integration tests for _prepare_tools_for_openai cover sanitization and non-mutation of cached schemas. However, there are two notable gaps: (1) the to_json_schema_spec() method was also changed to call sanitize_schema_for_api but has no corresponding test, and (2) there is no test for circular $ref chains, which would currently cause infinite recursion in _resolve_refs. A minor gap is that the sibling-key merging logic in _resolve_refs (keeping e.g. description from a $ref node) is exercised by no test.

✗ Design Approach

The approach of sanitizing JSON Schema at the API boundary is sound and well-tested. However, _resolve_refs performs recursive inlining of $ref pointers without any cycle detection. Circular $ref is valid JSON Schema (e.g., a tree node referencing itself) and is common in real-world schemas. This will cause infinite recursion and a stack overflow, which is a design-level gap — the function needs to either detect cycles and stop, or cap recursion depth.

Flagged Issues

  • _resolve_refs will infinitely recurse on circular $ref schemas (e.g. a 'Node' type whose 'children' field references itself). This is a common JSON Schema pattern and will cause RecursionError at runtime when an MCP server returns such a schema.
  • _resolve_refs has no cycle detection: a circular $ref (e.g., TypeA → TypeB → TypeA, or self-referencing TypeA → TypeA) will cause unbounded recursion and a RecursionError. Since MCP server schemas are untrusted input, this is a denial-of-service / crash risk.
  • _resolve_refs has no cycle detection: a self-referential $ref (e.g. a tree node type) will cause infinite recursion / stack overflow. Circular $ref is valid JSON Schema and common in practice. The function should track a visited set of definition names (or cap depth) and emit a best-effort fallback when a cycle is detected.

Suggestions

  • Add a seen set parameter to _resolve_refs tracking which definition names are currently being resolved. If a def_name is already in seen, stop recursion (e.g., return the node without the $ref key). This prevents infinite recursion on self-referential schemas.
  • In the unresolvable-$ref fallback branch, recurse into the remaining sibling values rather than returning them as-is, so any nested $refs in siblings are still resolved.
  • Add a _seen (set of ref names) or max-depth parameter to _resolve_refs to break cycles, e.g., treat a re-visited ref the same as an unresolvable one.
  • Add a test with a self-referencing $ref (and a mutually-recursive pair) to assert graceful handling instead of RecursionError.
  • Add a test for FunctionTool.to_json_schema_spec() with a schema containing $schema/$ref/$defs to verify that code path also sanitizes. Currently only _prepare_tools_for_openai has integration coverage; to_json_schema_spec is a separate call site with no test.
  • Add a test (and likely a fix) for circular $ref chains (e.g. TypeA → TypeB → TypeA). _resolve_refs will currently recurse infinitely and raise RecursionError. Consider adding a seen set to guard against cycles.
  • Add a test for the sibling-key merging behaviour in _resolve_refs (lines 697-700), e.g. a $ref node that also carries a description override — verify the description appears in the resolved output.
  • Consider stripping title from nested nodes as well, not just the root — some LLM APIs (e.g. Gemini) reject title at any level in the schema.

Automated review by moonbox3's agents

Comment on lines +688 to +696
defs: The top-level ``$defs`` / ``definitions`` mapping to resolve against.

Returns:
A new dict with ``$ref`` pointers replaced by their resolved definitions.
"""
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circular $ref bug: if defs['A'] itself contains a $ref back to 'A' (directly or transitively), this _resolve_refs(resolved, defs) call recurses infinitely, raising RecursionError. Add a seen: set[str] parameter (defaulting to an empty set) and skip resolution when def_name in seen.

Suggested change
defs: The top-level ``$defs`` / ``definitions`` mapping to resolve against.
Returns:
A new dict with ``$ref`` pointers replaced by their resolved definitions.
"""
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if def_name in defs:
if def_name in _seen:
# Circular reference — return without the $ref to break the cycle
return {k: v for k, v in schema.items() if k != "$ref"}
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs, _seen | {def_name})

for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback drops $ref but does not recurse into the remaining sibling values. If any sibling key contains a nested dict with a resolvable $ref, it will not be resolved.

Suggested change
if def_name in defs:
return _resolve_refs({k: v for k, v in schema.items() if k != "$ref"}, defs)

Comment on lines +693 to +705
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circular $ref chains (common in JSON Schema for recursive types like trees) will recurse until RecursionError. Track visited ref names and bail out when a cycle is detected.

Suggested change
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs)
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs and def_name not in _seen:
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs, _seen | {def_name})

if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circular $ref chains (A → B → A) will cause infinite recursion here. Consider adding a seen: set[str] parameter to detect cycles and break out gracefully. There is no test covering this edge case.

Suggested change
resolved = dict(defs[def_name])
def _resolve_refs(schema: dict[str, Any], defs: dict[str, Any], _seen: set[str] | None = None) -> dict[str, Any]:
if _seen is None:
_seen = set()
if "$ref" in schema:
ref_path: str = schema["$ref"]
if ref_path in _seen:
return {k: v for k, v in schema.items() if k != "$ref"}
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs, _seen | {ref_path})
# Unresolvable $ref — drop it and keep sibling keys as a best-effort fallback
return {k: v for k, v in schema.items() if k != "$ref"}

assert result["properties"]["b"] == {"type": "integer"}


# endregion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test that FunctionTool.to_json_schema_spec() produces a sanitized schema. This is a separate integration point from _prepare_tools_for_openai and should have its own test to prevent regressions.

assert result["properties"]["b"] == {"type": "integer"}


# endregion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test for _resolve_refs sibling-key merging: when a $ref node has additional keys (e.g. description), those should appear in the resolved output. Currently no test exercises lines 697-700 of _tools.py.

Comment on lines +690 to +700
Returns:
A new dict with ``$ref`` pointers replaced by their resolved definitions.
"""
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recursive call will loop forever on circular $ref (e.g., a tree node whose children items reference the node definition itself). Pass a seen: set[str] through the recursion and short-circuit when a definition name is revisited — for example, emit {} or {"type": "object"} as a fallback for the cyclic reference.

Suggested change
Returns:
A new dict with ``$ref`` pointers replaced by their resolved definitions.
"""
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
resolved = dict(defs[def_name])
if "$ref" in schema:
ref_path: str = schema["$ref"]
# Only handle local fragment references: #/$defs/Name or #/definitions/Name
for prefix in ("#/$defs/", "#/definitions/"):
if ref_path.startswith(prefix):
def_name = ref_path[len(prefix) :]
if def_name in defs:
if def_name in _seen:
# Circular reference — emit best-effort fallback
return {k: v for k, v in schema.items() if k != "$ref"}
resolved = dict(defs[def_name])
# Merge any sibling keys (e.g. description) from the referring node
for k, v in schema.items():
if k != "$ref" and k not in resolved:
resolved[k] = v
return _resolve_refs(resolved, defs, _seen | {def_name})
# Unresolvable $ref — drop it and keep sibling keys as a best-effort fallback
return {k: v for k, v in schema.items() if k != "$ref"}

@giles17 giles17 marked this pull request as draft March 11, 2026 16:28
@giles17 giles17 closed this Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: Local stdio MCP works for calculator but fails for official matlab-mcp-core-server on LM Studio /v1/responses

4 participants