Skip to content

Preserve MCP JSON schema structure in codex-rs#22166

Open
soheil-oai wants to merge 2 commits into
mainfrom
soheil-oai/codex-rs-schema-handling-issue
Open

Preserve MCP JSON schema structure in codex-rs#22166
soheil-oai wants to merge 2 commits into
mainfrom
soheil-oai/codex-rs-schema-handling-issue

Conversation

@soheil-oai
Copy link
Copy Markdown

@soheil-oai soheil-oai commented May 11, 2026

Summary

  • preserve MCP/dynamic tool input schemas as raw JSON Schema when serializing tools for Responses
  • keep hand-authored/local Codex tools on the existing typed JsonSchema builders
  • limit compatibility normalization to the tool-parameter envelope: object root plus root properties
  • add regression coverage that serialized Responses payloads retain $defs, $ref, allOf, descriptions, additionalProperties, and nested {} property schemas

Problem

Follow-up to the schema handling investigation and proposal:
https://www.notion.so/35d8e50b62b0813f8376d5ef25bdda22

The lossy step was parse_tool_input_schema: incoming MCP/dynamic schemas were forced into Codex's narrow typed JsonSchema representation. That representation is useful for hand-authored Codex tools, but it cannot faithfully model every provider-native schema shape ($ref, $defs, allOf, oneOf, literal {}, metadata siblings, and arbitrary permissive schemas).

Before this PR, the sanitizer filled those gaps by inventing types. In practice, unknown or permissive shapes could become string, so Codex sent Responses a less accurate schema than the connector exposed. The reported cases include nested provider objects such as Outlook-style $ref properties and permissive shapes such as Notion body.parent: {}.

Approach

This PR avoids implementing JSON Schema semantics in Codex. Instead, MCP and dynamic tools now preserve their raw schema payload for serialization, while local/static tools continue using the typed JsonSchema builder API.

The raw-schema path only normalizes the root tool-parameter envelope:

  • non-object roots fall back to an empty object schema
  • missing/null root type becomes "object"
  • missing/null root properties becomes {}
  • nested schema content is otherwise preserved as provided

That means strict: false-compatible provider-native constructs such as $defs, $ref, allOf, oneOf, anyOf, additionalProperties, type arrays, descriptions, enums, and {} property schemas are no longer simplified before reaching Responses.

Non-Goals

  • No $ref expansion or recursive ref resolution.
  • No sibling constraint merging.
  • No enum intersections.
  • No arbitrary allOf/oneOf/anyOf interpretation.
  • No broad schema rewriting beyond the root envelope compatibility rule.

The important principle is: if Responses accepts a non-strict schema shape, Codex should not preemptively simplify it.

Validation

  • cargo test -p codex-tools
  • just fmt
  • just fix -p codex-tools
  • git diff --cached --check

Base used for this clean branch: origin/main at 704ad620f625c92d83da1cc85e99bf15a7fb2f31.

Testing Methodology

In addition to the unit tests above, I validated the reported Outlook Calendar failure mode with local instrumentation on this branch:

  1. Added temporary opt-in logging at the MCP boundary to print the raw upstream rmcp::model::Tool schema and annotations before Codex normalization.

  2. Added temporary opt-in logging at the Responses conversion boundary to print the model-visible ResponsesApiTool.parameters payload after Codex conversion.

  3. Rebuilt the local CLI with that instrumentation and ran an end-to-end Outlook Calendar write through Responses API tool calling.

    The CLI was invoked with the Outlook Calendar connector enabled and destructive/open-world tools approved:

    CODEX_LOG_UPSTREAM_MCP_SCHEMAS='outlook calendar_create_event' \
    RUST_LOG=codex_tools=warn,codex_core::mcp_tool_call=debug \
    ./target/debug/codex \
      -c 'apps.connector_e6a7394682e24467ac68c60696f275a4.enabled=true' \
      -c 'apps.connector_e6a7394682e24467ac68c60696f275a4.destructive_enabled=true' \
      -c 'apps.connector_e6a7394682e24467ac68c60696f275a4.open_world_enabled=true' \
      -c 'apps.connector_e6a7394682e24467ac68c60696f275a4.default_tools_approval_mode="approve"' \
      exec --ephemeral --json \
      -C /Users/soheil/code/codex-pr-22166-test \
      'Use Outlook Calendar only. Create exactly one dummy event on my default Outlook calendar today Monday May 11, 2026 from 8:45 PM to 9:00 PM Eastern Time. Title it exactly: Codex Outlook schema smoke test (safe to delete). No attendees. Description: Created by local codex-rs Outlook schema smoke test; safe to delete. Report whether it succeeded, event id/link if available, and exact tool arguments used. Do not use Google Calendar. Do not create more than one event.'

    The prompt intentionally did not tell the model the create_event argument shape. It only described the user task.

  4. Confirmed the upstream Outlook Calendar create_event tool annotations were:

    {
      "readOnlyHint": false,
      "destructiveHint": true,
      "openWorldHint": true
    }
  5. Confirmed the upstream Outlook Calendar create_event schema exposes start and end as EventDateTime objects with nested dateTime and timeZone string properties.

  6. Confirmed the model-visible Responses tool schema still exposes start and end as objects with the same nested properties, rather than collapsing either field to string.

  7. Confirmed the Responses API tool call went through to the Outlook Calendar connector. The model inferred and sent object-shaped arguments without schema-shape hints:

    {
      "subject": "Codex Outlook schema smoke test (safe to delete)",
      "start": {
        "dateTime": "2026-05-11T20:45:00",
        "timeZone": "Eastern Standard Time"
      },
      "end": {
        "dateTime": "2026-05-11T21:00:00",
        "timeZone": "Eastern Standard Time"
      },
      "body_content": "Created by local codex-rs Outlook schema smoke test; safe to delete.",
      "body_content_type": "Text",
      "attendees": [],
      "calendar_id": null
    }
  8. Confirmed the connector completed successfully and created exactly one dummy Outlook Calendar event:

    • event id: AAMkADFiZDc5OGE0LWY5MmYtNDg0Ny1hZDRiLWRlMDJhOWRjM2Q4MwBGAAAAAADv5C0rm4T0TL1fE7-wwPEFBwDH912JMQjmSrZkxFhjLO2NAAAAAAENAADH912JMQjmSrZkxFhjLO2NAAA3ucCiAAA=
    • event link: https://outlook.office365.com/owa/?itemid=AAMkADFiZDc5OGE0LWY5MmYtNDg0Ny1hZDRiLWRlMDJhOWRjM2Q4MwBGAAAAAADv5C0rm4T0TL1fE7%2FwwPEFBwDH912JMQjmSrZkxFhjLO2NAAAAAAENAADH912JMQjmSrZkxFhjLO2NAAA3ucCiAAA%3D&exvsurl=1&path=/calendar/item

The earlier cancellation observed during manual testing was not a schema or Responses tool-calling failure. It came from using the wrong config override path (apps.apps.<connector_id>...) for the Outlook Calendar app. The working path is the flattened app config key apps.<connector_id>....

@github-actions
Copy link
Copy Markdown
Contributor


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@soheil-oai soheil-oai force-pushed the soheil-oai/codex-rs-schema-handling-issue branch from 0e9c3bd to 7b19a8c Compare May 11, 2026 15:50
@soheil-oai soheil-oai marked this pull request as ready for review May 11, 2026 20:01
@soheil-oai soheil-oai requested a review from a team as a code owner May 11, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants