Skip to content

fix: gracefully degrade on MCP server connection failure#1450

Open
YoungY620 wants to merge 3 commits intomainfrom
fix/mcp-connection-crash
Open

fix: gracefully degrade on MCP server connection failure#1450
YoungY620 wants to merge 3 commits intomainfrom
fix/mcp-connection-crash

Conversation

@YoungY620
Copy link
Collaborator

@YoungY620 YoungY620 commented Mar 16, 2026

Related Issue

Description

When an MCP server fails to connect (e.g., network issues or misconfiguration), the CLI previously raised MCPRuntimeError and crashed the entire session. Similarly, if an MCP tool call encountered a runtime error (e.g., server disconnected mid-call), the unhandled exception would propagate and crash the session.

This PR makes two key improvements:

  1. Graceful degradation on MCP connection failure: Instead of crashing, failed servers are logged as warnings and skipped. The remaining healthy servers still load normally. A single consolidated toast notification is emitted with priority: connection failure > authorization needed > success, preventing a success toast from overriding a failure toast.

  2. Resilient MCP tool execution: When an MCP tool call fails at runtime, the error is caught and returned as a ToolError instead of crashing the session, allowing the conversation to continue.

Changes

  • src/kimi_cli/soul/toolset.py: Replaced MCPRuntimeError raise with warning logs + graceful skip on connection failure; consolidated toast notifications into a single emit with priority logic; wrapped MCP tool call exceptions in ToolError instead of re-raising
  • tests/core/test_toolset.py: Added 4 new test cases covering failure toast, authorization toast, success toast, and failure-over-unauthorized priority
  • .gitignore: Added playwright report/test-results paths

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked the related issue, if any.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run make gen-changelog to update the changelog.
  • I have run make gen-docs to update the user documentation.

Open with Devin

…rashing

When an MCP server fails to connect (e.g. missing binary), log a warning
and continue with remaining servers instead of raising MCPRuntimeError.
Also catch all exceptions in MCPTool.__call__ to handle runtime server
disconnections gracefully by returning ToolError.
…ailure

The MCP background connection code sent multiple mutually-exclusive toasts
(failure, unauthorized, success) at the end of _connect(), causing the last
one to overwrite earlier ones. Consolidate into a single final_toast variable
with priority: failure > unauthorized > success. Also remove stray top-level
code in test_toolset.py that caused NameError on import.
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant