fix: prevent silent failure on GET stream so clients don't hang#1943
fix: prevent silent failure on GET stream so clients don't hang#1943sicoyle wants to merge 16 commits intomodelcontextprotocol:mainfrom
Conversation
… POST-only MCP servers Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
|
In the spec, it says:
It seems the issue here is more the "then gives up failing silently." than the method not supported. Footnotes |
|
I responded on the issue for GitHub MCP but the spec is explicit about the GET support being optional: github/github-mcp-server#1951 (comment) |
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
There was a problem hiding this comment.
Pull request overview
Adds explicit handling for servers that reject GET-based SSE connections (HTTP 405) in the StreamableHTTP client, and introduces a regression test to ensure the client doesn’t hang in this scenario (e.g., GitHub MCP behavior).
Changes:
- Update
handle_get_streamto treat HTTP 405 as a non-retryable condition and exit the GET SSE loop after logging. - Add an async test that simulates a server returning 405 for GET and verifies the client continues to function via POST without retrying the GET stream.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/mcp/client/streamable_http.py |
Stops reconnect attempts on GET SSE when the server returns 405, avoiding a hang/race. |
tests/issues/test_streamable_http_405_get_stream.py |
Regression test covering 405-on-GET SSE behavior and ensuring no reconnect loop occurs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
Signed-off-by: Samantha Coyle <sam@diagrid.io>
When connecting to an MCP server over streamable HTTP (ex: GitHub’s MCP server) from Dapr Agents, the client could hang indefinitely on calls like list_tools() if run outside a debugger; with a debugger attached, the timing often hid the issue and pointed to a race.
Root cause: The streamable_http client starts a background GET stream for server‑sent events. If that GET fails (ex: 405 Method Not Allowed), the client retries twice with a 1s delay and then gives up without surfacing that failure. The GET task exits while the rest of the client still assumes the stream may be used. Subsequent requests that expect or coordinate with that stream then block forever on a dead stream.
Fix: For permanent GET failures such as 405, we no longer retry and we log a clear warning instead of failing silently. The client continues to use POST for request/response (e.g. list_tools()), so it no longer hangs when the GET stream has been abandoned. This addresses the silent-failure behavior; per the spec, servers are expected to support both POST and GET, but when the GET stream cannot be established we now fail visibly and avoid leaving the client in a hung state.
This is a fix for an agent hang that I saw in Dapr Agents when it was trying to connect to the Github MCP server using streamable HTTP Transport. I only saw the issue if I ran it locally in terminal, but it was fine within the scope of a debugger. That led me down the path of a race condition. The MCP streamable_http client starts a background GET stream for SSE events that immediately get a 405 err from Github (which doesn't support GET), retries twice with the 1sec delay, and then gives up failing silently. If I called to list tools after the GET stream exhausted it's retries, it would hang forever waiting for the SSE responses on the dead stream. This PR prevents this hang by handling the 405 error.Motivation and Context
I'm associated with the Dapr open source project (CNCF graduated project), and am a maintainer for Dapr Agents repo. We are experiencing issues that this PR helps to correct for our open source community.
dapr/dapr-agents#399
How Has This Been Tested?
I tested and confirmed this works against my repo for a release note agent that we want to eventually publicize and make into a github action that others can consume for open source project release notes when it's ready:
https://github.com/sicoyle/release-note-agent
I can confirm locally that this corrects at least my hanging issue.
Breaking Changes
Types of changes
Checklist
Additional context
related issues:
#1941
dapr/dapr-agents#399