Skip to content

Graceful SSE drain on session manager shutdown#2239

Closed
wiggzz wants to merge 1 commit intomodelcontextprotocol:mainfrom
wiggzz:wtj/mcp-graceful-drain
Closed

Graceful SSE drain on session manager shutdown#2239
wiggzz wants to merge 1 commit intomodelcontextprotocol:mainfrom
wiggzz:wtj/mcp-graceful-drain

Conversation

@wiggzz
Copy link

@wiggzz wiggzz commented Mar 6, 2026

Summary

Fixes #1272
Fixes #1739

When StreamableHTTPSessionManager.run() exits (e.g., during server shutdown), active SSE streams are abruptly cancelled by the task group cancellation. This means EventSourceResponse never sends its final more_body=False chunk, causing the TCP connection to reset. Reverse proxies like nginx interpret this as "upstream prematurely closed connection" and return 502 to clients.

This PR fixes the issue by terminating all active transports before cancelling the task group during shutdown. transport.terminate() closes the in-memory anyio streams, which causes sse_writer to exit cleanly → sse_stream_writer closes → EventSourceResponse._stream_response iterator ends → more_body=False is sent → clean HTTP close.

Changes

  • streamable_http_manager.py: Track in-flight stateless transports in _stateless_transports set. In run()'s finally block, call terminate() on all stateful and stateless transports before tg.cancel_scope.cancel().
  • test_streamable_http_manager.py: Add E2E tests for both stateless and stateful modes that verify the SSE stream closes cleanly when the manager shuts down while a tool call is in-flight.
  • streamable_http.py: Remove unnecessary pragma: no cover annotations on lines already covered by the test suite.

Motivation

In production, every rolling deploy triggers connection errors at our gateway since the streams are not terminated gracefully.

Breaking changes

None. This is an internal implementation change that only affects shutdown behavior.

wiggzz added a commit to dbt-labs/mcp-python-sdk that referenced this pull request Mar 6, 2026
Terminate all active transports before cancelling the task group during
shutdown. This closes in-memory anyio streams cleanly, allowing
EventSourceResponse to send a final `more_body=False` chunk — a clean
HTTP close instead of a connection reset that triggers "upstream
prematurely closed connection" errors at reverse proxies.

Changes:
- Track in-flight stateless transports in _stateless_transports set
- In run() finally block, call terminate() on all transports (both
  stateful and stateless) before tg.cancel_scope.cancel()
- Add E2E tests for graceful shutdown in both stateless and stateful
  modes using httpx.ASGITransport

Upstream PR: modelcontextprotocol#2239
@wiggzz wiggzz force-pushed the wtj/mcp-graceful-drain branch 4 times, most recently from b9909d8 to 05a1639 Compare March 9, 2026 15:23
wiggzz added a commit to dbt-labs/mcp-python-sdk that referenced this pull request Mar 9, 2026
Terminate all active transports before cancelling the task group during
shutdown, allowing EventSourceResponse to send a final more_body=False
chunk for clean HTTP close instead of a connection reset.

Upstream PR: modelcontextprotocol#2239
wiggzz added a commit to dbt-labs/mcp-python-sdk that referenced this pull request Mar 9, 2026
Terminate all active transports before cancelling the task group during
shutdown, allowing EventSourceResponse to send a final more_body=False
chunk for clean HTTP close instead of a connection reset.

Upstream PR: modelcontextprotocol#2239
@wiggzz wiggzz force-pushed the wtj/mcp-graceful-drain branch from 05a1639 to 5112436 Compare March 9, 2026 16:48
wiggzz added a commit to dbt-labs/mcp-python-sdk that referenced this pull request Mar 9, 2026
Terminate all active transports before cancelling the task group during
shutdown, allowing EventSourceResponse to send a final more_body=False
chunk for clean HTTP close instead of a connection reset.

Upstream PR: modelcontextprotocol#2239
Terminate all active transports before cancelling the task group during
StreamableHTTPSessionManager shutdown.  This closes their in-memory
streams, allowing EventSourceResponse to send a final `more_body=False`
chunk — a clean HTTP close instead of a connection reset.

Without this, reverse proxies like nginx see "upstream prematurely
closed connection" and return 502 to clients during rolling deploys.

Changes:
- Track in-flight stateless transports in `_stateless_transports` set
- In `run()` finally block, call `terminate()` on all stateful and
  stateless transports before `tg.cancel_scope.cancel()`
- Add E2E tests for both stateless and stateful modes that verify
  the SSE stream closes cleanly when the manager shuts down while
  a tool call is in-flight
@wiggzz wiggzz force-pushed the wtj/mcp-graceful-drain branch from 5112436 to 1a82b51 Compare March 9, 2026 17:15
@wiggzz
Copy link
Author

wiggzz commented Mar 9, 2026

I'm going to move this into #2145 since it is related (deals with cancellation logic during shutdown) and without this, there will be a regression in the connection closure logic in the other PR.

@wiggzz wiggzz closed this Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support clean server shutdown Server hangs when shutting down if a connection is still open

1 participant