-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(core): add opt-in periodic ping for connection health monitoring #1717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
1938c02
41a36b3
d9d31e1
2a6dc38
6ec1f9c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| --- | ||
| "@modelcontextprotocol/core": minor | ||
| "@modelcontextprotocol/client": minor | ||
| "@modelcontextprotocol/server": minor | ||
| --- | ||
|
|
||
| feat: add opt-in periodic ping for connection health monitoring | ||
|
|
||
| Adds a `pingIntervalMs` option to `ProtocolOptions` that enables automatic | ||
| periodic pings to verify the remote side is still responsive. Per the MCP | ||
| specification, implementations SHOULD periodically issue pings to detect | ||
| connection health, with configurable frequency. | ||
|
|
||
| The feature is disabled by default. When enabled, pings begin after | ||
| initialization completes and stop automatically when the connection closes. | ||
| Failures are reported via the `onerror` callback without stopping the timer. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,6 +33,7 @@ import type { | |
| TaskCreationParams | ||
| } from '../types/index.js'; | ||
| import { | ||
| EmptyResultSchema, | ||
| getNotificationSchema, | ||
| getRequestSchema, | ||
| getResultSchema, | ||
|
|
@@ -92,6 +93,22 @@ export type ProtocolOptions = { | |
| * so they should NOT be included here. | ||
| */ | ||
| tasks?: TaskManagerOptions; | ||
|
|
||
| /** | ||
| * Interval (in milliseconds) between periodic ping requests sent to the remote side | ||
| * to verify connection health. When set, pings begin after initialization completes | ||
| * (`Client` starts them after the MCP handshake; `Server` starts | ||
| * them on `notifications/initialized`) and stop automatically when the connection closes. | ||
| * | ||
| * Per the MCP specification, implementations SHOULD periodically issue pings to | ||
| * detect connection health, with configurable frequency. | ||
| * | ||
| * Disabled by default (no periodic pings). Typical values: 15000-60000 (15s-60s). | ||
| * | ||
| * Ping failures are reported via the {@linkcode Protocol.onerror | onerror} callback | ||
| * and do not stop the periodic loop. | ||
| */ | ||
| pingIntervalMs?: number; | ||
| }; | ||
|
|
||
| /** | ||
|
|
@@ -308,6 +325,10 @@ export abstract class Protocol<ContextT extends BaseContext> { | |
|
|
||
| private _taskManager: TaskManager; | ||
|
|
||
| private _pingTimer?: ReturnType<typeof setTimeout>; | ||
| private _pingIntervalMs?: number; | ||
| private _closing = false; | ||
|
|
||
| protected _supportedProtocolVersions: string[]; | ||
|
|
||
| /** | ||
|
|
@@ -336,6 +357,7 @@ export abstract class Protocol<ContextT extends BaseContext> { | |
|
|
||
| constructor(private _options?: ProtocolOptions) { | ||
| this._supportedProtocolVersions = _options?.supportedProtocolVersions ?? SUPPORTED_PROTOCOL_VERSIONS; | ||
| this._pingIntervalMs = _options?.pingIntervalMs; | ||
|
|
||
| // Create TaskManager from protocol options | ||
| this._taskManager = _options?.tasks ? new TaskManager(_options.tasks) : new NullTaskManager(); | ||
|
|
@@ -454,6 +476,7 @@ export abstract class Protocol<ContextT extends BaseContext> { | |
| */ | ||
| async connect(transport: Transport): Promise<void> { | ||
| this._transport = transport; | ||
| this._closing = false; | ||
| const _onclose = this.transport?.onclose; | ||
| this._transport.onclose = () => { | ||
| try { | ||
|
|
@@ -490,6 +513,8 @@ export abstract class Protocol<ContextT extends BaseContext> { | |
| } | ||
|
|
||
| private _onclose(): void { | ||
| this.stopPeriodicPing(); | ||
|
|
||
| const responseHandlers = this._responseHandlers; | ||
| this._responseHandlers = new Map(); | ||
| this._progressHandlers.clear(); | ||
|
|
@@ -728,10 +753,70 @@ export abstract class Protocol<ContextT extends BaseContext> { | |
| return this._transport; | ||
| } | ||
|
|
||
| /** | ||
| * Starts sending periodic ping requests at the configured interval. | ||
| * Pings are used to verify that the remote side is still responsive. | ||
| * Failures are reported via the {@linkcode onerror} callback but do not | ||
| * stop the loop; pings continue until the connection is closed. | ||
| * | ||
| * This is not called automatically by the base {@linkcode Protocol.connect | connect()} | ||
| * method. `Client` calls it after the MCP initialization handshake | ||
| * (and on reconnection), and `Server` calls it when the | ||
| * `notifications/initialized` notification is received. Custom `Protocol` | ||
| * subclasses must call this explicitly after their own initialization. | ||
| * | ||
| * Has no effect if periodic ping is already running or if no interval | ||
| * is configured. | ||
| */ | ||
|
Comment on lines
+756
to
+770
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 The JSDoc for Extended reasoning...What the bug isThe JSDoc comment on The specific code pathLooking at Why existing code does not prevent the confusionThe second sentence of the JSDoc partially acknowledges the real behavior: "Subclasses that override ImpactAny developer implementing a custom Step-by-step proof
FixUpdate the JSDoc to accurately state that Regarding the duplicate refutationOne verifier flagged this as a duplicate of bug_004. Whether or not another report covers the same issue, the bug is independently confirmed and real. The JSDoc is factually wrong regardless of duplication — the duplication concern is a process question, not a technical refutation. |
||
| protected startPeriodicPing(): void { | ||
| if (this._pingTimer || !this._pingIntervalMs) { | ||
| return; | ||
| } | ||
|
|
||
| const schedulePing = (): void => { | ||
| this._pingTimer = setTimeout(async () => { | ||
| try { | ||
| await this._requestWithSchema({ method: 'ping' }, EmptyResultSchema, { | ||
| timeout: this._pingIntervalMs | ||
| }); | ||
| } catch (error) { | ||
| // Suppress errors caused by intentional shutdown | ||
| if (!this._closing) { | ||
| this._onerror(error instanceof Error ? error : new Error(`Periodic ping failed: ${String(error)}`)); | ||
| } | ||
| } finally { | ||
| // Schedule the next ping only if we have not been stopped | ||
| if (this._pingTimer) { | ||
| schedulePing(); | ||
| } | ||
| } | ||
| }, this._pingIntervalMs); | ||
|
|
||
| // Allow the process to exit even if the timer is still running | ||
| if (typeof this._pingTimer === 'object' && 'unref' in this._pingTimer) { | ||
| this._pingTimer.unref(); | ||
| } | ||
| }; | ||
|
|
||
| schedulePing(); | ||
| } | ||
|
|
||
| /** | ||
| * Stops periodic ping requests. Called automatically when the connection closes. | ||
| */ | ||
| protected stopPeriodicPing(): void { | ||
| if (this._pingTimer) { | ||
| clearTimeout(this._pingTimer); | ||
| this._pingTimer = undefined; | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Closes the connection. | ||
| */ | ||
| async close(): Promise<void> { | ||
| this._closing = true; | ||
| this.stopPeriodicPing(); | ||
| await this._transport?.close(); | ||
| } | ||
|
Comment on lines
817
to
821
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 When close() is called while a periodic ping is in-flight, _onclose() rejects all pending response handlers with SdkError(ConnectionClosed), causing the ping catch block to fire this._onerror() with that error. Users monitoring onerror for connection health will receive spurious Connection closed errors during normal, intentional shutdowns that are indistinguishable from genuine connection failures. Fix: add a _closing boolean flag set before closing the transport, and check it in the ping catch block to suppress the error. Extended reasoning...The Race ConditionThe bug lives in startPeriodicPing() at packages/core/src/shared/protocol.ts around lines 778-804. The setInterval callback is async, so once it fires and awaits _requestWithSchema(), JavaScript cannot cancel it. Clearing the interval only prevents future firings; the already-started async function continues running. This is the core of the race. Exact Code Path
Why Existing Code Does Not Prevent ItThe close() method does call stopPeriodicPing() before this._transport?.close(), which looks like it should protect against this. But stopPeriodicPing() only prevents new timer firings. It cannot retroactively cancel an async function that has already started and is awaiting a promise. There is no mechanism in the ping catch block to distinguish intentional close from a genuine connection failure. ImpactAny consumer that sets onerror to monitor connection health (the primary intended use case per the PR description) will see false positive ConnectionClosed errors every time close() is called while a ping happens to be in-flight. Since pings fire on a regular interval, the race is probabilistic but reliably reproducible at scale or in tests. More critically, these errors are indistinguishable from a real remote-side failure, so automated reconnection logic or alerting built on onerror will misfire. How to FixAdd a private _closing boolean field to Protocol, initialized to false. In close(), set this._closing = true before calling stopPeriodicPing() and this._transport?.close(). In the ping catch block, guard the onerror call: } catch (error) {
if (!this._closing) {
this._onerror(error instanceof Error ? error : new Error('Periodic ping failed: ' + String(error)));
}
}Reset _closing to false in _onclose() after cleanup so that a reconnected protocol instance works correctly. The _closing flag cleanly suppresses the error regardless of what exact error code the transport produces on shutdown. Step-by-Step Proof |
||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.