Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 21 additions & 9 deletions src/ModelContextProtocol.Core/Client/StdioClientTransport.cs
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,17 @@ public async Task<ITransport> ConnectAsync(CancellationToken cancellationToken =
stderrRollingLog.Enqueue(data);
}

_options.StandardErrorLines?.Invoke(data);
try
{
_options.StandardErrorLines?.Invoke(data);
}
catch (Exception ex)
{
// Prevent exceptions in the user callback from propagating
// to the background thread that dispatches ErrorDataReceived,
// which would crash the process.
LogStderrCallbackFailed(logger, endpointName, ex);
}

LogReadStderr(logger, endpointName, data);
}
Expand Down Expand Up @@ -228,14 +238,13 @@ internal static void DisposeProcess(
process.KillTree(shutdownTimeout);
}

// Ensure all redirected stderr/stdout events have been dispatched
// before disposing. Only the no-arg WaitForExit() guarantees this;
// WaitForExit(int) (as used by KillTree) does not.
// This should not hang: either the process already exited on its own
// (no child processes holding handles), or KillTree killed the entire
// process tree. If it does take too long, the test infrastructure's
// own timeout will catch it.
if (!processRunning && HasExited(process))
// When a process has exited either on it's own or via KillTree, we must
// call WaitForExit() to ensure all redirected stderr/stdout events have
// been dispatched, otherwise ErrorDataReceived handlers may fire after
// Dispose().
// The HasExited check is here to avoid waiting forever on a process that
// failed to be killed.
if (HasExited(process))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change or the comment on it. If HasExited return false, how does this prevent ErrorDataReceived from firing after Dispose?

Copy link
Contributor Author

@ericstj ericstj Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what Copilot says:

The guard is there because WaitForExit() (no-arg) blocks indefinitely if the process hasn't exited.

Consider the scenarios at that point in the code:

Scenario HasExited What happens
Process exited on its own true Call WaitForExit() to flush events ✓
KillTree killed it true Call WaitForExit() to flush events ✓
KillTree called but process didn't die (edge case) false Skip — avoids hanging forever
Process never started (processStarted=false) throws → true Caught by the outer try

Without the guard, an unkillable process (or a race between kill and exit check) would cause WaitForExit() to hang the calling thread indefinitely. The tradeoff is that in that rare edge case, some ErrorDataReceived events may not be fully dispatched before Dispose() — but that's better than a deadlock.

HasExited returns true when the process has exited, but the no-arg WaitForExit() does additional work beyond waiting for exit — it drains the async stdout/stderr output readers.

Here are the docs that confirm that indeed you must call WaitForExit() to flush redirected events.
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.process.waitforexit?view=net-10.0#system-diagnostics-process-waitforexit(system-int32)

Seems we could have a better name such an API - like Flush() or something.

Copy link
Contributor

@stephentoub stephentoub Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand that response. My question wasn't about whether there needs to be a guard on the HasExited call... there already was a guard prior to this PR. This PR is just changing that guard from:

if (!processRunning && HasExited(process))

to

if (HasExited(process))

and the value of processRunning is not impacted at all based on whether KillTree was successful or not.

Moreover, the comment that was added suggests that "we must call [WaitForExit] [...] otherwise ErrorDataReceived handlers may fire after Dispose()" but there's nothing here preventing that race condition when we don't call WaitForExit because the process hasn't yet exited by the time we get here. So the logic is lost on me.

What am I missing?

{
process.WaitForExit();
}
Expand Down Expand Up @@ -299,6 +308,9 @@ private static string EscapeArgumentString(string argument) =>
[LoggerMessage(Level = LogLevel.Information, Message = "{EndpointName} received stderr log: '{Data}'.")]
private static partial void LogReadStderr(ILogger logger, string endpointName, string data);

[LoggerMessage(Level = LogLevel.Warning, Message = "{EndpointName} StandardErrorLines callback failed.")]
private static partial void LogStderrCallbackFailed(ILogger logger, string endpointName, Exception exception);

[LoggerMessage(Level = LogLevel.Information, Message = "{EndpointName} started server process with PID {ProcessId}.")]
private static partial void LogTransportProcessStarted(ILogger logger, string endpointName, int processId);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,17 @@ public async Task CreateAsync_ValidProcessInvalidServer_StdErrCallbackInvoked()
Assert.Contains(id, sb.ToString());
}

[Fact(Skip = "Platform not supported by this test.", SkipUnless = nameof(IsStdErrCallbackSupported))]
public async Task CreateAsync_StdErrCallbackThrows_DoesNotCrashProcess()
{
StdioClientTransport transport = RuntimeInformation.IsOSPlatform(OSPlatform.Windows) ?
new(new() { Command = "cmd", Arguments = ["/c", "echo fail >&2 & exit /b 1"], StandardErrorLines = _ => throw new InvalidOperationException("boom") }, LoggerFactory) :
new(new() { Command = "sh", Arguments = ["-c", "echo fail >&2; exit 1"], StandardErrorLines = _ => throw new InvalidOperationException("boom") }, LoggerFactory);

// Should throw IOException for the failed server, not crash the host process.
await Assert.ThrowsAnyAsync<IOException>(() => McpClient.CreateAsync(transport, loggerFactory: LoggerFactory, cancellationToken: TestContext.Current.CancellationToken));
}

[Theory]
[InlineData(null)]
[InlineData("argument with spaces")]
Expand Down
Loading