Skip to content

feat: add interruptInUseConnections client option#4961

Open
italojs wants to merge 1 commit into
mongodb:mainfrom
italojs:add-interrupt-in-use-connections-option
Open

feat: add interruptInUseConnections client option#4961
italojs wants to merge 1 commit into
mongodb:mainfrom
italojs:add-interrupt-in-use-connections-option

Conversation

@italojs

@italojs italojs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a boolean interruptInUseConnections MongoClientOptions (default true, preserving the current SDAM spec behaviour).

When the SDAM monitor heartbeat times out, the driver clears the connection pool with interruptInUseConnections: true, which rejects in-flight operations with a PoolClearedOnNetworkError. For operations the driver does not retry — notably tailable getMore — this surfaces as an unrecoverable error after a transient blip such as a host being suspended/resumed (for example a laptop sleeping).

Setting interruptInUseConnections: false makes a monitor network timeout still clear the pool (ResetPool) but not interrupt in-use connections, so those in-flight operations are left untouched and resume once the server is reachable again. The pool is still cleared and the driver reconnects as usual.

Motivation

There is currently no supported way to opt out of the in-use connection interruption introduced with the interruptInUseConnections SDAM behaviour. This is the root of a long-standing "idle crash" reported downstream in Meteor (meteor/meteor#13108): when a laptop sleeps, the local mongod is frozen, the monitor times out, and the oplog tail's getMore — which the driver never retries — rejects with PoolClearedOnNetworkError and crashes the process. The same applies to any long-lived/non-retried operation across a transient network blip.


I don't have a NODE Jira ticket for this, happy to reword the commit/title with one if a maintainer assigns it.

When the SDAM monitor heartbeat times out, the driver clears the connection
pool with interruptInUseConnections=true, rejecting in-flight operations with a
PoolClearedOnNetworkError. For operations the driver does not retry (notably
tailable getMore), this surfaces as an unrecoverable error after a transient
blip such as a host being suspended/resumed (e.g. a laptop sleeping).

Add a boolean interruptInUseConnections MongoClientOptions (default true,
preserving the current SDAM spec behaviour). When set to false, a monitor
network timeout still clears the pool (ResetPool) but does not interrupt in-use
connections, so such in-flight operations are left untouched and resume once the
server is reachable again.

Refs: meteor/meteor#13108
@italojs italojs requested a review from a team as a code owner June 10, 2026 15:56
@italojs

italojs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Reproduction & verification (PoC)

A standalone, deterministic before/after proof of concept — plain Node + this driver + MongoDB in Docker, no Meteor:
https://github.com/italojs/meteor-idle-crash-13108

./run.sh installs the driver straight from this PR branch and runs the same uncaught in-flight tailable getMore across a simulated sleep/wake (docker pause/unpause):

  • default (interruptInUseConnections: true) → PoolClearedOnNetworkError → process exits 1 (the bug)
  • fixed (interruptInUseConnections: false) → process survives, exit 0 (the fix)

Same build in both runs; only the option changes. Output:

# 1) THE BUG -- default (interruptInUseConnections=true)
  PoolClearedOnNetworkError: Connection to 127.0.0.1:37017 interrupted due to server monitor timeout
    caused by: MongoNetworkTimeoutError: connection <monitor> to 127.0.0.1:37017 timed out
  -> exit code 1   (crashed)

# 2) THE FIX -- interruptInUseConnections=false
  RESULT: NO CRASH -- the in-use operation survived sleep/wake.
  -> exit code 0   (survived)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant