-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Network bridge silently dies without reconnection when transport exception occurs during broker info handshake #1864
Description
Description
In DemandForwardingBridgeSupport.start(), the local and remote transport onException() handlers check if (!futureBrokerInfo.isDone()). When the future is NOT done (i.e., the broker info handshake has not yet completed), the handler cancels the future and returns early — skipping serviceLocalException() / serviceRemoteException() entirely.
// localBroker onException handler (~line 227)
public void onException(IOException error) {
if (!futureLocalBrokerInfo.isDone()) {
LOG.info("Error with pending local brokerInfo on: {} ({})", localBroker, error.getMessage());
LOG.debug("Peer error: ", error);
futureLocalBrokerInfo.cancel(true);
return; // <-- BUG: skips serviceLocalException()
}
serviceLocalException(error);
}The same pattern exists in the remoteBroker onException() handler (~line 247).
Impact
The reconnection chain — serviceLocalException()/serviceRemoteException() → fireBridgeFailed() → NetworkBridgeListener.bridgeFailed() → discoveryAgent.serviceFailed() — is never triggered. The network bridge silently dies and is never re-established. This requires a broker restart to recover.
Evidence this is a bug
The duplexInboundLocalBroker onException() handler in the same file does NOT have this guard — it unconditionally calls serviceLocalException(error), confirming the intended behavior.
Fix
Remove the two return statements so that serviceLocalException() / serviceRemoteException() is always called regardless of the future's state. The future cancellation is still performed, but execution continues to trigger the reconnection path.
Production Validation
Tested in a production environment with 15 network connectors over 3.5 months. The bug path was hit 12 times (transport exceptions during broker info handshake), and all 12 resulted in successful reconnection after the fix. Prior to the fix, these would have been permanent connector losses.
Component
activemq-broker / Network Connectors