Skip to content

agentHost: keep SSH connection registered after incompatible handshake#319455

Draft
roblourens wants to merge 4 commits into
mainfrom
agents/vsckb-implement-please-investigate-and-try-to-fi-9f9f15ed
Draft

agentHost: keep SSH connection registered after incompatible handshake#319455
roblourens wants to merge 4 commits into
mainfrom
agents/vsckb-implement-please-investigate-and-try-to-fi-9f9f15ed

Conversation

@roblourens
Copy link
Copy Markdown
Member

@roblourens roblourens commented Jun 1, 2026

Fixes #319381

Problem

When an SSH-bootstrapped remote agent host fails its handshake because the remote rejects the client's protocol version, the renderer surfaces the error correctly but the "Update Server" upgrade flow is unusable.

Root cause: _setupConnection in SshRemoteAgentHostServiceImpl throws on UnsupportedProtocolVersion before calling IRemoteAgentHostService.addManagedConnection. With no entry in the platform service, triggerServerUpgrade cannot locate the still-open SSH relay and the upgrade RPC has no transport to ride on.

This matched the WebSocket path's earlier behaviour for the same case, which was already fixed by registering an incompatible managed connection. SSH (and tunnels) needed the same treatment.

Fix

  • Extend IRemoteAgentHostService.addManagedConnection with an optional status: RemoteAgentHostConnectionStatus parameter (defaults to connected, so existing callers are unchanged).
  • In SSH _setupConnection (and the parallel tunnel/web-tunnel paths), wrap the handshake in a try/catch. If RemoteAgentHostConnectionStatus.fromConnectError classifies the error as incompatible:
    • Register the managed connection with status: incompatible, keeping the handle and protocolClient alive so triggerServerUpgrade can reuse the transport.
    • Rethrow the original error so the caller (e.g. connectWithProgress) still surfaces the failure to the user.
  • For any other failure, dispose protocolClient and the underlying transport as before and rethrow.
  • On SSH reconnect after a previous incompatible handshake, drop the stale handle and force a fresh handshake against the upgraded server so the renderer doesn't latch onto a permanently-incompatible client.

Applies to:

  • sshRemoteAgentHostServiceImpl.ts
  • tunnelAgentHostServiceImpl.ts
  • webTunnelAgentHostService.ts

Tests

  • remoteAgentHostService.test.ts: new test asserts that an incompatible managed connection is reachable for triggerServerUpgrade while getConnection still reports it as not connected.
  • sshRemoteAgentHostService.test.ts: new tests assert that (a) an incompatible handshake registers the managed connection with incompatible status, does not tear down the relay, and still rejects the connect promise with the original ProtocolError; (b) reconnect after an incompatible handshake replaces the stale handle and re-handshakes against the upgraded server.

Validation

  • compile-check-ts-native, valid-layers-check, precommit hygiene all clean.
  • Targeted unit tests for remoteAgentHostService and sshRemoteAgentHostService all green.
  • Manual end-to-end: with a pinned-old-server CLI on a remote macOS host, triggered an UnsupportedProtocolVersion handshake against a current dev build, clicked "Update supervisor reported the upgrade RPC and restarted on the new commit, and the desktop dropped the stale incompatible handle, re-handshook against the upgraded server, and connected cleanly.Server"

Notes for reviewers

  • The platform-level IConnectionEntry still carries both connected: boolean and status: RemoteAgentHostConnectionStatus. This PR keeps the former derived from the latter via isConnected(status) but does not deduplicate the two worth a follow-up to converge call sites on one source of truth.fields

(Written by Copilot)

When the remote server rejects the client's protocol version, SSH
_setupConnection used to throw before calling addManagedConnection. That
left IRemoteAgentHostService with no entry for the host, so
triggerServerUpgrade could not find the still-open relay and the 'Update
Server' flow was unusable from an SSH agent host.

Extend addManagedConnection with an optional status parameter, and
update the SSH path to detect incompatible handshakes via
RemoteAgentHostConnectionStatus.fromConnectError. On an incompatible
result, register the managed connection with status 'incompatible' and
keep the relay (handle + protocolClient) alive so triggerServerUpgrade
can reuse it, while still rethrowing the original error so the caller
surfaces the failure to the user.

Adds unit tests at both layers.

Fixes #319381

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 1, 2026 21:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the “Update Server” flow for SSH-bootstrapped remote agent hosts when the initial protocol handshake fails due to an unsupported protocol version, by still registering the underlying transport as a managed connection (marked incompatible) so triggerServerUpgrade can reuse it.

Changes:

  • Extend IRemoteAgentHostService.addManagedConnection to accept an optional status (defaulting to connected).
  • Update the SSH connection setup flow to register an incompatible managed connection on UnsupportedProtocolVersion, while still rejecting the original connect attempt.
  • Add/extend unit tests to ensure incompatible connections remain upgrade-addressable but are not treated as “connected”.
Show a summary per file
File Description
src/vs/platform/agentHost/test/electron-browser/sshRemoteAgentHostService.test.ts Adds a test asserting SSH keeps an incompatible transport registered for upgrade without tearing down the relay.
src/vs/platform/agentHost/test/electron-browser/remoteAgentHostService.test.ts Adds a test ensuring triggerServerUpgrade works even when the managed connection is incompatible.
src/vs/platform/agentHost/electron-browser/sshRemoteAgentHostServiceImpl.ts Registers managed SSH connections with incompatible status on handshake mismatch (without disconnecting the tunnel).
src/vs/platform/agentHost/common/remoteAgentHostService.ts Extends the addManagedConnection API contract with an optional status parameter and documents its intent.
src/vs/platform/agentHost/browser/remoteAgentHostServiceImpl.ts Implements the new status parameter, deriving connected from status and returning the provided status.

Copilot's findings

  • Files reviewed: 5/5 changed files
  • Comments generated: 2

roblourens and others added 2 commits June 1, 2026 14:52
After the upgrade RPC succeeds and the SSH provider calls
`SSHRemoteAgentHostService.reconnect()`, the main-side
`replaceRelay` path tears down the old WebSocket relay and creates a
fresh  but it deliberately keeps the same `connectionId` soone
nothing else (like the WebSocket relay channel routing) breaks.

That meant `_setupConnection` saw the old renderer-side handle in its
`_connections` map and short-circuited:

  const existing = this._connections.get(result.connectionId);
  if (existing) { return existing; }

So the stale `RemoteAgentHostProtocolClient` (permanently stuck in
`incompatible` after the original handshake rejection) was reused,
even though the underlying server had just been upgraded and would
now happily accept our protocol version. The connection therefore
never recovered until the window was reloaded.

Only short-circuit on existing handles when the managed entry is
still in a usable state (`getConnection` returns the client). When
it isn't, drop the stale  without running itshandle
`disconnectFn`, since the main service kept the SSH client alive
across `replaceRelay` and we'd otherwise kill the brand-new tunnel
and fall through to a fresh handshake. The subsequent
`addManagedConnection` call replaces the stale managed entry and
disposes the old protocol client.

Adds a regression test that pins a stable `connectionId` across
`connect`/`reconnect` to mimic `replaceRelay` and asserts that
the second handshake produces a fresh client, a fresh `connected`
registration, and no main-tunnel disconnect.

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…hake

Mirrors the SSH fix for the tunnel paths (desktop and web). When the
protocol handshake fails with `UnsupportedProtocolVersion`, the
service now registers the still-open relay as an `incompatible`
managed connection instead of disposing it, so the renderer's
`triggerServerUpgrade` flow can locate the protocol client and send
`_vscodeUpgrade` over the existing transport. (Written by Copilot)

Note: tunnels generate a fresh `connectionId` on every connect (unlike
SSH's `replaceRelay` path), so the corresponding "stale-handle on
reconnect" fix from the SSH side is not needed  reconnect alwayshere
creates a brand-new protocol client.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 7/7 changed files
  • Comments generated: 7

Match the production shape (server advertises supportedVersions:
['^' + PROTOCOL_VERSION] in protocolServerHandler.ts) instead of
date-like placeholders. No behavioural change.

(Written by Copilot)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remote AH server upgrade path not working for SSH

3 participants