fix: stabilize tray pairing and reconnect behavior#80
fix: stabilize tray pairing and reconnect behavior#80andyeskridge wants to merge 2 commits intoopenclaw:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Stabilizes the Windows tray “node mode” pairing state machine and improves reconnect recovery after gateway restarts, aligning tray behavior with current gateway responses (including NOT_PAIRED and hello-ok without auth.deviceToken).
Changes:
- Update tray pairing notifications to suppress duplicate Pending/Paired toasts/activity.
- Extend
WindowsNodeClientto handle pairing-required errors, pairing events, and treathello-okas paired even withoutauth.deviceToken. - Make WebSocket reconnect retry continuously with backoff until the gateway is reachable again.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/OpenClaw.Tray.WinUI/App.xaml.cs | Debounces pairing toasts/activity based on last observed pairing status. |
| src/OpenClaw.Shared/WindowsNodeClient.cs | Adds explicit pairing state tracking, handles NOT_PAIRED + pair events, and revises hello-ok handling. |
| src/OpenClaw.Shared/WebSocketClientBase.cs | Changes reconnect to loop with backoff until connected/disposed. |
Comments suppressed due to low confidence (1)
src/OpenClaw.Shared/WindowsNodeClient.cs:575
_pairingApprovedAwaitingReconnectis documented as “until the next successful reconnect”, but in thehello-okpath it is only cleared whenauth.deviceTokenis present. If the gateway approves and then reconnects without returningauth.deviceToken, this flag stays true indefinitely, causing incorrect state/logging on subsequent reconnects. Clear_pairingApprovedAwaitingReconnectafter processing the first successfulhello-okpost-approval (even when no device token is returned).
else if (_pairingApprovedAwaitingReconnect)
{
_logger.Info("hello-ok arrived after pairing approval without auth.deviceToken; keeping local state paired.");
}
_logger.Info($"Node registered successfully! ID: {_nodeId ?? _deviceIdentity.DeviceId.Substring(0, 16)}");
_logger.Info($"[NODE] hello-ok auth present={hasAuthPayload}, receivedDeviceToken={receivedDeviceToken}, storedDeviceToken={!string.IsNullOrEmpty(_deviceIdentity.DeviceToken)}, pendingApproval={_isPendingApproval}, awaitingReconnect={_pairingApprovedAwaitingReconnect}");
_isPendingApproval = false;
_isPaired = true;
_logger.Info(string.IsNullOrEmpty(_deviceIdentity.DeviceToken)
? "Gateway accepted the node without returning a device token; treating this device as paired"
: "Already paired with stored device token");
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (!wasPairedBeforeHello) | ||
| { | ||
| _isPendingApproval = true; | ||
| _logger.Info("Not yet paired - check 'openclaw devices list' for pending approval"); | ||
| _logger.Info($"To approve, run: openclaw devices approve {_deviceIdentity.DeviceId}"); | ||
| PairingStatusChanged?.Invoke(this, new PairingStatusEventArgs( | ||
| PairingStatus.Pending, | ||
| PairingStatus.Paired, | ||
| _deviceIdentity.DeviceId, | ||
| $"Run: openclaw devices approve {ShortDeviceId}...")); | ||
| } | ||
| else | ||
| { | ||
| _isPendingApproval = false; | ||
| _logger.Info("Already paired with stored device token"); | ||
| PairingStatusChanged?.Invoke(this, new PairingStatusEventArgs( | ||
| PairingStatus.Paired, | ||
| _deviceIdentity.DeviceId)); | ||
| "Pairing approved!")); | ||
| } |
There was a problem hiding this comment.
The PairingStatusChanged event emitted on hello-ok uses the message "Pairing approved!" even when approval may have happened previously (e.g., device already approved on the gateway but no auth.deviceToken is returned). This is likely to mislead users because hello-ok indicates successful registration, not necessarily a new approval action. Consider using a more neutral message (e.g., "Node connected" / "Node registration accepted") or only using "approved" wording when a pairing approval event was actually observed.
This issue also appears on line 563 of the same file.
See below for a potential fix:
var pairingMessage = (receivedDeviceToken || _pairingApprovedAwaitingReconnect)
? "Pairing approved!"
: "Node registration accepted";
PairingStatusChanged?.Invoke(this, new PairingStatusEventArgs(
PairingStatus.Paired,
_deviceIdentity.DeviceId,
pairingMessage));
| var previousStatus = _lastNodePairingStatus; | ||
| _lastNodePairingStatus = args.Status; | ||
|
|
||
| try | ||
| { | ||
| if (args.Status == OpenClaw.Shared.PairingStatus.Pending) | ||
| { | ||
| AddRecentActivity("Node pairing pending", category: "node", dashboardPath: "nodes", nodeId: args.DeviceId); | ||
| // Show toast with approval instructions | ||
| new ToastContentBuilder() | ||
| .AddText(LocalizationHelper.GetString("Toast_PairingPending")) | ||
| .AddText(string.Format(LocalizationHelper.GetString("Toast_PairingPendingDetail"), args.DeviceId.Substring(0, 16))) | ||
| .Show(); | ||
| if (previousStatus != OpenClaw.Shared.PairingStatus.Pending) | ||
| { | ||
| AddRecentActivity("Node pairing pending", category: "node", dashboardPath: "nodes", nodeId: args.DeviceId); | ||
| new ToastContentBuilder() | ||
| .AddText(LocalizationHelper.GetString("Toast_PairingPending")) | ||
| .AddText(string.Format(LocalizationHelper.GetString("Toast_PairingPendingDetail"), args.DeviceId.Substring(0, 16))) | ||
| .Show(); | ||
| } |
There was a problem hiding this comment.
Pairing toast/activity suppression only keys off the previous status, not the device identity. If the device identity changes during the app session (e.g., device key regenerated) and the new device immediately enters Pending/Paired, the toast could be incorrectly suppressed because previousStatus matches. Track the last deviceId along with status (or reset _lastNodePairingStatus when args.DeviceId changes).
| while (!_disposed) | ||
| { | ||
| await Task.Delay(delay, _cts.Token); | ||
| var delay = BackoffMs[Math.Min(_reconnectAttempts, BackoffMs.Length - 1)]; | ||
| _reconnectAttempts++; | ||
| _logger.Warn($"{ClientRole} reconnecting in {delay}ms (attempt {_reconnectAttempts})"); | ||
| RaiseStatusChanged(ConnectionStatus.Connecting); | ||
|
|
There was a problem hiding this comment.
ReconnectWithBackoffAsync now raises ConnectionStatus.Connecting on every retry iteration. In the tray app, OnNodeStatusChanged logs recent activity for every status change, so this can create noisy repeated "Node mode Connecting" entries while the gateway is down. Consider deduping/only emitting Connecting when transitioning from a non-connecting state, or moving the attempt counter into logs without raising a new status event each loop.
See below for a potential fix:
// Only emit "Connecting" once when entering the reconnect loop to avoid noisy duplicates.
RaiseStatusChanged(ConnectionStatus.Connecting);
while (!_disposed)
{
var delay = BackoffMs[Math.Min(_reconnectAttempts, BackoffMs.Length - 1)];
_reconnectAttempts++;
_logger.Warn($"{ClientRole} reconnecting in {delay}ms (attempt {_reconnectAttempts})");
| if (string.Equals(errorCode, "NOT_PAIRED", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| if (_isPendingApproval) | ||
| { | ||
| return; | ||
| } | ||
|
|
||
| _isPendingApproval = true; | ||
| _isPaired = false; | ||
| _pairingApprovedAwaitingReconnect = false; | ||
|
|
||
| var detail = $"Device {ShortDeviceId} requires approval"; | ||
| if (!string.IsNullOrWhiteSpace(pairingRequestId)) | ||
| { | ||
| detail += $" (request {pairingRequestId})"; | ||
| } | ||
|
|
||
| _logger.Info($"[NODE] Pairing required for this device; waiting for gateway approval. reason={pairingReason ?? "unknown"}, requestId={pairingRequestId ?? "none"}"); | ||
| PairingStatusChanged?.Invoke(this, new PairingStatusEventArgs( | ||
| PairingStatus.Pending, | ||
| _deviceIdentity.DeviceId, | ||
| detail)); | ||
| return; | ||
| } | ||
|
|
||
| _logger.Error($"Node registration failed: {error} (code: {errorCode})"); | ||
| RaiseStatusChanged(ConnectionStatus.Error); | ||
| } |
There was a problem hiding this comment.
The new pairing state machine (handling NOT_PAIRED, node/device.pair.* events, and hello-ok without auth.deviceToken) isn’t covered by unit tests. There are existing WindowsNodeClientTests, but they only validate URL normalization; adding tests that feed representative JSON into ProcessMessageAsync/handlers would help prevent regressions in reconnect + pairing behavior.
Summary
This updates the Windows tray node pairing flow to match current OpenClaw gateway behavior and fixes reconnect recovery after gateway restarts.
Why
The tray previously assumed a node was only paired if
hello-okreturnedauth.deviceTokenand that token was persisted locally.In practice, current gateway behavior for Windows WS nodes looks like this:
NOT_PAIREDhello-okauth.deviceTokenis not always returnedPending, falseUnknown, and failed reconnect recovery after gateway restarts.Changes
NOT_PAIREDas a real pairing-needed state instead of a generic registration errorhello-okas paired even whenauth.deviceTokenis absentValidation
Tested locally against a real gateway:
PendingonNOT_PAIREDConnectedsystem.notifyverified)