Skip to content

fix(browser): keep the extension bridge connection alive and its status in sync#113

Merged
cnjack merged 1 commit into
mainfrom
fix/browser-bridge-keepalive
Jul 4, 2026
Merged

fix(browser): keep the extension bridge connection alive and its status in sync#113
cnjack merged 1 commit into
mainfrom
fix/browser-bridge-keepalive

Conversation

@cnjack

@cnjack cnjack commented Jul 4, 2026

Copy link
Copy Markdown
Owner

Problem

The jcode Browser Bridge Chrome/Edge extension kept showing "Reconnecting…" with the red "Could not reach the jcode server" error — even though browser tools actually worked. The server log flapped extension connected / extension disconnected every 1–3 minutes for hours.

Root causes

1. No server-side keepalive. After the hello handshake the bridge cleared the read deadline and never sent anything. The only thing holding the extension's MV3 service worker awake was its own 30s chrome.alarms ping, which races the 30s idle-kill (and is clamped to 60s on some builds). When the worker napped, the socket dropped and only reconnected on the next alarm tick — a recurring down-window where the popup shows the scary error.

2. Stale-socket state clobber (the "works but shows Reconnecting" bug). All ws event handlers referenced the module-level ws. A superseded socket's late-firing onclose set connected = false after a newer socket had already welcomed; the keepalive alarm's reconnect then early-returned because the live global ws was already OPEN, so connected got stuck false forever while the socket kept working.

Fix

Server (internal/browser/bridge.go)

  • Send an application-level {"type":"ping"} every 15s — an inbound ws message resets the MV3 worker's idle timer, keeping it (and the whole bridge) alive between commands.
  • Set a 40s read deadline, refreshed on any inbound frame, to detect and drop a dead extension.
  • Route all writes through a serialized writeJSON helper (gorilla forbids concurrent writers) with a write deadline.

Extension (extension/background.js)

  • Capture each socket in a local sock and guard every handler with ws === sock, so a superseded socket can no longer clobber the live connection's state.
  • Reply pong to server pings, and self-heal connected when a ping arrives (receiving one proves we're connected).

Popup (extension/popup/popup.js, popup.html)

  • A routine reconnect now shows a neutral gray "Reconnecting to jcode…" instead of the red "Could not reach the jcode server", which only appears after several failed attempts in a row.

Test (internal/browser/bridge_test.go)

  • TestBridgeKeepAlivePing covers the server ping + read-watchdog drop.

Verification

  • go build ./..., go vet ./internal/browser/, and go test ./internal/browser/ all pass (including the new test).
  • Live: after deploying the server heartbeat, the connection held stable for 5+ minutes with zero churn (was flapping every 1–3 min) and /api/browser/status reported extension_online: true.

Reviewer notes

  • The server change ships in the jcode binary (needs a rebuild + restart); the extension changes need a reload in chrome://extensions / edge://extensions.
  • keepAlivePing / keepAliveWait are var (not const) so tests can shrink them.

Generated with Jack AI bot

Summary by CodeRabbit

  • New Features

    • Added background connection keep-alives so the extension stays responsive during idle periods.
    • The popup now shows a clearer neutral “Reconnecting…” status when reconnecting normally.
  • Bug Fixes

    • Improved connection handling to ignore outdated socket events and reduce false error states.
    • Better retry behavior and cleaner error messages when the service is temporarily unreachable.
    • Kept the connection alive more reliably by recognizing heartbeat traffic.

…us in sync

The jcode Browser Bridge extension flapped between connected and
"Reconnecting…" every 1–3 minutes, and could get stranded showing
"Reconnecting…" (with "Could not reach the jcode server") even while the
socket was up and browser tools worked.

Two root causes:

1. No server-side keepalive. After the hello handshake the bridge cleared
   the read deadline and never sent anything, so the only thing holding the
   extension's MV3 service worker awake was its own 30s chrome.alarms ping —
   which races the 30s idle-kill (and is clamped to 60s on some builds). When
   the worker napped the socket dropped and only came back on the next alarm
   tick. Fix: the server now sends an application-level {"type":"ping"} every
   15s (an inbound ws message resets the worker's idle timer) and drops the
   peer if no frame arrives within 40s.

2. Stale-socket state clobber. All ws event handlers referenced the module-
   level `ws`, so a superseded socket's late-firing onclose set connected=false
   after a newer socket had already welcomed; the keepalive alarm's reconnect
   then early-returned because the (live) global ws was OPEN, leaving connected
   stuck false forever. Fix: capture each socket in a local `sock` and guard
   every handler with `ws === sock`; also self-heal `connected` when a server
   ping arrives (receiving one proves we're connected).

Also: quiet the popup so a routine reconnect shows a neutral "Reconnecting to
jcode…" instead of the red "Could not reach the jcode server" error, which now
only appears after several failed attempts in a row.

Adds TestBridgeKeepAlivePing covering the server ping + read watchdog.

Generated with Jack AI bot
@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The extension background script now guards against stale WebSocket instances and handles ping/pong keep-alive frames, refining reconnect/error UI logic in the popup. The Go bridge adds a server-side keep-alive goroutine, centralizes writes through a mutex-protected writeJSON, extends read deadlines, and recognizes ping/pong frames, with an accompanying test.

Changes

Extension connection stability

Layer / File(s) Summary
Per-attempt socket guarding
extension/background.js
connect() assigns a local sock to module-level ws; connect-stall watchdog, onopen, and onmessage now verify ws === sock before acting.
Ping/pong and close/error handling
extension/background.js
onmessage responds to ping with pong and ignores pong; onclose/onerror guard on the active socket and only surface errors after repeated failures.
Popup reconnect messaging
extension/popup/popup.js, extension/popup/popup.html
Popup shows an informational "Reconnecting..." message when no lastError exists, versus an error with disconnect guidance on real failure, styled via a new .msg.info CSS class.

Server-side bridge keep-alive

Layer / File(s) Summary
Centralized write path
internal/browser/bridge.go
writeJSON serializes socket writes with a mutex and write deadline; request() uses it instead of writing directly.
Keep-alive ping loop
internal/browser/bridge.go
New timing variables and a keepAlive() goroutine started from HandleWS periodically sends "ping" frames, closing the connection on write failure.
Read loop deadline handling
internal/browser/bridge.go
readLoop sets and extends read deadlines based on inbound traffic and explicitly recognizes "ping"/"pong" frames as keep-alive traffic.
Keep-alive test
internal/browser/bridge_test.go
TestBridgeKeepAlivePing verifies ping delivery and watchdog-driven connection closure when the client goes silent.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Extension as Background.js
  participant Bridge as Go Bridge
  participant Popup

  Extension->>Bridge: WebSocket connect (sock)
  Bridge->>Extension: welcome + periodic ping
  Extension->>Bridge: pong response
  Bridge->>Bridge: readLoop extends read deadline
  Note over Extension,Bridge: If ws !== sock, stale events ignored
  Bridge-->>Extension: connection closed (no ping response)
  Extension->>Popup: update status (info or error)
  Popup->>Popup: render "Reconnecting..." or error message
Loading
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly captures the main fix: keeping the browser bridge connection alive and syncing its status.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/browser-bridge-keepalive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@cnjack cnjack merged commit 46e56a7 into main Jul 4, 2026
2 of 3 checks passed
@cnjack cnjack deleted the fix/browser-bridge-keepalive branch July 4, 2026 15:41

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/browser/bridge.go (1)

203-217: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Close the websocket when readLoop exits

readLoop returns on read timeout without closing c.ws, so a silent-but-connected client can leave the underlying socket/fd open until something else tears it down. Add defer c.close() (or close it on the error path) so the keep-alive timeout actually releases the connection.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/browser/bridge.go` around lines 203 - 217, The readLoop exit path
leaves the websocket open on read timeout, so the underlying connection can
remain allocated. Update bridgeConn.readLoop to ensure c.close is invoked when
the loop exits, either by adding a deferred close at the start of readLoop or by
closing c.ws in the ReadJSON error path alongside the existing pending-channel
cleanup. Keep the fix localized to bridgeConn and preserve the current
closeErr/closed signaling behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@internal/browser/bridge.go`:
- Around line 203-217: The readLoop exit path leaves the websocket open on read
timeout, so the underlying connection can remain allocated. Update
bridgeConn.readLoop to ensure c.close is invoked when the loop exits, either by
adding a deferred close at the start of readLoop or by closing c.ws in the
ReadJSON error path alongside the existing pending-channel cleanup. Keep the fix
localized to bridgeConn and preserve the current closeErr/closed signaling
behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d8302bf9-e307-48f2-af81-2028a57a2a04

📥 Commits

Reviewing files that changed from the base of the PR and between b52f083 and 52708ac.

📒 Files selected for processing (5)
  • extension/background.js
  • extension/popup/popup.html
  • extension/popup/popup.js
  • internal/browser/bridge.go
  • internal/browser/bridge_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant