feat(browser): browser-use capability (managed CDP + extension bridge) by cnjack · Pull Request #107 · cnjack/jcode

cnjack · 2026-07-03T18:17:28Z

What

Adds browser use — the agent can drive a browser to navigate, snapshot, interact, screenshot, read page text, and (in developer mode) eval JS.

Two backends behind one Session interface:

Managed — jcode launches its own Chrome and drives it over CDP.
Extension — controls the user's Chrome/Edge via a Chrome extension (extension/) over a WebSocket bridge + native-messaging host.

Approval has three tiers — read-only (auto), navigate/interact (site-scoped), high-risk eval (always prompt) — with per-origin site permissions surfaced in the web Settings → Browser tab. Tools: browser_open, browser_snapshot, browser_screenshot, browser_act, browser_read, browser_tabs, browser_eval. Wired into both TUI and web; screenshots served over /api/browser/shots.

Design notes: internal-doc/browser-use-design.md.

Lifecycle / approval fixes included

Shared managed Chrome no longer killed on task teardown. Session.Close only releases the task's own tabs; the managed Chrome is owned by the Manager and reused across tasks. getManaged drops and relaunches a dead backend (crash / user-quit) instead of caching it. Web mode now calls Manager.Close on exit so Chrome isn't orphaned.
interact site permission works. browser_act scopes its per-site permission by the active tab's origin (from the live session) instead of a hardcoded empty origin, so interact=allow and the interact class default actually take effect.
browser_act action=reload implemented (reuses Session.Reload, previously dead code).

Tests

go build ./..., go vet, and go test -race ./internal/browser/ ./internal/runner/ ./internal/tools/ pass. New regression tests: session teardown keeps the shared backend, CurrentOrigin caching, reload routing, and interact-by-session-origin approval.

Notes

browser_act is a single interact tier (dialog/scroll/hover included); if interact=always_allow, a dialog accept is also auto-approved — intentional per design.
A few known follow-ups remain (scroll position/delta conflation, fill clear-on-prefilled, extension MV3 state loss); not in this PR.

Generated with Jack AI bot

Summary by CodeRabbit

New Features
- Added browser-use support with a new browser settings tab, status view, and controls for enabling browser integration, choosing a backend, and managing site permissions.
- Added browser actions for opening pages, taking snapshots/screenshots, reading page text, tab management, and optional advanced evaluation in developer mode.
- Added Chrome extension and native-messaging setup for connecting to a local browser session.
Bug Fixes
- Improved browser session cleanup and connection handling to reduce stale or orphaned browser state.
Documentation
- Added setup and usage guides for the browser bridge and browser-use workflow.

…idge) Introduces browser use: the agent can navigate, snapshot, interact, screenshot, read, and (in dev mode) eval a page. Two backends behind one Session interface — a jcode-managed Chrome over CDP, and the user's Chrome/Edge via a Chrome extension WebSocket bridge + native-messaging host. Approval has three tiers (read-only auto, navigate/interact site-scoped, high-risk always-prompt) with per-origin site permissions in the web Settings UI. Wired into TUI and web; tools registered per config; screenshots served over /api/browser/shots. Includes lifecycle/approval fixes: - Session.Close no longer tears down the shared managed Chrome (the Manager owns it and reuses it across tasks); getManaged drops and relaunches a dead backend; web mode closes the manager on exit so Chrome isn't orphaned. - browser_act scopes its per-site "interact" permission by the active tab's origin from the session instead of a hardcoded empty origin, so interact=allow and the interact class default actually take effect. - browser_act action=reload is implemented (reuses Session.Reload). Generated with Jack AI bot

coderabbitai · 2026-07-03T18:17:47Z

Warning

Review limit reached

@cnjack, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 37 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad17fecc-0206-4aa3-abf6-2ff158081994

📥 Commits

Reviewing files that changed from the base of the PR and between d960862 and 77a9ecb.

📒 Files selected for processing (8)

internal/browser/actions.go
internal/browser/bridge.go
internal/browser/bridge_test.go
internal/browser/cdp_test.go
internal/browser/session.go
internal/browser/session_test.go
internal/browser/smoke_test.go
internal/command/interactive.go

📝 Walkthrough

Walkthrough

This PR adds a native browser-use capability to jcode: Go packages implementing managed Chrome (CDP) and extension-bridge backends, session/snapshot/action logic, a Chrome MV3 extension with native messaging host installation, config/approval/tool wiring, Web server endpoints and frontend settings UI, plus design documentation.

Changes

Browser Use Feature

Layer / File(s)	Summary
CDP core & managed Chrome backend `internal/browser/cdp.go`, `internal/browser/cdp_test.go`, `internal/browser/discover.go`, `internal/browser/perms.go`	Adds wsCDP client, TabConn/TabInfo/Backend interfaces, managed backend/tab, Chrome discovery, launch, and origin utilities.
Extension bridge & token persistence `internal/browser/bridge.go`, `internal/browser/bridge_test.go`, `internal/browser/tokens.go`	Implements server-side websocket bridge authenticating extension connections and forwarding CDP commands, with token file persistence.
Session, snapshot, actions `internal/browser/session.go`, `internal/browser/session_test.go`, `internal/browser/snapshot.go`, `internal/browser/snapshot_test.go`, `internal/browser/actions.go`, `internal/browser/smoke_test.go`	Implements per-task Session with tab lifecycle, accessibility-tree snapshotting with generation-tagged UIDs, and Act dispatcher for click/fill/select/upload/press/scroll/dialog.
Manager `internal/browser/manager.go`	Adds process-wide Manager for backend selection, status, lazy Chrome launch reuse, and screenshot persistence.
Native messaging host `internal/browser/nativehost*.go`, `internal/browser/nativehost_test.go`, `cmd/jcode/main.go`	Implements native messaging protocol, endpoint persistence, per-OS host manifest install, and CLI early-exit hook.
Chrome extension client `extension/*`	Adds MV3 manifest, background service worker (websocket + chrome.debugger forwarding), popup UI/logic, and README.
Config, approval, and tool dispatch `internal/config/config.go`, `internal/runner/approval.go`, `internal/runner/approval_browser_test.go`, `internal/tools/browser.go`, `internal/tools/env.go`, `internal/skills/builtin/browser-use/SKILL.md`	Adds BrowserConfig schema, approval tiering hooks/tests, and browser_* tool dispatch on Env.
CLI/web wiring & endpoints `internal/command/interactive.go`, `internal/command/web.go`, `internal/web/browser.go`, `internal/web/server.go`, `internal/web/auth.go`, `internal/web/engine.go`, `internal/handler/web.go`	Wires Manager into interactive/web execution, approval hooks, HTTP endpoints for status/config/ext-ws/screenshots, and tool display metadata.
Web settings UI `web/src/composables/api.ts`, `web/src/components/SettingsDialog.vue`, `web/src/components/ToolCallCard.vue`, `web/src/i18n/locales/*`	Adds API client types/methods, a Browser settings tab, screenshot tool card rendering, and localized strings.
Design documentation `internal-doc/browser-use-design.md`, `internal-doc/browser-use-ui.html`	Adds design spec and HTML UI wireframe for the browser-use feature.

Estimated code review effort: 5 (Critical) | ~120 minutes

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant WebUI
  participant TaskEnv
  participant ApprovalState
  participant Session
  participant Manager
  participant Backend

  User->>WebUI: trigger browser_act tool call
  WebUI->>TaskEnv: dispatchBrowser(browser_act)
  TaskEnv->>ApprovalState: decideBrowser(origin, class)
  ApprovalState-->>TaskEnv: auto-approve or prompt
  TaskEnv->>Session: BrowserSession(ctx)
  Session->>Manager: OpenSession(ctx)
  Manager->>Backend: NewTab / AttachTab
  Backend-->>Manager: TabConn
  Manager-->>Session: Session ready
  TaskEnv->>Session: Act(req)
  Session->>Backend: CDP Input/DOM commands
  Backend-->>Session: result
  Session-->>WebUI: action summary

sequenceDiagram
  participant Extension
  participant NativeHost
  participant Bridge
  participant Session

  Extension->>NativeHost: request endpoint (native messaging)
  NativeHost-->>Extension: ws URL + token
  Extension->>Bridge: hello + token (websocket)
  Bridge-->>Extension: welcome
  Session->>Bridge: tab.new / cdp.send
  Bridge->>Extension: envelope request
  Extension->>Extension: chrome.debugger.sendCommand
  Extension-->>Bridge: envelope response
  Bridge-->>Session: result

Possibly related PRs

cnjack/jcode#29: Extends the same extractToolDisplayInfo mapping in internal/handler/web.go that this PR adds browser_* cases to.
cnjack/jcode#76: Modifies the same internal/runner/approval.go decision logic that this PR extends with browser-specific tiering.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding browser-use capability with managed CDP and an extension bridge.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/browser-use

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

cnjack

Senior staff engineer review — PR #107 "browser-use capability"

Overall Risk: High — two confirmed authorization/security gaps below allow, respectively, an approval-tier bypass for claiming control of the user's existing browser tabs, and removal of an origin check this codebase otherwise treats as mandatory for any WebSocket bridging an untrusted webpage to a privileged capability. Both are exploitable within the PR's own stated threat model (an AI agent acting on possibly-adversarial page content, driving the user's real logged-in browser). Recommend addressing Findings 1 and 2 before merge.

Reviewed the full diff (50 files, ~6k lines) across four subsystems: the CDP/managed-Chrome driver, the extension+native-messaging+WebSocket bridge, the approval/authorization layer, and the web API/frontend surface. The core lifecycle fix the PR claims (shared managed Chrome surviving task teardown, dead-backend detection/relaunch) is real, correctly implemented, and mutex-safe. The specific bug the PR says it fixed (hardcoded empty origin in browser_act's site-permission check) is genuinely fixed and failed closed even before the fix, so it was never a silent-approval hole.

Finding 1 (High): `browser_tabs op=select` grants the same control as `claim`, without the approval prompt `claim` requires

Impact: Live authorization bypass in exactly the threat model this PR targets. Defeats the "claim a pre-existing user tab requires approval" tier entirely.

Evidence:

internal/runner/approval.go:304-314 auto-approves browser_tabs when op is "", "list", or "select", on the theory that select is "read-only."
But Session.SelectTab (internal/browser/session.go:421-436) does not require the tab to already be under jcode's control: if tabID isn't already tracked, it calls s.backend.AttachTab(ctx, tabID) and registers it — the identical attach call ClaimTab uses (session.go:439-449, which is literally SelectTab plus flipping a cleanup-only created flag).
So browser_tabs op=list (auto-approved) enumerates all open tabs, including the user's own untouched tabs (e.g. a logged-in banking/email tab via the extension backend), and browser_tabs op=select tab_id=<that id> (also auto-approved) attaches to and takes control of it with zero prompt. If that tab's origin also has interact pre-approved (see Finding 3 below), the very next browser_act can click/fill on it with no user awareness at all.

Suggested Fix: The approval decision for browser_tabs needs session state, not just the op string: auto-approve select only when the target tab is already in the session's controlled set; otherwise treat it like claim (prompt). Expose an "already controlled" predicate to the approval layer, or move SelectTab's attach-if-uncontrolled branch behind the same gate ClaimTab uses.

Finding 2 (High): Extension bridge accepts WebSocket connections from any Origin, unlike every other socket in this codebase

Impact: The extension bridge is the trust boundary between jcode and the user's real, logged-in browser session (full CDP access — cookies, navigation, arbitrary JS in page context; extension/background.js:243-251 forwards CDP calls with no method allow-list). Any webpage the browser visits can open a raw WebSocket to this bridge — browsers don't apply same-origin/CORS restrictions to WS handshakes, only server-side Origin checks do.

Evidence:

internal/browser/bridge.go:34: upgrader: websocket.Upgrader{CheckOrigin: func(*http.Request) bool { return true }}.
This codebase already has the correct pattern for exactly this risk: internal/web/server.go:2569-2573 defines isAllowedWebOrigin with a comment explaining precisely this ("Access-Control-Allow-Origin: * plus a WebSocket CheckOrigin that returns true would let any website the user visits drive the agent..."), and internal/web/pty.go:59 correctly wires CheckOrigin: isAllowedWebOrigin. bridge.go doesn't reuse it.
Today a valid bearer token is still required (bridge.go:83), so this isn't independently exploitable yet — but it removes a defense-in-depth layer this codebase treats as load-bearing everywhere else, and turns any future token leak (Finding 3) into remotely web-exploitable rather than local-only.

Suggested Fix: Use isAllowedWebOrigin as the CheckOrigin for this upgrader, matching pty.go.

Finding 3 (Medium-High): Bridge tokens never expire and cannot actually be revoked, despite the extension telling users they can

Evidence:

internal/browser/bridge.go:56-63 (IssueToken) mints tokens but nothing ever removes an old one; no expiry/rotation logic anywhere in internal/browser.
The extension's "Disconnect" button (extension/popup/popup.js:78-82 → background.js:331-338) only clears local chrome.storage.local — no revoke endpoint or bridge message invalidates the token server-side.
extension/README.md:60-61 explicitly tells users "Use the popup's Disconnect to revoke the token" — false.

Suggested Fix: Rotate/invalidate the previous token on each new auth, give tokens a TTL, wire Disconnect to an authenticated server-side revoke.

Finding 4 (Medium): Cached tab origin goes stale after an in-page navigation, letting `browser_act` auto-approve on the wrong origin

Evidence: browser_act's origin check reads Session.CurrentOrigin() (internal/browser/session.go:212-223), a cache refreshed only by Open, Reload, Snapshot. Act() (internal/browser/actions.go:23-83) never writes the post-action URL back into the cache. So: snapshot origin A (approved) → click navigates cross-origin to B → cache still A → next press/scroll action is checked against stale origin A and auto-approved, though it executes on unapproved origin B.

Suggested Fix: Refresh t.url from the post-action URL in Act(), and/or defensively re-derive origin for actions that don't require a fresh snapshot.

Finding 5 (Medium): The PR's headline concurrency fix has no test coverage, and CI doesn't run `-race`

Evidence: No manager_test.go exists in internal/browser/; nothing constructs a browser.Manager. getManaged's dead-backend-detect/relaunch logic (manager.go:135-156) — the riskiest new state mutation in the PR — is untested. .github/workflows/ci.yml:52 runs go test ./... without -race, despite the PR description citing -race as part of its own verification.

Suggested Fix: Add a manager_test.go exercising getManaged directly (dead-backend detection, relaunch, concurrent OpenSession during relaunch). Add -race to CI, at least for internal/browser and internal/runner.

Finding 6 (Medium): `Manager.Close()` is skipped on one web-server shutdown error path, leaking the Chrome process

Evidence: internal/command/web.go:782-790:

if err := srv.Start(ctx); err != nil {
    return fmt.Errorf("server error: %w", err)   // returns here, skips cleanup below
}
srv.CloseAllEngines()
_ = browserMgr.Close()

Interactive mode does this correctly with defer browserMgr.Close() (internal/command/interactive.go:941); web mode should match, or a genuine Serve error orphans the Chrome process and locks its profile dir for the next start.

Additional Medium/Low items (worth fixing, not individually blocking)

Global Approval["interact"/"navigate"] = "always_allow" (internal/command/web.go:109-127) is not origin-scoped, contradicting the PR's "site-scoped" tier-2 model — grants blanket no-prompt interact/navigate on every site.
Screenshots are unfetchable whenever requireAuth is on: ToolCallCard.vue's <img>/<a> tags hit /api/browser/shots/{id} directly without the bearer token api.ts attaches, and that route isn't in the auth-exempt/query-token allowlist — 401s in exactly the self-hosted/multi-user deployment this feature targets.
SettingsDialog.vue's permission editor saves optimistically with no error handling — a failed POST /api/browser/config only logs to console; UI can silently diverge from persisted permission state.
Chrome crashes mid-task are misclassified as "user took back control" (ErrControlInterrupted via a "connection closed" substring match, actions.go:315-325), masking real crashes; a Session whose backend dies has no recovery path within the same task.
getManaged holds the Manager-wide mutex across the full Chrome launch (up to ~30s, manager.go:136-137), blocking Status/Close during that window.

Top Findings

browser_tabs op=select bypasses the claim approval gate — takes control of any open tab without a prompt. (internal/runner/approval.go:304-314, internal/browser/session.go:421-436)
Extension bridge WebSocket accepts connections from any Origin. (internal/browser/bridge.go:34)
Bridge auth tokens never expire; advertised "Disconnect to revoke" is non-functional. (internal/browser/bridge.go, tokens.go, extension/popup/popup.js, extension/README.md:60-61)
Stale cached tab origin lets browser_act auto-approve after a cross-origin navigation. (internal/browser/session.go:212-223, internal/browser/actions.go:23-83)
Headline concurrency fix (Manager.getManaged relaunch) has zero test coverage; CI doesn't run -race. (internal/browser/manager.go:135-156)

Generated by Claude Code

…session management

coderabbitai

Actionable comments posted: 15

🧹 Nitpick comments (3)

internal-doc/browser-use-design.md (1)
78-93: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add language tags to the fenced examples. markdownlint is already flagging these bare blocks (MD040); marking them as text will keep the doc lint-clean without changing rendering.

Also applies to: 104-111, 161-177, 187-193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal-doc/browser-use-design.md` around lines 78 - 93, Add explicit
markdown language tags to all fenced example blocks in browser-use-design.md
that are currently unlabeled, especially the diagram/example blocks around the
internal/browser architecture and the other referenced sections; use text as the
fence language so markdownlint MD040 passes while preserving the current
rendering. Identify and update each bare fenced block in the affected document
sections rather than changing the content inside them.
Source: Linters/SAST tools
internal/browser/discover.go (1)
221-256: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Fail fast when Chrome exits before announcing DevTools.

If Chrome exits early (invalid path/flags, crash, or forwarding to an already-running instance on the same profile), the stderr scanner hits EOF and wsCh never receives, so Launch blocks for the full 30s before returning a generic timeout. Watching the process exit lets you surface the real failure immediately.
♻️ Detect early exit alongside the DevTools announcement
 	wsCh := make(chan string, 1)
 	go func() {
 		scanner := bufio.NewScanner(stderr)
 		scanner.Buffer(make([]byte, 64*1024), 1024*1024)
 		for scanner.Scan() {
 			if m := devtoolsRe.FindStringSubmatch(scanner.Text()); m != nil {
 				select {
 				case wsCh <- m[1]:
 				default:
 				}
 				// Keep draining so Chrome never blocks on a full stderr pipe.
 			}
 		}
 	}()
 
+	exitCh := make(chan error, 1)
+	go func() { exitCh <- cmd.Wait() }()
+
 	launchCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
 	defer cancel()
 	select {
 	case wsURL := <-wsCh:
 		stop := func() {
 			_ = cmd.Process.Kill()
-			_, _ = cmd.Process.Wait()
+			<-exitCh
 		}
 		backend, err := connectManaged(launchCtx, wsURL, stop)
 		if err != nil {
 			stop()
 			return nil, err
 		}
 		config.Logger().Printf("[browser] managed chrome started pid=%d ws=%s", cmd.Process.Pid, wsURL)
 		return backend, nil
+	case err := <-exitCh:
+		return nil, fmt.Errorf("chrome exited before announcing DevTools endpoint: %w", err)
 	case <-launchCtx.Done():
 		_ = cmd.Process.Kill()
-		_, _ = cmd.Process.Wait()
+		<-exitCh
 		return nil, fmt.Errorf("chrome did not announce DevTools endpoint within 30s")
 	}
Note: this also replaces the direct cmd.Process.Wait() calls with cmd.Wait(), which is the correct reaper when using StderrPipe. Please verify the stderr-draining goroutine still completes cleanly on EOF with this change.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/browser/discover.go` around lines 221 - 256, In discover.go’s Launch
flow, Chrome can exit before emitting the DevTools URL and the current select
only waits on wsCh or the 30s timeout. Add a process-exit signal alongside the
stderr scanner so the code fails fast when cmd exits early, surfaces that error
instead of a generic timeout, and still keeps draining stderr until EOF. While
doing this, replace the direct cmd.Process.Wait() cleanup in the stop/timeout
paths with cmd.Wait() so the StderrPipe is reaped correctly and the goroutine
can terminate cleanly.
internal/tools/browser.go (1)
67-139: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Inconsistent handling of malformed tool args.

browser_act (Line 151-153) returns an "invalid args: %w" error when JSON unmarshal fails, but every other branch here (browser_open, browser_snapshot, browser_screenshot, browser_read, browser_tabs, browser_eval) uses _ = json.Unmarshal(...) and silently proceeds with a zero-valued struct on malformed JSON. This contradicts the tool guideline to return descriptive errors that help the agent self-correct, and can produce confusing silent-default behavior (e.g. a malformed browser_screenshot call always falls back to full_page=false instead of surfacing the parse failure).

As per path instructions, "Return descriptive error strings that help the agent self-correct. Include file paths, line numbers, or command output in error messages."
♻️ Example fix for one branch (repeat pattern for the others)
 	case "browser_snapshot":
 		var in struct {
 			Filter   string `json:"filter"`
 			MaxLines int    `json:"max_lines"`
 		}
-		_ = json.Unmarshal([]byte(argsJSON), &in)
+		if err := json.Unmarshal([]byte(argsJSON), &in); err != nil {
+			return "", fmt.Errorf("invalid args: %w", err)
+		}
 		return sess.Snapshot(ctx, in.Filter, in.MaxLines)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tools/browser.go` around lines 67 - 139, The browser tool dispatch
handlers are swallowing malformed JSON args and defaulting to zero values
instead of returning self-correcting errors. Update dispatchBrowser to handle
json.Unmarshal failures consistently in each branch (browser_open,
browser_snapshot, browser_screenshot, browser_read, browser_tabs, browser_eval),
matching the browser_act pattern by returning a descriptive “invalid args” error
that includes the parse failure. Keep the validation for required fields like
URL and expression after successful parsing so the tool fails fast on bad input.
Source: Path instructions

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@extension/background.js`:
- Around line 38-50: The stop() shutdown path in background.js does not release
debugger attachments, so stale-token and reconnect-failure exits leave tabs
stuck attached. Update stop() to detach every tab tracked in attached before
clearing websocket/timer state, using the existing detachTab flow so the
disconnect cleanup is centralized. Keep the same stop() behavior for
ws/connected/badge/token cleanup, but make sure attached tabs are always
detached regardless of whether stop() is called from the stale-token path or the
give-up reconnect paths.
- Around line 261-296: groupTab is creating a fresh tab group on every
invocation instead of reusing the existing jcode 🔎 group. Update groupTab to
keep track of the created group id and pass it back into chrome.tabs.group when
adding later tabs, so tab.attach/tab.new calls join the same group instead of
spawning separate ones.

In `@extension/README.md`:
- Around line 34-36: The README security wording is inconsistent with the actual
reconnect behavior in background.js and the token flow described earlier. Update
the Security section and nearby token description so both use the same
terminology and clearly state that the extension stores a long-lived token in
chrome.storage.local, reconnects silently until Disconnect is used, and does not
describe this as a short-lived pairing code. Refer to the background.js token
storage/reconnect behavior and the README Security section text to keep the
wording aligned.
- Around line 48-55: The Permissions section is missing two manifest-declared
capabilities, `alarms` and `nativeMessaging`, even though they are part of the
extension’s core behavior. Update the Permissions list in the README to include
both entries alongside the existing permissions, keeping the descriptions
aligned with their roles in keepalive and auto-connect discovery as reflected in
manifest.json.

In `@internal/browser/actions.go`:
- Around line 277-293: The scroll handler in Session.scroll is mixing pointer
coordinates with wheel deltas by reading req.X/req.Y for both the mouse position
and the scroll amount. Update the ActRequest usage so the CDP event position
comes from dedicated cursor coordinates while deltaX/deltaY come from separate
scroll fields, and keep the default paging behavior in scroll intact when no
delta is provided.

In `@internal/browser/bridge.go`:
- Around line 160-234: The bridge connection can hang indefinitely because
bridgeConn.request writes with c.ws.WriteJSON(env) before honoring ctx, and
readLoop clears deadlines without any keepalive. Update request to bound the
write with the caller context or a write deadline, and add WebSocket liveness
handling in readLoop (for example pong handling and periodic ping/control
writes) so dead or suspended peers are detected and c.closed/c.closeErr are set
instead of leaving browser_* calls stuck forever.
- Around line 31-38: NewBridge currently configures
websocket.Upgrader.CheckOrigin to allow every origin, which bypasses browser
origin protection. Update the Bridge initialization so CheckOrigin only returns
true for the companion extension’s exact chrome-extension://<id> origin, using
the Bridge/NewBridge upgrader setup as the place to enforce it. Keep the
existing token loading and token validation flow unchanged so tokens remain the
primary authentication path.

In `@internal/browser/manager.go`:
- Around line 36-41: Restrict browser screenshot storage to owner-only access in
Manager and the screenshot write path. Update NewManager and the screenshot save
logic used around Manager/bridge so shotDir is created with private permissions
and individual screenshot files are written with user-only permissions instead
of world-readable modes. Keep the fix localized to the Manager setup and the
screenshot file creation code paths.

In `@internal/browser/session.go`:
- Around line 386-407: The ListTabs method computes s.active but never uses it,
so the returned TabInfo slice does not mark the currently active tab. Update
Session.ListTabs to either remove the unused active lookup or, preferably, use
it when iterating backend tabs to set the matching TabInfo as active (for
example via an Active field on TabInfo if present/added), while keeping the
existing Attached flag logic based on controlled tabs.

In `@internal/command/web.go`:
- Around line 106-129: The per-site override logic in browserSitePreapproved is
treating an unset BrowserSitePermission field as an explicit deny instead of
falling back to the class default. Update the origin match handling so that for
the requested class, a site-specific value only overrides when it is set (e.g.
navigate/interact is "allow" or "deny"), and otherwise defer to
bc.Approval[class]. Keep the lookup scoped to browserSitePreapproved and its use
of BrowserSitePermission, SitePermissions, and Approval.

In `@internal/runner/approval.go`:
- Around line 304-315: The browser_tabs branch in approval.go is treating
malformed JSON as a read-only op because json.Unmarshal errors are ignored and
in.Op defaults to empty, which auto-approves. Update the browser_tabs handling
in the decide logic to check the unmarshal error and return decisionPrompt on
parse failure, matching the fail-safe behavior already used by the read and
execute cases. Keep the existing decisionAutoApprove only for explicitly parsed
list/select ops, and consider adding a regression test near
TestDecideBrowserTiers for malformed toolArgs.

In `@internal/tools/browser.go`:
- Around line 88-102: The browser_screenshot response text is misleading because
it formats the PNG byte length as a fake width with a stray “x?” placeholder.
Update the return message in the browser_screenshot case of the browser tool to
remove the bogus dimension field and only report the real byte size plus the
image_ref, keeping the user-facing text clear and accurate.
- Around line 167-211: The browserTabs function currently returns success text
for select, claim, and close even when sess.SelectTab, sess.ClaimTab, or
sess.CloseTab fails. Update browserTabs so each of those cases checks the
session call result first and returns an empty string with the error on failure,
only constructing the “selected/claimed/closed tab” message after a successful
call.

In `@web/src/i18n/locales/ja.ts`:
- Line 244: The Japanese locale is missing the full settings.browser translation
group, so the Browser tab in SettingsDialog.vue falls back to raw keys. Add the
complete settings.browser.* section in ja.ts in the same spot as the other
locales, between the existing skills and ssh groups, and make sure it includes
every key used by the Browser tab such as title, enableTitle, enableDesc,
control, managed, noChrome, extension, connected, notConnected, online,
connectHint, chromePath, approval, navigate, interact, askEachSite, alwaysAllow,
sitePermissions, add, noSitePermissions, developerMode, elevatedRisk,
devModeTitle, and devModeDesc.

In `@web/src/i18n/locales/ko.ts`:
- Line 244: The Korean locale is missing the settings.browser translation group,
so add the full settings.browser.* key set to ko.ts using the same key structure
as en.ts and zh-Hans.ts. Insert the Browser settings strings between the
existing skills and ssh sections, including title, subtitle, enableTitle, and
devModeDesc, so SettingsDialog.vue can render the Browser tab without raw
fallback keys.

---

Nitpick comments:
In `@internal-doc/browser-use-design.md`:
- Around line 78-93: Add explicit markdown language tags to all fenced example
blocks in browser-use-design.md that are currently unlabeled, especially the
diagram/example blocks around the internal/browser architecture and the other
referenced sections; use text as the fence language so markdownlint MD040 passes
while preserving the current rendering. Identify and update each bare fenced
block in the affected document sections rather than changing the content inside
them.

In `@internal/browser/discover.go`:
- Around line 221-256: In discover.go’s Launch flow, Chrome can exit before
emitting the DevTools URL and the current select only waits on wsCh or the 30s
timeout. Add a process-exit signal alongside the stderr scanner so the code
fails fast when cmd exits early, surfaces that error instead of a generic
timeout, and still keeps draining stderr until EOF. While doing this, replace
the direct cmd.Process.Wait() cleanup in the stop/timeout paths with cmd.Wait()
so the StderrPipe is reaped correctly and the goroutine can terminate cleanly.

In `@internal/tools/browser.go`:
- Around line 67-139: The browser tool dispatch handlers are swallowing
malformed JSON args and defaulting to zero values instead of returning
self-correcting errors. Update dispatchBrowser to handle json.Unmarshal failures
consistently in each branch (browser_open, browser_snapshot, browser_screenshot,
browser_read, browser_tabs, browser_eval), matching the browser_act pattern by
returning a descriptive “invalid args” error that includes the parse failure.
Keep the validation for required fields like URL and expression after successful
parsing so the tool fails fast on bad input.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 402ea967-8b2b-4b16-a9c5-af6f0395689f

📥 Commits

Reviewing files that changed from the base of the PR and between 0052a43 and d960862.

⛔ Files ignored due to path filters (3)

extension/icons/icon128.png is excluded by !**/*.png
extension/icons/icon16.png is excluded by !**/*.png
extension/icons/icon48.png is excluded by !**/*.png

📒 Files selected for processing (47)

cmd/jcode/main.go
extension/README.md
extension/background.js
extension/manifest.json
extension/popup/popup.html
extension/popup/popup.js
internal-doc/browser-use-design.md
internal-doc/browser-use-ui.html
internal/browser/actions.go
internal/browser/bridge.go
internal/browser/bridge_test.go
internal/browser/cdp.go
internal/browser/cdp_test.go
internal/browser/discover.go
internal/browser/manager.go
internal/browser/nativehost.go
internal/browser/nativehost_notwindows.go
internal/browser/nativehost_test.go
internal/browser/nativehost_windows.go
internal/browser/perms.go
internal/browser/session.go
internal/browser/session_test.go
internal/browser/smoke_test.go
internal/browser/snapshot.go
internal/browser/snapshot_test.go
internal/browser/tokens.go
internal/command/interactive.go
internal/command/web.go
internal/config/config.go
internal/handler/web.go
internal/runner/approval.go
internal/runner/approval_browser_test.go
internal/skills/builtin/browser-use/SKILL.md
internal/tools/browser.go
internal/tools/env.go
internal/web/auth.go
internal/web/browser.go
internal/web/engine.go
internal/web/server.go
web/src/components/SettingsDialog.vue
web/src/components/ToolCallCard.vue
web/src/composables/api.ts
web/src/i18n/locales/en.ts
web/src/i18n/locales/ja.ts
web/src/i18n/locales/ko.ts
web/src/i18n/locales/zh-Hans.ts
web/src/i18n/locales/zh-Hant.ts

coderabbitai · 2026-07-03T18:38:14Z

+function stop(forget) {
+  desired = false;
+  if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }
+  if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; }
+  if (ws) {
+    ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null;
+    try { ws.close(); } catch {}
+    ws = null;
+  }
+  connected = false;
+  chrome.action.setBadgeText({ text: "" });
+  if (forget) chrome.storage.local.remove("token");
+}


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

stop() leaves debugger attached to tabs when connection dies.

stop() is the "hard-off switch" but never detaches attached tabs. On the stale-token path (line 150) and both give-up paths (lines 185, 191) it's called without first detaching, unlike the explicit "disconnect" handler (line 334) which manually loops detachTab first. Result: after a stale token or repeated reconnect failures, Chrome keeps the debugger attached (banner + "jcode 🔎" group stays) with no live connection to ever release it.

🔧 Proposed fix: detach tabs inside stop()

function stop(forget) { desired = false; if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; } if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; } if (ws) { ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null; try { ws.close(); } catch {} ws = null; } connected = false; chrome.action.setBadgeText({ text: "" }); + for (const tabId of [...attached]) { + try { chrome.debugger.detach({ tabId }); } catch {} + attached.delete(tabId); + } if (forget) chrome.storage.local.remove("token"); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

function stop(forget) {

desired = false;

if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }

if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; }

if (ws) {

ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null;

try { ws.close(); } catch {}

ws = null;

}

connected = false;

chrome.action.setBadgeText({ text: "" });

if (forget) chrome.storage.local.remove("token");

}

function stop(forget) {

desired = false;

if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }

if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; }

if (ws) {

ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null;

try { ws.close(); } catch {}

ws = null;

}

connected = false;

chrome.action.setBadgeText({ text: "" });

for (const tabId of [...attached]) {

try { chrome.debugger.detach({ tabId }); } catch {}

attached.delete(tabId);

}

if (forget) chrome.storage.local.remove("token");

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@extension/background.js` around lines 38 - 50, The stop() shutdown path in background.js does not release debugger attachments, so stale-token and reconnect-failure exits leave tabs stuck attached. Update stop() to detach every tab tracked in attached before clearing websocket/timer state, using the existing detachTab flow so the disconnect cleanup is centralized. Keep the same stop() behavior for ws/connected/badge/token cleanup, but make sure attached tabs are always detached regardless of whether stop() is called from the stale-token path or the give-up reconnect paths.

coderabbitai · 2026-07-03T18:38:14Z

+function attachTab(tabId) {
+  return new Promise((resolve, reject) => {
+    if (attached.has(tabId)) return resolve();
+    chrome.debugger.attach({ tabId }, DEBUGGER_VERSION, () => {
+      if (chrome.runtime.lastError) return reject(new Error(chrome.runtime.lastError.message));
+      attached.add(tabId);
+      resolve();
+    });
+  });
+}
+
+function detachTab(tabId) {
+  return new Promise((resolve) => {
+    if (!attached.has(tabId)) return resolve();
+    chrome.debugger.detach({ tabId }, () => {
+      attached.delete(tabId);
+      resolve();
+    });
+  });
+}
+
+function sendCDP(tabId, method, params) {
+  return new Promise((resolve, reject) => {
+    chrome.debugger.sendCommand({ tabId }, method, params || {}, (result) => {
+      if (chrome.runtime.lastError) return reject(new Error(chrome.runtime.lastError.message));
+      resolve(result);
+    });
+  });
+}
+
+async function groupTab(tabId) {
+  try {
+    const groupId = await chrome.tabs.group({ tabIds: [tabId] });
+    await chrome.tabGroups.update(groupId, { title: GROUP_TITLE, color: "orange" });
+  } catch {}
+}


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does chrome.tabs.group create a new group every call when groupId is omitted, even if a prior group with matching tabs already exists?

💡 Result:

Yes, chrome.tabs.group creates a new tab group every time it is called when the groupId parameter is omitted [1][2]. The API does not perform any automatic check to see if an existing group with matching properties or contents already exists [2][3]. If you do not provide a groupId, the browser will instantiate a new, distinct group for the specified tabs [1]. To add tabs to an existing group, you must manually identify the target group's ID (typically by querying existing groups using chrome.tabGroups.query) and explicitly pass that ID to the groupId parameter in your chrome.tabs.group call [2]. If you wish to manage group creation logically—such as grouping tabs by domain—your extension code must implement the logic to check for existing groups first, and only create a new group if no suitable existing one is found [2].

Citations:

1: https://stackoverflow.com/questions/69599168/how-to-add-chrome-tab-group-in-javascript

2: https://bestchromeextensions.com/docs/guides/tab-groups/

3: https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/chrome/common/extensions/api/tab_groups.json

🏁 Script executed:

#!/bin/bash set -euo pipefail # Map the relevant file and locate group-related logic/usages. git ls-files extension/background.js echo '--- outline ---' ast-grep outline extension/background.js --view expanded || true echo '--- groupTab usages ---' rg -n "groupTab|tabs\.group|tabGroups\.update|tabGroups\.query|GROUP_TITLE|jcode" extension/background.js extension || true echo '--- surrounding lines ---' sed -n '220,340p' extension/background.js

Repository: cnjack/jcode

Length of output: 2005

groupTab creates a new tab group on every call. chrome.tabs.group({ tabIds: [tabId] }) without groupId starts a fresh group each time, so every tab.new/tab.attach lands in a separate jcode 🔎 group. Track the group id and reuse it when adding tabs.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@extension/background.js` around lines 261 - 296, groupTab is creating a fresh tab group on every invocation instead of reusing the existing jcode 🔎 group. Update groupTab to keep track of the created group id and pass it back into chrome.tabs.group when adding later tabs, so tab.attach/tab.new calls join the same group instead of spawning separate ones.

coderabbitai · 2026-07-03T18:38:14Z

+Auto-connect exchanges for a long-lived token in `chrome.storage.local`;
+afterwards the extension reconnects silently — you connect once. Use
+**Disconnect** in the popup to stop and forget the token.


🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Security section contradicts the token description above.

Lines 34-36 describe a "long-lived token" persisted in storage and reused silently, but the Security section (lines 59-60) says the bridge "authenticates with a short-lived pairing code." This mismatches actual behavior in background.js (token stored via chrome.storage.local and reused across reconnects until Disconnect) and could confuse users about the real security model.

Also applies to: 57-61

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@extension/README.md` around lines 34 - 36, The README security wording is inconsistent with the actual reconnect behavior in background.js and the token flow described earlier. Update the Security section and nearby token description so both use the same terminology and clearly state that the extension stores a long-lived token in chrome.storage.local, reconnects silently until Disconnect is used, and does not describe this as a short-lived pairing code. Refer to the background.js token storage/reconnect behavior and the README Security section text to keep the wording aligned.

coderabbitai · 2026-07-03T18:38:14Z

+## Permissions
+
+- `debugger` — the CDP control channel (Chrome shows a banner while attached).
+- `tabs`, `tabGroups` — create/switch/group tabs.
+- `storage` — persist the server URL and pairing token.
+- `scripting` — reserved for future in-page helpers.
+- `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks
+  to your local jcode.


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Permissions list omits alarms and nativeMessaging.

manifest.json declares alarms (keepalive) and nativeMessaging (Auto-connect discovery), both documented as core features earlier in this README, but they're missing from the Permissions list.

📝 Proposed addition

- `debugger` — the CDP control channel (Chrome shows a banner while attached). - `tabs`, `tabGroups` — create/switch/group tabs. - `storage` — persist the server URL and pairing token. - `scripting` — reserved for future in-page helpers. +- `alarms` — periodic keepalive ping/reconnect while the service worker sleeps. +- `nativeMessaging` — talk to the jcode native host for zero-input Auto-connect. - `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks to your local jcode.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

## Permissions

- `debugger` — the CDP control channel (Chrome shows a banner while attached).

- `tabs`, `tabGroups` — create/switch/group tabs.

- `storage` — persist the server URL and pairing token.

- `scripting` — reserved for future in-page helpers.

- `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks

to your local jcode.

## Permissions

- `debugger` — the CDP control channel (Chrome shows a banner while attached).

- `tabs`, `tabGroups` — create/switch/group tabs.

- `storage` — persist the server URL and pairing token.

- `scripting` — reserved for future in-page helpers.

- `alarms` — periodic keepalive ping/reconnect while the service worker sleeps.

- `nativeMessaging` — talk to the jcode native host for zero-input Auto-connect.

- `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks

to your local jcode.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@extension/README.md` around lines 48 - 55, The Permissions section is missing two manifest-declared capabilities, `alarms` and `nativeMessaging`, even though they are part of the extension’s core behavior. Update the Permissions list in the README to include both entries alongside the existing permissions, keeping the descriptions aligned with their roles in keepalive and auto-connect discovery as reflected in manifest.json.

coderabbitai · 2026-07-03T18:38:14Z

+func (s *Session) scroll(ctx context.Context, t *sessionTab, req ActRequest) error {
+	dy := req.Y
+	if dy == 0 {
+		dy = 600 // default one "page" down
+	}
+	x, y := req.X, req.Y
+	if x == 0 {
+		x = 400
+	}
+	if y == 0 {
+		y = 400
+	}
+	_, err := t.conn.Send(ctx, "Input.dispatchMouseEvent", map[string]any{
+		"type": "mouseWheel", "x": x, "y": y, "deltaX": req.X, "deltaY": dy,
+	})
+	return interpretErr(err)
+}


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

scroll conflates wheel-event position with scroll delta.

Per CDP's Input.dispatchMouseEvent, x/y are the pointer position while deltaX/deltaY are independent wheel-delta values. This code sets deltaY from req.Y and deltaX from req.X, while also using req.X/req.Y as the event's x/y position (Lines 282-288). A caller cannot express "scroll by 200px" at a specific pointer location without that same value also becoming the pointer's y-coordinate — the two independent CDP concepts are collapsed onto one struct field.

🛠️ Proposed fix: separate delta fields

type ActRequest struct { Action string // click|dblclick|fill|press|hover|scroll|select|upload|dialog UID string // element uid from the latest snapshot (most actions) X, Y float64 // coordinate fallback for scroll/click + DeltaX, DeltaY float64 // scroll amount (scroll only), independent of X/Y position Value string // fill text, select value, dialog decision (accept|dismiss) Key string // for action=press (e.g. "Enter") Files []string }

func (s *Session) scroll(ctx context.Context, t *sessionTab, req ActRequest) error { - dy := req.Y + dx, dy := req.DeltaX, req.DeltaY if dy == 0 { dy = 600 // default one "page" down } x, y := req.X, req.Y if x == 0 { x = 400 } if y == 0 { y = 400 } _, err := t.conn.Send(ctx, "Input.dispatchMouseEvent", map[string]any{ - "type": "mouseWheel", "x": x, "y": y, "deltaX": req.X, "deltaY": dy, + "type": "mouseWheel", "x": x, "y": y, "deltaX": dx, "deltaY": dy, }) return interpretErr(err) }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/browser/actions.go` around lines 277 - 293, The scroll handler in Session.scroll is mixing pointer coordinates with wheel deltas by reading req.X/req.Y for both the mouse position and the scroll amount. Update the ActRequest usage so the CDP event position comes from dedicated cursor coordinates while deltaX/deltaY come from separate scroll fields, and keep the default paging behavior in scroll intact when no delta is provided.

coderabbitai · 2026-07-03T18:38:14Z

+	case "browser_tabs":
+		var in struct {
+			Op string `json:"op"`
+		}
+		_ = json.Unmarshal([]byte(toolArgs), &in)
+		switch in.Op {
+		case "", "list", "select":
+			return decisionAutoApprove, true // read-only tab ops
+		default: // new/claim/close mutate the controlled set
+			return decisionPrompt, true
+		}
+	}


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

browser_tabs auto-approves on malformed args instead of prompting.

json.Unmarshal errors are discarded here, so if toolArgs fails to parse, in.Op stays "" and falls into the auto-approve case — the opposite of the fail-safe pattern used elsewhere in this file (read/execute explicitly check err and default to decisionPrompt). A malformed close/new/claim payload would silently bypass the prompt that's supposed to gate mutating tab operations.

🔒️ Proposed fix

case "browser_tabs": var in struct { Op string `json:"op"` } - _ = json.Unmarshal([]byte(toolArgs), &in) + if err := json.Unmarshal([]byte(toolArgs), &in); err != nil { + return decisionPrompt, true + } switch in.Op { case "", "list", "select": return decisionAutoApprove, true // read-only tab ops default: // new/claim/close mutate the controlled set return decisionPrompt, true }

Want me to add a regression test for the malformed-JSON case alongside TestDecideBrowserTiers?

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case "browser_tabs":

var in struct {

Op string `json:"op"`

}

_ = json.Unmarshal([]byte(toolArgs), &in)

switch in.Op {

case "", "list", "select":

return decisionAutoApprove, true // read-only tab ops

default: // new/claim/close mutate the controlled set

return decisionPrompt, true

}

}

case "browser_tabs":

var in struct {

Op string `json:"op"`

}

if err := json.Unmarshal([]byte(toolArgs), &in); err != nil {

return decisionPrompt, true

}

switch in.Op {

case "", "list", "select":

return decisionAutoApprove, true // read-only tab ops

default: // new/claim/close mutate the controlled set

return decisionPrompt, true

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/runner/approval.go` around lines 304 - 315, The browser_tabs branch in approval.go is treating malformed JSON as a read-only op because json.Unmarshal errors are ignored and in.Op defaults to empty, which auto-approves. Update the browser_tabs handling in the decide logic to check the unmarshal error and return decisionPrompt on parse failure, matching the fail-safe behavior already used by the read and execute cases. Keep the existing decisionAutoApprove only for explicitly parsed list/select ops, and consider adding a regression test near TestDecideBrowserTiers for malformed toolArgs.

coderabbitai · 2026-07-03T18:38:15Z

+	case "browser_screenshot":
+		var in struct {
+			FullPage bool `json:"full_page"`
+		}
+		_ = json.Unmarshal([]byte(argsJSON), &in)
+		png, err := sess.Screenshot(ctx, in.FullPage)
+		if err != nil {
+			return "", err
+		}
+		id, err := env.Browser.SaveScreenshot(png)
+		if err != nil {
+			return "", err
+		}
+		// The web UI renders image_ref inline; text clients see the ref + size.
+		return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Confusing screenshot output text (%dx?).

fmt.Sprintf("[screenshot %dx? bytes=%d ...", len(png), len(png), id) prints the byte length as a fake "width" (e.g. 1024x? bytes=1024), which is misleading — it isn't the image's pixel dimensions, and the ? looks like a stray debug placeholder rather than intentional. This text is surfaced directly to the agent and (per ToolCallCard.vue) rendered in the UI.

🐛 Proposed fix: drop the bogus dimension field

- return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil + return fmt.Sprintf("[screenshot bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), id), nil

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case "browser_screenshot":

var in struct {

FullPage bool `json:"full_page"`

}

_ = json.Unmarshal([]byte(argsJSON), &in)

png, err := sess.Screenshot(ctx, in.FullPage)

if err != nil {

return "", err

}

id, err := env.Browser.SaveScreenshot(png)

if err != nil {

return "", err

}

// The web UI renders image_ref inline; text clients see the ref + size.

return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil

return fmt.Sprintf("[screenshot bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), id), nil

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tools/browser.go` around lines 88 - 102, The browser_screenshot response text is misleading because it formats the PNG byte length as a fake width with a stray “x?” placeholder. Update the return message in the browser_screenshot case of the browser tool to remove the bogus dimension field and only report the real byte size plus the image_ref, keeping the user-facing text clear and accurate.

coderabbitai · 2026-07-03T18:38:15Z

+func browserTabs(ctx context.Context, sess *browser.Session, argsJSON string) (string, error) {
+	var in struct {
+		Op    string `json:"op"`
+		TabID string `json:"tab_id"`
+	}
+	_ = json.Unmarshal([]byte(argsJSON), &in)
+	switch in.Op {
+	case "", "list":
+		tabs, err := sess.ListTabs(ctx)
+		if err != nil {
+			return "", err
+		}
+		if len(tabs) == 0 {
+			return "(no tabs)", nil
+		}
+		var b strings.Builder
+		for _, t := range tabs {
+			mark := " "
+			if t.Attached {
+				mark = "*"
+			}
+			flag := ""
+			if t.UserTab {
+				flag = " [user]"
+			}
+			fmt.Fprintf(&b, "%s %s  %q  %s%s\n", mark, shortTabID(t.ID), t.Title, t.URL, flag)
+		}
+		b.WriteString("(* = controlled by jcode)")
+		return b.String(), nil
+	case "new":
+		id, err := sess.NewTab(ctx)
+		if err != nil {
+			return "", err
+		}
+		return "opened tab " + shortTabID(id), nil
+	case "select":
+		return "selected tab " + shortTabID(in.TabID), sess.SelectTab(ctx, in.TabID)
+	case "claim":
+		return "claimed tab " + shortTabID(in.TabID), sess.ClaimTab(ctx, in.TabID)
+	case "close":
+		return "closed tab " + shortTabID(in.TabID), sess.CloseTab(ctx, in.TabID)
+	default:
+		return "", fmt.Errorf("unknown tabs op %q", in.Op)
+	}
+}


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

set -euo pipefail git ls-files | rg '^internal/tools/browser\.go$|^internal/tools/.*browser.*\.go$|^internal/tools/.*\.go$' echo '--- browser.go outline ---' ast-grep outline internal/tools/browser.go --view expanded || true echo '--- relevant lines ---' sed -n '1,260p' internal/tools/browser.go echo '--- search browserTabs and InvokableRun ---' rg -n "browserTabs|InvokableRun|SelectTab|ClaimTab|CloseTab|ListTabs|NewTab" internal/tools -g '*.go'

Repository: cnjack/jcode

Length of output: 2005

Return no success text when tab operations fail

select/claim/close currently build a success-looking message even when the session call returns an error. Return "" on error so callers don’t receive a misleading result alongside the failure.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tools/browser.go` around lines 167 - 211, The browserTabs function currently returns success text for select, claim, and close even when sess.SelectTab, sess.ClaimTab, or sess.CloseTab fails. Update browserTabs so each of those cases checks the session call result first and returns an empty string with the error on failure, only constructing the “selected/claimed/closed tab” message after a successful call.

coderabbitai · 2026-07-03T18:38:15Z

@@ -241,6 +241,7 @@ export default {
      providers: 'プロバイダー',
      mcp: 'MCP サーバー',
      skills: 'スキル',
+      browser: 'ブラウザ',


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Missing settings.browser translation group breaks the Browser settings tab in Japanese.

This file adds only the tabs.browser label but not the full settings.browser.* group that en.ts/zh-Hans.ts added at the same location (after skills, before ssh). SettingsDialog.vue's Browser tab renders via t('settings.browser.title'), enableTitle, enableDesc, control, managed, noChrome, extension, connected, notConnected, online, connectHint, chromePath, approval, navigate, interact, askEachSite, alwaysAllow, sitePermissions, add, noSitePermissions, developerMode, elevatedRisk, devModeTitle, devModeDesc — all undefined for ja, so the entire tab will show fallback/raw keys instead of Japanese text.

As per path instructions/file convention, this file's own header states: "Every other locale file (zh-Hans/zh-Hant/ja/ko) mirrors this exact shape."

🌐 Proposed fix (insert after `skills`, before `ssh`, ~line 373)

loadingHint: '読み込み中…', }, + browser: { + title: 'ブラウザ', + subtitle: 'jcode にブラウザを操作させます。マネージドブラウザはそのまま使えます。ログイン状態を再利用するには Chrome 拡張機能を接続してください。', + enableTitle: 'ブラウザ操作を有効化', + enableDesc: 'エージェントに browser_* ツールを追加します。', + control: '制御', + managed: 'マネージドブラウザ', + noChrome: 'Chrome が見つかりません — 下にパスを設定してください', + extension: 'Google Chrome 拡張機能', + connected: '自分の Chrome とログイン状態を再利用', + notConnected: '未接続 — 自分の Chrome とログイン状態を再利用', + online: '接続済み', + connectHint: '接続方法：jcode Browser Bridge 拡張機能をインストールし、ポップアップを開いて「Auto-connect to jcode」をクリックしてください。', + chromePath: 'Chrome のパス', + approval: '承認', + navigate: 'サイトを開く（ナビゲーション）', + interact: 'ページ操作（クリック / 入力）', + askEachSite: '毎回確認する', + alwaysAllow: '常に許可', + sitePermissions: 'サイトごとの権限', + add: '追加', + noSitePermissions: 'サイト固有の権限はまだありません', + developerMode: '開発者モード', + elevatedRisk: 'リスク高', + devModeTitle: 'browser_eval と生の CDP を有効化', + devModeDesc: 'エージェントがページ内で JavaScript や生の DevTools コマンドを実行できるようにします。呼び出しごとに確認が必要で、サイトの許可リストは適用されません。', + }, ssh: {

(Translations above are best-effort for review purposes — please have a native speaker verify wording before merge.)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/src/i18n/locales/ja.ts` at line 244, The Japanese locale is missing the full settings.browser translation group, so the Browser tab in SettingsDialog.vue falls back to raw keys. Add the complete settings.browser.* section in ja.ts in the same spot as the other locales, between the existing skills and ssh groups, and make sure it includes every key used by the Browser tab such as title, enableTitle, enableDesc, control, managed, noChrome, extension, connected, notConnected, online, connectHint, chromePath, approval, navigate, interact, askEachSite, alwaysAllow, sitePermissions, add, noSitePermissions, developerMode, elevatedRisk, devModeTitle, and devModeDesc.

coderabbitai · 2026-07-03T18:38:15Z

@@ -241,6 +241,7 @@ export default {
      providers: '프로바이더',
      mcp: 'MCP 서버',
      skills: '스킬',
+      browser: '브라우저',


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Missing settings.browser translation group breaks the Browser settings tab in Korean.

Same gap as in ja.ts: only the tabs.browser label was added, but the settings.browser.* group (title/subtitle/enableTitle/…/devModeDesc) that en.ts/zh-Hans.ts added between skills and ssh is missing here. SettingsDialog.vue's Browser tab depends on these keys, so Korean-locale users will see fallback/raw key text throughout that tab.

As per path instructions/file convention: "키 구조는 en.ts와 완전히 동일합니다" (key structure must exactly match en.ts).

🌐 Proposed fix (insert after `skills`, before `ssh`, ~line 373)

loadingHint: '불러오는 중…', }, + browser: { + title: '브라우저', + subtitle: 'jcode가 브라우저를 조작하도록 합니다. 관리형 브라우저는 별도 설정 없이 바로 사용할 수 있습니다. 로그인 세션을 재사용하려면 Chrome 확장 프로그램을 연결하세요.', + enableTitle: '브라우저 사용 활성화', + enableDesc: '에이전트에 browser_* 도구를 추가합니다.', + control: '제어', + managed: '관리형 브라우저', + noChrome: 'Chrome을 찾을 수 없습니다 — 아래에서 경로를 설정하세요', + extension: 'Google Chrome 확장 프로그램', + connected: '내 Chrome과 로그인 상태 재사용', + notConnected: '연결되지 않음 — 내 Chrome과 로그인 상태 재사용', + online: '연결됨', + connectHint: '연결 방법: jcode Browser Bridge 확장 프로그램을 설치하고 팝업을 연 후 "Auto-connect to jcode"를 클릭하세요.', + chromePath: 'Chrome 경로', + approval: '승인', + navigate: '사이트 열기(탐색)', + interact: '페이지 상호작용(클릭 / 입력)', + askEachSite: '매번 확인', + alwaysAllow: '항상 허용', + sitePermissions: '사이트 권한', + add: '추가', + noSitePermissions: '아직 사이트별 권한이 없습니다', + developerMode: '개발자 모드', + elevatedRisk: '위험도 높음', + devModeTitle: 'browser_eval 및 원시 CDP 활성화', + devModeDesc: '에이전트가 페이지에서 JavaScript와 원시 DevTools 명령을 실행할 수 있게 합니다. 호출마다 여전히 확인이 필요하며 사이트 허용 목록은 적용되지 않습니다.', + }, ssh: {

(Translations above are best-effort for review purposes — please have a native speaker verify wording before merge.)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/src/i18n/locales/ko.ts` at line 244, The Korean locale is missing the settings.browser translation group, so add the full settings.browser.* key set to ko.ts using the same key structure as en.ts and zh-Hans.ts. Insert the Browser settings strings between the existing skills and ssh sections, including title, subtitle, enableTitle, and devModeDesc, so SettingsDialog.vue can render the Browser tab without raw fallback keys.

Merge branch 'main' into feat/browser-use

8d96a81

cnjack commented Jul 3, 2026

View reviewed changes

fix: handle errors on resource closure and simplify return values in …

3d413bc

…session management

cnjack force-pushed the feat/browser-use branch from 10001e9 to 3d413bc Compare July 3, 2026 18:34

Merge branch 'main' into feat/browser-use

77a9ecb

cnjack merged commit 4a1f48f into main Jul 3, 2026
2 of 3 checks passed

cnjack deleted the feat/browser-use branch July 3, 2026 18:34

coderabbitai Bot reviewed Jul 3, 2026

View reviewed changes

cnjack mentioned this pull request Jul 4, 2026

fix(site,browser): store extension ids, snippet overflow, privacy page #110

Merged

This was referenced Jul 4, 2026

feat(browser,ble): native-messaging stable connect, /browser command, BLE as desktop-only feature #111

Merged

feat(hooks): user-configurable lifecycle hooks (external command, 3-surface) #114

Merged

Conversation

cnjack commented Jul 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Lifecycle / approval fixes included

Tests

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

cnjack left a comment

Choose a reason for hiding this comment

Senior staff engineer review — PR #107 "browser-use capability"

Finding 1 (High): browser_tabs op=select grants the same control as claim, without the approval prompt claim requires

Finding 2 (High): Extension bridge accepts WebSocket connections from any Origin, unlike every other socket in this codebase

Finding 3 (Medium-High): Bridge tokens never expire and cannot actually be revoked, despite the extension telling users they can

Finding 4 (Medium): Cached tab origin goes stale after an in-page navigation, letting browser_act auto-approve on the wrong origin

Finding 5 (Medium): The PR's headline concurrency fix has no test coverage, and CI doesn't run -race

Finding 6 (Medium): Manager.Close() is skipped on one web-server shutdown error path, leaking the Chrome process

Additional Medium/Low items (worth fixing, not individually blocking)

Top Findings

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cnjack commented Jul 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 3, 2026 •

edited

Loading

Finding 1 (High): `browser_tabs op=select` grants the same control as `claim`, without the approval prompt `claim` requires

Finding 4 (Medium): Cached tab origin goes stale after an in-page navigation, letting `browser_act` auto-approve on the wrong origin

Finding 5 (Medium): The PR's headline concurrency fix has no test coverage, and CI doesn't run `-race`

Finding 6 (Medium): `Manager.Close()` is skipped on one web-server shutdown error path, leaking the Chrome process