feat(browser): browser-use capability (managed CDP + extension bridge)#107
Conversation
…idge) Introduces browser use: the agent can navigate, snapshot, interact, screenshot, read, and (in dev mode) eval a page. Two backends behind one Session interface — a jcode-managed Chrome over CDP, and the user's Chrome/Edge via a Chrome extension WebSocket bridge + native-messaging host. Approval has three tiers (read-only auto, navigate/interact site-scoped, high-risk always-prompt) with per-origin site permissions in the web Settings UI. Wired into TUI and web; tools registered per config; screenshots served over /api/browser/shots. Includes lifecycle/approval fixes: - Session.Close no longer tears down the shared managed Chrome (the Manager owns it and reuses it across tasks); getManaged drops and relaunches a dead backend; web mode closes the manager on exit so Chrome isn't orphaned. - browser_act scopes its per-site "interact" permission by the active tab's origin from the session instead of a hardcoded empty origin, so interact=allow and the interact class default actually take effect. - browser_act action=reload is implemented (reuses Session.Reload). Generated with Jack AI bot
|
Warning Review limit reached
Next review available in: 37 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR adds a native browser-use capability to jcode: Go packages implementing managed Chrome (CDP) and extension-bridge backends, session/snapshot/action logic, a Chrome MV3 extension with native messaging host installation, config/approval/tool wiring, Web server endpoints and frontend settings UI, plus design documentation. ChangesBrowser Use Feature
Estimated code review effort: 5 (Critical) | ~120 minutes Sequence Diagram(s)sequenceDiagram
participant User
participant WebUI
participant TaskEnv
participant ApprovalState
participant Session
participant Manager
participant Backend
User->>WebUI: trigger browser_act tool call
WebUI->>TaskEnv: dispatchBrowser(browser_act)
TaskEnv->>ApprovalState: decideBrowser(origin, class)
ApprovalState-->>TaskEnv: auto-approve or prompt
TaskEnv->>Session: BrowserSession(ctx)
Session->>Manager: OpenSession(ctx)
Manager->>Backend: NewTab / AttachTab
Backend-->>Manager: TabConn
Manager-->>Session: Session ready
TaskEnv->>Session: Act(req)
Session->>Backend: CDP Input/DOM commands
Backend-->>Session: result
Session-->>WebUI: action summary
sequenceDiagram
participant Extension
participant NativeHost
participant Bridge
participant Session
Extension->>NativeHost: request endpoint (native messaging)
NativeHost-->>Extension: ws URL + token
Extension->>Bridge: hello + token (websocket)
Bridge-->>Extension: welcome
Session->>Bridge: tab.new / cdp.send
Bridge->>Extension: envelope request
Extension->>Extension: chrome.debugger.sendCommand
Extension-->>Bridge: envelope response
Bridge-->>Session: result
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
cnjack
left a comment
There was a problem hiding this comment.
Senior staff engineer review — PR #107 "browser-use capability"
Overall Risk: High — two confirmed authorization/security gaps below allow, respectively, an approval-tier bypass for claiming control of the user's existing browser tabs, and removal of an origin check this codebase otherwise treats as mandatory for any WebSocket bridging an untrusted webpage to a privileged capability. Both are exploitable within the PR's own stated threat model (an AI agent acting on possibly-adversarial page content, driving the user's real logged-in browser). Recommend addressing Findings 1 and 2 before merge.
Reviewed the full diff (50 files, ~6k lines) across four subsystems: the CDP/managed-Chrome driver, the extension+native-messaging+WebSocket bridge, the approval/authorization layer, and the web API/frontend surface. The core lifecycle fix the PR claims (shared managed Chrome surviving task teardown, dead-backend detection/relaunch) is real, correctly implemented, and mutex-safe. The specific bug the PR says it fixed (hardcoded empty origin in browser_act's site-permission check) is genuinely fixed and failed closed even before the fix, so it was never a silent-approval hole.
Finding 1 (High): browser_tabs op=select grants the same control as claim, without the approval prompt claim requires
Impact: Live authorization bypass in exactly the threat model this PR targets. Defeats the "claim a pre-existing user tab requires approval" tier entirely.
Evidence:
internal/runner/approval.go:304-314auto-approvesbrowser_tabswhenopis"","list", or"select", on the theory thatselectis "read-only."- But
Session.SelectTab(internal/browser/session.go:421-436) does not require the tab to already be under jcode's control: iftabIDisn't already tracked, it callss.backend.AttachTab(ctx, tabID)and registers it — the identical attach callClaimTabuses (session.go:439-449, which is literallySelectTabplus flipping a cleanup-onlycreatedflag). - So
browser_tabs op=list(auto-approved) enumerates all open tabs, including the user's own untouched tabs (e.g. a logged-in banking/email tab via the extension backend), andbrowser_tabs op=select tab_id=<that id>(also auto-approved) attaches to and takes control of it with zero prompt. If that tab's origin also hasinteractpre-approved (see Finding 3 below), the very nextbrowser_actcan click/fill on it with no user awareness at all.
Suggested Fix: The approval decision for browser_tabs needs session state, not just the op string: auto-approve select only when the target tab is already in the session's controlled set; otherwise treat it like claim (prompt). Expose an "already controlled" predicate to the approval layer, or move SelectTab's attach-if-uncontrolled branch behind the same gate ClaimTab uses.
Finding 2 (High): Extension bridge accepts WebSocket connections from any Origin, unlike every other socket in this codebase
Impact: The extension bridge is the trust boundary between jcode and the user's real, logged-in browser session (full CDP access — cookies, navigation, arbitrary JS in page context; extension/background.js:243-251 forwards CDP calls with no method allow-list). Any webpage the browser visits can open a raw WebSocket to this bridge — browsers don't apply same-origin/CORS restrictions to WS handshakes, only server-side Origin checks do.
Evidence:
internal/browser/bridge.go:34:upgrader: websocket.Upgrader{CheckOrigin: func(*http.Request) bool { return true }}.- This codebase already has the correct pattern for exactly this risk:
internal/web/server.go:2569-2573definesisAllowedWebOriginwith a comment explaining precisely this ("Access-Control-Allow-Origin: *plus a WebSocketCheckOriginthat returns true would let any website the user visits drive the agent..."), andinternal/web/pty.go:59correctly wiresCheckOrigin: isAllowedWebOrigin.bridge.godoesn't reuse it. - Today a valid bearer token is still required (
bridge.go:83), so this isn't independently exploitable yet — but it removes a defense-in-depth layer this codebase treats as load-bearing everywhere else, and turns any future token leak (Finding 3) into remotely web-exploitable rather than local-only.
Suggested Fix: Use isAllowedWebOrigin as the CheckOrigin for this upgrader, matching pty.go.
Finding 3 (Medium-High): Bridge tokens never expire and cannot actually be revoked, despite the extension telling users they can
Evidence:
internal/browser/bridge.go:56-63(IssueToken) mints tokens but nothing ever removes an old one; no expiry/rotation logic anywhere ininternal/browser.- The extension's "Disconnect" button (
extension/popup/popup.js:78-82→background.js:331-338) only clears localchrome.storage.local— no revoke endpoint or bridge message invalidates the token server-side. extension/README.md:60-61explicitly tells users "Use the popup's Disconnect to revoke the token" — false.
Suggested Fix: Rotate/invalidate the previous token on each new auth, give tokens a TTL, wire Disconnect to an authenticated server-side revoke.
Finding 4 (Medium): Cached tab origin goes stale after an in-page navigation, letting browser_act auto-approve on the wrong origin
Evidence: browser_act's origin check reads Session.CurrentOrigin() (internal/browser/session.go:212-223), a cache refreshed only by Open, Reload, Snapshot. Act() (internal/browser/actions.go:23-83) never writes the post-action URL back into the cache. So: snapshot origin A (approved) → click navigates cross-origin to B → cache still A → next press/scroll action is checked against stale origin A and auto-approved, though it executes on unapproved origin B.
Suggested Fix: Refresh t.url from the post-action URL in Act(), and/or defensively re-derive origin for actions that don't require a fresh snapshot.
Finding 5 (Medium): The PR's headline concurrency fix has no test coverage, and CI doesn't run -race
Evidence: No manager_test.go exists in internal/browser/; nothing constructs a browser.Manager. getManaged's dead-backend-detect/relaunch logic (manager.go:135-156) — the riskiest new state mutation in the PR — is untested. .github/workflows/ci.yml:52 runs go test ./... without -race, despite the PR description citing -race as part of its own verification.
Suggested Fix: Add a manager_test.go exercising getManaged directly (dead-backend detection, relaunch, concurrent OpenSession during relaunch). Add -race to CI, at least for internal/browser and internal/runner.
Finding 6 (Medium): Manager.Close() is skipped on one web-server shutdown error path, leaking the Chrome process
Evidence: internal/command/web.go:782-790:
if err := srv.Start(ctx); err != nil {
return fmt.Errorf("server error: %w", err) // returns here, skips cleanup below
}
srv.CloseAllEngines()
_ = browserMgr.Close()Interactive mode does this correctly with defer browserMgr.Close() (internal/command/interactive.go:941); web mode should match, or a genuine Serve error orphans the Chrome process and locks its profile dir for the next start.
Additional Medium/Low items (worth fixing, not individually blocking)
- Global
Approval["interact"/"navigate"] = "always_allow"(internal/command/web.go:109-127) is not origin-scoped, contradicting the PR's "site-scoped" tier-2 model — grants blanket no-prompt interact/navigate on every site. - Screenshots are unfetchable whenever
requireAuthis on:ToolCallCard.vue's<img>/<a>tags hit/api/browser/shots/{id}directly without the bearer tokenapi.tsattaches, and that route isn't in the auth-exempt/query-token allowlist — 401s in exactly the self-hosted/multi-user deployment this feature targets. SettingsDialog.vue's permission editor saves optimistically with no error handling — a failedPOST /api/browser/configonly logs to console; UI can silently diverge from persisted permission state.- Chrome crashes mid-task are misclassified as "user took back control" (
ErrControlInterruptedvia a"connection closed"substring match,actions.go:315-325), masking real crashes; aSessionwhose backend dies has no recovery path within the same task. getManagedholds the Manager-wide mutex across the full Chrome launch (up to ~30s,manager.go:136-137), blockingStatus/Closeduring that window.
Top Findings
browser_tabs op=selectbypasses theclaimapproval gate — takes control of any open tab without a prompt. (internal/runner/approval.go:304-314,internal/browser/session.go:421-436)- Extension bridge WebSocket accepts connections from any Origin. (
internal/browser/bridge.go:34) - Bridge auth tokens never expire; advertised "Disconnect to revoke" is non-functional. (
internal/browser/bridge.go,tokens.go,extension/popup/popup.js,extension/README.md:60-61) - Stale cached tab origin lets
browser_actauto-approve after a cross-origin navigation. (internal/browser/session.go:212-223,internal/browser/actions.go:23-83) - Headline concurrency fix (
Manager.getManagedrelaunch) has zero test coverage; CI doesn't run-race. (internal/browser/manager.go:135-156)
Generated by Claude Code
…session management
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (3)
internal-doc/browser-use-design.md (1)
78-93: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd language tags to the fenced examples. markdownlint is already flagging these bare blocks (MD040); marking them as
textwill keep the doc lint-clean without changing rendering.Also applies to: 104-111, 161-177, 187-193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal-doc/browser-use-design.md` around lines 78 - 93, Add explicit markdown language tags to all fenced example blocks in browser-use-design.md that are currently unlabeled, especially the diagram/example blocks around the internal/browser architecture and the other referenced sections; use text as the fence language so markdownlint MD040 passes while preserving the current rendering. Identify and update each bare fenced block in the affected document sections rather than changing the content inside them.Source: Linters/SAST tools
internal/browser/discover.go (1)
221-256: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winFail fast when Chrome exits before announcing DevTools.
If Chrome exits early (invalid path/flags, crash, or forwarding to an already-running instance on the same profile), the stderr scanner hits EOF and
wsChnever receives, soLaunchblocks for the full 30s before returning a generic timeout. Watching the process exit lets you surface the real failure immediately.♻️ Detect early exit alongside the DevTools announcement
wsCh := make(chan string, 1) go func() { scanner := bufio.NewScanner(stderr) scanner.Buffer(make([]byte, 64*1024), 1024*1024) for scanner.Scan() { if m := devtoolsRe.FindStringSubmatch(scanner.Text()); m != nil { select { case wsCh <- m[1]: default: } // Keep draining so Chrome never blocks on a full stderr pipe. } } }() + exitCh := make(chan error, 1) + go func() { exitCh <- cmd.Wait() }() + launchCtx, cancel := context.WithTimeout(ctx, 30*time.Second) defer cancel() select { case wsURL := <-wsCh: stop := func() { _ = cmd.Process.Kill() - _, _ = cmd.Process.Wait() + <-exitCh } backend, err := connectManaged(launchCtx, wsURL, stop) if err != nil { stop() return nil, err } config.Logger().Printf("[browser] managed chrome started pid=%d ws=%s", cmd.Process.Pid, wsURL) return backend, nil + case err := <-exitCh: + return nil, fmt.Errorf("chrome exited before announcing DevTools endpoint: %w", err) case <-launchCtx.Done(): _ = cmd.Process.Kill() - _, _ = cmd.Process.Wait() + <-exitCh return nil, fmt.Errorf("chrome did not announce DevTools endpoint within 30s") }Note: this also replaces the direct
cmd.Process.Wait()calls withcmd.Wait(), which is the correct reaper when usingStderrPipe. Please verify the stderr-draining goroutine still completes cleanly on EOF with this change.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/browser/discover.go` around lines 221 - 256, In discover.go’s Launch flow, Chrome can exit before emitting the DevTools URL and the current select only waits on wsCh or the 30s timeout. Add a process-exit signal alongside the stderr scanner so the code fails fast when cmd exits early, surfaces that error instead of a generic timeout, and still keeps draining stderr until EOF. While doing this, replace the direct cmd.Process.Wait() cleanup in the stop/timeout paths with cmd.Wait() so the StderrPipe is reaped correctly and the goroutine can terminate cleanly.internal/tools/browser.go (1)
67-139: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winInconsistent handling of malformed tool args.
browser_act(Line 151-153) returns an"invalid args: %w"error when JSON unmarshal fails, but every other branch here (browser_open,browser_snapshot,browser_screenshot,browser_read,browser_tabs,browser_eval) uses_ = json.Unmarshal(...)and silently proceeds with a zero-valued struct on malformed JSON. This contradicts the tool guideline to return descriptive errors that help the agent self-correct, and can produce confusing silent-default behavior (e.g. a malformedbrowser_screenshotcall always falls back tofull_page=falseinstead of surfacing the parse failure).As per path instructions, "Return descriptive error strings that help the agent self-correct. Include file paths, line numbers, or command output in error messages."
♻️ Example fix for one branch (repeat pattern for the others)
case "browser_snapshot": var in struct { Filter string `json:"filter"` MaxLines int `json:"max_lines"` } - _ = json.Unmarshal([]byte(argsJSON), &in) + if err := json.Unmarshal([]byte(argsJSON), &in); err != nil { + return "", fmt.Errorf("invalid args: %w", err) + } return sess.Snapshot(ctx, in.Filter, in.MaxLines)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tools/browser.go` around lines 67 - 139, The browser tool dispatch handlers are swallowing malformed JSON args and defaulting to zero values instead of returning self-correcting errors. Update dispatchBrowser to handle json.Unmarshal failures consistently in each branch (browser_open, browser_snapshot, browser_screenshot, browser_read, browser_tabs, browser_eval), matching the browser_act pattern by returning a descriptive “invalid args” error that includes the parse failure. Keep the validation for required fields like URL and expression after successful parsing so the tool fails fast on bad input.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@extension/background.js`:
- Around line 38-50: The stop() shutdown path in background.js does not release
debugger attachments, so stale-token and reconnect-failure exits leave tabs
stuck attached. Update stop() to detach every tab tracked in attached before
clearing websocket/timer state, using the existing detachTab flow so the
disconnect cleanup is centralized. Keep the same stop() behavior for
ws/connected/badge/token cleanup, but make sure attached tabs are always
detached regardless of whether stop() is called from the stale-token path or the
give-up reconnect paths.
- Around line 261-296: groupTab is creating a fresh tab group on every
invocation instead of reusing the existing jcode 🔎 group. Update groupTab to
keep track of the created group id and pass it back into chrome.tabs.group when
adding later tabs, so tab.attach/tab.new calls join the same group instead of
spawning separate ones.
In `@extension/README.md`:
- Around line 34-36: The README security wording is inconsistent with the actual
reconnect behavior in background.js and the token flow described earlier. Update
the Security section and nearby token description so both use the same
terminology and clearly state that the extension stores a long-lived token in
chrome.storage.local, reconnects silently until Disconnect is used, and does not
describe this as a short-lived pairing code. Refer to the background.js token
storage/reconnect behavior and the README Security section text to keep the
wording aligned.
- Around line 48-55: The Permissions section is missing two manifest-declared
capabilities, `alarms` and `nativeMessaging`, even though they are part of the
extension’s core behavior. Update the Permissions list in the README to include
both entries alongside the existing permissions, keeping the descriptions
aligned with their roles in keepalive and auto-connect discovery as reflected in
manifest.json.
In `@internal/browser/actions.go`:
- Around line 277-293: The scroll handler in Session.scroll is mixing pointer
coordinates with wheel deltas by reading req.X/req.Y for both the mouse position
and the scroll amount. Update the ActRequest usage so the CDP event position
comes from dedicated cursor coordinates while deltaX/deltaY come from separate
scroll fields, and keep the default paging behavior in scroll intact when no
delta is provided.
In `@internal/browser/bridge.go`:
- Around line 160-234: The bridge connection can hang indefinitely because
bridgeConn.request writes with c.ws.WriteJSON(env) before honoring ctx, and
readLoop clears deadlines without any keepalive. Update request to bound the
write with the caller context or a write deadline, and add WebSocket liveness
handling in readLoop (for example pong handling and periodic ping/control
writes) so dead or suspended peers are detected and c.closed/c.closeErr are set
instead of leaving browser_* calls stuck forever.
- Around line 31-38: NewBridge currently configures
websocket.Upgrader.CheckOrigin to allow every origin, which bypasses browser
origin protection. Update the Bridge initialization so CheckOrigin only returns
true for the companion extension’s exact chrome-extension://<id> origin, using
the Bridge/NewBridge upgrader setup as the place to enforce it. Keep the
existing token loading and token validation flow unchanged so tokens remain the
primary authentication path.
In `@internal/browser/manager.go`:
- Around line 36-41: Restrict browser screenshot storage to owner-only access in
Manager and the screenshot write path. Update NewManager and the screenshot save
logic used around Manager/bridge so shotDir is created with private permissions
and individual screenshot files are written with user-only permissions instead
of world-readable modes. Keep the fix localized to the Manager setup and the
screenshot file creation code paths.
In `@internal/browser/session.go`:
- Around line 386-407: The ListTabs method computes s.active but never uses it,
so the returned TabInfo slice does not mark the currently active tab. Update
Session.ListTabs to either remove the unused active lookup or, preferably, use
it when iterating backend tabs to set the matching TabInfo as active (for
example via an Active field on TabInfo if present/added), while keeping the
existing Attached flag logic based on controlled tabs.
In `@internal/command/web.go`:
- Around line 106-129: The per-site override logic in browserSitePreapproved is
treating an unset BrowserSitePermission field as an explicit deny instead of
falling back to the class default. Update the origin match handling so that for
the requested class, a site-specific value only overrides when it is set (e.g.
navigate/interact is "allow" or "deny"), and otherwise defer to
bc.Approval[class]. Keep the lookup scoped to browserSitePreapproved and its use
of BrowserSitePermission, SitePermissions, and Approval.
In `@internal/runner/approval.go`:
- Around line 304-315: The browser_tabs branch in approval.go is treating
malformed JSON as a read-only op because json.Unmarshal errors are ignored and
in.Op defaults to empty, which auto-approves. Update the browser_tabs handling
in the decide logic to check the unmarshal error and return decisionPrompt on
parse failure, matching the fail-safe behavior already used by the read and
execute cases. Keep the existing decisionAutoApprove only for explicitly parsed
list/select ops, and consider adding a regression test near
TestDecideBrowserTiers for malformed toolArgs.
In `@internal/tools/browser.go`:
- Around line 88-102: The browser_screenshot response text is misleading because
it formats the PNG byte length as a fake width with a stray “x?” placeholder.
Update the return message in the browser_screenshot case of the browser tool to
remove the bogus dimension field and only report the real byte size plus the
image_ref, keeping the user-facing text clear and accurate.
- Around line 167-211: The browserTabs function currently returns success text
for select, claim, and close even when sess.SelectTab, sess.ClaimTab, or
sess.CloseTab fails. Update browserTabs so each of those cases checks the
session call result first and returns an empty string with the error on failure,
only constructing the “selected/claimed/closed tab” message after a successful
call.
In `@web/src/i18n/locales/ja.ts`:
- Line 244: The Japanese locale is missing the full settings.browser translation
group, so the Browser tab in SettingsDialog.vue falls back to raw keys. Add the
complete settings.browser.* section in ja.ts in the same spot as the other
locales, between the existing skills and ssh groups, and make sure it includes
every key used by the Browser tab such as title, enableTitle, enableDesc,
control, managed, noChrome, extension, connected, notConnected, online,
connectHint, chromePath, approval, navigate, interact, askEachSite, alwaysAllow,
sitePermissions, add, noSitePermissions, developerMode, elevatedRisk,
devModeTitle, and devModeDesc.
In `@web/src/i18n/locales/ko.ts`:
- Line 244: The Korean locale is missing the settings.browser translation group,
so add the full settings.browser.* key set to ko.ts using the same key structure
as en.ts and zh-Hans.ts. Insert the Browser settings strings between the
existing skills and ssh sections, including title, subtitle, enableTitle, and
devModeDesc, so SettingsDialog.vue can render the Browser tab without raw
fallback keys.
---
Nitpick comments:
In `@internal-doc/browser-use-design.md`:
- Around line 78-93: Add explicit markdown language tags to all fenced example
blocks in browser-use-design.md that are currently unlabeled, especially the
diagram/example blocks around the internal/browser architecture and the other
referenced sections; use text as the fence language so markdownlint MD040 passes
while preserving the current rendering. Identify and update each bare fenced
block in the affected document sections rather than changing the content inside
them.
In `@internal/browser/discover.go`:
- Around line 221-256: In discover.go’s Launch flow, Chrome can exit before
emitting the DevTools URL and the current select only waits on wsCh or the 30s
timeout. Add a process-exit signal alongside the stderr scanner so the code
fails fast when cmd exits early, surfaces that error instead of a generic
timeout, and still keeps draining stderr until EOF. While doing this, replace
the direct cmd.Process.Wait() cleanup in the stop/timeout paths with cmd.Wait()
so the StderrPipe is reaped correctly and the goroutine can terminate cleanly.
In `@internal/tools/browser.go`:
- Around line 67-139: The browser tool dispatch handlers are swallowing
malformed JSON args and defaulting to zero values instead of returning
self-correcting errors. Update dispatchBrowser to handle json.Unmarshal failures
consistently in each branch (browser_open, browser_snapshot, browser_screenshot,
browser_read, browser_tabs, browser_eval), matching the browser_act pattern by
returning a descriptive “invalid args” error that includes the parse failure.
Keep the validation for required fields like URL and expression after successful
parsing so the tool fails fast on bad input.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 402ea967-8b2b-4b16-a9c5-af6f0395689f
⛔ Files ignored due to path filters (3)
extension/icons/icon128.pngis excluded by!**/*.pngextension/icons/icon16.pngis excluded by!**/*.pngextension/icons/icon48.pngis excluded by!**/*.png
📒 Files selected for processing (47)
cmd/jcode/main.goextension/README.mdextension/background.jsextension/manifest.jsonextension/popup/popup.htmlextension/popup/popup.jsinternal-doc/browser-use-design.mdinternal-doc/browser-use-ui.htmlinternal/browser/actions.gointernal/browser/bridge.gointernal/browser/bridge_test.gointernal/browser/cdp.gointernal/browser/cdp_test.gointernal/browser/discover.gointernal/browser/manager.gointernal/browser/nativehost.gointernal/browser/nativehost_notwindows.gointernal/browser/nativehost_test.gointernal/browser/nativehost_windows.gointernal/browser/perms.gointernal/browser/session.gointernal/browser/session_test.gointernal/browser/smoke_test.gointernal/browser/snapshot.gointernal/browser/snapshot_test.gointernal/browser/tokens.gointernal/command/interactive.gointernal/command/web.gointernal/config/config.gointernal/handler/web.gointernal/runner/approval.gointernal/runner/approval_browser_test.gointernal/skills/builtin/browser-use/SKILL.mdinternal/tools/browser.gointernal/tools/env.gointernal/web/auth.gointernal/web/browser.gointernal/web/engine.gointernal/web/server.goweb/src/components/SettingsDialog.vueweb/src/components/ToolCallCard.vueweb/src/composables/api.tsweb/src/i18n/locales/en.tsweb/src/i18n/locales/ja.tsweb/src/i18n/locales/ko.tsweb/src/i18n/locales/zh-Hans.tsweb/src/i18n/locales/zh-Hant.ts
| function stop(forget) { | ||
| desired = false; | ||
| if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; } | ||
| if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; } | ||
| if (ws) { | ||
| ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null; | ||
| try { ws.close(); } catch {} | ||
| ws = null; | ||
| } | ||
| connected = false; | ||
| chrome.action.setBadgeText({ text: "" }); | ||
| if (forget) chrome.storage.local.remove("token"); | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
stop() leaves debugger attached to tabs when connection dies.
stop() is the "hard-off switch" but never detaches attached tabs. On the stale-token path (line 150) and both give-up paths (lines 185, 191) it's called without first detaching, unlike the explicit "disconnect" handler (line 334) which manually loops detachTab first. Result: after a stale token or repeated reconnect failures, Chrome keeps the debugger attached (banner + "jcode 🔎" group stays) with no live connection to ever release it.
🔧 Proposed fix: detach tabs inside stop()
function stop(forget) {
desired = false;
if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }
if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; }
if (ws) {
ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null;
try { ws.close(); } catch {}
ws = null;
}
connected = false;
chrome.action.setBadgeText({ text: "" });
+ for (const tabId of [...attached]) {
+ try { chrome.debugger.detach({ tabId }); } catch {}
+ attached.delete(tabId);
+ }
if (forget) chrome.storage.local.remove("token");
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function stop(forget) { | |
| desired = false; | |
| if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; } | |
| if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; } | |
| if (ws) { | |
| ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null; | |
| try { ws.close(); } catch {} | |
| ws = null; | |
| } | |
| connected = false; | |
| chrome.action.setBadgeText({ text: "" }); | |
| if (forget) chrome.storage.local.remove("token"); | |
| } | |
| function stop(forget) { | |
| desired = false; | |
| if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; } | |
| if (connectTimer) { clearTimeout(connectTimer); connectTimer = null; } | |
| if (ws) { | |
| ws.onclose = null; ws.onerror = null; ws.onmessage = null; ws.onopen = null; | |
| try { ws.close(); } catch {} | |
| ws = null; | |
| } | |
| connected = false; | |
| chrome.action.setBadgeText({ text: "" }); | |
| for (const tabId of [...attached]) { | |
| try { chrome.debugger.detach({ tabId }); } catch {} | |
| attached.delete(tabId); | |
| } | |
| if (forget) chrome.storage.local.remove("token"); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@extension/background.js` around lines 38 - 50, The stop() shutdown path in
background.js does not release debugger attachments, so stale-token and
reconnect-failure exits leave tabs stuck attached. Update stop() to detach every
tab tracked in attached before clearing websocket/timer state, using the
existing detachTab flow so the disconnect cleanup is centralized. Keep the same
stop() behavior for ws/connected/badge/token cleanup, but make sure attached
tabs are always detached regardless of whether stop() is called from the
stale-token path or the give-up reconnect paths.
| function attachTab(tabId) { | ||
| return new Promise((resolve, reject) => { | ||
| if (attached.has(tabId)) return resolve(); | ||
| chrome.debugger.attach({ tabId }, DEBUGGER_VERSION, () => { | ||
| if (chrome.runtime.lastError) return reject(new Error(chrome.runtime.lastError.message)); | ||
| attached.add(tabId); | ||
| resolve(); | ||
| }); | ||
| }); | ||
| } | ||
|
|
||
| function detachTab(tabId) { | ||
| return new Promise((resolve) => { | ||
| if (!attached.has(tabId)) return resolve(); | ||
| chrome.debugger.detach({ tabId }, () => { | ||
| attached.delete(tabId); | ||
| resolve(); | ||
| }); | ||
| }); | ||
| } | ||
|
|
||
| function sendCDP(tabId, method, params) { | ||
| return new Promise((resolve, reject) => { | ||
| chrome.debugger.sendCommand({ tabId }, method, params || {}, (result) => { | ||
| if (chrome.runtime.lastError) return reject(new Error(chrome.runtime.lastError.message)); | ||
| resolve(result); | ||
| }); | ||
| }); | ||
| } | ||
|
|
||
| async function groupTab(tabId) { | ||
| try { | ||
| const groupId = await chrome.tabs.group({ tabIds: [tabId] }); | ||
| await chrome.tabGroups.update(groupId, { title: GROUP_TITLE, color: "orange" }); | ||
| } catch {} | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🌐 Web query:
Does chrome.tabs.group create a new group every call when groupId is omitted, even if a prior group with matching tabs already exists?
💡 Result:
Yes, chrome.tabs.group creates a new tab group every time it is called when the groupId parameter is omitted [1][2]. The API does not perform any automatic check to see if an existing group with matching properties or contents already exists [2][3]. If you do not provide a groupId, the browser will instantiate a new, distinct group for the specified tabs [1]. To add tabs to an existing group, you must manually identify the target group's ID (typically by querying existing groups using chrome.tabGroups.query) and explicitly pass that ID to the groupId parameter in your chrome.tabs.group call [2]. If you wish to manage group creation logically—such as grouping tabs by domain—your extension code must implement the logic to check for existing groups first, and only create a new group if no suitable existing one is found [2].
Citations:
- 1: https://stackoverflow.com/questions/69599168/how-to-add-chrome-tab-group-in-javascript
- 2: https://bestchromeextensions.com/docs/guides/tab-groups/
- 3: https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/chrome/common/extensions/api/tab_groups.json
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Map the relevant file and locate group-related logic/usages.
git ls-files extension/background.js
echo '--- outline ---'
ast-grep outline extension/background.js --view expanded || true
echo '--- groupTab usages ---'
rg -n "groupTab|tabs\.group|tabGroups\.update|tabGroups\.query|GROUP_TITLE|jcode" extension/background.js extension || true
echo '--- surrounding lines ---'
sed -n '220,340p' extension/background.jsRepository: cnjack/jcode
Length of output: 2005
groupTab creates a new tab group on every call. chrome.tabs.group({ tabIds: [tabId] }) without groupId starts a fresh group each time, so every tab.new/tab.attach lands in a separate jcode 🔎 group. Track the group id and reuse it when adding tabs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@extension/background.js` around lines 261 - 296, groupTab is creating a fresh
tab group on every invocation instead of reusing the existing jcode 🔎 group.
Update groupTab to keep track of the created group id and pass it back into
chrome.tabs.group when adding later tabs, so tab.attach/tab.new calls join the
same group instead of spawning separate ones.
| Auto-connect exchanges for a long-lived token in `chrome.storage.local`; | ||
| afterwards the extension reconnects silently — you connect once. Use | ||
| **Disconnect** in the popup to stop and forget the token. |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win
Security section contradicts the token description above.
Lines 34-36 describe a "long-lived token" persisted in storage and reused silently, but the Security section (lines 59-60) says the bridge "authenticates with a short-lived pairing code." This mismatches actual behavior in background.js (token stored via chrome.storage.local and reused across reconnects until Disconnect) and could confuse users about the real security model.
Also applies to: 57-61
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@extension/README.md` around lines 34 - 36, The README security wording is
inconsistent with the actual reconnect behavior in background.js and the token
flow described earlier. Update the Security section and nearby token description
so both use the same terminology and clearly state that the extension stores a
long-lived token in chrome.storage.local, reconnects silently until Disconnect
is used, and does not describe this as a short-lived pairing code. Refer to the
background.js token storage/reconnect behavior and the README Security section
text to keep the wording aligned.
| ## Permissions | ||
|
|
||
| - `debugger` — the CDP control channel (Chrome shows a banner while attached). | ||
| - `tabs`, `tabGroups` — create/switch/group tabs. | ||
| - `storage` — persist the server URL and pairing token. | ||
| - `scripting` — reserved for future in-page helpers. | ||
| - `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks | ||
| to your local jcode. |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Permissions list omits alarms and nativeMessaging.
manifest.json declares alarms (keepalive) and nativeMessaging (Auto-connect discovery), both documented as core features earlier in this README, but they're missing from the Permissions list.
📝 Proposed addition
- `debugger` — the CDP control channel (Chrome shows a banner while attached).
- `tabs`, `tabGroups` — create/switch/group tabs.
- `storage` — persist the server URL and pairing token.
- `scripting` — reserved for future in-page helpers.
+- `alarms` — periodic keepalive ping/reconnect while the service worker sleeps.
+- `nativeMessaging` — talk to the jcode native host for zero-input Auto-connect.
- `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks
to your local jcode.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Permissions | |
| - `debugger` — the CDP control channel (Chrome shows a banner while attached). | |
| - `tabs`, `tabGroups` — create/switch/group tabs. | |
| - `storage` — persist the server URL and pairing token. | |
| - `scripting` — reserved for future in-page helpers. | |
| - `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks | |
| to your local jcode. | |
| ## Permissions | |
| - `debugger` — the CDP control channel (Chrome shows a banner while attached). | |
| - `tabs`, `tabGroups` — create/switch/group tabs. | |
| - `storage` — persist the server URL and pairing token. | |
| - `scripting` — reserved for future in-page helpers. | |
| - `alarms` — periodic keepalive ping/reconnect while the service worker sleeps. | |
| - `nativeMessaging` — talk to the jcode native host for zero-input Auto-connect. | |
| - `host_permissions` limited to `127.0.0.1` / `localhost` — it only ever talks | |
| to your local jcode. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@extension/README.md` around lines 48 - 55, The Permissions section is missing
two manifest-declared capabilities, `alarms` and `nativeMessaging`, even though
they are part of the extension’s core behavior. Update the Permissions list in
the README to include both entries alongside the existing permissions, keeping
the descriptions aligned with their roles in keepalive and auto-connect
discovery as reflected in manifest.json.
| func (s *Session) scroll(ctx context.Context, t *sessionTab, req ActRequest) error { | ||
| dy := req.Y | ||
| if dy == 0 { | ||
| dy = 600 // default one "page" down | ||
| } | ||
| x, y := req.X, req.Y | ||
| if x == 0 { | ||
| x = 400 | ||
| } | ||
| if y == 0 { | ||
| y = 400 | ||
| } | ||
| _, err := t.conn.Send(ctx, "Input.dispatchMouseEvent", map[string]any{ | ||
| "type": "mouseWheel", "x": x, "y": y, "deltaX": req.X, "deltaY": dy, | ||
| }) | ||
| return interpretErr(err) | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
scroll conflates wheel-event position with scroll delta.
Per CDP's Input.dispatchMouseEvent, x/y are the pointer position while deltaX/deltaY are independent wheel-delta values. This code sets deltaY from req.Y and deltaX from req.X, while also using req.X/req.Y as the event's x/y position (Lines 282-288). A caller cannot express "scroll by 200px" at a specific pointer location without that same value also becoming the pointer's y-coordinate — the two independent CDP concepts are collapsed onto one struct field.
🛠️ Proposed fix: separate delta fields
type ActRequest struct {
Action string // click|dblclick|fill|press|hover|scroll|select|upload|dialog
UID string // element uid from the latest snapshot (most actions)
X, Y float64 // coordinate fallback for scroll/click
+ DeltaX, DeltaY float64 // scroll amount (scroll only), independent of X/Y position
Value string // fill text, select value, dialog decision (accept|dismiss)
Key string // for action=press (e.g. "Enter")
Files []string
} func (s *Session) scroll(ctx context.Context, t *sessionTab, req ActRequest) error {
- dy := req.Y
+ dx, dy := req.DeltaX, req.DeltaY
if dy == 0 {
dy = 600 // default one "page" down
}
x, y := req.X, req.Y
if x == 0 {
x = 400
}
if y == 0 {
y = 400
}
_, err := t.conn.Send(ctx, "Input.dispatchMouseEvent", map[string]any{
- "type": "mouseWheel", "x": x, "y": y, "deltaX": req.X, "deltaY": dy,
+ "type": "mouseWheel", "x": x, "y": y, "deltaX": dx, "deltaY": dy,
})
return interpretErr(err)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/browser/actions.go` around lines 277 - 293, The scroll handler in
Session.scroll is mixing pointer coordinates with wheel deltas by reading
req.X/req.Y for both the mouse position and the scroll amount. Update the
ActRequest usage so the CDP event position comes from dedicated cursor
coordinates while deltaX/deltaY come from separate scroll fields, and keep the
default paging behavior in scroll intact when no delta is provided.
| case "browser_tabs": | ||
| var in struct { | ||
| Op string `json:"op"` | ||
| } | ||
| _ = json.Unmarshal([]byte(toolArgs), &in) | ||
| switch in.Op { | ||
| case "", "list", "select": | ||
| return decisionAutoApprove, true // read-only tab ops | ||
| default: // new/claim/close mutate the controlled set | ||
| return decisionPrompt, true | ||
| } | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
browser_tabs auto-approves on malformed args instead of prompting.
json.Unmarshal errors are discarded here, so if toolArgs fails to parse, in.Op stays "" and falls into the auto-approve case — the opposite of the fail-safe pattern used elsewhere in this file (read/execute explicitly check err and default to decisionPrompt). A malformed close/new/claim payload would silently bypass the prompt that's supposed to gate mutating tab operations.
🔒️ Proposed fix
case "browser_tabs":
var in struct {
Op string `json:"op"`
}
- _ = json.Unmarshal([]byte(toolArgs), &in)
+ if err := json.Unmarshal([]byte(toolArgs), &in); err != nil {
+ return decisionPrompt, true
+ }
switch in.Op {
case "", "list", "select":
return decisionAutoApprove, true // read-only tab ops
default: // new/claim/close mutate the controlled set
return decisionPrompt, true
}Want me to add a regression test for the malformed-JSON case alongside TestDecideBrowserTiers?
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case "browser_tabs": | |
| var in struct { | |
| Op string `json:"op"` | |
| } | |
| _ = json.Unmarshal([]byte(toolArgs), &in) | |
| switch in.Op { | |
| case "", "list", "select": | |
| return decisionAutoApprove, true // read-only tab ops | |
| default: // new/claim/close mutate the controlled set | |
| return decisionPrompt, true | |
| } | |
| } | |
| case "browser_tabs": | |
| var in struct { | |
| Op string `json:"op"` | |
| } | |
| if err := json.Unmarshal([]byte(toolArgs), &in); err != nil { | |
| return decisionPrompt, true | |
| } | |
| switch in.Op { | |
| case "", "list", "select": | |
| return decisionAutoApprove, true // read-only tab ops | |
| default: // new/claim/close mutate the controlled set | |
| return decisionPrompt, true | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/runner/approval.go` around lines 304 - 315, The browser_tabs branch
in approval.go is treating malformed JSON as a read-only op because
json.Unmarshal errors are ignored and in.Op defaults to empty, which
auto-approves. Update the browser_tabs handling in the decide logic to check the
unmarshal error and return decisionPrompt on parse failure, matching the
fail-safe behavior already used by the read and execute cases. Keep the existing
decisionAutoApprove only for explicitly parsed list/select ops, and consider
adding a regression test near TestDecideBrowserTiers for malformed toolArgs.
| case "browser_screenshot": | ||
| var in struct { | ||
| FullPage bool `json:"full_page"` | ||
| } | ||
| _ = json.Unmarshal([]byte(argsJSON), &in) | ||
| png, err := sess.Screenshot(ctx, in.FullPage) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| id, err := env.Browser.SaveScreenshot(png) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| // The web UI renders image_ref inline; text clients see the ref + size. | ||
| return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Confusing screenshot output text (%dx?).
fmt.Sprintf("[screenshot %dx? bytes=%d ...", len(png), len(png), id) prints the byte length as a fake "width" (e.g. 1024x? bytes=1024), which is misleading — it isn't the image's pixel dimensions, and the ? looks like a stray debug placeholder rather than intentional. This text is surfaced directly to the agent and (per ToolCallCard.vue) rendered in the UI.
🐛 Proposed fix: drop the bogus dimension field
- return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil
+ return fmt.Sprintf("[screenshot bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), id), nil📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case "browser_screenshot": | |
| var in struct { | |
| FullPage bool `json:"full_page"` | |
| } | |
| _ = json.Unmarshal([]byte(argsJSON), &in) | |
| png, err := sess.Screenshot(ctx, in.FullPage) | |
| if err != nil { | |
| return "", err | |
| } | |
| id, err := env.Browser.SaveScreenshot(png) | |
| if err != nil { | |
| return "", err | |
| } | |
| // The web UI renders image_ref inline; text clients see the ref + size. | |
| return fmt.Sprintf("[screenshot %dx? bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), len(png), id), nil | |
| return fmt.Sprintf("[screenshot bytes=%d image_ref=/api/browser/shots/%s.png]\nCaptured. The image is shown in the UI; use browser_snapshot for element ground truth.", len(png), id), nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/tools/browser.go` around lines 88 - 102, The browser_screenshot
response text is misleading because it formats the PNG byte length as a fake
width with a stray “x?” placeholder. Update the return message in the
browser_screenshot case of the browser tool to remove the bogus dimension field
and only report the real byte size plus the image_ref, keeping the user-facing
text clear and accurate.
| func browserTabs(ctx context.Context, sess *browser.Session, argsJSON string) (string, error) { | ||
| var in struct { | ||
| Op string `json:"op"` | ||
| TabID string `json:"tab_id"` | ||
| } | ||
| _ = json.Unmarshal([]byte(argsJSON), &in) | ||
| switch in.Op { | ||
| case "", "list": | ||
| tabs, err := sess.ListTabs(ctx) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| if len(tabs) == 0 { | ||
| return "(no tabs)", nil | ||
| } | ||
| var b strings.Builder | ||
| for _, t := range tabs { | ||
| mark := " " | ||
| if t.Attached { | ||
| mark = "*" | ||
| } | ||
| flag := "" | ||
| if t.UserTab { | ||
| flag = " [user]" | ||
| } | ||
| fmt.Fprintf(&b, "%s %s %q %s%s\n", mark, shortTabID(t.ID), t.Title, t.URL, flag) | ||
| } | ||
| b.WriteString("(* = controlled by jcode)") | ||
| return b.String(), nil | ||
| case "new": | ||
| id, err := sess.NewTab(ctx) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return "opened tab " + shortTabID(id), nil | ||
| case "select": | ||
| return "selected tab " + shortTabID(in.TabID), sess.SelectTab(ctx, in.TabID) | ||
| case "claim": | ||
| return "claimed tab " + shortTabID(in.TabID), sess.ClaimTab(ctx, in.TabID) | ||
| case "close": | ||
| return "closed tab " + shortTabID(in.TabID), sess.CloseTab(ctx, in.TabID) | ||
| default: | ||
| return "", fmt.Errorf("unknown tabs op %q", in.Op) | ||
| } | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
set -euo pipefail
git ls-files | rg '^internal/tools/browser\.go$|^internal/tools/.*browser.*\.go$|^internal/tools/.*\.go$'
echo '--- browser.go outline ---'
ast-grep outline internal/tools/browser.go --view expanded || true
echo '--- relevant lines ---'
sed -n '1,260p' internal/tools/browser.go
echo '--- search browserTabs and InvokableRun ---'
rg -n "browserTabs|InvokableRun|SelectTab|ClaimTab|CloseTab|ListTabs|NewTab" internal/tools -g '*.go'Repository: cnjack/jcode
Length of output: 2005
Return no success text when tab operations fail
select/claim/close currently build a success-looking message even when the session call returns an error. Return "" on error so callers don’t receive a misleading result alongside the failure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/tools/browser.go` around lines 167 - 211, The browserTabs function
currently returns success text for select, claim, and close even when
sess.SelectTab, sess.ClaimTab, or sess.CloseTab fails. Update browserTabs so
each of those cases checks the session call result first and returns an empty
string with the error on failure, only constructing the “selected/claimed/closed
tab” message after a successful call.
| @@ -241,6 +241,7 @@ export default { | |||
| providers: 'プロバイダー', | |||
| mcp: 'MCP サーバー', | |||
| skills: 'スキル', | |||
| browser: 'ブラウザ', | |||
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Missing settings.browser translation group breaks the Browser settings tab in Japanese.
This file adds only the tabs.browser label but not the full settings.browser.* group that en.ts/zh-Hans.ts added at the same location (after skills, before ssh). SettingsDialog.vue's Browser tab renders via t('settings.browser.title'), enableTitle, enableDesc, control, managed, noChrome, extension, connected, notConnected, online, connectHint, chromePath, approval, navigate, interact, askEachSite, alwaysAllow, sitePermissions, add, noSitePermissions, developerMode, elevatedRisk, devModeTitle, devModeDesc — all undefined for ja, so the entire tab will show fallback/raw keys instead of Japanese text.
As per path instructions/file convention, this file's own header states: "Every other locale file (zh-Hans/zh-Hant/ja/ko) mirrors this exact shape."
🌐 Proposed fix (insert after `skills`, before `ssh`, ~line 373)
loadingHint: '読み込み中…',
},
+ browser: {
+ title: 'ブラウザ',
+ subtitle: 'jcode にブラウザを操作させます。マネージドブラウザはそのまま使えます。ログイン状態を再利用するには Chrome 拡張機能を接続してください。',
+ enableTitle: 'ブラウザ操作を有効化',
+ enableDesc: 'エージェントに browser_* ツールを追加します。',
+ control: '制御',
+ managed: 'マネージドブラウザ',
+ noChrome: 'Chrome が見つかりません — 下にパスを設定してください',
+ extension: 'Google Chrome 拡張機能',
+ connected: '自分の Chrome とログイン状態を再利用',
+ notConnected: '未接続 — 自分の Chrome とログイン状態を再利用',
+ online: '接続済み',
+ connectHint: '接続方法:jcode Browser Bridge 拡張機能をインストールし、ポップアップを開いて「Auto-connect to jcode」をクリックしてください。',
+ chromePath: 'Chrome のパス',
+ approval: '承認',
+ navigate: 'サイトを開く(ナビゲーション)',
+ interact: 'ページ操作(クリック / 入力)',
+ askEachSite: '毎回確認する',
+ alwaysAllow: '常に許可',
+ sitePermissions: 'サイトごとの権限',
+ add: '追加',
+ noSitePermissions: 'サイト固有の権限はまだありません',
+ developerMode: '開発者モード',
+ elevatedRisk: 'リスク高',
+ devModeTitle: 'browser_eval と生の CDP を有効化',
+ devModeDesc: 'エージェントがページ内で JavaScript や生の DevTools コマンドを実行できるようにします。呼び出しごとに確認が必要で、サイトの許可リストは適用されません。',
+ },
ssh: {(Translations above are best-effort for review purposes — please have a native speaker verify wording before merge.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/src/i18n/locales/ja.ts` at line 244, The Japanese locale is missing the
full settings.browser translation group, so the Browser tab in
SettingsDialog.vue falls back to raw keys. Add the complete settings.browser.*
section in ja.ts in the same spot as the other locales, between the existing
skills and ssh groups, and make sure it includes every key used by the Browser
tab such as title, enableTitle, enableDesc, control, managed, noChrome,
extension, connected, notConnected, online, connectHint, chromePath, approval,
navigate, interact, askEachSite, alwaysAllow, sitePermissions, add,
noSitePermissions, developerMode, elevatedRisk, devModeTitle, and devModeDesc.
| @@ -241,6 +241,7 @@ export default { | |||
| providers: '프로바이더', | |||
| mcp: 'MCP 서버', | |||
| skills: '스킬', | |||
| browser: '브라우저', | |||
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Missing settings.browser translation group breaks the Browser settings tab in Korean.
Same gap as in ja.ts: only the tabs.browser label was added, but the settings.browser.* group (title/subtitle/enableTitle/…/devModeDesc) that en.ts/zh-Hans.ts added between skills and ssh is missing here. SettingsDialog.vue's Browser tab depends on these keys, so Korean-locale users will see fallback/raw key text throughout that tab.
As per path instructions/file convention: "키 구조는 en.ts와 완전히 동일합니다" (key structure must exactly match en.ts).
🌐 Proposed fix (insert after `skills`, before `ssh`, ~line 373)
loadingHint: '불러오는 중…',
},
+ browser: {
+ title: '브라우저',
+ subtitle: 'jcode가 브라우저를 조작하도록 합니다. 관리형 브라우저는 별도 설정 없이 바로 사용할 수 있습니다. 로그인 세션을 재사용하려면 Chrome 확장 프로그램을 연결하세요.',
+ enableTitle: '브라우저 사용 활성화',
+ enableDesc: '에이전트에 browser_* 도구를 추가합니다.',
+ control: '제어',
+ managed: '관리형 브라우저',
+ noChrome: 'Chrome을 찾을 수 없습니다 — 아래에서 경로를 설정하세요',
+ extension: 'Google Chrome 확장 프로그램',
+ connected: '내 Chrome과 로그인 상태 재사용',
+ notConnected: '연결되지 않음 — 내 Chrome과 로그인 상태 재사용',
+ online: '연결됨',
+ connectHint: '연결 방법: jcode Browser Bridge 확장 프로그램을 설치하고 팝업을 연 후 "Auto-connect to jcode"를 클릭하세요.',
+ chromePath: 'Chrome 경로',
+ approval: '승인',
+ navigate: '사이트 열기(탐색)',
+ interact: '페이지 상호작용(클릭 / 입력)',
+ askEachSite: '매번 확인',
+ alwaysAllow: '항상 허용',
+ sitePermissions: '사이트 권한',
+ add: '추가',
+ noSitePermissions: '아직 사이트별 권한이 없습니다',
+ developerMode: '개발자 모드',
+ elevatedRisk: '위험도 높음',
+ devModeTitle: 'browser_eval 및 원시 CDP 활성화',
+ devModeDesc: '에이전트가 페이지에서 JavaScript와 원시 DevTools 명령을 실행할 수 있게 합니다. 호출마다 여전히 확인이 필요하며 사이트 허용 목록은 적용되지 않습니다.',
+ },
ssh: {(Translations above are best-effort for review purposes — please have a native speaker verify wording before merge.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/src/i18n/locales/ko.ts` at line 244, The Korean locale is missing the
settings.browser translation group, so add the full settings.browser.* key set
to ko.ts using the same key structure as en.ts and zh-Hans.ts. Insert the
Browser settings strings between the existing skills and ssh sections, including
title, subtitle, enableTitle, and devModeDesc, so SettingsDialog.vue can render
the Browser tab without raw fallback keys.
What
Adds browser use — the agent can drive a browser to navigate, snapshot, interact, screenshot, read page text, and (in developer mode) eval JS.
Two backends behind one
Sessioninterface:extension/) over a WebSocket bridge + native-messaging host.Approval has three tiers — read-only (auto), navigate/interact (site-scoped), high-risk eval (always prompt) — with per-origin site permissions surfaced in the web Settings → Browser tab. Tools:
browser_open,browser_snapshot,browser_screenshot,browser_act,browser_read,browser_tabs,browser_eval. Wired into both TUI and web; screenshots served over/api/browser/shots.Design notes:
internal-doc/browser-use-design.md.Lifecycle / approval fixes included
Session.Closeonly releases the task's own tabs; the managed Chrome is owned by theManagerand reused across tasks.getManageddrops and relaunches a dead backend (crash / user-quit) instead of caching it. Web mode now callsManager.Closeon exit so Chrome isn't orphaned.interactsite permission works.browser_actscopes its per-site permission by the active tab's origin (from the live session) instead of a hardcoded empty origin, sointeract=allowand the interact class default actually take effect.browser_act action=reloadimplemented (reusesSession.Reload, previously dead code).Tests
go build ./...,go vet, andgo test -race ./internal/browser/ ./internal/runner/ ./internal/tools/pass. New regression tests: session teardown keeps the shared backend,CurrentOrigincaching, reload routing, and interact-by-session-origin approval.Notes
browser_actis a single interact tier (dialog/scroll/hover included); ifinteract=always_allow, a dialogacceptis also auto-approved — intentional per design.Generated with Jack AI bot
Summary by CodeRabbit
New Features
Bug Fixes
Documentation