Skip to content

[STG-2450] feat(cli): browse snapshot lean by default, add --full for ref maps#2296

Open
shrey150 wants to merge 3 commits into
mainfrom
shrey/browse-lean-snapshot-default
Open

[STG-2450] feat(cli): browse snapshot lean by default, add --full for ref maps#2296
shrey150 wants to merge 3 commits into
mainfrom
shrey/browse-lean-snapshot-default

Conversation

@shrey150

@shrey150 shrey150 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

browse snapshot now prints the formatted accessibility tree only by default, omitting the xpathMap/urlMap ref maps that were previously included on every call.

  • browse snapshot (default): formatted tree only (~14KB on a content-heavy page).
  • browse snapshot --full: tree + xpathMap + urlMap.
  • browse snapshot --compact: deprecated no-op alias of the default (prints a stderr-only, TTY-gated deprecation notice).
  • browse refs: prints the cached maps on demand.

Ref-based element commands (click, fill, select, …) are unaffected — the maps are still captured and cached server-side, so refs resolve exactly as before.

Why

The default snapshot emitted ~241KB (~60K tokens) on a content-heavy page, of which ~217KB (90%) is the xpathMap (a ref→XPath entry for every node) + urlMap. The tree an agent actually reasons over is only ~13.5KB.

Because refs resolve from a server-side cache (not from stdout), the printed maps were dead weight in the context of any agent consuming browse snapshot output. Making the tree lean by default cuts the default payload ~17× with no loss to element interaction; --full and browse refs recover the maps when needed.

E2E Test Matrix

Behavior + size (local build, live session on capeair.com)

Verified against packages/cli/bin/run.js (the local build), not the published browse.

Command / flow Observed output Confidence / sufficiency
browse snapshot (default) 13,845 bytes, keys = tree only, hasMaps=false Lean default; ~17× smaller than before
browse snapshot --full 241,104 bytes, keys = tree,urlMap,xpathMap (1,640 entries) --full returns the full tree + maps
browse snapshot --compact (piped) 13,868 bytes, hasMaps=false, 0 bytes stderr Deprecated alias == default; no noise on non-TTY
browse refs count=1640 Maps still captured + retrievable on demand
browse click @0-1588 after a lean snapshot {"clicked": true} → navigated to /about_us/ Ref interaction unaffected by dropping maps from stdout
default tree vs --full tree identical (341 lines both) Lean mode omits only the maps; it does not prune tree content
pnpm --dir packages/cli build (tsc) + evals tsc --noEmit success / clean Typechecks
driver-commands unit test 15/15 pass Locks in: default omits maps, --full includes them, maps cached in both modes

Agent task-success A/B (lean vs full)

Controlled A/B — same task / model (Sonnet) / session per pair; the only variable is snapshot mode (--full = maps vs default = lean), against the local build.

Task Arm Outcome Snapshots Agent tokens
capeair (regional booking) full (maps) ✅ SUCCESS 9 109.8K
capeair lean ✅ SUCCESS 8 93.5K
Google Flights full (maps) ✅ SUCCESS 15 85.3K
Google Flights lean ✅ SUCCESS 15 86.4K

4/4 success — lean-by-default held task success with zero capability loss. (A standard-benchmark A/B via the evals package is in progress and will be added here.)

What's in the diff

  • packages/cli/src/commands/snapshot.ts — lean default (compact decoupled from map omission); add --full; deprecate --compact.
  • packages/cli/src/lib/driver/commands/snapshot.ts — driver handler emits the full tree and includes ref maps only when full is requested.
  • packages/cli/skills/browse/SKILL.md — document lean default + --full.
  • packages/cli/tests/driver-commands.test.ts — test the default/--full/caching behavior.
  • packages/evals/core/tools/browse_cli.ts — derive refCount from the tree when maps are absent.
  • .changeset/lean-browse-snapshot.mdbrowse patch.

Linear: STG-2450

@changeset-bot

changeset-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: c05d2a4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 0 packages

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Comment thread packages/cli/src/commands/snapshot.ts Outdated
Comment thread packages/cli/src/commands/snapshot.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Confidence score: 3/5

  • In packages/cli/src/commands/snapshot.ts, the new default path still applies the old compact tree filter, so lean mode may remove accessibility-tree lines (not just ref maps), which can change snapshot content unexpectedly for users — split “omit maps” logic from “compact tree” before merging the default switch.
  • In packages/cli/src/commands/snapshot.ts, the default flip plus new --full flag and deprecated flag behavior are untested, so regressions in typical CLI usage could ship unnoticed — add unit tests for default lean output, --full, and deprecated-flag compatibility before merging.
Architecture diagram
sequenceDiagram
    participant Agent as Agent (LLM/Client)
    participant CLI as CLI (browse commands)
    participant Session as Session Manager
    participant Daemon as Driver Daemon
    participant Cache as Refs Cache (in-memory)

    Note over Agent,Cache: NEW: Lean snapshot flow

    Agent->>CLI: browse snapshot
    CLI->>CLI: Parse flags (--full default=false)
    alt --full flag absent (default)
        CLI->>CLI: Set compact=true (lean output)
        CLI->>Daemon: snapshot(compact=true)
        Daemon->>Daemon: Capture accessibility tree only
        Daemon-->>CLI: { tree: formattedTree }
        CLI-->>Agent: JSON with tree only (hasMaps=false)
    else --full flag present
        CLI->>CLI: Set compact=false
        CLI->>Daemon: snapshot(compact=false)
        Daemon->>Daemon: Capture tree + xpathMap + urlMap
        Daemon-->>CLI: { tree, xpathMap, urlMap }
        CLI-->>Agent: JSON with tree + ref maps
    end

    Note over CLI,Daemon: --compact deprecation path (TTY only)

    opt --compact flag AND stderr is TTY
        CLI->>CLI: Print deprecation warning to stderr
    end

    Note over Agent,Cache: CHANGED: Refs still cached server-side

    Agent->>CLI: browse click @0-1588
    CLI->>Daemon: click(ref="0-1588")
    Daemon->>Cache: Resolve ref to XPath
    Cache-->>Daemon: XPath
    Daemon->>Daemon: Execute click on element
    Daemon-->>CLI: { clicked: true }
    CLI-->>Agent: Result

    Note over Agent,Cache: CHANGED: browse refs still works

    Agent->>CLI: browse refs
    CLI->>Daemon: getRefs()
    Daemon->>Cache: Retrieve cached maps
    Cache-->>Daemon: { xpathMap, urlMap }
    Daemon-->>CLI: Maps
    CLI-->>Agent: Formatted ref maps

    Note over Agent,Cache: Eval represent() with BROWSE_SNAPSHOT_FULL toggle

    alt BROWSE_SNAPSHOT_FULL=1
        Agent->>CLI: represent() → snapshot --full
        CLI-->>Agent: { tree, xpathMap, urlMap }
        Agent->>Agent: refCount = Object.keys(xpathMap).length
    else default (lean)
        Agent->>CLI: represent() → snapshot
        CLI-->>Agent: { tree }
        Agent->>Agent: refCount = count refs in tree text
    end
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/cli/src/commands/snapshot.ts Outdated
Comment thread packages/cli/src/commands/snapshot.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/cli/tests/driver-commands.test.ts">

<violation number="1" location="packages/cli/tests/driver-commands.test.ts:363">
P3: `toHaveBeenCalledWith` doesn't verify caching happens on the lean call specifically. If the handler is refactored to only cache maps in the `--full` path the test would still pass.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/cli/tests/driver-commands.test.ts Outdated
shrey150 and others added 3 commits June 30, 2026 20:14
… maps

browse snapshot previously emitted the formatted tree plus xpathMap+urlMap
(~217KB / ~60K tokens on content-heavy pages) on every call. Refs resolve from
a server-side cache, so the printed maps were dead weight in agent context and
~18x larger than Playwright MCP / agent-browser / our own managed Agents output.

- Default output is now the formatted tree only (no ref maps).
- Add --full to restore tree + xpathMap + urlMap (the previous default).
- --compact is now a deprecated no-op alias of the default (stderr + TTY-gated notice).
- browse refs still prints the cached maps on demand; ref-based commands unaffected.
- evals(browse_cli): derive refCount from the tree when maps are absent; add
  BROWSE_SNAPSHOT_FULL toggle for the core-tier represent() path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses PR review:
- Default snapshot now emits the full tree (no line pruning) and only omits
  the ref maps; --full adds xpathMap/urlMap. Verified default tree === --full tree.
- Driver snapshot handler takes `full` instead of overloading `compact`.
- Simplify the command description and drop verbose/historical comments.
- Add a driver test: default omits maps, --full includes them, maps cached either way.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verify setRefMaps runs on the default (lean) call specifically via call-count
+ last-called-with, so caching moving into the --full path would fail the test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@shrey150 shrey150 force-pushed the shrey/browse-lean-snapshot-default branch from ab29d17 to c05d2a4 Compare July 1, 2026 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant