Skip to content

Commit acc837e

Browse files
shrey150claude
andcommitted
Add --connect flag docs to browser and browse skills
Documents the new `browse --connect <session-id>` flag that attaches to an existing Browserbase session instead of creating a new one. Updates SKILL.md, REFERENCE.md, and EXAMPLES.md for both browser/ and browse/ skills. Sister PR: browserbase/stagehand#1889 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 1f532a1 commit acc837e

6 files changed

Lines changed: 772 additions & 0 deletions

File tree

skills/browse/EXAMPLES.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Browser Automation Examples
2+
3+
Common browser automation workflows using the `browse` CLI. Each example demonstrates a distinct pattern using real commands.
4+
5+
## Example 1: Extract Data from a Page
6+
7+
**User request**: "Get the product details from example.com/product/123"
8+
9+
```bash
10+
browse open https://example.com/product/123
11+
browse snapshot # read page structure + element refs
12+
browse get text "body" # extract all visible text content
13+
browse stop
14+
```
15+
16+
Parse the text output to extract structured data (name, price, description, etc.).
17+
18+
For a specific section, use a CSS selector:
19+
20+
```bash
21+
browse get text ".product-details" # text from a specific container
22+
```
23+
24+
**Note**: `browse get text` requires a CSS selector — use `"body"` for all page text.
25+
26+
## Example 2: Fill and Submit a Form
27+
28+
**User request**: "Fill out the contact form on example.com with my information"
29+
30+
```bash
31+
browse open https://example.com/contact
32+
browse snapshot # find form fields and their refs
33+
browse click @0-3 # click the Name input (ref from snapshot)
34+
browse type "John Doe"
35+
browse press Tab # move to next field
36+
browse type "john@example.com"
37+
browse fill "#message" "I would like to inquire about your services"
38+
browse snapshot # verify fields are filled
39+
browse click @0-8 # click Submit button (ref from snapshot)
40+
browse snapshot # confirm submission result
41+
browse stop
42+
```
43+
44+
**Key pattern**: Use `browse snapshot` before interacting to discover element refs, then `browse click <ref>` and `browse type` to interact.
45+
46+
## Example 3: Multi-Step Navigation
47+
48+
**User request**: "Get headlines from the first 3 pages of results on example.com/news"
49+
50+
```bash
51+
browse open https://example.com/news
52+
browse snapshot # read page 1 content
53+
browse get text ".headline" # extract headlines
54+
55+
browse snapshot # find "Next" button ref
56+
browse click @0-12 # click Next (ref from snapshot)
57+
browse wait load # wait for page 2 to load
58+
browse get text ".headline" # extract page 2 headlines
59+
60+
browse snapshot # find Next again (ref may change)
61+
browse click @0-15 # click Next
62+
browse wait load
63+
browse get text ".headline" # extract page 3 headlines
64+
65+
browse stop
66+
```
67+
68+
**Key pattern**: Re-run `browse snapshot` after each navigation because element refs change when the page updates.
69+
70+
## Example 4: Escalate to Remote Mode
71+
72+
**User request**: "Scrape pricing from competitor.com" (a site with Cloudflare protection)
73+
74+
```bash
75+
# Attempt 1: local mode
76+
browse open https://competitor.com/pricing
77+
browse snapshot
78+
# Output shows: "Checking your browser..." (Cloudflare interstitial)
79+
# or: page content is empty / access denied
80+
browse stop
81+
```
82+
83+
The agent detects bot protection and tells the user:
84+
85+
> This site has Cloudflare bot detection. Browserbase remote mode can bypass this with anti-bot stealth and residential proxies. Want me to set it up?
86+
87+
If the user agrees:
88+
89+
```bash
90+
# Set up Browserbase credentials
91+
openclaw browserbase setup
92+
# User enters API key + project ID interactively
93+
94+
# Retry — credentials are now in the environment
95+
browse open https://competitor.com/pricing
96+
browse snapshot # full page content now accessible
97+
browse get text ".pricing-table"
98+
browse stop
99+
```
100+
101+
If the env vars aren't visible yet (setup was run outside OpenClaw):
102+
103+
```bash
104+
eval "$(openclaw browserbase env --format shell)" && browse open https://competitor.com/pricing
105+
browse snapshot
106+
browse get text ".pricing-table"
107+
browse stop
108+
```
109+
110+
## Example 5: Attach to an Existing Browserbase Session
111+
112+
**User request**: "Connect to this running Browserbase session and check its state"
113+
114+
Use `--connect` to attach the CLI to a Browserbase session that was created outside the CLI (via the API, dashboard, or another tool). The CLI connects without creating a new session, and `browse stop` disconnects without killing it.
115+
116+
```bash
117+
# Attach to the existing session
118+
browse --connect bb_sess_abc123 open https://app.example.com/dashboard
119+
browse snapshot # inspect the page
120+
browse get text ".status" # extract data
121+
browse stop # disconnect — session stays alive on Browserbase
122+
```
123+
124+
**Key pattern**: `--connect` is for sessions you don't own — the CLI is a guest. `browse stop` only disconnects, it does not terminate the remote session. Cannot be combined with `--context-id`.
125+
126+
## Tips
127+
128+
- **Snapshot first**: Always run `browse snapshot` before interacting — it gives you the accessibility tree with element refs
129+
- **Use refs to click**: `browse click @0-5` is more reliable than trying to describe elements
130+
- **Re-snapshot after actions**: Element refs change when the page updates
131+
- **`get text` for data extraction**: Use `browse get text [selector]` to pull text content from specific elements
132+
- **`stop` when done**: Always `browse stop` to clean up the browser session
133+
- **Prefer snapshot over screenshot**: Snapshot is fast and structured; screenshot is slow and uses vision tokens. Only screenshot when you need visual context (layout, images, debugging)

0 commit comments

Comments
 (0)