|
| 1 | +# Browser Automation Examples |
| 2 | + |
| 3 | +Common browser automation workflows using the `browse` CLI. Each example demonstrates a distinct pattern using real commands. |
| 4 | + |
| 5 | +## Example 1: Extract Data from a Page |
| 6 | + |
| 7 | +**User request**: "Get the product details from example.com/product/123" |
| 8 | + |
| 9 | +```bash |
| 10 | +browse open https://example.com/product/123 |
| 11 | +browse snapshot # read page structure + element refs |
| 12 | +browse get text "body" # extract all visible text content |
| 13 | +browse stop |
| 14 | +``` |
| 15 | + |
| 16 | +Parse the text output to extract structured data (name, price, description, etc.). |
| 17 | + |
| 18 | +For a specific section, use a CSS selector: |
| 19 | + |
| 20 | +```bash |
| 21 | +browse get text ".product-details" # text from a specific container |
| 22 | +``` |
| 23 | + |
| 24 | +**Note**: `browse get text` requires a CSS selector — use `"body"` for all page text. |
| 25 | + |
| 26 | +## Example 2: Fill and Submit a Form |
| 27 | + |
| 28 | +**User request**: "Fill out the contact form on example.com with my information" |
| 29 | + |
| 30 | +```bash |
| 31 | +browse open https://example.com/contact |
| 32 | +browse snapshot # find form fields and their refs |
| 33 | +browse click @0-3 # click the Name input (ref from snapshot) |
| 34 | +browse type "John Doe" |
| 35 | +browse press Tab # move to next field |
| 36 | +browse type "john@example.com" |
| 37 | +browse fill "#message" "I would like to inquire about your services" |
| 38 | +browse snapshot # verify fields are filled |
| 39 | +browse click @0-8 # click Submit button (ref from snapshot) |
| 40 | +browse snapshot # confirm submission result |
| 41 | +browse stop |
| 42 | +``` |
| 43 | + |
| 44 | +**Key pattern**: Use `browse snapshot` before interacting to discover element refs, then `browse click <ref>` and `browse type` to interact. |
| 45 | + |
| 46 | +## Example 3: Multi-Step Navigation |
| 47 | + |
| 48 | +**User request**: "Get headlines from the first 3 pages of results on example.com/news" |
| 49 | + |
| 50 | +```bash |
| 51 | +browse open https://example.com/news |
| 52 | +browse snapshot # read page 1 content |
| 53 | +browse get text ".headline" # extract headlines |
| 54 | + |
| 55 | +browse snapshot # find "Next" button ref |
| 56 | +browse click @0-12 # click Next (ref from snapshot) |
| 57 | +browse wait load # wait for page 2 to load |
| 58 | +browse get text ".headline" # extract page 2 headlines |
| 59 | + |
| 60 | +browse snapshot # find Next again (ref may change) |
| 61 | +browse click @0-15 # click Next |
| 62 | +browse wait load |
| 63 | +browse get text ".headline" # extract page 3 headlines |
| 64 | + |
| 65 | +browse stop |
| 66 | +``` |
| 67 | + |
| 68 | +**Key pattern**: Re-run `browse snapshot` after each navigation because element refs change when the page updates. |
| 69 | + |
| 70 | +## Example 4: Escalate to Remote Mode |
| 71 | + |
| 72 | +**User request**: "Scrape pricing from competitor.com" (a site with Cloudflare protection) |
| 73 | + |
| 74 | +```bash |
| 75 | +# Attempt 1: local mode |
| 76 | +browse open https://competitor.com/pricing |
| 77 | +browse snapshot |
| 78 | +# Output shows: "Checking your browser..." (Cloudflare interstitial) |
| 79 | +# or: page content is empty / access denied |
| 80 | +browse stop |
| 81 | +``` |
| 82 | + |
| 83 | +The agent detects bot protection and tells the user: |
| 84 | + |
| 85 | +> This site has Cloudflare bot detection. Browserbase remote mode can bypass this with anti-bot stealth and residential proxies. Want me to set it up? |
| 86 | +
|
| 87 | +If the user agrees: |
| 88 | + |
| 89 | +```bash |
| 90 | +# Set up Browserbase credentials |
| 91 | +openclaw browserbase setup |
| 92 | +# User enters API key + project ID interactively |
| 93 | + |
| 94 | +# Retry — credentials are now in the environment |
| 95 | +browse open https://competitor.com/pricing |
| 96 | +browse snapshot # full page content now accessible |
| 97 | +browse get text ".pricing-table" |
| 98 | +browse stop |
| 99 | +``` |
| 100 | + |
| 101 | +If the env vars aren't visible yet (setup was run outside OpenClaw): |
| 102 | + |
| 103 | +```bash |
| 104 | +eval "$(openclaw browserbase env --format shell)" && browse open https://competitor.com/pricing |
| 105 | +browse snapshot |
| 106 | +browse get text ".pricing-table" |
| 107 | +browse stop |
| 108 | +``` |
| 109 | + |
| 110 | +## Example 5: Attach to an Existing Browserbase Session |
| 111 | + |
| 112 | +**User request**: "Connect to this running Browserbase session and check its state" |
| 113 | + |
| 114 | +Use `--connect` to attach the CLI to a Browserbase session that was created outside the CLI (via the API, dashboard, or another tool). The CLI connects without creating a new session, and `browse stop` disconnects without killing it. |
| 115 | + |
| 116 | +```bash |
| 117 | +# Attach to the existing session |
| 118 | +browse --connect bb_sess_abc123 open https://app.example.com/dashboard |
| 119 | +browse snapshot # inspect the page |
| 120 | +browse get text ".status" # extract data |
| 121 | +browse stop # disconnect — session stays alive on Browserbase |
| 122 | +``` |
| 123 | + |
| 124 | +**Key pattern**: `--connect` is for sessions you don't own — the CLI is a guest. `browse stop` only disconnects, it does not terminate the remote session. Cannot be combined with `--context-id`. |
| 125 | + |
| 126 | +## Tips |
| 127 | + |
| 128 | +- **Snapshot first**: Always run `browse snapshot` before interacting — it gives you the accessibility tree with element refs |
| 129 | +- **Use refs to click**: `browse click @0-5` is more reliable than trying to describe elements |
| 130 | +- **Re-snapshot after actions**: Element refs change when the page updates |
| 131 | +- **`get text` for data extraction**: Use `browse get text [selector]` to pull text content from specific elements |
| 132 | +- **`stop` when done**: Always `browse stop` to clean up the browser session |
| 133 | +- **Prefer snapshot over screenshot**: Snapshot is fast and structured; screenshot is slow and uses vision tokens. Only screenshot when you need visual context (layout, images, debugging) |
0 commit comments