browserkit is a framework for building site-specific MCP servers over real authenticated browser sessions. This guide covers how to contribute adapters, extend the core, and maintain code quality.
git clone https://github.com/browserkit-dev/browserkit
cd browserkit
pnpm install
pnpm build
pnpm testThe most important principle for adapter development: favour stability over specificity.
Google changes .DY5T1d class names constantly. LinkedIn redesigns component structure every few months. The adapters that survive these changes anchor on stable signals:
Prefer structural/semantic selectors over class names:
// ✅ Stable — Google uses data-hveid for click tracking internally
"[data-hveid]"
// ✅ Stable — heading roles are semantic, not layout
"h3, [role='heading']"
// ❌ Fragile — will break when Google pushes a CSS update
".DY5T1d-RZkmj"Prefer textContent extraction over DOM walking:
// ✅ Resilient — extracts longest text node as title (works even if markup changes)
const allText = Array.from(card.querySelectorAll("div, span"))
.filter(el => el.children.length === 0)
.map(el => el.textContent.trim())
.filter(t => t.length > 20);
const title = allText.sort((a, b) => b.length - a.length)[0];
// ❌ Fragile — breaks when class name changes
card.querySelector(".article-title-class").textContentPrefer URL navigation over clicking:
// ✅ Navigate directly to the section URL
await page.goto("https://linkedin.com/in/username/details/experience/");
// ❌ Click a tab element whose selector may change
await page.click('[data-section="experience"]');Document any DOM dependency with a comment explaining why textContent/URL navigation isn't sufficient.
npx @browserkit/core create-adapter my-site
cd adapter-my-site
pnpm install-
src/selectors.ts— CSS selector constants with stability comments; preferdata-*attributes and semantic elements over class names -
src/index.ts:-
defineAdapter({ site, domain, loginUrl, selectors, rateLimit })— setrateLimitfor authenticated sites -
isLoggedIn(page)— check current page only (no navigation); returntruefor public sites -
tools()— one tool per distinct URL/action; includeannotations: { readOnlyHint, openWorldHint } - Tool handlers return
{ content, references? }— includereferencesfor extractable links
-
- Extract complex scraping logic into a
src/scraper.tsfunction that takes aPageand can be tested independently
-
tests/<site>.test.ts— L1 unit: schema validation, metadata, selectors exported -
tests/<site>.scraping.test.ts— mock DOM tests using a local HTML fixture; launch real Playwright browser againstfile://fixture URL; no network -
tests/mcp-protocol.test.ts— L3: MCP initialize, tool list (includesbrowser+close_session), tool dispatch,isErroron schema violations -
tests/reliability.test.ts— L4: concurrency, latency p50/p95, error recovery -
tests/<site>.integration.test.ts— L2: live scraping; tagged so defaultpnpm testexcludes it
-
README.md— tool table, installation, config example withdeviceEmulation/channelif needed - Update
browserkit.config.jsexample in the main repo README if relevant
pnpm build
pnpm test # L1 + mock scraping + L3 + L4
pnpm test:integration # L2 live (requires auth if applicable)- Add tool to
tools()array insrc/index.ts - Include
annotations: { readOnlyHint, openWorldHint }(readOnly=false + destructiveHint=true for write operations) - Add
referencesto the return value if the result contains navigable links - Update
src/selectors.tsif new selectors are needed - Add L1 schema test covering the new input
- Add mock scraping test if the tool involves DOM extraction
- Update
README.mdtool table
All adapter tools should declare annotations so AI agents can reason about safety:
{
name: "get_posts",
annotations: {
readOnlyHint: true, // tool does not modify state
openWorldHint: true, // tool calls external services (always true for web adapters)
},
// ...
}| Annotation | When to use |
|---|---|
readOnlyHint: true |
Any read-only scraping tool |
readOnlyHint: false + destructiveHint: true |
Posting, deleting, purchasing |
openWorldHint: true |
Any tool that hits a real website (always) |
If a tool result contains links to other entities (articles, profiles, comments), include them in references:
return {
content: [{ type: "text", text: JSON.stringify(articles) }],
references: articles.map(a => ({
kind: "article",
url: a.url,
text: a.title,
context: a.source,
})),
};This allows AI agents to follow links without parsing JSON from the main content string.
- No
anytypes — strict TypeScript throughout - No
console.login committed code — usegetLogger()from core - Conventional commits:
feat:,fix:,refactor:,test:,docs:,chore: - Small incremental changes — PRs should be reviewable in under 10 minutes
See ARCH.md for the full framework architecture and open design questions.