Conversation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Align the CLI with ScrapeGraphAI/scrapegraph-js#11 (v2 SDK migration): - Rename smart-scraper → extract, search-scraper → search - Remove commands dropped from the API: agentic-scraper, generate-schema, sitemap, validate - Add client factory (src/lib/client.ts) using the new scrapegraphai({ apiKey }) pattern - Update scrape command with --format flag (markdown, html, screenshot, branding) - Update crawl to use crawl.start/status polling lifecycle - Update history to use v2 service names and parameters - All commands now use try/catch (v2 throws on error) and self-timed elapsed BREAKING CHANGE: CLI commands have been renamed and removed to match the v2 API surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns fetchConfig with sgai-stack development branch contract: - mode: "auto" | "fast" | "js" | "direct+stealth" | "js+stealth" - Removes render and stealth boolean fields - Updates timeout range to 1000-60000ms (default 30000) - Adds SGAI-APIKEY header to all requests - Fixes API URL paths (/v2 → /api/v2) - Exports ApiFetchMode type Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite proxy configuration page to document FetchConfig object with mode parameter (auto/fast/js/direct+stealth/js+stealth), country-based geotargeting, and all fetch options. Update knowledge-base proxy guide and fix FetchConfig examples in both Python and JavaScript SDK pages to match the actual v2 API surface. Refs: ScrapeGraphAI/scrapegraph-js#11, ScrapeGraphAI/scrapegraph-py#82 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Final status — SDK v2 migrationAll changes validated against ArchitectureClient surfaceconst sgai = scrapegraphai({ apiKey: "sgai-..." });
// Top-level
sgai.scrape(url, options?) // POST /api/v2/scrape
sgai.extract(url, extractOptions) // POST /api/v2/extract
sgai.search(query, options?) // POST /api/v2/search
sgai.credits() // GET /api/v2/credits
sgai.history(filter?) // GET /api/v2/history
// Crawl namespace
sgai.crawl.start(url, options?) // POST /api/v2/crawl
sgai.crawl.status(id) // GET /api/v2/crawl/:id
sgai.crawl.stop(id) // POST /api/v2/crawl/:id/stop
sgai.crawl.resume(id) // POST /api/v2/crawl/:id/resume
// Monitor namespace
sgai.monitor.create(input) // POST /api/v2/monitor
sgai.monitor.list() // GET /api/v2/monitor
sgai.monitor.get(id) // GET /api/v2/monitor/:id
sgai.monitor.pause(id) // POST /api/v2/monitor/:id/pause
sgai.monitor.resume(id) // POST /api/v2/monitor/:id/resume
sgai.monitor.delete(id) // DELETE /api/v2/monitor/:idFetch mode enum (
|
| Value | Description |
|---|---|
auto |
Tries all providers in order (default) |
fast |
Direct HTTP only (impit) |
js |
JavaScript rendering |
direct+stealth |
Residential proxy |
js+stealth |
JS render + stealth proxy |
Extract endpoint (simplified)
Extract accepts only: url, prompt, schema, mode, contentType.
fetchConfig and llmConfig have been removed from this endpoint.
Search endpoint — nationality
Added nationality parameter (2-letter ISO code, maps to hl in Serper for language-targeted results).
Request layer
- Headers:
Authorization: Bearer <key>,SGAI-APIKEY: <key>,X-SDK-Version: js@2.0.0 - Retry on 502/503 with exponential backoff (default 2 retries)
- Retry on network errors (
TypeError— fetch failed, connection refused) - Configurable
timeout(default 30s) andAbortSignalsupport - Timeout range: 1000–60000ms
Unit tests — 16/16 ✅
| File | Tests | Coverage |
|---|---|---|
tests/http.test.ts |
4 | POST body, retry on 502, auth headers, 401 error |
tests/client.test.ts |
7 | scrape, extract, search, search+nationality, credits, crawl (start/status/stop), monitor (create/delete), fetchConfig.mode, history query params |
tests/zod.test.ts |
5 | Zod object, raw passthrough, optional fields, arrays, nested objects |
Integration tests (localhost:3002) ✅
| Endpoint | Tests | Status |
|---|---|---|
scrape (markdown, html, screenshot, mock, stealth, all 5 fetch modes) |
12 | ✅ |
extract (basic, schema, complex, fetchConfig, llmConfig) |
5 | ✅ |
search (basic, numResults, llmConfig, nationality) |
4 | ✅ |
history (no filters, limit, service filter) |
4 | ✅ |
credits |
1 | ✅ |
| Error handling (invalid API key) | 1 | ✅ |
Pending changes (uncommitted)
Two uncommitted changes ready to be committed:
src/schemas.ts: addednationalityfield toapiSearchRequestSchematests/client.test.ts: addedsearch forwards nationalitytest
Ready for review.
- Update biome.json schema to match installed CLI version - Exclude .claude dir from biome checks - Fix formatting in schemas.ts and client.test.ts - Add search nationality forwarding test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Compared against
I validated these against the monorepo |
|
Update completed on this branch. What was done
Tests run
Live endpoint coverage
Result
|
Summary
Full rewrite of
scrapegraph-jsfrom v1 flat-function API to v2 client factory targeting/api/v2/*endpoints.What changed
scrapegraphai({ apiKey, baseUrl?, timeout?, maxRetries? })replaces all standalone functionsscrape,extract,search,credits,historyas top-level methods;crawl.*andmonitor.*as namespaced sub-objectssrc/http.ts): retry on 502/503 with exponential backoff,Authorization+SGAI-APIKEY+X-SDK-Versionheaders, configurable timeout and AbortSignal supportsrc/schemas.ts): Zod v4 validation schemas copied from sgai-stack shared contracts — used for type derivation, not runtime validation in the SDKsrc/zod.ts): converts Zod v3/v4 schemas to JSON Schema so consumers can passz.object(...)toextract({ schema })src/url.ts): private/internal hostname and IP blocking (used server-side, exported for contract alignment)render/stealthbooleans withfetchConfig.modeenum (auto,fast,js,direct+stealth,js+stealth)fetchConfigandllmConfigfrom extract body — extract only acceptsurl,prompt,schema,mode,contentTypenationality(2-letter ISO) to search endpointz.infer<>, exported fromsrc/types/index.tsMIGRATION.mdwith v1→v2 mapping for every function, parameter, and typeRemoved
src/scrapegraphai.ts(old monolithic module withsmartScraper,searchScraper,markdownify,agenticScraper,sitemap,generateSchema,checkHealth)src/env.ts(env-based config)tsup.config.ts(replaced with bun build)integration_test.ts(replaced with unit test suite)examples/(to be recreated for v2 API).DS_StoreBreaking changes
scrapegraphai()factorycrawlandmonitorare namespaced (crawl.start,monitor.create, etc.)/api/v2/*{ data, requestId }, errors throwsnake_caseparams replaced withcamelCaseTest plan
bun test) — mock HTTP servers, no real API callstests/http.test.ts: POST body, retry on 502, auth headers, 401 errortests/client.test.ts: scrape, extract, search, search+nationality, credits, crawl.start/status/stop, monitor.create/delete, fetchConfig.mode, history query paramstests/zod.test.ts: Zod object/optional/array/nested conversion, raw passthroughtsc --noEmit)localhost:3002— all endpoints working🤖 Generated with Claude Code