feat(firecrawl): migrate FireCrawl loader to Firecrawl v2 (v4) SDK#1
Closed
rakshith48 wants to merge 4 commits into
Closed
feat(firecrawl): migrate FireCrawl loader to Firecrawl v2 (v4) SDK#1rakshith48 wants to merge 4 commits into
rakshith48 wants to merge 4 commits into
Conversation
Replace the hand-rolled v1 REST client in the FireCrawl document loader
with the official @mendable/firecrawl-js v2 API (Firecrawl class) and bump
the dependency from ^1.18.2 to ^4.25.2.
- Use `new Firecrawl({ apiKey, apiUrl })` and its `.scrape` / `.crawl` /
`.search` / `.extract` methods instead of manual axios calls to /v1/*.
- Adapt to v2 response shapes: scrape/crawl return Document(s) directly
(no { success, data } envelope); crawl returns a CrawlJob with `.data`;
search returns results grouped by source (use `.web`).
- Preserve the node's inputs, modes, defaults, and Document/Text output
shape. Search `country` now maps to v2's single `location` field, since
v1's separate `lang`/`country` params were removed in v2.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…crawl-js) Both names dual-publish the identical v4 SDK; `firecrawl` is the current canonical package. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…crawl-js) Both names dual-publish the identical v4 SDK; `firecrawl` is the current canonical package. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Updated to use the canonical Note: the dep was swapped in place in |
…2 maxDiscoveryDepth Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Superseded — migrated to upstream: FlowiseAI#6474 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrades the FireCrawl document loader (
packages/components/nodes/documentloaders/FireCrawl/FireCrawl.ts) from the legacy Firecrawl v1 API to the official@mendable/firecrawl-jsv2 (v4 major) SDK, and bumps the dependency inpackages/components/package.jsonfrom^1.18.2to^4.25.2.The previous node declared
@mendable/firecrawl-jsas a dependency but never imported it — it hand-rolled its ownFirecrawlAppclass hitting the/v1/scrape,/v1/crawl,/v1/extract,/v1/searchREST endpoints directly. This PR replaces that hand-rolled client with the official SDK.v1 → v2 changes
new FirecrawlApp({ apiKey, apiUrl })(custom) →new Firecrawl({ apiKey, apiUrl })(SDK default export).POST /v1/*+ status polling → SDKapp.scrape(),app.crawl()(built-in waiter,pollInterval: 2),app.search(),app.extract().{ success, data }envelope):scrape→ returns aDocumentdirectly (markdown/html/metadata at top level).crawl→ returns aCrawlJobwith.statusand.data: Document[].search→ returnsSearchDatagrouped by source; the loader reads.weband normalizes each entry (lightweightSearchResultWebor fullDocument).extract→ returnsExtractResponsewith.data/.status/.expiresAt.lang/country. To avoid breaking existing node configs, the node'sCountryinput is mapped to v2's singlelocationfield;langis no longer sent (input remains in the UI but is inert for v2).limit,timeout,tbs,ignoreInvalidURLsare passed through.crawl/scrape/extract/search), same defaults (formats: ['markdown'],onlyMainContent: true), and the same LangChainDocument[]/ concatenated-Textoutput.integration: 'flowise'is still sent on every call.Note for reviewers (security tradeoff)
The old hand-rolled client routed requests through Flowise's
secureAxiosRequest(SSRF-protected HTTP wrapper). The official SDK uses its own internal HTTP client (undici/fetch), so this migration moves Firecrawl traffic offsecureAxiosRequest. This matches how most other vendor-SDK nodes inpackages/componentsoperate, but flagging it explicitly since it is a behavioral change worth a conscious decision.Verification
What I did verify:
@mendable/firecrawl-js@4.25.2in an isolated project and rantsc --strict --noEmitonFireCrawl.ts(with only the Flowise-internal../../../src/*and@langchain/*imports stubbed). Result: 0 type errors — every v2 method name, option field, and response field access used in the file resolves against the actual v4.d.ts.Firecrawl(default export) and the named typesDocument,ScrapeOptions,CrawlOptions,SearchRequest,SearchResultWebare all exported by@mendable/firecrawl-js@4.25.2.src/v2/client.ts,src/v2/methods/*.ts) and the published npm README usage examples — no guessed method names.prettier --checkwith Flowise's exact config (printWidth 140, tabWidth 4, singleQuote, no semicolons, trailingComma none) — passes.What I did NOT run (and why):
pnpm install. The full install + turbo build is heavy; per task constraints it was avoided. The single-file strict type-check above substitutes for it but does not exercise the real@langchain/*/ FlowiseInterface/utilstypes (those were stubbed).🤖 Generated with Claude Code