diff --git a/.github/styles/config/vocabularies/Docs/accept.txt b/.github/styles/config/vocabularies/Docs/accept.txt index dfb5437af3..aac216fec5 100644 --- a/.github/styles/config/vocabularies/Docs/accept.txt +++ b/.github/styles/config/vocabularies/Docs/accept.txt @@ -7,7 +7,7 @@ CU booleans env npm -serverless +[Ss]erverless [Bb]oolean node_modules [Rr]egex diff --git a/sources/platform/actors/development/quick-start/build_with_ai.md b/sources/platform/actors/development/quick-start/build_with_ai.md index cc86e96019..948285fb3d 100644 --- a/sources/platform/actors/development/quick-start/build_with_ai.md +++ b/sources/platform/actors/development/quick-start/build_with_ai.md @@ -14,9 +14,10 @@ import TabItem from '@theme/TabItem'; This guide provides best practices for building new Actors or improving existing ones using AI code generation tools by providing the AI agents with the right instructions and context. -:::tip Develop AI agents on Apify +:::tip Different goal? -Looking to build and deploy AI agents as Actors? See [Develop AI agents on Apify](/platform/actors/development/quick-start/develop-ai-agents) for the full stack - templates, sandboxes, LLM access, and monetization. +- _Building and deploying AI agents as Actors on Apify?_ See [Develop AI agents on Apify](/platform/actors/development/quick-start/develop-ai-agents) for the full stack - templates, sandboxes, LLM access, and monetization. +- _Connecting an external AI agent to Apify?_ See [Apify for AI agents](/platform/integrations/agent-onboarding) for MCP, Agent Skills, client libraries, and the REST API. ::: diff --git a/sources/platform/index.mdx b/sources/platform/index.mdx index d0710fb402..c7abf59de1 100644 --- a/sources/platform/index.mdx +++ b/sources/platform/index.mdx @@ -33,6 +33,11 @@ Learn how to run any Actor in Apify Store or create your own. A step-by-step gui desc="Learn everything about web scraping and automation with free courses that will turn you into an expert scraper developer." to="/academy" /> + ## Contents diff --git a/sources/platform/integrations/ai/agent-onboarding.md b/sources/platform/integrations/ai/agent-onboarding.md new file mode 100644 index 0000000000..7048fbd38b --- /dev/null +++ b/sources/platform/integrations/ai/agent-onboarding.md @@ -0,0 +1,306 @@ +--- +title: Apify for AI agents +sidebar_label: Agent onboarding +description: Connect your AI agent to the Apify platform - scrape the web, run Actors, and retrieve structured data via MCP, Agent Skills, client libraries, or the REST API. +sidebar_position: 1 +slug: /integrations/agent-onboarding +toc_max_heading_level: 3 +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +Connect your AI agent or application to Apify - the platform for web scraping, data extraction, and browser automation. The typical agent workflow: find an Actor, run it, get structured data back. + +## Core concepts + +- _Actors_ - Serverless cloud programs that perform scraping, crawling, or automation tasks. Thousands of ready-made Actors are available in [Apify Store](https://apify.com/store). +- _Datasets_ - Append-only storage for structured results. Every Actor run creates a default dataset. Export as JSON, CSV, Excel, XML, or RSS. +- _API_ - RESTful API at `https://api.apify.com/v2` for all platform operations. Also accessible via [MCP](/platform/integrations/mcp), [CLI](/cli), and client libraries. + +## Prerequisites + +Sign up to [Apify Console](https://console.apify.com/sign-up). The free plan includes monthly platform usage credits with no credit card required. Get your API token from **[Console > Settings > Integrations](https://console.apify.com/settings/integrations)**. + +:::tip Free exploration + +The MCP server's `search-actors`, `fetch-actor-details`, and docs tools work without authentication. You can browse Actors and documentation without an account. + +::: + +## Run your first Actor + +Every Apify Actor follows the same pattern: send input as JSON, get structured data back. The shortest path through each of the main integration methods, using the agent-optimized [RAG Web Browser](https://apify.com/apify/rag-web-browser) Actor: + + + + +After [connecting the MCP server](#mcp-server) to your AI assistant, ask: + +```text +Use Apify's RAG Web Browser to find the top 3 pages about Apify documentation, then summarize. +``` + +Your agent calls [`search-actors`](/platform/integrations/mcp#available-tools), [`call-actor`](/platform/integrations/mcp#available-tools), and reads the resulting dataset items - all through MCP, no code required. + + + + +```typescript +import { ApifyClient } from 'apify-client'; + +const client = new ApifyClient({ token: process.env.APIFY_TOKEN }); +const run = await client.actor('apify/rag-web-browser').call({ + query: 'Apify documentation', + maxResults: 3, +}); +const { items } = await client.dataset(run.defaultDatasetId).listItems(); +``` + + + + +```python +import os +from apify_client import ApifyClient + +client = ApifyClient(token=os.environ['APIFY_TOKEN']) +run = client.actor('apify/rag-web-browser').call( + run_input={'query': 'Apify documentation', 'maxResults': 3}, +) +items = client.dataset(run['defaultDatasetId']).list_items().items +``` + + + + +```bash +apify login # one-time +apify call apify/rag-web-browser \ + -i '{"query": "Apify documentation", "maxResults": 3}' \ + --output-dataset +``` + + + + +The pattern is the same across every integration method: pick an Actor, send input, receive structured data. Choose the connection method below that fits your stack. + +:::caution Cost controls + +When an agent calls Actors automatically, set run limits to prevent surprise bills. Pass these as query parameters on the [run Actor endpoint](/api/v2/act-runs-post): + +- `memory` (MB) - power of 2, minimum 128. Lower memory means lower cost per second. +- `timeout` (seconds) - cap how long a single run can last. +- `maxTotalChargeUsd` - cap total run cost for pay-per-event Actors. + +See [Usage and resources](/platform/actors/running/usage-and-resources) and [Billing](/platform/console/billing) for details. + +::: + +## Choose your integration method + +| Method | Best for | Auth | +| :--- | :--- | :--- | +| [MCP server](#mcp-server) | AI agents and coding assistants | OAuth or API token | +| [API client](#api-client) | Backend apps (JavaScript/Python) | API token | +| [CLI](#cli) | Building and deploying custom Actors | API token | +| [REST API](#rest-api) | Any language, HTTP integrations, no-code tools | API token | + +### MCP server + +The [Apify MCP server](/platform/integrations/mcp) connects your agent to the full Apify platform via the [Model Context Protocol](https://modelcontextprotocol.io/). No local installation needed for remote-capable clients. + +#### Remote (recommended) + +Works with Claude Code, Cursor, VS Code, GitHub Copilot, and other remote-capable clients. + +1. Add the following to your MCP client's configuration: + + ```json + { + "mcpServers": { + "apify": { + "url": "https://mcp.apify.com" + } + } + } + ``` + +1. Restart your client and sign in when prompted. OAuth handles authentication automatically. + +#### Local/stdio + +For clients that only support local MCP servers, for example Claude Desktop. + +1. Add the following to your MCP client's configuration: + + ```json + { + "mcpServers": { + "apify": { + "command": "npx", + "args": ["-y", "@apify/actors-mcp-server"], + "env": { "APIFY_TOKEN": "YOUR_TOKEN" } + } + } + } + ``` + +1. Replace `YOUR_TOKEN` with your API token and restart the client. + +For client-specific setup instructions, use the [MCP Configurator](https://mcp.apify.com) which generates ready-to-paste configs. For details, see the [MCP server documentation](/platform/integrations/mcp). + +### API client + +For integrating Apify into your application code. + +:::warning Package naming + +`apify-client` is the API client for _calling_ Actors. The `apify` package is the SDK for _building_ Actors. For backend integration, install `apify-client`. + +::: + + + + +```bash +npm install apify-client +``` + +```typescript +import { ApifyClient } from 'apify-client'; + +const client = new ApifyClient({ token: process.env.APIFY_TOKEN }); +const run = await client.actor('apify/web-scraper').call({ + startUrls: [{ url: 'https://example.com' }], +}); +const { items } = await client.dataset(run.defaultDatasetId).listItems(); +``` + +Full reference: [JavaScript API client docs](https://docs.apify.com/api/client/js) + + + + +```bash +pip install apify-client +``` + +```python +import os +from apify_client import ApifyClient + +client = ApifyClient(token=os.environ['APIFY_TOKEN']) +run = client.actor('apify/web-scraper').call( + run_input={'startUrls': [{'url': 'https://example.com'}]} +) +items = client.dataset(run['defaultDatasetId']).list_items().items +``` + +Full reference: [Python API client docs](https://docs.apify.com/api/client/python) + + + + +### CLI + +For running Actors and building custom ones from the command line. + +Install on macOS or Linux (Windows and Homebrew alternatives in the [CLI install docs](/cli/docs/installation)): + +```bash +curl -fsSL https://apify.com/install-cli.sh | bash +apify login # authenticate with your API token +``` + +Discover and inspect Actors: + +```bash +apify actors search scraping # search Apify Store +apify actors info apify/web-scraper --readme # get Actor README +apify actors info apify/web-scraper --input # get input schema +``` + +Run an Actor and get its output: + +```bash +apify actors call apify/web-scraper \ + -i '{"startUrls": [{"url": "https://example.com"}]}' \ + --output-dataset +``` + +Build and deploy custom Actors: + +```bash +apify create my-actor # scaffold (JS/TS/Python) +apify run # test locally +apify push # deploy to Apify cloud +``` + +Full reference: [Apify CLI documentation](/cli). + +### REST API + +For HTTP-native integrations or languages without a dedicated client. Base URL: `https://api.apify.com/v2`. Authenticate with the `Authorization: Bearer YOUR_TOKEN` header. + +#### Quick reference + +| Action | Method | Endpoint | +| :--- | :--- | :--- | +| [Search Actors in Store](/api/v2/store-get) | `GET` | `/v2/store` | +| [Get Actor details](/api/v2/act-get) | `GET` | `/v2/acts/{actorId}` | +| [Run an Actor](/api/v2/act-runs-post) | `POST` | `/v2/acts/{actorId}/runs` | +| [Run Actor (sync, get results)](/api/v2/act-run-sync-get-dataset-items-post) | `POST` | `/v2/acts/{actorId}/run-sync-get-dataset-items` | +| [Get run status](/api/v2/actor-run-get) | `GET` | `/v2/actor-runs/{runId}` | +| [Get dataset items](/api/v2/dataset-items-get) | `GET` | `/v2/datasets/{datasetId}/items` | + +The sync endpoint ([`run-sync-get-dataset-items`](/api/v2/act-run-sync-get-dataset-items-post)) runs an Actor and returns results in a single request (waits up to 5 minutes). Use [async endpoints](/api/v2/act-runs-post) for longer runs. + +For runs that take longer than the sync timeout, prefer [webhooks](/platform/integrations/webhooks) over polling - Apify will POST a notification to your URL when the run finishes, avoiding wasted requests. + +Full reference: [Apify API v2](/api/v2). + +## Agent Skills + +Once you connect an agent via MCP or a coding assistant, [Apify Agent Skills](https://skills.sh/apify/agent-skills) add pre-built workflows on top - guiding the agent through multi-step scraping pipelines and Actor development tasks. Skills are not a separate integration method; they layer over your existing connection. + +Install into Claude Code, Cursor, Gemini CLI, or OpenAI Codex: + +```bash +npx skills add apify/agent-skills +``` + +| Skill | What it does | +| :--- | :--- | +| `apify-ultimate-scraper` | Routes web scraping requests to the right Actor for multi-step data pipelines | +| `apify-actor-development` | Guided workflow for building and deploying custom Actors | +| `apify-actorization` | Converts an existing project into an Apify Actor | +| `apify-generate-output-schema` | Auto-generates output schemas from Actor source code | + +For the full list and details, see the [skills registry](https://skills.sh/apify/agent-skills). + +## Documentation access for agents + +Apify documentation is available in formats optimized for programmatic consumption. + +| Resource | How to access | +| :--- | :--- | +| Specific doc page | Append `.md` to any docs URL (for example, `docs.apify.com/platform/actors.md`) | +| Specific doc page (alt) | Request with `Accept: text/markdown` header | +| Docs index | [docs.apify.com/llms.txt](https://docs.apify.com/llms.txt) | +| Full docs (large) | [docs.apify.com/llms-full.txt](https://docs.apify.com/llms-full.txt) | +| Actor Store pages | Append `.md` to any Apify Store URL | +| MCP docs tools | `search-apify-docs`, `fetch-apify-docs` | + +For targeted lookups, prefer `.md` URLs for specific pages or the MCP docs tools over the full `llms-full.txt` file. Agents with limited context windows may not load `llms-full.txt` fully. + +## Useful resources + +- [MCP server integration](/platform/integrations/mcp) - Tool customization, dynamic Actor discovery, and advanced configuration +- [CLI documentation](/cli) - Complete command reference +- [API reference](/api/v2) - All REST API endpoints +- [API client for JavaScript](https://docs.apify.com/api/client/js) | [for Python](https://docs.apify.com/api/client/python) - Client libraries +- [Storage documentation](/platform/storage) - Datasets, key-value stores, and request queues +- [Build with AI](/platform/actors/development) - Build and deploy your first Actor +- [Framework integrations](./crewai.md) - CrewAI, LangChain, LlamaIndex, and more diff --git a/sources/platform/integrations/ai/mcp/index.md b/sources/platform/integrations/ai/mcp/index.md index 8615c20707..bf248a37e5 100644 --- a/sources/platform/integrations/ai/mcp/index.md +++ b/sources/platform/integrations/ai/mcp/index.md @@ -2,7 +2,7 @@ title: Apify MCP server sidebar_label: MCP server description: Learn how to use the Apify MCP server to integrate Apify's library of Actors into your AI agents or large language model-based applications. -sidebar_position: 1 +sidebar_position: 0 slug: /integrations/mcp toc_max_heading_level: 4 --- diff --git a/sources/platform/integrations/index.mdx b/sources/platform/integrations/index.mdx index ba6fdabc3d..6a37a54800 100644 --- a/sources/platform/integrations/index.mdx +++ b/sources/platform/integrations/index.mdx @@ -171,9 +171,15 @@ The Apify platform integrates with popular ETL and data pipeline services, enabl
If you are working on AI/LLM-related applications, we recommend looking into the many integrations with popular AI/LLM ecosystems. -These integrations allow you to use Apify Actors as tools and data sources. +These integrations allow you to use Apify Actors as tools and data sources. If you are connecting any AI agent to Apify, start with the [Apify for AI agents](/platform/integrations/agent-onboarding) page. +