diff --git a/docs.json b/docs.json index 8233742c..5cb72d29 100644 --- a/docs.json +++ b/docs.json @@ -315,6 +315,7 @@ "sdk/guides/agent-interactive-terminal", "sdk/guides/agent-browser-use", "sdk/guides/agent-custom", + "sdk/guides/agent-file-based", "sdk/guides/agent-stuck-detector", "sdk/guides/agent-tom-agent", "sdk/guides/critic" diff --git a/sdk/guides/agent-file-based.mdx b/sdk/guides/agent-file-based.mdx new file mode 100644 index 00000000..627e44a4 --- /dev/null +++ b/sdk/guides/agent-file-based.mdx @@ -0,0 +1,393 @@ +--- +title: File-Based Agents +description: Define specialized sub-agents as simple Markdown files with YAML frontmatter — no Python code required. +--- + +import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx"; + +> A ready-to-run example is available [here](#ready-to-run-example)! + +File-based agents let you define specialized sub-agents using Markdown files. Each file declares the agent's name, description, tools, and system prompt — the same things you'd pass to `register_agent()` in code, but without writing any Python. + +This is the fastest way to create reusable, domain-specific agents that can be invoked via [delegation](/sdk/guides/agent-delegation). + +## Agent File Format + +An agent is a single `.md` file with YAML frontmatter and a Markdown body: + +```markdown icon="markdown" +--- +name: code-reviewer +description: > + Reviews code for quality, bugs, and best practices. + Review this pull request for issues + Check this code for bugs +tools: + - file_editor + - terminal +model: inherit +--- + +# Code Reviewer + +You are a meticulous code reviewer. When reviewing code: + +1. **Correctness** - Look for bugs, off-by-one errors, and race conditions. +2. **Style** - Check for consistent naming and idiomatic usage. +3. **Performance** - Identify unnecessary allocations or algorithmic issues. +4. **Security** - Flag injection vulnerabilities or hardcoded secrets. + +Keep feedback concise and actionable. For each issue, suggest a fix. +``` + +The YAML frontmatter configures the agent. The Markdown body becomes the agent's system prompt. + +### Frontmatter Fields + +| Field | Required | Default | Description | +|-------|----------|---------|-------------| +| `name` | Yes | - | Agent identifier (e.g., `code-reviewer`) | +| `description` | No | `""` | What this agent does. Shown to the orchestrator | +| `tools` | No | `[]` | List of tools the agent can use | +| `model` | No | `"inherit"` | LLM model (`"inherit"` uses the parent agent's model) | +| `color` | No | `None` | [Rich color name](https://rich.readthedocs.io/en/stable/appendix/colors.html) (e.g., `"blue"`, `"green"`) used by visualizers to style this agent's output in terminal panels | + +### `` Tags + +Add `` tags inside the description to help the orchestrating agent know **when** to delegate to this agent: + +```markdown icon="markdown" +description: > + Writes and improves technical documentation. + Write docs for this module + Improve the README +``` + +These examples are extracted and stored as `when_to_use_examples` on the `AgentDefinition` object. They can be used by routing logic (or prompt-building) to help decide when to delegate to the right sub-agent. + +## Directory Conventions + +Place agent files in these directories, scanned in **priority order** (first match wins): + +| Priority | Location | Scope | +|----------|----------|-------| +| 1 | `{project}/.agents/agents/*.md` | Project-level (primary) | +| 2 | `{project}/.openhands/agents/*.md` | Project-level (secondary) | +| 3 | `~/.agents/agents/*.md` | User-level (primary) | +| 4 | `~/.openhands/agents/*.md` | User-level (secondary) | + + + + + + + + + + + + + + + +**Rules:** +- Only top-level `.md` files are loaded (subdirectories are skipped) +- `README.md` files are automatically skipped +- Project-level agents take priority over user-level agents with the same name + + +Put agents shared across all your projects in `~/.agents/agents/`. Put project-specific agents in `{project}/.agents/agents/`. + + +## Built-in Agents + +The SDK ships with built-in agents that are automatically loaded at the beginning of each conversation and are available to the user. + +By default, agents include `FinishTool` and `ThinkTool`; they are appended after tool filtering. + +The table below summarizes all available built-in agents: + +| Agent | Tools | Description | +|--------|-------|-------| +| **default** | `terminal`, `file_editor`, `task_tracker`, `browser_tool_set` | general purpose agent | + +## Overall Priority + +When the same agent name is defined in multiple places, the highest-priority source wins. Registration is first-come first-win. + +| Priority | Source | Description | +|----------|--------|-------------| +| 1 (highest) | **Programmatic** `register_agent()` | Registered first, never overwritten | +| 2 | **Plugin agents** (`Plugin.agents`) | Loaded from plugin `agents/` directories | +| 3 | **Project-level** file-based agents | `.agents/agents/*.md` or `.openhands/agents/*.md` | +| 4 | **User-level** file-based agents | `~/.agents/agents/*.md` or `~/.openhands/agents/*.md` | +| 5 (lowest) | **Built-in agents** | SDK built-in agents | + +## Auto-Registration + +The simplest way to use file-based agents is auto-registration. Call `register_file_agents()` with your project directory, and all discovered agents are registered into the delegation system: + +```python icon="python" focus={3} +from openhands.sdk.subagent import register_file_agents + +agent_names = register_file_agents("/path/to/project") +print(f"Registered {len(agent_names)} agents: {agent_names}") +``` + +This scans both project-level and user-level directories, deduplicates by name, and registers each agent as a delegate that can be spawned by the orchestrator. + +## Manual Loading + +For more control, load and register agents explicitly: + +```python icon="python" focus={3-6, 8-14} +from pathlib import Path + +from openhands.sdk import load_agents_from_dir, register_agent, agent_definition_to_factory + +# Load from a specific directory +agents_dir = Path("agents") +agent_definitions = load_agents_from_dir(agents_dir) + +# Register each agent +for agent_def in agent_definitions: + register_agent( + name=agent_def.name, + factory_func=agent_definition_to_factory(agent_def), + description=agent_def.description, + ) +``` + +### Key Functions + +#### `load_agents_from_dir()` + +Scans a directory for `.md` files and returns a list of `AgentDefinition` objects: + +```python icon="python" focus={3-4} +from pathlib import Path + +from openhands.sdk import load_agents_from_dir + +definitions = load_agents_from_dir(Path(".agents/agents")) +for d in definitions: + print(f"{d.name}: {d.tools}, model={d.model}") +``` + +#### `agent_definition_to_factory()` + +Converts an `AgentDefinition` into a factory function `(LLM) -> Agent`: + +```python icon="python" +from openhands.sdk import agent_definition_to_factory + +factory = agent_definition_to_factory(agent_def) +# The factory is called by the delegation system with the parent's LLM +``` + +The factory: +- Maps tool names from the frontmatter to `Tool` objects +- Appends the Markdown body to the parent system message via `AgentContext(system_message_suffix=...)` +- Respects the `model` field (`"inherit"` keeps the parent LLM; an explicit model name creates a copy) + +#### `load_project_agents()` / `load_user_agents()` + +Load agents from project-level or user-level directories respectively: + +```python icon="python" focus={3, 4} +from openhands.sdk.subagent import load_project_agents, load_user_agents + +project_agents = load_project_agents("/path/to/project") +user_agents = load_user_agents() # scans ~/.agents/agents/ and ~/.openhands/agents/ +``` + +## Using with Delegation + +File-based agents are designed to work with the [DelegateTool](/sdk/guides/agent-delegation). Once registered, the orchestrating agent can spawn and delegate tasks to them by name: + +```python icon="python" focus={6, 9-12, 15-19} +from openhands.sdk import Agent, Conversation, Tool +from openhands.sdk.subagent import register_file_agents +from openhands.sdk.tool import register_tool +from openhands.tools.delegate import DelegateTool, DelegationVisualizer + +register_file_agents("/path/to/project") # Register .agents/agents/*.md + +# Set up the orchestrator with DelegateTool +register_tool("DelegateTool", DelegateTool) +main_agent = Agent( + llm=llm, + tools=[Tool(name="DelegateTool")], +) + +conversation = Conversation( + agent=main_agent, + workspace="/path/to/project", + visualizer=DelegationVisualizer(name="Orchestrator"), +) +``` + +To learn more about agent delegation, follow our [comprehensive guide](/sdk/guides/agent-delegation). + +## Example Agent Files + +### Code Reviewer + +```markdown icon="markdown" +--- +name: code-reviewer +description: > + Reviews code for quality, bugs, and best practices. + Review this pull request for issues + Check this code for bugs +tools: + - file_editor + - terminal +--- + +# Code Reviewer + +You are a meticulous code reviewer. When reviewing code: + +1. **Correctness** - Look for bugs, off-by-one errors, null pointer issues, and race conditions. +2. **Style** - Check for consistent naming, formatting, and idiomatic usage. +3. **Performance** - Identify unnecessary allocations, N+1 queries, or algorithmic inefficiencies. +4. **Security** - Flag potential injection vulnerabilities, hardcoded secrets, or unsafe deserialization. + +Keep feedback concise and actionable. For each issue found, suggest a concrete fix. +``` + +### Technical Writer + +```markdown icon="markdown" +--- +name: tech-writer +description: > + Writes and improves technical documentation. + Write docs for this module + Improve the README +tools: + - file_editor +--- + +# Technical Writer + +You are a skilled technical writer. When creating or improving documentation: + +1. **Audience** - Write for developers who are new to the project. +2. **Structure** - Use clear headings, code examples, and step-by-step instructions. +3. **Accuracy** - Read the source code before documenting behavior. Never guess. +4. **Brevity** - Prefer short, concrete sentences over long explanations. + +Always include a usage example with expected output when documenting functions or APIs. +``` + +## Agents in Plugins + +> Plugins bundle agents, tools, skills, and MCP servers into reusable packages. +Learn more about plugins [here](/sdk/guides/plugins). + +File-based agents can also be bundled inside plugins. Place them in the `agents/` directory of your plugin: + + + + + + + + + + + + + +Plugin agents use the same `.md` format and are registered automatically when the plugin is loaded. They have higher priority than file-based agents but lower than programmatic `register_agent()` calls. + +## Ready-to-run Example + + +This example is available on GitHub: [examples/01_standalone_sdk/42_file_based_subagents.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/42_file_based_subagents.py) + + +This example uses `AgentDefinition` directly. File-based agents are loaded into the same `AgentDefinition` objects (from Markdown) and registered the same way. + +```python icon="python" expandable examples/01_standalone_sdk/42_file_based_subagents.py +"""Example: Defining a sub-agent inline with AgentDefinition. + +Defines a grammar-checker sub-agent using AgentDefinition, registers it, +and delegates work to it from an orchestrator agent. The orchestrator then +asks the builtin default agent to judge the results. +""" + +import os +from pathlib import Path + +from openhands.sdk import ( + LLM, + Agent, + Conversation, + Tool, + agent_definition_to_factory, + register_agent, +) +from openhands.sdk.subagent import AgentDefinition +from openhands.sdk.tool import register_tool +from openhands.tools.delegate import DelegateTool, DelegationVisualizer + + +# 1. Define a sub-agent using AgentDefinition +grammar_checker = AgentDefinition( + name="grammar-checker", + description="Checks documents for grammatical errors.", + tools=["file_editor"], + system_prompt="You are a grammar expert. Find and list grammatical errors.", +) + +# 2. Register it in the delegate registry +register_agent( + name=grammar_checker.name, + factory_func=agent_definition_to_factory(grammar_checker), + description=grammar_checker.description, +) + +# 3. Set up the orchestrator agent with the DelegateTool +llm = LLM( + model=os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929"), + api_key=os.getenv("LLM_API_KEY"), + base_url=os.getenv("LLM_BASE_URL"), + usage_id="file-agents-demo", +) + +register_tool("DelegateTool", DelegateTool) +main_agent = Agent( + llm=llm, + tools=[Tool(name="DelegateTool")], +) +conversation = Conversation( + agent=main_agent, + workspace=Path.cwd(), + visualizer=DelegationVisualizer(name="Orchestrator"), +) + +# 4. Ask the orchestrator to delegate to our agent +task = ( + "Please delegate to the grammar-checker agent and ask it to review " + "the README.md file in search of grammatical errors.\n" + "Then ask the default agent to judge the errors." +) +conversation.send_message(task) +conversation.run() + +cost = conversation.conversation_stats.get_combined_metrics().accumulated_cost +print(f"\nTotal cost: ${cost:.4f}") +print(f"EXAMPLE_COST: {cost:.4f}") +``` + + + +## Next Steps + +- **[Sub-Agent Delegation](/sdk/guides/agent-delegation)** - Learn about the DelegateTool and delegation patterns +- **[Skills](/sdk/guides/skill)** - Add specialized knowledge and triggers to agents +- **[Plugins](/sdk/guides/plugins)** - Bundle agents, skills, hooks, and MCP servers together +- **[Custom Agent](/sdk/guides/agent-custom)** - Create agents programmatically for more control