Awesome AI Agents

A curated list of AI agent frameworks, tools, platforms, research papers, and resources.

AI Agents are autonomous systems that use LLMs to reason, plan, and take actions. This list tracks the rapidly evolving ecosystem.

Contributing: PRs welcome! Read the contribution guidelines first.

Frameworks & Libraries

Multi-Agent Orchestration

AG2 — Successor to AutoGen. Multi-agent framework with improved APIs.
Agent Swarm — Multi-agent orchestration for AI coding assistants (Claude Code, Codex, Gemini CLI). Lead/worker coordination with Docker isolation, compounding memory, and Slack/GitHub integration.
AgentField — Open-source control plane that makes AI agents callable as microservices. Routing, coordination, memory, async execution, and cryptographic audit trails. Supports Python, Go, and TypeScript.
AgentScope — Alibaba's production-ready agent framework with essential abstractions, built-in fine-tuning support, and a visual drag-and-drop interface.
Agno — Programming language for agentic software. Build and manage multi-agent systems at scale.
AutoGen — Microsoft's multi-agent conversation framework. Supports complex agent topologies.
CAMEL — Communicative agents for role-playing and multi-agent cooperation. First LLM multi-agent framework.
CrewAI — Role-based multi-agent framework. Agents with roles, goals, and backstories.
DeerFlow — ByteDance's open-source long-horizon SuperAgent harness. Orchestrates sub-agents, sandboxes, memory, tools, and skills for tasks spanning minutes to hours. Hit #1 GitHub Trending with v2.0 (Feb 2026).
dimos — Agentic operating system for physical space. Build multi-agent systems that control humanoids, quadrupeds, drones, and other hardware via natural language.
Google Agent Development Kit (ADK) — Google's open-source, code-first Python framework for building multi-agent systems with A2A support.
LangGraph — Stateful agent workflows as graphs. Part of the LangChain ecosystem.
Mastra — TypeScript-first AI agent framework with workflows, RAG, and integrations.
MetaGPT — Multi-agent framework that mimics a software company with roles (PM, architect, engineer).
Microsoft Agent Framework — Framework for building, orchestrating and deploying multi-agent workflows (Python + .NET).
MiroFish — Concise and universal swarm intelligence engine for forecasting and prediction. Upload seed material, describe goals in natural language, get a detailed prediction report and an interactive simulation.
OpenAI Agents SDK — OpenAI's production framework for multi-agent orchestration with handoffs and guardrails.
Ruflo — Agent orchestration platform optimized for Claude. Features self-learning swarms, distributed intelligence, RAG integration, and native Claude Code/Codex integration. Formerly claude-flow.
Semantic Kernel — Microsoft's SDK for AI orchestration. Plugins, planners, and memory.
Swarm — OpenAI's lightweight multi-agent framework (educational).

Single Agent

GenericAgent — Self-evolving agent that grows its own skill tree from ~3K lines of seed code. 9 atomic tools for full system control (browser, terminal, filesystem, screen vision) with automatic skill crystallization.
Haystack — End-to-end NLP framework with agent pipelines.
Instructor — Structured output from LLMs. Essential for reliable tool use.
LangChain — The most popular LLM application framework. Agents, chains, tools.
LlamaIndex — Data framework for LLM apps. Strong RAG and data agent support.
PydanticAI — GenAI agent framework, the Pydantic way. Type-safe and production-ready.
smolagents — Hugging Face's lightweight agent library. ~1,000 lines of focused code, easy to understand and extend.

Code Agents

Aider — AI pair programming in the terminal.
Claude Code — Anthropic's agentic coding tool. Terminal-based, strong at complex refactors and multi-file changes.
Codex — OpenAI's cloud-based coding agent. Runs tasks in sandboxed environments, integrates with GitHub.
Cursor — AI-first code editor with agent capabilities.
Devin — Cognition's autonomous software engineer. Full environment with browser, editor, and terminal.
Gemini CLI — Open-source AI agent bringing Gemini directly into your terminal.
GitHub Copilot — AI pair programmer with agent mode for multi-file edits, terminal commands, and autonomous task execution.
Kiro — AWS's spec-driven AI coding IDE. Three-phase Specify, Plan, Execute workflow.
Open SWE — LangChain's open-source async cloud coding agent. Connects to GitHub repos, delegates tasks from issues via Slack or Linear.
OpenHands — AI software development agent (formerly OpenDevin).
OpenHands Software Agent SDK — Modular Python SDK for building code agents. Local or ephemeral workspaces, composable tools, powers OpenHands CLI and Cloud.
SWE-agent — Princeton's software engineering agent.
Windsurf — AI-native IDE by Codeium with agentic Cascade flows.

Personal AI Agents

CoPaw — Alibaba's open-source personal AI agent workstation. Supports multi-channel workflows, MCP skills, local/cloud LLMs, and persistent memory.
Hermes Agent — Nous Research's open-source self-improving personal AI agent. Closed learning loop, multi-platform gateway (Telegram, Discord, Slack, WhatsApp, Signal), MCP integration, and cron scheduling.
OpenClaw — Open-source personal AI agent with tool use, browser control, messaging integration, and persistent memory.

Browser Agents

Browser Use — Control browsers with AI agents. Most popular browser automation framework.
Playwright MCP — Anthropic's browser automation via MCP.
Stagehand — AI-powered browser automation framework by Browserbase.
UI-TARS Desktop — ByteDance's multimodal AI agent stack for desktop automation.

Research Agents

GPT Researcher — Autonomous agent for deep research on any topic using any LLM.
autoresearch — Andrej Karpathy's open-source framework for running AI agents that autonomously conduct research on single-GPU model training experiments overnight.
Perplexica — Open-source AI-powered answering engine (Perplexity alternative).

Platforms & Low-Code

Activepieces — Open-source AI workflow automation with 400+ MCP servers for agents.
Amazon Bedrock Agents — AWS managed agent service.
AnythingLLM — All-in-one desktop & Docker AI app with built-in RAG, agents, and MCP.
Anthropic Claude + Tool Use — Claude's function calling and agent capabilities.
Claude Managed Agents — Anthropic's hosted agent execution environment (public beta, April 2026). Stateful sessions, built-in sandboxing, and tool execution without managing your own infrastructure.
Azure AI Foundry — Full-stack AI platform with agent capabilities.
Composio — 1000+ toolkits, auth management, and sandboxed workbench for AI agents.
Dify — Open-source LLMOps platform with visual agent builder.
Google Vertex AI Agent Builder — Google Cloud's agent development platform.
MaxKB — Open-source platform for building enterprise-grade agents.
Microsoft Copilot Studio — Low-code agent builder. Integrates with M365, Dynamics, Power Platform.
n8n — Workflow automation with native AI agent capabilities and MCP support.
OpenAI Assistants API — OpenAI's managed agent platform with tools and retrieval.
Relevance AI — No-code AI agent platform.
Trigger.dev — Build and deploy fully-managed AI agents and workflows.

Agent Infrastructure

Tool Protocols

Agent2Agent Protocol (A2A) — Google's open protocol for agent-to-agent communication and discovery. Linux Foundation project.
Context7 — MCP server for up-to-date code documentation for LLMs.
GitHub MCP Server — GitHub's official MCP server for AI agents.
Model Context Protocol (MCP) — Anthropic's standard for connecting AI to tools and data.
OpenAI Function Calling — De facto standard for LLM tool use.

Agent Skills & Tools

Google Agents CLI — CLI and skill pack that turns any coding assistant (Claude Code, Codex, Gemini CLI, Cursor) into an expert at creating, evaluating, and deploying AI agents on Google Cloud.
PowerSkills — PowerShell automation toolkit for AI agents. Structured JSON control over Windows — Outlook, Edge browser, desktop, and system operations.

Memory & State

Hindsight — Agent memory that learns: state-of-the-art memory layer for AI agents with persistent, personalized recall.
Letta — Stateful agents with long-term memory (formerly MemGPT).
Mem0 — Universal memory layer for AI agents. Persistent, contextual.
Zep — Long-term memory for AI assistants.

Monitoring & Observability

Arize Phoenix — ML & LLM observability.
Future AGI — Open-source, end-to-end, self-hostable platform for evaluating, observing, and improving LLM and AI agent apps. Tracing, evals, simulations, datasets, gateway, and guardrails in one stack.
Helicone — LLM observability and cost tracking.
Langfuse — Open-source LLM observability. Traces, evals, prompt management.
LangSmith — LangChain's debugging and monitoring platform.

Data Extraction

Crawl4AI — Open-source LLM-friendly web crawler. High-performance async crawling.
Firecrawl — Turn entire websites into LLM-ready markdown or structured data.

Vector Databases

Azure AI Search — Enterprise search with vector + hybrid capabilities.
ChromaDB — Lightweight embedding database.
Pinecone — Managed vector database.
Qdrant — High-performance vector search.
Weaviate — Open-source vector database.

Sandboxing & Execution

Daytona — Secure and elastic infrastructure for running AI-generated code.
E2B — Cloud sandboxes for AI agents. Secure code execution environments.
CubeSandbox — Tencent Cloud's instant, concurrent, secure, and lightweight Rust-based sandbox for AI agents. Sub-second cold start with strong isolation for tool execution and code interpreters.
GitHub Agentic Workflows — AI agents running within GitHub Actions. Markdown-based workflow definitions.
Moltworker — Cloudflare's open-source framework for deploying personal AI agents on Workers with sandboxed execution.

Evaluation & Testing

AgentBench — Tsinghua's multi-dimensional agent benchmark.
AgentBoard — Multi-round agent evaluation platform.
GAIA — General AI Assistants benchmark by Meta.
LangTest — Testing framework for delivering safe & effective language models.
RuLES — Benchmark for evaluating rule-following in language models.
SWE-bench — Benchmark for software engineering agents.
ToolBench — Benchmark for tool-use capabilities.
ToolEmu — LM-based emulation framework for identifying risks of agents with tool use (ICLR '24).
UQLM — Uncertainty quantification for LLMs. UQ-based hallucination detection.

Safety & Governance

Agent Governance Toolkit — Microsoft's runtime governance infrastructure for AI agents. Deterministic policy enforcement, zero-trust identity, execution sandboxing, and SRE. Covers all 10 OWASP Agentic Top 10 risks across Python, TypeScript, .NET, Rust, and Go.
Agentic Security — LLM vulnerability scanner and AI red teaming kit.
Anthropic Constitutional AI — Self-improving AI safety through constitutions.
Azure AI Content Safety — Content moderation for AI outputs.
Guardrails AI — Validation framework for LLM outputs.
IronCurtain — Open-source security layer for autonomous AI agents. Runs agents in isolated VMs to prevent prompt injection and rogue behavior.
LangFair — Python library for LLM bias and fairness assessments.
LLM Guard — Security toolkit for LLM interactions.
NeMo Guardrails — NVIDIA's programmable guardrails.
PromptInject — Framework for quantitative analysis of LLM robustness to prompt attacks (NeurIPS '22 Best Paper).
Rebuff — Prompt injection detection.
Safe RLHF — Constrained value alignment via safe reinforcement learning from human feedback.

Research Papers

Surveys & Overviews

The Rise and Potential of Large Language Model Based Agents (2023) — Comprehensive survey of LLM-based agents.
A Survey on Large Language Model based Autonomous Agents (2023) — Systematic review of agent architectures.
Agent AI: Surveying the Horizons of Multimodal Interaction (2024) — Microsoft Research survey on agent AI.

Agent Architectures

ReAct: Synergizing Reasoning and Acting (2023) — The foundational Reason + Act paradigm.
Toolformer (2023) — Teaching LLMs to use tools autonomously.
Voyager (2023) — Lifelong learning agent in Minecraft.
Generative Agents (2023) — Stanford's believable simulacra of human behavior.
Tree of Thoughts (2023) — Deliberate problem solving through exploration of reasoning paths.
Self-Refine (2023) — Iterative self-refinement with self-feedback.

Multi-Agent Systems

CAMEL (2023) — Communicative agents for role-playing.
MetaGPT (2023) — Multi-agent collaboration mimicking software companies.
ChatDev (2023) — Agents collaborating in a virtual software company.
PaperOrchestra (2026) — Google's multi-agent framework for automated AI research paper writing, converting unstructured pre-writing materials into submission-ready papers.

Safety & Evaluation

AgentBench (2023) — Evaluating LLMs as agents across 8 environments.
InjectAgent (2024) — Indirect prompt injection attacks on tool-integrated agents.
R-Judge (2024) — Benchmarking safety risk awareness for LLM agents.

Agent Training

Group-in-Group Policy Optimization for LLM Agent Training (2025) — RL-based training for LLM/VLM agents.

Tutorials & Courses

DeepLearning.AI: A2A Protocol — Short course on Google's Agent2Agent protocol.
DeepLearning.AI: Building Agentic RAG — Andrew Ng's course on agentic RAG patterns.
Hugging Face: Building AI Agents — Open course on agent development.
LangChain Academy — Free courses on agents and RAG.
Microsoft: AI Agents for Beginners — 12 lessons to get started building AI agents.
Microsoft: Build AI Agents with Azure AI Foundry — Official Microsoft Learn path.
Microsoft: MCP for Beginners — Curriculum for Model Context Protocol with cross-language examples.
Prompt Engineering Guide — Comprehensive guides for prompt engineering, RAG, and AI agents.

Use Cases & Case Studies

Enterprise

IT Helpdesk Agents — Automated ticket resolution, password resets
Customer Service — Multi-turn conversation with CRM integration
Document Intelligence — Contract analysis, compliance checking
Data Analysis — Natural language to SQL, automated reporting

Research & Humanitarian

Disinformation Detection — Agents monitoring information ecosystems
Disaster Response — Coordinating information flows in crisis situations
Knowledge Management — Intelligent document retrieval for NGOs

Community

r/AI_Agents — Reddit community
AI Agents Discord — Active Discord server
awesome-ai-agent-papers — Curated collection of AI agent research papers released in 2026, covering engineering, memory, evaluation, workflows, and autonomous systems.
#AIAgents on X — Twitter/X hashtag

License

Disclaimer: This list aims to be vendor-neutral and community-driven. Inclusion does not imply endorsement by any employer or organization.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Awesome AI Agents

Contents

Frameworks & Libraries

Multi-Agent Orchestration

Single Agent

Code Agents

Personal AI Agents

Browser Agents

Research Agents

Platforms & Low-Code

Agent Infrastructure

Tool Protocols

Agent Skills & Tools

Memory & State

Monitoring & Observability

Data Extraction

Vector Databases

Sandboxing & Execution

Evaluation & Testing

Safety & Governance

Research Papers

Surveys & Overviews

Agent Architectures

Multi-Agent Systems

Safety & Evaluation

Agent Training

Tutorials & Courses

Use Cases & Case Studies

Enterprise

Research & Humanitarian

Community

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages