Skip to content

aloth/awesome-ai-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Awesome AI Agents Awesome

A curated list of AI agent frameworks, tools, platforms, research papers, and resources.

AI Agents are autonomous systems that use LLMs to reason, plan, and take actions. This list tracks the rapidly evolving ecosystem.

Contributing: PRs welcome! Read the contribution guidelines first.


Contents


Frameworks & Libraries

Multi-Agent Orchestration

  • AG2 — Successor to AutoGen. Multi-agent framework with improved APIs.
  • Agent Swarm — Multi-agent orchestration for AI coding assistants (Claude Code, Codex, Gemini CLI). Lead/worker coordination with Docker isolation, compounding memory, and Slack/GitHub integration.
  • AgentField — Open-source control plane that makes AI agents callable as microservices. Routing, coordination, memory, async execution, and cryptographic audit trails. Supports Python, Go, and TypeScript.
  • AgentScope — Alibaba's production-ready agent framework with essential abstractions, built-in fine-tuning support, and a visual drag-and-drop interface.
  • Agno — Programming language for agentic software. Build and manage multi-agent systems at scale.
  • AutoGen — Microsoft's multi-agent conversation framework. Supports complex agent topologies.
  • CAMEL — Communicative agents for role-playing and multi-agent cooperation. First LLM multi-agent framework.
  • CrewAI — Role-based multi-agent framework. Agents with roles, goals, and backstories.
  • DeerFlow — ByteDance's open-source long-horizon SuperAgent harness. Orchestrates sub-agents, sandboxes, memory, tools, and skills for tasks spanning minutes to hours. Hit #1 GitHub Trending with v2.0 (Feb 2026).
  • dimos — Agentic operating system for physical space. Build multi-agent systems that control humanoids, quadrupeds, drones, and other hardware via natural language.
  • Google Agent Development Kit (ADK) — Google's open-source, code-first Python framework for building multi-agent systems with A2A support.
  • LangGraph — Stateful agent workflows as graphs. Part of the LangChain ecosystem.
  • Mastra — TypeScript-first AI agent framework with workflows, RAG, and integrations.
  • MetaGPT — Multi-agent framework that mimics a software company with roles (PM, architect, engineer).
  • Microsoft Agent Framework — Framework for building, orchestrating and deploying multi-agent workflows (Python + .NET).
  • MiroFish — Concise and universal swarm intelligence engine for forecasting and prediction. Upload seed material, describe goals in natural language, get a detailed prediction report and an interactive simulation.
  • OpenAI Agents SDK — OpenAI's production framework for multi-agent orchestration with handoffs and guardrails.
  • Ruflo — Agent orchestration platform optimized for Claude. Features self-learning swarms, distributed intelligence, RAG integration, and native Claude Code/Codex integration. Formerly claude-flow.
  • Semantic Kernel — Microsoft's SDK for AI orchestration. Plugins, planners, and memory.
  • Swarm — OpenAI's lightweight multi-agent framework (educational).

Single Agent

  • GenericAgent — Self-evolving agent that grows its own skill tree from ~3K lines of seed code. 9 atomic tools for full system control (browser, terminal, filesystem, screen vision) with automatic skill crystallization.
  • Haystack — End-to-end NLP framework with agent pipelines.
  • Instructor — Structured output from LLMs. Essential for reliable tool use.
  • LangChain — The most popular LLM application framework. Agents, chains, tools.
  • LlamaIndex — Data framework for LLM apps. Strong RAG and data agent support.
  • PydanticAI — GenAI agent framework, the Pydantic way. Type-safe and production-ready.
  • smolagents — Hugging Face's lightweight agent library. ~1,000 lines of focused code, easy to understand and extend.

Code Agents

  • Aider — AI pair programming in the terminal.
  • Claude Code — Anthropic's agentic coding tool. Terminal-based, strong at complex refactors and multi-file changes.
  • Codex — OpenAI's cloud-based coding agent. Runs tasks in sandboxed environments, integrates with GitHub.
  • Cursor — AI-first code editor with agent capabilities.
  • Devin — Cognition's autonomous software engineer. Full environment with browser, editor, and terminal.
  • Gemini CLI — Open-source AI agent bringing Gemini directly into your terminal.
  • GitHub Copilot — AI pair programmer with agent mode for multi-file edits, terminal commands, and autonomous task execution.
  • Kiro — AWS's spec-driven AI coding IDE. Three-phase Specify, Plan, Execute workflow.
  • Open SWE — LangChain's open-source async cloud coding agent. Connects to GitHub repos, delegates tasks from issues via Slack or Linear.
  • OpenHands — AI software development agent (formerly OpenDevin).
  • OpenHands Software Agent SDK — Modular Python SDK for building code agents. Local or ephemeral workspaces, composable tools, powers OpenHands CLI and Cloud.
  • SWE-agent — Princeton's software engineering agent.
  • Windsurf — AI-native IDE by Codeium with agentic Cascade flows.

Personal AI Agents

  • CoPaw — Alibaba's open-source personal AI agent workstation. Supports multi-channel workflows, MCP skills, local/cloud LLMs, and persistent memory.
  • Hermes Agent — Nous Research's open-source self-improving personal AI agent. Closed learning loop, multi-platform gateway (Telegram, Discord, Slack, WhatsApp, Signal), MCP integration, and cron scheduling.
  • OpenClaw — Open-source personal AI agent with tool use, browser control, messaging integration, and persistent memory.

Browser Agents

  • Browser Use — Control browsers with AI agents. Most popular browser automation framework.
  • Playwright MCP — Anthropic's browser automation via MCP.
  • Stagehand — AI-powered browser automation framework by Browserbase.
  • UI-TARS Desktop — ByteDance's multimodal AI agent stack for desktop automation.

Research Agents

  • GPT Researcher — Autonomous agent for deep research on any topic using any LLM.
  • autoresearch — Andrej Karpathy's open-source framework for running AI agents that autonomously conduct research on single-GPU model training experiments overnight.
  • Perplexica — Open-source AI-powered answering engine (Perplexity alternative).

Platforms & Low-Code

  • Activepieces — Open-source AI workflow automation with 400+ MCP servers for agents.
  • Amazon Bedrock Agents — AWS managed agent service.
  • AnythingLLM — All-in-one desktop & Docker AI app with built-in RAG, agents, and MCP.
  • Anthropic Claude + Tool Use — Claude's function calling and agent capabilities.
  • Claude Managed Agents — Anthropic's hosted agent execution environment (public beta, April 2026). Stateful sessions, built-in sandboxing, and tool execution without managing your own infrastructure.
  • Azure AI Foundry — Full-stack AI platform with agent capabilities.
  • Composio — 1000+ toolkits, auth management, and sandboxed workbench for AI agents.
  • Dify — Open-source LLMOps platform with visual agent builder.
  • Google Vertex AI Agent Builder — Google Cloud's agent development platform.
  • MaxKB — Open-source platform for building enterprise-grade agents.
  • Microsoft Copilot Studio — Low-code agent builder. Integrates with M365, Dynamics, Power Platform.
  • n8n — Workflow automation with native AI agent capabilities and MCP support.
  • OpenAI Assistants API — OpenAI's managed agent platform with tools and retrieval.
  • Relevance AI — No-code AI agent platform.
  • Trigger.dev — Build and deploy fully-managed AI agents and workflows.

Agent Infrastructure

Tool Protocols

Agent Skills & Tools

  • Google Agents CLI — CLI and skill pack that turns any coding assistant (Claude Code, Codex, Gemini CLI, Cursor) into an expert at creating, evaluating, and deploying AI agents on Google Cloud.
  • PowerSkills — PowerShell automation toolkit for AI agents. Structured JSON control over Windows — Outlook, Edge browser, desktop, and system operations.

Memory & State

  • Hindsight — Agent memory that learns: state-of-the-art memory layer for AI agents with persistent, personalized recall.
  • Letta — Stateful agents with long-term memory (formerly MemGPT).
  • Mem0 — Universal memory layer for AI agents. Persistent, contextual.
  • Zep — Long-term memory for AI assistants.

Monitoring & Observability

  • Arize Phoenix — ML & LLM observability.
  • Future AGI — Open-source, end-to-end, self-hostable platform for evaluating, observing, and improving LLM and AI agent apps. Tracing, evals, simulations, datasets, gateway, and guardrails in one stack.
  • Helicone — LLM observability and cost tracking.
  • Langfuse — Open-source LLM observability. Traces, evals, prompt management.
  • LangSmith — LangChain's debugging and monitoring platform.

Data Extraction

  • Crawl4AI — Open-source LLM-friendly web crawler. High-performance async crawling.
  • Firecrawl — Turn entire websites into LLM-ready markdown or structured data.

Vector Databases

  • Azure AI Search — Enterprise search with vector + hybrid capabilities.
  • ChromaDB — Lightweight embedding database.
  • Pinecone — Managed vector database.
  • Qdrant — High-performance vector search.
  • Weaviate — Open-source vector database.

Sandboxing & Execution

  • Daytona — Secure and elastic infrastructure for running AI-generated code.
  • E2B — Cloud sandboxes for AI agents. Secure code execution environments.
  • CubeSandbox — Tencent Cloud's instant, concurrent, secure, and lightweight Rust-based sandbox for AI agents. Sub-second cold start with strong isolation for tool execution and code interpreters.
  • GitHub Agentic Workflows — AI agents running within GitHub Actions. Markdown-based workflow definitions.
  • Moltworker — Cloudflare's open-source framework for deploying personal AI agents on Workers with sandboxed execution.

Evaluation & Testing

  • AgentBench — Tsinghua's multi-dimensional agent benchmark.
  • AgentBoard — Multi-round agent evaluation platform.
  • GAIA — General AI Assistants benchmark by Meta.
  • LangTest — Testing framework for delivering safe & effective language models.
  • RuLES — Benchmark for evaluating rule-following in language models.
  • SWE-bench — Benchmark for software engineering agents.
  • ToolBench — Benchmark for tool-use capabilities.
  • ToolEmu — LM-based emulation framework for identifying risks of agents with tool use (ICLR '24).
  • UQLM — Uncertainty quantification for LLMs. UQ-based hallucination detection.

Safety & Governance

  • Agent Governance Toolkit — Microsoft's runtime governance infrastructure for AI agents. Deterministic policy enforcement, zero-trust identity, execution sandboxing, and SRE. Covers all 10 OWASP Agentic Top 10 risks across Python, TypeScript, .NET, Rust, and Go.
  • Agentic Security — LLM vulnerability scanner and AI red teaming kit.
  • Anthropic Constitutional AI — Self-improving AI safety through constitutions.
  • Azure AI Content Safety — Content moderation for AI outputs.
  • Guardrails AI — Validation framework for LLM outputs.
  • IronCurtain — Open-source security layer for autonomous AI agents. Runs agents in isolated VMs to prevent prompt injection and rogue behavior.
  • LangFair — Python library for LLM bias and fairness assessments.
  • LLM Guard — Security toolkit for LLM interactions.
  • NeMo Guardrails — NVIDIA's programmable guardrails.
  • PromptInject — Framework for quantitative analysis of LLM robustness to prompt attacks (NeurIPS '22 Best Paper).
  • Rebuff — Prompt injection detection.
  • Safe RLHF — Constrained value alignment via safe reinforcement learning from human feedback.

Research Papers

Surveys & Overviews

Agent Architectures

Multi-Agent Systems

  • CAMEL (2023) — Communicative agents for role-playing.
  • MetaGPT (2023) — Multi-agent collaboration mimicking software companies.
  • ChatDev (2023) — Agents collaborating in a virtual software company.
  • PaperOrchestra (2026) — Google's multi-agent framework for automated AI research paper writing, converting unstructured pre-writing materials into submission-ready papers.

Safety & Evaluation

  • AgentBench (2023) — Evaluating LLMs as agents across 8 environments.
  • InjectAgent (2024) — Indirect prompt injection attacks on tool-integrated agents.
  • R-Judge (2024) — Benchmarking safety risk awareness for LLM agents.

Agent Training

Tutorials & Courses

Use Cases & Case Studies

Enterprise

  • IT Helpdesk Agents — Automated ticket resolution, password resets
  • Customer Service — Multi-turn conversation with CRM integration
  • Document Intelligence — Contract analysis, compliance checking
  • Data Analysis — Natural language to SQL, automated reporting

Research & Humanitarian

  • Disinformation Detection — Agents monitoring information ecosystems
  • Disaster Response — Coordinating information flows in crisis situations
  • Knowledge Management — Intelligent document retrieval for NGOs

Community


License

CC0


Disclaimer: This list aims to be vendor-neutral and community-driven. Inclusion does not imply endorsement by any employer or organization.

Releases

No releases published

Packages

 
 
 

Contributors