A plugin for Claude Code that treats it as an Operating System. 9 specialized agents, smart flow selection (direct edit Β· bugfix TDD Β· 5-phase workflow), self-healing memory, and multi-agent orchestration. Right effort matched to task risk.
Typos get direct edits. Bugs get 4-step TDD. Features get 5-phase workflow. You never pick β the plugin matches effort to risk.
Install in 30 seconds Β· See it in action Β· Why Aura Frog? Β· Full benefits guide β
You open Claude Code. You type a prompt. Claude writes code. You hope it works.
No structure. No tests. No quality gates. Every session starts from scratch. Every complex feature turns into prompt spaghetti.
You're the project manager, QA lead, and architect β all while trying to code.
Aura Frog treats Claude Code as an Operating System β Claude is the kernel, agents are processes, and the context window is managed RAM. You describe the task. Aura Frog classifies complexity, picks the right flow (direct edit Β· bugfix TDD Β· full 5-phase), dispatches the right agent, and compresses context automatically so you never lose decisions.
Right effort for every task. You only approve when it matters (0 gates for typos, up to 2 for architecture).
flowchart TB
User([π€ User prompt]):::user --> CC[Claude Code CLI]:::cc
CC -->|every message| AD[π agent-detector skill<br/>model: haiku Β· auto-invoke]:::skill
AD --> Cx{Complexity<br/>detected?}:::gate
Cx -->|Quick<br/>1 file / clear scope| Direct[βοΈ Direct edit<br/>no workflow]:::quick
Cx -->|Standard<br/>2-5 files / feature| SingleAgent[π Single agent<br/>agent-detector picks one]:::std
Cx -->|Deep<br/>6+ files / architecture| RO[π― run-orchestrator skill<br/>5-phase TDD]:::deep
RO --> P1[π P1: Understand + Design<br/>architect]:::phase
P1 --> G1{{β Approve?}}:::gate
G1 -->|approve| P2[π§ͺ P2: Test RED<br/>tester writes failing tests]:::phase
P2 --> P3[βοΈ P3: Build GREEN<br/>architect/frontend/mobile]:::phase
P3 --> G2{{β Approve?}}:::gate
G2 -->|approve| P4[π P4: Refactor + Review<br/>security + tester NOT builder]:::phase
P4 --> P5[π¦ P5: Finalize<br/>lead β docs + commit]:::phase
P5 --> Done([β
Production-ready]):::done
SingleAgent --> Done
Direct --> Done
classDef user fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
classDef cc fill:#6366f1,stroke:#3730a3,stroke-width:2px,color:#ffffff
classDef skill fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
classDef gate fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
classDef quick fill:#10b981,stroke:#065f46,stroke-width:2px,color:#ffffff
classDef std fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
classDef deep fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
classDef phase fill:#475569,stroke:#1e293b,stroke-width:2px,color:#ffffff
classDef done fill:#059669,stroke:#064e3b,stroke-width:2px,color:#ffffff
The flow, explained:
- Every message you send goes through the
agent-detectorskill (runs on haiku for cost) β it classifies complexity + picks the right agent + suggests the right model. - Quick tasks (typo, one-line fix) β direct edit, no workflow overhead.
- Standard tasks (one feature, clear scope) β single specialized agent runs inline.
- Deep tasks (feature + multi-file + TDD) β
run-orchestratorspawns the 5-phase workflow with two human approval gates. - Between phases, you either approve, reject, or modify β no commit happens until Phase 5 and you say so.
Aura Frog's 57 rules, 44 skills, and 9 agents are ~87% portable (weighted average) because they're markdown conventions, not tool-specific code. Only the thin hook layer needs adapters.
| Tool | Status | Coverage |
|---|---|---|
| 100% | ||
| ~85% | ||
| ~80% |
Why this matters: When you invest in Aura Frog's TDD discipline, gotcha-only expert skills, and agent architecture, that investment survives tool switches. Only the thin adapter layer changes.
Read the Portability Guide β
| β Without Aura Frog | β With Aura Frog |
|---|---|
|
|
Result: Production-ready code with tests, security review, and documentation β from a single prompt.
- Claude Code CLI installed β install guide
- Node.js β₯ 18 (for hook scripts)
- Git (for phase checkpoint commits)
# 1. Add the marketplace
/plugin marketplace add nguyenthienthanh/aura-frog
# 2. Install the plugin
/plugin install aura-frog@aurafrog
# 3. Verify
/af statusExpected output:
πΈ Aura Frog v3.6.0 β Ready
Agents: 9 loaded (lead, architect, frontend, mobile, tester, security, devops, strategist, scanner)
Skills: 44 available (5 auto-invoke, 39 on-demand)
Rules: 57 loaded (18 core + 17 agent + 22 workflow)
Hooks: 28 registered
MCP: context7, playwright, vitest, firebase, figma, slack
/project initScans your codebase and creates 7 context files (framework, conventions, rules, examples, architecture, etc.) in .claude/project-contexts/<name>/. Takes 30β60 seconds; saves minutes on every future session.
Install af CLI for health checks outside Claude Code
# Inside Claude Code:
/af setup cli
# Or manually:
sudo ln -sf "$HOME/.claude/plugins/marketplaces/aurafrog/scripts/af" /usr/local/bin/afThen use anywhere: af doctor, af measure, af setup remote.
MCP tokens (Figma, Slack, Firebase)
cp .envrc.template .envrc
# Edit .envrc β add FIGMA_API_TOKEN, SLACK_BOT_TOKEN, FIREBASE_TOKEN, etc.
direnv allow # if using direnvWithout tokens, figma / slack / firebase MCP servers stay inactive. context7, playwright, vitest need no config.
Skills-only mode on other platforms
| Platform | Install | What works |
|---|---|---|
| Claude Code | /plugin marketplace add nguyenthienthanh/aura-frog |
Everything |
| OpenAI Codex | cp -r aura-frog/skills/* ~/.codex/skills/ |
Skills + commands |
| Gemini CLI | cp -r aura-frog/skills/* ~/.gemini/skills/ |
Skills + commands |
| OpenCode | cp -r aura-frog/skills/* .opencode/skills/ |
Skills + commands |
Hooks, agent detection, subagent spawning, and MCP servers are Claude Code exclusive.
/run "Your task here"See the Walkthrough below for a complete transcript of what this looks like.
| Symptom | Likely cause | Fix |
|---|---|---|
/plugin install fails |
Marketplace cache | Run /plugin marketplace refresh |
| Hooks not firing | .claude/settings.json missing hook config |
/af setup integrations re-installs |
af: command not found |
PATH missing plugin scripts dir | Add $HOME/.claude/plugins/marketplaces/aurafrog/scripts to $PATH |
State not saving during /run |
Hook path drift (pre-v3.7) | Upgrade to 3.7+ (state path fixed) |
| Claude uses wrong agent | No /project init yet |
Run /project init to load conventions |
Full guide: GET_STARTED.md.
A complete transcript of implementing user authentication with /run. This is what you actually see.
sequenceDiagram
autonumber
participant U as π€ You
participant CC as Claude Code
participant AD as agent-detector<br/>(haiku)
participant RO as run-orchestrator<br/>skill
participant Lead as lead agent
participant Arch as architect
participant Tester as tester
participant Sec as security
U->>CC: /run "Add JWT authentication"
CC->>AD: classify (auto-invoke)
AD-->>CC: Deep Β· multi-file Β· sonnet
CC->>RO: spawn 5-phase workflow
RO->>Lead: coordinate
Lead->>Arch: Phase 1 β design
Arch-->>Lead: requirements.md + tech_spec.md
Lead-->>U: β Approve Phase 1?
U-->>Lead: approve
Lead->>Tester: Phase 2 β write failing tests
Tester-->>Lead: 5 tests Β· all RED
Lead->>Arch: Phase 3 β build GREEN
Arch-->>Lead: implementation Β· 5 tests pass
Lead-->>U: β Approve Phase 3?
U-->>Lead: approve
Lead->>Sec: Phase 4 β security review
Lead->>Tester: Phase 4 β coverage review
Sec-->>Lead: β 0 critical
Tester-->>Lead: β 94% coverage
Lead->>Lead: Phase 5 β finalize
Lead-->>U: β
Production-ready
$ /run "Add JWT authentication to the /api routes"
πΈ agent-detector (haiku):
Complexity: Deep Β· Agent: architect (primary), security (reviewer)
Model: sonnet Β· Estimated tokens: ~95K
πΈ Sprint Contract β confirm before Phase 2:
Scope: JWT middleware, /login + /refresh endpoints, user model
Done when: 5 tests passing, 0 critical security findings, coverage β₯80%
Out of scope: password reset, 2FA, social auth
Confirm contract? (yes / modify / skip)
> yes
βββββββββββββββββββββββββββββββββββββββββββββββ
Phase 1: Understand + Design (architect)
βββββββββββββββββββββββββββββββββββββββββββββββ
π Requirements analyzed:
β’ POST /auth/login β issues JWT (15m) + refresh token (7d)
β’ POST /auth/refresh β rotates access token
β’ Middleware verifies JWT on protected routes
β’ User.passwordHash stored as bcrypt (cost 12)
π― Technical Design:
β’ Stack: express-jwt, bcrypt, crypto.randomUUID for jti
β’ Token storage: httpOnly cookie (refresh) + Authorization header (access)
β’ Refresh token rotation on every use (prevents replay)
β’ RS256 signing (public/private keypair)
β οΈ Risks flagged by strategist:
β’ Secret leak β env var with rotation plan
β’ No rate-limit β add bucket before deploy (Phase 5 followup)
π Saved:
.claude/logs/runs/250421-jwt-auth/phase-1/requirements.md
.claude/logs/runs/250421-jwt-auth/phase-1/tech-spec.md
ββββββββββββββββββββββββββββββββββββββββββββββ
β APPROVAL GATE β Phase 1
Options:
approve Continue to Phase 2 (write tests)
modify <changes> Adjust design without restart
reject <reason> Full redesign
ββββββββββββββββββββββββββββββββββββββββββββββ
> approve
βββββββββββββββββββββββββββββββββββββββββββββββ
Phase 2: Test RED (tester)
βββββββββββββββββββββββββββββββββββββββββββββββ
π§ͺ Writing failing tests (TDD RED)...
β auth.test.ts: login returns JWT for valid creds [FAIL β no implementation]
β auth.test.ts: login rejects invalid password [FAIL]
β auth.test.ts: refresh rotates token on use [FAIL]
β middleware.test.ts: rejects missing Authorization header [FAIL]
β middleware.test.ts: rejects expired JWT [FAIL]
π Status: 5 failing, 0 passing (correct for RED)
Auto-continue β Phase 3 (tests fail as expected) β
βββββββββββββββββββββββββββββββββββββββββββββββ
Phase 3: Build GREEN (architect)
βββββββββββββββββββββββββββββββββββββββββββββββ
βοΈ Implementing...
β src/auth/jwt-service.ts (new Β· 67 lines)
β src/auth/middleware.ts (new Β· 42 lines)
β src/routes/auth.ts (new Β· 58 lines)
β src/models/user.ts (modified Β· added passwordHash)
β .env.example (modified Β· added JWT_PRIVATE_KEY)
π§ͺ Running tests...
β 5 passing, 0 failing
β Coverage: 94% (target: 80%)
ββββββββββββββββββββββββββββββββββββββββββββββ
β APPROVAL GATE β Phase 3
> approve
βββββββββββββββββββββββββββββββββββββββββββββββ
Phase 4: Refactor + Review (security + tester)
βββββββββββββββββββββββββββββββββββββββββββββββ
π Security review (reviewer β builder):
β No hardcoded secrets (env var pattern correct)
β bcrypt cost 12 (OWASP minimum: 10)
β httpOnly cookie flag set on refresh token
β οΈ INFO: Consider CSP header on /login response (not blocking)
π§ͺ Test coverage review:
β 94% coverage (target: 80%) β
β All critical paths covered
β No flaky tests detected
Auto-continue β Phase 5 (0 critical findings) β
βββββββββββββββββββββββββββββββββββββββββββββββ
Phase 5: Finalize (lead)
βββββββββββββββββββββββββββββββββββββββββββββββ
π¦ Deliverables:
β 4 new files, 2 modified
β 5 tests passing, 94% coverage
β 0 critical security findings
β ADR saved: docs/adr/0007-jwt-auth.md
β Deployment note: add rate-limit middleware (tracked in phase-1/risks.md)
π Workflow stats:
Duration: 18m Β· Tokens: 82K Β· Budget: 30K target β 2.7x (Deep tier norm)
Ready to commit? (yes / no)
> yes
πΎ Committed: 7a3b9c2 Β· feat(auth): JWT authentication with refresh rotation
β
Workflow complete β JWT auth shipped.
| Step | Who ran it | Your role |
|---|---|---|
| 1. Detection | agent-detector skill (haiku, auto) |
Nothing β zero friction |
| 2. Sprint Contract | Orchestrator proposed | Confirm scope |
| 3. Phase 1 design | architect in forked context |
Approve design |
| 4. Phase 2 RED | tester (auto-continues) |
Nothing |
| 5. Phase 3 GREEN | architect implements |
Approve implementation |
| 6. Phase 4 review | security + tester (NOT architect) |
Nothing |
| 7. Phase 5 finalize | lead |
Confirm commit |
Two approvals. 18 minutes. Production-ready JWT auth with 94% coverage and security review.
Not every task gets the 5-phase workflow. Aura Frog's agent-detector classifies complexity on every message and picks the minimum viable flow:
| Task type | Flow | Gates | Example |
|---|---|---|---|
| Typo, one-line fix | Direct edit (no workflow) | 0 | /run fix typo in login.ts |
| Bug fix | 4-step TDD (Investigate β RED β GREEN β Verify) | 0 | /run fix login button not disabling |
| Refactor | Analyze β plan β test β refactor | 0 | /run refactor auth service |
| Add tests | Detect framework β write β verify coverage | 0 | /run add tests for payment |
| Feature (β€5 files) | Single-agent inline with TDD | 0β1 | /run add email validation |
| Feature (6+ files, architecture) | Full 5-phase workflow | 2 | /run implement user subscription |
When the 5-phase workflow DOES fire (Deep complexity only):
β Phase 1: Understand + Design β You approve the plan
β‘ Phase 2: Test RED β Failing tests written
β Phase 3: Build GREEN β You approve the implementation
β‘ Phase 4: Refactor + Review β Auto quality + security check
β‘ Phase 5: Finalize β Docs + notifications
Escape hatches β you control rigor when the detector gets it wrong:
/run fasttrack: <specs>β skip Phase 1 if you've already designed/run must do: <task>/just do: <task>β bypass brainstorming, execute literally/run reopen <phase>β unfreeze an approved phase to revise/run reason: sc|tot|coveβ opt in to heavy reasoning (Self-Consistency / Tree of Thoughts / Chain-of-Verification) for hard decisions/run handoffβ save state, resume in a fresh session
What you get vs what you skip:
- 80% of tasks never see a gate β fast iteration
- 20% that matter (architecture, multi-file, vague scope) get disciplined TDD + human approval
- You never manually pick β the detector routes; you approve only when it matters
Full strategy matrix: Routing Strategies below. Full benefits guide: docs/reference/BENEFITS.md.
9 specialized agents activate automatically β no configuration:
"Build a React dashboard" β frontend
"Optimize the SQL queries" β architect
"Set up CI/CD pipeline" β devops
"Fix the login screen crash" β mobile
"Run a security audit" β security
flowchart LR
Msg([User message]):::m --> L0[Layer 0<br/>Task content<br/>+50-60]:::l
Msg --> L1[Layer 1<br/>Explicit tech<br/>+60]:::l
Msg --> L2[Layer 2<br/>Intent verb<br/>+50]:::l
Msg --> L3[Layer 3<br/>Project context<br/>+40]:::l
Msg --> L4[Layer 4<br/>File patterns<br/>+20]:::l
L0 --> Score[[Sum per agent]]:::s
L1 --> Score
L2 --> Score
L3 --> Score
L4 --> Score
Score --> T{Threshold?}:::g
T -->|β₯80| Primary[PRIMARY agent]:::p
T -->|50-79| Secondary[SECONDARY agent]:::sec
T -->|30-49| Optional[OPTIONAL agent]:::opt
T -->|<30| Skip[Ask user]:::skip
classDef m fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
classDef l fill:#6366f1,stroke:#3730a3,stroke-width:2px,color:#ffffff
classDef s fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
classDef g fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
classDef p fill:#059669,stroke:#064e3b,stroke-width:2px,color:#ffffff
classDef sec fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
classDef opt fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
classDef skip fill:#475569,stroke:#1e293b,stroke-width:2px,color:#ffffff
Why 5 layers instead of one? A backend repo can contain frontend work (Blade/Jinja templates, email HTML, PDF styling), and a frontend repo can need backend work (API rate-limits, auth logic). Repo type alone lies. Task content (Layer 0) overrides repo context β so "Fix email template styling" in a Laravel repo correctly routes to frontend, not architect.
Details: skills/agent-detector/SKILL.md + skills/agent-detector/task-based-agent-selection.md.
All 9 agents
| Agent | Model | Tools | When it activates |
|---|---|---|---|
lead |
inherit | full | Coordinates workflows, enforces gates |
architect |
inherit | full | System design, DB schema, backend APIs β uses Opus when session is Opus |
frontend |
inherit | full | React, Vue, Angular, Next.js + design systems β uses Opus when session is Opus |
mobile |
inherit | full | React Native, Flutter, Expo, NativeWind β uses Opus when session is Opus |
strategist |
sonnet | read-only | ROI, MVP, scope creep (Phase 1 Deep) |
security |
sonnet | read-only | OWASP, auth/crypto review (Phase 4) |
tester |
sonnet | full | Jest, Cypress, Playwright, Detox, coverage |
devops |
sonnet | full | Docker, K8s, CI/CD, monitoring |
scanner |
haiku | read + Bash | Project detection, session-start context |
Agent + complexity + model selection all done by the agent-detector skill (no separate router β consolidated in v3.6.0).
Each agent and skill declares its own model: in YAML frontmatter. Claude Code resolves the model like this:
| Priority | Source | When it applies |
|---|---|---|
| 1 (highest) | CLAUDE_CODE_SUBAGENT_MODEL env var |
Override everything β useful for CI or cost control |
| 2 | Per-invocation model parameter |
Rare β set at spawn time |
| 3 | Agent/skill frontmatter model: field |
This is where Aura Frog declarations live |
| 4 (fallback) | Main session model | Used only if nothing above is set |
Key point: frontmatter wins over session model. If you started your session on Opus but invoke agent-detector, that skill runs on haiku β not Opus. The session model is the fallback, not the override.
Why we hard-code certain models:
| Agent or skill | Model | Why |
|---|---|---|
agent-detector, scanner |
haiku | Classification/detection tasks. Fire every message or session-start. Haiku is ~3Γ faster and ~10Γ cheaper. Opus here wastes budget. |
security, strategist, tester, devops |
sonnet | Balanced reasoning for review/analysis/tests/deploy. Locked to sonnet β Opus rarely pays back for these roles. |
lead, architect, frontend, mobile |
inherit | These do the heavy design/build work. If you chose Opus for a complex task, these agents should reason at Opus too. |
What this means for you:
- Starting a session on Opus β
lead,architect,frontend,mobileall run on Opus (they inherit). Review/test/deploy stay on sonnet. Detection stays on haiku. You get Opus-quality design + sonnet-cost everything else. - Starting on Sonnet β everything runs β€ sonnet (haiku calls still haiku). No Opus unless you escalate the session.
- Want everything on one model? Set
CLAUDE_CODE_SUBAGENT_MODEL=opus(env var at top of resolution order) β overrides every frontmatter declaration.
Not what you want? Edit the model: field in aura-frog/agents/<name>.md frontmatter. Remove the line to inherit session model. Change to opus/sonnet/haiku to lock. See the Frontmatter Maintenance Rule in .claude/CLAUDE.md.
For deep tasks, 4 agents independently analyze your plan β then challenge each other:
π Architect β "How to build it"
π Tester β "How it can fail"
π€ Frontend β "How users experience it"
πΌ Strategist β "Should we even build this?"
Plans survive 4 rounds of scrutiny before a single line of code. Catches scope creep and wasted effort before it happens.
Run project:init once. Every future session instantly understands your codebase β conventions, architecture, patterns, file relationships. 12 pattern detections. 7 context files generated.
No more re-explaining your project every session.
For complex work, Aura Frog spins up a real team working in parallel:
lead
βββ architect β Designs the system
βββ frontend β Builds the UI
βββ tester β Writes tests
βββ security β Reviews for vulnerabilities
All cross-reviewing each other's work.
Only activates when needed. Simple tasks stay single-agent (saves ~3x tokens).
6 bundled servers auto-invoke when Claude needs them:
"Build with MUI" β context7 fetches current MUI docs
"Test the login page" β playwright launches a browser
"Check test coverage" β vitest runs your suite
"Deploy to Firebase" β firebase manages the project
Plus Figma design fetching and Slack notifications.
More features
Detects your patterns, remembers corrections, creates rules that persist across sessions. Optional Supabase sync for teams.
Automatically matches effort to task size β typos get direct edits, features get full workflows, architecture gets collaborative planning. No configuration. See Routing Strategies below.
Run crashed? /run resume. Context full? Decisions preserved across /compact. Need to pause? Type handoff to save everything.
All cached context is treated as a hint β agents verify against actual files before acting. State only updates after confirmed success (Strict Write Discipline). No stale assumptions propagate.
MicroCompact (free, every 10 turns) β AutoCompact (one /compact call at 80%) β ManualCompact (full session snapshot). Context stays lean. Decisions survive.
3-tier rule loading (~75% less context), conditional hooks (~40% fewer executions), agent detection caching, session start caching (<1s repeat sessions).
Aura Frog picks one of three execution strategies per task β you never configure it manually.
flowchart TB
Prompt([Prompt]):::m --> AD[agent-detector analyzes]:::s
AD --> Cx{Complexity?}:::g
Cx -->|single file<br/>clear scope<br/>~5K tokens| Q[Quick Strategy]:::q
Cx -->|2-5 files<br/>feature<br/>~20K tokens| S[Standard Strategy]:::st
Cx -->|6+ files<br/>architecture<br/>~80K tokens| D[Deep Strategy]:::d
Q --> QR[Direct edit<br/>haiku model<br/>no workflow]:::q
S --> SR[Single agent inline<br/>sonnet model<br/>TDD optional]:::st
D --> DR[5-phase TDD<br/>2 approval gates<br/>builder β reviewer<br/>collaborative planning]:::d
classDef m fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
classDef s fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
classDef g fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
classDef q fill:#10b981,stroke:#065f46,stroke-width:2px,color:#ffffff
classDef st fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
classDef d fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
| Strategy | Triggers | Model | Gates | Example |
|---|---|---|---|---|
| Quick | Single file, typo, one-line fix | haiku | 0 | "Fix typo in login.ts" |
| Standard | 2β5 files, one feature | sonnet | 0β1 | "Add email validation to signup form" |
| Deep | 6+ files, architecture, vague scope | sonnet (opus for design) | 2 (P1 + P3) | "Design and implement user subscription system" |
Why three tiers instead of always-TDD? Forcing Deep on every task burns tokens (~3Γ vs subagent mode) and slows iteration. Forcing Quick on complex work skips tests and breaks production. The three-tier model matches effort to risk.
Team Mode (subset of Deep): if the task spans 2+ domains AND CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1, multiple agents work in parallel and cross-review each other. See AGENT_TEAMS_GUIDE.
Details: rules/core/execution-rules.md, skills/agent-detector/SKILL.md, skills/run-orchestrator/SKILL.md.
| Component | Count | Why it matters |
|---|---|---|
| Agents | 9 | Right expert auto-selected per task |
| Skills | 41 | 5 auto-invoke on context, 36 on-demand |
| Commands | 6 | /run, /check, /design, /project, /af, /help |
| Rules | 57 | 3-tier loading (18 core + 17 agent + 22 workflow) β only what's needed |
| Hooks | 28 | Conditional β skip processing for non-code files |
| MCP Servers | 6 | Zero-config, auto-invoked |
Full workflow target: β€30K tokens across all 5 phases.
Six commands cover every workflow. Each auto-detects intent and dispatches the right skills/agents.
Auto-detects what kind of work you want (feature / bugfix / refactor / test) and picks the right workflow.
| What you say | Intent detected | Flow |
|---|---|---|
/run implement user profile |
Feature | 5-phase TDD workflow |
/run fix login not working |
Bugfix | bugfix-quick skill β investigate β test β fix β verify |
/run refactor auth service |
Refactor | refactor-expert skill β analyze β plan β test β refactor |
/run add tests for payment |
Test | test-writer skill β detect framework β write tests β coverage |
/run fasttrack: <specs> |
Fast-Track | Skip Phase 1, auto-execute P2βP5 (specs must include Requirements + Design + API + Data Model + Acceptance Criteria) |
/run resume <id> |
Resume | Load state from .claude/logs/runs/<id>/ |
/run status |
Status | Current phase + progress |
/run handoff |
Handoff | Save state for cross-session continuation |
/check # all checks (security + perf + complexity + debt + coverage + deps)
/check security # SAST only
/check perf # performance bottlenecks
/check coverage # test coverage report
/check deps # outdated/vulnerable dependencies/design api # REST/GraphQL API spec (calls api-designer skill)
/design db # Database schema design
/design doc # ADR or runbook (calls documentation skill)/project init # First-time setup β generates 7 context files
/project status # Current context + active workflow
/project refresh # Re-scan codebase, update conventions
/project regen # Regenerate context files from scratch
/project env # Validate .envrc / MCP tokens
/project sync # Sync status line + refresh cache/af status # Plugin health check
/af agents # List loaded agents with their tools + model
/af metrics # Workflow velocity + token efficiency
/af learn status # Learning system state (Supabase or local)
/af learn analyze # Extract patterns from past workflows
/af learn apply # Apply learned rules to future sessions
/af setup cli # Install af CLI system-wide
/af prompts # Analyze prompt quality + suggest improvements/help # Plugin overview
/help <command> # Detailed help for a specific command
/help agents # Agent selection guide
/help hooks # Hook lifecycle referenceFull command docs: commands/README.md.
Real examples of what the agent-detector skill picks and why. Score thresholds: PRIMARY β₯80, SECONDARY 50β79, OPTIONAL 30β49.
| You type | PRIMARY agent | Why (scoring breakdown) |
|---|---|---|
| "Add login form with email+password" | frontend | form +35, login +30, UI intent +50 = 115 |
| "Add rate-limit to /api routes" | architect | api route +55, rate limit +45, backend intent +50 = 150 |
| "Fix email template styling in Laravel" | frontend (in Laravel repo!) | email template +55, styling +40, Layer 0 overrides repo = 95 |
| "Optimize this slow query" | architect | slow query +50, optimize +35, database intent +55 = 140 |
| "Run OWASP audit on payment flow" | security | OWASP +55, audit +50, security intent +55 = 160 |
| "Write Cypress tests for checkout" | tester | Cypress +50, tests +55, test infra exists +30 = 135 |
| "Set up GitHub Actions for CI" | devops | GitHub Actions +55, CI +50, deployment intent +50 = 155 |
| "Fix FlatList performance in Expo" | mobile | FlatList +50, Expo +55, mobile intent +50 = 155 |
| "Should we build this feature?" | strategist | should we +50, business-question intent +55 = 105 |
| "What does this repo do?" | scanner | Project detection intent +60, cached context +40 = 100 |
Key insight: Layer 0 (task content) overrides repo type. A Laravel repo asking "fix email template styling" gets frontend, not architect. See skills/agent-detector/task-based-agent-selection.md for the full scoring matrix.
Real measurements from production workflows. Numbers vary Β±20% based on project size.
| Strategy | Typical Tokens | Cost (Sonnet) | Cost (Opus) | Gates | Example task |
|---|---|---|---|---|---|
| Quick (direct edit, haiku) | ~3K | $0.003 | β | 0 | Fix typo, rename variable |
| Standard (single agent, sonnet) | ~15β25K | $0.08 | $0.40 | 0β1 | Add validation to form |
| Deep (5-phase, sonnet) | ~60β90K | $0.30 | $1.50 | 2 | JWT auth, payment flow |
| Deep + Team Mode (multi-agent, sonnet) | ~120β180K | $0.60 | $3.00 | 2 | User subscription system |
Phase 1: Understand + Design ~8K (13%)
Phase 2: Test RED ~6K (10%)
Phase 3: Build GREEN ~40K (65%) β biggest phase
Phase 4: Refactor + Review ~6K (10%)
Phase 5: Finalize ~2K ( 2%)
Target: β€30K tokens per workflow. Actual median: 62K (2x target β Phase 3 is the compressor target for future optimization).
Run /run predict <task> before a workflow to get a tailored estimate.
Q: Workflow state isn't saving. `/run status` shows nothing.
Likely cause: Path drift between hooks and skills (fixed in v3.7+).
Check:
ls -la .claude/logs/runs/ # Should exist after first /run
ls -la .claude/logs/workflows/ # Legacy path β may have old stateFix:
- Upgrade to v3.7+ (
/plugin update aura-frog) - Or manually move:
mv .claude/logs/workflows/* .claude/logs/runs/
Verify with /af status β should show 0 orphan paths.
Q: Wrong agent picked for my task.
Likely cause: Missing project context or ambiguous task description.
Check:
- Did you run
/project inityet? Scanner uses those files for Layer 3 (project context). - Is your task description short/vague?
agent-detectordefaults to repo type when signals are weak.
Fix:
- Run
/project initif you haven't - Rephrase task with domain-specific keywords:
"Add email template styling"(frontend) vs"Update email feature"(ambiguous) - Override manually:
/run @frontend implement Xforces the frontend agent
Full scoring logic: skills/agent-detector/task-based-agent-selection.md.
Q: Token budget blown past 200K. What happened?
Likely cause: Phase 3 (Build GREEN) hit an iteration loop on a complex refactor.
Check:
/run budget # Shows per-phase consumption
/run metrics # Shows if rejection count is highFix:
/run handoffto save state β resume in fresh session- For next time: use
/run predict <task>first β flags Deep tasks likely to exceed budget - Consider splitting:
/run part 1: <narrow scope>β merge β/run part 2
Q: Hooks not firing (no SessionStart banner, no lint-autofix).
Likely cause: .claude/settings.json missing hook config, or plugin not activated in this project.
Check:
cat .claude/settings.json # Should reference plugin hooks
/af status # Should show "Hooks: 28 registered"Fix:
/af setup integrations # Re-installs hook configIf still nothing, check plugin.json path:
ls ~/.claude/plugins/marketplaces/aurafrog/aura-frog/hooks/hooks.jsonQ: Opus session costs surprised me. Can I lock everything to Sonnet?
Yes β two ways:
Option 1 β Session override (temporary):
# Start Claude Code with model flag
claude --model sonnetOption 2 β Env var (permanent, overrides ALL frontmatter):
export CLAUDE_CODE_SUBAGENT_MODEL=sonnetThis overrides every agent/skill model: declaration. See Per-Agent Model Override for resolution order.
Cost tip: scanner and agent-detector stay on haiku regardless β you don't need to touch them.
Q: Can I run multiple /run workflows in parallel?
Yes β use git worktrees:
/run worktree: <task> # Automatically creates isolated worktree + runs thereEach worktree has its own state in .claude/logs/runs/<id>/. See Git Worktree skill.
For full multi-agent parallel work, enable Agent Teams:
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1See Agent Teams Guide.
Q: How do I disable a hook that's slowing me down?
Each hook has a disable env var:
AF_LINT_AUTOFIX=false # Skip post-edit linter
AF_PROMPT_LOGGING=false # Skip prompt metadata logging
AF_LEARNING_ENABLED=false # Skip all learning hooksOr disable at the source by editing aura-frog/hooks/hooks.json (comment out the matcher).
Full hook list: hooks/README.md.
More issues: TROUBLESHOOTING.md.
Honest comparison with two popular plugins in the ecosystem (April 2026).
| Aura Frog | wshobson/agents | Superpowers | |
|---|---|---|---|
| Agents | 9 curated | 184 across 78 plugins | ~20 |
| Skills | 38 | 150 | Small focused set |
| Commands | 6 | 98 | ~10 |
| Workflow | 5-phase TDD with 2 gates | No structured workflow | Phase-gated workflow |
| Agent routing | Task-content Layer 0 override | Manual /agent-name |
Similar to Aura Frog |
| TDD enforcement | β Mandatory REDβGREENβREFACTOR | β Per-agent | β Phase-gated |
| Context management | 3-tier (MicroCompact / AutoCompact / ManualCompact) | β Base Claude Code | Partial |
| Approval gates | 2 (P1 + P3) | β | Multiple |
| MCP bundled | 6 (context7, playwright, vitest, firebase, figma, slack) | Varies per plugin | 2β3 |
| Best fit | Teams shipping production features with TDD discipline | Extending with niche specialists | Structured workflows for research/writing |
| Weakness | Steeper learning curve | Agent sprawl (184 is a lot) | Smaller ecosystem |
Not competing β different optimization targets. Aura Frog optimizes for production code quality (TDD + security review). wshobson optimizes for breadth of specialists. Superpowers optimizes for structured thinking over code.
Combine freely β plugins coexist in Claude Code.
| All Documentation | docs/README.md |
| Getting Started | GET_STARTED.md |
| First Workflow Tutorial | FIRST_WORKFLOW_TUTORIAL.md |
| All Commands (6) | commands/README.md |
| All Skills (38) | skills/README.md |
| Agent Teams Guide | AGENT_TEAMS_GUIDE.md |
| MCP Setup | MCP_GUIDE.md |
| Hooks & Lifecycle | hooks/README.md |
| Troubleshooting | TROUBLESHOOTING.md |
| Changelog | CHANGELOG.md |
Claude = Kernel Context Window = RAM Project Files = Disk
Agents = Processes 5-Phase TDD = Scheduler MCP = Device Drivers
TOON = Compression Approval Gates = Interrupts Handoffs = IPC
aura-frog/
βββ agents/ 9 processes (auto-dispatched per task)
βββ skills/ 44 skills (5 auto-invoke + 39 on-demand)
βββ commands/ 6 commands (/run, /check, /design, /project, /af, /help)
βββ rules/ 57 rules (18 core + 17 agent + 22 workflow)
βββ hooks/ 28 lifecycle hooks (conditional execution)
βββ scripts/ 43 utility scripts
βββ docs/ AI reference docs (phases, TOON refs)
βββ .mcp.json 6 device drivers (MCP servers)
We welcome contributions β especially new MCP integrations, agents, skills, and bug fixes. See CONTRIBUTING.md or submit an issue.
Godot and SEO/GEO modules available as separate addons.
MIT β See LICENSE
Install Now Β· Tutorial Β· Report Issue
Built by @nguyenthienthanh Β· Changelog

