Skip to content

nguyenthienthanh/aura-frog

Repository files navigation

Aura Frog

Aura Frog

An Operating System for software engineering.

A plugin for Claude Code that treats it as an Operating System. 9 specialized agents, smart flow selection (direct edit Β· bugfix TDD Β· 5-phase workflow), self-healing memory, and multi-agent orchestration. Right effort matched to task risk.

Version License Claude Code Portable Trigger Accuracy PRs Welcome

Typos get direct edits. Bugs get 4-step TDD. Features get 5-phase workflow. You never pick β€” the plugin matches effort to risk.

Install in 30 seconds Β· See it in action Β· Why Aura Frog? Β· Full benefits guide β†’


The Problem

You open Claude Code. You type a prompt. Claude writes code. You hope it works.

No structure. No tests. No quality gates. Every session starts from scratch. Every complex feature turns into prompt spaghetti.

You're the project manager, QA lead, and architect β€” all while trying to code.

The Solution

Aura Frog treats Claude Code as an Operating System β€” Claude is the kernel, agents are processes, and the context window is managed RAM. You describe the task. Aura Frog classifies complexity, picks the right flow (direct edit Β· bugfix TDD Β· full 5-phase), dispatches the right agent, and compresses context automatically so you never lose decisions.

Right effort for every task. You only approve when it matters (0 gates for typos, up to 2 for architecture).


How It Works

flowchart TB
    User([πŸ‘€ User prompt]):::user --> CC[Claude Code CLI]:::cc
    CC -->|every message| AD[πŸ” agent-detector skill<br/>model: haiku Β· auto-invoke]:::skill
    AD --> Cx{Complexity<br/>detected?}:::gate

    Cx -->|Quick<br/>1 file / clear scope| Direct[✏️ Direct edit<br/>no workflow]:::quick
    Cx -->|Standard<br/>2-5 files / feature| SingleAgent[πŸ›  Single agent<br/>agent-detector picks one]:::std
    Cx -->|Deep<br/>6+ files / architecture| RO[🎯 run-orchestrator skill<br/>5-phase TDD]:::deep

    RO --> P1[πŸ“ P1: Understand + Design<br/>architect]:::phase
    P1 --> G1{{βœ‹ Approve?}}:::gate
    G1 -->|approve| P2[πŸ§ͺ P2: Test RED<br/>tester writes failing tests]:::phase
    P2 --> P3[βš™οΈ P3: Build GREEN<br/>architect/frontend/mobile]:::phase
    P3 --> G2{{βœ‹ Approve?}}:::gate
    G2 -->|approve| P4[πŸ”’ P4: Refactor + Review<br/>security + tester NOT builder]:::phase
    P4 --> P5[πŸ“¦ P5: Finalize<br/>lead β€” docs + commit]:::phase
    P5 --> Done([βœ… Production-ready]):::done

    SingleAgent --> Done
    Direct --> Done

    classDef user fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
    classDef cc fill:#6366f1,stroke:#3730a3,stroke-width:2px,color:#ffffff
    classDef skill fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
    classDef gate fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
    classDef quick fill:#10b981,stroke:#065f46,stroke-width:2px,color:#ffffff
    classDef std fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
    classDef deep fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
    classDef phase fill:#475569,stroke:#1e293b,stroke-width:2px,color:#ffffff
    classDef done fill:#059669,stroke:#064e3b,stroke-width:2px,color:#ffffff
Loading

The flow, explained:

  1. Every message you send goes through the agent-detector skill (runs on haiku for cost) β€” it classifies complexity + picks the right agent + suggests the right model.
  2. Quick tasks (typo, one-line fix) β†’ direct edit, no workflow overhead.
  3. Standard tasks (one feature, clear scope) β†’ single specialized agent runs inline.
  4. Deep tasks (feature + multi-file + TDD) β†’ run-orchestrator spawns the 5-phase workflow with two human approval gates.
  5. Between phases, you either approve, reject, or modify β€” no commit happens until Phase 5 and you say so.

Works Across AI Coding Tools

Aura Frog's 57 rules, 44 skills, and 9 agents are ~87% portable (weighted average) because they're markdown conventions, not tool-specific code. Only the thin hook layer needs adapters.

Tool Status Coverage
Claude Code Fully tested 100%
Codex Q2 2026 ~85%
Cursor Q2 2026 ~80%

Why this matters: When you invest in Aura Frog's TDD discipline, gotcha-only expert skills, and agent architecture, that investment survives tool switches. Only the thin adapter layer changes.

Read the Portability Guide β†’


Before & After

❌ Without Aura Frogβœ… With Aura Frog
You: "Add user authentication"
Claude: *writes 500 lines of untested code*
You: "Wait, that's not what Iβ€”"
Claude: *rewrites everything from scratch*
You: "Add user authentication"

🐸 Phase 1: "JWT or OAuth2? Here are trade-offs.
   3 endpoints needed. Approve?"

You: "approve"

🐸 Phase 2-3: 5 tests β†’ all GREEN.
🐸 Phase 4-5: Reviewed. Documented. Done.

Result: Production-ready code with tests, security review, and documentation β€” from a single prompt.


Installation

Prerequisites

  • Claude Code CLI installed β†’ install guide
  • Node.js β‰₯ 18 (for hook scripts)
  • Git (for phase checkpoint commits)

Install in Claude Code (30 seconds)

# 1. Add the marketplace
/plugin marketplace add nguyenthienthanh/aura-frog

# 2. Install the plugin
/plugin install aura-frog@aurafrog

# 3. Verify
/af status

Expected output:

🐸 Aura Frog v3.6.0 β€” Ready
  Agents:   9 loaded (lead, architect, frontend, mobile, tester, security, devops, strategist, scanner)
  Skills:   44 available (5 auto-invoke, 39 on-demand)
  Rules:    57 loaded (18 core + 17 agent + 22 workflow)
  Hooks:    28 registered
  MCP:      context7, playwright, vitest, firebase, figma, slack

Initialize Your Project (Recommended β€” one time)

/project init

Scans your codebase and creates 7 context files (framework, conventions, rules, examples, architecture, etc.) in .claude/project-contexts/<name>/. Takes 30–60 seconds; saves minutes on every future session.

Optional Setup

Install af CLI for health checks outside Claude Code
# Inside Claude Code:
/af setup cli

# Or manually:
sudo ln -sf "$HOME/.claude/plugins/marketplaces/aurafrog/scripts/af" /usr/local/bin/af

Then use anywhere: af doctor, af measure, af setup remote.

MCP tokens (Figma, Slack, Firebase)
cp .envrc.template .envrc
# Edit .envrc β€” add FIGMA_API_TOKEN, SLACK_BOT_TOKEN, FIREBASE_TOKEN, etc.
direnv allow   # if using direnv

Without tokens, figma / slack / firebase MCP servers stay inactive. context7, playwright, vitest need no config.

Skills-only mode on other platforms
Platform Install What works
Claude Code /plugin marketplace add nguyenthienthanh/aura-frog Everything
OpenAI Codex cp -r aura-frog/skills/* ~/.codex/skills/ Skills + commands
Gemini CLI cp -r aura-frog/skills/* ~/.gemini/skills/ Skills + commands
OpenCode cp -r aura-frog/skills/* .opencode/skills/ Skills + commands

Hooks, agent detection, subagent spawning, and MCP servers are Claude Code exclusive.

Start Your First Workflow

/run "Your task here"

See the Walkthrough below for a complete transcript of what this looks like.

Common Install Issues

Symptom Likely cause Fix
/plugin install fails Marketplace cache Run /plugin marketplace refresh
Hooks not firing .claude/settings.json missing hook config /af setup integrations re-installs
af: command not found PATH missing plugin scripts dir Add $HOME/.claude/plugins/marketplaces/aurafrog/scripts to $PATH
State not saving during /run Hook path drift (pre-v3.7) Upgrade to 3.7+ (state path fixed)
Claude uses wrong agent No /project init yet Run /project init to load conventions

Full guide: GET_STARTED.md.


Walkthrough: A Real Workflow in Action

A complete transcript of implementing user authentication with /run. This is what you actually see.

Interaction Sequence

sequenceDiagram
    autonumber
    participant U as πŸ‘€ You
    participant CC as Claude Code
    participant AD as agent-detector<br/>(haiku)
    participant RO as run-orchestrator<br/>skill
    participant Lead as lead agent
    participant Arch as architect
    participant Tester as tester
    participant Sec as security

    U->>CC: /run "Add JWT authentication"
    CC->>AD: classify (auto-invoke)
    AD-->>CC: Deep Β· multi-file Β· sonnet
    CC->>RO: spawn 5-phase workflow
    RO->>Lead: coordinate
    Lead->>Arch: Phase 1 β€” design
    Arch-->>Lead: requirements.md + tech_spec.md
    Lead-->>U: βœ‹ Approve Phase 1?
    U-->>Lead: approve
    Lead->>Tester: Phase 2 β€” write failing tests
    Tester-->>Lead: 5 tests Β· all RED
    Lead->>Arch: Phase 3 β€” build GREEN
    Arch-->>Lead: implementation Β· 5 tests pass
    Lead-->>U: βœ‹ Approve Phase 3?
    U-->>Lead: approve
    Lead->>Sec: Phase 4 β€” security review
    Lead->>Tester: Phase 4 β€” coverage review
    Sec-->>Lead: βœ“ 0 critical
    Tester-->>Lead: βœ“ 94% coverage
    Lead->>Lead: Phase 5 β€” finalize
    Lead-->>U: βœ… Production-ready
Loading

Mock Terminal Transcript

$ /run "Add JWT authentication to the /api routes"

🐸 agent-detector (haiku):
   Complexity: Deep Β· Agent: architect (primary), security (reviewer)
   Model: sonnet Β· Estimated tokens: ~95K

🐸 Sprint Contract β€” confirm before Phase 2:
   Scope:     JWT middleware, /login + /refresh endpoints, user model
   Done when: 5 tests passing, 0 critical security findings, coverage β‰₯80%
   Out of scope: password reset, 2FA, social auth

Confirm contract? (yes / modify / skip)

> yes

═══════════════════════════════════════════════
Phase 1: Understand + Design  (architect)
═══════════════════════════════════════════════

πŸ“‹ Requirements analyzed:
   β€’ POST /auth/login  β†’ issues JWT (15m) + refresh token (7d)
   β€’ POST /auth/refresh β†’ rotates access token
   β€’ Middleware verifies JWT on protected routes
   β€’ User.passwordHash stored as bcrypt (cost 12)

🎯 Technical Design:
   β€’ Stack: express-jwt, bcrypt, crypto.randomUUID for jti
   β€’ Token storage: httpOnly cookie (refresh) + Authorization header (access)
   β€’ Refresh token rotation on every use (prevents replay)
   β€’ RS256 signing (public/private keypair)

⚠️  Risks flagged by strategist:
   β€’ Secret leak β†’ env var with rotation plan
   β€’ No rate-limit β†’ add bucket before deploy (Phase 5 followup)

πŸ“ Saved:
   .claude/logs/runs/250421-jwt-auth/phase-1/requirements.md
   .claude/logs/runs/250421-jwt-auth/phase-1/tech-spec.md

──────────────────────────────────────────────
βœ‹ APPROVAL GATE β€” Phase 1

Options:
  approve            Continue to Phase 2 (write tests)
  modify <changes>   Adjust design without restart
  reject <reason>    Full redesign
──────────────────────────────────────────────

> approve

═══════════════════════════════════════════════
Phase 2: Test RED  (tester)
═══════════════════════════════════════════════

πŸ§ͺ Writing failing tests (TDD RED)...
   βœ“ auth.test.ts:  login returns JWT for valid creds         [FAIL β€” no implementation]
   βœ“ auth.test.ts:  login rejects invalid password            [FAIL]
   βœ“ auth.test.ts:  refresh rotates token on use              [FAIL]
   βœ“ middleware.test.ts: rejects missing Authorization header [FAIL]
   βœ“ middleware.test.ts: rejects expired JWT                  [FAIL]

πŸ“Š Status: 5 failing, 0 passing (correct for RED)

Auto-continue β†’ Phase 3 (tests fail as expected) βœ“

═══════════════════════════════════════════════
Phase 3: Build GREEN  (architect)
═══════════════════════════════════════════════

βš™οΈ  Implementing...
   βœ“ src/auth/jwt-service.ts    (new Β· 67 lines)
   βœ“ src/auth/middleware.ts     (new Β· 42 lines)
   βœ“ src/routes/auth.ts         (new Β· 58 lines)
   βœ“ src/models/user.ts         (modified Β· added passwordHash)
   βœ“ .env.example               (modified Β· added JWT_PRIVATE_KEY)

πŸ§ͺ Running tests...
   βœ“ 5 passing, 0 failing
   βœ“ Coverage: 94% (target: 80%)

──────────────────────────────────────────────
βœ‹ APPROVAL GATE β€” Phase 3

> approve

═══════════════════════════════════════════════
Phase 4: Refactor + Review  (security + tester)
═══════════════════════════════════════════════

πŸ”’ Security review (reviewer β‰  builder):
   βœ“ No hardcoded secrets (env var pattern correct)
   βœ“ bcrypt cost 12 (OWASP minimum: 10)
   βœ“ httpOnly cookie flag set on refresh token
   ⚠️  INFO: Consider CSP header on /login response (not blocking)

πŸ§ͺ Test coverage review:
   βœ“ 94% coverage (target: 80%) βœ“
   βœ“ All critical paths covered
   βœ“ No flaky tests detected

Auto-continue β†’ Phase 5 (0 critical findings) βœ“

═══════════════════════════════════════════════
Phase 5: Finalize  (lead)
═══════════════════════════════════════════════

πŸ“¦ Deliverables:
   βœ“ 4 new files, 2 modified
   βœ“ 5 tests passing, 94% coverage
   βœ“ 0 critical security findings
   βœ“ ADR saved: docs/adr/0007-jwt-auth.md
   βœ“ Deployment note: add rate-limit middleware (tracked in phase-1/risks.md)

πŸ“Š Workflow stats:
   Duration: 18m Β· Tokens: 82K Β· Budget: 30K target β†’ 2.7x (Deep tier norm)

Ready to commit? (yes / no)

> yes

πŸ’Ύ Committed: 7a3b9c2 Β· feat(auth): JWT authentication with refresh rotation

βœ… Workflow complete β€” JWT auth shipped.

What You Just Saw

Step Who ran it Your role
1. Detection agent-detector skill (haiku, auto) Nothing β€” zero friction
2. Sprint Contract Orchestrator proposed Confirm scope
3. Phase 1 design architect in forked context Approve design
4. Phase 2 RED tester (auto-continues) Nothing
5. Phase 3 GREEN architect implements Approve implementation
6. Phase 4 review security + tester (NOT architect) Nothing
7. Phase 5 finalize lead Confirm commit

Two approvals. 18 minutes. Production-ready JWT auth with 94% coverage and security review.


Why Teams Ship Faster With Aura Frog

1. Smart Flow Selection β€” Right Effort for Every Task

Not every task gets the 5-phase workflow. Aura Frog's agent-detector classifies complexity on every message and picks the minimum viable flow:

Task type Flow Gates Example
Typo, one-line fix Direct edit (no workflow) 0 /run fix typo in login.ts
Bug fix 4-step TDD (Investigate β†’ RED β†’ GREEN β†’ Verify) 0 /run fix login button not disabling
Refactor Analyze β†’ plan β†’ test β†’ refactor 0 /run refactor auth service
Add tests Detect framework β†’ write β†’ verify coverage 0 /run add tests for payment
Feature (≀5 files) Single-agent inline with TDD 0–1 /run add email validation
Feature (6+ files, architecture) Full 5-phase workflow 2 /run implement user subscription

When the 5-phase workflow DOES fire (Deep complexity only):

  βœ‹ Phase 1: Understand + Design    β†’ You approve the plan
  ⚑ Phase 2: Test RED               β†’ Failing tests written
  βœ‹ Phase 3: Build GREEN            β†’ You approve the implementation
  ⚑ Phase 4: Refactor + Review      β†’ Auto quality + security check
  ⚑ Phase 5: Finalize               β†’ Docs + notifications

Escape hatches β€” you control rigor when the detector gets it wrong:

  • /run fasttrack: <specs> β€” skip Phase 1 if you've already designed
  • /run must do: <task> / just do: <task> β€” bypass brainstorming, execute literally
  • /run reopen <phase> β€” unfreeze an approved phase to revise
  • /run reason: sc|tot|cove β€” opt in to heavy reasoning (Self-Consistency / Tree of Thoughts / Chain-of-Verification) for hard decisions
  • /run handoff β€” save state, resume in a fresh session

What you get vs what you skip:

  • 80% of tasks never see a gate β€” fast iteration
  • 20% that matter (architecture, multi-file, vague scope) get disciplined TDD + human approval
  • You never manually pick β€” the detector routes; you approve only when it matters

Full strategy matrix: Routing Strategies below. Full benefits guide: docs/reference/BENEFITS.md.

2. The Right Expert for Every Task

9 specialized agents activate automatically β€” no configuration:

"Build a React dashboard"     β†’ frontend
"Optimize the SQL queries"    β†’ architect
"Set up CI/CD pipeline"       β†’ devops
"Fix the login screen crash"  β†’ mobile
"Run a security audit"        β†’ security

How Agent Detection Works

flowchart LR
    Msg([User message]):::m --> L0[Layer 0<br/>Task content<br/>+50-60]:::l
    Msg --> L1[Layer 1<br/>Explicit tech<br/>+60]:::l
    Msg --> L2[Layer 2<br/>Intent verb<br/>+50]:::l
    Msg --> L3[Layer 3<br/>Project context<br/>+40]:::l
    Msg --> L4[Layer 4<br/>File patterns<br/>+20]:::l

    L0 --> Score[[Sum per agent]]:::s
    L1 --> Score
    L2 --> Score
    L3 --> Score
    L4 --> Score

    Score --> T{Threshold?}:::g
    T -->|β‰₯80| Primary[PRIMARY agent]:::p
    T -->|50-79| Secondary[SECONDARY agent]:::sec
    T -->|30-49| Optional[OPTIONAL agent]:::opt
    T -->|<30| Skip[Ask user]:::skip

    classDef m fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
    classDef l fill:#6366f1,stroke:#3730a3,stroke-width:2px,color:#ffffff
    classDef s fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
    classDef g fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
    classDef p fill:#059669,stroke:#064e3b,stroke-width:2px,color:#ffffff
    classDef sec fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
    classDef opt fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
    classDef skip fill:#475569,stroke:#1e293b,stroke-width:2px,color:#ffffff
Loading

Why 5 layers instead of one? A backend repo can contain frontend work (Blade/Jinja templates, email HTML, PDF styling), and a frontend repo can need backend work (API rate-limits, auth logic). Repo type alone lies. Task content (Layer 0) overrides repo context β€” so "Fix email template styling" in a Laravel repo correctly routes to frontend, not architect.

Details: skills/agent-detector/SKILL.md + skills/agent-detector/task-based-agent-selection.md.

All 9 agents
Agent Model Tools When it activates
lead inherit full Coordinates workflows, enforces gates
architect inherit full System design, DB schema, backend APIs β€” uses Opus when session is Opus
frontend inherit full React, Vue, Angular, Next.js + design systems β€” uses Opus when session is Opus
mobile inherit full React Native, Flutter, Expo, NativeWind β€” uses Opus when session is Opus
strategist sonnet read-only ROI, MVP, scope creep (Phase 1 Deep)
security sonnet read-only OWASP, auth/crypto review (Phase 4)
tester sonnet full Jest, Cypress, Playwright, Detox, coverage
devops sonnet full Docker, K8s, CI/CD, monitoring
scanner haiku read + Bash Project detection, session-start context

Agent + complexity + model selection all done by the agent-detector skill (no separate router β€” consolidated in v3.6.0).

Per-Agent Model Override β€” How It Works and Why

Each agent and skill declares its own model: in YAML frontmatter. Claude Code resolves the model like this:

Priority Source When it applies
1 (highest) CLAUDE_CODE_SUBAGENT_MODEL env var Override everything β€” useful for CI or cost control
2 Per-invocation model parameter Rare β€” set at spawn time
3 Agent/skill frontmatter model: field This is where Aura Frog declarations live
4 (fallback) Main session model Used only if nothing above is set

Key point: frontmatter wins over session model. If you started your session on Opus but invoke agent-detector, that skill runs on haiku β€” not Opus. The session model is the fallback, not the override.

Why we hard-code certain models:

Agent or skill Model Why
agent-detector, scanner haiku Classification/detection tasks. Fire every message or session-start. Haiku is ~3Γ— faster and ~10Γ— cheaper. Opus here wastes budget.
security, strategist, tester, devops sonnet Balanced reasoning for review/analysis/tests/deploy. Locked to sonnet β€” Opus rarely pays back for these roles.
lead, architect, frontend, mobile inherit These do the heavy design/build work. If you chose Opus for a complex task, these agents should reason at Opus too.

What this means for you:

  • Starting a session on Opus β†’ lead, architect, frontend, mobile all run on Opus (they inherit). Review/test/deploy stay on sonnet. Detection stays on haiku. You get Opus-quality design + sonnet-cost everything else.
  • Starting on Sonnet β†’ everything runs ≀ sonnet (haiku calls still haiku). No Opus unless you escalate the session.
  • Want everything on one model? Set CLAUDE_CODE_SUBAGENT_MODEL=opus (env var at top of resolution order) β€” overrides every frontmatter declaration.

Not what you want? Edit the model: field in aura-frog/agents/<name>.md frontmatter. Remove the line to inherit session model. Change to opus/sonnet/haiku to lock. See the Frontmatter Maintenance Rule in .claude/CLAUDE.md.

3. Complex Features Get Debated Before Built

For deep tasks, 4 agents independently analyze your plan β€” then challenge each other:

πŸ“ Architect    β†’ "How to build it"
πŸ” Tester       β†’ "How it can fail"
πŸ‘€ Frontend     β†’ "How users experience it"
πŸ’Ό Strategist   β†’ "Should we even build this?"

Plans survive 4 rounds of scrutiny before a single line of code. Catches scope creep and wasted effort before it happens.

4. Your Codebase Loads in Seconds, Not Minutes

Run project:init once. Every future session instantly understands your codebase β€” conventions, architecture, patterns, file relationships. 12 pattern detections. 7 context files generated.

No more re-explaining your project every session.

5. Multi-Agent Teams for Big Features

For complex work, Aura Frog spins up a real team working in parallel:

lead
β”œβ”€β”€ architect     β†’ Designs the system
β”œβ”€β”€ frontend      β†’ Builds the UI
β”œβ”€β”€ tester        β†’ Writes tests
└── security      β†’ Reviews for vulnerabilities

All cross-reviewing each other's work.

Only activates when needed. Simple tasks stay single-agent (saves ~3x tokens).

6. Context-Aware MCP Servers β€” Zero Config

6 bundled servers auto-invoke when Claude needs them:

"Build with MUI"          β†’ context7 fetches current MUI docs
"Test the login page"     β†’ playwright launches a browser
"Check test coverage"     β†’ vitest runs your suite
"Deploy to Firebase"      β†’ firebase manages the project

Plus Figma design fetching and Slack notifications.

More features

Self-Improving Learning

Detects your patterns, remembers corrections, creates rules that persist across sessions. Optional Supabase sync for teams.

Smart Complexity Routing

Automatically matches effort to task size β€” typos get direct edits, features get full workflows, architecture gets collaborative planning. No configuration. See Routing Strategies below.

Built-in Safety Net

Run crashed? /run resume. Context full? Decisions preserved across /compact. Need to pause? Type handoff to save everything.

Memory That Heals Itself

All cached context is treated as a hint β€” agents verify against actual files before acting. State only updates after confirmed success (Strict Write Discipline). No stale assumptions propagate.

3-Tier Context Compression

MicroCompact (free, every 10 turns) β†’ AutoCompact (one /compact call at 80%) β†’ ManualCompact (full session snapshot). Context stays lean. Decisions survive.

Performance by Design

3-tier rule loading (~75% less context), conditional hooks (~40% fewer executions), agent detection caching, session start caching (<1s repeat sessions).


Routing Strategies

Aura Frog picks one of three execution strategies per task β€” you never configure it manually.

flowchart TB
    Prompt([Prompt]):::m --> AD[agent-detector analyzes]:::s
    AD --> Cx{Complexity?}:::g

    Cx -->|single file<br/>clear scope<br/>~5K tokens| Q[Quick Strategy]:::q
    Cx -->|2-5 files<br/>feature<br/>~20K tokens| S[Standard Strategy]:::st
    Cx -->|6+ files<br/>architecture<br/>~80K tokens| D[Deep Strategy]:::d

    Q --> QR[Direct edit<br/>haiku model<br/>no workflow]:::q
    S --> SR[Single agent inline<br/>sonnet model<br/>TDD optional]:::st
    D --> DR[5-phase TDD<br/>2 approval gates<br/>builder β‰  reviewer<br/>collaborative planning]:::d

    classDef m fill:#f59e0b,stroke:#b45309,stroke-width:2px,color:#000000
    classDef s fill:#ec4899,stroke:#9d174d,stroke-width:2px,color:#ffffff
    classDef g fill:#dc2626,stroke:#7f1d1d,stroke-width:2px,color:#ffffff
    classDef q fill:#10b981,stroke:#065f46,stroke-width:2px,color:#ffffff
    classDef st fill:#3b82f6,stroke:#1e40af,stroke-width:2px,color:#ffffff
    classDef d fill:#8b5cf6,stroke:#5b21b6,stroke-width:2px,color:#ffffff
Loading
Strategy Triggers Model Gates Example
Quick Single file, typo, one-line fix haiku 0 "Fix typo in login.ts"
Standard 2–5 files, one feature sonnet 0–1 "Add email validation to signup form"
Deep 6+ files, architecture, vague scope sonnet (opus for design) 2 (P1 + P3) "Design and implement user subscription system"

Why three tiers instead of always-TDD? Forcing Deep on every task burns tokens (~3Γ— vs subagent mode) and slows iteration. Forcing Quick on complex work skips tests and breaks production. The three-tier model matches effort to risk.

Team Mode (subset of Deep): if the task spans 2+ domains AND CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1, multiple agents work in parallel and cross-review each other. See AGENT_TEAMS_GUIDE.

Details: rules/core/execution-rules.md, skills/agent-detector/SKILL.md, skills/run-orchestrator/SKILL.md.


The Numbers

Component Count Why it matters
Agents 9 Right expert auto-selected per task
Skills 41 5 auto-invoke on context, 36 on-demand
Commands 6 /run, /check, /design, /project, /af, /help
Rules 57 3-tier loading (18 core + 17 agent + 22 workflow) β€” only what's needed
Hooks 28 Conditional β€” skip processing for non-code files
MCP Servers 6 Zero-config, auto-invoked

Full workflow target: ≀30K tokens across all 5 phases.


Command Reference

Six commands cover every workflow. Each auto-detects intent and dispatches the right skills/agents.

/run <task> β€” The main entry point

Auto-detects what kind of work you want (feature / bugfix / refactor / test) and picks the right workflow.

What you say Intent detected Flow
/run implement user profile Feature 5-phase TDD workflow
/run fix login not working Bugfix bugfix-quick skill β€” investigate β†’ test β†’ fix β†’ verify
/run refactor auth service Refactor refactor-expert skill β€” analyze β†’ plan β†’ test β†’ refactor
/run add tests for payment Test test-writer skill β€” detect framework β†’ write tests β†’ coverage
/run fasttrack: <specs> Fast-Track Skip Phase 1, auto-execute P2–P5 (specs must include Requirements + Design + API + Data Model + Acceptance Criteria)
/run resume <id> Resume Load state from .claude/logs/runs/<id>/
/run status Status Current phase + progress
/run handoff Handoff Save state for cross-session continuation

/check β€” Health + quality checks

/check            # all checks (security + perf + complexity + debt + coverage + deps)
/check security   # SAST only
/check perf       # performance bottlenecks
/check coverage   # test coverage report
/check deps       # outdated/vulnerable dependencies

/design β€” Design artifacts

/design api       # REST/GraphQL API spec (calls api-designer skill)
/design db        # Database schema design
/design doc       # ADR or runbook (calls documentation skill)

/project β€” Project lifecycle

/project init     # First-time setup β€” generates 7 context files
/project status   # Current context + active workflow
/project refresh  # Re-scan codebase, update conventions
/project regen    # Regenerate context files from scratch
/project env      # Validate .envrc / MCP tokens
/project sync     # Sync status line + refresh cache

/af β€” Plugin management + learning

/af status        # Plugin health check
/af agents        # List loaded agents with their tools + model
/af metrics       # Workflow velocity + token efficiency
/af learn status  # Learning system state (Supabase or local)
/af learn analyze # Extract patterns from past workflows
/af learn apply   # Apply learned rules to future sessions
/af setup cli     # Install af CLI system-wide
/af prompts       # Analyze prompt quality + suggest improvements

/help β€” Contextual help

/help             # Plugin overview
/help <command>   # Detailed help for a specific command
/help agents      # Agent selection guide
/help hooks       # Hook lifecycle reference

Full command docs: commands/README.md.


Agent Selection Examples

Real examples of what the agent-detector skill picks and why. Score thresholds: PRIMARY β‰₯80, SECONDARY 50–79, OPTIONAL 30–49.

You type PRIMARY agent Why (scoring breakdown)
"Add login form with email+password" frontend form +35, login +30, UI intent +50 = 115
"Add rate-limit to /api routes" architect api route +55, rate limit +45, backend intent +50 = 150
"Fix email template styling in Laravel" frontend (in Laravel repo!) email template +55, styling +40, Layer 0 overrides repo = 95
"Optimize this slow query" architect slow query +50, optimize +35, database intent +55 = 140
"Run OWASP audit on payment flow" security OWASP +55, audit +50, security intent +55 = 160
"Write Cypress tests for checkout" tester Cypress +50, tests +55, test infra exists +30 = 135
"Set up GitHub Actions for CI" devops GitHub Actions +55, CI +50, deployment intent +50 = 155
"Fix FlatList performance in Expo" mobile FlatList +50, Expo +55, mobile intent +50 = 155
"Should we build this feature?" strategist should we +50, business-question intent +55 = 105
"What does this repo do?" scanner Project detection intent +60, cached context +40 = 100

Key insight: Layer 0 (task content) overrides repo type. A Laravel repo asking "fix email template styling" gets frontend, not architect. See skills/agent-detector/task-based-agent-selection.md for the full scoring matrix.


Token Budget

Real measurements from production workflows. Numbers vary Β±20% based on project size.

Strategy Typical Tokens Cost (Sonnet) Cost (Opus) Gates Example task
Quick (direct edit, haiku) ~3K $0.003 β€” 0 Fix typo, rename variable
Standard (single agent, sonnet) ~15–25K $0.08 $0.40 0–1 Add validation to form
Deep (5-phase, sonnet) ~60–90K $0.30 $1.50 2 JWT auth, payment flow
Deep + Team Mode (multi-agent, sonnet) ~120–180K $0.60 $3.00 2 User subscription system

Per-Phase Breakdown (Deep workflow, sonnet)

Phase 1: Understand + Design     ~8K   (13%)
Phase 2: Test RED                ~6K   (10%)
Phase 3: Build GREEN            ~40K   (65%)  ← biggest phase
Phase 4: Refactor + Review       ~6K   (10%)
Phase 5: Finalize                ~2K   ( 2%)

Target: ≀30K tokens per workflow. Actual median: 62K (2x target β€” Phase 3 is the compressor target for future optimization).

Run /run predict <task> before a workflow to get a tailored estimate.


Troubleshooting / FAQ

Q: Workflow state isn't saving. `/run status` shows nothing.

Likely cause: Path drift between hooks and skills (fixed in v3.7+).

Check:

ls -la .claude/logs/runs/         # Should exist after first /run
ls -la .claude/logs/workflows/    # Legacy path β€” may have old state

Fix:

  • Upgrade to v3.7+ (/plugin update aura-frog)
  • Or manually move: mv .claude/logs/workflows/* .claude/logs/runs/

Verify with /af status β€” should show 0 orphan paths.

Q: Wrong agent picked for my task.

Likely cause: Missing project context or ambiguous task description.

Check:

  • Did you run /project init yet? Scanner uses those files for Layer 3 (project context).
  • Is your task description short/vague? agent-detector defaults to repo type when signals are weak.

Fix:

  • Run /project init if you haven't
  • Rephrase task with domain-specific keywords: "Add email template styling" (frontend) vs "Update email feature" (ambiguous)
  • Override manually: /run @frontend implement X forces the frontend agent

Full scoring logic: skills/agent-detector/task-based-agent-selection.md.

Q: Token budget blown past 200K. What happened?

Likely cause: Phase 3 (Build GREEN) hit an iteration loop on a complex refactor.

Check:

/run budget      # Shows per-phase consumption
/run metrics     # Shows if rejection count is high

Fix:

  • /run handoff to save state β†’ resume in fresh session
  • For next time: use /run predict <task> first β€” flags Deep tasks likely to exceed budget
  • Consider splitting: /run part 1: <narrow scope> β†’ merge β†’ /run part 2
Q: Hooks not firing (no SessionStart banner, no lint-autofix).

Likely cause: .claude/settings.json missing hook config, or plugin not activated in this project.

Check:

cat .claude/settings.json   # Should reference plugin hooks
/af status                  # Should show "Hooks: 28 registered"

Fix:

/af setup integrations      # Re-installs hook config

If still nothing, check plugin.json path:

ls ~/.claude/plugins/marketplaces/aurafrog/aura-frog/hooks/hooks.json
Q: Opus session costs surprised me. Can I lock everything to Sonnet?

Yes β€” two ways:

Option 1 β€” Session override (temporary):

# Start Claude Code with model flag
claude --model sonnet

Option 2 β€” Env var (permanent, overrides ALL frontmatter):

export CLAUDE_CODE_SUBAGENT_MODEL=sonnet

This overrides every agent/skill model: declaration. See Per-Agent Model Override for resolution order.

Cost tip: scanner and agent-detector stay on haiku regardless β€” you don't need to touch them.

Q: Can I run multiple /run workflows in parallel?

Yes β€” use git worktrees:

/run worktree: <task>    # Automatically creates isolated worktree + runs there

Each worktree has its own state in .claude/logs/runs/<id>/. See Git Worktree skill.

For full multi-agent parallel work, enable Agent Teams:

export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

See Agent Teams Guide.

Q: How do I disable a hook that's slowing me down?

Each hook has a disable env var:

AF_LINT_AUTOFIX=false        # Skip post-edit linter
AF_PROMPT_LOGGING=false      # Skip prompt metadata logging
AF_LEARNING_ENABLED=false    # Skip all learning hooks

Or disable at the source by editing aura-frog/hooks/hooks.json (comment out the matcher).

Full hook list: hooks/README.md.

More issues: TROUBLESHOOTING.md.


Compared to Other Claude Code Plugins

Honest comparison with two popular plugins in the ecosystem (April 2026).

Aura Frog wshobson/agents Superpowers
Agents 9 curated 184 across 78 plugins ~20
Skills 38 150 Small focused set
Commands 6 98 ~10
Workflow 5-phase TDD with 2 gates No structured workflow Phase-gated workflow
Agent routing Task-content Layer 0 override Manual /agent-name Similar to Aura Frog
TDD enforcement βœ… Mandatory REDβ†’GREENβ†’REFACTOR ❌ Per-agent βœ… Phase-gated
Context management 3-tier (MicroCompact / AutoCompact / ManualCompact) ❌ Base Claude Code Partial
Approval gates 2 (P1 + P3) ❌ Multiple
MCP bundled 6 (context7, playwright, vitest, firebase, figma, slack) Varies per plugin 2–3
Best fit Teams shipping production features with TDD discipline Extending with niche specialists Structured workflows for research/writing
Weakness Steeper learning curve Agent sprawl (184 is a lot) Smaller ecosystem

Not competing β€” different optimization targets. Aura Frog optimizes for production code quality (TDD + security review). wshobson optimizes for breadth of specialists. Superpowers optimizes for structured thinking over code.

Combine freely β€” plugins coexist in Claude Code.


Documentation

All Documentation docs/README.md
Getting Started GET_STARTED.md
First Workflow Tutorial FIRST_WORKFLOW_TUTORIAL.md
All Commands (6) commands/README.md
All Skills (38) skills/README.md
Agent Teams Guide AGENT_TEAMS_GUIDE.md
MCP Setup MCP_GUIDE.md
Hooks & Lifecycle hooks/README.md
Troubleshooting TROUBLESHOOTING.md
Changelog CHANGELOG.md

Architecture β€” LLM OS

Claude = Kernel          Context Window = RAM           Project Files = Disk
Agents = Processes       5-Phase TDD = Scheduler        MCP = Device Drivers
TOON = Compression       Approval Gates = Interrupts    Handoffs = IPC

aura-frog/
β”œβ”€β”€ agents/         9 processes (auto-dispatched per task)
β”œβ”€β”€ skills/         44 skills (5 auto-invoke + 39 on-demand)
β”œβ”€β”€ commands/       6 commands (/run, /check, /design, /project, /af, /help)
β”œβ”€β”€ rules/          57 rules (18 core + 17 agent + 22 workflow)
β”œβ”€β”€ hooks/          28 lifecycle hooks (conditional execution)
β”œβ”€β”€ scripts/        43 utility scripts
β”œβ”€β”€ docs/           AI reference docs (phases, TOON refs)
└── .mcp.json       6 device drivers (MCP servers)

Contributing

We welcome contributions β€” especially new MCP integrations, agents, skills, and bug fixes. See CONTRIBUTING.md or submit an issue.

Godot and SEO/GEO modules available as separate addons.


License

MIT β€” See LICENSE


Aura Frog

Your AI writes code. Aura Frog runs the OS.

Install Now Β· Tutorial Β· Report Issue

Built by @nguyenthienthanh Β· Changelog