Autonomous app builder. Give it one goal; a manager maintains a business backlog,
a watcher drives a disk-backed event queue, and builders handle the technical job
kinds: design, implementation, acceptance review, and merge repair. Builders work
in parallel git worktrees, verify.sh checks software behavior, and the loop keeps
running until the goal is actually done.
When run in a terminal it shows a live TUI: a goal-entry screen lets you confirm or edit the goal before anything runs, then a jobs panel and a persistent input bar for adding tasks. Questions agents raise are answered automatically ("you decide"), so the loop never waits on you. When the goal is done it stays alive in standby so you can keep adding tasks. Piped/non-TTY runs fall back to plain event-line output and exit on completion.
Prebuilt binaries (macOS arm64/x86_64, Linux x86_64):
curl -fsSL https://raw.githubusercontent.com/ngthluu/agentloop/main/scripts/install.sh | bashThis installs the agentloop binary to ~/.local/bin (override with
AGENTLOOP_INSTALL_DIR=/usr/local/bin). Ensure the install dir is on your PATH.
The claude and/or codex CLIs must also be on PATH at runtime.
Rust (edition 2021), git, and the claude and/or codex CLIs on PATH.
Platforms: macOS and Linux only. agentloop manages agent processes with POSIX
process groups and signals; it does not build on Windows (a clear compile_error!
says so). Use WSL on Windows.
agentloop is built to run unattended, which means it deliberately removes the safety prompts you may be used to:
- Every agent is spawned with permission checks disabled (
claude --dangerously-skip-permissions,codex --yolo). Agents can run any shell command, edit any file your user can, and access the network. .agentloop/verify.shis executed viabashon every iteration. It is arbitrary code living inside the workspace — anything that can write to the workspace (including the agents themselves) controls what it does.- Task descriptions, designs, and notes written by one agent are fed into the prompts of others. agentloop sanitizes the identifiers (branch/path safety) and bounds the sizes, but it cannot make prompt content trustworthy.
Therefore: only point agentloop at goals and workspaces you trust, with
credentials you accept being exercised autonomously. For anything else, run it
inside a container or VM with scoped credentials. A run can also spend real API
credits for hours; set caps.total_budget_sec / caps.max_iterations
accordingly and watch the first runs of a new goal.
cargo build --release./target/release/agentloop "<goal>" --workspace ./app # pre-fills goal-entry screen with <goal>
./target/release/agentloop --workspace ./app # resume: entry screen pre-filled from goal.md
./target/release/agentloop --workspace ./new-dir # fresh dir: entry screen is emptyEvery interactive (terminal) launch opens on a goal-entry screen first. The screen
is pre-filled with the existing goal from <workspace>/.agentloop/state/goal.md, or
empty on a fresh workspace. Nothing runs until you
type/edit the goal and press enter (the "[ Continue ]" action). Ctrl-C at the
entry screen quits without running. A goal passed as the CLI argument pre-fills the
entry screen rather than starting immediately.
Headless/piped (non-TTY) runs are unchanged: they use the goal arg or persisted goal and run directly without the entry screen.
Options:
--config <path>—config.jsonpath. By default agentloop uses$AGENTLOOP_CONFIGwhen set, otherwise~/.agentloop/config.json.--fresh— wipe existing.agentloopstate and start over. Prompts for confirmation (it deletes all run state, logs, and results); pass--yesto skip the prompt in scripts. The existing goal is preserved unless you pass a new one.--yes— skip confirmation prompts (required for--freshwhen not on a TTY)--max-iterations N— overridecaps.max_iterationsfrom config--dry-run— run the manager once and print the business backlog; do not dispatch builders--report— print the bounce/failure troubleshooting report for the workspace and exit
One run per workspace: agentloop holds an advisory lock on
.agentloop/state/.lock; a second concurrent run on the same workspace exits
with an error instead of corrupting shared state.
- State and queue:
.agentloop/state/backlog.jsonis the manager-owned business backlog, with.agentloop/state/master.mdas the human-readable status board. The watcher drives.agentloop/state/queue/events.jsonl, with active claims under.agentloop/state/queue/claims/and exhausted events archived to.agentloop/state/queue/dead.jsonl. Per-task technical state lives under.agentloop/state/tasks/<task-id>/, includingdesign.md, builder subitems, and acceptance review state. - Routing: global
~/.agentloop/config.jsonmaps visible roles to tool/model/effort. The configurable roles are onlymanager,builder, andwatcher. Design, implementation, acceptance review, and merge-resolution are builder job kinds, not separate public roles. The watcher role is out of the build loop and is used for anomaly investigation only. Tool permission switches are fixed by agentloop:claudealways gets--dangerously-skip-permissions, andcodexalways gets--yolo. An omittedmodel/effortleaves the choice to the tool itself — the default config pins no codex model, because codex model slugs churn (gpt-5no longer exists). Pressctrl-oin the TUI to pick and persist tool/model/effort per role: edits are written back to the config file immediately and apply to the running loop from its next dispatch. - Watcher loop: every tick recovers stale queue claims, synthesizes missing work events from backlog/task state, claims due events, and dispatches the correct visible role. Agents do not spawn other public roles directly: if work is needed, state changes become queue events and the watcher decides what to run next. The run parks only when the global gate passes and there is no open, failed, pending, or claimed queue work.
- Gate and acceptance:
.agentloop/verify.shstill gates software behavior, so agentloop can target any kind of software. The manager owns the script; its contract is build + the project's test suite, nothing more. A business task's acceptance run passes that task id as$1(so one task's flaky check can't fail — and force a redesign of — an unrelated task); the per-iteration DONE gate runs with no arguments. Scripts may ignore$1and always run everything. After the scoped gate passes for a business task, a builderreview_acceptancejob approves or rejects that task by its acceptance criteria. The gate runs in its own process group with a wall-clock cap (default 30 min, override withAGENTLOOP_GATE_TIMEOUT_SECS) so a hung verify.sh can never hang the loop; a timeout reads as a gate failure (rc 124). - Caps:
max_iterations,max_parallel,item_timeout_sec,total_budget_sec,max_attempts, andmax_redesigns(whole-task re-plans; deliberately separate from and higher than the per-builder attempt cap, so a handful of flaky gate runs can't fail a task outright). The loop also stops on a no-progress stall: two consecutive iterations that merge nothing and change no loop-relevant state (gate verdict, backlog/builder statuses and attempts). Iterations where the manager re-scopes a failed task or a builder consumes an attempt count as progress — those are cap-bounded, so they can't keep the loop alive forever. - Parallelism: independent ready items run concurrently, each in its own git worktree;
successful builders are merged back sequentially. Loop-owned tracked changes under
.agentloop/(e.g. the manager's verify.sh rewrites) are auto-committed before integration so they never read as a dirty tree; USER uncommitted changes still block all merges — that bounce refunds the builder's attempt, since the dirty tree says nothing about the work. - No-change dones: a builder that verifies its item and finds the acceptance
criteria already hold reports done with
"no_changes": trueand commits nothing — accepted as a first-class done (the task-scoped gate still judges the task). A done with no commits and no such declaration is bounced as a lazy builder. - Merge conflicts: when a builder's branch conflicts on merge, the watcher
queues a
resolve_mergebuilder job in the workspace to resolve the conflict and complete the merge, instead of bouncing the item. The merge repair job is unbounded (no attempt cap, no timeout) but is killed when you quit, so it never orphans. If it cannot resolve, the merge is aborted and the item is returned for manager repair. - Autonomous decisions: a builder that hits a decision writes
.agentloop/questions/<id>.jsonand reportsstatus:"needs_input". The loop answers it immediately with a canned "decide the best option for me — you decide" reply (stored in.agentloop/answers/<id>.json) and re-dispatches the item with the Q&A appended to its prompt. Nothing waits on the user. - Usage limits: when claude/codex dies on a provider usage/rate limit, agentloop parses the reset time when the output includes one (else waits a fallback window), logs a ⏳ note to the job log, and re-runs the agent automatically. Quitting interrupts the wait.
- Backlog repair: after every manager round the orchestrator drops deps on ids
that don't exist in the backlog, re-designs
in_progresstasks that lost their plan, and redesigns plans whose remaining items can never dispatch — the loop never idles while open work exists. - Add tasks any time: an add-task request is appended to
.agentloop/state/requests.jsonl; the manager folds it into the business backlog on its next round (you feed intent; the manager stays the sole owner of the backlog). - Standby: on completion (or a cap/stall) the interactive run idles in standby instead of exiting; adding a task re-engages it with a fresh budget window. The status bar shows why it parked (done / stall / max_iterations / budget, with open and failed counts).
- Failed tasks hold the run open: a task that exhausts its redesign cap is marked
failed, but the run is only DONE when the gate passes and no open or failed items remain. Every failed item — including leaves nothing depends on — is listed in the manager prompt with its failure note, and the manager is required to reshape it into new tasks or drop it, instead of the loop silently ending over (or grinding forever against) abandoned work. - Re-run = more context: re-running with new goal text (without
--fresh) appends it togoal.mdand queues it as a pending request, so the manager folds it into the business backlog as new tasks and the loop re-engages instead of reporting an instant "Done, nothing changed."
Running in a terminal opens a full-screen panel. A persistent, text-wrapping input bar sits at the bottom of the screen at all times (similar to Claude Code's input). Keys:
- Printable keys always type into the persistent bottom input bar; it wraps long text
automatically.
shift+enter(oralt+enter) inserts a newline. enter— submits the input as a new task for the manager. When the input is empty,enteropens the selected job's detail view (live log tail + a real-time working timer).↑/↓— navigate the jobs list, or scroll the log in the job-detail viewesc— clear the input bar, or leave the job-detail viewq— quit (only when the input bar is empty);Ctrl-Calways quits
The status bar shows the goal, current iteration, gate state, open-item count, and a
live ⏱ total-run-time readout. When the run parks, the banner says why:
✓ DONE · standby only when everything is done and the gate passes, otherwise
⏸ standby: <reason> (stall / max_iterations / budget, with open and failed counts).
The gate itself appears as a gate job row while verify.sh runs, so a long verify
never looks like a dead loop. (Headless runs print the total elapsed time on exit.)
Long goals are ellipsized so the counters and timer always stay visible.
The main panel is the Jobs list (full width).
Nothing the loop produces is deleted:
.agentloop/state/events.jsonl— append-only history of every dispatch and status transition (bounced/failed/merged/approved/rejected/redesign) with its reason.agentloop --report --workspace <dir>prints all bounced and failed cases plus what is currently failed in the backlog and builder plans..agentloop/state/queue/events.jsonl— disk-backed event queue the watcher claims from..agentloop/state/queue/dead.jsonlrecords exhausted queue events that need operational attention..agentloop/logs/iter-N/— per-iteration agent logs, plus each builder's archived result JSON (timestamp-prefixed)..agentloop/logs/gate.log— every verify.sh run (timestamp, rc, full output);state/last_gate.txtkeeps just the latest..agentloop/state/tasks/<id>/archive/— superseded builder plans, designs, acceptance reviews, and redesign counters..agentloop/logs/answered-<id>-<ts>.json— consumed agent questions.
These files and archive dirs grow without bound by design (they are the audit
trail); prune them manually — or start over with --fresh — if a long-lived
workspace gets too big.
Before running, agentloop verifies that every CLI tool the config routes roles to (claude/codex) is installed, and exits with install instructions otherwise.
src/
main.rs binary entry point
cli.rs arg parsing (clap), workspace bootstrap, dry-run wiring
config.rs Config / Caps / Role deserialization + helpers
state.rs backlog.json validate / query / mutate (atomic writes)
spawn.rs timeout + claude/codex command building (+ fake-agent hook)
worktree.rs worktree create / merge / cleanup
manager.rs business backlog prompt + invoke + validate
watcher.rs queue-driven run loop, event synthesis, event processors
queue.rs disk-backed queue events, claims, retries, dead-lettering
architect.rs compatibility wrapper for builder design_task jobs
worker.rs builder job-kind prompts + dispatch
customer.rs compatibility wrapper for builder review_acceptance jobs
task_state.rs task-local design/builders/acceptance state helpers
events.rs Reporter trait, Event/Command enums, stderr + channel reporters
history.rs append-only state/events.jsonl, artifact archiving, --report
preflight.rs startup check that configured agent CLIs are installed
inbox.rs question/answer file IO + prior-Q&A prompt block (auto-answered)
limits.rs usage/rate-limit detection + auto-continue wait math
requests.rs pending user-request log (requests.jsonl) + manager prompt block
orchestrator.rs legacy iteration helpers, gate, merge, repair, standby machine
tui.rs ratatui view-model (events -> state, keys -> commands) + render
app.rs wires orchestrator + TUI over channels; TTY vs headless dispatch
bin/fake_agent.rs offline stub used by tests
templates/
master.md embedded default master status board
tests/ offline integration suite (fake_agent, scripted stub, no tokens)
Releases are cut by pushing to the production branch:
-
Bump
versioninCargo.toml(e.g.0.1.0->0.1.1) and merge tomain. -
Fast-forward/merge
mainintoproductionand push:git push origin main:production
-
The
releaseworkflow readsversionfromCargo.toml, creates and pushes the tagv{version}, buildsagentloopfor each supported target, and publishes a GitHub Release with theagentloop-<target>.tar.gzassets.
If the tag v{version} already exists, the workflow no-ops — bump the version to
cut a new release. install.sh always fetches the latest release.
cargo test # offline; uses the in-crate fake_agent + scripted stub, no tokens spent