Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
<div align="center">

# UseZombie Docs
<picture>
<source media="(prefers-color-scheme: dark)" srcset="logo/light.svg" />
<source media="(prefers-color-scheme: light)" srcset="logo/dark.svg" />
<img src="logo/dark.svg" width="200" alt="usezombie" />
</picture>

**Documentation site for UseZombie — submit a spec, get a validated PR.**
**Heroku for agents, but the agent never sees your keys.**

**Run your agents 24/7. Walled, watched, killable.**

[![Docs](https://img.shields.io/badge/UseZombie-Docs-blue?style=for-the-badge)](https://docs.usezombie.com)
[![Try Free](https://img.shields.io/badge/UseZombie-Try_Free-brightgreen?style=for-the-badge)](https://usezombie.com)
Expand Down
167 changes: 62 additions & 105 deletions how-it-works.mdx
Original file line number Diff line number Diff line change
@@ -1,151 +1,108 @@
---
title: "How it works"
description: "The spec-to-PR lifecycle with self-repairing agents."
description: "The UseZombie agent hosting model — credential firewall, always-on execution, and observability."
---

## The spec-to-PR lifecycle
## The agent hosting model

UseZombie turns a markdown spec into a validated pull request through a deterministic pipeline: validate, implement, gate, score, ship.
UseZombie sits between your agent and the outside world. You bring the agent logic. We provide the runtime: a sandboxed process, a credential firewall, wired webhooks, and a kill switch.

```mermaid
flowchart LR
A[Write spec] --> B[Submit via CLI/API]
B --> C[Validate spec]
C --> D[Agent implements]
D --> E[Gate loop]
E -->|Pass| F[Score run]
E -->|Fail| G[Self-repair]
G --> E
F --> H[Open PR with scorecard]
A[Your agent code] --> B[UseZombie sandbox]
B --> C[Credential firewall]
C --> D[External APIs / LLMs]
E[Webhooks: GitHub, Slack, email] --> B
F[zombiectl / Mission Control] --> B
```

Your agent never sees raw credentials. It makes requests. The firewall intercepts them, injects the right token, and forwards. Audit logs record every action.

## Step by step

<Steps>
<Step title="Write a spec">
A spec is a markdown file describing what you want built. It can follow any format — structured sections, free-form prose, bullet lists. The agent reads natural language and infers intent from your codebase context.

You describe **what** to build. The agent figures out **how**.
<Step title="Connect your agent">
Push your agent code to a workspace. UseZombie wraps it in a sandboxed process with resource limits (CPU, memory, wall time). The agent starts running immediately and restarts automatically on crash.
</Step>

<Step title="Submit">
Submit via `zombiectl run --spec <path>` or the REST API (`POST /v1/runs`). On submission, UseZombie validates that referenced files exist in the workspace, deduplicates against in-flight runs, and enqueues the work.
<Step title="Store credentials — once">
Add API keys, tokens, and secrets to the workspace credential store via `zombiectl skill-secret put` or Mission Control. Credentials are encrypted at rest and never passed into the sandbox.
</Step>

<Step title="Agent implements">
The NullClaw agent runtime picks up the run and works inside an **isolated git worktree** — a fresh working directory branched from your default branch. The agent receives the spec plus injected codebase context (relevant file contents, module structure) to produce an accurate implementation.

No changes touch your main branch until you approve the PR.
</Step>

<Step title="Gate loop">
After implementation, the agent runs your project's standard validation gates in sequence:

1. `make lint` — linting and type checks
2. `make test` — unit tests
3. `make build` — production build

If any gate fails, the agent reads the error output, diagnoses the issue, and self-repairs. This loop runs up to **3 times** by default. If all repair attempts fail, the run is marked `FAILED` with full error context.
<Step title="Firewall injects credentials per-request">
When your agent makes an outbound request, the firewall intercepts it, matches the target against your credential policy, and injects the token before forwarding. The agent code never contains a key — it just makes requests.

<Info>
The repair limit is configurable per agent profile. See [Gate loop](/runs/gate-loop) for details.
This is the core security guarantee: credential injection happens at the network boundary, outside the sandbox. A compromised agent cannot exfiltrate credentials it never received.
</Info>
</Step>

<Step title="Score">
Every completed run receives a **scorecard** with four weighted dimensions:

| Dimension | Weight | What it measures |
|-----------|--------|------------------|
| Completion | 40% | Did the agent implement everything the spec asked for? |
| Error rate | 30% | How many gate failures occurred before passing? |
| Latency | 20% | Wall-clock time from enqueue to PR. |
| Resource efficiency | 10% | Token and compute usage relative to task complexity. |

Scores map to tiers:

| Tier | Score range |
|------|-------------|
| Bronze | 0 -- 39 |
| Silver | 40 -- 69 |
| Gold | 70 -- 89 |
| Elite | 90 -- 100 |
<Step title="Webhooks arrive without ngrok">
Register webhook sources (GitHub, Slack, email, custom HTTP) on the workspace. UseZombie provides a stable inbound endpoint and routes matching events to your agent process. No tunneling, no port forwarding, no custom servers.
</Step>

<Step title="PR">
The agent pushes a branch named `zombie/<run_id>/<slug>` and opens a pull request on GitHub. The PR body contains an agent-generated explanation of what was implemented and why. A scorecard comment is posted with the quality metrics.

From here, it's a normal code review. Approve, request changes, or close.
<Step title="Observe and control">
Every agent action is timestamped in the audit log: what ran, when, what it called, and what it cost. Budget alerts fire before you hit limits. The kill switch stops any agent mid-action from the CLI or dashboard.
</Step>
</Steps>

## Runtime architecture
## Credential firewall architecture

Under the hood, the CLI, API server, queue, worker, and executor coordinate to move a run from submission to PR.
```mermaid
sequenceDiagram
participant A as Agent (sandbox)
participant F as Credential firewall
participant V as Credential store
participant E as External API

A->>F: GET api.openai.com/v1/chat/completions
F->>V: lookup(workspace_id, target_host)
V-->>F: Bearer sk-...
F->>E: GET api.openai.com/v1/chat/completions<br/>Authorization: Bearer sk-...
E-->>F: 200 OK
F-->>A: 200 OK
Note over F: audit_log.append(action, ts, cost_tokens)
```

The agent makes a plain HTTP request. The firewall resolves the right credential from the store, injects it, and forwards. The agent receives the response. The credential value never crosses the sandbox boundary.

## Runtime architecture

```mermaid
sequenceDiagram
participant CLI as zombiectl
participant API as zombied API
participant Q as Redis Streams
participant W as zombied worker
participant E as zombied-executor
participant GH as GitHub
participant S as Sandbox process
participant F as Credential firewall

CLI->>API: POST /v1/runs (spec)
API->>Q: enqueue run_id
CLI->>API: POST /v1/agents (agent config)
API->>Q: enqueue agent_start
Q->>W: claim work
W->>E: StartStage (agent config, tools)
E->>E: NullClaw agent implements
E->>W: ExecutionResult
W->>W: Gate loop (lint/test/build)
W->>GH: push branch + open PR
W->>GH: post scorecard comment
W->>S: spawn sandboxed process
S->>F: outbound requests (no credentials)
F->>F: inject credentials + log
W->>CLI: SSE: status, logs, cost
```

**Component responsibilities:**

- **zombiectl** — CLI client. Submits specs, checks status, streams logs.
- **zombied API** — HTTP server. Validates specs, manages runs and workspaces, serves the dashboard.
- **`zombiectl`** — CLI client. Deploys agents, checks status, manages secrets, streams logs.
- **`zombied` API** — HTTP server. Manages agent lifecycle, credential store, webhook routing, billing.
- **Redis Streams** — Work queue. Durable, ordered, with consumer group semantics for worker fleet scaling.
- **zombied worker** — Claim runs, orchestrate the gate loop, push results to GitHub. Supports drain and rolling deploys.
- **zombied-executor** — Sidecar process that owns the sandbox lifecycle. Spawns NullClaw agents, manages worktrees, enforces resource limits.
- **GitHub** — Target forge. Branch push, PR creation, scorecard comments.

## Agent relay model
- **`zombied` worker** — Owns the sandbox lifecycle. Spawns agents, enforces resource limits, handles restarts.
- **Credential firewall** — Network-layer proxy. Intercepts outbound requests, injects credentials, records audit logs.

For lightweight, interactive agent sessions (`spec init`, `run --preview`), UseZombie uses a different execution model: the **agent relay**. Instead of queuing work for a sandbox, `zombied` acts as a stateless pass-through between the CLI and the workspace's LLM provider.

```mermaid
sequenceDiagram
participant CLI as zombiectl
participant API as zombied API
participant LLM as LLM Provider

CLI->>API: POST /v1/agent/stream (mode, tools, messages)
API->>LLM: Forward with system prompt + API key
LLM-->>API: tool_use: list_dir(".")
API-->>CLI: SSE: event: tool_use
Note over CLI: Executes locally on laptop
CLI->>API: POST /v1/agent/stream (messages + tool_result)
API->>LLM: Forward accumulated messages
LLM-->>API: tool_use: read_file("go.mod")
API-->>CLI: SSE: event: tool_use
Note over CLI: Reads file locally
CLI->>API: POST /v1/agent/stream (messages + tool_result)
API->>LLM: Forward accumulated messages
LLM-->>API: text: "# M5_001: Rate Limiting..."
API-->>CLI: SSE: event: text_delta + done {usage}
```
## Spend control

**Key differences from the pipeline model:**
Every workspace has configurable limits that prevent runaway costs:

| | Pipeline (full runs) | Agent relay (spec init, preview) |
|---|---|---|
| **Execution** | Sandbox on worker, queued | Direct handler, no queue |
| **File access** | Agent reads files in sandbox | CLI reads files locally, sends to model on demand |
| **Duration** | 1-5 minutes | 3-8 seconds |
| **State** | Durable (DB + Redis) | Stateless (CLI manages conversation) |
| **Provider** | Configured per workspace | Same, resolved by `zombied` |
| Control | What it does |
|---------|-------------|
| Token budget | Max tokens per agent execution window |
| Wall time limit | Max wall-clock time before forced stop |
| Cost ceiling | Max USD spend per billing period |
| Kill switch | Manual stop from CLI or Mission Control at any time |

The relay model is inspired by how Claude Code and OpenCode work: the CLI holds tool definitions, the model calls them on demand, and the API layer is a relay. The difference is `zombied` sits between CLI and provider because API keys are server-side secrets managed per-workspace.
When a limit is hit, the agent receives a graceful shutdown signal. The audit log records the reason. No surprises on the invoice.
54 changes: 33 additions & 21 deletions index.mdx
Original file line number Diff line number Diff line change
@@ -1,53 +1,65 @@
---
title: Introduction
description: Submit a spec. Get a validated PR.
description: Run your agents 24/7. Walled, watched, killable.
---

<Tip>
🧟 **Early Access Preview** · Pre-release — revised release coming up by April 11. APIs, CLI, and behavior may change without notice before general availability.
</Tip>

UseZombie is in a product pivot. The focus is practical operator leverage, not tunnel-vision optimization around one narrow bottleneck that frontier models may erase soon.
## Why we pivoted

Submit a spec. An agent implements it, self-repairs until quality gates pass, and opens a PR with a scorecard. You review one PR instead of babysitting ten agent sessions.
We started by obsessing over one thing: making AI-generated code *correct*. Self-repair loops, quality scoring, scorecard evidence. Good problems — but narrow ones. Frontier models get better every quarter, and the gap we were optimizing for keeps shrinking on its own.

[Join the waitlist →](https://usezombie.com)
</Tip>
The tempting move is to double down — we built it, so we stick with it. But code is cattle, not pets. You don't keep a solution alive just because you wrote it.

## Getting started
So we killed the old approach and pivoted. UseZombie is now a runtime for always-on agents — you bring your agent, we handle the credentials (hidden from the sandbox, injected at the firewall), webhooks (wired automatically), audit logs (every action timestamped), and a kill switch. Your agent runs 24/7 without ever seeing a password.

UseZombie turns markdown specs into validated pull requests with self-repairing agents, run quality scoring, and evidence-backed scorecards so teams ship with confidence, not babysitting.
## What UseZombie does

<Card title="Start here" icon="rocket" href="/quickstart" horizontal>
Install the CLI and submit your first spec in under 5 minutes.
</Card>
UseZombie is agent hosting infrastructure. Connect your agents, and we handle credentials, webhooks, and walls — your agents run 24/7 without ever seeing your keys.

## Make it yours
<Card title="Get started" icon="rocket" href="/quickstart" horizontal>
Connect your agent and have it running in minutes.
</Card>

Connect your GitHub repo, configure your workspace, and let agents ship validated PRs while you focus on specs and reviews.
## What zombies do

<Columns cols={2}>
<Card title="Install the CLI" icon="terminal" href="/cli/install">
Install zombiectl via npm, npx, or the install script.
<Card title="Always-on agents" icon="infinity">
Your agent runs continuously in a sandboxed process and restarts on crash. No babysitting.
</Card>
<Card title="How it works" icon="diagram-project" href="/how-it-works">
Understand the spec-to-PR lifecycle with self-repair gate loops.
<Card title="Credentials hidden" icon="shield-halved">
Agents never see tokens. The firewall injects them per-request, outside the sandbox boundary.
</Card>
<Card title="Write a spec" icon="file-lines" href="/specs/writing-specs">
Author specs in markdown. Any format works.
<Card title="Webhooks wired" icon="webhook">
Receive events from email, Slack, GitHub, and more — without ngrok or custom servers.
</Card>
<Card title="Mission Control" icon="gauge" href="https://app.usezombie.com">
View runs, scorecards, and workspace settings in the dashboard.
<Card title="Observability" icon="chart-line">
See what your agent did, when, why, and how much it cost — before the invoice surprises you.
</Card>
<Card title="Spend ceiling" icon="gauge-high">
Per-run token budgets and wall time limits. One bad prompt never becomes an infinite burn.
</Card>
<Card title="Kill switch" icon="power-off">
Stop any agent mid-action from the CLI or Mission Control.
</Card>
</Columns>

## Explore the docs

<Columns cols={2}>
<Card title="How it works" icon="diagram-project" href="/how-it-works">
Understand the agent hosting model and credential firewall.
</Card>
<Card title="Key concepts" icon="book" href="/concepts">
Agents, workspaces, credential walls, webhooks, and observability.
</Card>
<Card title="CLI reference" icon="rectangle-terminal" href="/cli/zombiectl">
Full command reference for zombiectl.
</Card>
<Card title="API reference" icon="code" href="/api-reference/introduction">
REST API for programmatic spec submission and run management.
REST API for agent management and event ingestion.
</Card>
<Card title="Operator guide" icon="server" href="/operator/deployment/architecture">
Deploy and operate the UseZombie control plane.
Expand Down
Loading