diff --git a/README.md b/README.md index b0b012b..afe981d 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,14 @@
-# UseZombie Docs + + + + usezombie + -**Documentation site for UseZombie — submit a spec, get a validated PR.** +**Heroku for agents, but the agent never sees your keys.** + +**Run your agents 24/7. Walled, watched, killable.** [![Docs](https://img.shields.io/badge/UseZombie-Docs-blue?style=for-the-badge)](https://docs.usezombie.com) [![Try Free](https://img.shields.io/badge/UseZombie-Try_Free-brightgreen?style=for-the-badge)](https://usezombie.com) diff --git a/how-it-works.mdx b/how-it-works.mdx index 89e929c..76aa945 100644 --- a/how-it-works.mdx +++ b/how-it-works.mdx @@ -1,87 +1,72 @@ --- title: "How it works" -description: "The spec-to-PR lifecycle with self-repairing agents." +description: "The UseZombie agent hosting model — credential firewall, always-on execution, and observability." --- -## The spec-to-PR lifecycle +## The agent hosting model -UseZombie turns a markdown spec into a validated pull request through a deterministic pipeline: validate, implement, gate, score, ship. +UseZombie sits between your agent and the outside world. You bring the agent logic. We provide the runtime: a sandboxed process, a credential firewall, wired webhooks, and a kill switch. ```mermaid flowchart LR - A[Write spec] --> B[Submit via CLI/API] - B --> C[Validate spec] - C --> D[Agent implements] - D --> E[Gate loop] - E -->|Pass| F[Score run] - E -->|Fail| G[Self-repair] - G --> E - F --> H[Open PR with scorecard] + A[Your agent code] --> B[UseZombie sandbox] + B --> C[Credential firewall] + C --> D[External APIs / LLMs] + E[Webhooks: GitHub, Slack, email] --> B + F[zombiectl / Mission Control] --> B ``` +Your agent never sees raw credentials. It makes requests. The firewall intercepts them, injects the right token, and forwards. Audit logs record every action. + ## Step by step - - A spec is a markdown file describing what you want built. It can follow any format — structured sections, free-form prose, bullet lists. The agent reads natural language and infers intent from your codebase context. - - You describe **what** to build. The agent figures out **how**. + + Push your agent code to a workspace. UseZombie wraps it in a sandboxed process with resource limits (CPU, memory, wall time). The agent starts running immediately and restarts automatically on crash. - - Submit via `zombiectl run --spec ` or the REST API (`POST /v1/runs`). On submission, UseZombie validates that referenced files exist in the workspace, deduplicates against in-flight runs, and enqueues the work. + + Add API keys, tokens, and secrets to the workspace credential store via `zombiectl skill-secret put` or Mission Control. Credentials are encrypted at rest and never passed into the sandbox. - - The NullClaw agent runtime picks up the run and works inside an **isolated git worktree** — a fresh working directory branched from your default branch. The agent receives the spec plus injected codebase context (relevant file contents, module structure) to produce an accurate implementation. - - No changes touch your main branch until you approve the PR. - - - - After implementation, the agent runs your project's standard validation gates in sequence: - - 1. `make lint` — linting and type checks - 2. `make test` — unit tests - 3. `make build` — production build - - If any gate fails, the agent reads the error output, diagnoses the issue, and self-repairs. This loop runs up to **3 times** by default. If all repair attempts fail, the run is marked `FAILED` with full error context. + + When your agent makes an outbound request, the firewall intercepts it, matches the target against your credential policy, and injects the token before forwarding. The agent code never contains a key — it just makes requests. - The repair limit is configurable per agent profile. See [Gate loop](/runs/gate-loop) for details. + This is the core security guarantee: credential injection happens at the network boundary, outside the sandbox. A compromised agent cannot exfiltrate credentials it never received. - - Every completed run receives a **scorecard** with four weighted dimensions: - - | Dimension | Weight | What it measures | - |-----------|--------|------------------| - | Completion | 40% | Did the agent implement everything the spec asked for? | - | Error rate | 30% | How many gate failures occurred before passing? | - | Latency | 20% | Wall-clock time from enqueue to PR. | - | Resource efficiency | 10% | Token and compute usage relative to task complexity. | - - Scores map to tiers: - - | Tier | Score range | - |------|-------------| - | Bronze | 0 -- 39 | - | Silver | 40 -- 69 | - | Gold | 70 -- 89 | - | Elite | 90 -- 100 | + + Register webhook sources (GitHub, Slack, email, custom HTTP) on the workspace. UseZombie provides a stable inbound endpoint and routes matching events to your agent process. No tunneling, no port forwarding, no custom servers. - - The agent pushes a branch named `zombie//` and opens a pull request on GitHub. The PR body contains an agent-generated explanation of what was implemented and why. A scorecard comment is posted with the quality metrics. - - From here, it's a normal code review. Approve, request changes, or close. + + Every agent action is timestamped in the audit log: what ran, when, what it called, and what it cost. Budget alerts fire before you hit limits. The kill switch stops any agent mid-action from the CLI or dashboard. -## Runtime architecture +## Credential firewall architecture -Under the hood, the CLI, API server, queue, worker, and executor coordinate to move a run from submission to PR. +```mermaid +sequenceDiagram + participant A as Agent (sandbox) + participant F as Credential firewall + participant V as Credential store + participant E as External API + + A->>F: GET api.openai.com/v1/chat/completions + F->>V: lookup(workspace_id, target_host) + V-->>F: Bearer sk-... + F->>E: GET api.openai.com/v1/chat/completions
Authorization: Bearer sk-... + E-->>F: 200 OK + F-->>A: 200 OK + Note over F: audit_log.append(action, ts, cost_tokens) +``` + +The agent makes a plain HTTP request. The firewall resolves the right credential from the store, injects it, and forwards. The agent receives the response. The credential value never crosses the sandbox boundary. + +## Runtime architecture ```mermaid sequenceDiagram @@ -89,63 +74,35 @@ sequenceDiagram participant API as zombied API participant Q as Redis Streams participant W as zombied worker - participant E as zombied-executor - participant GH as GitHub + participant S as Sandbox process + participant F as Credential firewall - CLI->>API: POST /v1/runs (spec) - API->>Q: enqueue run_id + CLI->>API: POST /v1/agents (agent config) + API->>Q: enqueue agent_start Q->>W: claim work - W->>E: StartStage (agent config, tools) - E->>E: NullClaw agent implements - E->>W: ExecutionResult - W->>W: Gate loop (lint/test/build) - W->>GH: push branch + open PR - W->>GH: post scorecard comment + W->>S: spawn sandboxed process + S->>F: outbound requests (no credentials) + F->>F: inject credentials + log + W->>CLI: SSE: status, logs, cost ``` **Component responsibilities:** -- **zombiectl** — CLI client. Submits specs, checks status, streams logs. -- **zombied API** — HTTP server. Validates specs, manages runs and workspaces, serves the dashboard. +- **`zombiectl`** — CLI client. Deploys agents, checks status, manages secrets, streams logs. +- **`zombied` API** — HTTP server. Manages agent lifecycle, credential store, webhook routing, billing. - **Redis Streams** — Work queue. Durable, ordered, with consumer group semantics for worker fleet scaling. -- **zombied worker** — Claim runs, orchestrate the gate loop, push results to GitHub. Supports drain and rolling deploys. -- **zombied-executor** — Sidecar process that owns the sandbox lifecycle. Spawns NullClaw agents, manages worktrees, enforces resource limits. -- **GitHub** — Target forge. Branch push, PR creation, scorecard comments. - -## Agent relay model +- **`zombied` worker** — Owns the sandbox lifecycle. Spawns agents, enforces resource limits, handles restarts. +- **Credential firewall** — Network-layer proxy. Intercepts outbound requests, injects credentials, records audit logs. -For lightweight, interactive agent sessions (`spec init`, `run --preview`), UseZombie uses a different execution model: the **agent relay**. Instead of queuing work for a sandbox, `zombied` acts as a stateless pass-through between the CLI and the workspace's LLM provider. - -```mermaid -sequenceDiagram - participant CLI as zombiectl - participant API as zombied API - participant LLM as LLM Provider - - CLI->>API: POST /v1/agent/stream (mode, tools, messages) - API->>LLM: Forward with system prompt + API key - LLM-->>API: tool_use: list_dir(".") - API-->>CLI: SSE: event: tool_use - Note over CLI: Executes locally on laptop - CLI->>API: POST /v1/agent/stream (messages + tool_result) - API->>LLM: Forward accumulated messages - LLM-->>API: tool_use: read_file("go.mod") - API-->>CLI: SSE: event: tool_use - Note over CLI: Reads file locally - CLI->>API: POST /v1/agent/stream (messages + tool_result) - API->>LLM: Forward accumulated messages - LLM-->>API: text: "# M5_001: Rate Limiting..." - API-->>CLI: SSE: event: text_delta + done {usage} -``` +## Spend control -**Key differences from the pipeline model:** +Every workspace has configurable limits that prevent runaway costs: -| | Pipeline (full runs) | Agent relay (spec init, preview) | -|---|---|---| -| **Execution** | Sandbox on worker, queued | Direct handler, no queue | -| **File access** | Agent reads files in sandbox | CLI reads files locally, sends to model on demand | -| **Duration** | 1-5 minutes | 3-8 seconds | -| **State** | Durable (DB + Redis) | Stateless (CLI manages conversation) | -| **Provider** | Configured per workspace | Same, resolved by `zombied` | +| Control | What it does | +|---------|-------------| +| Token budget | Max tokens per agent execution window | +| Wall time limit | Max wall-clock time before forced stop | +| Cost ceiling | Max USD spend per billing period | +| Kill switch | Manual stop from CLI or Mission Control at any time | -The relay model is inspired by how Claude Code and OpenCode work: the CLI holds tool definitions, the model calls them on demand, and the API layer is a relay. The difference is `zombied` sits between CLI and provider because API keys are server-side secrets managed per-workspace. +When a limit is hit, the agent receives a graceful shutdown signal. The audit log records the reason. No surprises on the invoice. diff --git a/index.mdx b/index.mdx index 3d54db9..cb8d777 100644 --- a/index.mdx +++ b/index.mdx @@ -1,53 +1,65 @@ --- title: Introduction -description: Submit a spec. Get a validated PR. +description: Run your agents 24/7. Walled, watched, killable. --- 🧟 **Early Access Preview** · Pre-release — revised release coming up by April 11. APIs, CLI, and behavior may change without notice before general availability. + - UseZombie is in a product pivot. The focus is practical operator leverage, not tunnel-vision optimization around one narrow bottleneck that frontier models may erase soon. +## Why we pivoted - Submit a spec. An agent implements it, self-repairs until quality gates pass, and opens a PR with a scorecard. You review one PR instead of babysitting ten agent sessions. +We started by obsessing over one thing: making AI-generated code *correct*. Self-repair loops, quality scoring, scorecard evidence. Good problems — but narrow ones. Frontier models get better every quarter, and the gap we were optimizing for keeps shrinking on its own. - [Join the waitlist →](https://usezombie.com) - +The tempting move is to double down — we built it, so we stick with it. But code is cattle, not pets. You don't keep a solution alive just because you wrote it. -## Getting started +So we killed the old approach and pivoted. UseZombie is now a runtime for always-on agents — you bring your agent, we handle the credentials (hidden from the sandbox, injected at the firewall), webhooks (wired automatically), audit logs (every action timestamped), and a kill switch. Your agent runs 24/7 without ever seeing a password. -UseZombie turns markdown specs into validated pull requests with self-repairing agents, run quality scoring, and evidence-backed scorecards so teams ship with confidence, not babysitting. +## What UseZombie does - - Install the CLI and submit your first spec in under 5 minutes. - +UseZombie is agent hosting infrastructure. Connect your agents, and we handle credentials, webhooks, and walls — your agents run 24/7 without ever seeing your keys. -## Make it yours + + Connect your agent and have it running in minutes. + -Connect your GitHub repo, configure your workspace, and let agents ship validated PRs while you focus on specs and reviews. +## What zombies do - - Install zombiectl via npm, npx, or the install script. + + Your agent runs continuously in a sandboxed process and restarts on crash. No babysitting. - - Understand the spec-to-PR lifecycle with self-repair gate loops. + + Agents never see tokens. The firewall injects them per-request, outside the sandbox boundary. - - Author specs in markdown. Any format works. + + Receive events from email, Slack, GitHub, and more — without ngrok or custom servers. - - View runs, scorecards, and workspace settings in the dashboard. + + See what your agent did, when, why, and how much it cost — before the invoice surprises you. + + + Per-run token budgets and wall time limits. One bad prompt never becomes an infinite burn. + + + Stop any agent mid-action from the CLI or Mission Control. ## Explore the docs + + Understand the agent hosting model and credential firewall. + + + Agents, workspaces, credential walls, webhooks, and observability. + Full command reference for zombiectl. - REST API for programmatic spec submission and run management. + REST API for agent management and event ingestion. Deploy and operate the UseZombie control plane.