Website · Documentation · Ask Tracy
AI agents are opaque. You deploy an OpenClaw agent, it runs autonomously, and you hope it works. When something goes wrong, you have no idea:
- Why did this run cost $4.70 when it usually costs $0.30? Was it a runaway context window? A retry loop? A wrong model?
- Why did the agent fail silently? Which step broke? What was the input that caused it?
- Is the agent drifting? Is it doing the same work as last week, or has behavior changed?
- Where is the bottleneck? Which step takes 80% of the wall clock time?
You end up reading raw logs, guessing at token counts, and manually tracing execution paths. This doesn't scale.
ClawTrace is the observability platform for OpenClaw agents. It captures every trajectory (a complete agent execution from start to finish), breaks it down into spans (individual LLM calls, tool executions, sub-agent delegations), and gives you the tools to understand what happened, why, and what to fix.
Tracy, your OpenClaw Doctor Agent. Every other observability tool gives you dashboards and expects you to interpret the data yourself. ClawTrace includes Tracy, an AI analyst that lives inside the platform. Ask Tracy a question in natural language, and she queries your trajectory data in real time, generates charts, spots anomalies, and delivers specific optimization recommendations.
| Feature | Description |
|---|---|
| Execution Path | Interactive trace tree showing every LLM call, tool use, and sub-agent delegation with full input/output payloads |
| Call Graph | Node-link diagram visualizing relationships between agents, tools, and models |
| Timeline | Gantt chart revealing parallelism, bottlenecks, and idle gaps |
| Cost Estimation | Per-span cost calculation with 80+ model pricing entries covering OpenAI, Anthropic, Google, DeepSeek, Mistral, Qwen, GLM, Kimi, and more. Cache-aware pricing (fresh input vs cached input vs output) |
| Ask Tracy | Conversational AI analyst that queries your trajectory graph, generates ECharts visualizations, and provides actionable recommendations |
| Consumption Billing | Pay for what you use with credits. No seat-based subscriptions |
Every trajectory can be inspected through three complementary views. Click any step in any view to open the Step Detail panel with full input/output payloads, token counts, duration, cost estimate, and error details.
The execution path renders the complete agent run as a collapsible tree. Each node represents one step the agent took: a session start, an LLM inference, a tool execution, or a sub-agent delegation. Hierarchy lines show parent-child relationships. Metadata badges on each node display the model name, duration, token counts, and estimated cost. Nodes with errors are highlighted with a red border. You can expand and collapse subtrees to focus on the part of the execution that matters.
The call graph shows the trajectory as an interactive force-directed node-link diagram. Each unique actor (agent session, LLM model, tool) appears as a node. Edges represent calls between them. Node size reflects how many times that actor was invoked. This view is ideal for understanding the shape of a complex multi-agent run at a glance: which models were used, which tools were called, and how they connect.
The timeline presents a horizontal Gantt chart of every span in the trajectory. Each bar represents one step, positioned by its start time and sized by its duration. Bars are color-coded by step type (LLM call, tool call, sub-agent, session). This view makes it easy to spot bottlenecks (the longest bars), parallelism (overlapping bars), and idle gaps (empty space between bars) that reveal optimization opportunities.
openclaw plugins install @epsilla/clawtraceopenclaw clawtrace setupopenclaw gateway restartThat's it. Every trajectory now streams to ClawTrace automatically.
Visit clawtrace.ai to sign up and get 100 free credits. Refer a friend and you both get 200 bonus credits.
graph TB
subgraph Agent Runtime
OC[OpenClaw Agent]
PLG["@epsilla/clawtrace plugin<br/>8 hook types"]
end
subgraph Ingest Layer
ING[Ingest Service<br/>FastAPI + Cloud Storage]
end
subgraph Data Lake
RAW[Raw JSON Events<br/>Azure Blob / GCS / S3]
DBX[Databricks Lakeflow<br/>SQL Pipeline]
ICE[Iceberg Silver Tables<br/>events_all, pg_traces,<br/>pg_spans, pg_agents]
end
subgraph Graph Layer
PG[PuppyGraph<br/>Cypher over Delta Lake]
end
subgraph Backend Services
API[Backend API<br/>FastAPI + asyncpg]
PAY[Payment Service<br/>Credits + Stripe]
MCP[Tracy MCP Server<br/>Cypher queries]
end
subgraph AI Layer
TRACY[Tracy Agent<br/>Anthropic Managed Harness<br/>Claude Sonnet 4.6]
end
subgraph Frontend
UI[ClawTrace UI<br/>Next.js 15 + React 19]
DOCS[Documentation<br/>Server-rendered Markdown]
end
subgraph External
NEON[(Neon PostgreSQL<br/>Users, API Keys,<br/>Credits, Sessions)]
STRIPE[Stripe<br/>Payments]
end
OC --> PLG
PLG -->|"POST /v1/traces/events"| ING
ING --> RAW
RAW --> DBX
DBX --> ICE
ICE --> PG
PG -->|Cypher| API
PG -->|Cypher| MCP
API --> NEON
PAY --> NEON
PAY --> STRIPE
MCP -->|tool results| TRACY
TRACY -->|SSE stream| API
UI -->|REST API| API
UI -->|SSE| API
API -->|deficit check| PAY
- Capture: The
@epsilla/clawtraceplugin intercepts 8 OpenClaw hook types:session_start,session_end,llm_input,llm_output,before_tool_call,after_tool_call,subagent_spawning,subagent_ended - Ingest: Events are batched and POSTed to the ingest service, which writes partitioned JSON to cloud storage (
tenant={id}/agent={id}/dt=YYYY-MM-DD/hr=HH/) - Transform: Databricks Lakeflow SQL pipeline materializes raw events into 8 Iceberg silver tables every 3 minutes
- Query: PuppyGraph virtualizes the Delta Lake tables as a Cypher-queryable graph (Tenant → Agent → Trace → Span with CHILD_OF edges)
- Serve: The backend API runs Cypher queries, the payment service tracks credit consumption, and Tracy's MCP server provides graph access to the AI agent
- Display: Next.js UI renders trace trees, call graphs, timelines, and Tracy's streamed responses with inline ECharts
Agent observability data is naturally a graph: tenants own agents, agents produce traces, traces contain spans, and spans form parent-child hierarchies. ClawTrace models this explicitly with 4 vertex types (Tenant, Agent, Trace, Span) and 4 edge types (HAS_AGENT, OWNS, HAS_SPAN, CHILD_OF), queryable via Cypher.
Most observability tools store traces in relational databases or document stores. That works for thousands of traces. It breaks at billions.
Separation of storage and compute. Raw events land in cloud object storage (Azure Blob, GCS, or S3) as partitioned JSON. Databricks materializes them into Delta Lake Iceberg tables with data skipping statistics and Z-order clustering. PuppyGraph reads these tables directly without copying data. Storage scales infinitely at object storage prices. Compute scales independently.
Graph queries over a data lake. PuppyGraph virtualizes Delta Lake tables as a Cypher-queryable graph. This means you get the expressiveness of graph traversal (find all spans that are children of a specific span, trace the full call chain across sub-agents) with the storage economics of a data lake. No separate graph database to maintain, no ETL to a graph store, no data duplication.
Clustered for the access patterns that matter. Each table is clustered by the keys that the UI actually queries:
pg_traces: clustered by(tenant_id, agent_id, trace_id)for fast agent dashboard loadspg_spans: clustered by(trace_id, span_id)for fast trace detail queriespg_child_of_edges: clustered by(trace_id, parent_span_id)for hierarchy traversal
Delta Lake's data skipping statistics prune irrelevant Parquet files before any data is read. A query for one agent's traces in a platform with a billion spans touches only the relevant files.
Tracy queries the graph directly. Because PuppyGraph exposes the data lake as a Cypher endpoint, Tracy (the AI analyst) writes and executes graph queries in real time against the same data that powers the UI. No pre-computed aggregations, no stale materialized views. Tracy's answers are always up to date.
clawtrace/
├── packages/clawtrace-ui/ Next.js 15 frontend (App Router, React 19, Drizzle ORM)
├── services/clawtrace-backend/ FastAPI backend (PuppyGraph, JWT auth, Tracy chat)
├── services/clawtrace-ingest/ FastAPI ingest (multi-tenant, cloud-agnostic storage)
├── services/clawtrace-payment/ FastAPI billing (consumption credits, Stripe, notifications)
├── plugins/clawtrace/ @epsilla/clawtrace npm plugin for OpenClaw
├── sql/databricks/ Lakeflow SQL pipeline (silver tables + billing tables)
└── puppygraph/ PuppyGraph schema configuration
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React 19, CSS Modules, ECharts, react-markdown |
| Backend | FastAPI, asyncpg, httpx, Pydantic Settings |
| Database | Neon PostgreSQL (users, credits, sessions), Drizzle ORM |
| Data Lake | Azure Blob Storage, Databricks, Delta Lake, Iceberg |
| Graph | PuppyGraph (Cypher over Delta Lake) |
| AI | Anthropic Managed Agents (Claude Sonnet 4.6), MCP protocol |
| Billing | Stripe, consumption-based credits |
| Deployment | Vercel (UI), Docker + Kubernetes (services) |
ClawTrace estimates per-span cost using a comprehensive pricing table covering 80+ models across all major vendors:
Western vendors: OpenAI (GPT-5.x, GPT-4.x, o-series), Anthropic (Claude Opus/Sonnet/Haiku), Google (Gemini 3.x/2.x/1.5), DeepSeek (V3, R1), Mistral (Large/Small/Codestral)
Chinese vendors: Alibaba Qwen (3.x Max/Plus/Flash), Zhipu GLM (5.x/4.x), Moonshot Kimi (K2.5), Baidu ERNIE (5.0/4.5), MiniMax (M2.x)
Open source: Llama 4/3.x, Mixtral, Stepfun
Cache-aware pricing: fresh input tokens, cached input tokens (~10% rate), cache write tokens, and output tokens are calculated separately for accurate cost estimation.
- Rubric-Based Evaluation — Define quality rubrics, auto-score agent trajectories, catch regressions before deployment
- A/B Testing — Run agent variants side by side, compare cost, quality, and speed, promote winners with confidence
- Version Control — Track agent config changes over time, roll back to known good versions, audit who changed what
- Self-Evolving Agents — The long vision: agents that learn from their own trajectory data to continuously improve reliability, reduce costs, and adapt to new patterns automatically
cd packages/clawtrace-ui
npm install
npm run dev # localhost:3000
npm run typecheck # TypeScript validationcd services/clawtrace-backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # Edit with your credentials
uvicorn app.main:app --reload --port 8082cd services/clawtrace-ingest
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
uvicorn app.main:app --reload --port 8080cd services/clawtrace-payment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
uvicorn app.main:app --reload --port 8083cd plugins/clawtrace
npm install
npm testThis project was inspired by and builds upon the work in openclaw-tracing, a reference implementation for tracing OpenClaw agent executions. ClawTrace extends this foundation with production-grade observability, a graph-based query engine, consumption-based billing, and Tracy, the AI observability analyst.
Apache 2.0. See LICENSE for details.
Built with ❤️ by Epsilla





