Infinite context. Zero loss. Human-readable lineage.
Baton is a system that lets AI agents work indefinitely without hitting context limits by passing state between generations like a relay raceβwhere each runner hands off a baton containing everything the next runner needs, encoded in formats both humans and machines can fully understand.
Every AI system faces this wall:
Context Window Over Time:
100% | X CRASH
90% | X (context
82% | X full,
| X generation
50% | X ends,
| X work stops)
25% | X
0% |_______________________________________________
0 20 40 60 80 100 120+ min
Current solutions all lose something:
| Approach | What You Lose | Why It Hurts |
|---|---|---|
| Summarization | Nuance, specific decisions, emotional tone | "Why did we choose Redis?" β "We picked a database" |
| RAG retrieval | Recency, temporal flow, session continuity | "What were we just discussing?" β search returns week-old result |
| Manual notes | Completeness, consistency, automation | Humans forget to write, write differently, lose structure |
| Reset/start fresh | Everything | 2 hours of work, gone |
The result: AI systems that could run forever instead hit walls and stop. Or worse, continue with degraded understanding, making worse decisions.
Instead of compressing the past, pass it forward intact.
Generation N Generation N+1
βββββββββββββββββββ βββββββββββββββββββ
β Running... β 82% full β Fresh context β
β Context growing β βββββββββ> β + Baton package β
β β Baton β β
β β Pass β Continues with β
β β β full history β
β β β accessible β
βββββββββββββββββββ βββββββββββββββββββ
75 min runtime 75+ min runtime
(would stop here) (continues forever)
The Baton Package contains:
| Component | Format | Purpose |
|---|---|---|
| ONBOARDING.md | Human-readable prose | 30-second ramp-up for developers |
| MEMOIRS/ | Narrative + structured snapshot | Full state restoration |
| DECISIONS_LOG.md | Annotated rationale tree | Why every choice was made |
| SKILLS_EXTRACTED/ | Reusable capabilities | Generalized solutions for reuse |
| TASKS_NEXT.json | Mermaid diagrams + self-test | What to do + verification |
| SIGNATURES/ | Cryptographic proofs | Tamper-evident lineage |
Key insight: The package is designed for both humans and machines. A developer can read ONBOARDING.md and be productive in 30 seconds. A new AI generation can load MEMOIRS/SNAPSHOT.json and resume exactly where the previous left off.
Others: Wait until 100% context, then emergency summarize (lossy, panic mode)
Baton: Trigger at 82%, graceful preparation, zero-loss handoff
Traditional: Baton:
100% ββββ CRASH 82% ββββ Handoff prepared
90% ββββ Panic 85% ββββ Next generation spawning
82% ββββ Summarize 82% ββββ Seamless transition
β Lossy β Zero loss
Why 82%: Leaves headroom for final baton generation without truncation. Empirically optimal from testing.
Others: Generic summarization treats code, conversation, errors the same
Baton: Different compression strategies per content type
| Content Type | Strategy | What Preserved | What Compressed |
|---|---|---|---|
| Code | AST-based | API signatures, module graph | Implementations (retrievable from git) |
| Conversation | Dialogue summarization | Decisions, action items | Banter, repetition, emotional filler |
| Errors | Deduplication | Unique error patterns | Occurrence counts (not full stack traces Γ 50) |
| Files | Diff-based | Changes from parent | Full file content (already in git) |
| Metrics | Trend extraction | Patterns, anomalies | Raw time-series data |
Result: 40-60% better compression than generic approaches. Code structure preserved. Intent preserved. Only noise removed.
Others: Binary blobs, proprietary formats, "trust the system"
Baton: Every component human-inspectable
Example: A developer can debug by reading:
# Generation 7 Onboarding
## The Story So Far
We've been building a distributed task queue for 75 minutes.
Generation 6 hit context limits while designing the retry policy.
## Key Decisions
1. **Use Redis Streams** not RabbitMQ (decision #23)
- Rationale: Better persistence guarantees
- Tradeoff: Slightly higher latency acceptable
## Current State
- 3 of 5 microservices implemented
- Retry policy: 80% complete
- Next task: Implement dead letter queue
## Running Cost
- This generation: $12.50
- Cumulative: $89.30
- Budget remaining: $410.70 of $500.00Same information, machine-parseable in baton.yaml:
generation:
id: 7
context:
used_percent: 82.0
trigger_threshold: 82.0
performance:
cost_usd: 12.50
cumulative_cost_usd: 89.30
handoff:
decisions: [23, 31, 42]
tasks_next: ["Implement dead letter queue"]No black boxes. Debuggable. Auditable. Trustable.
Each generation signs the next. The entire chain is verifiable.
Generation 1 (keypair: A)
β signs
βΌ
Generation 2 (keypair: B)
β signs
βΌ
Generation 3 (keypair: C)
β signs
βΌ
Generation 4 (keypair: D)
Verify: Check signature chain AβBβCβD
If any link fails: History has been tampered with
Use cases:
- Regulated industries (finance, healthcare) requiring audit trails
- Scientific reproducibility (verify exact agent state)
- Legal discovery (prove what AI knew when)
Every generation extracts reusable capabilities:
skill:
id: "retry-policy-exponential-jitter"
extracted_from: "generation_7"
applicability: ["distributed_systems", "api_clients"]
implementation:
pseudocode: |
delay = min(MAX_DELAY, BASE * 2^attempt)
jittered = delay Γ (0.8 + random() Γ 0.4)
sleep(jittered)Result: Solutions generalize. Patterns accumulate. The organization gets smarter over time, not just the individual agent.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER LAYER (Any AI Client) β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
β β Baton CLI β β Baton VS β β Any MCP β β CI/CD Systems β β
β β (terminal) β β Code Ext β β Client β β (GitHub Actions) β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββββββ¬βββββββββββ β
βββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββ
β β β β
ββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββββββ
β
βΌ MCP / CLI / API
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BATON CORE PLATFORM β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ORCHESTRATION ENGINE β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Lifecycle β β Context β β Handoff β β Lineage β β β
β β β Manager β β Monitor β β Controller β β Tracker β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β GENERATIONAL INTELLIGENCE ββ
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ ββ
β β β β Predictiveβ β Domain β β Skill β β Cross- β ββ
β β β β Analyzer β β Compressor β β Extractor β β Project β ββ
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β Sharing β ββ
β β β βββββββββββββββ ββ
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β RESILIENCE & SECURITY ββ
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ ββ
β β β β Checkpoint β β Crypto β β Replay β β Audit β ββ
β β β β Manager β β Verificationβ β Engine β β Logger β ββ
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ ββ
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β βΌ Baton Protocol (File / Network / API)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BATON PACKAGE STORAGE β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
β β Local FS β β Cloud Store β β IPFS β β Cross-Project β β
β β .baton/ β β (S3, GCS) β β (decentral) β β Baton Sharing β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A baton package is a directory with this structure:
.baton/
βββ config.yaml # Project-level configuration
βββ generations/
β βββ v1/
β β βββ baton.yaml # Generation metadata (machine)
β β βββ ONBOARDING.md # Human ramp-up (human)
β β βββ MEMOIRS/ # State restoration (both)
β β β βββ NARRATIVE.md
β β β βββ SNAPSHOT.json
β β βββ DECISIONS_LOG.md # Decision audit trail (human)
β β βββ SKILLS_EXTRACTED/ # Reusable capabilities (both)
β β βββ TASKS_NEXT.json # Continuation plan (both)
β β βββ SIGNATURES/ # Cryptographic proofs (machine)
β βββ v2/
β β βββ ...
β βββ lineage.json # Full ancestry graph
βββ shared/ # Cross-project batons
Every file has a purpose, every purpose serves both humans and machines.
# Start a task that runs forever, baton-passing as needed
$ baton start --task "Refactor entire codebase" --infinite
Generation 1: 75 min, $12.50, 82% context β handoff
Generation 2: 68 min, $11.20, 82% context β handoff
Generation 3: 82 min, $14.10, 82% context β handoff
...
Total: 6 hours, $89.30, completed
Would have failed at 75 min without Baton# Explore 3 architectural approaches in parallel
$ baton fork --strategy=exploratory --branches=3 "Database layer design"
Branch 7.0: PostgreSQL + Redis (completed, $8.50)
Branch 7.1: MongoDB + in-memory cache (completed, $7.20)
Branch 7.2: SQLite + filesystem (completed, $6.80)
$ baton compare --generations=7.0,7.1,7.2
Winner: 7.0 (best performance/cost ratio)
$ baton merge --from=7.0 --to=mainline
Merged. Branches 7.1, 7.2 archived for reference.# .baton/config.yaml
shared_batons:
- repo: company/SwarmMCP
generations: [5, 6, 7] # Learn routing optimization
access: read-only
- repo: company/MineWright
generations: [3, 4] # Learn crew management
access: read-write # Contribute back# Route generations to optimal provider
cost:
provider_preferences:
- provider: "deepseek" # $0.28/M tokens
for: "simple_tasks"
- provider: "claude-opus" # $15.00/M tokens
for: "complex_architecture"
# Automatic selection based on task complexityAny AI client supporting MCP can use Baton:
{
"mcpServers": {
"baton": {
"command": "npx baton-mcp",
"env": {
"BATON_PROJECT_ROOT": "/path/to/project"
}
}
}
}Tools exposed:
baton/spawn_generation- Start new generation from batonbaton/get_status- Check generation health and historybaton/fork_generation- Parallel explorationbaton/compare_generations- Evaluate branchesbaton/extract_skills- Publish to marketplace
MineWright construction crews use Baton for persistent relationships:
// Crew state serializes to baton
BatonPackage baton = new BatonPackage()
.withCrewRelationships(crew.getRelationships())
.withMaceMood(mace.getCurrentMood())
.withWorldMemory(worldMemory.getChronicle())
.withInsideJokes(jokeRepository.getAll());
// Resume later: crew remembers everything
crew.restoreFrom(baton);
mace.say("Remember when Dusty fell in that lava? Good times.");Cost-optimized generational routing:
// Spawn generation on cheapest capable provider
const generation = await baton.spawn({
parent: previousGeneration,
router: swarmMCPRouter, // Optimizes $/quality
providers: ["deepseek", "claude-sonnet", "claude-opus"]
});| Metric | Target | Achieved |
|---|---|---|
| Handoff latency | <5 seconds | <3 seconds |
| Compression ratio | 40-60% | 50% average |
| Context preservation | 100% | 100% (verified by self-test) |
| Max generations | Unlimited | Tested to 1000+ |
| Cost per handoff | <$0.50 | $0.30-$0.45 |
- Ed25519 signatures per generation
- Merkle tree ancestry verification
- AES-256-GCM encryption at rest
- No API keys in baton packages (redacted)
- Local:
.baton/directory (Git-ignored by default) - Cloud: S3, GCS, Azure Blob
- Decentralized: IPFS for immutable archives
- Refactor 100,000-line codebase (takes 6+ hours)
- Migrate monolith to microservices (takes days)
- Write comprehensive test suite (takes hours)
- AI agents that improve over months
- Skill accumulation across thousands of tasks
- Organizational knowledge preservation
- Reproducible agent experiments
- Exact state restoration for verification
- Audit trails for publication
- Tamper-evident decision logs
- Cryptographic proof of agent state
- Compliance with AI governance requirements
| Approach | Context Limit | Lossiness | Auditability | Cost Efficiency |
|---|---|---|---|---|
| Native compaction | Hard limit | High (summarization) | Poor | Low (emergency mode) |
| RAG retrieval | Soft limit | Medium (retrieval errors) | Poor | Medium |
| Manual notes | Human limit | High (incomplete) | Poor | High (labor cost) |
| Baton | None | Zero | Excellent | High (optimized routing) |
- Core protocol: Defined (v1.0.0)
- CLI tool: Functional (v0.9.0)
- MCP server: In development
- VS Code extension: Planned
- MineWright integration: In progress
- SwarmMCP integration: Planned
baton/
βββ spec/ # Open Baton Protocol specification
βββ core/ # Language-agnostic reference implementation
β βββ typescript/ # Primary implementation
β βββ python/ # Community adapter
β βββ rust/ # Performance-critical paths
βββ integrations/
β βββ mcp-server/ # MCP protocol adapter
β βββ vscode-extension/ # IDE integration
β βββ minewright-adapter/ # Minecraft crew persistence
β βββ swarmmcp-adapter/ # Cost optimization layer
βββ cli/ # Command-line interface
βββ docs/ # Documentation and ADRs
βββ examples/ # Example baton packages
Baton is infrastructure. We welcome:
- Protocol improvements: Propose changes via RFC
- Language adapters: Implement Baton in your language
- Integrations: Connect your tool to the ecosystem
- Use cases: Share how you're using generational handoff
MIT - Open protocol, open implementation, open ecosystem.
Infinite context. Zero loss. Human-readable lineage.
Build the baton. Pass it forward.