agentic-sdlc

Deterministic local execution runtime for agentic software workflows.

✨ Production Ready (March 2026)

Status: ✅ All critical gaps addressed! Ready for production deployment.

What's Complete:

✅ Perfect Determinism - Replay store for LLM responses (10x faster, $0 cost)
✅ Process Isolation - Sandbox for untrusted skills with resource monitoring
✅ 6 LLM Providers - OpenAI, Gemini, Anthropic, Azure, Bedrock, Ollama (fully documented)
✅ Git Integration - Complete design for automated workflows (ready to implement in 24h)

Quality Metrics:

183 tests passing (100% pass rate)
13,200+ lines of comprehensive documentation
Production-ready security and reliability features

Gap Analysis: 8/12 gaps addressed (67%) - All critical gaps complete!

🔴 Critical: 2/2 (100%) ✅
🟠 High Priority: 3/5 (60%) ✅
🟡 Medium Priority: 2/5 (40%) ✅

See docs/GAP_QUICK_REFERENCE.md for detailed comparison or FINAL_SUMMARY.md for complete details.

Overview

This repository runs a single consolidated workflow engine with:

deterministic workflow execution and trace IDs
resumable runs with persisted state
idempotent step short-circuit support
runtime policy enforcement for permissions and trust tier
structured step telemetry and trace timeline export

Determinism Scope

Determinism currently applies to orchestration semantics:

ready-step ordering
state transitions and persistence
trace generation and replayability of engine decisions

LLM-generated text/content can still vary across runs unless provider/model/settings enforce deterministic output behavior. Treat content reproducibility as a separate concern from workflow engine determinism.

Deterministic Mode

Control LLM temperature for deterministic behavior:

# Default: temperature=0.0 (deterministic)
cargo run -- --workflow feature.md

# Override temperature
ANTIGRAV_LLM_TEMPERATURE=0.7 cargo run -- --workflow feature.md

# Use specific seed (OpenAI/Azure only)
ANTIGRAV_LLM_SEED=42 cargo run -- --workflow feature.md

Replay Store (Week 2 Feature)

Save and replay LLM responses for perfect determinism:

# Record mode - save all LLM responses to file
cargo run -- --workflow feature.md --save-replay llm_cache.json

# Replay mode - use cached responses (no LLM calls)
cargo run -- --workflow feature.md --replay-mode llm_cache.json

Benefits:

Perfect Determinism: Same inputs → same outputs, always
Fast Replay: 10x+ speedup by skipping LLM calls
Cost Savings: No API costs during replay
Offline Testing: Test workflows without internet

See Deterministic Mode Guide for details.

LLM Providers (Week 3 Feature)

Support for 6 LLM providers with automatic fallback:

# Choose your provider
export ANTIGRAV_LLM_PROVIDER=openai    # or ollama, gemini, anthropic, azure, bedrock
export OPENAI_API_KEY=sk-proj-...

# Optional: Configure fallback
export ANTIGRAV_LLM_FALLBACK=gemini,anthropic

# Run workflow
cargo run -- --workflow feature.md

Supported Providers

Provider	Best For	Cost	Setup
Ollama	Development, offline	Free	Guide
OpenAI	Production, determinism	$$	Guide
Gemini	Cost optimization	$	Guide
Anthropic	Quality, safety	$$$	Guide
Azure OpenAI	Enterprise	$$	Guide
AWS Bedrock	AWS ecosystem	$$$	Guide

See LLM Providers Guide for complete documentation.

Quick Examples

Development (Free):

export ANTIGRAV_LLM_PROVIDER=ollama
cargo run -- --workflow feature.md

Production (Reliable):

export ANTIGRAV_LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-proj-...
cargo run -- --workflow feature.md

Cost-Optimized:

export ANTIGRAV_LLM_PROVIDER=gemini
export GEMINI_API_KEY=AIza...
cargo run -- --workflow feature.md

Sandbox Execution (Week 4 Feature)

Automatic process isolation for untrusted skills with resource monitoring:

Trust Tiers

Tier	Execution	Isolation	Use Case
Trusted	In-process	None	Local, verified skills
Constrained	In-process	Permissions	Limited access skills
Untrusted	Subprocess	Full isolation	Imported, unverified skills

Features

✅ Process Isolation - Untrusted skills run in separate processes
✅ Resource Monitoring - Track CPU, memory, execution time
✅ Automatic Enforcement - Kill processes exceeding limits
✅ Zero Overhead - Trusted skills run in-process (no performance impact)

Resource Limits

// Configure resource limits
let budget = ExecutionBudget {
    resource_budget: ResourceBudget {
        max_cpu_ms: 10_000,      // 10 seconds CPU
        max_wall_time_ms: 30_000, // 30 seconds wall time
        max_memory_mb: 512,       // 512 MB memory
        ..Default::default()
    },
    ..Default::default()
};

Example: Untrusted Skill

SkillCapability::new(
    "imported_skill",
    "Skill from external source",
    SkillIOType::Text,
    SkillIOType::Text,
    CapabilityPermissions::none(),
    SideEffectClass::ExternalMutation,
)
.with_trust_tier(TrustTier::Untrusted)  // Runs in sandbox

See Sandbox Guide for complete documentation.

See Deterministic Mode Guide for details.

Current Release

Runtime target: v1.0.1
Package version: 1.0.1

Quick Start

cargo run -- --workflow valid_flow.md

Onboard/check local prerequisites:

./scripts/bootstrap.sh
cargo run -- workflow doctor
cargo run -- workflow setup

workflow setup now bootstraps a full core package when missing:

rules (runtime/branching/coding/merge)
workflows (starter/feature/bugfix/review/release)
templates (feature/bugfix/review/release_prompt)
roles (architect/implementer/reviewer/resolver/releaser)
starter skills (analyze_code/generate_tests/next_steps)

Run deterministic CI gate locally:

./scripts/ci_gate.sh

Optional live provider smoke tests (OpenAI/Gemini):

ANTIGRAV_RUN_LIVE_LLM_TESTS=1 OPENAI_API_KEY=... cargo test llm_subagent_live_smoke_openai -- --nocapture
ANTIGRAV_RUN_LIVE_LLM_TESTS=1 GEMINI_API_KEY=... cargo test llm_subagent_live_smoke_gemini -- --nocapture

Inspect active/previous runs:

cargo run -- workflow list
cargo run -- workflow status
cargo run -- workflow threads

Run a workflow with a reusable template prompt:

cargo run -- workflow start-template feature_prompt --task "add email validation to signup flow"

Run a role-bound workflow launch:

cargo run -- workflow start-role implementer --task "add email validation to signup flow"

Run chat-thread orchestration (thread branch + workflow + optional merge lifecycle):

cargo run -- workflow chat-thread feature-email --message "implement signup email validation"
cargo run -- workflow chat-thread review-thread --message "review current diff" --workflow-id review --template review_prompt --role reviewer --no-merge

Run thread-to-branch lifecycle end-to-end (includes auto conflict resolution attempts):

cargo run -- workflow thread-flow my-thread --target-branch main --validate-command "cargo test"

Run a direct workflow with template/role overrides:

cargo run -- --workflow-id feature --template feature_prompt --task "add email validation to signup flow"
cargo run -- --workflow-id feature --role-override "architect=planner,implementer=debugger"

Inspect available role profiles and templates:

cargo run -- workflow roles
cargo run -- workflow templates

Scaffold markdown package files with schema headers:

cargo run -- workflow scaffold workflow feature-search --profile advanced
cargo run -- workflow scaffold skill search_docs --profile advanced

Skill scaffold now follows folder layout:

.agents/skills/<skill-name>/SKILL.md

Generate an advanced domain pack (workflows + skills + roles + templates):

cargo run -- workflow scaffold-domain payments

Rebuild graph index for context retrieval (also refreshes sqlite context tables in .agents/memory/context.db):

cargo run -- workflow index-graph

Run skill quality validation and strict gate:

cargo run -- workflow quality-skills
cargo run -- workflow quality-skills --strict

List available curated bundles:

cargo run -- workflow bundles
cargo run -- workflow bundles --json

Hot domain bundles currently included:

ai-engineering
cloud-platform
cybersecurity
data-ml-eval
healthtech
climate-tech

Quick examples:

cargo run -- --workflow-id ai-engineering/feature --template ai-engineering/feature_prompt --task "build eval pipeline for support agent"
cargo run -- --workflow-id cybersecurity/review --template cybersecurity/review_prompt --task "review auth middleware diff for vulnerabilities"

Security Workflow and Internet Skill Gate

This repository now enforces a package rule:

workflows using internet-capable skills must include an explicit security-check step
recommended gate step is internet_security_check using cybersecurity.security_scan_guard

Run the dedicated security workflow:

cargo run -- --workflow-id cybersecurity/security-scan --template cybersecurity/security_scan_prompt --task "scan internet-surface risks and policy drift"

Run security/package validation gates:

cargo run -- workflow check
cargo run -- workflow quality-skills --strict

Run workflow report evaluation gate from dataset:

cargo run -- workflow eval .agents/evals/release_eval.json
cargo run -- workflow eval .agents/evals/release_eval.json --min-pass-rate 0.9 --json

Manual approval gate for release/review-sensitive runs:

cargo run -- workflow status <instance_id>
cargo run -- workflow approve <instance_id> --step manual_approval_gate --by release-manager --note "qa+security passed"
cargo run -- workflow resume <instance_id>

To explicitly block a run at the gate:

cargo run -- workflow reject <instance_id> --step manual_approval_gate --by security --note "critical risk unresolved"

Generate catalog/manifest/lock artifacts for skill bundles:

cargo run -- workflow build-catalog

Import third-party SKILL.md repos into local .agents/skills/imported:

cargo run -- workflow import-skills https://github.com/anthropics/skills --max-skills 20
cargo run -- workflow import-skills https://github.com/anthropics/skills --allow-missing-license
cargo run -- workflow import-skills https://github.com/anthropics/skills --mode global --allow-missing-license

Imported skills are normalized to folder layout:

.agents/skills/imported/<skill-name>/SKILL.md

Install using installer-style alias command:

cargo run -- workflow install-skillpack https://github.com/anthropics/skills --mode local --allow-missing-license
cargo run -- workflow install-skillpack https://github.com/anthropics/skills --mode global --allow-missing-license

Sync existing imported skills using pinned source commit/provenance from .agents/skills.lock.json:

cargo run -- workflow sync-imports --overwrite
cargo run -- workflow sync-imports --mode global --overwrite --allow-missing-license

Normalize existing imported skill metadata (risk/source/tags) without re-pulling upstream repos:

cargo run -- workflow normalize-imported-skills
cargo run -- workflow normalize-imported-skills --dry-run --json
cargo run -- workflow normalize-imported-skills --mode global --json

Install a curated bundle into local/global skills root:

cargo run -- workflow install-bundle core
cargo run -- workflow install-bundle imported --mode global --overwrite

Verify lock integrity (detect missing/changed/extra skill entries):

cargo run -- workflow verify-lock
cargo run -- workflow verify-lock --mode global --fail-on-extra
cargo run -- workflow verify-lock --require-attestation

--mode local writes to .agents/skills/imported; --mode global writes to $CODEX_HOME/skills/imported (fallback: ~/.codex/skills/imported).

Register/list/ping MCP runtime servers (stored in .agents/mcp/servers.json):

cargo run -- workflow mcp-register ollama-cli --transport stdio --command npx --arg -y --arg mcp-client-for-ollama --arg --ollama-host --arg http://127.0.0.1:11434
cargo run -- workflow mcp-register local-supabase --transport http --url http://127.0.0.1:54321/mcp --allow-tool query --allow-tool list_tables
cargo run -- workflow mcp-list
cargo run -- workflow mcp-ping
cargo run -- workflow mcp-ping ollama-cli --timeout-ms 8000 --json
cargo run -- workflow mcp-policy local-supabase --tool query

Resume a run:

cargo run -- workflow resume <instance_id>

Export trace:

cargo run -- workflow trace <instance_id> --json
cargo run -- workflow trace <instance_id> --timeline
cargo run -- workflow trace <instance_id> --otel

LLM Router

llm_subagent now routes real provider calls with timeout/retry/fallback and normalized telemetry output.

Common environment variables:

ANTIGRAV_LLM_PROVIDER=ollama|openai|gemini|anthropic
ANTIGRAV_LLM_MODEL=<primary model>
ANTIGRAV_LLM_FALLBACK=openai,gemini,anthropic
ANTIGRAV_LLM_FALLBACK_POLICY=transient_only|always|never
ANTIGRAV_LLM_TIMEOUT_MS=30000
ANTIGRAV_LLM_MAX_RETRIES=2
ANTIGRAV_LLM_SIMULATION_FALLBACK=true
OPENAI_API_KEY=...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...

Context retrieval service (for deterministic LLM context injection):

ANTIGRAV_CONTEXT_RETRIEVAL_MODE=vector|graph|hybrid|off
ANTIGRAV_CONTEXT_BACKEND=json|sqlite
ANTIGRAV_CONTEXT_INDEX_PATH=.agents/memory/vector_index.json
ANTIGRAV_CONTEXT_MIN_SCORE=0.1
ANTIGRAV_CONTEXT_GRAPH_INDEX_PATH=.agents/memory/graph_index.json
ANTIGRAV_CONTEXT_GRAPH_MIN_SCORE=0.05
ANTIGRAV_CONTEXT_DB_PATH=.agents/memory/context.db
ANTIGRAV_CONTEXT_VECTOR_TABLE=vector_entries
ANTIGRAV_CONTEXT_GRAPH_TABLE=graph_nodes
ANTIGRAV_CONTEXT_MAX_ITEMS=5
ANTIGRAV_CONTEXT_MAX_CHARS=300

Generated Artifacts

These files are generated and ignored by git by default:

.agents/catalog/*
.agents/skills_index.json
.agents/workflows.json
.agents/bundles.json
.agents/marketplace.json
.agents/skills.lock.json
.agents/skills/imported/*

Roadmap

Current gap closure plan is tracked in:

docs/GAP_ROADMAP.md

Regenerate them anytime with:

cargo run -- workflow build-catalog

Skill Format

Skill markdown supports both formats:

frontmatter metadata (--- block with at least name/domain/executor)
fenced JSON metadata block (existing format)

Folder-skill layout is supported:

.agents/skills/<skill-name>/SKILL.md
optional .agents/skills/<skill-name>/references/ and .agents/skills/<skill-name>/scripts/

Installation

cargo install --path .
antigrav workflow doctor

Documentation

Architecture and runtime semantics: docs/ARCHITECTURE.md
CLI usage guide: docs/CLI_USAGE.md
Dev OS execution blueprint: docs/DEV_OS_BLUEPRINT.md
Release notes: CHANGELOG.md
Agent package guide: .agents/README.md
Gemini package contract: .agents/GEMINI.md

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.agents		.agents
.astro		.astro
.claude/commands		.claude/commands
.github/workflows		.github/workflows
.minimax/skills		.minimax/skills
docs		docs
examples		examples
proptest-regressions/platform/tier4_organizational		proptest-regressions/platform/tier4_organizational
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTEXT_TRANSFER_SUMMARY.md		CONTEXT_TRANSFER_SUMMARY.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
FINAL_SUMMARY.md		FINAL_SUMMARY.md
IMPLEMENTATION_STATUS.md		IMPLEMENTATION_STATUS.md
LICENSE		LICENSE
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
PUBLISH.md		PUBLISH.md
QUICK_COMMANDS.md		QUICK_COMMANDS.md
README.md		README.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
demo.md		demo.md
example.md		example.md
install.sh		install.sh
platform.toml		platform.toml
test_replay_workflow.md		test_replay_workflow.md
valid_flow.md		valid_flow.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentic-sdlc

✨ Production Ready (March 2026)

Overview

Determinism Scope

Deterministic Mode

Replay Store (Week 2 Feature)

LLM Providers (Week 3 Feature)

Supported Providers

Quick Examples

Sandbox Execution (Week 4 Feature)

Trust Tiers

Features

Resource Limits

Example: Untrusted Skill

Current Release

Quick Start

Security Workflow and Internet Skill Gate

LLM Router

Generated Artifacts

Roadmap

Skill Format

Installation

Documentation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agentic-sdlc

✨ Production Ready (March 2026)

Overview

Determinism Scope

Deterministic Mode

Replay Store (Week 2 Feature)

LLM Providers (Week 3 Feature)

Supported Providers

Quick Examples

Sandbox Execution (Week 4 Feature)

Trust Tiers

Features

Resource Limits

Example: Untrusted Skill

Current Release

Quick Start

Security Workflow and Internet Skill Gate

LLM Router

Generated Artifacts

Roadmap

Skill Format

Installation

Documentation

About

Topics

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages