Skip to content

truongnat/agentic-sdlc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

agentic-sdlc

CI

Deterministic local execution runtime for agentic software workflows.

✨ Production Ready (March 2026)

Status: ✅ All critical gaps addressed! Ready for production deployment.

What's Complete:

  • Perfect Determinism - Replay store for LLM responses (10x faster, $0 cost)
  • Process Isolation - Sandbox for untrusted skills with resource monitoring
  • 6 LLM Providers - OpenAI, Gemini, Anthropic, Azure, Bedrock, Ollama (fully documented)
  • Git Integration - Complete design for automated workflows (ready to implement in 24h)

Quality Metrics:

  • 183 tests passing (100% pass rate)
  • 13,200+ lines of comprehensive documentation
  • Production-ready security and reliability features

Gap Analysis: 8/12 gaps addressed (67%) - All critical gaps complete!

  • 🔴 Critical: 2/2 (100%) ✅
  • 🟠 High Priority: 3/5 (60%) ✅
  • 🟡 Medium Priority: 2/5 (40%) ✅

See docs/GAP_QUICK_REFERENCE.md for detailed comparison or FINAL_SUMMARY.md for complete details.


Overview

This repository runs a single consolidated workflow engine with:

  • deterministic workflow execution and trace IDs
  • resumable runs with persisted state
  • idempotent step short-circuit support
  • runtime policy enforcement for permissions and trust tier
  • structured step telemetry and trace timeline export

Determinism Scope

Determinism currently applies to orchestration semantics:

  • ready-step ordering
  • state transitions and persistence
  • trace generation and replayability of engine decisions

LLM-generated text/content can still vary across runs unless provider/model/settings enforce deterministic output behavior. Treat content reproducibility as a separate concern from workflow engine determinism.

Deterministic Mode

Control LLM temperature for deterministic behavior:

# Default: temperature=0.0 (deterministic)
cargo run -- --workflow feature.md

# Override temperature
ANTIGRAV_LLM_TEMPERATURE=0.7 cargo run -- --workflow feature.md

# Use specific seed (OpenAI/Azure only)
ANTIGRAV_LLM_SEED=42 cargo run -- --workflow feature.md

Replay Store (Week 2 Feature)

Save and replay LLM responses for perfect determinism:

# Record mode - save all LLM responses to file
cargo run -- --workflow feature.md --save-replay llm_cache.json

# Replay mode - use cached responses (no LLM calls)
cargo run -- --workflow feature.md --replay-mode llm_cache.json

Benefits:

  • Perfect Determinism: Same inputs → same outputs, always
  • Fast Replay: 10x+ speedup by skipping LLM calls
  • Cost Savings: No API costs during replay
  • Offline Testing: Test workflows without internet

See Deterministic Mode Guide for details.

LLM Providers (Week 3 Feature)

Support for 6 LLM providers with automatic fallback:

# Choose your provider
export ANTIGRAV_LLM_PROVIDER=openai    # or ollama, gemini, anthropic, azure, bedrock
export OPENAI_API_KEY=sk-proj-...

# Optional: Configure fallback
export ANTIGRAV_LLM_FALLBACK=gemini,anthropic

# Run workflow
cargo run -- --workflow feature.md

Supported Providers

Provider Best For Cost Setup
Ollama Development, offline Free Guide
OpenAI Production, determinism $$ Guide
Gemini Cost optimization $ Guide
Anthropic Quality, safety $$$ Guide
Azure OpenAI Enterprise $$ Guide
AWS Bedrock AWS ecosystem $$$ Guide

See LLM Providers Guide for complete documentation.

Quick Examples

Development (Free):

export ANTIGRAV_LLM_PROVIDER=ollama
cargo run -- --workflow feature.md

Production (Reliable):

export ANTIGRAV_LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-proj-...
cargo run -- --workflow feature.md

Cost-Optimized:

export ANTIGRAV_LLM_PROVIDER=gemini
export GEMINI_API_KEY=AIza...
cargo run -- --workflow feature.md

Sandbox Execution (Week 4 Feature)

Automatic process isolation for untrusted skills with resource monitoring:

Trust Tiers

Tier Execution Isolation Use Case
Trusted In-process None Local, verified skills
Constrained In-process Permissions Limited access skills
Untrusted Subprocess Full isolation Imported, unverified skills

Features

Process Isolation - Untrusted skills run in separate processes
Resource Monitoring - Track CPU, memory, execution time
Automatic Enforcement - Kill processes exceeding limits
Zero Overhead - Trusted skills run in-process (no performance impact)

Resource Limits

// Configure resource limits
let budget = ExecutionBudget {
    resource_budget: ResourceBudget {
        max_cpu_ms: 10_000,      // 10 seconds CPU
        max_wall_time_ms: 30_000, // 30 seconds wall time
        max_memory_mb: 512,       // 512 MB memory
        ..Default::default()
    },
    ..Default::default()
};

Example: Untrusted Skill

SkillCapability::new(
    "imported_skill",
    "Skill from external source",
    SkillIOType::Text,
    SkillIOType::Text,
    CapabilityPermissions::none(),
    SideEffectClass::ExternalMutation,
)
.with_trust_tier(TrustTier::Untrusted)  // Runs in sandbox

See Sandbox Guide for complete documentation.

See Deterministic Mode Guide for details.

Current Release

  • Runtime target: v1.0.1
  • Package version: 1.0.1

Quick Start

cargo run -- --workflow valid_flow.md

Onboard/check local prerequisites:

./scripts/bootstrap.sh
cargo run -- workflow doctor
cargo run -- workflow setup

workflow setup now bootstraps a full core package when missing:

  • rules (runtime/branching/coding/merge)
  • workflows (starter/feature/bugfix/review/release)
  • templates (feature/bugfix/review/release_prompt)
  • roles (architect/implementer/reviewer/resolver/releaser)
  • starter skills (analyze_code/generate_tests/next_steps)

Run deterministic CI gate locally:

./scripts/ci_gate.sh

Optional live provider smoke tests (OpenAI/Gemini):

ANTIGRAV_RUN_LIVE_LLM_TESTS=1 OPENAI_API_KEY=... cargo test llm_subagent_live_smoke_openai -- --nocapture
ANTIGRAV_RUN_LIVE_LLM_TESTS=1 GEMINI_API_KEY=... cargo test llm_subagent_live_smoke_gemini -- --nocapture

Inspect active/previous runs:

cargo run -- workflow list
cargo run -- workflow status
cargo run -- workflow threads

Run a workflow with a reusable template prompt:

cargo run -- workflow start-template feature_prompt --task "add email validation to signup flow"

Run a role-bound workflow launch:

cargo run -- workflow start-role implementer --task "add email validation to signup flow"

Run chat-thread orchestration (thread branch + workflow + optional merge lifecycle):

cargo run -- workflow chat-thread feature-email --message "implement signup email validation"
cargo run -- workflow chat-thread review-thread --message "review current diff" --workflow-id review --template review_prompt --role reviewer --no-merge

Run thread-to-branch lifecycle end-to-end (includes auto conflict resolution attempts):

cargo run -- workflow thread-flow my-thread --target-branch main --validate-command "cargo test"

Run a direct workflow with template/role overrides:

cargo run -- --workflow-id feature --template feature_prompt --task "add email validation to signup flow"
cargo run -- --workflow-id feature --role-override "architect=planner,implementer=debugger"

Inspect available role profiles and templates:

cargo run -- workflow roles
cargo run -- workflow templates

Scaffold markdown package files with schema headers:

cargo run -- workflow scaffold workflow feature-search --profile advanced
cargo run -- workflow scaffold skill search_docs --profile advanced

Skill scaffold now follows folder layout:

  • .agents/skills/<skill-name>/SKILL.md

Generate an advanced domain pack (workflows + skills + roles + templates):

cargo run -- workflow scaffold-domain payments

Rebuild graph index for context retrieval (also refreshes sqlite context tables in .agents/memory/context.db):

cargo run -- workflow index-graph

Run skill quality validation and strict gate:

cargo run -- workflow quality-skills
cargo run -- workflow quality-skills --strict

List available curated bundles:

cargo run -- workflow bundles
cargo run -- workflow bundles --json

Hot domain bundles currently included:

  • ai-engineering
  • cloud-platform
  • cybersecurity
  • data-ml-eval
  • healthtech
  • climate-tech

Quick examples:

cargo run -- --workflow-id ai-engineering/feature --template ai-engineering/feature_prompt --task "build eval pipeline for support agent"
cargo run -- --workflow-id cybersecurity/review --template cybersecurity/review_prompt --task "review auth middleware diff for vulnerabilities"

Security Workflow and Internet Skill Gate

This repository now enforces a package rule:

  • workflows using internet-capable skills must include an explicit security-check step
  • recommended gate step is internet_security_check using cybersecurity.security_scan_guard

Run the dedicated security workflow:

cargo run -- --workflow-id cybersecurity/security-scan --template cybersecurity/security_scan_prompt --task "scan internet-surface risks and policy drift"

Run security/package validation gates:

cargo run -- workflow check
cargo run -- workflow quality-skills --strict

Run workflow report evaluation gate from dataset:

cargo run -- workflow eval .agents/evals/release_eval.json
cargo run -- workflow eval .agents/evals/release_eval.json --min-pass-rate 0.9 --json

Manual approval gate for release/review-sensitive runs:

cargo run -- workflow status <instance_id>
cargo run -- workflow approve <instance_id> --step manual_approval_gate --by release-manager --note "qa+security passed"
cargo run -- workflow resume <instance_id>

To explicitly block a run at the gate:

cargo run -- workflow reject <instance_id> --step manual_approval_gate --by security --note "critical risk unresolved"

Generate catalog/manifest/lock artifacts for skill bundles:

cargo run -- workflow build-catalog

Import third-party SKILL.md repos into local .agents/skills/imported:

cargo run -- workflow import-skills https://github.com/anthropics/skills --max-skills 20
cargo run -- workflow import-skills https://github.com/anthropics/skills --allow-missing-license
cargo run -- workflow import-skills https://github.com/anthropics/skills --mode global --allow-missing-license

Imported skills are normalized to folder layout:

  • .agents/skills/imported/<skill-name>/SKILL.md

Install using installer-style alias command:

cargo run -- workflow install-skillpack https://github.com/anthropics/skills --mode local --allow-missing-license
cargo run -- workflow install-skillpack https://github.com/anthropics/skills --mode global --allow-missing-license

Sync existing imported skills using pinned source commit/provenance from .agents/skills.lock.json:

cargo run -- workflow sync-imports --overwrite
cargo run -- workflow sync-imports --mode global --overwrite --allow-missing-license

Normalize existing imported skill metadata (risk/source/tags) without re-pulling upstream repos:

cargo run -- workflow normalize-imported-skills
cargo run -- workflow normalize-imported-skills --dry-run --json
cargo run -- workflow normalize-imported-skills --mode global --json

Install a curated bundle into local/global skills root:

cargo run -- workflow install-bundle core
cargo run -- workflow install-bundle imported --mode global --overwrite

Verify lock integrity (detect missing/changed/extra skill entries):

cargo run -- workflow verify-lock
cargo run -- workflow verify-lock --mode global --fail-on-extra
cargo run -- workflow verify-lock --require-attestation

--mode local writes to .agents/skills/imported; --mode global writes to $CODEX_HOME/skills/imported (fallback: ~/.codex/skills/imported).

Register/list/ping MCP runtime servers (stored in .agents/mcp/servers.json):

cargo run -- workflow mcp-register ollama-cli --transport stdio --command npx --arg -y --arg mcp-client-for-ollama --arg --ollama-host --arg http://127.0.0.1:11434
cargo run -- workflow mcp-register local-supabase --transport http --url http://127.0.0.1:54321/mcp --allow-tool query --allow-tool list_tables
cargo run -- workflow mcp-list
cargo run -- workflow mcp-ping
cargo run -- workflow mcp-ping ollama-cli --timeout-ms 8000 --json
cargo run -- workflow mcp-policy local-supabase --tool query

Resume a run:

cargo run -- workflow resume <instance_id>

Export trace:

cargo run -- workflow trace <instance_id> --json
cargo run -- workflow trace <instance_id> --timeline
cargo run -- workflow trace <instance_id> --otel

LLM Router

llm_subagent now routes real provider calls with timeout/retry/fallback and normalized telemetry output.

Common environment variables:

ANTIGRAV_LLM_PROVIDER=ollama|openai|gemini|anthropic
ANTIGRAV_LLM_MODEL=<primary model>
ANTIGRAV_LLM_FALLBACK=openai,gemini,anthropic
ANTIGRAV_LLM_FALLBACK_POLICY=transient_only|always|never
ANTIGRAV_LLM_TIMEOUT_MS=30000
ANTIGRAV_LLM_MAX_RETRIES=2
ANTIGRAV_LLM_SIMULATION_FALLBACK=true
OPENAI_API_KEY=...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...

Context retrieval service (for deterministic LLM context injection):

ANTIGRAV_CONTEXT_RETRIEVAL_MODE=vector|graph|hybrid|off
ANTIGRAV_CONTEXT_BACKEND=json|sqlite
ANTIGRAV_CONTEXT_INDEX_PATH=.agents/memory/vector_index.json
ANTIGRAV_CONTEXT_MIN_SCORE=0.1
ANTIGRAV_CONTEXT_GRAPH_INDEX_PATH=.agents/memory/graph_index.json
ANTIGRAV_CONTEXT_GRAPH_MIN_SCORE=0.05
ANTIGRAV_CONTEXT_DB_PATH=.agents/memory/context.db
ANTIGRAV_CONTEXT_VECTOR_TABLE=vector_entries
ANTIGRAV_CONTEXT_GRAPH_TABLE=graph_nodes
ANTIGRAV_CONTEXT_MAX_ITEMS=5
ANTIGRAV_CONTEXT_MAX_CHARS=300

Generated Artifacts

These files are generated and ignored by git by default:

  • .agents/catalog/*
  • .agents/skills_index.json
  • .agents/workflows.json
  • .agents/bundles.json
  • .agents/marketplace.json
  • .agents/skills.lock.json
  • .agents/skills/imported/*

Roadmap

Current gap closure plan is tracked in:

Regenerate them anytime with:

cargo run -- workflow build-catalog

Skill Format

Skill markdown supports both formats:

  • frontmatter metadata (--- block with at least name/domain/executor)
  • fenced JSON metadata block (existing format)

Folder-skill layout is supported:

  • .agents/skills/<skill-name>/SKILL.md
  • optional .agents/skills/<skill-name>/references/ and .agents/skills/<skill-name>/scripts/

Installation

cargo install --path .
antigrav workflow doctor

Documentation

About

From Spec to Production: Fully Automated

Topics

Resources

License

MIT and 2 other licenses found

Licenses found

MIT
LICENSE
Unknown
LICENSE-APACHE
MIT
LICENSE-MIT

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors