Use Cases and Acceptance Traceability

This document maps TeaAgent's current acceptance coverage against the common usage standards visible in mainstream coding-agent READMEs: Hermes Agent, OpenCode, Claude Code, and Codex. It separates implemented acceptance stories from market-standard product gaps that still need acceptance tests.

Generated matrix: use-case-matrix.md

Landscape survey (reviewed 2026-05-24): scripts/refresh_agent_readme_survey.md

Status Key

Label	Meaning
Implemented	Shipped with acceptance test coverage.
Partial	Implemented but missing surface docs, acceptance tests, or production hardening.
Planned	Identified gap; no implementation yet.

Implemented parity (competitive baseline)

Mainstream coding-agent expectations from Codex, Claude Code, OpenCode, OpenHands, and Aider are largely covered by acceptance flows today. TeaAgent does not need to replicate framework-native graph/crew orchestration; the harness focuses on governance, audit, and portable protocol surfaces.

Area	Status	Primary evidence
Terminal-first CLI/TUI	Implemented	`test_daily_cli.py`, `test_daily_tui.py`
First-run onboarding + provider readiness	Implemented	`test_first_run_experience_flow.py`, `test_provider_matrix_consistency_flow.py`
Repo instructions (`AGENTS.md`)	Implemented	`test_agents_md_injection_flow.py`
Read-only planning	Implemented	`test_plan_mode_read_only_flow.py`
Edit/test/diff loop + undo	Implemented	`test_workspace_edit_flow.py`, `test_run_undo_acceptance_flow.py`
Permission modes + policy	Implemented	`test_policy_as_code_flow.py`, `test_cancel_flow.py`
MCP + skills/plugins + hooks	Implemented	`test_remote_mcp_consumption_flow.py`, `test_skill_install_flow.py`, `test_hooks.py`
Memory + session continuity	Implemented	`test_memory_auto_curation_flow.py`, `test_session_resume_continuity_flow.py`
IDE surface (VS Code)	Implemented	`test_vscode_extension_mcp_boot_flow.py`
Federation (A2A, ANP)	Implemented	`test_anp_adapter_flow.py`, A2A acceptance flows

Competitive Differentiators (Implemented / Maintenance)

These items are intentionally not full parity gaps. They are shipped differentiators from the 2026-05-24 landscape survey that now need release hygiene, drift checks, and periodic review rather than feature buildout.

Differentiator	Priority	Backlog reference
Docs/provider architecture drift guard	P0	Implemented (`validate_docs_consistency.py`, `test_provider_matrix_consistency_flow.py`)
Subagent lineage and isolation hardening	P1	Implemented (`test_subagent_lineage_flow.py`, `test_subagent_worktree_isolation_flow.py`, `test_subagent_container_isolation_flow.py`, `test_subagent_lineage.py`)
Repo-map / context pack for coding runs	P1	Implemented (`context_pack` on preflight with hybrid/knowledge/GraphQLite read-only hits; `test_context_pack_read_only_flow.py`)
Mode and safety comparison matrix	P1	Implemented (`docs/USAGE.md`, `validate_mode_safety_matrix`)
Multi-surface launch recipes	P1	Implemented (`docs/USAGE.md`, `test_surface_launch_recipes_flow.py`)
Plugin/skill compatibility catalog	P2	Implemented (`docs/plugin-skill-catalog.md`, fixture-backed validator)
Competitive use-case dashboard refresh	P2	Implemented (`refresh_competitive_docs.py`, matrix + HTML dashboard)
Periodic mainstream-agent refresh cadence	P2	Implemented (`docs/release-checklist.md`)

Requirement Baseline

Requirement	Mainstream signal	TeaAgent status	Verification evidence
Terminal-first local agent	Codex, Claude Code, and OpenCode all lead with a local CLI/TUI workflow.	Implemented.	`test_daily_cli.py`, `test_daily_tui.py`
First-run onboarding	Mainstream READMEs put install, setup, first command, and troubleshooting before architecture.	Implemented.	`test_first_run_experience_flow.py`, `test_model_smoke_gating_flow.py`
Project instruction loading	Modern agents rely on repo-local instruction files such as `AGENTS.md` or migration fallbacks.	Implemented.	`test_agents_md_injection_flow.py`
Read-only planning/exploration mode	OpenCode exposes a read-only `plan` agent; other tools distinguish explore/plan from edit/build.	Implemented.	`test_plan_mode_read_only_flow.py`
Build/edit/test/diff loop	Coding agents are expected to read code, edit files, run tests, inspect diffs, and summarize results.	Implemented.	`test_workspace_edit_flow.py`, `test_agent_fix_test_review_flow.py`
Approval and hard policy boundaries	Mainstream agents increasingly expose permission modes, approvals, and sandbox profiles.	Implemented.	`test_policy_as_code_flow.py`, `test_cancel_flow.py`, `test_run_undo_acceptance_flow.py`
Provider/model flexibility	Hermes and OpenCode emphasize no lock-in and multi-provider operation.	Implemented.	`test_provider_matrix_consistency_flow.py`, `test_live_provider_conformance_flow.py`
Tool ecosystem extensibility	MCP, skills/plugins, custom commands, external tools, and semantic code-analysis toolpacks are mainstream extension points.	Implemented.	`test_skill_install_flow.py`, `test_remote_mcp_consumption_flow.py`, `test_external_tool_manifest_compatibility_flow.py`, `test_code_analysis_prompt_injection_flow.py`
Multi-surface operation	Codex and Claude Code support IDE surfaces; Hermes supports messaging gateways; OpenCode supports desktop/client-server surfaces.	Implemented (VSCode surface).	`test_vscode_extension_mcp_boot_flow.py`, `test_vscode_mcp_runtime_smoke_flow.py`
Session continuity and memory	Hermes foregrounds learning loops and memory; terminal agents need resumable sessions.	Implemented.	`test_memory_auto_curation_flow.py`, `test_session_resume_continuity_flow.py`
Reversible change recovery	Production-grade autonomous edit tools need rollback/undo stories.	Implemented.	`test_run_undo_acceptance_flow.py`
Hook lifecycle system	Claude Code and Hermes implement 8-event hooks for extensibility.	Implemented.	`test_hooks.py`
Three-tier memory hierarchy	Claude Code implements Project/Personal/Auto-Memory tiers.	Implemented.	`test_memory.py`
Context compaction	Claude Code triggers auto-compaction at 75-92% token usage.	Implemented.	`test_preflight.py`
Plugin system	Claude Code supports Commands/Agents/Hooks/MCP extension points.	Implemented.	`test_plugins.py`
ACP IDE integration	Protocol for VS Code, Zed, JetBrains integration.	Implemented.	`test_vscode_*_flow.py`
Read-before-write mtime guard	OpenCode and Codex enforce concurrent modification detection on writes.	Implemented.	`test_mtime_read_before_write_flow.py`
Protected path enforcement	Codex automatically protects `.git`/`.codex`/`.agents` directories.	Implemented.	`test_protected_paths_flow.py`
Declarative sub-agent definitions	Claude Code uses `.claude/agents/*.md` frontmatter; Codex uses config for thread/agent topology.	Implemented.	`test_subagent_definitions_flow.py`
Semantic code navigation (LSP)	OpenCode integrates LSP for diagnostics, definitions, and references.	Implemented.	`test_code_analysis_lsp_flow.py`, `test_code_analysis_prompt_injection_flow.py`
Persistent automation (cron-style)	Hermes-style scheduled agents with collectors, provenance quarantine, and webhook delivery.	Implemented.	`test_automation_wake_agent_gate_skips_unchanged_flow.py`, `test_automation_promote_quarantined_flow.py`, `test_automation_webhook_delivery_flow.py`, `test_automation_status_observability_flow.py`
Self-generated skills (quarantine pipeline)	Agent proposes skill candidates with artifacts, offline eval, and human review before install.	Implemented.	`test_skill_candidate_flow.py`, `test_skill_candidate_contract_policy_provenance_flow.py`, `test_skill_candidate_offline_eval_flow.py`

Current Core Use Cases

Use Case	User Goal	Blast Radius	Rollback Path	Audit Criticality	Primary Acceptance Coverage	Status
Project instruction conformance	Ensure repo-local agent rules are always applied.	high	git revert AGENTS.md	medium	`test_agents_md_injection_flow.py`, `test_first_run_experience_flow.py`	Implemented
Safe autonomous coding run	Execute coding tasks with policy controls and auditability.	high	teaagent agent undo	high	`test_daily_cli.py`, `test_daily_tui.py`, `test_policy_as_code_flow.py`, `test_workspace_edit_flow.py`, `test_agent_fix_test_review_flow.py`	Implemented
Destructive-action governance	Require approval before risky operations.	critical	teaagent agent undo	critical	`test_cancel_flow.py`, `test_daily_cli.py` (pause/resume), `test_policy_as_code_flow.py`, `test_run_undo_acceptance_flow.py`	Implemented
Tool ecosystem extensibility	Load skills and remote MCP tools reliably.	medium	remove skill/MCP config	medium	`test_skill_install_flow.py`, `test_remote_mcp_consumption_flow.py`, `test_mcp_client_flow.py`	Implemented baseline
Reliability and forensics	Preserve run history, webhook delivery, and audit integrity.	high	N/A (read-only verification)	critical	`test_audit_chain_integrity_flow.py`, `test_webhook_audit_flow.py`, `test_cost_tracking_flow.py`	Implemented baseline
Memory continuity	Reuse successful outcomes across runs without manual logging.	low	clear .teaagent/memory/	low	`test_memory_auto_curation_flow.py`, `test_session_resume_continuity_flow.py`	Implemented
IDE-assisted workflows	Operate MCP flows and commands from VSCode extension.	low	restart VSCode	low	`test_vscode_extension_mcp_boot_flow.py`, `test_vscode_mcp_runtime_smoke_flow.py`	Implemented
Hook lifecycle management	Execute custom logic on tool events (PreToolUse, PostToolUse, etc.).	medium	disable hooks config	medium	`test_hooks.py`	Implemented
Three-tier memory system	Use Project/Personal/Auto-Memory for context persistence.	low	clear memory files	low	`test_memory.py`	Implemented
Context auto-compaction	Automatically compress context when approaching token limits.	low	N/A	low	`test_preflight.py`	Implemented
Plan mode exploration	Explore codebases in read-only mode without modifications.	low	N/A	low	`test_plan_mode_read_only_flow.py`	Implemented
Plugin extensibility	Add custom Commands, Agents, or MCP integrations.	medium	remove plugin	low	`test_plugins.py`	Implemented
LSP code analysis	Navigate codebases with semantic tools (definitions, references, diagnostics, symbols).	low	N/A	low	`test_code_analysis_lsp_flow.py`, `test_code_analysis_prompt_injection_flow.py`	Implemented
Declarative sub-agent management	Define sub-agents via YAML/JSON/Markdown files with isolation, background, and tool restrictions.	medium	remove .teaagent/subagents/	medium	`test_subagent_definitions_flow.py`, `test_subagent_lineage_flow.py`	Implemented
Concurrent modification safety	Prevent silent data loss when files are modified between read and write.	high	N/A	medium	`test_mtime_read_before_write_flow.py`	Implemented
Protected path enforcement	Block accidental writes to .git/ and .teaagent/ by default.	high	N/A	medium	`test_protected_paths_flow.py`	Implemented

Implemented Market-Standard Use Cases

Use Case	User Goal	Blast Radius	Rollback Path	Audit Criticality
Product onboarding and provider readiness	Install, initialize, verify providers, and start a safe first run without reading architecture docs.	`test_first_run_experience_flow.py`, `test_model_smoke_gating_flow.py`, `test_live_provider_conformance_flow.py`, `test_provider_matrix_consistency_flow.py`	P0	Implemented
Read-only planning mode	Explore an unfamiliar repo and produce a plan without file edits or shell mutation.	`test_plan_mode_read_only_flow.py`	P0	Implemented
End-to-end code-change loop	Ask the agent to fix a small failing test, apply a scoped edit, rerun tests, inspect diff, and report the result.	`test_workspace_edit_flow.py`, `test_agent_fix_test_review_flow.py`	P0	Implemented
Reversible change recovery	Undo or recover from an agent-authored workspace edit using a user-facing command.	`test_run_undo_acceptance_flow.py`	P1	Implemented
Runtime IDE MCP smoke	Start the workspace MCP endpoint from the VSCode command and verify an MCP client can attach.	`test_vscode_extension_mcp_boot_flow.py`, `test_vscode_mcp_runtime_smoke_flow.py`	P1	Implemented
Session resume continuity	Resume a paused or completed run and preserve task, observations, memory, and audit context.	`test_session_resume_continuity_flow.py`	P1	Implemented
External ecosystem compatibility	Validate representative MCP manifests, skill metadata, and tool annotations against TeaAgent's registry contract.	`test_external_tool_manifest_compatibility_flow.py`	P2	Implemented
Semantic code navigation (LSP)	Navigate codebases with go-to-definition, find-references, diagnostics, and document symbols via LSP-backed tools.	`test_code_analysis_lsp_flow.py`	P0	Implemented
Concurrent modification safety	Prevent silent data loss by rejecting writes when files were modified between read and write.	`test_mtime_read_before_write_flow.py`	P0	Implemented
Protected path enforcement	Block accidental writes to `.git/` and `.teaagent/` with built-in default deny rules.	`test_protected_paths_flow.py`	P1	Implemented
Declarative sub-agent orchestration	Define agent roles in `.teaagent/subagents/*.md` (Markdown frontmatter) mirroring Claude Code's `.claude/agents/` convention.	`test_subagent_definitions_flow.py`	P1	Implemented
Persistent automation / cron agent	Schedule repo watchers and script-first collectors with provenance gates and optional webhook delivery.	`test_automation_wake_agent_gate_skips_unchanged_flow.py`, `test_automation_context_from_chain_flow.py`, `test_automation_webhook_delivery_flow.py`	P2	Implemented
Self-generated skill candidates	Propose skills from completed runs; offline eval + review before install to active skill dirs.	`test_skill_candidate_flow.py`, `test_skill_candidate_offline_eval_flow.py`, `test_skill_activation_explain_flow.py`	P2	Implemented

Completed Delivery Plan

Completed (P0): Provider/docs consistency acceptance (test_provider_matrix_consistency_flow.py).
Completed (P0): Read-only planning acceptance (test_plan_mode_read_only_flow.py).
Completed (P0): End-to-end repair loop acceptance (test_agent_fix_test_review_flow.py).
Completed (P1): Reversible change recovery acceptance (test_run_undo_acceptance_flow.py).
Completed (P1): VSCode runtime MCP smoke acceptance (test_vscode_mcp_runtime_smoke_flow.py).
Completed (P1): Session resume continuity acceptance (test_session_resume_continuity_flow.py).
Completed (P2): External ecosystem compatibility acceptance (test_external_tool_manifest_compatibility_flow.py).
Completed (P2): Published rendered dashboard at docs/use-case-matrix.html.
Completed (P0): LSP code analysis acceptance (test_code_analysis_lsp_flow.py).
Completed (P0): mtime read-before-write guard (test_mtime_read_before_write_flow.py).
Completed (P0): Protected paths default deny rules (test_protected_paths_flow.py).
Completed (P1): Declarative sub-agent definitions with Markdown frontmatter (test_subagent_definitions_flow.py).
Completed (P1): Context compaction latency SLO (test_context_compaction_slo_flow.py).
Completed (P1): Hook lifecycle acceptance elevation (test_hook_lifecycle_flow.py).
Completed (P2): Persistent automation with collectors, quarantine, promote, and webhook delivery.
Completed (P2): Self-generated skill candidate pipeline with offline eval and provenance artifacts.
Completed (P2): Automation status observability (prompt ledger, token contributors, gate reasons).

Known productization gaps

TeaAgent ships strong governance/protocol acceptance coverage. The items below are not release-grade product surfaces yet — they have harness primitives and acceptance stories, but docs/packaging/ops still lag mainstream daily agents.

Gap	Acceptance flow (regression guard)	Priority
First-hour e2e loop	`test_first_hour_e2e_flow.py`	P0
Actionable error recovery	`test_error_recovery_common_misuse_flow.py`	P0
Docs acceptance count accuracy	`test_docs_acceptance_count_accuracy.py`	P0
Background attach / resume / notify	`test_background_attach_resume_notify_flow.py`	P1
Automation vs foreground parity	`test_automation_foreground_parity_flow.py`	P1
Parallel subagent worktree merge story	`test_subagent_parallel_worktree_merge_flow.py`	P1
CLI / TUI surface parity	`test_cli_tui_surface_parity_flow.py`	P1
Desktop client-server session	`test_desktop_client_server_session_flow.py`	P2
Large-repo repo-map quality SLO	`test_repo_map_quality_large_repo_flow.py`	P2
Managed cloud task stub lifecycle	`test_managed_runtime_cloud_task_flow.py`	P2
Plugin / skill install security	`test_plugin_install_security_flow.py`	P2

Partial / Planned Gaps (docs & packaging)

These items are tracked as open gaps from the 2026-05-24 landscape survey. They are not claimed as done — each has a concrete next action.

Gap	Source agent(s)	Current state	Next action	Priority
Background/cloud surface docs	Codex Cloud Tasks, Claude Code background sessions	Partial — acceptance flows exist; hosted guide still thin	Write hosted deployment guide + background session walkthrough	P2
Desktop/client-server packaging	OpenCode desktop, Codex app server	Partial — MCP HTTP acceptance; no desktop bundle	Document desktop launch recipes in USAGE.md	P2
Repo-map quality benchmark	Aider repo-map, OpenCode LSP	Partial — large-repo SLO acceptance; no external benchmark dataset	Publish repo-map accuracy evaluation script + dataset	P2

Evidence Commands

Use these commands as the default claim-verification workflow before updating docs:

python3 scripts/refresh_competitive_docs.py
python3 -m pytest tests/acceptance --collect-only -q
Re-run scripts/refresh_agent_readme_survey.md when upstream agent signals change

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Cases and Acceptance Traceability

Status Key

Implemented parity (competitive baseline)

Competitive Differentiators (Implemented / Maintenance)

Requirement Baseline

Current Core Use Cases

Implemented Market-Standard Use Cases

Completed Delivery Plan

Known productization gaps

Partial / Planned Gaps (docs & packaging)

Evidence Commands

FilesExpand file tree

use-cases.md

Latest commit

History

use-cases.md

File metadata and controls

Use Cases and Acceptance Traceability

Status Key

Implemented parity (competitive baseline)

Competitive Differentiators (Implemented / Maintenance)

Requirement Baseline

Current Core Use Cases

Implemented Market-Standard Use Cases

Completed Delivery Plan

Known productization gaps

Partial / Planned Gaps (docs & packaging)

Evidence Commands