This document maps TeaAgent's current acceptance coverage against the common usage standards visible in mainstream coding-agent READMEs: Hermes Agent, OpenCode, Claude Code, and Codex. It separates implemented acceptance stories from market-standard product gaps that still need acceptance tests.
Generated matrix: use-case-matrix.md
Landscape survey (reviewed 2026-05-24): scripts/refresh_agent_readme_survey.md
| Label | Meaning |
|---|---|
| Implemented | Shipped with acceptance test coverage. |
| Partial | Implemented but missing surface docs, acceptance tests, or production hardening. |
| Planned | Identified gap; no implementation yet. |
Mainstream coding-agent expectations from Codex, Claude Code, OpenCode, OpenHands, and Aider are largely covered by acceptance flows today. TeaAgent does not need to replicate framework-native graph/crew orchestration; the harness focuses on governance, audit, and portable protocol surfaces.
| Area | Status | Primary evidence |
|---|---|---|
| Terminal-first CLI/TUI | Implemented | test_daily_cli.py, test_daily_tui.py |
| First-run onboarding + provider readiness | Implemented | test_first_run_experience_flow.py, test_provider_matrix_consistency_flow.py |
Repo instructions (AGENTS.md) |
Implemented | test_agents_md_injection_flow.py |
| Read-only planning | Implemented | test_plan_mode_read_only_flow.py |
| Edit/test/diff loop + undo | Implemented | test_workspace_edit_flow.py, test_run_undo_acceptance_flow.py |
| Permission modes + policy | Implemented | test_policy_as_code_flow.py, test_cancel_flow.py |
| MCP + skills/plugins + hooks | Implemented | test_remote_mcp_consumption_flow.py, test_skill_install_flow.py, test_hooks.py |
| Memory + session continuity | Implemented | test_memory_auto_curation_flow.py, test_session_resume_continuity_flow.py |
| IDE surface (VS Code) | Implemented | test_vscode_extension_mcp_boot_flow.py |
| Federation (A2A, ANP) | Implemented | test_anp_adapter_flow.py, A2A acceptance flows |
These items are intentionally not full parity gaps. They are shipped differentiators from the 2026-05-24 landscape survey that now need release hygiene, drift checks, and periodic review rather than feature buildout.
| Differentiator | Priority | Backlog reference |
|---|---|---|
| Docs/provider architecture drift guard | P0 | Implemented (validate_docs_consistency.py, test_provider_matrix_consistency_flow.py) |
| Subagent lineage and isolation hardening | P1 | Implemented (test_subagent_lineage_flow.py, test_subagent_worktree_isolation_flow.py, test_subagent_container_isolation_flow.py, test_subagent_lineage.py) |
| Repo-map / context pack for coding runs | P1 | Implemented (context_pack on preflight with hybrid/knowledge/GraphQLite read-only hits; test_context_pack_read_only_flow.py) |
| Mode and safety comparison matrix | P1 | Implemented (docs/USAGE.md, validate_mode_safety_matrix) |
| Multi-surface launch recipes | P1 | Implemented (docs/USAGE.md, test_surface_launch_recipes_flow.py) |
| Plugin/skill compatibility catalog | P2 | Implemented (docs/plugin-skill-catalog.md, fixture-backed validator) |
| Competitive use-case dashboard refresh | P2 | Implemented (refresh_competitive_docs.py, matrix + HTML dashboard) |
| Periodic mainstream-agent refresh cadence | P2 | Implemented (docs/release-checklist.md) |
| Requirement | Mainstream signal | TeaAgent status | Verification evidence |
|---|---|---|---|
| Terminal-first local agent | Codex, Claude Code, and OpenCode all lead with a local CLI/TUI workflow. | Implemented. | test_daily_cli.py, test_daily_tui.py |
| First-run onboarding | Mainstream READMEs put install, setup, first command, and troubleshooting before architecture. | Implemented. | test_first_run_experience_flow.py, test_model_smoke_gating_flow.py |
| Project instruction loading | Modern agents rely on repo-local instruction files such as AGENTS.md or migration fallbacks. |
Implemented. | test_agents_md_injection_flow.py |
| Read-only planning/exploration mode | OpenCode exposes a read-only plan agent; other tools distinguish explore/plan from edit/build. |
Implemented. | test_plan_mode_read_only_flow.py |
| Build/edit/test/diff loop | Coding agents are expected to read code, edit files, run tests, inspect diffs, and summarize results. | Implemented. | test_workspace_edit_flow.py, test_agent_fix_test_review_flow.py |
| Approval and hard policy boundaries | Mainstream agents increasingly expose permission modes, approvals, and sandbox profiles. | Implemented. | test_policy_as_code_flow.py, test_cancel_flow.py, test_run_undo_acceptance_flow.py |
| Provider/model flexibility | Hermes and OpenCode emphasize no lock-in and multi-provider operation. | Implemented. | test_provider_matrix_consistency_flow.py, test_live_provider_conformance_flow.py |
| Tool ecosystem extensibility | MCP, skills/plugins, custom commands, external tools, and semantic code-analysis toolpacks are mainstream extension points. | Implemented. | test_skill_install_flow.py, test_remote_mcp_consumption_flow.py, test_external_tool_manifest_compatibility_flow.py, test_code_analysis_prompt_injection_flow.py |
| Multi-surface operation | Codex and Claude Code support IDE surfaces; Hermes supports messaging gateways; OpenCode supports desktop/client-server surfaces. | Implemented (VSCode surface). | test_vscode_extension_mcp_boot_flow.py, test_vscode_mcp_runtime_smoke_flow.py |
| Session continuity and memory | Hermes foregrounds learning loops and memory; terminal agents need resumable sessions. | Implemented. | test_memory_auto_curation_flow.py, test_session_resume_continuity_flow.py |
| Reversible change recovery | Production-grade autonomous edit tools need rollback/undo stories. | Implemented. | test_run_undo_acceptance_flow.py |
| Hook lifecycle system | Claude Code and Hermes implement 8-event hooks for extensibility. | Implemented. | test_hooks.py |
| Three-tier memory hierarchy | Claude Code implements Project/Personal/Auto-Memory tiers. | Implemented. | test_memory.py |
| Context compaction | Claude Code triggers auto-compaction at 75-92% token usage. | Implemented. | test_preflight.py |
| Plugin system | Claude Code supports Commands/Agents/Hooks/MCP extension points. | Implemented. | test_plugins.py |
| ACP IDE integration | Protocol for VS Code, Zed, JetBrains integration. | Implemented. | test_vscode_*_flow.py |
| Read-before-write mtime guard | OpenCode and Codex enforce concurrent modification detection on writes. | Implemented. | test_mtime_read_before_write_flow.py |
| Protected path enforcement | Codex automatically protects .git/.codex/.agents directories. |
Implemented. | test_protected_paths_flow.py |
| Declarative sub-agent definitions | Claude Code uses .claude/agents/*.md frontmatter; Codex uses config for thread/agent topology. |
Implemented. | test_subagent_definitions_flow.py |
| Semantic code navigation (LSP) | OpenCode integrates LSP for diagnostics, definitions, and references. | Implemented. | test_code_analysis_lsp_flow.py, test_code_analysis_prompt_injection_flow.py |
| Persistent automation (cron-style) | Hermes-style scheduled agents with collectors, provenance quarantine, and webhook delivery. | Implemented. | test_automation_wake_agent_gate_skips_unchanged_flow.py, test_automation_promote_quarantined_flow.py, test_automation_webhook_delivery_flow.py, test_automation_status_observability_flow.py |
| Self-generated skills (quarantine pipeline) | Agent proposes skill candidates with artifacts, offline eval, and human review before install. | Implemented. | test_skill_candidate_flow.py, test_skill_candidate_contract_policy_provenance_flow.py, test_skill_candidate_offline_eval_flow.py |
| Use Case | User Goal | Blast Radius | Rollback Path | Audit Criticality | Primary Acceptance Coverage | Status |
|---|---|---|---|---|---|---|
| Project instruction conformance | Ensure repo-local agent rules are always applied. | high | git revert AGENTS.md | medium | test_agents_md_injection_flow.py, test_first_run_experience_flow.py |
Implemented |
| Safe autonomous coding run | Execute coding tasks with policy controls and auditability. | high | teaagent agent undo | high | test_daily_cli.py, test_daily_tui.py, test_policy_as_code_flow.py, test_workspace_edit_flow.py, test_agent_fix_test_review_flow.py |
Implemented |
| Destructive-action governance | Require approval before risky operations. | critical | teaagent agent undo | critical | test_cancel_flow.py, test_daily_cli.py (pause/resume), test_policy_as_code_flow.py, test_run_undo_acceptance_flow.py |
Implemented |
| Tool ecosystem extensibility | Load skills and remote MCP tools reliably. | medium | remove skill/MCP config | medium | test_skill_install_flow.py, test_remote_mcp_consumption_flow.py, test_mcp_client_flow.py |
Implemented baseline |
| Reliability and forensics | Preserve run history, webhook delivery, and audit integrity. | high | N/A (read-only verification) | critical | test_audit_chain_integrity_flow.py, test_webhook_audit_flow.py, test_cost_tracking_flow.py |
Implemented baseline |
| Memory continuity | Reuse successful outcomes across runs without manual logging. | low | clear .teaagent/memory/ | low | test_memory_auto_curation_flow.py, test_session_resume_continuity_flow.py |
Implemented |
| IDE-assisted workflows | Operate MCP flows and commands from VSCode extension. | low | restart VSCode | low | test_vscode_extension_mcp_boot_flow.py, test_vscode_mcp_runtime_smoke_flow.py |
Implemented |
| Hook lifecycle management | Execute custom logic on tool events (PreToolUse, PostToolUse, etc.). | medium | disable hooks config | medium | test_hooks.py |
Implemented |
| Three-tier memory system | Use Project/Personal/Auto-Memory for context persistence. | low | clear memory files | low | test_memory.py |
Implemented |
| Context auto-compaction | Automatically compress context when approaching token limits. | low | N/A | low | test_preflight.py |
Implemented |
| Plan mode exploration | Explore codebases in read-only mode without modifications. | low | N/A | low | test_plan_mode_read_only_flow.py |
Implemented |
| Plugin extensibility | Add custom Commands, Agents, or MCP integrations. | medium | remove plugin | low | test_plugins.py |
Implemented |
| LSP code analysis | Navigate codebases with semantic tools (definitions, references, diagnostics, symbols). | low | N/A | low | test_code_analysis_lsp_flow.py, test_code_analysis_prompt_injection_flow.py |
Implemented |
| Declarative sub-agent management | Define sub-agents via YAML/JSON/Markdown files with isolation, background, and tool restrictions. | medium | remove .teaagent/subagents/ | medium | test_subagent_definitions_flow.py, test_subagent_lineage_flow.py |
Implemented |
| Concurrent modification safety | Prevent silent data loss when files are modified between read and write. | high | N/A | medium | test_mtime_read_before_write_flow.py |
Implemented |
| Protected path enforcement | Block accidental writes to .git/ and .teaagent/ by default. | high | N/A | medium | test_protected_paths_flow.py |
Implemented |
| Use Case | User Goal | Blast Radius | Rollback Path | Audit Criticality | Required Acceptance Coverage | Priority | Status |
|---|---|---|---|---|---|---|---|
| Product onboarding and provider readiness | Install, initialize, verify providers, and start a safe first run without reading architecture docs. | test_first_run_experience_flow.py, test_model_smoke_gating_flow.py, test_live_provider_conformance_flow.py, test_provider_matrix_consistency_flow.py |
P0 | Implemented | |||
| Read-only planning mode | Explore an unfamiliar repo and produce a plan without file edits or shell mutation. | test_plan_mode_read_only_flow.py |
P0 | Implemented | |||
| End-to-end code-change loop | Ask the agent to fix a small failing test, apply a scoped edit, rerun tests, inspect diff, and report the result. | test_workspace_edit_flow.py, test_agent_fix_test_review_flow.py |
P0 | Implemented | |||
| Reversible change recovery | Undo or recover from an agent-authored workspace edit using a user-facing command. | test_run_undo_acceptance_flow.py |
P1 | Implemented | |||
| Runtime IDE MCP smoke | Start the workspace MCP endpoint from the VSCode command and verify an MCP client can attach. | test_vscode_extension_mcp_boot_flow.py, test_vscode_mcp_runtime_smoke_flow.py |
P1 | Implemented | |||
| Session resume continuity | Resume a paused or completed run and preserve task, observations, memory, and audit context. | test_session_resume_continuity_flow.py |
P1 | Implemented | |||
| External ecosystem compatibility | Validate representative MCP manifests, skill metadata, and tool annotations against TeaAgent's registry contract. | test_external_tool_manifest_compatibility_flow.py |
P2 | Implemented | |||
| Semantic code navigation (LSP) | Navigate codebases with go-to-definition, find-references, diagnostics, and document symbols via LSP-backed tools. | test_code_analysis_lsp_flow.py |
P0 | Implemented | |||
| Concurrent modification safety | Prevent silent data loss by rejecting writes when files were modified between read and write. | test_mtime_read_before_write_flow.py |
P0 | Implemented | |||
| Protected path enforcement | Block accidental writes to .git/ and .teaagent/ with built-in default deny rules. |
test_protected_paths_flow.py |
P1 | Implemented | |||
| Declarative sub-agent orchestration | Define agent roles in .teaagent/subagents/*.md (Markdown frontmatter) mirroring Claude Code's .claude/agents/ convention. |
test_subagent_definitions_flow.py |
P1 | Implemented | |||
| Persistent automation / cron agent | Schedule repo watchers and script-first collectors with provenance gates and optional webhook delivery. | test_automation_wake_agent_gate_skips_unchanged_flow.py, test_automation_context_from_chain_flow.py, test_automation_webhook_delivery_flow.py |
P2 | Implemented | |||
| Self-generated skill candidates | Propose skills from completed runs; offline eval + review before install to active skill dirs. | test_skill_candidate_flow.py, test_skill_candidate_offline_eval_flow.py, test_skill_activation_explain_flow.py |
P2 | Implemented |
- Completed (P0): Provider/docs consistency acceptance (
test_provider_matrix_consistency_flow.py). - Completed (P0): Read-only planning acceptance (
test_plan_mode_read_only_flow.py). - Completed (P0): End-to-end repair loop acceptance (
test_agent_fix_test_review_flow.py). - Completed (P1): Reversible change recovery acceptance (
test_run_undo_acceptance_flow.py). - Completed (P1): VSCode runtime MCP smoke acceptance (
test_vscode_mcp_runtime_smoke_flow.py). - Completed (P1): Session resume continuity acceptance (
test_session_resume_continuity_flow.py). - Completed (P2): External ecosystem compatibility acceptance (
test_external_tool_manifest_compatibility_flow.py). - Completed (P2): Published rendered dashboard at
docs/use-case-matrix.html. - Completed (P0): LSP code analysis acceptance (
test_code_analysis_lsp_flow.py). - Completed (P0): mtime read-before-write guard (
test_mtime_read_before_write_flow.py). - Completed (P0): Protected paths default deny rules (
test_protected_paths_flow.py). - Completed (P1): Declarative sub-agent definitions with Markdown frontmatter (
test_subagent_definitions_flow.py). - Completed (P1): Context compaction latency SLO (
test_context_compaction_slo_flow.py). - Completed (P1): Hook lifecycle acceptance elevation (
test_hook_lifecycle_flow.py). - Completed (P2): Persistent automation with collectors, quarantine, promote, and webhook delivery.
- Completed (P2): Self-generated skill candidate pipeline with offline eval and provenance artifacts.
- Completed (P2): Automation status observability (prompt ledger, token contributors, gate reasons).
TeaAgent ships strong governance/protocol acceptance coverage. The items below are not release-grade product surfaces yet — they have harness primitives and acceptance stories, but docs/packaging/ops still lag mainstream daily agents.
| Gap | Acceptance flow (regression guard) | Priority |
|---|---|---|
| First-hour e2e loop | test_first_hour_e2e_flow.py |
P0 |
| Actionable error recovery | test_error_recovery_common_misuse_flow.py |
P0 |
| Docs acceptance count accuracy | test_docs_acceptance_count_accuracy.py |
P0 |
| Background attach / resume / notify | test_background_attach_resume_notify_flow.py |
P1 |
| Automation vs foreground parity | test_automation_foreground_parity_flow.py |
P1 |
| Parallel subagent worktree merge story | test_subagent_parallel_worktree_merge_flow.py |
P1 |
| CLI / TUI surface parity | test_cli_tui_surface_parity_flow.py |
P1 |
| Desktop client-server session | test_desktop_client_server_session_flow.py |
P2 |
| Large-repo repo-map quality SLO | test_repo_map_quality_large_repo_flow.py |
P2 |
| Managed cloud task stub lifecycle | test_managed_runtime_cloud_task_flow.py |
P2 |
| Plugin / skill install security | test_plugin_install_security_flow.py |
P2 |
These items are tracked as open gaps from the 2026-05-24 landscape survey. They are not claimed as done — each has a concrete next action.
| Gap | Source agent(s) | Current state | Next action | Priority |
|---|---|---|---|---|
| Background/cloud surface docs | Codex Cloud Tasks, Claude Code background sessions | Partial — acceptance flows exist; hosted guide still thin | Write hosted deployment guide + background session walkthrough | P2 |
| Desktop/client-server packaging | OpenCode desktop, Codex app server | Partial — MCP HTTP acceptance; no desktop bundle | Document desktop launch recipes in USAGE.md | P2 |
| Repo-map quality benchmark | Aider repo-map, OpenCode LSP | Partial — large-repo SLO acceptance; no external benchmark dataset | Publish repo-map accuracy evaluation script + dataset | P2 |
Use these commands as the default claim-verification workflow before updating docs:
python3 scripts/refresh_competitive_docs.pypython3 -m pytest tests/acceptance --collect-only -q- Re-run scripts/refresh_agent_readme_survey.md when upstream agent signals change