You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mission Control — Comprehensive Manual E2E Test Plan
Context: We are dogfooding this plugin from within the plugin's own repo.
All mc_* tool calls are made by the AI agent (us) in this session.
This plan covers all 17 MCP tools, all 12 job states, and all 8 plan states.
Dynamic Path Convention: All paths use $(basename $(git rev-parse --show-toplevel)) instead of
hardcoded project names. This makes the plan portable across repos and forks.
Emergency Nuclear Cleanup Script
Run this FIRST if the environment is dirty, or LAST to guarantee clean state.
This script is idempotent and safe to run at any time. Every command is fault-tolerant.
#!/bin/bash# Mission Control — Nuclear Cleanup# Safe to run at any time. All commands are idempotent.
PROJECT_NAME=$(basename $(git rev-parse --show-toplevel))
STATE_DIR=~/.local/share/opencode-mission-control/$PROJECT_NAMEecho"=== Nuclear Cleanup: $PROJECT_NAME ==="# 1. Kill ALL mc-tmc-* tmux sessionsecho"Killing tmux sessions..."forsin$(tmux list-sessions -F '#{session_name}'2>/dev/null | grep '^mc-tmc-');do
tmux kill-session -t "$s"2>/dev/null ||truedone# 2. Remove ALL tmc-* worktreesecho"Removing worktrees..."forwtin$(git worktree list --porcelain 2>/dev/null | grep '^worktree '| awk '{print $2}'| grep 'tmc-');do
git worktree remove --force "$wt"2>/dev/null ||truedone# 3. Remove integration worktrees (from plan tests)forwtin$(git worktree list --porcelain 2>/dev/null | grep '^worktree '| awk '{print $2}'| grep 'mc-integration');do
git worktree remove --force "$wt"2>/dev/null ||truedone# 4. Prune stale worktree references
git worktree prune 2>/dev/null ||true# 5. Delete ALL mc/tmc-* branchesecho"Deleting test branches..."forbrin$(git branch --list 'mc/tmc-*'2>/dev/null);do
git branch -D "$br"2>/dev/null ||truedone# 6. Delete ALL mc/integration-* branches (flat pattern)forbrin$(git branch --list 'mc/integration-*'2>/dev/null);do
git branch -D "$br"2>/dev/null ||truedone# 7. Delete ALL mc/integration/* branches (nested pattern)forbrin$(git branch --list 'mc/integration/*'2>/dev/null);do
git branch -D "$br"2>/dev/null ||truedone# 8. Clean report filesecho"Cleaning report files..."
rm -f "$STATE_DIR/reports/"*.json 2>/dev/null ||true
rm -f "$STATE_DIR/reports/"*.json.tmp 2>/dev/null ||true# 9. Clean plan stateecho"Cleaning plan state..."
rm -f "$STATE_DIR/state/plan.json"2>/dev/null ||true# 10. Final prune
git worktree prune 2>/dev/null ||true# 11. Clean jobs stateecho"Cleaning jobs state..."
rm -f "$STATE_DIR/state/jobs.json"2>/dev/null ||trueecho"=== Nuclear Cleanup Complete ==="
Standalone Rebuild & Restart Procedure
If the plugin needs rebuilding mid-test (e.g., after a code change):
# 1. Build the plugin
bun run build
# 2. Verify the build output
ls -la dist/index.js
# 3. Restart OpenCode (the plugin re-loads on startup)# Exit and re-enter your OpenCode session.# 4. Verify plugin loaded# Run mc_overview — if it responds, the plugin is loaded.
Safety Rules
Test name prefix: All test jobs use the prefix tmc- (test-mission-control) to distinguish from real work.
No pushes: We never call mc_pr or git push during testing. mc_pr is verified structurally only (documented but never invoked).
No OMO modes except plan mode: We only test vanilla and plan modes. Never ralph or ulw — these launch recursive agent loops.
Simple prompts only: Test job prompts create trivial files only (e.g., echo hello > test.txt).
Snapshot & restore: We record pre-test state (Phase 0) and verify post-test state matches (Phase 12).
SHA-based resets: NEVER use HEAD~N for git resets. Always save SHAs before merges (e.g., $PHASE4_SHA) and reset to them explicitly.
Wait 3-5 seconds after every mc_launch: The tmux session and worktree need time to initialize before monitoring/capture calls.
Always deleteBranch=true on cleanup: Every mc_cleanup call must include deleteBranch=true to prevent branch leaks.
Cancel before completion: Plans must be cancelled before all jobs reach merged state. If all jobs merge, the plan auto-pushes to remote and enters creating_pr state.
Dynamic paths only: Never hardcode project names in paths. Always use $(basename $(git rev-parse --show-toplevel)) or the $PROJECT_NAME variable.
Agent timing: Simple prompts (echo, file creation) complete in 10-25 seconds. If you need the agent to be in running state when you check, either check within 5-10 seconds of launch, use a longer-running prompt like "Read every file in src/ and summarize each one", or kill the agent immediately after launch.
Quick Smoke Test (5 minutes)
Run these tests for basic validation after a code change. References use test IDs from the full phases.
Step
Source
Test
Purpose
1
Phase 0
Nuclear Cleanup
Clean environment
2
Phase 1
1.1-1.6 (launch → status → capture → kill → cleanup)
Core lifecycle
3
Phase 1
1.7 (duplicate name rejected)
Input validation
4
Phase 2
2.1 (error on nonexistent job)
Error handling
5
Phase 2
2.7 (cleanup running job rejected)
Safety check
6
Phase 5
5.7-5.10 (plan with deps → verify waiting_deps → cancel)
Plan basics
7
Phase 9
9.1 (overview empty)
Dashboard baseline
8
Phase 9
9.4 (overview with jobs)
Dashboard with data
9
Phase 5G
5.76 (retry + relaunch mutual exclusion)
TouchSet param validation
10
Phase 9
9.14 (overview after cleanup)
Dashboard cleanup
11
Phase 12
Nuclear Cleanup
Clean exit
Pass criteria: All 11 steps succeed. If any fail, run the full test plan for that phase.
MCP Tools Coverage Matrix
All 17 tools must be exercised during this plan. Check off as tested:
Tool
Phase(s)
Notes
mc_launch
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Core lifecycle
mc_jobs
1, 2, 3, 4, 5, 6, 9
List/filter jobs
mc_status
1, 2, 9, 10
Detailed job info
mc_capture
1, 2, 7, 8, 10
Terminal output
mc_attach
1, 2
Tmux attach command
mc_diff
1, 2, 4
Branch comparison
mc_kill
1, 2, 3, 4, 6, 7, 8
Stop running jobs
mc_cleanup
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12
Remove artifacts
mc_sync
4
Rebase/merge sync
mc_merge
4, 6
Merge to main
mc_pr
—
NOT tested (pushes to remote). Structural mention only.
Timing Caveat: The supervisor pre_merge checkpoint only triggers when all jobs complete and the merge train starts. With simple prompts, jobs complete in 10-20 seconds — the plan may auto-advance before you observe the paused state. Use a 3-job plan with a long-running first job, or verify synthetically by checking plan.json for status: "paused".
#
Test
Action
Expected
5.25
Cancel copilot plan
mc_plan_cancel
Cancelled
5.26
Cleanup
mc_cleanup all=true, deleteBranch=true
Cleaned
5.27
Create supervisor plan
mc_plan name=tmc-plan-super, mode=supervisor, jobs=[{name: "tmc-sv1", prompt: "echo super > sv1.txt"}]
Plan created
5.28
Wait 3-5 seconds
—
—
5.29
Check for checkpoint pauses
mc_plan_status
May show paused at checkpoint or running
5.30
Approve checkpoint (if paused)
mc_plan_approve checkpoint=pre_merge
Execution continues
5G: TouchSet Enforcement
This section tests the touchSet violation detection and the three resolution paths:
accept, relaunch, and retry. These require a plan with touchSet configured
and a job that deliberately modifies files outside its allowed patterns.
Key Concept: TouchSet validation runs after a job completes but before it enters the
merge train. If violations are found, the plan pauses at an on_error checkpoint with
structured checkpointContext containing failureKind, jobName, touchSetViolations,
and touchSetPatterns.
5G-1: TouchSet Violation Detection
#
Test
Action
Expected
5.35
Cleanup from prior tests
mc_cleanup all=true, deleteBranch=true
Clean
5.36
Create plan with touchSet
mc_plan name=tmc-plan-touch, mode=supervisor, jobs=[{name: "tmc-ts1", prompt: "Create allowed.txt with 'hello' and also create rogue.txt with 'oops'", touchSet: ["allowed.txt"]}]
Note: Steps 5.42-5.43 are synthetic — we manually set the job to completed to trigger
the orchestrator's touchSet validation. In production, the monitor detects agent completion
and transitions the job state automatically.
5G-2: Accept Path (Clear Checkpoint)
Continue from 5G-1 state (plan paused with touchSet violation).
#
Test
Action
Expected
5.47
Accept violations
mc_plan_approve checkpoint=on_error
Checkpoint cleared, job moves to ready_to_merge
5.48
Verify plan resumed
mc_plan_status
Plan running (or merging if merge train started)
5.49
Verify job state
mc_jobs
tmc-ts1 = ready_to_merge or merging or merged
Cancel immediately after verifying — do NOT let the plan reach creating_pr.
#
Action
Verify
5.50
mc_plan_cancel
Plan cancelled
5.51
mc_cleanup all=true, deleteBranch=true
Cleaned
5G-3: Relaunch Path (Agent Correction)
This tests spawning a new agent in the existing worktree to fix violations.
Timing Note: Simple prompts complete in 10-25 seconds. Check running state within 5-10 seconds of launch, or use a longer prompt.
This phase tests a realistic workflow where 3 agents work on related tasks simultaneously.
We manually simulate the agents' commits to control timing and test merge ordering.
Background: What Do Spawned Agents Know?
The .opencode/ directory is automatically symlinked into every worktree via
BUILTIN_SYMLINKS in worktree-setup.ts. Both mc_launch and the orchestrator's
launchJob call resolvePostCreateHook(), which always includes .opencode in the
symlink list. This means spawned agents DO have access to Mission Control tools.
What each agent CAN see:
Its own prompt (passed via opencode --prompt '...')
The full repo codebase (checked out at the worktree's branch)
Standard OpenCode tools (read, write, bash, grep, etc.)
ALL mc_* tools (plugin loaded via .opencode symlink)
/mc-* slash commands
Other jobs via mc_jobs (they can see sibling jobs)
mc_report tool for reporting status back to the orchestrator
What each agent CANNOT see:
Whether it's part of an orchestrated plan (plan context not exposed to agents)
Other agents' terminal output (no cross-session capture)
The merge train or integration branch internals
The dependency graph (agents don't know what depends on them)
mc_launch name=tmc-docs, prompt="Update the README.md with a new section about troubleshooting"
Success, branch mc/tmc-docs
6.4
Wait 3-5 seconds
—
—
6.5
Launch bugfix job
mc_launch name=tmc-bugfix, prompt="Fix the config loading bug by updating shared-config.txt line2"
Success, branch mc/tmc-bugfix
6.6
Wait 3-5 seconds
—
—
6.7
Launch feature job
mc_launch name=tmc-feature, prompt="Add caching feature by updating shared-config.txt line4"
Success, branch mc/tmc-feature
6.8
Wait 3-5 seconds
—
—
6.9
All 3 running
mc_jobs
All 3 shown as running
6C: Simulate Agent Work (Manual Commits)
We kill the agents quickly and create controlled commits to test merge behavior.
The bugfix and feature jobs both modify shared-config.txt but on different lines
(non-conflicting overlap).
#
Action
Expected
6.10
Kill all 3 agents: mc_kill tmc-docs, tmc-bugfix, tmc-feature
All stopped
6.11
Docs commit: In tmc-docs worktree, append a troubleshooting section to README.md and commit
Clean commit, no overlap
6.12
Bugfix commit: In tmc-bugfix worktree, change line2 of shared-config.txt from "original" to "bugfix-applied" and commit
Commit touches line2
6.13
Feature commit: In tmc-feature worktree, change line4 of shared-config.txt from "original" to "cache-enabled" and commit
Commit touches line4
6D: Test Merge Ordering (Non-Conflicting Overlap)
The bugfix and feature both touch shared-config.txt but on different lines.
Merging should succeed in any order since the changes don't overlap.
#
Test
Action
Expected
6.14
Merge docs first (safe)
mc_merge name=tmc-docs
Clean merge — README change only
6.15
Merge bugfix second
mc_merge name=tmc-bugfix
Clean merge — line2 change in shared-config.txt
6.16
Merge feature third
mc_merge name=tmc-feature
Clean merge — line4 change (different line, no conflict)
Verify shared-config.txt is gone: ls shared-config.txt
File not found
6.49
mc_jobs — final verify
Empty
Phase 7 — Model Verification
This phase verifies that the launcher script correctly passes model configuration to spawned agents.
Timing Caveat: .mc-launch.sh is auto-deleted after 5 seconds. You must read it IMMEDIATELY after launch. If you miss the window, re-launch and try again — it's not a test failure, just a timing issue.
7A: Identify Current Model
#
Test
Action
Expected
7.1
Identify session model
Note the model you're currently using (check startup banner or ask "what model are you?")
Record as $CURRENT_MODEL (e.g., anthropic/claude-sonnet-4-20250514)
7B: Verify Launcher Script (.mc-launch.sh)
#
Test
Action
Expected
7.2
Launch job
mc_launch name=tmc-model, prompt="echo hello"
Success — note the worktree path from the response
7.3
IMMEDIATELY read launcher script
Read <worktree>/.mc-launch.sh — must read within 5 seconds
File exists
7.4
Verify model flag
Check file contents for -m flag
Contains -m "$CURRENT_MODEL" or the model string from your session
7.5
Verify prompt file reference
Check file contents for .mc-prompt.txt
Contains --prompt "$(cat '.mc-prompt.txt')" or similar
7.6
Verify script is executable
ls -la <worktree>/.mc-launch.sh
-rwxr-xr-x permissions
7C: Verify Terminal Output
#
Test
Action
Expected
7.7
Wait for agent startup
Wait 5-10 seconds after launch
Agent should be running
7.8
Check terminal for model
mc_capture name=tmc-model, lines=30
Terminal output shows model identifier (e.g., in opencode startup banner or model selection line)
7.9
Verify model matches
Compare captured model to $CURRENT_MODEL
Model in tmux matches the model from step 7.1
7D: Cleanup
#
Action
Verify
7.10
mc_kill name=tmc-model
Stopped
7.11
mc_cleanup name=tmc-model, deleteBranch=true
Cleaned
7.12
mc_jobs
Empty
Phase 8 — mc_report Flow
mc_report is called by spawned agents to report their status back to Mission Control.
Agents have access to mc_report because .opencode is automatically symlinked into
every worktree. The MC_REPORT_SUFFIX appended to every agent prompt instructs agents
to call mc_report at key milestones. We verify reports via filesystem inspection
of report JSON files, and also check that mc_status and mc_overview surface report data.
8A: Launch a Reporting Job
#
Test
Action
Expected
8.1
Launch job with substantive prompt
mc_launch name=tmc-reporter, prompt="Create a file called report-test.txt with 'hello world'. When done, commit your changes."
Job launched
8.2
Wait 3-5 seconds
—
—
8.3
Verify job running
mc_capture name=tmc-reporter
Agent is working
8B: Verify Report Files (Non-Deterministic)
Note: Agents have mc_report available and are instructed to use it via MC_REPORT_SUFFIX.
Reports should appear reliably, but agent behavior is non-deterministic. If no report appears
after 15 seconds, investigate — the plugin is wired correctly, so absence likely indicates
the agent ignored the prompt suffix or hasn't reached a reporting milestone yet.
#
Test
Action
Expected
8.4
Wait for agent to potentially report
Wait 10-15 seconds
—
8.5
Check reports directory
ls ~/.local/share/opencode-mission-control/$(basename $(git rev-parse --show-toplevel))/reports/
Why synthetic: Agents complete simple prompts in 10-20 seconds and rarely hit blocked state naturally. Synthetic injection tests the exact same code paths deterministically.
8E: Synthetic Blocked Pipeline Test
These tests use synthetic report injection to deterministically verify the monitor → notification → overview pipeline. We launch a real job to get a valid Job ID, then manually write a report JSON file.
Dogfooding paradox: We're testing Mission Control from within a Mission Control-managed session. Launching jobs will create worktrees of this repo, and those agents will also have MC loaded (via .opencode/plugins/mission-control.ts -> ../../dist/index.js). Our simple prompts and quick kills mitigate recursive chaos.
Merge pollution (Phases 4 & 6): The merge tests temporarily bring test commits into main. We immediately git reset --hard $SAVED_SHA to undo. If anything goes wrong, these are the highest-risk steps. NEVER use HEAD~N — always use saved SHAs.
State file corruption: If we crash mid-test, jobs.json and plan.json may have orphaned entries. The Nuclear Cleanup script (top of document) handles this.
tmux session leak: If mc_kill fails, tmux sessions persist. Phase 12 force-kills all mc-tmc-* sessions.
Integration branch leak: Plan tests create mc/integration-* AND mc/integration/* branches and worktrees. Both patterns are cleaned in the Nuclear Cleanup script and Phase 12.
Agent unawareness of plan: Spawned agents have full MC tools (.opencode is symlinked) and can see sibling jobs via mc_jobs, but they don't know they're part of an orchestrated plan — plan context is not exposed to agents. They also have access to dangerous tools (mc_kill, mc_plan_cancel, mc_merge) with no guardrails.
Plan auto-push: If all jobs in a plan reach merged state, the plan automatically pushes the integration branch to remote and enters creating_pr state. ALWAYS cancel plans before all jobs complete to prevent unwanted pushes.
Report reliability: Agents have mc_report available (plugin loaded via .opencode symlink) and are instructed to call it via MC_REPORT_SUFFIX prompt injection. Report files should appear reliably, but agent behavior is ultimately non-deterministic — a missing report after 15 seconds warrants investigation but is not necessarily a plugin failure.
Launcher script timing: .mc-launch.sh is deleted after 5 seconds. Phase 7 must read it immediately after launch. If you miss the window, the test is inconclusive, not failed.
Worktree initialization race: Some operations may fail if attempted before the worktree is fully initialized. The 3-5 second wait after every mc_launch mitigates this.
TouchSet testing on feature branches: When running Phase 5G on a non-main branch, job worktrees inherit the feature branch's uncommitted changes. TouchSet validation compares the job branch against the integration branch, so feature branch source files show up as spurious violations alongside the actual test violations (e.g., rogue.txt). This is a testing artifact — in production, both branches share the same base so only the job's own changes appear.
Agent Capabilities Reference
Current State: Spawned Agents Have MC Tools
The .opencode/ directory is automatically symlinked into every worktree. This is
implemented via BUILTIN_SYMLINKS = ['.opencode'] in src/lib/worktree-setup.ts, which
is included in every resolvePostCreateHook() call from both mc_launch and the
orchestrator's launchJob. Plugin updates propagate automatically since it's a symlink.
Capability
Available?
Why
mc_* tools
YES
Plugin loaded via .opencode symlink
/mc-* slash commands
YES
Plugin loaded via .opencode symlink
mc_report
YES
Agents can report status back to orchestrator
mc_jobs
YES
Agents can see sibling jobs
Worktree awareness
YES
getWorktreeContext() runs in agent session
Standard OpenCode tools
YES
Read, write, bash, grep, etc. all work
Git operations
YES
Full git access within the worktree
Plan awareness
NO
Plan context not exposed to agent prompts
Cross-agent visibility
PARTIAL
Can list jobs via mc_jobs but cannot capture other agents' output
Orchestrator control
UNSAFE
Agents COULD call mc_kill, mc_plan_cancel, mc_merge — no guardrails prevent this
Safety Consideration: Dangerous Tools in Agent Hands
Since agents have full access to ALL mc_* tools, they could theoretically:
Kill other jobs (mc_kill)
Cancel the orchestrating plan (mc_plan_cancel)
Merge branches prematurely (mc_merge)
Launch new jobs (mc_launch)
This is mitigated by:
Simple test prompts that don't trigger complex tool use
Quick kills — agents are stopped before they can cause harm
The MC_REPORT_SUFFIX prompt only instructs agents to call mc_report, not other tools