From 8d1ec87bcf4aa6d33b27957dab4c81d60e8851f0 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]"
 <41898282+github-actions[bot]@users.noreply.github.com>
Date: Fri, 15 May 2026 11:23:50 +0000
Subject: [PATCH] docs: fix stale testing section, broken publishing link, and
 add agent-team troubleshooting
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

CONTRIBUTING.md Testing section claimed "there is no automated test
harness" — factually wrong since tests/ has a full three-tier suite
(invariants, skill tests, E2E). Replaced with accurate description and
quick-start commands pointing to tests/README.md.

Fixed a broken cross-reference: "Publishing section of the README"
linked to README.md#publishing, which does not exist. Replaced with an
inline description of the maintainer publishing steps.

catalog/agent-team/README.md: expanded "Kicking off a task" step 5 to
cover all three recovery paths for a blocked issue — reviewer kickback,
the new "inputs not propagated" fail-loud state introduced in PR #69,
and full-restart from scratch. Previously, only the label-re-add path
was documented, which is the wrong recovery for input-propagation
failures (re-adding the label does nothing when a spec already exists).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 CONTRIBUTING.md              | 19 ++++++++++++++++---
 catalog/agent-team/README.md | 16 +++++++++++++++-
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 1c2721e..96e7baa 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -74,14 +74,27 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit
 
 ## Testing
 
-There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are:
+The repo has a three-tier test suite under `tests/`. See [`tests/README.md`](tests/README.md) for the full reference.
+
+```bash
+./tests/test-invariants.sh   # Tier 2: <1s grep/filesystem checks — run this first
+./tests/run-tests.sh         # Tier 1: skill tests via Claude Code (~4-5 min total)
+```
+
+**Tier 2** (`test-invariants.sh`) checks for forbidden stale phrases, required remediation text, file-existence consistency, and version alignment — no Claude invocation, no cost. Run it locally before every commit.
+
+**Tier 1** (`run-tests.sh`) invokes Claude Code headlessly with questions about each skill and asserts patterns in the response. Runs in CI on every PR and push to `main`. Requires Claude auth (ambient OAuth or `ANTHROPIC_API_KEY`). Fork PRs skip this step — see `tests/README.md#ci`.
+
+**Tier 3** (`test-e2e*.sh`) runs a live pipeline on a playground repo or creates throwaway repos to exercise the install skills end-to-end. Expensive (~20-35 min); run manually before releases only. See `tests/README.md#running-tier-3`.
+
+Additional local checks:
 
 1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors.
 2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow.
 3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
 4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).
 
-Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.
+Never commit untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.
 
 ## Workflow files
 
@@ -112,4 +125,4 @@ Branch naming conventions:
 
 ## Publishing (maintainers only)
 
-See the [Publishing section of the README](README.md#publishing) for the steps to submit the plugin to the Claude plugin registry.
+Coordinate with the maintainer team to bump the version in `.claude-plugin/plugin.json`, create a GitHub Release, and submit the updated `marketplace.json` to the external plugin registries (`claude-plugins.dev`, `ClaudePluginHub`).
diff --git a/catalog/agent-team/README.md b/catalog/agent-team/README.md
index 5490837..06a6fd1 100644
--- a/catalog/agent-team/README.md
+++ b/catalog/agent-team/README.md
@@ -105,7 +105,21 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl
 2. Add the single label `agent-team`.
 3. Watch the thread. Each role posts its contribution as a comment; the implementer opens a draft PR that closes the issue when merged.
 4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. Manual dispatches must pass the required `workflow_dispatch` inputs, and the downstream workflow markdown must read them via `${{ github.event.inputs.* }}`.
-5. **Retrying a blocked task**: clear `state:blocked`, then re-add `agent-team`. Spec-agent treats it as a fresh dispatch (because the state:* labels are gone and the spec markers are already satisfied — actually: to redo from scratch, also delete the prior spec comment).
+5. **Retrying a blocked task**: the correct recovery depends on *why* the issue was blocked.
+   - **Blocked by the reviewer (kickback cap, architectural decision)**: clear `state:blocked`, re-add `agent-team`. The spec agent's early-exit guard will fire (spec + state:* labels already exist), so nothing restarts. Instead, manually dispatch the specific stage you want to retry: `gh workflow run planner-agent.lock.yml -f issue_number=<N> -f iteration=1` (reset iteration to 1 to clear the kickback counter). Always supply `-f issue_number=<N>` and any other required inputs.
+   - **Blocked because workflow_dispatch inputs were not propagated** (the agent posted `🛑 agent-team: workflow_dispatch inputs were not propagated`): this means the run was triggered without the required inputs — usually a manual dispatch that omitted `-f` flags. Clear `state:blocked`, then re-dispatch the correct stage with all required inputs:
+     ```bash
+     # Planner
+     gh workflow run planner-agent.lock.yml -f issue_number=<N> -f iteration=<I>
+     # Implementer (first pass — no pr_number)
+     gh workflow run implementer-agent.lock.yml -f issue_number=<N> -f iteration=<I>
+     # Implementer (kickback — existing PR)
+     gh workflow run implementer-agent.lock.yml -f issue_number=<N> -f iteration=<I> -f pr_number=<PR>
+     # Reviewer
+     gh workflow run reviewer-agent.lock.yml -f pr_number=<PR> -f issue_number=<N> -f iteration=<I>
+     ```
+     Do **not** simply re-add the `agent-team` label — the spec agent will immediately exit (spec already posted), so the pipeline will not restart.
+   - **Full restart from scratch**: clear all `state:*` labels and delete the `<!-- agent-team:spec -->` comment from the issue, then re-add the `agent-team` label.
 
 ## Limits and gotchas