From 8d1ec87bcf4aa6d33b27957dab4c81d60e8851f0 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Fri, 15 May 2026 11:23:50 +0000 Subject: [PATCH] docs: fix stale testing section, broken publishing link, and add agent-team troubleshooting MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CONTRIBUTING.md Testing section claimed "there is no automated test harness" — factually wrong since tests/ has a full three-tier suite (invariants, skill tests, E2E). Replaced with accurate description and quick-start commands pointing to tests/README.md. Fixed a broken cross-reference: "Publishing section of the README" linked to README.md#publishing, which does not exist. Replaced with an inline description of the maintainer publishing steps. catalog/agent-team/README.md: expanded "Kicking off a task" step 5 to cover all three recovery paths for a blocked issue — reviewer kickback, the new "inputs not propagated" fail-loud state introduced in PR #69, and full-restart from scratch. Previously, only the label-re-add path was documented, which is the wrong recovery for input-propagation failures (re-adding the label does nothing when a spec already exists). Co-Authored-By: Claude Sonnet 4.6 --- CONTRIBUTING.md | 19 ++++++++++++++++--- catalog/agent-team/README.md | 16 +++++++++++++++- 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1c2721e..96e7baa 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -74,14 +74,27 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit ## Testing -There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are: +The repo has a three-tier test suite under `tests/`. See [`tests/README.md`](tests/README.md) for the full reference. + +```bash +./tests/test-invariants.sh # Tier 2: <1s grep/filesystem checks — run this first +./tests/run-tests.sh # Tier 1: skill tests via Claude Code (~4-5 min total) +``` + +**Tier 2** (`test-invariants.sh`) checks for forbidden stale phrases, required remediation text, file-existence consistency, and version alignment — no Claude invocation, no cost. Run it locally before every commit. + +**Tier 1** (`run-tests.sh`) invokes Claude Code headlessly with questions about each skill and asserts patterns in the response. Runs in CI on every PR and push to `main`. Requires Claude auth (ambient OAuth or `ANTHROPIC_API_KEY`). Fork PRs skip this step — see `tests/README.md#ci`. + +**Tier 3** (`test-e2e*.sh`) runs a live pipeline on a playground repo or creates throwaway repos to exercise the install skills end-to-end. Expensive (~20-35 min); run manually before releases only. See `tests/README.md#running-tier-3`. + +Additional local checks: 1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors. 2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow. 3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile. 4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape). -Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run. +Never commit untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run. ## Workflow files @@ -112,4 +125,4 @@ Branch naming conventions: ## Publishing (maintainers only) -See the [Publishing section of the README](README.md#publishing) for the steps to submit the plugin to the Claude plugin registry. +Coordinate with the maintainer team to bump the version in `.claude-plugin/plugin.json`, create a GitHub Release, and submit the updated `marketplace.json` to the external plugin registries (`claude-plugins.dev`, `ClaudePluginHub`). diff --git a/catalog/agent-team/README.md b/catalog/agent-team/README.md index 5490837..06a6fd1 100644 --- a/catalog/agent-team/README.md +++ b/catalog/agent-team/README.md @@ -105,7 +105,21 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl 2. Add the single label `agent-team`. 3. Watch the thread. Each role posts its contribution as a comment; the implementer opens a draft PR that closes the issue when merged. 4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. Manual dispatches must pass the required `workflow_dispatch` inputs, and the downstream workflow markdown must read them via `${{ github.event.inputs.* }}`. -5. **Retrying a blocked task**: clear `state:blocked`, then re-add `agent-team`. Spec-agent treats it as a fresh dispatch (because the state:* labels are gone and the spec markers are already satisfied — actually: to redo from scratch, also delete the prior spec comment). +5. **Retrying a blocked task**: the correct recovery depends on *why* the issue was blocked. + - **Blocked by the reviewer (kickback cap, architectural decision)**: clear `state:blocked`, re-add `agent-team`. The spec agent's early-exit guard will fire (spec + state:* labels already exist), so nothing restarts. Instead, manually dispatch the specific stage you want to retry: `gh workflow run planner-agent.lock.yml -f issue_number= -f iteration=1` (reset iteration to 1 to clear the kickback counter). Always supply `-f issue_number=` and any other required inputs. + - **Blocked because workflow_dispatch inputs were not propagated** (the agent posted `🛑 agent-team: workflow_dispatch inputs were not propagated`): this means the run was triggered without the required inputs — usually a manual dispatch that omitted `-f` flags. Clear `state:blocked`, then re-dispatch the correct stage with all required inputs: + ```bash + # Planner + gh workflow run planner-agent.lock.yml -f issue_number= -f iteration= + # Implementer (first pass — no pr_number) + gh workflow run implementer-agent.lock.yml -f issue_number= -f iteration= + # Implementer (kickback — existing PR) + gh workflow run implementer-agent.lock.yml -f issue_number= -f iteration= -f pr_number= + # Reviewer + gh workflow run reviewer-agent.lock.yml -f pr_number= -f issue_number= -f iteration= + ``` + Do **not** simply re-add the `agent-team` label — the spec agent will immediately exit (spec already posted), so the pipeline will not restart. + - **Full restart from scratch**: clear all `state:*` labels and delete the `` comment from the issue, then re-add the `agent-team` label. ## Limits and gotchas