Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 16 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,27 @@ Edit the relevant `SKILL.md` or data file. Test by running the skill locally wit

## Testing

There is no automated test harness for skills — they are instruction sets interpreted by Claude Code, not code with unit tests. The validation steps are:
The repo has a three-tier test suite under `tests/`. See [`tests/README.md`](tests/README.md) for the full reference.

```bash
./tests/test-invariants.sh # Tier 2: <1s grep/filesystem checks — run this first
./tests/run-tests.sh # Tier 1: skill tests via Claude Code (~4-5 min total)
```

**Tier 2** (`test-invariants.sh`) checks for forbidden stale phrases, required remediation text, file-existence consistency, and version alignment — no Claude invocation, no cost. Run it locally before every commit.

**Tier 1** (`run-tests.sh`) invokes Claude Code headlessly with questions about each skill and asserts patterns in the response. Runs in CI on every PR and push to `main`. Requires Claude auth (ambient OAuth or `ANTHROPIC_API_KEY`). Fork PRs skip this step — see `tests/README.md#ci`.

**Tier 3** (`test-e2e*.sh`) runs a live pipeline on a playground repo or creates throwaway repos to exercise the install skills end-to-end. Expensive (~20-35 min); run manually before releases only. See `tests/README.md#running-tier-3`.

Additional local checks:

1. **Load the plugin**: `claude --plugin-dir .` — confirm no startup errors.
2. **Run the skill manually**: invoke `/discover-workflows` or `/install-workflow` and walk through the flow.
3. **Validate lock files** (if you changed `.lock.yml` files): `gh aw validate` — safe, does not recompile.
4. **Check grep counts** (if you applied the OAuth tweak): see [skills/install-workflow/auth.md](skills/install-workflow/auth.md#step-4--verify-the-tweak-shape).

Never test by committing untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.
Never commit untested changes to `main`. The installed workflows run on push to `main`, so a broken install skill or a bad `.lock.yml` will trigger a live workflow run.

## Workflow files

Expand Down Expand Up @@ -112,4 +125,4 @@ Branch naming conventions:

## Publishing (maintainers only)

See the [Publishing section of the README](README.md#publishing) for the steps to submit the plugin to the Claude plugin registry.
Coordinate with the maintainer team to bump the version in `.claude-plugin/plugin.json`, create a GitHub Release, and submit the updated `marketplace.json` to the external plugin registries (`claude-plugins.dev`, `ClaudePluginHub`).
16 changes: 15 additions & 1 deletion catalog/agent-team/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,21 @@ Then apply the OAuth token tweak to each `.lock.yml` per [`skills/install-workfl
2. Add the single label `agent-team`.
3. Watch the thread. Each role posts its contribution as a comment; the implementer opens a draft PR that closes the issue when merged.
4. Human override at any time: add `state:blocked` to halt, edit a comment to steer the next agent, or manually `gh workflow run` a specific role to retry a stuck stage. Manual dispatches must pass the required `workflow_dispatch` inputs, and the downstream workflow markdown must read them via `${{ github.event.inputs.* }}`.
5. **Retrying a blocked task**: clear `state:blocked`, then re-add `agent-team`. Spec-agent treats it as a fresh dispatch (because the state:* labels are gone and the spec markers are already satisfied — actually: to redo from scratch, also delete the prior spec comment).
5. **Retrying a blocked task**: the correct recovery depends on *why* the issue was blocked.
- **Blocked by the reviewer (kickback cap, architectural decision)**: clear `state:blocked`, re-add `agent-team`. The spec agent's early-exit guard will fire (spec + state:* labels already exist), so nothing restarts. Instead, manually dispatch the specific stage you want to retry: `gh workflow run planner-agent.lock.yml -f issue_number=<N> -f iteration=1` (reset iteration to 1 to clear the kickback counter). Always supply `-f issue_number=<N>` and any other required inputs.
- **Blocked because workflow_dispatch inputs were not propagated** (the agent posted `🛑 agent-team: workflow_dispatch inputs were not propagated`): this means the run was triggered without the required inputs — usually a manual dispatch that omitted `-f` flags. Clear `state:blocked`, then re-dispatch the correct stage with all required inputs:
```bash
# Planner
gh workflow run planner-agent.lock.yml -f issue_number=<N> -f iteration=<I>
# Implementer (first pass — no pr_number)
gh workflow run implementer-agent.lock.yml -f issue_number=<N> -f iteration=<I>
# Implementer (kickback — existing PR)
gh workflow run implementer-agent.lock.yml -f issue_number=<N> -f iteration=<I> -f pr_number=<PR>
# Reviewer
gh workflow run reviewer-agent.lock.yml -f pr_number=<PR> -f issue_number=<N> -f iteration=<I>
```
Do **not** simply re-add the `agent-team` label — the spec agent will immediately exit (spec already posted), so the pipeline will not restart.
- **Full restart from scratch**: clear all `state:*` labels and delete the `<!-- agent-team:spec -->` comment from the issue, then re-add the `agent-team` label.

## Limits and gotchas

Expand Down