diff --git a/CHANGELOG.md b/CHANGELOG.md index d1034a9..eee6c06 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,7 +10,8 @@ Please choose versions by [Semantic Versioning](http://semver.org/). ## Unreleased -- docs: add `docs/three-command-review-split.md` — design note locking the proposed shape for splitting the review commands into three distinct scopes (`/coding:pr-review` remote diff, `/coding:local-review` local pre-commit diff (renamed from current `/coding:code-review`), new `/coding:code-review` whole-codebase audit). No behavior change in this PR; the doc proposes a 5-PR migration plan (rename → deprecation stub → new behavior → flip deprecation) and the three mechanisms (severity filter, rule-id dedup, baseline file `.code-review-baseline.yaml`) that make whole-codebase review useful without drowning in pre-existing tech debt. Optional `golangci-lint` passthrough for Go projects. +- docs: add `docs/three-command-review-split.md` — design note locking the proposed shape for splitting the review commands into three distinct scopes (`/coding:pr-review` remote diff, `/coding:local-review` local pre-commit diff (renamed from current `/coding:code-review`), new `/coding:code-review` whole-codebase audit). The doc pins defaults for baseline-file location (`.code-review-baseline.yaml` at repo root, `--baseline-path` override), monorepo handling (one baseline per repo by default; per-subproject via override), `--refresh-baseline` clean-tree requirement, and `.gitignore` semantics (respected; no separate `.claude-ignore` introduced). +- **feat: split review commands into 3 distinct scopes.** Renames current `/coding:code-review` → `/coding:local-review` (preserving its diff-vs-`HEAD~1` semantics for pre-commit local checks), AND introduces a brand-new `/coding:code-review` that scans the WHOLE codebase via `git ls-files`. New command ships with severity filter (default-on: Must Fix + Should Fix only; `--include-optional` to opt in), rule-id dedup at consolidation (N occurrences → 1 summary with sample sites), and baseline file (`.code-review-baseline.yaml` via `--refresh-baseline`) so subsequent runs only flag NEW findings (drift since last sweep) rather than the operator's full accepted tech-debt set. Design rationale: `docs/three-command-review-split.md`. 10 reference sites updated across `llms.txt`, `README.md`, `scenarios/*`, and 4 agent definitions to point at `/coding:local-review` (preserving old semantics). **Migration**: sharp behavior cutover on the `/coding:code-review` slot — operators previously relying on its diff-vs-`HEAD~1` behavior must move to `/coding:local-review`. The contrast pair `pr-review` (remote, branch vs target) / `local-review` (local, pre-commit) reads cleanly; `code-review` takes the unmarked whole-codebase slot. ## v0.25.0 diff --git a/README.md b/README.md index 4fbc726..c35a69f 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ Opinionated coding guidelines, quality review agents, and slash commands for Go ## Overview -Writing consistent, idiomatic code across a large codebase is hard. This plugin bundles 50+ opinionated guides (Go architecture, error handling, testing, HTTP handlers, Python structure, Git workflow, documentation) together with specialized Claude Code agents that enforce them on your code. Install once, then run `/coding:code-review` or `/coding:pr-review` to review your work against the full ruleset. +Writing consistent, idiomatic code across a large codebase is hard. This plugin bundles 50+ opinionated guides (Go architecture, error handling, testing, HTTP handlers, Python structure, Git workflow, documentation) together with specialized Claude Code agents that enforce them on your code. Install once, then run `/coding:local-review` or `/coding:pr-review` to review your work against the full ruleset. ## Requirements @@ -35,11 +35,19 @@ Review your current branch against all guidelines: /coding:pr-review ``` -Review code in standard mode (7 agents) or full mode (14 agents): +Review local uncommitted changes in selector mode (default — zero LLM spawns, in-session classify + adjudicate) or full mode (per-owner dispatch, concurrent agents): ``` -/coding:code-review standard -/coding:code-review full +/coding:local-review # selector mode (default) +/coding:local-review full # per-owner dispatch +``` + +Audit the whole codebase (severity-filtered, baseline-aware): + +``` +/coding:code-review # whole codebase, Must Fix + Should Fix only +/coding:code-review --include-optional # add Nice to Have +/coding:code-review --refresh-baseline # write current findings to .code-review-baseline.yaml ``` Find relevant guides before starting work: @@ -58,8 +66,9 @@ Commit with changelog and version bump: | Command | Description | |---------|-------------| -| `/coding:pr-review` | Review pull request diff against standards | -| `/coding:code-review [short\|standard\|full]` | Review code against guidelines (standard: 7 agents, full: 14) | +| `/coding:pr-review` | Branch diff vs target — selector mode default; full mode = per-owner dispatch | +| `/coding:local-review [short\|selector\|full]` | Local uncommitted/recent diff vs `HEAD~1` — selector mode default | +| `/coding:code-review [--include-optional] [--refresh-baseline]` | Whole-codebase audit — severity-filtered (Must + Should) + baseline-aware (`.code-review-baseline.yaml`) | | `/coding:architecture-review [directory]` | Deep whole-codebase architectural review — top-down + dimensions, consolidated Must/Should/Could | | `/coding:check-guides "task"` | Find relevant guides before implementation | | `/coding:commit` | Git commit with changelog and versioning | @@ -258,7 +267,7 @@ End-to-end acceptance walks for the doc-driven review pipeline, following the [d | # | Scenario | Validates | |---|---|---| | 001 | [toolchain-preflight](scenarios/001-toolchain-preflight.md) | Step 4.0 / Step 0 preflight blocks exit 1 with documented stderr when `ast-grep` / `sg` is absent from PATH | -| 002 | [clean-pr-zero-findings](scenarios/002-clean-pr-zero-findings.md) | `/coding:code-review master` against a zero-violation diff produces empty Must Fix / Should Fix / Nice to Have (no LLM hallucination) | +| 002 | [clean-pr-zero-findings](scenarios/002-clean-pr-zero-findings.md) | `/coding:local-review master` against a zero-violation diff produces empty Must Fix / Should Fix / Nice to Have (no LLM hallucination) | | 003 | [scaling-funnel-100-files](scenarios/003-scaling-funnel-100-files.md) | 100-file synthetic fixture: mechanical funnel ≤30s, distinct Owners ≤30 (structural ceiling on Step 4b LLM calls) | | 004 | [findings-exist-path](scenarios/004-findings-exist-path.md) | `/coding:pr-review` against the stable test PR [bborbe/maintainer#2](https://github.com/bborbe/maintainer/pull/2): Step 4a surfaces ≥4 findings, every Owner has an agent file, citation discipline holds | diff --git a/agents/architecture-dimensions-assistant.md b/agents/architecture-dimensions-assistant.md index 51de441..8c75ea5 100644 --- a/agents/architecture-dimensions-assistant.md +++ b/agents/architecture-dimensions-assistant.md @@ -29,7 +29,7 @@ The user (typically via `/coding:architecture-review`) gives you a target direct - **No structural review** — that's the sibling agents' job. If you find a layering violation, note it briefly and defer to the structural pass. - **No code modification** — read-only. -- **No diff-scoped review** — whole-codebase only. For per-PR work, the user should use `/coding:code-review` instead. +- **No diff-scoped review** — whole-codebase only. For per-PR work, the user should use `/coding:pr-review` (branch diff) or `/coding:local-review` (uncommitted) instead. - **No mechanical findings** — "function too long", "too many params", "name should be camelCase" are owned by linters and `go-quality-assistant` / `python-quality-assistant`. Your altitude is architectural. ## Output format diff --git a/agents/go-architecture-assistant.md b/agents/go-architecture-assistant.md index 6f93c07..6157af6 100644 --- a/agents/go-architecture-assistant.md +++ b/agents/go-architecture-assistant.md @@ -422,7 +422,7 @@ Package boundary violations: . ## Integration -- Runs alongside `srp-checker` in `/coding:code-review full` — SRP checks unit-level, this checks cross-unit and extraction-quality +- Runs alongside `srp-checker` in `/coding:local-review full` — SRP checks unit-level, this checks cross-unit and extraction-quality - Invoked directly via `/coding:audit-architecture [directory]` - Read-only — never edits code diff --git a/agents/python-architecture-assistant.md b/agents/python-architecture-assistant.md index 2aa2434..ee5f6f4 100644 --- a/agents/python-architecture-assistant.md +++ b/agents/python-architecture-assistant.md @@ -438,7 +438,7 @@ Import-time side effects: . ## Integration -- Runs alongside `python-quality-assistant` in `/coding:code-review full` — quality checks style/idioms, this checks cross-unit architecture and extract-quality +- Runs alongside `python-quality-assistant` in `/coding:local-review full` — quality checks style/idioms, this checks cross-unit architecture and extract-quality - Invoked directly via `/coding:audit-architecture [directory]` - Read-only — never edits code diff --git a/commands/code-review.md b/commands/code-review.md index 953797c..d1d5996 100644 --- a/commands/code-review.md +++ b/commands/code-review.md @@ -1,253 +1,172 @@ --- -allowed-tools: Task, Bash(git diff:+), Bash(git log:+), Bash(git status:+), Bash(git ls-files:+) -argument-hint: "[short|full|selector] [directory]" -description: Perform a comprehensive code review of recent changes +allowed-tools: Task, Bash(git ls-files:+), Bash(git status:+), Bash(git log:+), Bash(git branch:+) +argument-hint: "[short|full|selector] [directory] [--include-optional] [--refresh-baseline]" +description: Whole-codebase audit — severity-filtered + baseline-aware --- ## Context - Current git status: `!git status` -- Recent changes (stat): `!git diff --stat HEAD~1` -- Recent commits: `!git log --oneline -5` - Current branch: `!git branch --show-current` +- Tracked file count: `!git ls-files | wc -l` ## Your task -Perform a code review with configurable depth based on mode. +Whole-codebase architectural + quality audit. **Different scope from `/coding:pr-review` and `/coding:local-review`** — see table: -### Step 1: Parse Arguments +| Command | Scope | Use when | +|---|---|---| +| `/coding:pr-review` | branch diff vs target | reviewing a PR before merge | +| `/coding:local-review` | uncommitted / `HEAD~1` diff | pre-commit local check | +| `/coding:code-review` (this) | **whole codebase** | onboarding, drift audit, periodic health-check | -Parse the first argument to determine mode: -- If first arg is `short|quick|fast` → **Short mode** (manual review only) -- If first arg is `full|comprehensive|complete` → **Full mode** (all agents, per-owner dispatch) -- Otherwise (including `standard`, `selector`, `--selector`, or no token) → **Selector mode (default)** (in-session classify + adjudicate, zero sub-agent spawns) +Design rationale: see `docs/three-command-review-split.md`. -Any remaining arguments are treated as the directory path. +## Step 0: Parse Arguments -### Step 2: Project Detection +- First positional → mode (`short` / `full` / `selector` (default)) +- Second positional → directory (default: current) +- `--include-optional` → include `Nice to Have` findings (default: filtered out) +- `--refresh-baseline` → write current finding set to `.code-review-baseline.yaml` and exit (no report) -Detect project type to determine which specialist agents to invoke: -- **Go project**: Check for `*.go` files or `go.mod` -- **Python project**: Check for `*.py` files or `pyproject.toml`/`requirements.txt` -- **Other languages**: Add detection as needed +Defaults are conservative — `selector` mode + Must Fix + Should Fix only — because whole-codebase output on a mature codebase is otherwise overwhelming. -### Step 3: Run Automated Checks (All Modes) +## Step 1: Walk the codebase -**3a. Check for LICENSE file (public repos only)** +Build the file-set the funnel processes: -First, detect if the repo is public or private: -- Run `git remote -v` and check the URL -- `github.com` → **public** → check LICENSE -- `bitbucket.seibert.tools` or other internal hosting → **private** → skip all license checks - -For public repos, use Glob to check if LICENSE file exists in project root: -``` -LICENSE or LICENSE.md or LICENSE.txt +```bash +cd && git ls-files | grep -E '\.(go|py|md|yaml|yml|sh)$' > /tmp/code-review-filelist.txt ``` -Store result for later: -- If private repo → skip license-assistant entirely (no LICENSE needed) -- If public and missing → flag for license-assistant in Standard mode, report in Short mode -- If public and present → skip license-assistant in Standard mode - -**3b. Run make precommit (Full mode only)** - -Running the full test suite is CI's job; the review needs the result, not a re-run. In **Selector** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (selector mode) — CI covers lint+test". - -**Full mode only**: Check if Makefile exists and has a `precommit` target: -1. Use Read tool to check if Makefile exists (will error if missing) -2. Use Grep tool to search for `^precommit:` pattern in Makefile +Exclude vendor + node_modules: -If both checks pass, use Task tool with simple-bash-runner agent: -``` -Task tool with subagent_type="coding:simple-bash-runner", prompt="cd [directory] && make precommit" +```bash +grep -v -E '^(vendor/|node_modules/|\.git/)' /tmp/code-review-filelist.txt > /tmp/code-review-files.txt +mv /tmp/code-review-files.txt /tmp/code-review-filelist.txt ``` -This provides automated checks (formatting, linting, tests, security) before running agents. Include the output and any failures in the final report. - -If Makefile doesn't exist or lacks `precommit` target, skip this step. If `make precommit` fails, note the failures but continue with the review. - -### Step 4: Dispatcher — ast-grep funnel → findings-scoped LLM adjudication - -Mirrors `commands/pr-review.md` Step 4. The funnel runs first (diff-scoped), then adjudicates findings in-session. **Selector mode (the default)**: in-session classify + adjudicate, zero sub-agent spawns. **Full mode**: keeps per-owner dispatch (all relevant owners + conditional agents). **Short mode**: skips Step 4 entirely. +This is the **scope source** — every file the audit considers. Replaces the diff-based file list that `/coding:pr-review` and `/coding:local-review` use. -**Short Mode**: No agents — skip to Step 5. +## Step 2: Project Detection + LICENSE check -**Early exit**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the diff stat decides this — no tool calls needed. -- BUT: if LICENSE missing AND repo is public, add to "Should Fix": - - "Missing LICENSE file" - - "README missing license section" (check with Grep for `## License` in README.md) +Same as `/coding:pr-review` Step 2 + 3a — Go/Python detection drives which judgment rules trigger; LICENSE-presence is the conditional `license-assistant` gate. -#### 4.0: Toolchain preflight (fail-fast) +Skip `make precommit` (Step 3b in pr-review) — full lint+test on a whole codebase is CI's job; running it here is wasteful. -Mirror of `commands/pr-review.md` Step 4.0. Verify `ast-grep` is in PATH before invoking the runner; the runner script fail-fasts on the same check (exit 2 + JSON error), but doing it here too keeps the failure surface close to the dispatcher: +## Step 3: Toolchain preflight (fail-fast) -```bash -cd && (command -v ast-grep >/dev/null 2>&1 || command -v sg >/dev/null 2>&1) \ - || { echo "ERROR: ast-grep/sg not in PATH. Install via 'npm install -g @ast-grep/cli' (or 'apk add ast-grep' in alpine). pr-reviewer container fix: bborbe/maintainer agent/pr-reviewer/Dockerfile commit 1de083f." >&2; exit 1; } -``` - -Run exactly this one command, once. If it fails: report the toolchain gap in Step 5 and skip Step 4 entirely. Do NOT investigate further (no `which`, no `ls rules/`, no retry variants). +Identical to `/coding:pr-review` Step 4.0 — verify `ast-grep` is available. Failure → "Must Fix toolchain failure" in report; skip Step 4. -#### 4a: Mechanical funnel - -Run `scripts/ast-grep-runner.sh` (deterministic — covers ast-grep YAMLs AND script-tier rule-checks, diff-scoped) over the changed files from `git diff --stat HEAD~1`. The script ships with the coding plugin; resolve its path first — plugin install dir, falling back to the local checkout: +## Step 4: Mechanical funnel — whole codebase ```bash RUNNER="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/scripts/ast-grep-runner.sh" [ -x "$RUNNER" ] || RUNNER="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/scripts/ast-grep-runner.sh" [ -x "$RUNNER" ] || RUNNER="$HOME/Documents/workspaces/coding/scripts/ast-grep-runner.sh" -"$RUNNER" > /tmp/code-review-findings.json +"$RUNNER" $(cat /tmp/code-review-filelist.txt | tr '\n' ' ') > /tmp/code-review-findings.json ``` -Run exactly this one Bash call, once. Emits `{stats, findings_by_owner: {: [...findings]}, errors}` — read it from `/tmp/code-review-findings.json`. Do NOT spawn an agent for this step (the former `coding:ast-grep-runner` agent is deprecated). If the runner is missing or fails: note "mechanical funnel unavailable" for the Step 5 report and continue with Step 4b using judgment-rule triggers only — do NOT investigate (no `find`, no `which`, no path probing). +The runner is scope-agnostic — it processes whatever file list it receives. We pass the whole codebase. -#### 4b: Findings-scoped candidate computation +## Step 5: Adjudication -**Step 4b-i: Active judgment rules** — compute which judgment-tier rules are triggered by the diff. Run: +Selector mode (default) follows `docs/selector-mode-guide.md` § Step 4c-sel CLASSIFY + Step 4d-sel ADJUDICATE — identical to `/coding:pr-review`'s Step 4, with one change: -```bash -CHANGED_FILES="" -jq -r --arg files "$CHANGED_FILES" ' - [ .[] | select(.enforcement_type == "judgment") | - select( - .trigger == null or - (.trigger | any(. as $pat | - ($files | split("\n") | .[] | - (. == $pat) or - (($pat | startswith("@")) and $pat == "@commits") or - (($pat | contains("*")) and - (. | test("^" + ($pat | gsub("\\."; "\\.") | gsub("\\*\\*/"; "°") | gsub("\\*\\*"; "±") | gsub("\\*"; "[^/]*") | gsub("\\?"; "[^/]") | gsub("°"; "(.*/)?") | gsub("±"; ".*")) + "$")) - ) - ) - )) - ) - ] | .[] | .id + " " + .owner -' rules/index.json -``` +- **`DIFF`** input → the full content of changed-or-relevant files (the guide is source-agnostic; pass concatenated file contents OR pass the file paths as context — the adjudicator reads what it needs). +- **`MECHANICAL_FINDINGS`** → `/tmp/code-review-findings.json` (whole-codebase output). +- **`CANDIDATES`** → judgment-tier rules whose `trigger` glob matches at least one file in the codebase (most rules will match). -This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. This output feeds both Selector mode (Steps 4c-sel/4d-sel) and Full mode (per-owner dispatch). +Full mode → per-owner dispatch identical to `/coding:pr-review` Step 4b-ii, but the `` agents receive the whole-file-set scope. -#### Selector mode (the default): Steps 4c-sel and 4d-sel +Short mode → skip Step 5 entirely; report only the toolchain status + file count. -Steps 4.0, 4a, and 4b-i run unchanged. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. +## Step 6: Baseline diff (the critical noise-reduction step) -Run exactly this one command, once: +Without this, whole-codebase output drowns the operator in pre-existing tech debt. Read `.code-review-baseline.yaml` from the directory root: ```bash -GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" -[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" +BASELINE="/.code-review-baseline.yaml" +[ -f "$BASELINE" ] && yq eval '.accepted' "$BASELINE" > /tmp/code-review-baseline.json 2>/dev/null || echo "{}" > /tmp/code-review-baseline.json ``` -If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). +For each finding from Step 5: +- **CARRIED** — `(rule_id, file:line)` matches an `accepted` entry in the baseline → suppress from report, count in traceability section as "carried from baseline" +- **NEW** — finding not in baseline → report normally +- **FIXED-SINCE-BASELINE** — entry in baseline that has NO matching finding in current run → report as positive signal (line N's `no-fmt-errorf` was accepted, now gone) -If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: +Without a baseline file, every finding is NEW. First run on a mature codebase: huge report (this is the operator's signal to either fix the highest-value subset OR generate a baseline with `--refresh-baseline` and start tracking deltas). -- **DIFF** = `git diff HEAD~1` (or directory diff as parsed in Step 1) -- **CANDIDATES** = the Step 4b-i ` ` output -- **MECHANICAL_FINDINGS** = `/tmp/code-review-findings.json` -- **Working directory** = the reviewed directory +## Step 7: Severity filter + dedup -On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. Skip this section for short/full mode. +**Severity filter** (default-on): suppress `Nice to Have` findings unless `--include-optional` flag was passed. -Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. - -#### Full mode: per-owner dispatch - -**Full mode only** — skip this section in selector and short mode. - -Compute the dispatch set from Step 4b-i and Step 4a findings: +**Rule-id dedup**: group findings by `rule_id`. For each rule with ≥ 4 occurrences, emit ONE summary entry with the top 5 file:line citations + total count instead of N separate findings: ``` -owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) +- **** — N occurrences across M files. + Sample sites: + - file1.go:42 + - file2.go:107 + - file3.go:18 + - …and (N−3) more ``` -If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. +Rules with < 4 occurrences → list individually (no dedup benefit, more information loss than gain). -Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently** — they're independent. Each Task prompt: - -``` -coding: agent: "TARGET_DIR=. +## Step 8: `--refresh-baseline` mode -Pre-filtered mechanical findings for you (from ast-grep-runner): -] JSON, or empty array if none> +If the flag was set: write the CURRENT finding set (post-Step 5, pre-Step 6) to `.code-review-baseline.yaml`: -Active judgment rules you own (from rules/index.json, triggered by this diff): - - -Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. - -Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. - -Review changed code only." -``` - -#### 4c: Context-specific conventions - -Load these conventionally when the diff matches: - -| If diff touches… | Read first | -|---|---| -| `.env` files OR `k8s/*-secret.yaml` OR templates with `teamvault*` functions | `~/Documents/workspaces/coding/docs/teamvault-conventions.md` (teamvault lookup keys are not exposed credentials) | -| `main.go` of a k8s-deployed service | `~/Documents/workspaces/coding/docs/go-k8s-binary-conventions.md` | -| `k8s/*.yaml` (non-secret) | `~/Documents/workspaces/coding/docs/k8s-manifest-guide.md` | -| `CHANGELOG.md` | `~/Documents/workspaces/coding/docs/changelog-guide.md` | - -#### 4d: Citation validation - -**Full mode only** (selector mode's own citation call lives in the guide). Walk every finding from the per-owner agent reports and verify its `rule_id` field exists in `rules/index.json`. Drop findings citing missing IDs — they're hallucinations or stale-walker references. Log dropped findings to stderr. - -```bash -coding:simple-bash-runner agent: "bash scripts/validate-citations.sh " +```yaml +# .code-review-baseline.yaml — accepted pre-existing findings. +# Regenerate: /coding:code-review --refresh-baseline +generated_at: "" +generated_at_sha: "" +accepted: + : + count: + sample: + - file1.go:42 + - file2.go:107 + - file3.go:18 ``` -Drops findings citing missing rule IDs; logs drift to stderr. +Then exit — do NOT produce a report. The next normal `/coding:code-review` run will treat these as baseline and report only NEW findings. -#### Conditional / full-mode agents (independent of rule-base) +**Constraint**: `--refresh-baseline` requires a clean working tree (`git status --porcelain` empty). Baking accidental local cruft into accepted findings is exactly the failure mode this guards against. Refuse with a clear error if dirty. -These are file-presence / language-feature checks not yet expressed as RULE blocks in `rules/index.json`. Continue invoking directly: +## Step 9: Consolidated Report -- **`license-assistant`** — public repos with missing LICENSE -- **`readme-quality-assistant`** — full mode only (README quality) -- **`shellcheck-assistant`** — shell-script review -- **`context7-library-checker`** — full mode; up-to-date library docs -- **`go-version-manager`** / **`go-tooling-assistant`** — full mode; version + Makefile checks - -These will migrate to RULE blocks in follow-up PRs as their conventions get canonicalised. For now they fire on the legacy path alongside the dispatcher. - -### Step 5: Consolidated Report - -Merge all agent findings into a unified report. Each agent owns its domain — do NOT duplicate their rules here. - -Organize by severity: +Three buckets (per `/coding:pr-review` Step 5): #### Must Fix (Critical) -Agent-reported critical issues (security, context violations, concurrency bugs, data correctness, transaction deadlocks, circular imports, time.Now()/time.Time usage). - #### Should Fix (Important) -Agent-reported important issues (error handling, architecture, factory/handler patterns, test gaps, tooling, licensing). - #### Nice to Have (Optional) -Agent-reported minor issues (style, documentation, naming conventions, version updates). +*(suppressed by default; pass `--include-optional` to include)* -#### Selector Mode: Classify Traceability (selector mode only) +#### Baseline traceability section -Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. +- **Baseline**: `` (`.code-review-baseline.yaml`) +- **Findings before baseline diff**: `` +- **Carried from baseline (suppressed)**: `` +- **NEW (since baseline)**: `` +- **FIXED since baseline (positive)**: `` +- **Severity-filtered (Nice to Have suppressed)**: `` (or "all severities shown" if `--include-optional`) -### Step 6: Next Steps +#### Selector mode traceability (selector mode only) -If `go-test-coverage-assistant` reported missing tests, suggest: -``` -/coding:go-write-test basic — minimal tests for modified files -/coding:go-write-test standard — comprehensive with error cases -``` +Per `docs/selector-mode-guide.md` § Traceability Report Section. + +## Step 10: Next steps + +- If `--refresh-baseline` was just set up: commit `.code-review-baseline.yaml` so subsequent runs know the starting point. +- If NEW findings dominate: suggest opening focused fix-PRs grouped by rule_id (one PR per rule = clear scope, easy review). +- If FIXED-SINCE-BASELINE > 0: suggest refreshing the baseline (`--refresh-baseline`) to lock in the improvement so it can't regress unnoticed. -### Step 7: Manual Review (Short Mode / Non-Go) +## Constraints -For projects without agent coverage, review manually: -1. Code quality and readability -2. Security vulnerabilities -3. Performance bottlenecks -4. Test coverage -5. Documentation completeness +- Scope is the **whole tracked codebase** as of HEAD — not the working tree, not the staged set. Use `git ls-files`. +- Vendored / generated files (vendor/, node_modules/, .git/) are always excluded. +- Read-only — never modify code. The only file write is `.code-review-baseline.yaml` under `--refresh-baseline`. +- All paths in findings are repo-relative (no absolute, no `~/`) — same convention as the other review commands. diff --git a/commands/local-review.md b/commands/local-review.md new file mode 100644 index 0000000..7b628ba --- /dev/null +++ b/commands/local-review.md @@ -0,0 +1,253 @@ +--- +allowed-tools: Task, Bash(git diff:+), Bash(git log:+), Bash(git status:+), Bash(git ls-files:+) +argument-hint: "[short|full|selector] [directory]" +description: Perform a comprehensive review of local uncommitted/recent changes +--- + +## Context + +- Current git status: `!git status` +- Recent changes (stat): `!git diff --stat HEAD~1` +- Recent commits: `!git log --oneline -5` +- Current branch: `!git branch --show-current` + +## Your task + +Perform a review of local uncommitted/recent changes (since `HEAD~1`) with configurable depth based on mode. For branch-vs-target review use `/coding:pr-review`; for whole-codebase audit use `/coding:code-review`. + +### Step 1: Parse Arguments + +Parse the first argument to determine mode: +- If first arg is `short|quick|fast` → **Short mode** (manual review only) +- If first arg is `full|comprehensive|complete` → **Full mode** (all agents, per-owner dispatch) +- Otherwise (including `standard`, `selector`, `--selector`, or no token) → **Selector mode (default)** (in-session classify + adjudicate, zero sub-agent spawns) + +Any remaining arguments are treated as the directory path. + +### Step 2: Project Detection + +Detect project type to determine which specialist agents to invoke: +- **Go project**: Check for `*.go` files or `go.mod` +- **Python project**: Check for `*.py` files or `pyproject.toml`/`requirements.txt` +- **Other languages**: Add detection as needed + +### Step 3: Run Automated Checks (All Modes) + +**3a. Check for LICENSE file (public repos only)** + +First, detect if the repo is public or private: +- Run `git remote -v` and check the URL +- `github.com` → **public** → check LICENSE +- `bitbucket.seibert.tools` or other internal hosting → **private** → skip all license checks + +For public repos, use Glob to check if LICENSE file exists in project root: +``` +LICENSE or LICENSE.md or LICENSE.txt +``` + +Store result for later: +- If private repo → skip license-assistant entirely (no LICENSE needed) +- If public and missing → flag for license-assistant in Standard mode, report in Short mode +- If public and present → skip license-assistant in Standard mode + +**3b. Run make precommit (Full mode only)** + +Running the full test suite is CI's job; the review needs the result, not a re-run. In **Selector** and **Short** mode, skip this step entirely and include in the Step 5 report: "precommit skipped (selector mode) — CI covers lint+test". + +**Full mode only**: Check if Makefile exists and has a `precommit` target: +1. Use Read tool to check if Makefile exists (will error if missing) +2. Use Grep tool to search for `^precommit:` pattern in Makefile + +If both checks pass, use Task tool with simple-bash-runner agent: +``` +Task tool with subagent_type="coding:simple-bash-runner", prompt="cd [directory] && make precommit" +``` + +This provides automated checks (formatting, linting, tests, security) before running agents. Include the output and any failures in the final report. + +If Makefile doesn't exist or lacks `precommit` target, skip this step. If `make precommit` fails, note the failures but continue with the review. + +### Step 4: Dispatcher — ast-grep funnel → findings-scoped LLM adjudication + +Mirrors `commands/pr-review.md` Step 4. The funnel runs first (diff-scoped), then adjudicates findings in-session. **Selector mode (the default)**: in-session classify + adjudicate, zero sub-agent spawns. **Full mode**: keeps per-owner dispatch (all relevant owners + conditional agents). **Short mode**: skips Step 4 entirely. + +**Short Mode**: No agents — skip to Step 5. + +**Early exit**: if NO changed file has extension `.go` or `.py` AND none matches `CHANGELOG.md`, `go.mod`, `LICENSE*`, `README.md`, `Makefile`, `pyproject.toml`, `k8s/**`, `agents/**`, `commands/**`, `skills/**`, `docs/**` — the diff cannot match any rule. Skip Step 4 entirely; note "Step 4 skipped: no rule-relevant files changed" in the report. One glance at the diff stat decides this — no tool calls needed. +- BUT: if LICENSE missing AND repo is public, add to "Should Fix": + - "Missing LICENSE file" + - "README missing license section" (check with Grep for `## License` in README.md) + +#### 4.0: Toolchain preflight (fail-fast) + +Mirror of `commands/pr-review.md` Step 4.0. Verify `ast-grep` is in PATH before invoking the runner; the runner script fail-fasts on the same check (exit 2 + JSON error), but doing it here too keeps the failure surface close to the dispatcher: + +```bash +cd && (command -v ast-grep >/dev/null 2>&1 || command -v sg >/dev/null 2>&1) \ + || { echo "ERROR: ast-grep/sg not in PATH. Install via 'npm install -g @ast-grep/cli' (or 'apk add ast-grep' in alpine). pr-reviewer container fix: bborbe/maintainer agent/pr-reviewer/Dockerfile commit 1de083f." >&2; exit 1; } +``` + +Run exactly this one command, once. If it fails: report the toolchain gap in Step 5 and skip Step 4 entirely. Do NOT investigate further (no `which`, no `ls rules/`, no retry variants). + +#### 4a: Mechanical funnel + +Run `scripts/ast-grep-runner.sh` (deterministic — covers ast-grep YAMLs AND script-tier rule-checks, diff-scoped) over the changed files from `git diff --stat HEAD~1`. The script ships with the coding plugin; resolve its path first — plugin install dir, falling back to the local checkout: + +```bash +RUNNER="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/scripts/ast-grep-runner.sh" +[ -x "$RUNNER" ] || RUNNER="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/scripts/ast-grep-runner.sh" +[ -x "$RUNNER" ] || RUNNER="$HOME/Documents/workspaces/coding/scripts/ast-grep-runner.sh" +"$RUNNER" > /tmp/local-review-findings.json +``` + +Run exactly this one Bash call, once. Emits `{stats, findings_by_owner: {: [...findings]}, errors}` — read it from `/tmp/local-review-findings.json`. Do NOT spawn an agent for this step (the former `coding:ast-grep-runner` agent is deprecated). If the runner is missing or fails: note "mechanical funnel unavailable" for the Step 5 report and continue with Step 4b using judgment-rule triggers only — do NOT investigate (no `find`, no `which`, no path probing). + +#### 4b: Findings-scoped candidate computation + +**Step 4b-i: Active judgment rules** — compute which judgment-tier rules are triggered by the diff. Run: + +```bash +CHANGED_FILES="" +jq -r --arg files "$CHANGED_FILES" ' + [ .[] | select(.enforcement_type == "judgment") | + select( + .trigger == null or + (.trigger | any(. as $pat | + ($files | split("\n") | .[] | + (. == $pat) or + (($pat | startswith("@")) and $pat == "@commits") or + (($pat | contains("*")) and + (. | test("^" + ($pat | gsub("\\."; "\\.") | gsub("\\*\\*/"; "°") | gsub("\\*\\*"; "±") | gsub("\\*"; "[^/]*") | gsub("\\?"; "[^/]") | gsub("°"; "(.*/)?") | gsub("±"; ".*")) + "$")) + ) + ) + )) + ) + ] | .[] | .id + " " + .owner +' rules/index.json +``` + +This produces a list of ` ` pairs whose trigger globs match at least one changed file. Rules with `trigger: ["@commits"]` are always included. This output feeds both Selector mode (Steps 4c-sel/4d-sel) and Full mode (per-owner dispatch). + +#### Selector mode (the default): Steps 4c-sel and 4d-sel + +Steps 4.0, 4a, and 4b-i run unchanged. Resolve the guide and execute Steps 4c-sel and 4d-sel from it — zero sub-agent spawns. + +Run exactly this one command, once: + +```bash +GUIDE="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/coding}/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] || GUIDE="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/marketplaces/coding/docs/selector-mode-guide.md" +[ -f "$GUIDE" ] && echo "GUIDE_OK: $GUIDE" || echo "GUIDE_MISSING" +``` + +If it prints `GUIDE_MISSING`: report "selector guide unavailable" as a **Must Fix toolchain failure** in Step 5 and STOP the selector path — do NOT continue with a mechanical-findings-only review presented as a complete selector review (a review without the judgment tier silently misses every judgment-tier rule; same fail-fast discipline as Step 4.0). Do NOT investigate further (no `find`, no `ls`, no path probing). + +If it prints `GUIDE_OK`: Read the file at that path, then execute its **Step 4c-sel CLASSIFY** and **Step 4d-sel ADJUDICATE** with: + +- **DIFF** = `git diff HEAD~1` (or directory diff as parsed in Step 1) +- **CANDIDATES** = the Step 4b-i ` ` output +- **MECHANICAL_FINDINGS** = `/tmp/local-review-findings.json` +- **Working directory** = the reviewed directory + +On the guide's short-circuit condition the report line is `selector clean — no adjudication needed`. Skip this section for short/full mode. + +Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. + +#### Full mode: per-owner dispatch + +**Full mode only** — skip this section in selector and short mode. + +Compute the dispatch set from Step 4b-i and Step 4a findings: + +``` +owners_to_spawn = (keys of findings_by_owner) ∪ (owners from active judgment rules) +``` + +If `owners_to_spawn` is empty AND `findings_by_owner` is empty, report "funnel clean — no adjudication needed" and proceed to Step 5. ZERO LLM spawns. + +Otherwise, spawn ONE Task per owner in `owners_to_spawn` **concurrently** — they're independent. Each Task prompt: + +``` +coding: agent: "TARGET_DIR=. + +Pre-filtered mechanical findings for you (from ast-grep-runner): +] JSON, or empty array if none> + +Active judgment rules you own (from rules/index.json, triggered by this diff): + + +Adjudicate: for each mechanical finding, assign severity (Critical / Important / Optional), add a fix suggestion that cites the rule by ID. Drop any finding whose rule_id is not in the index — stale-walker bug, not your concern. + +Also scan the diff for each active judgment rule listed above and report violations you find. Read only changed files relevant to those rules. + +Review changed code only." +``` + +#### 4c: Context-specific conventions + +Load these conventionally when the diff matches: + +| If diff touches… | Read first | +|---|---| +| `.env` files OR `k8s/*-secret.yaml` OR templates with `teamvault*` functions | `~/Documents/workspaces/coding/docs/teamvault-conventions.md` (teamvault lookup keys are not exposed credentials) | +| `main.go` of a k8s-deployed service | `~/Documents/workspaces/coding/docs/go-k8s-binary-conventions.md` | +| `k8s/*.yaml` (non-secret) | `~/Documents/workspaces/coding/docs/k8s-manifest-guide.md` | +| `CHANGELOG.md` | `~/Documents/workspaces/coding/docs/changelog-guide.md` | + +#### 4d: Citation validation + +**Full mode only** (selector mode's own citation call lives in the guide). Walk every finding from the per-owner agent reports and verify its `rule_id` field exists in `rules/index.json`. Drop findings citing missing IDs — they're hallucinations or stale-walker references. Log dropped findings to stderr. + +```bash +coding:simple-bash-runner agent: "bash scripts/validate-citations.sh " +``` + +Drops findings citing missing rule IDs; logs drift to stderr. + +#### Conditional / full-mode agents (independent of rule-base) + +These are file-presence / language-feature checks not yet expressed as RULE blocks in `rules/index.json`. Continue invoking directly: + +- **`license-assistant`** — public repos with missing LICENSE +- **`readme-quality-assistant`** — full mode only (README quality) +- **`shellcheck-assistant`** — shell-script review +- **`context7-library-checker`** — full mode; up-to-date library docs +- **`go-version-manager`** / **`go-tooling-assistant`** — full mode; version + Makefile checks + +These will migrate to RULE blocks in follow-up PRs as their conventions get canonicalised. For now they fire on the legacy path alongside the dispatcher. + +### Step 5: Consolidated Report + +Merge all agent findings into a unified report. Each agent owns its domain — do NOT duplicate their rules here. + +Organize by severity: + +#### Must Fix (Critical) +Agent-reported critical issues (security, context violations, concurrency bugs, data correctness, transaction deadlocks, circular imports, time.Now()/time.Time usage). + +#### Should Fix (Important) +Agent-reported important issues (error handling, architecture, factory/handler patterns, test gaps, tooling, licensing). + +#### Nice to Have (Optional) +Agent-reported minor issues (style, documentation, naming conventions, version updates). + +#### Selector Mode: Classify Traceability (selector mode only) + +Include the traceability section per `docs/selector-mode-guide.md` § Traceability Report Section. + +### Step 6: Next Steps + +If `go-test-coverage-assistant` reported missing tests, suggest: +``` +/coding:go-write-test basic — minimal tests for modified files +/coding:go-write-test standard — comprehensive with error cases +``` + +### Step 7: Manual Review (Short Mode / Non-Go) + +For projects without agent coverage, review manually: +1. Code quality and readability +2. Security vulnerabilities +3. Performance bottlenecks +4. Test coverage +5. Documentation completeness diff --git a/docs/architecture-dimensions-guide.md b/docs/architecture-dimensions-guide.md index d2745c8..3dc6a1c 100644 --- a/docs/architecture-dimensions-guide.md +++ b/docs/architecture-dimensions-guide.md @@ -9,7 +9,7 @@ How to review a whole codebase for **behavioral** architectural quality, complem - Before scaling a service (team, traffic, complexity) - Onboarding a codebase you didn't write -**Do NOT use for diff review** — `/coding:code-review` and `/coding:pr-review` cover per-PR work. +**Do NOT use for diff review** — `/coding:local-review` and `/coding:pr-review` cover per-PR work. ## Scope: structural vs behavioral @@ -201,5 +201,5 @@ Exclude from review: - `coding:go-architecture-assistant` / `coding:python-architecture-assistant` — sibling structural pass; run in parallel via `/coding:architecture-review` - `coding:srp-checker` — unit-level SRP; this guide handles cross-cutting consistency -- `/coding:code-review` — diff-scoped per-PR review (different altitude) +- `/coding:local-review` — diff-scoped per-PR review (different altitude) - `/coding:audit-architecture` — quick single-agent structural scan (lighter alternative) diff --git a/docs/releasing-coding.md b/docs/releasing-coding.md index 3050636..e38d171 100644 --- a/docs/releasing-coding.md +++ b/docs/releasing-coding.md @@ -30,7 +30,7 @@ The check is **release-time only** — `make precommit` does NOT run it. Until a `scenarios/` regression suite exists, the operator must manually exercise: - Reload Claude Code with the plugin source mounted; verify `commands/`, `agents/`, `skills/` load without errors -- Smoke-test at least one slash command (e.g. `/coding:code-review`) end-to-end +- Smoke-test at least one slash command (e.g. `/coding:local-review`) end-to-end - Inspect any new docs added under `docs/` for broken cross-references not caught by `check-links` If any check fails, fix before tagging. diff --git a/llms.txt b/llms.txt index a126c28..a56daaf 100644 --- a/llms.txt +++ b/llms.txt @@ -89,8 +89,8 @@ - [Markdown & Todos](docs/markdown-todo-guide.md): Formatting standards ## Acceptance Scenarios -End-to-end acceptance walks for the doc-driven review pipeline (`/coding:code-review`, `/coding:pr-review`). Each scenario is a manually-walked checklist under `scenarios/`; status `active` = walked successfully at least once. +End-to-end acceptance walks for the doc-driven review pipeline (`/coding:local-review`, `/coding:pr-review`). Each scenario is a manually-walked checklist under `scenarios/`; status `active` = walked successfully at least once. - [Scenario 001 — toolchain-preflight](scenarios/001-toolchain-preflight.md): Step 4.0 / Step 0 preflight blocks exit 1 with documented stderr when `ast-grep` / `sg` is absent from PATH -- [Scenario 002 — clean-pr-zero-findings](scenarios/002-clean-pr-zero-findings.md): `/coding:code-review master` against a zero-violation diff produces empty Must Fix / Should Fix / Nice to Have +- [Scenario 002 — clean-pr-zero-findings](scenarios/002-clean-pr-zero-findings.md): `/coding:local-review master` against a zero-violation diff produces empty Must Fix / Should Fix / Nice to Have - [Scenario 003 — scaling-funnel-100-files](scenarios/003-scaling-funnel-100-files.md): 100-file synthetic fixture, mechanical funnel ≤30s, distinct Owners ≤30 - [Scenario 004 — findings-exist-path](scenarios/004-findings-exist-path.md): `/coding:pr-review` against bborbe/maintainer#2 — Step 4a surfaces ≥4 findings, every Owner has an agent file, citation discipline holds diff --git a/scenarios/002-clean-pr-zero-findings.md b/scenarios/002-clean-pr-zero-findings.md index 9982de6..345f667 100644 --- a/scenarios/002-clean-pr-zero-findings.md +++ b/scenarios/002-clean-pr-zero-findings.md @@ -4,7 +4,7 @@ status: outdated # Scenario 002: Zero-violation PR in standard mode produces empty findings, no false positives -Validates that a diff with no mechanical-rule violations flows through `/coding:code-review master standard` (Step 4.0 → 4a → 4b → 4c → 4d → 5) and produces a report with empty Must Fix / Should Fix / Nice to Have sections — locking down the regression risk that the LLM tier hallucinates findings to fill the void when the mechanical layer surfaces none. +Validates that a diff with no mechanical-rule violations flows through `/coding:local-review master standard` (Step 4.0 → 4a → 4b → 4c → 4d → 5) and produces a report with empty Must Fix / Should Fix / Nice to Have sections — locking down the regression risk that the LLM tier hallucinates findings to fill the void when the mechanical layer surfaces none. Updated for funnel v2 (PR #48): runner output now lands in `/tmp/code-review-findings.json` (not an awk-extracted stdout block); zero LLM spawns are signalled by the dispatcher logging "funnel clean — no adjudication needed" when `findings_by_owner` is empty and no judgment rules are triggered by the README-only diff. @@ -17,7 +17,7 @@ Updated for funnel v2 (PR #48): runner output now lands in `/tmp/code-review-fin ## Action -- [ ] Run `/coding:code-review master` against `$WORK` in a fresh Claude Code session (standard mode is the default — do not pass a mode argument); tee stdout to `/tmp/code-review-stdout.log`, stderr to `/tmp/code-review-stderr.log`, capture exit code to `/tmp/code-review-exit` +- [ ] Run `/coding:local-review master` against `$WORK` in a fresh Claude Code session (standard mode is the default — do not pass a mode argument); tee stdout to `/tmp/code-review-stdout.log`, stderr to `/tmp/code-review-stderr.log`, capture exit code to `/tmp/code-review-exit` ## Expected diff --git a/scenarios/005-selector-clean-short-circuit.md b/scenarios/005-selector-clean-short-circuit.md index 0e097b0..101f98d 100644 --- a/scenarios/005-selector-clean-short-circuit.md +++ b/scenarios/005-selector-clean-short-circuit.md @@ -4,7 +4,7 @@ status: active # Scenario 005: Selector mode short-circuits cleanly on a README-only whitespace diff -Validates that `/coding:code-review master selector` (in-place review — works on a local-only branch, mirroring scenario 002) against a branch whose diff is a single README.md whitespace edit emits `selector clean — no adjudication needed`, produces a report with empty Must Fix / Should Fix / Nice to Have sections, and spawns zero sub-agents — locking down the regression risk that the classify step incorrectly marks any rule as applicable when no rule-relevant files changed. +Validates that `/coding:local-review master selector` (in-place review — works on a local-only branch, mirroring scenario 002) against a branch whose diff is a single README.md whitespace edit emits `selector clean — no adjudication needed`, produces a report with empty Must Fix / Should Fix / Nice to Have sections, and spawns zero sub-agents — locking down the regression risk that the classify step incorrectly marks any rule as applicable when no rule-relevant files changed. ## Setup @@ -15,7 +15,7 @@ Validates that `/coding:code-review master selector` (in-place review — works ## Action -- [ ] Run `/coding:code-review master selector` in `$WORK` (in-place; the fixture branch is local-only, which `/coding:pr-review`'s worktree flow does not support — that journey is scenario 006's) in a fresh Claude Code session; tee stdout to `/tmp/scen005-stdout.log`, stderr to `/tmp/scen005-stderr.log`, capture exit code to `/tmp/scen005-exit` +- [ ] Run `/coding:local-review master selector` in `$WORK` (in-place; the fixture branch is local-only, which `/coding:pr-review`'s worktree flow does not support — that journey is scenario 006's) in a fresh Claude Code session; tee stdout to `/tmp/scen005-stdout.log`, stderr to `/tmp/scen005-stderr.log`, capture exit code to `/tmp/scen005-exit` ## Expected @@ -35,4 +35,4 @@ Validates that `/coding:code-review master selector` (in-place review — works --- -**Status**: walked 2026-06-11 (MiniMax-M2.7-highspeed via an isolated `CLAUDE_CONFIG_DIR` (plugin clone pinned to the branch under test), skill @ f93717e): 1m29s, 17 turns, zero `coding:*` spawns, GUIDE_OK ×2 in stream, classify table with per-candidate `applies_when` + reasons, 5 candidates → 0 applicable → 5 skipped, literal `selector clean — no adjudication needed` fired. Transcript: `/tmp/scen005-walk2.jsonl` (volatile; key numbers recorded here). Promoted draft → active. Note: original draft used `/coding:pr-review` — switched to `/coding:code-review` (in-place) because the fixture branch is local-only; pr-review's worktree flow needs an origin branch (that journey is scenario 006's). +**Status**: walked 2026-06-11 (MiniMax-M2.7-highspeed via an isolated `CLAUDE_CONFIG_DIR` (plugin clone pinned to the branch under test), skill @ f93717e): 1m29s, 17 turns, zero `coding:*` spawns, GUIDE_OK ×2 in stream, classify table with per-candidate `applies_when` + reasons, 5 candidates → 0 applicable → 5 skipped, literal `selector clean — no adjudication needed` fired. Transcript: `/tmp/scen005-walk2.jsonl` (volatile; key numbers recorded here). Promoted draft → active. Note: original draft used `/coding:pr-review` — switched to `/coding:local-review` (in-place) because the fixture branch is local-only; pr-review's worktree flow needs an origin branch (that journey is scenario 006's). diff --git a/scripts/acceptance.sh b/scripts/acceptance.sh index 92bf514..ebc99d4 100755 --- a/scripts/acceptance.sh +++ b/scripts/acceptance.sh @@ -7,7 +7,7 @@ # slash-command run). # # 1. Mode coverage — short/standard/full dispatch contracts in commands/ -# 2. Per-language routing — Go vs Python agent scope in code-review.md +# 2. Per-language routing — Go vs Python agent scope in local-review.md (whole-codebase code-review.md has its own scope) # 3. Context loading — Step 2.5 file-glob → doc mapping consistency # 4. Broken-YAML isolation — ast-grep error on bad YAML doesn't mask # findings from good YAMLs in the same dir @@ -41,22 +41,22 @@ echo "=== 1/5 Mode coverage ===" # Short mode in both dispatcher commands must skip Step 4 entirely (no runner agent). if grep -qE "Short Mode.*No agents|Short Mode.*skip" commands/pr-review.md && \ - grep -qE "Short Mode.*No agents|Short Mode.*skip" commands/code-review.md; then - ok "short mode skips Step 4 in both pr-review.md AND code-review.md" + grep -qE "Short Mode.*No agents|Short Mode.*skip" commands/local-review.md; then + ok "short mode skips Step 4 in both pr-review.md AND local-review.md" else - fail "short mode skip directive missing from pr-review.md or code-review.md" + fail "short mode skip directive missing from pr-review.md or local-review.md" fi # All modes (selector/full) invoke the deterministic runner script (Step 4a is mandatory for # the dispatcher). The former coding:ast-grep-runner agent is deprecated and must # NOT be invoked by either command. -if grep -qE "scripts/ast-grep-runner\.sh" commands/code-review.md && \ +if grep -qE "scripts/ast-grep-runner\.sh" commands/local-review.md && \ grep -qE "scripts/ast-grep-runner\.sh" commands/pr-review.md; then ok "ast-grep-runner.sh invoked in both dispatcher commands" else - fail "scripts/ast-grep-runner.sh not referenced in pr-review.md or code-review.md" + fail "scripts/ast-grep-runner.sh not referenced in pr-review.md or local-review.md" fi -if grep -qE '^coding:ast-grep-runner agent' commands/code-review.md commands/pr-review.md; then +if grep -qE '^coding:ast-grep-runner agent' commands/local-review.md commands/pr-review.md; then fail "deprecated coding:ast-grep-runner agent still invoked in a dispatcher command" else ok "deprecated coding:ast-grep-runner agent not invoked by either command" @@ -66,22 +66,22 @@ fi # to RULE blocks yet (license, readme, shellcheck, context7, go-version-manager, …). conditional_agents=0 for a in license-assistant readme-quality-assistant shellcheck-assistant context7-library-checker go-version-manager go-tooling-assistant; do - if grep -qE "$a" commands/code-review.md; then + if grep -qE "$a" commands/local-review.md; then conditional_agents=$((conditional_agents+1)) fi done if [ "$conditional_agents" -ge 4 ]; then - ok "full mode references $conditional_agents legacy-path agents (≥ 4) in code-review.md" + ok "full mode references $conditional_agents legacy-path agents (≥ 4) in local-review.md" else - fail "full mode references only $conditional_agents legacy-path agents (< 4) in code-review.md" + fail "full mode references only $conditional_agents legacy-path agents (< 4) in local-review.md" fi # Default/otherwise token must route to Selector mode in both commands. if grep -qE "Selector mode \(the default\)|Selector mode \(default\)" commands/pr-review.md && \ - grep -qE "Selector mode \(the default\)|Selector mode \(default\)" commands/code-review.md; then + grep -qE "Selector mode \(the default\)|Selector mode \(default\)" commands/local-review.md; then ok "default/otherwise token routes to Selector mode (default) in both commands" else - fail "Selector mode (default) routing missing from pr-review.md or code-review.md" + fail "Selector mode (default) routing missing from pr-review.md or local-review.md" fi echo "=== 2/5 Per-language routing ===" @@ -89,11 +89,11 @@ echo "=== 2/5 Per-language routing ===" # Full-mode per-owner dispatch block present — findings_by_owner feeds the dispatch set and # coding: agent prompts reference it. Per-owner dispatch lives in full mode only # (standard/selector path removed); the check asserts what remains true post-flip. -if grep -qE "coding:|findings_by_owner" commands/code-review.md && \ +if grep -qE "coding:|findings_by_owner" commands/local-review.md && \ grep -qE "coding:|findings_by_owner" commands/pr-review.md; then ok "full-mode per-owner dispatch block (findings_by_owner / coding:) present in both dispatcher commands" else - fail "full-mode per-owner dispatch block missing from pr-review.md or code-review.md" + fail "full-mode per-owner dispatch block missing from pr-review.md or local-review.md" fi # Every owner referenced in rules/index.json must have a matching agent file in agents/. @@ -126,10 +126,10 @@ echo "=== 3/5 Context loading (Step 2.5 globs) ===" # Each canonical context-doc mapping must be referenced in both dispatcher commands. for mapping in "teamvault-conventions.md" "go-k8s-binary-conventions.md" "k8s-manifest-guide.md" "changelog-guide.md"; do - if grep -qF "$mapping" commands/pr-review.md && grep -qF "$mapping" commands/code-review.md; then + if grep -qF "$mapping" commands/pr-review.md && grep -qF "$mapping" commands/local-review.md; then ok "Step 2.5 mapping '$mapping' present in both dispatcher commands" else - fail "Step 2.5 mapping '$mapping' missing from pr-review.md OR code-review.md" + fail "Step 2.5 mapping '$mapping' missing from pr-review.md OR local-review.md" fi done @@ -174,7 +174,7 @@ echo "=== 5/5 Selector mode contracts ===" # (a) Both command files reference --selector token in Step 1 parse, the # short-circuit string, and the GUIDE_OK/GUIDE_MISSING fail-fast block. -for cmd in commands/pr-review.md commands/code-review.md; do +for cmd in commands/pr-review.md commands/local-review.md; do if grep -qE '\-\-selector|selector.*mode' "$cmd"; then ok "$cmd: --selector token present in Step 1 parse" else @@ -216,10 +216,10 @@ else fail "selector guide missing verbatim recall contract sentence 'When uncertain, include.'" fi -# (c) Sibling consistency: both pr-review.md and code-review.md reference the +# (c) Sibling consistency: both pr-review.md and local-review.md reference the # same guide filename and the same step labels (4c-sel, 4d-sel). pr_guide=$(grep -oE 'selector-mode-guide\.md' commands/pr-review.md | head -1) -cr_guide=$(grep -oE 'selector-mode-guide\.md' commands/code-review.md | head -1) +cr_guide=$(grep -oE 'selector-mode-guide\.md' commands/local-review.md | head -1) if [ "$pr_guide" = "selector-mode-guide.md" ] && [ "$cr_guide" = "selector-mode-guide.md" ]; then ok "both commands reference the same guide filename: selector-mode-guide.md" else @@ -227,7 +227,7 @@ else fi for label in '4c-sel' '4d-sel'; do pr_has=$(grep -c "$label" commands/pr-review.md || true) - cr_has=$(grep -c "$label" commands/code-review.md || true) + cr_has=$(grep -c "$label" commands/local-review.md || true) if [ "$pr_has" -ge 1 ] && [ "$cr_has" -ge 1 ]; then ok "step label '$label' present in both commands" else