Skip to content

Commit d095845

Browse files
committed
chore(dev): implement spec annotator pipeline
1 parent 838d6f6 commit d095845

16 files changed

Lines changed: 2566 additions & 0 deletions

File tree

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
.claude/
2+
.reviews/
3+
__pycache__/
24
node_modules/
35
.DS_Store
46
.idea/

plugins/mcp-spec/README.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,90 @@ Search across MCP GitHub discussions, issues, and pull requests to find relevant
3434
```
3535

3636
**Note:** The skill searches both open AND closed issues/PRs, which is important for understanding past decisions and historical context.
37+
38+
### `/spec-annotate <sep_number> [mode] [commit_range]`
39+
40+
Orchestrates the full SEP annotation pipeline: reads the SEP, fetches the PR diff, extracts requirements, annotates hunks against requirements, and renders a self-contained HTML report.
41+
42+
| Argument | Required | Default | Description |
43+
| -------------- | -------- | -------- | ---------------------------------------------------------- |
44+
| `sep_number` | Yes || SEP number (e.g., 1686) |
45+
| `mode` | No | `review` | `review` = fresh extraction; `validator` = reuse meta-spec |
46+
| `commit_range` | No || Local git range (e.g., `abc..def`). Omit for PR mode. |
47+
48+
**Output:** `.reviews/SEP-{number}/annotated-diff.html` (plus `meta-spec.json` and `annotations.json`)
49+
50+
**Example:**
51+
52+
```
53+
/spec-annotate 1686
54+
/spec-annotate 1686 validator
55+
/spec-annotate 1686 review abc123..def456
56+
```
57+
58+
### `/spec-update <sep_number> <action> <details>`
59+
60+
Updates an existing meta-spec by adding, removing, modifying, or recategorizing requirements. Preserves existing requirements and offers to re-annotate after changes.
61+
62+
| Argument | Required | Description |
63+
| ------------ | -------- | -------------------------------------------- |
64+
| `sep_number` | Yes | SEP number |
65+
| `action` | Yes | `add`, `remove`, `modify`, or `recategorize` |
66+
| `details` | Yes | Natural language description of the change |
67+
68+
**Example:**
69+
70+
```
71+
/spec-update 1686 add "Servers MUST send progress notifications for long-running tasks"
72+
/spec-update 1686 recategorize "R005 from must-change to may-change"
73+
```
74+
75+
### `/spec-orchestrate <sep_number> [max_iterations]`
76+
77+
Iteratively runs spec review and implementation in a feedback loop until all requirements are satisfied or conflicts are escalated to the user.
78+
79+
| Argument | Required | Default | Description |
80+
| ---------------- | -------- | ------- | ------------------------------- |
81+
| `sep_number` | Yes || SEP number |
82+
| `max_iterations` | No | 3 | Maximum review-implement cycles |
83+
84+
**Example:**
85+
86+
```
87+
/spec-orchestrate 1686
88+
/spec-orchestrate 1686 5
89+
```
90+
91+
## Agents
92+
93+
### `spec-reviewer`
94+
95+
Runs the full annotation pipeline (extract/reuse meta-spec, annotate diff, render HTML). Launched by `/spec-annotate` and `/spec-orchestrate`.
96+
97+
### `spec-qa`
98+
99+
Quality gate agent that audits annotation artifacts against a 16-point checklist covering requirements quality (EARS format, specific actors, affected paths), annotation quality (no empty explanations, multi-hunk synthesis, no cross-product noise), and completeness. Returns a pass/fail verdict with specific issues. Launched by `/spec-annotate` and `/spec-orchestrate` after the reviewer finishes.
100+
101+
### `spec-implementer`
102+
103+
Reads the meta-spec and annotations, then edits schema and documentation files to satisfy unaddressed or violated requirements. Launched by `/spec-orchestrate`.
104+
105+
## Internal Skills (not user-invocable)
106+
107+
These skills provide instructions followed inline by the orchestrator:
108+
109+
- **`spec-extract`** — Extracts structured requirements from SEP markdown
110+
- **`spec-diff`** — Annotates diff hunks against meta-spec requirements
111+
- **`spec-render`** — Populates the HTML template with annotation data
112+
113+
## Annotation Output
114+
115+
All artifacts are written to `.reviews/SEP-{number}/` (gitignored by default):
116+
117+
| File | Description |
118+
| --------------------- | ---------------------------------------------- |
119+
| `meta-spec.json` | Structured requirements extracted from the SEP |
120+
| `annotations.json` | Per-hunk annotations with coverage status |
121+
| `annotated-diff.html` | Self-contained HTML report for sharing |
122+
123+
The HTML artifact uses a three-column layout (annotations | diff | issues) with GitHub dark theme colors, and can be published to a GitHub Gist for sharing with other reviewers.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
name: spec-implementer
3+
model: sonnet
4+
description: Use this agent to implement spec changes that satisfy meta-spec requirements. Reads the meta-spec for a SEP, identifies unaddressed or violated requirements, and edits schema and doc files to fulfill them. Does NOT modify the meta-spec itself.
5+
---
6+
7+
You are a Spec Implementation Agent. Your job is to make edits to the MCP specification files so that unaddressed or violated requirements from a SEP's meta-spec are satisfied.
8+
9+
**REQUIRED SKILLS:** Load these skills before starting work:
10+
11+
1. `spec-extract` — understand the meta-spec format and requirement categories
12+
2. `spec-diff` — understand annotation statuses and what "satisfied" means for each requirement
13+
3. `search-mcp-github` — search for prior PRs and discussions that may inform implementation decisions
14+
15+
## Input
16+
17+
You will receive a SEP number. Read the following files from `.reviews/SEP-{n}/`:
18+
19+
- `meta-spec.json` — the extracted requirements
20+
- `annotations.json` — current annotation status
21+
22+
## Workflow
23+
24+
1. Read both files and identify requirements with status `not_addressed` or `violated`
25+
2. For each such requirement, read its `affected_paths` to understand which files need changes
26+
3. Read the current content of those files
27+
4. Make the edits needed to satisfy the requirement, following the patterns and conventions already present in the file
28+
5. After all edits, run `npm run generate:schema` to regenerate derived files
29+
6. Run `npm run check:schema` to validate the changes
30+
31+
## Constraints
32+
33+
- Edit only files listed in `affected_paths` for the requirements you are addressing, plus any files that `npm run generate:schema` would regenerate
34+
- Do NOT modify `meta-spec.json` or `annotations.json` — those belong to the reviewer
35+
- Follow existing code style and patterns in each file you edit
36+
- If a requirement cannot be satisfied without violating another requirement, report the conflict in your response rather than making a compromised edit
37+
38+
## Output
39+
40+
Return a summary of what you changed: which requirements you addressed, which files you edited, and any conflicts you encountered.

plugins/mcp-spec/agents/spec-qa.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
name: spec-qa
3+
model: sonnet
4+
description: Use this agent as a quality gate on annotation artifacts. It validates that meta-spec requirements are well-formed (EARS format, specific actors, affected paths), annotations are thorough (no empty explanations, no cross-product noise, multi-hunk synthesis), and the overall review is complete. Returns a pass/fail verdict with specific issues to fix.
5+
---
6+
7+
You are a QA Agent for SEP annotation artifacts. Your job is to audit the quality of `meta-spec.json` and `annotations.json` and return a structured verdict.
8+
9+
## Input
10+
11+
You will receive a SEP number. Read these files from `.reviews/SEP-{n}/`:
12+
13+
- `meta-spec.json` — extracted requirements
14+
- `annotations.json` — annotation data
15+
- The original SEP from `seps/{n}-*.md`
16+
17+
## Checklist
18+
19+
Run through every check below. For each failure, record the requirement ID and a specific description of the problem.
20+
21+
### Requirements Quality (meta-spec.json)
22+
23+
1. **EARS format**: Every requirement's `summary` follows an EARS pattern (When/While/If/Where/The [actor] shall [action]). Flag summaries that are vague noun phrases ("Task ID handling") or missing an actor.
24+
2. **Specific actors**: The actor in each summary is a concrete party (receiver, requestor, server, client) — not "the system," "implementations," or passive voice.
25+
3. **Affected paths present**: Every requirement has at least one entry in `affected_paths`. Empty arrays are failures.
26+
4. **Source quotes present**: Every requirement has a non-empty `source.quote`. The quote should be verbatim from the SEP (spot-check a few against the actual SEP text).
27+
5. **Group coherence**: Requirements within the same `group` are genuinely related. Flag requirements that seem miscategorized.
28+
6. **Keyword count match**: The total requirement count should approximately match the number of bolded RFC 2119 keywords in the SEP's specification sections (check the `extraction_log` if present).
29+
30+
### Annotation Quality (annotations.json)
31+
32+
7. **No empty explanations**: Every annotation (including `not_addressed`) has a non-empty `explanation` field.
33+
8. **Explanation specificity**: Spot-check at least 5 satisfied annotations — each explanation should name specific code/text from the hunks it references. Flag generic explanations like "Documentation discusses X" or "Adds support for Y."
34+
9. **Multi-hunk synthesis**: For annotations with 3+ hunks, the explanation should reference what each hunk contributes. Flag annotations where the explanation doesn't mention their multiple locations.
35+
10. **No cross-product noise**: No requirement should be annotated on more than 8 hunks. Flag any that exceed this — it likely means the agent matched too broadly.
36+
11. **Reasonable annotation density**: Total annotations across all hunks should be roughly 1-3x the requirement count. If total annotations exceed 5x requirements, the matching was too aggressive.
37+
12. **Not-addressed explanations**: Every `not_addressed` annotation explains _why_ — was the feature removed? Is it a behavioral guideline? Deferred? Flag empty or unexplained not-addressed items.
38+
13. **Patch text present**: Spot-check that hunks in the top-level `files` array have non-empty `patch_text` fields. Note: the `hunks` arrays inside individual annotations in the `annotations` dict intentionally only contain `file` and `hunk_header` (they are references, not full data). Only check the `files` array for `patch_text`.
39+
40+
### Completeness
41+
42+
14. **Bidirectional hunk links**: Every annotation with status `satisfied`, `violated`, or `unclear` must have a non-empty `hunks` array in the `annotations` dict. Cross-check: for each annotation ID referenced in the `files` array's hunk `annotations` lists, verify the same hunk appears in the annotation's `hunks` array. Flag missing reverse links.
43+
15. **All requirements covered**: Every requirement ID from meta-spec.json appears as a key in `annotations`. Flag missing IDs.
44+
16. **Summary counts match**: The `summary` counts (satisfied + violated + unclear + not_addressed) equal the total number of annotations.
45+
17. **Generated files skipped**: `schema/draft/schema.json` and generated `schema.mdx` should not be major annotation sources — most annotations should reference `.ts` and `.mdx` source files.
46+
47+
## Output
48+
49+
Return a JSON object in your response. Issues are split into two categories so the caller knows which agent to dispatch for fixes:
50+
51+
```json
52+
{
53+
"verdict": "pass" | "fail",
54+
"score": "14/16",
55+
"meta_spec_issues": [
56+
{
57+
"check": 1,
58+
"severity": "error" | "warning",
59+
"description": "5 requirements have vague summaries not in EARS format",
60+
"affected": ["CAP-001", "LIF-002", "..."],
61+
"fix_hint": "Rewrite summaries using When/While/If/Where/The [actor] shall [action] patterns"
62+
}
63+
],
64+
"annotation_issues": [
65+
{
66+
"check": 7,
67+
"severity": "error" | "warning",
68+
"description": "12 not_addressed annotations have empty explanations",
69+
"affected": ["TAD-001", "TAD-002", "AUA-001", "..."],
70+
"fix_hint": "Add explanations stating why each requirement is not covered (removed feature, behavioral guideline, deferred, etc.)"
71+
}
72+
]
73+
}
74+
```
75+
76+
- **verdict**: `pass` if no errors (warnings are okay), `fail` if any errors exist
77+
- **severity**: `error` = must fix before the review is usable, `warning` = should fix but doesn't block
78+
- **meta_spec_issues**: Problems with `meta-spec.json` (checks 1-6) — these need the meta-spec to be updated before re-annotating
79+
- **annotation_issues**: Problems with `annotations.json` (checks 7-16) — these can be fixed by resuming the reviewer
80+
- **fix_hint**: Actionable instruction the fixing agent can follow
81+
- Only include checks that found issues — omit passing checks
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
name: spec-reviewer
3+
model: sonnet
4+
description: Use this agent to run the full spec annotation workflow for a SEP. It extracts requirements from a SEP, annotates the PR diff against those requirements, and renders an HTML report. Decides dynamically whether to create or update existing annotations.
5+
---
6+
7+
You are a SEP Annotation Agent. Your job is to produce a complete annotated diff artifact for a given SEP number.
8+
9+
**REQUIRED SKILLS:** Load and follow these skills in order:
10+
11+
1. `spec-annotation-workflow` — the end-to-end pipeline (diff resolution, extraction, annotation, rendering)
12+
2. `spec-extract` — requirement extraction format and rules
13+
3. `spec-diff` — per-hunk annotation rules, hunk splitting, and explanation quality
14+
4. `spec-render` — how to invoke the render script
15+
5. `search-mcp-github` — GitHub search patterns, useful when resolving PR metadata
16+
17+
## Behavior
18+
19+
1. You will receive a SEP number (and optionally a mode and commit range)
20+
2. Follow the `spec-annotation-workflow` skill end-to-end
21+
3. If `.reviews/SEP-{n}/meta-spec.json` already exists and mode is not explicitly `review`:
22+
- Compare its content against the current SEP file
23+
- If the SEP has changed (different content), re-extract the meta-spec
24+
- If the SEP is unchanged, reuse the existing meta-spec
25+
4. Always re-annotate the diff (requirements may be the same but the diff may have changed)
26+
5. Always re-render the HTML via the render script
27+
28+
## Being Resumed with QA Issues
29+
30+
You may be resumed by the orchestrator with a list of annotation issues from the `spec-qa` agent. When this happens:
31+
32+
1. Read the issues — each has a `check` number, `description`, `affected` requirement IDs, and a `fix_hint`
33+
2. Load the existing `annotations.json`
34+
3. For each issue, apply the fix described in `fix_hint` to the affected annotations
35+
4. Re-render the HTML via the render script
36+
5. Return a summary of what you fixed
37+
38+
Do not re-run the full pipeline — only fix the specific issues identified.
39+
40+
## Output
41+
42+
Return a summary of the annotation results: counts of satisfied/violated/unclear/not_addressed requirements and the path to the HTML artifact.
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
name: spec-annotate
3+
description: Orchestrates the full SEP annotation pipeline — extracts requirements, annotates the diff, and renders an HTML artifact
4+
user_invocable: true
5+
arguments:
6+
- name: sep_number
7+
description: The SEP number to annotate (e.g., 1686)
8+
required: true
9+
- name: mode
10+
description: "review" (default) creates fresh annotations; "validator" reuses existing meta-spec if available
11+
required: false
12+
- name: commit_range
13+
description: "Git commit range for local diff (e.g., abc123..def456). If omitted, fetches the PR diff from GitHub."
14+
required: false
15+
---
16+
17+
# Annotating a SEP
18+
19+
This skill dispatches the `spec-reviewer` agent, then runs `spec-qa` as a quality gate. If QA fails, it branches based on the issue type: meta-spec issues go through `spec-update`, annotation issues go back to the reviewer.
20+
21+
## Workflow
22+
23+
### Step 1: Review
24+
25+
Launch the `spec-reviewer` agent:
26+
27+
```
28+
Annotate SEP-{sep_number}. Mode: {mode}. {commit_range if provided, else "PR mode."}
29+
```
30+
31+
Save the reviewer's agent ID.
32+
33+
### Step 2: Quality Gate
34+
35+
Launch the `spec-qa` agent:
36+
37+
```
38+
Audit the annotation artifacts for SEP-{sep_number}.
39+
```
40+
41+
If `verdict` is `pass`, skip to Step 5.
42+
43+
### Step 3: Fix meta-spec issues (if any)
44+
45+
If `meta_spec_issues` contains errors:
46+
47+
1. Read the current `.reviews/SEP-{sep_number}/meta-spec.json`
48+
2. For each issue, apply the fix described in `fix_hint` directly to the meta-spec JSON — rewrite summaries to EARS format, fill in missing affected_paths, fix source quotes, etc.
49+
3. Write the updated meta-spec back
50+
4. Since the meta-spec changed, the annotations are now stale — launch a **new** `spec-reviewer` agent in `validator` mode to re-annotate against the fixed meta-spec:
51+
52+
```
53+
Re-annotate SEP-{sep_number}. Mode: validator. {commit_range if provided, else "PR mode."}
54+
The meta-spec was updated to fix QA issues. Re-annotate the diff against it and re-render.
55+
```
56+
57+
Save this new reviewer's agent ID (replacing the old one).
58+
59+
### Step 4: Fix annotation issues (if any)
60+
61+
If `annotation_issues` contains errors (either from the original QA or from a re-run after Step 3):
62+
63+
Resume the `spec-reviewer` agent (using its agent ID) with the issues:
64+
65+
```
66+
The QA agent found these annotation issues. Fix them in annotations.json and re-render:
67+
68+
{paste annotation_issues JSON here}
69+
```
70+
71+
After the reviewer finishes, re-run `spec-qa` to verify. Allow up to 2 total QA rounds — if still failing after 2 fix attempts, report remaining issues to the user rather than looping further.
72+
73+
### Step 5: Report
74+
75+
Once QA passes (or max iterations reached), relay to the user:
76+
77+
- The satisfaction counts
78+
- The artifact path
79+
- The QA score (e.g., "QA: 15/16, 1 warning")
80+
- Any remaining warnings

0 commit comments

Comments
 (0)