Compact high-traffic MAP skill playbooks#138
Conversation
There was a problem hiding this comment.
Pull request overview
This PR compacts the highest-traffic MAP workflow skill playbooks (/map-plan, /map-efficient, /map-check, /map-review) to reduce retained context after invocation/compaction, while preserving detailed guidance in bundled supporting reference Markdown files. It also adds regression coverage and documentation to keep these skills compact over time.
Changes:
- Added a regression test enforcing high-traffic
SKILL.mdbodies stay ≤500 lines and link to bundled supporting references (in both.claude/skillsand shipped templates). - Refactored multiple skills to move low-frequency examples/troubleshooting/rubrics into new
*-reference.mdsupporting files. - Updated maintainer docs/improvement bookkeeping to document the compact-skill lifecycle and validation workflow.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_skills.py | Adds regression enforcing compactness + supporting-reference linkage for high-traffic skills. |
| src/mapify_cli/templates/skills/map-review/SKILL.md | Compacts /map-review active playbook and links to review-reference.md. |
| src/mapify_cli/templates/skills/map-review/review-reference.md | Adds /map-review supporting reference (rubrics/examples/troubleshooting). |
| src/mapify_cli/templates/skills/map-plan/plan-reference.md | Adds /map-plan supporting reference (spec template/examples/troubleshooting). |
| src/mapify_cli/templates/skills/map-efficient/SKILL.md | Compacts /map-efficient active playbook and links to efficient-reference.md. |
| src/mapify_cli/templates/skills/map-efficient/efficient-reference.md | Adds /map-efficient supporting reference (waves/TDD/examples/troubleshooting). |
| src/mapify_cli/templates/skills/map-check/SKILL.md | Compacts /map-check playbook and links to check-reference.md. |
| src/mapify_cli/templates/skills/map-check/check-reference.md | Adds /map-check supporting reference (matrices/examples/troubleshooting). |
| README.md | Documents the “compact high-traffic playbooks” philosophy. |
| docs/USAGE.md | Adds maintainer guardrail: keep high-traffic skills under 500 lines + supporting refs. |
| docs/learned/testing-strategies.md | Adds learned note about smoking compact skills in generated projects. |
| docs/learned/review-checks.md | Adds review check for verifying mandatory workflow markers remain in active SKILL.md. |
| docs/learned/commands.md | Documents the new focused compact-skill regression command. |
| docs/learned/architecture-patterns.md | Adds architecture note reinforcing compact high-traffic skills. |
| docs/improvement-plan.md | Updates improvement-plan bookkeeping to mark this slice as shipped and remove duplicated section. |
| docs/improvement-loop-log.md | Logs the improvement-loop decision and validation steps for this compaction slice. |
| docs/improvement-done.md | Records completion summary and artifacts moved into supporting references. |
| docs/ARCHITECTURE.md | Documents “Compact Skill Playbooks” as an architectural principle. |
| .claude/skills/map-review/SKILL.md | Mirrors /map-review compaction in source-of-truth .claude skills. |
| .claude/skills/map-review/review-reference.md | Mirrors /map-review supporting reference in .claude. |
| .claude/skills/map-plan/SKILL.md | Mirrors /map-plan compaction in .claude and links to plan-reference.md. |
| .claude/skills/map-plan/plan-reference.md | Mirrors /map-plan supporting reference in .claude. |
| .claude/skills/map-efficient/SKILL.md | Mirrors /map-efficient compaction in .claude. |
| .claude/skills/map-efficient/efficient-reference.md | Mirrors /map-efficient supporting reference in .claude. |
| .claude/skills/map-check/SKILL.md | Mirrors /map-check compaction in .claude. |
| .claude/skills/map-check/check-reference.md | Mirrors /map-check supporting reference in .claude. |
Comments suppressed due to low confidence (2)
src/mapify_cli/templates/skills/map-check/SKILL.md:185
- The diagnostics snippet references
$TEST_CMD, but it is never set in this skill anymore, and the snippet no longer shows re-running the failing command to produce$LOG_FILEbefore parsing. DefineTEST_CMD(or use the actual command string inline), run it with output redirected to$LOG_FILE, then pass the real exit code todiagnostics.py parse/summarize.
```bash
python3 .map/scripts/diagnostics.py summarize \
--tool tests \
--command "$TEST_CMD" \
--exit-code $? \
--summary "Pytest run for branch verification" \
--known-issues ".map/${BRANCH}/known-issues.json" \
--notes "Capture deviations, flaky behavior, or environment quirks here"
.claude/skills/map-check/SKILL.md:185
- The diagnostics snippet references
$TEST_CMD, but it is never set in this skill anymore, and the snippet no longer shows re-running the failing command to produce$LOG_FILEbefore parsing. DefineTEST_CMD(or use the actual command string inline), run it with output redirected to$LOG_FILE, then pass the real exit code todiagnostics.py parse/summarize.
```bash
python3 .map/scripts/diagnostics.py summarize \
--tool tests \
--command "$TEST_CMD" \
--exit-code $? \
--summary "Pytest run for branch verification" \
--known-issues ".map/${BRANCH}/known-issues.json" \
--notes "Capture deviations, flaky behavior, or environment quirks here"
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if [ "$DETACHED_FLAG" = "true" ]; then | ||
| # EC-15: prepare detached review once; compare runs reuse the same path. | ||
| DETACHED_JSON=$(python3 .map/scripts/map_step_runner.py prepare_detached_review) | ||
| DETACHED_PATH=$(echo "$DETACHED_JSON" | python3 -c "import sys,json; print(json.load(sys.stdin).get('detached_path',''))") |
There was a problem hiding this comment.
Fixed in a22ad14: the detached helper snippet now parses status, reason, and worktree_path from prepare_detached_review and passes BUNDLE_JSON_PATH.
| ```bash | ||
| REVIEW_PROMPTS_JSON=$(python3 .map/scripts/map_step_runner.py build_review_prompts \ | ||
| --review-preferences "[paste Review Preferences section above]") | ||
|
|
||
| MONITOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["monitor"]["prompt"])') | ||
| PREDICTOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["predictor"]["prompt"])') | ||
| EVALUATOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["evaluator"]["prompt"])') | ||
| PROMPTS_JSON=$(python3 .map/scripts/map_step_runner.py build_review_prompts --review-preferences "$ARGUMENTS") | ||
| ``` | ||
|
|
||
| Use the extracted `MONITOR_PROMPT`, `PREDICTOR_PROMPT`, and `EVALUATOR_PROMPT` strings as the Task prompts. | ||
| The helper preserves the review bundle as primary context, clips lower-priority raw diff | ||
| context first when the prompt exceeds `MAP_REVIEW_PROMPT_BUDGET_TOKENS` (default 12,000 | ||
| estimated tokens), clips oversized review preferences only if needed to preserve the hard | ||
| budget, and includes a `Review Prompt Budget` diagnostic document when clipping occurs. | ||
|
|
||
| **3 agent Task calls** (pass the budgeted prompt for each role): | ||
| Launch all three tasks from the generated prompt files. Keep reviewer task calls below the bundle and prompt-builder commands. | ||
|
|
||
| ``` | ||
| Task( | ||
| subagent_type="monitor", | ||
| description="Review code changes", | ||
| prompt="[paste MONITOR_PROMPT]" | ||
| ) | ||
|
|
||
| Task( | ||
| subagent_type="predictor", | ||
| description="Analyze change impact", | ||
| prompt="[paste PREDICTOR_PROMPT]" | ||
| ) | ||
|
|
||
| Task( | ||
| subagent_type="evaluator", | ||
| description="Score change quality", | ||
| prompt="[paste EVALUATOR_PROMPT]" | ||
| ) | ||
| ```text | ||
| Task(subagent_type="monitor", description="Review diff for correctness", prompt="Read generated monitor prompt from build_review_prompts output.") | ||
| Task(subagent_type="predictor", description="Predict integration risk", prompt="Read generated predictor prompt from build_review_prompts output.") | ||
| Task(subagent_type="evaluator", description="Score review quality", prompt="Read generated evaluator prompt from build_review_prompts output.") |
There was a problem hiding this comment.
Fixed in b45c28e: build_review_prompts output is now parsed into MONITOR_PROMPT/PREDICTOR_PROMPT/EVALUATOR_PROMPT and those variables are passed to Task calls.
|
|
||
| ```bash | ||
| SECTIONS_JSON=$(python3 .map/scripts/map_step_runner.py shuffle-sections "$MODE_FLAG" "$SEED_RAW") | ||
| SECTIONS_JSON=$(python3 .map/scripts/map_step_runner.py shuffle-sections --mode "$MODE_FLAG" --seed "$SEED_RAW") |
There was a problem hiding this comment.
Fixed in b45c28e: shuffle-sections now uses positional arguments again.
| if [ "$DETACHED_FLAG" = "true" ]; then | ||
| # EC-15: prepare detached review once; compare runs reuse the same path. | ||
| DETACHED_JSON=$(python3 .map/scripts/map_step_runner.py prepare_detached_review) | ||
| DETACHED_PATH=$(echo "$DETACHED_JSON" | python3 -c "import sys,json; print(json.load(sys.stdin).get('detached_path',''))") |
There was a problem hiding this comment.
Fixed in a22ad14 in the .claude source copy; templates were resynced.
| ```bash | ||
| REVIEW_PROMPTS_JSON=$(python3 .map/scripts/map_step_runner.py build_review_prompts \ | ||
| --review-preferences "[paste Review Preferences section above]") | ||
|
|
||
| MONITOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["monitor"]["prompt"])') | ||
| PREDICTOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["predictor"]["prompt"])') | ||
| EVALUATOR_PROMPT=$(printf '%s' "$REVIEW_PROMPTS_JSON" | python3 -c 'import json,sys; print(json.load(sys.stdin)["prompts"]["evaluator"]["prompt"])') | ||
| PROMPTS_JSON=$(python3 .map/scripts/map_step_runner.py build_review_prompts --review-preferences "$ARGUMENTS") | ||
| ``` | ||
|
|
||
| Use the extracted `MONITOR_PROMPT`, `PREDICTOR_PROMPT`, and `EVALUATOR_PROMPT` strings as the Task prompts. | ||
| The helper preserves the review bundle as primary context, clips lower-priority raw diff | ||
| context first when the prompt exceeds `MAP_REVIEW_PROMPT_BUDGET_TOKENS` (default 12,000 | ||
| estimated tokens), clips oversized review preferences only if needed to preserve the hard | ||
| budget, and includes a `Review Prompt Budget` diagnostic document when clipping occurs. | ||
|
|
||
| **3 agent Task calls** (pass the budgeted prompt for each role): | ||
| Launch all three tasks from the generated prompt files. Keep reviewer task calls below the bundle and prompt-builder commands. | ||
|
|
||
| ``` | ||
| Task( | ||
| subagent_type="monitor", | ||
| description="Review code changes", | ||
| prompt="[paste MONITOR_PROMPT]" | ||
| ) | ||
|
|
||
| Task( | ||
| subagent_type="predictor", | ||
| description="Analyze change impact", | ||
| prompt="[paste PREDICTOR_PROMPT]" | ||
| ) | ||
|
|
||
| Task( | ||
| subagent_type="evaluator", | ||
| description="Score change quality", | ||
| prompt="[paste EVALUATOR_PROMPT]" | ||
| ) | ||
| ```text | ||
| Task(subagent_type="monitor", description="Review diff for correctness", prompt="Read generated monitor prompt from build_review_prompts output.") | ||
| Task(subagent_type="predictor", description="Predict integration risk", prompt="Read generated predictor prompt from build_review_prompts output.") | ||
| Task(subagent_type="evaluator", description="Score review quality", prompt="Read generated evaluator prompt from build_review_prompts output.") |
There was a problem hiding this comment.
Fixed in b45c28e in the .claude source copy; templates were resynced.
|
|
||
| ```bash | ||
| SECTIONS_JSON=$(python3 .map/scripts/map_step_runner.py shuffle-sections "$MODE_FLAG" "$SEED_RAW") | ||
| SECTIONS_JSON=$(python3 .map/scripts/map_step_runner.py shuffle-sections --mode "$MODE_FLAG" --seed "$SEED_RAW") |
There was a problem hiding this comment.
Fixed in b45c28e in the .claude source copy; templates were resynced.
| # /map-check - Quality Gates & Verification | ||
|
|
||
| **Purpose:** Run code quality checks (linters, type checkers, tests) and/or verify MAP workflow completion. | ||
| Purpose: run verification only. Do not plan, implement, or fix from this skill. |
There was a problem hiding this comment.
Fixed in b45c28e: purpose text now says quality gates and MAP workflow verification.
| Optional structured diagnostics for failing gates: | ||
| ```bash | ||
| TEST_CMD="pytest" # Default; override if project uses different test runner | ||
| echo "Running final tests..." | ||
| eval "$TEST_CMD" | ||
|
|
||
| # Optional (structured diagnostics): | ||
| # If tests fail and you want a durable artifact for follow-up/debugging, | ||
| # re-run capturing output and parse to .map/<branch>/diagnostics.json: | ||
| # | ||
| # BRANCH=$(git rev-parse --abbrev-ref HEAD | sed -E 's|/|-|g; s|[^a-zA-Z0-9_.-]|-|g; s|-{2,}|-|g; s|^-||; s|-$||') | ||
| # LOG_FILE=".map/${BRANCH}/tests.log" | ||
| # mkdir -p ".map/${BRANCH}" | ||
| # ( $TEST_CMD ) >"$LOG_FILE" 2>&1 | ||
| # python3 .map/scripts/diagnostics.py parse --tool tests --log "$LOG_FILE" --command "$TEST_CMD" --exit-code $? | ||
|
|
||
| if [[ $? -ne 0 ]]; then | ||
| echo "❌ Tests failed - verification REJECTED" | ||
| VERDICT="REJECTED" | ||
| REASON="Test suite has failures" | ||
| fi | ||
| LOG_FILE=".map/${BRANCH}/tests.log" | ||
| mkdir -p ".map/${BRANCH}" | ||
| # Re-run the failing command into $LOG_FILE, then parse it: | ||
| python3 .map/scripts/diagnostics.py parse --tool tests --log "$LOG_FILE" --command "$TEST_CMD" --exit-code $? | ||
| ``` |
There was a problem hiding this comment.
Fixed in b45c28e: the diagnostics snippet defines TEST_CMD, reruns it into LOG_FILE, captures TEST_EXIT, and passes that exit code to parse/summarize.
| # /map-check - Quality Gates & Verification | ||
|
|
||
| **Purpose:** Run code quality checks (linters, type checkers, tests) and/or verify MAP workflow completion. | ||
| Purpose: run verification only. Do not plan, implement, or fix from this skill. |
There was a problem hiding this comment.
Fixed in b45c28e in the .claude source copy; templates were resynced.
| Optional structured diagnostics for failing gates: | ||
| ```bash | ||
| TEST_CMD="pytest" # Default; override if project uses different test runner | ||
| echo "Running final tests..." | ||
| eval "$TEST_CMD" | ||
|
|
||
| # Optional (structured diagnostics): | ||
| # If tests fail and you want a durable artifact for follow-up/debugging, | ||
| # re-run capturing output and parse to .map/<branch>/diagnostics.json: | ||
| # | ||
| # BRANCH=$(git rev-parse --abbrev-ref HEAD | sed -E 's|/|-|g; s|[^a-zA-Z0-9_.-]|-|g; s|-{2,}|-|g; s|^-||; s|-$||') | ||
| # LOG_FILE=".map/${BRANCH}/tests.log" | ||
| # mkdir -p ".map/${BRANCH}" | ||
| # ( $TEST_CMD ) >"$LOG_FILE" 2>&1 | ||
| # python3 .map/scripts/diagnostics.py parse --tool tests --log "$LOG_FILE" --command "$TEST_CMD" --exit-code $? | ||
|
|
||
| if [[ $? -ne 0 ]]; then | ||
| echo "❌ Tests failed - verification REJECTED" | ||
| VERDICT="REJECTED" | ||
| REASON="Test suite has failures" | ||
| fi | ||
| LOG_FILE=".map/${BRANCH}/tests.log" | ||
| mkdir -p ".map/${BRANCH}" | ||
| # Re-run the failing command into $LOG_FILE, then parse it: | ||
| python3 .map/scripts/diagnostics.py parse --tool tests --log "$LOG_FILE" --command "$TEST_CMD" --exit-code $? | ||
| ``` |
There was a problem hiding this comment.
Fixed in b45c28e in the .claude source copy; templates were resynced.
Summary
User impact
The main MAP golden-path skills now carry less recurring context after invocation and compaction while preserving mandatory execution gates and full supporting references for edge cases.
Validation