Address MAP framework bundle: 8 framework gaps surfaced in a downstream run by azalio · Pull Request #142 · azalio/map-framework

azalio · 2026-05-22T13:56:44Z

Summary

8 framework issues (Handle missing recitation subtask IDs #2, docs: Remove non-functional Claude Marketplace installation documentation #3, docs: Add comprehensive PATH setup instructions for mapify CLI installation #4, fix: Add CLI subcommands to fix UV tool isolation issue #6, feat: Add CLI tool development support to MAP framework #7, feat: add /map-efficient and /map-fast workflow variants #11, feat: Sequential Thinking Integration for MAP Agents #12, feat: add 'mapify playbook apply-delta' CLI command #13) surfaced during a downstream MAP run, bundled into one PR. Each carries regression tests; template sync is enforced.
New orchestrator surfaces: peek_current_step, mark_subtask_complete.
New step-runner surfaces: save_research / load_research (subtask-scoped artifact API), validate_mutation_boundary (warn-only by default, MAP_STRICT_SCOPE=1 opt-in to hard reject).
Skill / agent prompt fixes: map-check Step 2 (jq schema bug), map-efficient (RESEARCH wired through new API, no-op short-circuit, build_context_block CLI), task-decomposer (Minimize Dependencies guidance to fix wave over-serialization), monitor (mutation-boundary verification step, ast.parse replaces py_compile).
Side fix: end-of-turn.sh hook used py_compile, which writes __pycache__/*.pyc next to source even with -B. Replaced with ast.parse so the template-hygiene gate stops tripping when any .py under src/mapify_cli/templates/ is touched.

Per-issue notes

feat: Add CLI tool development support to MAP framework #7 get_next_step short-circuit: WORKFLOW_COMPLETE now wins over stale pending_steps. Also fixed the map-check/SKILL.md Step 2 jq pattern that crashed with "Cannot index array with string" — the schema is a flat list, not a dict.
fix: Add CLI subcommands to fix UV tool isolation issue #6 build_context_block CLI: skill stopped recommending python -c "import sys; sys.path.insert...".
feat: Sequential Thinking Integration for MAP Agents #12 save/load_research API: .map/<branch>/research/<subtask_id>__<kind>.md, strict id/kind sanitization, CLI for stdin/stdout streaming.
Handle missing recitation subtask IDs #2 peek_current_step: read-only recovery escape hatch after validate_step double-advance.
docs: Remove non-functional Claude Marketplace installation documentation #3 mark_subtask_complete: short-circuit no-op / docs-only subtasks (records status=no-op for audit, advances cursor, closes workflow atomically).
feat: add /map-efficient and /map-fast workflow variants #11 validate_mutation_boundary: warn-only by default to avoid blocking legitimate cycle-fix expansion; .map/ + .codex/ excluded from actual surface.
docs: Add comprehensive PATH setup instructions for mapify CLI installation #4 wave-planner over-serialization: root cause was decomposer-side false dependencies (linear deps → 15 single waves). Decomposer prompt now requires every dependency edge to be load-bearing and affected_files populated.
feat: add 'mapify playbook apply-delta' CLI command #13 auto-compact at subtask boundaries: already implemented in .claude/hooks/context-meter.py — closed as resolved.

Test plan

uv run pytest -q — 1437 passed, 4 skipped (was 1398 on main; +44 new test cases this PR).
make lint — ruff + mypy clean.
make sync-templates — all dev/template pairs in sync.
Validate end-to-end on neuro-vlad branch new-road: copy updated .claude/skills/map-check/SKILL.md (or mapify init against the new release) and re-run /map-check to confirm Step 2 no longer crashes.
Smoke-test the new CLI subcommands manually: peek_current_step, mark_subtask_complete, save_research, load_research, validate_mutation_boundary.

Deferred (not in this PR)

~10 pre-existing Pyright diagnostics in map_step_runner.py (dict.get() typed-as-object → .pop()/[]/int() errors) and 6 in test_map_step_runner.py surfaced during this work. They are unrelated to the eight fixes and would balloon the diff; tracked for a separate cleanup PR.

🤖 Generated with Claude Code

Eight inter-related framework issues surfaced during a downstream run. All fixes ship together with regression tests and template sync. map-check/SKILL.md (#7-bundle bug) Step 2 indexed `pending_steps["ST-001"]` but the canonical schema makes pending_steps a flat list[str] of workflow phase ids — jq crashed with `Cannot index array with string`. Rewrote Step 2 around workflow_status + flat-array iteration. get_next_step short-circuit (#7) Added early-return on workflow_status=='WORKFLOW_COMPLETE' so a stale repopulation of pending_steps after a finished run no longer surfaces a phantom RESEARCH step. build_context_block CLI surfacing (#6) map_step_runner.py already exposed the CLI subcommand; skill docs still pushed `python -c "import sys; sys.path.insert..."`. Replaced with the canonical CLI invocation + bash recipe. save_research / load_research API (#12) New subtask-scoped artifact API in map_step_runner.py (function + CLI subcommands) with strict sanitization. Storage lands at .map/<branch>/research/<subtask_id>__<kind>.md, partitioned by kind (actor / monitor / decomposer). map-efficient RESEARCH phase rewired to use it. peek_current_step (#2) Read-only recovery escape hatch for "Step mismatch: expected Y, got X" after validate_step double-advance. Returns the same shape as get_next_step but never saves the state. mark_subtask_complete (#3) CLI subcommand on the orchestrator to short-circuit already-done / no-op subtasks without the research→actor→monitor cycle. Records a synthetic subtask_result with status='no-op' for audit, advances the cursor, and closes the workflow atomically when it was the last subtask. Skill prompt updated with the new path. validate_mutation_boundary (#11) New CLI in map_step_runner.py compares the actual git diff vs blueprint.subtasks[id].affected_files. Warn-only default (appends to .map/<branch>/scope-violations.log); MAP_STRICT_SCOPE=1 escalates to hard reject. .map/ and .codex/ paths are excluded from the actual surface — they are framework infrastructure, not subtask scope. Monitor agent prompt now runs it during the verification sequence. Wave-planner over-serialization guidance (#4) Audit identified the root cause as decomposer-side false dependencies (linear deps collapse the wave planner to single-subtask waves). Added "Minimize Dependencies for Parallelism" section to task-decomposer.md + new checklist items requiring each edge be load-bearing and affected_files always populated. context-meter (#13) Already implemented in .claude/hooks/context-meter.py — closed as resolved with documentation in TaskUpdate. Side fix: end-of-turn.sh hook used `py_compile`, which writes __pycache__/*.pyc next to source even with -B (emitting bytecode is its entire job). Replaced with `ast.parse` so editing any .py module under src/mapify_cli/templates/ no longer trips the template-hygiene gate. Same change in the Monitor agent's syntax-check recommendation (monitor.md + monitor.toml). Code hygiene cleanups along the way: - Removed unused `state: StepState` param from _write_retry_quarantine. - pyright: ignore on the dynamic DependencyGraph / SubtaskNode imports (importlib spec fallback Pyright cannot follow). - Three pre-existing tmp_path unused fixture params in test_map_orchestrator.py got the documented `del` suppression. Tests: +44 new test cases. Full suite 1437 passed / 4 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Bundles several MAP framework hardening fixes surfaced by downstream usage: adds new orchestration/step-runner surfaces, wires research artifacts into /map-efficient, strengthens mutation-boundary verification, and updates hook/agent guidance to avoid template hygiene regressions.

Changes:

Added orchestrator recovery + workflow helpers (peek_current_step, mark_subtask_complete) and fixed get_next_step completion short-circuit.
Added step-runner research artifact API (save_research / load_research) and a git-diff-based mutation boundary validator with warn/strict modes.
Updated skills/agents/hooks and added regression tests (including replacing py_compile with ast.parse to prevent __pycache__ pollution).

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
tests/test_skills.py	Adds regression tests ensuring updated decomposer/skills guidance stays present and schema-correct.
tests/test_map_step_runner.py	Adds tests for mutation-boundary validation and save/load research CLI behavior.
tests/test_map_orchestrator.py	Adds regression tests for new orchestrator helpers and completion short-circuit behavior.
tests/hooks/test_end_of_turn.py	Adds regression tests ensuring syntax checks don’t create `__pycache__` while still catching syntax errors.
src/mapify_cli/templates/skills/map-efficient/SKILL.md	Documents no-op short-circuit, research artifact wiring, and build_context_block CLI usage.
src/mapify_cli/templates/skills/map-check/SKILL.md	Fixes jq usage to treat `pending_steps` as a flat array and rely on `workflow_status`.
src/mapify_cli/templates/map/scripts/map_step_runner.py	Implements `save_research`/`load_research` + `validate_mutation_boundary` and exposes new CLI subcommands.
src/mapify_cli/templates/map/scripts/map_orchestrator.py	Adds `peek_current_step`, `mark_subtask_complete`, and `WORKFLOW_COMPLETE` short-circuit in `get_next_step`.
src/mapify_cli/templates/hooks/end-of-turn.sh	Replaces `py_compile` with `ast.parse` to avoid writing bytecode into templates.
src/mapify_cli/templates/codex/agents/monitor.toml	Updates Python build-gate guidance to use `ast.parse` instead of `py_compile`.
src/mapify_cli/templates/agents/task-decomposer.md	Adds mandatory guidance to minimize false dependency edges and require populated `affected_files`.
src/mapify_cli/templates/agents/monitor.md	Adds mutation-boundary verification step and updates Python build-gate guidance.
.map/scripts/map_step_runner.py	Mirrors template step-runner updates for runtime use.
.map/scripts/map_orchestrator.py	Mirrors template orchestrator updates for runtime use.
.codex/agents/monitor.toml	Mirrors template Codex monitor guidance update.
.claude/skills/map-efficient/SKILL.md	Mirrors template `/map-efficient` skill updates.
.claude/skills/map-check/SKILL.md	Mirrors template `/map-check` skill updates.
.claude/hooks/end-of-turn.sh	Mirrors template hook update using `ast.parse` to prevent `__pycache__`.
.claude/agents/task-decomposer.md	Mirrors template task-decomposer guidance additions.
.claude/agents/monitor.md	Mirrors template monitor guidance additions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        diff_result = subprocess.run(
+            ["git", "diff", "--name-only", base_ref],
+            cwd=project_dir,
+            capture_output=True,
+            text=True,


+        # Exit code: 0 unless MAP_STRICT_SCOPE=1 AND status=="violation".
+        base_ref_arg = sys.argv[4] if len(sys.argv) >= 5 else None
+        report = validate_mutation_boundary(sys.argv[2], sys.argv[3], base_ref_arg)
+        print(json.dumps(report, indent=2))
+        if report.get("status") == "violation" and report.get("strict"):
+            sys.exit(1)


+    Return shape::
+        {
+          "status": "clean" | "warning" | "violation",
+          "subtask_id": str,
+          "base_ref": str,


+                load_research(branch_arg, subtask_arg, kind=kind_arg)
+            )
+        except ValueError as exc:
+            print(json.dumps({"status": "error", "message": str(exc)}))


+
+Some subtasks are already-done historically (rename/refactor landed in a prior PR), or are docs-only and don't need the full research→actor→monitor cycle. Skip them up-front to save tokens:
+
+```bash


+          "status": "clean" | "warning" | "violation",
+          "subtask_id": str,
+          "base_ref": str,
+          "expected": [str],   # declared affected_files
+          "actual": [str],     # files actually changed
+          "unexpected": [str], # actual but not expected (scope leak)
+          "strict": bool,


+                load_research(branch_arg, subtask_arg, kind=kind_arg)
+            )
+        except ValueError as exc:
+            print(json.dumps({"status": "error", "message": str(exc)}))


+
+Some subtasks are already-done historically (rename/refactor landed in a prior PR), or are docs-only and don't need the full research→actor→monitor cycle. Skip them up-front to save tokens:
+
+```bash


+5. **Verify mutation boundary (MANDATORY):** Run
+   `python3 .map/scripts/map_step_runner.py validate_mutation_boundary <branch> <subtask_id>`
+   to compare the actual git diff against the subtask's declared `affected_files`.
+   - `status="clean"` → continue.
+   - `status="warning"` → record the `unexpected` files in your verdict; do


+Call `research-agent` for the current subtask, then persist its concise findings via the canonical `save_research` API so Actor and Monitor consume them from the same path. Validate the phase with the orchestrator.
+
+```bash
+# After research-agent returns findings in $RESEARCH_FINDINGS:


Copilot flagged 16 comments (8 unique × dev/template copies). All fixed in the same PR. Functional bugs - validate_mutation_boundary now checks return codes from `git status` and `git diff`. `git status` non-zero ⇒ hard error (cannot silently report "clean" outside a git repo). Caller-supplied invalid `base_ref` ⇒ hard error. Auto-resolved base_ref that doesn't exist (fresh repo, no commits yet) ⇒ fall through to porcelain-only and report against uncommitted state, not error. - CLI for validate_mutation_boundary now exits 1 on status="error" so Monitor's mandatory gate cannot silently pass via missing blueprint / unknown subtask / git failure. - load_research CLI now writes its error JSON to STDERR (stdout stays empty) so command substitution `FOO=$(... load_research ...)` is not corrupted by error payloads. Documentation - validate_mutation_boundary docstring now lists the "error" return shape so callers don't assume only clean/warning/violation. - Monitor agent prompt now spells out the status="error" branch (`valid: false` with returned message). Skill snippet bugs - Both `mark_subtask_complete` and `save_research` snippets in map-efficient/SKILL.md now define `SUBTASK_ID=$(jq -r '.current_subtask_id' …)` before use. Previously the snippets relied on a variable set only in a later phase, producing an empty / wrong value on the no-op path and on RESEARCH. Tests / cosmetic - test_branch_is_sanitized actually passes `feature/x` now and asserts the result lands under `feature-x/`, not the literal subpath. The prior version's docstring lied about what it verified. - "each dependencies edge" → "each dependency edge" typo. Regression tests added - test_error_when_not_a_git_repo - test_cli_exits_non_zero_on_error_status - TestLoadResearchCliErrorChannel.test_invalid_subtask_id_writes_to_stderr_keeps_stdout_empty Full suite: 1440 passed (+3) / 4 skipped. ruff + mypy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A downstream invocation of /map-efficient against a repo that had a complete task_plan_<branch>.md ready for resume refused with "needs a task description in \$TASK_ARGS" — the model skipped Step 0 and checked $ARGUMENTS for emptiness as a stop condition. Step 0 has always supported resume, but the contract was implicit. Made it explicit: - "MANDATORY: Empty \$TASK_ARGS is NOT a stop condition." Spelled out the 3-of-3 contract: only exit when args are empty AND no step_state.json AND no task_plan_<branch>.md. - Step 0 now checks step_state.json BEFORE plan resume (in-flight work wins over a stale plan-only resume that would recreate state from INIT_STATE and lose subtask_results). - On the empty-everything path, the skill exits with a clear "provide a task description OR run /map-plan first" message instead of silently doing nothing. Regression tests: TestMapEfficientEmptyArgsResumeGuard (3 cases × 2 copies = 6 tests). Suite 1446 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same PR as the original eight fixes. These came from a second downstream review of issues #1-#12 that had not landed in the first batch. #4 (real bug, found via code read at lines 736-742): validate_step at the inter-subtask boundary (pending_steps emptied but more subtasks remain) was setting current_step_id="COMPLETE" and returning next_step="COMPLETE". get_next_step then re-advanced and handed back RESEARCH for the next subtask, so the workflow recovered — but the validate_step response had already lied, making COMPLETE indistinguishable from a true terminal state. Now emits the explicit "ADVANCE_SUBTASK" sentinel (with matching current_step_id/phase) and reserves COMPLETE for the actual terminal case. #11 validate_step idempotency: Re-running validate_step X when X is already in completed_steps is now a no-op success ({valid: True, idempotent: True}) instead of "Step mismatch: expected Y, got X". Combined with the new peek_current_step, callers can safely retry without recovery dances. #5 RESEARCH enforced (not prompt-text): validate_step("2.2") now verifies that .map/<branch>/research/ <current_subtask>__*.md exists. If not, rejects with valid=false and the exact save_research command to run. "MANDATORY RESEARCH" is now actual behaviour, not just docs. #3 resume_from_plan auto-set_waves: When blueprint.json is present, resume_from_plan now invokes set_waves itself and reports the outcome in waves_computed: "success" / "error" / "skipped". /map-efficient skill no longer needs to dispatch set_waves manually after every resume. #7 get_subtask CLI: python3 .map/scripts/map_step_runner.py get_subtask <ID> [--branch X] Hides the {flat, blueprint-wrapped} schema dichotomy so callers stop needing ad-hoc jq with two fallbacks. #10 pytest-timeout in test deps: CLAUDE.md examples reference `pytest --timeout=60` but the package was missing; added to requirements-test.txt and pyproject.toml test/dev extras. #2 wave-API integration (partial — full pivot deferred): Added documentation guidance in map-efficient/SKILL.md explaining when the sequential walker (get_next_step) vs the wave loop (get_wave_step / validate_wave_step / advance_wave) applies, and noted that resume_from_plan now auto-populates execution_waves. The deeper unification — making get_next_step itself walk by execution_waves rather than subtask_sequence — touches multiple invariant tests and is tracked as a separate follow-up plan. Test integration fix: tests/integration/test_e2e_artifact_contracts.py walked subtask phases without writing research artifacts; updated to plant .map/<branch>/research/ST-NNN__actor.md per subtask now that RESEARCH enforcement is real. +11 new test cases: TestValidateStepIdempotency, TestValidateStepInterSubtaskBoundary, TestValidateStepResearchEnforcement, TestResumeFromPlanAutoSetWaves, TestGetSubtaskCli. Suite: 1456 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…hment) #1: workflow-context-injector now stamps the [MAP] reminder with the hook's wall-clock UTC time AND the age of step_state.json (now - mtime), e.g. [MAP] @ 14:23:01.234Z (state +0.5s) 2.3 ACTOR | ... If the hook is reading stale state (the symptom: "[MAP] still says ACTOR after I validate_step'd to MONITOR"), the "state +Xs" delta makes it obvious — a fresh validate_step would push mtime to "now" so the next hook firing should report a small delta. Future repros can compare deltas across consecutive reminders to confirm whether it's a hook cache or a genuinely stale state file. #6: build_context_block now emits the subtask's `description` field (the long-form prose what/why from blueprint) and `risk_level`. Validated against the real neuro-vlad blueprint — ST-001's 400-char description flows into the context block instead of forcing Actor to re-open blueprint.json. Length went 21 → 22 lines but the per-line density grew substantially. Description is truncated to 480 chars to stay within the context budget. +2 new test cases (TestBuildContextBlockIncludesDescription). Suite: 1458. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Eight more fixes surfaced in a fresh /map-efficient run on neuro-vlad new-road after the earlier batches landed. #1 record_subtask_result CLI: Skill text already said "record files changed in step_state.json" but no public command existed; callers reached into Python or hoped validate_step did it implicitly. Added `python3 .map/scripts/map_orchestrator.py record_subtask_result <ID> <status> --files a.py,b.py --summary "..." --commit-sha SHA`. #2 ADVANCE_SUBTASK documented: The "ADVANCE_SUBTASK" sentinel introduced in the previous batch (#4) had no description in map-efficient/SKILL.md. Added an explicit "Phase: ADVANCE_SUBTASK (synthetic boundary)" section so callers know it's a free transition (call get_next_step again) and not a phase to execute. #3 Wave banner truthfulness: workflow-context-injector now reports "[waves computed, sequential walker active]" when execution_waves is populated but current_wave_index is still 0 (sequential walker has not been swapped for the wave loop). Previously the banner claimed "mode batch:parallel" even when nothing parallel was happening. #5 Monitor verdict contract: Added a "Verdict consistency contract (MANDATORY)" block to monitor.md: MEDIUM+ severity issues force valid:false, and any `recommendation in {"revise","block","needs_investigation"}` forbids `valid:true`. Closes the loophole where Monitor returned valid:true with recommendation:revise and the skill silently advanced. #6 build_context_block truncation marker: Added a compact "# [TRUNCATED] see .map/<branch>/token_budget.json" marker inside the budgeted text when clipping happened, replacing the prior silent loss. Token-budget aware: the marker REPLACES the existing "# Context Budget:..." footer so net token cost is zero (the contract assertion stays <= configured budget). #7 save_research attempt versioning: `save_research(..., attempt=N)` (and CLI flag `--attempt N`) now preserves a numbered snapshot at `<id>__<kind>.attempt-<N>.md` BEFORE overwriting the canonical file. Useful for clean-retry diffing after Monitor rejection. #9 mark_subtask_complete hint: get_next_step's RESEARCH (2.2) instruction now mentions both the save_research command (positive path) AND the mark_subtask_complete no-op short-circuit (escape hatch). Previously the operator had to recall the latter from efficient-reference.md. #11 finalize_plan CLI: `python3 .map/scripts/map_orchestrator.py finalize_plan` bumps artifact_manifest.stages.plan to "complete" when blueprint + task_plan are present. Closes the stage-stuck-partial trap reported on neuro-vlad new-road's manifest. #12 validate_step("2.4") auto mutation-boundary: MONITOR gate now runs validate_mutation_boundary internally for the current subtask. Warn-only by default; MAP_STRICT_SCOPE=1 escalates to a hard reject. Best-effort: missing blueprint or git failure is silently skipped so the gate stays usable in unit-test contexts. Skipped from the 12-issue list: #4 hook lag — repro now possible with timestamps from previous PR commit; awaiting fresh logs to diagnose root cause. #8 type-ignore misapplication — agent-quality, not framework. #10 per-subtask token accounting — needs new transcript-parsing infrastructure; tracked for separate plan. +7 test classes added. Suite: 1462 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the "no cheap way to know how many tokens spent in current subtask" gap from the latest framework triage. python3 .map/scripts/map_step_runner.py subtask_token_usage <branch> \ [subtask_id] [--since-ts <ISO>] Behaviour: * Resolves Claude Code's per-session log dir via the canonical ~/.claude/projects/<cwd-with-dashes>/ convention; falls back to cwd-matching across project dirs when the canonical path isn't there. * Picks the newest *.jsonl by mtime as the active session transcript. * Anchors the window at step_state.json mtime (the orchestrator rewrites that file on every advance, so it's a clean per-subtask transition signal). Override with --since-ts for arbitrary windows. * Sums message.usage.{input,output,cache_creation,cache_read}_tokens across assistant turns with timestamp >= anchor, returning a flat JSON report. Result on neuro-vlad new-road ST-004 with explicit since-ts 2026-05-23T06:00:00Z: 33 messages counted, 27265 output tokens, 331344 cache-creation, 2129820 cache-read — the kind of signal that previously required eyeballing transcripts. +3 test cases (TestSubtaskTokenUsage). Suite: 1465 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Convenience over `--since-ts 1970-01-01T00:00:00Z`: pass `--all` to report tokens spent across the entire active session, ignoring the default step_state.json mtime anchor that scopes the report to the current subtask. Useful when the operator wants a running session total rather than "since current subtask boundary". Real smoke on neuro-vlad new-road (currently at ST-005, 58 messages in the active jsonl): 388 223 total tokens, 4.6M cache_read — the exact "how much have I burned this session" signal that was missing. +1 test case. Suite: 1466 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…test run) A downstream invocation of /map-efficient finished ST-004, returned a "Pausing to report progress... re-run /map-efficient to resume at ST-005" message, and stopped. The operator had to issue another /map-efficient call to drive ST-005. Doubles round-trips and burns attention; the operator explicitly asked the skill to ship the whole plan, not check in after each subtask. Step 2b now carries a "MANDATORY: Do NOT pause between subtasks" rule with the four legitimate stop conditions enumerated: 1. next_step="COMPLETE" with subtask_index+1 == len(subtask_sequence) 2. retry-quarantine adjudication required 3. user explicit interrupt 4. circuit breaker tripped Anything else is "the wrong default" the operator just called out. +2 regression test cases (TestMapEfficientNoInterSubtaskPause). Suite: 1470 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Sixth round of fixes from a downstream /map-efficient run on neuro-vlad new-road. Six framework gaps, one commit, regression-tested. #7 transactional MONITOR pass: validate_step("2.4") now implicitly closes pending 2.3 (ACTOR) when the cursor is mid-flight. Caller convenience — Monitor approval logically means Actor work was accepted, so requiring a separate validate_step("2.3") before validate_step("2.4") was just ceremony that produced "Step mismatch: expected 2.3, got 2.4" errors. Skill can now go straight Monitor-pass → record_subtask_result → validate_step("2.4"). #10 build_context_block auto-loads research: Inlines the latest research artifact (actor → monitor → decomposer kinds, first hit wins, cap 1500 chars) into the context block under "# Research Findings (ST-NNN, kind=actor):". Stops the manual "load_research → glue into Actor prompt" two-step. #6 detect_already_done CLI: python3 .map/scripts/map_step_runner.py detect_already_done <branch> <subtask_id> [--since-ref REF] Heuristic check: every affected_file exists AND has commits in the window? Returns "likely_done" / "partial" / "unclear". Falls back to all-history when --since-ref doesn't resolve (fresh repos). Pragmatic, not authoritative — operators still review evidence before mark_subtask_complete. #3 scope baseline: validate_mutation_boundary now subtracts a per-branch baseline (.map/<branch>/scope-baseline.json) from `actual`. Capture it with the new record_scope_baseline CLI when the branch carries pre-existing untracked / unstaged work from prior waves; subsequent mutation-boundary checks then only flag files the current subtask actually changed. Closes the "every ST shows warning because the branch is dirty" friction. #4 verification-command REQUIRED suppression: workflow-context-injector now recognizes verification invocations (pytest, ruff check, ruff format --check, mypy, pyright, go vet/ build, cargo check, tsc --noEmit) and emits the base reminder WITHOUT the trailing " | REQUIRED: Run Actor" pressure tag. Actor running pytest on their own work shouldn't be nagged to re-enter the phase they're already in. #9 WAVE banner only when wave loop is active: workflow-context-injector no longer surfaces "WAVE 1/N" while the sequential walker (get_next_step) drives — only when current_wave_index > 0 (wave loop actually advanced). Removes the "[waves computed, sequential walker active]" cognitive-noise tail the operator just called out. +9 new test cases across orchestrator and step_runner suites. Suite: 1476 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 22, 2026 13:56

Copilot started reviewing on behalf of azalio May 22, 2026 13:56 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

azalio and others added 9 commits May 22, 2026 17:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address MAP framework bundle: 8 framework gaps surfaced in a downstream run#142

Address MAP framework bundle: 8 framework gaps surfaced in a downstream run#142
azalio wants to merge 10 commits into
mainfrom
fix-map-framework-bundle

azalio commented May 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Some subtasks are already-done historically (rename/refactor landed in a prior PR), or are docs-only and don't need the full research→actor→monitor cycle. Skip them up-front to save tokens:

		```bash

Conversation

azalio commented May 22, 2026

Summary

Per-issue notes

Test plan

Deferred (not in this PR)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants