Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 52 additions & 15 deletions commands/gen-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,11 @@ After Claude candidate plan v1 is ready, run iterative challenge/refine rounds w
"${CLAUDE_PLUGIN_ROOT}/scripts/ask-codex.sh" "<review current candidate plan>"
```
- Prompt MUST include current candidate plan, prior disagreements, and unresolved items
- Prompt MUST include the RLCR plan contract: `AC-*` items are current RLCR completion gates; deferred, future, out-of-scope, post-work, or successor-loop goals must be represented as `FUT-*` under `## Future Work / Out of Scope`, optionally with a current-loop handoff AC.
- Prompt MUST require Codex to inspect each AC for deferral semantics. If any AC claims the real work happens outside this RLCR loop, Codex MUST put it under `REQUIRED_CHANGES`, not `OPTIONAL_IMPROVEMENTS`.
- Prompt MUST include the Handoff AC Pattern definition inline: when preserving a future goal, a current-loop AC may cover only the handoff state/artifact/documentation. The handoff AC may reference `FUT-*`, but its positive and negative tests must be fully verifiable in this loop without completing the future work. The final plan must not leave a `Handoff AC Pattern` template/example section behind.
- Prompt MUST require semantic deferred-AC detection within each AC body. Use these strings as review hints, not automatic blockers: `TODO`, `TBD`, `deferred`, `future`, `follow-up`, `subsequent`, `next phase`, `next iteration`, `next milestone`, `next loop`, `v2`, `v.next`, `Phase II`, `left for`, `to be implemented in`, `see FUT-`. Codex MUST put the issue under `REQUIRED_CHANGES` only when the AC's meaning makes the real work happen outside this RLCR loop and the AC is not a valid Handoff AC. Do not block current-scope domain wording solely because it contains a marker term, such as an AC about validating future dates as input.
- Prompt MUST require AC/Task bidirectional coverage: every `AC-*` is targeted by at least one Task Breakdown row; every Task Breakdown row targets at least one current-scope `AC-*`; no task target may be empty, `-`, `FUT-*`, or `DEC-*`.
- Require output format:
- `AGREE:` points accepted as reasonable
- `DISAGREE:` points considered unreasonable and why
Expand All @@ -280,8 +285,9 @@ After Claude candidate plan v1 is ready, run iterative challenge/refine rounds w
- Topic
- Claude position
- Second Codex position
- Resolution status (`resolved`, `needs_user_decision`, `deferred`)
- Resolution status (`resolved` or `needs_user_decision`)
- Round-to-round delta
- Do NOT use `deferred` as a convergence status. If the selected resolution defers work, it is `resolved` only after the candidate plan records a `DEC-*` decision with a non-`PENDING` `Decision Status`, links that decision to a `FUT-*` item under `## Future Work / Out of Scope`, and ensures the deferred work is not represented as a current-scope AC/task. If that DEC/FUT linkage is missing or the decision needs human input, mark the topic `needs_user_decision`.

### Loop Termination Rules

Expand All @@ -293,8 +299,9 @@ Repeat convergence rounds until one of the following is true:
If max rounds are reached with unresolved opposite opinions, carry them to user decision phase explicitly.

Set convergence state explicitly:
- `PLAN_CONVERGENCE_STATUS=converged` when convergence conditions are met
- `PLAN_CONVERGENCE_STATUS=converged` when convergence conditions are met, no `needs_user_decision` topic remains, and every resolution that defers work already has a resolved `DEC-*` plus linked `FUT-*` entry
- `PLAN_CONVERGENCE_STATUS=partially_converged` otherwise
- Any unlinked deferred-work resolution MUST force `PLAN_CONVERGENCE_STATUS=partially_converged` and `HUMAN_REVIEW_REQUIRED=true`

---

Expand All @@ -309,19 +316,21 @@ Decide if manual review can be skipped:
- Else if `AUTO_START_RLCR_IF_CONVERGED=true` **and** `PLAN_CONVERGENCE_STATUS=converged`, set `HUMAN_REVIEW_REQUIRED=false`
- Otherwise set `HUMAN_REVIEW_REQUIRED=true`

If `HUMAN_REVIEW_REQUIRED=false`, skip Step 2-4 and continue directly to Phase 7.
Do not skip Step 1.5. If `HUMAN_REVIEW_REQUIRED=false`, run Step 1.5 first, then skip Step 2-4 and continue directly to Phase 7.

### Step 1.5: Consolidate Pending User Decisions (runs unconditionally)

Before proceeding (regardless of `HUMAN_REVIEW_REQUIRED`), consolidate all user-facing questions from prior phases into the plan's `## Pending User Decisions` section:

1. Extract `QUESTIONS_FOR_USER` items from Codex Analysis v1 (Phase 3)
2. Extract items with status `needs_user_decision` from the final convergence matrix (Phase 5) — use the last round's state, not intermediate rounds
3. Deduplicate: if the same topic appears in both sources, merge into one entry
4. For each collected item, check if it was substantively resolved during Phase 4-5 plan refinement (i.e., Claude addressed it and second Codex agreed in a subsequent round). Remove only items with clear evidence of resolution.
5. Write all remaining unresolved items into the plan's `## Pending User Decisions` section. Use `DEC-N` identifiers. Set `Decision Status` to `PENDING`.
3. Extract any convergence topic whose selected resolution defers work but lacks a resolved `DEC-*` plus linked `FUT-*` entry. Add it as a `PENDING` decision so it blocks auto-start instead of silently escaping the completion gate.
4. Deduplicate: if the same topic appears in multiple sources, merge into one entry
5. For each collected item, check if it was substantively resolved during Phase 4-5 plan refinement (i.e., Claude addressed it and second Codex agreed in a subsequent round). Remove only items with clear evidence of resolution and, for deferred-work resolutions, complete resolved `DEC-*`/`FUT-*` linkage.
6. Write all remaining unresolved items into the plan's `## Pending User Decisions` section. Use `DEC-N` identifiers. Set `Decision Status` to `PENDING`.
- For Claude-vs-Codex disagreements: fill `Claude Position`, `Codex Position`, and `Tradeoff Summary`
- For open questions (no opposing positions): set `Claude Position` to Claude's tentative answer (if any), `Codex Position` to `N/A - open question`, and `Tradeoff Summary` to the question's context
- For deferred-work resolutions that are already decided, do not leave them as `PENDING`; instead record the resolved decision, reference its `FUT-*` entry, and ensure the `FUT-*` entry includes `Source DEC: DEC-N`

This ensures:
- When `HUMAN_REVIEW_REQUIRED=true`: items are visible for Steps 2-4 user resolution
Expand Down Expand Up @@ -384,6 +393,7 @@ Deeply think and generate the plan.md following these rules:
## Acceptance Criteria

Following TDD philosophy, each criterion includes positive and negative tests for deterministic verification.
The `AC-*` items are current RLCR completion gates for this implementation loop.

- AC-1: <First criterion>
- Positive Tests (expected to PASS):
Expand Down Expand Up @@ -450,11 +460,22 @@ Each task must include exactly one routing tag:
- `coding`: implemented by Claude
- `analyze`: executed via Codex (`/humanize:ask-codex`)

Every `AC-*` must be covered by at least one task. Every task must target at least one `AC-*`. Do not target `FUT-*`, `DEC-*`, or `-` in the Target AC column.

| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On |
|---------|-------------|-----------|----------------------------|------------|
| task1 | <...> | AC-1 | coding | - |
| task2 | <...> | AC-2 | analyze | task1 |

## Future Work / Out of Scope

Future, deferred, post-work, successor-loop, and out-of-scope items belong here, not under `## Acceptance Criteria`.

- FUT-1: <Future item that is not required for this RLCR loop>
- Source DEC: DEC-1
- Current-loop handoff: AC-X
- Promotion trigger: <Condition or follow-up loop that should promote this to a current-scope AC>

## Claude-Codex Deliberation

### Agreements
Expand Down Expand Up @@ -510,23 +531,33 @@ When `alternative_plan_language` is empty, absent, set to `"English"`, or set to

5. **AC Format**: All acceptance criteria must use AC-X or AC-X.Y format.

6. **Clear Dependencies**: Show what depends on what, not when things happen.
6. **Current-Scope AC Contract**: `AC-*` items are the current RLCR completion gate. Do NOT create deferred ACs. Any deferred, future, out-of-scope, post-work, successor-task, or successor-loop goal must be written as `FUT-*` under `## Future Work / Out of Scope`, optionally linked to a current-loop Handoff AC.

7. **Deferred AC Semantic Guard**: Before finalizing, inspect each AC body for deferral semantics. Deferral marker terms such as `TODO`, `TBD`, `deferred`, `future`, `follow-up`, `subsequent`, `next phase`, `next iteration`, `next milestone`, `next loop`, `v2`, `v.next`, `Phase II`, `left for`, `to be implemented in`, and `see FUT-` are review hints, not automatic failures. If the AC's meaning makes the real work happen outside this loop, rewrite the item as a current-loop handoff AC plus a `FUT-*` item, or move it entirely to future work. If a marker term is ordinary current-scope domain language, such as validating future dates as input, keep the AC if its tests are fully current-loop verifiable.

8. **Handoff AC Pattern**: When preserving a future goal, write a current-loop AC only for the handoff state/artifact/documentation. The handoff AC may reference `FUT-*`, but its positive and negative tests must be fully verifiable in this loop and must not require completing the future work. This pattern is generation guidance only; do not leave a `Handoff AC Pattern` template/example section in the final plan.

9. **AC/Task Bidirectional Coverage**: Every `AC-*` must be covered by at least one Task Breakdown row. Every Task Breakdown row must target at least one current-scope `AC-*`. No row may use an empty target, `-`, `FUT-*`, or `DEC-*` as its Target AC.

10. **DEC/FUT Linkage**: If a resolved decision defers work, the decision resolution must explicitly reference a `FUT-*` item. Each `FUT-*` item caused by a decision must include `Source DEC: DEC-N`. If there is a current-loop handoff, both the DEC and FUT entry should reference the handoff AC.

11. **Clear Dependencies**: Show what depends on what, not when things happen.

7. **TDD-Style Tests**: Each acceptance criterion MUST include both positive tests (expected to pass) and negative tests (expected to fail). This follows Test-Driven Development philosophy and enables deterministic verification.
12. **TDD-Style Tests**: Each acceptance criterion MUST include both positive tests (expected to pass) and negative tests (expected to fail). This follows Test-Driven Development philosophy and enables deterministic verification.

8. **Affirmative Path Boundaries**: Describe upper and lower bounds using affirmative language (what IS acceptable) rather than negative language (what is NOT acceptable).
13. **Affirmative Path Boundaries**: Describe upper and lower bounds using affirmative language (what IS acceptable) rather than negative language (what is NOT acceptable).

9. **Respect Deterministic Designs**: If the draft specifies a fixed approach with no choices, reflect this in the plan by narrowing the path boundaries to match the user's specification.
14. **Respect Deterministic Designs**: If the draft specifies a fixed approach with no choices, reflect this in the plan by narrowing the path boundaries to match the user's specification.

10. **Code Style Constraint**: The generated plan MUST include a section or note instructing that implementation code and comments should NOT contain plan-specific progress terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These terms belong in the plan document, not in the resulting codebase.
15. **Code Style Constraint**: The generated plan MUST include a section or note instructing that implementation code and comments should NOT contain plan-specific progress terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers. These terms belong in the plan document, not in the resulting codebase.

11. **Draft Completeness Requirement**: The generated plan MUST incorporate ALL information from the input draft document without omission. The draft represents the most valuable human input and must be fully preserved. Any clarifications obtained through Phase 6 should be added incrementally to the draft's original content, never replacing or losing any original requirements. The final plan must be a superset of the draft information plus all clarified details.
16. **Draft Completeness Requirement**: The generated plan MUST incorporate ALL information from the input draft document without omission. The draft represents the most valuable human input and must be fully preserved. Any clarifications obtained through Phase 6 should be added incrementally to the draft's original content, never replacing or losing any original requirements. The final plan must be a superset of the draft information plus all clarified details.

12. **Debate Traceability**: The plan MUST include Codex-first findings, Claude/Codex agreements, resolved disagreements, and unresolved decisions. Unresolved opposite opinions MUST be recorded in `## Pending User Decisions` for explicit user decision.
17. **Debate Traceability**: The plan MUST include Codex-first findings, Claude/Codex agreements, resolved disagreements, and unresolved decisions. Unresolved opposite opinions MUST be recorded in `## Pending User Decisions` for explicit user decision.

13. **Convergence Requirement**: The plan MUST record Claude/Codex agreements, resolved disagreements, and final convergence status in `## Claude-Codex Deliberation`. Stop only when convergence conditions are met or max rounds reached with explicit carry-over decisions.
18. **Convergence Requirement**: The plan MUST record Claude/Codex agreements, resolved disagreements, and final convergence status in `## Claude-Codex Deliberation`. Stop only when convergence conditions are met or max rounds reached with explicit carry-over decisions.

14. **Task Tag Requirement**: The plan MUST include `## Task Breakdown`, and every task MUST be tagged as either `coding` or `analyze` (no untagged tasks, no other tag values).
19. **Task Tag Requirement**: The plan MUST include `## Task Breakdown`, and every task MUST be tagged as either `coding` or `analyze` (no untagged tasks, no other tag values).

---

Expand All @@ -549,6 +580,11 @@ After updating, **read the complete plan file** and verify:
- The structured plan aligns with the original draft content
- Claude/Codex disagreement handling is explicit and correctly reflected
- No contradictions exist between different parts of the document
- No instructional `Handoff AC Pattern` template/example section remains in the final plan
- No `AC-*` uses deferred, future, out-of-scope, post-work, or successor-loop semantics except as a valid Handoff AC whose current-loop verification is complete without performing future work
- Every `AC-*` is covered by at least one Task Breakdown row, and every Task Breakdown row targets at least one current-scope `AC-*`
- Every decision that defers work links to a `FUT-*` entry, and every such `FUT-*` entry links back with `Source DEC: DEC-N`
- Items under `## Future Work / Out of Scope` use `FUT-*`, not `AC-*`, and are not listed as current-scope Task Breakdown work

If inconsistencies are found, fix them using the Edit tool.

Expand Down Expand Up @@ -598,6 +634,7 @@ If all of the following are true:
- `PLAN_CONVERGENCE_STATUS=converged`
- `GEN_PLAN_MODE=discussion`
- There are no pending decisions with status `PENDING`
- Every convergence topic whose resolution defers work has a resolved `DEC-*` plus linked `FUT-*` entry; no deferred-work resolution exists only in the convergence matrix

Then start work immediately by running:

Expand Down
11 changes: 7 additions & 4 deletions prompt-template/codex/full-alignment-review.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ This is a **mandatory checkpoint** (at configurable intervals). You must conduct
@{{PLAN_FILE}}

You MUST read this plan file first to understand the full scope of work before conducting your review.
Only items under `## Acceptance Criteria` and current-scope Task Breakdown rows are completion gates.
Items under `## Future Work` / `## Out of Scope`, including `FUT-*` items, are informational and MUST NOT block the COMPLETE verdict.
If a current-scope AC or current-scope task is deferred, treat it as incomplete.

---
## Claude's Work Summary
Expand Down Expand Up @@ -103,12 +106,12 @@ The project's `.humanize/rlcr/{{LOOP_TIMESTAMP}}/` directory contains the histor

## Part 6: Output Requirements

- If issues found OR any AC is NOT MET (including deferred ACs), write your findings to @{{REVIEW_RESULT_FILE}}
- If issues found OR any current-scope AC is NOT MET (including deferred current-scope ACs), write your findings to @{{REVIEW_RESULT_FILE}}
- Include specific action items for Claude to address, classified into:
- Mainline Gaps
- Blocking Side Issues
- Queued Side Issues
- **If development is stagnating** (see Part 4), write "STOP" as the last line
- **CRITICAL**: Only write "COMPLETE" as the last line if ALL ACs from the original plan are FULLY MET with no deferrals
- DEFERRED items are considered INCOMPLETE - do NOT output COMPLETE if any AC is deferred
- The ONLY condition for COMPLETE is: all original plan tasks are done, all ACs are met, no deferrals allowed
- **CRITICAL**: Only write "COMPLETE" as the last line if ALL current-scope ACs from the original plan are FULLY MET with no deferrals
- DEFERRED current-scope items are considered INCOMPLETE - do NOT output COMPLETE if any current-scope AC is deferred
- The ONLY condition for COMPLETE is: all current-scope original plan tasks are done, all current-scope ACs are met, no current-scope deferrals allowed
Loading
Loading