From b5e740ae485d956f0f02ff9586c39b9f77d72f0a Mon Sep 17 00:00:00 2001 From: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Date: Mon, 4 May 2026 13:00:32 +0000 Subject: [PATCH 1/4] fix(ce-dispatch): address second-round Codex review on EveryInc#762 Two more bugs flagged by chatgpt-codex-connector[bot] on the upstream PR after the first-round fixes landed: P3 - Phase 1 dependency parser keyed to the wrong field label. ce-plan emits unit dependencies under `**Dependencies:**` (per its unit template), but the ce-dispatch parse rule was keyed to a `Depends on:` label. Every unit produced by ce-plan therefore fell back to `none`, making dependent units look like roots and dispatching out of order. Fix: reference the canonical `Dependencies:` field as the primary key, accept `Depends on:` as an alias for hand-edited plans, and explicitly forbid silent fallback when a parseable line exists. Inline call-out so future maintainers don't drop the alias. P4 - Phase 4 merge step ran the test suite against pre-merge code. `gh pr merge` lands the merge on GitHub but does not update the local checkout, so the immediately-following test-suite run could report a false green while the merged branch was broken. Fix: between `gh pr merge` and the test-suite step, sync the local working tree to the merged base (`git fetch origin && git checkout && git pull --ff-only`). Inline rationale alongside the new step. Added three regression assertions in tests/skills/ce-dispatch-contract .test.ts (extends the existing 'regression guards (Codex-flagged bugs)' describe block): - Phase 1 Dependencies bullet must reference the canonical `Dependencies:` (or bolded `**Dependencies:**`) label. - Phase 4 'Merge a PR' block must contain `git fetch` and `git pull` between `gh pr merge` and the 'test suite' phrase. - (existing P1 guard kept; existing P2 guard kept.) Verified locally: bun test tests/skills/ce-dispatch-contract.test.ts -> 49 pass / 0 fail (was 47; +2 new regression guards) bun test (full) -> 1294 pass / 1 fail (the 1 fail is the pre-existing resolve-base.sh failure on main) bun run release:validate -> in sync --- .../skills/ce-dispatch/SKILL.md | 4 +- tests/skills/ce-dispatch-contract.test.ts | 50 +++++++++++++++++++ 2 files changed, 52 insertions(+), 2 deletions(-) diff --git a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md index 17aecb642..a07f11ff4 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md +++ b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md @@ -77,7 +77,7 @@ Locate the `Implementation Units` section. Each unit is a top-level bullet whose - **Patterns** (the unit's `Patterns to follow` field, if present) - **Approach** (the unit's `Approach` field, if present) - **Verification** (the unit's `Verification` or `Test scenarios` field) -- **Dependencies** (any `Depends on:` field listing other U-IDs, or inferred from the plan's sequencing prose; default to `none`) +- **Dependencies** (the unit's `**Dependencies:**` field listing other U-IDs — this is the canonical label `ce-plan` emits per its unit template; also accept `Depends on:` as an alias for hand-edited or external plans. If neither label is present, fall back to inferring from the plan's sequencing prose; default to `none` only when nothing is found. Do **not** default to `none` silently when a `**Dependencies:**` line exists with parseable U-IDs — that would let dependent units look like roots and dispatch out of order.) If the plan has no recognizable Implementation Units section, stop and tell the user the plan must contain implementation units before dispatch. Do not invent units. @@ -173,7 +173,7 @@ Act on the user's selection — do not just announce it. The bare per-option act - CI rollup on the PR is green (no `FAILURE` or `ERROR` checks). If checks are pending, ask the user whether to wait or skip. - The PR has a `## Dispatch Result` section in its body with `Status: completed`. If the section is missing or `Status` is `partial` / `failed`, refuse and surface the issue back to the user. - When all gates pass, run `gh pr merge --squash --delete-branch`. After the merge succeeds, run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. + When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base: `git fetch origin && git checkout && git pull --ff-only origin `. Without this sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. On merge conflict (`gh pr merge` reports the PR is not mergeable due to conflicts), do **not** attempt to resolve the conflict in the dispatching session — the conflict belongs to the workspace that produced the PR. Surface the conflict and advise the user: "Open the workspace, run `git fetch origin && git rebase origin/`, resolve conflicts, push, and re-run option 1 to refresh status." Re-render the menu without merging. diff --git a/tests/skills/ce-dispatch-contract.test.ts b/tests/skills/ce-dispatch-contract.test.ts index e65425c3a..f6ef185e6 100644 --- a/tests/skills/ce-dispatch-contract.test.ts +++ b/tests/skills/ce-dispatch-contract.test.ts @@ -363,6 +363,56 @@ describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { expect(inv).toContain("--short") } }) + + test("Phase 1 dependency parser keys to the canonical ce-plan field", () => { + // ce-plan emits the bolded `**Dependencies:**` field (see ce-plan/SKILL.md + // Implementation Units template). An earlier draft of ce-dispatch keyed + // the parser to `Depends on:` instead, which would silently fall back to + // `none` for every unit produced by ce-plan, making dependent units look + // like roots and dispatching them out of order. The Phase 1 parse rule + // must explicitly reference the `Dependencies:` label as the primary key. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + expect(phase1Start).toBeGreaterThan(-1) + expect(phase2Start).toBeGreaterThan(phase1Start) + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + // Field-extraction bullet for Dependencies must name the canonical label. + const dependenciesBullet = phase1Region.match( + /-\s+\*\*Dependencies\*\*[^\n]+/, + ) + expect(dependenciesBullet).not.toBeNull() + const bulletText = dependenciesBullet![0] + // Primary label is `Dependencies:` (bolded `**Dependencies:**` accepted). + expect(bulletText).toMatch(/`(?:\*\*)?Dependencies:(?:\*\*)?`/) + }) + + test("Phase 4 merge step syncs local checkout before running tests", () => { + // `gh pr merge` lands the merge on GitHub but does not update the local + // checkout. Running the project test suite immediately after `gh pr merge` + // therefore tests pre-merge code, which can falsely report success while + // the merged commit is broken. The merge bullet must include an explicit + // local sync (fetch + checkout base + pull) between `gh pr merge` and + // running the test suite. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + expect(phase4Start).toBeGreaterThan(-1) + const phase4Region = SKILL_BODY.slice(phase4Start) + // Find the "Merge a PR" routing block — from its label to the next bullet. + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(mergeBlockMatch).not.toBeNull() + const mergeBlock = mergeBlockMatch![0] + // Must mention `gh pr merge`, then `git fetch`/`git pull` (local sync), + // then the test suite — in that order. + const ghMergeIdx = mergeBlock.indexOf("gh pr merge") + const fetchIdx = mergeBlock.search(/git fetch/) + const pullIdx = mergeBlock.search(/git pull/) + const testSuiteIdx = mergeBlock.toLowerCase().indexOf("test suite") + expect(ghMergeIdx).toBeGreaterThan(-1) + expect(fetchIdx).toBeGreaterThan(ghMergeIdx) + expect(pullIdx).toBeGreaterThan(ghMergeIdx) + expect(testSuiteIdx).toBeGreaterThan(Math.max(fetchIdx, pullIdx)) + }) }) describe("conductor-notes.md documents key Conductor behavior", () => { From dbf5d33616b991678d458bacf739b37d989eaa12 Mon Sep 17 00:00:00 2001 From: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Date: Mon, 4 May 2026 13:09:02 +0000 Subject: [PATCH 2/4] fix(ce-dispatch): wider audit pass on Codex review themes Audit pass after the bot's two flagged bugs (P3 dependency label, P4 post-merge sync) revealed four sibling issues in the same code paths. Folding them into the same review-cycle PR rather than shipping in pieces. Phase 1.1 Files field-label mismatch (same class as P3). ce-plan emits Files sub-bullets as Create:/Modify:/Test: per its unit template; ce-dispatch listed them as Create/Modify/Read. Test files were therefore dropped from (a) the Phase 1.3 parallel-safety file-to-unit map and (b) the dispatch prompt's section - masking real test-file overlap between dispatched units and giving in-workspace agents an incomplete picture of what to author. Fix: canonicalize on Create:/Modify:/Test: with Read: accepted as alias in both SKILL.md and dispatch-prompt-template.md. Phase 1.3 already said 'Test paths' - the inconsistency is now resolved in Phase 1.1's favour. Phase 1.1 Verification vs Test scenarios collapse. Phase 2 substitutes with the unit's test scenarios and with the project's test/lint commands - different prompt sections, different sources, different audiences. The Phase 1 parse list collapsed them into one 'Verification or Test scenarios' field, which leaks each into the wrong template section. Fix: split into two distinct captured fields, named to match the canonical ce-plan labels. Phase 0.2 origin/HEAD fallback. The dispatch_base_branch default invoked `git symbolic-ref --short refs/remotes/origin/HEAD` with no fallback. That command exits non-zero on bare clones, fresh `git clone --no-checkout`, and some CI checkouts where origin/HEAD was never set. Fix: document the documented fallback chain (`git remote show origin` HEAD-branch parse, then default-to-main with a user-facing warning). Phase 4 merge sync precondition (regression introduced by P4 fix). The new `git fetch + git checkout + git pull --ff-only` step will fail if the dispatching session's working tree is dirty, and silently displace the user from a feature branch they were on. Fix: add an explicit `git status --porcelain` precondition check, capture the current branch via `git symbolic-ref --short HEAD` (with detached- HEAD fallback) before the sync, and restore it after the test run. Surface the cycled-through-base behaviour to the user explicitly. Phase 3 `gh issue create --label` wording. Skill said `gh` 'prints a warning' on missing labels; in fact `gh` exits non-zero and refuses to create the issue (intentional, per cli/cli#715, to prevent accidental label creation). Fix: describe the actual error semantics, reference the upstream issue, and specify the create-label-then-retry recovery path. Added five regression assertions in tests/skills/ce-dispatch-contract.test.ts covering each fix above. Verified locally: bun test tests/skills/ce-dispatch-contract.test.ts -> 54 pass / 0 fail (was 49; +5 new regression guards) bun test (full) -> 1298 pass / 1 fail (the 1 fail is the pre-existing resolve-base.sh failure on main) bun run release:validate -> in sync --- .../skills/ce-dispatch/SKILL.md | 19 ++-- .../references/dispatch-prompt-template.md | 8 +- tests/skills/ce-dispatch-contract.test.ts | 102 ++++++++++++++++++ 3 files changed, 118 insertions(+), 11 deletions(-) diff --git a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md index a07f11ff4..aa5e25db9 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md +++ b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md @@ -49,7 +49,7 @@ Config keys and resolution: |---|---|---| | `dispatch_mode` | `conductor`, or another short identifier | `conductor` | | `dispatch_branch_prefix` | any string (no leading/trailing slashes) | `dispatch/` | -| `dispatch_base_branch` | any branch name | repo's default branch (`git symbolic-ref --short refs/remotes/origin/HEAD`) | +| `dispatch_base_branch` | any branch name | repo's default branch (`git symbolic-ref --short refs/remotes/origin/HEAD`; if that fails because `origin/HEAD` is not set — happens on bare clones, fresh `git clone --no-checkout`, or some CI checkouts — fall back to `git remote show origin` and parse the `HEAD branch:` line, or to `main` with a one-line warning to the user) | | `dispatch_labels` | comma-separated label list | `ce-dispatch` | | `dispatch_auto_review` | `true` or `false` | `true` | @@ -72,11 +72,12 @@ Locate the `Implementation Units` section. Each unit is a top-level bullet whose - **U-ID** (e.g., `U1`, `U3`) - **Name** (the bolded heading text) -- **Goal** (the unit's "Goal" or "Why" field) -- **Files** (the unit's `Files:` section — Create, Modify, Read paths) -- **Patterns** (the unit's `Patterns to follow` field, if present) -- **Approach** (the unit's `Approach` field, if present) -- **Verification** (the unit's `Verification` or `Test scenarios` field) +- **Goal** (the unit's `**Goal:**` field, verbatim — this is the canonical label `ce-plan` emits) +- **Files** (the unit's `**Files:**` section, capturing all three sub-bullets `Create:`, `Modify:`, and `Test:` — these are the canonical labels `ce-plan` emits per its unit template; `Read:` is also accepted as an alias for hand-edited or external plans. **Test paths are required**: they feed into the Phase 1.3 parallel-safety file-to-unit map, so dropping them silently lets test-file overlaps slip through unflagged.) +- **Patterns** (the unit's `**Patterns to follow:**` field, if present) +- **Approach** (the unit's `**Approach:**` field, if present) +- **Test scenarios** (the unit's `**Test scenarios:**` field, if present — captured as a distinct field from `Verification` because Phase 2 substitutes them into the dispatch prompt's `` and `` sections respectively, which are different. Collapsing them here would leak test-scenario detail into `` and verification commands into ``.) +- **Verification** (the unit's `**Verification:**` field, if present — this carries the project's combined test/lint commands and feeds the dispatch prompt's `` section) - **Dependencies** (the unit's `**Dependencies:**` field listing other U-IDs — this is the canonical label `ce-plan` emits per its unit template; also accept `Depends on:` as an alias for hand-edited or external plans. If neither label is present, fall back to inferring from the plan's sequencing prose; default to `none` only when nothing is found. Do **not** default to `none` silently when a `**Dependencies:**` line exists with parseable U-IDs — that would let dependent units look like roots and dispatch out of order.) If the plan has no recognizable Implementation Units section, stop and tell the user the plan must contain implementation units before dispatch. Do not invent units. @@ -103,7 +104,7 @@ For each dispatchable unit (initially the roots; later, units whose dependencies Substitute concrete values for every section: - `` — plan file repo-relative path; one-sentence project context - `` — Goal from the unit (single-unit case) -- `` — the unit's combined Create/Modify/Read file list +- `` — the unit's combined Create/Modify/Test file list (with `Read:` accepted as alias). Test paths are part of the agent's deliverable, not just reading material — preserve them. - `` — the unit's `Patterns to follow` content (or the fallback line) - `` — the unit's Approach field - `` — the template's constraints, plus any predicted-overlap hint @@ -138,7 +139,7 @@ gh issue create \ Notes: - Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). The scratch directory holds one file per dispatched unit so retries can re-use them. -- The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh` prints a warning — surface it to the user once and offer to create the label via `gh label create` (single confirmation, not per-issue). +- The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh issue create` returns a non-zero exit and refuses to create the issue (`could not add label: '' not found`) — this is intentional `gh` behavior to prevent accidental label creation (cli/cli#715). Detect the missing-label error on the first failed `gh issue create` call, surface it once with the missing label name(s), and offer to run `gh label create ` (single confirmation covering all missing labels, not per-issue), then retry the failed `gh issue create` invocations. Do **not** strip the label from the command and proceed silently — dispatched issues without their labels won't be picked up by `dispatch_labels`-filtering automations. - After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. - If `gh issue create` fails (auth error, rate limit, etc.), stop the round and surface the error. Do not try to "recover" by retrying with different flags — the user needs to fix the underlying problem. @@ -173,7 +174,7 @@ Act on the user's selection — do not just announce it. The bare per-option act - CI rollup on the PR is green (no `FAILURE` or `ERROR` checks). If checks are pending, ask the user whether to wait or skip. - The PR has a `## Dispatch Result` section in its body with `Status: completed`. If the section is missing or `Status` is `partial` / `failed`, refuse and surface the issue back to the user. - When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base: `git fetch origin && git checkout && git pull --ff-only origin `. Without this sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. + When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base. **Precondition** (run before the sync, not after): check `git status --porcelain` — if the dispatching session's working tree has uncommitted changes, do **not** run `git checkout` (it can fail or silently overwrite user work). Surface the dirty paths to the user and ask them to commit, stash, or skip the local test step before proceeding. Also capture the current branch with `git symbolic-ref --short HEAD` (or `git rev-parse HEAD` if detached) so it can be restored after tests — the user may not have been on the base branch when they invoked the orchestrator. With those guards in place, run `git fetch origin && git checkout && git pull --ff-only origin `. Without this sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. Finally, restore the captured pre-sync branch (`git checkout `) and tell the user the working tree was cycled through `` for verification — don't leave them silently displaced. On merge conflict (`gh pr merge` reports the PR is not mergeable due to conflicts), do **not** attempt to resolve the conflict in the dispatching session — the conflict belongs to the workspace that produced the PR. Surface the conflict and advise the user: "Open the workspace, run `git fetch origin && git rebase origin/`, resolve conflicts, push, and re-run option 1 to refresh status." Re-render the menu without merging. diff --git a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md b/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md index b57fae0ac..f5bd25f48 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md +++ b/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md @@ -28,8 +28,12 @@ issues -- otherwise prefer one issue per unit.] -[Combined file list from the unit(s) -- files to create, modify, or read. -Use the plan's `Files:` section as the source of truth. Repo-relative paths only.] +[Combined file list from the unit(s) -- files to Create, Modify, or Test. +Use the plan's `**Files:**` section as the source of truth (canonical sub-bullets: +`Create:`, `Modify:`, `Test:` -- per the ce-plan unit template). `Read:` is also +accepted as an alias when the plan was hand-edited or produced outside ce-plan. +Repo-relative paths only. Do not silently drop `Test:` paths -- they are the +test files the agent is expected to author or update, not just reference.] diff --git a/tests/skills/ce-dispatch-contract.test.ts b/tests/skills/ce-dispatch-contract.test.ts index f6ef185e6..1c2aa4187 100644 --- a/tests/skills/ce-dispatch-contract.test.ts +++ b/tests/skills/ce-dispatch-contract.test.ts @@ -413,6 +413,108 @@ describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { expect(pullIdx).toBeGreaterThan(ghMergeIdx) expect(testSuiteIdx).toBeGreaterThan(Math.max(fetchIdx, pullIdx)) }) + + test("Phase 4 merge sync guards dirty working tree and restores branch", () => { + // The post-merge sync (`git fetch` + `git checkout ` + `git pull`) + // can fail or silently overwrite user work if the dispatching session's + // working tree is dirty. It can also leave the user displaced from a + // feature branch they were on. The merge block must (a) check + // `git status --porcelain` before running checkout, and (b) restore the + // pre-sync branch (or surface that the working tree was cycled) afterward. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(mergeBlockMatch).not.toBeNull() + const mergeBlock = mergeBlockMatch![0] + // Precondition guard: dirty-tree check before checkout. + expect(mergeBlock).toMatch(/git status --porcelain/) + const dirtyCheckIdx = mergeBlock.search(/git status --porcelain/) + const checkoutIdx = mergeBlock.search(/git checkout /) + expect(dirtyCheckIdx).toBeGreaterThan(-1) + expect(checkoutIdx).toBeGreaterThan(dirtyCheckIdx) + // Branch capture before sync; restore after tests. + expect(mergeBlock).toMatch(/git symbolic-ref --short HEAD/) + expect(mergeBlock).toMatch(/restore|displaced|cycled/i) + }) + + test("Phase 0 base-branch default documents an origin/HEAD fallback", () => { + // `git symbolic-ref --short refs/remotes/origin/HEAD` exits non-zero on + // clones where origin/HEAD was never set (bare clones, fresh + // `git clone --no-checkout`, some CI checkouts). The default-resolution + // table assumed it always succeeds. Must document a fallback (parsing + // `git remote show origin`, defaulting to `main`, or both). + const phase0Start = SKILL_BODY.indexOf("### Phase 0:") + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase0Region = SKILL_BODY.slice(phase0Start, phase1Start) + const baseBranchRow = phase0Region.match( + /\| `dispatch_base_branch` \|[^\n]+/, + ) + expect(baseBranchRow).not.toBeNull() + // Either a parse of `git remote show origin` or a default-to-main, with a + // user-facing warning, must be documented in the default cell. + expect(baseBranchRow![0]).toMatch( + /git remote show origin|default(?:s)? to `?main`?/, + ) + }) + + test("Phase 1 Files field captures Test paths (not just Read)", () => { + // ce-plan emits Files as `Create:` / `Modify:` / `Test:` (per its unit + // template), but the original parse rule listed them as `Create, Modify, + // Read`. Test files therefore got dropped from the parallel-safety + // file-to-unit map (Phase 1.3) and from the dispatch prompt's `` + // section, masking real test-file overlap between dispatched units. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + const filesBullet = phase1Region.match(/-\s+\*\*Files\*\*[^\n]+/) + expect(filesBullet).not.toBeNull() + // Must explicitly name `Test:` as a captured sub-bullet. + expect(filesBullet![0]).toMatch(/`Test:`/) + // Phase 1.3 already says "Test paths" — keep that consistent. + expect(phase1Region).toMatch(/Create, Modify, and Test paths/) + }) + + test("Phase 1 captures Test scenarios separately from Verification", () => { + // Phase 2 substitutes `` with the unit's test scenarios and + // `` with the project's test/lint commands — different prompt + // sections, different sources. The Phase 1 parse list must capture both + // the `**Test scenarios:**` and `**Verification:**` fields as separate + // entries; collapsing them into one field leaks each into the wrong + // template section. + const phase1Start = SKILL_BODY.indexOf("### Phase 1:") + const phase2Start = SKILL_BODY.indexOf("### Phase 2:") + const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) + expect(phase1Region).toMatch(/-\s+\*\*Test scenarios\*\*[^\n]+/) + expect(phase1Region).toMatch(/-\s+\*\*Verification\*\*[^\n]+/) + // Verification bullet must NOT also claim to capture Test scenarios in + // the same field — that's the bug we're guarding against. + const verificationBullet = phase1Region.match( + /-\s+\*\*Verification\*\*[^\n]+/, + )! + expect(verificationBullet[0]).not.toMatch(/Test scenarios/) + }) + + test("Phase 3 documents gh issue create label-missing as an error", () => { + // `gh issue create` with `--label ` exits non-zero and refuses + // to create the issue (cli/cli#715 — intentional, prevents accidental + // label creation). Calling it a "warning" understates the recovery the + // user needs to perform; the skill must describe it as an error and + // outline the create-label-then-retry path. + const phase3Start = SKILL_BODY.indexOf("### Phase 3:") + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase3Region = SKILL_BODY.slice(phase3Start, phase4Start) + // Find the bullet that talks about labels. + const labelBullet = phase3Region.match(/- The label list comes from[^\n]+/) + expect(labelBullet).not.toBeNull() + // Must NOT call the missing-label outcome a "warning" only. + expect(labelBullet![0]).not.toMatch(/`gh` prints a warning/) + // Must describe an error/refusal and a retry path. + expect(labelBullet![0]).toMatch(/non-zero|refuses|error|not found/i) + expect(labelBullet![0]).toMatch(/gh label create/) + expect(labelBullet![0]).toMatch(/retry/i) + }) }) describe("conductor-notes.md documents key Conductor behavior", () => { From d845c5edaf913258d0f9b3429c9d56532ef33a4b Mon Sep 17 00:00:00 2001 From: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Date: Mon, 4 May 2026 13:28:28 +0000 Subject: [PATCH 3/4] fix(ce-dispatch): address third-round Codex review + defensive hardening MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round 3 Codex bot review on EveryInc#762 surfaced two new P1 bugs and three sibling defensive issues raised on the second-round review thread. This commit lands all five. Bot-flagged P1s: - P13 (Comment 12) — Allow ce-dispatch to be invoked from ce-plan routing. ce-plan's option-4 routing fires Skill ce-dispatch through the platform's skill-invocation primitive, but ce-dispatch's frontmatter carried disable-model-invocation: true, which (per plugins/compound-engineering/AGENTS.md 'Beta Skills' caveat) blocks model-initiated Skill-tool invocation. The flag is mutually exclusive with being a callee of another skill — ce-work, the established sibling callee, doesn't carry it. Removed the flag. Description specificity and the required argument prevent accidental auto-fire instead, per the documented non-beta caveat. - P14 (Comment 13) — Normalize merged-state casing across dispatch state checks. The merge-routing block required dependencies in uppercase state MERGED while the same block wrote the merged unit status as lowercase merged; unblock-dispatch and loop-completion routings also keyed off the lowercase form. With mixed enums, dependencies merged via one path were treated as unmerged by another, causing false merge blocks. Declared a single canonical lowercase status taxonomy in Phase 3 (pending / issue_created / pr_open / merged / closed / failed) and mapped gh's uppercase PR-state enum (OPEN/MERGED/CLOSED) to it on ingest. Phase 4 reads/writes now reference the canonical lowercase status field consistently. Defensive hardening (raised but not enforced on round 2): - P10 — gh pr list --search 'head:...' is substring-matched, not exact. A search for dispatch/U3-add-rate-limiter would also match a sibling branch dispatch/U3-add-rate-limiter-v2. Phase 4 status check now adds headRefName to the --json projection, post-filters by exact case- sensitive headRefName equality, and falls back to a linked-issue query (gh pr list --search 'linked-issue:...') when the workspace renamed the branch. - P11 — gh pr view --json mergeable returns UNKNOWN for several seconds after a PR is opened (GitHub computes mergeability asynchronously). Treating that as MERGEABLE or CONFLICTING silently mis-routes the merge gate. Phase 4 status check now retries the mergeable poll up to three times with a ~2 second pause, then surfaces the unknown state to the user rather than coercing. - P12 — gh pr merge --delete-branch deletes the head ref on the remote. Without --prune, the dispatching session keeps stale origin/ refs which then poison subsequent gh pr list --search 'head:...' lookups. Phase 4 merge sync now uses git fetch origin --prune. Tests: tests/skills/ce-dispatch-contract.test.ts grows by five regression guards (P10/P11/P12/P13/P14). Local: bun test tests/skills/ce-dispatch-contract.test.ts → 58 pass / 0 fail (was 54). Full bun test → 1302 pass / 1 fail (the 1 is the pre-existing resolve-base.sh failure unrelated to this PR). bun run release:validate → in sync. --- .../skills/ce-dispatch/SKILL.md | 12 +- tests/skills/ce-dispatch-contract.test.ts | 123 +++++++++++++++++- 2 files changed, 127 insertions(+), 8 deletions(-) diff --git a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md index aa5e25db9..f6e9461bf 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md +++ b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md @@ -1,7 +1,6 @@ --- name: ce-dispatch description: "[BETA] Dispatch plan implementation units to external agent workspaces via GitHub issues. Use after ce-plan to fan out execution to Conductor workspaces or any issue-driven agent workflow. One issue per implementation unit, dispatched in dependency order; the orchestrator monitors PRs, gates merges on dependencies, and re-dispatches newly unblocked units." -disable-model-invocation: true argument-hint: "[Plan doc path. Blank to auto-detect latest plan]" --- @@ -141,6 +140,7 @@ Notes: - Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). The scratch directory holds one file per dispatched unit so retries can re-use them. - The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh issue create` returns a non-zero exit and refuses to create the issue (`could not add label: '' not found`) — this is intentional `gh` behavior to prevent accidental label creation (cli/cli#715). Detect the missing-label error on the first failed `gh issue create` call, surface it once with the missing label name(s), and offer to run `gh label create ` (single confirmation covering all missing labels, not per-issue), then retry the failed `gh issue create` invocations. Do **not** strip the label from the command and proceed silently — dispatched issues without their labels won't be picked up by `dispatch_labels`-filtering automations. - After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. +- The `status` field is the unit's **canonical lifecycle enum** and is the only state value the orchestrator reads or writes for gate decisions. Allowed values are **always lowercase**: `pending` (pre-dispatch), `issue_created` (issue exists, no PR yet), `pr_open` (PR exists and not yet merged or closed), `merged` (PR squash-merged), `closed` (PR closed without merge), `failed` (issue creation or dispatch errored). When ingesting `gh pr view --json state`, map its uppercase enum to the canonical lowercase form (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) before storing — never compare against `gh`'s raw uppercase values directly. Mixing the two casings would let a unit merged via the `gh pr merge` action (which writes lowercase `merged`) be treated as unmerged by a dependency check that compared against uppercase `MERGED`. - If `gh issue create` fails (auth error, rate limit, etc.), stop the round and surface the error. Do not try to "recover" by retrying with different flags — the user needs to fix the underlying problem. After all issues in the round are created, summarize to the user: count, U-IDs dispatched, base branch, and the expectation that workspaces will pick them up. @@ -165,22 +165,22 @@ Dispatch status: / merged. open PRs. < Act on the user's selection — do not just announce it. The bare per-option action lives inline below. Elaborate sub-flows (review tool selection, conflict resolution prose) live further down. -- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"` (or query by linked issue if the workspace renamed the branch); `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). For each match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName`. Update `dispatched_units` with the latest PR number, state (`OPEN`, `MERGED`, `CLOSED`), CI rollup, and mergeable flag. Re-render the loop status line and re-render the menu. +- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"`; `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). The `--search "head:..."` query is **substring-matched**, not exact — if `expected_branch` is `dispatch/U3-add-rate-limiter`, a sibling branch like `dispatch/U3-add-rate-limiter-v2` will also match. For each candidate match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName,number` and discard rows whose `headRefName` is not strictly equal to `expected_branch` (case-sensitive exact match). If no PR survives the exact-match filter, fall back to querying by linked issue: `gh pr list --state all --search "linked-issue:"` and apply the same exact-match-on-`headRefName` filter (the workspace may have renamed the branch but the linked-issue link is durable). For each surviving match, update `dispatched_units[]` with the latest PR number, CI rollup, `mergeable` flag, and the canonical lowercase `status` mapped from `gh`'s uppercase PR-state enum (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) per the taxonomy in Phase 3. **`mergeable: UNKNOWN` is transient**: GitHub computes mergeability asynchronously, so newly-opened PRs may report `UNKNOWN` for several seconds. When `mergeable` is `UNKNOWN`, re-poll `gh pr view --json mergeable` up to three times with a ~2 second pause between calls before storing the value. If still `UNKNOWN` after retries, store it as such and surface that to the user ("'s PR mergeability is still being computed by GitHub — retry option 1 in a moment") rather than treating `UNKNOWN` as `MERGEABLE` or `CONFLICTING`. Re-render the loop status line and re-render the menu. - **Review a PR (2)** — ask the user which U-ID's PR to review (blocking tool single-select from open PRs in `dispatched_units`). Then invoke the `ce-code-review` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the PR URL as the argument. When `dispatch_auto_review: true`, also auto-trigger this for every newly opened PR before the user is asked to merge it (record per-PR `reviewed: true` so it isn't re-run). - **Merge a PR (3)** — ask which U-ID's PR to merge (blocking tool single-select from PRs that pass the merge gate below). Apply this gate before merging: - - All of the unit's dependencies (per the dependency graph) are already in state `MERGED` in `dispatched_units`. If any dependency is not yet merged, refuse with the message "Cannot merge `` — dependency `` is still . Merge it first." and re-render the menu. + - All of the unit's dependencies (per the dependency graph) have `status: merged` in `dispatched_units` (lowercase canonical — see Phase 3). If any dependency is not yet merged, refuse with the message "Cannot merge `` — dependency `` is still . Merge it first." and re-render the menu. - CI rollup on the PR is green (no `FAILURE` or `ERROR` checks). If checks are pending, ask the user whether to wait or skip. - The PR has a `## Dispatch Result` section in its body with `Status: completed`. If the section is missing or `Status` is `partial` / `failed`, refuse and surface the issue back to the user. - When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base. **Precondition** (run before the sync, not after): check `git status --porcelain` — if the dispatching session's working tree has uncommitted changes, do **not** run `git checkout` (it can fail or silently overwrite user work). Surface the dirty paths to the user and ask them to commit, stash, or skip the local test step before proceeding. Also capture the current branch with `git symbolic-ref --short HEAD` (or `git rev-parse HEAD` if detached) so it can be restored after tests — the user may not have been on the base branch when they invoked the orchestrator. With those guards in place, run `git fetch origin && git checkout && git pull --ff-only origin `. Without this sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. Finally, restore the captured pre-sync branch (`git checkout `) and tell the user the working tree was cycled through `` for verification — don't leave them silently displaced. + When all gates pass, run `gh pr merge --squash --delete-branch`. `gh pr merge` lands the merge on GitHub but does not touch the local checkout, so before running any verification commands locally, sync the working tree to the merged base. **Precondition** (run before the sync, not after): check `git status --porcelain` — if the dispatching session's working tree has uncommitted changes, do **not** run `git checkout` (it can fail or silently overwrite user work). Surface the dirty paths to the user and ask them to commit, stash, or skip the local test step before proceeding. Also capture the current branch with `git symbolic-ref --short HEAD` (or `git rev-parse HEAD` if detached) so it can be restored after tests — the user may not have been on the base branch when they invoked the orchestrator. With those guards in place, run `git fetch origin --prune && git checkout && git pull --ff-only origin `. The `--prune` flag is required: `gh pr merge --delete-branch` deletes the head ref on the remote, and without `--prune` the dispatching session keeps a stale `origin/` ref which then poisons subsequent `gh pr list --state all --search "head:..."` lookups (and confuses humans inspecting `git branch -r`). Without the local sync the test suite would run against pre-merge code and could report a false green even when the merged commit is broken. Then run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. Finally, restore the captured pre-sync branch (`git checkout `) and tell the user the working tree was cycled through `` for verification — don't leave them silently displaced. On merge conflict (`gh pr merge` reports the PR is not mergeable due to conflicts), do **not** attempt to resolve the conflict in the dispatching session — the conflict belongs to the workspace that produced the PR. Surface the conflict and advise the user: "Open the workspace, run `git fetch origin && git rebase origin/`, resolve conflicts, push, and re-run option 1 to refresh status." Re-render the menu without merging. - **Dispatch newly unblocked units (4)** — recompute the dispatchable set: U-IDs whose dependencies are all `merged` and that have not yet been dispatched. Re-enter Phases 2-3 for that set. If the set is empty, say so and re-render the menu. -- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, with each node labeled by U-ID and current state (`merged` / `open #PR` / `blocked` / `pending`). Re-render the menu. +- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, reading each node's `status` field (canonical lowercase enum from Phase 3). Render `pr_open` as `open #` for human friendliness; render `merged`, `blocked`, `pending`, `closed`, `failed`, and `issue_created` verbatim. Re-render the menu. - **Done for now (6)** — print a summary (units merged, units still open, units blocked) and exit the loop. The dispatched issues and PRs persist in GitHub; the user can re-invoke `ce-dispatch` later to resume monitoring. @@ -188,7 +188,7 @@ If the user enters free text instead of a number, interpret intent and route to #### 4.2 Completion -The skill is **not** complete until the user picks `Done for now` or every unit in the plan is in state `merged`. Re-rendering the menu and stopping at the user's selection without acting on it is not completion — fire the routed action. +The skill is **not** complete until the user picks `Done for now` or every unit in the plan has `status: merged` (canonical lowercase). Re-rendering the menu and stopping at the user's selection without acting on it is not completion — fire the routed action. When every unit is merged, congratulate the user, optionally run the plan's final verification command (e.g., the full test suite from ``), and exit the loop. Do not auto-close the dispatched issues — `gh pr merge` typically closes them via the linked-issue mechanism, but verify and report. diff --git a/tests/skills/ce-dispatch-contract.test.ts b/tests/skills/ce-dispatch-contract.test.ts index 1c2aa4187..244d62603 100644 --- a/tests/skills/ce-dispatch-contract.test.ts +++ b/tests/skills/ce-dispatch-contract.test.ts @@ -65,8 +65,18 @@ describe("ce-dispatch SKILL.md frontmatter", () => { expect(desc.toLowerCase()).toContain("implementation unit") }) - test("disable-model-invocation is true (beta skill)", () => { - expect(fm["disable-model-invocation"]).toBe(true) + test("does not set disable-model-invocation (skill is invoked from ce-plan routing)", () => { + // ce-plan's option-4 routing fires `Skill ce-dispatch ` via the + // platform's skill-invocation primitive. `disable-model-invocation: true` + // would block that exact path (per plugins/compound-engineering/AGENTS.md + // "Beta Skills" section: the flag blocks model-initiated invocations via + // the Skill tool; only a user typing a slash command bypasses it). + // The flag is therefore mutually exclusive with being a callee of another + // skill. ce-dispatch is a callee, so the flag must be absent (or false). + // Prevent accidental auto-fire instead via the description's specificity + // and the required argument, per the AGENTS.md non-beta caveat. + const flag = fm["disable-model-invocation"] + expect(flag === undefined || flag === false).toBe(true) }) test("argument-hint references plan path with auto-detect fallback", () => { @@ -496,6 +506,115 @@ describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { expect(verificationBullet[0]).not.toMatch(/Test scenarios/) }) + test("Phase 4 status check applies an exact-match filter on headRefName", () => { + // `gh pr list --search "head:..."` is substring-matched, not exact, so a + // sibling branch like `dispatch/U3-add-rate-limiter-v2` will collide with + // a search for `dispatch/U3-add-rate-limiter`. The status check must + // post-filter the candidate rows so only those whose headRefName equals + // the expected_branch survive, and must fall back to a linked-issue query + // when no candidate survives (e.g., the workspace renamed the branch). + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const statusBlockMatch = phase4Region.match( + /\*\*Check PR status \(1\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + expect(statusBlockMatch).not.toBeNull() + const statusBlock = statusBlockMatch![0] + // Must call out substring-matching as a known caveat. + expect(statusBlock).toMatch(/substring-?match/i) + // Must require headRefName is part of the --json projection so the post- + // filter is possible. + expect(statusBlock).toMatch(/headRefName/) + // Must describe an exact-match filter and a linked-issue fallback. + expect(statusBlock).toMatch(/exact[-\s]?match/i) + expect(statusBlock).toMatch(/linked-issue/) + }) + + test("Phase 4 status check retries on transient mergeable: UNKNOWN", () => { + // GitHub computes mergeability asynchronously, so newly-opened PRs report + // `mergeable: UNKNOWN` for several seconds after creation. Treating that + // value as if it were CONFLICTING or MERGEABLE silently mis-routes the + // merge gate. The status check must explicitly retry the mergeable poll + // a small number of times before storing UNKNOWN as a final state, and + // must surface the unknown state to the user when retries exhaust rather + // than coercing it to a known value. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const statusBlockMatch = phase4Region.match( + /\*\*Check PR status \(1\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const statusBlock = statusBlockMatch![0] + expect(statusBlock).toMatch(/mergeable[`'"]?:?\s*`?UNKNOWN/i) + // Some retry / re-poll language must be present. + expect(statusBlock).toMatch(/re-?poll|retry|retries/i) + // The skill must explicitly forbid coercing UNKNOWN to a known state. + expect(statusBlock).toMatch(/(?:not|never|rather than).{0,40}MERGEABLE/i) + }) + + test("dispatched_units status uses one canonical lowercase enum across reads and writes", () => { + // gh's PR-state JSON returns uppercase enums (`OPEN`, `MERGED`, `CLOSED`), + // while the merge-routing block writes the merged status as lowercase + // `merged` and the unblock-dispatch / loop-completion routings also key off + // the lowercase form. If any read or write uses the uppercase form, a unit + // merged via one path is treated as unmerged by the other, causing false + // merge blocks or missed unblocking. The skill must declare a single + // canonical lowercase taxonomy and explicitly map the uppercase gh enum to + // it on ingest, never compare against the uppercase form directly. + const phase3Start = SKILL_BODY.indexOf("### Phase 3:") + const phase3End = SKILL_BODY.indexOf("### Phase 4:") + const phase3Region = SKILL_BODY.slice(phase3Start, phase3End) + // Phase 3 must declare the canonical taxonomy. + expect(phase3Region).toMatch(/canonical/i) + expect(phase3Region).toMatch(/lowercase/i) + // Each canonical value should be enumerated as a lowercase backticked token. + for (const value of [ + "pending", + "issue_created", + "pr_open", + "merged", + "closed", + "failed", + ]) { + expect(phase3Region).toContain("`" + value + "`") + } + // The taxonomy must include the explicit gh-state -> lowercase mapping. + expect(phase3Region).toMatch(/OPEN[^\n]*pr_open/) + expect(phase3Region).toMatch(/MERGED[^\n]*\bmerged\b/) + expect(phase3Region).toMatch(/CLOSED[^\n]*\bclosed\b/) + + // Phase 4 must not require an uppercase MERGED for the merge-gate + // dependency check. The `MERGED` token may still appear inside the + // documented `OPEN -> pr_open` / `MERGED -> merged` / `CLOSED -> closed` + // mapping prose (for the gh-side enum), but a literal "in state `MERGED`" + // dependency check would re-introduce the bug. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + expect(phase4Region).not.toMatch(/state\s+`MERGED`/) + // The merge gate must reference the canonical lowercase status field. + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const mergeBlock = mergeBlockMatch![0] + expect(mergeBlock).toMatch(/status:\s*merged/) + }) + + test("Phase 4 merge sync uses git fetch --prune to clear stale refs", () => { + // `gh pr merge --delete-branch` removes the head ref on the remote, but + // `git fetch origin` without `--prune` retains the stale local + // `origin/` ref. Subsequent `gh pr list --search + // "head:..."` queries can match the stale ref and confuse the orchestrator + // about whether a follow-up PR exists. The Phase 4 sync step must use + // `git fetch --prune` so deleted branches are swept on the next sync. + const phase4Start = SKILL_BODY.indexOf("### Phase 4:") + const phase4Region = SKILL_BODY.slice(phase4Start) + const mergeBlockMatch = phase4Region.match( + /\*\*Merge a PR \(3\)\*\*[\s\S]*?(?=\n- \*\*[A-Z])/, + ) + const mergeBlock = mergeBlockMatch![0] + // The post-merge sync's git fetch must carry --prune. + expect(mergeBlock).toMatch(/git fetch origin --prune/) + }) + test("Phase 3 documents gh issue create label-missing as an error", () => { // `gh issue create` with `--label ` exits non-zero and refuses // to create the issue (cli/cli#715 — intentional, prevents accidental From 6ea2276535edad3c01f6c426c68711aa505f4bc0 Mon Sep 17 00:00:00 2001 From: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Date: Mon, 4 May 2026 13:59:47 +0000 Subject: [PATCH 4/4] fix(ce-dispatch): rename to ce-dispatch-beta per beta-skills framework triplet P15: Restore convention triplet (-beta name + [BETA] description + disable-model-invocation). Removing the flag in P13 broke the convention; the right resolution is to apply it cleanly: - Rename ce-dispatch -> ce-dispatch-beta (directory, frontmatter, internal references, template metadata footer marker, default labels) - Re-add disable-model-invocation: true (matches ce-work-beta, ce-polish-beta) - Description prefixed with [BETA] - Add ce-dispatch (legacy unsuffixed name) to STALE_SKILL_DIRS in legacy-cleanup.ts and EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN['compound-engineering'].skills so flat-install upgrades sweep the stale dir on next install P15-routing: ce-plan option-4 routing changed from Skill-primitive invocation to user-typed slash command. The flag blocks model-initiated invocations via the Skill tool; only a user typing /ce-dispatch-beta directly bypasses it. Inline routing in ce-plan/SKILL.md and references/plan-handoff.md now end the turn with the slash-command instruction and an explicit 'do not attempt Skill ce-dispatch-beta' negation. Promotion-back-to-stable note included for forward compatibility. P16 (Codex Comment 15 / P1): Replace invalid 'linked-issue:' GitHub-search qualifier with a body-content search keyed on the 'Unit ID:' line in the PR body (durable across branch renames). 'linked-issue:' is not a valid GitHub-search qualifier; the documented form is 'linked:issue' (a flag). The earlier fallback would have silently matched nothing. P17: Remove cross-skill file-path reference 'plugins/compound-engineering/skills/ce-work/ SKILL.md' from Phase 1.3 (parallel-safety analysis). Per AGENTS.md 'File References in Skills' rule, each skill directory must be self-contained. The analysis is now inlined. P20: Standardize dispatched_units[].pr as a sub-object {number, state, mergeable, ci_rollup} consistently across Phase 3 init, Phase 4 status check, and Phase 4 graph render. Earlier draft mixed flat sibling fields (pr_number) with the sub-object shape; unified now to prevent the same casing-class drift we saw with the lifecycle enum. P22: Removed Pipeline Mode section (impossible under disable-model-invocation: true) with an explicit explanation: the only entrypoint is a user-typed slash command, so there is no automated caller to compress Phase 4 for. Forward reference to promotion path notes that pipeline-mode wiring can be added when the flag is removed alongside the routing change in ce-plan. Tests: - tests/skills/ce-dispatch-contract.test.ts: updated paths, frontmatter assertions (name=ce-dispatch-beta, description starts with [BETA], disable-model-invocation: true), template metadata marker (ce-dispatch-beta-metadata), and ce-plan routing assertions (slash command, flag-aware, explicit negation of Skill primitive) - New regression guards: linked-issue: qualifier never appears in actual gh invocations, no cross-skill file paths in SKILL.md, pr field is a consistent sub-object shape - 61 contract tests pass (was 54). Full suite: 1305 pass / 1 fail (pre-existing resolve-base.sh shallow-checkout failure, unrelated). bun run release:validate: in sync. --- .../config.local.example.yaml | 10 +- .../SKILL.md | 27 +-- .../references/conductor-notes.md | 24 +-- .../references/dispatch-prompt-template.md | 10 +- .../skills/ce-plan/SKILL.md | 2 +- .../skills/ce-plan/references/plan-handoff.md | 2 +- .../ce-setup/references/config-template.yaml | 10 +- src/data/plugin-legacy-artifacts.ts | 1 + src/utils/legacy-cleanup.ts | 5 + tests/skills/ce-dispatch-contract.test.ts | 204 ++++++++++++++---- 10 files changed, 220 insertions(+), 75 deletions(-) rename plugins/compound-engineering/skills/{ce-dispatch => ce-dispatch-beta}/SKILL.md (77%) rename plugins/compound-engineering/skills/{ce-dispatch => ce-dispatch-beta}/references/conductor-notes.md (64%) rename plugins/compound-engineering/skills/{ce-dispatch => ce-dispatch-beta}/references/dispatch-prompt-template.md (95%) diff --git a/.compound-engineering/config.local.example.yaml b/.compound-engineering/config.local.example.yaml index dd3d9b040..3e272d79c 100644 --- a/.compound-engineering/config.local.example.yaml +++ b/.compound-engineering/config.local.example.yaml @@ -12,13 +12,17 @@ # work_delegate_effort: high # minimal | low | medium | high | xhigh (omit to use ~/.codex/config.toml default) # --- Dispatch (external workspace delegation) --- -# Settings for /ce-dispatch, which fans out plan implementation units to -# external agent workspaces (e.g., Conductor) via GitHub issues. +# Settings for /ce-dispatch-beta, which fans out plan implementation units to +# external agent workspaces (e.g., Conductor) via GitHub issues. The beta +# suffix follows the beta-skills framework triplet (-beta name + +# [BETA] description prefix + disable-model-invocation: true); promotion +# to stable will rename the slash command to /ce-dispatch and update the +# default labels accordingly. # dispatch_mode: conductor # conductor | (default: conductor) # dispatch_branch_prefix: dispatch/ # branch prefix suggested in dispatch prompts (default: dispatch/) # dispatch_base_branch: main # PR base branch (default: repo default branch) -# dispatch_labels: ce-dispatch # comma-separated labels applied to created issues (default: ce-dispatch) +# dispatch_labels: ce-dispatch-beta # comma-separated labels applied to created issues (default: ce-dispatch-beta) # dispatch_auto_review: true # true | false (default: true) -- auto-run ce-code-review on each new PR # --- Product pulse --- diff --git a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md similarity index 77% rename from plugins/compound-engineering/skills/ce-dispatch/SKILL.md rename to plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md index f6e9461bf..b6ee70193 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/SKILL.md @@ -1,6 +1,7 @@ --- -name: ce-dispatch +name: ce-dispatch-beta description: "[BETA] Dispatch plan implementation units to external agent workspaces via GitHub issues. Use after ce-plan to fan out execution to Conductor workspaces or any issue-driven agent workflow. One issue per implementation unit, dispatched in dependency order; the orchestrator monitors PRs, gates merges on dependencies, and re-dispatches newly unblocked units." +disable-model-invocation: true argument-hint: "[Plan doc path. Blank to auto-detect latest plan]" --- @@ -8,7 +9,9 @@ argument-hint: "[Plan doc path. Blank to auto-detect latest plan]" Fan out a structured plan's implementation units to external agent workspaces (Conductor or any issue-driven agent platform) by creating one GitHub issue per dispatchable unit. The orchestrator monitors the resulting pull requests, enforces dependency-ordered merges, and re-dispatches units whose dependencies have just merged. -This skill is a sibling to `ce-work` and `ce-work-beta`. Where `ce-work` executes a plan in the **current** session and `ce-work-beta` can delegate to `codex exec`, `ce-dispatch` hands units off to **separate workspaces** that the dispatching session does not control directly. Use it when units are independent enough to parallelize across workspaces, when you want human-in-the-loop review at the PR layer, or when integrating with a workspace platform (e.g., Conductor) that picks up GitHub issues. +This skill is a sibling to `ce-work` and `ce-work-beta`. Where `ce-work` executes a plan in the **current** session and `ce-work-beta` can delegate to `codex exec`, `ce-dispatch-beta` hands units off to **separate workspaces** that the dispatching session does not control directly. Use it when units are independent enough to parallelize across workspaces, when you want human-in-the-loop review at the PR layer, or when integrating with a workspace platform (e.g., Conductor) that picks up GitHub issues. + +Like all `-beta` skills in this plugin (`ce-work-beta`, `ce-polish-beta`, etc.), `ce-dispatch-beta` carries `disable-model-invocation: true` and is invoked **only** by the user typing `/ce-dispatch-beta` (or by direct slash-command pipelines). It is not auto-invoked by the model, and other skills do not call it via the platform's skill-invocation primitive — that path is blocked by the flag. `ce-plan`'s post-generation menu surfaces it as a manual next step the user can choose. For background on Conductor's specific behavior (issue-to-workspace lifecycle, startup scripts, PR creation flow), see `references/conductor-notes.md`. For the structure of the prompt embedded in each issue, see `references/dispatch-prompt-template.md`. @@ -16,7 +19,7 @@ For background on Conductor's specific behavior (issue-to-workspace lifecycle, s When asking the user a question, use the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question. -The Phase 4 monitor loop renders **6 menu options**, which exceeds the 4-option cap most blocking tools enforce. For that menu — and only that menu — render a numbered list directly in chat per the option-overflow exception in `plugins/compound-engineering/AGENTS.md`. Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Earlier phases (Phase 0 plan-path confirmation, Phase 3 confirm-before-creating-issues) stay within the 4-option cap and use the blocking tool. +The Phase 4 monitor loop renders **6 menu options**, which exceeds the 4-option cap most blocking tools enforce. For that menu — and only that menu — render a numbered list directly in chat instead of calling the blocking tool. Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Earlier phases (Phase 0 plan-path confirmation, Phase 3 confirm-before-creating-issues) stay within the 4-option cap and use the blocking tool. The general principle: prefer the blocking tool whenever the option count fits its cap; fall back to a numbered chat list with explicit "Pick a number or describe what you want." wording only for menus that exceed the cap, so the open-ended free-text path is preserved. ## Input @@ -49,7 +52,7 @@ Config keys and resolution: | `dispatch_mode` | `conductor`, or another short identifier | `conductor` | | `dispatch_branch_prefix` | any string (no leading/trailing slashes) | `dispatch/` | | `dispatch_base_branch` | any branch name | repo's default branch (`git symbolic-ref --short refs/remotes/origin/HEAD`; if that fails because `origin/HEAD` is not set — happens on bare clones, fresh `git clone --no-checkout`, or some CI checkouts — fall back to `git remote show origin` and parse the `HEAD branch:` line, or to `main` with a one-line warning to the user) | -| `dispatch_labels` | comma-separated label list | `ce-dispatch` | +| `dispatch_labels` | comma-separated label list | `ce-dispatch-beta` | | `dispatch_auto_review` | `true` or `false` | `true` | If a key has an unrecognized value, fall through to the default for that key. Do not error. @@ -90,7 +93,7 @@ Construct a directed graph from the captured `Dependencies` lists. Nodes are U-I #### 1.3 Parallel Safety Check -Mirror the parallel-safety analysis from `ce-work` (the canonical version lives in `plugins/compound-engineering/skills/ce-work/SKILL.md`'s "Parallel Safety Check" section). Build a file-to-unit mapping from every unit's `Files:` section (Create, Modify, and Test paths). Detect intersections. +Apply a parallel-safety analysis: build a file-to-unit mapping from every unit's `Files:` section (Create, Modify, and Test paths). Detect intersections. The same analysis runs in `ce-work` for in-session execution and in `ce-work-beta` for delegated execution; the inlined version below is the source of truth for dispatch and is intentionally duplicated here per the plugin's "File References in Skills" rule (each skill directory is self-contained; cross-skill file paths break runtime resolution and converter portability). Each external workspace runs in its own working tree (Conductor: one workspace = one branch = one isolated working tree), so file overlap between units in different workspaces does **not** corrupt git state — but it predicts merge conflicts when those PRs land. @@ -138,8 +141,8 @@ gh issue create \ Notes: - Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). The scratch directory holds one file per dispatched unit so retries can re-use them. -- The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh issue create` returns a non-zero exit and refuses to create the issue (`could not add label: '' not found`) — this is intentional `gh` behavior to prevent accidental label creation (cli/cli#715). Detect the missing-label error on the first failed `gh issue create` call, surface it once with the missing label name(s), and offer to run `gh label create ` (single confirmation covering all missing labels, not per-issue), then retry the failed `gh issue create` invocations. Do **not** strip the label from the command and proceed silently — dispatched issues without their labels won't be picked up by `dispatch_labels`-filtering automations. -- After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. +- The label list comes from `dispatch_labels` (default `ce-dispatch-beta`). If a label does not yet exist in the repo, `gh issue create` returns a non-zero exit and refuses to create the issue (`could not add label: '' not found`) — this is intentional `gh` behavior to prevent accidental label creation (cli/cli#715). Detect the missing-label error on the first failed `gh issue create` call, surface it once with the missing label name(s), and offer to run `gh label create ` (single confirmation covering all missing labels, not per-issue), then retry the failed `gh issue create` invocations. Do **not** strip the label from the command and proceed silently — dispatched issues without their labels won't be picked up by `dispatch_labels`-filtering automations. +- After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. The `pr` slot is the **canonical PR sub-object** for the unit and is the only place PR-related fields live; once Phase 4 finds a PR, it populates `pr` as `{ number, state, mergeable, ci_rollup, reviewed }` (lowercase keys, never as flat siblings like `pr_number` / `pr_state` — the orchestrator reads `.pr.number`, `.pr.state`, etc., and a flat-sibling write would split state across two namespaces and reintroduce the same casing-class bug as the lifecycle enum). - The `status` field is the unit's **canonical lifecycle enum** and is the only state value the orchestrator reads or writes for gate decisions. Allowed values are **always lowercase**: `pending` (pre-dispatch), `issue_created` (issue exists, no PR yet), `pr_open` (PR exists and not yet merged or closed), `merged` (PR squash-merged), `closed` (PR closed without merge), `failed` (issue creation or dispatch errored). When ingesting `gh pr view --json state`, map its uppercase enum to the canonical lowercase form (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) before storing — never compare against `gh`'s raw uppercase values directly. Mixing the two casings would let a unit merged via the `gh pr merge` action (which writes lowercase `merged`) be treated as unmerged by a dependency check that compared against uppercase `MERGED`. - If `gh issue create` fails (auth error, rate limit, etc.), stop the round and surface the error. Do not try to "recover" by retrying with different flags — the user needs to fix the underlying problem. @@ -165,7 +168,7 @@ Dispatch status: / merged. open PRs. < Act on the user's selection — do not just announce it. The bare per-option action lives inline below. Elaborate sub-flows (review tool selection, conflict resolution prose) live further down. -- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"`; `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). The `--search "head:..."` query is **substring-matched**, not exact — if `expected_branch` is `dispatch/U3-add-rate-limiter`, a sibling branch like `dispatch/U3-add-rate-limiter-v2` will also match. For each candidate match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName,number` and discard rows whose `headRefName` is not strictly equal to `expected_branch` (case-sensitive exact match). If no PR survives the exact-match filter, fall back to querying by linked issue: `gh pr list --state all --search "linked-issue:"` and apply the same exact-match-on-`headRefName` filter (the workspace may have renamed the branch but the linked-issue link is durable). For each surviving match, update `dispatched_units[]` with the latest PR number, CI rollup, `mergeable` flag, and the canonical lowercase `status` mapped from `gh`'s uppercase PR-state enum (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) per the taxonomy in Phase 3. **`mergeable: UNKNOWN` is transient**: GitHub computes mergeability asynchronously, so newly-opened PRs may report `UNKNOWN` for several seconds. When `mergeable` is `UNKNOWN`, re-poll `gh pr view --json mergeable` up to three times with a ~2 second pause between calls before storing the value. If still `UNKNOWN` after retries, store it as such and surface that to the user ("'s PR mergeability is still being computed by GitHub — retry option 1 in a moment") rather than treating `UNKNOWN` as `MERGEABLE` or `CONFLICTING`. Re-render the loop status line and re-render the menu. +- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"`; `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). The `--search "head:..."` query is **substring-matched**, not exact — if `expected_branch` is `dispatch/U3-add-rate-limiter`, a sibling branch like `dispatch/U3-add-rate-limiter-v2` will also match. For each candidate match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName,number` and discard rows whose `headRefName` is not strictly equal to `expected_branch` (case-sensitive exact match). If no PR survives the exact-match filter, fall back to a body-content search keyed on the U-ID (workspaces sometimes rename the branch, but every dispatched PR carries a `**Unit ID:** ` line in its description per the dispatch prompt template's output contract): `gh pr list --state all --search 'in:body "Unit ID: "'`. Do **not** use `--search "linked-issue:"` — `linked-issue:` is not a valid GitHub-search qualifier (the documented qualifier is `linked:issue`, which is a flag returning all PRs linked to any issue, not a per-issue lookup). For PRs surfaced by the body-search fallback, skip the `headRefName` filter — the whole point of the fallback is that the branch was renamed, so the `Unit ID:` line is the durable correlation key. For each surviving match, populate `dispatched_units[].pr` as a sub-object `{ number, state, mergeable, ci_rollup, reviewed }` with values pulled from `gh pr view`. The `state` value must be the canonical lowercase mapped from `gh`'s uppercase PR-state enum (`OPEN` -> `pr_open`, `MERGED` -> `merged`, `CLOSED` -> `closed`) per the taxonomy in Phase 3, and `dispatched_units[].status` must be set to the same value (the unit-level `status` and the unit-level `pr.state` move in lockstep — never let them disagree). **`mergeable: UNKNOWN` is transient**: GitHub computes mergeability asynchronously, so newly-opened PRs may report `UNKNOWN` for several seconds. When `mergeable` is `UNKNOWN`, re-poll `gh pr view --json mergeable` up to three times with a ~2 second pause between calls before storing the value. If still `UNKNOWN` after retries, store it as such and surface that to the user ("'s PR mergeability is still being computed by GitHub — retry option 1 in a moment") rather than treating `UNKNOWN` as `MERGEABLE` or `CONFLICTING`. Re-render the loop status line and re-render the menu. - **Review a PR (2)** — ask the user which U-ID's PR to review (blocking tool single-select from open PRs in `dispatched_units`). Then invoke the `ce-code-review` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the PR URL as the argument. When `dispatch_auto_review: true`, also auto-trigger this for every newly opened PR before the user is asked to merge it (record per-PR `reviewed: true` so it isn't re-run). @@ -180,9 +183,9 @@ Act on the user's selection — do not just announce it. The bare per-option act - **Dispatch newly unblocked units (4)** — recompute the dispatchable set: U-IDs whose dependencies are all `merged` and that have not yet been dispatched. Re-enter Phases 2-3 for that set. If the set is empty, say so and re-render the menu. -- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, reading each node's `status` field (canonical lowercase enum from Phase 3). Render `pr_open` as `open #` for human friendliness; render `merged`, `blocked`, `pending`, `closed`, `failed`, and `issue_created` verbatim. Re-render the menu. +- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, reading each node's `status` field (canonical lowercase enum from Phase 3). Render `pr_open` as `open #.pr.number` for human friendliness; render `merged`, `blocked`, `pending`, `closed`, `failed`, and `issue_created` verbatim. Re-render the menu. -- **Done for now (6)** — print a summary (units merged, units still open, units blocked) and exit the loop. The dispatched issues and PRs persist in GitHub; the user can re-invoke `ce-dispatch` later to resume monitoring. +- **Done for now (6)** — print a summary (units merged, units still open, units blocked) and exit the loop. The dispatched issues and PRs persist in GitHub; the user can re-invoke `/ce-dispatch-beta` later to resume monitoring. If the user enters free text instead of a number, interpret intent and route to the closest option, or ask one clarifying question and resume the loop. @@ -194,9 +197,9 @@ When every unit is merged, congratulate the user, optionally run the plan's fina ## Pipeline Mode -If `ce-dispatch` is invoked from an automated workflow (e.g., LFG, or any `disable-model-invocation` upstream), skip the Phase 4 interactive loop and return immediately after Phase 3 with a structured summary of dispatched units. The caller decides what to do with the open PRs. +This skill is intentionally **not** schedulable: `disable-model-invocation: true` blocks every model-initiated invocation (Skill primitive, scheduled re-entry from `/loop`, automated pipeline harnesses), so the only entrypoint is a user typing `/ce-dispatch-beta` directly. Any "pipeline mode" framing in earlier drafts is moot — there is no automated caller to compress Phase 4 for. If a future stable `ce-dispatch` is promoted by removing the flag (per `docs/solutions/skill-design/beta-skills-framework.md`), pipeline-mode wiring can be added at that point alongside the routing change in `ce-plan`. -## What ce-dispatch does NOT do +## What ce-dispatch-beta does NOT do - It does not programmatically create Conductor workspaces. Conductor opens workspaces from issues at the user's discretion (per `references/conductor-notes.md`, section 1). - It does not write to or modify the dispatched workspace's filesystem. The orchestrating session only touches GitHub via `gh` and the local plan file. diff --git a/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md b/plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md similarity index 64% rename from plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md rename to plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md index 2764cdc79..e91859a55 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/references/conductor-notes.md @@ -1,6 +1,6 @@ # Conductor Notes -Findings from the public Conductor documentation at https://www.conductor.build/docs (researched at the time `ce-dispatch` was authored). Conductor is the primary integration target for `ce-dispatch`, but the skill is written to be generic over any issue-driven agent workflow — these notes exist so future maintainers can verify or revise the assumptions baked into the skill. +Findings from the public Conductor documentation at https://www.conductor.build/docs (researched at the time `ce-dispatch-beta` was authored). Conductor is the primary integration target for `ce-dispatch-beta`, but the skill is written to be generic over any issue-driven agent workflow — these notes exist so future maintainers can verify or revise the assumptions baked into the skill. If Conductor's behavior changes, update both this file and the SKILL.md sections that depend on it (Phase 0 config defaults, Phase 3 issue body conventions, Phase 4 PR/merge guidance). @@ -10,8 +10,8 @@ Source: [From issue to PR](https://www.conductor.build/docs/guides/issue-to-pr) - Workspace creation is **user-initiated** in the Conductor desktop app (Cmd+Shift+N → choose GitHub or Linear issue). There is no automatic trigger that spins up a workspace the moment a GitHub issue is created — a human picks the issue from a list inside Conductor. - When the user picks a GitHub or Linear issue, Conductor creates a workspace and the agent inherits the issue title, description, and context as starting prompt material. -- There are no documented label or metadata conventions Conductor requires on issues. Any GitHub issue the user can see is a candidate. `ce-dispatch` is therefore free to apply its own label scheme (`ce-dispatch` by default, configurable via `dispatch_labels`) for human filtering rather than to satisfy Conductor. -- Implication for `ce-dispatch`: the issue body **is** the agent's initial prompt context. Make the body fully self-contained — do not rely on a separate "startup prompt" file Conductor will inject. Any context the in-workspace agent needs (plan path, unit goal, files, patterns, approach, constraints, output contract) must be in the issue body. +- There are no documented label or metadata conventions Conductor requires on issues. Any GitHub issue the user can see is a candidate. `ce-dispatch-beta` is therefore free to apply its own label scheme (`ce-dispatch-beta` by default, configurable via `dispatch_labels`) for human filtering rather than to satisfy Conductor. +- Implication for `ce-dispatch-beta`: the issue body **is** the agent's initial prompt context. Make the body fully self-contained — do not rely on a separate "startup prompt" file Conductor will inject. Any context the in-workspace agent needs (plan path, unit goal, files, patterns, approach, constraints, output contract) must be in the issue body. ## 2. Startup scripts @@ -19,7 +19,7 @@ Source: [Scripts](https://www.conductor.build/docs/reference/scripts), [Setup sc - Conductor supports three repo-level scripts: `setup` (runs at workspace creation), `run` (Run-button command), `archive` (pre-archive cleanup). Defined in `conductor.json` at the repo root or per-user in Repository Settings. - The `setup` script is for **environment preparation** (`npm install`, copy `.env`, build assets, install local plugins) — not for injecting an LLM prompt. There is no documented hook to bake an LLM prompt into a workspace independent of the issue body. -- Implication for `ce-dispatch`: do not assume any "startup prompt" is wired up. The full agent prompt rides in the issue body. If the target repo has a `conductor.json`, ce-dispatch leaves it alone; if a maintainer wants the CE plugin auto-installed in every workspace, that is a Conductor-level configuration choice, not something `ce-dispatch` writes for them. +- Implication for `ce-dispatch-beta`: do not assume any "startup prompt" is wired up. The full agent prompt rides in the issue body. If the target repo has a `conductor.json`, ce-dispatch-beta leaves it alone; if a maintainer wants the CE plugin auto-installed in every workspace, that is a Conductor-level configuration choice, not something `ce-dispatch-beta` writes for them. ## 3. Worktree and branch management @@ -27,14 +27,14 @@ Source: [Isolated workspaces](https://www.conductor.build/docs/concepts/workspac - Each workspace = one git working tree on its own branch. One workspace per branch; a branch can only be checked out in one workspace at a time. - Conductor auto-creates a branch when a workspace starts. The first chat typically prompts the in-workspace agent to **rename** the branch to match the work (per the Conductor doc note: "When you start your first chat, Conductor will instruct the agent to rename this branch to match what you're working on"). Workspaces also have a directory name (e.g., `warsaw-v2`) separate from the git branch. -- There is no enforced branch naming convention from Conductor — naming is left to the in-workspace agent / user. `ce-dispatch` therefore **suggests** a branch name in the issue body (e.g., `dispatch/U3-add-rate-limiter` derived from `dispatch_branch_prefix` + U-ID + slugged unit goal) and lets the agent honor it. The metadata block records the expected branch so Phase 4 monitoring can match PRs to U-IDs even if the agent renamed the branch. +- There is no enforced branch naming convention from Conductor — naming is left to the in-workspace agent / user. `ce-dispatch-beta` therefore **suggests** a branch name in the issue body (e.g., `dispatch/U3-add-rate-limiter` derived from `dispatch_branch_prefix` + U-ID + slugged unit goal) and lets the agent honor it. The metadata block records the expected branch so Phase 4 monitoring can match PRs to U-IDs even if the agent renamed the branch. ## 4. Agent configuration Source: [Agent modes](https://www.conductor.build/docs/concepts/agent-modes), [Setup script reference](https://www.conductor.build/docs/reference/scripts/setup). - Conductor runs Claude Code or Codex inside the workspace. Skills work in both. Repository instructions (`AGENTS.md`, `CLAUDE.md`) and skills the user already has installed are available. -- The CE plugin is **not** automatically installed in every Conductor workspace. It must be either (a) already installed at the user/system level so it's available in every workspace, or (b) installed by the repo's `setup` script. `ce-dispatch` does not enforce this — the dispatch prompt's `` block tells the in-workspace agent how to detect and use the plugin **if available**, and what to do otherwise (follow the prompt sections directly). +- The CE plugin is **not** automatically installed in every Conductor workspace. It must be either (a) already installed at the user/system level so it's available in every workspace, or (b) installed by the repo's `setup` script. `ce-dispatch-beta` does not enforce this — the dispatch prompt's `` block tells the in-workspace agent how to detect and use the plugin **if available**, and what to do otherwise (follow the prompt sections directly). ## 5. PR lifecycle @@ -42,24 +42,24 @@ Source: [Workflow](https://www.conductor.build/docs/concepts/workflow), [From is - Conductor has a built-in **`Create PR`** action (Cmd+Shift+P). When invoked, Conductor sends the current diff and repo context to the in-workspace agent so it can draft the PR description. - After the PR exists, Conductor's Checks tab follows GitHub Actions, deployments, review comments, and todos. -- Implication for `ce-dispatch`: do not fight Conductor's PR flow. The dispatch prompt's `` tells the in-workspace agent to commit, push, and **open a PR** when the unit is complete — whether via Conductor's `Create PR` UI, the `ce-commit-push-pr` skill (when CE plugin is installed), or a manual `gh pr create`. Any of those produces a real GitHub PR, which is what `ce-dispatch` Phase 4 monitors via `gh pr view`/`gh pr checks`. +- Implication for `ce-dispatch-beta`: do not fight Conductor's PR flow. The dispatch prompt's `` tells the in-workspace agent to commit, push, and **open a PR** when the unit is complete — whether via Conductor's `Create PR` UI, the `ce-commit-push-pr` skill (when CE plugin is installed), or a manual `gh pr create`. Any of those produces a real GitHub PR, which is what `ce-dispatch-beta` Phase 4 monitors via `gh pr view`/`gh pr checks`. ## 6. API and CLI - The public docs do not describe a CLI or HTTP API for **programmatic** workspace creation. Workspace creation is desktop-app driven (keyboard shortcut or `...` menu on the New Workspace button). - There is a [Deep Links](https://www.conductor.build/docs/reference/deep-links) reference (`conductor://` URLs) that can open Conductor and trigger actions, but it's not a substitute for an API. -- Implication for `ce-dispatch`: the skill is **not** trying to programmatically create Conductor workspaces. It creates GitHub issues; the human (or Conductor user) opens those issues as workspaces in Conductor. This is intentional — it keeps `ce-dispatch` decoupled from any one platform's workspace orchestration. +- Implication for `ce-dispatch-beta`: the skill is **not** trying to programmatically create Conductor workspaces. It creates GitHub issues; the human (or Conductor user) opens those issues as workspaces in Conductor. This is intentional — it keeps `ce-dispatch-beta` decoupled from any one platform's workspace orchestration. -## What `ce-dispatch` does NOT assume about Conductor +## What `ce-dispatch-beta` does NOT assume about Conductor - That a specific label name is required for issues to be picked up — Conductor accepts any visible issue. - That Conductor will rename branches to a specific pattern — it lets the in-workspace agent decide. - That a startup script can deliver an LLM prompt — the issue body is the prompt. -- That Conductor exposes an API for headless workspace creation — `ce-dispatch` stays at the issue layer. +- That Conductor exposes an API for headless workspace creation — `ce-dispatch-beta` stays at the issue layer. -## What `ce-dispatch` is opinionated about (and why) +## What `ce-dispatch-beta` is opinionated about (and why) -- **Label** (default `ce-dispatch`): so humans can filter their issue list; not a Conductor requirement. +- **Label** (default `ce-dispatch-beta`): so humans can filter their issue list; not a Conductor requirement. - **Branch name suggestion** (`dispatch_branch_prefix` + U-ID + slug): so the orchestrator can correlate PRs back to U-IDs in Phase 4; the in-workspace agent is encouraged but not forced to honor it. - **HTML metadata comment in the issue body** (plan path, U-ID, dependencies, expected branch, base branch): structured data the orchestrator parses on subsequent runs to detect dependency state without rebuilding the graph from scratch. The HTML comment renders invisibly to humans on GitHub but stays parseable. - **PR-based output contract** (a `## Dispatch Result` section in the PR description): replaces ce-work-beta's `--output-schema` JSON, since dispatched agents don't have a shared scratch directory with the orchestrator. The PR description is the durable handoff surface. diff --git a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md b/plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md similarity index 95% rename from plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md rename to plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md index f5bd25f48..a16371a20 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md +++ b/plugins/compound-engineering/skills/ce-dispatch-beta/references/dispatch-prompt-template.md @@ -13,7 +13,7 @@ Render exactly these sections, in this order. Keep the XML tags so downstream to [One paragraph orienting the in-workspace agent: - Plan file path (repo-relative) the unit was extracted from - One-sentence project context (read from plan frontmatter / repo README if available) -- Note that this issue was created by ce-dispatch and corresponds to a single +- Note that this issue was created by ce-dispatch-beta and corresponds to a single implementation unit (or a small batch of independent units) from the plan. The agent should `Read` the plan file for the full picture before starting.] @@ -122,7 +122,7 @@ without it. Report the result via the **PR description**, not via a JSON file or scratch -artifact -- ce-dispatch reads the PR body to drive Phase 4 monitoring, +artifact -- ce-dispatch-beta reads the PR body to drive Phase 4 monitoring, review, and merge gating. Render this section verbatim under a top-level `## Dispatch Result` heading @@ -155,10 +155,10 @@ If verification was not possible, say why. ## Metadata footer -Append the following HTML comment **outside** the `` block, at the very end of the rendered issue body. The comment is invisible in the GitHub UI but parseable by `ce-dispatch` on subsequent runs (and other tooling that wants to round-trip dispatch state). +Append the following HTML comment **outside** the `` block, at the very end of the rendered issue body. The comment is invisible in the GitHub UI but parseable by `ce-dispatch-beta` on subsequent runs (and other tooling that wants to round-trip dispatch state). ```html -