From ea385ab577d2913a437c933fc93c5c9c002fe2f0 Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 18:19:21 +0200 Subject: [PATCH 1/9] docs(pr-review): add Notice tier doctrine terms to CONTEXT.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Captures the domain language that emerged from the grilling session behind the ADO-error-hardening graduation: Notice (the structured message emitted by orchestration agents), Notice Tier (the four-state classification — OK / EMPTY-BY-DESIGN / DEGRADED / ABORTED — with no fifth ASK tier), and Trailer (the mandatory end-of-run line in the Claude interface). The relationships section is extended with a new bullet stating that every orchestration-agent operation terminates in one of the four Notice Tiers and that DEGRADED and Doc-Context EMPTY-BY-DESIGN operations emit a Notice that flows through the Review Summary (ADO modes) or the pre-findings block (Pre-PR mode), with counts also echoed in the Trailer. These terms are referenced by the two PRDs created in the follow-up commit (PRD A: pr-review-ado-fetcher-reliability; PRD B: pr-review-platform-failure-handling) and by the planned ADRs 0014 and 0015. Capturing them in CONTEXT.md first so the PRDs can reference the canonical definitions rather than re-litigating. Co-Authored-By: Claude Sonnet 4.6 --- apps/claude-code/pr-review/CONTEXT.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/apps/claude-code/pr-review/CONTEXT.md b/apps/claude-code/pr-review/CONTEXT.md index 35266a2..1b896d6 100644 --- a/apps/claude-code/pr-review/CONTEXT.md +++ b/apps/claude-code/pr-review/CONTEXT.md @@ -123,6 +123,30 @@ A Thread Classification state. No action taken; the issue still exists in the ne **obsolete**: A Thread Classification state. The relevant code was deleted or moved; the comment no longer applies. +### Platform-failure handling + +**Notice**: +A user-facing message emitted by an orchestration agent when a Review operation completed in a non-OK Notice Tier. Carries `severity` (`info` or `warning`), `kind` (a small enum identifying the failed operation), and a one-line `message`. Notices are merged across agents by the orchestrator, rendered in the Review Summary, included in the end-of-run Trailer, and (for Pre-PR mode) printed in the Claude interface before findings. +_Avoid_: warning, error, log line + +**Notice Tier**: +A four-state classification of every Review operation outcome: **OK**, **EMPTY-BY-DESIGN**, **DEGRADED**, **ABORTED**. The tier choice IS the gating decision — there is no fifth "ask the user" tier. Failure modes that tempt one are reclassified as ABORTED. + +**OK**: +A Notice Tier. The operation completed with a non-empty result. No Notice emitted. + +**EMPTY-BY-DESIGN**: +A Notice Tier. The operation completed with an empty result that is a legitimate domain state (no work-items linked, no Confluence pages, no prior threads). Currently emits an `info` Notice only for the Doc Context family; other empty states are inherent to the Review type and stay silent. + +**DEGRADED**: +A Notice Tier. The operation failed but the Review can still complete with reduced coverage. Emits a `warning` Notice; the Review still posts. + +**ABORTED**: +A Notice Tier. The operation failed and continuing would corrupt cross-run state (Bot Signature drift, Summary thread desync, mode misdetection). The run stops before the Review Summary is composed; the failure goes to stderr plus the end-of-run Trailer. + +**Trailer**: +A single end-of-run line printed by the orchestrator to the Claude interface, regardless of mode or success state. Carries findings count by severity, Notice counts by severity, and (for ADO modes) the PR URL. Designed for AFK skim: the invoker sees outcome status without opening the PR. + ## Relationships - A **Review** produces one **Review Summary**, zero or more **Inline Comments**, and zero or more **General Comments** @@ -137,6 +161,7 @@ A Thread Classification state. The relevant code was deleted or moved; the comme - The **ADO Fetcher** is invoked by first-review and re-review modes; **Pre-PR mode** skips it entirely and goes directly to Review Aspect agents - The **Re-review Coordinator** is invoked only when the mode is re-review; first-review and pre-PR modes never load it - The **ADO Writer** is invoked by first-review and re-review modes; **Pre-PR mode** does not write back to ADO +- Every operation in an orchestration agent terminates in one of the four **Notice Tiers**. **DEGRADED** and **EMPTY-BY-DESIGN**-with-message operations emit a **Notice** that flows from the agent's structured result block, through the orchestrator's merge step, into the **Review Summary** (for ADO modes) or the printed pre-findings block (for **Pre-PR mode**). The end-of-run **Trailer** carries Notice counts so the invoker sees them without opening the PR. ## Example dialogue From 46c23f0e24ae7ac98e35cc919750cbadf8abcf5e Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 18:19:42 +0200 Subject: [PATCH 2/9] docs(pr-review): graduate ADO error-hardening inbox into PRD A and PRD B MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Splits the deferred items from PR #29's silent-failure-hunter review into two coherent PRDs after a grilling session that locked the underlying doctrine, scope, and module shape: - PRD A — pr-review-ado-fetcher-reliability (foundation): introduces the four-tier Notice doctrine (OK / EMPTY-BY-DESIGN / DEGRADED / ABORTED), the helper layer under scripts/ado/ (classify-http-error, notices, fetch-iterations, fetch-work-items), the canonical HTTP-tier mapping (401/403 abort, 5xx/network degrade, no retries in v1), the Notice flow from agent result blocks through the orchestrator into the Review Summary, the mandatory end-of-run Trailer line, and the DIFF_RANGE: full | incremental sentinel in ADO_FETCHER_RESULT. Also schedules ADR 0014 (helper-layer refinement of ADR 0013), ADR 0015 (HTTP-tier mapping), and an in-place ADR 0004 amendment for the γ-downgrade rule. - PRD B — pr-review-platform-failure-handling (consumer, depends on PRD A): applies the doctrine to the Re-review Coordinator (DIFF_RANGE consumption with γ-downgrade of addressed/obsolete to pending, match-finding try/catch with DEGRADED Notice, PATCH-to-fixed routed through the canonical mapping), the ADO Writer (every az devops invoke routed through a new parse-write-response helper, H1 inline POST retroactively inheriting 401/403 abort, *.err streamed to stderr at moment of failure, parseAdoWriterResult discriminated-union refactor), and Pre-PR mode (parseChangedFilesFromDiff suspicious- shape Notice, Gitflow-aware default-branch fallback chain via a new scripts/pre-pr/detect-default-branch.mjs helper). Both PRDs land with Status: needs-triage. Test scope follows the user's choice during grilling: unit tests for the new deep helpers; MODIFY helpers and agent prompts are verified by integration smoke test against a real ADO PR (per ADR 0013's testing posture). The originating inbox file (docs/inbox/pr-review-ado-error-hardening- pass.md) is removed in this commit, per the graduation flow documented in docs/inbox/README.md. Discriminated-union refactors are verified breaking-change-free (zero consumers outside the pr-review plugin). Co-Authored-By: Claude Sonnet 4.6 --- .../pr-review-ado-error-hardening-pass.md | 36 --- .../pr-review-ado-fetcher-reliability/PRD.md | 206 ++++++++++++++++++ .../PRD.md | 193 ++++++++++++++++ 3 files changed, 399 insertions(+), 36 deletions(-) delete mode 100644 docs/inbox/pr-review-ado-error-hardening-pass.md create mode 100644 docs/issues/pr-review-ado-fetcher-reliability/PRD.md create mode 100644 docs/issues/pr-review-platform-failure-handling/PRD.md diff --git a/docs/inbox/pr-review-ado-error-hardening-pass.md b/docs/inbox/pr-review-ado-error-hardening-pass.md deleted file mode 100644 index f271aaf..0000000 --- a/docs/inbox/pr-review-ado-error-hardening-pass.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: pr-review ADO error-hardening pass -created: 2026-05-13 ---- - -**Status:** needs-triage -**Category:** enhancement - -> _This was generated by AI during triage of PR #29._ - -PR #29 (pr-review orchestrator split) addressed the highest-stakes silent-failure paths in the ADO Writer and the orchestrator's mode-detection step (H1–H5). A second wave of silent-failure findings from the same review was deferred because the changes span multiple agents and helpers and feel like a feature in themselves — "ADO write reliability" — rather than a fit for the orchestrator-split PR. Capture them here so they aren't lost. - -## Items to harden - -- **ADO Fetcher Step 2 (iterations fetch).** `ITERATIONS_JSON=$(az devops invoke ...)` does not capture the exit code; an `az` failure produces an empty variable, which the subsequent Node parse silently coerces to `LATEST_ITERATION_ID=''` and `LATEST_COMMIT_SHA=''`. Every comment is then signed `Iteration ` (empty) and re-review detection breaks forever afterward because no `PRIOR_ITERATION_ID` can match `""`. Capture the exit code, branch on it, and abort the whole agent with a clear error rather than emitting a signature-less review. -- **ADO Fetcher Step 5 (work-item fetch).** `WI_RESPONSE=$(az devops invoke ... 2>/dev/null) || WI_RESPONSE=""` conflates legitimate "no work items linked" with auth expiry, project rename, 5xx, extension uninstall, network partition. The Doc Context Orchestrator then runs without business context, silently. Capture stderr, log it, and emit a distinct sentinel (e.g. `WORK_ITEM_IDS_ERROR`) so the orchestrator can surface to the user. -- **ADO Fetcher Step 4 (diff-range fallback).** When the prior commit can't be fetched, the agent silently falls back to the full diff. The Re-review Coordinator then classifies prior threads against the wider hunk set and may flip threads from `obsolete` to `pending` or vice versa. Either propagate a `DIFF_RANGE: full|incremental` field through `ADO_FETCHER_RESULT` so the Coordinator can react, or skip classification when the fallback fires. -- **`parseWorkItemIds` discriminated union.** The helper currently maps null/undefined responses to `[]` — the JSDoc even codifies this as intentional. Replace with `{ ok: true, ids } | { ok: false, reason }` so callers must distinguish "no work items" from "fetch failed". -- **`parseAdoWriterResult` discriminated union.** Returns `null` when the result block is missing. The orchestrator has no documented branch for "writer returned null block" — it presumably treats it as 0/empty and proceeds to report success. Either throw on missing block or add a `{ status: 'missing' | 'parsed' }` variant. -- **`parseIterations` empty-input default.** Returns `{ latestIterationId: 1, latestCommitSha: '' }` when `value` is empty — but `iterationId=1` is explicitly the iteration the project never uses (per plugin CLAUDE.md). If a fetch failure produces an empty value array, the agent happily emits `Iteration 1` signatures, silently violating the rule. Either throw, or return a discriminated union. -- **Pre-PR default-branch detection.** `DEFAULT_BRANCH=$(git remote show origin 2>/dev/null | grep 'HEAD branch' | awk '{print $NF}' || echo "main")` silently falls back to `main`. The repo's own Gitflow uses `develop` — so a transient `git remote show` failure would compute the pre-PR diff against the wrong base and review hundreds of unrelated commits with no warning. Surface the fallback with a stderr warning. -- **Inline POST `*.err` files cleanup.** `Step 4 — Clean up` in the Writer runs `rm -f` unconditionally — destroying the only persistent record of stderr from failed inline POSTs. Keep them on failure (`if FINDINGS_POSTED < EXPECTED`) or stream their contents to stderr before cleanup. -- **Re-review Coordinator PATCH-to-fixed catch-all.** The Node catch block only special-cases `409` → continue; everything else (401, 403, 404, 5xx, network) becomes a 200-character "PATCH warning" in stdout that no automated layer inspects. Threads stay active when the user expects them fixed, and the next re-review re-replies to them. Distinguish recoverable (409) from unrecoverable and propagate the latter. -- **Re-review Coordinator per-finding match parse.** `MATCH=$(... 2>/dev/null || echo "")` — on Node parse error, CLASSIFICATION and THREAD_ID both end up empty strings, dispatch falls through to "no match → add to freshFindings", and the reviewer sees the same finding posted twice. Capture the exit code separately and abort/log instead of silently downgrading. -- **`pre-pr.mjs` parseChangedFilesFromDiff edge case.** `[]` is returned for both "empty input" and "diff with no `diff --git` headers" — the latter is suspicious (a real diff always has those headers). Pre-PR mode then prints `✅ Pre-PR review complete — no issues found.` even when the actual reason was a broken pipeline. Log a debug line when non-empty input produces zero files, or have `buildPrePrContext` throw on suspicious shapes. - -## What grilling needs to resolve - -- Is "harden every silent-failure path" a single feature, or should it be split (Writer / Fetcher / Coordinator / Pre-PR)? -- Where is the line between "abort and surface to user" and "log and continue"? Different items currently lean different ways. -- Adding `try/catch` + structured exit codes around every Node heredoc balloons the agent prompts — is that acceptable, or do we extract more helpers (the `mode-detection.mjs` precedent) so the bash side gets simpler? -- Are the discriminated-union refactors of `parseWorkItemIds` / `parseIterations` / `parseAdoWriterResult` a breaking change to the helper API? If they have any external consumers (they shouldn't, but verify), call it out. - -## Source - -PR #29 silent-failure-hunter review. The fixed subset (H1–H5) is documented in `apps/claude-code/pr-review/CHANGELOG.md` under [Unreleased] → Fixed. diff --git a/docs/issues/pr-review-ado-fetcher-reliability/PRD.md b/docs/issues/pr-review-ado-fetcher-reliability/PRD.md new file mode 100644 index 0000000..c09bc4d --- /dev/null +++ b/docs/issues/pr-review-ado-fetcher-reliability/PRD.md @@ -0,0 +1,206 @@ +# PRD: pr-review — ADO Fetcher reliability + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` + +--- + +## Problem Statement + +When the ADO Fetcher's Azure DevOps reads fail — iterations endpoint down, work-item fetch denied by auth, prior commit missing for an incremental diff — the failures are currently invisible. The bot keeps running and produces output that looks like a normal Review, but signed `Iteration ` (empty), or with `WORK_ITEM_IDS=[]` (indistinguishable from "no work items linked"), or classifying prior threads against the wrong diff range. The reviewer reading the PR has no way to tell that the Review was produced on degraded inputs; the next re-review can be corrupted permanently because Bot Signature drift breaks re-review detection. This is the most consequential class of silent failure surfaced by the PR #29 review. + +## Solution + +Introduce a four-state Notice Tier doctrine across the plugin and apply it first to the ADO Fetcher. Every Fetcher read terminates in one of four tiers — OK, EMPTY-BY-DESIGN, DEGRADED, ABORTED — and emits a structured Notice when the tier is non-OK (with one carve-out: EMPTY-BY-DESIGN is silent except for the Doc Context family). Notices flow from the Fetcher's structured result block through the orchestrator into the Review Summary, where the reviewer sees them. A mandatory end-of-run Trailer line printed in the Claude interface also reports notice counts, so the user invoking the command sees outcome status without opening the PR. + +Failure classification moves to pure JavaScript helpers under `scripts/ado/`, refining the architecture documented in ADR 0013. Three new deep helpers (`classify-http-error`, `notices`, plus per-fetch wrappers) replace the inline bash-and-Node heredocs that today implicitly swallow exit codes. The discriminated-union return shape distinguishes EMPTY-BY-DESIGN from DEGRADED at the helper API level, removing the conflation that today defaults `parseIterations([])` to `{ latestIterationId: 1 }` (silently violating CLAUDE.md's "iterationId=1 is never used" rule). + +The diff-range fallback that the Fetcher already performs (when the prior iteration's commit is unreachable, falling back to the full PR diff) gets a `DIFF_RANGE: full | incremental` sentinel field in `ADO_FETCHER_RESULT` and a DEGRADED Notice. PRD B will consume the sentinel; PRD A only emits it. + +## User Stories + +1. As a PR reviewer, I want a banner at the top of the Review Summary listing any platform failures that occurred during the Review, so that I can tell whether the bot's findings are based on complete or degraded context. +2. As a developer invoking `/pr-review:review-pr`, I want a single end-of-run Trailer line in my Claude interface reporting findings counts, notice counts, and the PR URL, so that I can scan outcome status across many AFK invocations without opening each PR. +3. As a PR reviewer, I want to know when the Review was produced without business context (no work items linked, or work-item fetch failed), so that I can decide whether to re-run with a linked work item or accept the review-without-context. +4. As a Plugin maintainer, I want the Bot Signature to never carry an empty or fabricated Iteration ID, so that re-review detection on the next run is not silently corrupted. +5. As a Plugin maintainer, I want auth or permission failures on the Azure DevOps iterations endpoint to abort the run with a clear stderr message naming `az devops login` as the remedy, so that the user is not left wondering why subsequent re-reviews behave oddly. +6. As a PR reviewer in re-review mode, I want the bot to tell me when it classified prior threads against the full PR diff instead of the incremental diff, so that I can interpret an unexpected `pending` verdict as conservative rather than definitive. +7. As a Plugin maintainer, I want failure classification logic to live in pure JS helpers with unit tests, rather than in bash-and-Node heredocs inside agent prompts, so that I can verify the doctrine is applied consistently without running an end-to-end ADO smoke test. +8. As a developer reading the codebase, I want every ADO write call site to consult one canonical helper that maps HTTP status codes to Notice Tiers, so that 401 means the same thing in every code path and a future contributor cannot accidentally invent a divergent mapping. +9. As a developer running `/pr-review:review-pr` in Pre-PR mode, I want any Fetcher-related infrastructure changes to be invisible to me, because Pre-PR mode does not run the Fetcher. +10. As a Plugin maintainer, I want a Notice that is emitted by multiple agents for the same root cause (e.g. Fetcher and Doc Context Orchestrator both noticing a Confluence outage) to be deduplicated in the orchestrator's merge step, so that the Summary does not list the same problem twice. +11. As a developer maintaining a Re-review on an ADO PR that was merged before the Review completed, I want the Fetcher to still return a usable iteration list (because comments are still useful as a review record), so that the merged-but-reviewable workflow ADR 0013 acknowledges keeps working. +12. As a Plugin maintainer, I want the discriminated-union refactor of `parseIterations` and `parseWorkItemIds` to be purely internal, so that no other plugin or release-tool depends on the old return shape. +13. As a developer running Pre-PR mode, I want the Doc Context EMPTY-BY-DESIGN informational Notice to be emitted only for the Doc Context family (no linked work items), so that other inherently-empty states (first-review having no prior threads, a clean PR having no findings) do not pollute the Summary with redundant `ℹ️` lines. +14. As a Plugin maintainer, I want the ADRs that record the new doctrine (helper-layer split from ADR 0013, canonical HTTP-tier mapping, γ-downgrade rule for diff-range) to be in place before PRD B's consumers start arriving, so that PRD B can reference them rather than re-litigate the decisions. +15. As a CI engineer, I want every new deep helper module to come with `node:test` unit tests in the prior-art style of `packages/release-tools/scripts/verify-changelog.test.mjs`, so that the helpers can be verified without an ADO PR and without Azure CLI installed. + +## Implementation Decisions + +### Notice Tier doctrine + +A four-state classification of every Review operation outcome — **OK**, **EMPTY-BY-DESIGN**, **DEGRADED**, **ABORTED** — captured in `CONTEXT.md` under "Platform-failure handling". The tier choice is the gating decision; there is no fifth ASK tier. AFK invocations never block on user input. Failure modes that tempt an ASK tier are reclassified as ABORTED. + +EMPTY-BY-DESIGN is silent for most states. The Doc Context family is the one exception: when `WORK_ITEM_IDS=[]` the orchestrator emits an `info`-severity Notice in the Summary, because the reviewer cannot tell from the PR alone whether the bot considered linked business context. + +### Notice flow + +Each orchestration agent emits a `NOTICES` JSON array as a new field in its structured result block. The orchestrator parses, merges (with `kind`-based deduplication), and passes the merged array to the ADO Writer alongside `FINDINGS`. The ADO Writer renders a `## ⚠ Notices` block above the findings in the Review Summary content. + +Notice shape: `{ severity: "info" | "warning", kind: , message: string }`. `kind` is a small enum (`doc-context`, `diff-range`, `work-items`, `iterations`, `default-branch`, `partial-run-check`, `thread-match`, `thread-classify`, `inline-post`, `summary-post`, `patch-to-fixed`, `diff-parse`); rejected: free-form strings, severity-coded numerics. ABORTED never reaches the Notice channel — its surface is stderr + the Trailer. + +### End-of-run Trailer + +The orchestrator prints a mandatory single-line Trailer to the Claude interface at end-of-run, regardless of mode or outcome: + +- ADO modes: `✅ Review posted: findings ( critical, important) · warning notices · info notices → ` +- Pre-PR mode: `✅ Pre-PR review complete: findings ( critical, important) · warning notices` +- Aborted: `❌ Review aborted: ` + +Designed for AFK skim: the invoker sees outcome status without opening the PR. Same `NOTICES` array drives both the Summary rendering and the Trailer counts. + +### Helper layer (ADR 0014) + +Failure classification moves from inline bash-and-Node heredocs to pure JS helpers under `scripts/ado/`. Agent prompts shrink to "import, call, branch on `result.ok`". This refines ADR 0013 — orchestration still lives in agent prompts, but **failure classification** lives in helpers. + +New helper modules: + +- **`scripts/ado/classify-http-error.mjs`** — pure function taking an HTTP status code, response body excerpt, and process exit code. Returns `{ tier: 'ok' | 'degraded' | 'aborted', kind, message }`. Encodes the canonical HTTP-tier mapping. Consumed by PRD B too. +- **`scripts/ado/notices.mjs`** — pure helpers `createNotice`, `mergeNotices` (dedupe by `kind`), `formatNoticesAsSummaryBlock`, `formatNoticesAsPrePrPreamble`, `formatTrailer`. +- **`scripts/ado/fetch-iterations.mjs`** — wraps the iterations fetch and parse; returns `{ ok: true, latestIterationId, latestCommitSha } | { ok: false, reason }`. Subsumes the existing `parseIterations` helper, refactored to the discriminated-union shape. Empty `value` array on a real PR → `{ ok: false, reason: 'empty-iterations' }` → ABORTED. +- **`scripts/ado/fetch-work-items.mjs`** — wraps the work-items fetch and parse; returns `{ ok: true, ids } | { ok: false, reason }`. Subsumes `parseWorkItemIds`. Empty array (legitimate "no work items linked") → `{ ok: true, ids: [] }`; fetch failure (auth, 5xx, network) → `{ ok: false }`. + +### Canonical HTTP-tier mapping (ADR 0015) + +| HTTP outcome | Tier | Notes | +| --------------------- | -------- | ---------------------------------------------------------- | +| 200 / 201 | OK | No Notice. | +| 404 | OK | Domain "the thing is already gone." | +| 409 | OK | Domain "state already changed." | +| 401 | ABORTED | Token expired or revoked; all subsequent writes will fail. | +| 403 | ABORTED | Permission revoked; same. | +| 5xx | DEGRADED | Transient backend; emit Notice; continue. | +| Other 4xx (400 / 422) | DEGRADED | Malformed request bug; Notice includes body excerpt. | +| Network error | DEGRADED | Treat as 5xx. | + +No retries in v1. Retries add latency, complexity, and a new failure mode (retry storm). The doctrine produces correct behaviour without them; retries can be added later behind the same Notice surface. + +### DIFF_RANGE sentinel and ADR 0004 amendment + +The ADO Fetcher's existing fallback from incremental to full diff (when the prior iteration's commit is unreachable) is currently silent. PRD A introduces: + +- A new `DIFF_RANGE: full | incremental` line in `ADO_FETCHER_RESULT_START/END`. +- A DEGRADED Notice (`kind: diff-range`, message: "Incremental diff unavailable — Coordinator will classify against the full PR diff with conservative downgrades.") when the fallback fires. + +The PRD B Coordinator changes (γ-downgrade rule that remaps `addressed` / `obsolete` to `pending` when `DIFF_RANGE=full`) consume this sentinel. PRD A only emits it. + +ADR 0004 ("incremental diff baseline") is amended in-place with a "Degraded baseline" subsection covering this rule. + +### Agent and orchestrator changes + +- **`.agents/ado-fetcher.md`** — three inline bash heredocs (Steps 2, 4a/work-items, 4-diff) replaced with `await import` calls to the three new helpers. `ADO_FETCHER_RESULT` output block grows two fields: `DIFF_RANGE` and `NOTICES`. +- **`.agents/ado-writer.md`** — accepts a new `NOTICES` input; renders the `## ⚠ Notices` block above the existing severity-grouped findings in the Summary content. No changes to write call sites in PRD A (those land in PRD B). +- **`commands/review-pr.md`** — parses `NOTICES` and `DIFF_RANGE` from `ADO_FETCHER_RESULT`; merges Notices via the `notices` helper; passes merged Notices to the ADO Writer prompt; emits Doc-Context EMPTY-BY-DESIGN info Notice when `WORK_ITEM_IDS=[]`; prints the mandatory end-of-run Trailer line. The 200-line cap from PRD-orchestrator-split is preserved by leaning on the new helpers (the bash side becomes uniform `if [ "$RESULT_OK" != "true" ]; then ...`). + +### Existing helpers, breaking-change check + +`parseIterations`, `parseWorkItemIds`, and any other affected helpers are verified to have zero consumers outside the `pr-review` plugin (`grep` across `apps/`, `packages/`, `docs/` returns no matches outside `apps/claude-code/pr-review/`). The discriminated-union refactor is therefore safe to land without a deprecation period. + +## Testing Decisions + +### What makes a good test + +Tests assert the external behaviour of each helper given controlled inputs — no implementation-detail inspection, no internal-branching tests. Inputs are plain JavaScript objects or short JSON fixtures. A test reads as a sentence: "given an HTTP 401, classifyHttpError returns the aborted tier." `node:test` built-in, `node:assert/strict`, no external deps. + +### Modules under test + +**New deep helpers (full unit-test coverage):** + +- `scripts/ado/classify-http-error.mjs` — one test per row of the canonical mapping (200, 201, 401, 403, 404, 409, 5xx, 400, 422, network/exit-code paths) plus the case where the body excerpt is malformed JSON. +- `scripts/ado/notices.mjs` — `createNotice` shape, `mergeNotices` dedup behaviour across multiple sources, the three `format…` renderers producing expected markdown / line shapes, `formatTrailer` for first-review / re-review / pre-pr / aborted modes. +- `scripts/ado/fetch-iterations.mjs` — happy path with one iteration, multiple iterations (returns the max), empty `value` array → `{ ok: false, reason: 'empty-iterations' }`, missing `value` key, malformed JSON, an ADO error response. +- `scripts/ado/fetch-work-items.mjs` — empty PR-work-item links → `{ ok: true, ids: [] }`, populated links, dedup of duplicate IDs (existing parseWorkItemIds invariant), null/missing response → `{ ok: false }`, ADO error response. + +The existing test files for `parseIterations` and `parseWorkItemIds` are subsumed — the fetch helpers replace them and inherit their fixtures. + +### Modules NOT under test in PRD A + +- Agent prompt content (`.agents/*.md`, `commands/review-pr.md`): no new string-match assertions. The existing pattern is flagged as brittle in `docs/inbox/pr-review-prompt-content-tests-brittleness.md` and behaviour is verified by integration smoke test against a real ADO PR after merge, per ADR 0013's testing posture. + +### Prior art + +`packages/release-tools/scripts/verify-changelog.test.mjs`, `packages/release-tools/scripts/bump-version.test.mjs`, and `apps/claude-code/pr-review/tests/parse-diff-hunks.test.mjs` (added in PR #29). Same style throughout. + +## Out of Scope + +- Coordinator and Writer changes (DIFF_RANGE consumption, γ-downgrade rule applied, HTTP-tier mapping applied to every write call site, `*.err` retention policy, `parseAdoWriterResult` discriminated-union refactor) — those land in **PRD B**. +- Pre-PR mode changes (`parseChangedFilesFromDiff` suspicious-shape Notice, default-branch fallback chain + Notice) — those land in **PRD B**. +- The integration smoke test against a real ADO PR — verification is manual, post-merge. +- Retries on transient HTTP errors — out of scope per the doctrine. Re-evaluate if 5xx Notices prove painful in practice. +- A canonical thread shape spanning ADO and GitHub — deferred per ADR 0013 until a second platform consumer exists. +- Lifting any helper from `scripts/ado/` to `pr-review-toolkit` — none of these helpers are platform-shared yet. + +## Further Notes + +**ADR 0014** (`apps/claude-code/pr-review/docs/adr/0014-failure-classification-helpers.md`) records the helper-layer refinement to ADR 0013. + +**ADR 0015** (`apps/claude-code/pr-review/docs/adr/0015-canonical-http-tier-mapping.md`) records the HTTP-tier mapping, the 401/403 abort rule, and the no-retries-in-v1 stance. + +**ADR 0004** (`apps/claude-code/pr-review/docs/adr/0004-incremental-diff-baseline.md`) is amended in-place with a "Degraded baseline" subsection covering the γ-downgrade rule that PRD B will implement on the consumer side. + +**`CONTEXT.md`** is already updated with the new terms (Notice, Notice Tier and its four states, Trailer). + +**Source:** the deferred items from the PR #29 multi-agent review, grilled against the domain doctrine over the conversation captured in this session. The originating inbox file (`docs/inbox/pr-review-ado-error-hardening-pass.md`) is removed once PRD A and PRD B are published. + +--- + +## Agent Brief + +> _This was generated by AI during triage._ + +**Category:** enhancement +**Summary:** Apply the four-tier Notice doctrine to the ADO Fetcher. Introduce three new deep helpers in `scripts/ado/` (`classify-http-error`, `notices`, plus `fetch-iterations` and `fetch-work-items` as discriminated-union refactors of the existing parsers). The Fetcher emits a `NOTICES` array and a `DIFF_RANGE` sentinel in its structured result block; the orchestrator merges Notices, passes them to the ADO Writer, prints a mandatory end-of-run Trailer line. The ADO Writer renders a `## ⚠ Notices` block above findings in the Review Summary. + +**Current behavior:** +ADO Fetcher reads silently swallow exit codes. An iterations-fetch failure produces `LATEST_ITERATION_ID=''`, drifting the Bot Signature to `Iteration ` (empty) and breaking re-review detection forever afterward. A work-item-fetch failure is indistinguishable from "no work items linked" — both produce `WORK_ITEM_IDS=[]`. A diff-range fallback to the full PR diff happens silently, causing the Coordinator to classify prior threads against the wrong range. None of these surface to the reviewer or the invoker. + +**Desired behavior:** +Every Fetcher operation terminates in one of four Notice Tiers (OK, EMPTY-BY-DESIGN, DEGRADED, ABORTED). Tier choice is the gating decision — no user prompts, AFK-friendly. Failures route to: + +- **ABORTED** for state-corrupting failures (empty iterations on a real PR, 401/403 on iteration fetch). Process exits non-zero with stderr message + Trailer aborted line. +- **DEGRADED** for failures the Review can still complete around (work-item fetch failed, diff-range fallback to full diff, 5xx on any read). Emits a `warning` Notice surfaced in the Review Summary. +- **EMPTY-BY-DESIGN** for legitimate empty states. Silent except for the Doc Context family (`WORK_ITEM_IDS=[]` → `info` Notice in the Summary). +- **OK** for normal completion. + +The four new helpers under `scripts/ado/` own the classification logic. Agent prompts shrink to `await import` + branch on `result.ok`. The Bot Signature is never signed with an empty Iteration ID again. + +**Key interfaces:** + +- `classifyHttpError({ status, body, exitCode }) → { tier, kind, message }` — canonical HTTP-tier mapping; consumed by PRD B. +- `createNotice / mergeNotices / formatNoticesAsSummaryBlock / formatNoticesAsPrePrPreamble / formatTrailer` from `scripts/ado/notices.mjs`. +- `fetchIterations(...) → { ok: true, latestIterationId, latestCommitSha } | { ok: false, reason }`. +- `fetchWorkItems(...) → { ok: true, ids } | { ok: false, reason }`. +- `ADO_FETCHER_RESULT` grows `DIFF_RANGE: full | incremental` and `NOTICES: [{severity, kind, message}, …]` fields. + +**Acceptance criteria:** + +- [ ] The four new helpers under `scripts/ado/` exist and pass their unit tests (`pnpm --filter pr-review test`). +- [ ] `parseIterations` / `parseWorkItemIds` are gone — the new fetch helpers fully subsume them; no consumer outside `pr-review` is broken (verified by `grep`). +- [ ] An iterations fetch that returns `value: []` aborts the run with a clear stderr message and a Trailer `❌ Review aborted: empty-iterations — …` line. +- [ ] A work-item fetch that fails with auth/5xx/network emits a DEGRADED Notice (`kind: work-items`); a work-item fetch that returns an empty list emits an `info` Notice (`kind: doc-context`). +- [ ] A diff-range fallback emits `DIFF_RANGE: full` in `ADO_FETCHER_RESULT` and a DEGRADED Notice (`kind: diff-range`). +- [ ] The orchestrator merges Notices, dedupes by `kind`, and passes them to the ADO Writer. +- [ ] The ADO Writer renders a `## ⚠ Notices` block above findings in first-review and re-review Summaries. +- [ ] Every successful run ends with a Trailer line in the Claude interface listing findings, notices, and the PR URL (ADO modes) or finding counts (Pre-PR mode). +- [ ] `commands/review-pr.md` remains ≤ 200 lines. +- [ ] ADR 0014 (helper layer), ADR 0015 (HTTP-tier mapping), and the in-place ADR 0004 amendment exist. +- [ ] `pnpm test` passes; `pnpm format` produces no diff; `pnpm check` reports zero warnings. + +**Out of scope:** + +- Coordinator and Writer changes — PRD B. +- Pre-PR mode changes — PRD B. +- Retries on transient HTTP errors. +- Integration smoke test (manual, post-merge). +- Lifting helpers to `pr-review-toolkit`. diff --git a/docs/issues/pr-review-platform-failure-handling/PRD.md b/docs/issues/pr-review-platform-failure-handling/PRD.md new file mode 100644 index 0000000..f02b2e0 --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/PRD.md @@ -0,0 +1,193 @@ +# PRD: pr-review — platform-failure handling + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Depends on:** `docs/issues/pr-review-ado-fetcher-reliability/PRD.md` (PRD A — must land first) + +--- + +## Problem Statement + +After PRD A lands, the four-tier Notice doctrine is established and the ADO Fetcher is correct, but the rest of the plugin still swallows platform failures in three places. The Re-review Coordinator silently downgrades to first-review mode when its match-finding helper throws (duplicating every prior thread); its PATCH-to-fixed call has a catch-all that only special-cases HTTP 409 (silently letting 401/403/5xx through as 200-char "warnings" no one reads). The ADO Writer's inline POST path captures stderr to `*.err` files but only logs them on cleanup — the actual failure text never reaches the user, and 401/403 are treated like recoverable per-finding failures. Pre-PR mode's diff parser returns `[]` for both empty diffs and malformed inputs, so a broken pipeline looks like a clean Review. The default-branch fallback chain still hardcodes `main` even though most Unic projects use Gitflow. + +These are not isolated bugs — they are the same doctrine PRD A established, not yet applied across the remaining surfaces. PRD B finishes the job by routing every ADO write call site through PRD A's canonical helpers, extending the Coordinator's classification helpers to honour `DIFF_RANGE`, and giving Pre-PR mode the same Notice surface that the ADO modes get. + +## Solution + +Apply the Notice Tier doctrine + helper-layer doctrine (both from PRD A) to the Re-review Coordinator, ADO Writer, and Pre-PR mode. The shared helpers (`classify-http-error`, `notices`, `parse-write-response`) own all classification; the agent prompts shrink to "call the helper, branch on `result.ok`". + +The Coordinator consumes the `DIFF_RANGE` sentinel PRD A emits: when `full`, the existing `classify-thread` helper downgrades verdicts that depend on diff position (`addressed`, `obsolete`) to the safer `pending`; `disputed` is unaffected. A DEGRADED Notice surfaces the downgrade. The Coordinator's per-finding match call wraps `match-finding` in try/catch and emits a DEGRADED Notice on throw, instead of silently treating the parse failure as "no match" and duplicating the thread. PATCH-to-fixed routes every response through the canonical HTTP-tier mapping — 401/403 abort the whole re-review, 5xx/network/other-4xx emit per-thread DEGRADED Notices. + +The ADO Writer routes every `az devops invoke` POST/PATCH through `parse-write-response` (a new pure helper composing PRD A's `classify-http-error` with response-`id` parsing). The H1 path (inline POST) inherits the canonical mapping retroactively — auth failures no longer log-and-continue, they abort. `*.err` files stream their content to stderr at the moment of failure (so the failure text is adjacent to the Notice that references it), then are unconditionally cleaned up. The `parseAdoWriterResult` helper is refactored to the discriminated-union shape, so the orchestrator can distinguish a missing result block (Writer crashed before printing) from a parsed-with-zero-findings outcome. + +Pre-PR mode gets the same Notice surface that PRD A wired for ADO modes. `parseChangedFilesFromDiff` detects a suspicious shape (non-empty input with `diff --git` headers but zero parsed paths) and emits a DEGRADED Notice via `buildPrePrContext`. The default-branch detection becomes a fallback chain (`git remote show origin` → `origin/develop` → `origin/main` → `origin/master` → ABORTED) implemented in a pure helper `scripts/pre-pr/detect-default-branch.mjs`, with a Notice that names the actually-used branch when any fallback level fires. + +## User Stories + +1. As a PR reviewer in re-review mode, I want the bot to never silently re-post a thread it already opened on a prior iteration, so that my PR's thread list does not accumulate duplicates. +2. As a PR reviewer, I want any HTTP 401 / 403 error from Azure DevOps during write-back to abort the Review with a clear stderr message naming `az devops login` as the remedy, so that the run does not silently complete with most threads missing. +3. As a PR reviewer, I want a per-thread DEGRADED Notice in the Review Summary listing every thread the bot tried to mark as fixed but couldn't (because of a 5xx or network blip), so that I can manually mark them fixed if appropriate. +4. As a PR reviewer in re-review mode with no incremental diff available, I want prior threads that would have been classified `addressed` or `obsolete` to instead be classified `pending`, so that I am never told a comment is resolved when the bot wasn't actually able to verify it. +5. As a developer running Pre-PR mode in a Gitflow-style project, I want the default-branch detection to try `origin/develop` before `origin/main`, so that the local diff is computed against the actual integration branch most of the time. +6. As a developer running Pre-PR mode, I want a Notice telling me which branch the bot diffed against when default-branch detection fell back, so that I can spot the case where it picked the wrong branch. +7. As a Plugin maintainer, I want the ADO Writer's existing H1 inline-POST path (which today logs auth failures and continues) to inherit the canonical HTTP-tier mapping introduced in PRD A, so that 401/403 abort the writer consistently with every other ADO write. +8. As a Plugin maintainer, I want `*.err` file contents to be visible at the moment of failure, not buried in a cleanup step, so that diagnosing a partial-success run does not require reaching for temp files that may have been deleted. +9. As a Plugin maintainer, I want the `parseAdoWriterResult` helper to distinguish "result block missing" (Writer crashed mid-run) from "result block parsed with zero findings" (legitimate zero outcome), so that the orchestrator can fail loud on the first case instead of silently reporting success. +10. As a PR reviewer, I want a Notice telling me when Pre-PR mode's diff parser detected `diff --git` headers but produced zero file paths, so that I can tell the "no files changed" message apart from "the pipeline broke". +11. As a Plugin maintainer, I want the Coordinator's match-finding error path to emit a DEGRADED Notice (`kind: thread-match`) when the helper throws on a parse error, so that the reviewer sees one warning instead of one silent duplicate posting. +12. As a developer reading the codebase, I want every ADO write call site (inline POST, summary POST, delta reply, completion marker, PATCH-to-fixed) to route through one shared helper, so that adding a new write call type in the future inherits the same HTTP-tier mapping for free. +13. As a Plugin maintainer, I want the existing classify-thread and match-finding tests to be extended with the new branches (diffRange parameter; throw on parse error), so that the new behaviour is verified at the helper boundary even though the agent prompts are not unit-tested. +14. As a developer running re-review mode, I want the partial-run check from H4 (already landed in PR #29) to keep its exit-code contract (`0` = found, `1` = not found, `2` = crash); PRD B does not modify that path. + +## Implementation Decisions + +### Foundation (from PRD A) + +All shared helpers (`scripts/ado/classify-http-error.mjs`, `scripts/ado/notices.mjs`) and the Notice flow + Trailer printing in the orchestrator are assumed in place. PRD B only adds consumers and one new shared helper (`parse-write-response.mjs`). The Notice tier doctrine, the canonical HTTP-tier mapping, the four-state classification, and the no-fifth-ASK-tier rule are all documented in PRD A's ADRs (0014, 0015) and the in-place ADR 0004 amendment. + +### New helpers + +- **`scripts/ado/parse-write-response.mjs`** — pure function composing PRD A's `classify-http-error` with response-`id` parsing. Returns `{ ok: true, id } | { ok: false, tier, kind, message }`. Consumed by every ADO write call site (inline POST, threadContext fallback, summary POST, delta reply, completion marker, PATCH-to-fixed). One shape, one classifier. +- **`scripts/pre-pr/detect-default-branch.mjs`** — pure function over an injectable `branchExists(name) → bool` tester. Walks the fallback chain `git remote show origin HEAD` → `origin/develop` → `origin/main` → `origin/master` → `{ branch: null }`. Returns `{ branch, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The bash side wires the tester to `git rev-parse --verify --quiet`. ABORTED when all four fail. + +### Modified helpers + +- **`scripts/re-review/classify-thread.mjs`** — adds a `diffRange: 'full' | 'incremental'` parameter. When `diffRange === 'full'`, outputs that would be `addressed` or `obsolete` are remapped to `pending`; `disputed` is unaffected. Default `diffRange === 'incremental'` preserves today's behaviour. Single new branch, ~3 lines. +- **`scripts/re-review/match-finding.mjs`** — today returns `null` on no match. New contract: `null` continues to mean "legitimate no-match"; a thrown `Error` distinguishes a parse failure in the input. The Coordinator's per-finding call wraps in try/catch. +- **`scripts/ado-writer.mjs` (`parseAdoWriterResult`)** — discriminated-union refactor: `{ ok: true, summaryThreadId, findingsPosted } | { ok: false, reason: 'missing-block' | 'malformed' }`. Subsumes today's `null` return. +- **`scripts/pre-pr.mjs` (`buildPrePrContext`, `parseChangedFilesFromDiff`)** — `buildPrePrContext` return shape extends to `{ rawDiff, changedFiles, filteredFiles, notices: Notice[] }`. `parseChangedFilesFromDiff` detects suspicious shape (non-empty input with ≥ 1 `diff --git` header but zero parsed paths) and emits a DEGRADED Notice (`kind: diff-parse`). + +### Agent and orchestrator changes + +- **`.agents/re-review-coordinator.md`**: + - Consume `DIFF_RANGE` from `ADO_FETCHER_RESULT`; pass it to `classify-thread`. + - Wrap per-finding `match-finding` call in try/catch; on throw, push a DEGRADED Notice and continue to the next finding (do NOT add the unclassified prior thread to `freshFindings` — let it fall through naturally to a duplicate posting, but with a Notice surfacing the cause). + - Route PATCH-to-fixed responses through `parse-write-response`. Tier `aborted` → exit non-zero with the abort kind. Tier `degraded` → push a Notice (`kind: patch-to-fixed`) and continue to the next thread. + - Emit `NOTICES` array in `RE_REVIEW_COORDINATOR_RESULT_START/END` for the orchestrator to merge. +- **`.agents/ado-writer.md`**: + - Route every `az devops invoke` POST/PATCH (inline POST, threadContext-fallback, summary POST, delta reply, completion marker) through `parse-write-response`. + - H1 retroactive fix: the inline POST path inherits the canonical mapping — 401/403 abort the writer immediately, 5xx/network/other-4xx push a `warning` Notice and continue. + - Stream the `*.err` file content to stderr at the moment of failure, then unconditional `rm -f` in cleanup. No conditional retention. + - Emit `NOTICES` array in `ADO_WRITER_RESULT_START/END`. +- **`commands/review-pr.md` (Pre-PR mode)**: + - Wire `detect-default-branch.mjs` (via the existing helper-import pattern). On `branch: null`, abort with stderr message and Trailer aborted line. + - Use `buildPrePrContext().notices` to prepend the pre-findings Notices block in the Claude interface. + - Trailer line includes Pre-PR notice counts (already mandatory per PRD A). + - 200-line cap preserved. + +### Test-scope choice + +The user explicitly chose "NEW deep modules only" in the test-scope question during the grilling session. PRD B writes unit tests for the two new helpers (`parse-write-response`, `detect-default-branch`). The MODIFY helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs`) get no new unit tests in this PRD; their existing test files stay frozen except for whatever fixture updates the new return shapes force. Behaviour change verification on the MODIFY helpers and on the agent prompts goes to the integration smoke test against a real ADO PR, per ADR 0013's stated testing posture. + +## Testing Decisions + +### What makes a good test + +Same as PRD A: tests assert the external behaviour of each helper given controlled inputs. Plain JS object or short JSON fixtures, sentence-shaped test names, `node:test` + `node:assert/strict`, no external deps. + +### Modules under test + +**New deep helpers (full unit-test coverage):** + +- `scripts/ado/parse-write-response.mjs` — happy path (`{ id: 12345 }` response), 401 → `{ ok: false, tier: 'aborted', kind: 'auth' }`, 5xx → `{ ok: false, tier: 'degraded' }`, 404 → `{ ok: true }` (domain-OK), 409 → `{ ok: true }`, malformed JSON body, network exit-code path, missing `id` field on otherwise-200 response. +- `scripts/pre-pr/detect-default-branch.mjs` — `git remote show` succeeds → no fallback Notice, `develop` exists → `develop-fallback` with Notice, only `main` exists → `main-fallback` with Notice, only `master` exists → `master-fallback` with Notice, nothing exists → ABORTED (no branch, no Notice — Trailer carries the abort), `branchExists` thrown exception → propagated. + +### Modules NOT under test in PRD B + +Per the user's choice during grilling: + +- `classify-thread.mjs` extension (`diffRange` parameter) — verified by integration smoke test. +- `match-finding.mjs` extension (throw-on-parse-error) — same. +- `parseAdoWriterResult` discriminated-union refactor — same. +- `pre-pr.mjs` suspicious-shape Notice — same. +- All agent prompt content (`.agents/*.md`, `commands/review-pr.md`) — same. + +### Prior art + +Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump-version.test.mjs`, `apps/claude-code/pr-review/tests/parse-diff-hunks.test.mjs`. No external deps, no spawnSync, fixtures as inline JS objects. + +## Out of Scope + +- Anything PRD A delivers (helper layer, canonical HTTP mapping, ADRs, Fetcher fixes, orchestrator Notice merging + Trailer). +- Unit tests for MODIFY-only helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs` suspicious-shape). +- Unit tests for agent prompt content. +- Retries on transient HTTP errors. +- The integration smoke test (manual, post-merge). +- A canonical thread shape spanning ADO and GitHub — deferred per ADR 0013. +- Changes to the four pre-existing re-review modules' interfaces (`detect-prior-review`, `parse-signature`) — only `classify-thread` and `match-finding` are modified, and only additively (new parameter / new throw path). +- Pre-PR mode informational notices for inherently-empty states beyond the Doc Context family — PRD A already capped that. + +## Further Notes + +**Dependency on PRD A:** PRD B cannot land before PRD A. The helper imports (`classify-http-error`, `notices`, `formatTrailer`), the orchestrator's Notice-merge step, the ADO Writer's `## ⚠ Notices` block rendering, and the Trailer line are all PRD A deliverables that PRD B's new consumers and modified call sites rely on. The two PRDs ship together as a coherent "platform-failure handling" feature; PRD A is the foundation, PRD B is the rollout. + +**Inbox file removal:** the originating `docs/inbox/pr-review-ado-error-hardening-pass.md` is deleted once PRD A and PRD B are published (per the inbox graduation flow documented in `docs/inbox/README.md`). + +**Source:** same grilling session as PRD A. See PRD A's "Further Notes" for the doctrine, ADR cross-references, and `CONTEXT.md` term additions. + +--- + +## Agent Brief + +> _This was generated by AI during triage._ + +**Category:** enhancement +**Summary:** Apply PRD A's four-tier Notice doctrine + helper-layer architecture to the remaining surfaces: Re-review Coordinator, ADO Writer (every write call site, including H1 retroactively), and Pre-PR mode. Adds two new deep helpers (`parse-write-response`, `detect-default-branch`), extends two re-review classifier helpers (`classify-thread` gets a `diffRange` parameter for γ-downgrade; `match-finding` throws instead of returning null on parse error), and refactors `parseAdoWriterResult` to the discriminated-union shape. Default-branch detection becomes a Gitflow-aware fallback chain (`develop` → `main` → `master`) with a Notice naming the actually-used branch. + +**Current behavior (after PRD A lands):** + +- Coordinator's per-finding `match-finding` call falls back to "no match" on Node parse error, silently duplicating prior threads. +- Coordinator's PATCH-to-fixed catch-all only special-cases HTTP 409; auth, 5xx, and network failures become 200-char `process.stdout.write` warnings that nothing reads. Threads stay open, the user is not told. +- ADO Writer's inline POST path (H1, from PR #29) treats 401/403 as recoverable per-finding failures — every subsequent inline POST in the same run also fails, but the user only sees "N findings posted" with N=0 or partial. +- ADO Writer's `*.err` files are unconditionally cleaned up at the end, destroying the only diagnostic for partial-success runs. +- `parseAdoWriterResult` returns `null` for both "Writer never printed a result block" and "Writer parsed but block was malformed", conflating crash with empty-success. +- Pre-PR `parseChangedFilesFromDiff` returns `[]` for both empty input and `diff --git`-bearing input that fails to parse — broken pipelines look like clean reviews. +- Pre-PR default-branch detection hardcodes `main` as the fallback, computing the diff against the wrong base on every Gitflow project. +- Coordinator and Writer ignore the new `DIFF_RANGE` sentinel PRD A emits. + +**Desired behavior:** + +- All five ADO write call sites in the Writer and Coordinator route through `parse-write-response` (composing PRD A's `classify-http-error` with response-`id` parsing). One canonical HTTP-tier mapping across the plugin. +- 401/403 anywhere in a Writer or Coordinator run aborts that run with a single stderr message + Trailer aborted line. +- Every per-thread / per-finding write failure that the canonical mapping classifies as DEGRADED pushes a `warning` Notice (`kind` = `inline-post` / `summary-post` / `patch-to-fixed`) that the orchestrator merges and the Writer renders in the Summary. +- Coordinator consumes `DIFF_RANGE: full | incremental`. When `full`, `classify-thread` downgrades `addressed` / `obsolete` outputs to `pending`; `disputed` is unaffected; a DEGRADED Notice (`kind: diff-range` is emitted by the Fetcher, so the Coordinator only consumes — the Notice is already in the merged array). +- Coordinator `match-finding` calls wrap in try/catch; on throw, push DEGRADED Notice (`kind: thread-match`) and let the finding fall through naturally — the reviewer sees one duplicate-and-Notice instead of one silent duplicate. +- Writer streams `*.err` content to stderr at the moment of failure; unconditional cleanup follows. +- `parseAdoWriterResult` returns the discriminated union; orchestrator fails-loud on `{ ok: false, reason: 'missing-block' }`. +- Pre-PR `parseChangedFilesFromDiff` detects suspicious-shape and emits a DEGRADED Notice (`kind: diff-parse`); `buildPrePrContext` returns the Notice array. +- Pre-PR `detect-default-branch.mjs` walks `develop` → `main` → `master`; emits a Notice naming the actually-used branch; aborts when none exists. + +**Key interfaces:** + +- `parseWriteResponse({ httpExit, responseText, errStream }) → { ok: true, id } | { ok: false, tier, kind, message }`. +- `detectDefaultBranch({ branchExists }) → { branch, source, notice? }`. +- `classifyThread({ ..., diffRange }) → { classification }` — new optional parameter with default `'incremental'`. +- `matchFinding(...) → { classification, threadId } | null` — now throws on parse error. +- `parseAdoWriterResult(...) → { ok: true, summaryThreadId, findingsPosted } | { ok: false, reason }`. +- `buildPrePrContext(rawDiff) → { rawDiff, changedFiles, filteredFiles, notices: Notice[] }`. + +**Acceptance criteria:** + +- [ ] PRD A is merged before PRD B starts. +- [ ] Every `az devops invoke` POST/PATCH in `.agents/ado-writer.md` and `.agents/re-review-coordinator.md` is routed through `parse-write-response`. +- [ ] 401 or 403 from any Writer or Coordinator HTTP call aborts the run with a clear stderr message and a Trailer aborted line. +- [ ] 5xx / network / other-4xx from any write call emits a DEGRADED Notice and continues; the Notice appears in the Review Summary. +- [ ] `classify-thread` accepts a `diffRange` parameter; when `'full'`, `addressed` / `obsolete` are remapped to `pending`; `disputed` unaffected. +- [ ] `match-finding` throws on parse error; the Coordinator's call site catches the throw and emits a DEGRADED Notice (`kind: thread-match`). +- [ ] `parseAdoWriterResult` returns a discriminated union; the orchestrator surfaces `{ ok: false, reason: 'missing-block' }` as an ABORTED run. +- [ ] `buildPrePrContext` returns a `notices: Notice[]` field; suspicious-shape diffs emit a DEGRADED Notice (`kind: diff-parse`). +- [ ] `detect-default-branch.mjs` exists, has unit tests covering the four fallback levels + the abort case, and the orchestrator wires it via injectable `branchExists`. +- [ ] Pre-PR mode aborts with a clear stderr message when none of `develop`, `main`, `master` exist. +- [ ] `*.err` content is streamed to stderr at the moment of failure; cleanup is unconditional. +- [ ] `commands/review-pr.md` remains ≤ 200 lines. +- [ ] `pnpm test` passes; `pnpm format` produces no diff; `pnpm check` reports zero warnings. +- [ ] `docs/inbox/pr-review-ado-error-hardening-pass.md` is removed. + +**Out of scope:** + +- Anything PRD A delivers. +- Retries on transient HTTP errors. +- Integration smoke test (manual, post-merge). +- Lifting helpers to `pr-review-toolkit`. +- Unit tests for MODIFY helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs` suspicious-shape). From 38d8387a3ac67bbb37a8e3347f5d2858c5136054 Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 18:28:56 +0200 Subject: [PATCH 3/9] chore(triage): break PRD A and PRD B into 4 + 6 tracer-bullet slices MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Each slice is a thin vertical cut through every layer (helper → agent prompt → orchestrator → tests where applicable) and is demoable on its own. All 10 slices land with Status: needs-triage and are AFK-grabbable once their blocker has merged. PRD A — pr-review-ado-fetcher-reliability (foundation): 01 — End-to-end Notice pipeline via Doc-Context info Notice + ADR-0014 02 — classify-http-error + fetch-work-items refactor → DEGRADED tier + ADR-0015 03 — fetch-iterations refactor → ABORTED tier 04 — DIFF_RANGE sentinel + ADR-0004 amendment PRD B — pr-review-platform-failure-handling (consumer): 01 — parse-write-response helper + Writer applies HTTP-tier mapping to all writes + .err streaming 02 — parseAdoWriterResult discriminated-union refactor 03 — Coordinator consumes DIFF_RANGE → γ-downgrade in classify-thread 04 — Coordinator match-finding throws + DEGRADED Notice on catch 05 — Coordinator PATCH-to-fixed routed through parse-write-response 06 — Pre-PR Notice surface (suspicious-shape Notice + Gitflow-aware default-branch fallback) Dependency graph: A1 ──┬── A2 ── A3 │ └── B1 ── B5 ├── A4 ── B3 ├── B2 ├── B4 └── B6 After A1 lands, A2 / A4 / B2 / B4 / B6 can be picked up in parallel. Only A3 (after A2) and B5 (after B1) extend the critical path. Each slice file follows the to-issues template: Parent (PRD reference), What to build (end-to-end description), Acceptance criteria (checkbox list verifiable by pnpm test/check/verify:changelog), Blocked by (concrete file path of the blocker, or "None" for A1). Co-Authored-By: Claude Sonnet 4.6 --- .../01-end-to-end-notice-pipeline.md | 43 ++++++++++++++++ .../02-classify-http-error-and-work-items.md | 41 ++++++++++++++++ .../03-fetch-iterations-aborted-tier.md | 41 ++++++++++++++++ .../04-diff-range-sentinel.md | 37 ++++++++++++++ .../01-writer-http-tier-mapping.md | 49 +++++++++++++++++++ ...e-ado-writer-result-discriminated-union.md | 38 ++++++++++++++ ...-coordinator-diff-range-gamma-downgrade.md | 38 ++++++++++++++ .../04-coordinator-match-finding-throw.md | 39 +++++++++++++++ .../05-coordinator-patch-to-fixed-mapping.md | 40 +++++++++++++++ .../06-pre-pr-notice-surface.md | 40 +++++++++++++++ 10 files changed, 406 insertions(+) create mode 100644 docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md create mode 100644 docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md create mode 100644 docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md create mode 100644 docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md create mode 100644 docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md create mode 100644 docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md create mode 100644 docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md create mode 100644 docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md create mode 100644 docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md create mode 100644 docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md diff --git a/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md b/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md new file mode 100644 index 0000000..44fbd8a --- /dev/null +++ b/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md @@ -0,0 +1,43 @@ +# A1. End-to-end Notice pipeline via Doc-Context info Notice + ADR-0014 + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-ado-fetcher-reliability/PRD.md` + +## What to build + +Deliver the foundational Notice pipeline end-to-end with one concrete emission site: the Doc-Context `info` Notice when the PR has no linked work items. + +Implementation cuts through every layer: + +- **New helper** `scripts/ado/notices.mjs` — pure functions `createNotice`, `mergeNotices` (dedupe by `kind`), `formatNoticesAsSummaryBlock`, `formatNoticesAsPrePrPreamble`, `formatTrailer`. With unit tests. +- **ADO Fetcher prompt** — `ADO_FETCHER_RESULT_START/END` block gains a `NOTICES: [...]` field. When `WORK_ITEM_IDS=[]`, the Fetcher appends an `info` Notice (`kind: doc-context`, message: "Reviewed without business context — no work items linked to this PR."). +- **Orchestrator** — parses `NOTICES` from the Fetcher result, merges via the new helper (no-op at this stage but the merge wiring must exist), passes a merged `NOTICES_JSON` to the ADO Writer prompt. +- **ADO Writer prompt** — accepts the new `NOTICES_JSON` input; renders a `## Notices` block (with `ℹ️` for `info`, `⚠` for `warning`) above the severity-grouped findings in the Summary content. +- **Trailer** — orchestrator prints a mandatory end-of-run line in the Claude interface for every run (ADO modes, Pre-PR mode, aborted). Carries findings counts, notice counts, and (for ADO modes) the PR URL. +- **ADR-0014** — new `apps/claude-code/pr-review/docs/adr/0014-notice-tier-doctrine-and-failure-classification-helpers.md`. Records the four-tier doctrine (OK / EMPTY-BY-DESIGN / DEGRADED / ABORTED), the no-fifth-ASK-tier rule, and the JS-helper-layer refinement to ADR 0013. +- **CHANGELOG** — `[Unreleased]` Added entries for the new helper and ADR; Changed entries for the Fetcher result block, the orchestrator merge wiring, and the Summary rendering. +- **`commands/review-pr.md`** stays ≤ 200 lines. + +End-to-end demoable: invoke `/pr-review:review-pr` against an ADO PR with no linked work items. The Summary opens with `ℹ️ Reviewed without business context — no work items linked to this PR.` followed by the findings. The Claude interface ends with `✅ Review posted: findings · 0 warning notices · 1 info notice → `. + +## Acceptance criteria + +- [ ] `scripts/ado/notices.mjs` exists with all five exported functions and passes `pnpm --filter pr-review test`. +- [ ] `ADO_FETCHER_RESULT_START/END` block emits a `NOTICES` field (JSON array; empty array if no notices). +- [ ] Orchestrator parses, merges, and passes `NOTICES` to the ADO Writer prompt. +- [ ] ADO Writer Summary content renders a `## Notices` block above findings when `NOTICES` is non-empty; no block when empty. +- [ ] Doc-Context info Notice appears on a real ADO PR with no linked work items. +- [ ] End-of-run Trailer line is printed in the Claude interface for every run (success, abort, pre-PR). +- [ ] ADR-0014 exists at the documented path. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, and `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +None — can start immediately. diff --git a/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md b/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md new file mode 100644 index 0000000..c76bffa --- /dev/null +++ b/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md @@ -0,0 +1,41 @@ +# A2. `classify-http-error` + `fetch-work-items` refactor → DEGRADED tier + ADR-0015 + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-ado-fetcher-reliability/PRD.md` + +## What to build + +Wire the DEGRADED tier end-to-end by introducing the canonical HTTP-tier mapping and applying it to the ADO Fetcher's work-item fetch. + +Implementation cuts through every layer: + +- **New helper** `scripts/ado/classify-http-error.mjs` — pure function `({ status, body, exitCode }) → { tier: 'ok' | 'degraded' | 'aborted', kind, message }`. Encodes the canonical mapping (200/201/404/409 → ok; 401/403 → aborted; 5xx/network/4xx → degraded). With unit tests covering every row of the mapping table from PRD A's Implementation Decisions, plus malformed-body and network-exit-code paths. +- **New helper** `scripts/ado/fetch-work-items.mjs` — wraps the work-items fetch (currently inline in the Fetcher prompt). Returns `{ ok: true, ids } | { ok: false, reason, message }`. Subsumes today's `parseWorkItemIds`. Empty array on successful fetch → `{ ok: true, ids: [] }` (legitimate EMPTY-BY-DESIGN). Auth/5xx/network → `{ ok: false }` with a kind and message ready for Notice creation. With unit tests covering the discriminated-union behaviour. +- **ADO Fetcher prompt** — Step 5 (work-item fetch) refactored to `await import(...)` the new helper. On `{ ok: false }`, emit a DEGRADED Notice (`kind: work-items`, message from the helper) into the Fetcher's `NOTICES` array (the channel wired by A1). The Doc-Context info Notice from A1 still fires when `{ ok: true, ids: [] }`. +- **ADR-0015** — new `apps/claude-code/pr-review/docs/adr/0015-canonical-http-tier-mapping.md`. Records the HTTP-tier mapping, the 401/403 abort rule, and the no-retries-in-v1 stance. +- **CHANGELOG** — `[Unreleased]` Added entries for the two new helpers and ADR-0015; Changed entry for the Fetcher prompt's work-item step. + +Subsumption: today's `parseWorkItemIds` is removed (subsumed into `fetch-work-items.mjs`), and its existing test file is replaced by the new helper's tests. + +End-to-end demoable: invoke `/pr-review:review-pr` against an ADO PR while the local Azure CLI is unauthenticated (e.g. revoke the token). The Summary renders `## Notices` containing `⚠ work-items: Failed to fetch linked work items (auth error). Review proceeded without business context.` The Trailer reports `· 1 warning notice`. Restore auth and the same PR shows the Doc-Context info Notice from A1 (or no Notice at all if work items are populated). + +## Acceptance criteria + +- [ ] `scripts/ado/classify-http-error.mjs` exists with full HTTP-tier-mapping unit tests (≥ 10 cases). +- [ ] `scripts/ado/fetch-work-items.mjs` exists with discriminated-union return shape and unit tests; subsumes `parseWorkItemIds`. +- [ ] Old `parseWorkItemIds` exports and tests are removed. +- [ ] ADO Fetcher prompt's work-item step calls the new helper via `await import(...)`. +- [ ] A simulated work-item-fetch failure (e.g. wrong PROJECT name, revoked auth) produces a DEGRADED Notice with `kind: work-items` in the Summary. +- [ ] ADR-0015 exists at the documented path. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` diff --git a/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md b/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md new file mode 100644 index 0000000..c21b352 --- /dev/null +++ b/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md @@ -0,0 +1,41 @@ +# A3. `fetch-iterations` refactor → ABORTED tier + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-ado-fetcher-reliability/PRD.md` + +## What to build + +Wire the ABORTED tier end-to-end by refactoring the iterations fetch and removing the `iterationId=1` default that today silently violates CLAUDE.md. + +Implementation cuts through every layer: + +- **New helper** `scripts/ado/fetch-iterations.mjs` — wraps the iterations fetch and parse. Returns `{ ok: true, latestIterationId, latestCommitSha } | { ok: false, reason: 'empty-iterations' | 'auth' | 'transient' | 'malformed', message }`. Uses `classify-http-error` (from A2) for HTTP failures. Empty `value` array on a real PR → `{ ok: false, reason: 'empty-iterations' }`. 401/403 on the fetch → `{ ok: false, reason: 'auth' }`. With unit tests covering each branch. +- **ADO Fetcher prompt** — Step 2 (iterations fetch) refactored to `await import(...)` the new helper. On `{ ok: false }` of any kind, the prompt exits non-zero with a clear stderr message ("ERROR: . Try `az devops login` to re-authenticate." for auth; "ERROR: iterations endpoint returned empty value array. Cannot sign Review with a valid Iteration ID." for empty). +- **Orchestrator** — recognises a Fetcher non-zero exit and prints the Trailer aborted line in the Claude interface: `❌ Review aborted: `. No Summary is composed (the run never reaches the Writer). +- **Removed:** today's `parseIterations` helper exports and tests are subsumed into `fetch-iterations.mjs`. +- **CHANGELOG** — `[Unreleased]` Added entry for the new helper; Changed entry for the Fetcher prompt's iterations step; Fixed entry calling out the `iterationId=1` default removal. + +Subsumption: the existing `parseIterations` test fixtures move into the new helper's tests. The empty-input case is reclassified from "returns iterationId=1" to "returns `{ ok: false, reason: 'empty-iterations' }`" per the grilling decision. + +End-to-end demoable: invoke `/pr-review:review-pr` against a PR while the local `az devops login` is expired. The Claude interface ends with `❌ Review aborted: auth — Failed to fetch iterations (HTTP 401). Try \`az devops login\` to re-authenticate.` No Summary is posted. Restore auth and the same PR signs comments with the correct latest iteration. + +## Acceptance criteria + +- [ ] `scripts/ado/fetch-iterations.mjs` exists with discriminated-union return shape and unit tests (≥ 6 cases including empty value, 401, 5xx, malformed JSON, single iteration, multiple iterations). +- [ ] Old `parseIterations` exports and tests are removed. +- [ ] ADO Fetcher prompt's iterations step calls the new helper via `await import(...)`. +- [ ] A simulated empty-iterations response causes the run to ABORT with a clear stderr message and a Trailer aborted line. +- [ ] A simulated 401 on the iterations endpoint causes the same abort with `reason: auth` in the Trailer. +- [ ] No comment is ever signed with `Iteration ` (empty) or `Iteration 1` due to the empty-default path. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md` diff --git a/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md b/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md new file mode 100644 index 0000000..d560f8f --- /dev/null +++ b/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md @@ -0,0 +1,37 @@ +# A4. `DIFF_RANGE` sentinel + ADR-0004 amendment + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-ado-fetcher-reliability/PRD.md` + +## What to build + +Emit the `DIFF_RANGE` sentinel and the corresponding Notice when the Fetcher's existing diff-range fallback fires, and amend ADR 0004 in-place with the γ-downgrade rule that PRD B's Coordinator will consume. + +Implementation cuts through every layer: + +- **ADO Fetcher prompt** — Step 4 (raw diff) updated to emit `DIFF_RANGE: full | incremental` as a new field in the `ADO_FETCHER_RESULT_START/END` block. The value reflects which diff range was actually computed: `incremental` when the prior iteration's commit was reachable and the diff ran against `${PRIOR_COMMIT_SHA}..${LATEST_COMMIT_SHA}`; `full` when any fallback fired and the diff ran against `origin/${TARGET_BRANCH}...HEAD`. When `full`, the prompt also appends a DEGRADED Notice (`kind: diff-range`, message: "Incremental diff unavailable — Coordinator will classify against the full PR diff with conservative downgrades.") to the Fetcher's `NOTICES` array. +- **Orchestrator** — parses the new `DIFF_RANGE` field alongside the other Fetcher result fields. PRD A does not yet consume the value; PRD B (issue B3) will. +- **ADR 0004 amendment** — `apps/claude-code/pr-review/docs/adr/0004-incremental-diff-baseline.md` gets a new "Degraded baseline" subsection (in-place, not a separate ADR) documenting the rule: when `DIFF_RANGE=full`, the Coordinator MAY classify against the full diff but MUST downgrade `addressed` / `obsolete` outputs to `pending` and emit a DEGRADED Notice. Status of ADR 0004 stays `Accepted`; the amendment is additive. +- **CHANGELOG** — `[Unreleased]` Changed entry for the Fetcher result-block extension; Fixed entry for the diff-range fallback no longer being silent. + +End-to-end demoable: invoke `/pr-review:review-pr` against a PR where the prior iteration's commit has been force-pushed away (so the Fetcher's `git fetch origin "$PRIOR_COMMIT_SHA"` fails). The Summary opens with `⚠ diff-range: Incremental diff unavailable — Coordinator will classify against the full PR diff with conservative downgrades.` The Trailer reports `· 1 warning notice`. (Without PRD B's B3 landed, the Coordinator does not yet downgrade — that's B3's verification surface.) + +## Acceptance criteria + +- [ ] `ADO_FETCHER_RESULT_START/END` block emits a `DIFF_RANGE: full | incremental` field. +- [ ] When the diff-range fallback fires, the Fetcher's `NOTICES` array contains a `warning`-severity entry with `kind: diff-range`. +- [ ] When the incremental diff succeeds, `DIFF_RANGE=incremental` and no diff-range Notice is emitted. +- [ ] Orchestrator parses the new field (does not yet consume it — PRD B will). +- [ ] ADR 0004 has the "Degraded baseline" subsection appended in-place. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` diff --git a/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md b/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md new file mode 100644 index 0000000..144d972 --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md @@ -0,0 +1,49 @@ +# B1. `parse-write-response` helper + ADO Writer applies HTTP-tier mapping to all writes + `*.err` streaming + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Route every ADO write call site in the ADO Writer through one canonical helper. Apply the HTTP-tier mapping consistently. Fix the H1 retroactive auth gap and the `*.err` retention policy. + +Implementation cuts through every layer: + +- **New helper** `scripts/ado/parse-write-response.mjs` — pure function `({ httpExit, responseText, errStream }) → { ok: true, id } | { ok: false, tier, kind, message }`. Composes `classify-http-error` (from A2) with response-`id` parsing. Used by every ADO write call site. With unit tests covering happy path, 200/201 with valid `id`, 401, 403, 404, 409, 5xx, network exit-code, malformed JSON body, missing `id` field on 200 response. +- **ADO Writer prompt** — every `az devops invoke` POST/PATCH call site routed through the new helper: + - inline POST (Step 1) — including the threadContext-fallback path + - summary POST (Step 2 first-review) + - delta reply POST (Step 2 re-review) + - completion marker POST (Step 3) +- **Tier handling per call site:** + - `ok: true` → record the `id`, increment counters, continue (today's H1 behaviour, now formalised through the helper). + - `ok: false, tier: 'aborted'` (401/403) → emit stderr message ("ERROR: . Try `az devops login` to re-authenticate.") and exit non-zero. The orchestrator surfaces the abort in the Trailer. + - `ok: false, tier: 'degraded'` (5xx/network/4xx) → push a Notice (`kind: inline-post | summary-post | patch-to-fixed`-equivalent for delta/completion marker) to the Writer's `NOTICES` array, continue to next call site. +- **`*.err` retention policy** — at the moment of failure, stream the contents of the per-call `*.err` file to the Writer's stderr (so the failure text is adjacent to the Notice that references it). Cleanup step at the end of the Writer is unconditional — no retention based on counts. +- **Writer result block** — `ADO_WRITER_RESULT_START/END` gains a `NOTICES: [...]` array so the orchestrator can merge Writer-emitted notices with Fetcher-emitted notices for the Summary. +- **Orchestrator** — merges the Writer's `NOTICES` into the combined array passed to subsequent rendering steps if needed; the Trailer notice counts already reflect all merged notices per A1. +- **CHANGELOG** — `[Unreleased]` Added entry for `parse-write-response.mjs`; Changed entries for the Writer call sites; Fixed entry retroactively covering the H1 inline-POST auth gap and the `*.err` streaming policy. + +End-to-end demoable: invoke `/pr-review:review-pr` against a PR while the local `az devops login` token is revoked. The Claude interface ends with `❌ Review aborted: auth — ` after the first failing write. Restore auth, simulate a 5xx (e.g. malformed REPO_ID), and the Summary renders `## Notices` with `⚠ inline-post: Failed to post inline comment at /src/foo.ts:42 (HTTP 503).` plus the `*.err` content visible in stderr above the Notice. + +## Acceptance criteria + +- [ ] `scripts/ado/parse-write-response.mjs` exists with full unit-test coverage (≥ 10 cases). +- [ ] Every `az devops invoke` POST/PATCH in `.agents/ado-writer.md` routes through the new helper. +- [ ] 401 or 403 from any write call aborts the Writer with a clear stderr message; the orchestrator's Trailer line reads `❌ Review aborted: auth — ...`. +- [ ] 5xx, network, and other-4xx from any write call emits a DEGRADED Notice; the Writer continues to the next call site. +- [ ] `*.err` file content is streamed to stderr at the moment of failure; cleanup at the end is unconditional. +- [ ] `ADO_WRITER_RESULT_START/END` emits a `NOTICES` array. +- [ ] The H1 inline-POST path (from PR #29) inherits the canonical mapping — auth failures no longer log-and-continue. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md` diff --git a/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md b/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md new file mode 100644 index 0000000..86b28a5 --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md @@ -0,0 +1,38 @@ +# B2. `parseAdoWriterResult` discriminated-union refactor + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Refactor `parseAdoWriterResult` to a discriminated-union return shape so the orchestrator can distinguish "Writer crashed before printing its result block" from "Writer parsed successfully with zero findings posted." + +Implementation cuts through every layer: + +- **Refactor** `scripts/ado-writer.mjs` (`parseAdoWriterResult`) — return type becomes `{ ok: true, summaryThreadId, findingsPosted } | { ok: false, reason: 'missing-block' | 'malformed' }`. Today's `null` return is subsumed: missing `ADO_WRITER_RESULT_START` / `_END` → `{ ok: false, reason: 'missing-block' }`; present block with garbage inside → `{ ok: false, reason: 'malformed' }`; valid block → `{ ok: true, ... }`. +- **Update existing tests** — the existing `ado-writer.test.mjs` cases that asserted `null` are updated to the new return shape. Add cases for the two `ok: false` reasons explicitly. +- **Orchestrator** — `commands/review-pr.md` parses the Writer's result block via the refactored helper. On `{ ok: false }`, the orchestrator emits a stderr abort message ("ERROR: Writer did not return a valid result block (). The Summary may or may not have been posted; verify on ADO.") and prints a Trailer aborted line. On `{ ok: true }`, behaviour unchanged. +- **ADO Writer prompt** — the existing round-trip validation step (added in PR #29's H2 fix) is updated to assert against the new `{ ok: true }` shape rather than the old "non-null" shape. +- **CHANGELOG** — `[Unreleased]` Changed entry for the helper API; Fixed entry covering the silent-success bug on Writer crash. + +End-to-end demoable: inject a fault that crashes the Writer mid-Summary-post (e.g. corrupt `LATEST_ITERATION_ID` env var). The Claude interface ends with `❌ Review aborted: writer-missing-block — Writer did not return a valid result block. The Summary may or may not have been posted; verify on ADO.` instead of the misleading success Trailer that would have appeared with the old `null` behaviour. + +## Acceptance criteria + +- [ ] `parseAdoWriterResult` returns the discriminated-union shape; existing test file's `null` assertions are migrated to `{ ok: false }` assertions. +- [ ] At least two new test cases cover `reason: 'missing-block'` and `reason: 'malformed'`. +- [ ] Orchestrator branches on `result.ok` and emits an abort message + Trailer aborted line on `{ ok: false }`. +- [ ] ADO Writer prompt's round-trip validation step works against the new shape. +- [ ] No code path in the orchestrator relies on the old `null` return. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` diff --git a/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md b/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md new file mode 100644 index 0000000..0d713d1 --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md @@ -0,0 +1,38 @@ +# B3. Coordinator consumes `DIFF_RANGE` → γ-downgrade in `classify-thread` + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Extend `classify-thread` with the γ-downgrade rule and have the Re-review Coordinator pass the `DIFF_RANGE` sentinel (emitted by A4) into every thread classification call. + +Implementation cuts through every layer: + +- **`scripts/re-review/classify-thread.mjs`** — adds a `diffRange: 'full' | 'incremental'` parameter (default `'incremental'`, preserving today's behaviour). When `diffRange === 'full'`, the function remaps `addressed` → `pending` and `obsolete` → `pending`; `disputed` is unaffected (its derivation is reviewer-reply-based, not diff-position-based). The downgrade is a single new branch at the end of the existing classification flow. +- **Existing tests** — `scripts/re-review/classify-thread.test.mjs` gets new cases (the user-confirmed test scope for PRD B is "NEW deep modules only", but this is a behaviour change to a MODIFY module that ships with the slice — the new cases are minimal additions, not full new test files). +- **Re-review Coordinator prompt** — parses `DIFF_RANGE` from `ADO_FETCHER_RESULT` (which A4 already emits). Threads the value into every `classify-thread` invocation in Step 5 of the Coordinator. The Notice surfacing the downgrade is already emitted by the Fetcher in A4; the Coordinator does not emit a duplicate. +- **CHANGELOG** — `[Unreleased]` Changed entry for the classify-thread parameter; Fixed entry covering the previously-silent classification against a full-diff fallback. + +End-to-end demoable: trigger A4's diff-range fallback (force-push away the prior iteration's commit on a re-review). The Summary opens with `⚠ diff-range: Incremental diff unavailable...` (emitted by A4), and the thread classifications visibly downgrade — what would have been `addressed` or `obsolete` is now `pending`. The reviewer sees one Notice + one consistently-conservative classification, instead of false-confidence verdicts. + +## Acceptance criteria + +- [ ] `classify-thread` accepts a `diffRange` parameter; default is `'incremental'`. +- [ ] When `diffRange === 'full'`, outputs `addressed` and `obsolete` are remapped to `pending`; `disputed` is unaffected. +- [ ] At least two new test cases in `classify-thread.test.mjs` cover the downgrade branches. +- [ ] Re-review Coordinator parses `DIFF_RANGE` from `ADO_FETCHER_RESULT` and passes it to every classify-thread call. +- [ ] On a synthetic full-diff fallback, no thread is classified as `addressed` or `obsolete` purely from diff position. +- [ ] No duplicate diff-range Notice is emitted by the Coordinator (the Fetcher's Notice from A4 is the only one). +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md` diff --git a/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md b/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md new file mode 100644 index 0000000..78771bf --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md @@ -0,0 +1,39 @@ +# B4. Coordinator `match-finding` throws + DEGRADED Notice on catch + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Change `match-finding` to throw on parse errors (distinguishable from `null` = legitimate no-match), and have the Re-review Coordinator catch the throw and surface a DEGRADED Notice instead of silently treating it as no-match (which today causes the same finding to be re-posted as a duplicate inline thread). + +Implementation cuts through every layer: + +- **`scripts/re-review/match-finding.mjs`** — today returns `null` on no match. New contract: `null` continues to mean "legitimate no-match"; a thrown `Error` distinguishes a parse failure in the input. Internal `JSON.parse` calls and helper-call boundaries that previously swallowed errors now propagate them. +- **Existing tests** — `scripts/re-review/match-finding.test.mjs` gets new cases asserting that malformed input throws (rather than returns `null`). Existing "no match → null" cases unchanged. +- **Re-review Coordinator prompt** — Step 6a (per-finding match) wraps the `match-finding` call in try/catch. On throw, append a DEGRADED Notice to the Coordinator's `NOTICES` array (`kind: thread-match`, message: "Could not classify finding at : — falling back to no-match.") and let the finding fall through naturally to the unclassified path (which today causes duplicate posting — but now with the Notice surfacing the cause to the reviewer). +- **Coordinator result block** — `RE_REVIEW_COORDINATOR_RESULT_START/END` gains a `NOTICES: [...]` field (similar to the Fetcher's and the Writer's). The orchestrator merges Coordinator notices with Fetcher and Writer notices into the combined Summary block. +- **CHANGELOG** — `[Unreleased]` Changed entry for match-finding's new throw semantics; Fixed entry covering the silent-duplicate-posting bug. + +End-to-end demoable: inject a synthetic match-finding parse failure (e.g. corrupt the prior threads JSON for one specific thread). The reviewer sees the finding re-posted as a new inline thread (today's behaviour) AND a `⚠ thread-match: Could not classify finding at /src/foo.ts:42 — falling back to no-match.` Notice in the Summary, instead of just the silent duplicate. + +## Acceptance criteria + +- [ ] `match-finding` throws on input parse errors; returns `null` only for legitimate no-match. +- [ ] At least two new test cases cover the throw paths. +- [ ] Coordinator wraps the per-finding match call in try/catch and emits a DEGRADED Notice (`kind: thread-match`) on throw. +- [ ] Coordinator's result block emits a `NOTICES` array. +- [ ] Orchestrator merges Coordinator notices into the combined Notice block. +- [ ] On a synthetic match-finding parse failure, both the duplicate posting and the Notice appear in the Summary. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` diff --git a/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md new file mode 100644 index 0000000..e2c57e8 --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md @@ -0,0 +1,40 @@ +# B5. Coordinator PATCH-to-fixed routed through `parse-write-response` + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Apply the canonical HTTP-tier mapping to the Re-review Coordinator's PATCH-to-fixed call site, replacing the existing 409-only catch-all with the same uniform error handling every other ADO write surface now uses. + +Implementation cuts through every layer: + +- **Re-review Coordinator prompt** — Step 5's PATCH-to-fixed call site (where the Coordinator marks an `addressed` thread as fixed by PATCHing its `status`) is refactored to capture exit code, response body, and stderr, then route them through `parse-write-response.mjs` (from B1). +- **Tier handling for PATCH-to-fixed:** + - `ok: true` → thread successfully marked fixed; continue. + - `ok: false, tier: 'aborted'` (401/403) → emit stderr message and exit the Coordinator non-zero. The orchestrator surfaces the abort in the Trailer. + - `ok: false, tier: 'degraded'` (5xx/network/other-4xx) → push a per-thread Notice (`kind: patch-to-fixed`, message: "Could not mark thread as fixed (HTTP ). Thread remains active and will be re-evaluated on next re-review.") to the Coordinator's `NOTICES` array, continue to the next thread. +- **Special-cases preserved by the canonical mapping** — 404 (thread deleted) and 409 (state already changed) both map to `ok: true` in `classify-http-error`, so the Coordinator continues silently for those, matching today's behaviour and the user's intent. +- **CHANGELOG** — `[Unreleased]` Changed entry for the Coordinator's PATCH-to-fixed call site; Fixed entry covering the silent-failure auth gap (401/403 used to be a "PATCH warning" string on stdout that nothing read). + +End-to-end demoable: run a re-review against a PR whose threads include at least one `addressed` candidate, while the local `az devops login` is revoked. The Claude interface ends with `❌ Review aborted: auth — Could not mark thread N as fixed (HTTP 401). Try \`az devops login\` to re-authenticate.`Restore auth and simulate a 5xx (e.g. throttling), and the Summary renders`## Notices`with`⚠ patch-to-fixed: Could not mark thread N as fixed (HTTP 503). Thread remains active and will be re-evaluated on next re-review.` plus a passing re-review. + +## Acceptance criteria + +- [ ] Coordinator's PATCH-to-fixed call routes through `parse-write-response.mjs`. +- [ ] 401 or 403 from any PATCH-to-fixed aborts the Coordinator with a clear stderr message; the orchestrator's Trailer line reads `❌ Review aborted: auth — ...`. +- [ ] 5xx / network / other-4xx from any PATCH-to-fixed emits a per-thread DEGRADED Notice; the Coordinator continues to the next thread. +- [ ] 404 and 409 continue silently (canonical OK tier). +- [ ] The old 409-only catch-all is removed from the Coordinator prompt. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md` diff --git a/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md new file mode 100644 index 0000000..e835d8f --- /dev/null +++ b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md @@ -0,0 +1,40 @@ +# B6. Pre-PR Notice surface: suspicious-shape Notice + Gitflow-aware default-branch fallback + +**Status:** needs-triage +**Category:** enhancement +**Plugin:** `apps/claude-code/pr-review` +**Type:** AFK + +## Parent + +`docs/issues/pr-review-platform-failure-handling/PRD.md` + +## What to build + +Give Pre-PR mode the same Notice surface as ADO modes. Detect malformed diff inputs. Replace the hardcoded `main` fallback with a Gitflow-aware fallback chain. + +Implementation cuts through two helpers and the orchestrator's Pre-PR steps: + +- **`scripts/pre-pr.mjs` (`buildPrePrContext` + `parseChangedFilesFromDiff`)** — return shape of `buildPrePrContext` extended to `{ rawDiff, changedFiles, filteredFiles, notices: Notice[] }`. `parseChangedFilesFromDiff` detects suspicious shape: non-empty input that contains ≥ 1 `diff --git` header but produces zero parsed paths. When detected, `buildPrePrContext` pushes a DEGRADED Notice (`kind: diff-parse`, message: "Pre-PR diff parsed to zero files but contained diff headers — input may be malformed.") to the returned `notices` array. Existing test file extended with cases covering the suspicious-shape detection. +- **New helper** `scripts/pre-pr/detect-default-branch.mjs` — pure function `({ branchExists }) → { branch: string | null, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The function tries (in order): the value from `git remote show origin HEAD branch` if non-empty; then `origin/develop`; then `origin/main`; then `origin/master`. Returns `{ branch: null, source: 'none' }` when nothing exists. Emits a `warning` Notice (`kind: default-branch`, message: `"Default branch not detected via remote-show; computed diff against origin/ ()."`) when any fallback level fires. With unit tests covering each branch. +- **`commands/review-pr.md` (Pre-PR Step A)** — wires `detect-default-branch.mjs` via the same `await import(...)` pattern as other helpers. The bash side passes an injected `branchExists(name)` implementation that runs `git rev-parse --verify --quiet "refs/remotes/origin/$name"`. On `branch: null`, the orchestrator emits a stderr message and prints the Trailer aborted line; on any fallback level, the Notice is pushed into the Pre-PR `notices` array. +- **`commands/review-pr.md` (Pre-PR Step B + E)** — Step B uses `buildPrePrContext().notices` and merges them with the default-branch Notice. Step E prints all Notices before findings (per PRD A's pre-PR contract), then the findings, then the Trailer line (which already includes notice counts). +- **CHANGELOG** — `[Unreleased]` Added entry for `detect-default-branch.mjs`; Changed entry for `buildPrePrContext` return shape; Fixed entries for the suspicious-shape detection and the Gitflow-aware fallback. + +End-to-end demoable: run `/pr-review:review-pr` (no URL) in a Gitflow project where `origin/develop` exists but the local `git remote show origin` is offline (e.g. break the remote). The Claude interface prints `⚠ default-branch: Default branch not detected via remote-show; computed diff against origin/develop (develop-fallback).` then the findings, then the Trailer. Repeat in a trunk-only project (no develop branch) → fallback message names `main`. Repeat in a project with no `develop`, `main`, or `master` → `❌ Review aborted: default-branch — No detectable default branch on origin (tried develop, main, master). Specify a base manually.` + +## Acceptance criteria + +- [ ] `buildPrePrContext` returns a `notices: Notice[]` field. +- [ ] `parseChangedFilesFromDiff` suspicious-shape detection emits a DEGRADED Notice (`kind: diff-parse`) via `buildPrePrContext`. +- [ ] `scripts/pre-pr/detect-default-branch.mjs` exists with full unit-test coverage (≥ 6 cases). +- [ ] The fallback chain order is `remote-show` → `origin/develop` → `origin/main` → `origin/master` → `none`. +- [ ] Any fallback level fires a `warning` Notice (`kind: default-branch`) that names the actually-used branch. +- [ ] `none` aborts the run with a clear stderr message and a Trailer aborted line. +- [ ] Pre-PR mode prints all Notices before findings. +- [ ] `commands/review-pr.md` is ≤ 200 lines. +- [ ] `pnpm format`, `pnpm check`, `pnpm --filter pr-review test`, `pnpm --filter pr-review verify:changelog` all pass. + +## Blocked by + +`docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` From 2e1f7b58963fc80f06c93ac3cd544914d11a77fa Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 18:32:17 +0200 Subject: [PATCH 4/9] chore(triage): move all 10 PRD A + PRD B slices to ready-for-agent Each slice was fully grilled to lock in the doctrine, scope, and module shape during the conversational /grill-with-docs session of 2026-05-13. All acceptance criteria are precise and verifiable via the standard pnpm test/check/verify:changelog loop. No outstanding questions on any slice; no slice needs human implementation (all AFK). Triage Notes appended per file map each slice to the specific grilling questions that produced its locked decisions, so an AFK agent picking up a slice tomorrow can trace the rationale without re-reading the full conversation. Co-Authored-By: Claude Sonnet 4.6 --- .../01-end-to-end-notice-pipeline.md | 10 +++++++++- .../02-classify-http-error-and-work-items.md | 10 +++++++++- .../03-fetch-iterations-aborted-tier.md | 10 +++++++++- .../04-diff-range-sentinel.md | 10 +++++++++- .../01-writer-http-tier-mapping.md | 10 +++++++++- .../02-parse-ado-writer-result-discriminated-union.md | 10 +++++++++- .../03-coordinator-diff-range-gamma-downgrade.md | 10 +++++++++- .../04-coordinator-match-finding-throw.md | 10 +++++++++- .../05-coordinator-patch-to-fixed-mapping.md | 10 +++++++++- .../06-pre-pr-notice-surface.md | 10 +++++++++- 10 files changed, 90 insertions(+), 10 deletions(-) diff --git a/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md b/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md index 44fbd8a..3c1c342 100644 --- a/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md +++ b/docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md @@ -1,6 +1,6 @@ # A1. End-to-end Notice pipeline via Doc-Context info Notice + ADR-0014 -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -41,3 +41,11 @@ End-to-end demoable: invoke `/pr-review:review-pr` against an ADO PR with no lin ## Blocked by None — can start immediately. + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q2 four-tier doctrine (OK / EMPTY-BY-DESIGN / DEGRADED / ABORTED, no fifth ASK tier), Q3 Doc-Context info-Notice carve-out, Q4 Option A Notice flow (per-agent `NOTICES` array, orchestrator merges with `kind`-based dedup, Writer renders `## Notices` block above findings), and the mandatory Trailer convention. ADR-0014 content is fully specified in PRD A's "Implementation Decisions → Helper layer". No outstanding questions; ready for an AFK agent to implement. diff --git a/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md b/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md index c76bffa..08f417f 100644 --- a/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md +++ b/docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md @@ -1,6 +1,6 @@ # A2. `classify-http-error` + `fetch-work-items` refactor → DEGRADED tier + ADR-0015 -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -39,3 +39,11 @@ End-to-end demoable: invoke `/pr-review:review-pr` against an ADO PR while the l ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q7 canonical HTTP-tier mapping (200/201/404/409 → OK; 401/403 → ABORTED; 5xx/network/4xx → DEGRADED; no retries in v1). Helper-API discriminated-union refactor for `parseWorkItemIds` verified breaking-change-free (zero consumers outside the plugin, confirmed by `grep` across `apps/`, `packages/`, `docs/`). ADR-0015 content is fully specified in PRD A's "Implementation Decisions → Canonical HTTP-tier mapping". No outstanding questions. diff --git a/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md b/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md index c21b352..127e5b0 100644 --- a/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md +++ b/docs/issues/pr-review-ado-fetcher-reliability/03-fetch-iterations-aborted-tier.md @@ -1,6 +1,6 @@ # A3. `fetch-iterations` refactor → ABORTED tier -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -39,3 +39,11 @@ End-to-end demoable: invoke `/pr-review:review-pr` against a PR while the local ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q5 (empty-iterations reclassification — `value: []` on a real PR is ABORTED, including for merged PRs per the explicit "record-keeping reviews are not supported on PRs with missing iteration history" caveat). The `iterationId=1` default that currently violates CLAUDE.md is removed. `parseIterations` is subsumed; verified breaking-change-free. No outstanding questions. diff --git a/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md b/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md index d560f8f..3e0d69a 100644 --- a/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md +++ b/docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md @@ -1,6 +1,6 @@ # A4. `DIFF_RANGE` sentinel + ADR-0004 amendment -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -35,3 +35,11 @@ End-to-end demoable: invoke `/pr-review:review-pr` against a PR where the prior ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q6 (sentinel naming `DIFF_RANGE: full | incremental` chosen over the boolean alternative for forward-compat with future range types; in-place amendment to ADR 0004 rather than a new ADR-0015a — the amendment is additive). Option γ (the γ-downgrade rule) is implemented in PRD B issue B3, not here; A4 only emits the sentinel and Notice. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md b/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md index 144d972..881c51e 100644 --- a/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md +++ b/docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md @@ -1,6 +1,6 @@ # B1. `parse-write-response` helper + ADO Writer applies HTTP-tier mapping to all writes + `*.err` streaming -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -47,3 +47,11 @@ End-to-end demoable: invoke `/pr-review:review-pr` against a PR while the local ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/02-classify-http-error-and-work-items.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q7 canonical HTTP-tier mapping applied to every Writer call site; Q8(b) `*.err` retention policy (stream-to-stderr at moment of failure, unconditional cleanup, rejected the conditional-retention alternative). H1's per-finding LOG-AND-CONTINUE pattern (from PR #29) is preserved for the per-thread cases but extended with the 401/403 → ABORT escalation — the inbox flagged this consistency gap and grilling confirmed the escalation. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md b/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md index 86b28a5..fded2f3 100644 --- a/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md +++ b/docs/issues/pr-review-platform-failure-handling/02-parse-ado-writer-result-discriminated-union.md @@ -1,6 +1,6 @@ # B2. `parseAdoWriterResult` discriminated-union refactor -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -36,3 +36,11 @@ End-to-end demoable: inject a fault that crashes the Writer mid-Summary-post (e. ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q8 (mechanical batch — discriminated-union refactor distinguishes Writer-crash from zero-success; `null` return was conflating the two cases). Verified breaking-change-free: zero consumers outside `apps/claude-code/pr-review/`. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md b/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md index 0d713d1..cd712ac 100644 --- a/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md +++ b/docs/issues/pr-review-platform-failure-handling/03-coordinator-diff-range-gamma-downgrade.md @@ -1,6 +1,6 @@ # B3. Coordinator consumes `DIFF_RANGE` → γ-downgrade in `classify-thread` -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -36,3 +36,11 @@ End-to-end demoable: trigger A4's diff-range fallback (force-push away the prior ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/04-diff-range-sentinel.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q6 Option γ — when `DIFF_RANGE=full`, `addressed` and `obsolete` outputs from `classify-thread` are remapped to `pending`; `disputed` is unaffected (its derivation is reviewer-reply-based, not diff-position-based). Option α (silent continuation) and Option β (skip classification entirely) were both rejected — γ preserves classifications the Coordinator can still make confidently while defaulting diff-position-derived verdicts to the safer state. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md b/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md index 78771bf..edea2be 100644 --- a/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md +++ b/docs/issues/pr-review-platform-failure-handling/04-coordinator-match-finding-throw.md @@ -1,6 +1,6 @@ # B4. Coordinator `match-finding` throws + DEGRADED Notice on catch -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -37,3 +37,11 @@ End-to-end demoable: inject a synthetic match-finding parse failure (e.g. corrup ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q8(a) — throw-on-parse-error in `match-finding`; try/catch in the Coordinator; on throw the finding falls through to natural duplicate posting but with a DEGRADED Notice (`kind: thread-match`) surfacing the cause. Aborting the whole Coordinator was considered and rejected because parse errors are local to one finding-thread pair. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md index e2c57e8..d563f3a 100644 --- a/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md +++ b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md @@ -1,6 +1,6 @@ # B5. Coordinator PATCH-to-fixed routed through `parse-write-response` -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -38,3 +38,11 @@ End-to-end demoable: run a re-review against a PR whose threads include at least ## Blocked by `docs/issues/pr-review-platform-failure-handling/01-writer-http-tier-mapping.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q7 canonical HTTP-tier mapping applied to the Coordinator's PATCH-to-fixed (the only HTTP write surface in the Coordinator). 404 and 409 stay OK per the canonical mapping (today's 409-only catch-all behaviour is preserved; 404 is now also OK because a deleted thread is a domain success). 401/403 abort the Coordinator with the same stderr+Trailer contract as the Writer. No outstanding questions. diff --git a/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md index e835d8f..7b45932 100644 --- a/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md +++ b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md @@ -1,6 +1,6 @@ # B6. Pre-PR Notice surface: suspicious-shape Notice + Gitflow-aware default-branch fallback -**Status:** needs-triage +**Status:** ready-for-agent **Category:** enhancement **Plugin:** `apps/claude-code/pr-review` **Type:** AFK @@ -38,3 +38,11 @@ End-to-end demoable: run `/pr-review:review-pr` (no URL) in a Gitflow project wh ## Blocked by `docs/issues/pr-review-ado-fetcher-reliability/01-end-to-end-notice-pipeline.md` + +--- + +## Triage Notes + +> _This was generated by AI during triage._ + +Locked during the `/grill-with-docs` session of 2026-05-13: Q8(c) suspicious-shape doctrine (non-empty input with `diff --git` headers but zero parsed paths → DEGRADED Notice); Q8(d) + user follow-up — Gitflow-aware fallback chain (`remote-show` → `origin/develop` → `origin/main` → `origin/master` → ABORT) replacing the hardcoded `main` fallback, with a Notice naming the actually-used branch. `detect-default-branch.mjs` takes an injectable `branchExists` tester for cross-platform unit testability. No outstanding questions. From 117ff096fc3eea1966a2150183e3419a450f1dea Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 19:03:22 +0200 Subject: [PATCH 5/9] docs(pr-review): unify Notices heading to bare `## Notices` (Copilot K1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR #30 had `## ⚠ Notices` in PRD A's Notice-flow prose but `## Notices` in the slice files — Copilot flagged the inconsistency on PRD A:49. Reconcile to the bare heading. Justification: a mixed `info` + `warning` Notices list would force a single heading emoji to misrepresent one tier; per-item emoji prefixes (`ℹ️` for `info`, `⚠` for `warning`) correctly distinguish them without polluting the heading. Add an explicit doctrine sentence to PRD A's "Notice flow" subsection so future contributors don't relitigate the choice. Co-Authored-By: Claude Sonnet 4.6 --- docs/issues/pr-review-ado-fetcher-reliability/PRD.md | 8 ++++---- docs/issues/pr-review-platform-failure-handling/PRD.md | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/issues/pr-review-ado-fetcher-reliability/PRD.md b/docs/issues/pr-review-ado-fetcher-reliability/PRD.md index c09bc4d..8456eff 100644 --- a/docs/issues/pr-review-ado-fetcher-reliability/PRD.md +++ b/docs/issues/pr-review-ado-fetcher-reliability/PRD.md @@ -46,7 +46,7 @@ EMPTY-BY-DESIGN is silent for most states. The Doc Context family is the one exc ### Notice flow -Each orchestration agent emits a `NOTICES` JSON array as a new field in its structured result block. The orchestrator parses, merges (with `kind`-based deduplication), and passes the merged array to the ADO Writer alongside `FINDINGS`. The ADO Writer renders a `## ⚠ Notices` block above the findings in the Review Summary content. +Each orchestration agent emits a `NOTICES` JSON array as a new field in its structured result block. The orchestrator parses, merges (with `kind`-based deduplication), and passes the merged array to the ADO Writer alongside `FINDINGS`. The ADO Writer renders a `## Notices` block above the findings in the Review Summary content. The heading stays bare (no emoji) so a mixed `info` + `warning` Notices list does not require the heading emoji to misrepresent one of the tiers; each list item carries its own per-Notice emoji prefix (`ℹ️` for `info`, `⚠` for `warning`). Notice shape: `{ severity: "info" | "warning", kind: , message: string }`. `kind` is a small enum (`doc-context`, `diff-range`, `work-items`, `iterations`, `default-branch`, `partial-run-check`, `thread-match`, `thread-classify`, `inline-post`, `summary-post`, `patch-to-fixed`, `diff-parse`); rejected: free-form strings, severity-coded numerics. ABORTED never reaches the Notice channel — its surface is stderr + the Trailer. @@ -100,7 +100,7 @@ ADR 0004 ("incremental diff baseline") is amended in-place with a "Degraded base ### Agent and orchestrator changes - **`.agents/ado-fetcher.md`** — three inline bash heredocs (Steps 2, 4a/work-items, 4-diff) replaced with `await import` calls to the three new helpers. `ADO_FETCHER_RESULT` output block grows two fields: `DIFF_RANGE` and `NOTICES`. -- **`.agents/ado-writer.md`** — accepts a new `NOTICES` input; renders the `## ⚠ Notices` block above the existing severity-grouped findings in the Summary content. No changes to write call sites in PRD A (those land in PRD B). +- **`.agents/ado-writer.md`** — accepts a new `NOTICES` input; renders the `## Notices` block above the existing severity-grouped findings in the Summary content. No changes to write call sites in PRD A (those land in PRD B). - **`commands/review-pr.md`** — parses `NOTICES` and `DIFF_RANGE` from `ADO_FETCHER_RESULT`; merges Notices via the `notices` helper; passes merged Notices to the ADO Writer prompt; emits Doc-Context EMPTY-BY-DESIGN info Notice when `WORK_ITEM_IDS=[]`; prints the mandatory end-of-run Trailer line. The 200-line cap from PRD-orchestrator-split is preserved by leaning on the new helpers (the bash side becomes uniform `if [ "$RESULT_OK" != "true" ]; then ...`). ### Existing helpers, breaking-change check @@ -160,7 +160,7 @@ The existing test files for `parseIterations` and `parseWorkItemIds` are subsume > _This was generated by AI during triage._ **Category:** enhancement -**Summary:** Apply the four-tier Notice doctrine to the ADO Fetcher. Introduce three new deep helpers in `scripts/ado/` (`classify-http-error`, `notices`, plus `fetch-iterations` and `fetch-work-items` as discriminated-union refactors of the existing parsers). The Fetcher emits a `NOTICES` array and a `DIFF_RANGE` sentinel in its structured result block; the orchestrator merges Notices, passes them to the ADO Writer, prints a mandatory end-of-run Trailer line. The ADO Writer renders a `## ⚠ Notices` block above findings in the Review Summary. +**Summary:** Apply the four-tier Notice doctrine to the ADO Fetcher. Introduce three new deep helpers in `scripts/ado/` (`classify-http-error`, `notices`, plus `fetch-iterations` and `fetch-work-items` as discriminated-union refactors of the existing parsers). The Fetcher emits a `NOTICES` array and a `DIFF_RANGE` sentinel in its structured result block; the orchestrator merges Notices, passes them to the ADO Writer, prints a mandatory end-of-run Trailer line. The ADO Writer renders a `## Notices` block above findings in the Review Summary. **Current behavior:** ADO Fetcher reads silently swallow exit codes. An iterations-fetch failure produces `LATEST_ITERATION_ID=''`, drifting the Bot Signature to `Iteration ` (empty) and breaking re-review detection forever afterward. A work-item-fetch failure is indistinguishable from "no work items linked" — both produce `WORK_ITEM_IDS=[]`. A diff-range fallback to the full PR diff happens silently, causing the Coordinator to classify prior threads against the wrong range. None of these surface to the reviewer or the invoker. @@ -191,7 +191,7 @@ The four new helpers under `scripts/ado/` own the classification logic. Agent pr - [ ] A work-item fetch that fails with auth/5xx/network emits a DEGRADED Notice (`kind: work-items`); a work-item fetch that returns an empty list emits an `info` Notice (`kind: doc-context`). - [ ] A diff-range fallback emits `DIFF_RANGE: full` in `ADO_FETCHER_RESULT` and a DEGRADED Notice (`kind: diff-range`). - [ ] The orchestrator merges Notices, dedupes by `kind`, and passes them to the ADO Writer. -- [ ] The ADO Writer renders a `## ⚠ Notices` block above findings in first-review and re-review Summaries. +- [ ] The ADO Writer renders a `## Notices` block above findings in first-review and re-review Summaries. - [ ] Every successful run ends with a Trailer line in the Claude interface listing findings, notices, and the PR URL (ADO modes) or finding counts (Pre-PR mode). - [ ] `commands/review-pr.md` remains ≤ 200 lines. - [ ] ADR 0014 (helper layer), ADR 0015 (HTTP-tier mapping), and the in-place ADR 0004 amendment exist. diff --git a/docs/issues/pr-review-platform-failure-handling/PRD.md b/docs/issues/pr-review-platform-failure-handling/PRD.md index f02b2e0..5b82397 100644 --- a/docs/issues/pr-review-platform-failure-handling/PRD.md +++ b/docs/issues/pr-review-platform-failure-handling/PRD.md @@ -120,7 +120,7 @@ Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump ## Further Notes -**Dependency on PRD A:** PRD B cannot land before PRD A. The helper imports (`classify-http-error`, `notices`, `formatTrailer`), the orchestrator's Notice-merge step, the ADO Writer's `## ⚠ Notices` block rendering, and the Trailer line are all PRD A deliverables that PRD B's new consumers and modified call sites rely on. The two PRDs ship together as a coherent "platform-failure handling" feature; PRD A is the foundation, PRD B is the rollout. +**Dependency on PRD A:** PRD B cannot land before PRD A. The helper imports (`classify-http-error`, `notices`, `formatTrailer`), the orchestrator's Notice-merge step, the ADO Writer's `## Notices` block rendering, and the Trailer line are all PRD A deliverables that PRD B's new consumers and modified call sites rely on. The two PRDs ship together as a coherent "platform-failure handling" feature; PRD A is the foundation, PRD B is the rollout. **Inbox file removal:** the originating `docs/inbox/pr-review-ado-error-hardening-pass.md` is deleted once PRD A and PRD B are published (per the inbox graduation flow documented in `docs/inbox/README.md`). From 64489ad948a29c98ce5a039b6061a5a43277cc45 Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 19:03:51 +0200 Subject: [PATCH 6/9] docs(pr-review): repair collapsed inline-code spacing in slice 05 (Copilot K2) Prettier had collapsed the whitespace around several inline-code fragments in the End-to-end demoable paragraph of B5 (lines 26), making the example unreadable. Rewrite the paragraph with prose spacing instead of inline-code adjacency, so the example renders cleanly in both raw markdown and GitHub. Copilot caught this on PR #30 line 26 of the slice file. Co-Authored-By: Claude Sonnet 4.6 --- .../05-coordinator-patch-to-fixed-mapping.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md index d563f3a..e917d0a 100644 --- a/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md +++ b/docs/issues/pr-review-platform-failure-handling/05-coordinator-patch-to-fixed-mapping.md @@ -23,7 +23,7 @@ Implementation cuts through every layer: - **Special-cases preserved by the canonical mapping** — 404 (thread deleted) and 409 (state already changed) both map to `ok: true` in `classify-http-error`, so the Coordinator continues silently for those, matching today's behaviour and the user's intent. - **CHANGELOG** — `[Unreleased]` Changed entry for the Coordinator's PATCH-to-fixed call site; Fixed entry covering the silent-failure auth gap (401/403 used to be a "PATCH warning" string on stdout that nothing read). -End-to-end demoable: run a re-review against a PR whose threads include at least one `addressed` candidate, while the local `az devops login` is revoked. The Claude interface ends with `❌ Review aborted: auth — Could not mark thread N as fixed (HTTP 401). Try \`az devops login\` to re-authenticate.`Restore auth and simulate a 5xx (e.g. throttling), and the Summary renders`## Notices`with`⚠ patch-to-fixed: Could not mark thread N as fixed (HTTP 503). Thread remains active and will be re-evaluated on next re-review.` plus a passing re-review. +End-to-end demoable: run a re-review against a PR whose threads include at least one `addressed` candidate, while the local `az devops login` is revoked. The Claude interface ends with the Trailer aborted line naming the auth failure ("Could not mark thread N as fixed (HTTP 401). Try `az devops login` to re-authenticate."). Restore auth and simulate a 5xx (e.g. throttling), and the Summary's `## Notices` block contains an entry like "⚠ patch-to-fixed: Could not mark thread N as fixed (HTTP 503). Thread remains active and will be re-evaluated on next re-review." — and the rest of the re-review completes normally. ## Acceptance criteria From 3a63d78ba12d5feac23f3c1317233baa16aed2dd Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 19:03:59 +0200 Subject: [PATCH 7/9] docs(pr-review): correct `git remote show` description in slice 06 (Copilot K3) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The earlier wording read as if `git remote show origin HEAD branch` were a single command invocation (an invalid one — `git remote show` takes a remote name only). Reword to make clear we parse the `HEAD branch:` line out of `git remote show origin`'s output and pass the parsed branch name into the helper via a new `remoteHeadBranch` argument. Step A's prose mirrors the helper signature. Copilot caught this on PR #30 line 19 of the slice file. Co-Authored-By: Claude Sonnet 4.6 --- .../06-pre-pr-notice-surface.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md index 7b45932..f18cc1e 100644 --- a/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md +++ b/docs/issues/pr-review-platform-failure-handling/06-pre-pr-notice-surface.md @@ -16,8 +16,8 @@ Give Pre-PR mode the same Notice surface as ADO modes. Detect malformed diff inp Implementation cuts through two helpers and the orchestrator's Pre-PR steps: - **`scripts/pre-pr.mjs` (`buildPrePrContext` + `parseChangedFilesFromDiff`)** — return shape of `buildPrePrContext` extended to `{ rawDiff, changedFiles, filteredFiles, notices: Notice[] }`. `parseChangedFilesFromDiff` detects suspicious shape: non-empty input that contains ≥ 1 `diff --git` header but produces zero parsed paths. When detected, `buildPrePrContext` pushes a DEGRADED Notice (`kind: diff-parse`, message: "Pre-PR diff parsed to zero files but contained diff headers — input may be malformed.") to the returned `notices` array. Existing test file extended with cases covering the suspicious-shape detection. -- **New helper** `scripts/pre-pr/detect-default-branch.mjs` — pure function `({ branchExists }) → { branch: string | null, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The function tries (in order): the value from `git remote show origin HEAD branch` if non-empty; then `origin/develop`; then `origin/main`; then `origin/master`. Returns `{ branch: null, source: 'none' }` when nothing exists. Emits a `warning` Notice (`kind: default-branch`, message: `"Default branch not detected via remote-show; computed diff against origin/ ()."`) when any fallback level fires. With unit tests covering each branch. -- **`commands/review-pr.md` (Pre-PR Step A)** — wires `detect-default-branch.mjs` via the same `await import(...)` pattern as other helpers. The bash side passes an injected `branchExists(name)` implementation that runs `git rev-parse --verify --quiet "refs/remotes/origin/$name"`. On `branch: null`, the orchestrator emits a stderr message and prints the Trailer aborted line; on any fallback level, the Notice is pushed into the Pre-PR `notices` array. +- **New helper** `scripts/pre-pr/detect-default-branch.mjs` — pure function `({ branchExists, remoteHeadBranch }) → { branch: string | null, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The function tries (in order): the `remoteHeadBranch` argument (parsed by the bash side from the `HEAD branch:` line of `git remote show origin` output) if non-empty; then `origin/develop`; then `origin/main`; then `origin/master`. Returns `{ branch: null, source: 'none' }` when nothing exists. Emits a `warning` Notice (`kind: default-branch`, message: `"Default branch not detected via remote-show; computed diff against origin/ ()."`) when any fallback level fires. With unit tests covering each branch. +- **`commands/review-pr.md` (Pre-PR Step A)** — wires `detect-default-branch.mjs` via the same `await import(...)` pattern as other helpers. The bash side first runs `git remote show origin 2>/dev/null` and parses the `HEAD branch:` line out of its output (passing the resulting branch name, or empty string on failure, as `remoteHeadBranch`); it also passes an injected `branchExists(name)` implementation that runs `git rev-parse --verify --quiet "refs/remotes/origin/$name"`. On `branch: null`, the orchestrator emits a stderr message and prints the Trailer aborted line; on any fallback level, the Notice is pushed into the Pre-PR `notices` array. - **`commands/review-pr.md` (Pre-PR Step B + E)** — Step B uses `buildPrePrContext().notices` and merges them with the default-branch Notice. Step E prints all Notices before findings (per PRD A's pre-PR contract), then the findings, then the Trailer line (which already includes notice counts). - **CHANGELOG** — `[Unreleased]` Added entry for `detect-default-branch.mjs`; Changed entry for `buildPrePrContext` return shape; Fixed entries for the suspicious-shape detection and the Gitflow-aware fallback. From cde034179e51b7d86ef22c66d739a70750271035 Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 19:04:07 +0200 Subject: [PATCH 8/9] docs(pr-review): reconcile PRD B test-scope with slice requirements (Copilot K4) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PRD B's "Test-scope choice" section stated MODIFY helpers receive no new unit tests, but slices B2/B3/B4/B6 each explicitly require 2–3 new test cases pinning the new behaviour branches. Copilot flagged this real contradiction on PR #30 line 81. Reconcile by clarifying the doctrine: the user's "NEW deep modules only" answer during grilling was about not writing whole new test suites for MODIFY helpers; the minimal branch-verification cases each slice requires are necessary to confirm the new behaviour landed and do not constitute new suites. Update Test-scope choice + Modules under test sections and two Out of Scope bullets in PRD B and its Agent Brief. Co-Authored-By: Claude Sonnet 4.6 --- .../PRD.md | 29 ++++++++++++------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/docs/issues/pr-review-platform-failure-handling/PRD.md b/docs/issues/pr-review-platform-failure-handling/PRD.md index 5b82397..457efce 100644 --- a/docs/issues/pr-review-platform-failure-handling/PRD.md +++ b/docs/issues/pr-review-platform-failure-handling/PRD.md @@ -78,7 +78,13 @@ All shared helpers (`scripts/ado/classify-http-error.mjs`, `scripts/ado/notices. ### Test-scope choice -The user explicitly chose "NEW deep modules only" in the test-scope question during the grilling session. PRD B writes unit tests for the two new helpers (`parse-write-response`, `detect-default-branch`). The MODIFY helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs`) get no new unit tests in this PRD; their existing test files stay frozen except for whatever fixture updates the new return shapes force. Behaviour change verification on the MODIFY helpers and on the agent prompts goes to the integration smoke test against a real ADO PR, per ADR 0013's stated testing posture. +The user chose "NEW deep modules only" in the test-scope question during the grilling session. Reconciling that choice with the slice acceptance criteria: + +- **New deep helpers** (`parse-write-response`, `detect-default-branch`) receive full unit-test coverage — every documented return-shape branch is asserted. +- **MODIFY helpers** (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs`) receive **minimal branch-verification** test cases — 2–3 cases per helper, each pinning a single new branch introduced by this PRD (the `diffRange='full'` downgrade, the throw-on-parse-error path, the `{ ok: false, reason }` variants, the suspicious-shape Notice). No full new test suites; the existing fixture conventions in those files are reused. +- **Agent prompt content** (`.agents/*.md`, `commands/review-pr.md`) gets no new tests of any kind. End-to-end behaviour change is verified by the integration smoke test against a real ADO PR, per ADR 0013's stated testing posture. + +The intent of the user's "NEW deep modules only" choice was to avoid writing wholly new test files and full coverage for MODIFY helpers; the minimal branch-verification cases above are necessary to confirm the new behaviour landed and do not constitute new test suites. ## Testing Decisions @@ -93,15 +99,18 @@ Same as PRD A: tests assert the external behaviour of each helper given controll - `scripts/ado/parse-write-response.mjs` — happy path (`{ id: 12345 }` response), 401 → `{ ok: false, tier: 'aborted', kind: 'auth' }`, 5xx → `{ ok: false, tier: 'degraded' }`, 404 → `{ ok: true }` (domain-OK), 409 → `{ ok: true }`, malformed JSON body, network exit-code path, missing `id` field on otherwise-200 response. - `scripts/pre-pr/detect-default-branch.mjs` — `git remote show` succeeds → no fallback Notice, `develop` exists → `develop-fallback` with Notice, only `main` exists → `main-fallback` with Notice, only `master` exists → `master-fallback` with Notice, nothing exists → ABORTED (no branch, no Notice — Trailer carries the abort), `branchExists` thrown exception → propagated. -### Modules NOT under test in PRD B +### MODIFY helpers — minimal branch-verification cases -Per the user's choice during grilling: +Each of these receives 2–3 new test cases in its existing test file, pinning the new branch introduced by this PRD. No full new coverage. + +- `classify-thread.mjs` — `diffRange='full'` downgrade of `addressed`/`obsolete` → `pending`; `disputed` unaffected (3 cases). +- `match-finding.mjs` — legitimate no-match still returns `null`; malformed input throws (3 cases). +- `parseAdoWriterResult` — `{ ok: true, ... }` for valid block; `{ ok: false, reason: 'missing-block' }`; `{ ok: false, reason: 'malformed' }` (3 cases). +- `pre-pr.mjs` — suspicious-shape detection emits a `kind: diff-parse` Notice through `buildPrePrContext` (2 cases). + +### Modules NOT under test in PRD B -- `classify-thread.mjs` extension (`diffRange` parameter) — verified by integration smoke test. -- `match-finding.mjs` extension (throw-on-parse-error) — same. -- `parseAdoWriterResult` discriminated-union refactor — same. -- `pre-pr.mjs` suspicious-shape Notice — same. -- All agent prompt content (`.agents/*.md`, `commands/review-pr.md`) — same. +- All agent prompt content (`.agents/*.md`, `commands/review-pr.md`) — verified end-to-end by the integration smoke test against a real ADO PR, per ADR 0013. ### Prior art @@ -110,7 +119,7 @@ Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump ## Out of Scope - Anything PRD A delivers (helper layer, canonical HTTP mapping, ADRs, Fetcher fixes, orchestrator Notice merging + Trailer). -- Unit tests for MODIFY-only helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs` suspicious-shape). +- **Full new test suites** for MODIFY-only helpers — only the minimal 2–3 branch-verification cases listed in Testing Decisions are in scope. - Unit tests for agent prompt content. - Retries on transient HTTP errors. - The integration smoke test (manual, post-merge). @@ -190,4 +199,4 @@ Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump - Retries on transient HTTP errors. - Integration smoke test (manual, post-merge). - Lifting helpers to `pr-review-toolkit`. -- Unit tests for MODIFY helpers (`classify-thread`, `match-finding`, `parseAdoWriterResult`, `pre-pr.mjs` suspicious-shape). +- Full new test suites for MODIFY helpers — only minimal branch-verification cases (per PRD Testing Decisions) are in scope. From 3303f798e5ae1e6ff0d13746b577900adcdb8d78 Mon Sep 17 00:00:00 2001 From: Oriol Torrent Florensa Date: Wed, 13 May 2026 19:10:33 +0200 Subject: [PATCH 9/9] docs(pr-review): propagate K3 helper-signature fix into PRD B (Copilot K3 follow-up) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The K3 commit (3a63d78) only fixed slice 06's description of detect-default-branch.mjs. PRD B itself carried four stale references to the same helper that contradicted the slice: - Implementation Decisions: helper signature missing `remoteHeadBranch` parameter; fallback chain described as `git remote show origin HEAD` (the exact invalid-command wording K3 was meant to remove). - Testing Decisions: listed test cases referenced `git remote show succeeds` without naming the `remoteHeadBranch` injection. - Agent Brief desired-behaviour: said the helper walks `develop → main → master` (omitting the `remote-show` first level). - Agent Brief acceptance criteria: said orchestrator wires the helper via injectable `branchExists` only, omitting `remoteHeadBranch`. Reconcile all four against slice 06's K3-corrected contract so an AFK agent picking up B6 reads one consistent signature across PRD and slice. No effect on the other 9 slices. Co-Authored-By: Claude Sonnet 4.6 --- docs/issues/pr-review-platform-failure-handling/PRD.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/issues/pr-review-platform-failure-handling/PRD.md b/docs/issues/pr-review-platform-failure-handling/PRD.md index 457efce..57810d5 100644 --- a/docs/issues/pr-review-platform-failure-handling/PRD.md +++ b/docs/issues/pr-review-platform-failure-handling/PRD.md @@ -49,7 +49,7 @@ All shared helpers (`scripts/ado/classify-http-error.mjs`, `scripts/ado/notices. ### New helpers - **`scripts/ado/parse-write-response.mjs`** — pure function composing PRD A's `classify-http-error` with response-`id` parsing. Returns `{ ok: true, id } | { ok: false, tier, kind, message }`. Consumed by every ADO write call site (inline POST, threadContext fallback, summary POST, delta reply, completion marker, PATCH-to-fixed). One shape, one classifier. -- **`scripts/pre-pr/detect-default-branch.mjs`** — pure function over an injectable `branchExists(name) → bool` tester. Walks the fallback chain `git remote show origin HEAD` → `origin/develop` → `origin/main` → `origin/master` → `{ branch: null }`. Returns `{ branch, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The bash side wires the tester to `git rev-parse --verify --quiet`. ABORTED when all four fail. +- **`scripts/pre-pr/detect-default-branch.mjs`** — pure function over an injectable `branchExists(name) → bool` tester and a `remoteHeadBranch` argument (a string parsed by the bash side from the `HEAD branch:` line of `git remote show origin` output, or empty string on failure). Walks the fallback chain `remoteHeadBranch` → `origin/develop` → `origin/main` → `origin/master` → `{ branch: null }`. Returns `{ branch, source: 'remote-show' | 'develop-fallback' | 'main-fallback' | 'master-fallback' | 'none', notice?: Notice }`. The bash side parses `HEAD branch:` from `git remote show origin 2>/dev/null` and wires the `branchExists` tester to `git rev-parse --verify --quiet`. ABORTED when all four fallback levels fail. ### Modified helpers @@ -97,7 +97,7 @@ Same as PRD A: tests assert the external behaviour of each helper given controll **New deep helpers (full unit-test coverage):** - `scripts/ado/parse-write-response.mjs` — happy path (`{ id: 12345 }` response), 401 → `{ ok: false, tier: 'aborted', kind: 'auth' }`, 5xx → `{ ok: false, tier: 'degraded' }`, 404 → `{ ok: true }` (domain-OK), 409 → `{ ok: true }`, malformed JSON body, network exit-code path, missing `id` field on otherwise-200 response. -- `scripts/pre-pr/detect-default-branch.mjs` — `git remote show` succeeds → no fallback Notice, `develop` exists → `develop-fallback` with Notice, only `main` exists → `main-fallback` with Notice, only `master` exists → `master-fallback` with Notice, nothing exists → ABORTED (no branch, no Notice — Trailer carries the abort), `branchExists` thrown exception → propagated. +- `scripts/pre-pr/detect-default-branch.mjs` — non-empty `remoteHeadBranch` (e.g. `'develop'`) → no fallback Notice, empty `remoteHeadBranch` and `branchExists('develop')=true` → `develop-fallback` with Notice, empty `remoteHeadBranch` and only `main` exists → `main-fallback` with Notice, only `master` exists → `master-fallback` with Notice, nothing exists → `{ branch: null, source: 'none' }` (no notice — Trailer carries the abort), `branchExists` thrown exception → propagated. ### MODIFY helpers — minimal branch-verification cases @@ -165,7 +165,7 @@ Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump - Writer streams `*.err` content to stderr at the moment of failure; unconditional cleanup follows. - `parseAdoWriterResult` returns the discriminated union; orchestrator fails-loud on `{ ok: false, reason: 'missing-block' }`. - Pre-PR `parseChangedFilesFromDiff` detects suspicious-shape and emits a DEGRADED Notice (`kind: diff-parse`); `buildPrePrContext` returns the Notice array. -- Pre-PR `detect-default-branch.mjs` walks `develop` → `main` → `master`; emits a Notice naming the actually-used branch; aborts when none exists. +- Pre-PR `detect-default-branch.mjs` walks the parsed `HEAD branch:` line from `git remote show origin` → `origin/develop` → `origin/main` → `origin/master`; emits a Notice naming the actually-used branch; aborts when none exists. **Key interfaces:** @@ -186,7 +186,7 @@ Same as PRD A: `packages/release-tools/scripts/verify-changelog.test.mjs`, `bump - [ ] `match-finding` throws on parse error; the Coordinator's call site catches the throw and emits a DEGRADED Notice (`kind: thread-match`). - [ ] `parseAdoWriterResult` returns a discriminated union; the orchestrator surfaces `{ ok: false, reason: 'missing-block' }` as an ABORTED run. - [ ] `buildPrePrContext` returns a `notices: Notice[]` field; suspicious-shape diffs emit a DEGRADED Notice (`kind: diff-parse`). -- [ ] `detect-default-branch.mjs` exists, has unit tests covering the four fallback levels + the abort case, and the orchestrator wires it via injectable `branchExists`. +- [ ] `detect-default-branch.mjs` exists, has unit tests covering the four fallback levels + the abort case, and the orchestrator wires it via injectable `branchExists` plus a `remoteHeadBranch` argument parsed from `git remote show origin`'s `HEAD branch:` line. - [ ] Pre-PR mode aborts with a clear stderr message when none of `develop`, `main`, `master` exist. - [ ] `*.err` content is streamed to stderr at the moment of failure; cleanup is unconditional. - [ ] `commands/review-pr.md` remains ≤ 200 lines.