feat(capture): add the failed-then-fixed learning trigger#3568
Conversation
Add the second capture trigger: PRs whose CI failed then passed before merge feed the existing propose-only pipeline. Transition detection is one GraphQL query per PR; the fixing diff is the primary signal with best-effort failing logs; a new secret/redaction scan layers on the private-name gate because logs and diffs leak more than repo names; a discriminated candidate union keeps dedup, cap, and privacy shared across both triggers.
…vacy module CI logs and diffs leak more than repo names — tokens, keys, connection strings, internal hostnames, and paths. Add a block scan for hard secrets (PATs, private keys, credential-bearing connection strings, cloud/LLM keys) and a structural redaction for paths, internal hostnames, and bearer/authorization values, both pure and callable from the same shared module the existing private-name gate uses.
Make Candidate a trigger-discriminated union so a second source can share the digest's dedup, cap, and privacy path without overloading review-specific fields. The privacy step now redacts structural secrets and drops on a residual hard secret as well as a private name. Collapse candidates to one per merge SHA within a run so a PR matching both triggers yields a single proposal, review prose taking precedence. The review source emits the same shape as before.
…ence Add the CI fail-then-pass harvester: one GraphQL query per PR yields per-commit check conclusions, a chronological walk finds the last-failing and first-passing commits for a required check, and the fixing diff between them becomes the primary evidence. A best-effort failing-log excerpt enriches it and degrades to a placeholder when logs are gone. The diff and log pass the same redact-and-scan privacy gate as review prose, dropping on a private name or residual secret. A PR with no fixing diff (a bare green re-run) is not a candidate.
Fetch merged PRs once and feed both harvesters, resolve the required-check set from branch protection (falling back to all checks when none are configured), and concatenate both sources into the digest. The agent prompt now branches on the trigger so a failed-then-fixed candidate is distilled from its fixing diff and failure log, and the run summary reports the failed-then-fixed counts and the secret-blocked count.
Bound each gh invocation with a timeout and cap commit pagination so a hung process can't stall the run. Stop treating cancelled and action-required checks as failures — only real failures qualify — and recognize errored status contexts. Scan the failing check name alongside the diff and log, add the open step's secret check as a final gate before posting, widen the secret patterns (Google, GitLab, JWT, generic credentials), include base64 characters in the bearer redaction, decode buffer log responses, and give failed-then-fixed proposals a neutral title.
fro-bot
left a comment
There was a problem hiding this comment.
The structural spine here is clean: pure core (buildCandidateDigest, findFailPassTransition) split from the I/O shell, evidence injected so the privacy scan stays pure and testable. The fail-closed gating follows the prior learning in docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md to the letter — one shared capture-learnings-privacy.ts module, two chokepoints (upstream enrichment scan + authored-body backstop in the open step), and the empty-set-vs-load-failure distinction handled in main() via applyEnrichmentScanAvailability rather than smuggled into the core. 222 tests pass, covering the transition ordering invariant, CANCELLED exclusion, StatusContext ERROR detection, redaction ordering, and the mutation-proof "evidence reaches output when the gate is bypassed" case.
The chrome is light where it counts: SHA-pinned actions, contents: read only, file-based digest (no GITHUB_OUTPUT shell-injection vector), App token scoped to this repo, and the agent step deliberately denied issues: write so it cannot bypass the deterministic open step. No pull_request_target. No any, no @ts-ignore, no console in the new code — the as unknown as Parameters<...> casts are the established Octokit pagination idiom here, not type holes.
Verdict: PASS
Blocking issues
None.
Non-blocking concerns
scripts/capture-learnings-harvest.ts:1069buildLogExcerptreranks error lines ahead of everything else ([...errorLines, ...otherLines]), which severs each error from its surrounding stack-trace context before the excerpt reaches the agent. The diff is the primary signal so this doesn't break the learning, but the log evidence loses the very context that makes "this failure → this fix" legible. Consider emitting contiguous windows around error lines instead of a global partition. Defer if you want — it's a quality knob, not a correctness one.scripts/capture-learnings-privacy.ts:61,71thesk-[A-Za-z0-9\-]{20,}and generic-credential catch-all patterns are broad enough to match benign base64/hex blobs in a diff. The bias is fail-closed (over-block enriched evidence, never leak), which is the right direction — but ifenrichmentBlockedBySecrettelemetry climbs in production, that's the knob to tighten. No action needed now; the counts-only telemetry already gives you the signal to watch.fetchMergedPrsInWindowissues a secondpulls.listpagination alongsideharvestCandidates's own fetch. The inline rationale (avoid entangling the filter logic, bounded byLOOKBACK_DAYS) is sound and honest, so this is a noted tradeoff rather than a flaw.
Missing tests
None. New behavior — trigger discrimination, within-run dedup precedence (review-heavy wins), CI transition detection across CheckRun/StatusContext, secret block/redact, scan-availability fail-closed, and log-redaction — all carry coverage.
Risk assessment: LOW
Scheduled/dispatch-only workflow with read-scoped permissions and no untrusted-input → run: interpolation. The blast radius is bounded: worst case under a logic error is dropped enriched evidence (fail-closed) or a low-value proposal issue a human still reviews before it reaches docs/solutions/. The privacy gate is load-bearing and proven so by a mutation test. The accepted value-gate risk (corpus adoption unmeasured) is recorded in the requirements doc — eyes open.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/.github |
| Run ID | 28064390701 |
| Cache | hit |
| Session | ses_1092934e3ffen5NQkmGlPzW6Xn |
Emit contiguous windows around each error line instead of a global error-first partition, so each error keeps the surrounding stack-trace lines that make the failure legible; merge overlapping windows and fall back to the log head when no error line is present.
|
Applied the actionable NBC: |
fro-bot
left a comment
There was a problem hiding this comment.
Re-review after the dismissing push (e0af2cc). You took the one actionable note and turned it.
buildLogExcerpt no longer globally partitions error lines away from their context. It now emits contiguous windows (3 lines before, 6 after each error), merges overlapping/adjacent windows, preserves original order, drops an ... elision marker between non-contiguous segments, and falls back to the log head when no error line is present — all before the budget truncation. The privacy ordering invariant holds: this still runs in the I/O shell upstream of the pure-core redact-then-scan, so the windowed excerpt is what gets gated, not raw log text. A stack trace now reaches the agent as a legible unit instead of a scatter of orphaned error lines.
The three new unit tests (context-preservation, no-error head fallback, budget truncation) land exactly where the behavior lives. Suite is green at 225 (was 222). The other two NBCs — broad secret patterns and the second pulls.list — were correctly read as accepted tradeoffs; the enrichmentBlockedBySecret counter is the telescope you watch them through.
The spine I praised last round is untouched: pure core / I/O shell split, fail-closed gating per docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md, SHA-pinned actions, contents: read, file-based digest, agent denied issues: write.
Verdict: PASS
Blocking issues
None.
Non-blocking concerns
scripts/capture-learnings-privacy.ts:61,71— broadsk-/ generic-credential patterns remain a fail-closed over-block risk. Accepted; watchenrichmentBlockedBySecret.fetchMergedPrsInWindowissues a secondpulls.listalongsideharvestCandidates. Accepted tradeoff, bounded byLOOKBACK_DAYS, documented inline.
Missing tests
None. The log-windowing change carries direct coverage; all prior behavior (trigger discrimination, dedup precedence, transition detection, secret block/redact, scan-availability) remains covered. 225 passing.
Risk assessment: LOW
Scheduled/dispatch-only, read-scoped, no untrusted input reaching a run: block. Worst case under a logic error is dropped enriched evidence (fail-closed) or a low-value proposal a human reviews before it reaches docs/solutions/. The windowing change is pure and confined to evidence quality — it cannot widen the leak surface. Ship it.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/.github |
| Run ID | 28065498429 |
| Cache | hit |
| Session | ses_1092934e3ffen5NQkmGlPzW6Xn |
Adds a second source to the learning-capture run: merged PRs whose CI failed then passed before merge. The fixing change becomes the learning, distilled from real evidence rather than the PR title.
What it does
Hardening from review
Bounded gh timeouts and capped pagination so a hung process can't stall the run; recognized errored status contexts; widened the secret patterns; and added precedence, scan-availability, and log-redaction tests.
The brainstorm and plan ship in this branch. The accepted value-gate risk (corpus adoption not yet measured) is recorded in the requirements doc.