feat(capture): add the failed-then-fixed learning trigger by marcusrbrown · Pull Request #3568 · fro-bot/.github

marcusrbrown · 2026-06-23T23:34:51Z

Adds a second source to the learning-capture run: merged PRs whose CI failed then passed before merge. The fixing change becomes the learning, distilled from real evidence rather than the PR title.

What it does

Detection is one GraphQL query per PR — per-commit check conclusions across the PR's history — walked chronologically to find the last-failing and first-passing commits for a required check. Only real failures count (cancelled runs and policy gates are excluded).
Evidence is the fixing diff (primary, always available) plus a best-effort failing-log excerpt that degrades to a placeholder when logs are purged.
Privacy layers a new secret scan (PATs, cloud/LLM keys, JWTs, connection strings, generic credentials) and structural redaction (paths, hostnames, bearer tokens) on top of the existing private-name gate. Every evidence field — diff, log, and the failing check name — is redacted and scanned before it reaches the agent, fail-closed, with the open step re-checking the authored body as a final gate before posting.
Both triggers share the same dedup, per-run cap, and proposal flow; a PR matching both yields one proposal.

Hardening from review

Bounded gh timeouts and capped pagination so a hung process can't stall the run; recognized errored status contexts; widened the secret patterns; and added precedence, scan-availability, and log-redaction tests.

The brainstorm and plan ship in this branch. The accepted value-gate risk (corpus adoption not yet measured) is recorded in the requirements doc.

Add the second capture trigger: PRs whose CI failed then passed before merge feed the existing propose-only pipeline. Transition detection is one GraphQL query per PR; the fixing diff is the primary signal with best-effort failing logs; a new secret/redaction scan layers on the private-name gate because logs and diffs leak more than repo names; a discriminated candidate union keeps dedup, cap, and privacy shared across both triggers.

…vacy module CI logs and diffs leak more than repo names — tokens, keys, connection strings, internal hostnames, and paths. Add a block scan for hard secrets (PATs, private keys, credential-bearing connection strings, cloud/LLM keys) and a structural redaction for paths, internal hostnames, and bearer/authorization values, both pure and callable from the same shared module the existing private-name gate uses.

Make Candidate a trigger-discriminated union so a second source can share the digest's dedup, cap, and privacy path without overloading review-specific fields. The privacy step now redacts structural secrets and drops on a residual hard secret as well as a private name. Collapse candidates to one per merge SHA within a run so a PR matching both triggers yields a single proposal, review prose taking precedence. The review source emits the same shape as before.

…ence Add the CI fail-then-pass harvester: one GraphQL query per PR yields per-commit check conclusions, a chronological walk finds the last-failing and first-passing commits for a required check, and the fixing diff between them becomes the primary evidence. A best-effort failing-log excerpt enriches it and degrades to a placeholder when logs are gone. The diff and log pass the same redact-and-scan privacy gate as review prose, dropping on a private name or residual secret. A PR with no fixing diff (a bare green re-run) is not a candidate.

Fetch merged PRs once and feed both harvesters, resolve the required-check set from branch protection (falling back to all checks when none are configured), and concatenate both sources into the digest. The agent prompt now branches on the trigger so a failed-then-fixed candidate is distilled from its fixing diff and failure log, and the run summary reports the failed-then-fixed counts and the secret-blocked count.

Bound each gh invocation with a timeout and cap commit pagination so a hung process can't stall the run. Stop treating cancelled and action-required checks as failures — only real failures qualify — and recognize errored status contexts. Scan the failing check name alongside the diff and log, add the open step's secret check as a final gate before posting, widen the secret patterns (Google, GitLab, JWT, generic credentials), include base64 characters in the bearer redaction, decode buffer log responses, and give failed-then-fixed proposals a neutral title.

fro-bot

The structural spine here is clean: pure core (buildCandidateDigest, findFailPassTransition) split from the I/O shell, evidence injected so the privacy scan stays pure and testable. The fail-closed gating follows the prior learning in docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md to the letter — one shared capture-learnings-privacy.ts module, two chokepoints (upstream enrichment scan + authored-body backstop in the open step), and the empty-set-vs-load-failure distinction handled in main() via applyEnrichmentScanAvailability rather than smuggled into the core. 222 tests pass, covering the transition ordering invariant, CANCELLED exclusion, StatusContext ERROR detection, redaction ordering, and the mutation-proof "evidence reaches output when the gate is bypassed" case.

The chrome is light where it counts: SHA-pinned actions, contents: read only, file-based digest (no GITHUB_OUTPUT shell-injection vector), App token scoped to this repo, and the agent step deliberately denied issues: write so it cannot bypass the deterministic open step. No pull_request_target. No any, no @ts-ignore, no console in the new code — the as unknown as Parameters<...> casts are the established Octokit pagination idiom here, not type holes.

Verdict: PASS

Blocking issues

None.

Non-blocking concerns

scripts/capture-learnings-harvest.ts:1069 buildLogExcerpt reranks error lines ahead of everything else ([...errorLines, ...otherLines]), which severs each error from its surrounding stack-trace context before the excerpt reaches the agent. The diff is the primary signal so this doesn't break the learning, but the log evidence loses the very context that makes "this failure → this fix" legible. Consider emitting contiguous windows around error lines instead of a global partition. Defer if you want — it's a quality knob, not a correctness one.
scripts/capture-learnings-privacy.ts:61,71 the sk-[A-Za-z0-9\-]{20,} and generic-credential catch-all patterns are broad enough to match benign base64/hex blobs in a diff. The bias is fail-closed (over-block enriched evidence, never leak), which is the right direction — but if enrichmentBlockedBySecret telemetry climbs in production, that's the knob to tighten. No action needed now; the counts-only telemetry already gives you the signal to watch.
fetchMergedPrsInWindow issues a second pulls.list pagination alongside harvestCandidates's own fetch. The inline rationale (avoid entangling the filter logic, bounded by LOOKBACK_DAYS) is sound and honest, so this is a noted tradeoff rather than a flaw.

Missing tests

None. New behavior — trigger discrimination, within-run dedup precedence (review-heavy wins), CI transition detection across CheckRun/StatusContext, secret block/redact, scan-availability fail-closed, and log-redaction — all carry coverage.

Risk assessment: LOW

Scheduled/dispatch-only workflow with read-scoped permissions and no untrusted-input → run: interpolation. The blast radius is bounded: worst case under a logic error is dropped enriched evidence (fail-closed) or a low-value proposal issue a human still reviews before it reaches docs/solutions/. The privacy gate is load-bearing and proven so by a mutation test. The accepted value-gate risk (corpus adoption unmeasured) is recorded in the requirements doc — eyes open.

Run Summary

Field	Value
Event	pull_request
Repository	fro-bot/.github
Run ID	28064390701
Cache	hit
Session	ses_1092934e3ffen5NQkmGlPzW6Xn

Emit contiguous windows around each error line instead of a global error-first partition, so each error keeps the surrounding stack-trace lines that make the failure legible; merge overlapping windows and fall back to the log head when no error line is present.

marcusrbrown · 2026-06-24T00:02:05Z

Applied the actionable NBC: buildLogExcerpt now emits contiguous windows around each error line (a few lines before, more after) and merges overlapping windows, so each error keeps its surrounding stack-trace context instead of being globally partitioned away from it. Falls back to the log head when no error line is present. Added direct unit tests for context-preservation, the no-error fallback, and budget truncation. The other two NBCs (broad secret patterns watched via the enrichmentBlockedBySecret counter; the two pulls.list fetches) are accepted as-is per your note.

fro-bot

Re-review after the dismissing push (e0af2cc). You took the one actionable note and turned it.

buildLogExcerpt no longer globally partitions error lines away from their context. It now emits contiguous windows (3 lines before, 6 after each error), merges overlapping/adjacent windows, preserves original order, drops an ... elision marker between non-contiguous segments, and falls back to the log head when no error line is present — all before the budget truncation. The privacy ordering invariant holds: this still runs in the I/O shell upstream of the pure-core redact-then-scan, so the windowed excerpt is what gets gated, not raw log text. A stack trace now reaches the agent as a legible unit instead of a scatter of orphaned error lines.

The three new unit tests (context-preservation, no-error head fallback, budget truncation) land exactly where the behavior lives. Suite is green at 225 (was 222). The other two NBCs — broad secret patterns and the second pulls.list — were correctly read as accepted tradeoffs; the enrichmentBlockedBySecret counter is the telescope you watch them through.

The spine I praised last round is untouched: pure core / I/O shell split, fail-closed gating per docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md, SHA-pinned actions, contents: read, file-based digest, agent denied issues: write.

Verdict: PASS

Blocking issues

None.

Non-blocking concerns

scripts/capture-learnings-privacy.ts:61,71 — broad sk- / generic-credential patterns remain a fail-closed over-block risk. Accepted; watch enrichmentBlockedBySecret.
fetchMergedPrsInWindow issues a second pulls.list alongside harvestCandidates. Accepted tradeoff, bounded by LOOKBACK_DAYS, documented inline.

Missing tests

None. The log-windowing change carries direct coverage; all prior behavior (trigger discrimination, dedup precedence, transition detection, secret block/redact, scan-availability) remains covered. 225 passing.

Risk assessment: LOW

Scheduled/dispatch-only, read-scoped, no untrusted input reaching a run: block. Worst case under a logic error is dropped enriched evidence (fail-closed) or a low-value proposal a human reviews before it reaches docs/solutions/. The windowing change is pure and confined to evidence quality — it cannot widen the leak surface. Ship it.

Run Summary

Field	Value
Event	pull_request
Repository	fro-bot/.github
Run ID	28065498429
Cache	hit
Session	ses_1092934e3ffen5NQkmGlPzW6Xn

marcusrbrown added 6 commits June 23, 2026 14:37

marcusrbrown requested a review from fro-bot as a code owner June 23, 2026 23:34

fro-bot added the agent: working label Jun 23, 2026

fro-bot previously approved these changes Jun 23, 2026

View reviewed changes

fro-bot removed the agent: working label Jun 23, 2026

marcusrbrown dismissed fro-bot’s stale review via e0af2cc June 24, 2026 00:01

fro-bot added the agent: working label Jun 24, 2026

fro-bot approved these changes Jun 24, 2026

View reviewed changes

fro-bot removed the agent: working label Jun 24, 2026

marcusrbrown merged commit d5a105e into main Jun 24, 2026
14 checks passed

marcusrbrown deleted the feat/c2-failed-then-fixed branch June 24, 2026 00:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(capture): add the failed-then-fixed learning trigger#3568

feat(capture): add the failed-then-fixed learning trigger#3568
marcusrbrown merged 7 commits into
mainfrom
feat/c2-failed-then-fixed

marcusrbrown commented Jun 23, 2026

Uh oh!

fro-bot left a comment

Uh oh!

marcusrbrown commented Jun 24, 2026

Uh oh!

fro-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

marcusrbrown commented Jun 23, 2026

What it does

Hardening from review

Uh oh!

fro-bot left a comment

Choose a reason for hiding this comment

Verdict: PASS

Blocking issues

Non-blocking concerns

Missing tests

Risk assessment: LOW

Uh oh!

marcusrbrown commented Jun 24, 2026

Uh oh!

fro-bot left a comment

Choose a reason for hiding this comment

Verdict: PASS

Blocking issues

Non-blocking concerns

Missing tests

Risk assessment: LOW

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants