Skip to content

fix(capture): attach ci-fix evidence to dual-trigger candidates#3581

Merged
marcusrbrown merged 5 commits into
mainfrom
fix/merged-candidate-evidence
Jun 24, 2026
Merged

fix(capture): attach ci-fix evidence to dual-trigger candidates#3581
marcusrbrown merged 5 commits into
mainfrom
fix/merged-candidate-evidence

Conversation

@marcusrbrown

Copy link
Copy Markdown
Collaborator

A live capture run proved the failed-then-fixed trigger detects candidates but never authored a proposal: within-run dedup dropped the failed-then-fixed candidate before the cap floor ran, and in this repo nearly every failed-then-fixed PR also drew review rounds, so every dual-trigger PR collapsed to review-only.

This attaches the ci-fix evidence to the surviving review candidate instead of dropping it, so the most-informative PRs — those that both drew review and broke then fixed CI — yield one proposal grounded in both signals.

What changes

  • Attach, not drop: a same-SHA dual-trigger PR keeps its review candidate with the fixing diff/log attached. One proposal per merge SHA still holds.
  • Floor delivers: the cap floor now reserves a slot for any candidate carrying ci-fix evidence (pure or attached), so the trigger delivers when it detects.
  • Independent privacy scan: a dual candidate's review prose and attached diff/log are scanned separately and fail-closed — a secret in the diff drops only the diff, a private name in the prose drops only the prose, and a name in both is caught in both. The scan runs on the final candidate over every populated field, with a shared helper so the two evidence paths cannot drift.
  • Prompt + telemetry: the agent weaves both signals into one learning; a dual-trigger count surfaces in the run summary.

The brainstorm and plan ship in this branch. After merge, a capture run should finally author a proposal grounded in the fixing diff — the live proof the delivery bug is fixed.

…andidates

Live validation showed the failed-then-fixed trigger detects candidates but never
authors a proposal: within-run dedup drops the ci-fix candidate before the cap
floor runs, and in this repo nearly every failed-then-fixed PR also drew review
rounds. Attach the ci-fix evidence to the surviving review candidate instead of
dropping it, retarget the floor to candidates that carry ci-fix evidence, and scan
every populated evidence field on the final candidate.
…ad of dropping it

When a PR matched both triggers, within-run dedup dropped the failed-then-fixed
candidate, so the cap floor had nothing to reserve and the learning was never
authored. Attach the ci-fix evidence to the surviving review candidate, and key the
floor on whether a candidate carries ci-fix evidence (pure or attached) so those
candidates are guaranteed a slot. The review candidate keeps its position, so
freshness order is preserved.
…osed

A dual-trigger candidate now carries both review prose and the fixing diff, so the
privacy step scans each independently on the final candidate: a secret in the diff
drops only the diff, a private name in the review prose drops only the prose, and a
private name in both is caught in both. A scan-unavailable run clears the attached
evidence too, so no unscanned evidence reaches the agent.
When a candidate carries both review prose and attached ci-fix evidence, the agent
now weaves the review feedback and the failure-to-fix into one learning rather than
treating them separately. Report a dual-trigger count in the run summary so the
merged candidates are visible.
…ger counter

Extract the redact-then-scan logic shared by the standalone and attached ci-fix
evidence paths into one helper so the two scans cannot drift, rename the telemetry
counter to dualTriggerCandidates so it does not read as 'merged into main', and
correct the blocked-counter docs now that a dual-trigger candidate can drop each
evidence field independently.

@fro-bot fro-bot left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dual-trigger collapse is the right call. A PR that both drew review rounds and broke-then-fixed CI is the densest learning signal in the repo, and the prior behavior silently dropped exactly that intersection at the within-run dedup before the floor could rescue it. Attaching the ci-fix evidence to the surviving review candidate — rather than emitting two proposals or discarding one — keeps the one-proposal-per-merge-SHA contract intact while preserving both signals.

The privacy architecture holds up under scrutiny. The two evidence paths (reviewExcerpts and attached ciFix) are scanned independently and fail-closed: a secret in the diff drops only the diff, a private name in the prose drops only the prose, and a name in both is caught in both. The shared scanCiFixEvidence helper means the pure CiFixCandidate path and the attached-evidence path can't drift — the same chokepoint discipline from docs/solutions/best-practices/pure-core-privacy-gates-shared-module-2026-06-22.md. The privacy-ordering invariant (truncate in the I/O shell, scan the already-truncated text in the pure core) is documented at every call site and respected. applyEnrichmentScanAvailability correctly clears attached ciFix too when the token load fails, so no path leaks unscanned evidence.

Verified mechanically:

  • selectWithCiFixFloor keys on object identity over a reference-unique list (the byMergeSha map guarantees distinct refs), and hasCiFixEvidence correctly treats an empty diffExcerpt as "no substantive evidence" so a privacy-cleared dual candidate doesn't consume a floor slot it can't fill.
  • dualTriggerCandidates is counted pre-scan on the capped set (line 523), so a candidate whose evidence is later dropped still registers as dual-trigger — matches the documented telemetry intent.
  • One-proposal-per-SHA: when both triggers hit a SHA, byMergeSha retains the review candidate (line 483) and the standalone ci-fix is reattached as evidence rather than emitted separately. No double-proposal.

172 tests pass locally, including the load-bearing ones: cross-field mutation proof (private name in both fields caught in both), independent-drop (diff hit clears only diff), scan-unavailable clearing both fields, and the floor-reservation end-to-end. No any, no @ts-ignore, no console in the diff. The workflow change is prompt prose plus one numeric jq telemetry line reading a counts-only field from the script-authored digest — no untrusted input enters a run: block, no injection surface.

One non-blocking observation, not a request:

The two-fetch approach (harvestCandidates fetches closed PRs internally, fetchMergedPrsInWindow fetches them again for the ci-fix path) is documented as a deliberate blast-radius tradeoff, and the rationale is sound for now. But it means every capture run pages the full closed-PR list twice. As the repo's closed-PR history grows past a few thousand, that doubles the dominant API cost of the job. Worth a tracking note for when the pagination count climbs — not a reason to hold this PR.

The delivery bug is real and this fixes it at the correct seam. The ghost that the within-run dedup was erasing now gets to teach the elder.

Verdict: PASS

Blocking issues

None.

Non-blocking concerns

  • scripts/capture-learnings-harvest.ts (fetchMergedPrsInWindow): the closed-PR list is paginated twice per run (once here, once inside harvestCandidates). Documented as an intentional blast-radius tradeoff. Impact: doubles the dominant API cost as closed-PR history grows. Remediation (future): thread a single pre-fetched merged-PR list into both harvesters once the entanglement with harvestCandidates's filtering is worth untangling.

Missing tests

None. New behavior is covered across happy path (dual candidate, both survive redacted), independent drops (diff-only, review-only), cross-field mutation proof, floor reservation (reserved when present, not reserved when diffExcerpt empty), scan-unavailable clearing both fields, mixed threat types incrementing both counters independently, and order/edge cases (first-seen wins for same-trigger duplicates, position preserved).

Risk assessment: LOW

Change is confined to a single offline harvest script and its workflow prompt/telemetry. Fail-closed privacy semantics are preserved and independently tested; the worst-case regression (unscanned evidence reaching the digest) is guarded on every path including token-load failure. No new dependencies, no new untrusted-input surface, no workflow-injection vector. The harvest step is best-effort (errors degrade to an empty digest, exit 0), so a latent bug cannot fail the workflow or leak content.


Run Summary
Field Value
Event pull_request
Repository fro-bot/.github
Run ID 28077028929
Cache hit
Session ses_107ed295cffewTR1bVXWJLQ11Y

@marcusrbrown marcusrbrown merged commit 76154b8 into main Jun 24, 2026
14 checks passed
@marcusrbrown marcusrbrown deleted the fix/merged-candidate-evidence branch June 24, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants