feat(ai): synthetic-defect harness for critic calibration (AI-044) by mrviduus · Pull Request #339 · mrviduus/textstack

mrviduus · 2026-06-16T01:06:14Z

AI-044 — Synthetic-defect injection harness (Phase 7)

Validates the AI-041 critic's calibration — the calibration AutoPublishCrew/SeoCrew gate publish on. Injects KNOWN defects into clean drafts, runs them through the real nano critic, and measures catch-rate + a clean-control false-positive rate.

How

Mirrors ToolCallEvalRunner: deterministic injection + scoring (pure, no LLM) → CI-testable with a fake critic, no key; the live nano run is admin-triggered (POST /admin/ai-quality/evals/criticdefects/run, ~23 calls, sync) and persists a criticdefects eval_run (no judge, Score=catch-rate, BreakdownJson per-axis + FP). Reuses EvalRun — no schema change.

Defect taxonomy (23 fixtures on a real edition-description brief)

Type	n	Axis	"caught" =
factual_hallucination	6	factual_accuracy	axis ≤2 / factual issue / ParseFailed
banned_phrase	4	banned_phrases	axis ≤2 / banned issue / ParseFailed
length over/under	4	length	length ≤2 / ParseFailed
tone_break	4	tone	tone ≤2 / ParseFailed
clean (control)	5	—	flagged → false positive

ParseFailed (fail-closed verdict) counts as a correct reject for any defect.

Honest gate (hardened per adversarial QA)

Passed = catchRate ≥ 0.80 AND falsePositiveRate ≤ 0.20. The original catchRate-only gate let a flag-everything critic (FP=1.0) report success — the runner's own test now proves it correctly fails. A useless critic can't masquerade as calibrated.
Clean controls rewritten meta-phrase-free + grounded + in-bounds, so a legitimately strict critic isn't penalized with false FP.
Length defects breach by a wide margin (~½·Min / ~1.5·Max) so the breach is actually catchable by an LLM eyeballing prose — not a <1% margin that misses for reasons unrelated to critic quality.

Admin UI

New "Run critic-defect eval" button on the AI-quality Evals tab → catch% / FP% / n + PASS/FAIL badge; result persists into the eval history.

Tests — 15 (full AiEvals suite 41 pass / 5 live-key skip)

Pure injector transforms (breaches are real, deterministic) + runner scoring with fake critics: catches-all → fails (FP guard), catches-none → 0.0, good→pass, FP-just-over→fail, garbage→ParseFailed-caught. StudyBuddy set-equality green; no ITool leaked.

Verify

dotnet test tests/TextStack.AiEvals → 41 pass / 5 skip (deterministic half runs with no key)
dotnet test tests/TextStack.UnitTests → 402 pass
dotnet format --verify-no-changes → clean
pnpm -C apps/admin exec tsc --noEmit + build → clean

Note: FP-rate enforced now; golden set grows later (per RAG/StudyBuddy golden TODOs). Admin button is build-verified; live click is owner-triggered (needs prod key + admin session).

🤖 Generated with Claude Code

Injects KNOWN defects (hallucinated facts, banned phrases, wrong length, tone breaks) into clean drafts, runs them through the real AI-041 critic (nano), and measures catch-rate + clean-control false-positive rate — validating the calibration AutoPublishCrew/SeoCrew gate publish on. - Deterministic injector + scoring → CI-testable with a FAKE critic, no key; live nano run admin-triggered via POST /admin/ai-quality/evals/ criticdefects/run, persists a criticdefects eval_run. Mirrors ToolCallEvalRunner (no judge, Score=catch-rate, BreakdownJson per-axis). - Honest gate: Passed = catch-rate >= 0.80 AND false-positive <= 0.20 — a flag-everything critic (FP=1.0) correctly FAILS, not passes. - 23 fixtures (factual x6, banned x4, length x4, tone x4, clean x5) on a real edition-description brief; clean controls neutral + grounded, length defects breach by a wide margin so an LLM can actually catch them. - Admin Evals tab: Run critic-defect button → catch%/FP%/n + PASS/FAIL. 15 tests (injector + runner w/ fake critic, fail-closed + gate cases). FP-rate enforced; golden grows later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mrviduus merged commit e2367e4 into main Jun 16, 2026
5 checks passed

mrviduus deleted the ai-044-critic-defect-harness branch June 16, 2026 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): synthetic-defect harness for critic calibration (AI-044)#339

feat(ai): synthetic-defect harness for critic calibration (AI-044)#339
mrviduus merged 1 commit into
mainfrom
ai-044-critic-defect-harness

mrviduus commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrviduus commented Jun 16, 2026

AI-044 — Synthetic-defect injection harness (Phase 7)

How

Defect taxonomy (23 fixtures on a real edition-description brief)

Honest gate (hardened per adversarial QA)

Admin UI

Tests — 15 (full AiEvals suite 41 pass / 5 live-key skip)

Verify

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant