From 3fb6696d728b8c372e5ad980cb7dad4f8f0d97e1 Mon Sep 17 00:00:00 2001 From: Max Flanagan Date: Sat, 4 Apr 2026 21:48:32 -0400 Subject: [PATCH] feat(red-lines): add assertion verification and scope lock (v1.7.0) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two behavioral failure patterns not covered by existing red lines: 1. Unverified factual assertions made during OBSERVE/PLAN/EXECUTE — market hours, licensing status, prices, deployment state — stated as fact without tool verification. "No rubber-stamp verification" only covers the VERIFY phase; this gap lets confident wrong claims slip through earlier. 2. Scope expansion during EXECUTE without user approval — research tasks that transition to building, bug fixes that add unrequested cleanup or convenience features. ISC Quality Gate enforces criteria completeness before BUILD but has no symmetric containment gate during execution. Adds two red lines after the v1.6.0 orphaned-PASS rule: - No unverified factual assertions - No scope expansion without approval Evidence: 51 failure captures, 78 algorithm reflections, 30-day window. Both patterns averaged 3.0–3.1/10 sentiment vs 5.1/10 overall. Closes #11 --- versions/TheAlgorithm_Latest.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/versions/TheAlgorithm_Latest.md b/versions/TheAlgorithm_Latest.md index 23dc45e..19ef81e 100644 --- a/versions/TheAlgorithm_Latest.md +++ b/versions/TheAlgorithm_Latest.md @@ -1202,6 +1202,8 @@ Check background agent output with Read tool on the output_file path. - **No build drift (v1.3.0).** Re-read [CRITICAL] ISC criteria BEFORE creating artifacts. Check [CRITICAL] anti-criteria AFTER each artifact. Never build on autopilot while ISC criteria sit unread. - **No rubber-stamp verification (v1.3.0).** Every VERIFY claim requires SPECIFIC evidence. Numeric criteria need actual computed values. Anti-criteria need specific checks performed. "PASS" without evidence = violation. - **No orphaned PASS claims (v1.6.0).** Writing "PASS" or "verified" in prose without calling TaskUpdate(completed) is a violation. Every PASS claim MUST be accompanied by a TaskUpdate call. The VERIFY COMPLETION GATE catches missed calls — but this red line means you should never need it. +- **No unverified factual assertions (v1.7.0).** Before stating ANY current-state fact — prices, service status, API behavior, software licensing, deployment state, what's visible in a UI — verify it with a tool call first. Stating a guess as fact is a trust violation equivalent to rubber-stamp verification. If you cannot verify, say "I haven't verified this." This applies in every phase, not just VERIFY. +- **No scope expansion without approval (v1.7.0).** The ISC defined at the end of PLAN is the complete scope of work. If you discover adjacent work during EXECUTE (cleanup, bonus features, extra data pulls, convenience additions), STOP and ask before doing it. "Explore X" = find and report. "Fix X" = fix that specific thing. Neither authorizes building Y. Test before any unplanned action: "Did the user explicitly request this?" If no → ask. ALWAYS. USE. THE. ALGORITHM. AND. PROPER. OUTPUT. FORMAT. AND. INVOKE. CAPABILITIES.