yetanotherco · MauroToscano · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026
diff --git a/.github/ai-review/matrix.json b/.github/ai-review/matrix.json
@@ -0,0 +1,73 @@
+{
+  "standard": {
+    "review_lanes": [
+      {
+        "id": "minimax-correctness",
+        "model": "minimax/minimax-m3",
+        "prompt": "correctness",
+        "max_output_tokens": 32000,
+        "provider": {
+          "order": ["novita"],
+          "allow_fallbacks": false
+        }
+      }
+    ],
+    "verifier_lanes": []
+  },
+  "critical": {
+    "review_lanes": [
+      {
+        "id": "minimax-critical-correctness",
+        "model": "minimax/minimax-m3",
+        "prompt": "correctness",
+        "max_output_tokens": 32000,
+        "provider": {
+          "order": ["novita"],
+          "allow_fallbacks": false
+        }
+      },
+      {
+        "id": "minimax-critical-maintainability",
+        "model": "minimax/minimax-m3",
+        "prompt": "maintainability",
+        "max_output_tokens": 32000,
+        "provider": {
+          "order": ["novita"],
+          "allow_fallbacks": false
+        }
+      },
+      {
+        "id": "deepseek-soundness",
+        "model": "deepseek/deepseek-v4-pro",
+        "prompt": "soundness",
+        "max_output_tokens": 32000
+      },
+      {
+        "id": "glm-critical",
+        "model": "z-ai/glm-5.1",
+        "prompt": "critical",
+        "max_output_tokens": 32000
+      },
+      {
+        "id": "qwen-critical",
+        "model": "qwen/qwen3.7-max",
+        "prompt": "critical",
+        "max_output_tokens": 32000
+      }
+    ],
+    "verifier_lanes": [
+      {
+        "id": "glm-critical-verifier",
+        "model": "z-ai/glm-5.1",
+        "prompt": "verify-critical",
+        "max_output_tokens": 32000
+      },
+      {
+        "id": "deepseek-critical-verifier",
+        "model": "deepseek/deepseek-v4-pro",
+        "prompt": "verify-critical",
+        "max_output_tokens": 32000
+      }
+    ]
+  }
+}
diff --git a/.github/ai-review/prompts/critical.md b/.github/ai-review/prompts/critical.md
@@ -0,0 +1,32 @@
+This is the critical AI review tier. Treat this PR as security- or
+soundness-sensitive even if the diff is small.
+
+Review only issues introduced by this PR. Use the diff as the scope anchor,
+but inspect surrounding code, call sites, tests, and relevant base/head
+behavior when needed.
+
+Focus on:
+
+1. **Soundness, security, and correctness**
+   - Constraint under-specification, missing bus interactions, trace mistakes
+   - VM/executor behavior changes, memory access, privilege or state bugs
+   - Obvious transcript/Fiat-Shamir, commitment, challenge-ordering, or
+     witness-soundness drift visible from the changed code
+   - Unsafe Rust, panics on reachable inputs, unchecked assumptions
+
+2. **Regression and integration risk**
+   - Changed invariants, changed public contracts, test fixture drift
+   - Interactions across prover tables, statement generation, AIR inclusion,
+     executor behavior, GPU/CUDA paths, or infra scripts
+
+3. **Maintainability risks**
+   - Complexity that hides correctness assumptions
+   - Stale comments, stale names, misleading docs, or scope drift
+
+Guidelines:
+- Prefer concrete, high-confidence findings over exhaustive speculation.
+- Do not attempt a full spec audit in this workflow. Flag obvious spec or doc
+  drift only when it is directly visible from the PR context.
+- Do not report unrelated pre-existing issues unless this PR worsens them.
+- Be concise and actionable.
+- If no issues are found, say so briefly.
diff --git a/.github/ai-review/prompts/general.md b/.github/ai-review/prompts/general.md
@@ -0,0 +1,21 @@
+1. **Soundness and security issues** - Label by criticality (Critical/High/Medium/Low)
+   - Rust: unsafe blocks, error handling, panics, memory safety issues
+   - ZK/prover soundness: incorrect local constraints, missing trace assignments,
+     invalid witness assumptions, inconsistent proving or verification behavior
+   - VM/executor: instruction semantics, memory access, state transitions,
+     inconsistent execution/proving behavior
+
+2. **Potential bugs** - Logic errors, edge cases, incorrect behavior, race conditions
+
+3. **Performance issues** - Only significant: e.g. O(n^2) on unbounded input, unnecessary allocations, hot path inefficiencies
+
+4. **Simplicity and readability** - Prefer simple, readable code over clever
+   abstractions. Cosmetic rewrites are acceptable when they make changed code,
+   names, comments, or docs easier to understand.
+
+Guidelines:
+- Be concise and to the point
+- Do NOT suggest micro-optimizations, churn, or premature abstractions
+- Always prefer simplicity over complexity when performance gains are marginal
+- Focus on real issues, not hypothetical improvements
+- Be concise and actionable
diff --git a/.github/ai-review/prompts/lanes/correctness.md b/.github/ai-review/prompts/lanes/correctness.md
@@ -0,0 +1,14 @@
+Review this PR for concrete correctness issues introduced by the changed code.
+
+Focus on:
+
+- logic errors, edge cases, and changed invariants
+- incorrect error handling or reachable panics
+- VM, executor, prover, memory, trace, bus, and constraint behavior affected by
+  the diff
+- inconsistent behavior between execution, proving, verification, and tests
+
+If constraints, trace generation, or bus interactions change, check local
+consistency against nearby code and tests. Do not attempt a full spec audit.
+
+Ignore unrelated pre-existing issues. Prefer high-confidence findings.
diff --git a/.github/ai-review/prompts/lanes/maintainability.md b/.github/ai-review/prompts/lanes/maintainability.md
@@ -0,0 +1,15 @@
+Review this PR for simplification, readability, stale comments, stale names, and
+scope drift introduced by the changed code.
+
+Useful cosmetic rewrites are allowed when they make the changed code, names,
+comments, or docs easier to understand. Do not suggest low-signal churn.
+
+Focus on:
+
+- stale or misleading comments and doc comments
+- names that no longer match behavior or scope
+- duplicated logic, avoidable abstractions, and unnecessarily clever code
+- tests whose names or assertions no longer match what they check
+- comments or docs that were not updated after code behavior changed
+
+Prefer concise, actionable findings with concrete file and line references.
diff --git a/.github/ai-review/prompts/lanes/soundness.md b/.github/ai-review/prompts/lanes/soundness.md
@@ -0,0 +1,15 @@
+Review this PR for soundness-sensitive issues visible from the changed code and
+nearby context.
+
+Focus on:
+
+- under-constrained values, missing constraints, and incorrect selectors
+- missing or incorrect bus interactions
+- trace assignment mistakes and witness assumptions
+- inconsistent prover/verifier behavior
+- AIR inclusion or statement-generation drift
+- obvious transcript, commitment, or challenge-ordering drift visible from the
+  changed code
+
+This is not a full spec audit. Report only issues with concrete evidence in the
+diff or surrounding code.
diff --git a/.github/ai-review/prompts/lanes/tests.md b/.github/ai-review/prompts/lanes/tests.md
@@ -0,0 +1,12 @@
+Review this PR for missing or stale tests.
+
+Focus on:
+
+- changed behavior without a test
+- edge cases that are likely to regress
+- tests whose names, fixtures, or assertions no longer match the implementation
+- prover, executor, trace, bus, and constraint changes that need targeted tests
+- docs or comments that imply behavior not covered by tests
+
+Do not ask for broad test rewrites. Prefer targeted tests tied to the changed
+behavior.
diff --git a/.github/ai-review/prompts/lanes/verify-critical.md b/.github/ai-review/prompts/lanes/verify-critical.md
@@ -0,0 +1,13 @@
+Verify candidate review findings for this critical PR.
+
+For each candidate, decide whether the finding is supported by the diff and
+provided surrounding code. Mark it as:
+
+- `confirmed` when the issue is real and introduced or exposed by this PR
+- `rejected` when the claim is wrong, unrelated, or too speculative
+- `uncertain` when it may be real but the provided context is insufficient
+
+For soundness-sensitive claims, require concrete evidence from constraints,
+trace generation, bus interactions, statement generation, executor behavior, or
+nearby tests. Do not accept protocol-level speculation that is not visible from
+the changed code.
diff --git a/.github/ai-review/prompts/lanes/verify.md b/.github/ai-review/prompts/lanes/verify.md
@@ -0,0 +1,10 @@
+Verify candidate review findings for this PR.
+
+For each candidate, decide whether the finding is supported by the diff and
+provided surrounding code. Mark it as:
+
+- `confirmed` when the issue is real and introduced or exposed by this PR
+- `rejected` when the claim is wrong, unrelated, or too speculative
+- `uncertain` when it may be real but the provided context is insufficient
+
+Prefer rejecting speculative findings. Do not invent new findings in this step.
diff --git a/.github/ai-review/prompts/standard.md b/.github/ai-review/prompts/standard.md
@@ -0,0 +1,31 @@
+This is the standard AI review tier. Review this PR seriously and report
+concrete issues that should be addressed before merge.
+
+Review only issues introduced by this PR. Use the diff as the scope anchor.
+Do not attempt a full spec audit in this workflow. Flag obvious spec or doc drift
+only when it is directly visible from the PR context, and do not report unrelated
+pre-existing issues.
+
+Focus on:
+
+1. **Correctness and regressions**
+   - Logic errors, edge cases, changed invariants, incorrect error handling
+   - VM, prover, memory, bus, trace, and constraint behavior affected by the diff
+   - If constraints, trace generation, or bus interactions change, check their
+     local consistency against the surrounding code and tests
+
+2. **Tests and observability**
+   - Missing tests for new behavior or fixed edge cases
+   - Tests whose names/assertions no longer match the behavior
+
+3. **Simplicity and maintainability**
+   - Unnecessary complexity, duplicated logic, avoidable abstractions
+   - Stale comments, stale names, misleading doc comments, or scope drift
+   - Cosmetic rewrites when they make changed code easier to read or maintain
+
+Guidelines:
+- Prefer fewer, higher-confidence findings.
+- Do not suggest micro-optimizations or low-signal churn.
+- Be concise and actionable.
+- Include concrete file and line references when possible.
+- If no issues are found, say so briefly.