lemmas: booster-only gap fixes — widthOpCode, int/bytes simplifications, preserves-definedness by ehildenb · Pull Request #2849 · runtimeverification/evm-semantics

ehildenb · 2026-05-25T18:39:14Z

The booster backend (booster-only-simplify mode) fails to discharge several side conditions and lemmas that kore handles correctly, because booster's equation engine has stricter matching requirements. This PR closes the gaps found while running the functional test-suite in booster-only mode. Note that I collected the information needed (or Claude did, more precisely) using functionality that's not merged yet (runtimeverification/k#4919), but that the improvements here are independent of that.

Changes:

Updates CLAUDE.md with some basic information about how to debug stuck proofs, mostly focused on getting logs out of the haskell backend for analysis.
evm.md: add total to #widthOpCode declaration; add preserves-definedness to all four #computeValidJumpDests rules so booster's definedness checker passes
lemmas.k: add smtlib(widthOpCode) annotation + smt-lemma bounds for #widthOpCode (0 ≤ x ≤ 33); add widthOpCode-concat-{left,right} rules for booster's single-step equation engine; add Bool tautology rules; add KSet membership/dedup rules with comments explaining which are permanently indeterminate in booster (backend limitation, not fixable from K source)
int-simplification.k: add minInt/maxInt comparison rules, commutativity normalisation (==Int concrete-on-left), arithmetic and Bool simplifications
evm-int-simplification.k: add bit-mask-asWord-m and bit-notmask-asWord-nm
bytes-simplification.k: add preserves-definedness to lookup-as-asWord; add concrete b"" concat identity rules
compute-valid-jump-dests-spec.k: add a fast N=5 functional claim alongside the existing N=1000 performance claim for quicker CI feedback
set-dedup-spec.k: new functional spec exercising the KSet membership and dedup rules under kore. This one is still not passing with --booster-only-simplify.

Validation:

Booster-only functional suite: 8/10 specs pass after these fixes (avg ~2× wall-clock speedup with --booster-only-simplify flag vs kore baseline). The two remaining failures (lemmas-spec with many claims, merkle-spec) are engine-level limitations unrelated to these lemmas.
compute-valid-jump-dests specifically: 1.23× wall-clock speedup in booster-only mode on the N=1000 symbolic claim; with --use-booster-dev (pure booster, no kore proxy) the same proof drops from ~200 s to 0.68s. The kore fallback is the bottleneck, not the semantics.
int-simplifications-spec improves 3.48× in booster-only mode after the minInt/maxInt and commutativity normalisation rules.
widthOpCode smt-lemma bounds let Z3 discharge #widthOpCode-containing side conditions without concrete evaluation, which was previously causing fallbacks.
set-dedup-spec is intentionally kore-only: it documents the booster matchSets() limitation (booster cannot match non-empty KSet patterns — Match.hs:673-676) so the permanently-dead rules in lemmas.k have a test-level explanation.

…alidJumpDests rules `total` on `#widthOpCode` tells the Haskell backend the function is defined for all inputs, suppressing spurious #Ceil obligations. `preserves-definedness` on all four `#computeValidJumpDests` rules unblocks the booster's equation evaluator: without it the backend aborted with `[failure][break] Uncertain about definedness` on every attempt because `PGM[I]` (byte indexing) is non-total in general. With the annotation the booster correctly iterates the 2001-byte concrete prefix via LLVM and halts at the symbolic `#buf` boundary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…-asWord Without this attribute Booster refused to apply the rule because the LHS contains the non-total hook Lbl_[_]_BYTES-HOOKE. After the fix Booster correctly applies it (8× succeeded in the execute-node simplify request). Analysis: scripts/bench-prove.py --analyse-fallbacks shows the "Uncertain about definedness of rule due to: non-total symbol Lbl_[_]_BYTES-HOOKE" error gone, replaced by successful Booster applications. The rule-index gap and indeterminate-match cases for other equations remain and will be addressed in follow-up commits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bytes-simplification.k: add preserves-definedness to lengthBytes-buf and lengthBytes-range. evm-int-simplification.k: add asWord-buf-inversion-rangeUInt256.

…) => true)

lemmas.k: add widthOpCode-concat-left/right composition rules. evm-int-simplification.k: add asWord-range-buf to match the actual #asWord(#range(#buf(N,X), S, W)) term structure after concat-lookup and lookup-as-asWord reductions.

…Code, smt-lemma bounds Add smtlib(widthOpCode) to the #widthOpCode syntax declaration so it is registered as a declared SMT function. This allows smt-lemma rules to encode bounds on #widthOpCode as universally-quantified axioms in the solver prelude, letting Z3 discharge side conditions that contain #widthOpCode without needing to evaluate it concretely. Add widthOpCode-lb and widthOpCode-ub with [smt-lemma] so the solver knows 0 ≤ #widthOpCode(X) ≤ 33 for all X. Also document the "save output to file before analysing" workflow in CLAUDE.md to avoid re-running slow commands with different pipes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…1000 claim compute-valid-jump-dests-fast (N=5): quick smoke test for iterating on simplification lemmas — completes in under a second. compute-valid-jump-dests-perf (N=1000): performance regression benchmark; run without --haskell-logging for accurate baseline timing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ion rules Fixes `int-simplifications-spec` regression under `--booster-only-simplify`. The four new `minint-lt-maxint-*` / `minint-leq-maxint-*` rules at priority 40 make Booster produce `true` directly for claims like: minInt(A, _B) <Int maxInt(C, _D) requires A <Int C minInt(_A, B) <=Int maxInt(_C, D) requires B <=Int D and their shifted variants (E+A, E+B vs F+C, F+D). Without these rules, the existing priority-50 expansion rules (minint-lt, maxint-gt) expand to a disjunction that Booster cannot collapse to `true` in booster-only mode because the path condition is unavailable at the implies check. The non-linear shift rules (E appears twice in the minInt args) correctly fire at priority 40 but their compound `andBool` side conditions fail to discharge when the path condition variables are in non-linear position. Two additional "factored-form" rules handle the result after minInt-factor-left / maxInt-factor-left (priority 50) have applied: the factored LHS is linear, so the compound condition discharges normally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…k-asWord-nm Parametric masking of #asWord(BA:Bytes) with byte-aligned maxUIntN / notMaxUIntN masks; fires at priority 40 (before multimask-b-and at 50).

…n-left Kore's execute-fallback applies INT-KORE.eq-int-true-rigth which can swap ==Int argument order in path conditions (e.g. lengthBytes(BA) ==Int 32 → 32 ==Int lengthBytes(BA)). This makes Booster's implies check fail due to a syntactic mismatch even when the logical content is identical. The new rule [int-eq-comm-concrete] at priority 30 fires before the +Int cancellation rules and normalises concrete-on-left to concrete-on-right, restoring the form the target spec expects. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ation lemmas Adds to INT-SIMPLIFICATION-COMMON: - A <=Int A => true (reflexivity; fixes chop-overflow-02 under --booster-only-simplify) - A <=Int A +Int B => 0 <=Int B - A +Int ((B -Int A) +Int C) => B +Int C (4-term cancellation; fixes cancellation-02) - A <Int B andBool B <Int A => false (strict order contradiction; fixes b2w-03) Adds to INT-SIMPLIFICATION-HASKELL: - A <Int B andBool C <=Int A => false requires B <=Int C [concrete(B,C)] (range contradiction) - C <=Int A andBool A <Int B => false requires B <=Int C [concrete(B,C)] (range contradiction) (requires is purely concrete, evaluated by LLVM hook — no circular self-application risk) Adds to LEMMAS-HASKELL: - B orBool notBool B => true (and commuted variant; fixes b2w-05, chop-additional-knowledge) - B andBool notBool B => false (and commuted variant) Validation: full functional test suite passes in normal mode (slot-updates-spec.k failure is pre-existing, unrelated to these changes). Under --booster-only-simplify, lemmas-spec.k drops from 22 to 18 failing claims. No loops detected (checked via kevm-analyse hot-rules). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…iling claims Add a higher-priority (40) variant of `A <=Int A+B => 0 <=Int B` that short-circuits directly to `true` when `0 <=Int B` is already in the path condition, fixing `chop-overflow-01` in --booster-only-simplify mode where the intermediate `0 <=Int Y` form cannot be discharged at the implies check. Add `map-update-comm` to `LEMMAS-HASKELL`: normalises `M[I1<-V1][I2<-V2]` to `M[I2<-V2][I1<-V1]` when `I2 <Int I1`, fixing the map-update-commutativity claim (`012353799830...`). Loop-safe: after the swap the outer key is larger than the inner key, so the `I2 <Int I1` condition is false on the result. Validated: two booster-only claims fixed, zero new failures (306/322 now pass vs 304/322 before). No regressions in normal mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

In K, .Bytes (the constructor) and b"" (LLVM-evaluated concrete empty bytes, a Kore domain value \dv{SortBytes}("")) are different Kore terms. The existing bytes-concat-empty-right/left rules use .Bytes and do not fire when LLVM hooks (e.g. #encodeArgs base case) return b"". New rules: [bytes-concat-empty-right-concrete]: B +Bytes b"" => B [bytes-concat-empty-left-concrete]: b"" +Bytes B => B Fixes booster-only failure for encodepacked-keccak01: the contract evaluates keccak(1 : #encodeArgs(#uint256(A0))) where the #encodeArgs base case produces b"", leaving keccak(X +Bytes b"") vs keccak(X) as the implies-check mismatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… functional spec Adds a runLemma/doneLemma spec that tests Set deduplication lemmas (SetItem membership and |Set deduplication) in isolation. Registers the spec in the KOMPILE_MAIN_FILE table so the pytest harness picks it up automatically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ead booster rules

ehildenb · 2026-05-25T18:43:32Z

+    rule [lengthBytes-buf]:        lengthBytes(#buf(S, _))                       => S requires 0 <=Int S                     [simplification, preserves-definedness]
+    rule [lengthBytes-range]:      lengthBytes(#range(_, S:Int, W:Int))          => maxInt(0, W) requires 0 <=Int S          [simplification, preserves-definedness]


I'm not sure if these rules actually do preserve definedness.

#buf and #range are total so I don't see a problem with these.

ehildenb · 2026-05-25T18:45:25Z

+  // General reduction: #asWord of a range extracted from a buf.
+  // #buf(N, X) is the big-endian N-byte encoding of X.  Taking W bytes
+  // starting at byte S gives (X >> (8*(N-S-W))) & (2^(8*W)-1).
+  // Activated with concrete N, S, W so Booster can evaluate the arithmetic.
+  rule [asWord-range-buf]:
+    #asWord ( #range ( #buf ( N:Int, X:Int ), S:Int, W:Int ) ) =>
+      X /Int ( 2 ^Int ( 8 *Int ( N -Int S -Int W ) ) ) modInt ( 2 ^Int ( 8 *Int W ) )
+    requires 0 <=Int S andBool 0 <Int W andBool S +Int W <=Int N
+     andBool 0 <=Int X andBool X <Int 2 ^Int ( 8 *Int N )
+    [simplification, concrete(N, S, W), preserves-definedness]


This lemma seems unnecessarily complicated, we should track down which test triggered adding it, and see if we can find a smaller/simpler version that works.

…refs Remove '## Running long commands' (belongs in the backend-docs PR). Fix two bullet points in '## Diagnosing slow Haskell backend' that referenced the now-deleted bench-prove.py script. Keep '## Testing a local Haskell backend build' and its kdist cache rebuild instruction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ehildenb · 2026-05-25T18:58:53Z

+    // minInt <Int maxInt / minInt <=Int maxInt: priority 40 fires before the default-priority (50) expansion rules,
+    // so Booster sees `true` directly rather than a disjunction requiring path-condition reasoning.
+    rule [minint-lt-maxint-a]:
+        minInt(A:Int, _B:Int) <Int maxInt(C:Int, _D:Int) => true
+        requires A <Int C
+        [simplification(40), preserves-definedness]
+    rule [minint-lt-maxint-shift-a]:
+        minInt(E:Int +Int A:Int, E:Int +Int _B:Int) <Int maxInt(F:Int +Int C:Int, F:Int +Int _D:Int) => true
+        requires A <Int C andBool E <=Int F
+        [simplification(40), preserves-definedness]
+    rule [minint-leq-maxint-b]:
+        minInt(_A:Int, B:Int) <=Int maxInt(_C:Int, D:Int) => true
+        requires B <=Int D
+        [simplification(40), preserves-definedness]
+    rule [minint-leq-maxint-shift-b]:
+        minInt(_A:Int +Int E:Int, B:Int +Int E:Int) <=Int maxInt(_C:Int +Int F:Int, D:Int +Int F:Int) => true
+        requires B <=Int D andBool E <=Int F
+        [simplification(40), preserves-definedness]
+    // Factored forms: after minInt-factor-{left,right} + maxInt-factor-{left,right} (priority 50) have fired,
+    // the non-linear shift rules above no longer match; these linear rules catch the result.
+    rule [minint-lt-maxint-shift-a-factored]:
+        E:Int +Int minInt(A:Int, _B:Int) <Int F:Int +Int maxInt(C:Int, _D:Int) => true
+        requires A <Int C andBool E <=Int F
+        [simplification(40), preserves-definedness]
+    rule [minint-leq-maxint-shift-b-factored]:
+        minInt(_A:Int, B:Int) +Int E:Int <=Int maxInt(_C:Int, D:Int) +Int F:Int => true
+        requires B <=Int D andBool E <=Int F
+        [simplification(40), preserves-definedness]
+


Some of these seem like they can be simplified with some other reasoning steps in the way. For eaxmple some of the rules could be combined with a disjunction in the requires clause, and others can first do arithmetic simplification on the expression then rely on the other rules here to fire afterwards.

ehildenb · 2026-05-25T19:01:47Z

+    // NOTE: the two non-linear rules below fire in kore but are INDETERMINATE in booster.
+    // Booster internalises SetItem(X) as KSet{def,[X],None}; matchSets() returns addIndeterminate
+    // for any non-empty KSet (Match.hs:673-676). Backend change required to fix.
    rule   ( S:Set |Set SetItem ( X )                      ) |Set SetItem( X ) =>   ( S:Set |Set SetItem ( X )                      ) [simplification]
    rule ( ( S:Set |Set SetItem ( X ) ) |Set SetItem ( Y ) ) |Set SetItem( X ) => ( ( S:Set |Set SetItem ( X ) ) |Set SetItem ( Y ) ) [simplification]

+    // Membership in the |Set (union) form — fires in kore only (same matchSets limitation).
+    // Kore uses these to discharge the requires-clause of the dedup rule below.
+    rule X:Int in (REST:Set |Set SetItem(Y:Int)) => true      requires           X ==Int Y   [simplification]
+    rule X:Int in (REST:Set |Set SetItem(Y:Int)) => X in REST requires notBool ( X ==Int Y ) [simplification]
+
+    // Deduplication of repeated concrete Int elements in a symbolic |Set expression.
+    // Required in normal mode (where kore's simplify step fires it) but INDETERMINATE in booster.
+    // Booster cannot match SetItem(X) in the pattern: matchSets() is unimplemented for non-empty
+    // KSets (booster/Match.hs:673-676). Cannot be fixed from K source; requires backend change.
+    // Context: #touchAccounts in evm.md adds addresses via |Set; when the same address is added
+    // twice to a symbolic touchedAccounts, this rule is needed to normalize back to a 2-element form.
+    rule S:Set |Set SetItem(X:Int) => S requires X in S [simplification]
+


@jberthold how much work would it be to support these types of simplification rules in the booster backend directly? We don't need to support sets in general, but just these expressions.

The attached set-dedup-spec.k file should capture claims that need to pass with the new rules.

Related code is here: https://github.com/runtimeverification/haskell-backend/blob/00479fbdb86b70fc602aefca9fabab5afc8bad5f/booster/library/Booster/Pattern/Match.hs#L659-L676
At the moment, only empty sets are matched. It looks like we could consider matching singleton sets like the ones above, but there is the danger of discarding a match prematurely. For the above rules, it should be OK because there is always an explicit |Set operation separating the sets and they are singletons where it matters... 🤔

ehildenb · 2026-05-25T19:03:09Z

+    rule [bool-or-not-self]:      B:Bool  orBool notBool B:Bool => true  [simplification]
+    rule [bool-not-or-self]:  notBool B:Bool  orBool B:Bool     => true  [simplification]
+    rule [bool-and-not-self]:     B:Bool andBool notBool B:Bool => false [simplification]
+    rule [bool-not-and-self]: notBool B:Bool andBool B:Bool     => false [simplification]


Formatting needs fixed.

ehildenb · 2026-05-25T19:03:38Z

-    rule #computeValidJumpDests(  _, I, RESULT, LEN) => RESULT                                                                           requires I >=Int LEN
-    rule #computeValidJumpDests(PGM, I, RESULT, LEN) => #computeValidJumpDests(PGM, I +Int 1,                        RESULT[I <- 1], LEN) requires I <Int LEN andBool PGM [ I ] ==Int 91
-    rule #computeValidJumpDests(PGM, I, RESULT, LEN) => #computeValidJumpDests(PGM, I +Int #widthOpCode(PGM [ I ]), RESULT,          LEN) requires I <Int LEN andBool notBool PGM [ I ] ==Int 91
+    rule #computeValidJumpDests(PGM)                 => #computeValidJumpDests(PGM, 0, padRightBytes(.Bytes, lengthBytes(PGM), 0), lengthBytes(PGM))                                                                             [preserves-definedness]


This function is already marked as total.

I think this rule should still be marked preserves-definedness ... the RHS #computeValidJumpDests(_, _, _, _) is not total but should be defined for these arguments.

ehildenb · 2026-05-25T19:04:32Z

+    rule #computeValidJumpDests(  _, I, RESULT, LEN) => RESULT                                                                           requires I >=Int LEN                                                                   [preserves-definedness]
+    rule #computeValidJumpDests(PGM, I, RESULT, LEN) => #computeValidJumpDests(PGM, I +Int 1,                        RESULT[I <- 1], LEN) requires I <Int LEN andBool PGM [ I ] ==Int 91                                      [preserves-definedness]
+    rule #computeValidJumpDests(PGM, I, RESULT, LEN) => #computeValidJumpDests(PGM, I +Int #widthOpCode(PGM [ I ]), RESULT,          LEN) requires I <Int LEN andBool notBool PGM [ I ] ==Int 91                              [preserves-definedness]


Is it true that these are preserving definedness?

I guess not, but it is pragmatically the only choice to avoid having lengthBytes(PGM) again here.
This is only OK if the length of PGM and RESULT is > I but at the call sites they are the same length LEN.

ehildenb · 2026-05-25T19:04:45Z

+    syntax Int ::= #widthOpCode(Int) [symbol(#widthOpCode), function, total, smtlib(widthOpCode)]
 // -----------------------------------------------------------------


…flow In '## Testing a local Haskell backend build': add the pyproject.toml snippet for pointing at a local pyk checkout, and the uv sync / revert steps that go with it. Add '## Updating local pyk vs local K': explains that a pyk-only change only needs uv sync --reinstall-package kframework (no kdist rebuild), while a K/Haskell backend change requires the full kup + kevm-kdist flow. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ehildenb and others added 15 commits May 25, 2026 18:28

lemmas: Round 2 Booster gap fixes

fa9903d

bytes-simplification.k: add preserves-definedness to lengthBytes-buf and lengthBytes-range. evm-int-simplification.k: add asWord-buf-inversion-rangeUInt256.

lemmas/lemmas.k: Round 3 — add widthOpCode-lb (0 <=Int #widthOpCode(_…

e849dee

…) => true)

lemmas/evm-int-simplification.k: add bit-mask-asWord-m and bit-notmas…

90dd3b2

…k-asWord-nm Parametric masking of #asWord(BA:Bytes) with byte-aligned maxUIntN / notMaxUIntN masks; fires at priority 40 (before multimask-b-and at 50).

lemmas/lemmas.k: add KSet match limitation comments for permanently-d…

d9cc35a

…ead booster rules

ehildenb commented May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lemmas: booster-only gap fixes — widthOpCode, int/bytes simplifications, preserves-definedness#2849

lemmas: booster-only gap fixes — widthOpCode, int/bytes simplifications, preserves-definedness#2849
ehildenb wants to merge 17 commits into
masterfrom
kevm-lemmas-booster

ehildenb commented May 25, 2026 •

edited

Loading

Uh oh!

ehildenb May 25, 2026

Uh oh!

jberthold May 26, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

jberthold May 26, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

jberthold May 26, 2026

Uh oh!

ehildenb May 25, 2026

Uh oh!

jberthold May 26, 2026 •

edited

Loading

Uh oh!

ehildenb May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		rule [lengthBytes-buf]: lengthBytes(#buf(S, _)) => S requires 0 <=Int S [simplification, preserves-definedness]
		rule [lengthBytes-range]: lengthBytes(#range(_, S:Int, W:Int)) => maxInt(0, W) requires 0 <=Int S [simplification, preserves-definedness]

		syntax Int ::= #widthOpCode(Int) [symbol(#widthOpCode), function, total, smtlib(widthOpCode)]
		// -----------------------------------------------------------------

Conversation

ehildenb commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jberthold May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ehildenb commented May 25, 2026 •

edited

Loading

jberthold May 26, 2026 •

edited

Loading