Skip to content

perf(compiler): narrow EVM SUB lowering for u64-proven operand pairs#536

Merged
zoowii merged 6 commits into
DTVMStack:mainfrom
abmcar:perf/evm-sub-u64-wrap-lowering
Jun 24, 2026
Merged

perf(compiler): narrow EVM SUB lowering for u64-proven operand pairs#536
zoowii merged 6 commits into
DTVMStack:mainfrom
abmcar:perf/evm-sub-u64-wrap-lowering

Conversation

@abmcar

@abmcar abmcar commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Narrow EVM SUB lowering when both operands are range-proven u64, replacing the eight-limb generic borrow chain with a three-instruction fast path.

What

A SUB result wraps to full 256-bit width on underflow, so the result cannot carry a narrow range tag — but when both operands are range-proven u64, the computation can be narrowed: for a, b ∈ [0, 2^64),

(a - b) mod 2^256  ==  { wrapping_sub(a0, b0), fill, fill, fill }
where fill = 0 - borrow,  borrow = (a0 <u b0)

On underflow the i64 negation yields all-ones, reproducing the wrapped upper 192 bits bit-exactly for every input. No no-underflow proof is required — this sidesteps the relational-fact prerequisite that correctly deferred result-narrowing for SUB.

The new fast path (gated on bothFitU64 with both operands non-constant; all existing constant paths keep priority) emits one sub, one unsigned compare, and one negation. The generic path it replaces pre-materializes all eight operand limbs through protectUnsafeValue to shield its SUB/SBB borrow chain from flag clobbering — the new path has no SBB chain, so no barrier is needed. The difference has a single consumer and no flag chain, so it needs no protectUnsafeValue either. The result keeps the default U256 range tag, symmetric with the analyzer's SUB transfer rule.

Also wires the new SubFastRangeU64Count into the arithmetic-summary predicate and log line.

Soundness

  • Wrap algebra verified two independent ways during review: a 4-million-pair brute force over the boundary set (a>b, a<b, a==b, 0 - (2^64-1), single-bit edges) and a separate 442-pair computational check — zero mismatches.
  • Every u64 range-tag producer on main (constant auto-derive, compare results, AND-with-const, analyzer entry import) materializes upper limbs as genuine zeros; no producer can attach a u64 tag to a value with non-zero upper limbs.
  • The dual use of each operand's limb0 (sub + compare) matches the existing ADD fast-path precedent; CgIR lowering memoizes by node pointer, so there is no double emission.

Verification

  • 6 adversarial differential fixtures + EVMSubWrapDifferentialTest (interpreter vs multipass byte-equality + JIT-compiled assertions): no-underflow, underflow all-ones fill (5 - 7 → 2^256 - 2), equal operands, the 0 - (2^64-1) wrap boundary (limb0 = 1, upper limbs all-ones), dynamic-zero RHS, and a one-sided-wide control that must not take the new path. 6/6 pass; golden suite no regressions.
  • multipass evmone-unittests 223/223; multipass evmone-statetest -k fork_Cancun 2723/2723; format check clean; no new warnings.
  • Both quality findings from review were addressed (see Soundness and What).

Measurements

Paired site-weighted measurement (per-site instrumentation tap, on a separate branch not part of this PR) on the EEST Cancun suite (28,109 shared compiled sites):

metric base this PR delta
SUB fast-path hit rate 18.4% 42.1% +23.8pp (925 sites moved from the full-width path to the narrow u64 path)
all other opcodes bitwise identical site-by-site

SUB has the largest full-width-path site population on this suite (3,893 sites). Measured against plain main. Stacked with the range-tag consumption PR (#534), whose ENV/compare tags create additional u64 pairs, covered sites grow from 925 to ~1,594 (directional estimate from #534's own measurement run, not re-measured here).

Performance (evmone-bench 27-bench sweep vs upstream/main, median of 5 + 15-rep outlier reruns): median +0.62%, with every outlier — including benchmarks the diff cannot affect — resolving into its run-variance band on rerun; the largest and most stable benchmark (snailtracer, cv ~1%) is +0.4%. End-to-end neutral, no regressions; the win is the per-site removal of seven spills plus the SBB chain, which this suite's hot paths do not isolate.

Known limitation

On real-mainnet workloads, cross-block range widening currently leaves almost no dynamically-proven u64 SUB pairs at full-path sites (execution-weighted ≈ 0); the EEST gains come from in-block pairs (loop counters, gas math). Real-load benefit depends on planned cross-block range-precision work that propagates u64 ranges across basic blocks. Noted in the change document.

Change document: docs/changes/2026-06-10-evm-sub-u64-wrap-lowering/README.md.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings June 9, 2026 21:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new range-based u64 fast path lowering for EVM SUB in the MIR compiler and introduces differential tests/fixtures to validate interpreter vs multipass/JIT equivalence on wrap/borrow edge cases.

Changes:

  • Implement range-proven u64 SUB lowering using diff + unsigned-borrow compare + broadcast fill for upper limbs.
  • Add EVM assembly fixtures (goldens) covering no-underflow, underflow, boundary wrap, dynamic-zero RHS, and a wide control case.
  • Extend arithmetic compile stats and add a new gtest parameterized differential suite for the new fixtures.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/evm_asm/sub_wide_u64_control.expected Golden output for wide-u256 minus u64 control case.
tests/evm_asm/sub_wide_u64_control.easm Fixture ensuring the u64 range fast path does not fire for wide minuend.
tests/evm_asm/sub_u64_pair_zero_rhs_dyn.expected Golden output for dynamic-zero RHS case.
tests/evm_asm/sub_u64_pair_zero_rhs_dyn.easm Fixture exercising new range-based u64 fast path (RHS is dynamic 0).
tests/evm_asm/sub_u64_pair_wrap_boundary.expected Golden output for wrap boundary case with upper-limb all-ones fill.
tests/evm_asm/sub_u64_pair_wrap_boundary.easm Fixture validating borrow broadcast behavior at 0 - (2^64-1).
tests/evm_asm/sub_u64_pair_underflow.expected Golden output for adversarial underflow (5-7).
tests/evm_asm/sub_u64_pair_underflow.easm Fixture validating underflow behavior and all-ones upper limbs.
tests/evm_asm/sub_u64_pair_nounderflow.expected Golden output for typical non-underflow subtraction.
tests/evm_asm/sub_u64_pair_nounderflow.easm Fixture for non-underflow in the range-u64 path.
tests/evm_asm/sub_u64_pair_equal.expected Golden output for equal operands case.
tests/evm_asm/sub_u64_pair_equal.easm Fixture validating equal operands produce zero.
src/tests/evm_interp_tests.cpp Adds parameterized differential test (interp vs multipass/JIT) for the new stems.
src/compiler/evm_frontend/evm_mir_compiler.h Implements new range-based u64 SUB lowering and adds a compile-stat counter.
src/compiler/evm_frontend/evm_mir_compiler.cpp Wires new counter into hasArithCompileStats() and summary logging.
docs/changes/2026-06-10-evm-sub-u64-wrap-lowering/README.md Design/change note describing the rationale, lowering, and validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/compiler/evm_frontend/evm_mir_compiler.h
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 3.92 3.79 -3.1% PASS
total/main/blake2b_huff/empty 0.06 0.06 +1.3% PASS
total/main/blake2b_shifts/8415nulls 20.87 21.02 +0.8% PASS
total/main/sha1_divs/5311 12.72 12.30 -3.3% PASS
total/main/sha1_divs/empty 0.15 0.15 +5.4% PASS
total/main/sha1_shifts/5311 9.19 9.35 +1.8% PASS
total/main/sha1_shifts/empty 0.08 0.08 -0.6% PASS
total/main/snailtracer/benchmark 108.78 108.02 -0.7% PASS
total/main/structarray_alloc/nfts_rank 1.38 1.38 -0.3% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 +0.3% PASS
total/main/swap_math/received 0.01 0.01 -1.6% PASS
total/main/swap_math/spent 0.01 0.01 -1.6% PASS
total/main/weierstrudel/1 0.41 0.40 -0.3% PASS
total/main/weierstrudel/15 4.55 4.35 -4.4% PASS
total/micro/JUMPDEST_n0/empty 2.86 2.87 +0.2% PASS
total/micro/jump_around/empty 0.12 0.11 -3.7% PASS
total/micro/loop_with_many_jumpdests/empty 64.65 64.70 +0.1% PASS
total/micro/memory_grow_mload/by1 0.23 0.23 +1.9% PASS
total/micro/memory_grow_mload/by16 0.25 0.25 +1.0% PASS
total/micro/memory_grow_mload/by32 0.15 0.15 -0.7% PASS
total/micro/memory_grow_mload/nogrow 0.12 0.12 +0.8% PASS
total/micro/memory_grow_mstore/by1 0.25 0.24 -4.2% PASS
total/micro/memory_grow_mstore/by16 0.14 0.14 -0.7% PASS
total/micro/memory_grow_mstore/by32 0.28 0.28 -0.1% PASS
total/micro/memory_grow_mstore/nogrow 0.23 0.24 +5.4% PASS
total/micro/signextend/one 0.28 0.28 +0.4% PASS
total/micro/signextend/zero 0.49 0.48 -3.3% PASS
total/synth/ADD/b0 3.23 3.23 +0.0% PASS
total/synth/ADD/b1 6.05 5.69 -5.9% PASS
total/synth/ADDRESS/a0 6.64 6.65 +0.1% PASS
total/synth/ADDRESS/a1 5.35 5.35 -0.0% PASS
total/synth/AND/b0 2.87 2.88 +0.1% PASS
total/synth/AND/b1 5.32 5.40 +1.6% PASS
total/synth/BYTE/b0 6.07 6.09 +0.3% PASS
total/synth/BYTE/b1 8.51 8.76 +3.0% PASS
total/synth/CALLDATASIZE/a0 5.54 5.57 +0.7% PASS
total/synth/CALLDATASIZE/a1 3.48 3.48 +0.1% PASS
total/synth/CALLER/a0 6.67 6.73 +0.9% PASS
total/synth/CALLER/a1 6.97 7.18 +3.1% PASS
total/synth/CALLVALUE/a0 3.39 3.39 +0.0% PASS
total/synth/CALLVALUE/a1 6.46 6.55 +1.4% PASS
total/synth/CODESIZE/a0 7.07 6.80 -3.7% PASS
total/synth/CODESIZE/a1 6.35 6.53 +2.7% PASS
total/synth/DUP1/d0 2.03 1.99 -1.1% PASS
total/synth/DUP1/d1 2.05 2.14 +4.6% PASS
total/synth/DUP10/d0 2.00 1.97 -1.1% PASS
total/synth/DUP10/d1 2.14 2.12 -0.9% PASS
total/synth/DUP11/d0 1.23 1.23 +0.0% PASS
total/synth/DUP11/d1 2.15 2.13 -1.2% PASS
total/synth/DUP12/d0 1.99 1.99 -0.2% PASS
total/synth/DUP12/d1 1.45 1.65 +13.6% PASS
total/synth/DUP13/d0 1.99 1.95 -2.2% PASS
total/synth/DUP13/d1 2.04 2.12 +4.0% PASS
total/synth/DUP14/d0 1.23 1.23 +0.1% PASS
total/synth/DUP14/d1 2.07 2.10 +1.2% PASS
total/synth/DUP15/d0 2.03 2.03 +0.2% PASS
total/synth/DUP15/d1 1.63 1.44 -11.7% PASS
total/synth/DUP16/d0 2.01 1.97 -2.2% PASS
total/synth/DUP16/d1 2.10 2.20 +5.1% PASS
total/synth/DUP2/d0 1.23 1.23 +0.1% PASS
total/synth/DUP2/d1 2.09 2.17 +3.7% PASS
total/synth/DUP3/d0 1.94 2.07 +6.5% PASS
total/synth/DUP3/d1 1.52 1.65 +8.7% PASS
total/synth/DUP4/d0 1.95 1.96 +0.2% PASS
total/synth/DUP4/d1 2.13 2.18 +2.1% PASS
total/synth/DUP5/d0 1.23 1.23 +0.2% PASS
total/synth/DUP5/d1 2.18 2.11 -3.3% PASS
total/synth/DUP6/d0 2.02 1.97 -2.7% PASS
total/synth/DUP6/d1 1.63 1.57 -3.7% PASS
total/synth/DUP7/d0 1.97 1.97 +0.2% PASS
total/synth/DUP7/d1 2.14 2.16 +1.2% PASS
total/synth/DUP8/d0 1.23 1.23 +0.0% PASS
total/synth/DUP8/d1 2.20 2.07 -5.5% PASS
total/synth/DUP9/d0 1.99 2.00 +0.6% PASS
total/synth/DUP9/d1 1.65 1.44 -12.7% PASS
total/synth/EQ/b0 6.07 6.14 +1.2% PASS
total/synth/EQ/b1 6.40 6.39 -0.2% PASS
total/synth/GAS/a0 3.95 3.94 -0.2% PASS
total/synth/GAS/a1 7.32 7.38 +0.8% PASS
total/synth/GT/b0 6.42 6.30 -1.7% PASS
total/synth/GT/b1 6.59 6.62 +0.5% PASS
total/synth/ISZERO/u0 10.11 10.13 +0.3% PASS
total/synth/JUMPDEST/n0 2.86 2.87 +0.1% PASS
total/synth/LT/b0 6.31 6.44 +2.0% PASS
total/synth/LT/b1 5.41 5.44 +0.5% PASS
total/synth/MSIZE/a0 6.06 6.06 +0.0% PASS
total/synth/MSIZE/a1 6.21 6.22 +0.2% PASS
total/synth/MUL/b0 7.88 8.18 +3.8% PASS
total/synth/MUL/b1 5.93 5.91 -0.4% PASS
total/synth/NOT/u0 8.01 7.97 -0.6% PASS
total/synth/OR/b0 5.28 5.00 -5.3% PASS
total/synth/OR/b1 3.40 3.42 +0.4% PASS
total/synth/PC/a0 5.63 5.62 -0.1% PASS
total/synth/PC/a1 3.50 3.46 -1.0% PASS
total/synth/PUSH1/p0 2.22 2.30 +3.8% PASS
total/synth/PUSH1/p1 1.53 1.68 +9.8% PASS
total/synth/PUSH10/p0 2.20 2.31 +5.3% PASS
total/synth/PUSH10/p1 1.76 1.71 -2.6% PASS
total/synth/PUSH11/p0 2.21 2.26 +2.4% PASS
total/synth/PUSH11/p1 2.40 2.49 +3.6% PASS
total/synth/PUSH12/p0 1.32 1.32 -0.1% PASS
total/synth/PUSH12/p1 2.38 2.36 -0.9% PASS
total/synth/PUSH13/p0 2.15 2.37 +10.3% PASS
total/synth/PUSH13/p1 1.54 1.57 +2.1% PASS
total/synth/PUSH14/p0 2.31 2.31 +0.1% PASS
total/synth/PUSH14/p1 2.63 2.54 -3.2% PASS
total/synth/PUSH15/p0 1.32 1.32 +0.3% PASS
total/synth/PUSH15/p1 2.57 2.64 +2.7% PASS
total/synth/PUSH16/p0 2.28 2.65 +17.3% PASS
total/synth/PUSH16/p1 1.56 1.63 +4.5% PASS
total/synth/PUSH17/p0 2.25 2.34 +4.0% PASS
total/synth/PUSH17/p1 2.45 2.43 -0.6% PASS
total/synth/PUSH18/p0 1.32 1.31 -1.1% PASS
total/synth/PUSH18/p1 2.44 2.52 +3.1% PASS
total/synth/PUSH19/p0 2.29 2.37 +3.3% PASS
total/synth/PUSH19/p1 1.68 1.58 -5.8% PASS
total/synth/PUSH2/p0 2.41 2.23 -7.5% PASS
total/synth/PUSH2/p1 2.41 2.45 +1.6% PASS
total/synth/PUSH20/p0 3.03 2.25 -25.8% PASS
total/synth/PUSH20/p1 2.39 2.40 +0.2% PASS
total/synth/PUSH21/p0 1.34 1.32 -1.1% PASS
total/synth/PUSH21/p1 2.42 2.44 +0.8% PASS
total/synth/PUSH22/p0 2.34 2.38 +1.7% PASS
total/synth/PUSH22/p1 1.55 1.64 +5.6% PASS
total/synth/PUSH23/p0 2.28 2.28 +0.2% PASS
total/synth/PUSH23/p1 2.39 2.61 +9.1% PASS
total/synth/PUSH24/p0 1.32 1.32 +0.2% PASS
total/synth/PUSH24/p1 2.61 2.41 -7.7% PASS
total/synth/PUSH25/p0 2.43 2.24 -7.6% PASS
total/synth/PUSH25/p1 1.65 1.76 +6.2% PASS
total/synth/PUSH26/p0 2.50 2.34 -6.6% PASS
total/synth/PUSH26/p1 2.42 2.47 +2.1% PASS
total/synth/PUSH27/p0 1.32 1.32 +0.4% PASS
total/synth/PUSH27/p1 2.48 2.51 +1.3% PASS
total/synth/PUSH28/p0 2.35 2.30 -2.2% PASS
total/synth/PUSH28/p1 1.57 1.60 +2.0% PASS
total/synth/PUSH29/p0 2.33 2.33 -0.2% PASS
total/synth/PUSH29/p1 2.49 2.46 -1.2% PASS
total/synth/PUSH3/p0 1.30 1.32 +2.0% PASS
total/synth/PUSH3/p1 2.42 2.59 +7.0% PASS
total/synth/PUSH30/p0 1.58 1.49 -5.8% PASS
total/synth/PUSH30/p1 2.46 2.53 +2.8% PASS
total/synth/PUSH31/p0 2.42 2.29 -5.3% PASS
total/synth/PUSH31/p1 1.85 1.71 -7.5% PASS
total/synth/PUSH32/p0 2.42 2.38 -1.7% PASS
total/synth/PUSH32/p1 2.43 2.44 +0.4% PASS
total/synth/PUSH4/p0 2.28 2.37 +4.0% PASS
total/synth/PUSH4/p1 1.68 1.66 -1.4% PASS
total/synth/PUSH5/p0 2.28 2.34 +2.3% PASS
total/synth/PUSH5/p1 2.40 2.46 +2.8% PASS
total/synth/PUSH6/p0 1.32 1.31 -0.7% PASS
total/synth/PUSH6/p1 2.41 2.61 +8.4% PASS
total/synth/PUSH7/p0 2.36 2.33 -1.4% PASS
total/synth/PUSH7/p1 1.55 1.84 +19.0% PASS
total/synth/PUSH8/p0 2.32 2.33 +0.4% PASS
total/synth/PUSH8/p1 2.45 2.38 -2.7% PASS
total/synth/PUSH9/p0 1.32 1.32 +0.2% PASS
total/synth/PUSH9/p1 2.37 2.37 -0.2% PASS
total/synth/RETURNDATASIZE/a0 3.56 3.60 +1.2% PASS
total/synth/RETURNDATASIZE/a1 6.65 6.50 -2.0% PASS
total/synth/SAR/b0 3.94 3.95 +0.2% PASS
total/synth/SAR/b1 7.63 7.52 -1.5% PASS
total/synth/SGT/b0 5.69 5.65 -0.6% PASS
total/synth/SGT/b1 4.13 4.14 +0.1% PASS
total/synth/SHL/b0 7.01 7.39 +5.3% PASS
total/synth/SHL/b1 3.65 3.65 -0.2% PASS
total/synth/SHR/b0 7.26 7.22 -0.4% PASS
total/synth/SHR/b1 6.06 6.38 +5.2% PASS
total/synth/SIGNEXTEND/b0 3.37 3.37 -0.1% PASS
total/synth/SIGNEXTEND/b1 6.45 6.65 +3.0% PASS
total/synth/SLT/b0 4.30 4.30 +0.1% PASS
total/synth/SLT/b1 5.63 5.58 -0.9% PASS
total/synth/SUB/b0 5.82 5.55 -4.7% PASS
total/synth/SUB/b1 5.68 5.79 +1.8% PASS
total/synth/SWAP1/s0 3.52 3.52 +0.1% PASS
total/synth/SWAP10/s0 3.54 3.54 +0.1% PASS
total/synth/SWAP11/s0 3.49 3.53 +1.3% PASS
total/synth/SWAP12/s0 3.49 3.59 +2.9% PASS
total/synth/SWAP13/s0 3.54 3.54 +0.0% PASS
total/synth/SWAP14/s0 3.70 3.46 -6.5% PASS
total/synth/SWAP15/s0 5.73 3.58 -37.6% PASS
total/synth/SWAP16/s0 3.56 3.56 +0.1% PASS
total/synth/SWAP2/s0 3.55 3.51 -1.3% PASS
total/synth/SWAP3/s0 3.39 3.50 +3.2% PASS
total/synth/SWAP4/s0 3.53 3.53 +0.1% PASS
total/synth/SWAP5/s0 3.42 3.44 +0.6% PASS
total/synth/SWAP6/s0 3.47 3.52 +1.4% PASS
total/synth/SWAP7/s0 3.53 3.51 -0.5% PASS
total/synth/SWAP8/s0 3.59 3.53 -1.6% PASS
total/synth/SWAP9/s0 3.58 3.46 -3.5% PASS
total/synth/XOR/b0 5.03 5.06 +0.6% PASS
total/synth/XOR/b1 5.26 5.57 +5.9% PASS
total/synth/loop_v1 12.22 12.16 -0.5% PASS
total/synth/loop_v2 12.74 12.32 -3.3% PASS

Summary: 194 benchmarks, 0 regressions


✅ Performance Check Passed (multipass)

Performance Benchmark Results (threshold: 25%)

Benchmark Baseline (us) Current (us) Change Status
total/main/blake2b_huff/8415nulls 1.59 1.57 -0.9% PASS
total/main/blake2b_huff/empty 0.02 0.03 +2.1% PASS
total/main/blake2b_shifts/8415nulls 18.99 17.31 -8.8% PASS
total/main/sha1_divs/5311 12.82 12.89 +0.5% PASS
total/main/sha1_divs/empty 0.16 0.16 +0.6% PASS
total/main/sha1_shifts/5311 9.77 9.86 +0.9% PASS
total/main/sha1_shifts/empty 0.08 0.08 -5.9% PASS
total/main/snailtracer/benchmark 119.90 120.67 +0.6% PASS
total/main/structarray_alloc/nfts_rank 1.36 1.20 -11.7% PASS
total/main/swap_math/insufficient_liquidity 0.00 0.00 -4.2% PASS
total/main/swap_math/received 0.01 0.01 -0.6% PASS
total/main/swap_math/spent 0.01 0.01 +0.4% PASS
total/main/weierstrudel/1 0.40 0.40 -0.6% PASS
total/main/weierstrudel/15 4.37 4.41 +0.9% PASS
total/micro/JUMPDEST_n0/empty 0.00 0.00 -0.6% PASS
total/micro/jump_around/empty 0.06 0.06 +0.4% PASS
total/micro/loop_with_many_jumpdests/empty 0.01 0.01 -2.2% PASS
total/micro/memory_grow_mload/by1 0.02 0.02 -1.0% PASS
total/micro/memory_grow_mload/by16 0.02 0.02 -0.2% PASS
total/micro/memory_grow_mload/by32 0.01 0.01 +0.7% PASS
total/micro/memory_grow_mload/nogrow 0.01 0.01 -2.6% PASS
total/micro/memory_grow_mstore/by1 0.02 0.02 -13.1% PASS
total/micro/memory_grow_mstore/by16 0.01 0.01 -0.2% PASS
total/micro/memory_grow_mstore/by32 0.02 0.02 +0.3% PASS
total/micro/memory_grow_mstore/nogrow 0.02 0.02 +2.1% PASS
total/micro/signextend/one 0.12 0.13 +1.4% PASS
total/micro/signextend/zero 0.23 0.23 +0.3% PASS
total/synth/ADD/b0 0.00 0.00 -0.6% PASS
total/synth/ADD/b1 0.00 0.00 -1.0% PASS
total/synth/ADDRESS/a0 0.21 0.20 -3.6% PASS
total/synth/ADDRESS/a1 0.15 0.15 -0.3% PASS
total/synth/AND/b0 0.00 0.00 -0.9% PASS
total/synth/AND/b1 0.00 0.00 -0.9% PASS
total/synth/BYTE/b0 0.00 0.00 -0.8% PASS
total/synth/BYTE/b1 0.00 0.00 -3.8% PASS
total/synth/CALLDATASIZE/a0 0.12 0.12 +1.4% PASS
total/synth/CALLDATASIZE/a1 0.08 0.08 -0.7% PASS
total/synth/CALLER/a0 0.21 0.22 +5.6% PASS
total/synth/CALLER/a1 0.21 0.20 -3.0% PASS
total/synth/CALLVALUE/a0 0.21 0.21 +0.1% PASS
total/synth/CALLVALUE/a1 0.28 0.29 +3.5% PASS
total/synth/CODESIZE/a0 0.12 0.12 -1.5% PASS
total/synth/CODESIZE/a1 0.12 0.12 -3.7% PASS
total/synth/DUP1/d0 0.00 0.00 -0.4% PASS
total/synth/DUP1/d1 0.00 0.00 +1.6% PASS
total/synth/DUP10/d0 0.00 0.00 -2.3% PASS
total/synth/DUP10/d1 0.00 0.00 -1.5% PASS
total/synth/DUP11/d0 0.00 0.00 -1.0% PASS
total/synth/DUP11/d1 0.00 0.00 -0.6% PASS
total/synth/DUP12/d0 0.00 0.00 -0.8% PASS
total/synth/DUP12/d1 0.00 0.00 -1.1% PASS
total/synth/DUP13/d0 0.00 0.00 -2.6% PASS
total/synth/DUP13/d1 0.00 0.00 -1.8% PASS
total/synth/DUP14/d0 0.00 0.00 -0.6% PASS
total/synth/DUP14/d1 0.00 0.00 -2.5% PASS
total/synth/DUP15/d0 0.00 0.00 +1.6% PASS
total/synth/DUP15/d1 0.00 0.00 +0.3% PASS
total/synth/DUP16/d0 0.00 0.00 -1.2% PASS
total/synth/DUP16/d1 0.00 0.00 -1.7% PASS
total/synth/DUP2/d0 0.00 0.00 -0.6% PASS
total/synth/DUP2/d1 0.00 0.00 -0.9% PASS
total/synth/DUP3/d0 0.00 0.00 -1.6% PASS
total/synth/DUP3/d1 0.00 0.00 -0.3% PASS
total/synth/DUP4/d0 0.00 0.00 -1.4% PASS
total/synth/DUP4/d1 0.00 0.00 +1.0% PASS
total/synth/DUP5/d0 0.00 0.00 -0.8% PASS
total/synth/DUP5/d1 0.00 0.00 -1.9% PASS
total/synth/DUP6/d0 0.00 0.00 -1.4% PASS
total/synth/DUP6/d1 0.00 0.00 -1.2% PASS
total/synth/DUP7/d0 0.00 0.00 -2.7% PASS
total/synth/DUP7/d1 0.00 0.00 +1.6% PASS
total/synth/DUP8/d0 0.00 0.00 -0.5% PASS
total/synth/DUP8/d1 0.00 0.00 +0.3% PASS
total/synth/DUP9/d0 0.00 0.00 -1.8% PASS
total/synth/DUP9/d1 0.00 0.00 -0.8% PASS
total/synth/EQ/b0 0.00 0.00 -1.3% PASS
total/synth/EQ/b1 0.00 0.00 -0.5% PASS
total/synth/GAS/a0 0.52 0.52 +0.4% PASS
total/synth/GAS/a1 0.79 0.80 +1.1% PASS
total/synth/GT/b0 0.00 0.00 -1.8% PASS
total/synth/GT/b1 0.00 0.00 -2.2% PASS
total/synth/ISZERO/u0 0.00 0.00 +0.4% PASS
total/synth/JUMPDEST/n0 0.00 0.00 +0.3% PASS
total/synth/LT/b0 0.00 0.00 -1.7% PASS
total/synth/LT/b1 0.00 0.00 -0.2% PASS
total/synth/MSIZE/a0 0.00 0.00 +2.7% PASS
total/synth/MSIZE/a1 0.00 0.00 -0.9% PASS
total/synth/MUL/b0 0.00 0.00 -2.1% PASS
total/synth/MUL/b1 0.00 0.00 -0.6% PASS
total/synth/NOT/u0 0.00 0.00 +0.6% PASS
total/synth/OR/b0 0.00 0.00 -2.0% PASS
total/synth/OR/b1 0.00 0.00 -1.1% PASS
total/synth/PC/a0 0.00 0.00 +0.5% PASS
total/synth/PC/a1 0.00 0.00 +0.2% PASS
total/synth/PUSH1/p0 0.00 0.00 +0.8% PASS
total/synth/PUSH1/p1 0.00 0.00 -0.4% PASS
total/synth/PUSH10/p0 0.00 0.00 -1.7% PASS
total/synth/PUSH10/p1 0.00 0.00 +2.6% PASS
total/synth/PUSH11/p0 0.00 0.00 +1.2% PASS
total/synth/PUSH11/p1 0.00 0.00 -1.9% PASS
total/synth/PUSH12/p0 0.00 0.00 -0.9% PASS
total/synth/PUSH12/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH13/p0 0.00 0.00 -3.7% PASS
total/synth/PUSH13/p1 0.00 0.00 +1.4% PASS
total/synth/PUSH14/p0 0.00 0.00 +5.5% PASS
total/synth/PUSH14/p1 0.00 0.00 +1.6% PASS
total/synth/PUSH15/p0 0.00 0.00 +0.3% PASS
total/synth/PUSH15/p1 0.00 0.00 +0.5% PASS
total/synth/PUSH16/p0 0.00 0.00 -0.1% PASS
total/synth/PUSH16/p1 0.00 0.00 +8.7% PASS
total/synth/PUSH17/p0 0.00 0.00 -9.1% PASS
total/synth/PUSH17/p1 0.00 0.00 +1.3% PASS
total/synth/PUSH18/p0 0.00 0.00 +0.3% PASS
total/synth/PUSH18/p1 0.00 0.00 +0.3% PASS
total/synth/PUSH19/p0 0.00 0.00 -5.9% PASS
total/synth/PUSH19/p1 0.00 0.00 +10.5% PASS
total/synth/PUSH2/p0 0.00 0.00 -1.2% PASS
total/synth/PUSH2/p1 0.00 0.00 +0.1% PASS
total/synth/PUSH20/p0 0.00 0.00 -5.9% PASS
total/synth/PUSH20/p1 0.00 0.00 -2.4% PASS
total/synth/PUSH21/p0 0.00 0.00 -10.9% PASS
total/synth/PUSH21/p1 0.00 0.00 +0.2% PASS
total/synth/PUSH22/p0 2.45 2.45 -0.1% PASS
total/synth/PUSH22/p1 1.57 1.67 +5.8% PASS
total/synth/PUSH23/p0 2.47 2.49 +0.6% PASS
total/synth/PUSH23/p1 2.53 2.51 -0.7% PASS
total/synth/PUSH24/p0 1.49 1.44 -3.6% PASS
total/synth/PUSH24/p1 2.53 2.51 -0.9% PASS
total/synth/PUSH25/p0 2.48 2.48 +0.2% PASS
total/synth/PUSH25/p1 1.64 1.69 +2.9% PASS
total/synth/PUSH26/p0 2.49 2.48 -0.1% PASS
total/synth/PUSH26/p1 2.50 2.50 -0.3% PASS
total/synth/PUSH27/p0 1.50 1.45 -3.0% PASS
total/synth/PUSH27/p1 2.51 2.51 -0.1% PASS
total/synth/PUSH28/p0 2.48 2.49 +0.4% PASS
total/synth/PUSH28/p1 1.64 1.70 +3.4% PASS
total/synth/PUSH29/p0 2.47 2.29 -7.2% PASS
total/synth/PUSH29/p1 2.51 2.51 -0.0% PASS
total/synth/PUSH3/p0 0.00 0.00 -1.4% PASS
total/synth/PUSH3/p1 0.00 0.00 +0.7% PASS
total/synth/PUSH30/p0 1.49 1.46 -2.0% PASS
total/synth/PUSH30/p1 2.50 2.50 -0.0% PASS
total/synth/PUSH31/p0 2.47 2.44 -1.3% PASS
total/synth/PUSH31/p1 1.59 1.73 +8.9% PASS
total/synth/PUSH32/p0 2.49 2.49 +0.2% PASS
total/synth/PUSH32/p1 2.51 2.50 -0.1% PASS
total/synth/PUSH4/p0 0.00 0.00 +0.7% PASS
total/synth/PUSH4/p1 0.00 0.00 +1.5% PASS
total/synth/PUSH5/p0 0.00 0.00 -3.3% PASS
total/synth/PUSH5/p1 0.00 0.00 -0.3% PASS
total/synth/PUSH6/p0 0.00 0.00 -3.3% PASS
total/synth/PUSH6/p1 0.00 0.00 -2.9% PASS
total/synth/PUSH7/p0 0.00 0.00 -3.0% PASS
total/synth/PUSH7/p1 0.00 0.00 -0.9% PASS
total/synth/PUSH8/p0 0.00 0.00 -0.6% PASS
total/synth/PUSH8/p1 0.00 0.00 -0.3% PASS
total/synth/PUSH9/p0 0.00 0.00 +3.3% PASS
total/synth/PUSH9/p1 0.00 0.00 +0.3% PASS
total/synth/RETURNDATASIZE/a0 0.04 0.04 -0.1% PASS
total/synth/RETURNDATASIZE/a1 0.05 0.05 -9.3% PASS
total/synth/SAR/b0 0.00 0.00 -1.0% PASS
total/synth/SAR/b1 0.00 0.00 -1.5% PASS
total/synth/SGT/b0 0.00 0.00 -2.6% PASS
total/synth/SGT/b1 0.00 0.00 +0.0% PASS
total/synth/SHL/b0 0.00 0.00 -0.5% PASS
total/synth/SHL/b1 0.00 0.00 -0.7% PASS
total/synth/SHR/b0 0.00 0.00 -1.8% PASS
total/synth/SHR/b1 0.00 0.00 -1.8% PASS
total/synth/SIGNEXTEND/b0 0.00 0.00 -1.2% PASS
total/synth/SIGNEXTEND/b1 0.00 0.00 -0.4% PASS
total/synth/SLT/b0 0.00 0.00 -1.4% PASS
total/synth/SLT/b1 0.00 0.00 -2.1% PASS
total/synth/SUB/b0 0.00 0.00 -2.1% PASS
total/synth/SUB/b1 0.00 0.00 -1.3% PASS
total/synth/SWAP1/s0 0.00 0.00 -1.2% PASS
total/synth/SWAP10/s0 0.00 0.00 -0.1% PASS
total/synth/SWAP11/s0 0.00 0.00 -0.1% PASS
total/synth/SWAP12/s0 0.00 0.00 -2.9% PASS
total/synth/SWAP13/s0 0.00 0.00 -1.3% PASS
total/synth/SWAP14/s0 0.00 0.00 -1.9% PASS
total/synth/SWAP15/s0 0.00 0.00 -1.4% PASS
total/synth/SWAP16/s0 0.00 0.00 -0.2% PASS
total/synth/SWAP2/s0 0.00 0.00 -0.7% PASS
total/synth/SWAP3/s0 0.00 0.00 +0.0% PASS
total/synth/SWAP4/s0 0.00 0.00 -1.1% PASS
total/synth/SWAP5/s0 0.00 0.00 +0.6% PASS
total/synth/SWAP6/s0 0.00 0.00 -1.8% PASS
total/synth/SWAP7/s0 0.00 0.00 -0.7% PASS
total/synth/SWAP8/s0 0.00 0.00 +1.2% PASS
total/synth/SWAP9/s0 0.00 0.00 -1.0% PASS
total/synth/XOR/b0 0.00 0.00 +0.9% PASS
total/synth/XOR/b1 0.00 0.00 -2.7% PASS
total/synth/loop_v1 12.66 12.68 +0.1% PASS
total/synth/loop_v2 12.67 12.68 +0.1% PASS

Summary: 194 benchmarks, 0 regressions


abmcar and others added 4 commits June 11, 2026 11:34
A SUB result wraps to full width on underflow, so the RESULT cannot be
narrowed -- but when both operands are range-proven u64 the COMPUTATION
can: (a - b) mod 2^256 is exactly {wrapping_sub(a0, b0), fill, fill,
fill} where fill = 0 - borrow and borrow = (a0 <u b0). On underflow the
i64 negation yields all-ones, reproducing the wrapped upper 192 bits
bit-exactly for every input -- no no-underflow proof is needed.

Replace the generic path's eight protectUnsafeValue spills plus SUB/SBB
chain with one sub, one compare, and one negation (no flag-protection
barrier needed without an SBB chain). The result keeps the default U256
range tag, symmetric with the analyzer's SUB transfer rule.

Wire the new SubFastRangeU64Count into the [EVM-ARITH-SUMMARY] predicate
and log line.

Add 6 adversarial differential fixtures (underflow all-ones fill, the
0 - (2^64-1) wrap boundary, dynamic zero RHS, one-sided-wide control)
plus an EVMSubWrapDifferentialTest suite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The evmone-statetests job on the previous run died ~5 minutes in with
its Build-and-Test step log never uploaded (runner-level termination);
the suite never reached test execution. All 16 other checks passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The implementation ships in this PR; the Status field was left at the
Proposed value from the drafting stage.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@abmcar abmcar force-pushed the perf/evm-sub-u64-wrap-lowering branch from 094eee3 to 48d32d5 Compare June 11, 2026 03:40
@abmcar

abmcar commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up measurement: the stacking estimate in the PR body is now confirmed by a paired capture on a local merge of current main (which includes #524/#532) plus #534/#535/#536. On the EEST Cancun suite (27,742 shared compiled sites, site-weighted, zero reverse transitions):

  • SUB fast-path hit rate: 18.4% → 59.3% (+40.9pp, 1,594 sites FULL→narrow — matching the ~1,594 stacked estimate exactly)
  • combined stack overall: 80.01% → 86.99% (+6.98pp)
  • full Cancun statetest on the merged stack: 2723/2723

The remaining SUB FULL mass on that baseline is dominated by statically unproven U256|U256 pairs (1,262 of 1,584), i.e. analysis-side rather than lowering-side.

The EVMSubWrapDifferentialTest suite and its 6 SUB-path fixtures relocate to
the dedicated EVM differential-suite change so optimization PRs stay code-only
and the shared evm_interp_tests.cpp file stops accumulating per-PR copies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@abmcar

abmcar commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

The differential test suite and its fixtures have moved out of this PR into #539, which consolidates the interp-vs-multipass differential coverage from all three optimization PRs into a dedicated test target. This PR is now code + docs only; #539 carries the tests and can merge independently in any order.

Align the doc with the technical-writing rule: describe internal labels by behavior, remove who-reviewed and process narrative, and keep every count, flag, code anchor, and measurement. No code or runtime change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Cen5bPpPEgkSkcxWWTSY7d
@zoowii zoowii merged commit 569cfe8 into DTVMStack:main Jun 24, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants