perf(compiler): consume value-range tags in EVM bool/compare/bitwise lowering#534
Conversation
d32f6dc to
4e9a276
Compare
⚡ Performance Regression Check Results✅ Performance Check Passed (interpreter)Performance Benchmark Results (threshold: 25%)
Summary: 194 benchmarks, 0 regressions ✅ Performance Check Passed (multipass)Performance Benchmark Results (threshold: 25%)
Summary: 194 benchmarks, 0 regressions |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR improves EVM MIR lowering by consuming existing ValueRange tags to generate narrower (and sometimes fused) code paths for ISZERO/JUMPI, OR/XOR, and signed comparisons, and adds differential regression fixtures to ensure Multipass matches the interpreter.
Changes:
- Propagate base range through deferred ISZERO operands and use it to narrow limb folding in ISZERO materialization and JUMPI condition evaluation (including deferred-zero-test fusion).
- Add non-constant range-narrowed OR/XOR paths and u64-constant fast paths for SLT/SGT.
- Add 21 adversarial EVM-asm fixtures plus a new parameterized differential test to validate interpreter vs. Multipass output.
Reviewed changes
Copilot reviewed 46 out of 46 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/evm_asm/xor_dyn_u64_wide.expected | New expected output for XOR mixed wide/u64-range narrowing coverage |
| tests/evm_asm/xor_dyn_u64_wide.easm | New adversarial XOR fixture (upper-limb passthrough) |
| tests/evm_asm/xor_dyn_u64_u64.expected | New expected output for XOR u64/u64 fast path |
| tests/evm_asm/xor_dyn_u64_u64.easm | New XOR fixture exercising both-fit-u64 path |
| tests/evm_asm/slt_dyn_neg_vs_const.expected | New expected output for SLT signed fast path (negative vs const) |
| tests/evm_asm/slt_dyn_neg_vs_const.easm | New SLT fixture for negative dynamic vs u64 const |
| tests/evm_asm/slt_dyn_msb64_vs_const.expected | New expected output for SLT msb64 edge case |
| tests/evm_asm/slt_dyn_msb64_vs_const.easm | New SLT fixture guarding against incorrect signed-i64 limb0 compare |
| tests/evm_asm/slt_dyn_highsparse_vs_const.expected | New expected output for SLT highsparse dynamic vs const |
| tests/evm_asm/slt_dyn_highsparse_vs_const.easm | New SLT fixture for upper-limb selection logic |
| tests/evm_asm/slt_dyn_eq_const.expected | New expected output for SLT equality boundary |
| tests/evm_asm/slt_dyn_eq_const.easm | New SLT fixture ensuring strict-less behavior |
| tests/evm_asm/slt_const_vs_dyn.expected | New expected output for SLT const-vs-dyn swapped-dispatch case |
| tests/evm_asm/slt_const_vs_dyn.easm | New SLT fixture exercising swapped dispatch |
| tests/evm_asm/sgt_dyn_neg_vs_const.expected | New expected output for SGT negative vs const |
| tests/evm_asm/sgt_dyn_neg_vs_const.easm | New SGT fixture for negative dynamic vs const |
| tests/evm_asm/sgt_dyn_msb64_vs_const.expected | New expected output for SGT msb64 edge case |
| tests/evm_asm/sgt_dyn_msb64_vs_const.easm | New SGT fixture guarding unsigned limb0 compare logic |
| tests/evm_asm/sgt_dyn_highsparse_vs_const.expected | New expected output for SGT highsparse dynamic vs const |
| tests/evm_asm/sgt_dyn_highsparse_vs_const.easm | New SGT fixture for upper-limb selection logic |
| tests/evm_asm/sgt_const_vs_dyn.expected | New expected output for SGT const-vs-dyn swapped-dispatch case |
| tests/evm_asm/sgt_const_vs_dyn.easm | New SGT fixture exercising swapped dispatch |
| tests/evm_asm/or_dyn_u64_wide.expected | New expected output for OR mixed wide/u64-range narrowing coverage |
| tests/evm_asm/or_dyn_u64_wide.easm | New adversarial OR fixture (upper-limb passthrough) |
| tests/evm_asm/or_dyn_u64_u64.expected | New expected output for OR u64/u64 fast path |
| tests/evm_asm/or_dyn_u64_u64.easm | New OR fixture exercising both-fit-u64 path |
| tests/evm_asm/jumpi_u64_cond_taken.expected | New expected output for JUMPI with u64-tagged cond (taken) |
| tests/evm_asm/jumpi_u64_cond_taken.easm | New JUMPI fixture exercising FoldLimbs=1 path |
| tests/evm_asm/jumpi_u64_cond_nottaken.expected | New expected output for JUMPI with u64-tagged cond (not taken) |
| tests/evm_asm/jumpi_u64_cond_nottaken.easm | New JUMPI fixture exercising FoldLimbs=1 path (zero) |
| tests/evm_asm/jumpi_iszero_iszero_fused.expected | New expected output for double-ISZERO deferred fusion through JUMPI |
| tests/evm_asm/jumpi_iszero_iszero_fused.easm | New fixture exercising deferred-zero-test negation flip + JUMPI fold |
| tests/evm_asm/jumpi_iszero_fused_taken.expected | New expected output for deferred ISZERO fused into JUMPI (taken) |
| tests/evm_asm/jumpi_iszero_fused_taken.easm | New fixture verifying ISZERO materialization elision in JUMPI |
| tests/evm_asm/jumpi_iszero_fused_nottaken_highsparse.expected | New expected output for highsparse base in deferred ISZERO fusion (not taken) |
| tests/evm_asm/jumpi_iszero_fused_nottaken_highsparse.easm | New adversarial fixture preventing limb0-only folding bugs |
| tests/evm_asm/iszero_dyn_u64_nonzero.expected | New expected output for ISZERO on u64-tagged nonzero dynamic |
| tests/evm_asm/iszero_dyn_u64_nonzero.easm | New ISZERO fixture exercising narrowed fold for u64 |
| tests/evm_asm/iszero_dyn_highsparse.expected | New expected output for ISZERO highsparse dynamic |
| tests/evm_asm/iszero_dyn_highsparse.easm | New ISZERO fixture guarding against limb0-only fold |
| tests/evm_asm/iszero_calldatasize.expected | New expected output for ISZERO(CALLDATASIZE) |
| tests/evm_asm/iszero_calldatasize.easm | New fixture covering env-op producer tagging + ISZERO narrow fold |
| src/tests/evm_interp_tests.cpp | Add parameterized differential test suite and instantiate it for new fixtures |
| src/compiler/evm_frontend/evm_mir_compiler.h | Track deferred-zero-test base range; add SLT/SGT u64-const helper declarations; extend EQZ signature |
| src/compiler/evm_frontend/evm_mir_compiler.cpp | Implement narrowed ISZERO fold, JUMPI fusion/narrow fold, OR/XOR range narrowing, SLT/SGT u64-const fast paths, and U64-tagging for single-instr u256 producers |
| docs/changes/2026-06-10-evm-range-lowering-gaps/README.md | Design/change note describing the new range-consuming lowering paths and verification/measurements |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
be9042b to
0cadeb3
Compare
|
The differential test suite and its fixtures have moved out of this PR into #539, which consolidates the interp-vs-multipass differential coverage from all three optimization PRs into a dedicated test target. This PR is now code + docs only; #539 carries the tests and can merge independently in any order. |
…lowering Close lowering-side gaps where operands already proven u64 by the range analyzer still took full 4-limb paths: - ISZERO: deferred zero-test operands now carry the base value range and tag their 0/1 result as u64; materialization folds only the limbs the range allows (u64 base -> single-limb test). - JUMPI: fuse deferred zero-test conditions into the branch (no materialize-then-refold round trip) and fold only range-live limbs of non-deferred conditions. - OR/XOR: new non-constant narrow paths — both-u64 emits a single i64 op with zeroed upper limbs; one-sided u64 passes the wide side's upper limbs through (identity), mirroring the existing const-u64 path. - SLT/SGT: new fast paths against u64 constants with the same three range tiers as the unsigned compare helpers; a u64 constant is a non-negative signed-256 value, so sign-bit + upper-or + unsigned limb0 compare decide the result. - Context-size producers (PC/GAS/CALLDATASIZE/CODESIZE/MSIZE/ RETURNDATASIZE): tag results u64 — limbs 1..3 are literal zeros by construction, closing a builder/analyzer SSOT divergence. Add 21 adversarial differential fixtures (interp vs multipass, boundary values incl. 2^64/2^128/2^192/-1/high-sparse and limb0 MSB cases) plus an EVMRangeNarrowingDifferentialTest suite. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…after rebase Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e step Set Status to Implemented. Reword the two references to repository paths that ship with the separate mainnet-replay analysis-suite work and are absent from this tree: the motivation's real-load analysis pointer and the known-limitation re-measure step (formerly tools/run_real_load_profile.py). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The EVMRangeNarrowingDifferentialTest suite and its 21 evm_asm fixtures relocate to the dedicated differential-suite change so optimization PRs stay code-only and the shared evm_interp_tests file stops accumulating per-PR copies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Align the doc with the technical-writing rule: describe internal labels by behavior, remove who-reviewed and process narrative, and keep every count, flag, code anchor, and measurement. No code or runtime change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Cen5bPpPEgkSkcxWWTSY7d
b952a35 to
5e53858
Compare
Consume existing
ValueRangetags in four EVM lowering paths so operands already proven to fit in u64 skip full 4-limb sequences. Overall fast-path hit rate rises 78.72% -> 80.02% on the EEST Cancun suite, with no measured performance regression and 21/21 differential fixtures passing.What
The multipass JIT range analyzer already proves many operands fit in u64, but several lowering paths ignore that proof and emit full 4-limb sequences. This PR makes four lowerings consume the existing
ValueRangetags:LT; ISZERO; JUMPIloop-exit shape previously paid a full compare, a 4-limb OR-fold + cmp + select materialization, and a second 4-limb OR-fold + cmp + select at the branch.handleCompareSltRhsU64/handleCompareSgtRhsU64). A u64 constant has zero upper limbs, so it is a non-negative signed-256 value; the compare reduces to sign-bit test + upper-limb OR + an unsigned limb0 compare, with the same three range tiers as the unsigned helpers. Both constant positions are dispatched (c <s x ⟺ x >s c).Why
Real-mainnet-load profiling showed these ops taking the full path while their operands were already statically proven narrow: 58.5% of ISZERO full-path executions had a proven-u64 operand, 51.8% of OR full-path executions had at least one proven-u64 side, and 22.4% of SLT full-path executions compared against a u64 constant. These are lowering-side gaps — no new analysis precision is required to close them.
Soundness
ValueRangetag already proves upper limbs are semantically zero; no new tag sources are introduced (change 5's tag is justified by the function's structural zero-fill).[2^63, 2^64-1]are positive 256-bit values; the new paths use unsigned predicates on limb0 and decide negative inputs from limb3's sign bit. The truth tables are verified across the boundary set (2^63, 2^64, 2^128, 2^192, 2^255, -1, equal values, c=0, c>=2^63).Diff scope
This diff includes the base-range plumbing (
DeferredBaseRange) and the 1/2/4-limb narrow fold at materialization and JUMPI fusion. Two tags that previously sat here are excluded because they landed upstream independently: the context-size producer tag (now #532) and the U64 tag on the deferred zero-test result (now #524).Verification
tests/evm_asm/, committed) cover each new path's narrow and full sides with boundary values including 2^64, 2^128, 2^192, -1, high-sparse, and the limb0-MSB unsigned-predicate hard gate. A newEVMRangeNarrowingDifferentialTestsuite asserts interpreter and multipass outputs match byte-for-byte and that multipass actually JIT-compiled: 21/21 pass.-k fork_Cancun: 2723/2723; ctest: 11/11; golden.easmsuite: 178/178 (no regressions);tools/format.sh checkclean; no new build warnings.Measurements
Fast-path hit rate, paired site-weighted measurement on the EEST Cancun suite (28,109 shared compiled sites, measured with an instrumentation-only tap branch not included in this PR). The EEST measurement predates the #522/#524/#530/#532 merges; the ADD row's 17 sites are attributable to the context-size tag landed as #532. Transition column reads as " sites: -> ".
Performance (evmone-bench, 27-bench sweep, multipass, vs upstream/main baseline): median delta -0.23%, within run-to-run variance (about ±2%). First-pass outliers re-measured with 15 repetitions all resolved to noise; the largest and most stable benchmark (snailtracer, cv 1.5%) improved consistently across two independent runs (-1.1% / -1.3%). No regressions.
Known limitation
With
ZEN_ENABLE_EVM_STACK_SSA_LIFT=ON(default OFF, CI OFF),getOperandIdentityKey()in the lifted-stack path does not handle deferred operands; a live-out deferred zero-test can hit an assertion there. This exposure predates this PR (the lifter is untouched and deferred operand lifetimes are unchanged) and is noted in the change document for the stack-lift follow-up.Change document:
docs/changes/2026-06-10-evm-range-lowering-gaps/README.md.🤖 Generated with Claude Code