perf(compiler): statically resolve constant-amount EVM shift guards by abmcar · Pull Request #535 · DTVMStack/DTVM

abmcar · 2026-06-09T20:07:55Z

Resolves constant-amount SHL/SHR/SAR shift guards at compile time, removing four Selects plus a 4-limb compare chain per constant-shift site. Correctness-neutral (12/12 differential fixtures, 223/223 unittests, 2723/2723 statetest); benchmark-neutral (median -0.08% over the 27-bench sweep).

What

The const-amount fast paths in EVM SHL/SHR/SAR lowering still emitted a runtime >= 256 guard — an isU256GreaterOrEqual comparison chain plus one Select per result limb — even when the shift amount is a compile-time constant that makes the guard statically decidable, and still computed shifted/carry terms for source limbs that the value operand's range proves zero. This PR resolves both at compile time:

Static large-shift resolution (handleShift): when the shift amount is constant, the full 256-bit value is checked via intx. Amount >= 256 folds SHL/SHR to a constant zero (EVM spec — result is identically zero for any value, mirroring the existing both-constant fold); SAR keeps the generic flow since its fill depends on the value's sign bit. Amount < 256 skips building IsLargeShift entirely and passes nullptr to the limb helpers.
Guard-select pruning: the three helpers' const-amount paths skip the per-limb Select(IsLargeShift, fill, R) when the guard is nullptr. SAR's out-of-bounds sign-fill comes from the limb's default initializer, not the removed select, and is untouched. Dynamic-amount paths gained a defensive ZEN_ASSERT(IsLargeShift != nullptr).
Range-aware source-limb pruning (SHL/SHR_U const paths only): a new LiveLimbs parameter (value range U64 → 1, U128 → 2, else 4) drops shifted/carry terms whose source limb index is >= LiveLimbs — those limbs are semantically zero under the existing range contract. SAR is deliberately excluded so the change introduces no new range claim.

Net effect per constant-shift site: four Selects plus a 4-limb comparison chain removed; for shifts of proven-narrow values, additional dead shl/ushr/or terms removed.

Correctness notes

The limb0 trap: getConstShiftAmount inspects only limb0, so a constant amount like 2^64 (limb0 == 0, upper limb set) historically relied on the runtime guard. The static resolution checks the full 256-bit constant, so such amounts either fold to zero (SHL/SHR) or keep a real guard (SAR); nullptr is only ever passed for full constants < 256. Covered by a dedicated fixture.
Term-liveness was checked by exhaustive enumeration against a 256-bit reference implementation over all (limb-offset × intra-limb shift × live-limb tier) combinations for shift amounts 0-255, and by manual trace of the boundary cases (e.g. a u64 value shifted left by 136, where the top result limb keeps only the carry term sourced from the live low limb while its shifted term is dead).
The early return for >= 256 constants fires after both operands are popped; EVM stack operands are pure values, so dropping the unmaterialized value expression is safe.
Result range tags: SHR_U keeps its existing value-range passthrough (pruning makes the zero limbs structural, strengthening the tag); SHL stays U256. The only change is that the >= 256 fold now yields a constant zero whose auto-derived tag is more precise than the old dynamic zero. This more-precise constant-zero tag is a strict strengthening of the prior dynamic-zero tag and changes no observable behavior.

Verification

12 new adversarial differential fixtures (tests/evm_asm/) + an EVMConstShiftDifferentialTest suite asserting interpreter and multipass outputs match byte-for-byte and that multipass JIT-compiled: cross-limb carries (<< 96), source pruning (u64 << 200, u64 >> 8), >= 256 folds, the 2^64 trap, SAR sign-fill for both signs, and a dynamic-amount control. 12/12 pass.
multipass evmone-unittests 223/223; multipass evmone-statetest -k fork_Cancun 2723/2723; golden .easm suite no regressions; tools/format.sh check clean; no new warnings.

Performance

evmone-bench 27-bench sweep (multipass, vs upstream/main baseline, median of 5 reps): median delta -0.08%. Shift-focused benchmarks and all first-pass outliers re-measured with 15 repetitions resolve into their run-variance bands (blake2b_shifts +1.3% at cv 2.4-3.4%, sha1_shifts +0.2%, signextend -0.1%). End-to-end neutral, no regressions; the benefit is generated-code reduction at constant-shift sites, which this suite's hot paths do not isolate.

Known limitation

Source-limb pruning trusts the range contract. Within-block narrow producers (AND-mask, constants) physically zero the upper limbs; cross-block narrow tags imported via EntryStackRanges additionally rely on the analyzer being a sound over-approximation (verified for current transfer rules). That import path is gated by ZEN_ENABLE_EVM_STACK_SSA_LIFT, default OFF and OFF in CI; if the flag is enabled by default in the future, the analyzer transfer rules should be re-audited and a lift-ON differential fixture added. Noted in the change document.

Change document: docs/changes/2026-06-10-evm-const-shift-pruning/README.md.

🤖 Generated with Claude Code

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Optimizes EVM shift lowering in the multipass JIT by statically eliminating redundant >= 256 guards for constant shift amounts and pruning dead source-limb contributions based on value range, with new differential fixtures to ensure interpreter/multipass agree.

Changes:

Statically fold SHL/SHR_U with constant shift amounts >= 256 to zero and omit the IsLargeShift per-limb Select chain when constant < 256.
Add range-aware pruning (U64/U128) for const-amount SHL/SHR_U to skip dead shifted/carry terms from provably-zero source limbs.
Add differential EVM asm fixtures and a new gtest suite covering the optimized paths and edge cases.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/evm_asm/shr_const8_u64val.expected	Adds expected output for SHR const path with U64-masked input.
tests/evm_asm/shr_const8_u64val.easm	Adds fixture exercising SHR const path limb pruning from U64 range.
tests/evm_asm/shr_const72_dyn.expected	Adds expected output for cross-limb SHR by 72.
tests/evm_asm/shr_const72_dyn.easm	Adds fixture for SHR const cross-limb behavior (CompShift=1, ShiftMod=8).
tests/evm_asm/shr_const4_dyn.expected	Adds expected output for SHR by 4.
tests/evm_asm/shr_const4_dyn.easm	Adds SHR-by-4 fixture for const-amount path.
tests/evm_asm/shr_const256_dyn.expected	Adds expected output for SHR by 256 folding to zero.
tests/evm_asm/shr_const256_dyn.easm	Adds fixture for large constant SHR amount (`>=256`).
tests/evm_asm/shl_dyn_amount.expected	Adds expected output for dynamic shift amount path regression coverage.
tests/evm_asm/shl_dyn_amount.easm	Adds fixture to force dynamic shift amount lowering (memory-laundered amount).
tests/evm_asm/shl_const_highlimb_dyn.expected	Adds expected output for “high limb set” shift amount folding to zero.
tests/evm_asm/shl_const_highlimb_dyn.easm	Adds fixture for 2^64 “trap” constant shift amount (upper limb set).
tests/evm_asm/shl_const96_dyn.expected	Adds expected output for SHL cross-limb carry behavior.
tests/evm_asm/shl_const96_dyn.easm	Adds fixture exercising SHL const cross-limb carry terms (<<96).
tests/evm_asm/shl_const4_dyn.expected	Adds expected output for SHL by 4.
tests/evm_asm/shl_const4_dyn.easm	Adds SHL-by-4 fixture for const-amount path.
tests/evm_asm/shl_const256_dyn.expected	Adds expected output for SHL by 256 folding to zero.
tests/evm_asm/shl_const256_dyn.easm	Adds fixture for large constant SHL amount (`>=256`).
tests/evm_asm/shl_const200_u64val.expected	Adds expected output for U64-masked value shifted left by 200.
tests/evm_asm/shl_const200_u64val.easm	Adds fixture to verify SHL const path pruning dead source limbs (U64).
tests/evm_asm/sar_const8_neg.expected	Adds expected output for SAR negative value sign-fill preservation.
tests/evm_asm/sar_const8_neg.easm	Adds fixture covering SAR const path sign-fill behavior.
tests/evm_asm/sar_const64_pos.expected	Adds expected output for SAR positive value (CompShift=1, ShiftMod=0).
tests/evm_asm/sar_const64_pos.easm	Adds fixture for SAR const shift amount correctness on positive values.
src/tests/evm_interp_tests.cpp	Adds a parameterized differential test suite comparing interp vs multipass & JIT compilation.
src/compiler/evm_frontend/evm_mir_compiler.h	Implements static large-shift folding, nullable guard plumbing, and LiveLimbs propagation.
src/compiler/evm_frontend/evm_mir_compiler.cpp	Updates shift helpers to omit guard Selects when statically false and prune dead limbs.
docs/changes/2026-06-10-evm-const-shift-pruning/README.md	Documents motivation, soundness argument, tests, and measurements for the optimization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-06-09T20:23:33Z

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

Performance Benchmark Results (threshold: 25%)

Benchmark	Baseline (us)	Current (us)	Change	Status
total/main/blake2b_huff/8415nulls	4.13	4.21	+2.0%	PASS
total/main/blake2b_huff/empty	0.07	0.07	+0.7%	PASS
total/main/blake2b_shifts/8415nulls	17.21	16.85	-2.1%	PASS
total/main/sha1_divs/5311	12.79	13.13	+2.6%	PASS
total/main/sha1_divs/empty	0.16	0.16	-2.7%	PASS
total/main/sha1_shifts/5311	9.91	9.82	-1.0%	PASS
total/main/sha1_shifts/empty	0.07	0.07	-1.3%	PASS
total/main/snailtracer/benchmark	123.30	122.56	-0.6%	PASS
total/main/structarray_alloc/nfts_rank	1.22	1.20	-1.4%	PASS
total/main/swap_math/insufficient_liquidity	0.00	0.00	-0.2%	PASS
total/main/swap_math/received	0.01	0.01	-0.4%	PASS
total/main/swap_math/spent	0.01	0.01	-0.6%	PASS
total/main/weierstrudel/1	0.41	0.41	+0.4%	PASS
total/main/weierstrudel/15	4.50	4.52	+0.5%	PASS
total/micro/JUMPDEST_n0/empty	1.86	1.85	-0.8%	PASS
total/micro/jump_around/empty	0.08	0.07	-11.0%	PASS
total/micro/loop_with_many_jumpdests/empty	69.99	69.88	-0.1%	PASS
total/micro/memory_grow_mload/by1	0.26	0.26	+0.6%	PASS
total/micro/memory_grow_mload/by16	0.27	0.28	+4.3%	PASS
total/micro/memory_grow_mload/by32	0.14	0.14	+0.6%	PASS
total/micro/memory_grow_mload/nogrow	0.12	0.12	+1.0%	PASS
total/micro/memory_grow_mstore/by1	0.26	0.26	+1.2%	PASS
total/micro/memory_grow_mstore/by16	0.13	0.13	-0.1%	PASS
total/micro/memory_grow_mstore/by32	0.31	0.30	-4.0%	PASS
total/micro/memory_grow_mstore/nogrow	0.26	0.26	+0.1%	PASS
total/micro/signextend/one	0.25	0.25	+0.5%	PASS
total/micro/signextend/zero	0.51	0.51	+0.9%	PASS
total/synth/ADD/b0	3.05	3.02	-0.9%	PASS
total/synth/ADD/b1	6.58	6.43	-2.3%	PASS
total/synth/ADDRESS/a0	11.29	11.11	-1.6%	PASS
total/synth/ADDRESS/a1	6.79	6.60	-2.8%	PASS
total/synth/AND/b0	2.79	2.81	+0.9%	PASS
total/synth/AND/b1	5.95	5.82	-2.3%	PASS
total/synth/BYTE/b0	5.45	4.96	-9.0%	PASS
total/synth/BYTE/b1	9.00	8.82	-2.0%	PASS
total/synth/CALLDATASIZE/a0	7.84	7.27	-7.3%	PASS
total/synth/CALLDATASIZE/a1	3.49	3.55	+1.5%	PASS
total/synth/CALLER/a0	10.88	10.96	+0.8%	PASS
total/synth/CALLER/a1	11.23	11.15	-0.7%	PASS
total/synth/CALLVALUE/a0	3.32	3.33	+0.3%	PASS
total/synth/CALLVALUE/a1	7.23	7.01	-2.9%	PASS
total/synth/CODESIZE/a0	8.63	8.30	-3.8%	PASS
total/synth/CODESIZE/a1	8.65	8.68	+0.4%	PASS
total/synth/DUP1/d0	2.45	2.35	-3.7%	PASS
total/synth/DUP1/d1	2.47	2.54	+2.7%	PASS
total/synth/DUP10/d0	2.35	2.47	+5.0%	PASS
total/synth/DUP10/d1	2.37	2.50	+5.7%	PASS
total/synth/DUP11/d0	1.62	1.63	+0.7%	PASS
total/synth/DUP11/d1	2.43	2.51	+3.4%	PASS
total/synth/DUP12/d0	2.48	2.44	-1.4%	PASS
total/synth/DUP12/d1	1.73	1.74	+0.7%	PASS
total/synth/DUP13/d0	2.46	2.46	+0.3%	PASS
total/synth/DUP13/d1	2.45	2.48	+1.0%	PASS
total/synth/DUP14/d0	1.63	1.64	+0.8%	PASS
total/synth/DUP14/d1	2.47	2.62	+6.4%	PASS
total/synth/DUP15/d0	2.76	2.82	+1.9%	PASS
total/synth/DUP15/d1	1.73	1.74	+0.7%	PASS
total/synth/DUP16/d0	2.37	2.46	+3.8%	PASS
total/synth/DUP16/d1	2.47	2.48	+0.5%	PASS
total/synth/DUP2/d0	1.61	1.63	+1.2%	PASS
total/synth/DUP2/d1	2.49	2.39	-4.3%	PASS
total/synth/DUP3/d0	2.47	2.39	-3.3%	PASS
total/synth/DUP3/d1	1.73	1.73	+0.3%	PASS
total/synth/DUP4/d0	2.46	2.45	-0.5%	PASS
total/synth/DUP4/d1	2.43	2.47	+1.7%	PASS
total/synth/DUP5/d0	1.62	1.63	+1.0%	PASS
total/synth/DUP5/d1	2.48	2.37	-4.3%	PASS
total/synth/DUP6/d0	2.35	2.34	-0.5%	PASS
total/synth/DUP6/d1	1.73	1.75	+1.1%	PASS
total/synth/DUP7/d0	2.31	2.45	+5.9%	PASS
total/synth/DUP7/d1	2.42	2.47	+2.0%	PASS
total/synth/DUP8/d0	1.62	1.64	+1.2%	PASS
total/synth/DUP8/d1	2.47	2.26	-8.5%	PASS
total/synth/DUP9/d0	2.46	2.47	+0.2%	PASS
total/synth/DUP9/d1	1.73	1.74	+1.1%	PASS
total/synth/EQ/b0	9.12	9.06	-0.7%	PASS
total/synth/EQ/b1	9.13	9.09	-0.4%	PASS
total/synth/GAS/a0	3.71	3.66	-1.4%	PASS
total/synth/GAS/a1	9.35	8.79	-6.0%	PASS
total/synth/GT/b0	9.39	9.05	-3.6%	PASS
total/synth/GT/b1	9.46	9.36	-1.1%	PASS
total/synth/ISZERO/u0	14.58	14.26	-2.1%	PASS
total/synth/JUMPDEST/n0	1.86	1.85	-0.3%	PASS
total/synth/LT/b0	9.26	9.22	-0.4%	PASS
total/synth/LT/b1	5.14	5.06	-1.4%	PASS
total/synth/MSIZE/a0	8.48	9.07	+7.0%	PASS
total/synth/MSIZE/a1	9.04	8.93	-1.2%	PASS
total/synth/MUL/b0	10.13	10.68	+5.4%	PASS
total/synth/MUL/b1	5.46	5.47	+0.2%	PASS
total/synth/NOT/u0	8.46	8.47	+0.2%	PASS
total/synth/OR/b0	5.94	5.84	-1.7%	PASS
total/synth/OR/b1	3.39	3.38	-0.1%	PASS
total/synth/PC/a0	7.91	7.92	+0.2%	PASS
total/synth/PC/a1	3.45	3.52	+2.0%	PASS
total/synth/PUSH1/p0	2.61	2.63	+0.5%	PASS
total/synth/PUSH1/p1	1.58	1.46	-8.1%	PASS
total/synth/PUSH10/p0	2.61	2.60	-0.5%	PASS
total/synth/PUSH10/p1	1.60	1.51	-5.8%	PASS
total/synth/PUSH11/p0	2.63	2.55	-3.0%	PASS
total/synth/PUSH11/p1	2.58	2.61	+1.1%	PASS
total/synth/PUSH12/p0	1.38	1.45	+4.7%	PASS
total/synth/PUSH12/p1	2.62	2.72	+3.7%	PASS
total/synth/PUSH13/p0	2.54	2.63	+3.5%	PASS
total/synth/PUSH13/p1	1.55	1.51	-2.1%	PASS
total/synth/PUSH14/p0	2.63	2.60	-0.9%	PASS
total/synth/PUSH14/p1	2.63	2.66	+1.0%	PASS
total/synth/PUSH15/p0	1.37	1.45	+6.2%	PASS
total/synth/PUSH15/p1	2.63	2.40	-8.5%	PASS
total/synth/PUSH16/p0	2.63	2.56	-2.7%	PASS
total/synth/PUSH16/p1	1.60	1.52	-4.9%	PASS
total/synth/PUSH17/p0	2.62	2.54	-2.8%	PASS
total/synth/PUSH17/p1	2.62	2.64	+1.1%	PASS
total/synth/PUSH18/p0	1.38	1.47	+6.5%	PASS
total/synth/PUSH18/p1	2.66	2.68	+0.5%	PASS
total/synth/PUSH19/p0	2.64	2.61	-1.1%	PASS
total/synth/PUSH19/p1	1.59	1.52	-4.3%	PASS
total/synth/PUSH2/p0	2.62	2.48	-5.3%	PASS
total/synth/PUSH2/p1	2.64	2.37	-10.5%	PASS
total/synth/PUSH20/p0	2.62	2.58	-1.3%	PASS
total/synth/PUSH20/p1	2.61	2.65	+1.6%	PASS
total/synth/PUSH21/p0	1.39	1.48	+6.2%	PASS
total/synth/PUSH21/p1	2.61	2.64	+1.1%	PASS
total/synth/PUSH22/p0	2.62	2.60	-0.6%	PASS
total/synth/PUSH22/p1	1.59	1.53	-3.8%	PASS
total/synth/PUSH23/p0	2.61	2.61	-0.1%	PASS
total/synth/PUSH23/p1	2.64	2.69	+2.2%	PASS
total/synth/PUSH24/p0	1.37	1.50	+9.7%	PASS
total/synth/PUSH24/p1	2.65	2.70	+2.1%	PASS
total/synth/PUSH25/p0	2.54	2.63	+3.4%	PASS
total/synth/PUSH25/p1	1.60	1.53	-4.6%	PASS
total/synth/PUSH26/p0	2.60	2.60	-0.3%	PASS
total/synth/PUSH26/p1	2.59	2.59	-0.1%	PASS
total/synth/PUSH27/p0	1.42	1.47	+3.3%	PASS
total/synth/PUSH27/p1	2.62	2.65	+1.1%	PASS
total/synth/PUSH28/p0	2.56	2.61	+2.0%	PASS
total/synth/PUSH28/p1	1.59	1.57	-1.7%	PASS
total/synth/PUSH29/p0	2.60	2.62	+0.6%	PASS
total/synth/PUSH29/p1	2.66	2.55	-4.1%	PASS
total/synth/PUSH3/p0	1.34	1.43	+6.9%	PASS
total/synth/PUSH3/p1	2.60	2.59	-0.4%	PASS
total/synth/PUSH30/p0	1.48	1.58	+6.7%	PASS
total/synth/PUSH30/p1	2.59	2.64	+1.7%	PASS
total/synth/PUSH31/p0	2.62	2.61	-0.3%	PASS
total/synth/PUSH31/p1	1.59	1.62	+2.0%	PASS
total/synth/PUSH32/p0	2.59	2.60	+0.4%	PASS
total/synth/PUSH32/p1	2.64	2.69	+1.9%	PASS
total/synth/PUSH4/p0	2.63	2.55	-3.2%	PASS
total/synth/PUSH4/p1	1.59	1.50	-5.1%	PASS
total/synth/PUSH5/p0	2.30	2.61	+13.2%	PASS
total/synth/PUSH5/p1	2.60	2.67	+2.6%	PASS
total/synth/PUSH6/p0	1.37	1.44	+5.1%	PASS
total/synth/PUSH6/p1	2.62	2.65	+1.2%	PASS
total/synth/PUSH7/p0	2.62	2.58	-1.7%	PASS
total/synth/PUSH7/p1	1.60	1.48	-7.5%	PASS
total/synth/PUSH8/p0	2.63	2.62	-0.5%	PASS
total/synth/PUSH8/p1	2.64	2.69	+1.8%	PASS
total/synth/PUSH9/p0	1.33	1.43	+7.0%	PASS
total/synth/PUSH9/p1	2.43	2.63	+8.3%	PASS
total/synth/RETURNDATASIZE/a0	3.70	3.79	+2.2%	PASS
total/synth/RETURNDATASIZE/a1	8.61	8.70	+1.1%	PASS
total/synth/SAR/b0	3.92	3.96	+1.0%	PASS
total/synth/SAR/b1	9.25	9.27	+0.2%	PASS
total/synth/SGT/b0	8.05	7.96	-1.1%	PASS
total/synth/SGT/b1	3.80	3.86	+1.7%	PASS
total/synth/SHL/b0	8.05	8.05	-0.0%	PASS
total/synth/SHL/b1	3.80	3.76	-1.2%	PASS
total/synth/SHR/b0	7.88	7.92	+0.6%	PASS
total/synth/SHR/b1	7.04	7.13	+1.3%	PASS
total/synth/SIGNEXTEND/b0	3.27	3.26	-0.4%	PASS
total/synth/SIGNEXTEND/b1	7.23	6.86	-5.0%	PASS
total/synth/SLT/b0	3.31	3.29	-0.8%	PASS
total/synth/SLT/b1	7.93	8.12	+2.5%	PASS
total/synth/SUB/b0	6.27	6.49	+3.6%	PASS
total/synth/SUB/b1	6.47	6.69	+3.4%	PASS
total/synth/SWAP1/s0	2.14	2.13	-0.4%	PASS
total/synth/SWAP10/s0	2.16	2.17	+0.5%	PASS
total/synth/SWAP11/s0	4.01	4.01	-0.1%	PASS
total/synth/SWAP12/s0	4.01	3.77	-6.0%	PASS
total/synth/SWAP13/s0	2.17	2.16	-0.6%	PASS
total/synth/SWAP14/s0	4.01	4.05	+1.2%	PASS
total/synth/SWAP15/s0	3.95	4.11	+4.1%	PASS
total/synth/SWAP16/s0	2.20	2.19	-0.3%	PASS
total/synth/SWAP2/s0	3.99	3.99	-0.1%	PASS
total/synth/SWAP3/s0	3.99	3.98	-0.3%	PASS
total/synth/SWAP4/s0	2.15	2.15	-0.2%	PASS
total/synth/SWAP5/s0	3.99	4.00	+0.2%	PASS
total/synth/SWAP6/s0	4.01	3.99	-0.3%	PASS
total/synth/SWAP7/s0	2.16	2.15	-0.2%	PASS
total/synth/SWAP8/s0	4.03	4.01	-0.3%	PASS
total/synth/SWAP9/s0	4.02	4.00	-0.6%	PASS
total/synth/XOR/b0	5.97	5.63	-5.6%	PASS
total/synth/XOR/b1	5.87	5.85	-0.3%	PASS
total/synth/loop_v1	12.04	13.27	+10.2%	PASS
total/synth/loop_v2	13.29	13.25	-0.3%	PASS

Summary: 194 benchmarks, 0 regressions

✅ Performance Check Passed (multipass)

Performance Benchmark Results (threshold: 25%)

Benchmark	Baseline (us)	Current (us)	Change	Status
total/main/blake2b_huff/8415nulls	2.13	2.13	+0.0%	PASS
total/main/blake2b_huff/empty	0.04	0.04	+1.6%	PASS
total/main/blake2b_shifts/8415nulls	20.85	20.95	+0.4%	PASS
total/main/sha1_divs/5311	12.21	11.93	-2.3%	PASS
total/main/sha1_divs/empty	0.15	0.15	+1.4%	PASS
total/main/sha1_shifts/5311	8.58	9.11	+6.2%	PASS
total/main/sha1_shifts/empty	0.08	0.08	-3.4%	PASS
total/main/snailtracer/benchmark	105.74	110.47	+4.5%	PASS
total/main/structarray_alloc/nfts_rank	1.38	1.40	+1.4%	PASS
total/main/swap_math/insufficient_liquidity	0.00	0.00	-2.0%	PASS
total/main/swap_math/received	0.01	0.01	-2.8%	PASS
total/main/swap_math/spent	0.01	0.01	-3.9%	PASS
total/main/weierstrudel/1	0.38	0.40	+4.8%	PASS
total/main/weierstrudel/15	4.27	4.28	+0.4%	PASS
total/micro/JUMPDEST_n0/empty	0.00	0.00	-24.4%	PASS
total/micro/jump_around/empty	0.06	0.06	-1.1%	PASS
total/micro/loop_with_many_jumpdests/empty	0.01	0.01	+0.3%	PASS
total/micro/memory_grow_mload/by1	0.02	0.02	-0.3%	PASS
total/micro/memory_grow_mload/by16	0.02	0.02	+2.3%	PASS
total/micro/memory_grow_mload/by32	0.01	0.01	-0.7%	PASS
total/micro/memory_grow_mload/nogrow	0.01	0.01	+0.6%	PASS
total/micro/memory_grow_mstore/by1	0.02	0.02	-4.2%	PASS
total/micro/memory_grow_mstore/by16	0.01	0.01	-0.2%	PASS
total/micro/memory_grow_mstore/by32	0.02	0.03	+2.6%	PASS
total/micro/memory_grow_mstore/nogrow	0.02	0.02	+30.4%	PASS
total/micro/signextend/one	0.07	0.07	-0.8%	PASS
total/micro/signextend/zero	0.18	0.18	+1.5%	PASS
total/synth/ADD/b0	0.00	0.00	+3.6%	PASS
total/synth/ADD/b1	0.00	0.00	+1.4%	PASS
total/synth/ADDRESS/a0	0.23	0.23	-1.0%	PASS
total/synth/ADDRESS/a1	0.15	0.15	+0.3%	PASS
total/synth/AND/b0	0.00	0.00	+3.2%	PASS
total/synth/AND/b1	0.00	0.00	+1.4%	PASS
total/synth/BYTE/b0	0.00	0.00	+3.3%	PASS
total/synth/BYTE/b1	0.00	0.00	+1.1%	PASS
total/synth/CALLDATASIZE/a0	0.13	0.12	-8.1%	PASS
total/synth/CALLDATASIZE/a1	0.07	0.07	+0.0%	PASS
total/synth/CALLER/a0	0.24	0.23	-7.4%	PASS
total/synth/CALLER/a1	0.26	0.24	-9.5%	PASS
total/synth/CALLVALUE/a0	0.19	0.19	+0.5%	PASS
total/synth/CALLVALUE/a1	0.28	0.28	-0.8%	PASS
total/synth/CODESIZE/a0	0.12	0.12	+0.3%	PASS
total/synth/CODESIZE/a1	0.12	0.13	+7.6%	PASS
total/synth/DUP1/d0	0.00	0.00	+1.6%	PASS
total/synth/DUP1/d1	0.00	0.00	+3.4%	PASS
total/synth/DUP10/d0	0.00	0.00	+1.2%	PASS
total/synth/DUP10/d1	0.00	0.00	+1.1%	PASS
total/synth/DUP11/d0	0.00	0.00	+3.6%	PASS
total/synth/DUP11/d1	0.00	0.00	+1.7%	PASS
total/synth/DUP12/d0	0.00	0.00	+1.6%	PASS
total/synth/DUP12/d1	0.00	0.00	+3.7%	PASS
total/synth/DUP13/d0	0.00	0.00	+2.3%	PASS
total/synth/DUP13/d1	0.00	0.00	+2.1%	PASS
total/synth/DUP14/d0	0.00	0.00	+3.8%	PASS
total/synth/DUP14/d1	0.00	0.00	-0.2%	PASS
total/synth/DUP15/d0	0.00	0.00	+0.9%	PASS
total/synth/DUP15/d1	0.00	0.00	+3.2%	PASS
total/synth/DUP16/d0	0.00	0.00	+2.0%	PASS
total/synth/DUP16/d1	0.00	0.00	-0.0%	PASS
total/synth/DUP2/d0	0.00	0.00	+3.6%	PASS
total/synth/DUP2/d1	0.00	0.00	+1.1%	PASS
total/synth/DUP3/d0	0.00	0.00	+1.4%	PASS
total/synth/DUP3/d1	0.00	0.00	+3.4%	PASS
total/synth/DUP4/d0	0.00	0.00	+0.9%	PASS
total/synth/DUP4/d1	0.00	0.00	-1.0%	PASS
total/synth/DUP5/d0	0.00	0.00	+3.6%	PASS
total/synth/DUP5/d1	0.00	0.00	+2.6%	PASS
total/synth/DUP6/d0	0.00	0.00	+1.3%	PASS
total/synth/DUP6/d1	0.00	0.00	+3.7%	PASS
total/synth/DUP7/d0	0.00	0.00	+1.7%	PASS
total/synth/DUP7/d1	0.00	0.00	+2.3%	PASS
total/synth/DUP8/d0	0.00	0.00	+3.7%	PASS
total/synth/DUP8/d1	0.00	0.00	+1.9%	PASS
total/synth/DUP9/d0	0.00	0.00	+0.6%	PASS
total/synth/DUP9/d1	0.00	0.00	+3.6%	PASS
total/synth/EQ/b0	0.00	0.00	+2.7%	PASS
total/synth/EQ/b1	0.00	0.00	+1.9%	PASS
total/synth/GAS/a0	0.76	0.76	+0.0%	PASS
total/synth/GAS/a1	0.96	0.92	-4.6%	PASS
total/synth/GT/b0	0.00	0.00	+1.1%	PASS
total/synth/GT/b1	0.00	0.00	+1.4%	PASS
total/synth/ISZERO/u0	0.00	0.00	+1.5%	PASS
total/synth/JUMPDEST/n0	0.00	0.00	+0.0%	PASS
total/synth/LT/b0	0.00	0.00	+1.7%	PASS
total/synth/LT/b1	0.00	0.00	+4.0%	PASS
total/synth/MSIZE/a0	0.00	0.00	+1.2%	PASS
total/synth/MSIZE/a1	0.00	0.00	+1.5%	PASS
total/synth/MUL/b0	0.00	0.00	+1.4%	PASS
total/synth/MUL/b1	0.00	0.00	+3.3%	PASS
total/synth/NOT/u0	0.00	0.00	+1.9%	PASS
total/synth/OR/b0	0.00	0.00	+2.2%	PASS
total/synth/OR/b1	0.00	0.00	+2.1%	PASS
total/synth/PC/a0	0.00	0.00	+2.5%	PASS
total/synth/PC/a1	0.00	0.00	+3.4%	PASS
total/synth/PUSH1/p0	0.00	0.00	+3.0%	PASS
total/synth/PUSH1/p1	0.00	0.00	+3.2%	PASS
total/synth/PUSH10/p0	0.00	0.00	+1.8%	PASS
total/synth/PUSH10/p1	0.00	0.00	+4.2%	PASS
total/synth/PUSH11/p0	0.00	0.00	+1.2%	PASS
total/synth/PUSH11/p1	0.00	0.00	-1.3%	PASS
total/synth/PUSH12/p0	0.00	0.00	+1.8%	PASS
total/synth/PUSH12/p1	0.00	0.00	+4.7%	PASS
total/synth/PUSH13/p0	0.00	0.00	+4.9%	PASS
total/synth/PUSH13/p1	0.00	0.00	+1.0%	PASS
total/synth/PUSH14/p0	0.00	0.00	+6.7%	PASS
total/synth/PUSH14/p1	0.00	0.00	+2.4%	PASS
total/synth/PUSH15/p0	0.00	0.00	+0.8%	PASS
total/synth/PUSH15/p1	0.00	0.00	+0.5%	PASS
total/synth/PUSH16/p0	0.00	0.00	+3.3%	PASS
total/synth/PUSH16/p1	0.00	0.00	+1.8%	PASS
total/synth/PUSH17/p0	0.00	0.00	-1.4%	PASS
total/synth/PUSH17/p1	0.00	0.00	+6.3%	PASS
total/synth/PUSH18/p0	0.00	0.00	+2.4%	PASS
total/synth/PUSH18/p1	0.00	0.00	+4.8%	PASS
total/synth/PUSH19/p0	0.00	0.00	+1.9%	PASS
total/synth/PUSH19/p1	0.00	0.00	+1.7%	PASS
total/synth/PUSH2/p0	0.00	0.00	+3.2%	PASS
total/synth/PUSH2/p1	0.00	0.00	+1.2%	PASS
total/synth/PUSH20/p0	0.00	0.00	+0.7%	PASS
total/synth/PUSH20/p1	0.00	0.00	+0.8%	PASS
total/synth/PUSH21/p0	0.00	0.00	+2.8%	PASS
total/synth/PUSH21/p1	0.00	0.00	+1.0%	PASS
total/synth/PUSH22/p0	2.17	2.22	+2.7%	PASS
total/synth/PUSH22/p1	1.57	1.86	+18.4%	PASS
total/synth/PUSH23/p0	2.18	2.18	-0.0%	PASS
total/synth/PUSH23/p1	2.35	2.64	+12.4%	PASS
total/synth/PUSH24/p0	1.33	1.41	+5.5%	PASS
total/synth/PUSH24/p1	2.27	2.47	+8.9%	PASS
total/synth/PUSH25/p0	2.14	2.19	+2.6%	PASS
total/synth/PUSH25/p1	1.62	1.83	+13.2%	PASS
total/synth/PUSH26/p0	2.16	2.33	+7.9%	PASS
total/synth/PUSH26/p1	2.30	2.62	+13.9%	PASS
total/synth/PUSH27/p0	1.32	1.40	+6.1%	PASS
total/synth/PUSH27/p1	2.20	2.48	+12.4%	PASS
total/synth/PUSH28/p0	2.20	2.19	-0.5%	PASS
total/synth/PUSH28/p1	1.58	1.85	+17.5%	PASS
total/synth/PUSH29/p0	2.21	2.25	+1.8%	PASS
total/synth/PUSH29/p1	2.22	2.40	+8.0%	PASS
total/synth/PUSH3/p0	0.00	0.00	+2.8%	PASS
total/synth/PUSH3/p1	0.00	0.00	+1.4%	PASS
total/synth/PUSH30/p0	1.49	1.61	+8.0%	PASS
total/synth/PUSH30/p1	2.21	2.38	+8.0%	PASS
total/synth/PUSH31/p0	2.27	2.20	-3.3%	PASS
total/synth/PUSH31/p1	1.71	2.04	+19.0%	PASS
total/synth/PUSH32/p0	2.14	2.20	+2.7%	PASS
total/synth/PUSH32/p1	2.37	2.40	+1.1%	PASS
total/synth/PUSH4/p0	0.00	0.00	+1.8%	PASS
total/synth/PUSH4/p1	0.00	0.00	+3.4%	PASS
total/synth/PUSH5/p0	0.00	0.00	+1.8%	PASS
total/synth/PUSH5/p1	0.00	0.00	-0.9%	PASS
total/synth/PUSH6/p0	0.00	0.00	+2.2%	PASS
total/synth/PUSH6/p1	0.00	0.00	-1.1%	PASS
total/synth/PUSH7/p0	0.00	0.00	+1.8%	PASS
total/synth/PUSH7/p1	0.00	0.00	+0.3%	PASS
total/synth/PUSH8/p0	0.00	0.00	-1.0%	PASS
total/synth/PUSH8/p1	0.00	0.00	-7.1%	PASS
total/synth/PUSH9/p0	0.00	0.00	+6.0%	PASS
total/synth/PUSH9/p1	0.00	0.00	+6.1%	PASS
total/synth/RETURNDATASIZE/a0	0.03	0.03	+0.8%	PASS
total/synth/RETURNDATASIZE/a1	0.06	0.06	+9.5%	PASS
total/synth/SAR/b0	0.00	0.00	+3.5%	PASS
total/synth/SAR/b1	0.00	0.00	+3.1%	PASS
total/synth/SGT/b0	0.00	0.00	+2.0%	PASS
total/synth/SGT/b1	0.00	0.00	+3.6%	PASS
total/synth/SHL/b0	0.00	0.00	-0.1%	PASS
total/synth/SHL/b1	0.00	0.00	+3.9%	PASS
total/synth/SHR/b0	0.00	0.00	+1.3%	PASS
total/synth/SHR/b1	0.00	0.00	+1.7%	PASS
total/synth/SIGNEXTEND/b0	0.00	0.00	+3.2%	PASS
total/synth/SIGNEXTEND/b1	0.00	0.00	+1.9%	PASS
total/synth/SLT/b0	0.00	0.00	+3.3%	PASS
total/synth/SLT/b1	0.00	0.00	+2.4%	PASS
total/synth/SUB/b0	0.00	0.00	+0.6%	PASS
total/synth/SUB/b1	0.00	0.00	+1.8%	PASS
total/synth/SWAP1/s0	0.00	0.00	+3.4%	PASS
total/synth/SWAP10/s0	0.00	0.00	+2.4%	PASS
total/synth/SWAP11/s0	0.00	0.00	+2.3%	PASS
total/synth/SWAP12/s0	0.00	0.00	+1.7%	PASS
total/synth/SWAP13/s0	0.00	0.00	+3.5%	PASS
total/synth/SWAP14/s0	0.00	0.00	+1.5%	PASS
total/synth/SWAP15/s0	0.00	0.00	+1.6%	PASS
total/synth/SWAP16/s0	0.00	0.00	+3.0%	PASS
total/synth/SWAP2/s0	0.00	0.00	+1.8%	PASS
total/synth/SWAP3/s0	0.00	0.00	+2.7%	PASS
total/synth/SWAP4/s0	0.00	0.00	+3.5%	PASS
total/synth/SWAP5/s0	0.00	0.00	+1.1%	PASS
total/synth/SWAP6/s0	0.00	0.00	+0.3%	PASS
total/synth/SWAP7/s0	0.00	0.00	+2.6%	PASS
total/synth/SWAP8/s0	0.00	0.00	+1.3%	PASS
total/synth/SWAP9/s0	0.00	0.00	+2.6%	PASS
total/synth/XOR/b0	0.00	0.00	+1.4%	PASS
total/synth/XOR/b1	0.00	0.00	+1.3%	PASS
total/synth/loop_v1	11.96	11.56	-3.3%	PASS
total/synth/loop_v2	11.58	11.61	+0.3%	PASS

Summary: 194 benchmarks, 0 regressions

The const-amount fast paths in SHL/SHR/SAR lowering kept a runtime >= 256 guard (one Select per result limb) plus an isU256GreaterOrEqual comparison chain even when the full 256-bit shift constant makes the guard statically decidable, and emitted shifted/carry terms for source limbs the value's range proves zero. - handleShift: a constant amount >= 256 folds SHL/SHR_U to a constant zero (EVM spec; SAR keeps the generic sign-dependent flow); a constant amount < 256 skips building IsLargeShift and passes nullptr. - Helpers skip the per-limb guard Select when IsLargeShift is nullptr; dynamic paths assert non-null. SAR sign-fill is untouched. - SHL/SHR_U const paths drop shifted/carry terms whose source limb index is >= the value operand's range tier (U64 -> 1 live limb, U128 -> 2); SAR is excluded so no new range claim is introduced. The limb0-only getConstShiftAmount trap (constant like 2^64 with a small limb0) is resolved statically by checking the full constant. Add 12 adversarial differential fixtures (cross-limb carries, source pruning, >= 256 folds, the 2^64 trap, SAR sign-fill both signs, and a dynamic-amount control) plus an EVMConstShiftDifferentialTest suite. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Compute the full 256-bit constant amount a single time and carry the below-256 verdict in a flag, instead of re-deriving it for the IsLargeShift gating (review feedback). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The evmone-statetests job died ~4.5 minutes in with its Build-and-Test step log never uploaded — the same runner-level termination signature seen earlier tonight on another PR's run (which passed on retrigger). The suite never reached test execution; this job passed on this PR's previous round. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…abels The shifted-dead/carry-live boundary example used << 200, where the top result limb actually keeps the shifted term and no carry term exists; << 136 is the case where only the carry term survives. Also replace internal reviewer codenames with neutral wording and set Status to Implemented. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Covers the carry-only emission branch in the const-shift handler: for a u64-tagged value (LiveLimbs=1), SHL by 136 (CompShift=2, ShiftMod=8) makes result limb 3 read SrcIdx=1 as a dead shifted term while its carry term reads the live limb Value[0] (Value[0] >> 56). That limb is therefore computed from a single carry shift with no OR and no >=256 guard Select. The existing 12 fixtures exercised shifted-term-live cases but never this shifted-dead/carry-live branch. Adds shl_const136_u64val (.easm + .expected) modeled on shl_const200_u64val, registers it in EVMConstShiftDifferentialTest, and bumps the change-doc verification count to 13/13. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The WASM multipass job normally completes in 3-4 minutes; the current run's instance has been stuck in progress for over two hours with no log output (runner-level hang). The EVM workflow on this same head passed completely. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The EVMConstShiftDifferentialTest suite and its 13 evm_asm fixtures relocate to the dedicated differential-suite change so optimization PRs stay code-only and the shared evm_interp_tests.cpp stops accumulating per-PR copies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

abmcar · 2026-06-11T07:56:01Z

The differential test suite and its fixtures have moved out of this PR into #539, which consolidates the interp-vs-multipass differential coverage from all three optimization PRs into a dedicated test target. This PR is now code + docs only; #539 carries the tests and can merge independently in any order.

Align the doc with the technical-writing rule: describe internal labels by behavior, remove who-reviewed and process narrative, and keep every count, flag, code anchor, and measurement. No code or runtime change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Cen5bPpPEgkSkcxWWTSY7d

Copilot AI review requested due to automatic review settings June 9, 2026 20:07

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Comment thread src/compiler/evm_frontend/evm_mir_compiler.h

Comment thread src/compiler/evm_frontend/evm_mir_compiler.h

abmcar closed this Jun 10, 2026

abmcar reopened this Jun 10, 2026

abmcar and others added 6 commits June 11, 2026 11:34

docs(compiler): translate change document to English

11d47d3

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

abmcar force-pushed the perf/evm-const-shift-pruning branch from 8bc0816 to 44d6223 Compare June 11, 2026 03:40

abmcar mentioned this pull request Jun 11, 2026

perf(compiler): narrow EVM SUB lowering for u64-proven operand pairs #536

Merged

abmcar mentioned this pull request Jun 11, 2026

test: consolidate multipass differential suites into one target #539

Merged

abmcar mentioned this pull request Jun 11, 2026

test(evm): add value-range lowering differential harness #533

Closed

zoowii merged commit 8c98d30 into DTVMStack:main Jun 24, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(compiler): statically resolve constant-amount EVM shift guards#535

perf(compiler): statically resolve constant-amount EVM shift guards#535
zoowii merged 9 commits into
DTVMStack:mainfrom
abmcar:perf/evm-const-shift-pruning

abmcar commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

abmcar commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

abmcar commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Correctness notes

Verification

Performance

Known limitation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡ Performance Regression Check Results

✅ Performance Check Passed (interpreter)

✅ Performance Check Passed (multipass)

Uh oh!

abmcar commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abmcar commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading