Skip to content

feat(tests,fw): Add MIP-8 implementation and tests#25

Open
pdobacz wants to merge 8 commits into
upstreamfrom
mip8
Open

feat(tests,fw): Add MIP-8 implementation and tests#25
pdobacz wants to merge 8 commits into
upstreamfrom
mip8

Conversation

@pdobacz

@pdobacz pdobacz commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Notes:

  1. This PR includes a WET addition of a new MONAD_NEXT fork (will be MONAD_TEN but CL uses MONAD_NEXT for now and we must match), there are a lot of files under forks/monad_next which are just copies of monad_nine predecessor.
  2. Most important part to review is the comprehensiveness of the spec test suite under monad_ten (we don't need to match on this)
  3. Second most important part is the new page_commitment function, but that's based on the reference implementation from the MIP-8 text.

Assuming that MIP-8 is going in, I'm going to go ahead and merge this to forks/monad_nine as soon as we review this. Maybe, not the 100% standard process for EEST, but for convenience - it will be easier to release and distribute MIP-8 tests as part of the usual monad spec tests JSON bundle.

I'm still doing some final validations but I'm not expecting surprises at this point, ready to review.

Oh, and it's branching off of upstream, so #23 must go in first.

pdobacz added 8 commits June 12, 2026 14:28
Co-Authored-By: Claude <claude-opus-4-7>
Co-Authored-By: Claude <claude-opus-4-7>
Co-Authored-By: Claude <claude-opus-4-7>
Co-Authored-By: Claude <claude-opus-4-7>
@pdobacz pdobacz requested review from QEDK and mijovic as code owners June 12, 2026 15:33
@greptile-apps

greptile-apps Bot commented Jun 12, 2026

Copy link
Copy Markdown

Greptile Summary

This PR introduces MIP-8 (pageified storage) for the new MONAD_NEXT fork: storage slots are grouped into 128-slot pages, each committed via a BLAKE3-based ISMC scheme, with page-level warming replacing EIP-2929 per-key warming. A pure-Python BLAKE3 implementation is added, along with paged_storage_trie.py for the new storage root, and a comprehensive test suite under tests/monad_ten/ covering gas metering, cross-call page propagation, and the MONAD_NINE→MONAD_NEXT fork transition.

  • The core paged_storage_trie.py and sstore/sload gas logic correctly implement the MIP-8 spec; page warming propagation across call frames (success inherits, failure discards) is wired properly in incorporate_child_on_success / incorporate_child_on_error.
  • test_sstore_refund_removed contains contradictory post-state assertions: it expects storage={} (successful execution) alongside balance = initial - gas_limit * gas_price (all gas consumed), which require actual_gas_used == gas_limit; this is not satisfied because full_page_sweep_gas is calibrated for 0→1 fresh writes while the test performs 99→0 clears, leading to significant gas underuse for MONAD_NEXT.
  • The multi-chunk tree-merging strategy in blake3_hash and the assert isinstance(root_node, Bytes) guard in storage_root_paged each warrant a second look for edge-case correctness.

Confidence Score: 3/5

The production EVM code (page gas charging, BLAKE3, storage root) is well-structured and appears spec-correct; the main concern before merging is fixing the contradictory assertions in test_sstore_refund_removed and confirming the BLAKE3 multi-chunk tree matches the reference.

test_sstore_refund_removed has self-contradictory post-state assertions that will cause the test to fail on MONAD_NEXT: it expects the storage to be fully cleared (successful tx) while also expecting the sender's balance to reflect all gas consumed (OOG). These two outcomes are mutually exclusive, and the root cause is that full_page_sweep_gas is calibrated for setting zero slots (0→1), not clearing non-zero ones (99→0), giving a gas budget roughly 50× too large for the operation under test.

tests/monad_ten/mip8_pageified_storage/test_sstore_refunds.py needs the balance/receipt assertions corrected to use actual clearing gas rather than full_page_sweep_gas; src/ethereum/crypto/blake3.py multi-chunk tree merging needs a non-power-of-two byte-level cross-check against the reference BLAKE3.

Important Files Changed

Filename Overview
src/ethereum/paged_storage_trie.py New core MIP-8 module: groups storage slots into 4 KB pages, computes BLAKE3 ISMC page commitments, and builds the keccak256 MPT storage root; logic appears correct with one uncertain type-safety edge in the root-node assertion.
src/ethereum/crypto/blake3.py Pure-Python BLAKE3 implementation (compression function + hash); internal G-function and round scheduling match the spec, but the multi-chunk tree merging strategy should be validated against the official BLAKE3 reference for non-power-of-two chunk counts.
src/ethereum/forks/monad_next/vm/instructions/storage.py MIP-8 SLOAD/SSTORE implementations with page-level warming and state-growth gas tracking; logic is sound, minor ordering question about static-context check placement and a fragile but safe growth-counter reset pattern.
src/ethereum/forks/monad_next/vm/init.py Adds read_accessed_pages, write_accessed_pages, current_state_growth, net_state_growth to Evm; incorporate_child_on_success correctly propagates all four by assignment (safe because child starts with a copy of parent's data).
src/ethereum/forks/monad_next/fork.py New fork module; compute_paged_state_root correctly builds copied tries, applies changes, and delegates per-account storage roots to storage_root_paged with an empty-trie guard.
tests/monad_ten/mip8_pageified_storage/test_sstore_refunds.py test_sstore_refund_removed has contradictory post-state: storage={} requires success but balance=initial-gas_limit*gas_price requires all gas consumed (OOG); these cannot both be true since full_page_sweep_gas is calibrated for 0→1 writes not 99→0 clears.
tests/monad_ten/mip8_pageified_storage/helpers.py TxPageState and simulate_sstore accurately mirror the EVM sstore algorithm; STATE_TRANSITIONS covers all zero/non-zero combinations; simulate_sstore omits key_warm but that is appropriate since MIP-8 tests don't use that field.
tests/monad_ten/mip8_pageified_storage/test_sstore_gas.py Comprehensive gas-measurement tests covering all state transitions, cross-page warming combinations, and full-page sweeps; parametrisation is thorough and the CodeGasMeasure pattern is used correctly.
tests/monad_ten/mip8_pageified_storage/test_cross_call.py Extensive cross-call page-warming propagation tests; correctly covers CALL/DELEGATECALL/CALLCODE/STATICCALL success and revert scenarios with all call opcodes parameterised.
tests/monad_ten/mip8_pageified_storage/test_fork_transition.py Tests MONAD_NINE→MONAD_NEXT fork transition: storage persistence, page-warming activation, and gas cost changes across the fork boundary are well-structured and complete.
packages/testing/src/execution_testing/forks/forks/forks.py Adds MONAD_NEXT fork class with MIP-8 gas overrides for SLOAD/SSTORE; _calculate_sstore_gas_mip8 correctly implements page-level cost formula; opcode_refund_map correctly removes all SSTORE refunds.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    SSTORE["SSTORE opcode"] --> STIPEND{"gas_left ≤ 2300?"}
    STIPEND -- OOG --> HALT1["OutOfGasError"]
    STIPEND -- OK --> GETVAL["get_storage current_value"]
    GETVAL --> PAGEKEY["compute page_key\n(target, slot >> 7)"]
    PAGEKEY --> READWARM{"page_key in\nread_accessed_pages?"}
    READWARM -- No --> LOADCOST["gas += PAGE_LOAD_COST\nadd to read_accessed_pages"]
    READWARM -- Yes --> BASE["gas += PAGE_BASE_COST"]
    LOADCOST --> BASE
    BASE --> VALUEDIFF{"current != new?"}
    VALUEDIFF -- No --> GROWCHECK
    VALUEDIFF -- Yes --> WRITEWARM{"page_key in\nwrite_accessed_pages?"}
    WRITEWARM -- No --> WRITECOST["gas += PAGE_WRITE_COST\nadd to write_accessed_pages\nreset growth counters"]
    WRITEWARM -- Yes --> GROWTRACK
    WRITECOST --> GROWTRACK["update current_state_growth\n(+1 if 0→nonzero, -1 if nonzero→0)"]
    GROWTRACK --> GROWCHECK{"current_growth\n> net_growth?"}
    GROWCHECK -- Yes --> GROWCOST["gas += PAGE_STATE_GROWTH_COST\nnet_growth = current_growth"]
    GROWCHECK -- No --> CHARGE
    GROWCOST --> CHARGE["charge_gas"]
    CHARGE --> STATIC{"is_static?"}
    STATIC -- Yes --> HALT2["WriteInStaticContext"]
    STATIC -- No --> WRITE["set_storage"]
    WRITE --> DONE["pc += 1"]
Loading
Prompt To Fix All With AI
Fix the following 5 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 5
tests/monad_ten/mip8_pageified_storage/test_sstore_refunds.py:37-73
**Contradictory post-state assertions for MONAD_NEXT**

The test simultaneously expects `storage={}` (requires the transaction to have succeeded and all 128 slots cleared) and `balance = initial - gas_limit * gas_price` (requires that ALL `gas_limit` gas was consumed, implying OOG). These two assertions are mutually exclusive: on OOG the EVM reverts, leaving storage populated with 99s; on success, unused gas is refunded and the sender's balance deduction is less than `gas_limit * gas_price`.

For MONAD_NEXT specifically: `gas_limit = generous_gas + full_page_sweep_gas` where `full_page_sweep_gas` is calibrated for *setting* fresh zero slots (0→1, paying `PAGE_STATE_GROWTH_COST = 17 000` per slot). The test actually *clears* pre-set slots (99→0), which never triggers the growth charge. A rough estimate gives actual gas ≈ 45 000 vs `gas_limit` ≈ 2 440 000—the leftover gas is refunded, so the sender's balance after the test will be much higher than the assertion expects, causing a test failure.

### Issue 2 of 5
src/ethereum/forks/monad_next/vm/instructions/storage.py:100-128
**Static-context check fires after gas and state-growth counters are mutated**

`charge_gas` and the page-warmth/state-growth updates all run before the `is_static` guard. When `WriteInStaticContext` is raised, gas is already consumed, `read_accessed_pages`, `write_accessed_pages`, `current_state_growth`, and `net_state_growth` are already mutated in the current frame. Because `incorporate_child_on_error` propagates only `gas_left`, the parent frame is unaffected, and the state reverts correctly—so there is no observable correctness issue. However, the current ordering diverges from the pattern used in most EVM SSTORE implementations (check static → charge gas → write), which could confuse future readers or cause incorrect behaviour if the guard is ever moved or the call depth changes. It's worth confirming this ordering is intentional per the MIP-8 / Monad spec.

### Issue 3 of 5
src/ethereum/forks/monad_next/vm/instructions/storage.py:105-115
**Page-write initialisation resets growth counters even when growth tracking preceded the write**

When the page is first write-warmed (`page_key not in evm.write_accessed_pages` AND `current_value != new_value`), the code forcibly resets `evm.current_state_growth[page_key] = 0` and `evm.net_state_growth[page_key] = 0`. Because a child EVM initialises its own `current_state_growth` from the parent's dict, if the parent had growth data for this page key (e.g., accumulated via a successful nested call) and then tries a write on its own that is the first write-warm in *this* frame without checking the inherited dict, the reset would clobber the inherited growth figure. In practice this cannot happen today because the parent would already have the key in `write_accessed_pages` after any successful child write propagates via `incorporate_child_on_success`. But the invariant is fragile—it would be clearer to use `.setdefault(page_key, 0)` rather than an unconditional `= 0` to make the intent explicit.

### Issue 4 of 5
src/ethereum/crypto/blake3.py:213-244
**Multi-chunk tree merging via left-to-right pairing may not match standard BLAKE3**

The `while len(cvs) > 2` loop repeatedly pairs adjacent chunk chaining-values from left to right. Standard BLAKE3 constructs an irregular binary tree where the *left* subtree is always a complete binary tree sized to the largest power of two not exceeding the remaining chunk count (achieved via a counter-based stack in the reference implementation). For inputs whose chunk count is not a power of two the two strategies can produce different trees and therefore different hashes.

For example, with 6 chunks the current code produces `parent(parent(A,B,C,D), parent(E,F))` which matches standard BLAKE3. With 5 chunks both yield `parent(parent(A,B,C,D), E)`. However, more complex counts should be validated. The BLAKE3 test vectors in `blake3_test_vectors.json` cover this, but it is worth explicitly adding a multi-chunk test case with a non-power-of-two input length to confirm byte-for-byte compatibility with the official BLAKE3 reference.

### Issue 5 of 5
src/ethereum/paged_storage_trie.py:139-156
**`assert isinstance(root_node, Bytes)` may trip for very small tries**

`encode_internal_node` returns a raw (un-hashed) node when `len(rlp.encode(node)) < 32`, and a 32-byte keccak hash otherwise. For the single-leaf case (exactly one page), `encode_internal_node` may return a raw bytes object that is not a `Bytes` instance but a plain `bytes` or another bytes-compatible type, potentially causing the `isinstance` assertion to fail. The code path that hashes the short root (`return keccak256(...)`) is correct, but the remaining `else` branch relies on the raw node being wrapped as `Bytes`—confirm that `encode_internal_node` always returns a `Bytes` subclass (not `bytes`) for the hashed case.

Reviews (1): Last reviewed commit: "MIP-8 tests" | Re-trigger Greptile

Comment on lines +37 to +73
Assert exact cumulative_gas_used via TransactionReceipt and
sender balance change matches the computed gas charge.
"""
pre_storage: dict[NumberConvertible, NumberConvertible] = dict.fromkeys(
range(Spec.SLOTS_PER_PAGE), pre_storage_value
)

code = Bytecode()
for i in range(Spec.SLOTS_PER_PAGE):
code += Op.SSTORE(i, 0)

contract_address = pre.deploy_contract(code, storage=pre_storage)

sender = pre.fund_eoa(amount=sender_initial_balance)
gas_price = 10**9
gas_limit = generous_gas(fork) + full_page_sweep_gas(fork)

tx = Transaction(
gas_limit=gas_limit,
gas_price=gas_price,
to=contract_address,
sender=sender,
expected_receipt=TransactionReceipt(
cumulative_gas_used=gas_limit,
),
)

state_test(
pre=pre,
post={
contract_address: Account(storage={}),
sender: Account(
balance=sender_initial_balance - gas_limit * gas_price,
nonce=1,
),
},
tx=tx,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Contradictory post-state assertions for MONAD_NEXT

The test simultaneously expects storage={} (requires the transaction to have succeeded and all 128 slots cleared) and balance = initial - gas_limit * gas_price (requires that ALL gas_limit gas was consumed, implying OOG). These two assertions are mutually exclusive: on OOG the EVM reverts, leaving storage populated with 99s; on success, unused gas is refunded and the sender's balance deduction is less than gas_limit * gas_price.

For MONAD_NEXT specifically: gas_limit = generous_gas + full_page_sweep_gas where full_page_sweep_gas is calibrated for setting fresh zero slots (0→1, paying PAGE_STATE_GROWTH_COST = 17 000 per slot). The test actually clears pre-set slots (99→0), which never triggers the growth charge. A rough estimate gives actual gas ≈ 45 000 vs gas_limit ≈ 2 440 000—the leftover gas is refunded, so the sender's balance after the test will be much higher than the assertion expects, causing a test failure.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/monad_ten/mip8_pageified_storage/test_sstore_refunds.py
Line: 37-73

Comment:
**Contradictory post-state assertions for MONAD_NEXT**

The test simultaneously expects `storage={}` (requires the transaction to have succeeded and all 128 slots cleared) and `balance = initial - gas_limit * gas_price` (requires that ALL `gas_limit` gas was consumed, implying OOG). These two assertions are mutually exclusive: on OOG the EVM reverts, leaving storage populated with 99s; on success, unused gas is refunded and the sender's balance deduction is less than `gas_limit * gas_price`.

For MONAD_NEXT specifically: `gas_limit = generous_gas + full_page_sweep_gas` where `full_page_sweep_gas` is calibrated for *setting* fresh zero slots (0→1, paying `PAGE_STATE_GROWTH_COST = 17 000` per slot). The test actually *clears* pre-set slots (99→0), which never triggers the growth charge. A rough estimate gives actual gas ≈ 45 000 vs `gas_limit` ≈ 2 440 000—the leftover gas is refunded, so the sender's balance after the test will be much higher than the assertion expects, causing a test failure.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +100 to +128
evm.current_state_growth[page_key] = 0
evm.net_state_growth[page_key] = 0

# State growth cost
if current_value == 0 and new_value != 0:
evm.current_state_growth[page_key] = (
evm.current_state_growth.get(page_key, 0) + 1
)
elif current_value != 0 and new_value == 0:
evm.current_state_growth[page_key] = (
evm.current_state_growth.get(page_key, 0) - 1
)

current = evm.current_state_growth.get(page_key, 0)
peak = evm.net_state_growth.get(page_key, 0)
if current > peak:
gas_cost += GasCosts.PAGE_STATE_GROWTH_COST
evm.net_state_growth[page_key] = current

charge_gas(evm, gas_cost)
if evm.message.is_static:
raise WriteInStaticContext
set_storage(tx_state, target, key, new_value)

# PROGRAM COUNTER
evm.pc += Uint(1)


def tload(evm: Evm) -> None:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Static-context check fires after gas and state-growth counters are mutated

charge_gas and the page-warmth/state-growth updates all run before the is_static guard. When WriteInStaticContext is raised, gas is already consumed, read_accessed_pages, write_accessed_pages, current_state_growth, and net_state_growth are already mutated in the current frame. Because incorporate_child_on_error propagates only gas_left, the parent frame is unaffected, and the state reverts correctly—so there is no observable correctness issue. However, the current ordering diverges from the pattern used in most EVM SSTORE implementations (check static → charge gas → write), which could confuse future readers or cause incorrect behaviour if the guard is ever moved or the call depth changes. It's worth confirming this ordering is intentional per the MIP-8 / Monad spec.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ethereum/forks/monad_next/vm/instructions/storage.py
Line: 100-128

Comment:
**Static-context check fires after gas and state-growth counters are mutated**

`charge_gas` and the page-warmth/state-growth updates all run before the `is_static` guard. When `WriteInStaticContext` is raised, gas is already consumed, `read_accessed_pages`, `write_accessed_pages`, `current_state_growth`, and `net_state_growth` are already mutated in the current frame. Because `incorporate_child_on_error` propagates only `gas_left`, the parent frame is unaffected, and the state reverts correctly—so there is no observable correctness issue. However, the current ordering diverges from the pattern used in most EVM SSTORE implementations (check static → charge gas → write), which could confuse future readers or cause incorrect behaviour if the guard is ever moved or the call depth changes. It's worth confirming this ordering is intentional per the MIP-8 / Monad spec.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +105 to +115
evm.current_state_growth[page_key] = (
evm.current_state_growth.get(page_key, 0) + 1
)
elif current_value != 0 and new_value == 0:
evm.current_state_growth[page_key] = (
evm.current_state_growth.get(page_key, 0) - 1
)

current = evm.current_state_growth.get(page_key, 0)
peak = evm.net_state_growth.get(page_key, 0)
if current > peak:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Page-write initialisation resets growth counters even when growth tracking preceded the write

When the page is first write-warmed (page_key not in evm.write_accessed_pages AND current_value != new_value), the code forcibly resets evm.current_state_growth[page_key] = 0 and evm.net_state_growth[page_key] = 0. Because a child EVM initialises its own current_state_growth from the parent's dict, if the parent had growth data for this page key (e.g., accumulated via a successful nested call) and then tries a write on its own that is the first write-warm in this frame without checking the inherited dict, the reset would clobber the inherited growth figure. In practice this cannot happen today because the parent would already have the key in write_accessed_pages after any successful child write propagates via incorporate_child_on_success. But the invariant is fragile—it would be clearer to use .setdefault(page_key, 0) rather than an unconditional = 0 to make the intent explicit.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ethereum/forks/monad_next/vm/instructions/storage.py
Line: 105-115

Comment:
**Page-write initialisation resets growth counters even when growth tracking preceded the write**

When the page is first write-warmed (`page_key not in evm.write_accessed_pages` AND `current_value != new_value`), the code forcibly resets `evm.current_state_growth[page_key] = 0` and `evm.net_state_growth[page_key] = 0`. Because a child EVM initialises its own `current_state_growth` from the parent's dict, if the parent had growth data for this page key (e.g., accumulated via a successful nested call) and then tries a write on its own that is the first write-warm in *this* frame without checking the inherited dict, the reset would clobber the inherited growth figure. In practice this cannot happen today because the parent would already have the key in `write_accessed_pages` after any successful child write propagates via `incorporate_child_on_success`. But the invariant is fragile—it would be clearer to use `.setdefault(page_key, 0)` rather than an unconditional `= 0` to make the intent explicit.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +213 to +244
----------
data :
Input bytes to hash.

Returns
-------
digest : bytes
32-byte BLAKE3 digest.

"""
data = bytes(data)

if len(data) == 0:
output = compress(
IV,
b"\x00" * BLOCK_LEN,
0,
0,
CHUNK_START | CHUNK_END | ROOT,
)
return words_to_bytes(output[:8])

if len(data) <= CHUNK_LEN:
cv = _compress_chunk(data, 0, ROOT)
return words_to_bytes(cv)

cvs: List[List[int]] = []
offset = 0
counter = 0
while offset < len(data):
chunk = data[offset : offset + CHUNK_LEN]
offset += CHUNK_LEN

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Multi-chunk tree merging via left-to-right pairing may not match standard BLAKE3

The while len(cvs) > 2 loop repeatedly pairs adjacent chunk chaining-values from left to right. Standard BLAKE3 constructs an irregular binary tree where the left subtree is always a complete binary tree sized to the largest power of two not exceeding the remaining chunk count (achieved via a counter-based stack in the reference implementation). For inputs whose chunk count is not a power of two the two strategies can produce different trees and therefore different hashes.

For example, with 6 chunks the current code produces parent(parent(A,B,C,D), parent(E,F)) which matches standard BLAKE3. With 5 chunks both yield parent(parent(A,B,C,D), E). However, more complex counts should be validated. The BLAKE3 test vectors in blake3_test_vectors.json cover this, but it is worth explicitly adding a multi-chunk test case with a non-power-of-two input length to confirm byte-for-byte compatibility with the official BLAKE3 reference.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ethereum/crypto/blake3.py
Line: 213-244

Comment:
**Multi-chunk tree merging via left-to-right pairing may not match standard BLAKE3**

The `while len(cvs) > 2` loop repeatedly pairs adjacent chunk chaining-values from left to right. Standard BLAKE3 constructs an irregular binary tree where the *left* subtree is always a complete binary tree sized to the largest power of two not exceeding the remaining chunk count (achieved via a counter-based stack in the reference implementation). For inputs whose chunk count is not a power of two the two strategies can produce different trees and therefore different hashes.

For example, with 6 chunks the current code produces `parent(parent(A,B,C,D), parent(E,F))` which matches standard BLAKE3. With 5 chunks both yield `parent(parent(A,B,C,D), E)`. However, more complex counts should be validated. The BLAKE3 test vectors in `blake3_test_vectors.json` cover this, but it is worth explicitly adding a multi-chunk test case with a non-power-of-two input length to confirm byte-for-byte compatibility with the official BLAKE3 reference.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +139 to +156
return mapped


def storage_root_paged(storage: Mapping[Bytes32, U256]) -> Hash32:
"""
Compute the storage root over a keccak256 MPT whose leaves are BLAKE3
page commitments (MIP-8).

``storage`` maps each 32-byte slot key to its `U256` value.
"""
obj = _prepare_storage_trie(storage)

root_node = encode_internal_node(patricialize(obj, Uint(0)))
if len(rlp.encode(root_node)) < 32:
return keccak256(rlp.encode(root_node))
else:
assert isinstance(root_node, Bytes)
return Hash32(root_node)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 assert isinstance(root_node, Bytes) may trip for very small tries

encode_internal_node returns a raw (un-hashed) node when len(rlp.encode(node)) < 32, and a 32-byte keccak hash otherwise. For the single-leaf case (exactly one page), encode_internal_node may return a raw bytes object that is not a Bytes instance but a plain bytes or another bytes-compatible type, potentially causing the isinstance assertion to fail. The code path that hashes the short root (return keccak256(...)) is correct, but the remaining else branch relies on the raw node being wrapped as Bytes—confirm that encode_internal_node always returns a Bytes subclass (not bytes) for the hashed case.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ethereum/paged_storage_trie.py
Line: 139-156

Comment:
**`assert isinstance(root_node, Bytes)` may trip for very small tries**

`encode_internal_node` returns a raw (un-hashed) node when `len(rlp.encode(node)) < 32`, and a 32-byte keccak hash otherwise. For the single-leaf case (exactly one page), `encode_internal_node` may return a raw bytes object that is not a `Bytes` instance but a plain `bytes` or another bytes-compatible type, potentially causing the `isinstance` assertion to fail. The code path that hashes the short root (`return keccak256(...)`) is correct, but the remaining `else` branch relies on the raw node being wrapped as `Bytes`—confirm that `encode_internal_node` always returns a `Bytes` subclass (not `bytes`) for the hashed case.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant