Upgrade reth 1.9.3 → 1.11.3#927
Conversation
|
Claude finished @ZanCorDX's task in 4m 14s —— View job Review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
The migration is well-structured and the Main concern — provider opens moved into hot loops (perf)The shift from a single
Opening a history provider is meaningfully heavier than an Other items
Minor / notes
No blocking correctness or security issues found in the reviewed paths. The perf concern in the hot loops is the one I'd most want validated before merge. |
| let order_id = task.order.id(); | ||
| let start_time = Instant::now(); | ||
| let state_provider = match provider | ||
| .history_by_block_hash(current_sim_context.block_ctx.attributes.parent) |
There was a problem hiding this comment.
Perf regression in the sim hot loop. Previously the state provider was opened once per sim context (outside the inner while let loop) and cheaply Arc-cloned per order. Now history_by_block_hash is called for every order task. Opening a history provider walks reth's state/overlay setup and is materially more expensive than the old Arc::clone, so this adds per-order latency and DB pressure on the simulation path.
Since CachedDB now owns a non-cloneable StateProviderBox, the provider can't simply be hoisted — but it's worth confirming the per-order open cost is acceptable (benchmark mev_boost/sim throughput), or reworking so the provider is opened once per block and the per-order CachedDB borrows it.
There was a problem hiding this comment.
It's ~0.4 µs but I can fix it here
| fn initialize_block_state( | ||
| &mut self, | ||
| state_provider: Arc<dyn StateProvider>, | ||
| ) -> BlockState<CachedDB> { | ||
| let cached = CachedDB::new(state_provider, self.ctx.shared_cached_reads.clone()); | ||
| BlockState::new(cached) | ||
| source: StateProviderSource, | ||
| ) -> ProviderResult<BlockState<CachedDB>> { | ||
| let cached = CachedDB::new( | ||
| source.state_provider()?, | ||
| self.ctx.shared_cached_reads.clone(), | ||
| ); |
There was a problem hiding this comment.
Perf regression in the conflict-resolution hot loop. initialize_block_state now opens a fresh provider via source.state_provider()? on every sequence attempt: resolve_conflict_task loops over sequence_to_try (multiple per task) × many tasks per thread. Before this change the provider was opened once per worker thread in conflict_resolving_pool::start and Arc-cloned per task. The existing // @todo actually reuse it for the duration of the block (line 130) is now much more costly than before. Worth measuring the parallel-builder throughput impact, or reusing one provider per thread/block.
There was a problem hiding this comment.
parallel builder is not working anyway
| .get_or_init(|| { | ||
| let config = match tokio::runtime::Handle::try_current() { | ||
| Ok(handle) => RuntimeConfig::with_existing_handle(handle), | ||
| Err(_) => RuntimeConfig::default(), | ||
| }; | ||
| // Invariant: building the process-wide task runtime is a startup operation. If it | ||
| // fails the builder cannot compute state roots or open the provider factory, so there | ||
| // is no meaningful way to continue. | ||
| RuntimeBuilder::new(config) | ||
| .build() |
There was a problem hiding this comment.
The process-wide OnceLock captures the tokio Handle from whatever context makes the first call, and that decision is permanent. Two consequences worth checking:
- If the first caller runs inside a short-lived runtime (e.g. a
#[tokio::test]), the cachedRuntimeholds a handle to a tokio runtime that is dropped at the end of that test. Any later use of the global runtime (another test, ortest_reth_dbwhich is a plain#[test]) then operates on a dead handle — a plausible source of order-dependent test flakiness, since test execution order within a binary isn't deterministic. - First-call-wins also means the standalone-vs-attached choice can be "wrong" for later callers (attached when the first caller happened to be outside a runtime, or vice versa).
In the production single-runtime path this is fine; the risk is concentrated in the test suite. Consider not attaching to the ambient handle (always build standalone), or documenting that the first call must come from the long-lived runtime.
| fn unpack_nibbles<'a>(nibbles: &Nibbles, buf: &'a mut [u8; 64]) -> &'a [u8] { | ||
| let len = nibbles.len(); | ||
| for (i, nibble) in nibbles.iter().enumerate() { | ||
| buf[i] = nibble; | ||
| } | ||
| &buf[..len] | ||
| } |
There was a problem hiding this comment.
The fixed [0u8; 64] buffer indexes buf[i] without a bound check, so a Nibbles longer than 64 panics with an index-out-of-bounds. 64 is the correct max for account/storage trie keys (32-byte keccak = 64 nibbles), so the invariant holds today — but a debug_assert!(nibbles.len() <= 64) would make the invariant explicit and fail loudly in tests rather than relying on the implicit panic. Same applies to the identical helper in v_experimental/trie/mod.rs.
| + 'static, | ||
| { | ||
| let overlay = OverlayStateProviderFactory::new(provider); | ||
| let overlay = OverlayStateProviderFactory::new(provider, ChangesetCache::new()); |
There was a problem hiding this comment.
A fresh ChangesetCache::new() is allocated on every root-hash computation. If reth intends this cache to be reused across blocks/computations (to amortize changeset reads), a per-call cache provides no benefit. If a per-computation cache is the intended semantics here, ignore — but worth confirming against how reth's node wires OverlayStateProviderFactory.
|
Claude finished @ZanCorDX's task in 1m 15s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
|
Claude finished @ZanCorDX's task in 1m 20s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
📝 Summary
Bumps reth
27a8c0f→d6324d6(1.11.3) and the coupled major bumps: revm 31→34,alloy-evm 0.23→0.27, alloy 1.x→1.6, alloy-trie 0.8→0.9, revm-inspectors→0.34.
Adapts rbuilder to the new reth/revm APIs and removes the
unsafethat reth'strait-object
Syncchange would otherwise force.Dependencies
reth-*→ revd6324d6; revm 34, revm-inspectors 0.34.2, alloy-evm 0.27.2,alloy 1.6.3, alloy-primitives 1.5.6, alloy-trie 0.9.4.
time = 0.3.47(0.3.48 has an E0119 coherence bug); MSRV 1.85 → 1.88.reth-tasksin rbuilder-utils repointed to the same rev (was stale).Key adaptations
Provider Send/Sync . reth 1.11 dropped
SyncfromStateProviderBox(
Box<dyn StateProvider + Send>). The unit shared acrossbuilding threads is now
StateProviderSource(factory + parent hash,Send+Sync+Clone);each consumer opens its own provider on demand.
CachedDBholds an already-open provider;the block-building helper clones itself via an explicit, fallible
try_clonethat reopensthe provider. This is the reth way.
revm 34.
Database::Erroris nowEvmDatabaseError<ProviderError>;CfgEnv::with_spec→with_spec_and_mainnet_gas_params(sets Osaka's per-tx gas cap,EIP-7825).
order_commitmaps the new error shape.Root hash.
OverlayStateProviderFactory::new(factory, ChangesetCache),ParallelStateRoot::new(.., Runtime); provider bounds gainChangeSetReader + StorageChangeSetReader + DBProvider + StorageSettingsCache.Transparently supports reth storage v1 and v2 (reth reads the mode from the DB;
rbuilder only forwards the bound).
Provider construction.
ProviderFactory::newnow takes a RocksDBProvider + Runtimeand returns
Result; it reads storage settings on open. Adds a process-wide task runtime.eth-sparse-mpt. Unified onto nybbles 0.4 (matching alloy-trie 0.9 / reth): the crate's
Nibblesis now the same packed type reth's trie nodes use, so the conversion shims betweenthe two representations are gone — encode/decode pass
Nibblesstraight into alloy-trie nodetypes, removing per-node conversions in the root-hash hot path. Single nybbles version
across the workspace.
🤖 Generated with Claude Code with me kicking his ass to get it right....
💡 Motivation and Context
I like prime numbers so I switched 9 for 11.
✅ I have completed the following steps:
make lintmake test