Skip to content

fix(epoch-cache): use finalized L1 block and correct lag for committee guard#22153

Open
spalladino wants to merge 7 commits intomerge-train/spartanfrom
palla/fix/epoch-cache-finalized-guard
Open

fix(epoch-cache): use finalized L1 block and correct lag for committee guard#22153
spalladino wants to merge 7 commits intomerge-train/spartanfrom
palla/fix/epoch-cache-finalized-guard

Conversation

@spalladino
Copy link
Copy Markdown
Contributor

Motivation

The computeCommittee guard in epoch-cache had two bugs: it used lagInEpochsForValidatorSet (the looser constraint) instead of lagInEpochsForRandao (the binding one), and it queried the latest L1 block instead of finalized. This meant an L1 reorg could change the RANDAO seed for a committee we'd already cached, and when the two lag values differed the guard was less strict than the L1 contract.

Approach

Switch the guard to use the finalized block tag and lagInEpochsForRandao. Compute the sampling timestamp from the epoch start (not the individual slot timestamp) to match L1 contract logic. Introduce typed error classes (EpochNotFinalizedError, EpochNotStableError) so callers can distinguish between "not yet finalized" and "not yet stable on L1". Extract types and errors into separate files.

Changes

  • epoch-cache: Fix computeCommittee guard to use lagInEpochsForRandao, finalized block tag, and epoch-start-based sampling timestamp. Add EpochCacheConstants type and getEpochCacheConstants() accessor. Extract errors to errors.ts and types/interfaces to types.ts.
  • epoch-cache (tests): Use different lag values (lagInEpochsForValidatorSet=2, lagInEpochsForRandao=1) to exercise the fix. Add unit test for EpochNotStableError wrapping. Add integration tests against real Anvil: happy path (committee, caching, proposer selection) and two guard tests that independently trigger each error class.
  • epoch-cache (docs): Rewrite README with committee computation, LAG values, RANDAO, proposer selection, escape hatch, finalized block guard, and caching strategy.

Fixes A-680

@spalladino spalladino added ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure backport-to-v4-next labels Mar 30, 2026
…e guard

The computeCommittee guard was using lagInEpochsForValidatorSet (the looser
constraint) instead of lagInEpochsForRandao (the binding constraint), and
queried the latest L1 block instead of the finalized one. This could allow
caching a committee whose RANDAO seed is not yet finalized on L1.

Fixes the guard to use lagInEpochsForRandao and the finalized block tag,
computes sampling timestamp from epoch start (not slot timestamp), introduces
EpochNotFinalizedError and EpochNotStableError, and adds integration tests
against a real Anvil instance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spalladino spalladino force-pushed the palla/fix/epoch-cache-finalized-guard branch from eb6dcfa to c362c0e Compare March 30, 2026 17:19
spalladino and others added 4 commits March 30, 2026 14:24
Increases the Anvil slotsInAnEpoch from 1 to 8 so finalized = latest - 16
blocks, making tests less likely to pass due to off-by-one near the
finality boundary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ests

Only interval mining needs to be stopped/restored to control the gap
between latest and finalized blocks. Automine is not used since Anvil
is started with l1BlockTime (interval mining).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…llup cheat codes

Adds mineUntilTimestamp which mines real L1 blocks (via hardhat_mine with
a timestamp interval) so finalized block timestamps advance alongside latest.
This prevents epoch-cache's finalized guard from rejecting committees after
time advances in tests.

The method derives the block interval from the last two block timestamps
(to handle anvil_setBlockTimestampInterval overrides), stops interval mining
before the burst, and leaves it stopped so the caller controls when to resume.

Updates rollup cheat codes (advanceToEpoch, advanceToNextEpoch, advanceToNextSlot,
advanceSlots) to use mineUntilTimestamp with automatic interval restore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spalladino spalladino force-pushed the palla/fix/epoch-cache-finalized-guard branch from a013a5b to 97da230 Compare March 30, 2026 21:54
spalladino and others added 2 commits March 30, 2026 19:15
…finalized block

warpL2TimeAtLeastTo used warp (single block jump), causing the finalized
L1 block to lag behind after large time jumps. This triggered
EpochNotFinalizedError in the epoch cache, blocking the sequencer from
building blocks after the warp.

Switches to mineUntilTimestamp which mines real blocks at the ethereum
slot interval so finalized advances alongside latest.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eouts

For large time jumps (e.g., 1 day in crossTimestampOfChange), mining at
the ethereum slot interval (12s) would require thousands of blocks,
causing Anvil to time out. Caps at ~100 blocks and spreads the interval
to cover the full jump.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-v4-next ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants