Skip to content

WIP: Gloas upgrade#18956

Open
domiwei wants to merge 89 commits intomainfrom
feature/caplin_gloas
Open

WIP: Gloas upgrade#18956
domiwei wants to merge 89 commits intomainfrom
feature/caplin_gloas

Conversation

@domiwei
Copy link
Copy Markdown
Member

@domiwei domiwei commented Feb 4, 2026

No description provided.

@domiwei domiwei requested a review from Giulio2002 as a code owner February 4, 2026 06:47
@domiwei domiwei changed the title WIP: Gloas hardfork WIP: Gloas upgrade Feb 4, 2026
@domiwei domiwei force-pushed the feature/caplin_gloas branch from e6788aa to 93cc1fd Compare February 9, 2026 03:31
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 26f539f to ec9aa2c Compare February 9, 2026 19:47
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 5c97886 to 6bd1123 Compare February 10, 2026 10:17
@domiwei domiwei force-pushed the feature/caplin_gloas branch 2 times, most recently from 457288b to ead33e8 Compare February 16, 2026 05:25
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 3b3cea8 to 1bae0e0 Compare February 24, 2026 17:19
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 1bae0e0 to 29ba4f9 Compare February 25, 2026 08:18
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 18b0a57 to 37c12f7 Compare March 2, 2026 07:29
@domiwei domiwei requested a review from sudeepdino008 as a code owner March 3, 2026 09:34
domiwei and others added 2 commits April 1, 2026 07:51
Emit SSE events for three new EIP-7732 gossip topics:
payload_attestation_message, execution_payload_bid, execution_payload_available.
Also fix PayloadStatus enum to match spec: Pending=0, Full=1, Empty=2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d sync fixes, and P2P hardening

GLOAS envelope fetching: add by-range fallback after by-root failures, return full
blocks from determineFullGloasBlocks for slot-based requests. Forward sync: add 30s
request timeout, advance past empty slot gaps, update P2P status during sync, and set
head state on stale handoff so ChainTipSync can receive gossip blocks.

P2P fixes: separate transport errors from fork mismatches in handshake validation
(keep peer on stream reset, disconnect on wrong fork), fix StatusV2 to include
EarliestAvailableSlot (92 bytes), fix StateVersionByForkDigest to error on unknown
digest instead of silently returning FuluVersion.

Also: improve error messages in httpreqresp and rpc, add peer disconnect trace log,
remove devnet-only workarounds (HTTP finality fetch, multi-peer retry, bootnode skip,
metadata V2 fallback, gossip relaxation during sync).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 0e0249c to 7a293b8 Compare April 8, 2026 08:30
domiwei and others added 4 commits April 9, 2026 02:05
…st data bug)

The consensus-specs v1.7.0-alpha.4 proposer_boost and proposer_boost_is_first_block
test data uses tick=51 for GLOAS, but the GLOAS BPS timing refactor lowered the
attestation threshold to 3000ms (ATTESTATION_DUE_BPS_GLOAS=2500). At tick=51 the
block arrives exactly at 3000ms into the slot, and 3000 < 3000 = false (not timely),
so the proposer boost is not applied. The tick should be 50 for GLOAS. Our
implementation matches the spec; the test data was not regenerated for the new timing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n routing (V5/V6)

Add NewPayloadV5 and GetPayloadV6 to the EngineAPI interface and CL-side
RPC client, fixing two bugs in the CL→EL version routing for GLOAS:
- NewPayload for GloasVersion fell through to error default instead of V5
- GetAssembledBlock for GloasVersion was caught by >= FuluVersion routing V5
  instead of V6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ounds hardening

- Add bounds checks in IsActiveBuilder, CanBuilderCoverBid, and
  InitiateBuilderExit before calling builders.Get() to prevent
  remote DoS panic from malicious builder_index values.
- Guard ExecutionRequests parsing with version check to prevent
  nil deref in GLOAS block production (requests live in envelope).
- Skip MEV-Boost blinded block path for GLOAS (Blinded() returns
  ErrGloasCannotBlind); builders use ePBS gossip bids instead.
- Populate SignedExecutionPayloadBid for GLOAS self-build with
  correct spec fields (BuilderIndexSelfBuild, InfiniteSignature,
  slot, parentBlockHash, parentBlockRoot, prevRandao).
- Route blob KZG commitments into the bid for GLOAS blocks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement the "defer payload processing to next block" change from
consensus-specs PR #5094 (merged Apr 16, 2026). This eliminates the
dual-state model (pre-envelope / post-envelope) in favor of a single
canonical post-state per block root.

Key changes:

Types & SSZ:
- Add ParentExecutionRequests field to BeaconBody (GLOAS-only)
- Add ExecutionRequestsRoot to ExecutionPayloadBid
- Remove StateRoot from ExecutionPayloadEnvelope
- Swap latest_block_hash / latest_execution_payload_bid positions (specs#5113)
- Fix PayloadStatus constant values to match spec ordering (EMPTY=0, FULL=1, PENDING=2)

State Transition:
- New ProcessParentExecutionPayload (called first in ProcessBlock, before header)
- New ApplyParentExecutionPayload (extracted helper for verify/apply split)
- ProcessExecutionPayloadEnvelope becomes verification-only (no state mutations)
- process_withdrawals uses inline FULL/EMPTY check
- Epoch-aware builder payment with old-epoch direct withdrawal fallback

Fork Choice:
- Remove GetFullStateAtBlockRoot / GetExecutionPayloadState (single state model)
- Simplify on_block parent state selection (no FULL state branching)
- on_execution_payload: no state copy, verification only, stores envelope
- Invalidate head payload status cache after envelope persistence
- Add is_payload_verified (HasEnvelope) guard in ShouldExtendPayload
- Add HasEnvelope check for PTC attestations in validate_on_attestation

Block Production:
- Use HasEnvelope && ShouldExtendPayload per spec's prepare_execution_payload
- FULL path: copy state, apply parent execution payload, compute withdrawals
- EMPTY path: use cached payload_expected_withdrawals
- Hard error if head is FULL but envelope missing

Forward Sync:
- Determine FULL/EMPTY from consecutive block bids (count+1 lookahead)
- Only request envelopes for FULL blocks via RequestEnvelopesFrantically
- Remove anchor block from fork graph (not needed in single-state model)

Cleanup:
- Remove FetchFinalizedBlock from checkpoint sync
- Remove IsParentBlockFull dead code
- Remove findMissingEnvelopeRoots from chain_tip_sync

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yperbasis yperbasis removed the dependencies Pull requests that update a dependency file label Apr 17, 2026
…cs, and spec deviations

Squash of fix/gloas-review-items addressing PR #18956 review findings:

- Fix pendingCond.Wait() deadlock on context cancellation in gossip services
- Return ErrIgnore (not nil) for already-seen data column sidecars
- Fix SSZ EncodingSizeSSZ +4 offsets for variable-length ePBS types
- Fix DecodeSSZ nil pointer deref via beaconCfg propagation
- Fix getPayloadStatusTiebreaker to match EIP-7732 spec
- Fix data race in OnPayloadAttestationMessage (GetState alwaysCopy=true)
- Filter unsolicited envelope responses by requested root set
- Return ErrIgnore when block not found for pending envelopes
- Deep copy BlobKzgCommitments in ExecutionPayloadBid.Copy()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
domiwei added a commit that referenced this pull request Apr 18, 2026
…cs, and spec deviations

Squash of fix/gloas-review-items addressing PR #18956 review findings:

- Fix pendingCond.Wait() deadlock on context cancellation in gossip services
- Return ErrIgnore (not nil) for already-seen data column sidecars
- Fix SSZ EncodingSizeSSZ +4 offsets for variable-length ePBS types
- Fix DecodeSSZ nil pointer deref via beaconCfg propagation
- Fix getPayloadStatusTiebreaker to match EIP-7732 spec
- Fix data race in OnPayloadAttestationMessage (GetState alwaysCopy=true)
- Filter unsolicited envelope responses by requested root set
- Return ErrIgnore when block not found for pending envelopes
- Deep copy BlobKzgCommitments in ExecutionPayloadBid.Copy()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
domiwei and others added 7 commits April 19, 2026 04:23
Squash merge of fix/gloas-spec-compliance branch (6 commits):

Spec alignment:
- Add block_access_list (EIP-7928) and slot_number (EIP-7843) to ExecutionPayload
- Remove non-spec Slot field from ExecutionPayloadEnvelope
- Add missing genesis checks in GLOAS state transition
- Enforce ByteListSSZ max length and fix M1 parent_bid.slot
- Add block.slot == payload.slot_number validation (EIP-7843)

Audit fixes:
- historical_states_reader: GLOAS version guard to prevent nil panic
- block_production: use GetLatestBlockHash() for GLOAS builder parentHash
- clstages: skip ExecutionPayload for GLOAS genesis blocks
- spectests: use GetBlobKzgCommitments() accessor for GLOAS compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Persist and reconstruct all 9 GLOAS BeaconState fields so that
ReadHistoricalState produces correct block roots for GLOAS-era slots.

Storage strategies:
- SlotData: latestBlockHash, nextWithdrawalBuilderIndex
- Per-slot compressed SSZ: latestExecutionPayloadBid, builderPendingPayments,
  executionPayloadAvailability, ptcWindow
- Dump+diff queues: builders, builderPendingWithdrawals,
  payloadExpectedWithdrawals

Also includes:
- DB schema version 7 migration
- Snapshot table registration
- ErrMissingGloasData sentinel with antiquary backoff
- readCompressedSSZ passes actual state version (not hardcoded 0)
- ReadRequiredQueueSSZ inlined to avoid double dump-table read
- Comments on NewEth1Block constructors re: empty BlockAccessList
- Comprehensive roundtrip tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…spectests

Upgrade spectest fixtures from v1.7.0-alpha.4 to v1.7.0-alpha.5 and
resolve all test failures across 8 forks (phase0 through gloas).

Key changes:

- Fix Eth1Header SSZ schema: remove incorrect GLOAS gate that added
  block_access_list_root and slot_number to ExecutionPayloadHeader
  (spec only includes them on ExecutionPayload). Fixes 10 LightClient
  "bad encoding size" failures.

- Fix ExtraData nil panics: use NewEth1Header() in Clone/DecodeSSZ/New
  and add nil-receiver guards on ExtraData SSZ methods.

- Fix envelope validation: use ValidatingMachine (BLS enabled) in
  applyEnvelope, return error for unknown beacon block roots.

- Handle execution_valid=false: read execution.yaml metadata to skip
  CL-only tests where EL rejection cannot be simulated. Fixes 23
  failures across bellatrix-fulu.

- Cache SHA256 per i/16 group in ComputeBalanceWeightedSelection and
  computeProposerIndexElectra, reducing hash calls from 268M to 16.8M.
  GLOAS historical_accumulator test: 62s → 14s.

- Implement gossip_attester_slashing and gossip_proposer_slashing
  spectest handlers with full three-state validation (valid/reject/ignore).

- Register missing spectest handlers: execution_payload,
  parent_execution_payload, on_execution_payload_envelope,
  get_parent_payload_status, and several ssz_static types.

- Un-skip proposer_boost fork choice tests (alpha.5 fixed tick values).

- Fix RlpHeader for GLOAS: populate BlockAccessListHash (EIP-7928) and
  SlotNumber (EIP-7843) to prevent hash mismatches in NewPayload, sync,
  and block production.

- Fix ListObjectSSZRoot deadlock in concurrent SSZ hashing.

Results: 8 forks, 0 failures, 1 fixture-limitation skip (wrong_withdrawals).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…on, and spec compliance

Squash merge of fix/caplin-gloas-audit-fixes into feature/caplin_gloas.

Fixes:
- cl/phase1/core/state/raw: fix baseOffsetSSZ GLOAS constant (2741117 → 3134333)
  and refactor EncodingSizeSSZ to stop double-counting fixed-size fields already
  in the base offset. Properly scope inactivityScores (Altair+), historicalSummaries
  (Capella+), and latestExecutionPayloadHeader (Bellatrix–Fulu only).

- cl/sentinel/handlers: fix envelope handler tests that used a hardcoded
  startSlot=100, far below the minimum serve epoch derived from wall-clock time.

- cl/beacon/handler: populate ExecutionRequestsRoot in self-build bid by decoding
  the EL's requestsBundle and computing hash_tree_root(ExecutionRequests).

- cl/beacon/handler: broadcast self-built execution payload envelope on the
  execution_payload gossip topic so blocks transition from PENDING to FULL.
  Uses an LRU cache to bridge payload data between produceBeaconBody and
  broadcastBlock (envelope needs the BeaconBlockRoot from the final block).

- cl/beacon/handler: select highest external builder bid from epbsPool when it
  exceeds local execution value, with proper ExecutionValue update in the API
  response.

- cl/beacon/handler: proper envelope signing via validator client following
  beacon-APIs PR #580 pattern. Returns unsigned envelope in the v3 block
  production response; accepts signed envelope in the v2 block publish request.
  Falls back to InfiniteSignature with warning when no signed envelope provided.

- cl/das: update stale TODO on GLOAS inclusion proof — consensus-specs
  v1.7.0-alpha.5 removes KzgCommitmentsInclusionProof from DataColumnSidecar;
  commitments verified via builder bid instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add missing GLOAS block to dynamic baseOffsetSSZ() (latestBlockHash,
  builders, builderPendingPayments, ptcWindow, etc.) — verified to match
  the hardcoded 3134333 for mainnet preset
- Restore peerDas nil check in OnBlock EL blob availability path
- Use config-aware MaxBlobCommittmentsPerBlock in NewBeaconBody
- Add missing cltypes import in cache_accessors.go
- Remove unused solid import in clstages.go
- Remove duplicate PeerDas initialization in run.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restore PayloadStatusNone case in on_block.go switch (defence-in-depth
  against EL returning unknown status string via Engine API)
- Restore peerDas nil guard for Fulu data availability check to prevent
  nil pointer dereference when peerDas is not initialized
- Fix proposerLookahead shallow copy in BeaconState.CopyInto to use
  CopyTo (prevents Merkle tree data race between copied states)
- Gate forward sync stale timeout by fork version: 5min for pre-GLOAS
  chains (more conservative), 2min for GLOAS (handles new error paths)
- Remove devnet debug logging from inserters.go (WriteTd read-back
  verification on hot path, hardcoded block range 3220-3250 log)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BuilderPendingPayment.EncodingSizeSSZ() returned 8 when Withdrawal was
nil, but the type declares Static() == true — its SSZ size must be
constant (44 = 8 weight + 36 withdrawal).  The nil guard caused
baseOffsetSSZ() to under-count by 2304 bytes (64 elements × 36 byte
shortfall), breaking GLOAS BeaconState SSZ encode/decode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@domiwei domiwei force-pushed the feature/caplin_gloas branch from 2f8e480 to 3ba600d Compare April 21, 2026 09:43
NewForkGraphDisk persists the filled-in header Root back to the anchor
state to prevent transitionSlot from recomputing HashSSZ.  This is only
needed for GLOAS where BlockRoot() uses the stored Root directly; for
pre-GLOAS versions BlockRoot() always re-derives via HashSSZ, so the
mutation makes the second BlockRoot call (in NewForkChoiceStore) return
a different anchor root, breaking TestForkChoiceChainBellatrix.

Gate the SetLatestBlockHeader call on Version >= GloasVersion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@domiwei domiwei force-pushed the feature/caplin_gloas branch from aaf5552 to bb8d2d3 Compare April 22, 2026 04:28
domiwei and others added 7 commits April 22, 2026 04:41
…l panic

NewEth1Block left Extra, Transactions, and Withdrawals nil, but getSchema()
references them during HashSSZ(). This caused a nil pointer dereference when
queuePendingEnvelope tried to hash an envelope with a freshly-constructed
Eth1Block. Initialize these fields to match NewEth1BlockFromExecutionHeader.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit eagerly initialized Extra, Transactions, and Withdrawals
in NewEth1Block, which changed JSON serialization from null to empty values
and broke beacon handler golden file tests (TestHarnessPhase0 blocks).

Move the nil guards into a shared ensureSSZFields() called by HashSSZ and
EncodeSSZ. This prevents the nil pointer panic in queuePendingEnvelope
without altering the JSON representation of freshly constructed blocks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…coring

Fix two EIP-7732 spec compliance issues found during audit:

1. Return HTTP 400 (not 500) when requesting blinded blocks for GLOAS-era
   blocks, since the blinded block concept does not apply post-GLOAS.
2. Add GossipSub topic scoring params for execution_payload_bid,
   payload_attestation_message, and proposer_preferences topics. Includes
   substring ordering fix (bid checked before payload to avoid false match).
   GLOAS weights excluded from maxScore() to avoid inflating penalties on
   pre-GLOAS forks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…5125)

When a GLOAS block has a pre-GLOAS (Fulu) parent at the fork boundary,
getParentPayloadStatus accessed GetSignedExecutionPayloadBid() on the
parent block which returns nil for pre-GLOAS blocks. The nil check then
returned PayloadStatusEmpty instead of the correct PayloadStatusPending.

Add a version check early — after getting the parent block but before
accessing GetSignedExecutionPayloadBid(). If the parent block's version
is less than GloasVersion, return PayloadStatusPending.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The spec simplified apply_parent_execution_payload by removing the
parent_bid parameter. The function now derives it internally from
state via s.GetLatestExecutionPayloadBid().

Update the interface in machine.go, the implementation in operations.go,
and all call sites (ProcessParentExecutionPayload and block_production.go).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…compliance

Address verified issues from PR #18956 review and Beacon API spec audit:

- forkchoice: eliminate TOCTOU race in OnBlock pending envelope processing by
  extracting applyEnvelopeLocked and calling it while f.mu is held, deferring
  DB index writes to after unlock to avoid deadlock
- fork_graph: add bounds checking in readBeaconStateFromDisk and
  ReadEnvelopeFromDisk to prevent panic on corrupt file lengths, with automatic
  buffer growth when needed
- beacon/handler: add dependent_root and execution_optimistic to PTC duties
  response, matching attester/proposer duty patterns
- beacon/handler: change POST pool/payload_attestations to accept array input
  with per-item error tracking, matching existing pool endpoint semantics
- beacon/handler: add WithOptimistic and WithFinalized metadata to GET
  execution_payload_envelope response
- beaconevents: fix execution_payload_available SSE event to emit flat
  {slot, block_root} per spec instead of full envelope with version wrapper

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d panic

Fix three issues discovered via Kurtosis local testing with Caplin CL + Erigon EL
in GLOAS (EIP-7732) mode:

1. Block production sent zero head hash to ForkchoiceUpdated at the GLOAS fork
   boundary. The initial bid created by UpgradeToGloas has ParentBlockHash=0x0
   (spec-compliant), but the EMPTY path in produceBeaconBody used this zero value
   as the EL head. Add fork boundary detection (Slot==0 && ParentBlockHash==0x0)
   to use parentBid.BlockHash instead, matching the spec's intent that pre-GLOAS
   blocks always had their payloads executed.

2. GetPayloadV1-V6 panicked on empty payload ID (index out of range [7] with
   length 0). Add decodePayloadID helper that validates the 8-byte length before
   calling binary.BigEndian.Uint64.

3. Block production continued to call GetAssembledBlock with an empty payload ID
   when ForkchoiceUpdated returned nil PayloadId (EL syncing). Add early return
   when idBytes is empty.

4. Gossip topic re-subscription at fork boundary: subscribeUpcomingTopics aborted
   on ErrExpiryInThePast for externally-managed topics (attestation, sync_committee,
   data_column_sidecar), preventing all subsequent topics from being created with the
   new fork digest. Also add missing RegisterTopicValidator for new fork digest topics.

Also add gloas-caplin.io Kurtosis config for local GLOAS testing with Caplin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@domiwei domiwei requested a review from mriccobene as a code owner April 25, 2026 04:35
domiwei and others added 2 commits April 25, 2026 14:51
The standalone caplin binary parsed --sentinel.bootnodes and
--sentinel.staticpeers flags via SentinelCliCfg but never passed them
to CaplinConfig, so the P2P manager never received bootstrap nodes.

Also add gloas-caplin-mixed.io Kurtosis config for testing GLOAS with
a mixed Lighthouse + Caplin setup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
goCheckForkAndResubscribe ran its first fork check immediately on
goroutine start, before RegisterGossipServices had populated any
topics. This caused subscribeUpcomingTopics to iterate an empty list,
then update forkDigest to the upcoming value — permanently missing
the resubscription window.

Move the ticker wait to the top of the loop so the first check happens
after one slot tick (~6s), giving RegisterGossipServices time to
complete. Also upgrade the fork digest change log from Debug to Info
for observability, and log errors from subscribeUpcomingTopics.

Also update gloas-caplin.io to use GLOAS-compatible Lighthouse VC
image (bal-devnet-3) and single-node config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Caplin Caplin: Consensus Layer, Beacon API Glamsterdam https://eips.ethereum.org/EIPS/eip-7773

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants