cmd/integration: commitment rebuild with history support#19016
Merged
sudeepdino008 merged 88 commits intomainfrom Mar 21, 2026
Merged
cmd/integration: commitment rebuild with history support#19016sudeepdino008 merged 88 commits intomainfrom
sudeepdino008 merged 88 commits intomainfrom
Conversation
594b302 to
018864d
Compare
Member
Author
|
current iteration (with parallel prefetch for each block) works but is still super slow
|
…uildCommitmentFilesWithHistory
Windows doesn't allow renaming files that are still open. The defer comp.Close() was happening after Compress() which does the rename, but the decompressor was being opened before the compressor was closed.
…transaction The Clone() method was returning itself without updating the transaction, causing stale transaction usage when trieContext() clones the state reader. This affected: - Commitment history regeneration (RebuildCommitmentFilesWithHistory) - RPC commitment verification (eth_getProof) - Receipts generation with state root computation
Add RebuildStateReader to commitmentdb that stores SharedDomains reference for proper Clone() behavior. This reader: - Reads commitment from SharedDomains in-memory batch (LatestStateReader) - Reads plain state from history (HistoryStateReader) - Clone() creates new reader with new tx while preserving sd and plainStateAsOf Use NewRebuildStateReader in RebuildCommitmentFilesWithHistory instead of CommitmentReplayStateReader.
Plain state reads come from disk history via HistoryStateReader, not from in-memory batch. Disabling inMemHistoryReads avoids accumulating unnecessary history data in memory.
The variable 'keyPos' used to track key offsets in accessor files was declared outside the retry loop. When a recsplit collision occurred and the loop retried, 'keyPos' retained its value from the previous iteration, causing incorrect key offset tracking in the index. Fix: Move keyPos initialization inside the retry loop so it's reset on each attempt. Similar to the fix in history.go (PR #19697).
Conflicts resolved: - db/state/domain.go: keep testHook from main, remove redundant outer var - db/state/entity_integrity_check.go: keep disableInterDomain field from HEAD, drop unused dirs field - execution/commitment/commitmentdb/reader.go: use main's CommitmentReplayStateReader.Clone fix (don't replace plainStateReader), keep RebuildStateReader from HEAD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, BuildFiles2 runs in a background goroutine and the code immediately proceeds to prune. Since files aren't built yet, the prune sees stale commitFilesEndTxNum and the DB keeps growing (1.1TB+). Use the existing WaitForFiles() to block until file building completes, ensuring data is moved from DB to snapshot files before pruning.
Replace adaptive block-based batching with step-based batching. Each iteration processes exactly 1 step worth of blocks, then flushes and builds files. This ensures predictable memory usage and file sizes, since the old memBatch size metric only tracked latest state (~2GB) while history data (5-12GB per step) was invisible to the adaptive logic. - Remove batchSize, batchBlockCount, adaptive grow/shrink logic - Each iteration: find current step from blockFrom's txNum, compute step boundary block, process blocks, flush, build files - Last step may be partial (fewer blocks), which is fine
lastToTxNum is inclusive (last txNum of the step, e.g. 781249 for step 1 with stepSize 390625). Integer division 781249/390625=1 means toStep=1, so BuildFiles2 loop 'for step := fromStep; step < 1' skips building step 1. Fix: use (lastToTxNum+1)/stepSize to get the exclusive step boundary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # db/state/aggregator.go
This reverts commit 82788f1.
- Replace RebuildStateReader with CommitmentReplayStateReader - Revert integrity_checker_test changes (belongs in separate PR) - Fix gofmt in domain.go
Member
Author
✅ Successful Hoodi commitment history regenerationBranch: Timeline
Final disk usage
State root
Integrity check
Checks run: Blocks, HeaderNoGaps, BlocksTxnID, InvertedIndex, HistoryNoSystemTxs, CommitmentKvi, ReceiptsNoDups, RCacheNoDups, CommitmentRoot, CommitmentHistVal, StateRootVerifyByHistory (257 sampled blocks, 6m5s), Publishable ("All snapshots are publishable"). Key fixes applied
|
CommitmentReplayStateReader.Clone() does not clone the plainStateReader, which causes warmup goroutines to share the write transaction's HistoryStateReader. RebuildStateReader.Clone() creates a fresh reader pair with the new tx, avoiding the issue. This reverts the reader swap from cleanup 2 (abf8ea1) while keeping CommitmentReplayStateReader intact for its other callers.
Flush every 2 steps instead of 1, halving the number of mdbx writes and BuildFiles2 calls during the block processing phase.
This reverts commit a2adf2b.
Resolved conflicts in DeleteStateSnapshots API (positional args → struct).
AskAlexSharov
approved these changes
Mar 20, 2026
sudeepdino008
added a commit
that referenced
this pull request
Apr 1, 2026
Extends `integration commitment rebuild` to support regenerating **commitment history** (`.v` and `.vi` files) alongside the commitment domain (`.kv` files). This is the only practical way to regenerate commitment history for an existing synced node — `stage_exec` re-executes all blocks from scratch which is ~2-4x slower and requires full EVM re-execution. Closes #18954 ```bash integration commitment rebuild --datadir=<dir> --chain=<chain> integration commitment rebuild --datadir=<dir> --chain=<chain> --clear-commitment integration commitment rebuild --datadir=<dir> --chain=<chain> --resume ``` | Flag | Description | |------|-------------| | `--clear-commitment` | Remove commitment data from DB and delete state files, then exit without rebuilding | | `--resume` | Resume a previously interrupted commitment rebuild (requires commitment history enabled) | | `--squeeze` | Enable squeeze pass for ReplaceKeysInValues (default: true) | - The node must be synced (accounts/storage/code domain files must exist) - `MaxTxNum` table must be populated. If not, run: `integration stage_headers --reset --datadir=<dir> --chain=<chain>` When commitment history is **not** enabled, the existing `RebuildPatriciaTrieBasedOnFiles` path is used — it reads the latest state from domain files and recomputes the commitment trie in a single pass. When commitment history **is** enabled, the new `RebuildCommitmentFilesWithHistory` path: 1. **Collects history keys** — for each batch of blocks, queries `AccountsDomain` and `StorageDomain` history to find which keys changed in each block 2. **Replays block-by-block** — for each block, touches the changed keys in the commitment trie using `TouchKey`, then computes the commitment root via `ComputeCommitment` 3. **Verifies state roots** — after each block, compares the computed root against the canonical header's state root 4. **Flushes per-step** — when the accumulated data crosses a step boundary (stepSize txNums), flushes domains to MDBX, builds snapshot files via `BuildFiles2`, waits for completion, then prunes 5. **Merges and squeezes** — after all blocks are processed, runs the merge loop and squeeze migration to produce final compressed files - **Step-based batching**: flushes exactly at step boundaries to produce clean per-step `.kv` files that merge correctly - **ETL-based history collection**: uses ETL collectors to sort history keys by block number, avoiding expensive per-block history queries - **Discards non-commitment writes**: account/storage/code domain writes are discarded since those files already exist — only commitment domain writes are persisted - **Warmup cache**: pre-warms the commitment trie cache for better read performance (disable with `ERIGON_REBUILD_NO_WARMUP_CACHE=1`) | Metric | `commitment rebuild` (with history) | `stage_exec` (from scratch) | |--------|--------------------------------------|----------------------------| | Block processing | ~38 blk/s | ~8 blk/s | | Block processing time | ~17.5 hours | ~80+ hours (estimated) | | Total wall clock (incl. merge/squeeze) | ~33.5 hours | ~80+ hours (estimated) | | Peak memory (RSS) | ~90 GB (mostly mmap) | ~6 GB | | Chaindata growth | Stable at 27 GB | Growing | | EVM re-execution | No | Yes | The commitment rebuild approach is **~2-4x faster** because it reads pre-computed state from existing snapshot files instead of re-executing every transaction through the EVM. - **`cmd/integration/commands/commitment.go`** — adds `--clear-commitment` and `--resume` flags, interactive prompt for history mode, routes to appropriate rebuild function - **`cmd/integration/commands/flags.go`** — new flag definitions - **`db/state/squeeze.go`** — core `RebuildCommitmentFilesWithHistory` function with ETL-based history collection, block-by-block replay, step-based flushing, and merge/squeeze - **`db/kv/rawdbv3/txnum.go`** — adds `IsMaxTxNumPopulated` helper - **`db/state/execctx/domain_shared.go`** — exposes `ClearWarmupCache` - **`execution/stagedsync/stage_commit_rebuild.go`** — adds `RebuildPatriciaTrieWithHistory` entry point - **`execution/commitment/commitmentdb/`** — reader/context changes for rebuild state reader --------- Co-authored-by: Alexey Sharov <AskAlexSharov@gmail.com> Co-authored-by: nanobot <nanobot@example.com> Co-authored-by: antonis19 <antonis19@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends
integration commitment rebuildto support regenerating commitment history (.vand.vifiles) alongside the commitment domain (.kvfiles). This is the only practical way to regenerate commitment history for an existing synced node —stage_execre-executes all blocks from scratch which is ~2-4x slower and requires full EVM re-execution.Closes #18954
Usage
Flags
--clear-commitment--resume--squeezePrerequisites
MaxTxNumtable must be populated. If not, run:integration stage_headers --reset --datadir=<dir> --chain=<chain>How It Works
When commitment history is not enabled, the existing
RebuildPatriciaTrieBasedOnFilespath is used — it reads the latest state from domain files and recomputes the commitment trie in a single pass.When commitment history is enabled, the new
RebuildCommitmentFilesWithHistorypath:AccountsDomainandStorageDomainhistory to find which keys changed in each blockTouchKey, then computes the commitment root viaComputeCommitmentBuildFiles2, waits for completion, then prunesKey design decisions
.kvfiles that merge correctlyERIGON_REBUILD_NO_WARMUP_CACHE=1)Performance (Hoodi testnet, 2.4M blocks)
commitment rebuild(with history)stage_exec(from scratch)The commitment rebuild approach is ~2-4x faster because it reads pre-computed state from existing snapshot files instead of re-executing every transaction through the EVM.
Files Changed
cmd/integration/commands/commitment.go— adds--clear-commitmentand--resumeflags, interactive prompt for history mode, routes to appropriate rebuild functioncmd/integration/commands/flags.go— new flag definitionsdb/state/squeeze.go— coreRebuildCommitmentFilesWithHistoryfunction with ETL-based history collection, block-by-block replay, step-based flushing, and merge/squeezedb/kv/rawdbv3/txnum.go— addsIsMaxTxNumPopulatedhelperdb/state/execctx/domain_shared.go— exposesClearWarmupCacheexecution/stagedsync/stage_commit_rebuild.go— addsRebuildPatriciaTrieWithHistoryentry pointexecution/commitment/commitmentdb/— reader/context changes for rebuild state reader