test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

julienrbrt · 2026-01-22T10:52:20Z

ref: #2965 (comment)
ref: #2982 (comment)

github-actions · 2026-01-22T10:52:32Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 27, 2026, 4:27 PM

gemini-code-assist · 2026-01-22T10:52:36Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enabling and validating the 'force inclusion' functionality within the system. It removes a previous block in the configuration validation that prevented the feature from being active and subsequently re-enables the comprehensive suite of end-to-end tests designed to verify its correct operation across various EVM components.

Highlights

Force Inclusion Feature Activation: The 'forced inclusion' feature has been activated by removing a temporary error message that previously prevented its use in pkg/config/config.go. The configuration now correctly validates the ForcedInclusionNamespace.
E2E Tests Re-enabled: All end-to-end tests related to EVM force inclusion, including sequencer, full node, and malicious sequencer scenarios, have been re-enabled in test/e2e/evm_force_inclusion_e2e_test.go by removing t.Skip() directives.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-01-22T10:52:38Z

Claude finished @julienrbrt's task in 2m 40s —— View job

Comprehensive Code Review

I've completed a thorough review of this PR which enables force inclusion functionality. Below is my detailed feedback organized by category.

Summary

This PR successfully enables the force inclusion feature by:

Removing validation block in pkg/config/config.go (lines 295-299)
Re-enabling 3 comprehensive E2E tests that were previously skipped
Adding extensive benchmarks for the FilterTxs performance path

The changes are well-structured and the test coverage is excellent. However, there are several areas that need attention.

Critical Issues

1. Missing Test Coverage for Config Validation 🔴

File: pkg/config/config.go:295-299

The PR removes the temporary block that prevented force inclusion from being enabled:

if len(c.DA.GetForcedInclusionNamespace()) > 0 {
    if err := validateNamespace(c.DA.GetForcedInclusionNamespace()); err != nil {
        return fmt.Errorf("could not validate forced inclusion namespace (%s): %w", c.DA.GetForcedInclusionNamespace(), err)
    }
}

Issue: Codecov reports 0% patch coverage for these 2 lines. The namespace validation logic for force inclusion is now active but completely untested.

Impact: If there's a bug in the namespace validation or someone provides an invalid namespace format, it won't be caught by tests.

Recommendation: Add unit tests in pkg/config/config_test.go covering:

Valid forced inclusion namespace formats
Invalid namespace formats (too short, wrong encoding, etc.)
Empty namespace (should pass validation check since len == 0)
Edge cases like special characters, maximum length

Fix this →

Code Quality Issues

2. Flag Consistency - Prefix Usage ⚠️

File: test/e2e/evm_force_inclusion_e2e_test.go:92,107,114,219

The test uses --evnode. prefix for flags (e.g., --evnode.node.aggregator=true), but there's an inconsistency discussion in the review comments about whether to move to rollkit flags.

Current usage:

"--evnode.node.aggregator=true",
"--evnode.signer.passphrase_file", passphraseFile,
"--evnode.da.forced_inclusion_namespace", "forced-inc",

Issue: The codebase supports both rollkit. and evnode. prefixes via flag normalization (config.go:344-350), but there should be a consistent convention in tests.

Review comment context: @randygrok asked about this, and @julienrbrt acknowledged "consistency should be the other way around indeed."

Recommendation:

Decide on a single prefix convention for E2E tests (evnode. or rollkit.)
Update all E2E tests to use the chosen convention consistently
Document the decision in CLAUDE.md or a testing guide

3. Benchmark Test Quality ✅ Good, with suggestions

File: execution/evm/filter_bench_test.go

Strengths:

Excellent coverage of different scenarios (normal txs, forced txs, mixed, gas limits, size limits)
Good use of sub-benchmarks for different transaction counts
Proper use of b.ResetTimer() and b.ReportAllocs()
Direct comparison benchmarks (BenchmarkFilterTxs_CompareUnmarshalOverhead)

Minor suggestions:

a) Line 131-132: Using the new b.Loop() pattern - good! However, the old pattern for i := 0; i < b.N; i++ is more widely understood. Consider a comment for clarity:

for b.Loop() { // Replaces: for i := 0; i < b.N; i++

b) Lines 96-97: Invalid transaction generation could be more realistic:

if includeInvalid && i%10 == 0 {
    // Every 10th transaction is invalid (random garbage)

Suggestion: Add a benchmark with transactions that are structurally valid but fail gas validation or signature checks. Random garbage is one failure mode, but malformed-but-parseable transactions are another.

c) Line 18: Missing constant documentation:

const (
	benchPrivateKey = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"
	benchToAddress  = "0x944fDcD1c868E3cC566C78023CcB38A32cDA836E"
	benchChainID    = "1234"
)

Suggestion: Add a comment noting these are test-only keys with no real value.

Security Review

4. E2E Test Security Properties ✅ Excellent

File: test/e2e/evm_force_inclusion_e2e_test.go:363-556

The TestEvmSyncerMaliciousSequencerForceInclusionE2E test is particularly well-designed:

Strengths:

Tests critical censorship resistance property
Properly simulates malicious sequencer (wrong namespace configuration)
Validates that honest nodes detect and halt on force inclusion violations
Comprehensive documentation (lines 363-397) explaining threat model

Architecture validation:

// Malicious Sequencer: "wrong-namespace" (doesn't see forced txs)
"--evnode.da.forced_inclusion_namespace", "wrong-namespace",

// Honest Full Node: "forced-inc" (validates forced txs from DA)
"--evnode.da.forced_inclusion_namespace", "forced-inc",

This correctly tests the security invariant that nodes with correct configuration will halt rather than follow a censoring sequencer.

One minor issue (line 465):

result := daClient.Submit(ctx, [][]byte{txBytes}, -1, []byte("forced-inc"), nil)

The namespace is passed as []byte("forced-inc") but everywhere else uses string constants. Consider extracting to a constant:

const TestForcedInclusionNamespace = "forced-inc"

5. No Secret Exposure ✅ Pass

Reviewed test keys and credentials:

TestPrivateKey used in tests (line 149, 250, 262, 429, 441) - appears to be test-only key
Benchmark private key (filter_bench_test.go:18) - test-only
No production credentials or secrets detected

Performance Review

6. FilterTxs Optimization Path ✅ Good Design

File: execution/evm/execution.go:839-889

The optimization where UnmarshalBinary is skipped when hasForceIncludedTransaction=false is smart:

if hasForceIncludedTransaction {
    var ethTx types.Transaction
    if err := ethTx.UnmarshalBinary(tx); err != nil {
        // ... filter out invalid tx
    }
}

Rationale: Mempool transactions are already validated, so parsing can be skipped unless force-included transactions (which may be untrusted) are present.

Performance impact: The benchmarks will quantify this. Based on the code:

Without forced txs: O(n) iteration + size checks
With forced txs: O(n) iteration + unmarshal + size + gas checks

Potential concern: The hasForceIncludedTransaction flag is a boolean covering the entire batch. If there's 1 forced tx and 9,999 mempool txs, all 10,000 get unmarshaled.

Suggestion for future optimization (not blocking for this PR):

// Instead of: hasForceIncludedTransaction bool
// Consider: forceIncludedIndices []int

// Then:
if contains(forceIncludedIndices, i) {
    // Only unmarshal this specific tx
}

This would unmarshal only the forced transactions, not the entire batch.

7. E2E Test Performance Parameters ⚠️

File: test/e2e/evm_force_inclusion_e2e_test.go:98,205,297

The tests set da_epoch_forced_inclusion = 2 for faster testing:

enableForceInclusionInGenesis(t, sequencerHome, 2)

Issue: While this makes tests faster, it may not catch edge cases that occur with realistic epoch values.

Questions:

What's the production default for da_epoch_forced_inclusion?
Should there be at least one test variant with production-like values?
Are there any consensus or security implications of using epoch=2?

Recommendation: Document in comments why epoch=2 is safe for testing, or add one test with a more realistic epoch value.

Documentation Review

8. Test Documentation ✅ Excellent

The E2E tests have outstanding documentation:

TestEvmSyncerMaliciousSequencerForceInclusionE2E (lines 363-397): Comprehensive doc comment explaining threat model, architecture, and expected behavior
Helper functions have clear docstrings (lines 28, 49, 75, etc.)
Inline comments explain test flow and timing decisions

No action needed - this is exemplary.

9. Config Comments ✅ Good

The YAML comment for ForcedInclusionNamespace (config.go:189) is clear:

ForcedInclusionNamespace string `comment:"Namespace ID for forced inclusion transactions on the DA layer."`

Suggestion: Consider adding a usage example or link to force inclusion documentation in the comment.

Architecture & Design

10. Modular Test Setup ✅ Good

The test helpers are well-designed:

setupSequencerWithForceInclusion() (line 76)
setupMaliciousSequencer() (line 279)
setupFullNodeWithForceInclusionCheck() (line 326)

Strengths:

Clear separation of concerns
Reusable across different test scenarios
Good use of t.Helper() for clearer test output

Minor suggestion: The three setup functions share ~80% of the same initialization code. Consider extracting a common base function:

func setupSequencerBase(t *testing.T, sut *SystemUnderTest, nodeHome string, opts SequencerOptions) (...)

This would reduce duplication and make test maintenance easier.

11. Grace Period and Timing ⚠️ Needs clarification

File: test/e2e/evm_force_inclusion_e2e_test.go:476-481,529

The malicious sequencer test advances the chain with 15 transactions (line 476) and monitors for "up to 60 seconds" (line 494) for the node to detect violation and stop.

Questions:

Is the 15 transaction count derived from epoch=2 + grace period calculation?
What's the actual grace period formula?
Could this test be flaky if system is slow?

Observation: Lines 528-533 wait for 10 consecutive checks where height doesn't change. This is a good heuristic, but:

if consecutiveStops >= 10 {
    t.Log("✅ Full node stopped syncing - malicious behavior detected!")
    stoppedSyncing = true
    break
}

Recommendation:

Document the expected detection latency formula: (epoch * DA_block_time) + grace_period
Consider making timeout calculation dynamic based on test parameters
Add defensive timeout for CI environments: timeout := calculateTimeout(epoch, daBlockTime) * 2 // 2x for CI

Error Handling

12. Benchmark Error Handling ✅ Appropriate

Benchmarks properly fail fast with b.Fatalf():

if err != nil {
    b.Fatalf("FilterTxs failed: %v", err)
}

This is correct - benchmarks should not continue if the operation fails.

13. E2E Test Assertions ✅ Good

The E2E tests use require appropriately:

require.Eventually(t, func() bool {
    return evm.CheckTxIncluded(fnClient, txForce.Hash())
}, 40*time.Second, 1*time.Second, "Forced inclusion tx not synced to full node")

Good use of require.Eventually with reasonable timeouts and clear failure messages.

Testing Coverage Summary

Component	Coverage	Assessment
Config validation	❌ 0%	Needs tests
E2E sequencer force inclusion	✅ Good	Complete scenario
E2E full node force inclusion	✅ Good	Complete scenario
E2E malicious sequencer detection	✅ Excellent	Security property validated
FilterTxs benchmarks	✅ Good	Comprehensive scenarios

Overall test quality: High, but config validation tests are missing.

Recommendations Summary

Must Fix (Blocking)

✅ Add unit tests for forced inclusion namespace validation (0% coverage)

Should Fix (Recommended)

⚠️ Resolve flag prefix consistency (evnode. vs rollkit.)
⚠️ Document or validate epoch=2 safety for testing
⚠️ Document grace period calculation and detection timing

Nice to Have (Optional)

💡 Extract test constants (forced-inc namespace, epoch values)
💡 Refactor common sequencer setup code
💡 Add benchmark for structurally-valid but invalid transactions
💡 Consider per-transaction force inclusion flag for future optimization

Conclusion

This is a well-implemented feature enablement with excellent E2E test coverage and comprehensive benchmarks. The security properties are properly tested, particularly the malicious sequencer detection scenario.

Primary concern: The 0% patch coverage on the namespace validation code. This must be addressed before merging.

Secondary concerns: Flag consistency and test parameter documentation.

Once the config validation tests are added and flag prefix is standardized, this PR will be in excellent shape. The force inclusion feature appears production-ready from a test coverage perspective.

gemini-code-assist

Code Review

This pull request enables the force inclusion feature. The changes in pkg/config/config.go correctly re-enable the validation logic for the forced inclusion namespace, and the changes in test/e2e/evm_force_inclusion_e2e_test.go re-enable the end-to-end tests for this feature by removing t.Skip() calls. The changes are straightforward and effectively achieve the goal of activating the force inclusion functionality and its associated tests. The pull request title also mentions the addition of force inclusion benchmarks, but no new benchmark files were found in the changes; this might be planned for a subsequent pull request.

codecov · 2026-01-22T10:56:30Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.91%. Comparing base (62c9339) to head (ecee902).

Files with missing lines	Patch %	Lines
pkg/config/config.go	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3005      +/-   ##
==========================================
- Coverage   58.04%   57.91%   -0.13%     
==========================================
  Files         110      110              
  Lines       10522    10523       +1     
==========================================
- Hits         6107     6094      -13     
- Misses       3764     3778      +14     
  Partials      651      651

Flag	Coverage Δ
combined	`57.91% <0.00%> (-0.13%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

test/e2e/evm_force_inclusion_e2e_test.go

test: enable force inclusion, re-enable fi e2e and add fi benchmarks

c92c7d8

github-actions bot assigned julienrbrt Jan 22, 2026

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

julienrbrt added 5 commits January 22, 2026 11:56

add filter tx bench

670313a

updates

055f6ee

Merge branch 'main' into julien/test-fi

4ed0a4b

check others

cab4a67

Merge branch 'main' into julien/test-fi

2f7b304

randygrok reviewed Jan 26, 2026

View reviewed changes

test/e2e/evm_force_inclusion_e2e_test.go Outdated Show resolved Hide resolved

julienrbrt added 4 commits January 26, 2026 16:52

Merge branch 'main' into julien/test-fi

36ce299

fix flags

cb09920

re-enable test

54d5917

fix sequencer p2p address

287ccc4

julienrbrt force-pushed the julien/test-fi branch from d0bab1c to 287ccc4 Compare January 27, 2026 08:33

Merge branch 'main' into julien/test-fi

ecee902

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

Uh oh!

julienrbrt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Uh oh!

claude bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

codecov bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

Are you sure you want to change the base?

test: enable force inclusion, re-enable fi e2e and add fi benchmarks #3005

Uh oh!

Conversation

julienrbrt commented Jan 22, 2026

Uh oh!

github-actions bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review

Summary

Critical Issues

1. Missing Test Coverage for Config Validation 🔴

Code Quality Issues

2. Flag Consistency - Prefix Usage ⚠️

3. Benchmark Test Quality ✅ Good, with suggestions

Security Review

4. E2E Test Security Properties ✅ Excellent

5. No Secret Exposure ✅ Pass

Performance Review

6. FilterTxs Optimization Path ✅ Good Design

7. E2E Test Performance Parameters ⚠️

Documentation Review

8. Test Documentation ✅ Excellent

9. Config Comments ✅ Good

Architecture & Design

10. Modular Test Setup ✅ Good

11. Grace Period and Timing ⚠️ Needs clarification

Error Handling

12. Benchmark Error Handling ✅ Appropriate

13. E2E Test Assertions ✅ Good

Testing Coverage Summary

Recommendations Summary

Must Fix (Blocking)

Should Fix (Recommended)

Nice to Have (Optional)

Conclusion

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

codecov bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 22, 2026 •

edited

Loading

claude bot commented Jan 22, 2026 •

edited

Loading

codecov bot commented Jan 22, 2026 •

edited

Loading