Give more wait time and retry for kafka setup for ExactlyOnceKafkaRealtimeClusterIntegrationTest and separate it from the test suite run#17752
Closed
xiangfu0 wants to merge 1 commit intoapache:masterfrom
Conversation
❌ 2 Tests Failed:
View the top 1 failed test(s) by shortest run time
View the full list of 1 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
6bae0d4 to
a53312b
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request addresses flakiness in the ExactlyOnceKafkaRealtimeClusterIntegrationTest when running in GitHub Actions CI environments. The test uses a 3-broker Kafka cluster with transactions, which requires more resources and time to start reliably in resource-constrained CI environments.
Changes:
- Introduced configurable Kafka startup parameters (max attempts, retry wait time, cluster ready timeout) in
BaseClusterIntegrationTest - Overrode these parameters in
ExactlyOnceKafkaRealtimeClusterIntegrationTestto use more generous timeouts when running in GitHub Actions - Restructured Maven test execution to run
ExactlyOnceKafkaRealtimeClusterIntegrationTestfirst in isolation before other tests
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pinot-integration-test-base/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTest.java | Added three protected methods to allow subclasses to customize Kafka startup configuration (max attempts, retry wait time, cluster ready timeout); updated all usages to call these methods instead of using constants directly |
| pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/ExactlyOnceKafkaRealtimeClusterIntegrationTest.java | Overrode Kafka configuration methods to use higher values (5 attempts, 5s retry wait, 180s timeout) when GITHUB_ACTIONS environment variable is true |
| pinot-integration-tests/pom.xml | Restructured integration-tests-set-1 profile to run ExactlyOnceKafkaRealtimeClusterIntegrationTest in a separate execution first, then exclude it from the remaining E*Test.java pattern |
.../java/org/apache/pinot/integration/tests/ExactlyOnceKafkaRealtimeClusterIntegrationTest.java
Show resolved
Hide resolved
a8cf5c3 to
0625763
Compare
…ceKafkaRealtimeClusterIntegrationTest
0625763 to
a4dbc4a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request improves the reliability and configurability of Kafka cluster startup in integration tests, particularly to better support resource-constrained CI environments. The main changes include making Kafka startup parameters overridable, updating the test execution order in Maven to ensure a specific test runs first, and customizing Kafka startup behavior for CI.
Kafka startup configurability and reliability:
Added new protected methods (
getKafkaStartMaxAttempts,getKafkaStartRetryWaitMs,getKafkaClusterReadyTimeoutMs) inBaseClusterIntegrationTest.javato allow subclasses to override Kafka broker startup attempts, retry wait time, and cluster readiness timeout. All usages of the previous constants in Kafka startup logic now use these methods. [1] [2] [3] [4] [5] [6]Overrode the new Kafka startup configuration methods in
ExactlyOnceKafkaRealtimeClusterIntegrationTest.javato provide more generous retry and timeout values when running in CI environments (detected viaGITHUB_ACTIONS), improving test reliability under resource constraints.Test execution order improvements:
pinot-integration-tests/pom.xmlto disable the default test execution, and instead:ExactlyOnceKafkaRealtimeClusterIntegrationTestfirst in its own execution.