Description
AdvancedShardAwarenessIT.should_not_struggle_to_fill_pools intermittently fails in CI with a 20-second Awaitility timeout. The test opens 4 concurrent sessions against a 2-node ScyllaDB cluster (with --smp=3) and waits for all connection pools to be fully initialized, but pools fail to fill due to DriverTimeoutException on the protocol OPTIONS handshake.
Error
org.awaitility.core.ConditionTimeoutException: Condition with lambda expression in
com.datastax.oss.driver.core.pool.AdvancedShardAwarenessIT was not fulfilled within 20 seconds.
at com.datastax.oss.driver.core.pool.AdvancedShardAwarenessIT.should_not_struggle_to_fill_pools(AdvancedShardAwarenessIT.java:239)
Root Cause
Multiple channels fail to initialize with protocol timeout:
WARN c.d.o.d.i.core.pool.ChannelPool - [s1|/127.0.2.1:19042] Error while opening new channel
com.datastax.oss.driver.api.core.DriverTimeoutException: [s1|id: 0xf3999fab, L:/127.0.0.1:11669 - R:/127.0.2.1:19042]
Protocol initialization request, step 1 (OPTIONS): timed out after 5000 ms
Opening 4 sessions simultaneously with multiple channels per shard creates a burst of connection attempts. On CI runners with limited resources, the ScyllaDB node cannot respond to all OPTIONS requests within the 5-second timeout, causing channels to fail and preventing pools from reaching full capacity within 20 seconds.
Environment
- ScyllaDB version: 2025.4.3 (also seen on LTS versions)
- CCM config: 2 nodes,
--smp=3
- CI: GitHub Actions
ubuntu-latest
- Test category:
IsolatedTests
CI Run
https://github.com/scylladb/java-driver/actions/runs/22542986983/job/65300844148?pr=818
Also observed in base branch (scylla-4.x) CI runs.
Possible Fixes
- Increase Awaitility timeout from 20s to 60s
- Increase per-channel protocol init timeout (
advanced.connection.init-query-timeout)
- Reduce the number of concurrent sessions from 4 to 2
- Add retry/backoff logic for initial connection pool fill
Related
Description
AdvancedShardAwarenessIT.should_not_struggle_to_fill_poolsintermittently fails in CI with a 20-second Awaitility timeout. The test opens 4 concurrent sessions against a 2-node ScyllaDB cluster (with--smp=3) and waits for all connection pools to be fully initialized, but pools fail to fill due toDriverTimeoutExceptionon the protocol OPTIONS handshake.Error
Root Cause
Multiple channels fail to initialize with protocol timeout:
Opening 4 sessions simultaneously with multiple channels per shard creates a burst of connection attempts. On CI runners with limited resources, the ScyllaDB node cannot respond to all OPTIONS requests within the 5-second timeout, causing channels to fail and preventing pools from reaching full capacity within 20 seconds.
Environment
--smp=3ubuntu-latestIsolatedTestsCI Run
https://github.com/scylladb/java-driver/actions/runs/22542986983/job/65300844148?pr=818
Also observed in base branch (
scylla-4.x) CI runs.Possible Fixes
advanced.connection.init-query-timeout)Related
should_initialize_all_channels), different root cause (socket collision)