Skip to content

(fix) test: Fix KeyError in test_idle_heartbeat with connection repla…#762

Draft
mykaul wants to merge 2 commits intoscylladb:masterfrom
mykaul:fix/test-idle-heartbeat
Draft

(fix) test: Fix KeyError in test_idle_heartbeat with connection repla…#762
mykaul wants to merge 2 commits intoscylladb:masterfrom
mykaul:fix/test-idle-heartbeat

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Mar 25, 2026

…cement

The test_idle_heartbeat test was failing with a KeyError when connections were dynamically replaced during the test run in shard-aware environments. Root cause: The test captures connection object IDs at the start, sleeps for heartbeat intervals, then tries to validate those connections. In shard-aware ScyllaDB deployments, the driver may replace connections during this window. When a connection is replaced, the new connection object has a different Python ID, causing a KeyError on lookup.

Fix: Skip validation for connections that weren't present in the initial snapshot. This preserves the test's intent (validating heartbeats on stable connections) while being robust to dynamic connection management in shard-aware mode.

I'm not sure this is a great fix. I don't think it has anything to do with this branch. I'll push it and if it's OK, I'll also create a separate PR for it. Why it popped now?

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

…cement

The test_idle_heartbeat test was failing with a KeyError when connections were dynamically replaced during the test run in shard-aware environments.
Root cause: The test captures connection object IDs at the start, sleeps for heartbeat intervals, then tries to validate those connections.
In shard-aware ScyllaDB deployments, the driver may replace connections during this window.
When a connection is replaced, the new connection object has a different Python ID, causing a KeyError on lookup.

Fix: Skip validation for connections that weren't present in the initial snapshot.
This preserves the test's intent (validating heartbeats on stable connections) while being robust to dynamic connection management in shard-aware mode.

I'm not sure this is a great fix. I don't think it has anything to do with this branch. I'll push it and if it's OK, I'll also create a separate PR for it.
Why it popped now?

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
@mykaul mykaul marked this pull request as draft March 25, 2026 15:35
@mykaul
Copy link
Copy Markdown
Author

mykaul commented Mar 26, 2026

CI flakiness - unrelated, will send a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant