Skip to content

tests: fix NLB replacement test bootstrap crash due to missing rackdc properties#772

Open
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:fix/nlb-test-bootstrap-rackdc
Open

tests: fix NLB replacement test bootstrap crash due to missing rackdc properties#772
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:fix/nlb-test-bootstrap-rackdc

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Mar 28, 2026

Summary

  • Fix TestFullNodeReplacementThroughNlb::test_should_survive_full_node_replacement_through_nlb which always crashes when bootstrapping node 4/5
  • Root cause: _bootstrap_node() calls ccm_cluster.add() without data_center or rack — CCM leaves cassandra-rackdc.properties empty, and Scylla's snitch crashes with locator::bad_property_file_error
  • Fix: infer data_center/rack from existing cluster nodes and pass explicitly

Details

The test was added in PR #706 with @skip_scylla_version_lt(scylla_version="2026.1.0"). Since CI uses SCYLLA_VERSION: release:2025.2, the test was always skipped and the bug was never caught.

Scylla error from node4 startup log:

ERROR snitch_logger - Failed to parse a properties file (cassandra-rackdc.properties). Halting...
ERROR init - Startup failed: locator::bad_property_file_error (std::exception)

Testing

Verified 10/10 passes on Scylla 2026.1 after the fix (previously 0/10).

… properties

The _bootstrap_node() method in TestFullNodeReplacementThroughNlb calls
ccm_cluster.add() without passing data_center or rack.  CCM does not
infer these from existing nodes, so the new node's
cassandra-rackdc.properties file is left with only template comments.
Scylla's GossipingPropertyFileSnitch fails to parse the empty file and
crashes on startup with 'locator::bad_property_file_error'.

Fix by reading data_center/rack from an existing cluster node and
passing them explicitly to ccm_cluster.add().

This test was added in PR scylladb#706 with a @skip_scylla_version_lt(2026.1.0)
decorator and CI runs Scylla 2025.2, so the bug was never caught.
@dkropachev
Copy link
Copy Markdown
Collaborator

@mykaul , this fix is wrong - test needs to specify concrete dc and rack it wants to place new nodes in, taking them from nodes already provisioned

@mykaul
Copy link
Copy Markdown
Author

mykaul commented Mar 28, 2026

@mykaul , this fix is wrong - test needs to specify concrete dc and rack it wants to place new nodes in, taking them from nodes already provisioned

@dkropachev - isn't it correct anyway, to specify the topology explicitly? I can fix the test in addition, but this seems correct as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants