Skip to content

Add swarms regression test for bucket-split task rescheduling (Altinity/ClickHouse#1486)#118

Open
CarlosFelipeOR wants to merge 1 commit into
mainfrom
add-swarms-task-rescheduling-test
Open

Add swarms regression test for bucket-split task rescheduling (Altinity/ClickHouse#1486)#118
CarlosFelipeOR wants to merge 1 commit into
mainfrom
add-swarms-task-rescheduling-test

Conversation

@CarlosFelipeOR
Copy link
Copy Markdown
Collaborator

Summary

Test plan

  • Pre-fix build: test fails with row loss (e.g. Expected 200000 total rows, but got 165762)
  • Post-fix build (PR #1493): test passes with RESULT: 200000

Made with Cursor



@TestStep(Given)
def create_parquet_with_many_row_groups(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step needs a clean up afterwards.

Comment thread swarms/regression.py
"swarms work only with antalya",
check_if_not_antalya_build,
),
"/swarms/feature/task rescheduling/*": (
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fix is added to 25.8 Altinity/ClickHouse#1237, so xfail should be updated for it as well

max_threads=1,
lock_object_storage_task_distribution_ms=2,
cluster_table_function_split_granularity="bucket",
cluster_table_function_buckets_batch_size=1,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this setting is set to 1?

assert sent_to_matched + sent_to_non_matched > 1, error(
"Bucket split was not active: expected more than one distributed task."
)
assert processed_tasks_total > 1, error(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we assert for 200 as total tasks: 200000 rows / 1000 rows in row group?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always get 198, maybe it logging bad if node is killed.

def feature(self, minio_root_user, minio_root_password, node=None):
"""Check that task rescheduling works correctly when swarm replicas fail,
verifying data completeness after recovery."""
if node is None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable is not used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants