Skip to content

HDDS-14822. Add a configuration for excluding Datanodes from pipelines#9914

Draft
siddhantsangwan wants to merge 5 commits intoapache:masterfrom
siddhantsangwan:HDDS-14822
Draft

HDDS-14822. Add a configuration for excluding Datanodes from pipelines#9914
siddhantsangwan wants to merge 5 commits intoapache:masterfrom
siddhantsangwan:HDDS-14822

Conversation

@siddhantsangwan
Copy link
Contributor

@siddhantsangwan siddhantsangwan commented Mar 13, 2026

What changes were proposed in this pull request?

Adds an SCM configuration for excluding Datanodes from pipeline creation and selection.

  1. Applies during the normal allocateBlock Ratis and EC code path, as well as SCMClient and server code path.
  2. Independent from the already existing concept of ExcludeList provided by the ozone client.
  3. Accepts UUID, hostname and IP address.

Limitations:

  1. We have configured limits for total number of open Ratis pipelines and total number of open EC pipelines in the cluster. The pipelines that are excluded because of this configuration will remain open and count in the limit. But they are useless pipelines and can prevent good pipelines from being opened if the limits are reached.

To handle this, we should either have some thread (or other mechanism) for closing useless open pipelines. Or not count excluded but open pipelines when counting if the limit has reached. This can be done in a future PR.

Workaround: The manual workaround is to increase the configured limits.

  1. Reconfiguring this configuration requires SCM restart. To make it reconfigurable, we should either use the ReconfigurableConfig framework or introduce a custom CLI command for configuring it without SCM restart. Both approaches require some thinking about correctness and thread-safety: the code should not simply start using the new configuration during an allocateBlock call. That can cause weird bugs.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14822

How was this patch tested?

Added unit tests.
Successful CI run in my fork - https://github.com/siddhantsangwan/ozone/actions/runs/23043011479.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant