Skip to content

DAOS-18768 cart: Change default mrc settings#17948

Open
frostedcmos wants to merge 1 commit intomasterfrom
frostedcmos/DAOS-18768
Open

DAOS-18768 cart: Change default mrc settings#17948
frostedcmos wants to merge 1 commit intomasterfrom
frostedcmos/DAOS-18768

Conversation

@frostedcmos
Copy link
Copy Markdown
Contributor

  • Disable mrc for all providers on servers, leave untouched on clients
  • Include UCX_RCACHE_ENABLE=n setting when disabling MRC
  • Move mrc code from per-provider setup to data_init()

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

- Disable mrc for all providers on servers, leave untouched on clients
- Include UCX_RCACHE_ENABLE=n setting when disabling MRC
- Move mrc code from per-provider setup to data_init()

Signed-off-by: Alexander A Oganezov <alexander.oganezov@hpe.com>
d_setenv("FI_UNIVERSE_SIZE", "2048", 1);

/* Disable MRC on servers and enable on clients by default */
mrc_enable = server ? 0 : 1;
Copy link
Copy Markdown
Collaborator

@soumagne soumagne Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mrc_enable = server ? 0 : 1;
mrc_enable = !server;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe it is clearer to keep the original and spell out the two cases explicitly with "?"

mrc_enable = server ? 0 : 1;
crt_env_get(CRT_MRC_ENABLE, &mrc_enable);

if (mrc_enable == 0) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (mrc_enable == 0) {
if (!mrc_enable) {

unsigned int mrecv_buf = CRT_HG_MRECV_BUF;
unsigned int mrecv_buf_copy = 0; /* buf copy disabled by default */
char *swim_traffic_class = NULL;
uint32_t mrc_enable = 0;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not make it a boolean ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be uint32_t for crt_env stuff. server is also currently defined as int, so changing all to bools would be messier in this pr

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Ticket title is 'Rank excluded due to: corrupted double-linked list after rebuild was triggered'
Status is 'Open'
https://daosio.atlassian.net/browse/DAOS-18768

Copy link
Copy Markdown
Collaborator

@Michael-Hennecke Michael-Hennecke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex - is this worth a 2.6 backport (after the 2.8 backport)?

d_setenv("FI_UNIVERSE_SIZE", "2048", 1);

/* Disable MRC on servers and enable on clients by default */
mrc_enable = server ? 0 : 1;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe it is clearer to keep the original and spell out the two cases explicitly with "?"

@frostedcmos
Copy link
Copy Markdown
Contributor Author

Thanks Alex - is this worth a 2.6 backport (after the 2.8 backport)?

I am not sure on this. the primary fix here is to add ucx disabling part, and i dont think we have any ucx users on 2.6. But if someone else decides it is needed for 2.6 then a backport can be easily made

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants