Skip to content

MDEV-37664: Fix main.lotofstack failure under UBSAN/MSAN+Debug#5039

Open
kjarir wants to merge 1 commit intoMariaDB:10.11from
kjarir:10.11
Open

MDEV-37664: Fix main.lotofstack failure under UBSAN/MSAN+Debug#5039
kjarir wants to merge 1 commit intoMariaDB:10.11from
kjarir:10.11

Conversation

@kjarir
Copy link
Copy Markdown
Contributor

@kjarir kjarir commented May 4, 2026

The test was expecting only ER_STACK_OVERRUN_NEED_MORE (1436) but under sanitizer builds (UBSAN, MSAN+Debug), the recursive stored procedure call bug10100p(255) fails with ER_SP_RECURSION_LIMIT or succeeds (0) instead, because sanitizer instrumentation increases per-frame stack usage, altering the recursion depth at which the stack guard fires.

Fix: Accept 0 and ER_SP_RECURSION_LIMIT as additional valid outcomes for bug10100p in the test, following the pattern from PR #4821.

The test was expecting only ER_STACK_OVERRUN_NEED_MORE (1436) but under
sanitizer builds (UBSAN, MSAN+Debug) the recursive stored procedure call
bug10100p(255) fails with ER_SP_RECURSION_LIMIT or succeeds (0) instead,
because sanitizer instrumentation increases per-frame stack usage,
altering the recursion depth at which the stack guard or recursion limit is hit.

Fix: Accept 0 and ER_SP_RECURSION_LIMIT as additional valid outcomes in the
test, following the pattern from PR MariaDB#4821.

Reviewed-by: <add reviewer>
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label May 4, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contributions! This is a preliminary review.

I believe the test should be disabled for UBSAN similarly to how it's disabled for ASAN. Or fixed for all cases.

-- disable_ps_protocol
-- disable_result_log
-- error ER_STACK_OVERRUN_NEED_MORE
-- error 0,ER_STACK_OVERRUN_NEED_MORE,ER_SP_RECURSION_LIMIT
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I subscribe to idea of the fix.
Serg's approach here was to just disable these for ASAN. I guess the same needs to be done for UBSAN.

But, even if you are hoping to fix this, you need to be fixing it for all cases where the above error is returned.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just discussed with @grooverdan who suggested fixing rather than disabling. The approach is to add more local variables to the procedure to increase per-frame stack consumption so 255 recursions reliably triggers the stack guard under UBSAN. Currently setting up a local UBSAN build to verify the right amount

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change the test doesn't test what it was supposed to test anymore. I support fixing rather than disabling if reasonable fix is possible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that you're not really fixing the test, as @svoj suggested. Please consider the alternative I've mentioned.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original bug https://bugs.mysql.com/bug.php?id=10100 was about ensuring recursion was possible. The current tests this by reaching the stack limit.

Seems hitting the recursion limit is also a suitable test for proving recursion works.

If max_sp_recursion_depth is dropped to say 25 (MSAN Debug was hitting cursor limit at 49), and the error is changed to ER_SP_RECURSION_LIMIT, then it will work on all, including ASAN. Acceptable?

Sorry @kjarir, orginal stack padding was probably a bad idea given max_sp_recursion_depth engineerd to a 255 max is probably trying to prevent a stack overflow, even though it doesn't.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grooverdan dunno, 0 looks definitely wrong. ER_SP_RECURSION_LIMIT is testing different code path, which is already covered by a different test (the other part of the test that was left in sp.test after the split).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants