Skip to content
/ server Public

MDEV-37713/MDEV-37714 Fold boolean literals in SELECT-list#4754

Open
jaeheonshim wants to merge 1 commit intoMariaDB:mainfrom
jaeheonshim:MDEV-37713
Open

MDEV-37713/MDEV-37714 Fold boolean literals in SELECT-list#4754
jaeheonshim wants to merge 1 commit intoMariaDB:mainfrom
jaeheonshim:MDEV-37713

Conversation

@jaeheonshim
Copy link

@jaeheonshim jaeheonshim commented Mar 8, 2026

The order of evaluation of the expressions that appear in a SELECT-list is undefined. This change exploits this fact by recursively folding TRUE/FALSE literals in OR/AND expressions which may allow for skipping evaluation of some parts of the expression or even the whole expression.

Example

Setup

CREATE TABLE t1(c0 INT8);
CREATE TABLE t2(c0 INT8);
INSERT INTO t1 VALUES(1);
INSERT INTO t2 SELECT seq FROM seq_1_to_10000000;

Before

SELECT (SELECT MIN(c0) FROM t2)<0 OR true;
+------------------------------------+
| (SELECT MIN(c0) FROM t2)<0 OR true |
+------------------------------------+
|                                  1 |
+------------------------------------+
1 row in set (1.335 sec)


SELECT ((SELECT MIN(c0) FROM t2)<0 AND false) OR (SELECT MAX(c0) FROM t1)>0;
+----------------------------------------------------------------------+
| ((SELECT MIN(c0) FROM t2)<0 AND false) OR (SELECT MAX(c0) FROM t1)>0 |
+----------------------------------------------------------------------+
|                                                                    1 |
+----------------------------------------------------------------------+
1 row in set (1.413 sec)

After

SELECT (SELECT MIN(c0) FROM t2)<0 OR true;
+------------------------------------+
| (SELECT MIN(c0) FROM t2)<0 OR true |
+------------------------------------+
|                                  1 |
+------------------------------------+
1 row in set (0.000 sec)


SELECT ((SELECT MIN(c0) FROM t2)<0 AND false) OR (SELECT MAX(c0) FROM t1)>0;
+----------------------------------------------------------------------+
| ((SELECT MIN(c0) FROM t2)<0 AND false) OR (SELECT MAX(c0) FROM t1)>0 |
+----------------------------------------------------------------------+
|                                                                    1 |
+----------------------------------------------------------------------+
1 row in set (0.000 sec)

@jaeheonshim jaeheonshim force-pushed the MDEV-37713 branch 3 times, most recently from 8acdfc3 to 3dc42db Compare March 8, 2026 07:52
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 9, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

Please also add a test case.

And, the jira mentions one optimization you're not really doing:

SELECT (a > 0 AND a < 0) FROM t1;
SELECT (a > 0 OR a <=0) FROM t1;

Would you consider doing this too? Or, if you wont, we'll need to open a sub-task Jira instead of using the current one.

li2.remove();
}
}
if (list.is_empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the list ever be empty, given that you're removing one of the branches only if another branch exists?

I'd say no. And then this is dead code. And should be turned into an assert.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list can be empty. For example if we are dealing with something like TRUE AND TRUE, both TRUEs will be dropped during the while loop, and we will be left with an empty list

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. You will only simplify a X AND|OR Y by replacing it with X or Y. This is a multiple parameter expression. You will do nothing on single parameter expressions.

Can you please demonstrate how the list becomes empty with an example SQL

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an expression or sub-expression composed of only boolean literals, the list can become empty. An example SQL expression where this happens is SELECT TRUE AND TRUE, or SELECT FALSE OR FALSE. Since all TRUEs are dropped from AND clauses and all FALSEs are dropped from OR clauses. I tested this on my end and I can confirm that the code after if (list.is_empty()) is run in certain cases, but I'd be happy to clarify/demonstrate further.

If item is a SELECT-list COND_ITEM, rewrite it on the first time this
query is optimized to fold boolean expressions
*/
if (thd->lex->current_select->context_analysis_place == SELECT_LIST &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why you're doing this only for the SELECT list expressions?
Why not do this during fix_fields for all Item_conds for example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention one thing: When doing the optimization you are doing you need to check for NULL-ness!

SELECT true AND NULL FROM t1

should return NULL and not true.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why you're doing this only for the SELECT list expressions? Why not do this during fix_fields for all Item_conds for example?

Clauses like WHERE/HAVING/ON already implement this folding through the remove_eq_conds function. However, its semantics are not correct in the SELECT-list (I tried using remove_eq_conds it at first but it failed a bunch of tests). For example, something like NULL AND TRUE would evaluate to FALSE after remove_eq_conds, when it should be NULL.

I guess you can say the simplify_cond function is a 'weaker' version of the remove_eq_conds in order to preserve correct semantics.

Copy link
Author

@jaeheonshim jaeheonshim Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention one thing: When doing the optimization you are doing you need to check for NULL-ness!

SELECT true AND NULL FROM t1

should return NULL and not true.

This works! The true simply gets folded out and the expression is left as NULL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about TRUE or NULL ? This should still return NULL. And in your example above it's returning TRUE (1).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the latest stable version of MariaDB it returns TRUE

Your MariaDB connection id is 5
Server version: 11.8.6-MariaDB-ubu2404 mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SELECT TRUE OR NULL;
+--------------+
| TRUE OR NULL |
+--------------+
|            1 |
+--------------+
1 row in set (0.000 sec)

btw, for all of the test cases I wrote I tested it against a distribution version of MariaDB running in a Docker container

@jaeheonshim
Copy link
Author

And, the jira mentions one optimization you're not really doing:

SELECT (a > 0 AND a < 0) FROM t1;
SELECT (a > 0 OR a <=0) FROM t1;

Would you consider doing this too? Or, if you wont, we'll need to open a sub-task Jira instead of using the current one.

Unfortunately, I don't think something like that can be added. We would risk breaking the semantics of the sql query. For example, if a is NULL, we cannot fold the expression (a > 0 AND a < 0) to FALSE, as the correct answer should be NULL.

The order of evaluation of the expressions that appear in a SELECT-list
is undefined. This change exploits this fact by recursively folding
TRUE/FALSE literals in OR/AND expressions which may allow for skipping
evaluation of some parts of the expression or even the whole expression.
li2.remove();
}
}
if (list.is_empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. You will only simplify a X AND|OR Y by replacing it with X or Y. This is a multiple parameter expression. You will do nothing on single parameter expressions.

Can you please demonstrate how the list becomes empty with an example SQL

If item is a SELECT-list COND_ITEM, rewrite it on the first time this
query is optimized to fold boolean expressions
*/
if (thd->lex->current_select->context_analysis_place == SELECT_LIST &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about TRUE or NULL ? This should still return NULL. And in your example above it's returning TRUE (1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

2 participants