MDEV-37713/MDEV-37714 Fold boolean literals in SELECT-list#4754
MDEV-37713/MDEV-37714 Fold boolean literals in SELECT-list#4754jaeheonshim wants to merge 1 commit intoMariaDB:mainfrom
Conversation
8acdfc3 to
3dc42db
Compare
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
Please also add a test case.
And, the jira mentions one optimization you're not really doing:
SELECT (a > 0 AND a < 0) FROM t1;
SELECT (a > 0 OR a <=0) FROM t1;
Would you consider doing this too? Or, if you wont, we'll need to open a sub-task Jira instead of using the current one.
| li2.remove(); | ||
| } | ||
| } | ||
| if (list.is_empty()) |
There was a problem hiding this comment.
Can the list ever be empty, given that you're removing one of the branches only if another branch exists?
I'd say no. And then this is dead code. And should be turned into an assert.
There was a problem hiding this comment.
The list can be empty. For example if we are dealing with something like TRUE AND TRUE, both TRUEs will be dropped during the while loop, and we will be left with an empty list
There was a problem hiding this comment.
Not sure I follow. You will only simplify a X AND|OR Y by replacing it with X or Y. This is a multiple parameter expression. You will do nothing on single parameter expressions.
Can you please demonstrate how the list becomes empty with an example SQL
There was a problem hiding this comment.
In an expression or sub-expression composed of only boolean literals, the list can become empty. An example SQL expression where this happens is SELECT TRUE AND TRUE, or SELECT FALSE OR FALSE. Since all TRUEs are dropped from AND clauses and all FALSEs are dropped from OR clauses. I tested this on my end and I can confirm that the code after if (list.is_empty()) is run in certain cases, but I'd be happy to clarify/demonstrate further.
| If item is a SELECT-list COND_ITEM, rewrite it on the first time this | ||
| query is optimized to fold boolean expressions | ||
| */ | ||
| if (thd->lex->current_select->context_analysis_place == SELECT_LIST && |
There was a problem hiding this comment.
Any specific reason why you're doing this only for the SELECT list expressions?
Why not do this during fix_fields for all Item_conds for example?
There was a problem hiding this comment.
Forgot to mention one thing: When doing the optimization you are doing you need to check for NULL-ness!
SELECT true AND NULL FROM t1
should return NULL and not true.
There was a problem hiding this comment.
Any specific reason why you're doing this only for the SELECT list expressions? Why not do this during fix_fields for all Item_conds for example?
Clauses like WHERE/HAVING/ON already implement this folding through the remove_eq_conds function. However, its semantics are not correct in the SELECT-list (I tried using remove_eq_conds it at first but it failed a bunch of tests). For example, something like NULL AND TRUE would evaluate to FALSE after remove_eq_conds, when it should be NULL.
I guess you can say the simplify_cond function is a 'weaker' version of the remove_eq_conds in order to preserve correct semantics.
There was a problem hiding this comment.
Forgot to mention one thing: When doing the optimization you are doing you need to check for NULL-ness!
SELECT true AND NULL FROM t1should return NULL and not true.
This works! The true simply gets folded out and the expression is left as NULL.
There was a problem hiding this comment.
How about TRUE or NULL ? This should still return NULL. And in your example above it's returning TRUE (1).
There was a problem hiding this comment.
On the latest stable version of MariaDB it returns TRUE
Your MariaDB connection id is 5
Server version: 11.8.6-MariaDB-ubu2404 mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> SELECT TRUE OR NULL;
+--------------+
| TRUE OR NULL |
+--------------+
| 1 |
+--------------+
1 row in set (0.000 sec)
btw, for all of the test cases I wrote I tested it against a distribution version of MariaDB running in a Docker container
Unfortunately, I don't think something like that can be added. We would risk breaking the semantics of the sql query. For example, if |
The order of evaluation of the expressions that appear in a SELECT-list is undefined. This change exploits this fact by recursively folding TRUE/FALSE literals in OR/AND expressions which may allow for skipping evaluation of some parts of the expression or even the whole expression.
| li2.remove(); | ||
| } | ||
| } | ||
| if (list.is_empty()) |
There was a problem hiding this comment.
Not sure I follow. You will only simplify a X AND|OR Y by replacing it with X or Y. This is a multiple parameter expression. You will do nothing on single parameter expressions.
Can you please demonstrate how the list becomes empty with an example SQL
| If item is a SELECT-list COND_ITEM, rewrite it on the first time this | ||
| query is optimized to fold boolean expressions | ||
| */ | ||
| if (thd->lex->current_select->context_analysis_place == SELECT_LIST && |
There was a problem hiding this comment.
How about TRUE or NULL ? This should still return NULL. And in your example above it's returning TRUE (1).
The order of evaluation of the expressions that appear in a SELECT-list is undefined. This change exploits this fact by recursively folding TRUE/FALSE literals in OR/AND expressions which may allow for skipping evaluation of some parts of the expression or even the whole expression.
Example
Setup
Before
After