MDEV-39499 Updates to derived-with-keys, window functions determining records per key#5020
MDEV-39499 Updates to derived-with-keys, window functions determining records per key#5020mariadb-RexJohnston wants to merge 1 commit into11.4from
Conversation
3b06e82 to
48a91d9
Compare
|
I was just looking at the patch, mrr_derived_crash_4610.result. This thing is present both before and after the patch: top-level select is an inner join, and there is a row with rows=0.... |
| --echo # | ||
|
|
||
|
|
||
| # |
There was a problem hiding this comment.
Use --echo # so that the testcase is delimited in .result, too.
| where ranked = 1 | ||
| ) | ||
| select * from ranked_table; | ||
|
|
There was a problem hiding this comment.
One has to strain the eye and compare $q0, $q1 ... . Please use short comment near let $q_N=... describing the difference.
Please do the same near EXPLAIN. So when/if that changes one can have a clue if that's a problem or not.
|
handle_single_part_rownumber: Use ha_rows datatype for anything that's number of records. |
Unfortunately the number shown in the explain output is an |
48a91d9 to
fd9f165
Compare
| longlong val_datetime_packed(THD *thd) override; | ||
| uint size_of() const override { return sizeof *this; } | ||
| Binlog_type_info binlog_type_info() const override; | ||
| bool hash_join_is_possible() override { return !null_bit; }; //MDEV-39369 |
There was a problem hiding this comment.
We will need to discuss this change.
Topics:
-
Remove
;at the very end. -
instead of
null_bit, usereal_maybe_null()(or should it bemaybe_null()?) -
Why do we need to disable this? Because MDEV-39499 will make it possible to hit MDEV-39369 for a wider set of queries?
-
Should the change be in a separate patch?
There was a problem hiding this comment.
I'm not sure what to do with this either. It's there because i needed to make a test failure involving a hash join go away. It should be a separate patch, but do we want this to be dependent on it?
The change allowing a const key part on a derived table will give the optimizer more choice, this is what has happened here.
I don't think this is a very good fix for MDEV-39369.
| if ((*part_list->item)->type() != Item::NULL_ITEM) | ||
| { | ||
| if ((*part_list->item)->type() == Item::FIELD_ITEM) | ||
| { |
There was a problem hiding this comment.
What is the point of the above two nested if conditions?
if ((*part_list->item)->type() != Item::NULL_ITEM)
if ((*part_list->item)->type() == Item::FIELD_ITEM)
``
Can't one just check the second?
There was a problem hiding this comment.
yes this is daft. I'll change it.
| If we have only one select in our unit and | ||
| we have a window function in our select | ||
| */ | ||
| if (!first->next_select() && first->with_sum_func) |
There was a problem hiding this comment.
To discuss: It is not clear that first->with_sum_func mean "we have a window function" ?
Is it the case here, because we've checked that derived table does not have SELECTS that have GROUP BY? (And if yes, where does the case of implicit grouping go?)
There was a problem hiding this comment.
The Item_sum base class constructor sets this flag during parsing.
│ 11113 simple_window_func:
│ 11114 ROW_NUMBER_SYM '(' ')'
│ 11115 {
│ > 11116 $$= new (thd->mem_root) Item_sum_row_number(thd);
│ 11117 if (unlikely($$ == NULL))
│ 11118 MYSQL_YYABORT;
│ 11119 }
This appears to be true of any window function, independent of any grouping operators.
|
If I just take this part of the patch: diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 531b30cce21..5f5233aaf95 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -9002,8 +9002,18 @@ best_access_path(JOIN *join,
table->opt_range[key].get_costs(&tmp);
goto got_cost2;
}
- /* quick_range couldn't use key! */
- records= (double) s->records/rec;
+ /*
+ quick_range couldn't use key!
+ Check records per key, it may have been inferred from
+ derived table characteristics, so there will be no table map.
+ */
+
+ if (!(records=
+ keyinfo->rec_per_key_null_aware(key_parts-1, notnull_part)))
+ records= (double) s->records/rec;
+
+ // /* quick_range couldn't use key! */
+ // records= (double) s->records/rec;
if (unlikely(trace_access_idx.trace_started()))
trace_access_idx.
add("used_range_estimates", false).Then I already get Looking at the change in rowid_filter_myisam --- /home/psergey/dev-git/11.4-review-derived-const-keys/mysql-test/main/rowid_filter_myisam.result 2026-05-
04 14:56:42.450322361 +0300
+++ /home/psergey/dev-git/11.4-review-derived-const-keys/mysql-test/main/rowid_filter_myisam.reject 2026-05-
04 15:02:59.218141505 +0300
@@ -91,7 +91,7 @@
SELECT * FROM t1 HAVING (7, 9) IN (SELECT t2.i1, t2.i2 FROM t2 WHERE t2.i1 = 3);
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY NULL NULL NULL NULL NULL NULL NULL NULL Impossible HAVING
-2 SUBQUERY t2 index_subquery i1,i2 i2 5 const 10 0.92 Using where
+2 SUBQUERY NULL NULL NULL NULL NULL NULL NULL NULL no matching row in const
table
Warnings:
Note 1003 /* select#1 */ select `test`.`t1`.`pk` AS `pk` from `test`.`t1` having 0
DROP TABLE t1,t2;Looks unexpected. |
t2 has |
fd9f165 to
f536bac
Compare
..records per key Enabling derived keys optimization for derived.col = const pushed conditions. Estimating records per key in derived key for the optimizer based on form and/or size of components of a derived table. Consider any derived table of the form SELECT ..., ROW_NUMBER () OVER (PARTITION BY c1,c2 order by ...) FROM t1, t2, t3 ... WHERE ... If the optimizer generates a key on this derived table because of a constraint being pushed into it, it currently will not consider key components of the form col = const We lift this constraint and add code to TABLE::add_tmp_key to search for a window function ROW_NUMBER(). From the partition list c1, c2 we can in infer an estimate of the number of rows we expect to see for each key value. In the absence of EITS, we still have an number of expected records in our referred to table and will fall back to using that as a worst case scenario. As a consequence of forcing best_access_path to now consider a generated index's records per key on any type of derived table, we see a number of optimization changes, for example a range optimizer might now be considered best and might return IMPOSSIBLE_RANGE (seen in rowid_filter_myisam.result).
f536bac to
6b3e95b
Compare
Enabling derived keys optimization for derived.col = const pushed conditions.
Estimating records per key in derived key for the optimizer based on form and/or size of components of a derived table.
Consider a derived table of the form
SELECT ..., ROW_NUMBER () OVER (PARTITION BY c1,c2 order by ...) FROM t1, t2, t3 ...
WHERE ...
If the optimizer generates a key on this derived table because of a constraint being pushed into it, it currently will not consider key components of the form col = const
We lift this constraint and add code to TABLE::add_tmp_key to search for a window function ROW_NUMBER(). From the partition list c1, c2 we can in infer an estimate of the number of rows we expect to see for each key value. The optimizer can then use this number to determine a better table join order.