Add case-heavy LEFT JOIN benchmark and debug timing/logging for PushDownFilter hot paths by kosiew · Pull Request #20664 · apache/datafusion

kosiew · 2026-03-03T10:10:47Z

Which issue does this PR close?

Part of perf: push_down_filter is pathologically slow for some plans #20002.

Rationale for this change

The PushDownFilter optimizer rule shows a severe planner-time performance pathology in the sql_planner_extended benchmark, where profiling indicates it dominates total planning CPU time and repeatedly recomputes expression types.

This PR adds a deterministic, CASE-heavy LEFT JOIN benchmark to reliably reproduce the worst-case behavior and introduces lightweight debug-only timing + counters inside push_down_filter to make it easier to pinpoint expensive sub-sections (e.g. predicate simplification and join predicate inference) during profiling.

What changes are included in this PR?

Benchmark: add a deterministic CASE-heavy LEFT JOIN workload
- Adds build_case_heavy_left_join_query and helpers to construct a CASE-nested predicate chain over a LEFT JOIN.
- Adds a new benchmark logical_plan_optimize_case_heavy_left_join to stress planning/optimization time.
- Adds an A/B benchmark group push_down_filter_case_heavy_left_join_ab that sweeps predicate counts and CASE depth, comparing:
  - default optimizer with push_down_filter enabled
  - optimizer with push_down_filter removed
Optimizer instrumentation (debug-only)
- Adds a small with_debug_timing helper gated by log_enabled!(Debug) to record microsecond timings for specific sections.
- Instruments and logs:
  - time spent in infer_join_predicates
  - time spent in simplify_predicates
  - counts of parent predicates, on_filters, inferred join predicates
  - before/after predicate counts for simplification

Are these changes tested?

No new unit/integration tests were added because this PR is focused on benchmarking and debug-only instrumentation rather than changing optimizer semantics.
Coverage is provided by:
- compiling/running the sql_planner_extended benchmark
- validating both benchmark variants (with/without push_down_filter) produce optimized plans without errors
- enabling RUST_LOG=debug to confirm timing sections and counters emit as expected

Are there any user-facing changes?

No user-facing behavior changes.
The optimizer logic is unchanged; only debug logging is added (emits only when RUST_LOG enables Debug for the relevant modules).
Benchmark suite additions only affect developers running benches.

LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

…ging in push down filter

…lter

Dandandan · 2026-03-05T20:11:36Z

datafusion/core/benches/sql_planner_extended.rs

+            query.push_str(" AND ");
+        }
+
+        let mut expr = format!("length(l.c{})", i % 20);


I don't think we really want a benchmark for the case expression.
We want to optimize for the evaluation cost of the filter during pushdown, so perhaps it could be written not using a large case expression as is done currently or adaptive removing filters, etc.

So the TPC-H/TPC-DS one is already a good one to optimize for.

don't think we really want a benchmark for the case expression.
We want to optimize for the evaluation cost of the filter during pushdown, so perhaps it could be written not using a large case expression as is done currently or adaptive removing filters, etc.

I believe that you fear that by using a huge CASE expression we might be tuning for an unrealistic “case expression” workload instead of the more common cost of pushing filters through joins.

The benchmark uses CASE because that form triggered a profiler hotspot in PushDownFilter — the nullability/type‑inference codepath for filters on non‑inner joins. I don’t believe real‑world queries typically look like this, so the presence of CASE is purely a convenient way to exercise that particular expensive planner path, not the target of optimization.

To make this clear and avoid overfitting, I’m going to treat the CASE variant as a narrowly scoped micro‑benchmark and add a companion LEFT JOIN query with a simple predicate instead of a CASE.
With both in place we can separate:

the baseline cost of pushing a filter through a join, and

the extra work incurred when a CASE expression forces nullability inference.

That way the benchmark remains useful for optimization while still reflecting more general planner behaviour.

So the TPC-H/TPC-DS one is already a good one to optimize for.

Agreed. TPC-H/TPC-DS should remain the primary goal for optimization value and regression detection.

The intent here is to complement those suites with a deterministic micro-benchmark that isolates known planner hotspot; macro benchmarks are still required to verify end-to-end relevance and prevent narrow wins.

Hm I didn't realize the issue is about planning (not about evaluation cost / pushdown per se), sorry about that!

Perhaps this would make sense to move into the planning benchmarks instead?

Oh wait - it already is 🙈

Dandandan · 2026-03-06T08:57:14Z

@adriangb FYI

Can we reconsider creating a large crazy large expression for the dynamic filters?

now the size of the dynamic expression is something like

number_of_join_keys * number_of_partitions which creates extremely large expressions on large core machines.

Perhaps create a EvaluateByIdExpr PhysicalExpr or something that has Vec<PhysicalExpr> that evaluates them by id, or disabling dynamic filters when having partitioned joins for the moment.

adriangb · 2026-03-06T09:07:05Z

@adriangb FYI

Can we reconsider creating a large crazy large expression for the dynamic filters?

now the size of the dynamic expression is something like
* `number_of_join_keys` * `number_of_partitions` which creates _extremely large expressions_ on large core machines.
Perhaps create a EvaluateByIdExpr PhysicalExpr or something that has Vec<PhysicalExpr> that evaluates them by id, or disabling dynamic filters when having partitioned joins for the moment.

I'm open to suggestions. We should find a solution that keeps the performance wins for small number of join keys / CPUs without degrading for large combinations of those. My hope was that CASE is basically already an optimized version of what you are suggesting (afaik it has a hash map internally to do the routing), but I guess not.

kosiew · 2026-03-07T04:10:07Z

@Dandandan
Thanks for the review

kosiew added 3 commits March 3, 2026 18:07

feat: add benchmarking for case-heavy left join and improve debug log…

054607b

…ging in push down filter

feat: enhance benchmarking for case-heavy left join with push down fi…

b927fe7

…lter

Align benchmark helper setup

8ea18ed

github-actions bot added optimizer Optimizer rules core Core DataFusion crate labels Mar 3, 2026

cargo fmt

7f6512b

kosiew marked this pull request as ready for review March 5, 2026 09:05

Dandandan reviewed Mar 5, 2026

View reviewed changes

kosiew added 2 commits March 6, 2026 16:05

Add bench comments for push down

72ebc0e

Add benchmarks for non-case left join with push down filter

330ea04

kosiew force-pushed the push_down-20002 branch from 0500e0c to 330ea04 Compare March 6, 2026 08:06

Dandandan approved these changes Mar 6, 2026

View reviewed changes

kosiew added this pull request to the merge queue Mar 7, 2026

Merged via the queue into apache:main with commit 0ac434d Mar 7, 2026
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add case-heavy LEFT JOIN benchmark and debug timing/logging for PushDownFilter hot paths#20664

Add case-heavy LEFT JOIN benchmark and debug timing/logging for PushDownFilter hot paths#20664
kosiew merged 6 commits intoapache:mainfrom
kosiew:push_down-20002

kosiew commented Mar 3, 2026

Uh oh!

Dandandan Mar 5, 2026

Uh oh!

Dandandan Mar 5, 2026

Uh oh!

kosiew Mar 6, 2026 •

edited

Loading

Uh oh!

Dandandan Mar 6, 2026

Uh oh!

Dandandan Mar 6, 2026

Uh oh!

Dandandan commented Mar 6, 2026

Uh oh!

adriangb commented Mar 6, 2026 •

edited

Loading

Uh oh!

kosiew commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kosiew commented Mar 3, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

LLM-generated code disclosure

Uh oh!

Dandandan Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

kosiew Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Mar 6, 2026

Uh oh!

adriangb commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kosiew commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kosiew Mar 6, 2026 •

edited

Loading

adriangb commented Mar 6, 2026 •

edited

Loading