Skip to content

feat: Improve partition_statistics() for AggregateExec using distinct_count#20731

Open
jonathanc-n wants to merge 3 commits intoapache:mainfrom
jonathanc-n:estimate-group-by-stats-with-distinct_count
Open

feat: Improve partition_statistics() for AggregateExec using distinct_count#20731
jonathanc-n wants to merge 3 commits intoapache:mainfrom
jonathanc-n:estimate-group-by-stats-with-distinct_count

Conversation

@jonathanc-n
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

We can improve partition_statistics accuracy using AggregateExec by using distinct_count on each column in the group by expression.

the test was written by AI

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 5, 2026
@github-actions github-actions bot added the core Core DataFusion crate label Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant