Allow Spark partial / Comet final for compatible aggregates by Shekharrajak · Pull Request #2994 · apache/datafusion-comet

Shekharrajak · 2025-12-27T09:35:54Z

Which issue does this PR close?

Rationale for this change

Comet currently falls back to Spark for ALL final hash aggregates when there's no Comet partial aggregate in the child plan. This is overly conservative because some aggregates have compatible intermediate buffer formats between Spark and Comet.
For example, MIN, MAX, COUNT, and bitwise aggregates (BIT_AND, BIT_OR, BIT_XOR) have simple intermediate buffers (single value) that are compatible between Spark and Comet. These can safely run with "Spark partial / Comet final" execution.
Other aggregates like SUM, AVG, VARIANCE, etc. have known incompatibilities (e.g., decimal overflow handling differences, complex intermediate buffers) and should continue to fall back when there's no Comet partial aggregate.

What changes are included in this PR?

Added supportsSparkPartialCometFinal method to CometAggregateExpressionSerde trait - Default is false

Added helper function - aggSupportsMixedExecution() in QueryPlanSerde

How are these changes tested?

"CometExecRule should not allow Spark partial and Comet final for unsafe aggregates" - Verifies SUM still falls back to Spark

"CometExecRule should allow Spark partial and Comet final for safe aggregates" - Verifies MIN/MAX/COUNT can use Comet final with Spark partial

codecov-commenter · 2025-12-27T15:42:03Z

Codecov Report

❌ Patch coverage is 53.84615% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.58%. Comparing base (f09f8af) to head (51869b1).
⚠️ Report is 803 commits behind head on main.

Files with missing lines	Patch %	Lines
...main/scala/org/apache/comet/serde/aggregates.scala	50.00%	3 Missing ⚠️
.../scala/org/apache/comet/serde/QueryPlanSerde.scala	33.33%	1 Missing and 1 partial ⚠️
...n/scala/org/apache/spark/sql/comet/operators.scala	66.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2994      +/-   ##
============================================
- Coverage     56.12%   54.58%   -1.54%     
- Complexity      976     1256     +280     
============================================
  Files           119      167      +48     
  Lines         11743    15505    +3762     
  Branches       2251     2571     +320     
============================================
+ Hits           6591     8464    +1873     
- Misses         4012     5822    +1810     
- Partials       1140     1219      +79

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

parthchandra · 2026-01-06T01:29:38Z

spark/src/test/scala/org/apache/comet/rules/CometExecRuleSuite.scala

+      val sparkPlan =
+        createSparkPlan(
+          spark,
+          "SELECT MIN(id), MAX(id), COUNT(*) FROM test_data GROUP BY (id % 3)")


It would be nice to add cases for the bit*** functions as well

andygrove · 2026-03-16T13:55:43Z

Sorry for the late review @Shekharrajak. This LGTM except for the missing end-to-end tests for bitwise aggregates that @parthchandra already stated.

I will go ahead and add those tests and push to this branch if permissions allow, or create a new branch from this one.

andygrove · 2026-03-16T14:00:15Z

Sorry for the late review @Shekharrajak. This LGTM except for the missing end-to-end tests for bitwise aggregates that @parthchandra already stated.

I will go ahead and add those tests and push to this branch if permissions allow, or create a new branch from this one.

Something is wrong with the git history on this branch so I cannot rebase or upmerge.

@Shekharrajak let me know if you are still interested in working on this. If not, I will create a new PR based on your changes.

Shekharrajak · 2026-03-16T18:17:36Z

Thanks for checking, Let me work on this.

…ates

andygrove

LGTM pending CI. Thanks @Shekharrajak

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from f2e6748 to 51869b1 Compare December 27, 2025 09:42

parthchandra reviewed Jan 9, 2026

View reviewed changes

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 91e0de7 to 274f38b Compare January 22, 2026 16:19

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 274f38b to 863ba03 Compare March 16, 2026 19:09

Shekharrajak added 4 commits March 17, 2026 00:49

Allow Spark partial / Comet final for compatible aggregates

5f66242

Add unit tests for aggSupportsMixedExecution including bitwise aggreg…

f9968be

…ates

Fix build errors: Remove unused imports and fix feature flag conflicts

24db310

Add bitwise aggregate mixed execution test

141ba57

Shekharrajak force-pushed the fix/issue-2894-aggregate-fallback branch from 863ba03 to 141ba57 Compare March 16, 2026 19:20

andygrove approved these changes Mar 17, 2026

View reviewed changes

minor change

4eaef6b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Spark partial / Comet final for compatible aggregates#2994

Allow Spark partial / Comet final for compatible aggregates#2994
Shekharrajak wants to merge 5 commits intoapache:mainfrom
Shekharrajak:fix/issue-2894-aggregate-fallback

Shekharrajak commented Dec 27, 2025

Uh oh!

codecov-commenter commented Dec 27, 2025 •

edited

Loading

Uh oh!

parthchandra Jan 6, 2026

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

Shekharrajak commented Mar 16, 2026

Uh oh!

andygrove left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Shekharrajak commented Dec 27, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

parthchandra Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

andygrove commented Mar 16, 2026

Uh oh!

Shekharrajak commented Mar 16, 2026

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Dec 27, 2025 •

edited

Loading