Skip to content

[SPARK-56155][SQL] Collect_list/collect_set sql() function includes "RESPECT NULLS"#54957

Open
helioshe4 wants to merge 1 commit intoapache:masterfrom
helioshe4:collect-list-set-sql-include-respect-nulls
Open

[SPARK-56155][SQL] Collect_list/collect_set sql() function includes "RESPECT NULLS"#54957
helioshe4 wants to merge 1 commit intoapache:masterfrom
helioshe4:collect-list-set-sql-include-respect-nulls

Conversation

@helioshe4
Copy link
Contributor

@helioshe4 helioshe4 commented Mar 23, 2026

What changes were proposed in this pull request?

Fix the pretty string for collect_list/set column alias. When collect_list/collect_set appear in the column header, the string includes "RESPECT NULLS".

Why are the changes needed?

For clarity. Output may be misleading if we don't show 'RESPECT NULLS' in the column header even when user has included it.

Does this PR introduce any user-facing change?

Yes. Column headers now include 'RESPECT NULLS' when the keyword has been included.

How was this patch tested?

UT in DataFrameAggregateSuite

Was this patch authored or co-authored using generative AI tooling?

No

}

override def sql(isDistinct: Boolean): String = {
val distinct = if (isDistinct) "DISTINCT " else ""
Copy link
Contributor Author

@helioshe4 helioshe4 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDistinct should always be false for CollectSet because it gets removed in EliminateDistinct from Optimizer.scala

but i will keep it here for now to be consistent with the other aggregate functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant