Skip to content

[SPARK-56182][SQL] Allow SPJ reducing identity to other transforms#54980

Open
peter-toth wants to merge 1 commit intoapache:masterfrom
peter-toth:SPARK-56182-reduce-identity-to-other-transforms
Open

[SPARK-56182][SQL] Allow SPJ reducing identity to other transforms#54980
peter-toth wants to merge 1 commit intoapache:masterfrom
peter-toth:SPARK-56182-reduce-identity-to-other-transforms

Conversation

@peter-toth
Copy link
Contributor

What changes were proposed in this pull request?

Extended KeyedShuffleSpec to support storage-partitioned joins (SPJ) where one side uses an identity transform (represented as AttributeReference) and the other side uses an arbitrary TransformExpression.

Why are the changes needed?

Previously, SPJ with compatible transform reduction only worked for (TransformExpression, TransformExpression) pairs. If one data source partitions by identity and the other by an arbitrary transform (e.g. bucket), SPJ was not triggered even though the identity side's raw values can be reduced by simply applying the transform.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new test case.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

@peter-toth
Copy link
Contributor Author

I will rebease this PR once the bug fix #54961 landed.

@peter-toth peter-toth force-pushed the SPARK-56182-reduce-identity-to-other-transforms branch from 17a1d3b to b5fc337 Compare March 24, 2026 14:37
@peter-toth peter-toth changed the title [WIP][SPARK-56182][SQL] Allow SPJ reducing identity to other transforms [SPARK-56182][SQL] Allow SPJ reducing identity to other transforms Mar 24, 2026
@peter-toth peter-toth marked this pull request as ready for review March 24, 2026 14:38
@peter-toth
Copy link
Contributor Author

cc @dongjoon-hyun , @szehon-ho

@peter-toth peter-toth force-pushed the SPARK-56182-reduce-identity-to-other-transforms branch from b5fc337 to c13f167 Compare March 24, 2026 14:40
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you for rebasing, @peter-toth !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants