Skip to content

[WIP][DO-NOT-REVIEW][SPARK-55886][SQL] Add DataFrame.zip for merging column-projected DataFrames#54976

Draft
zhengruifeng wants to merge 6 commits intoapache:masterfrom
zhengruifeng:df-zip
Draft

[WIP][DO-NOT-REVIEW][SPARK-55886][SQL] Add DataFrame.zip for merging column-projected DataFrames#54976
zhengruifeng wants to merge 6 commits intoapache:masterfrom
zhengruifeng:df-zip

Conversation

@zhengruifeng
Copy link
Contributor

Add a new DataFrame.zip(other) API that combines columns from two DataFrames that derive from the same base plan through Project chains. The optimizer rewrites the Zip node into a single Project over the shared base plan, and analysis rejects plans that cannot be merged.

Co-authored-by: Isaac

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Add a new DataFrame.zip(other) API that combines columns from two
DataFrames that derive from the same base plan through Project chains.
The optimizer rewrites the Zip node into a single Project over the
shared base plan, and analysis rejects plans that cannot be merged.

Co-authored-by: Isaac
@zhengruifeng zhengruifeng changed the title [DO-NOT-REVIEW][SPARK-55886][SQL] Add DataFrame.zip for merging column-projected DataFrames [WIP][DO-NOT-REVIEW][SPARK-55886][SQL] Add DataFrame.zip for merging column-projected DataFrames Mar 24, 2026
Zip is now always unresolved (resolved=false). A new ResolveZip
analyzer rule rewrites it into a Project when both children share the
same base plan. Removes the CollapseZip optimizer rule.

Co-authored-by: Isaac
Zip is always unresolved, so deduplication does not help it resolve.
ResolveZip already handles attribute remapping from right base to left
base via sameResult() and AttributeMap.

Co-authored-by: Isaac
No longer referenced after removing Zip from DeduplicateRelations
and changing resolved to always false.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant