Add SQL and physical planner support for MERGE INTO#2
Draft
wirybeaver wants to merge 2 commits into
Draft
Conversation
This was referenced Jun 11, 2026
d39efd8 to
c537e00
Compare
Add merge_into async method to TableProvider trait for MERGE INTO DML support. The method accepts: - source: ExecutionPlan representing the USING clause - on: Expr representing the ON join condition - clauses: Vec<MergeIntoClause> for WHEN MATCHED/NOT MATCHED actions Default implementation returns not_impl_err for tables that don't support MERGE INTO operations.
Implement merge_to_plan and merge_clause_to_plan in SQL planner: - Parse Statement::Merge into LogicalPlan::Dml with WriteOp::MergeInto - Resolve target table and plan source (USING clause) as LogicalPlan - Build combined schema for target + source to resolve ON and WHEN expressions - Convert ON condition and WHEN clauses to DataFusion Expr - Handle UPDATE, INSERT, and DELETE actions in WHEN clauses Add physical planner dispatch for WriteOp::MergeInto: - Use source_as_provider() to recover the TableProvider from the TableSource - Extract source ExecutionPlan from children - Call TableProvider::merge_into with source plan, ON condition, and clauses - Wrap errors with MERGE INTO operation context Wire MergeInto's expressions through LogicalPlan tree-traversal so optimizers can rewrite them: add MergeIntoOp::exprs() (stable iteration order: on, then per-clause predicate + action value Exprs) and MergeIntoOp::with_new_exprs() to rebuild the op from a transformed expr vector. Branch LogicalPlan::apply_expressions, map_expressions, and with_new_exprs on WriteOp::MergeInto to use these helpers; other WriteOp variants continue to expose no expressions as before.
4e175a2 to
8d7fcff
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Stacked on top of apache#20763 (Add MERGE INTO types to datafusion-expr).
Rationale for this change
Complete the MERGE INTO execution path so that the type definitions added in the parent PR can be planned and dispatched.
What changes are included in this PR?
datafusion/catalog/src/table.rs— extendTableProviderwith amerge_intoasync hook (default returnsnot_impl_err).datafusion/sql/src/statement.rs— SQL planner:merge_to_plan: parseStatement::MergeintoLogicalPlan::Dml(WriteOp::MergeInto(...)). Resolves the target table, plans theUSINGsource, builds a combined schema for resolvingONandWHENexpressions.merge_clause_to_plan: converts eachWHEN MATCHED / NOT MATCHEDclause into aMergeIntoClausewith typedMergeIntoAction.datafusion/expr/src/logical_plan/dml.rs— expression traversal onMergeIntoOp:exprs()— stable iteration (on expr → per-clause predicate → action value exprs).with_new_exprs()— reconstruct op from a transformed expr slice (used by optimizer rewrites).datafusion/expr/src/logical_plan/plan.rs/tree_node.rs— branchapply_expressions,map_expressions,with_new_exprsonWriteOp::MergeIntoto delegate to the helpers above; otherWriteOpvariants are unchanged.datafusion/core/src/physical_planner.rs— physical dispatch forWriteOp::MergeInto:TableProviderviasource_as_provider().ExecutionPlanfrom children.TableProvider::merge_into(source_plan, on_expr, clauses).Are these changes tested?
Unit and integration tests are in the parent PR (apache#20763). End-to-end sqllogictests covering a concrete
TableProvider::merge_intoimplementation are planned as follow-up once a reference implementation exists.Are there any user-facing changes?
TableProvidergains a new methodmerge_intowith a defaultnot_impl_errimplementation — existing implementors are unaffected.