fix: avoid pushing join conds with outer scope tags#24960
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
e8126c6 to
e8c9d47
Compare
e8c9d47 to
fafe015
Compare
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
XuPeng-SH
left a comment
There was a problem hiding this comment.
Looks good to me. The fix stays narrowly scoped to the join-order/separateNonEquiConds path, the new outer-scope-aware side classification addresses the remapping failure directly, and the added unit/BVT coverage matches the failure mode well.
aunjgr
left a comment
There was a problem hiding this comment.
The fix correctly identifies the root cause, but the approach reuses JoinSideBoth for outer-scope tags — semantically wrong. JoinSideBoth means "from both sides of this join," not "from an outer scope."
The correct fix:
- Add
JoinSideOuter = 1 << 5to the constants inpkg/sql/plan/types.go - In
getJoinSide, returnJoinSideOuterinstead ofJoinSideNonefor columns not in leftTags/rightTags/markTag - In
pushdownFilters, treatJoinSideOuterthe same asJoinSideBoth— cannot push to a single child, stays at current level
This makes the intent explicit: outer-scope tags don't belong to either child of the current join. The current fix works but conflates two distinct concepts.
aunjgr
left a comment
There was a problem hiding this comment.
The root cause is correctly identified but the fix has two issues:
-
Misuse of JoinSideBoth: Outer-scope tags are not "from both sides" — they are not from either side. Add JoinSideOuter = 1 << 5 in types.go and return it from getJoinSide for columns not in leftTags/rightTags/markTag. Then handle it in pushdownFilters (same as JoinSideBoth — stays at current level).
-
Unnecessary wrapper: Fix getJoinSide directly instead of adding getJoinSideWithOuterScope. A correctly classified JoinSideOuter naturally prevents single-child pushdown for ALL callers, not just the separateNonEquiConds path. The conditional wrapper should not be needed — the fix should be universal.
|
@aunjgr Thanks for the review. I agree with the semantic issue: an outer-scope tag is not really I updated the PR to add I intentionally kept the base So the compromise here is: make the outer-scope semantics explicit where the bug happens, but avoid changing the legacy classifier globally until we can separately audit the other callers and TPCH plan impact. Validation after the update:
|
aunjgr
left a comment
There was a problem hiding this comment.
Good catch on TPCH Q18/Q21. The issue with just adding JoinSideOuter and returning it from getJoinSide is the routing logic in pushdownFilters:
if side & JoinSideLeft == 0 → rightPushdown
else if side & JoinSideRight == 0 → leftPushdown
else → cantPushdownJoinSideOuter | JoinSideLeft has the JoinSideRight bit clear → pushed LEFT. Wrong.
The proper fix needs two changes:
-
Add
JoinSideOuter = 1 << 5in types.go, return it fromgetJoinSidefor columns outside leftTags/rightTags/markTag (instead of falling through to the default 0). -
In the routing logic, check for
JoinSideOuterbefore pushing to a child:
side := getJoinSideForPushdown(filter, leftTags, rightTags, markTag)
if side & JoinSideOuter != 0 || side == JoinSideNone {
cantPushdown = append(cantPushdown, filter)
} else if side & JoinSideLeft == 0 {
rightPushdown = append(rightPushdown, filter)
} else if side & JoinSideRight == 0 {
leftPushdown = append(leftPushdown, filter)
} else {
cantPushdown = append(cantPushdown, filter)
}JoinSideOuter means "references a scope outside this join subtree" — semantically distinct from JoinSideBoth. The routing check is explicit instead of relying on JoinSideBoth's bit pattern as a side effect.
Merge Queue Status
This pull request spent 1 hour 6 minutes 57 seconds in the queue, including 1 hour 5 minutes 55 seconds running CI. Required conditions to merge
|
What type of PR is this?
Which issue(s) this PR fixes:
issue #19635
What this PR does / why we need it:
This is a narrower fix for the join predicate pushdown issue previously addressed by #23719.
During the join-ordering path,
determineJoinOrdercallspushdownFilters(..., separateNonEquiConds=true)with a global condition list. While recursively placing those conditions into the reordered join tree, a condition can reference tags outside the current join subtree. The oldgetJoinSidereturnsJoinSideNonefor those outer-scope tags, so a condition can be misclassified as a single-side predicate and pushed into a child subtree that does not contain all referenced columns. That later fails during column remapping.Instead of changing
getJoinSideglobally, this PR only uses an outer-scope-aware side classifier for theseparateNonEquiConds=truepath after join ordering. This preserves the existing early filter pushdown behavior and avoids changing TPCH plans, while still preventing invalid pushdown of join-order conditions with outer-scope tags.A BVT case based on issue #19635 is added to
test/distributed/cases/join/join.sql.Validation:
Column remapping failed: cannot find supplier.s_suppkey./data3/hanfeng/tpch1000-explain-pr23719-old-20260612-143441.out/data3/hanfeng/tpch1000-explain-pr23719-separate-only-20260612-144338.outgo test ./pkg/sql/planpassed on 10.222.1.128 with the same CGO include/lib settings used bymake build.