Skip to content

fix: avoid pushing join conds with outer scope tags#24960

Merged
mergify[bot] merged 2 commits into
matrixorigin:mainfrom
aptend:fix-join-pushdown-outer-scope-pr23719
Jun 16, 2026
Merged

fix: avoid pushing join conds with outer scope tags#24960
mergify[bot] merged 2 commits into
matrixorigin:mainfrom
aptend:fix-join-pushdown-outer-scope-pr23719

Conversation

@aptend

@aptend aptend commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #19635

What this PR does / why we need it:

This is a narrower fix for the join predicate pushdown issue previously addressed by #23719.

During the join-ordering path, determineJoinOrder calls pushdownFilters(..., separateNonEquiConds=true) with a global condition list. While recursively placing those conditions into the reordered join tree, a condition can reference tags outside the current join subtree. The old getJoinSide returns JoinSideNone for those outer-scope tags, so a condition can be misclassified as a single-side predicate and pushed into a child subtree that does not contain all referenced columns. That later fails during column remapping.

Instead of changing getJoinSide globally, this PR only uses an outer-scope-aware side classifier for the separateNonEquiConds=true path after join ordering. This preserves the existing early filter pushdown behavior and avoids changing TPCH plans, while still preventing invalid pushdown of join-order conditions with outer-scope tags.

A BVT case based on issue #19635 is added to test/distributed/cases/join/join.sql.

Validation:

  • Reproduced the old failure on 10.222.1.128: Column remapping failed: cannot find supplier.s_suppkey.
  • Verified the new BVT SQL passes with this fix.
  • Verified TPCH 1T explain plan has no diff against the old-behavior baseline:
    • baseline: /data3/hanfeng/tpch1000-explain-pr23719-old-20260612-143441.out
    • fixed: /data3/hanfeng/tpch1000-explain-pr23719-separate-only-20260612-144338.out
    • diff lines: 0
  • go test ./pkg/sql/plan passed on 10.222.1.128 with the same CGO include/lib settings used by make build.

@aptend aptend requested review from aunjgr and ouyuanning as code owners June 12, 2026 07:22
Copilot AI review requested due to automatic review settings June 12, 2026 07:22
@aptend aptend requested a review from heni02 as a code owner June 12, 2026 07:22
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@matrix-meow matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Jun 12, 2026
@mergify mergify Bot added kind/bug Something isn't working kind/test-ci labels Jun 12, 2026
@aptend aptend force-pushed the fix-join-pushdown-outer-scope-pr23719 branch from e8126c6 to e8c9d47 Compare June 12, 2026 07:43
@aptend aptend marked this pull request as draft June 12, 2026 08:34
@aptend aptend force-pushed the fix-join-pushdown-outer-scope-pr23719 branch from e8c9d47 to fafe015 Compare June 12, 2026 08:47
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@XuPeng-SH XuPeng-SH left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The fix stays narrowly scoped to the join-order/separateNonEquiConds path, the new outer-scope-aware side classification addresses the remapping failure directly, and the added unit/BVT coverage matches the failure mode well.

@aunjgr aunjgr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix correctly identifies the root cause, but the approach reuses JoinSideBoth for outer-scope tags — semantically wrong. JoinSideBoth means "from both sides of this join," not "from an outer scope."

The correct fix:

  1. Add JoinSideOuter = 1 << 5 to the constants in pkg/sql/plan/types.go
  2. In getJoinSide, return JoinSideOuter instead of JoinSideNone for columns not in leftTags/rightTags/markTag
  3. In pushdownFilters, treat JoinSideOuter the same as JoinSideBoth — cannot push to a single child, stays at current level

This makes the intent explicit: outer-scope tags don't belong to either child of the current join. The current fix works but conflates two distinct concepts.

aunjgr

This comment was marked as duplicate.

@aunjgr aunjgr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause is correctly identified but the fix has two issues:

  1. Misuse of JoinSideBoth: Outer-scope tags are not "from both sides" — they are not from either side. Add JoinSideOuter = 1 << 5 in types.go and return it from getJoinSide for columns not in leftTags/rightTags/markTag. Then handle it in pushdownFilters (same as JoinSideBoth — stays at current level).

  2. Unnecessary wrapper: Fix getJoinSide directly instead of adding getJoinSideWithOuterScope. A correctly classified JoinSideOuter naturally prevents single-child pushdown for ALL callers, not just the separateNonEquiConds path. The conditional wrapper should not be needed — the fix should be universal.

@aptend

aptend commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

@aunjgr Thanks for the review. I agree with the semantic issue: an outer-scope tag is not really JoinSideBoth.

I updated the PR to add JoinSideOuter and return it from the outer-scope-aware classifier, so this path no longer conflates outer-scope references with “columns from both children”. pushdownFilters now keeps predicates with JoinSideOuter at the current join level.

I intentionally kept the base getJoinSide behavior unchanged and did not make this universal. This PR is scoped to the post-join-order separateNonEquiConds=true path, where predicates may reference tags outside the current join subtree. Making getJoinSide globally return JoinSideOuter for unknown tags is the same risk class as the earlier broad change in #23719; in our local validation/investigation that changed TPCH q18/q21 plans. The existing JoinSideNone behavior is still relied on by other pushdown paths for current compatibility.

So the compromise here is: make the outer-scope semantics explicit where the bug happens, but avoid changing the legacy classifier globally until we can separately audit the other callers and TPCH plan impact.

Validation after the update:

  • go test ./pkg/sql/plan passed on 10.222.1.128

@aunjgr aunjgr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on TPCH Q18/Q21. The issue with just adding JoinSideOuter and returning it from getJoinSide is the routing logic in pushdownFilters:

if side & JoinSideLeft == 0rightPushdown
else if side & JoinSideRight == 0leftPushdown
elsecantPushdown

JoinSideOuter | JoinSideLeft has the JoinSideRight bit clear → pushed LEFT. Wrong.

The proper fix needs two changes:

  1. Add JoinSideOuter = 1 << 5 in types.go, return it from getJoinSide for columns outside leftTags/rightTags/markTag (instead of falling through to the default 0).

  2. In the routing logic, check for JoinSideOuter before pushing to a child:

side := getJoinSideForPushdown(filter, leftTags, rightTags, markTag)
if side & JoinSideOuter != 0 || side == JoinSideNone {
    cantPushdown = append(cantPushdown, filter)
} else if side & JoinSideLeft == 0 {
    rightPushdown = append(rightPushdown, filter)
} else if side & JoinSideRight == 0 {
    leftPushdown = append(leftPushdown, filter)
} else {
    cantPushdown = append(cantPushdown, filter)
}

JoinSideOuter means "references a scope outside this join subtree" — semantically distinct from JoinSideBoth. The routing check is explicit instead of relying on JoinSideBoth's bit pattern as a side effect.

@mergify

mergify Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

  • Entered queue2026-06-16 09:01 UTC · Rule: main
  • Checks passed · in-place
  • Merged2026-06-16 10:07 UTC · at a866d9b6bfac17feb9a4a14e8a2cf625e766cab4 · squash

This pull request spent 1 hour 6 minutes 57 seconds in the queue, including 1 hour 5 minutes 55 seconds running CI.

Required conditions to merge
  • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
  • #review-threads-unresolved = 0 [🛡 GitHub branch protection]
  • github-review-decision = APPROVED [🛡 GitHub branch protection]
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Compose CI / multi cn e2e bvt test docker compose(PESSIMISTIC)
    • check-neutral = Matrixone Compose CI / multi cn e2e bvt test docker compose(PESSIMISTIC)
    • check-skipped = Matrixone Compose CI / multi cn e2e bvt test docker compose(PESSIMISTIC)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Standlone CI / Multi-CN e2e BVT Test on Linux/x64(LAUNCH, PROXY)
    • check-neutral = Matrixone Standlone CI / Multi-CN e2e BVT Test on Linux/x64(LAUNCH, PROXY)
    • check-skipped = Matrixone Standlone CI / Multi-CN e2e BVT Test on Linux/x64(LAUNCH, PROXY)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH, PESSIMISTIC)
    • check-neutral = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH, PESSIMISTIC)
    • check-skipped = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH, PESSIMISTIC)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone CI / SCA Test on Ubuntu/x86
    • check-neutral = Matrixone CI / SCA Test on Ubuntu/x86
    • check-skipped = Matrixone CI / SCA Test on Ubuntu/x86
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone CI / UT Test on Ubuntu/x86
    • check-neutral = Matrixone CI / UT Test on Ubuntu/x86
    • check-skipped = Matrixone CI / UT Test on Ubuntu/x86
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Compose CI / multi cn e2e bvt test docker compose(Optimistic/PUSH)
    • check-neutral = Matrixone Compose CI / multi cn e2e bvt test docker compose(Optimistic/PUSH)
    • check-skipped = Matrixone Compose CI / multi cn e2e bvt test docker compose(Optimistic/PUSH)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH,Optimistic)
    • check-neutral = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH,Optimistic)
    • check-skipped = Matrixone Standlone CI / e2e BVT Test on Linux/x64(LAUNCH,Optimistic)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Upgrade CI / Compatibility Test With Target on Linux/x64(LAUNCH)
    • check-neutral = Matrixone Upgrade CI / Compatibility Test With Target on Linux/x64(LAUNCH)
    • check-skipped = Matrixone Upgrade CI / Compatibility Test With Target on Linux/x64(LAUNCH)
  • any of [🛡 GitHub branch protection]:
    • check-success = Matrixone Utils CI / Coverage
    • check-neutral = Matrixone Utils CI / Coverage
    • check-skipped = Matrixone Utils CI / Coverage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working kind/test-ci size/M Denotes a PR that changes [100,499] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants