Skip to content

sql-parser: Fix IS [NOT] DISTINCT FROM precedence vs AND/OR#37061

Merged
aljoscha merged 2 commits into
MaterializeInc:mainfrom
aljoscha:fix-is-distinct-from-precedence
Jun 16, 2026
Merged

sql-parser: Fix IS [NOT] DISTINCT FROM precedence vs AND/OR#37061
aljoscha merged 2 commits into
MaterializeInc:mainfrom
aljoscha:fix-is-distinct-from-precedence

Conversation

@aljoscha

Copy link
Copy Markdown
Contributor

Motivation

Fixes SQL-236.

The right-hand side of IS [NOT] DISTINCT FROM was parsed at
Precedence::Zero, the lowest precedence, so the parser greedily pulled a
trailing AND/OR into the RHS. a IS DISTINCT FROM b AND c parsed as
a IS DISTINCT FROM (b AND c) instead of (a IS DISTINCT FROM b) AND c,
silently producing wrong results for any predicate combining
IS [NOT] DISTINCT FROM with a following boolean connective.

Description

Parse the RHS of IS [NOT] DISTINCT FROM at the precedence of the IS
operator (self.parse_subexpr(precedence)) rather than parse_expr()
(Precedence::Zero). With that precedence:

  • AND/OR (lower precedence) are left for the surrounding expression, so
    they are no longer absorbed into the RHS.
  • comparison and arithmetic operators (higher precedence) are still consumed
    into the RHS, unchanged.

This matches PostgreSQL, where IS DISTINCT FROM binds tighter than
AND/OR but looser than comparison/arithmetic.

Verification

  • New mz-sql-parser datadriven cases (src/sql-parser/tests/testdata/scalar)
    assert the parse tree for a IS DISTINCT FROM b AND c,
    a IS NOT DISTINCT FROM b OR c, and a IS DISTINCT FROM b + c.
  • New sqllogictest regression cases in test/sqllogictest/distinct_from.slt
    cover the exact queries from the report and assert the unparenthesized forms
    now match the explicitly-parenthesized ones.
  • Expected results were cross-checked against PostgreSQL 16 (the same queries
    return the same rows there).

@aljoscha aljoscha requested a review from a team as a code owner June 16, 2026 08:45
The right-hand side of `IS [NOT] DISTINCT FROM` was parsed at
`Precedence::Zero`, which greedily pulled a trailing `AND`/`OR` into the
RHS. As a result `a IS DISTINCT FROM b AND c` parsed as `a IS DISTINCT
FROM (b AND c)` instead of `(a IS DISTINCT FROM b) AND c`, producing
wrong results for any predicate that combined `IS [NOT] DISTINCT FROM`
with a following boolean connective.

Parse the RHS at the precedence of the `IS` operator instead, so that
`AND`/`OR` are left for the surrounding expression while
comparison/arithmetic (which bind tighter) are still consumed. This
matches PostgreSQL.

Adds parser unit tests for the AND/OR/arithmetic cases and a
sqllogictest regression test covering the query results from the report.

Fixes SQL-236
@aljoscha aljoscha force-pushed the fix-is-distinct-from-precedence branch from 97b6b57 to 87c2d10 Compare June 16, 2026 09:03

@def- def- left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a catalog migration for potentially already persisted objects?

@ggevay ggevay self-requested a review June 16, 2026 10:52
@aljoscha

Copy link
Copy Markdown
Contributor Author

Do we need a catalog migration for potentially already persisted objects?

We normally don't store parsed results but store the SQL for objects, so I'd say we should be fine, plus a migration is a bit heavy. But if you have more concrete concerns I could look into it.

@ggevay ggevay left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/sql-parser/src/parser.rs Outdated
Comment on lines 1237 to 1259

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: extract construct to a variable to avoid the deep indentation here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8144698 — extracted the IsExprConstruct into a let binding so the match arms are no longer nested inside the struct literal.

@def- def- left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but actually looks fine when upgrading, so no problem from QA side:

diff --git a/misc/python/materialize/checks/all_checks/materialized_views.py b/misc/python/materialize/checks/all_checks/materialized_views.py
index 6066d1af45..455e737028 100644
--- a/misc/python/materialize/checks/all_checks/materialized_views.py
+++ b/misc/python/materialize/checks/all_checks/materialized_views.py
@@ -352,3 +352,61 @@ class MaterializedViewReplacement(Check):
                 # `validate` remains idempotent.
                 > DROP MATERIALIZED VIEW mv_replacement_replacement
             """))
+
+
+class MaterializedViewIsNotDistinctFromPrecedence(Check):
+    def initialize(self) -> Testdrive:
+        return Testdrive(dedent("""
+                > CREATE TABLE precedence_table (flag boolean, active boolean);
+                > INSERT INTO precedence_table VALUES (true, true), (false, true), (NULL, true), (true, false);
+                > CREATE VIEW precedence_view1 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv1 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """))
+
+    def manipulate(self) -> list[Testdrive]:
+        return [
+            Testdrive(dedent(s))
+            for s in [
+                """
+                > CREATE VIEW precedence_view2 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv2 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """,
+                """
+                > CREATE VIEW precedence_view3 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv3 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """,
+            ]
+        ]
+
+    def validate(self) -> Testdrive:
+        return Testdrive(dedent("""
+                > SELECT * FROM precedence_view1
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_view2
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_view3
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv1
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv2
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv3
+                <null> true
+                true false
+                true true
+                """))

@aljoscha

Copy link
Copy Markdown
Contributor Author

I tried, but actually looks fine when upgrading, so no problem from QA side:

diff --git a/misc/python/materialize/checks/all_checks/materialized_views.py b/misc/python/materialize/checks/all_checks/materialized_views.py
index 6066d1af45..455e737028 100644
--- a/misc/python/materialize/checks/all_checks/materialized_views.py
+++ b/misc/python/materialize/checks/all_checks/materialized_views.py
@@ -352,3 +352,61 @@ class MaterializedViewReplacement(Check):
                 # `validate` remains idempotent.
                 > DROP MATERIALIZED VIEW mv_replacement_replacement
             """))
+
+
+class MaterializedViewIsNotDistinctFromPrecedence(Check):
+    def initialize(self) -> Testdrive:
+        return Testdrive(dedent("""
+                > CREATE TABLE precedence_table (flag boolean, active boolean);
+                > INSERT INTO precedence_table VALUES (true, true), (false, true), (NULL, true), (true, false);
+                > CREATE VIEW precedence_view1 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv1 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """))
+
+    def manipulate(self) -> list[Testdrive]:
+        return [
+            Testdrive(dedent(s))
+            for s in [
+                """
+                > CREATE VIEW precedence_view2 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv2 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """,
+                """
+                > CREATE VIEW precedence_view3 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                > CREATE MATERIALIZED VIEW precedence_mv3 AS SELECT flag, active FROM precedence_table WHERE flag IS DISTINCT FROM false OR active = false
+                """,
+            ]
+        ]
+
+    def validate(self) -> Testdrive:
+        return Testdrive(dedent("""
+                > SELECT * FROM precedence_view1
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_view2
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_view3
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv1
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv2
+                <null> true
+                true false
+                true true
+
+                > SELECT * FROM precedence_mv3
+                <null> true
+                true false
+                true true
+                """))

@def- you want that test addition in there as well? Or that was just a one-off check from your side?

Address review nit: compute the IsExprConstruct in a let binding before
building Expr::IsExpr, so the match arms are no longer nested inside the
struct literal. Pure refactor, no behavior change.
@def-

def- commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Just a one-off check. Since it found no bug we don't need it as a regression test.

@aljoscha aljoscha merged commit a86b11d into MaterializeInc:main Jun 16, 2026
118 checks passed
@aljoscha aljoscha deleted the fix-is-distinct-from-precedence branch June 16, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants