Skip to content

feat: emit warning diagnostic when comparing with NULL using = or <>#20739

Open
wilbyang wants to merge 2 commits intoapache:mainfrom
wilbyang:feat/eq-null-warning-14434
Open

feat: emit warning diagnostic when comparing with NULL using = or <>#20739
wilbyang wants to merge 2 commits intoapache:mainfrom
wilbyang:feat/eq-null-warning-14434

Conversation

@wilbyang
Copy link

@wilbyang wilbyang commented Mar 6, 2026

Which issue does this PR close?

Closes #14434

Rationale for this change

expr = NULL and expr <> NULL always evaluate to NULL (never TRUE or FALSE) due to SQL's three-valued logic. This is almost always a bug — the user almost certainly meant IS NULL or IS NOT NULL. DataFusion currently executes these queries silently with no feedback.

What changes are included in this PR?

Warning infrastructure (datafusion/sql/src/planner.rs)

Added warnings: Arc<Mutex<Vec<Diagnostic>>> to PlannerContext with two new public methods:

  • add_warning(diag: Diagnostic) — push a non-fatal warning during planning
  • take_warnings() -> Vec<Diagnostic> — drain all collected warnings after planning

Using Arc ensures all clones of PlannerContext (e.g. for subqueries) share the same backing store, so warnings from nested queries surface to the top-level caller. This is the first use of Diagnostic::new_warning in the codebase and lays the foundation for future non-fatal diagnostics.

Detection (datafusion/sql/src/expr/mod.rs)

In the binary expression stack machine, detects = NULL / <> NULL on either operand and emits a Diagnostic::new_warning:

'= NULL' will always be NULL (null comparisons never return true or false)
help: use 'IS NULL' instead

The span points at the non-null operand (the column/expression), since NULL literals carry no span.

Tests (datafusion/sql/tests/cases/diagnostic.rs)

Added a do_query_warnings helper and 5 new tests:

Test Expectation
WHERE col = NULL 1 warning, span on col, suggests IS NULL
WHERE NULL = col 1 warning, span on col, suggests IS NULL
WHERE col <> NULL 1 warning, span on col, suggests IS NOT NULL
WHERE col IS NULL 0 warnings (correct form)
WHERE a = NULL AND b = NULL 2 warnings

How callers consume warnings

let mut planner_context = PlannerContext::new();
let plan = sql_to_rel
    .sql_statement_to_plan_with_context(stmt, &mut planner_context)?;
let warnings = planner_context.take_warnings(); // Vec<Diagnostic>

Are these changes tested?

Yes — 5 new tests, all passing. Full datafusion-sql test suite passes (459 + 69 + 12 tests).

Closes apache#14434

Comparing a column against NULL with `=` or `<>` always produces NULL
(never TRUE or FALSE), which is almost certainly a mistake. The user
almost certainly meant `IS NULL` or `IS NOT NULL` instead.

This commit:

- Adds `warnings: Arc<Mutex<Vec<Diagnostic>>>` to `PlannerContext` so
  non-fatal diagnostics can be collected during planning without halting
  execution. The `Arc` ensures all clones (e.g. subquery contexts) share
  the same backing store, so warnings from nested queries surface to the
  top-level caller.
- Detects `expr = NULL` / `NULL = expr` / `expr <> NULL` in the binary
  expression stack machine and pushes a `Diagnostic::new_warning` with
  a help message suggesting `IS NULL` / `IS NOT NULL`.
- Adds `PlannerContext::add_warning` and `PlannerContext::take_warnings`
  as the public API for producing and consuming warnings.
- Adds 5 tests in `datafusion/sql/tests/cases/diagnostic.rs` covering
  the new warning, the negative case (`IS NULL` → no warning), and
  multiple warnings in one query.
@github-actions github-actions bot added the sql SQL Planner label Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Emit warning with attached Diagnostic when doing = NULL

1 participant