feat: emit warning diagnostic when comparing with NULL using = or <>#20739
Open
wilbyang wants to merge 2 commits intoapache:mainfrom
Open
feat: emit warning diagnostic when comparing with NULL using = or <>#20739wilbyang wants to merge 2 commits intoapache:mainfrom
wilbyang wants to merge 2 commits intoapache:mainfrom
Conversation
Closes apache#14434 Comparing a column against NULL with `=` or `<>` always produces NULL (never TRUE or FALSE), which is almost certainly a mistake. The user almost certainly meant `IS NULL` or `IS NOT NULL` instead. This commit: - Adds `warnings: Arc<Mutex<Vec<Diagnostic>>>` to `PlannerContext` so non-fatal diagnostics can be collected during planning without halting execution. The `Arc` ensures all clones (e.g. subquery contexts) share the same backing store, so warnings from nested queries surface to the top-level caller. - Detects `expr = NULL` / `NULL = expr` / `expr <> NULL` in the binary expression stack machine and pushes a `Diagnostic::new_warning` with a help message suggesting `IS NULL` / `IS NOT NULL`. - Adds `PlannerContext::add_warning` and `PlannerContext::take_warnings` as the public API for producing and consuming warnings. - Adds 5 tests in `datafusion/sql/tests/cases/diagnostic.rs` covering the new warning, the negative case (`IS NULL` → no warning), and multiple warnings in one query.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #14434
Rationale for this change
expr = NULLandexpr <> NULLalways evaluate toNULL(neverTRUEorFALSE) due to SQL's three-valued logic. This is almost always a bug — the user almost certainly meantIS NULLorIS NOT NULL. DataFusion currently executes these queries silently with no feedback.What changes are included in this PR?
Warning infrastructure (
datafusion/sql/src/planner.rs)Added
warnings: Arc<Mutex<Vec<Diagnostic>>>toPlannerContextwith two new public methods:add_warning(diag: Diagnostic)— push a non-fatal warning during planningtake_warnings() -> Vec<Diagnostic>— drain all collected warnings after planningUsing
Arcensures all clones ofPlannerContext(e.g. for subqueries) share the same backing store, so warnings from nested queries surface to the top-level caller. This is the first use ofDiagnostic::new_warningin the codebase and lays the foundation for future non-fatal diagnostics.Detection (
datafusion/sql/src/expr/mod.rs)In the binary expression stack machine, detects
= NULL/<> NULLon either operand and emits aDiagnostic::new_warning:The span points at the non-null operand (the column/expression), since NULL literals carry no span.
Tests (
datafusion/sql/tests/cases/diagnostic.rs)Added a
do_query_warningshelper and 5 new tests:WHERE col = NULLcol, suggestsIS NULLWHERE NULL = colcol, suggestsIS NULLWHERE col <> NULLcol, suggestsIS NOT NULLWHERE col IS NULLWHERE a = NULL AND b = NULLHow callers consume warnings
Are these changes tested?
Yes — 5 new tests, all passing. Full
datafusion-sqltest suite passes (459 + 69 + 12 tests).