Status: Active
Owner: flowscope-core
Last updated: 2026-02-21
FlowScope ships 72 lint rules across 9 families (AL, AM, CP, CV, JJ, LT, RF, ST, TQ),
each implemented in a dedicated one-rule-per-file module under linter/rules/.
The linter started with a handful of core AST rules plus a monolithic parity.rs that
used regex/heuristic matching for SQLFluff compatibility. That monolith has been fully
decommissioned — all rules are now in dedicated modules using AST-driven or
token-stream-driven implementations.
- Robust correctness across dialects and real-world SQL.
- Sound architecture with explicit semantics and low false positives.
- Maintainable implementation with clear rule ownership and minimal coupling.
- Scalable rule engine that can grow without a monolith.
- Deterministic outputs and stable spans suitable for editor and CI usage.
- One-shot rewrite of all existing rules.
- "AST-only" implementation of purely lexical formatting rules.
- Perfect SQLFluff behavior clone across every dialect from day one.
- AST-first semantics
- Semantic rules must be driven by parsed AST plus scope/resolution context, not regex.
- Examples: aliasing semantics, reference qualification, join logic, set operation checks.
- Token-aware style
- Formatting and trivia rules must use token stream data, not AST-only approximations.
- Examples: whitespace, newlines, comments, casing style, quoting style, Jinja padding.
- Parse once, tokenize once
- Build a single lint document model per SQL input and reuse it across all rules.
- Avoid repeated parsing and repeated ad hoc string scans in each rule.
- Stable rule contract
- Each rule gets structured input from the engine, not direct access to ad hoc helpers.
- Rule output must include deterministic code, message, severity, statement index, and span.
- Dialect-explicit behavior
- Rule decisions must be dialect-aware and must not silently assume generic SQL semantics.
- When parser fallback is used, confidence should degrade explicitly.
- Deterministic and testable
- Same input and config must always produce the same ordered issue set.
- Rules must be independently testable with focused fixtures.
- Regex is migration glue, not architecture
- Existing regex heuristics can remain temporarily for parity continuity.
- New semantic rules must not be implemented with regex.
The linter engine should construct a normalized input model once:
sql(full source text)dialectand parser/fallback metadata- parsed statements with statement ranges
- token stream with token spans and token kinds
- optional scope/resolution metadata for semantic rules
This becomes the only rule input surface.
- Semantic engine
- Input: AST + scope/resolution context.
- Handles semantic correctness and structural SQL logic.
- Lexical engine
- Input: token stream + token spans.
- Handles formatting/casing/quoting/comment-aware style rules.
- Document engine
- Input: whole file/document metadata.
- Handles file-level checks (EOF newline, leading blank lines, batch separators).
- Move from
parity.rsmonolith torules/<code>.rsmodules. - Keep shared traversal and token utilities in common helpers.
- Preserve existing lint codes for API stability.
- Primary span source: parser or tokenizer spans.
- Secondary span source: scoped fallback search only when necessary.
- No free-form "best guess" spans without explicit fallback path.
Each issue should carry internal provenance metadata:
- engine type (
semantic,lexical,document) - confidence (
high,medium,low) - fallback source (if parser fallback or heuristic logic was used)
This supports telemetry, triage, and quality gates.
Rule metadata should include whether a deterministic fix is supported.
- No inferred fix logic from message text.
- Fix support should be explicit and tested per rule.
- Parse SQL into statements with selected dialect.
- Tokenize full source with token spans.
- Build
LintDocumentwith statement ranges and shared metadata. - Optionally build scope/resolution context once for semantic rules.
- Execute semantic, lexical, and document engines.
- Normalize, sort, and deduplicate issues.
- Emit final issues with deterministic ordering and stable spans.
LintDocumentmodel, tokenization pass, and document-level lint execution path are live.- Token stream provider propagated through rule context (
parse once, tokenize once).
All semantic-heavy rules migrated to AST-driven implementations:
- references (
RF_001–RF_006) - structure checks (
ST_001–ST_012) - ambiguous join/reference rules (
AM_001–AM_009) - convention rules (
CV_001–CV_012) - aliasing rules (
AL_001–AL_009)
All style-oriented checks migrated to dedicated modules:
- capitalization (
CP_001–CP_005) — tokenizer-driven - layout (
LT_001–LT_015) — tokenizer/line-aware checks - jinja padding (
JJ_001) — delimiter scanning - TSQL checks (
TQ_001–TQ_003) — AST/token-driven
Remaining work: SQLFluff configuration-depth parity gaps for some CP/LT/JJ rules.
parity.rsretired and deleted.- All 72 rules live in one-rule-per-file modules under
linter/rules/.
- Phase 0 foundation shipped:
LintDocumentmodel, tokenization pass, and document-level lint execution path are live. - Engine split is active in linter orchestration: semantic + lexical + document passes run with deterministic sort/dedupe.
- Issue provenance metadata is implemented (
lint_engine,lint_confidence,lint_fallback_source). - Phase 1 AST migrations landed for:
AM_001-AM_009,CV_001-CV_012,RF_001-RF_006,ST_001-ST_012,AL_001-AL_009. -
LINT_AM_009now follows SQLFluff AM09 semantics via AST query-clause analysis, flagging LIMIT/OFFSET usage without ORDER BY across top-level and nested SELECTs. -
LINT_AM_004now follows SQLFluff AM04 semantics via AST output-width analysis, flagging queries whose result column count is unknown due to unresolved wildcard expansion (*/alias.*) across CTE/subquery/set-operation scopes, and now resolves wildcard width through declared CTE column lists, table-factor alias column lists (AS alias(col1, ...)), and aliased nested-join factors (includingUSING(...)width deduction plusNATURAL JOINoverlap deduction when both sides expose deterministic output column names). -
LINT_AM_002now follows SQLFluff AM02 core semantics by flagging bareUNION(without explicitALL/DISTINCT), with CLI fixer behavior inserting explicitDISTINCTthrough AST set-operation quantifier rewrites (text-regex path removed), and dialect-scoped execution aligned to SQLFluff-supported dialects available in FlowScope. -
LINT_CV_002now follows SQLFluff CV02 semantics and fixer behavior by flagging IFNULL/NVL function usage and rewriting to COALESCE. -
LINT_CV_005now follows SQLFluff CV05 semantics and fixer behavior by flagging= NULL/<> NULLcomparisons and rewriting toIS [NOT] NULL. -
LINT_CV_008fixer parity is now AST-driven across both simple and chained/nested RIGHT JOIN patterns, rewriting them to LEFT JOIN form by swapping join operands and normalizing join operators. -
LINT_ST_004now follows SQLFluff ST04 semantics via AST CASE analysis, flagging flattenable nested CASE expressions in ELSE clauses (instead of depth-based heuristics); fixer parity now flattens eligible nestedELSE CASEbranches into a single CASE. -
LINT_ST_007now includes SQLFluff ST07 fixer parity via AST join-constraint rewrites, convertingJOIN ... USING (...)to explicitONpredicates (including multi-column USING lists). -
LINT_ST_009now includes SQLFluff ST09 fixer parity via AST expression rewrites, swapping reversed qualified equality sides inJOIN ... ONpredicates. -
LINT_ST_006now follows SQLFluff ST06 detection semantics via AST SELECT projection analysis (simple targets after leading complex expressions) and includes fixer parity via AST reordering. -
LINT_ST_002now follows SQLFluff ST02 detection semantics via AST CASE analysis (repeated equality checks on a common operand) and includes fixer parity via AST CASE rewrites. -
LINT_ST_008now follows SQLFluff ST08 detection semantics via AST SELECT analysis forDISTINCT(<expr>)and includes fixer parity via AST SELECT rewrite toSELECT DISTINCT <expr>. -
LINT_ST_010now aligns closer to SQLFluff ST10 by covering equivalent-expression predicate comparisons across=/!=/</>/<=/>=(e.g.x = x,x < x) with operator-side guardrails (including equivalent concat/arithmetic expression detection such as'A'||'B' = 'A'||'B'while deferring nested comparison-expression operands), preserving SQLFluff-style literal handling (1=1/1=0allowed and non-equality literal-vs-literal comparisons deferred), traversing SELECT/UPDATE/DELETE/MERGE predicate contexts, and reporting per-occurrence violations rather than collapsing to one statement-level hit. -
LINT_ST_011now aligns closer to SQLFluff ST11 by scoping candidate checks to explicit OUTER joins, tracking only joined relations (not the baseFROMsource), deferring on unqualified references (RF02-style), accounting for references in other JOINONclauses,DISTINCT ON (...), query-levelORDER BY,CLUSTER BY/DISTRIBUTE BY,LATERAL VIEW,CONNECT BY, namedWINDOWclause expressions, and later JOIN relation expressions (e.g.UNNEST(g.nested_array)), evaluating multi-rootFROMclauses, treating both qualified wildcards (alias.*) and unqualified wildcard projections (*) as table references (including Snowflake qualified wildcardEXCLUDEforms), and normalizing quoted joined-source names across MySQL backticks and MSSQL brackets. -
LINT_AL_009now follows SQLFluff AL09 core detection semantics via AST projection analysis for identifier/qualified-identifier self-alias patterns (col AS col), with quote-aware case matching andalias_case_checkconfiguration support including mode-accuratequoted_cs_naked_upper/quoted_cs_naked_lowerbehavior. -
LINT_AL_001now uses AST-driven table-factor alias traversal with token-awareASdetection, replacing regex-based matching, and now includes SQLFluff AL01 parity forMERGEtarget/source aliases. -
LINT_AL_002now uses AST-driven SELECT projection alias traversal with token-awareASdetection, replacing regex-based clause extraction, and now excludes TSQL assignment-style projection aliases (SELECT alias = expr) from AL02 violations. -
LINT_AL_004now also checks implicit table-name aliases (no explicitAS) plus parent-scope collisions across both nested FROM/JOIN subqueries (excluding wrapper aliases) and expression subqueries (WHERE/IN/EXISTS), and supports quote-awarealias_case_checkconfiguration with mode-accuratequoted_cs_naked_upper/quoted_cs_naked_lowerbehavior. -
LINT_AL_008now checks duplicate projected output names from both explicit aliases and unaliased column references (e.g.,foo,schema.foo) in SELECT clauses, with quote-awarealias_case_checkconfiguration support and mode-accuratequoted_cs_naked_upper/quoted_cs_naked_lowerbehavior. -
lint.ruleConfigsnow supports per-rule configuration objects keyed by canonical/shorthand/dotted rule references;LINT_AL_001andLINT_AL_002use this for SQLFluff-stylealiasing=explicit|implicit. -
LINT_AL_006now runs as a dedicated AST rule via table-factor alias traversal and supportsmin_alias_length/max_alias_lengthvialint.ruleConfigs(default behavior now aligns with SQLFluff by leavingmax_alias_lengthunset unless configured). -
LINT_AL_003now supportsallow_scalarvialint.ruleConfigswith SQLFluff-aligned default behavior (allow_scalar=true). -
LINT_AL_007now runs as a dedicated AST rule over base-table factors inFROM/JOIN, flagging unnecessary aliases in both single-source and multi-source scopes while allowing aliases for repeated self-join table references. -
LINT_RF_004/LINT_RF_005/LINT_RF_006are now split out ofparity.rsinto dedicated core modules (rf_004.rs-rf_006.rs); all three now use AST-driven traversal (RF04identifier/alias analysis with identifier-policy/ignore config support,RF05identifier/special-char analysis with identifier-policy and allowed-character config support,RF06identifier quoting analysis with identifier-policy/keyword-preference/ignore config support). -
LINT_ST_012andLINT_TQ_001-LINT_TQ_003are now split out ofparity.rsinto dedicated core modules (st_012.rs,tq_001.rs-tq_003.rs);LINT_TQ_001/LINT_TQ_002are AST-driven (CreateProcedurename/body analysis), andLINT_ST_012/LINT_TQ_003now use token-driven sequencing checks. -
LINT_CV_001,LINT_CV_007, andLINT_CV_009-LINT_CV_011are now split out ofparity.rsinto dedicated core modules (cv_001.rs,cv_007.rs,cv_009.rs-cv_011.rs);LINT_CV_007,LINT_CV_009,LINT_CV_010, andLINT_CV_011are now AST-driven, andLINT_CV_001now uses token-aware operator scanning (pluspreferred_not_equal_styleconfig support) instead of regex. -
LINT_CV_004now supports SQLFluff-style COUNT preference knobs (prefer_count_1/prefer_count_0) vialint.ruleConfigswhile keeping AST expression traversal for detection; default fixer behavior now rewrites bothCOUNT(1)andCOUNT(0)toCOUNT(*). -
LINT_CV_006now supportsmultiline_newline/require_final_semicolonvialint.ruleConfigswhile keeping statement-boundary aware terminator checks. - Lint execution now propagates the single document token stream through rule context (
parse once, tokenize once), andLINT_CV_006consumes that shared stream before fallback tokenization. - MSSQL statement parsing now splits batches on
GOseparators before best-effort parsing, enablingLINT_CV_006final-semicolon checks to run on post-GOstatements without parser dropouts. -
LINT_CV_009now supports configurableblocked_words/blocked_regexvialint.ruleConfigs(AST traversal scope unchanged). -
LINT_CV_010now supportspreferred_quoted_literal_stylevialint.ruleConfigsand uses mixed-style (single+double) detection forconsistentmode (current behavior remains narrower than full SQLFluff literal semantics). -
LINT_CV_011now supportspreferred_type_casting_stylevialint.ruleConfigs(includingconsistent/shorthand/cast/convertpreferences). -
LINT_LT_005now supportsmax_line_length,ignore_comment_lines, andignore_comment_clausesvialint.ruleConfigs. -
LINT_LT_009now supportswildcard_policy(single/multiple) vialint.ruleConfigs. -
LINT_LT_011now supportsline_position(alone:strict/leading/trailing) vialint.ruleConfigs. -
LINT_LT_015now supportsmaximum_empty_lines_inside_statements/maximum_empty_lines_between_statementsvialint.ruleConfigs. -
LINT_LT_003now supports operator line-placement configuration vialint.ruleConfigs(line_position=leading|trailing, plus legacy SQLFluffoperator_new_lines=after|beforemapping). -
LINT_LT_004now supports comma line-placement configuration vialint.ruleConfigs(line_position=trailing|leading, plus legacy SQLFluffcomma_stylemapping). -
LINT_ST_005now supportsforbid_subquery_in(both/join/from) vialint.ruleConfigs, with SQLFluff-aligned default behavior set tojoin, and now exempts correlated JOIN-derived subqueries that reference outer query sources (SQLFluff ST05 parity for correlated cases). - CLI lint mode now supports SQLFluff-style config/template parity plumbing:
--rule-configsJSON for per-rule options, explicit--templatepassthrough in lint mode, and Jinja fallback retry for parse-erroring templated SQL (enabling config-aware fixture replay parity checks forAL05/ST05/ST11). -
LINT_ST_009now supportspreferred_first_table_in_join_clause(earlier/later) vialint.ruleConfigs. -
LINT_RF_001now supportsforce_enablevialint.ruleConfigs. -
LINT_RF_002now supportsforce_enablevialint.ruleConfigs, aligns closer to SQLFluff projection-alias semantics (self-alias projections are flagged while later references to earlier aliases remain allowed), and avoids false positives on common datepart function-argument keywords (e.g.,timestamp_trunc(..., month),datediff(year, ...)). -
LINT_RF_003now supportssingle_table_references(consistent/qualified/unqualified) andforce_enablevialint.ruleConfigs, and treats qualified wildcards (alias.*) as qualified references for mixed-style detection. -
LINT_RF_006now supportsprefer_quoted_identifiers/case_sensitivevialint.ruleConfigs. -
LINT_AL_007now supportsforce_enablevialint.ruleConfigsand is disabled by default to match SQLFluff behavior (rule logic runs whenforce_enable=true). -
LINT_AL_005now supportsalias_case_check(including SQLFluff-style casefolding modes, with mode-accuratequoted_cs_naked_upper/quoted_cs_naked_lowernormalization) vialint.ruleConfigs, includes dialect-aware quoted/unquoted alias normalization in default (dialect) mode (e.g., Postgres/Redshift lower-folding vs Snowflake upper-folding, plus case-insensitive quoted identifiers for dialects such as DuckDB/Hive/SQLite), tracks alias usage across additional AST clauses (QUALIFY, namedWINDOW,DISTINCT ON,PREWHERE,CLUSTER BY/DISTRIBUTE BY/SORT BY,LATERAL VIEW,CONNECT BY) plus join relation table-factor expressions (LATERALsubqueries,UNNEST(...), SnowflakeLATERAL FLATTEN(...), etc.), now ignores derived-subquery wrapper aliases and value-table-function aliases per SQLFluff AL05 parity, recursively checks nested derived-query scopes for inner alias usage/violations, applies to single-table scopes as well as multi-source joins, and now includes dialect-aware parity for BigQueryTO_JSON_STRING(<table_alias>), SQLFluff RedshiftQUALIFYordering behavior (QUALIFY alias references count only whenQUALIFYfollows FROM/JOIN directly, including unqualified alias-prefixed identifiers such asss_sold_date), BigQuery/Redshift implicit array-relation alias usage (FROM t, t.arr/FROM t, t.super_array AS x), repeated-self-join alias handling where sibling aliases on the same base relation are exempt when one is referenced, and SnowflakeDELETE ... USINGsubquery parity for inner CTE alias usage. - Parser fallback now normalizes escaped quoted-identifier edge cases for BigQuery/ClickHouse (plus ClickHouse trailing-comma-before-
FROMfallback normalization), removing supported-dialect fixture parse blockers that previously maskedLINT_AL_005parity checks. -
LINT_AM_005now supportsfully_qualify_join_types(inner/outer/both) vialint.ruleConfigs, with AST-driven outer-mode detection for unqualifiedLEFT/RIGHTjoins and token fallback kept only forFULL JOINkeyword-form disambiguation. -
LINT_AM_006now supportsgroup_by_and_order_by_style(consistent/explicit/implicit) vialint.ruleConfigs. -
LINT_CP_001now supportscapitalisation_policy,ignore_words, andignore_words_regexvialint.ruleConfigs. -
LINT_CP_002-LINT_CP_005now supportextended_capitalisation_policy,ignore_words, andignore_words_regexvialint.ruleConfigs;LINT_CP_002additionally supports SQLFluff-styleunquoted_identifiers_policy. -
LINT_CV_003now uses token/depth-aware SELECT-clause analysis for trailing-comma detection, replacing regex scanning, and supports SQLFluff-styleselect_clause_trailing_comma(forbid/require) vialint.ruleConfigs. -
LINT_JJ_001andLINT_LT_010/LINT_LT_011/LINT_LT_012/LINT_LT_013/LINT_LT_015are now split out ofparity.rsinto dedicated core modules (jj_001.rs,lt_010.rs,lt_011.rs,lt_012.rs,lt_013.rs,lt_015.rs);LINT_JJ_001now uses delimiter scanning (including%}/#}close checks and trim-marker-safe tags),LINT_LT_010/LINT_LT_011now use tokenizer line-aware checks,LINT_LT_012now enforces a single trailing newline at EOF, andLINT_LT_013/LINT_LT_015now use direct newline-run scanning instead of regex matching. -
LINT_LT_002/LINT_LT_003/LINT_LT_004/LINT_LT_007are now split out ofparity.rsinto dedicated core modules (lt_002.rs,lt_003.rs,lt_004.rs,lt_007.rs);LINT_LT_002now performs configurable indent-width checks (indent_unit/tab_space_size) with mixed tab/space detection,LINT_LT_003/LINT_LT_004now use tokenizer-based operator/comma layout checks, andLINT_LT_007now uses deterministic CTE sequence scanning instead of regex matching. -
LINT_LT_001/LINT_LT_005/LINT_LT_006/LINT_LT_008/LINT_LT_009/LINT_LT_014are now split out ofparity.rsinto dedicated core modules (lt_001.rs,lt_005.rs,lt_006.rs,lt_008.rs,lt_009.rs,lt_014.rs);LINT_LT_001now uses deterministic layout-pattern scanners,LINT_LT_006uses token-stream spacing detection for function-like calls,LINT_LT_009uses tokenizer-located SELECT-line target counting, andLINT_LT_014uses token/line-aware major-clause placement checks instead of regex masking. -
LINT_CP_001-LINT_CP_005are now split out ofparity.rsinto dedicated core modules (cp_001.rs-cp_005.rs);LINT_CP_004was migrated to tokenizer-driven literal detection,LINT_CP_001/LINT_CP_003/LINT_CP_005are tokenizer-driven (keyword/function/type token analysis), andLINT_CP_002now uses shared AST identifier-candidate traversal (with SQLFluff-style identifier-policy filtering), replacing regex + manual masking paths. -
LINT_AM_003now follows SQLFluff AM03 semantics via ASTORDER BYanalysis, flagging mixed implicit/explicit sort direction (includingNULLSordering cases) across nested query scopes; fixer parity now adds explicitASCto implicit items in mixed clauses. -
LINT_AM_005fixer now follows SQLFluff AM05 config-aware behavior: default/innerrewrites bareJOINtoINNER JOIN, andouter/bothmodes also qualifyLEFT/RIGHTjoins plus rewrite bareFULL JOINkeywords toFULL OUTER JOIN(outside string literals) after AST rewrites. -
LINT_AM_006now follows SQLFluff AM06 default (consistent) semantics via AST traversal ofGROUP BY/ORDER BYclauses, including nested-query precedence and rollup-style references. -
LINT_AM_008now follows SQLFluff AM08 semantics via AST join-operator analysis (implicit cross join detection, withWHEREdeferral to CV12 and UNNEST/CROSS/NATURAL/USING exclusions); fixer parity now rewrites eligible implicit joins to explicitCROSS JOIN. -
LINT_CV_012now broadens AST join-operator handling to includeINNER JOINforms represented asJoinOperator::InnerwithoutON/USING, and now aligns closer to SQLFluff CV12 chain semantics by flagging only when all naked joins in a join chain are represented via WHERE join predicates. -
LINT_AM_007now performs AST set-expression branch-width checks with deterministic wildcard resolution for CTE/derived sources (including declared CTE column lists and table-factor alias column lists) and aliased nested-join factors (includingUSING(...)width deduction plusNATURAL JOINoverlap deduction when both sides expose deterministic output column names), while unresolved wildcard expansions remain non-violating (SQLFluff-aligned behavior). - Parity monolith decommission is complete: migrated rule registrations and parity tests are removed, and
crates/flowscope-core/src/linter/rules/parity.rshas been retired. - [~] SQLFluff fixture adoption is in progress; AM, CV, ST fixture cases adopted for most semantic rules. Additional rule-level coverage is still being expanded.
- [~] SQLFluff parity quality gaps remain for a subset of rules. See
docs/sqlfluff-gap-matrix.mdfor the current status of per-rule parity deltas.
Each migrated rule must pass:
- correctness: fixture and regression coverage for trigger/non-trigger cases
- span quality: stable and accurate primary highlight span
- precision guardrails: false positive threshold on curated corpus
- performance: no meaningful regression on representative workloads
- parity continuity: no unintentional code/message regressions unless documented
- Parser limitations and missing AST locations
- Mitigation: token spans become first-class; fallback span logic remains explicit.
- Dialect edge cases not fully supported upstream
- Mitigation: dialect-specific behavior tables and confidence downgrade on fallback paths.
- Migration churn and temporary duplicate logic
- Resolved: phased rule-by-rule migration completed; parity monolith retired.
- All semantic rules run through AST/scope engine.
- All style/layout rules run through lexical/document engines.
-
parity.rsno longer acts as a rule home. - Rule additions are modular, testable, and engine-scoped by default.
- [~] Lint output quality and determinism improve while preserving stable public rule codes.
- Close remaining SQLFluff parity gaps (see
docs/sqlfluff-gap-matrix.md).