Skip to content

Strict mode for expression evaluation (#213 runtime)#232

Merged
amc-corey-cox merged 2 commits into
mainfrom
fix-expr-eval-silent-none
May 12, 2026
Merged

Strict mode for expression evaluation (#213 runtime)#232
amc-corey-cox merged 2 commits into
mainfrom
fix-expr-eval-silent-none

Conversation

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Closes #213 (runtime side; the validate-spec side is closed by #231).

Summary

Adds a strict flag on the Transformer dataclass. With strict=False (default), an expression that references a name with no binding emits a warning and returns None — backward-compatible with today's silent-null behavior, but no longer entirely silent. With strict=True, the same condition raises TransformationError, surfacing typos, stale references, and wrong-table accesses that would otherwise produce silent nulls in transform output.

The trick is distinguishing two cases that today both produce None:

  1. Field exists in source schema, value is absent on this rowNone is correct (SQL NULL semantics).
  2. Field does not exist in source schema at allNone masks a bug.

Bindings._collect_schema_slots() now pre-computes case 1 from SchemaView.class_induced_slots(source_type). Bindings.__getitem__ returns None for case 1 and raises KeyError (→ NameNotDefinedTransformationError in strict, warning + None in non-strict) for case 2.

Design choices

  • Strict mode is a flag, not the default. Default-to-strict is a separate conversation deferred to follow-up. The flag is on the base Transformer dataclass alongside unrestricted_eval so all transformer subclasses inherit it.
  • Warnings dedupe per unbound name across rows. The shared set lives on ObjectTransformer._warned_unbound_names and threads through eval_expr_with_mapping(..., warned_unbound=...). Without this, a typo in a spec used against a 10M-row dataset would log 10M lines for one underlying problem.
  • Explicit-join target class slots are intentionally NOT added to the schema-slot set. Explicit joins are accessed via {Alias.column} (dotted), not bare. A bare {score} on a source class that doesn't declare score is almost certainly a typo for the dotted form. Documented in _collect_schema_slots's docstring.
  • Implicit-join column resolution is out of scope. Belongs in the Sparse implicit-join semantics: nested object on no-match + data-dependent ambiguity enforcement #217 follow-up, where the auto-join runtime semantics live anyway. Strict mode with implicit-join specs may falsely reject slots that would be auto-resolved at runtime; this is a known gap that resolves cleanly when Sparse implicit-join semantics: nested object on no-match + data-dependent ambiguity enforcement #217 lands.
  • unrestricted_eval fallback path is not covered by strict mode. When the restricted evaluator raises and asteval takes over, asteval has its own (also-permissive) handling. Marginal case — flagging here as a known limitation.

CLI

linkml-map map-data --strict/--no-strict (default off). Falls through map_data's **kwargs into ObjectTransformer(**kwargs) — same pattern as --unrestricted-eval.

Test plan

  • uv run pytest tests/ — 819 passed, 4 skipped (was 804 passed before this PR; +15 new tests)
  • uv run pytest --doctest-modules src/linkml_map/ — 18 passed
  • uv run ruff check . + uv run ruff format --check . — clean
  • New tests:
    • tests/test_utils/test_eval_utils.py — 4 direct evaluator tests (default returns None, default logs warning, strict raises, bound-None-in-strict still returns None)
    • tests/test_transformer/test_expr_strict_mode.py — 8 integration tests via ObjectTransformer.map_object covering typo / valid-slot / null-value / absent-from-row scenarios, parameterized over strict=True/False, plus a dedup-across-rows test
    • tests/test_cli/test_cli.py — 2 CLI tests (--strict makes a typo'd spec exit non-zero; default exits 0)

Coordination

This branch is off main at 5ff8b00. Independent of PR #231 (validate-spec side of the silent-null defense arc) — no shared files, no merge order requirement.

By default, expression references to names that aren't slots on the
source class silently resolve to None, masking typos and stale
references. This adds a `strict` flag on the `Transformer` dataclass
that surfaces such names as `TransformationError` instead. Non-strict
mode keeps the silent-null behavior but logs a warning the first time
each unique unbound name is seen.

The `Bindings` class now pre-computes the source class's slot set so
"slot declared but absent from this row" (legitimate SQL null) is
distinguishable from "name not in schema at all" (typo). Both
behaviors are preserved in non-strict mode; only the typo case
escalates in strict mode.

Explicit-join target class slots are intentionally not added to the
slot set — those are accessed via the join alias (`{Reading.score}`),
not bare references, so a bare `{score}` on a source class without
that slot is a typo. Implicit-join column resolution is deferred to
the follow-up runtime work for #217.

Exposed on the CLI as `map-data --strict/--no-strict` (default off).
Copilot AI review requested due to automatic review settings May 12, 2026 18:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in “strict mode” to LinkML expression evaluation so that typos / unbound variable references can be surfaced as errors (instead of silently propagating None), while preserving backward-compatible non-strict behavior (now with deduped warnings).

Changes:

  • Introduce strict flag on Transformer / ObjectTransformer and thread it into expression evaluation.
  • Make binding lookup schema-aware so “declared slot but missing on this row” returns None, while “not a schema slot” becomes unbound (→ warn/raise depending on strictness).
  • Add CLI --strict/--no-strict and new unit/integration/CLI tests for strict-mode behavior and warning deduplication.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/linkml_map/utils/eval_utils.py Adds strict-mode behavior and warning dedupe support to the evaluator when encountering unbound names.
src/linkml_map/transformer/transformer.py Adds strict configuration flag to the base Transformer dataclass.
src/linkml_map/transformer/object_transformer.py Adds schema-slot precomputation to Bindings, introduces per-transformer warning dedupe set, and passes strict/dedupe into expression evaluation.
src/linkml_map/cli/cli.py Adds map-data --strict/--no-strict option to enable strict-mode at runtime via CLI.
tests/test_utils/test_eval_utils.py Adds unit tests validating strict vs non-strict behavior for unbound vs bound-None names.
tests/test_transformer/test_expr_strict_mode.py Adds integration tests for strict-mode behavior through ObjectTransformer.map_object, including deduped warnings across rows.
tests/test_cli/test_cli.py Adds CLI tests verifying --strict fails on typos and default mode remains backward-compatible.

Comment thread src/linkml_map/utils/eval_utils.py
Comment thread src/linkml_map/transformer/object_transformer.py Outdated
Comment thread tests/test_transformer/test_expr_strict_mode.py Outdated
- _eval_name raises NameError (not TransformationError) in strict mode
  so ObjectTransformer._slot_error_context can enrich it with class /
  slot / source-row context. NameError is also outside the
  (InvalidExpression, TypeError, ValueError) catch list in _eval_expr,
  so it propagates past the unrestricted_eval fallback rather than
  being swallowed.

- Update _warned_unbound_names docstring to match actual lifetime
  (per transformer instance, not per "run"), and document that this
  is intentional rather than a bug.

- Drop "(or any explicit join target)" from the test module docstring;
  joined-class slots are intentionally excluded from bare-name
  resolution.

- Inline codespell:ignore directives on lines using the deliberate
  'scroe' typo (kept inline rather than added to repo-wide
  ignore-words-list).
@amc-corey-cox amc-corey-cox merged commit f2363c3 into main May 12, 2026
12 checks passed
@amc-corey-cox amc-corey-cox deleted the fix-expr-eval-silent-none branch May 12, 2026 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expr evaluation silently returns None for nonexistent variables

2 participants