Strict mode for expression evaluation (#213 runtime)#232
Merged
Conversation
By default, expression references to names that aren't slots on the
source class silently resolve to None, masking typos and stale
references. This adds a `strict` flag on the `Transformer` dataclass
that surfaces such names as `TransformationError` instead. Non-strict
mode keeps the silent-null behavior but logs a warning the first time
each unique unbound name is seen.
The `Bindings` class now pre-computes the source class's slot set so
"slot declared but absent from this row" (legitimate SQL null) is
distinguishable from "name not in schema at all" (typo). Both
behaviors are preserved in non-strict mode; only the typo case
escalates in strict mode.
Explicit-join target class slots are intentionally not added to the
slot set — those are accessed via the join alias (`{Reading.score}`),
not bare references, so a bare `{score}` on a source class without
that slot is a typo. Implicit-join column resolution is deferred to
the follow-up runtime work for #217.
Exposed on the CLI as `map-data --strict/--no-strict` (default off).
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an opt-in “strict mode” to LinkML expression evaluation so that typos / unbound variable references can be surfaced as errors (instead of silently propagating None), while preserving backward-compatible non-strict behavior (now with deduped warnings).
Changes:
- Introduce
strictflag onTransformer/ObjectTransformerand thread it into expression evaluation. - Make binding lookup schema-aware so “declared slot but missing on this row” returns
None, while “not a schema slot” becomes unbound (→ warn/raise depending on strictness). - Add CLI
--strict/--no-strictand new unit/integration/CLI tests for strict-mode behavior and warning deduplication.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/linkml_map/utils/eval_utils.py |
Adds strict-mode behavior and warning dedupe support to the evaluator when encountering unbound names. |
src/linkml_map/transformer/transformer.py |
Adds strict configuration flag to the base Transformer dataclass. |
src/linkml_map/transformer/object_transformer.py |
Adds schema-slot precomputation to Bindings, introduces per-transformer warning dedupe set, and passes strict/dedupe into expression evaluation. |
src/linkml_map/cli/cli.py |
Adds map-data --strict/--no-strict option to enable strict-mode at runtime via CLI. |
tests/test_utils/test_eval_utils.py |
Adds unit tests validating strict vs non-strict behavior for unbound vs bound-None names. |
tests/test_transformer/test_expr_strict_mode.py |
Adds integration tests for strict-mode behavior through ObjectTransformer.map_object, including deduped warnings across rows. |
tests/test_cli/test_cli.py |
Adds CLI tests verifying --strict fails on typos and default mode remains backward-compatible. |
- _eval_name raises NameError (not TransformationError) in strict mode so ObjectTransformer._slot_error_context can enrich it with class / slot / source-row context. NameError is also outside the (InvalidExpression, TypeError, ValueError) catch list in _eval_expr, so it propagates past the unrestricted_eval fallback rather than being swallowed. - Update _warned_unbound_names docstring to match actual lifetime (per transformer instance, not per "run"), and document that this is intentional rather than a bug. - Drop "(or any explicit join target)" from the test module docstring; joined-class slots are intentionally excluded from bare-name resolution. - Inline codespell:ignore directives on lines using the deliberate 'scroe' typo (kept inline rather than added to repo-wide ignore-words-list).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #213 (runtime side; the validate-spec side is closed by #231).
Summary
Adds a
strictflag on theTransformerdataclass. Withstrict=False(default), an expression that references a name with no binding emits a warning and returnsNone— backward-compatible with today's silent-null behavior, but no longer entirely silent. Withstrict=True, the same condition raisesTransformationError, surfacing typos, stale references, and wrong-table accesses that would otherwise produce silent nulls in transform output.The trick is distinguishing two cases that today both produce
None:Noneis correct (SQL NULL semantics).Nonemasks a bug.Bindings._collect_schema_slots()now pre-computes case 1 fromSchemaView.class_induced_slots(source_type).Bindings.__getitem__returnsNonefor case 1 and raisesKeyError(→NameNotDefined→TransformationErrorin strict, warning +Nonein non-strict) for case 2.Design choices
Transformerdataclass alongsideunrestricted_evalso all transformer subclasses inherit it.ObjectTransformer._warned_unbound_namesand threads througheval_expr_with_mapping(..., warned_unbound=...). Without this, a typo in a spec used against a 10M-row dataset would log 10M lines for one underlying problem.{Alias.column}(dotted), not bare. A bare{score}on a source class that doesn't declarescoreis almost certainly a typo for the dotted form. Documented in_collect_schema_slots's docstring.unrestricted_evalfallback path is not covered by strict mode. When the restricted evaluator raises and asteval takes over, asteval has its own (also-permissive) handling. Marginal case — flagging here as a known limitation.CLI
linkml-map map-data --strict/--no-strict(default off). Falls throughmap_data's**kwargsintoObjectTransformer(**kwargs)— same pattern as--unrestricted-eval.Test plan
uv run pytest tests/— 819 passed, 4 skipped (was 804 passed before this PR; +15 new tests)uv run pytest --doctest-modules src/linkml_map/— 18 passeduv run ruff check .+uv run ruff format --check .— cleantests/test_utils/test_eval_utils.py— 4 direct evaluator tests (default returns None, default logs warning, strict raises, bound-None-in-strict still returns None)tests/test_transformer/test_expr_strict_mode.py— 8 integration tests viaObjectTransformer.map_objectcovering typo / valid-slot / null-value / absent-from-row scenarios, parameterized overstrict=True/False, plus a dedup-across-rows testtests/test_cli/test_cli.py— 2 CLI tests (--strictmakes a typo'd spec exit non-zero; default exits 0)Coordination
This branch is off
mainat5ff8b00. Independent of PR #231 (validate-spec side of the silent-null defense arc) — no shared files, no merge order requirement.