Skip to content

Safe-function extension surface + slugify/case-converter built-ins#249

Open
amc-corey-cox wants to merge 3 commits into
mainfrom
safe-fn-extensions
Open

Safe-function extension surface + slugify/case-converter built-ins#249
amc-corey-cox wants to merge 3 commits into
mainfrom
safe-fn-extensions

Conversation

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Summary

  • @safe_function extension mechanism for downstream-supplied functions in the expression eval namespace, loaded via -F/--functions (repeatable). Decorator-based discovery, file-path loading, explicit override semantics on the function rather than a CLI flag. Implements Python-module extension surface for safe-function namespace #248.
  • Slugify family of built-insslugify, to_snake, to_camel, to_pascal — backed by python-slugify and inflection. Implements Add slugify safe builtin for identifier-shaped expressions #242.
  • Enriched unknown-function errors with Levenshtein did-you-mean suggestions and a pointer to --functions, replacing simpleeval's FunctionNotDefined default message.

The extension mechanism was dogfooded by writing slugify as an extension first (see tests/input/extensions/slugify_ext.py), then lifting it into the built-ins. The example extension stays as a learning artifact and demonstrates override=True.

Design notes (locked in #248)

  • Author declares safety via the decorator; linkml-map enforces namespace boundaries, not function purity. Same posture as typing.final.
  • Collision between two extensions → ExtensionError. Collision with a built-in → ExtensionError unless declared override=True. override=True without a matching built-in → warning.
  • File paths only; module-name discovery deferred.
  • No --allow-override CLI flag — the decorator's override=True is sufficient declaration.
  • Trans-specs that need an extension document it in a header comment; declarative required_extensions: YAML key is a noted follow-up.

Test plan

  • uv run pytest — 864 passed, 4 skipped
  • uv run pytest --doctest-modules src/linkml_map/utils/eval_utils.py — all doctests pass
  • uv run ruff check . and uv run ruff format . clean
  • CLI smoke test exercises --functions end-to-end with the slugify example extension

…-ins

A `@safe_function` decorator + file-path loader lets downstream trans-spec
authors register custom safe functions into the expression eval namespace
from user-supplied Python modules, loaded via `-F`/`--functions`
(repeatable). Author declares safety on the function (pure, bounded-time,
no I/O); linkml-map enforces namespace boundaries, not function purity.

- Collision between extensions → error. Collision with a built-in → error
  unless declared `override=True`. `override=True` with no matching
  built-in → warning (typo catcher).
- Threaded through `ObjectTransformer.extension_functions` so Python
  callers can also set it directly.
- Replaces simpleeval's `FunctionNotDefined` message with one that
  includes Levenshtein suggestions and a pointer to `--functions`.
- Adds `python-slugify` and `inflection` as deps; built-in `slugify`,
  `to_snake`, `to_camel`, `to_pascal` join `_SCALAR_FUNCTIONS`.
- Docs: `docs/api/extensions.md` covers the contract, the
  when-not-to-use anti-pattern, override semantics, and the
  required-extensions header convention.
- Example extension in `tests/input/extensions/slugify_ext.py`
  dogfoods both `@safe_function` and `override=True`.

Closes #242, #248.
Copilot AI review requested due to automatic review settings May 15, 2026 19:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a downstream extension mechanism for “safe” expression functions, introduces slugify/case-conversion built-ins, and improves unknown-function diagnostics to guide users toward typos vs missing extension loading.

Changes:

  • Introduces @safe_function + load_extensions() to discover and load tagged functions from --functions/-F Python files into the expression namespace.
  • Adds new safe built-ins: slugify, to_snake, to_camel, to_pascal, along with enriched “unknown function” errors with did-you-mean suggestions.
  • Wires extension functions into ObjectTransformer (Python API) and map-data CLI, and documents/extensions tests the new surface.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
uv.lock Locks new deps needed for slugify/case conversion (python-slugify, inflection, text-unidecode).
pyproject.toml Declares new runtime dependencies for the new built-ins.
src/linkml_map/utils/eval_utils.py Adds slugify/case-converter built-ins and richer unknown-function errors.
src/linkml_map/utils/extensions.py New extension loader + decorator for downstream safe-function registration.
src/linkml_map/transformer/object_transformer.py Accepts extension_functions and injects them into expression evaluation.
src/linkml_map/cli/cli.py Adds repeatable --functions/-F option and loads extensions into the transformer.
tests/test_utils/test_extensions.py New unit/integration tests for decorator, loader semantics, CLI, and transformer integration.
tests/test_utils/test_eval_utils.py Updates unknown-function/import tests and adds coverage for new built-ins.
tests/input/extensions/slugify_ext.py Example extension module demonstrating override=True behavior.
mkdocs.yml Adds Extensions page to docs nav.
docs/api/functions.md Documents new built-ins and points to extension docs.
docs/api/extensions.md Adds end-user documentation for writing/loading extension functions.
Comments suppressed due to low confidence (3)

src/linkml_map/utils/extensions.py:117

  • _load_module_from_path() executes arbitrary user modules but doesn’t catch import-time exceptions; non-ExtensionError failures will bypass the CLI’s ExtensionError handling and can leave a partially-initialized module in sys.modules. Wrap exec_module() in try/except to raise ExtensionError with path context, and ensure sys.modules[spec.name] is cleaned up on failure.
    spec = importlib.util.spec_from_file_location(f"_linkml_ext_{path.stem}", path)
    if spec is None or spec.loader is None:
        msg = f"Could not create import spec for: {path}"
        raise ExtensionError(msg)
    module = importlib.util.module_from_spec(spec)
    sys.modules[spec.name] = module
    spec.loader.exec_module(module)
    return module

src/linkml_map/utils/extensions.py:115

  • spec_from_file_location(f"_linkml_ext_{path.stem}", path) can collide when two extension files share the same basename (e.g. foo/ext.py and bar/ext.py), causing sys.modules overwrites and confusing import behavior. Consider deriving a unique module name from the full resolved path (e.g., include a short hash) to avoid collisions.
        msg = f"Extension file not found: {path}"
        raise ExtensionError(msg)
    spec = importlib.util.spec_from_file_location(f"_linkml_ext_{path.stem}", path)
    if spec is None or spec.loader is None:
        msg = f"Could not create import spec for: {path}"
        raise ExtensionError(msg)
    module = importlib.util.module_from_spec(spec)
    sys.modules[spec.name] = module

src/linkml_map/utils/extensions.py:95

  • The decorator sets fn._linkml_safe_function = ... directly instead of using the _SAFE_FUNCTION_ATTR constant. Using setattr(fn, _SAFE_FUNCTION_ATTR, ...) avoids the hard-coded attribute name so future refactors can’t accidentally desync tagging vs discovery.
    def _tag(fn: Callable) -> Callable:
        fn._linkml_safe_function = {"override": override, "distributes": distributes}  # type: ignore[attr-defined]
        return fn

Comment thread src/linkml_map/utils/extensions.py Outdated
Comment thread src/linkml_map/utils/eval_utils.py Outdated
- Fix docstring path: docs/expressions/extensions.md → docs/api/extensions.md
- Sort known-names pool before difflib.get_close_matches for deterministic
  suggestion ordering across runs/Python hash seeds
- Hash-suffix sys.modules names so same-basename files don't clobber
- Wrap exec_module so import-time errors surface as ExtensionError with
  path context, and sys.modules is cleaned up on failure
- Use setattr(_SAFE_FUNCTION_ATTR, ...) to keep tagging/discovery in sync
- Reserve 'slot' (and any future per-call names) — refuse at load time
  rather than silently shadow
- New tests: import-time error, same-basename non-collision, reserved
  name, override end-to-end through ObjectTransformer, CLI clean-error
  on broken extension
- New fixture tests/input/extensions/override_demo_ext.py with distinct
  output so the integration test can prove override fires through the
  full stack
- Docs: extensions.md notes the reserved-name guard
Copilot AI review requested due to automatic review settings May 15, 2026 22:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

Comment on lines +225 to +249
def _slugify(s: str, separator: str = "_") -> str | None:
"""Slugify a string to an identifier-shaped form.

ASCII-folds, lowercases, and collapses non-alphanumeric runs to
``separator``. Returns ``None`` if no identifier-shaped content remains —
matching the SQL-style null convention so ``slugify`` composes with
``or``-chain fallbacks::

expr: "slugify(name) or slugify(label) or 'anonymous'"

Backed by `python-slugify <https://pypi.org/project/python-slugify/>`_.

>>> _slugify("Hello, World!")
'hello_world'
>>> _slugify("Schöne Grüße")
'schone_grusse'
>>> _slugify("café-au-lait", separator="-")
'cafe-au-lait'
>>> _slugify("!!!") is None
True
>>> _slugify("") is None
True
"""
return _slugify_lib(s, separator=separator, lowercase=True) or None

Comment on lines +488 to +503
def suggest_for_unknown_name(name: str, *, known_names: Iterable[str]) -> str:
"""Build a user-facing error message for an unknown function name.

Includes Levenshtein-style suggestions from the known-names pool plus a
parenthetical pointing at ``--functions`` so the single message covers both
the typo case and the missing-extension case.

:param name: The unknown name as referenced in the expression.
:param known_names: Names currently in the eval namespace (built-ins plus
any loaded extensions) — the pool for did-you-mean suggestions.
"""
suggestions = difflib.get_close_matches(name, sorted(set(known_names)), n=3, cutoff=0.6)
suggestion = f" Did you mean: {', '.join(repr(s) for s in suggestions)}?" if suggestions else ""
return f"Unknown function {name!r}.{suggestion} (If this is a custom function, pass it via --functions <path>.)"


Comment on lines +552 to +560
"""Pre-check function names for unknown-function errors with did-you-mean.

Replaces simpleeval's default ``FunctionNotDefined`` message ("Function
'X' not defined") with one that includes Levenshtein suggestions and a
pointer to ``--functions`` for the missing-extension case.
"""
if isinstance(node.func, ast.Name) and node.func.id not in self.functions:
msg = suggest_for_unknown_name(node.func.id, known_names=self.functions.keys())
raise UnknownFunctionError(node.func.id, msg) from None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants