Safe-function extension surface + slugify/case-converter built-ins#249
Open
amc-corey-cox wants to merge 3 commits into
Open
Safe-function extension surface + slugify/case-converter built-ins#249amc-corey-cox wants to merge 3 commits into
amc-corey-cox wants to merge 3 commits into
Conversation
…-ins A `@safe_function` decorator + file-path loader lets downstream trans-spec authors register custom safe functions into the expression eval namespace from user-supplied Python modules, loaded via `-F`/`--functions` (repeatable). Author declares safety on the function (pure, bounded-time, no I/O); linkml-map enforces namespace boundaries, not function purity. - Collision between extensions → error. Collision with a built-in → error unless declared `override=True`. `override=True` with no matching built-in → warning (typo catcher). - Threaded through `ObjectTransformer.extension_functions` so Python callers can also set it directly. - Replaces simpleeval's `FunctionNotDefined` message with one that includes Levenshtein suggestions and a pointer to `--functions`. - Adds `python-slugify` and `inflection` as deps; built-in `slugify`, `to_snake`, `to_camel`, `to_pascal` join `_SCALAR_FUNCTIONS`. - Docs: `docs/api/extensions.md` covers the contract, the when-not-to-use anti-pattern, override semantics, and the required-extensions header convention. - Example extension in `tests/input/extensions/slugify_ext.py` dogfoods both `@safe_function` and `override=True`. Closes #242, #248.
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a downstream extension mechanism for “safe” expression functions, introduces slugify/case-conversion built-ins, and improves unknown-function diagnostics to guide users toward typos vs missing extension loading.
Changes:
- Introduces
@safe_function+load_extensions()to discover and load tagged functions from--functions/-FPython files into the expression namespace. - Adds new safe built-ins:
slugify,to_snake,to_camel,to_pascal, along with enriched “unknown function” errors with did-you-mean suggestions. - Wires extension functions into
ObjectTransformer(Python API) andmap-dataCLI, and documents/extensions tests the new surface.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Locks new deps needed for slugify/case conversion (python-slugify, inflection, text-unidecode). |
| pyproject.toml | Declares new runtime dependencies for the new built-ins. |
| src/linkml_map/utils/eval_utils.py | Adds slugify/case-converter built-ins and richer unknown-function errors. |
| src/linkml_map/utils/extensions.py | New extension loader + decorator for downstream safe-function registration. |
| src/linkml_map/transformer/object_transformer.py | Accepts extension_functions and injects them into expression evaluation. |
| src/linkml_map/cli/cli.py | Adds repeatable --functions/-F option and loads extensions into the transformer. |
| tests/test_utils/test_extensions.py | New unit/integration tests for decorator, loader semantics, CLI, and transformer integration. |
| tests/test_utils/test_eval_utils.py | Updates unknown-function/import tests and adds coverage for new built-ins. |
| tests/input/extensions/slugify_ext.py | Example extension module demonstrating override=True behavior. |
| mkdocs.yml | Adds Extensions page to docs nav. |
| docs/api/functions.md | Documents new built-ins and points to extension docs. |
| docs/api/extensions.md | Adds end-user documentation for writing/loading extension functions. |
Comments suppressed due to low confidence (3)
src/linkml_map/utils/extensions.py:117
_load_module_from_path()executes arbitrary user modules but doesn’t catch import-time exceptions; non-ExtensionErrorfailures will bypass the CLI’sExtensionErrorhandling and can leave a partially-initialized module insys.modules. Wrapexec_module()intry/exceptto raiseExtensionErrorwith path context, and ensuresys.modules[spec.name]is cleaned up on failure.
spec = importlib.util.spec_from_file_location(f"_linkml_ext_{path.stem}", path)
if spec is None or spec.loader is None:
msg = f"Could not create import spec for: {path}"
raise ExtensionError(msg)
module = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = module
spec.loader.exec_module(module)
return module
src/linkml_map/utils/extensions.py:115
spec_from_file_location(f"_linkml_ext_{path.stem}", path)can collide when two extension files share the same basename (e.g.foo/ext.pyandbar/ext.py), causingsys.modulesoverwrites and confusing import behavior. Consider deriving a unique module name from the full resolved path (e.g., include a short hash) to avoid collisions.
msg = f"Extension file not found: {path}"
raise ExtensionError(msg)
spec = importlib.util.spec_from_file_location(f"_linkml_ext_{path.stem}", path)
if spec is None or spec.loader is None:
msg = f"Could not create import spec for: {path}"
raise ExtensionError(msg)
module = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = module
src/linkml_map/utils/extensions.py:95
- The decorator sets
fn._linkml_safe_function = ...directly instead of using the_SAFE_FUNCTION_ATTRconstant. Usingsetattr(fn, _SAFE_FUNCTION_ATTR, ...)avoids the hard-coded attribute name so future refactors can’t accidentally desync tagging vs discovery.
def _tag(fn: Callable) -> Callable:
fn._linkml_safe_function = {"override": override, "distributes": distributes} # type: ignore[attr-defined]
return fn
- Fix docstring path: docs/expressions/extensions.md → docs/api/extensions.md - Sort known-names pool before difflib.get_close_matches for deterministic suggestion ordering across runs/Python hash seeds
- Hash-suffix sys.modules names so same-basename files don't clobber - Wrap exec_module so import-time errors surface as ExtensionError with path context, and sys.modules is cleaned up on failure - Use setattr(_SAFE_FUNCTION_ATTR, ...) to keep tagging/discovery in sync - Reserve 'slot' (and any future per-call names) — refuse at load time rather than silently shadow - New tests: import-time error, same-basename non-collision, reserved name, override end-to-end through ObjectTransformer, CLI clean-error on broken extension - New fixture tests/input/extensions/override_demo_ext.py with distinct output so the integration test can prove override fires through the full stack - Docs: extensions.md notes the reserved-name guard
Comment on lines
+225
to
+249
| def _slugify(s: str, separator: str = "_") -> str | None: | ||
| """Slugify a string to an identifier-shaped form. | ||
|
|
||
| ASCII-folds, lowercases, and collapses non-alphanumeric runs to | ||
| ``separator``. Returns ``None`` if no identifier-shaped content remains — | ||
| matching the SQL-style null convention so ``slugify`` composes with | ||
| ``or``-chain fallbacks:: | ||
|
|
||
| expr: "slugify(name) or slugify(label) or 'anonymous'" | ||
|
|
||
| Backed by `python-slugify <https://pypi.org/project/python-slugify/>`_. | ||
|
|
||
| >>> _slugify("Hello, World!") | ||
| 'hello_world' | ||
| >>> _slugify("Schöne Grüße") | ||
| 'schone_grusse' | ||
| >>> _slugify("café-au-lait", separator="-") | ||
| 'cafe-au-lait' | ||
| >>> _slugify("!!!") is None | ||
| True | ||
| >>> _slugify("") is None | ||
| True | ||
| """ | ||
| return _slugify_lib(s, separator=separator, lowercase=True) or None | ||
|
|
Comment on lines
+488
to
+503
| def suggest_for_unknown_name(name: str, *, known_names: Iterable[str]) -> str: | ||
| """Build a user-facing error message for an unknown function name. | ||
|
|
||
| Includes Levenshtein-style suggestions from the known-names pool plus a | ||
| parenthetical pointing at ``--functions`` so the single message covers both | ||
| the typo case and the missing-extension case. | ||
|
|
||
| :param name: The unknown name as referenced in the expression. | ||
| :param known_names: Names currently in the eval namespace (built-ins plus | ||
| any loaded extensions) — the pool for did-you-mean suggestions. | ||
| """ | ||
| suggestions = difflib.get_close_matches(name, sorted(set(known_names)), n=3, cutoff=0.6) | ||
| suggestion = f" Did you mean: {', '.join(repr(s) for s in suggestions)}?" if suggestions else "" | ||
| return f"Unknown function {name!r}.{suggestion} (If this is a custom function, pass it via --functions <path>.)" | ||
|
|
||
|
|
Comment on lines
+552
to
+560
| """Pre-check function names for unknown-function errors with did-you-mean. | ||
|
|
||
| Replaces simpleeval's default ``FunctionNotDefined`` message ("Function | ||
| 'X' not defined") with one that includes Levenshtein suggestions and a | ||
| pointer to ``--functions`` for the missing-extension case. | ||
| """ | ||
| if isinstance(node.func, ast.Name) and node.func.id not in self.functions: | ||
| msg = suggest_for_unknown_name(node.func.id, known_names=self.functions.keys()) | ||
| raise UnknownFunctionError(node.func.id, msg) from None |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
@safe_functionextension mechanism for downstream-supplied functions in the expression eval namespace, loaded via-F/--functions(repeatable). Decorator-based discovery, file-path loading, explicit override semantics on the function rather than a CLI flag. Implements Python-module extension surface for safe-function namespace #248.slugify,to_snake,to_camel,to_pascal— backed bypython-slugifyandinflection. Implements Add slugify safe builtin for identifier-shaped expressions #242.--functions, replacing simpleeval'sFunctionNotDefineddefault message.The extension mechanism was dogfooded by writing
slugifyas an extension first (seetests/input/extensions/slugify_ext.py), then lifting it into the built-ins. The example extension stays as a learning artifact and demonstratesoverride=True.Design notes (locked in #248)
typing.final.ExtensionError. Collision with a built-in →ExtensionErrorunless declaredoverride=True.override=Truewithout a matching built-in → warning.--allow-overrideCLI flag — the decorator'soverride=Trueis sufficient declaration.required_extensions:YAML key is a noted follow-up.Test plan
uv run pytest— 864 passed, 4 skippeduv run pytest --doctest-modules src/linkml_map/utils/eval_utils.py— all doctests passuv run ruff check .anduv run ruff format .clean--functionsend-to-end with the slugify example extension