Replace empty module name strings with validated ModuleName type#1151
Conversation
Introduce ModuleName as a validated type (non-empty array of non-empty components) for PythonIdent.pythonModule, eliminating the root cause of empty module name bugs. Key changes: - PythonIdent.mk replaced by PythonIdent.ofComponent with compile-time proof that module string is non-empty - DDM.lean converted to monadic FromDDM for proper error reporting - signaturesToLaurel takes ModuleName instead of String prefix - Extract overload logic into dedicated ExtractM monad (independent of ToLaurelContext) - buildPySpecLaurel takes Array (ModuleName × String) instead of Array (String × String) - ModuleName API: ofString?, ofString!, back, parent, push, append, ofComponent, toString with configurable separator Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bfb75a1 to
e89e3f5
Compare
Introduce `ModuleOfPath` and `ofRelativePath` in PythonIdent.lean to provide a single canonical way to derive a ModuleName from a relative file path. Use it to simplify `discoverModules`, `RelativeImportTest`, and `findInPath`. Delete the unused `ofFile` function and replace `containsSubstr` with `String.contains`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Signature Add `PythonIdent.toString (sep)` to eliminate repeated manual assembly of underscore-separated ident strings. Change `TypeSignature.rank` key from `String` to `ModuleName` for type safety. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 Well-designed PR that eliminates a class of bugs at the type level. The ModuleName invariant is clean, ofRelativePath consolidates scattered logic, and the FromDDM monad properly replaces panic! with structured errors. A few issues below.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts: # StrataMain.lean
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 None of the 7 previous review comments have been addressed — the latest commits all predate the review. Specifically:
PythonIdent.lean:77— still missings!prefix inpanic!messagePythonIdent.lean:117— typoStatastill present (should beStrata)DDM.lean:357— still missings!prefix in.throwstringDecls.lean:112—SpecType.ofComponentstill not renamed back tomkToLaurel.lean:201-202— stale docstring still says "Returns the name unchanged if the prefix is empty"ToLaurel.lean:10,16— duplicateOverloadTableimport still presentSpecs.lean:25—since := "FIXME"placeholder still present
- Add missing s! prefix to panic/error string interpolations - Fix "Stata" typo → "Strata" in ModuleName Repr instance - Revert accidental SpecType constructor rename (ofComponent → mk) - Remove stale docstring clause about empty prefix - Remove duplicate OverloadTable import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All previous comments appear addressed — reviewer sign-off still needed.
tautschnig
left a comment
There was a problem hiding this comment.
Nothing fundamentally blocking, but a couple of suggestions that would seem like genuine improvements.
- Push ModuleName through extractFunctionSignatures, eliminating dead isEmpty branch - Rewrite pushOverloadEntry: dedup branches, use extractError instead of dbg_trace, replace .get! with match - Add ModuleOfPath.modulePrefix? returning Option ModuleName directly - Restore lost warning in discoverModules on ofRelativePath error - Add #guard cases for parent, ofString? edge cases, round-trips, back, append Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All actionable comments from tautschnig's review are addressed in commit 225b37b:
- ✅
ofRelativePathnow usesSystem.FilePath.components(portability fix) - ✅ Warning restored in
SimpleAPI.lean(behavior regression fixed) - ✅
extractFunctionSignaturestakesModuleNamedirectly, deadisEmptybranch removed - ✅
pushOverloadEntrycleaned up:matchreplaces.get!,extractError .overloadParamDisagreereplacesdbg_trace, branches deduplicated - ✅
modulePrefix?added and call site simplified inSpecs.lean - ✅
#guardcases added for round-trips,back,append, andparent
Items explicitly declined by author (with justification): Inhabited instance removal, back refactoring, parent rename. These are design choices for the reviewer to accept or push back on.
Reviewer sign-off still needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All previous comments appear addressed — reviewer sign-off still needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13ece5a to
d69ce1b
Compare
Resolve conflicts with Pipeline refactor (PipelineM, MessageKind, structured messaging). Adapt ModuleName typing to new pipeline infrastructure, remove old ExtractM monad in favor of ToLaurelM-based overload extraction, delete Error.lean (replaced by MessageKind.lean). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All previous comments appear addressed — reviewer sign-off still needed.
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖🔍 All previous comments appear addressed — reviewer sign-off still needed.
Summary
PythonIdent.pythonModulewas a rawString, allowing empty module names to propagate silently through the system and cause downstream failures (see #1143). This PR changespythonModuleto useModuleName— which now guarantees at least one non-empty component — and adopts it more widely through the PySpec pipeline, eliminating this class of bugs at the type level.Approach
Rather than adding runtime checks at each use site,
ModuleNamegains a strengthened invariant via a private constructor enforcing non-empty components.PythonIdent.ofComponentusesdecideto verify non-emptiness at compile time for literal module names (used inDecls.lean). Runtime parsing goes throughModuleName.ofString?which returnsOption.Key changes
PythonIdent.pythonModulechanged fromStringtoModuleName. NewofComponentconstructor with compile-time proof for single-component literals.ModuleNameAPI extended:back,parent (n),push,append/++,toString (sep)ModuleName.ofRelativePathprovides a single canonical function for deriving aModuleNamefrom a.pyfile path relative to a search root, handling both regular files and__init__.pypackages. This eliminates redundant ad-hoc implementations scattered acrossdiscoverModules,RelativeImportTest, andfindInPath— each of which had slightly different__init__.pyhandling, creating inconsistency and latent bugs (e.g.,pySpecToLaurelCommandsilently failed for package modules)findInPathsimplified to return(FilePath × Bool)instead of(FilePath × Array ModuleComponent)— theBoolisisInit, and callers useModuleOfPath.modulePrefixto derive the package prefix. Deleted unusedofFilefunctionfromDDMfunctions to monadicFromDDM(Except (SourceRange × String)) for proper error propagation instead of panicssignaturesToLaureltakesModuleNameinstead ofStringprefix. Overload extraction factored into dedicatedExtractMmonad (no dependency onToLaurelContext)buildPySpecLaureltakesArray (ModuleName × String)instead ofArray (String × String)ModuleNamethroughout;translateFilerequires aModuleName(no longer optional)ofComponentwith compile-time proofUnrelated cleanup
Strata.Python.containsSubstrhelper and replaced all call sites withString.contains(which accepts substring patterns in Lean 4.29)By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.