Version 8: Extract core pipeline into library#39
Conversation
WalkthroughVersion 8 refactors the project by extracting core pipeline functionality from main.rs into a new lib.rs, creating a mathsys library. Module imports now require .msm extensions. Numerous syntax structs expose previously private fields for public access. The solver adopts a new selection-based derivation pruning strategy, and module loading gains filesystem integration capabilities. Changes
Sequence Diagram(s)sequenceDiagram
participant Use as Use (Level1)
participant File as File System
participant Tokenizer
participant Parser
participant Solver
participant Settings
Use->>File: load module by name
File-->>Use: file content
Use->>Tokenizer: tokenize content
Tokenizer-->>Use: tokens
Use->>Parser: parse tokens
Parser-->>Use: AST (Start)
Use->>Solver: solve AST
Solver-->>Use: resolved Start
Use->>Use: update self.start with result
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
main.rs (2)
42-51:⚠️ Potential issue | 🟠 MajorIndex-based target dispatch is a silent breakage hazard.
If anyone reorders or inserts an entry in
TARGETS, every single index here silently points to the wrong command. You're oneTARGETSedit away fromhelprunninglatex. Use the string names directly instead ofTARGETS[i].0.🛡️ Proposed fix — match on string literals directly
- match &transformers.settings.target.clone().unwrap_or(Target {name: String::from("help")}).name as &str { - target if target == TARGETS[0].0 => help(&mut transformers)?, - target if target == TARGETS[1].0 => println!("Running Mathsys v{} on {}/{}/{}", version(&mut transformers)?, OS, ARCH, rustcv()), - target if target == TARGETS[2].0 => println!("{:#?}", tokens(&mut transformers)?), - target if target == TARGETS[3].0 => {let len = length(&mut transformers)?; println!("Token length: {len} / {MAXLEN} ({}%)", len as f32 / MAXLEN as f32 * 100.0)}, - target if target == TARGETS[4].0 => println!("{}", {check(&mut transformers)?; "Valid"}), - target if target == TARGETS[5].0 => println!("{:#?}", ast(&mut transformers)?), - target if target == TARGETS[6].0 => println!("{}", latex(&mut transformers)?), - other => return Err(Issue::UnknownTarget(other.to_string())) + match &transformers.settings.target.clone().unwrap_or(Target {name: String::from("help")}).name as &str { + "help" => help(&mut transformers)?, + "version" => println!("Running Mathsys v{} on {}/{}/{}", version(&mut transformers)?, OS, ARCH, rustcv()), + "tokens" => println!("{:#?}", tokens(&mut transformers)?), + "length" => {let len = length(&mut transformers)?; println!("Token length: {len} / {MAXLEN} ({}%)", len as f32 / MAXLEN as f32 * 100.0)}, + "check" => println!("{}", {check(&mut transformers)?; "Valid"}), + "ast" => println!("{:#?}", ast(&mut transformers)?), + "latex" => println!("{}", latex(&mut transformers)?), + other => return Err(Issue::UnknownTarget(other.to_string())) };
29-29: 🧹 Nitpick | 🔵 TrivialFile extension check via
split(".")is naive — handle dotless filenames and paths with dots.
file.split(".").last().unwrap()works forfoo.msmbut consider a path likemy.project/input— it'd matchinputas the extension (correctly not matchingmsm), but the intent is murky. Usestd::path::Path::extension()for robust extension checking.♻️ Proposed fix
- file if file.split(".").last().unwrap() == "msm" => Argument::File(File { + file if std::path::Path::new(file.as_str()).extension().is_some_and(|ext| ext == "msm") => Argument::File(File {syntax/level2.rs (1)
43-43: 🧹 Nitpick | 🔵 TrivialBare
panic!()with no message — you're making future-you's life harder.When this inevitably blows up on unexpected input, you'll get a panic with zero context. At minimum, include the offending
itemin the message so you can actually debug it.🔧 Proposed fix
- other => panic!() + other => panic!("Expression::summon: unexpected item {other:?}")
🤖 Fix all issues with AI agents
In @.github/CHANGELOG.md:
- Around line 210-222: Add a blank line before and after each level-3 heading
(e.g., "### Added", "### Changed", "### Removed", "### Fixed") in the "v8 --
Semantic solver" section (and the section title "v8 -- Semantic solver" itself)
so they match the v7 formatting, and ensure the file ends with a single trailing
newline; update the changelog text around those headings (including the "1. Full
issue coverage now." and list blocks) to preserve markdown paragraph separation
and add the final newline.
In @.gitignore:
- Around line 14-15: The current gitignore entry '*.msm' will ignore all .msm
files globally; change this to a more scoped pattern (for example '/*.msm' to
only ignore root-level .msm files) or move test .msm files into a dedicated
ignored directory and update the .gitignore accordingly so you don't
accidentally hide future module/example .msm files; update the '*.msm' entry in
the .gitignore to the chosen scoped pattern or directory rule.
In `@file.msm`:
- Around line 1-150: The file contains 150 identical repeated lines "e = lim n->
inf of (1 + 1/n)^n^" which is almost certainly an accidental paste; remove the
duplicates and keep a single canonical line (or a small set of distinct test
cases) instead of the repeated block, or if the repetition is intentional for
stress-testing add a top-of-file comment stating that and replace the
hard‑copied repeats with a programmatic generator (e.g., a loop or a test
fixture) so the intent is clear; look for the repeated literal "e = lim n-> inf
of (1 + 1/n)^n^" to locate and fix the issue.
In `@lib.rs`:
- Around line 65-77: The impl block for Transformers is crammed into one line
which reduces readability; reformat the impl Transformers { pub fn
new(arguments: &[Argument]) -> Result<Self, Issue> { ... } } into a multi-line,
idiomatic block: place impl Transformers on its own line, open a block, put pub
fn new(...) on a new line, and format the return Ok(Transformers { tokenizer:
Tokenizer::new(), parser: Parser::new(), solver: Solver::new(), settings:
Settings::set(arguments)?, time: Time::now() }) across multiple indented lines
so each field (tokenizer, parser, solver, settings, time) is on its own line and
the function braces are clear; keep function signature and error handling
(Settings::set(...)? ) unchanged.
- Around line 142-148: The Noise::change method's boolean parameter (shift) is
unclear at callsites like Settings::apply; replace it with a self-documenting
parameter by either renaming the parameter to increase: bool or better yet
define a small Direction enum (e.g., enum Direction {Up, Down}) and change pub
fn change(&mut self, shift: bool) to pub fn change(&mut self, dir: Direction)
(or increase: bool) and update all callsites (e.g., Settings::apply where
self.noise.change(false) appears) to use Direction::Down/Direction::Up (or
increase: false/true) so intent is explicit; keep the match logic but switch on
the enum variant (or boolean name) to preserve behavior.
- Line 179: The code currently swallows a second file/target silently
(Argument::File => if self.file.is_none() { self.file = Some(file.clone()) } and
similarly for Argument::Target), so change the branches to detect duplicates and
surface them: if self.file.is_none() set it as before, else emit a clear warning
or return an error (e.g., eprintln! or return Err with a message) indicating a
duplicate file argument; do the same for the Argument::Target branch (check
self.target/self.targets) so duplicate targets are not silently dropped. Ensure
the messages include the offending value (file or target) so users know what was
ignored.
- Around line 16-26: The nested modules tokenizer::tokenizer, parser::parser,
and solver::solver create awkward callsites; fix by either renaming the inner
modules (e.g., change tokenizer::tokenizer -> tokenizer::core, parser::parser ->
parser::core, solver::solver -> solver::core and update the corresponding mod
file names) or re-export the key types at the outer module level (add pub use
statements in the outer modules to expose tokenizer::Tokenizer, parser::Parser,
solver::Solver as tokenizer::Tokenizer, parser::Parser, solver::Solver) and
update any callsites accordingly; ensure the mod declarations (pub mod
tokenizer, pub mod parser, pub mod solver) remain correct and adjust file/module
names to match the chosen approach.
- Around line 86-127: The five pipeline functions (tokens, length, check, ast,
latex) duplicate the same read→tokenize→parse→solve→modules chain; extract a
shared helper (e.g., run_pipeline or prepare_start) that accepts &mut
Transformers and returns the intermediate results needed (at minimum: content or
tokens, and for the full pipeline a Start or whatever type the solver returns)
so each public function calls that helper instead of repeating steps; update
check, ast, and latex to call the helper and then perform their final actions
(e.g., start.modules(...) or start.latex()), keep tokens and length either
calling a lightweight helper that only performs read+tokenize or leave them
as-is but prefer a second helper for read+tokenize to avoid duplication. Ensure
the helper surfaces the same Result<..., Issue> errors and uses the existing
symbols transformers.settings, transformers.tokenizer.run,
transformers.parser.run, transformers.solver.run, and Start::modules.
- Around line 6-10: The crate currently enables unstable features in lib.rs
(try_trait_v2, default_field_values, if_let_guard) which require a nightly
compiler; add a rust-toolchain.toml pinning a minimum working nightly (or
alternatively add a clear README note) so contributors use the correct
toolchain. Update the repository by creating rust-toolchain.toml specifying the
nightly version that successfully compiles these features, or add a prominent
section in README referencing the exact nightly needed and why (point to the
feature names in lib.rs) so users won't attempt to build on stable Rust.
In `@solver/solver.rs`:
- Around line 76-84: The resolve function currently panics for any NonTerminal
pair that isn't the specific Declaration vs. Equation arm; update the resolve
method (the function named resolve on the solver struct operating on
&NonTerminal) to handle all expected pair combinations by adding explicit match
arms for pairs like (Node, Node), (Definition, Equation), etc., and change its
signature to return a Result<bool, ResolveError> (or equivalent) instead of
panicking so unexpected pairs yield an Err rather than crashing; also remove the
mutable struct field opposite and instead thread an opposite flag as a local
parameter (e.g., resolve(&mut self, first: &NonTerminal, second: &NonTerminal,
opposite: bool)) or use a local stack/recursion parameter so recursive calls
call resolve(second, first, !opposite) safely without shared mutable state.
- Around line 50-56: The current winner-selection initializes winner = &built[0]
and then only compares when both winner.1 and contender.1 match
Partition::NonTerminal, so if built[0] is a Token you may never compare and end
up "winning" a Token unintentionally; change the logic in the selection loop
(the block using built, winner, select, Partition::NonTerminal, and resolve) to
either (a) initialize winner to the first entry whose partition is NonTerminal
(scan built for the first Partition::NonTerminal) and fall back to the first
Token only if no NonTerminal exists, or (b) filter built into NonTerminals up
front and run the comparison only on that filtered list, then explicitly handle
the all-Token case afterward; also add a brief comment documenting the chosen
behavior so the intent is clear.
- Around line 38-74: The code can panic when candidates becomes empty before
candidates.pop().unwrap(); change select to return Result<Partition<'resolving>,
Issue> (e.g., fn select(...) -> Result<Partition<'resolving>, Issue>) and add a
guard after the while loop: if candidates.is_empty() { return
Err(Issue::EmptyDerivations(node.clone() or suitable context)); }. Propagate the
Result through all recursive calls to self.select (use ?), convert existing
returns like Partition::... into Ok(...), and update all call sites to handle
the Result; ensure resolve comparisons still work by matching
Ok(Partition::NonTerminal(...)) where needed or by unwrapping via ? before
comparing. This prevents the unwrap panic and surfaces an explicit error when
derivations are exhausted.
In `@syntax/level1.rs`:
- Around line 130-140: Change Use::load to propagate file-read errors instead of
returning Ok(()): when File { name: self.module.clone().into() }.read() fails,
map that error to Issue::FileNotFound (or the appropriate Issue variant) and
return Err(...) so callers get a clear diagnostic instead of silent success.
Also avoid reusing the same mutable Solver for nested module loads: before
calling start.modules(tokenizer, parser, solver, settings) create and pass a
fresh solver instance (e.g., clone or construct a new Solver based on the
current one) so nested solver.run / resolve calls cannot flip the outer
solver.opposite state and corrupt outer resolution. Ensure you update the call
site to use the new mutable solver reference when invoking start.modules.
In `@syntax/start.rs`:
- Line 39: Start::modules currently calls Use::load which re-enters
Start::modules, so add cycle detection by tracking modules being loaded (e.g., a
HashSet<String> of module paths) either on Solver or passed as an extra
parameter (e.g., &mut HashSet<String>) and have Use::load/Start::modules check
that set and return a proper Issue error if the module path is already present
to prevent recursion; also refactor the one-line impl Start::modules into a
properly formatted multi-line function for readability and to clearly handle
insertion/removal of the module path from the tracking set around the recursive
load call.
Summary by CodeRabbit
New Features
Documentation
Chores