Skip to content

Version 8: Extract core pipeline into library#39

Open
alejandro-vaz wants to merge 2 commits intomainfrom
development
Open

Version 8: Extract core pipeline into library#39
alejandro-vaz wants to merge 2 commits intomainfrom
development

Conversation

@alejandro-vaz
Copy link
Owner

@alejandro-vaz alejandro-vaz commented Feb 15, 2026

Summary by CodeRabbit

  • New Features

    • Redesigned semantic solver for improved expression evaluation
    • Module import capability from filesystem (.msm extension required)
  • Documentation

    • Added README documentation to package
  • Chores

    • Version updated to 8.0.0

@alejandro-vaz alejandro-vaz self-assigned this Feb 15, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 15, 2026

Walkthrough

Version 8 refactors the project by extracting core pipeline functionality from main.rs into a new lib.rs, creating a mathsys library. Module imports now require .msm extensions. Numerous syntax structs expose previously private fields for public access. The solver adopts a new selection-based derivation pruning strategy, and module loading gains filesystem integration capabilities.

Changes

Cohort / File(s) Summary
Version & Project Structure
.github/CHANGELOG.md, Cargo.toml, .gitignore
Version bumped to 8.0.0 with new v8 changelog section documenting semantic solver changes. Cargo.toml now declares library with name "mathsys" and path "lib.rs", adds readme reference. Gitignore pattern expanded from single file to glob pattern for test MSM files.
Library Extraction
lib.rs, main.rs
Transformers struct, Noise enum, Settings struct, and pipeline functions (help, version, tokens, length, check, ast, latex) moved from main.rs to new lib.rs public API. main.rs now delegates to mathsys crate imports, reducing internal surface from ~200 lines of definitions to ~14 lines of re-exports.
Solver Refactoring
solver/solver.rs, solver/nonterminal.rs
Solver struct gains mutable opposite: bool field; run() changes to accept &mut self. New select() and resolve() methods implement selection-based derivation pruning with operand toggling. Object::score() method removed from nonterminal.rs.
Module Loading Feature
syntax/level1.rs, tokenizer/tokenizer.rs
Use struct gains public load() method for filesystem-based module import with integrated tokenization, parsing, and solving. Fields module and start exposed publicly. MODULE token pattern tightened to require .msm extension (^"[a-z]+\.msm").
Public Field Exposure
syntax/level2.rs, syntax/level3.rs, syntax/level4.rs, syntax/level5.rs, syntax/start.rs
Multiple syntax structs expose previously private fields: Expression.terms, Term (numerator, denominator), Factor (value, exponent), Limit (all 5 fields), Variable.name, Nest.value, Tensor.values, Whole.number, Absolute.value, Rational.number, Start.stream. New Start::modules() method orchestrates module loading.
Test File Updates
file.msm
Content replaced with repeated definitions of e as limit expression; legacy definitions (x, a, pi, L) removed. Demonstrates updated MSM format.

Sequence Diagram(s)

sequenceDiagram
    participant Use as Use (Level1)
    participant File as File System
    participant Tokenizer
    participant Parser
    participant Solver
    participant Settings

    Use->>File: load module by name
    File-->>Use: file content
    Use->>Tokenizer: tokenize content
    Tokenizer-->>Use: tokens
    Use->>Parser: parse tokens
    Parser-->>Use: AST (Start)
    Use->>Solver: solve AST
    Solver-->>Use: resolved Start
    Use->>Use: update self.start with result
Loading

Poem

🚀 Big refactor energy! You've carved out a proper library,

Pulled the guts from main and made them legendary,

Module imports now require their .msm identity—no shortcuts, no messiness,

Exposed those hidden fields for everyone to see,

The solver's got new smarts; v8 hits different. Nice work. 🎯

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main
Title Check ✅ Passed Title check skipped as CodeRabbit has written the PR title.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot changed the title @coderabbitai Version 8: Extract core pipeline into library Feb 15, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
main.rs (2)

42-51: ⚠️ Potential issue | 🟠 Major

Index-based target dispatch is a silent breakage hazard.

If anyone reorders or inserts an entry in TARGETS, every single index here silently points to the wrong command. You're one TARGETS edit away from help running latex. Use the string names directly instead of TARGETS[i].0.

🛡️ Proposed fix — match on string literals directly
-    match &transformers.settings.target.clone().unwrap_or(Target {name: String::from("help")}).name as &str {
-        target if target == TARGETS[0].0 => help(&mut transformers)?,
-        target if target == TARGETS[1].0 => println!("Running Mathsys v{} on {}/{}/{}", version(&mut transformers)?, OS, ARCH, rustcv()),
-        target if target == TARGETS[2].0 => println!("{:#?}", tokens(&mut transformers)?),
-        target if target == TARGETS[3].0 => {let len = length(&mut transformers)?; println!("Token length: {len} / {MAXLEN} ({}%)", len as f32 / MAXLEN as f32 * 100.0)},
-        target if target == TARGETS[4].0 => println!("{}", {check(&mut transformers)?; "Valid"}),
-        target if target == TARGETS[5].0 => println!("{:#?}", ast(&mut transformers)?),
-        target if target == TARGETS[6].0 => println!("{}", latex(&mut transformers)?),
-        other => return Err(Issue::UnknownTarget(other.to_string()))
+    match &transformers.settings.target.clone().unwrap_or(Target {name: String::from("help")}).name as &str {
+        "help" => help(&mut transformers)?,
+        "version" => println!("Running Mathsys v{} on {}/{}/{}", version(&mut transformers)?, OS, ARCH, rustcv()),
+        "tokens" => println!("{:#?}", tokens(&mut transformers)?),
+        "length" => {let len = length(&mut transformers)?; println!("Token length: {len} / {MAXLEN} ({}%)", len as f32 / MAXLEN as f32 * 100.0)},
+        "check" => println!("{}", {check(&mut transformers)?; "Valid"}),
+        "ast" => println!("{:#?}", ast(&mut transformers)?),
+        "latex" => println!("{}", latex(&mut transformers)?),
+        other => return Err(Issue::UnknownTarget(other.to_string()))
     };

29-29: 🧹 Nitpick | 🔵 Trivial

File extension check via split(".") is naive — handle dotless filenames and paths with dots.

file.split(".").last().unwrap() works for foo.msm but consider a path like my.project/input — it'd match input as the extension (correctly not matching msm), but the intent is murky. Use std::path::Path::extension() for robust extension checking.

♻️ Proposed fix
-        file if file.split(".").last().unwrap() == "msm" => Argument::File(File {
+        file if std::path::Path::new(file.as_str()).extension().is_some_and(|ext| ext == "msm") => Argument::File(File {
syntax/level2.rs (1)

43-43: 🧹 Nitpick | 🔵 Trivial

Bare panic!() with no message — you're making future-you's life harder.

When this inevitably blows up on unexpected input, you'll get a panic with zero context. At minimum, include the offending item in the message so you can actually debug it.

🔧 Proposed fix
-        other => panic!()
+        other => panic!("Expression::summon: unexpected item {other:?}")
🤖 Fix all issues with AI agents
In @.github/CHANGELOG.md:
- Around line 210-222: Add a blank line before and after each level-3 heading
(e.g., "### Added", "### Changed", "### Removed", "### Fixed") in the "v8 --
Semantic solver" section (and the section title "v8 -- Semantic solver" itself)
so they match the v7 formatting, and ensure the file ends with a single trailing
newline; update the changelog text around those headings (including the "1. Full
issue coverage now." and list blocks) to preserve markdown paragraph separation
and add the final newline.

In @.gitignore:
- Around line 14-15: The current gitignore entry '*.msm' will ignore all .msm
files globally; change this to a more scoped pattern (for example '/*.msm' to
only ignore root-level .msm files) or move test .msm files into a dedicated
ignored directory and update the .gitignore accordingly so you don't
accidentally hide future module/example .msm files; update the '*.msm' entry in
the .gitignore to the chosen scoped pattern or directory rule.

In `@file.msm`:
- Around line 1-150: The file contains 150 identical repeated lines "e = lim n->
inf of (1 + 1/n)^n^" which is almost certainly an accidental paste; remove the
duplicates and keep a single canonical line (or a small set of distinct test
cases) instead of the repeated block, or if the repetition is intentional for
stress-testing add a top-of-file comment stating that and replace the
hard‑copied repeats with a programmatic generator (e.g., a loop or a test
fixture) so the intent is clear; look for the repeated literal "e = lim n-> inf
of (1 + 1/n)^n^" to locate and fix the issue.

In `@lib.rs`:
- Around line 65-77: The impl block for Transformers is crammed into one line
which reduces readability; reformat the impl Transformers { pub fn
new(arguments: &[Argument]) -> Result<Self, Issue> { ... } } into a multi-line,
idiomatic block: place impl Transformers on its own line, open a block, put pub
fn new(...) on a new line, and format the return Ok(Transformers { tokenizer:
Tokenizer::new(), parser: Parser::new(), solver: Solver::new(), settings:
Settings::set(arguments)?, time: Time::now() }) across multiple indented lines
so each field (tokenizer, parser, solver, settings, time) is on its own line and
the function braces are clear; keep function signature and error handling
(Settings::set(...)? ) unchanged.
- Around line 142-148: The Noise::change method's boolean parameter (shift) is
unclear at callsites like Settings::apply; replace it with a self-documenting
parameter by either renaming the parameter to increase: bool or better yet
define a small Direction enum (e.g., enum Direction {Up, Down}) and change pub
fn change(&mut self, shift: bool) to pub fn change(&mut self, dir: Direction)
(or increase: bool) and update all callsites (e.g., Settings::apply where
self.noise.change(false) appears) to use Direction::Down/Direction::Up (or
increase: false/true) so intent is explicit; keep the match logic but switch on
the enum variant (or boolean name) to preserve behavior.
- Line 179: The code currently swallows a second file/target silently
(Argument::File => if self.file.is_none() { self.file = Some(file.clone()) } and
similarly for Argument::Target), so change the branches to detect duplicates and
surface them: if self.file.is_none() set it as before, else emit a clear warning
or return an error (e.g., eprintln! or return Err with a message) indicating a
duplicate file argument; do the same for the Argument::Target branch (check
self.target/self.targets) so duplicate targets are not silently dropped. Ensure
the messages include the offending value (file or target) so users know what was
ignored.
- Around line 16-26: The nested modules tokenizer::tokenizer, parser::parser,
and solver::solver create awkward callsites; fix by either renaming the inner
modules (e.g., change tokenizer::tokenizer -> tokenizer::core, parser::parser ->
parser::core, solver::solver -> solver::core and update the corresponding mod
file names) or re-export the key types at the outer module level (add pub use
statements in the outer modules to expose tokenizer::Tokenizer, parser::Parser,
solver::Solver as tokenizer::Tokenizer, parser::Parser, solver::Solver) and
update any callsites accordingly; ensure the mod declarations (pub mod
tokenizer, pub mod parser, pub mod solver) remain correct and adjust file/module
names to match the chosen approach.
- Around line 86-127: The five pipeline functions (tokens, length, check, ast,
latex) duplicate the same read→tokenize→parse→solve→modules chain; extract a
shared helper (e.g., run_pipeline or prepare_start) that accepts &mut
Transformers and returns the intermediate results needed (at minimum: content or
tokens, and for the full pipeline a Start or whatever type the solver returns)
so each public function calls that helper instead of repeating steps; update
check, ast, and latex to call the helper and then perform their final actions
(e.g., start.modules(...) or start.latex()), keep tokens and length either
calling a lightweight helper that only performs read+tokenize or leave them
as-is but prefer a second helper for read+tokenize to avoid duplication. Ensure
the helper surfaces the same Result<..., Issue> errors and uses the existing
symbols transformers.settings, transformers.tokenizer.run,
transformers.parser.run, transformers.solver.run, and Start::modules.
- Around line 6-10: The crate currently enables unstable features in lib.rs
(try_trait_v2, default_field_values, if_let_guard) which require a nightly
compiler; add a rust-toolchain.toml pinning a minimum working nightly (or
alternatively add a clear README note) so contributors use the correct
toolchain. Update the repository by creating rust-toolchain.toml specifying the
nightly version that successfully compiles these features, or add a prominent
section in README referencing the exact nightly needed and why (point to the
feature names in lib.rs) so users won't attempt to build on stable Rust.

In `@solver/solver.rs`:
- Around line 76-84: The resolve function currently panics for any NonTerminal
pair that isn't the specific Declaration vs. Equation arm; update the resolve
method (the function named resolve on the solver struct operating on
&NonTerminal) to handle all expected pair combinations by adding explicit match
arms for pairs like (Node, Node), (Definition, Equation), etc., and change its
signature to return a Result<bool, ResolveError> (or equivalent) instead of
panicking so unexpected pairs yield an Err rather than crashing; also remove the
mutable struct field opposite and instead thread an opposite flag as a local
parameter (e.g., resolve(&mut self, first: &NonTerminal, second: &NonTerminal,
opposite: bool)) or use a local stack/recursion parameter so recursive calls
call resolve(second, first, !opposite) safely without shared mutable state.
- Around line 50-56: The current winner-selection initializes winner = &built[0]
and then only compares when both winner.1 and contender.1 match
Partition::NonTerminal, so if built[0] is a Token you may never compare and end
up "winning" a Token unintentionally; change the logic in the selection loop
(the block using built, winner, select, Partition::NonTerminal, and resolve) to
either (a) initialize winner to the first entry whose partition is NonTerminal
(scan built for the first Partition::NonTerminal) and fall back to the first
Token only if no NonTerminal exists, or (b) filter built into NonTerminals up
front and run the comparison only on that filtered list, then explicitly handle
the all-Token case afterward; also add a brief comment documenting the chosen
behavior so the intent is clear.
- Around line 38-74: The code can panic when candidates becomes empty before
candidates.pop().unwrap(); change select to return Result<Partition<'resolving>,
Issue> (e.g., fn select(...) -> Result<Partition<'resolving>, Issue>) and add a
guard after the while loop: if candidates.is_empty() { return
Err(Issue::EmptyDerivations(node.clone() or suitable context)); }. Propagate the
Result through all recursive calls to self.select (use ?), convert existing
returns like Partition::... into Ok(...), and update all call sites to handle
the Result; ensure resolve comparisons still work by matching
Ok(Partition::NonTerminal(...)) where needed or by unwrapping via ? before
comparing. This prevents the unwrap panic and surfaces an explicit error when
derivations are exhausted.

In `@syntax/level1.rs`:
- Around line 130-140: Change Use::load to propagate file-read errors instead of
returning Ok(()): when File { name: self.module.clone().into() }.read() fails,
map that error to Issue::FileNotFound (or the appropriate Issue variant) and
return Err(...) so callers get a clear diagnostic instead of silent success.
Also avoid reusing the same mutable Solver for nested module loads: before
calling start.modules(tokenizer, parser, solver, settings) create and pass a
fresh solver instance (e.g., clone or construct a new Solver based on the
current one) so nested solver.run / resolve calls cannot flip the outer
solver.opposite state and corrupt outer resolution. Ensure you update the call
site to use the new mutable solver reference when invoking start.modules.

In `@syntax/start.rs`:
- Line 39: Start::modules currently calls Use::load which re-enters
Start::modules, so add cycle detection by tracking modules being loaded (e.g., a
HashSet<String> of module paths) either on Solver or passed as an extra
parameter (e.g., &mut HashSet<String>) and have Use::load/Start::modules check
that set and return a proper Issue error if the module path is already present
to prevent recursion; also refactor the one-line impl Start::modules into a
properly formatted multi-line function for readability and to clearly handle
insertion/removal of the module path from the tracking set around the recursive
load call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant