Skip to content

Refactor module#7

Merged
b4prog merged 10 commits into
mainfrom
refactor-module
Jun 26, 2026
Merged

Refactor module#7
b4prog merged 10 commits into
mainfrom
refactor-module

Conversation

@b4prog

@b4prog b4prog commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • New Features
    • Added CLI help output with version derived from the bundled Cargo.lock.
    • Enhanced argument parsing with stronger normalization and validation (including legacy flag handling).
  • Bug Fixes
    • Improved source file discovery for both explicit paths and recursive searches, with better extension filtering, ignore rules, and more consistent de-duplication.
  • Chores
    • Bumped version to 0.7.0.
    • Updated command verification to run formatting checks, tests, linting, and builds in sequence.

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9a9b2282-3024-4279-a0ac-17ab46ac3214

📥 Commits

Reviewing files that changed from the base of the PR and between 5b17e8e and be5325b.

📒 Files selected for processing (4)
  • src/cli/help.rs
  • src/discovery/explicit.rs
  • src/language/classification.rs
  • src/language/patterns.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/language/patterns.rs
  • src/discovery/explicit.rs
  • src/cli/help.rs
  • src/language/classification.rs

📝 Walkthrough

Walkthrough

The CLI was split into argument parsing, help text, and version lookup modules. Source-file discovery now has explicit and recursive paths. Language duplicate-mitigation data and classification moved into language submodules, and crate-level wiring now uses the new discovery and report entry points.

Changes

Duplicate-report pipeline refactor

Layer / File(s) Summary
CLI surface and release metadata
AGENTS.md, Cargo.toml, src/cli/*
CLI parsing now handles help and -h, normalizes legacy flags, validates --report-duplicate, and returns CliConfig; help text and version lookup moved into dedicated modules, and repo instructions add cargo test while the package version becomes 0.7.0.
Explicit and recursive discovery
src/discovery/*
discover_source_files now selects explicit-file or recursive discovery, explicit discovery canonicalizes and deduplicates provided paths, and recursive discovery walks the tree with ignore rules and skipped directories.
Language mitigation registry
src/language/*
Language pattern data, supported extensions, and duplicate-mitigation classification move into separate language submodules that build the extension registry and evaluate line matches.
Crate-level report routing
src/lib.rs, src/report/mod.rs
src/report/mod.rs re-exports the detection and rendering APIs, and src/lib.rs stops exporting the old duplicate and git modules while routing changed-file lookup through discovery and duplicate detection through report.

Sequence Diagram(s)

sequenceDiagram
  participant run
  participant discovery as discovery::changed_files_against_origin
  participant report as report::detect_duplicate_blocks

  run->>discovery: current_dir
  discovery-->>run: changed files
  run->>report: processed_files
  report-->>run: duplicate blocks
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • b4prog/CodeM8#1: Refactors the same CLI, discovery, language, and crate wiring surfaces into the new module layout.
  • b4prog/CodeM8#2: Changes the same discovery and duplicate-report pipeline, including discover_source_files and report routing.
  • b4prog/CodeM8#5: Updates the language mitigation patterns and classification path that this PR moves into language submodules.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is too generic and does not convey the main changes in the PR. Use a more specific title that names the primary refactor, such as the CLI/discovery/language module reorganization or duplicate-detection overhaul.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 93.22% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor-module

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cli/args.rs (1)

158-168: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Allow the declared --codem8-* long options through normalization.

Line 167 currently turns the long flags defined on Lines 14-30 into unknown argument errors, so standard invocations like --codem8-verbose, --codem8-git-branch, --codem8-file-extension rs, and --codem8-files path never reach Clap.

Suggested fix
 fn normalized_clap_arg(arg: String) -> Result<String> {
     if arg == "-verbose" {
         Ok("--codem8-verbose".to_owned())
     } else if arg == "-git-branch" {
         Ok("--codem8-git-branch".to_owned())
+    } else if matches!(
+        arg.as_str(),
+        "--report-duplicate"
+            | "--codem8-verbose"
+            | "--codem8-git-branch"
+            | "--codem8-file-extension"
+            | "--codem8-files"
+    ) || arg.starts_with("--codem8-file-extension=")
+        || arg.starts_with("--codem8-files=")
+    {
+        Ok(arg)
     } else if let Some(value) = arg.strip_prefix("-file-extension=") {
         Ok(format!("--codem8-file-extension={value}"))
     } else if let Some(value) = arg.strip_prefix("-files=") {
         Ok(format!("--codem8-files={value}"))
     } else if arg.starts_with("--") && arg != "--report-duplicate" {
         Err(CodeM8Error::new(format!("unknown argument: {arg}")))
     } else {
         Ok(arg)
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/args.rs` around lines 158 - 168, The normalized_clap_arg helper is
rejecting valid declared long options by treating every "--" argument except
"--report-duplicate" as unknown. Update normalized_clap_arg so the existing
Clap-defined "--codem8-*" flags are allowed through unchanged (or normalized
consistently), including the options handled by Args/Clap such as verbose,
git-branch, file-extension, and files, while still rejecting truly unknown long
arguments.
🧹 Nitpick comments (1)
src/cli/version.rs (1)

1-13: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Use Cargo package metadata as the version source.

help_text() renders this helper directly, so the CLI banner now depends on Cargo.lock staying in sync with Cargo.toml. That adds a second source of truth for release metadata and can show the wrong version after a future bump. env!("CARGO_PKG_VERSION") is tied to the compiled crate and removes the lockfile parser entirely.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/version.rs` around lines 1 - 13, The version helper in
codem8_version_from_cargo_lock should stop reading from Cargo.lock and use Cargo
package metadata instead. Replace the lockfile-based lookup used by help_text()
with the compiled crate version source so the CLI banner always matches the
built package, and remove the CargoLockPackage/cargo_lock_packages parsing path
if it is no longer needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cli/help.rs`:
- Around line 6-12: Update the help documentation in the help output so the
supported `-h` entrypoint is advertised alongside `help`. In `parse_command` and
the help text emitted from `src/cli/help.rs`, make sure the usage and command
list clearly show both invocations, keeping the existing `help` entry and adding
`-h` as an equivalent supported command so it is discoverable.

In `@src/discovery/explicit.rs`:
- Around line 56-60: The `display_path` logic in `explicit.rs` is relativizing
`canonical_path` against an unnormalized `current_dir`, so `strip_prefix` can
fail for symlinked or non-canonical workspaces and leak absolute paths.
Canonicalize or otherwise normalize `current_dir` before the
`canonical_path.strip_prefix(...)` check inside the `display_path` computation,
so `normalize_display_path` consistently returns stable relative paths for
`src/line.rs` and similar inputs.

In `@src/language/patterns.rs`:
- Around line 141-143: The Rust mitigation regex in the pattern set is too broad
because the optional separator in the expression also matches struct-literal
field labels like field: value,. Tighten the pattern used in the Rust rule so it
only matches path or enum-fragment forms with :: and does not allow a lone
colon; update the regex in the Rust patterns entry that is consumed by
classification::register_rust_rules so the comment about path/enum fragments
remains accurate.

---

Outside diff comments:
In `@src/cli/args.rs`:
- Around line 158-168: The normalized_clap_arg helper is rejecting valid
declared long options by treating every "--" argument except
"--report-duplicate" as unknown. Update normalized_clap_arg so the existing
Clap-defined "--codem8-*" flags are allowed through unchanged (or normalized
consistently), including the options handled by Args/Clap such as verbose,
git-branch, file-extension, and files, while still rejecting truly unknown long
arguments.

---

Nitpick comments:
In `@src/cli/version.rs`:
- Around line 1-13: The version helper in codem8_version_from_cargo_lock should
stop reading from Cargo.lock and use Cargo package metadata instead. Replace the
lockfile-based lookup used by help_text() with the compiled crate version source
so the CLI banner always matches the built package, and remove the
CargoLockPackage/cargo_lock_packages parsing path if it is no longer needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 822da5ec-b765-4594-a949-09da04fe84fa

📥 Commits

Reviewing files that changed from the base of the PR and between 6ea7567 and 5b17e8e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (20)
  • AGENTS.md
  • Cargo.toml
  • src/cli/args.rs
  • src/cli/help.rs
  • src/cli/mod.rs
  • src/cli/version.rs
  • src/discovery.rs
  • src/discovery/explicit.rs
  • src/discovery/git.rs
  • src/discovery/mod.rs
  • src/discovery/recursive.rs
  • src/language.rs
  • src/language/classification.rs
  • src/language/mod.rs
  • src/language/patterns.rs
  • src/language/registry.rs
  • src/lib.rs
  • src/report/duplicate_detection.rs
  • src/report/duplicate_renderer.rs
  • src/report/mod.rs
💤 Files with no reviewable changes (2)
  • src/discovery.rs
  • src/language.rs
📜 Review details
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Before finishing a change, run `cargo fmt --all -- --check` from the workspace root and fix any formatting issues before handing work back.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Before finishing a change, run `cargo test` from the workspace root and fix any test failures before handing work back.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Before finishing a change, run `cargo clippy --workspace --all-targets --all-features -- -D warnings -W clippy::too_many_lines -W clippy::too_many_arguments -W clippy::type_complexity -W clippy::excessive_nesting -W clippy::cognitive_complexity -W clippy::pedantic -W clippy::nursery -W clippy::cargo` from the workspace root and fix any Clippy issues before handing work back.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Before finishing a change, run `cargo build --locked --all-targets` from the workspace root and fix any build issues before handing work back.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Treat Clippy warnings as errors for generated or edited code.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: Prefer changes that satisfy the repository `clippy.toml` configuration without adding `#[allow(...)]` attributes unless a maintainer explicitly asks for them.
Learnt from: CR
Repo: b4prog/CodeM8

Timestamp: 2026-06-26T14:42:45.825Z
Learning: If a required command cannot be run in the current environment, call that out clearly in the handoff.
🔇 Additional comments (5)
src/cli/help.rs (1)

50-89: LGTM!

src/discovery/mod.rs (1)

19-41: LGTM!

src/discovery/recursive.rs (1)

25-161: LGTM!

src/report/mod.rs (1)

1-5: LGTM!

src/lib.rs (1)

35-50: LGTM!

Comment thread src/cli/help.rs
Comment thread src/discovery/explicit.rs
Comment thread src/language/patterns.rs
@b4prog

b4prog commented Jun 26, 2026

Copy link
Copy Markdown
Owner Author

@CodeRabbit resume

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
✅ Action performed

Reviews resumed.

@b4prog b4prog merged commit 904b59d into main Jun 26, 2026
3 checks passed
@b4prog b4prog deleted the refactor-module branch June 26, 2026 16:27
This was referenced Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant