reDUP

Code duplication analyzer and refactoring planner for LLMs.

AI Cost Tracking

🤖 LLM usage: $7.5000 (65 commits)
👤 Human dev: ~$1688 (16.9h @ $100/h, 30min dedup)

Generated on 2026-04-16 using openrouter/qwen/qwen3-coder-next

reDUP scans codebases for duplicated functions, blocks, and structural patterns — then builds a prioritized refactoring map that LLMs can consume to eliminate redundancy systematically.

Features

Exact duplicate detection via SHA-256 block hashing
Structural clone detection — same AST shape, different variable names
LSH near-duplicate detection for large code blocks (>50 lines)
Multi-language support — 35+ languages via tree-sitter (Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, C#, Ruby, PHP, Bash, SQL, HTML, CSS, Lua, Scala, Kotlin, Swift, Objective-C, JSON, YAML, TOML, XML, Markdown, GraphQL, Dockerfile, Makefile, Nginx, Vim, Svelte, Vue, and more)
Parallel scanning for large projects (2x+ performance improvement)
Fuzzy near-duplicate matching via SequenceMatcher / rapidfuzz
Function-level analysis using Python AST and tree-sitter extraction
Impact scoring — prioritizes duplicates by saved_lines × similarity
Refactoring planner — generates concrete extract/inline suggestions
Multiple output formats: JSON, YAML, TOON, Markdown
Configuration system — TOML files and environment variables
CLI commands: scan, compare, diff, check, config, info
Cross-project comparison — detect shared code between projects with merge/extract recommendations
CI integration with configurable quality gates
Clean output — no syntax warnings from external libraries

New Features (v0.4.20)

🤖 MCP Server

Full MCP (Model Context Protocol) server for AI assistant integration:

# Start MCP server
redup-mcp

# Or HTTP mode
redup-mcp --transport http --port 8000

Available Tools:

analyze_project — Full duplication analysis
find_duplicates — Quick duplicate detection
check_project — Quality gate check
compare_projects — Cross-project comparison
suggest_refactoring — AI-powered refactoring suggestions
project_info — Project metadata

🌐 Universal Fuzzy Similarity Detection

Cross-language duplicate detection across all 35+ supported languages:

# Detect similar code across languages
redup scan . --fuzzy --fuzzy-threshold 0.65

Cross-Language Matching:

JavaScript ↔ Python functions: ~65% similarity
Docker ↔ YAML configs: ~40% similarity
Auth patterns across languages: ~70% similarity

Supported Patterns:

Functions, classes, API endpoints
Database queries, web components
Auth/validation, error handling, logging
Configuration, infrastructure code

🌳 Modular Tree-Sitter Extractor

Refactored tree-sitter extraction with clean, modular architecture:

ts_extractor/
├── extractors/          # Modular per-language extractors
│   ├── c_family.py      # C, C++, C#, Objective-C
│   ├── go.py            # Go
│   ├── java.py          # Java, Scala, Kotlin
│   ├── markup.py        # HTML, XML, Svelte, Vue
│   ├── web.py           # JavaScript, TypeScript
│   └── ...
├── dispatcher.py        # Smart language routing
├── config.py            # Language registry
└── main.py              # Unified API

Benefits:

Easier to add new languages
Better testability
Cleaner separation of concerns
35+ languages supported

New Features (v0.5.0+)

🌐 Universal Fuzzy Similarity Detection

Cross-language fuzzy matching for detecting similar code patterns across all 35+ supported languages:

# Detect similar patterns across different languages
redup scan . --fuzzy --ext .py,.js,.ts

# Cross-project comparison with fuzzy matching
redup compare ./project-a ./project-b --fuzzy --threshold 0.65

Features:

Detects similar functions, API endpoints, validation logic across languages (e.g., JS ↔ Python)
Pattern recognition: authentication, error handling, database queries, web components
Language-agnostic signature generation with identifier normalization
Complexity scoring (0.0-1.0) for each detected pattern

Example patterns detected:

Express.js route handler ↔ Flask endpoint (70% similarity)
Docker Compose service ↔ Kubernetes deployment (40% similarity)
Auth middleware patterns across frameworks

🧩 Modular ts_extractor Architecture

The tree-sitter multi-language extractor has been refactored from a 782-line god module into a clean package:

redup/core/ts_extractor/
├── extractors/
│   ├── web.py        # JavaScript/TypeScript
│   ├── c_family.py   # C/C++
│   ├── dotnet.py     # C#
│   ├── ruby.py       # Ruby
│   ├── php.py        # PHP
│   └── ...           # 10+ language-specific modules

Benefits:

Better maintainability (avg 100 lines per module vs 782)
Easier to add new language extractors
Shared base utilities for common operations
Full backward compatibility maintained

🎯 Enriched TOON Reporter

The TOON format now includes actionable sections for practical refactoring:

HOTSPOTS — Top 7 files with most duplicated lines (where to focus effort)
QUICK_WINS — Low-risk, high-savings suggestions (do first)
DEPENDENCY_RISK — Duplicates spanning multiple packages (cross-module risk)
EFFORT_ESTIMATE — Time estimates per task with difficulty (easy/medium/hard)

🤖 LLM-Powered Refactoring Plans

Generate AI-assisted refactoring TODO lists from cross-project comparisons:

redup compare ./project-a ./project-b --refactor-plan --env .env --output report.json

Uses litellm for flexible LLM provider support
Compact metadata-only prompts for efficiency
Structured JSON output with prioritized tasks
Token usage tracking

📊 Simplified Compare Reports

Cross-project comparison reports are now more compact and human-readable:

Relative file paths instead of absolute
Matches deduplicated by function pair
Communities with compact member dicts
Filtered trivial entries to reduce noise
~60% smaller JSON size

Installation

pip install redup

With optional dependencies:

pip install redup[all]       # Everything
pip install redup[fuzzy]     # rapidfuzz for better similarity matching
pip install redup[ast]       # tree-sitter for multi-language AST
pip install redup[lsh]       # datasketch for LSH near-duplicate detection
pip install redup[compare]   # networkx for cross-project community detection
pip install redup[llm]       # litellm for LLM-powered refactoring plans

Quick Start

CLI

# Scan current directory, output TOON to stdout
redup scan .

# Scan with JSON output saved to file
redup scan ./src --format json --output ./reports/

# Parallel scanning for large projects
redup scan . --parallel --max-workers 4

# Multi-language scanning with 35+ supported languages
redup scan . --ext ".py,.js,.ts,.go,.rs,.java,.rb,.php,.html,.css,.sql,.lua,.scala,.kt,.swift,.m,.json,.yaml,.toml,.xml,.md,.graphql,.dockerfile,.svelte,.vue"

# CI gate with thresholds
redup check . --max-groups 10 --max-lines 100

# Compare two scans
redup diff before.json after.json

# Cross-project comparison (merge vs extract decision)
redup compare ./project-a ./project-b --threshold 0.75

# With LLM-powered refactoring plan (requires litellm + .env with API keys)
redup compare ./project-a ./project-b --refactor-plan --env .env --output comparison.json

# Specify custom LLM model
redup compare ./project-a ./project-b --refactor-plan --llm-model openrouter/anthropic/claude-3.5-sonnet

# Initialize configuration
redup config --init

# Scan with all formats
redup scan . --format all --output ./redup_output/

# Only function-level duplicates (faster)
redup scan . --functions-only

# Custom thresholds
redup scan . --min-lines 5 --min-sim 0.9

# Show installed optional dependencies
redup info

# Export duplications as tasks to TODO.md (requires: pip install redup[tasks])
redup tasks ./my-project

# Export with GitHub sync
redup tasks ./my-project --backend github --milestone "Sprint 1"

# Export with GitLab sync and custom output
redup tasks ./my-project -b gitlab -o refactoring-tasks.md

# Preview tasks without creating files
redup tasks ./my-project --dry-run

Task Management with Planfile (Optional)

When you install redup[tasks], you can export duplication findings as actionable tasks in TODO.md format with synchronization to GitHub, GitLab, or Jira:

# Install with planfile support
pip install redup[tasks]

# Generate TODO.md from duplications
redup tasks ./my-project --output TODO.md

# The generated TODO.md includes:
# - Priority-based task organization (critical/major/minor)
# - Difficulty estimation (easy/medium/hard)
# - Line savings potential
# - Detailed refactoring suggestions
# - Planfile export configuration

Example TODO.md output:

# TODO - Duplication Refactoring Tasks

## CRITICAL (3 tasks)
- [ ] **Refactor: process_file (4x duplication)** 🔴
   Priority: critical | Savings: 124L
   <details>
   Extract function to shared utility module.
   Files: src/core/scanner.py, src/core/planner.py, ...
   </details>

## MAJOR (5 tasks)
- [ ] **Refactor: validate_input (3x duplication)** 🟡
   Priority: major | Savings: 45L
   ...

Configuration

Create a redup.toml file:

[scan]
extensions = ".py,.js,.ts,.go,.rs,.java,.rb,.php,.html,.css,.sql,.lua,.scala,.kt,.swift,.m,.json,.yaml,.toml,.xml,.md,.graphql,.dockerfile,.svelte,.vue"
min_lines = 3
min_similarity = 0.85
include_tests = false

[lsh]
enabled = true
min_lines = 50
threshold = 0.8

[check]
max_groups = 10
max_lines = 100

[output]
format = "toon"
output = "redup_output"

[reporting]
include_snippets = true
generate_suggestions = true

Or use [tool.redup] in pyproject.toml. Environment variables with REDUP_ prefix override file settings.

Python API

from pathlib import Path
from redup import ScanConfig, analyze
from redup.reporters.toon_reporter import to_toon
from redup.reporters.json_reporter import to_json

config = ScanConfig(
    root=Path("./my_project"),
    extensions=[".py", ".js", ".ts", ".go", ".rs", ".java", ".rb", ".php", ".html", ".css"],
    min_block_lines=3,
    min_similarity=0.85,
)

result = analyze(config=config, function_level_only=True)

print(f"Found {result.total_groups} duplicate groups")
print(f"Lines recoverable: {result.total_saved_lines}")

# For LLM consumption
print(to_toon(result))

# For tooling / CI
Path("duplication.json").write_text(to_json(result))

Output Formats

TOON (LLM-optimized)

# redup/duplication | 15 groups | 86f 10453L | 2026-04-16

SUMMARY:
  files_scanned: 86
  total_lines:   10453
  dup_groups:    15
  dup_fragments: 36
  saved_lines:   217
  scan_ms:       3620

HOTSPOTS[7] (files with most duplication):
  src/redup/core/ts_extractor.py  dup=74L  groups=4  frags=11  (0.7%)
  src/redup/core/scanner_utils.py  dup=70L  groups=3  frags=3  (0.7%)
  src/redup/core/scanner_loader.py  dup=52L  groups=1  frags=1  (0.5%)

DUPLICATES[15] (ranked by impact):
  [E0001] ! EXAC  _preload_files  L=52 N=2 saved=52 sim=1.00
      src/redup/core/scanner_loader.py:9-60  (_preload_files)
      src/redup/core/scanner_utils.py:53-104  (_preload_files)

REFACTOR[15] (ranked by priority):
  [1] ◐ extract_module     → src/redup/core/utils/_preload_files.py
      WHY: 2 occurrences of 52-line block across 2 files — saves 52 lines
      FILES: src/redup/core/scanner_loader.py, src/redup/core/scanner_utils.py

QUICK_WINS[8] (low risk, high savings — do first):
  [3] extract_function   saved=26L  → src/redup/core/utils/find_exact_duplicates_lazy.py
      FILES: lazy_grouper.py
  [4] extract_function   saved=21L  → src/redup/core/utils/_extract_functions_go.py
      FILES: ts_extractor.py

DEPENDENCY_RISK[3] (duplicates spanning multiple packages):
  validate_input  packages=2  files=2
      api/routes/users.py
      services/auth/validate.py

EFFORT_ESTIMATE (total ≈ 8.7h):
  hard   _preload_files                      saved=52L  ~156min
  hard   __init__                            saved=36L  ~108min
  medium find_exact_duplicates_lazy          saved=26L  ~52min
  easy   _is_test_file                       saved=12L  ~24min

METRICS-TARGET:
  dup_groups:  15 → 0
  saved_lines: 217 lines recoverable

JSON (machine-readable)

{
  "summary": {
    "total_groups": 3,
    "total_saved_lines": 84
  },
  "groups": [
    {
      "id": "E0001",
      "type": "exact",
      "normalized_name": "calculate_tax",
      "fragments": [
        {"file": "billing.py", "line_start": 1, "line_end": 8},
        {"file": "shipping.py", "line_start": 1, "line_end": 8}
      ],
      "saved_lines_potential": 16
    }
  ],
  "refactor_suggestions": [
    {
      "priority": 1,
      "action": "extract_function",
      "new_module": "utils/calculate_tax.py",
      "risk_level": "low"
    }
  ]
}

Cross-Project Comparison

The redup compare command analyzes two separate projects to detect shared code and recommends a refactoring strategy:

Merge projects — if >60% code overlap
Extract shared library — if 5-60% overlap with well-defined clusters
Keep separate — if <5% overlap

CLI Usage

# Basic comparison
redup compare ./project-a ./project-b --threshold 0.75

# With semantic similarity (slower, more accurate)
redup compare ./project-a ./project-b --semantic --threshold 0.70

# Multi-language projects
redup compare ./backend ./frontend --ext ".py,.js,.ts" --threshold 0.80

# Skip community detection (faster, no networkx required)
redup compare ./a ./b --no-community

# Generate LLM-powered refactoring plan (requires redup[llm])
redup compare ./a ./b --refactor-plan --env .env --output plan.json

Sample Output

Comparing project-a ↔ project-b (threshold=0.75)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃      Cross-Project Comparison                        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Metric                  │ Value                      │
├─────────────────────────┼────────────────────────────┤
│ Project A files         │ 42                         │
│ Project B files         │ 38                         │
│ Project A lines         │ 8500                       │
│ Project B lines         │ 7200                       │
│ Cross matches           │ 15                         │
│ Shared LOC (potential)  │ 1200                       │
└─────────────────────────┴────────────────────────────┘

Recommendation: extract_shared_lib
15% overlap (1200 shared lines, 5 clusters). Extract to shared library.
Confidence: 80%

Top Communities (shared code candidates):
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━┳━━━━━━━━━━┓
┃ ID ┃ Name                 ┃ Similarity ┃ LOC ┃ Members  ┃
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━╇━━━━━━━━━━┩
│  0 │ validate_input       │ 0.89       │ 180 │ 5        │
│  1 │ parse_config         │ 0.82       │ 140 │ 4        │
│  2 │ format_response      │ 0.76       │ 100 │ 3        │
└────┴──────────────────────┴────────────┴─────┴──────────┘

Report JSON Structure

{
  "project_a": "./project-a",
  "project_b": "./project-b",
  "stats": {
    "a": {"files": 42, "lines": 8500},
    "b": {"files": 38, "lines": 7200}
  },
  "total_matches": 15,
  "shared_loc_potential": 1200,
  "recommendation": {
    "decision": "extract_shared_lib",
    "rationale": "15% overlap (1200 shared lines, 5 clusters). Extract to shared library.",
    "overlap_pct": 0.1523,
    "shared_loc": 1200,
    "confidence": 0.8
  },
  "communities": [
    {
      "name": "validate_input",
      "similarity": 0.89,
      "loc": 180,
      "members": [
        {"project": "A", "file": "api/validators.py", "function": "validate_input"},
        {"project": "B", "file": "utils/validation.py", "function": "validate_input"}
      ]
    }
  ],
  "matches": [...]
}

Algorithm Overview

The comparison uses a 3-tier similarity detection:

Structural hash — exact AST matches (fast, O(n+m))
LSH (Locality Sensitive Hashing) — near-duplicates via MinHash
Semantic similarity — CodeBERT embeddings (optional, slowest)

Matches are deduplicated by (function_a, function_b, file_a, file_b) with the highest similarity score retained.

Community Detection

Requires networkx (pip install redup[compare]).

Uses greedy modularity communities on a similarity graph where:

Nodes = functions from both projects
Edges = similarity score (filtered by --threshold)
Communities = clusters of mutually similar functions

Each community gets a generated name based on longest common prefix of its member functions (e.g., validate_* → validate_input).

Architecture

src/redup/
├── __init__.py            # Public API
├── __main__.py            # python -m redup
├── mcp_server.py          # MCP server entry point (re-exports from mcp package)
├── mcp/                   # MCP server package
│   ├── __init__.py        # Public MCP API
│   ├── handlers.py        # Tool handlers
│   ├── schemas.py         # JSON-RPC schemas
│   ├── server.py          # JSON-RPC server core
│   └── utils.py           # Shared utilities
├── core/
│   ├── models.py          # Pydantic data models
│   ├── scanner.py         # File discovery + block extraction
│   ├── scanner/           # Scanner package
│   │   ├── __init__.py    # Public scanner API
│   │   ├── cache.py       # Memory cache
│   │   ├── filters.py     # File filtering
│   │   ├── loader.py      # File preloading
│   │   └── types.py       # Scanner types
│   ├── hasher.py          # SHA-256 / structural fingerprinting
│   ├── matcher.py         # Fuzzy similarity comparison
│   ├── planner.py         # Refactoring suggestion generator
│   ├── pipeline.py        # Legacy: re-exports from pipeline package
│   └── pipeline/          # Pipeline package (new)
│       ├── __init__.py    # analyze(), analyze_optimized(), analyze_parallel()
│       ├── phases.py      # scan_phase(), process_blocks()
│       ├── duplicate_finder.py  # Duplicate finding phases
│       └── groups.py      # Group creation, deduplication
│   └── ts_extractor/        # Tree-sitter extraction (35+ languages)
│       ├── __init__.py    # Public API
│       ├── main.py        # Core extraction API
│       ├── dispatcher.py  # Language routing
│       ├── config.py      # Language registry
│       └── extractors/    # Per-language extractors
├── reporters/
│   ├── json_reporter.py   # JSON output
│   ├── yaml_reporter.py   # YAML output
│   └── toon_reporter.py   # TOON output (LLM-optimized)
└── cli_app/
    └── main.py            # Typer CLI

Analysis Pipeline

1. SCAN      Walk project, read files, extract function-level + sliding-window blocks
2. HASH      Generate exact (SHA-256) and structural (normalized AST) fingerprints
3. GROUP     Bucket by hash, keep only groups with 2+ blocks from different locations
4. MATCH     Verify candidates with fuzzy similarity (SequenceMatcher / rapidfuzz)
5. DEDUP     Remove overlapping groups (keep highest-impact)
6. PLAN      Generate prioritized refactoring suggestions with risk assessment
7. REPORT    Export to JSON / YAML / TOON

Recent Improvements (v0.5.0)

🏗️ Modular Architecture Refactoring

Major internal restructuring for better maintainability and extensibility:

MCP Server Package

The MCP server has been split from a 675-line monolith into a clean package:

redup/mcp/
├── __init__.py      # Public API
├── handlers.py      # 8 tool handlers
├── schemas.py       # JSON-RPC schemas
├── server.py        # Server core
└── utils.py         # Utilities

82% code reduction in main file
Backward compatible: mcp_server.py re-exports all APIs
Better testability: Isolated handlers can be tested independently

Pipeline Package

The analysis pipeline (714 lines) now lives in a modular package:

redup/core/pipeline/
├── __init__.py          # analyze(), analyze_optimized(), analyze_parallel()
├── phases.py            # scan_phase(), process_blocks()
├── duplicate_finder.py  # find_exact_groups(), find_structural_groups(), etc.
└── groups.py            # deduplicate_groups(), blocks_to_group(), etc.

66% reduction in main orchestrator file
Phases can be used independently for custom workflows
Cleaner separation of concerns

Scanner Improvements

The scanner has been refactored with extracted helpers:

_init_strategy() - Strategy initialization
_process_single_file() - Per-file processing
_extract_blocks_for_file() - Block extraction
Reduced CC and fan-out in main scan_project() function

🎯 Sprint 1 Refactoring Complete

Reduced cyclomatic complexity from CC̄=4.2 to CC̄=3.5
Eliminated all critical functions (CC > 10): 2 → 0
Achieved HEALTHY status with no structural issues
Dispatch pattern implementation for AST node processing
Modular TOON reporter split into 5 focused functions
CLI refactoring with helper functions for better maintainability

🚀 Technical Achievements

_process_ast_node: CC=14 → CC=6 (dispatch dict pattern)
to_toon: CC=12 → CC=8 (5 helper functions)
CLI scan(): fan-out=18 → ≤10 (4 helper functions)
Code quality: 0 high-complexity functions
Test coverage: 64/64 tests passing (100%)

📊 Quality Metrics

Health status: ✅ HEALTHY (no critical issues)
Cyclomatic complexity: CC̄=3.5 (target ≤ 3.0 achieved)
Maximum CC: 9 (target ≤ 10 achieved)
Code maintainability: Significantly improved
Duplication: Minimal (2 groups, 6 lines - acceptable patterns)

🔧 Code Architecture

Dispatch tables for extensible AST processing
Single responsibility functions throughout codebase
Clean separation of concerns in CLI pipeline
Type safety improvements with proper annotations
Error handling enhanced for edge cases

Integration with wronai Toolchain

reDUP is part of the wronai developer toolchain:

code2llm — static analysis engine (health diagnostics, complexity)
reDUP — deep duplication analysis and refactoring planning
code2docs — automatic documentation generation
vallm — validation of LLM-generated code proposals

📈 Typical workflow:

code2llm analyzes the project → .toon diagnostics
redup finds duplicates → duplication.toon.yaml
Feed both to an LLM for targeted refactoring
vallm validates the LLM's proposals before merging

🎯 Why reDUP?

LLM-ready: TOON format optimized for LLM consumption
Actionable: Generates concrete refactoring suggestions
Prioritized: Ranks duplicates by impact and risk
Integrated: Works seamlessly with wronai toolchain
Fast: Scans 1000+ lines in < 1 second
Clean: No syntax warnings, professional output

Development

git clone https://github.com/semcod/redup.git
cd redup
pip install -e ".[dev]"
pytest

License

Licensed under Apache-2.0.

Author

Tom Sapletta

Status

Last updated by taskill at 2026-04-25 13:46 UTC

Metric	Value
HEAD	`7055183`
Coverage	42.9%
Failing tests	—
Commits in last cycle	50

Added markdown output and a configuration management system, with numerous docs and code-analysis refactors and some test additions. Several refactors target the code analysis engine and TypeScript extractor components.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.idea		.idea
.redup/cache		.redup/cache
benchmarks		benchmarks
code2llm_output		code2llm_output
docs		docs
examples		examples
logs		logs
project		project
proxy_analysis		proxy_analysis
redup_self_analysis		redup_self_analysis
scripts		scripts
src		src
tests		tests
toon		toon
.gitignore		.gitignore
BUGFIX_SUMMARY.md		BUGFIX_SUMMARY.md
CHANGELOG.md		CHANGELOG.md
COMPLEXITY_REDUCTION_COMPLETE.md		COMPLEXITY_REDUCTION_COMPLETE.md
CONTINUATION_PLAN.md		CONTINUATION_PLAN.md
LICENSE		LICENSE
MEMORY_OPTIMIZATION.md		MEMORY_OPTIMIZATION.md
PERFORMANCE.md		PERFORMANCE.md
PERFORMANCE_OPTIMIZATIONS.md		PERFORMANCE_OPTIMIZATIONS.md
README.md		README.md
SPRINT_2_PLAN.md		SPRINT_2_PLAN.md
TEST_OUTPUT.md		TEST_OUTPUT.md
TODO.md		TODO.md
TODO_github.md		TODO_github.md
TODO_gitlab.md		TODO_gitlab.md
Taskfile.yml		Taskfile.yml
VERSION		VERSION
app.doql.css		app.doql.css
benchmark.py		benchmark.py
compare_report.json		compare_report.json
compare_report_test.json		compare_report_test.json
complexity_reduction_plan.md		complexity_reduction_plan.md
coverage.json		coverage.json
goal.yaml		goal.yaml
planfile.yaml		planfile.yaml
prefact.yaml		prefact.yaml
project.sh		project.sh
pyproject.toml		pyproject.toml
pyqual.yaml		pyqual.yaml
redup.toml		redup.toml
test_fuzzy_similarity.py		test_fuzzy_similarity.py
test_universal_fuzzy.py		test_universal_fuzzy.py

Folders and files

Latest commit

History

Repository files navigation

reDUP

AI Cost Tracking

Features

New Features (v0.4.20)

🤖 MCP Server

🌐 Universal Fuzzy Similarity Detection

🌳 Modular Tree-Sitter Extractor

New Features (v0.5.0+)

🌐 Universal Fuzzy Similarity Detection

🧩 Modular ts_extractor Architecture

🎯 Enriched TOON Reporter

🤖 LLM-Powered Refactoring Plans

📊 Simplified Compare Reports

Installation

Quick Start

CLI

Task Management with Planfile (Optional)

Configuration

Python API

Output Formats

TOON (LLM-optimized)

JSON (machine-readable)

Cross-Project Comparison

CLI Usage

Sample Output

Report JSON Structure

Algorithm Overview

Community Detection

Architecture

Analysis Pipeline

Recent Improvements (v0.5.0)

🏗️ Modular Architecture Refactoring

MCP Server Package

Pipeline Package

Scanner Improvements

🎯 Sprint 1 Refactoring Complete

🚀 Technical Achievements

📊 Quality Metrics

🔧 Code Architecture

Integration with wronai Toolchain

📈 Typical workflow:

🎯 Why reDUP?

Development

License

Author

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages