script3r
diff --git a/‎BENCHMARK.md‎
Lines changed: 109 additions & 58 deletions b/‎BENCHMARK.md‎
Lines changed: 109 additions & 58 deletions
diff --git a/‎Cargo.toml‎
Lines changed: 24 additions & 0 deletions b/‎Cargo.toml‎
Lines changed: 24 additions & 0 deletions
@@ -1,83 +1,134 @@
 # Benchmarking Cipherscope
 
-This document explains how to read the micro-benchmark results and what they do (and do not) measure.
+This document explains how to run and interpret the benchmark suite.
 
-## What the micro-benchmark measures
+## Quick Start
 
-The benchmark is a full end-to-end scan using the compiled `cipherscope` binary. Each iteration:
-- Walks the roots and discovers files (respecting ignore rules).
-- Runs a fast regex anchor hint to skip files with no matching library/API patterns.
-- Parses files into ASTs.
-- Finds library anchors and algorithm hits.
-- Writes JSONL output to a temp file.
+```bash
+# Run all benchmarks (fast mode, ~5-8 minutes)
+cargo bench
 
-This is an integrated measurement of scanner performance, not a unit benchmark of a single stage.
+# Run extended benchmarks (~30 minutes)
+CIPHERSCOPE_BENCH_EXTENDED=1 cargo bench
 
-## Datasets used
-
-The current benchmark runs two small fixed datasets:
-- `fixtures`: `fixtures/` only (26 files).
-- `repo_mix`: `fixtures/` + `src/` + `tests/` (30 files).
-
-These datasets are intentionally small and fast to run. They are useful for regression tracking but not
-representative of large codebases.
+# Run a specific benchmark
+cargo bench --bench component_bench
+cargo bench --bench scale_bench
+```
 
-## Threading variants
+## Benchmark Modes
+
+### Normal Mode (Default)
+Runs essential benchmarks with minimal variants. Completes in **~5-8 minutes**.
+
+### Extended Mode
+Set `CIPHERSCOPE_BENCH_EXTENDED=1` to enable:
+- More file size variants (1KB-1MB)
+- More file count variants (100-10K)
+- More thread counts (1,2,4,8,16,32)
+- Memory profiling benchmarks
+- Large fixture benchmarks (5K+ files)
+
+## Benchmark Summary
+
+| Benchmark | Normal Mode | Extended Mode |
+|-----------|-------------|---------------|
+| `scan_bench` | 4 variants (~1 min) | Same |
+| `component_bench` | 8 variants (~1.5 min) | 15 variants (~3 min) |
+| `file_size_bench` | 3 sizes (~1 min) | 5 sizes (~2 min) |
+| `scale_bench` | 3 file counts (~1.5 min) | 6 counts + density (~5 min) |
+| `thread_scaling_bench` | 3 thread counts (~1 min) | 7 thread counts (~3 min) |
+| `memory_bench` | Skipped | 3 variants (~3 min) |
+| `large_fixture_bench` | Skipped | 5K files + nested (~5 min) |
+
+## Benchmark Details
+
+### scan_bench
+Basic end-to-end scan benchmark using the existing fixtures.
+```bash
+cargo bench --bench scan_bench
+```
 
-Each dataset is benchmarked with:
-- `1` thread.
-- `num_cpus::get()` threads (full CPU on the current machine).
+### component_bench
+Isolates individual scanner components:
+- `parsing` - Tree-sitter AST parsing
+- `anchor_hint` - Fast regex pre-filter
+- `library_anchors` - Library detection
+- `algorithm_detection` - Pattern matching
+- `full_pipeline` - Complete scan pipeline
+- `language_detection` - File extension mapping
+- `pattern_loading` - PatternSet initialization
+
+```bash
+cargo bench --bench component_bench
+```
 
-This shows scaling behavior on the same workload.
+### file_size_bench
+Tests performance with different file sizes (1KB, 10KB, 100KB, etc.).
+```bash
+cargo bench --bench file_size_bench
+```
 
-## Interpreting numbers
+### scale_bench
+Tests performance with different file counts (100, 500, 1000, etc.).
+```bash
+cargo bench --bench scale_bench
+```
 
-Criterion reports a time range per benchmark, e.g.:
+### thread_scaling_bench
+Measures parallel scaling efficiency.
+```bash
+cargo bench --bench thread_scaling_bench
 ```
-scan/fixtures/1   time: [209.72 ms 210.81 ms 211.61 ms]
+
+### memory_bench (Extended Only)
+Profiles memory usage during scans.
+```bash
+CIPHERSCOPE_BENCH_EXTENDED=1 cargo bench --bench memory_bench
 ```
 
-This range represents the typical runtime distribution (low/median/high) across samples.
-For quick intuition, you can estimate throughput:
-- `files/sec ≈ file_count / median_time_seconds`
+### large_fixture_bench (Extended Only)
+Tests with large synthetic fixtures (5K+ files).
+```bash
+CIPHERSCOPE_BENCH_EXTENDED=1 cargo bench --bench large_fixture_bench
+```
 
-Example:
-- 26 files / 0.210 s ≈ 124 files/sec.
+## Interpreting Results
 
-## Methodology summary
+Criterion reports time as a range:
+```
+parsing/lang/python     time:   [1.98 ms 2.00 ms 2.01 ms]
+                        thrpt:  [4.76 MiB/s 4.79 MiB/s 4.83 MiB/s]
+```
 
-The benchmark:
-- Uses `cargo bench --bench scan_bench`.
-- Warms up for ~3 seconds.
-- Collects 10 samples over ~10 seconds per case.
-- Shells out to the compiled binary and writes JSONL to a temp file.
+- First line: timing (low / median / high)
+- Second line: throughput (if configured)
 
-This keeps the timing focused on real scanning work while avoiding stdout overhead.
+### Throughput Metrics
+- `Elements/s`: Files scanned per second
+- `MiB/s`: Data processed per second
 
-## Large-scale benchmark
+## Environment Variables
 
-For a more realistic scan, the `scan_large_bench` benchmark targets a folder
-containing multiple large repositories. It is opt-in and can be run with:
-```
-CIPHERSCOPE_BENCH_FIXTURE=/path/to/fixture cargo bench --bench scan_large_bench
-```
+| Variable | Description |
+|----------|-------------|
+| `CIPHERSCOPE_BENCH_EXTENDED` | Enable extended benchmarks |
+| `CIPHERSCOPE_BENCH_FIXTURE` | Custom fixture path for scan_large_bench |
+| `CIPHERSCOPE_BENCH_THREADS` | Custom thread counts (comma-separated) |
 
-If `CIPHERSCOPE_BENCH_FIXTURE` is not set, the benchmark defaults to
-`../cipherscope-paper/fixture` relative to the `cipherscope` repo. The large
-benchmark uses fewer samples and a longer measurement window to accommodate
-large repos.
+## Tips
 
-## Limitations and caveats
+1. Close other applications to reduce noise
+2. Run multiple times to verify consistency
+3. Results are saved to `target/criterion/`
+4. HTML reports: `target/criterion/<name>/report/index.html`
 
-- Results are machine- and filesystem-dependent.
-- Small datasets can exaggerate overhead and reduce signal.
-- OS caching can make repeated scans faster than cold-cache runs.
-- The output writing cost is included (to a temp file).
+## Comparing Baselines
 
-## When to extend the benchmark
+```bash
+# Save baseline
+cargo bench -- --save-baseline before
 
-For larger or more realistic measurements, consider:
-- Adding a larger repo checkout as an additional dataset.
-- Reporting total bytes scanned to compute MB/sec.
-- Running explicit cold-cache tests.
-- Adding a "no-output" mode for pure scanning cost.
+# Make changes, then compare
+cargo bench -- --baseline before
+```
@@ -61,3 +61,27 @@ harness = false
 [[bench]]
 name = "scan_large_bench"
 harness = false
+
+[[bench]]
+name = "file_size_bench"
+harness = false
+
+[[bench]]
+name = "scale_bench"
+harness = false
+
+[[bench]]
+name = "thread_scaling_bench"
+harness = false
+
+[[bench]]
+name = "component_bench"
+harness = false
+
+[[bench]]
+name = "memory_bench"
+harness = false
+
+[[bench]]
+name = "large_fixture_bench"
+harness = false