diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
index c9a2865..1ec7ce5 100644
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -10,7 +10,13 @@
"repository": "https://github.com/denfry/codebase-index",
"license": "MIT",
"keywords": [
+ "claude-code",
"code-search",
+ "semantic-code-search",
+ "codebase-index",
+ "mcp",
+ "ai-agents",
+ "local-first",
"tree-sitter",
"rag",
"sqlite",
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8cde91a..58a6aa9 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,33 @@ All notable changes to this project are documented here. The format is based on
## [Unreleased]
+### Added
+- **`clean` is now implemented** (it was a documented-but-stubbed `_todo` since M0).
+ `codebase-index clean` resets the index database (`index.sqlite` + WAL/SHM
+ sidecars); `codebase-index clean --all` wipes the whole per-project cache
+ directory. It prompts before deleting (skip with `--yes`), supports `--json`,
+ and never touches the installed skill. Locked in by `tests/test_clean_cli.py`.
+- **`docs/PRODUCT_UPGRADE_PLAN.md`**: positioning, target users, competitor matrix,
+ differentiators, current weaknesses, a ranked roadmap, and documentation /
+ benchmark / distribution / technical task lists.
+- **`docs/RELEASE_CHECKLIST.md`**: a repeatable release checklist (version sync,
+ tests, benchmarks, doctor, install/plugin/MCP smoke, changelog) with signed
+ checksums + SBOM tracked as future hardening.
+
+### Changed
+- **README**: added "Who Is It For?" and a "How Is This Different?" section that
+ answers why-not-grep / Cursor / Aider repo-map / Sourcegraph / Codebase-Memory
+ MCP on the first screen, plus a proven-today-vs-roadmap table.
+- **`docs/COMPARISON.md`**: explicit rows and "choose them when / choose us when"
+ guidance for Continue, Sourcegraph/Cody/Amp, and Codebase-Memory MCP.
+- **`docs/BENCHMARKS.md`**: a status table separating proven / toy / honest
+ surfaces, an explicit "claims that should NOT be made yet" list, and a
+ TODO-friendly benchmark task checklist with a no-overclaim procedure.
+
+### Fixed
+- `docs/FAQ.md`: removed a dangling/duplicated sentence in "Is it
+ production-ready?" and documented the real `clean` / `clean --all` behavior.
+
## [1.3.0] - 2026-06-09
### Added
diff --git a/README.md b/README.md
index 998502c..6b01fbd 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,11 @@ references without scanning an entire repository.
[](docs/DATABASE_SCHEMA.md)
[](docs/ARCHITECTURE.md)
+
+
+
+
## What Is codebase-index?
**codebase-index is a private, offline retrieval layer for AI code search.** It
@@ -27,6 +32,24 @@ an AI coding agent can read instead of opening broad file sets.
Use it when you want Cursor-like codebase awareness in terminal-based AI tools
while keeping source code, snippets, and search metadata on your machine.
+> **codebase-index is not an IDE and not a coding agent.** It is the local
+> retrieval/index layer that gives terminal and MCP-based AI agents precise
+> codebase context. The agent stays your interface; this gives it better aim.
+
+## Who Is It For?
+
+- **Claude Code / Codex CLI / OpenCode users** on medium-to-large repos who want
+ the agent to read 3 ranked files instead of grepping and scanning 60.
+- **Privacy-constrained teams** (proprietary or regulated code) who cannot send
+ source to a cloud code-intelligence service.
+- **MCP power users** who want a stable, queryable code index as a tool, not a
+ black box baked into one agent's prompt.
+- **Tooling authors** who need scriptable retrieval (`--json`, SQLite, MCP) that
+ other tools can build on.
+
+Not for you if you want a full IDE, org-scale multi-repo search, or a hosted
+platform — use Cursor or Sourcegraph for those.
+
## Start Here
If you are opening this repository for the first time, follow this order:
@@ -145,6 +168,61 @@ Developers get Cursor-like codebase awareness in Claude Code, Codex CLI, and
OpenCode without leaving the terminal or sending code to a remote indexing
service.
+## How Is This Different?
+
+Short answers to the questions people actually ask. The full, honest matrix —
+including when you should pick the other tool — is in
+[docs/COMPARISON.md](docs/COMPARISON.md).
+
+- **Why not just `grep`/`rg`?** Grep returns every match with no ranking, no
+ symbol awareness, and no idea which files relate. codebase-index ranks results,
+ knows a definition from a call, expands along the dependency graph, and returns
+ specific line ranges under a token budget — so the agent reads less and answers
+ with citations.
+- **Why not Cursor?** Cursor is a great AI IDE with strong codebase awareness, but
+ it is proprietary and IDE-centric. codebase-index is a local, open retrieval
+ layer for **terminal and MCP** agents, offline by default, with no IDE lock-in.
+ If you live inside Cursor, keep using Cursor.
+- **Why not Aider repo-map?** Aider's repo-map is a good graph-ranked,
+ token-budgeted context map — but it is optimized to feed Aider's own chat.
+ codebase-index is a **reusable, queryable index**: CLI/JSON/MCP commands return
+ ranked `file:line` ranges, symbols, references, and impact that *any*
+ shell-capable agent can consume, with freshness and security gates.
+- **Why not Sourcegraph / Cody / Amp?** They are excellent enterprise-grade,
+ cross-repo code intelligence platforms. They are also heavier and
+ account/platform-oriented. codebase-index is single-repo, local, and
+ lightweight — no server, no account, no code leaving the machine by default.
+- **Why not Codebase-Memory MCP?** It is the closest direct alternative — a
+ broader graph engine with a static binary and wide language/agent coverage. We
+ do **not** claim to beat it globally. We differentiate on simplicity, a strict
+ privacy model, token-budgeted retrieval packets, a transparent Python
+ implementation, the Claude/Codex/OpenCode workflow, and honest benchmarks. If
+ you need its broader graph and language reach today, choose it.
+
+**What makes it trustworthy?** No telemetry, no network by default, a multi-gate
+exclusion pipeline (secrets/binaries/generated/dependencies never indexed),
+output-time secret redaction, a `doctor --strict` safety self-check, and a
+public benchmark suite wired as a CI regression gate. Claims that aren't proven
+in this repo are marked as roadmap, not done.
+
+### Proven today vs. roadmap
+
+| Capability | Status |
+|---|---|
+| Hybrid retrieval (path + symbol + FTS5 + graph), token-budgeted packets | ✅ Shipped |
+| Tree-sitter symbols for 12 Tier-A languages + Tier-B generic path | ✅ Shipped |
+| Import/call/reference/inheritance graph, `refs`/`impact` | ✅ Shipped |
+| Optional local embeddings; external embeddings gated 3 ways | ✅ Shipped |
+| stdio MCP server; CLI/skill/MCP share one service layer | ✅ Shipped |
+| Honest 55k LOC Java benchmark (recall@3 70% vs 40% `rg`, ~13× fewer tokens) | ✅ Shipped |
+| 10k/100k/1M LOC public-repo benchmarks | 🚧 Roadmap |
+| Framework-aware typed edges (route→handler→service→model) | 🚧 Roadmap |
+| PyPI / `uvx` / Homebrew, signed checksums, SBOM | 🚧 Roadmap |
+| Verified per-client MCP docs, paged/progressive results | 🚧 Roadmap |
+
+See [docs/PRODUCT_UPGRADE_PLAN.md](docs/PRODUCT_UPGRADE_PLAN.md) for the full
+upgrade plan and ranked roadmap.
+
## How Does codebase-index Work?
`codebase-index` builds a local hybrid index that combines:
@@ -537,7 +615,8 @@ Yes. The CLI is agent-agnostic. Any agent that can run shell commands can use
### How do I reset the index?
```bash
-codebase-index clean
+codebase-index clean # reset the index DB (keeps the skill)
+codebase-index clean --all # wipe the whole .claude/cache/codebase-index/ dir
# Or manually: rm -rf .claude/cache/codebase-index/
codebase-index index
```
diff --git a/assets/demo.png b/assets/demo.png
new file mode 100644
index 0000000..3ff97e2
Binary files /dev/null and b/assets/demo.png differ
diff --git a/assets/social-preview.png b/assets/social-preview.png
new file mode 100644
index 0000000..0f34eff
Binary files /dev/null and b/assets/social-preview.png differ
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index 1555eae..2215440 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -153,7 +153,7 @@ upward to nearest `.git`/`.claude`), and `--quiet`. Search-family commands accep
| `explain` | `""`, `--token-budget` | 0 | intent-aware bundle |
| `stats` | — | 0 | counts, coverage %, freshness |
| `doctor` | `--strict` | non-zero if unsafe config found | findings list |
-| `clean` | `--yes` | removes cache | confirmation |
+| `clean` | `--yes`, `--all` | resets index DB (`--all` wipes cache dir) | removed-count |
| `watch` | `--debounce ms` | long-running | event log |
The skill only ever calls the **read-only** family (`search`, `symbol`, `refs`, `impact`,
diff --git a/docs/BENCHMARKS.md b/docs/BENCHMARKS.md
index 245d979..062da77 100644
--- a/docs/BENCHMARKS.md
+++ b/docs/BENCHMARKS.md
@@ -1,6 +1,28 @@
# Benchmarks
-`codebase-index` has three benchmark surfaces.
+`codebase-index` has three benchmark surfaces. Read them with their status in
+mind — the whole point of this page is to keep evidence and aspiration separate.
+
+| Surface | What it is | Status | Use it as |
+|---|---|---|---|
+| Public suite (`tests/benchmark_public.py`) | Deterministic synthetic multi-language fixture with the full metric framework | **Toy/synthetic** | CI regression gate + metric shape, **not** product-quality evidence |
+| Smoke/perf (`test_perf_smoke.py`, `test_benchmark_comparison.py`) | Latency + output-size guards on a tiny fixture | **Toy/smoke** | Regression checks only |
+| Honest real-repo (`tests/benchmark_honest.py`) | 55k LOC Java repo, recall@3 vs disciplined `rg` baseline, symmetric token accounting | **Proven (one repo)** | The only headline product-quality number we stand behind today |
+
+### Claims that should NOT be made yet
+
+Do not write, imply, or ship any of these until a run with published logs exists:
+
+- Any 10k / 100k / 1M LOC scale or speed claim (no real run at that size).
+- "Beats Cursor / Sourcegraph / Codebase-Memory MCP" — no head-to-head exists.
+- Per-language quality claims beyond Java (the honest run is Java-only).
+- Generic "Nx faster" / "Nx fewer tokens" without naming the baseline and repo.
+- Latency claims — the honest run explicitly does not headline latency
+ (Python process start dominates; real `rg` is tens of ms).
+
+The defensible headline today is exactly: **on one 55k LOC Java repo, recall@3 was
+70% (index) vs 40% (`rg`+window), using ~13× fewer answer tokens.** Everything
+else is roadmap.
## Public benchmark suite
@@ -67,13 +89,27 @@ objective recall@3 ground truth.
latency and output-size behavior. They are useful regression checks, not product
quality evidence.
-## Remaining benchmark work
-
-The public suite now has the metric framework, but the next step is adding
-larger public or documented external repositories:
-
-- 10k, 100k, and 1M LOC scale targets
-- More real-world Python, TypeScript, Java, Go, Rust, C#, PHP repos
-- Agent answer grading with human-reviewed expected answers
-- Comparisons against repo-map style context and vanilla agent exploration
-- Framework graph tasks: route -> handler -> service -> DB, migrations, config consumers, CI/infra
+## Remaining benchmark work (TODO checklist)
+
+The public suite has the metric framework; the next step is real, larger,
+documented repositories. Each task must publish raw logs alongside any headline
+number (the pattern set by `tests/benchmark_honest_RESULTS.md`).
+
+- [ ] **10k LOC public repo** — Recall@1/3/5, MRR, nDCG, token economy; named repo + commit SHA.
+- [ ] **100k LOC public repo** — same metrics, plus full index build time and incremental update latency.
+- [ ] **1M LOC target** — feasibility + scale counters (files/symbols/edges/bytes); may be partial.
+- [ ] **Multi-language repo** (≥3 Tier-A languages) — per-language recall and answer-correctness breakdown.
+- [ ] **vs vanilla agent grep/read** — tokens and recall against an undisciplined agent exploring the same questions.
+- [ ] **vs repo-map-style context** — tokens and recall against an Aider-repo-map-style context blob.
+- [ ] **Graph task benchmark** — `refs`, `impact`, and route→handler→service paths against hand-labeled ground truth.
+- [ ] **Answer grading** — human-reviewed expected answers, not just file-level recall proxies.
+- [ ] **Framework graph tasks** — migrations, config consumers, CI/infra wiring once typed edges land.
+
+How to add one without overclaiming:
+
+1. Pick a public repo; record its URL and commit SHA.
+2. Derive ground truth independently of the index (e.g. naming convention), so the
+ index cannot grade its own homework.
+3. Use a symmetric token estimator and read window on both sides.
+4. Commit the raw run output next to a short `*_RESULTS.md` summary.
+5. Only then update README/COMPARISON headline numbers.
diff --git a/docs/COMPARISON.md b/docs/COMPARISON.md
index fe7613a..bed7020 100644
--- a/docs/COMPARISON.md
+++ b/docs/COMPARISON.md
@@ -14,8 +14,9 @@ platform.
| codebase-index | Local CLI/skill/MCP retrieval for Claude Code, Codex CLI, OpenCode, and MCP clients | Broad framework-aware graph is still a roadmap item |
| Cursor indexing | Integrated AI IDE workflow | Proprietary and tied to Cursor |
| Aider repo-map | Aider chat sessions with compact repository context | Context map, not a reusable local search API |
-| Sourcegraph Cody | Enterprise-scale code intelligence across many repos | Cloud/account setup and heavier platform surface |
-| Serena / MCP tools | MCP-first local tool integration | Quality and schemas vary by server |
+| Sourcegraph / Cody / Amp | Enterprise-scale code intelligence across many repos | Cloud/account setup and heavier platform surface |
+| Continue | Open-source coding agent for IDE + CLI | An agent with context features, not a standalone retrieval index |
+| Codebase-Memory MCP | Local graph-based code-memory over MCP | Broader/heavier graph engine; different simplicity/privacy tradeoffs |
| Manual grep/read | Exact ad hoc search | No ranking, graph, symbol contract, or token budgeting |
## Criteria
@@ -46,6 +47,100 @@ platform.
| Update model | Manual `index`/`update`, hooks, optional watcher | IDE-managed | Rebuilt as Aider manages context | Platform-managed | Varies | Always live but manual |
| Extensibility | CLI `--json`; MCP schema v1.0; SQLite local DB | Limited external contract | Aider internals/context | Sourcegraph APIs | MCP by design | Shell pipelines |
+## When to choose what
+
+Honest, per-tool guidance. None of these are attacks — each tool is good at the
+job it was built for. The question is which layer you actually need.
+
+### Manual grep / read
+
+- **Good at:** exact string matching, zero setup, always live, universally
+ available. For a single known identifier in a small scope, nothing beats `rg`.
+- **Where codebase-index differs:** ranking, symbol awareness (definition vs
+ call), graph expansion to related files, and token-budgeted line ranges instead
+ of every matching line.
+- **Choose grep when:** you know the exact string, the repo is small, or you only
+ need one match.
+- **Choose codebase-index when:** the question is conceptual ("where is auth
+ implemented?"), the repo is large, or an AI agent will pay for every irrelevant
+ line it reads.
+
+### Cursor
+
+- **Good at:** an integrated AI IDE with strong, low-friction codebase awareness
+ for people who work inside Cursor.
+- **Where codebase-index differs:** it is a local, open retrieval layer for
+ **terminal and MCP** agents, offline by default, with no IDE lock-in and a
+ scriptable CLI/JSON/MCP contract.
+- **Choose Cursor when:** you want an AI-native IDE and are comfortable with a
+ proprietary, IDE-centric workflow.
+- **Choose codebase-index when:** your agent is Claude Code, Codex CLI, OpenCode,
+ or any MCP client in the terminal, and you want code to stay on your machine.
+
+### Aider repo-map
+
+- **Good at:** a compact, graph-ranked, token-budgeted repository map that feeds
+ Aider's chat context well. It is not "just grep" — it ranks with a graph
+ algorithm over source and dependencies.
+- **Where codebase-index differs:** it is a reusable, queryable index rather than
+ context injection for one agent. CLI/JSON/MCP commands return ranked `file:line`
+ ranges, symbols, references, and `impact` that any shell-capable agent can
+ consume, with freshness checks and security/ignore gates.
+- **Choose Aider repo-map when:** Aider is your agent and you want its built-in
+ context with nothing extra to run.
+- **Choose codebase-index when:** you want one index shared across multiple agents
+ (Claude Code, Codex, OpenCode, MCP) with a stable, scriptable contract.
+
+### Sourcegraph / Cody / Amp
+
+- **Good at:** enterprise-grade, cross-repo code intelligence, search, and code
+ graph at organization scale, with mature platform features.
+- **Where codebase-index differs:** single-repo, local, and lightweight — no
+ server, no account, no code leaving the machine by default. It is a retrieval
+ layer for an agent, not a platform.
+- **Choose Sourcegraph/Cody/Amp when:** you need org-wide search across many
+ repositories, team features, and are fine with a hosted/account-based platform.
+- **Choose codebase-index when:** you want per-repo retrieval for a terminal/MCP
+ agent with a strict local-first privacy model and minimal moving parts.
+
+### Continue
+
+- **Good at:** an open-source coding **agent** with IDE and CLI integrations and
+ built-in context features. It is a full assistant, not just an index.
+- **Where codebase-index differs:** it is the **retrieval/index layer itself**,
+ not an agent. It exposes a CLI/JSON/MCP contract that an agent (including, in
+ principle, agents like Continue) can query, and it focuses on token-budgeted
+ packets and a strict privacy model rather than on being the chat surface.
+- **Choose Continue when:** you want the agent — an open assistant to drive your
+ edits.
+- **Choose codebase-index when:** you already have an agent and want to give it
+ precise, local, ranked codebase context.
+
+### Codebase-Memory MCP
+
+This is the closest direct alternative, so the comparison is the most careful.
+
+- **Good at:** a broader graph engine with a static binary, wide language and
+ agent coverage, and more advanced graph features than codebase-index ships
+ today.
+- **Where codebase-index differs — and we do not claim to beat it globally:**
+ - **Simplicity and safety:** a small pure-Python surface, a multi-gate exclusion
+ pipeline, output-time secret redaction, and a `doctor --strict` self-check.
+ - **Strict privacy model:** no telemetry, no network by default; external
+ embeddings are opt-in and gated three ways.
+ - **Token-budgeted retrieval packets:** ranked `file:line` ranges and
+ `recommended_reads` under an explicit budget, tuned for the Claude/Codex/
+ OpenCode workflow.
+ - **Transparency:** readable Python, 80% coverage gate, golden CLI snapshots,
+ and a public benchmark suite wired as a CI regression gate.
+ - **Honest benchmarks:** we publish raw logs (see the 55k LOC Java run) and mark
+ unproven scale/graph claims as roadmap.
+- **Choose Codebase-Memory MCP when:** you need its broader graph engine,
+ static-binary distribution, or wider language/agent reach today.
+- **Choose codebase-index when:** you want a simpler, privacy-strict, transparent
+ retrieval layer tuned for terminal AI agents with token-budgeted output and
+ benchmarks you can audit.
+
## Aider repo-map clarification
Aider repo-map should not be described as "just grep" or as lacking ranking.
diff --git a/docs/FAQ.md b/docs/FAQ.md
index 3a14b1e..0541db2 100644
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -95,9 +95,12 @@ Yes. The CLI is agent-agnostic:
## How do I reset the index?
```bash
-# Delete the cache
+# Reset the index database (default — keeps resolved config and skill backups)
codebase-index clean
+# Wipe the whole per-project cache directory
+codebase-index clean --all
+
# Or manually
rm -rf .claude/cache/codebase-index/
@@ -105,6 +108,9 @@ rm -rf .claude/cache/codebase-index/
codebase-index index
```
+`clean` never touches the installed skill (it lives in `.claude/skills/`, not the
+cache). Add `--yes` to skip the confirmation prompt in scripts.
+
## What languages are supported?
Tier-A symbol extraction currently covers:
@@ -152,9 +158,8 @@ Yes. Use any of these methods:
## Is it production-ready?
-Yes — `codebase-index` is released as **v1.3.0**. Indexing, hybrid search, Tree-sitter
-The core indexing and search functionality is implemented and tested. The
-current `1.3.0` package includes:
+Yes — `codebase-index` is released as **v1.3.0**. The core indexing and search
+functionality is implemented and tested. The current `1.3.0` package includes:
- Hybrid FTS/path/symbol/vector retrieval
- Import/call/reference graph expansion and `impact`
diff --git a/docs/PRODUCT_UPGRADE_PLAN.md b/docs/PRODUCT_UPGRADE_PLAN.md
new file mode 100644
index 0000000..16be998
--- /dev/null
+++ b/docs/PRODUCT_UPGRADE_PLAN.md
@@ -0,0 +1,163 @@
+# Product Upgrade Plan
+
+> Status: living document. Created 2026-06-12 alongside the `1.3.0` line.
+> This is a planning artifact, not a claims document. Anything not marked
+> **Shipped** is a roadmap item and must not be advertised as done.
+
+## 1. Positioning
+
+**codebase-index is not an IDE and not a coding agent. It is the local
+retrieval/index layer that gives terminal and MCP-based AI agents precise
+codebase context.**
+
+One-line description used everywhere:
+
+> Local-first codebase retrieval for AI coding agents — Cursor-like codebase
+> awareness for Claude Code, Codex CLI, OpenCode and MCP, without cloud indexing
+> or IDE lock-in.
+
+What that commits us to:
+
+- We sit **below** the agent, not beside it. The agent (Claude Code, Codex CLI,
+ OpenCode, any MCP client) stays the user's interface; we return ranked
+ `file:line` packets it reads instead of scanning the repo.
+- We are a **queryable index with a stable contract** (CLI `--json`, MCP schema),
+ not a one-shot context blob baked into a single agent's prompt.
+- We are **local-first and offline by default**. The only path that can leave the
+ machine is opt-in external embeddings, gated three ways (see SECURITY_MODEL.md).
+
+What we explicitly do **not** claim:
+
+- Not a Cursor/IDE replacement.
+- Not best-in-class framework-aware graph retrieval *yet* — today the graph is
+ import/call/reference/inheritance, not full route→handler→service→model
+ intelligence.
+- Not proven at 100k/1M LOC scale — the public suite is synthetic; the only
+ real-repo evidence is a single 55k LOC Java run.
+
+## 2. Target users
+
+| Persona | Pain today | What we give them |
+|---|---|---|
+| Claude Code / Codex CLI / OpenCode user on a medium-to-large repo | Agent burns context window grepping and reading whole files | Ranked `file:line` packets; the agent reads 3 files, not 60 |
+| Privacy-constrained team (proprietary / regulated code) | Cloud code-intelligence is a non-starter | No network by default, no telemetry, secret redaction, ignore gates |
+| MCP power user wiring multiple tools | Wants a stable, queryable code index as a tool, not a black box | stdio MCP server with a documented tool contract + `--json` CLI |
+| Tooling/automation author | Needs scriptable retrieval other tools can build on | Agent-agnostic CLI with machine-readable JSON, SQLite the index lives in |
+
+Non-users (be honest): people who want a full IDE, multi-repo enterprise code
+search, or a turnkey hosted platform. Point them at Cursor / Sourcegraph.
+
+## 3. Competitor matrix
+
+Full prose lives in [COMPARISON.md](COMPARISON.md); this is the planning view.
+
+| Tool | Category | Strongest at | Where we differ | Choose them when |
+|---|---|---|---|---|
+| Manual grep/read | Baseline | Exact ad-hoc string match | Ranking, symbols, graph, token budget | One known string, tiny scope |
+| Cursor | AI IDE | Integrated editor + codebase awareness | Terminal/MCP-agnostic, offline by default, open | You live in Cursor's IDE |
+| Aider repo-map | Agent context | Graph-ranked, token-budgeted map feeding Aider chat | Reusable queryable API across agents, freshness/security gates | You use Aider as your agent |
+| Sourcegraph / Cody / Amp | Enterprise code intelligence | Cross-repo search/graph at org scale | Single-repo, local, lightweight, no platform/account | You need org-wide multi-repo search |
+| Continue | Open-source coding agent | IDE+CLI agent with context features | Standalone retrieval index any agent can query, not an agent itself | You want the agent, not just the index |
+| Codebase-Memory MCP | Local graph code-memory MCP | Broad graph engine, static binary, many languages | Simplicity, strict privacy model, token-budgeted packets, transparent Python, honest benchmarks | You need its broader graph/language reach today |
+
+We **do not** claim to beat Codebase-Memory MCP globally. We differentiate on
+simplicity, the Claude/Codex/OpenCode workflow, token-budgeted packets, a
+transparent Python implementation, a strict privacy model, and honest benchmarks.
+
+## 4. Differentiators (defensible today)
+
+1. **Token-budgeted retrieval packets** — output is line ranges + recommended
+ reads under an explicit token budget, not whole files or raw grep dumps.
+ Shipped: `--token-budget`, `recommended_reads`, honest ~13× fewer answer
+ tokens than an `rg`+window baseline on the 55k LOC Java run.
+2. **One index, three surfaces, one service layer** — CLI, Claude/Codex/OpenCode
+ skills, and stdio MCP all run through `service.py`, so they cannot drift.
+3. **Strict, auditable privacy model** — no network by default, no telemetry,
+ multi-gate exclusion pipeline, output-time secret redaction, `doctor` safety
+ self-check with `--strict` CI gating.
+4. **Freshness contract** — every search response carries an `index` block
+ (`exists`/`stale`/`files_changed_since_build`) so the agent knows when to
+ `update` before trusting results.
+5. **Graph-coverage honesty** — Tier-A vs Tier-B languages are labeled in
+ `stats`/`refs`/`impact`/`doctor`; partial-graph languages tell the agent to
+ fall back to Grep rather than reading "no references" as proof.
+6. **Transparent, testable implementation** — pure-Python, 80% coverage gate,
+ golden CLI snapshots, public benchmark suite as a CI regression gate.
+
+## 5. Current weaknesses (own them)
+
+| Weakness | Impact | Plan |
+|---|---|---|
+| No large-scale real-repo benchmark | Can't claim 100k/1M LOC quality | Benchmark tasks §8; recruit public repos |
+| Graph is import/call/ref only | `impact` misses framework wiring | ARCHITECTURE §9 typed-edge roadmap |
+| GitHub-only distribution | No `pip install codebase-index` / `uvx` | Distribution tasks §9 |
+| MCP client docs unverified | Templates may be wrong per client version | Verify against each client, add per-client docs |
+| Single-repo only | No monorepo/fleet context | Out of scope near-term; documented as non-goal |
+| `clean` was a stub vs documented | Doc/reality gap | **Shipped in this pass** — real cache reset + test |
+
+## 6. High-impact roadmap (ranked)
+
+1. **Scale benchmarks on real public repos** (10k → 100k LOC), published with raw
+ logs. Highest credibility lever.
+2. **Typed framework edges** (route→handler→service→model, test→impl, config→consumer)
+ with source spans + confidence. Biggest product-quality lever for `impact`.
+3. **Distribution hardening**: PyPI publish, `uvx`/`pipx` story, signed checksums,
+ SBOM. Lowers adoption friction and raises supply-chain trust.
+4. **MCP contract hardening**: `schema_version` on every payload, golden
+ snapshots per tool, verified client docs, paging/progressive results.
+5. **Retrieval tuning**: dampen the god-class `in_degree` tiebreak (the 3 honest
+ misses in the Java run), per-intent weights review.
+6. **Language reach**: config/IaC awareness (Dockerfile, Terraform, migrations,
+ CI), plus Swift/Dart/Scala/Vue/Svelte gaps called out in FAQ.
+
+## 7. Documentation tasks
+
+- [x] `docs/PRODUCT_UPGRADE_PLAN.md` (this file).
+- [x] README "How is this different?" section answering why-not-grep/Cursor/Aider/
+ Sourcegraph/Codebase-Memory on the first screen.
+- [x] `docs/COMPARISON.md` explicit rows + prose for Continue, Amp, Codebase-Memory MCP.
+- [x] `docs/BENCHMARKS.md` "claims not to make yet" + TODO benchmark checklist.
+- [x] `docs/RELEASE_CHECKLIST.md`.
+- [ ] Verified per-client MCP setup docs (after testing each client version).
+- [ ] A short "trust model in 60 seconds" callout reused across README/SECURITY.
+
+## 8. Benchmark tasks
+
+Track in [BENCHMARKS.md](BENCHMARKS.md); none may be reported until run with logs.
+
+- [ ] 10k LOC public repo: Recall@1/3/5, MRR, nDCG, token economy.
+- [ ] 100k LOC public repo: same, plus index build time + incremental update latency.
+- [ ] Multi-language public repo (≥3 Tier-A languages) with per-language breakdown.
+- [ ] Head-to-head vs vanilla agent grep/read behavior (tokens + recall).
+- [ ] Head-to-head vs repo-map-style context (tokens + recall).
+- [ ] Graph task benchmark: `refs`, `impact`, and route→handler→service paths
+ against hand-labeled ground truth.
+- [ ] Publish raw logs next to every headline number, like
+ `tests/benchmark_honest_RESULTS.md`.
+
+## 9. Distribution / release tasks
+
+- [ ] Publish to PyPI; switch docs to `pip install codebase-index` with GitHub
+ pin as the reproducible alternative.
+- [ ] `uvx codebase-index` and `pipx install codebase-index` once on PyPI.
+- [ ] Homebrew tap.
+- [ ] Signed release checksums (e.g. `cosign`/`minisign`) + published SBOM.
+- [ ] Reproducible-install smoke on a clean machine per OS (extend
+ `scripts/release_smoke.py`).
+- [x] `docs/RELEASE_CHECKLIST.md` to make releases repeatable.
+
+## 10. Technical improvements (ranked by impact / risk)
+
+| # | Improvement | Impact | Risk | Status |
+|---|---|---|---|---|
+| 1 | Implement `clean` (documented but was a stub) | Fixes doc/reality gap | Low | **Shipped this pass** |
+| 2 | Dampen god-class `in_degree` tiebreak in rerank | +recall on real repos | Medium (retune) | Planned |
+| 3 | `schema_version` on every MCP payload | Stable contract | Low | Partly (architecture claims it) — verify+test |
+| 4 | Golden snapshots for each MCP tool output | Regression safety | Low | Planned |
+| 5 | Typed framework edges in the graph | Better `impact` | High | Roadmap (ARCHITECTURE §9) |
+| 6 | Config/IaC parsers (Dockerfile, Terraform, migrations) | Coverage | Medium | Roadmap |
+| 7 | Paging/progressive MCP results | Big-repo UX | Medium | Roadmap (MCP.md) |
+
+Rule for this repo: small, safe, tested changes land directly; anything that
+risks destabilizing retrieval quality or the security model is documented here
+first and lands behind a benchmark.
diff --git a/docs/RELEASE_CHECKLIST.md b/docs/RELEASE_CHECKLIST.md
new file mode 100644
index 0000000..fc35e4d
--- /dev/null
+++ b/docs/RELEASE_CHECKLIST.md
@@ -0,0 +1,118 @@
+# Release Checklist
+
+A repeatable, copy-pasteable checklist for cutting a `codebase-index` release.
+Distribution is **GitHub-only** today (no PyPI publish yet — see "Future
+hardening"). Tagging `v*` triggers `.github/workflows/release.yml`, which builds,
+`twine check`s, runs the clean-machine smoke, and publishes a GitHub release.
+
+Work top to bottom. Do not tag until every required box is checked.
+
+## 1. Version sync (single source + the two manual mirrors)
+
+The package version is single-sourced from `src/codebase_index/__init__.py`
+(hatch dynamic version). Two files mirror it and are **not** auto-synced — bump
+them by hand and verify:
+
+- [ ] `src/codebase_index/__init__.py` → `__version__` bumped (canonical).
+- [ ] `.claude-plugin/plugin.json` → `"version"` matches.
+- [ ] `.claude-plugin/marketplace.json` → version matches (if present).
+- [ ] `requirements.lock` → the GitHub tarball tag matches the new tag
+ (`.../tags/vX.Y.Z.tar.gz`). The plugin bootstrap installs exactly this pin.
+- [ ] README / QUICKSTART / INSTALLATION / FAQ / MCP install snippets reference
+ the new tag (`@vX.Y.Z`).
+- [ ] Skill copies + `.skill_version` stamps regenerated and in sync:
+
+ ```bash
+ python scripts/sync_skill_copies.py # regenerate
+ python scripts/sync_skill_copies.py --check # CI gate: must pass clean
+ ```
+
+## 2. Tests and lint
+
+- [ ] `pytest` green locally (coverage gate `--cov-fail-under=80` enforced).
+- [ ] `ruff check src/ tests/` clean.
+- [ ] `mypy src/codebase_index` (advisory) reviewed.
+- [ ] Slow/perf tests considered: `pytest --runslow` for index/search latency.
+- [ ] CI matrix green (Ubuntu/macOS/Windows × py3.11–3.13) on the release branch.
+
+## 3. Benchmark run
+
+- [ ] Public suite runs and reports all metric families:
+
+ ```bash
+ python tests/benchmark_public.py --workdir .tmp-public-benchmark
+ ```
+
+- [ ] If any headline number in README/COMPARISON changed, re-run the honest
+ benchmark and refresh `tests/benchmark_honest_RESULTS.md` with raw logs.
+ Do **not** publish a new number without a logged run (see BENCHMARKS.md).
+
+## 4. Security / doctor checks
+
+- [ ] `codebase-index doctor --strict` exits 0 in this repo.
+- [ ] No secret/binary/generated file slipped into the index (doctor reports clean).
+- [ ] External-embeddings path still refused without all three gates
+ (config + key + warning) — covered by tests, eyeball SECURITY_MODEL.md if
+ anything in `embeddings/` changed.
+- [ ] Skill `allowed-tools` still limited to read-only subcommands (no `Bash(python *)`).
+
+## 5. Install smoke tests
+
+- [ ] Clean-venv build + install + init + index + search:
+
+ ```bash
+ python scripts/release_smoke.py # build wheel, install in throwaway venv, exercise path
+ ```
+
+- [ ] `pipx install "git+https://github.com/denfry/codebase-index.git@vX.Y.Z"` on a
+ clean machine → `init` → `index` → `search` works (the M9 exit criterion).
+- [ ] Installer scripts sanity-checked: `tests/installer/smoke.sh` /
+ `tests/installer/smoke.ps1`.
+
+## 6. Plugin smoke tests
+
+- [ ] `.claude-plugin/plugin.json` + `marketplace.json` validate and version-match
+ (`tests/test_plugin_manifest.py`).
+- [ ] `bin/cbx` / `bin/codebase-index` wrappers still enforce the subcommand
+ whitelist and refuse non-whitelisted commands like `clean`
+ (`tests/test_plugin_wrappers.py`).
+- [ ] Plugin ↔ skill parity holds (`tests/test_plugin_skill_parity.py`).
+- [ ] `SessionStart` bootstrap provisions a venv from `requirements.lock` and
+ reinstalls only when the lock changes (`tests/test_bootstrap.py`).
+
+## 7. MCP smoke tests
+
+- [ ] `codebase-index mcp --root .` starts and registers all tools
+ (`tests/test_mcp_server.py`): `healthcheck`, `search_code`, `find_symbol`,
+ `find_refs`, `impact_of`, `explain_code`, `index_stats`.
+- [ ] MCP and CLI agree (shared `service.py`) — vector channel + graph tier
+ surfaced in both.
+- [ ] `docs/MCP.md` client templates still match the shipped tool list.
+
+## 8. Changelog and docs
+
+- [ ] `CHANGELOG.md`: move `[Unreleased]` items under the new `vX.Y.Z` dated
+ heading; add the version-compare link at the bottom.
+- [ ] ROADMAP / docs reflect anything that shipped or moved.
+- [ ] `docs/PRODUCT_UPGRADE_PLAN.md` status column updated for shipped items.
+
+## 9. Tag and publish
+
+- [ ] Commit the version bump + changelog on the release branch; open/merge PR.
+- [ ] Tag: `git tag vX.Y.Z && git push origin vX.Y.Z`.
+- [ ] `release.yml` build job green (test gate + `python -m build` + `twine check`
+ + `release_smoke.py`).
+- [ ] GitHub release created with artifacts attached; release notes reviewed.
+- [ ] Post-publish: re-run `pipx install "...@vX.Y.Z"` once to confirm the tag
+ resolves.
+
+## Future hardening (not yet implemented — do not claim as done)
+
+- [ ] PyPI publish (then `pip install codebase-index`, `uvx`, `pipx` without a Git URL).
+- [ ] Homebrew tap.
+- [ ] Signed release checksums (`cosign` / `minisign`).
+- [ ] Published SBOM (e.g. CycloneDX) attached to each release.
+- [ ] Provenance / build attestation (SLSA).
+
+These matter for a tool that reads entire repositories, but they are roadmap
+items in the current line. See `docs/PRODUCT_UPGRADE_PLAN.md` §9.
diff --git a/docs/SEO.md b/docs/SEO.md
index 7f74fa4..03ed308 100644
--- a/docs/SEO.md
+++ b/docs/SEO.md
@@ -15,15 +15,18 @@ Rationale: Matches the intended package name and primary product keyword.
### GitHub About Description
```
-Local-first codebase indexing for Claude Code, Codex CLI, OpenCode, and AI coding agents.
+Local-first codebase indexing for Claude Code, Codex CLI, OpenCode & AI coding agents — hybrid FTS5 + Tree-sitter + graph search, fully offline.
```
### GitHub Topics
```
-claude-code, codex-cli, opencode, ai-coding, codebase-indexing, semantic-code-search, code-search, rag, codebase-rag, tree-sitter, sqlite, fts5, developer-tools, ai-agents, cursor-alternative, context-engineering, token-optimization, local-first, python, cli
+ai-agents, ai-coding, claude-code, cli, code-search, codebase-indexing, codex-cli, context-engineering, cursor-alternative, developer-tools, fts5, local-first, mcp, opencode, python, rag, semantic-code-search, sqlite, token-optimization, tree-sitter
```
+GitHub caps topics at 20; this list is the live set (all 20 slots used). `codebase-rag`
+is a swap candidate if a slot frees up.
+
### Website
Leave blank initially. Can be set to GitHub Pages docs site later.
@@ -77,7 +80,7 @@ Include shields.io badges in the README hero section:
```markdown

-
+



@@ -92,22 +95,34 @@ Include shields.io badges in the README hero section:
## Social Preview Image
-Create `assets/social-preview.png`:
+Built and committed as `assets/social-preview.png` (1280×640). Regenerate with:
+
+```bash
+python scripts/gen_assets.py
+```
+
+This also builds `assets/demo.png` (1200×760), the static terminal still embedded
+near the top of `README.md`.
+
+- **Dimensions:** 1280×640 (GitHub recommended)
+- **Background:** GitHub dark theme (#0d1117), accent glow
+- **Text:** wordmark `codebase-index` + "Local codebase indexing for AI coding agents"
+- **Elements:** terminal mock with a ranked search result + capability chips
+- **Style:** clean, minimal, professional
-- **Dimensions:** 1280x640 (GitHub recommended)
-- **Background:** Dark theme (#0d1117 or similar)
-- **Text:** "codebase-index - local codebase indexing for AI coding agents"
-- **Elements:** Terminal screenshot or code snippet graphic
-- **Style:** Clean, minimal, professional
+> **Action still required:** the file in the repo is not the social card by itself.
+> Upload it in **Settings → General → Social preview** so GitHub serves it as the
+> `og:image` on X / Slack / Discord / LinkedIn. (`usesCustomOpenGraphImage` is
+> currently `false`.)
## Launch Checklist
-- [x] Create v1.2.0 release with release notes
-- [ ] Add all GitHub topics (see list above)
-- [ ] Set repository description in About section
-- [ ] Upload social preview image
-- [ ] Ensure README first 150 words contain target keywords
-- [ ] Verify all badges render correctly
+- [x] Create v1.3.0 release with release notes
+- [x] Add all GitHub topics (20/20 slots used; see list above)
+- [x] Set repository description in About section
+- [ ] Upload social preview image (`assets/social-preview.png` built; must be uploaded in Settings → Social preview)
+- [x] Ensure README first 150 words contain target keywords
+- [x] Verify all badges render correctly
- [ ] Submit to awesome Claude Code skills lists
- [ ] Submit to awesome AI coding tools lists
- [ ] Post announcement on:
@@ -115,10 +130,10 @@ Create `assets/social-preview.png`:
- Reddit (r/LocalLLaMA, r/ClaudeAI, r/artificial)
- Hacker News (Show HN)
- Dev.to
-- [ ] Add demo GIF or terminal recording to README
-- [ ] Ensure comparison page is complete
-- [ ] Ensure security model page is complete
-- [ ] Tag release on GitHub
+- [~] Add demo GIF or terminal recording to README (`assets/demo.png` static still built; animated GIF still pending)
+- [x] Ensure comparison page is complete (`docs/COMPARISON.md`)
+- [x] Ensure security model page is complete (`docs/SECURITY_MODEL.md`)
+- [x] Tag release on GitHub (`v1.3.0`)
## Backlink Targets
diff --git a/pyproject.toml b/pyproject.toml
index 25d501a..cf35174 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -10,7 +10,11 @@ readme = "README.md"
requires-python = ">=3.11"
license = { text = "MIT" }
authors = [{ name = "codebase-index contributors" }]
-keywords = ["claude-code", "code-search", "tree-sitter", "rag", "sqlite", "fts5"]
+keywords = [
+ "claude-code", "codex-cli", "opencode", "mcp", "code-search",
+ "semantic-code-search", "codebase-indexing", "codebase-rag", "ai-agents",
+ "local-first", "tree-sitter", "rag", "sqlite", "fts5", "cli",
+]
classifiers = [
"Development Status :: 4 - Beta",
"Environment :: Console",
diff --git a/scripts/gen_assets.py b/scripts/gen_assets.py
new file mode 100644
index 0000000..2cb2a94
--- /dev/null
+++ b/scripts/gen_assets.py
@@ -0,0 +1,291 @@
+#!/usr/bin/env python3
+"""Generate brand assets for codebase-index (social preview + README demo still).
+
+Pure-Pillow, no external binaries. Renders at 3x and downsamples with LANCZOS for
+crisp typography. Re-run after changing copy:
+
+ python scripts/gen_assets.py
+
+Outputs:
+ assets/social-preview.png 1280x640 -> upload in Settings -> Social preview
+ assets/demo.png 1200x760 -> embed near the top of README.md
+"""
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+from PIL import Image, ImageDraw, ImageFilter, ImageFont
+
+SS = 3 # supersample factor
+
+# GitHub dark palette
+BG_TOP = (13, 17, 23) # #0d1117
+BG_BOT = (1, 4, 9) # #010409
+PANEL = (22, 27, 34) # #161b22
+PANEL_BAR = (26, 32, 40)
+BORDER = (48, 54, 61) # #30363d
+FG = (230, 237, 243) # #e6edf3
+FG2 = (173, 186, 199) # #adbac7
+MUTED = (110, 118, 129) # #6e7681
+DIM = (118, 131, 144) # #768390
+BLUE = (88, 166, 255) # #58a6ff
+CYAN = (121, 192, 255) # #79c0ff
+GREEN = (63, 185, 80) # #3fb950
+PURPLE = (188, 140, 255) # #bc8cff
+YELLOW = (210, 153, 34) # #d29922
+RED = (248, 81, 73) # #f85149
+
+FONTDIR = Path(os.environ.get("WINDIR", r"C:\Windows")) / "Fonts"
+
+
+def font(name: str, size: int) -> ImageFont.FreeTypeFont:
+ for candidate in (FONTDIR / name, Path(name)):
+ try:
+ return ImageFont.truetype(str(candidate), size * SS)
+ except OSError:
+ continue
+ return ImageFont.load_default()
+
+
+# Font roles
+def f_ui(size: int) -> ImageFont.FreeTypeFont: # Segoe UI regular
+ return font("segoeui.ttf", size)
+
+
+def f_ui_b(size: int) -> ImageFont.FreeTypeFont: # Segoe UI bold
+ return font("segoeuib.ttf", size)
+
+
+def f_mono(size: int) -> ImageFont.FreeTypeFont: # Consolas regular
+ return font("consola.ttf", size)
+
+
+def f_mono_b(size: int) -> ImageFont.FreeTypeFont: # Consolas bold
+ return font("consolab.ttf", size)
+
+
+def gradient_bg(w: int, h: int) -> Image.Image:
+ img = Image.new("RGB", (w, h), BG_TOP)
+ px = img.load()
+ for y in range(h):
+ t = y / max(1, h - 1)
+ r = round(BG_TOP[0] + (BG_BOT[0] - BG_TOP[0]) * t)
+ g = round(BG_TOP[1] + (BG_BOT[1] - BG_TOP[1]) * t)
+ b = round(BG_TOP[2] + (BG_BOT[2] - BG_TOP[2]) * t)
+ for x in range(w):
+ px[x, y] = (r, g, b)
+ return img
+
+
+def add_glow(img: Image.Image, cx: int, cy: int, radius: int, color, alpha: int) -> None:
+ layer = Image.new("RGBA", img.size, (0, 0, 0, 0))
+ d = ImageDraw.Draw(layer)
+ d.ellipse([cx - radius, cy - radius, cx + radius, cy + radius], fill=color + (alpha,))
+ layer = layer.filter(ImageFilter.GaussianBlur(radius // 2))
+ img.paste(Image.alpha_composite(img.convert("RGBA"), layer).convert("RGB"), (0, 0))
+
+
+def spaced_text(d, xy, text, fnt, fill, tracking):
+ """Draw text with extra letter-spacing (tracking in unscaled px)."""
+ x, y = xy
+ for ch in text:
+ d.text((x, y), ch, font=fnt, fill=fill)
+ x += d.textlength(ch, font=fnt) + tracking * SS
+
+
+def pill(d, x, y, label, fnt, fg, bg, pad_x=16, pad_y=9):
+ w = d.textlength(label, font=fnt)
+ h = (fnt.getbbox("Hg")[3] - fnt.getbbox("Hg")[1])
+ x1 = x + w + pad_x * 2 * SS
+ y1 = y + h + pad_y * 2 * SS
+ d.rounded_rectangle([x, y, x1, y1], radius=(h // 2 + pad_y * SS), fill=bg, outline=BORDER, width=SS)
+ d.text((x + pad_x * SS, y + pad_y * SS - fnt.getbbox("Hg")[1]), label, font=fnt, fill=fg)
+ return x1
+
+
+def window_chrome(d, x0, y0, x1, y1, title, bar_h=44):
+ d.rounded_rectangle([x0, y0, x1, y1], radius=16 * SS, fill=PANEL, outline=BORDER, width=SS + SS // 2)
+ # title bar separator
+ d.line([x0 + SS, y0 + bar_h * SS, x1 - SS, y0 + bar_h * SS], fill=BORDER, width=SS)
+ cy = y0 + (bar_h // 2) * SS
+ for i, col in enumerate((RED, YELLOW, GREEN)):
+ cx = x0 + (24 + i * 26) * SS
+ r = 7 * SS
+ d.ellipse([cx - r, cy - r, cx + r, cy + r], fill=col)
+ tf = f_mono(16)
+ tw = d.textlength(title, font=tf)
+ d.text(((x0 + x1) / 2 - tw / 2, cy - (tf.getbbox("Hg")[3] - tf.getbbox("Hg")[1]) / 2 - tf.getbbox("Hg")[1]),
+ title, font=tf, fill=MUTED)
+
+
+def downsave(img: Image.Image, w: int, h: int, path: Path) -> None:
+ out = img.resize((w, h), Image.LANCZOS)
+ path.parent.mkdir(parents=True, exist_ok=True)
+ out.save(path, "PNG", optimize=True)
+ kb = path.stat().st_size / 1024
+ print(f" {path} {w}x{h} {kb:.0f} KB")
+
+
+# --------------------------------------------------------------------------- #
+# Social preview: 1280 x 640
+# --------------------------------------------------------------------------- #
+def build_social(out: Path) -> None:
+ W, H = 1280, 640
+ w, h = W * SS, H * SS
+ img = gradient_bg(w, h)
+ add_glow(img, int(w * 0.92), int(h * 0.08), 360 * SS, BLUE, 46)
+ add_glow(img, int(w * 0.04), int(h * 0.98), 320 * SS, PURPLE, 34)
+ d = ImageDraw.Draw(img)
+
+ LM = 72 * SS
+
+ # eyebrow
+ dot_r = 6 * SS
+ ey = 82 * SS
+ d.ellipse([LM, ey - dot_r, LM + 2 * dot_r, ey + dot_r], fill=GREEN)
+ spaced_text(d, (LM + 22 * SS, 74 * SS),
+ "LOCAL-FIRST · NO NETWORK BY DEFAULT · MCP-READY",
+ f_ui_b(15), DIM, 2)
+
+ # wordmark (split color)
+ wm = f_mono_b(80)
+ wy = 104 * SS
+ d.text((LM, wy), "codebase", font=wm, fill=FG)
+ seg = d.textlength("codebase", font=wm)
+ d.text((LM + seg, wy), "-index", font=wm, fill=BLUE)
+
+ # tagline
+ d.text((LM, 214 * SS), "Local codebase indexing for AI coding agents",
+ font=f_ui(36), fill=FG2)
+
+ # terminal
+ x0, y0, x1, y1 = LM, 290 * SS, (W - 72) * SS, 540 * SS
+ window_chrome(d, x0, y0, x1, y1, "codebase-index — search")
+ bx = x0 + 30 * SS
+ mono = f_mono(21)
+ cw = d.textlength("0", font=mono)
+ by = y0 + 64 * SS
+
+ # command line
+ d.text((bx, by), "$", font=f_mono_b(21), fill=GREEN)
+ d.text((bx + cw * 2, by), "codebase-index search ", font=mono, fill=FG)
+ cmd_w = d.textlength("codebase-index search ", font=mono)
+ d.text((bx + cw * 2 + cmd_w, by), '"where is auth implemented?"', font=mono, fill=CYAN)
+
+ # results header
+ hy = by + 46 * SS
+ d.text((bx, hy), "Top matches", font=f_mono(18), fill=MUTED)
+
+ rows = [
+ ("1", "src/auth/AuthService.ts", "0.92", "exact symbol match", GREEN),
+ ("2", "src/routes/auth.ts", "0.78", "FTS · 4 callers", BLUE),
+ ("3", "src/middleware/auth.ts", "0.65", "path · FTS match", MUTED),
+ ]
+ col_rank = bx
+ col_path = bx + cw * 3
+ col_score = bx + cw * 31
+ col_reason = bx + cw * 38
+ ry = hy + 36 * SS
+ for rank, path, score, reason, scol in rows:
+ d.text((col_rank, ry), rank, font=f_mono_b(20), fill=BLUE)
+ d.text((col_path, ry), path, font=mono, fill=CYAN)
+ d.text((col_score, ry), score, font=f_mono_b(20), fill=scol)
+ d.text((col_reason, ry), reason, font=mono, fill=FG2)
+ ry += 33 * SS
+
+ # chips
+ chips = ["Tree-sitter", "SQLite FTS5", "Graph impact", "MCP server"]
+ cf = f_ui_b(16)
+ cx = LM
+ cy = 574 * SS
+ for c in chips:
+ cx = pill(d, cx, cy, c, cf, FG2, PANEL_BAR) + 12 * SS
+
+ downsave(img, W, H, out)
+
+
+# --------------------------------------------------------------------------- #
+# README demo still: 1200 x 760
+# --------------------------------------------------------------------------- #
+def build_demo(out: Path) -> None:
+ W, H = 1200, 760
+ w, h = W * SS, H * SS
+ img = gradient_bg(w, h)
+ add_glow(img, int(w * 0.5), int(h * -0.05), 460 * SS, BLUE, 26)
+ d = ImageDraw.Draw(img)
+
+ # terminal
+ x0, y0, x1, y1 = 48 * SS, 56 * SS, (W - 48) * SS, 660 * SS
+ window_chrome(d, x0, y0, x1, y1, "bash — codebase-index")
+
+ mono = f_mono(20)
+ mono_b = f_mono_b(20)
+ bx = x0 + 32 * SS
+ cw = d.textlength("0", font=mono)
+ lh = 30 * SS
+ y = y0 + 70 * SS
+
+ def col(n): # x position at character column n
+ return bx + cw * n
+
+ # command
+ d.text((bx, y), "$", font=mono_b, fill=GREEN)
+ d.text((col(2), y), "codebase-index search ", font=mono, fill=FG)
+ cmdw = d.textlength("codebase-index search ", font=mono)
+ d.text((col(2) + cmdw, y), '"where is user authentication implemented?"', font=mono, fill=CYAN)
+ y += lh * 2
+
+ d.text((bx, y), "Top matches:", font=mono_b, fill=FG); y += lh
+ d.text((bx, y), "Rank Path Symbols Score Reason",
+ font=mono, fill=MUTED); y += lh
+
+ table = [
+ ("1", "src/auth/AuthService.ts", "AuthService, login", "0.92", "exact symbol match", GREEN),
+ ("2", "src/routes/auth.ts", "loginHandler, logout", "0.78", "FTS · 4 callers", BLUE),
+ ("3", "src/middleware/auth.ts", "requireAuth", "0.65", "path · FTS match", MUTED),
+ ]
+ for rank, path, syms, score, reason, scol in table:
+ d.text((col(2), y), rank, font=mono_b, fill=BLUE)
+ d.text((col(7), y), path, font=mono, fill=CYAN)
+ d.text((col(32), y), syms, font=mono, fill=FG2)
+ d.text((col(53), y), score, font=mono_b, fill=scol)
+ d.text((col(60), y), reason, font=mono, fill=FG2)
+ y += lh
+ y += lh
+
+ d.text((bx, y), "Recommended reads:", font=mono_b, fill=FG); y += lh
+ reads = [
+ ("1.", "src/auth/AuthService.ts:12-148", "matched AuthService, login(), validatePassword()"),
+ ("2.", "src/routes/auth.ts:20-91", "/login route calls AuthService.login()"),
+ ("3.", "src/middleware/auth.ts:5-42", "auth middleware validates sessions"),
+ ]
+ for n, loc, reason in reads:
+ d.text((col(2), y), n, font=mono, fill=MUTED)
+ d.text((col(5), y), loc, font=mono_b, fill=CYAN)
+ y += lh
+ d.text((col(5), y), "reason: " + reason, font=mono, fill=MUTED)
+ y += lh + 4 * SS
+
+ # footer wordmark + tagline
+ fy = (H - 64) * SS
+ wm = f_mono_b(26)
+ d.text((48 * SS, fy), "codebase-index", font=wm, fill=FG)
+ seg = d.textlength("codebase-index", font=wm)
+ d.text((48 * SS + seg + 14 * SS, fy + 6 * SS),
+ "local hybrid index · no network by default", font=f_ui(17), fill=MUTED)
+
+ downsave(img, W, H, out)
+
+
+def main() -> None:
+ root = Path(__file__).resolve().parent.parent
+ assets = root / "assets"
+ print("Generating assets:")
+ build_social(assets / "social-preview.png")
+ build_demo(assets / "demo.png")
+ print("Done.")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/src/codebase_index/cli.py b/src/codebase_index/cli.py
index 9bf6657..0e18ccc 100644
--- a/src/codebase_index/cli.py
+++ b/src/codebase_index/cli.py
@@ -36,11 +36,6 @@
# --- global state resolved from common options --------------------------------------------------
-def _todo(name: str) -> None:
- typer.echo(f"[codebase-index] '{name}' is not implemented yet (M0 scaffold). See docs/ROADMAP.md")
- raise typer.Exit(code=0)
-
-
def _ensure_index(ctx: "typer.Context") -> tuple[Path, Any]:
from .indexer.pipeline import build_index
from .service import resolve_db
@@ -650,9 +645,66 @@ def mcp(
@app.command()
-def clean(yes: bool = typer.Option(False, "--yes", help="Skip confirmation.")) -> None:
- """Remove the per-project cache (keeps the skill)."""
- _todo("clean")
+def clean(
+ ctx: typer.Context,
+ yes: bool = typer.Option(False, "--yes", help="Skip the confirmation prompt."),
+ all_cache: bool = typer.Option(
+ False,
+ "--all",
+ help="Remove the whole cache dir (index DB, resolved config, graph exports, "
+ "skill backups), not just the index database.",
+ ),
+ json_flag: bool = typer.Option(False, "--json", help="Emit machine-readable JSON."),
+) -> None:
+ """Reset the local index. Default removes the index DB; --all wipes the cache dir.
+
+ The installed skill (in .claude/skills/) is never touched. Rebuild with
+ `codebase-index index`.
+ """
+ import json as _json
+ import shutil
+
+ from .service import cache_dir_for, resolve_db
+
+ is_json = json_flag or bool(ctx.obj and ctx.obj.get("json"))
+ quiet = bool(ctx.obj and ctx.obj.get("quiet"))
+ root_opt = ctx.obj.get("root") if ctx.obj else None
+ db_path, cfg = resolve_db(root_opt)
+ cache_dir = cache_dir_for(cfg)
+
+ if all_cache:
+ targets = [cache_dir]
+ else:
+ # The index database plus its SQLite WAL/SHM sidecar files.
+ targets = [db_path, *(db_path.with_name(db_path.name + s) for s in ("-wal", "-shm"))]
+ existing = [p for p in targets if p.exists()]
+
+ if not existing:
+ if is_json:
+ typer.echo(_json.dumps({"removed": [], "existed": False}))
+ elif not quiet:
+ typer.echo("Nothing to clean (no cache found).")
+ raise typer.Exit(code=0)
+
+ if not yes and not is_json and sys.stdin.isatty():
+ what = "the entire cache directory" if all_cache else "the index database"
+ where = cache_dir if all_cache else db_path
+ typer.confirm(f"Remove {what} at {where}?", abort=True)
+
+ removed: list[str] = []
+ for path in existing:
+ if path.is_dir():
+ shutil.rmtree(path)
+ else:
+ path.unlink()
+ removed.append(str(path))
+
+ if is_json:
+ typer.echo(_json.dumps({"removed": removed, "existed": True}))
+ elif not quiet:
+ typer.echo(
+ f"Removed {len(removed)} item(s). Run `codebase-index index` to rebuild."
+ )
@app.command()
diff --git a/tests/test_clean_cli.py b/tests/test_clean_cli.py
new file mode 100644
index 0000000..c102825
--- /dev/null
+++ b/tests/test_clean_cli.py
@@ -0,0 +1,87 @@
+"""`clean` resets the local index (documented in README/FAQ/ARCHITECTURE §5).
+
+Until 1.3.x it was a stub; these lock in the real reset behavior and the
+"never touch the installed skill" guarantee.
+"""
+
+from __future__ import annotations
+
+import json
+
+from typer.testing import CliRunner
+
+from codebase_index.cli import app
+
+runner = CliRunner()
+
+
+def _make_project(tmp_path):
+ (tmp_path / ".git").mkdir()
+ src = tmp_path / "src"
+ src.mkdir()
+ (src / "app.py").write_text("def greet(name):\n return f'hi {name}'\n", encoding="utf-8")
+ return tmp_path
+
+
+def _cache_dir(root):
+ return root / ".claude" / "cache" / "codebase-index"
+
+
+def test_clean_removes_index_db_but_keeps_cache_dir(tmp_path):
+ root = _make_project(tmp_path)
+ assert runner.invoke(app, ["--root", str(root), "index"]).exit_code == 0
+
+ db = _cache_dir(root) / "index.sqlite"
+ assert db.exists()
+
+ result = runner.invoke(app, ["--root", str(root), "clean", "--yes", "--json"])
+ assert result.exit_code == 0, result.output
+ payload = json.loads(result.output)
+ assert payload["existed"] is True
+ assert any("index.sqlite" in p for p in payload["removed"])
+ assert not db.exists()
+ # default clean is a DB reset, not a cache wipe
+ assert _cache_dir(root).exists()
+
+
+def test_clean_all_wipes_cache_dir(tmp_path):
+ root = _make_project(tmp_path)
+ assert runner.invoke(app, ["--root", str(root), "index"]).exit_code == 0
+ assert _cache_dir(root).exists()
+
+ result = runner.invoke(app, ["--root", str(root), "clean", "--all", "--yes", "--json"])
+ assert result.exit_code == 0, result.output
+ assert json.loads(result.output)["existed"] is True
+ assert not _cache_dir(root).exists()
+
+
+def test_clean_never_removes_installed_skill(tmp_path):
+ root = _make_project(tmp_path)
+ assert runner.invoke(app, ["--root", str(root), "init", "--target", "claude"]).exit_code == 0
+ assert runner.invoke(app, ["--root", str(root), "index"]).exit_code == 0
+
+ skill = root / ".claude" / "skills" / "codebase-index" / "SKILL.md"
+ assert skill.is_file()
+
+ assert runner.invoke(app, ["--root", str(root), "clean", "--all", "--yes"]).exit_code == 0
+ assert skill.is_file(), "clean must keep the installed skill"
+
+
+def test_clean_is_a_noop_when_nothing_to_clean(tmp_path):
+ root = _make_project(tmp_path)
+ result = runner.invoke(app, ["--root", str(root), "clean", "--yes", "--json"])
+ assert result.exit_code == 0, result.output
+ payload = json.loads(result.output)
+ assert payload["existed"] is False
+ assert payload["removed"] == []
+
+
+def test_clean_rebuild_cycle(tmp_path):
+ root = _make_project(tmp_path)
+ assert runner.invoke(app, ["--root", str(root), "index"]).exit_code == 0
+ assert runner.invoke(app, ["--root", str(root), "clean", "--yes"]).exit_code == 0
+
+ db = _cache_dir(root) / "index.sqlite"
+ assert not db.exists()
+ assert runner.invoke(app, ["--root", str(root), "index"]).exit_code == 0
+ assert db.exists()