Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 47 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ Use Codegraph when you need fast structural answers about a repo without relying
- Export graph data as JSON, Mermaid, DOT, or SQLite, then inspect it from scripts, Markdown renderers, Graphviz, or SQL tools.
- Keep one workflow across source languages, monorepos, and graph-first document and template formats instead of stitching together separate tools.

For a first pass, run `orient --root . --budget small --pretty`.
Use `packet get`, `search`, `explain`, `impact`, and `review` from the recommended next commands when you need deeper architecture, symbol, or change context.
For PR, worktree, or sweeping review tasks, start with `review --base HEAD --head WORKTREE --summary` or `impact --base HEAD --head WORKTREE --pretty`.
For unfamiliar repos, start with `orient --root . --budget small --pretty`, then use `search` and `explain` to land on one concrete code anchor.
For daily change work, start with `review --base HEAD --head WORKTREE --summary`; use `impact --base HEAD --head WORKTREE --pretty` as the broader blast-radius map when needed.
Search is code-first by default in hybrid mode, and search, explain, and review packets now include analysis labels so reduced-mode or mixed-semantics runs stay visible.
Detailed command contracts and JSON shapes live in [docs/cli.md](./docs/cli.md).

## Features
Expand Down Expand Up @@ -77,23 +77,28 @@ npm run build

`npm run build` always rebuilds `dist/`. If Cargo is available, it also requires the local native workspace build to succeed; if Cargo is unavailable, it still completes with the JavaScript build output and a warning.

Then start with orientation and follow the returned commands:
Then start with the default workflow. For code reviews, the lowest-friction loop is `review --summary` first, `impact --pretty` only when you need blast radius, then `search` or `explain` on a file or symbol named in the summary; use review JSON when a follow-up needs stable handles.

```bash
# initial repo orientation with next-step suggestions
# compact reviewer handoff for current edits
node ./dist/cli.js review --base HEAD --head WORKTREE --summary

# broader blast-radius map when the review packet needs expansion
node ./dist/cli.js impact --base HEAD --head WORKTREE --pretty

# bounded repo orientation with next-step suggestions
node ./dist/cli.js orient --root . --budget small --pretty

# find and explain a concrete anchor
node ./dist/cli.js search "build review report" --json
node ./dist/cli.js explain src/cli.ts

# optional runtime and artifact health check
node ./dist/cli.js doctor

# optional broader architecture summary
node ./dist/cli.js inspect ./src --limit 20

# find and explain a concrete anchor
node ./dist/cli.js packet get src/cli.ts --pretty
node ./dist/cli.js search "graph json" --json
node ./dist/cli.js explain src/cli.ts

# build a graph for product code
node ./dist/cli.js graph --root . ./src --compact-json --output codegraph.json

Expand Down Expand Up @@ -122,11 +127,14 @@ Choose output by consumer:
Use these as starting points, then see [docs/cli.md](./docs/cli.md) for all flags, defaults, and output contracts.

```bash
# fastest code-review handoff for current edits
codegraph review --base HEAD --head WORKTREE --summary
codegraph impact --base HEAD --head WORKTREE --pretty

# repo orientation and bounded follow-up
codegraph orient --root . --budget small --pretty
codegraph packet get src/cli/graph.ts --pretty
codegraph search "graph json" --json
codegraph explain file:src/cli/graph.ts
codegraph search "build review report" --json
codegraph explain src/review.ts

# semantic navigation
codegraph goto <file> <line> <column>
Expand Down Expand Up @@ -178,19 +186,32 @@ Recommended next
```json
{
"schemaVersion": 1,
"query": "graph json",
"query": "build review report",
"mode": "hybrid",
"resultCount": 20,
"totalCandidates": 7911,
"analysis": {
"label": "native semantic"
},
"resultCount": 1,
"totalCandidates": 42,
"results": [
{
"handle": "chunk:docs%2Fcli.md:646",
"kind": "chunk",
"label": "docs/cli.md:646",
"file": "docs/cli.md",
"score": 282,
"rankReasons": ["exact phrase match in docs text", "text token match: graph, json"],
"followUps": ["codegraph chunk docs/cli.md", "codegraph deps docs/cli.md --json"]
"handle": "symbol:src%2Freview.ts:buildReviewReport:214:1",
"kind": "symbol",
"label": "buildReviewReport",
"file": "src/review.ts",
"score": 248,
"provenance": {
"surface": "code",
"capability": "semantic",
"analysisMode": "semantic",
"backend": "native",
"confidence": "high"
},
"rankReasons": ["exact phrase match in symbol name", "symbol token match: build, review, report"],
"followUps": [
"codegraph explain \"symbol:src%2Freview.ts:buildReviewReport:214:1\"",
"codegraph refs --file src/review.ts --line 214 --col 1 --pretty"
]
}
]
}
Expand Down Expand Up @@ -319,7 +340,7 @@ For a custom location, use `codegraph skill install --target <path>/skills/codeg

## Using as a library

Use the TypeScript API when another program needs deterministic file packs, review packets, or model prompts. CLI `--pretty` and `--summary` output is also useful for model-readable triage, but library callers should keep structured fields until the final UI or prompt boundary.
Use the TypeScript API when another program needs deterministic file packs, review packets, or model prompts. CLI `--pretty` and `--summary` output is also useful for model-readable triage, but library callers should keep structured fields until the final UI or prompt boundary. For repeated calls, prefer one warm `createCodeReviewSession()` or one agent/MCP session over rebuilding ad hoc indexes.

```ts
import {
Expand Down Expand Up @@ -380,8 +401,8 @@ The supported package import surface includes the compatibility root export, `@l
- Repo triage: run `codegraph inspect ./src --limit 20`, then follow with `codegraph hotspots ./src --limit 20` or `codegraph unresolved` to focus the next pass.
- Duplicate cleanup: run `codegraph duplicates ./src --min-confidence medium` for the default pretty triage view, or add `--json` when a downstream tool needs grouped duplicate data.
- Symbol navigation: use `codegraph goto <file> <line> <column>` and `codegraph refs --file <file> --line <line> --col <column> --pretty` when a question is about definitions or semantic usages rather than matching strings.
- PR review: run `codegraph impact --base origin/main --head HEAD --pretty` for a ranked map, `codegraph review --base origin/main --head HEAD --summary` for a compact reviewer handoff with actionable candidate tests, or redirect plain `review` output when a downstream tool needs the full JSON bundle.
- Worktree review: run `codegraph impact --base HEAD --head WORKTREE --pretty` for current staged and unstaged tracked-file changes, then `codegraph review --base HEAD --head WORKTREE --summary` for a compact handoff. Use `--head STAGED` to compare `HEAD` against the current index.
- PR review: run `codegraph review --base origin/main --head HEAD --summary` for a compact reviewer handoff with actionable candidate tests, add `codegraph impact --base origin/main --head HEAD --pretty` when you need a ranked blast-radius map, or redirect plain `review` output when a downstream tool needs the full JSON bundle.
- Worktree review: run `codegraph review --base HEAD --head WORKTREE --summary` for current staged and unstaged tracked-file changes, then add `codegraph impact --base HEAD --head WORKTREE --pretty` only when the handoff needs wider blast-radius context. Use `--head STAGED` to compare `HEAD` against the current index.
- Graph exploration: run `codegraph graph --root . ./src --compact-json --output codegraph.json` for scripts, `--mermaid` for Markdown renderers, or `--dot` for Graphviz. Bare `codegraph graph` writes `codegraph.json`; add `--stdout` when piping.
- Public API inspection: run `codegraph apisurface` to summarize exported symbols before refactors, reviews, or release checks.

Expand Down
10 changes: 5 additions & 5 deletions codegraph-skill/codegraph/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,14 @@ Do not use Codegraph as the only evidence for runtime behavior; pair it with tes

## First Move

Start bounded:
For PR, worktree, or sweeping review tasks, start with the compact reviewer handoff:

```bash
codegraph orient --root . --budget small --pretty
codegraph review --base HEAD --head WORKTREE --summary
```

Use `codegraph impact --base HEAD --head WORKTREE --pretty` when you need the broader blast-radius map. For unfamiliar repos without a diff, start bounded with `codegraph orient --root . --budget small --pretty`.
Use `doctor` only when install, native-runtime, or artifact health is the task.

For PR, worktree, or sweeping review tasks, start with `codegraph review --base HEAD --head WORKTREE --summary` or `codegraph impact --base HEAD --head WORKTREE --pretty` instead.

Then choose the smallest useful follow-up:

- packet: `codegraph packet get <file|symbol|sql-object|handle> --pretty`
Expand All @@ -52,6 +50,8 @@ For `orient`, `drift`, and positional graph commands, positional paths are inclu
Use readable output when a human or model will read the result.
Use JSON when the next step needs exact fields, counts, or filtering.

Hybrid search is code-first by default, and search/explain packets include analysis labels plus per-result provenance so reduced or mixed runs stay visible.

Current high-value surfaces:

- `orient --pretty`: ranked first-turn focus targets with copyable follow-ups
Expand Down
38 changes: 32 additions & 6 deletions docs/agent-workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,27 @@ Use Codegraph for structural repo questions: architecture, dependency direction,

## Start here

For code reviews, start with `review`; it is the compact handoff with changed files, changed symbols, candidate tests, risks, duplicate leads, and analysis labels.

```bash
codegraph review --base HEAD --head WORKTREE --summary
```

Add `impact` only when you need a wider blast-radius map:

```bash
codegraph impact --base HEAD --head WORKTREE --pretty
```

For an unfamiliar repo, keep the first loop bounded and actionable:

```bash
codegraph orient --root . --budget small --pretty
codegraph packet get <file-from-orient> --pretty
codegraph search "auth user" --json
codegraph explain <file-from-search-or-orient> --json
```

For PR, worktree, or sweeping review tasks, start with `codegraph review --base HEAD --head WORKTREE --summary` or `codegraph impact --base HEAD --head WORKTREE --pretty` instead of orientation.
For PR, worktree, or sweeping review tasks, prefer `review` first; use `impact` when you need the broader blast radius map instead of the reviewer handoff.

Use `doctor` only when package/runtime state or an existing artifact path is the question.
Use `search` when the agent has a query but no handle, `explain` when it already knows a file/symbol/SQL object/handle, and `inspect` for a human-readable architecture summary.
Expand Down Expand Up @@ -55,11 +68,12 @@ codegraph search "handle login" --mode graph --from src/auth.ts --depth 1 --json
codegraph explain "<handle-from-search>" --json
```

Search results include stable handles, evidence, rank reasons, neighbors, follow-ups, limits, and omitted counts.
Search results include top-level `analysis` metadata plus stable handles, per-result `provenance`, evidence, rank reasons, neighbors, follow-ups, limits, and omitted counts.
`explain` accepts those handles plus file paths, symbol names, and SQL object names, then returns bounded dependencies, references, snippets, duplicate context, SQL relation facts, review context, and follow-ups.
Generated command strings quote dynamic arguments, SQL handles avoid ambiguous basenames, and omission counts stay explicit when packets hit limits.

Agent CLI commands use the incremental index path and default to disk cache.
Hybrid search is code-first by default. Use `mode: "text"` when you specifically want documentation or prose-heavy matches to outrank implementation symbols.
Pure path/text searches skip detailed symbol graph construction; hybrid, symbol, SQL, and graph searches keep symbol-aware ranking and neighbors.
Pass shared index flags only when an agent pass must mirror a specific scan mode; see [docs/cli.md](./cli.md#agent-oriented-commands) for the canonical flag list.

Expand All @@ -82,7 +96,14 @@ See [MCP server](./mcp.md) for client configuration examples.

## Session management

For agents performing code reviews or making multiple queries, use sessions to maintain warm caches:
For agents performing code reviews or making multiple queries, use sessions to maintain warm caches. Use one of these canonical reuse models:

- library callers: one shared `createCodeReviewSession()` per repo snapshot
- agent hosts: one shared `createAgentSession()` or MCP server per repo snapshot

The local review session refreshes manually with `refresh()` and records stale-snapshot metadata in `getStats()`. Navigation checks the requested file immediately and checks config or added/removed-file drift on the stale-check interval; impact calls add an interval-throttled tracked-file scan before computing the report.

For library callers performing repeated navigation or impact work, use sessions like this:

```ts
import { createCodeReviewSession } from "@lzehrung/codegraph";
Expand Down Expand Up @@ -344,13 +365,18 @@ codegraph review --base origin/main --head HEAD --include-symbol-details --max-c
codegraph review --base origin/main --head HEAD --review-depth standard > review.json
```

For current local edits, start with a ranked model-readable map, then hand off the compact review summary:
For current local edits, start with the compact review summary:

```bash
codegraph impact --base HEAD --head WORKTREE --pretty
codegraph review --base HEAD --head WORKTREE --summary
```

Add a ranked blast-radius map only when needed:

```bash
codegraph impact --base HEAD --head WORKTREE --pretty
```

Use `--head STAGED` instead of `WORKTREE` when the review should cover only the index. Keep the full JSON review bundle for scripts or agent steps that need `projectFiles`, `graphDelta`, or detailed symbol handles.

For function-call integrations, keep the JSON object as the handoff. Do not parse `review --summary` or `impact --pretty` text to recover fields that are already present in the TypeScript return values.
Expand Down
17 changes: 15 additions & 2 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ Bare `codegraph graph` writes `codegraph.json` and `codegraph.err` in the curren

Numeric options such as `--limit`, `--threads`, `--depth`, `--max-refs`, and token bounds must be integers in their documented ranges; invalid numeric values fail instead of being silently clamped or ignored.

Default workflow:

- code review: `codegraph review --base HEAD --head WORKTREE --summary`
- blast-radius follow-up: `codegraph impact --base HEAD --head WORKTREE --pretty`
- unfamiliar repo: `codegraph orient --root . --budget small --pretty`
- targeted follow-up: `codegraph search "<query>" --json` then `codegraph explain <handle|file|symbol>`

## Runtime selection

The CLI defaults to `--native auto`, which uses the native Tree-sitter path when a compatible native artifact is available and falls back automatically otherwise.
Expand Down Expand Up @@ -45,7 +52,12 @@ Cache and manifest reuse is rooted at `--root`. Reusing a project root lets comm
### Dependency graphs

```bash
# Fast code-review handoff for current local edits
codegraph review --base HEAD --head WORKTREE --summary
codegraph impact --base HEAD --head WORKTREE --pretty

# First-pass repo summary and next-step suggestions
codegraph orient --root . --budget small --pretty
codegraph inspect ./src --limit 20

# Whole-repo graph
Expand Down Expand Up @@ -116,8 +128,9 @@ codegraph index --workers --threads 8 --cache disk
# Search for agent-ready anchors across symbols, paths, chunks, SQL objects, and graph context
codegraph orient --root . --budget small --pretty
codegraph orient --root . ./src --budget medium --json
codegraph search "build review report" --json
codegraph explain src/review.ts --json
codegraph packet get src/cli.ts --pretty
codegraph search "validate user" --json
codegraph search "public users" --mode sql --json
codegraph search "handle login" --from src/auth.ts --mode graph --depth 1 --json
codegraph search --help
Expand Down Expand Up @@ -233,7 +246,7 @@ Short JSON shape:
- Use `packet get` with file paths, symbol names, SQL object names, file/symbol/chunk/SQL/graph handles, or review handles to retrieve bounded evidence plus follow-up commands.
- Agent commands reuse the incremental index path and default to disk cache. Use shared index flags such as `--cache`, `--cache-strict`, `--cache-verify`, `--threads`, `--native`, `--workers`, `--include-glob`, `--ignore-glob`, and `--no-gitignore` when the packet should match a specific scan mode.

`search` is deterministic and vectorless. `explain` resolves file paths, symbol names, SQL object names, and search handles into bounded packets with symbols, graph context, references, snippets, duplicate context, SQL facts, review tasks, candidate tests, limits, omissions, and follow-ups. Use `--max-duplicates` to tune duplicate context in `explain` and `packet get`; duplicate context also uses an internal pair budget and reports skipped duplicate work through omission counts.
`search` is deterministic and vectorless. Hybrid search is code-first by default: source symbols and implementation files outrank docs unless `--mode text` is explicit or docs are the strongest remaining evidence. Search JSON now includes top-level `analysis` metadata plus per-result `provenance` so mixed or reduced runs stay visible. `explain` resolves file paths, symbol names, SQL object names, and search handles into bounded packets with symbols, graph context, references, snippets, duplicate context, SQL facts, review tasks, candidate tests, analysis metadata, limits, omissions, and follow-ups. Use `--max-duplicates` to tune duplicate context in `explain` and `packet get`; duplicate context also uses an internal pair budget and reports skipped duplicate work through omission counts.

For SQL, prefer handles or schema-qualified names when basenames may be ambiguous. Reference and snippet omission counts are lower bounds after bounded navigation reaches its cap.

Expand Down
2 changes: 2 additions & 0 deletions docs/how-it-works.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Runtime behavior, performance characteristics, architecture, extension points, a
- `.codegraph-cache/index-v1/manifest.json` stores the last indexed commit, graph options, and per-file signatures plus resolved edges.
- Incremental runs treat the manifest as a cached base graph: unchanged files keep their edges, while changed files are reparsed and their edges replaced.
- `codegraph hotspots` and `codegraph inspect` reuse the disk index cache when the manifest is present and log the manifest path, timestamp, and last commit hash to stderr.
- Agent tool wrappers and agent sessions default to incremental warm-cache reuse so repeated local and MCP queries pay the cold build cost once, then reuse compatible manifests and parsed state.
- Remove the manifest, clear `.codegraph-cache/index-v1`, or rerun with different graph flags to force a full graph rebuild.

### Read paths
Expand Down Expand Up @@ -117,6 +118,7 @@ Language adapters expose:
- Call compatibility runs only for changed callable signatures with provider-backed signature extraction and high-confidence callsite argument counts.
- Hints compare arity only. They do not perform type checking, overload resolution, data-flow analysis, macro expansion, or dynamic dispatch.
- Existing impact filters apply before hints are emitted, so ignored files and tests excluded by default stay out of call compatibility results.
- Long-lived `CodeReviewSession` instances keep cheap freshness baselines for config files and project-directory mtimes. Navigation also checks the requested file signature, while impact calls add an interval-throttled tracked-file scan before reuse. When those signals show drift, the session refreshes before serving results, and `getStats()` exposes stale/refresh metadata for callers that want to surface it.

### 6. AST grep

Expand Down
Loading