A Neo4j code knowledge graph for TypeScript codebases β index NestJS and React code, then answer architecture questions with Cypher.
graphrag-code turns a TypeScript/TSX repository into a queryable code knowledge graph β a structured retrieval backend for Claude Code, Claude, and other AI coding agents. It walks the AST, recognises framework constructs (NestJS controllers, modules, DI; React components and hooks), and loads the result into Neo4j. Your agent can then ask architectural questions β dependency chains, endpoint inventories, component usage, hubs of DI β in Cypher, instead of fuzzy-matching code chunks with embeddings.
Built at Leyton CognitX to make large TypeScript monorepos legible to humans, to Claude, and to LLM agents alike.
pipx install cognitx-codegraph
cd /path/to/your-repo
codegraph initcodegraph init asks 4-5 short questions (which packages to index, which package boundaries to enforce, whether to install the Claude Code surface + GitHub Actions gate + local Neo4j) and then:
- Writes
.claude/commands/(7 slash commands),.github/workflows/arch-check.yml,.arch-policies.toml,docker-compose.yml, and aCLAUDE.mdsnippet. - Starts a local Neo4j container via
docker compose up -d. - Runs the first index.
- Prints what to query next.
You're fully set up in ~2 minutes. Want everything without prompts? codegraph init --yes. Want just the files and no Docker? codegraph init --yes --skip-docker --skip-index.
Full walkthrough: codegraph/docs/init.md. Policy reference: codegraph/docs/arch-policies.md.
- Framework-aware parsing β not just imports: controllers, injectables, modules, entities, React components and hooks are first-class nodes.
- Neo4j-backed β every relationship is a Cypher query away. Dependency walks, shortest paths, DI chains, orphan detection, all out of the box.
- Claude Code & AI agent native β the typed graph is a structured retrieval backend for Claude Code, Claude, and other coding agents that need architectural context, not just nearest-neighbour code chunks.
- Monorepo-friendly β scope indexing to specific packages (
twenty-server,twenty-front, β¦) and exclude build/test artefacts by default. - Batteries included β a Typer CLI (
index,query,validate), Docker Compose for Neo4j, and a library of example Cypher queries.
- Why a code knowledge graph?
- Using with Claude Code & AI agents
- Architecture
- Quickstart
- Graph schema
- Example queries
- Configuration
- Roadmap
- Contributing
- Contributors
- Star history
- License
Vector search over raw code chunks is a blunt instrument. It finds lexically similar snippets, not architecturally relevant ones. Questions like "which services does this controller transitively depend on?", "who injects AuthService?", or "which React components use this hook?" are graph queries, not similarity queries.
graphrag-code gives an LLM (or a human) the structured backbone it needs:
- Retrieval-augmented generation (RAG) over a TypeScript codebase with typed traversals instead of opaque embeddings.
- Architecture audits β find hubs, cycles, orphans, tangled modules.
- Safer refactors β understand the blast radius of a change before you make it.
- Onboarding β let new engineers query the codebase in plain Cypher instead of reading files top-to-bottom.
graphrag-code is designed as a drop-in retrieval backend for agentic coding workflows. The typical pattern for Claude Code (and any other LLM coding agent β Cursor, Aider, Continue, custom MCP clients):
- Index your repo once (see Quickstart) β
codegraph.cli indexwalks the AST and loads the graph into Neo4j. - Expose the graph to your agent β either via a thin MCP server, a CLI wrapper the agent can shell out to, or direct Bolt queries from tool-call handlers.
- Let the agent ask architectural questions in Cypher before editing code.
Claude Code and other coding agents work best with structured, low-noise context. Vector search over code chunks pulls back things that look similar; a typed graph answers the question the agent is actually asking:
| Agent question | Graph query |
|---|---|
"What would break if I rename AuthService?" |
Reverse INJECTS + IMPORTS* traversal |
"What endpoints does UserController expose?" |
EXPOSES direct lookup |
"Which React components call useAuth?" |
USES_HOOK lookup |
| "How is this file reached from the auth entrypoint?" | shortestPath on IMPORTS |
| "Which services are DI hubs I should treat as core?" | INJECTS aggregation |
All answered in single-digit milliseconds, with zero tokens spent on retrieving irrelevant snippets.
codegraph ships a first-class Model Context Protocol stdio server. Install the optional extra, add one block to Claude Code's config, and five typed tools appear in the agent's tool menu β no more shelling out to codegraph query.
pip install "codegraph[mcp]"In ~/.claude.json (or your Claude Desktop config):
{
"mcpServers": {
"codegraph": {
"command": "codegraph-mcp",
"type": "stdio",
"env": {
"CODEGRAPH_NEO4J_URI": "bolt://localhost:7688",
"CODEGRAPH_NEO4J_USER": "neo4j",
"CODEGRAPH_NEO4J_PASS": "codegraph123"
}
}
}
}Restart Claude Code. Five tools become available:
| Tool | Purpose |
|---|---|
query_graph(cypher, limit) |
Read-only Cypher escape hatch. Writes are rejected at the session level, so an LLM-generated DROP/DELETE can't mutate the graph. |
describe_schema() |
Labels, relationship types, and per-label node counts β cheap way for an agent to learn what's in the graph at session start. |
list_packages() |
Every indexed monorepo package with its detected framework, version, TypeScript flag, package manager, and detection confidence. |
callers_of_class(class_name, max_depth) |
Blast-radius traversal over INJECTS / EXTENDS / IMPLEMENTS. The canonical "what breaks if I rename X" query. |
endpoints_for_controller(controller_name) |
HTTP routes exposed by a NestJS controller class (method + path + handler). |
files_in_package(name, limit) |
List files belonging to a :Package by name. |
hook_usage(hook_name, limit) |
Which components / functions use a given React hook. |
gql_operation_callers(op_name, op_type, limit) |
Who calls a GraphQL query / mutation / subscription, optionally narrowed by type. |
most_injected_services(limit) |
Rank @Injectable classes by number of unique callers β the classic "DI hub detection" query. |
find_class(name_pattern, limit) |
Case-sensitive substring search over class names, backed by the class_name index. |
All ten tools share a single long-lived Neo4j driver and open sessions in READ_ACCESS mode. Configuration is env-var only (the same CODEGRAPH_NEO4J_* vars the CLI uses). The server is stdio-only β no network exposure.
TypeScript repo Parser Graph loader Neo4j
ββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ ββββββββββββ
β *.ts / *.tsx β ββββΊ β AST walk β ββββΊ β Typed nodes βββββΊ β Property β
β packages/*/src β β + framework β β + edges β β graph β
ββββββββββββββββββ β detection β ββββββββββββββββ ββββββ¬ββββββ
β (NestJS / React) β β
ββββββββββββββββββββ βΌ
Cypher / RAG
All indexing is local: your code never leaves the machine, and Neo4j runs in a Docker container alongside the CLI.
cd codegraph
# 1. Python environment
python3 -m venv .venv
.venv/bin/pip install -r requirements.txt
# 2. Neo4j (Docker)
docker compose up -d
# Browser UI: http://localhost:7475 (neo4j / codegraph123)
# Bolt: bolt://localhost:7688
# 3. Tell codegraph which packages in your monorepo to index.
# Either drop a codegraph.toml at the repo root (see Configuration below)
# or pass --package flags:
.venv/bin/python -m codegraph.cli index /path/to/your-monorepo \
--package packages/server --package packages/web
# 4. Sanity-check the load
.venv/bin/python -m codegraph.cli validate /path/to/your-monorepo
# 5. Ask a question
.venv/bin/python -m codegraph.cli query \
"MATCH (e:Endpoint) RETURN e.method, e.path LIMIT 10"Nodes
| Kind | Examples / notes |
|---|---|
Package |
One per configured monorepo package with detected framework (React / Next.js / Vue / Angular / Svelte / SvelteKit / NestJS / Odoo), framework_version, typescript, styling, router, state_management, ui_library, build_tool, package_manager, and a detection confidence. Framework detection walks up to the monorepo root for lockfiles + workspace-hoisted dependencies. |
File |
TS/TSX files with language, LOC, and framework flags (is_controller, is_component, β¦) |
Class |
NestJS controllers, injectables, modules, entities, resolvers |
Function |
Exported functions and React components |
Interface |
TypeScript interfaces |
Endpoint |
HTTP routes exposed by controllers (method + path + handler) |
Hook |
React hooks (custom and built-in usage sites) |
Decorator |
Framework decorators applied to classes/methods |
External |
Symbols imported from node_modules |
Edges
IMPORTS, IMPORTS_EXTERNAL, DEFINES_CLASS, DEFINES_FUNC, DEFINES_IFACE, EXPOSES, INJECTS, EXTENDS, IMPLEMENTS, RENDERS, USES_HOOK, DECORATED_BY, BELONGS_TO (File β Package).
A handful of the queries in codegraph/queries.md:
// 1. Every HTTP endpoint with its controller
MATCH (c:Class {is_controller:true})-[:EXPOSES]->(e:Endpoint)
RETURN c.name, e.method, e.path, e.handler
ORDER BY c.name, e.path;
// 2. Most-injected services (DI hubs)
MATCH (svc:Class {is_injectable:true})<-[:INJECTS]-(caller:Class)
RETURN svc.name, count(caller) AS injections
ORDER BY injections DESC LIMIT 20;
// 3. Which React components use a given hook?
MATCH (:Hook {name:'useAuth'})<-[:USES_HOOK]-(c:Function)
RETURN c.name, c.file;
// 4. Transitive dependencies of a file
MATCH (:File {path:$start})-[:IMPORTS*1..3]->(d:File)
RETURN DISTINCT d.path;See codegraph/queries.md for the full catalogue.
codegraph has no hardcoded packages. You tell it which packages to index via a codegraph.toml at the repo root, a [tool.codegraph] block in your existing pyproject.toml, or --package flags on the CLI. Config file values are loaded first; CLI flags override them.
codegraph.toml (preferred β a standalone file, no interference with Python tooling):
# Paths are relative to the repo root. Each entry should be a TypeScript
# package directory (i.e. contain a package.json / tsconfig.json so path
# aliases can be resolved).
packages = [
"packages/server",
"packages/web",
]
# Optional β these extend the built-in defaults, they don't replace them.
exclude_dirs = ["custom-build", "fixtures"]
exclude_suffixes = [".gen.ts"]pyproject.toml (if you already have one and want everything in one place):
[tool.codegraph]
packages = ["packages/server", "packages/web"]CLI override β wins over either file:
codegraph index . --package packages/server --package packages/webIf no config file exists and no --package flags are passed, index stops with a clear error. There are no Twenty-specific or other defaults.
codegraph also indexes Python codebases. The detector auto-picks the language based on the package directory: if the directory contains __init__.py, it's parsed as Python; otherwise it's parsed as TypeScript (with tsconfig.json / package.json).
Install the optional [python] extra to enable the Python frontend:
pip install "codegraph[python]"Then point --package at a Python package root (the directory containing __init__.py):
codegraph index . --package src/my_packageStage 1 indexes: modules (.py files), classes, functions, methods, imports (relative + absolute + import x as y), class inheritance, and decorators. Framework detection (FastAPI / Flask / Django / Typer / pytest) and route extraction land in Stage 2.
Controlled via environment variables (defaults match the bundled docker-compose.yml):
| Variable | Default |
|---|---|
CODEGRAPH_NEO4J_URI |
bolt://localhost:7688 |
CODEGRAPH_NEO4J_USER |
neo4j |
CODEGRAPH_NEO4J_PASS |
codegraph123 |
Indexing always skips node_modules, dist, build, .next, .turbo, .nuxt, .svelte-kit, .vercel, coverage, generated, __generated__, .cache, .parcel-cache, plus *.d.ts and *.stories.{ts,tsx}. Add to these via exclude_dirs / exclude_suffixes in your config β those keys extend the defaults, they don't replace them.
For confidential routes, components, or files that shouldn't reach the graph (and therefore shouldn't reach any LLM agent querying it), drop a .codegraphignore file at the repo root. Syntax is gitignore-style, plus two codegraph extensions:
# Standard gitignore β file paths
**/admin/**
**/*.secret.ts
!**/admin/public/** # negation β re-include a subtree
# Route patterns β match RouteNode.path
@route:/admin/*
@route:/settings/system/*
# Component patterns β match React component / NestJS class names
@component:*Admin*
@component:*UserManagement*Override the default location with --ignore-file PATH on the CLI or ignore_file = "custom/.ignore" in codegraph.toml. .codegraphignore is additive on top of BASE_EXCLUDE_DIRS β it doesn't replace them.
- Incremental re-indexing on file changes
- Python and Go language frontends
First-class MCP server exposing the graph to LLM agentsβ shipped (see Exposing the graph to Claude via MCP)- Pre-built RAG retrievers for common architecture questions
PRs welcome. The repository uses protected branches:
mainβ production-ready code. All changes land here via PR.releaseβ release-candidate branch. Stabilisation before tagging.hotfixβ urgent fixes that need to skip the normal cycle.
Every PR into main, release, or hotfix requires a Code Owner review (see CODEOWNERS). Please open an issue before a large refactor so we can align on direction.
Thanks to everyone who has helped shape graphrag-code:
Made with contrib.rocks.
If graphrag-code helps you make sense of a TypeScript monorepo, a star helps others find it too.
Licensed under the Apache License 2.0. Copyright Β© Leyton CognitX and contributors.