OpenCodeRAG is a local-first RAG plugin for semantic code search. It converts your codebase into vector indices and retrieves relevant code chunks on natural language queries. The primary aim is to save tokens by replacing full-file reads with targeted chunk retrieval and to speed-up tool calls for large codebases. Integrates seamlessly with OpenCode and works as standalone MCP server or CLI tool.
You don't need a dedicated GPU to run smaller embedding LLMs, as these models can still run performant on modern CPUs.
⚠️ Note: Don't confuse this with the npm packageopencode-rag(a discontinued project by a different author).
# 1. Clone and install
git clone https://github.com/your-org/OpenCodeRAG.git
cd OpenCodeRAG
npm install --legacy-peer-deps
npm run build
./install.sh
# 2. Initialize in your project
cd /path/to/your/project
opencode-rag init
# 3. Index your workspace
opencode-rag index
# 4. Search
opencode-rag query "authentication middleware"Prerequisites: Node.js v22+, Ollama (default) or other LLM-hosters (OpenAI-, Google- or Anthropic-compatible).
| Feature | Description |
|---|---|
| MCP server | opencode-rag mcp - stdio-based MCP server exposing search_semantic, get_file_skeleton, find_usages tools for any MCP-compatible client |
| AST chunking | 26 languages via tree-sitter (TS, JS, Python, Java, Go, Rust, C/C++, C#, Ruby, Kotlin, Swift, Bash, PHP, PowerShell, SQL, JSON, HTML, CSS, XML, YAML, TOML, INI, Dockerfile, Markdown, LaTeX, Razor) |
| Document support | Markdown, LaTeX, PDF, DOCX, DOC, Excel |
| Image indexing | Describe images via vision LLMs (Ollama, OpenAI, Anthropic, Gemini) and store descriptions as searchable vector chunks |
| Hybrid search | Vector similarity + TF×IDF keyword fusion |
| OpenCode plugin | Auto-inject context, read-tool override, TUI settings, Ctrl+Enter to add RAG context, MCP registration on init |
| Incremental indexing | File-hash manifest, background watcher, auto-rebuild on corruption |
| Privacy-first | All processing stays local (when using Ollama) |
| CLI | index, query, status, list, show, dump, clear, init, ui, mcp |
| Programmatic API | TypeScript search(), indexWorkspace(), getContext(), validateConfig(), scanWorkspace(), createBackgroundIndexer(), getIndexStatusSummary() |
| Proxy-aware | Corporate proxy support with raw-socket localhost bypass |
| OpenAI / Cohere | Alternate embedding providers with API key auto-resolution |
| Evaluation | Session-level token tracking, RAG-on vs RAG-off comparison, tiktoken BPE counting |
A browser-based dashboard for exploring the indexed vector database - browse and inspect chunks and evaluate the OpenCode sessions in terms of retrieved chunks, relevance scores, and more.
Launch with opencode-rag ui. See Web UI documentation for details.
| Document | Contents |
|---|---|
| Architecture | Module design, data flow, tech stack |
| Installation | Full install guide, global setup, uninstall |
| Configuration | All options: embedding, indexing, retrieval, description, plugin |
| Chunking | Language matrix, adding new chunkers, custom chunkers |
| Embedding | Providers, model recommendations, proxy, dimension probing |
| Retrieval | Pipeline, hybrid search, score fusion, caching |
| Plugin | OpenCode integration, tools, hooks, TUI, troubleshooting |
| CLI Reference | All commands, options, examples |
| Web UI | Dashboard, chunk browser, file explorer, compare view |
| Evaluation | Token analysis, session logging, benchmark runner, accuracy guide |
| Development | Setup, testing, conventions, adding providers |
| Troubleshooting | Common issues, logging, debugging |
| Roadmap | Completed items, short/mid/long-term plans |
OpenCodeRAG can index image files (PNG, JPEG, WebP, etc.) by sending them to a vision-capable LLM and storing the generated text descriptions as searchable vector chunks. This makes visual assets discoverable via natural language queries (e.g., "login screen screenshot", "architecture diagram").
Supported providers: Ollama, OpenAI, Anthropic, Google Gemini compatible providers.
Disabled by default — enable in opencode-rag.json to opt in.
OpenCodeRAG ships a stdio-based MCP (Model Context Protocol) server that exposes semantic code tools to any MCP-compatible client (Claude Desktop, OpenCode, Cursor, etc.).
opencode-rag mcp| Tool | Description |
|---|---|
search_semantic |
Vector + keyword hybrid search across the indexed codebase |
get_file_skeleton |
AST-based file outline (functions, classes, methods) |
find_usages |
Find all references to a symbol by name |
Clients can configure the MCP server manually, or opencode-rag init auto-registers it.
OpenCodeRAG registers tools that agents can invoke directly. Agents discover these tools via the OpenCode skill system - when opencode-rag init runs, it creates .opencode/skills/opencode-rag/SKILL.md which teaches agents the recommended workflow:
- Skeleton first -
get_file_skeleton(filePath)to orient in a file - Find usages -
find_usages(symbolName)before editing any symbol - Search -
search_semantic(query)to find relevant code - Read - use
readon specific line ranges - Edit - make changes with full context
| Tool | Purpose | When to Use |
|---|---|---|
search_semantic |
General-purpose code retrieval | Before any code task when you haven't read the relevant code |
get_file_skeleton |
Quick file overview via AST | Before reading a large file to decide which sections matter |
find_usages |
Symbol reference search | Before editing any function, variable, or class |
read (optional) |
RAG-enhanced file read | Full file contents with supplementary context chunks |
When using OpenCode, the plugin enhances your agent with three discovery mechanisms:
opencode-rag init creates .opencode/skills/opencode-rag/SKILL.md - an OpenCode skill that teaches agents the tool workflow. Agents load it on demand via the skill tool, keeping token overhead minimal.
After every message you send, the plugin searches your vector-indexed codebase:
contentType: "file_paths"(default): A lightweight list of relevant files is appended (e.g.,src/plugin.ts (typescript, lines 10-42, relevance 0.92)). Agents must callsearch_semanticorfind_usagesto retrieve actual code — nudging proactive tool usage.contentType: "chunks": High-confidence code chunks (score ≥ 0.85) are injected directly into your prompt, giving the agent instant context without a tool-call round-trip.
When chunks are indexed, a brief tool list is prepended to the system prompt so agents know the tools exist. This is skipped when no chunks are indexed to save tokens.
Press Ctrl+Enter in the terminal prompt to retrieve and append a relevant file list to your current prompt. Press Ctrl+Alt+Enter to append full code chunks instead. The query is taken from your typed text - if the prompt is empty, a toast reminds you to type first. Results are appended directly to the prompt as formatted code blocks with file paths, line ranges, and relevance scores. No dialogs are opened. Keybindings are configurable in the settings menu (Ctrl+Shift+R).
OpenCodeRAG tracks token usage, RAG injection overhead, and costs across sessions. Compare RAG-on vs RAG-off to measure whether semantic retrieval saves tokens.
opencode-rag eval:sessions # list sessions
opencode-rag eval:analyze <id> # detailed breakdown
opencode-rag eval:compare <A> <B> # side-by-side comparisonToken counting uses tiktoken BPE (cl100k_base) for accurate code tokenization. See Evaluation documentation for details.
100% local by default. Embeddings are generated locally via Ollama. The vector database stays in your project directory. No source code or embeddings leave your machine unless you explicitly configure a third-party API.
MIT
