Major architectural overhaul of RagCode MCP — new service layer, deterministic AST-based tools, native Ollama client, Streamable HTTP transport, and a dramatically reduced hardware footprint. by doITmagic · Pull Request #25 · doITmagic/rag-code-mcp

doITmagic · 2026-03-04T16:39:30Z

Description

This PR brings a major overhaul of RagCode MCP — focusing on making it lighter, faster, and easier
to work with both for developers and AI agents. Here's what changed and why it matters.

The biggest win: runs on almost any machine

The old version required a full LLM chat model (phi3:medium, ~8B parameters) running in the background
at all times. That was 5–8 GB of RAM just to keep the server alive.

We removed the chat model entirely. The server now uses only a single small embedding model
(qwen3-embedding:0.6b, 600M parameters) to encode search vectors — everything else is handled
deterministically in Go through AST analysis and Qdrant filters. The result:

Minimum RAM dropped from ~6–8 GB to ~1.5 GB
No GPU required — runs comfortably on a developer laptop
Faster startup, lower idle CPU

New and improved tools

We cleaned up the tool set significantly. Several overlapping tools from the old version were merged or
replaced with more precise, deterministic alternatives:

Gone: rag_get_function_details, rag_find_type_definition, rag_get_code_context,
rag_find_implementations, rag_search_docs, rag_hybrid_search

New in v2:

rag_search — smart router that auto-picks semantic vs exact search based on the query
rag_find_usages — finds all usages of a symbol using the AST Code Graph (no embedding needed)
rag_call_hierarchy — recursive caller/callee tree, also via AST relations
rag_read_file_context — returns code around a line with full AST context (surrounding function/class)
rag_check_update / rag_apply_update — self-update from GitHub releases

The existing rag_search_code was extended with explicit discovery (semantic) and exact (BM25 + vector)
modes, plus Graph Context Expansion that auto-fetches related definitions without extra embedding calls.

file_path is now optional in all tools — if omitted, the server detects the workspace automatically
from the last active project.

Architecture: from monolith to packages

The old entry point was a 1553-line main.go that mixed HTTP, LLM calls, indexing, tools, config, and
logging in one place. Hard to navigate, hard to test.

Now everything lives in focused, independently testable packages — each with its own README and tests:

internal/service/engine — workspace detection, background indexing, Git branch awareness
internal/service/search — semantic and hybrid search coordination
internal/service/tools — one file per MCP tool
pkg/indexer — parallel file indexing with state persistence
pkg/llm — native Ollama client (replaces langchaingo)
pkg/parser/{go,php,python,html} — pluggable parser registry with full AST extraction
pkg/workspace/{detector,registry,resolver,branchstate,watch} — robust multi-workspace tracking
pkg/telemetry — tracks how many bytes the tools saved vs reading full files

Indexer improvements

Markdown files (.md) are now indexed alongside code and searchable via include_docs=true
Concurrency is RAM-aware — 1 worker on 8 GB, up to 4 on machines with 32 GB+
A stall watchdog detects silent embedding deadlocks and recovers automatically
Only changed files are re-indexed on each save (incremental, Git-diff based)
Indexing progress is reported in real-time inside tool responses

Simpler server transport

The old SSE transport required agents to open a stream first (GET /sse), grab a sessionid,
then send every request to /messages?sessionid=XYZ. Many agents lost track of sessions or handled
the async stream incorrectly.

The new transport is a single POST /mcp — synchronous, stateless, response in the HTTP body.
No sessions, no streams to manage. Works correctly with every agent out of the box.

Type of change

New feature (non-breaking change which adds functionality)
Breaking change — legacy /sse and /messages endpoints removed; agents must use POST /mcp
Documentation update

Migration: any IDE or agent configured with url: .../sse needs to switch to url: .../mcp.
Run rag-code-install --transport auto to reconfigure automatically.

Checklist

I have performed a self-review of my own code
I have formatted my code with go fmt ./...
I have run tests go test ./... and they pass
I have verified integration with Ollama/Qdrant (if applicable)
I have updated the documentation accordingly

…tate, registry, and functional demo ## Core Modules Implemented - **contract**: request/response/error types, validation helper, reason codes - **resolver**: deterministic pipeline, strict mode, ambiguity handling, structured logging - **detector**: marker-based upward root detection, security validation, metadata fallback - **branchstate**: variant 2 branch/head tracking with persisted JSON state and cache TTL - **registry**: confirmed workspace persistence with upsert/lookup/cleanup operations - **tests**: AI-like scenario matrix with cross-module branchstate integration ## Key Features - Deterministic workspace detection (no opaque fallbacks) - Explicit confirmation workflow for ambiguous cases - Branch-aware reindex decisions (FIRST_SEEN, BRANCH_CHANGED, HEAD_CHANGED) - Persistent registry for confirmed workspaces - Functional demo with IDE-like modes (file/root) - Comprehensive unit and end-to-end test coverage ## AI Dependency Reduction - <5% AI dependency (only optional alias resolution) - >95% cases handled deterministically via file structure and git metadata - Strict mode and confirmation-required workflows without AI calls ## Validation - All modules unit-tested and passing - End-to-end scenarios validated: first seen, branch switch, head change, non-git fallback - Demo program validates IDE integration paths (workspace_root vs file_path) ## Files Added - 21 new files with 2385 insertions - Complete module documentation (TASKS.md) and architecture overview - Self-contained V2 core ready for integration This establishes the foundation for RagCode V2 with predictable, modular workspace detection.

…ties - Add phased implementation checklist to v2/TASKS.md - Add Phase 1/2 checklists to contract/resolver/tests module TASKS.md - Align ARCHITECTURE.md with Issue #21 requirements - Define priority labels (P0/P1) for deterministic rollout - Separate core metadata (Phase 1) from feedback loop (Phase 2) Phase 1 focuses on: - branch-aware context key - response metadata envelope - branch mismatch risk - invalidation hardening Phase 2 adds: - AI feedback contract - candidate promotion workflow - audit logs/metrics All documentation in English as per project standards.

- Enhance SearchDocsTool to detect language from file path and fallback to project language - Add SearchDocsTool tests for language selection logic - Improve SearchDocsOnly to handle both markdown and text chunks with proper scoring - Add workspace indexing improvements to treat non-code files as text documentation - Add binary file detection to skip indexing of binary files - Add comprehensive workspace scanning improvements for documentation files - Add PLANNING.md with roadmap for branch awareness and detection improvements

- Implemented intelligent startup indexing: only re-index if the workspace collection is missing or empty. - Added parameter to [IndexWorkspace](cci:1://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/engine/engine.go:310:0-391:1) and [StartIndexingAsync](cci:1://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/engine/engine.go:223:0-250:1) to allow forced re-indexing when needed. - Updated [config.yaml](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/config.yaml:0:0-0:0) to exclude directory from indexing to prevent duplicate code results. - Fixed tool to correctly handle the flag. - Added functional test script for end-to-end SSE interaction verification. This change significantly improves application startup time for existing workspaces and cleans up search results by ignoring irrelevant directories.

… observability - Unified and into a single, high-performance tool - Added parameter ( vs ) to allow agents to choose search strategies - Implemented colorful terminal logging (Cyan) for search requests to improve branding and debugging - Updated [llms.txt](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/llms.txt:0:0-0:0) and [llms-full.txt](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/llms-full.txt:0:0-0:0) to emphasize protocol-agnostic, agent-autonomous design - Expanded tool documentation to a 13-item roadmap, including skill management and updates - Refactored codebase: removed redundant search files and organized the directory - Enhanced integration tests in with complex architectural queries

…e tool

…gs tracker Fixed a fatal cross-file symbol injection in read_file_context AST parsing. Handled deep logging capabilities in core MCP tools enabled selectively via MCP_LOG_LEVEL=debug. Implemented telemetry pkg calculation (4-bytes to token standardization heuristic) embedded inside JSON's context metadata bounding the exact ratio/byte avoided by using our tools.

…l_index.go

…s output

- Implemented BDD testing infrastructure using Ginkgo for the tools package. - Created robust mocks (mockVectorStore, mockLLMProvider, mockDetector) in [mocks_test.go](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/tools/tests/mocks_test.go:0:0-0:0) for isolated unit testing. - Added test suites for: - SearchLocalIndexTool: verified semantic search and Code Graph context expansion. - FindUsagesTool: tested usage identification through AST relations. - CallHierarchyTool: validated recursive call hierarchy (incoming/outgoing). - ListPackageExportsTool: verified package exports for Go, PHP, and Python. - ReadFileContextTool: tested smart AST context extraction vs. naive fallback. - Fix(search): ensured SearchService.ExactSearch correctly utilizes vector store filters. - Refactor: added validation for the `symbol_name` parameter in CallHierarchyTool. - Config: registered language analyzers in the test suite for full multi-language support.

…xactSearch via Scroll ## Phase 4: File-level parallel indexing (pkg/indexer/service.go) - Replace sequential file loop with a worker pool (min 2, max 8 goroutines = NumCPU/2) - Files are dispatched via buffered channel; errors collected with sync.Mutex - Reduces wall-clock indexing time proportionally to available CPUs ## Phase 4: Race condition fix in Go AST parser (pkg/parser/go/analyzer.go) - Remove mutable `fset *token.FileSet` from CodeAnalyzer struct - Make `fset` a local variable in AnalyzePackage(), pass as explicit parameter to analyzeFunctionDecl(), analyzeTypeDecl(), analyzeConstantDecl(), analyzeVariableDecl() - CodeAnalyzer is now fully stateless and safe for concurrent use under -race ## Phase 1: is_public field on parser.Symbol (pkg/parser/parser.go) - Add `IsPublic bool` (json:"is_public") to Symbol struct - Go analyzer: ast.IsExported(ch.Name) - Python analyzer: strings.HasPrefix(ch.Name, "_") => false - PHP analyzer: reads chunk.Metadata["visibility"]; public/"" => true, else false ## Phase 1: list_package_exports filters by is_public (internal/service/tools/list_package_exports.go) - ExactSearch filter now includes {"is_public": true} - Removed runtime isExported() string-check function (replaced by indexed field) ## ExactSearch via Scroll (pkg/storage/qdrant.go, pkg/storage/interface.go) - Add ExactSearch(ctx, collection, filters, limit) to QdrantStore and Store interface - Performs metadata-only Qdrant Scroll (no HNSW, no embeddings required) - All results assigned score=1.0 (exact match semantics) - Support nested array paths via "array[].field" notation => qdrant.NewNestedFilter - Fix bool filter bug: use Match_Boolean for bool values instead of fmt.Sprintf keyword mismatch - Add matchBool() helper alongside matchKeyword() - Add ExactSearchPolyglot() to Engine (iterates all language collections for a workspace) ## Storage: Scroll-based search wired into search service (internal/service/search/search.go) - ExactSearch exposed through SearchService for use by tools ## Engine: ExactSearchPolyglot + SearchByName (internal/service/engine/engine.go) - ExactSearchPolyglot(ctx, wsID, filter, limit): fan-out ExactSearch across all lang collections - SearchByName(ctx, wsID, name, limit): thin wrapper => ExactSearchPolyglot({"name": name}) ## Tests added - pkg/indexer/service_test.go: TestIndexWorkspaceParallelFiles (8 packages, -race flag) - pkg/storage/qdrant_test.go: ExactSearch unit tests incl. bool filter, nested array, pagination - internal/service/engine/engine_exact_search_test.go: ExactSearchPolyglot unit tests - internal/service/search/search_test.go: SearchService.ExactSearch unit tests - tests/functional_sse_tools_test.go: timeout 30s -> 90s; 5/5 SSE functional specs PASS All 26 unit packages + functional SSE suite PASS (go test -race ./...)

- CallHierarchyInput.FilePath: added omitempty — no longer required in MCP JSON schema - IndexWorkspaceInput.FilePath: added omitempty — no longer required in MCP JSON schema - Updated call_hierarchy description: removed MANDATORY wording, added fallback note All tools now use registry fallback (last active workspace) when file_path is omitted. DetectContext already supported empty path via GetActiveWorkspace() registry fallback.

…TTL detection cache ## CollectionNameFor + CollectionName (engine.go) - Add package-level CollectionNameFor(wsID, lang string) string - Add WorkspaceContext.CollectionName(lang string) string — thin wrapper - Replace all 5 inline fmt.Sprintf("ragcode-%s-%s", ...) with the helpers ## DetectFromParams (engine.go) - New Engine.DetectFromParams(ctx, params map[string]any) helper - Reads file_path, workspace_root, workspace keys in priority order - Falls through to registry fallback when none are present ## TTL detection cache (engine.go) - Add detectionCache sync.Map + detectionCacheEntry{wctx, expiry} to Engine - DetectContext caches resolved WorkspaceContext for 5s (detectionCacheTTL) - Cache is skipped/invalidated when ReindexRequired=true or MismatchRisk=high - Eliminates repeated resolver cascade calls within the same tool execution ## ContextFromWorkspace helper (tools/response.go) - New ContextFromWorkspace(wctx *engine.WorkspaceContext) ContextMetadata - Centralizes ContextMetadata construction from WorkspaceContext - Accepts nil safely (returns empty ContextMetadata) All 26 unit packages + SSE functional suite PASS

…chPolyglot Phase 2 (engine.go): - Add WorkspaceID to SearchCodeResult - SearchCode: embed ONCE, fan-out parallel to all lang collections - Surface errors only when all collections fail and no results - Add SearchWithVector to search.go (pre-computed vector, includeDocs path) Phase 3 (search_local_index.go): - Graph expansion: parallel goroutines bounded to maxExpansions=10 - SearchByName first (zero embedding, deterministic Qdrant Scroll) - Fallback: SearchCode with embedding only when ExactSearch returns 0 results - seenTargets dedup prevents redundant goroutines call_hierarchy rewrite: - Remove per-lang loop pattern (sequential CollectionExists + ExactSearch) - findSymbolInfo -> engine.SearchByName (ExactSearchPolyglot internally parallel) - resolveIncoming -> engine.ExactSearchPolyglot (single parallel call) - resolveOutgoing -> engine.SearchByName per target - Remove search.Service dependency and parser.SupportedLanguages() usage

engine_searchcode_test.go (new): - TestSearchCodeEmbedsOnce: verifies embedder called exactly once even with go+python collections - TestSearchCodePopulatesWorkspaceID: verifies WorkspaceID != "" in SearchCodeResult - TestSearchCodeMergesMultiLangResults: go+python results both present after fan-out - TestSearchCodeSurfacesErrorWhenAllFail: error propagated when all collections fail search_test.go (Phase 3 graph expansion, 3 specs): - ExactSearch first: SearchCodeOnly NOT called with expansion limit when ExactSearch succeeds - Fallback: SearchCodeOnly IS called with limit=2 when ExactSearch returns empty - Dedup: 3 identical relations in payload → only 1 goroutine → 2 result items

find_usages.go: - Remove per-lang sequential loop (CollectionExists + ExactSearch per lang) - Replace with engine.ExactSearchPolyglot (parallel, zero embedding) - Remove search.Service + parser dependencies list_package_exports.go: - Same loop removal, ExactSearchPolyglot with package filter only - Add RelationsCount int to ExportedSymbol (AST relation count as complexity indicator) - Add isExported(name) fallback: graceful handling of pre-is_public index entries (checks Go naming convention: first rune uppercase) - Show Relations count in markdown output when > 0 - Remove search.Service + parser dependencies

doITmagic · 2026-03-06T12:33:54Z

I've addressed the remaining unresolved review comments in the latest commits (up to 37ea155):

docs: Updated the broken analyzer documentation markdown links in README.md.
tools: Renamed the shadowed filePath variable inside rag_find_usages tool loop to resultFilePath to avoid confusions and future bugs.
tests: Added extractSSEData concatenation feature loops to cmd/sse-client-test/main.go that merges proper JSON payload formats spanning over multiple data: lines. This prevents json Unmarshal failures.
core tests: Revisit fixed: the FindUsagesTool telemetry metrics test and the directory mock. The test correctly executes using a real temporary file instead of an unstatable mock, confirming proper bytes allocation for tokens saved, passing locally flawlessly. Also solved the test cleanup race-condition caused by background indexer goroutine overlaps.

All unit tests consistently stay 🟢 Green. Let me know if everything is up to standards and ready for merge!

doITmagic · 2026-03-06T12:35:13Z

rag_evaluate placing b.String() output inside Message instead of Data, as well as parsing structured fields successfully into the var data map[string]interface{} (including models mapping if available), has been pushed and is now fixed in 0edbad1 as requested. Thank you!

Copilot

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 5 comments.

internal/service/tools/index_workspace.go

internal/service/tools/find_usages.go

internal/service/tools/index_workspace.go

internal/service/search/search.go

internal/service/tools/find_usages.go

…failures

Copilot

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 4 comments.

internal/service/tools/index_workspace.go

internal/service/engine/index_progress.go

cmd/mcp-http-client-test/main.go

…arch tests

…evaluation

…ing get, complete, and fail operations

Copilot

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

README.md:1

./docs/IDE-SETUP.md is referenced here but is deleted in this PR, leaving a broken link in the main README. Either restore the doc (or replace it with the new canonical location) or update/remove this link so the README stays self-consistent.

<div align="center">

internal/service/search/search.go

internal/service/tools/find_usages.go

…eflect stateless transport

…rant calls

Copilot

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

go.mod:3

The go directive in go.mod is expected to be major.minor (e.g., go 1.24), not a patch version. Using go 1.24.4 will break go mod parsing in standard toolchains. If you intend to pin a patch toolchain, use go 1.24 and add a toolchain go1.24.4 directive instead.

go 1.24.4

internal/service/tools/index_workspace.go

…ailure in rag_index_workspace

Copilot

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 6 comments.

internal/healthcheck/healthcheck.go

cmd/mcp-http-client-test/main.go

internal/service/tools/index_workspace.go

internal/service/tools/evaluate_ragcode.go

internal/service/tools/find_usages.go

cmd/mcp-http-client-test/main.go

…ndex_workspace

…usages, evaluate_ragcode, index_workspace

razvan and others added 30 commits February 15, 2026 00:02

gemini flash tried to implement what we jae in another place ... FAILED

47e94be

feat(v2): finalize internal detection core and parser package migration

e33ea0e

refactor(v2): move workspace detection core to pkg/workspace

0bb3eed

docs(v2/workspace): add comprehensive README guides for all modules

605b79a

Add legacy parity tests for Qdrant store

aee8af8

added test for qdrant

62bd515

WIP 80% finished version v2

5839691

WIP bind all

a18df5e

WIP errors

3da339c

WIP v2 must add tests

8d084dd

wip

61d1384

feat: migrate graph resolution natively to rag_search_code tool

b8d61a1

chore: remove graph search tool and registration

19b31bd

docs: document recent graph and hybrid capabilities in rag_search_cod…

20a1c46

…e tool

Merge branch 'v2-detection-core' and resolve conflicts in search_loca…

bdb4d07

…l_index.go

feat: enhance read_file_context with formatted AST lines and relation…

164b87c

…s output

all tests passed

9daeea3

Copilot AI review requested due to automatic review settings March 6, 2026 12:33

Copilot AI reviewed Mar 6, 2026

View reviewed changes

fix: include models in evaluate_ragcode tool JSON Data

7b74aed

doITmagic force-pushed the v2-detection-core branch from 0edbad1 to 7b74aed Compare March 6, 2026 12:39

fix: return actual Go errors from Execute for find_usages validation …

fcb36f6

…failures

Copilot AI review requested due to automatic review settings March 6, 2026 12:40

Copilot AI reviewed Mar 6, 2026

View reviewed changes

internal/service/tools/index_workspace.go Outdated Show resolved Hide resolved

internal/service/engine/index_progress.go Show resolved Hide resolved

internal/service/engine/index_progress.go Outdated Show resolved Hide resolved

cmd/mcp-http-client-test/main.go Outdated Show resolved Hide resolved

doITmagic added 2 commits March 6, 2026 14:45

fix: redact timeline metadata in rag_index_workspace and add HybridSe…

02c3b5e

…arch tests

fix: return error alongside ToolResponse JSON in rag_index_workspace …

ee7067f

…evaluation

Copilot AI review requested due to automatic review settings March 6, 2026 12:46

perf: release progress store lock before persisting state to disk dur…

c086b2a

…ing get, complete, and fail operations

doITmagic force-pushed the v2-detection-core branch from 0e9c618 to 5f92ba1 Compare March 6, 2026 12:48

Copilot AI reviewed Mar 6, 2026

View reviewed changes

internal/service/search/search.go Show resolved Hide resolved

internal/service/tools/find_usages.go Show resolved Hide resolved

refactor: rename cmd/sse-client-test to cmd/mcp-http-client-test to r…

bef4e55

…eflect stateless transport

doITmagic force-pushed the v2-detection-core branch from 5f92ba1 to bef4e55 Compare March 6, 2026 12:51

perf: add early return to HybridSearch when limit is zero to avoid Qd…

a827fed

…rant calls

Copilot AI review requested due to automatic review settings March 6, 2026 12:52

Copilot AI reviewed Mar 6, 2026

View reviewed changes

internal/service/tools/index_workspace.go Show resolved Hide resolved

internal/service/tools/index_workspace.go Outdated Show resolved Hide resolved

doITmagic added 2 commits March 6, 2026 15:24

test: fix unchecked json.Unmarshal error in find_usages_test

a7452ff

fix: pass ToolResponse json string inside CallToolResult content on f…

7f817f5

…ailure in rag_index_workspace

Copilot AI review requested due to automatic review settings March 6, 2026 13:26

Copilot AI reviewed Mar 6, 2026

View reviewed changes

doITmagic added 2 commits March 6, 2026 15:48

fix: return JSON ToolResponse with nil error instead of both in rag_i…

4624c81

…ndex_workspace

fix: resolve final PR review comments for mcp-http-client-test, find_…

344d4fa

…usages, evaluate_ragcode, index_workspace

doITmagic requested review from Copilot and removed request for Copilot March 6, 2026 14:00

doITmagic merged commit 7adda5a into dev Mar 6, 2026
3 checks passed

doITmagic deleted the v2-detection-core branch March 6, 2026 14:07

Conversation

doITmagic commented Mar 4, 2026

Description

The biggest win: runs on almost any machine

New and improved tools

Architecture: from monolith to packages

Indexer improvements

Simpler server transport

Type of change

Checklist

Uh oh!

doITmagic commented Mar 6, 2026

Uh oh!

doITmagic commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants