Skip to content

Major architectural overhaul of RagCode MCP — new service layer, deterministic AST-based tools, native Ollama client, Streamable HTTP transport, and a dramatically reduced hardware footprint.#25

Merged
doITmagic merged 119 commits intodevfrom
v2-detection-core
Mar 6, 2026

Conversation

@doITmagic
Copy link
Owner

Description

This PR brings a major overhaul of RagCode MCP — focusing on making it lighter, faster, and easier
to work with both for developers and AI agents. Here's what changed and why it matters.


The biggest win: runs on almost any machine

The old version required a full LLM chat model (phi3:medium, ~8B parameters) running in the background
at all times. That was 5–8 GB of RAM just to keep the server alive.

We removed the chat model entirely. The server now uses only a single small embedding model
(qwen3-embedding:0.6b, 600M parameters) to encode search vectors — everything else is handled
deterministically in Go through AST analysis and Qdrant filters. The result:

  • Minimum RAM dropped from ~6–8 GB to ~1.5 GB
  • No GPU required — runs comfortably on a developer laptop
  • Faster startup, lower idle CPU

New and improved tools

We cleaned up the tool set significantly. Several overlapping tools from the old version were merged or
replaced with more precise, deterministic alternatives:

Gone: rag_get_function_details, rag_find_type_definition, rag_get_code_context,
rag_find_implementations, rag_search_docs, rag_hybrid_search

New in v2:

  • rag_search — smart router that auto-picks semantic vs exact search based on the query
  • rag_find_usages — finds all usages of a symbol using the AST Code Graph (no embedding needed)
  • rag_call_hierarchy — recursive caller/callee tree, also via AST relations
  • rag_read_file_context — returns code around a line with full AST context (surrounding function/class)
  • rag_check_update / rag_apply_update — self-update from GitHub releases

The existing rag_search_code was extended with explicit discovery (semantic) and exact (BM25 + vector)
modes, plus Graph Context Expansion that auto-fetches related definitions without extra embedding calls.

file_path is now optional in all tools — if omitted, the server detects the workspace automatically
from the last active project.


Architecture: from monolith to packages

The old entry point was a 1553-line main.go that mixed HTTP, LLM calls, indexing, tools, config, and
logging in one place. Hard to navigate, hard to test.

Now everything lives in focused, independently testable packages — each with its own README and tests:

  • internal/service/engine — workspace detection, background indexing, Git branch awareness
  • internal/service/search — semantic and hybrid search coordination
  • internal/service/tools — one file per MCP tool
  • pkg/indexer — parallel file indexing with state persistence
  • pkg/llm — native Ollama client (replaces langchaingo)
  • pkg/parser/{go,php,python,html} — pluggable parser registry with full AST extraction
  • pkg/workspace/{detector,registry,resolver,branchstate,watch} — robust multi-workspace tracking
  • pkg/telemetry — tracks how many bytes the tools saved vs reading full files

Indexer improvements

  • Markdown files (.md) are now indexed alongside code and searchable via include_docs=true
  • Concurrency is RAM-aware — 1 worker on 8 GB, up to 4 on machines with 32 GB+
  • A stall watchdog detects silent embedding deadlocks and recovers automatically
  • Only changed files are re-indexed on each save (incremental, Git-diff based)
  • Indexing progress is reported in real-time inside tool responses

Simpler server transport

The old SSE transport required agents to open a stream first (GET /sse), grab a sessionid,
then send every request to /messages?sessionid=XYZ. Many agents lost track of sessions or handled
the async stream incorrectly.

The new transport is a single POST /mcp — synchronous, stateless, response in the HTTP body.
No sessions, no streams to manage. Works correctly with every agent out of the box.


Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change — legacy /sse and /messages endpoints removed; agents must use POST /mcp
  • Documentation update

Migration: any IDE or agent configured with url: .../sse needs to switch to url: .../mcp.
Run rag-code-install --transport auto to reconfigure automatically.

Checklist

  • I have performed a self-review of my own code
  • I have formatted my code with go fmt ./...
  • I have run tests go test ./... and they pass
  • I have verified integration with Ollama/Qdrant (if applicable)
  • I have updated the documentation accordingly

razvan and others added 30 commits February 15, 2026 00:02
…tate, registry, and functional demo

## Core Modules Implemented
- **contract**: request/response/error types, validation helper, reason codes
- **resolver**: deterministic pipeline, strict mode, ambiguity handling, structured logging
- **detector**: marker-based upward root detection, security validation, metadata fallback
- **branchstate**: variant 2 branch/head tracking with persisted JSON state and cache TTL
- **registry**: confirmed workspace persistence with upsert/lookup/cleanup operations
- **tests**: AI-like scenario matrix with cross-module branchstate integration

## Key Features
- Deterministic workspace detection (no opaque fallbacks)
- Explicit confirmation workflow for ambiguous cases
- Branch-aware reindex decisions (FIRST_SEEN, BRANCH_CHANGED, HEAD_CHANGED)
- Persistent registry for confirmed workspaces
- Functional demo with IDE-like modes (file/root)
- Comprehensive unit and end-to-end test coverage

## AI Dependency Reduction
- <5% AI dependency (only optional alias resolution)
- >95% cases handled deterministically via file structure and git metadata
- Strict mode and confirmation-required workflows without AI calls

## Validation
- All modules unit-tested and passing
- End-to-end scenarios validated: first seen, branch switch, head change, non-git fallback
- Demo program validates IDE integration paths (workspace_root vs file_path)

## Files Added
- 21 new files with 2385 insertions
- Complete module documentation (TASKS.md) and architecture overview
- Self-contained V2 core ready for integration

This establishes the foundation for RagCode V2 with predictable, modular workspace detection.
…ties

- Add phased implementation checklist to v2/TASKS.md
- Add Phase 1/2 checklists to contract/resolver/tests module TASKS.md
- Align ARCHITECTURE.md with Issue #21 requirements
- Define priority labels (P0/P1) for deterministic rollout
- Separate core metadata (Phase 1) from feedback loop (Phase 2)

Phase 1 focuses on:
- branch-aware context key
- response metadata envelope
- branch mismatch risk
- invalidation hardening

Phase 2 adds:
- AI feedback contract
- candidate promotion workflow
- audit logs/metrics

All documentation in English as per project standards.
- Enhance SearchDocsTool to detect language from file path and fallback to project language
- Add SearchDocsTool tests for language selection logic
- Improve SearchDocsOnly to handle both markdown and text chunks with proper scoring
- Add workspace indexing improvements to treat non-code files as text documentation
- Add binary file detection to skip indexing of binary files
- Add comprehensive workspace scanning improvements for documentation files
- Add PLANNING.md with roadmap for branch awareness and detection improvements
- Implemented intelligent startup indexing: only re-index if the workspace collection is missing or empty.
- Added  parameter to [IndexWorkspace](cci:1://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/engine/engine.go:310:0-391:1) and [StartIndexingAsync](cci:1://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/engine/engine.go:223:0-250:1) to allow forced re-indexing when needed.
- Updated [config.yaml](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/config.yaml:0:0-0:0) to exclude  directory from indexing to prevent duplicate code results.
- Fixed  tool to correctly handle the  flag.
- Added functional test script  for end-to-end SSE interaction verification.

This change significantly improves application startup time for existing workspaces and cleans up search results by ignoring irrelevant directories.
… observability

- Unified  and  into a single, high-performance tool
- Added  parameter ( vs ) to allow agents to choose search strategies
- Implemented colorful terminal logging (Cyan) for search requests to improve branding and debugging
- Updated [llms.txt](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/llms.txt:0:0-0:0) and [llms-full.txt](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/llms-full.txt:0:0-0:0) to emphasize protocol-agnostic, agent-autonomous design
- Expanded tool documentation to a 13-item roadmap, including skill management and updates
- Refactored codebase: removed redundant search files and organized the  directory
- Enhanced integration tests in  with complex architectural queries
…gs tracker Fixed a fatal cross-file symbol injection in read_file_context AST parsing.

Handled deep logging capabilities in core MCP tools enabled selectively via MCP_LOG_LEVEL=debug.
Implemented telemetry pkg calculation (4-bytes to token standardization heuristic) embedded inside JSON's context metadata bounding the exact ratio/byte avoided by using our tools.
- Implemented BDD testing infrastructure using Ginkgo for the tools package.
- Created robust mocks (mockVectorStore, mockLLMProvider, mockDetector) in [mocks_test.go](cci:7://file:///home/razvan/go/src/github.com/doITmagic/rag-code-mcp/internal/service/tools/tests/mocks_test.go:0:0-0:0) for isolated unit testing.
- Added test suites for:
    - SearchLocalIndexTool: verified semantic search and Code Graph context expansion.
    - FindUsagesTool: tested usage identification through AST relations.
    - CallHierarchyTool: validated recursive call hierarchy (incoming/outgoing).
    - ListPackageExportsTool: verified package exports for Go, PHP, and Python.
    - ReadFileContextTool: tested smart AST context extraction vs. naive fallback.
- Fix(search): ensured SearchService.ExactSearch correctly utilizes vector store filters.
- Refactor: added validation for the `symbol_name` parameter in CallHierarchyTool.
- Config: registered language analyzers in the test suite for full multi-language support.
…xactSearch via Scroll

## Phase 4: File-level parallel indexing (pkg/indexer/service.go)
- Replace sequential file loop with a worker pool (min 2, max 8 goroutines = NumCPU/2)
- Files are dispatched via buffered channel; errors collected with sync.Mutex
- Reduces wall-clock indexing time proportionally to available CPUs

## Phase 4: Race condition fix in Go AST parser (pkg/parser/go/analyzer.go)
- Remove mutable `fset *token.FileSet` from CodeAnalyzer struct
- Make `fset` a local variable in AnalyzePackage(), pass as explicit parameter to
  analyzeFunctionDecl(), analyzeTypeDecl(), analyzeConstantDecl(), analyzeVariableDecl()
- CodeAnalyzer is now fully stateless and safe for concurrent use under -race

## Phase 1: is_public field on parser.Symbol (pkg/parser/parser.go)
- Add `IsPublic bool` (json:"is_public") to Symbol struct
- Go analyzer: ast.IsExported(ch.Name)
- Python analyzer: strings.HasPrefix(ch.Name, "_") => false
- PHP analyzer: reads chunk.Metadata["visibility"]; public/"" => true, else false

## Phase 1: list_package_exports filters by is_public (internal/service/tools/list_package_exports.go)
- ExactSearch filter now includes {"is_public": true}
- Removed runtime isExported() string-check function (replaced by indexed field)

## ExactSearch via Scroll (pkg/storage/qdrant.go, pkg/storage/interface.go)
- Add ExactSearch(ctx, collection, filters, limit) to QdrantStore and Store interface
- Performs metadata-only Qdrant Scroll (no HNSW, no embeddings required)
- All results assigned score=1.0 (exact match semantics)
- Support nested array paths via "array[].field" notation => qdrant.NewNestedFilter
- Fix bool filter bug: use Match_Boolean for bool values instead of fmt.Sprintf keyword mismatch
- Add matchBool() helper alongside matchKeyword()
- Add ExactSearchPolyglot() to Engine (iterates all language collections for a workspace)

## Storage: Scroll-based search wired into search service (internal/service/search/search.go)
- ExactSearch exposed through SearchService for use by tools

## Engine: ExactSearchPolyglot + SearchByName (internal/service/engine/engine.go)
- ExactSearchPolyglot(ctx, wsID, filter, limit): fan-out ExactSearch across all lang collections
- SearchByName(ctx, wsID, name, limit): thin wrapper => ExactSearchPolyglot({"name": name})

## Tests added
- pkg/indexer/service_test.go: TestIndexWorkspaceParallelFiles (8 packages, -race flag)
- pkg/storage/qdrant_test.go: ExactSearch unit tests incl. bool filter, nested array, pagination
- internal/service/engine/engine_exact_search_test.go: ExactSearchPolyglot unit tests
- internal/service/search/search_test.go: SearchService.ExactSearch unit tests
- tests/functional_sse_tools_test.go: timeout 30s -> 90s; 5/5 SSE functional specs PASS

All 26 unit packages + functional SSE suite PASS (go test -race ./...)
- CallHierarchyInput.FilePath: added omitempty — no longer required in MCP JSON schema
- IndexWorkspaceInput.FilePath: added omitempty — no longer required in MCP JSON schema
- Updated call_hierarchy description: removed MANDATORY wording, added fallback note

All tools now use registry fallback (last active workspace) when file_path is omitted.
DetectContext already supported empty path via GetActiveWorkspace() registry fallback.
…TTL detection cache

## CollectionNameFor + CollectionName (engine.go)
- Add package-level CollectionNameFor(wsID, lang string) string
- Add WorkspaceContext.CollectionName(lang string) string — thin wrapper
- Replace all 5 inline fmt.Sprintf("ragcode-%s-%s", ...) with the helpers

## DetectFromParams (engine.go)
- New Engine.DetectFromParams(ctx, params map[string]any) helper
- Reads file_path, workspace_root, workspace keys in priority order
- Falls through to registry fallback when none are present

## TTL detection cache (engine.go)
- Add detectionCache sync.Map + detectionCacheEntry{wctx, expiry} to Engine
- DetectContext caches resolved WorkspaceContext for 5s (detectionCacheTTL)
- Cache is skipped/invalidated when ReindexRequired=true or MismatchRisk=high
- Eliminates repeated resolver cascade calls within the same tool execution

## ContextFromWorkspace helper (tools/response.go)
- New ContextFromWorkspace(wctx *engine.WorkspaceContext) ContextMetadata
- Centralizes ContextMetadata construction from WorkspaceContext
- Accepts nil safely (returns empty ContextMetadata)

All 26 unit packages + SSE functional suite PASS
…chPolyglot

Phase 2 (engine.go):
- Add WorkspaceID to SearchCodeResult
- SearchCode: embed ONCE, fan-out parallel to all lang collections
- Surface errors only when all collections fail and no results
- Add SearchWithVector to search.go (pre-computed vector, includeDocs path)

Phase 3 (search_local_index.go):
- Graph expansion: parallel goroutines bounded to maxExpansions=10
- SearchByName first (zero embedding, deterministic Qdrant Scroll)
- Fallback: SearchCode with embedding only when ExactSearch returns 0 results
- seenTargets dedup prevents redundant goroutines

call_hierarchy rewrite:
- Remove per-lang loop pattern (sequential CollectionExists + ExactSearch)
- findSymbolInfo -> engine.SearchByName (ExactSearchPolyglot internally parallel)
- resolveIncoming -> engine.ExactSearchPolyglot (single parallel call)
- resolveOutgoing -> engine.SearchByName per target
- Remove search.Service dependency and parser.SupportedLanguages() usage
engine_searchcode_test.go (new):
- TestSearchCodeEmbedsOnce: verifies embedder called exactly once even with go+python collections
- TestSearchCodePopulatesWorkspaceID: verifies WorkspaceID != "" in SearchCodeResult
- TestSearchCodeMergesMultiLangResults: go+python results both present after fan-out
- TestSearchCodeSurfacesErrorWhenAllFail: error propagated when all collections fail

search_test.go (Phase 3 graph expansion, 3 specs):
- ExactSearch first: SearchCodeOnly NOT called with expansion limit when ExactSearch succeeds
- Fallback: SearchCodeOnly IS called with limit=2 when ExactSearch returns empty
- Dedup: 3 identical relations in payload → only 1 goroutine → 2 result items
find_usages.go:
- Remove per-lang sequential loop (CollectionExists + ExactSearch per lang)
- Replace with engine.ExactSearchPolyglot (parallel, zero embedding)
- Remove search.Service + parser dependencies

list_package_exports.go:
- Same loop removal, ExactSearchPolyglot with package filter only
- Add RelationsCount int to ExportedSymbol (AST relation count as complexity indicator)
- Add isExported(name) fallback: graceful handling of pre-is_public index entries
  (checks Go naming convention: first rune uppercase)
- Show Relations count in markdown output when > 0
- Remove search.Service + parser dependencies
Copilot AI review requested due to automatic review settings March 6, 2026 12:33
@doITmagic
Copy link
Owner Author

I've addressed the remaining unresolved review comments in the latest commits (up to 37ea155):

  • docs: Updated the broken analyzer documentation markdown links in README.md.
  • tools: Renamed the shadowed filePath variable inside rag_find_usages tool loop to resultFilePath to avoid confusions and future bugs.
  • tests: Added extractSSEData concatenation feature loops to cmd/sse-client-test/main.go that merges proper JSON payload formats spanning over multiple data: lines. This prevents json Unmarshal failures.
  • core tests: Revisit fixed: the FindUsagesTool telemetry metrics test and the directory mock. The test correctly executes using a real temporary file instead of an unstatable mock, confirming proper bytes allocation for tokens saved, passing locally flawlessly. Also solved the test cleanup race-condition caused by background indexer goroutine overlaps.

All unit tests consistently stay 🟢 Green. Let me know if everything is up to standards and ready for merge!

@doITmagic
Copy link
Owner Author

rag_evaluate placing b.String() output inside Message instead of Data, as well as parsing structured fields successfully into the var data map[string]interface{} (including models mapping if available), has been pushed and is now fixed in 0edbad1 as requested. Thank you!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 5 comments.

@doITmagic doITmagic force-pushed the v2-detection-core branch from 0edbad1 to 7b74aed Compare March 6, 2026 12:39
Copilot AI review requested due to automatic review settings March 6, 2026 12:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 4 comments.

Copilot AI review requested due to automatic review settings March 6, 2026 12:46
@doITmagic doITmagic force-pushed the v2-detection-core branch from 0e9c618 to 5f92ba1 Compare March 6, 2026 12:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

README.md:1

  • ./docs/IDE-SETUP.md is referenced here but is deleted in this PR, leaving a broken link in the main README. Either restore the doc (or replace it with the new canonical location) or update/remove this link so the README stays self-consistent.
<div align="center">

@doITmagic doITmagic force-pushed the v2-detection-core branch from 5f92ba1 to bef4e55 Compare March 6, 2026 12:51
Copilot AI review requested due to automatic review settings March 6, 2026 12:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

go.mod:3

  • The go directive in go.mod is expected to be major.minor (e.g., go 1.24), not a patch version. Using go 1.24.4 will break go mod parsing in standard toolchains. If you intend to pin a patch toolchain, use go 1.24 and add a toolchain go1.24.4 directive instead.
go 1.24.4

Copilot AI review requested due to automatic review settings March 6, 2026 13:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 89 out of 346 changed files in this pull request and generated 6 comments.

@doITmagic doITmagic requested review from Copilot and removed request for Copilot March 6, 2026 14:00
@doITmagic doITmagic merged commit 7adda5a into dev Mar 6, 2026
3 checks passed
@doITmagic doITmagic deleted the v2-detection-core branch March 6, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants