feat: plugin manager by midweste · Pull Request #12 · giancarloerra/SocratiCode

midweste · 2026-03-17T20:37:25Z

Summary

Adds a PluginManager class that enables SocratiCode to be extended via self-contained plugins without modifying core code. Plugins are auto-discovered from src/plugins/*/index.ts at startup and receive lifecycle hooks. All plugin errors are non-fatal — a failing plugin never affects the indexer.

SocratiCode gives AI agents context about what code does and how it's structured. Plugins extend that context with knowledge that can't be extracted from source files alone — things like why code was written a certain way, which parts of the codebase are most volatile, or what implicit dependencies exist between components. This additional context is stored alongside the existing index and surfaced automatically during search, giving AI agents a deeper understanding of the project without any changes to the core.

Changes

[NEW] src/services/plugin.ts — PluginManager class with auto-discovery, registration, lifecycle dispatch, and non-fatal error isolation
[NEW] src/plugins/README.md — Plugin convention docs: folder structure, interface, and how to create a plugin
[NEW] tests/unit/plugin.test.ts — 12 tests covering registration, hook dispatch, error isolation, and shutdown

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Test coverage improvement

Testing

Unit tests pass (npm run test:unit)
Integration tests pass (npm run test:integration) — if applicable
TypeScript compiles cleanly (npx tsc --noEmit)
New tests added for new/changed functionality

12 tests covering: plugin registration, hook dispatch order, non-fatal error isolation, onProgress forwarding, shutdown resilience, and test reset. No plugins directory = gracefully skipped.

Checklist

My code follows the existing code style and conventions
I have added/updated JSDoc comments where appropriate
I have updated documentation (README.md / DEVELOPER.md) if needed
I have read the Contributing Guide
I agree to the Contributor License Agreement

Related issues

None

giancarloerra · 2026-03-18T10:45:55Z

Thank you for this and the clean implementation midweste, it's well-structured with good error isolation and test coverage.

Adding a plugin/extension system is a significant architectural decision that I'd like to think through carefully before committing to it. The project is still young and changing rapidly, and it feels a bit early. Some considerations:

The plugin interface becomes a public API surface that's hard to change once adopted
Exposing getClient() from qdrant.ts gives plugins direct Qdrant access, which has security implications: one badly coded or malicious plugin could cause havoc
There are currently no concrete plugins with strong use cases to validate the design or the need of it

SocratiCode already has context artifacts for extending project knowledge without code changes, which may partially overlap this.

I'm not ruling this out for the future, I do like the idea of external plugins.

But I'd prefer to think about it alongside a few concrete plugins so there are use cases for it and the API is validated by real usage. I definitely like more the idea of plugins instead of bloated code going beyond its core design, but I'd like it to be very surface-level and not posing any potential security or concern over the core functionality and indexes.

I'll keep this open for now for further comments also from other contributors. In the meantime, I'd like to first work more on the core product, smooth out existing bugs and implement core features.

midweste · 2026-03-18T12:13:27Z

Thanks for the thoughtful review and for keeping the door open. Totally understand wanting to be careful with architectural decisions this early.

I want to share the context behind why I built the plugin system — there's a concrete feature driving it.

Git Memory Plugin

SocratiCode answers "what does this code do?" through semantic search and context artifacts. Git Memory fills a gap: "why was it written this way?"

When a project is indexed, it reads unprocessed git commits — diffs, messages, and git-trailers — batches them, and sends them to a configurable LLM (OpenRouter, OpenAI, Google, Ollama) to extract structured memories: architectural decisions, bug fixes, refactors, patterns. These get embedded and stored in the same context_{id} collection with type: "git-memory", so existing search returns them alongside context artifacts with zero changes needed.

For example, a search for "authentication middleware" wouldn't just find the code — it would also surface "Switched from JWT to sessions due to XSS vulnerability (commit abc123)." An AI assistant would know not to reintroduce a bug because it can find "This validation was missing — caused production outage, fixed in def456."

It's fully opt-in (GIT_MEMORY_ENABLED=true), runs in the background, never blocks indexing, and is incremental. When disabled, zero code executes. I have it working with full test coverage.

I originally built it just as a local project mcp in go, but I see what I've built can compliment a fully indexed and searchable codebase

On the Plugin Architecture

Git-memory is a first use case for plugin architecture. The interface is intentionally minimal — 4 optional lifecycle hooks — and I think it's actually the right pattern for SocratiCode going forward. Features like this should be isolated modules that plug into the lifecycle, not code wired throughout the core. That keeps the core lean and each feature self-contained.

That said, your concerns about getClient() are valid. One approach: the PluginManager itself could expose a scoped API of approved Qdrant operations — scroll, count, setPayload — so plugins interact through the manager instead of importing core services directly. That way the PluginManager acts as a sandbox, and you control exactly what plugins can do. Context artifacts and git memory complement each other — artifacts are static docs you write, git memory is dynamic knowledge extracted automatically from commits.

I'd love to get your take on the git-memory feature itself — if it makes sense, we can work out the right integration approach together.

Happy to discuss.

giancarloerra · 2026-03-18T14:44:49Z

The Git Memory idea is interesting, and thanks for addressing the security concern. A few thoughts:

Feature vs plugin: Git Memory feels more like a core feature behind a flag (like INCLUDE_DOT_FILES or context artifacts) than something that validates a plugin system (more below).

LLM dependency: SocratiCode today uses just embeddings. Simple, local, no API keys (by default). Git Memory would need I think a generative LLM to read diffs and extract structured memories, which means configuring providers, API keys, picking models. That's a different level of complexity that I'm trying to avoid. How would it work without that?

Existing artifacts: Could raw commit messages + diffs be embedded directly without LLM structuring? How could them be maintained up to date? If possible, implementing a simple script to update a folder with all of that would mean artifacts could already cover it. I keep thinking the git memory is something for the existing artifacts more than anything.

Existing tools: most coding AI agents already have access to git — they can run git log, git blame, search commit history dynamically. So is it really a concern for SocratiCode?

On plugins generally: I really do like the idea of opening SocratiCode to community extensions — but I think a plugin should be a lighter touch. Something that enriches the index (or the use of it) but without introducing heavy dependencies or high complexity.

I'd love to see the Git Memory implementation to understand the architecture better — even if we end up shipping it as a core feature (maybe as part of artifacts) rather than a plugin. Happy to keep discussing :-)

midweste · 2026-03-18T18:56:22Z

The Git Memory idea is interesting, and thanks for addressing the security concern. A few thoughts:

Feature vs plugin: Git Memory feels more like a core feature behind a flag (like INCLUDE_DOT_FILES or context artifacts) than something that validates a plugin system (more below).

I understand that perspective too, I never want to be too presumptuous with other peoples projects :)

LLM dependency: SocratiCode today uses just embeddings. Simple, local, no API keys (by default). Git Memory would need I think a generative LLM to read diffs and extract structured memories, which means configuring providers, API keys, picking models. That's a different level of complexity that I'm trying to avoid. How would it work without that?

So currently, I've built it to use openrouter and it evaluates first:

the free sources that meet the criteria for extraction, triage and synthesis. Auto chooses free model if not specifically set. Most any model can do extraction, triage and synthesis need some reasoning

The three steps it does via LLM are:

Extraction:

Pulls raw memories out and types them ( "decision", "pattern", "convention", "context", "debt",
"bug_fix", "refactor", "feature", "architecture" )
Use git trailers for additional context that is provided by the user (or the AI that commits cause it knows the conversation context and will add things like 'chose to use openrouter over openai because X'

Triage:

Uses reasoning LLM to dedupe redundant memories

Synthesis (where some magic happens) and as currently setup needs a good size context window:

Relationship links between memories ( "supersedes", "contradicts", "supports", "extends",
"related_to", "depends_on", "caused_by", "alternative_to" )
Heuristic scoring - computes importance and confidence

Existing artifacts: Could raw commit messages + diffs be embedded directly without LLM structuring? How could them be maintained up to date? If possible, implementing a simple script to update a folder with all of that would mean artifacts could already cover it. I keep thinking the git memory is something for the existing artifacts more than anything.

I expect so, but I do think the why is where some of this becomes more valuable, however without any LLM, "git-commit" could be a context artifact of its own and have its own links to search results. "Git memory lite"?

Existing tools: most coding AI agents already have access to git — they can run git log, git blame, search commit history dynamically. So is it really a concern for SocratiCode?

But do they use it is the real question. Ime I have to keep telling it over and over to scan files and tell it what to be scanning myself. MCP's seem to function as first class tools that it will use. I talk to opus a lot about why is doesn't use certain things and it tells me that the more friction it takes to do certain things, the more likely it is to skip it and just do it the "old fashion" way with grep etc. Maybe this is a cost cutting methodology with how they train the models or injected prompts.

On plugins generally: I really do like the idea of opening SocratiCode to community extensions — but I think a plugin should be a lighter touch. Something that enriches the index (or the use of it) but without introducing heavy dependencies or high complexity.

I don't really have a preference one way or another for this. The reason I'm here is because I liked what you put together and thought my beta project was a natural fit. I did some dry run AI testing with both systems as MCPs before I even considered porting it, and opus reported a lot of complementary results. I had it dry run mentally a feature addition to a codebase and it used both MCPs and told me what information it would use from both systems and how it would influence how it would go about building the new feature.

I do like modularity (even your existing indexer could in theory, be a plugin), but if it works as part of the existing codebase thats fine too as the plugin I submitted are really the only touch points needed.

I'd love to see the Git Memory implementation to understand the architecture better — even if we end up shipping it as a core feature (maybe as part of artifacts) rather than a plugin. Happy to keep discussing :-)

What's the best way for me to do this? I can push to my fork after I test a couple runs. My gap analysis of the port is nearly covered now.

My main motivation for the git memory mcp was that I kinda realized that git is probably one of the best and most available memory systems that a project has at it's disposal. Yes, memories can be superseded by later commits but let's be honest, once it ends up in the repo, its a meaningful thing to remember.

midweste · 2026-03-19T00:53:57Z

Ok got the basic flow working for memory additions, here's a document i asked opus to make that would show a theoretical feature, it chose "Test coverage plugin" I'm guessing where test coverage information is added to the index. Not even sure if this makes any sense but it does show what it thinks about what its finding:

Dry Run: "Add a Test Coverage Plugin"

A walkthrough of how an AI agent researches a new feature using SocratiCode. Each step shows the combined results the agent receives, tagged by source.

Phase 1: How Do I Create a Plugin?

The agent runs two searches in parallel and receives these combined results:

What the agent learned	Source
File convention is `src/plugins/*/index.ts` — auto-discovered at load time, no registration config needed	🧠 git memory
All hooks are optional and non-fatal — plugin crashes don't take down the indexer	🧠 git memory
Old API `registerPlugin()` was renamed to `pluginManager.register()` — stale examples exist in docs	🧠 git memory
Exact interface: `name: string` + 4 optional async hooks (onProjectIndexed, onProjectUpdated, onProjectRemoved, onShutdown)	🔍 code search
Hook signatures: (projectPath: string, onProgress?: ProgressFn) => Promise	🔍 code search
Working registration template: pre-flight checks → create resources → define plugin object → register	🔍 code search

Agent is ready to: scaffold src/plugins/test-coverage/index.ts with the correct interface and registration pattern, knowing errors are safely isolated.

Phase 2: How Do I Store Data?

What the agent learned	Source
⚠️ Must use relative paths — absolute paths caused a shipped bug (v1.1.3 patch) in worktree indexing. Using absolute paths would silently break shared indexes.	🧠 git memory
Data goes in the existing `context_` collection (same collection as other artifacts) — don't create a new one	🧠 git memory
Storage API: `upsertPreEmbeddedChunks(collection, points[])` where each point has `{id, vector, bm25Text, payload}`	🔍 code search
Batching: upserts happen in batches of 100 with automatic per-point fallback on failure	🔍 code search
Need `ensurePayloadIndex(collection, fieldName)` for any fields used in filtering	🔍 code search

Agent is ready to: write the storage layer using the correct API, with relative paths, into the shared collection.

The relative-path constraint is the most valuable piece here. This information exists only in git history — it is not documented or visible in current code. An agent without this context could re-introduce the exact bug that was already debugged and patched.

Phase 3: Collection & Project Identity

What the agent learned	Source
projectIdFromPath() generates a stable ID: SHA-256 hash or explicit `SOCRATICODE_PROJECT_ID` env var	🔍 code search
contextCollectionName(projectId) derives the collection name — pattern: `context_{id}`	🔍 code search
Three collection families exist: `codebase_`, `codegraph_`, `context_` — test coverage belongs in `context_`	🔍 code search
Worktrees share the same collection via `SOCRATICODE_PROJECT_ID` — test coverage data must be worktree-safe	🧠 git memory

Agent is ready to: use the canonical naming functions instead of constructing collection names manually.

Full Knowledge Map

Every piece of knowledge the agent gathered, by source:

Knowledge	🔍 Code	🧠 Memory
Plugin file convention (`src/plugins/*/index.ts`)		✓
Hooks are non-fatal/optional		✓
API rename (`registerPlugin` → `pluginManager.register`)		✓
Interface contract (4 hooks + signatures)	✓
Registration boilerplate template	✓
⚠️ Use relative paths (absolute paths = past bug)		✓
Store in existing `context_` collection		✓
`upsertPreEmbeddedChunks()` API + batch pattern	✓
`ensurePayloadIndex()` for filterable fields	✓
projectIdFromPath() / contextCollectionName()	✓
Worktree sharing via `SOCRATICODE_PROJECT_ID`	✓	✓

Code search provided 7 pieces: implementation contracts, API signatures, working templates.
Git memory provided 5 pieces: conventions, constraints, past bugs, architectural intent.
1 piece surfaced from both, reinforcing each other.

Together: 12 distinct pieces of knowledge from 4 parallel query pairs, zero files opened.

giancarloerra · 2026-03-19T12:36:08Z

I've been thinking more about this and I think there's an approach that could work well for both the plugin system and Git Memory, and one that I like more as it's a good compromise between SocratiCode philosophy of KISS and an expandable plugins system that doesn't affect or interacts with any of the core features.

Plugins as artifact generators

What if the plugin contract was simply: "generate files in the context artifacts directory"? SocratiCode's existing pipeline handles embedding, indexing, and search. Plugins just produce the knowledge.

The interface could be minimal:

interface ArtifactPlugin { name: string; generateArtifacts(projectPath: string, artifactsDir: string): Promise<void>; cleanArtifacts(artifactsDir: string): Promise<void>; }

A plugin gets the project path and a directory to write to. It does its thing. SocratiCode indexes whatever it finds there. No Qdrant access, no lifecycle hooks into the indexing pipeline, no core API surface to maintain.

This solves the concerns I have:

Security: plugins never see Qdrant or any core internals
Stability: a failing plugin can't affect the indexer — it just writes files (or doesn't)
API surface: "write files to a directory" is about as stable a contract as you can get
Simplicity: dead simple to implement, dead simple to write plugins for

How Git Memory fits

Git Memory becomes a perfect first plugin with two modes:

Lite (no LLM): runs git log, extracts commits, writes structured markdown artifacts: commit messages, authors, affected files, diffs. No AI processing, just organized git history made searchable.

Full (with LLM): same extraction step, then sends batches to a configured LLM for structuring: types, relationships, importance scores. Writes richer artifacts.

Both modes just produce markdown files. SocratiCode's hybrid search (embeddings + BM25) handles them naturally: "switched from JWT to sessions due to XSS" will surface when someone searches "authentication security" regardless of format. Structured markdown with good headings actually chunks well for embeddings.

The LLM provider configuration stays entirely within the plugin, while SocratiCode core remains embeddings-only. Users who want the full mode configure the plugin separately.

What this enables

Because the contract is just "generate useful artifacts," other plugins become natural too: API docs from OpenAPI specs, dependency analysis, architecture decision records, CI/CD context. All just file generators, all sandboxed by design.

I know this trades power for safety compared to your original plugin system with lifecycle hooks and Qdrant access. But I think that limitation is actually a feature right now, fitting the SocratiCode original philosophy: doing one thing, well.

It's enough to validate the concept, and covers the git memory use case fully.

What do you think? If this direction works for you, you could share maybe in another dedicated PR this approach for the plugin system and the Git Memory Lite and Full? So there's the plugin system and the first use cases for a simple one and a more complex one needing more configuration (the LLM part should support Openrouter and major providers like Openai compatible, Ollama etc.).

midweste · 2026-03-19T12:47:18Z

When I get some additional time, I'll injest this a bit more fully and see what changes need to be made. Honestly, I dont think its that much, however the file format may likely benefit from a structure like json. The current system generates links between memories during the final step and adds superseded by type tags to inform the consumer of its "relevance".

Couple quick questions:

Isnt there already the socraticodecontextartifacts.json mechanism? Doesn't it fit into that already? Or are memories set to become more than context? Build this out more?
These files added to like a .socraticode folder and generally ignored by git? we don't want memories of memories of course

Maybe the plugin implements a schema that can plug in? Json schema would be useful as the main core could validate input before it adds anything (or maybe thats the domain of the plugin?)

The upsert ends up looking like this:

{
  id,          // UUID derived from SHA-256 of "git-memory:{contentHash}"
  vector,      // embedding of prepareDocumentText(`[${memoryType}] ${summary}`, `git-memory:${filePath}`)
  bm25Text,    // same text as above (for hybrid search)
  payload: {
    // ── Context artifact fields (shared with all SocratiCode context) ──
    artifactName:        "git-memory",              // constant — groups all git memories
    artifactDescription: "[decision] Git memory (importance: 85) from commits abc123, def456",
    filePath:            "src/services/auth.ts",    // primary file path
    relativePath:        "git-memory",              // constant
    content:             "Decided to use JWT over sessions because...",  // the memory summary
    startLine:           0,                         // not applicable
    endLine:             0,                         // not applicable
    language:            "git-memory:decision",     // "git-memory:{memoryType}"
    type:                "git-memory",              // constant — Qdrant filter key

    // ── Git-memory-specific fields ──
    contentHash:    "a1b2c3d4e5f67890",   // 16-char hex for dedup
    sourceCommits:  ["abc123...", "def456..."],  // full commit hashes
    filePaths:      ["src/services/auth.ts", "src/config.ts"],  // all related files
    tags:           ["authentication", "jwt", "architecture"],
    importance:     85,                    // 0-100
    confidence:     70,                    // 0-100
    memoryType:     "decision",            // one of GIT_MEMORY_TYPES
    createdAt:      "2025-06-15T10:30:00Z", // ISO date of earliest source commit
  },
}

Will respond more later, haven't had my second cup of joe yet, so I may be completely offbase here :P

giancarloerra · 2026-03-25T00:04:02Z

Sorry I forgot to answer your questions:

Yes, exactly that's the whole point of my "plugins as artifact generators" suggestion. The plugin wouldn't interact with Qdrant or the indexing pipeline at all. It would just generate files (markdown, JSON, whatever format works best) into a directory, and the existing .socraticodecontextartifacts.json mechanism would handle the rest: embedding, indexing, hybrid search. The plugin system is essentially an automated way to produce and maintain context artifacts that would be tedious or impractical to create by hand. Git Memory Lite could generate structured markdown files from git history, and those files become artifacts: indexed and searchable like any other. So it fits into the existing artifacts system, just with automated generation instead of manual curation.
Yes, that's the right approach. Plugin-generated artifacts would go into a local directory (something like .socraticode/artifacts/git-memory/) that gets added to .gitignore automatically or by convention. They're local to each machine, derived from git history that's already in the repo, so there's no reason to track them. And as you said, we definitely don't want to index artifacts about artifacts. The .socraticodeignore could also exclude the plugin output directory from the main codebase index, so only the context artifacts pipeline processes them.

giancarloerra · 2026-04-15T17:38:41Z

Closing this one for now, happy to explore it again, also in consideration of all the core features that are being added and will keep being added, because the product is still new and expanding :-)

midweste · 2026-04-20T11:01:15Z

Currently just using it as is (adding context meta to qdrant) and getting a handle on what the data looks like after using it on a few repos and what it gives in value. Didn't have time to do the conversion to flat json yet but keeping it in mind.

One thing that I noticed and this may be relevant when using a remote qdrant server is that I was hoping to set this us for use as a shared resource for our small development team and the paths are absolute in the qdrant store. This need to be that way? Can we write a file to the source control that identifies the project either by name or hash and have the data be written relatively? Just a thought

midweste · 2026-04-20T11:06:43Z

The one gap that I saw in initial planning to put it into the existing json files schema is that it that the current json schema doesn't seem to be designed to hold extra meta. This wouldn't allow a plugin to define what data is important in it's plugin scope:

    // ── Git-memory-specific fields ──
    contentHash:    "a1b2c3d4e5f67890",   // 16-char hex for dedup
    sourceCommits:  ["abc123...", "def456..."],  // full commit hashes
    filePaths:      ["src/services/auth.ts", "src/config.ts"],  // all related files
    tags:           ["authentication", "jwt", "architecture"],
    importance:     85,                    // 0-100
    confidence:     70,                    // 0-100
    memoryType:     "decision",            // one of GIT_MEMORY_TYPES
    createdAt:      "2025-06-15T10:30:00Z", // ISO date of earliest source commit

These are all fields I didn't see a way to make available in the current schema design but are VERY important to the knowledge of git memory operation and value.

Without this, really loses a lot of value

giancarloerra · 2026-04-20T11:57:35Z

Currently just using it as is (adding context meta to qdrant) and getting a handle on what the data looks like after using it on a few repos and what it gives in value. Didn't have time to do the conversion to flat json yet but keeping it in mind.

One thing that I noticed and this may be relevant when using a remote qdrant server is that I was hoping to set this us for use as a shared resource for our small development team and the paths are absolute in the qdrant store. This need to be that way? Can we write a file to the source control that identifies the project either by name or hash and have the data be written relatively? Just a thought

This was actually addressed already in the recent weeks (you might need to update): the Qdrant payload stores both [filePath] (absolute, used internally for reading the file) and [relativePath] (relative to project root). Search results returned to the AI use the relative path, so what the agent sees is something like src/services/auth.ts (lines 42-67), not /home/dev1/projects/myapp/src/services/auth.ts.

The collection naming also doesn't expose your path. By default, the collection name is a SHA-256 hash of the absolute path (e.g. codebase_a1b2c3d4e5f6), not the path itself. So, looking at Qdrant directly, you'll see hash-based names, not filesystem paths.

For your shared remote Qdrant use case, the key is [SOCRATICODE_PROJECT_ID].

Set this to a stable team-wide name (e.g. my-project) and every team member's instance will read/write the same Qdrant collections regardless of where they cloned the repo locally. Without it, each developer's different absolute path produces a different hash, so they'd each get their own index.

Config example (in your MCP config):

{
  "socraticode": {
    "command": "npx",
    "args": ["-y", "socraticode"],
    "env": {
      "SOCRATICODE_PROJECT_ID": "my-project",
      "QDRANT_URL": "https://your-remote-qdrant:6333"
    }
  }
}

With this, the whole team shares one index on your remote Qdrant. The SOCRATICODE_PROJECT_ID is also documented in the [Git Worktrees section] of the README and in the [Environment Variables] table.

If you want you can also join on Socraticode Discord: https://discord.gg/5DrMXfNG

giancarloerra · 2026-04-20T12:06:14Z

The one gap that I saw in initial planning to put it into the existing json files schema is that it that the current json schema doesn't seem to be designed to hold extra meta. This wouldn't allow a plugin to define what data is important in it's plugin scope:
    // ── Git-memory-specific fields ──
    contentHash:    "a1b2c3d4e5f67890",   // 16-char hex for dedup
    sourceCommits:  ["abc123...", "def456..."],  // full commit hashes
    filePaths:      ["src/services/auth.ts", "src/config.ts"],  // all related files
    tags:           ["authentication", "jwt", "architecture"],
    importance:     85,                    // 0-100
    confidence:     70,                    // 0-100
    memoryType:     "decision",            // one of GIT_MEMORY_TYPES
    createdAt:      "2025-06-15T10:30:00Z", // ISO date of earliest source commit
These are all fields I didn't see a way to make available in the current schema design but are VERY important to the knowledge of git memory operation and value.

Without this, really loses a lot of value

I think there may be a misunderstanding about the constraint here. Qdrant payloads are schemaless, you can store any JSON fields you want. The current code chunks have fields like [filePath], [relativePath], [content], [startLine], etc., and context artifacts add [artifactName], [artifactDescription], [contentHash], but those aren't a rigid schema that limits what can be stored.

For a git-memory plugin, you wouldn't need to modify the existing payload structure at all. The plugin would manage its own data in its own Qdrant collection (or in the existing collection with a different [type] field to distinguish git-memory points from code/artifact points). Your PR #12 already exports getClient() from qdrant.ts, so the plugin can upsert points directly with whatever payload fields it needs: sourceCommits, tags, importance, confidence, memoryType, createdAt, all of it.

The helper functions like [upsertChunks] are convenience wrappers for code indexing specifically. A plugin doing something fundamentally different (like git memory) would use the Qdrant client directly, which gives full control over the payload. The collection creation and embedding generation utilities are also exported and reusable.

So the architecture actually supports what you described already. The plugin system from your PR provides the lifecycle hooks (when to run), and the exported Qdrant client provides the storage layer (where to write). No changes to the existing schema needed.

Happy to dig into specifics, there have been many several updates and more in the planning (one of the reasons I was mentioning a plugin system at the moment must be very light touch because the product will keep changing at speed).

midweste · 2026-04-20T12:18:00Z

Great to hear in regard to paths! In regard to the project ID, it seems some AI client implementation use global mcps for some unknown reason and putting the "SOCRATICODE_PROJECT_ID": "my-project", in the mcp would fix it to one index yes? (currently using Antigravity and there is no per project mcp config). I'll look to see if .socraticode.json in project root allows for project id, that would cover it.

I think I misunderstood how .socraticodecontextartifacts.json worked. Not at the computer yet but after looking again, it seems I could generate git memories in a folder to my own json schema and then just add them to .socraticodecontextartifacts.json ? Essentially bypassing the need for direct access to Qdrant. Sorry I'm having to catch up mentally! Then, the "plugin manager" would scale down to just something that triggered hooks? This is probably what you were suggesting before but it went over my head! Apologies

giancarloerra · 2026-04-20T12:26:15Z

Great to hear in regard to paths! In regard to the project ID, it seems some AI client implementation use global mcps for some unknown reason and putting the "SOCRATICODE_PROJECT_ID": "my-project", in the mcp would fix it to one index yes? (currently using Antigravity and there is no per project mcp config). I'll look to see if .socraticode.json in project root allows for project id, that would cover it.

I think I misunderstood how .socraticodecontextartifacts.json worked. Not at the computer yet but after looking again, it seems I could generate git memories in a folder to my own json schema and then just add them to .socraticodecontextartifacts.json ? Essentially bypassing the need for direct access to Qdrant. Sorry I'm having to catch up mentally! Then, the "plugin manager" would scale down to just something that triggered hooks? This is probably what you were suggesting before but it went over my head! Apologies

Project ID with global MCPs

Setting SOCRATICODE_PROJECT_ID in the MCP config pins it to that specific index regardless of what path the client resolves. So even if Antigravity uses a global MCP config, adding "SOCRATICODE_PROJECT_ID": "my-project" means every call maps to the same Qdrant collections. That's the intended escape hatch for clients that don't support per-project MCP configs.

On .socraticode.json: right now that file only supports linkedProjects (for cross-project search). It doesn't have a projectId field yet. So the env var is the way to go for your setup. Adding projectId to .socraticode.json is a reasonable feature request though if you want it, would cover the case where the env var isn't an option.

Context artifacts for git memories

No apologies needed! That's exactly the pattern:

Your tool generates git memory files (markdown, JSON, whatever format works for you) on disk
You list them in .socraticodecontextartifacts.json with a name and description
SocratiCode chunks, embeds, and indexes them into Qdrant automatically
They become searchable via codebase_context_search alongside any other artifacts

The description field in the config is important here because it gets stored in every chunk's payload. So if you set it to something like "Git-derived architectural decisions and code evolution patterns", the AI gets that context when results come back, helping it understand what it's looking at.

And yes, the "plugin manager" simplifies down to just triggering hooks. Something like: on codebase_index completion, run the git analysis script, write the output files, and let SocratiCode pick them up on the next codebase_update. No need to touch Qdrant directly at all.

The beauty of this approach is it's completely decoupled. Your git-memory tool doesn't even need to know SocratiCode exists. It just writes files. SocratiCode just indexes files. The .socraticodecontextartifacts.json config is the glue.

midweste · 2026-04-20T12:41:18Z

That approach would simplify some of what I'm doing in the code as well. Currently:

Reading commits, splitting them if they are too large to be processed in one context window and reassembling them into memories about a particular commit (think initial commit with a LOT of files).
After all commit memories are generated, there is a synthesis step that looks at the whole corpus and evaluates the connections between them, what supersedes what and would then edit the existing memories providing meta data and re-scores the importance, confidence, links etc.

Most recently, I had changed it to write to Qdrant after every generation as if some LLM calls failed I was having to regenerate the whole commit again. Later revisions wrote to qdrant per generation and later altered the existing memories on synthesis.

Files would work well because its a semi heavy operation. Having files based on memories makes it idempotent where once memories and generated, they would never have to be regenerated (for instance if the qdrant remote connection was down or something). This would save LLM hits. Also, some AI will just look at files in the codebase anyway and gather knowledge so it would help if the LLM decides its "too hard" to use the MCP. Would run into the same memories then.

giancarloerra · 2026-04-20T13:25:18Z

That approach would simplify some of what I'm doing in the code as well. Currently:

Reading commits, splitting them if they are too large to be processed in one context window and reassembling them into memories about a particular commit (think initial commit with a LOT of files).

After all commit memories are generated, there is a synthesis step that looks at the whole corpus and evaluates the connections between them, what supersedes what and would then edit the existing memories providing meta data and re-scores the importance, confidence, links etc.

Most recently, I had changed it to write to Qdrant after every generation as if some LLM calls failed I was having to regenerate the whole commit again. Later revisions wrote to qdrant per generation and later altered the existing memories on synthesis.

Files would work well because its a semi heavy operation. Having files based on memories makes it idempotent where once memories and generated, they would never have to be regenerated (for instance if the qdrant remote connection was down or something). This would save LLM hits. Also, some AI will just look at files in the codebase anyway and gather knowledge so it would help if the LLM decides its "too hard" to use the MCP. Would run into the same memories then.

File-based approach

That's a great way. The file-based pattern gives you exactly what you described: idempotency (skip already-generated commits), resilience (Qdrant being down doesn't lose work), and discoverability (agents browsing the repo find the memories directly). It's a strictly better architecture than writing to Qdrant on every LLM call.

Where this leaves the PR

Thinking about it, the file-based approach means your git-memory tool is fully external to SocratiCode. It generates files, you list them in .socraticodecontextartifacts.json, and indexing just works. No hooks needed in core, no plugin manager, no Qdrant client access.

So PR #12 as was scoped (plugin manager with lifecycle hooks) may not be needed anymore for your use case. That said, if after building it out you find there's a small, focused contribution that would help, like a projectId field in .socraticode.json (you mentioned the global MCP config issue), that's the kind of thing that would fit well as a standalone PR.

I'd say: build out the git-memory tool with the file-based approach, see how it feels end-to-end, and maybe we can then see how it could be integrated in SocratiCode? That way it maps directly to a real friction point rather than theoretical infrastructure (with one main application only).

Looking forward to seeing what you build with it.

midwestE added 2 commits March 17, 2026 15:11

feat: plugin manager for future support for plugins

88a6a37

docs: update plugin readme

dddc6cf

midweste changed the title ~~Midweste pluginmanager~~ Plugin Manager Mar 17, 2026

midweste changed the title ~~Plugin Manager~~ feat: plugin manager Mar 17, 2026

giancarloerra self-assigned this Mar 18, 2026

giancarloerra marked this pull request as draft April 11, 2026 16:47

giancarloerra closed this Apr 15, 2026

Conversation

midweste commented Mar 17, 2026

Summary

Changes

Type of change

Testing

Checklist

Related issues

Uh oh!

giancarloerra commented Mar 18, 2026

Uh oh!

midweste commented Mar 18, 2026

Uh oh!

giancarloerra commented Mar 18, 2026

Uh oh!

midweste commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

midweste commented Mar 19, 2026

Dry Run: "Add a Test Coverage Plugin"

Phase 1: How Do I Create a Plugin?

Phase 2: How Do I Store Data?

Phase 3: Collection & Project Identity

Full Knowledge Map

Uh oh!

giancarloerra commented Mar 19, 2026

Uh oh!

midweste commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giancarloerra commented Mar 25, 2026

Uh oh!

giancarloerra commented Apr 15, 2026

Uh oh!

midweste commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

midweste commented Apr 20, 2026

Uh oh!

giancarloerra commented Apr 20, 2026

Uh oh!

giancarloerra commented Apr 20, 2026

Uh oh!

midweste commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

giancarloerra commented Apr 20, 2026

Uh oh!

midweste commented Apr 20, 2026

Uh oh!

giancarloerra commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

midweste commented Mar 18, 2026 •

edited

Loading

midweste commented Mar 19, 2026 •

edited

Loading

midweste commented Apr 20, 2026 •

edited

Loading

midweste commented Apr 20, 2026 •

edited

Loading