zmem

Local-first hybrid memory for engineering workflows.

zmem is a small POC/experiment inspired by QMD-style document search, built to explore whether we can get strong practical recall by combining:

dense retrieval via @zvec/zvec
lexical retrieval via SQLite FTS5/BM25
a simple fusion pipeline (RRF)
a local MCP server for coding-agent integration
graph data for relationships between memories (wip)

overall it is a local-first, agent-oriented memory substrate for engineering decisions and evolving project context

Quickstart

Get zmem running locally in a few minutes.

1. Install

npm install -g @cosmiclasagnadev/zmem
zmem help

Or from this repo:

npm install
npm run build
npm install -g .

2. Initialize a workspace

Create or update a starter config for the workspace you want to index:

zmem init --workspace=default --root=/absolute/path/to/your/project --yes

This creates a config and sets up zmem to use user-scoped storage by default.

3. Check setup

Validate config, storage, and local model settings:

zmem doctor --workspace=default
zmem models status

4. Prepare local query-expansion models

zmem uses local-first query expansion by default.

See the exact model pull commands:

zmem models pull

This will print commands for the default local models:

primary: hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_M
fallback: hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_S

Run the printed node-llama-cpp pull commands, or let zmem download on first query.

5. Ingest your docs

zmem ingest /absolute/path/to/your/project --workspace=default

6. Query your memory

zmem query "database migration" --workspace=default
zmem query "" --mode=recent --workspace=default
zmem query "" --mode=typed --types=decision,preference --workspace=default

7. Use with MCP / OpenCode

Start the MCP server:

zmem mcp --workspace=default

Then point your MCP client, such as OpenCode, at the zmem command.

First-run notes

Default storage is outside your repo:
- macOS/Linux: ~/.local/share/zmem/workspaces/<workspace-slug>/
- Windows: %APPDATA%/zmem/workspaces/<workspace-slug>/
Query expansion is enabled by default for hybrid retrieval.
If the local query-expansion model is unavailable, zmem warns once and falls back to deterministic expansion.
You can inspect the fully resolved config with:

zmem config show --workspace=default

References:

QMD
ZVec

What this does differently

explicit memory types and lifecycle (pending, active, archived, deleted) plus supersedes_id
- if no explicit type is defined, it defaults to 'fact'
- if no explicit tags are defined, it defaults to '[]'
local-first query expansion is enabled by default for hybrid retrieval, with deterministic fallback when the local model is unavailable
uses @zvec/zvec for dense retrieval when doing vector search

Architecture (high level)

Language/runtime: TypeScript + Node.js
Vectors: @zvec/zvec
Lexical: better-sqlite3 + FTS5/BM25
Embeddings: node-llama-cpp by default, with explicit provider hooks for Gemini and offline mock testing
Validation: zod
Agent integration: @modelcontextprotocol/sdk

Hybrid retrieval flow (follows qmd's style closely):

lexical search (FTS/BM25)
vector search (zvec)
reciprocal rank fusion (RRF)
bounded query expansion
heuristic memory-item reranking

Indexing pipeline

flowchart LR
    A[Markdown/Text Sources] --> B[Ingestion Pipeline]
    B --> C[Parse + Chunk]
    C --> D[(SQLite + FTS5)]
    C --> E[Generate Embeddings]
    E --> F[(zvec Vector Index)]

Retrieval pipeline

flowchart LR
    Q[User Query] --> M{Mode}

    M -->|lexical| L[FTS/BM25 Search]
    M -->|vector| V[Embedding + zvec Search]
    M -->|hybrid| H[FTS + Vector in parallel]

    L --> O[Top Results]
    V --> O
    H --> R[RRF Fusion]
    R --> O

    O --> C1[CLI Output]
    O --> C2[MCP Tools]

At runtime, the same core API powers both CLI commands and MCP tools, so behavior stays consistent across direct terminal use and agent workflows.

Project layout

src/ingest/ - discovery, parsing, chunking, orchestration
src/search/ - lexical/vector retrieval + fusion
src/core/ - shared API (save, recall, get, list, delete, reindex, status)
src/mcp/ - MCP server + tool registration
src/db/ - SQLite handling + migrations
config.example.json - sample configuration

Getting started

1) Install

npm install

2) Configure

Create your local config file:

cp config.example.json config.json

Then update:

workspaces[].root to a real absolute path
any desired patterns for ingestion
storage paths (storage.dbPath, storage.zvecPath) if needed
ai.embedding.provider if you want to switch from local llamacpp to gemini
ai.embedding.apiKey / ZMD_EMBED_API_KEY when using gemini
storage.baseDir only if you want to override the default XDG-style storage root
ai.queryExpansion.* if you want to tune or disable default local-first query expansion

If config.json is missing, zmem falls back to defaults.

Default storage is outside your repo:

macOS/Linux: ~/.local/share/zmem/workspaces/<workspace-slug>/
Windows: %APPDATA%/zmem/workspaces/<workspace-slug>/

Within each workspace directory, zmem stores:

memory.db
vectors/

Advanced overrides still work through storage.baseDir, storage.dbPath, storage.zvecPath, ZMEM_STORAGE_BASE_DIR, ZMEM_DB_PATH, and ZMEM_ZVEC_PATH.

Query expansion can also be tuned with env vars such as:

ZMEM_QUERY_EXPANSION_ENABLED
ZMEM_QUERY_EXPANSION_PROVIDER
ZMEM_QUERY_EXPANSION_MODEL
ZMEM_QUERY_EXPANSION_FALLBACK_MODEL
ZMEM_QUERY_EXPANSION_MAX_EXPANSIONS

3) Ingest documents

npm run dev -- ingest ./docs --workspace=default

4) Query memory

npm run dev -- query "database decisions" --mode=hybrid --workspace=default

5) Check status

npm run dev -- status --workspace=default

CLI usage

Run help:

npm run dev -- help

Primary commands:

ingest <path> [--workspace=<name>] [--logs=true|false]
query <query> [--workspace=<name>] [--mode=hybrid|lexical|vector|recent|important|typed] [--scopes=a,b] [--types=a,b] [--expansion-mode=off|deterministic|llm] [--logs=true|false]
status [--workspace=<name>] [--logs=true|false]
init [--config=./config.json] [--workspace=<name>] [--root=<path>] [--storage-base-dir=<path>] [--enable-query-expansion=true|false] [--yes]
doctor [--config=./config.json] [--workspace=<name>]
config show [--config=./config.json] [--workspace=<name>]
models <status|check|pull> [--config=./config.json]
mcp [--config=./config.json] [--workspace=<name>] [--verbose=true|false]

Examples:

npm run dev -- ingest ./test-docs/search --workspace=default
npm run dev -- query "sqlite" --mode=lexical --workspace=default
npm run dev -- query "vector embeddings" --mode=vector --workspace=default
npm run dev -- init --workspace=default --root=/absolute/path/to/repo --yes
npm run dev -- doctor --workspace=default
npm run dev -- models pull

Local query expansion

zmem now defaults to local-first query expansion for hybrid retrieval.

primary model: hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_M
fallback model: hf:mradermacher/qmd-query-expansion-qwen3.5-2B-GGUF:Q4_K_S
expansion is skipped when lexical exact-match signal is already strong
if the local model is unavailable, zmem warns once and falls back to deterministic expansion

To inspect or prepare local models:

npm run dev -- models status
npm run dev -- models pull

MCP usage

Start the MCP stdio server:

npm run dev -- mcp --workspace=default

Implemented tools:

memory_query
memory_search
memory_get
memory_list
memory_save
memory_delete
memory_status
memory_link
memory_neighbors
memory_edge_update

Optional admin tool:

memory_reindex (enabled with ZMEM_ENABLE_REINDEX_TOOL=true)

Verbose MCP logs:

ZMEM_MCP_VERBOSE=true npm run dev -- mcp --workspace=default

Local development

Useful scripts:

npm run dev - run CLI via tsx
npm run build - build TypeScript to dist/
npm start - run built CLI
npm run typecheck - type-check without emitting
npm test - run tests
npm run smoke - build + smoke script

Typical dev loop:

npm run typecheck
npm test
npm run smoke

Environment variables

ZMD_EMBED_MODEL - override embedding model
ZMD_EMBED_DIMENSIONS - override embedding dimensions
ZMD_EMBED_PROVIDER - override embedding provider (llamacpp, openai, ollama, gemini, mock)
ZMD_EMBED_API_KEY - embedding API key override for remote providers such as Gemini
ZMD_EMBED_BASE_URL - optional embedding API base URL override
ZMD_EMBED_TASK_TYPE - optional Gemini task type override such as RETRIEVAL_DOCUMENT
ZMEM_STORAGE_BASE_DIR - override the XDG-style storage root
ZMEM_DB_PATH - override the resolved database path directly
ZMEM_ZVEC_PATH - override the resolved vector storage path directly
ZMEM_WORKSPACE - default workspace for MCP resolution
ZMEM_MCP_VERBOSE=true - verbose MCP logs to stderr
ZMEM_ENABLE_REINDEX_TOOL=true - expose memory_reindex MCP tool

Roadmap

graph traversal poc
Rust implementation and comparison with metrics
policy/compliance/audit/retention layer
added unit tests for error paths
improve batching controls and recall latency metrics
more integration ergonomics and ideas
reranking improvements (position-aware blend)
query expansion strategies
deeper retrieval tuning and eval harnesses

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
plans		plans
scripts		scripts
src		src
test-docs		test-docs
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
config.example.json		config.example.json
package-lock.json		package-lock.json
package.json		package.json
test-discovery.cjs		test-discovery.cjs
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zmem

Quickstart

1. Install

2. Initialize a workspace

3. Check setup

4. Prepare local query-expansion models

5. Ingest your docs

6. Query your memory

7. Use with MCP / OpenCode

First-run notes

What this does differently

Architecture (high level)

Indexing pipeline

Retrieval pipeline

Project layout

Getting started

1) Install

2) Configure

3) Ingest documents

4) Query memory

5) Check status

CLI usage

Local query expansion

MCP usage

Local development

Environment variables

Roadmap

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zmem

Quickstart

1. Install

2. Initialize a workspace

3. Check setup

4. Prepare local query-expansion models

5. Ingest your docs

6. Query your memory

7. Use with MCP / OpenCode

First-run notes

What this does differently

Architecture (high level)

Indexing pipeline

Retrieval pipeline

Project layout

Getting started

1) Install

2) Configure

3) Ingest documents

4) Query memory

5) Check status

CLI usage

Local query expansion

MCP usage

Local development

Environment variables

Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages