embd is a RAG pipeline: ingest documents → embed with sentence-transformers → store in ChromaDB → query via CLI, TUI, or HTTP API. The HTTP API is designed as a ChatGPT Actions backend. Retrieval uses hybrid search (semantic + BM25 via Reciprocal Rank Fusion).
# Development
source .venv/bin/activate
embd ingest # ingest documents + build BM25 index
embd ingest --refresh # re-check all files by hash
embd ingest --reset # drop everything, full re-ingest
embd ingest --contextualize # LLM context generation on chunks
embd ingest --contextualize-estimate-only # cost/time estimate, no writes
embd query "question" # query from CLI (hybrid search)
embd shell # interactive TUI
embd serve # HTTP API
# Docker
docker compose up -d # dev (self-signed TLS)
docker compose -f docker-compose.yml \
-f deploy/docker-compose.le.yml up -d # production (Let's Encrypt)src/embd/— main Python packageconfig.py— loadsconfig.toml+.envembedding/encoder.py— sentence-transformers wrapper (MPS/CUDA/CPU)server.py— FastAPI HTTP APIingestion/— extractors, chunker, scanner, file watcherbm25_index.py— BM25 index build/persist/querycontextual.py— contextual generation pipeline + cost estimation
store/— ChromaDB vector store + SQLite metadatavector_store.py— ChromaDB wrappermeta_db.py— SQLite file-level tracking, contextual progress
qa/— retrieval and generationretriever.py— semantic + BM25 hybrid retrieverhybrid_retriever.py— RRF merge functiongenerator_*.py— MLX, Ollama, Claude backends
deploy/— nginx reverse proxy, Docker, TLS, IP allowlistconfig.toml— all tunables (paths, models, chunking, retrieval weights, server).env— secrets (git-ignored)tests/— pytest tests for meta_db, bm25, rrf, contextual estimation
- Config:
config.tomlfor tunables,.envfor secrets - Chunk IDs are deterministic from
(source_key, page, chunk_index) - Embeddings are L2-normalized; ChromaDB uses cosine space
- File watcher (watchdog) auto-ingests on change during
serve/shell - BM25 index rebuilds after every ingest + 30s debounced rebuild in watcher
- SQLite
embd_meta.dbtracks file stats and contextual generation progress - Contextual generation is resumable per-file
pip install -e '.[all]' rank_bm25 pytest
pytest tests/