OmniBioAI Dev Hub — RAG V6

Here is an updated production-style README for your V6 FAISS-native RAG system.

OmniBioAI Dev Hub — RAG V6

Production-grade Retrieval-Augmented Generation (RAG) system powering the OmniBioAI ecosystem documentation, architecture search, workflow discovery, and developer assistant APIs.

Features

FAISS-native vector search
Incremental indexing
Ollama local embeddings + local LLM inference
FastAPI API server
Streaming responses (SSE)
Chunk-level document retrieval
Repository-wide multi-project indexing
Fully local execution
No OpenAI dependency
Production-safe embedding normalization
Hybrid-ready architecture
V6 dimension consistency enforcement

Architecture

Repositories
     ↓
Document Loader
     ↓
Chunker
     ↓
Ollama Embeddings (768-d)
     ↓
FAISS Vector Index
     ↓
RAG Engine
     ↓
FastAPI API
     ↓
LLM Answer Generation

V6 Major Improvements

FAISS Native Retrieval

Previous versions used brute-force cosine scanning across vectors.

V6 uses:

faiss.IndexFlatIP

Benefits:

10–50x faster retrieval
scalable search
lower latency
future ANN support

Embedding Consistency Fix

A major issue in previous builds was embedding mismatch.

Old Problem

Stage	Model	Dimension
Indexing	all-MiniLM-L6-v2	384
Querying	nomic-embed-text	768

This caused FAISS assertion failures:

AssertionError: d == self.d

V6 Fix

Now BOTH ingestion and retrieval use:

nomic-embed-text

Dimension:

This guarantees:

stable retrieval
no dimension mismatch
deterministic FAISS behavior

Repository Structure

omnibioai-dev-hub/
│
├── api/
│   ├── main.py
│   └── routes/
│
├── rag/
│   ├── engine.py
│   └── control_plane.py
│
├── index/
│   └── vector_store.py
│
├── embeddings/
│   └── embedder.py
│
├── ingestion/
│   └── doc_loader.py
│
├── processing/
│   └── chunker.py
│
├── scripts/
│   └── build_index.py
│
└── data/

Requirements

Python

Recommended:

Python 3.11

Ollama

Install:

curl -fsSL https://ollama.com/install.sh | sh

Required Ollama Models

Embedding Model

ollama pull nomic-embed-text

Generation Models

Recommended:

ollama pull mistral

Optional:

ollama pull llama3
ollama pull deepseek-coder
ollama pull deepseek-r1

Installation

Create Environment

conda create -n chemoinfo python=3.11 -y
conda activate chemoinfo

Install Dependencies

pip install fastapi uvicorn requests numpy faiss-cpu sentence-transformers

Build Index

Clean Existing Data

rm -rf data/*

Build V6 Index

python scripts/build_index.py

Expected output:

🚀 Incremental V6 Indexing Starting...
✅ V6 Index Complete

Run API Server

uvicorn api.main:app --host 0.0.0.0 --port 8082 --reload

Test Query API

curl -X POST http://localhost:8082/rag/query \
-H "Content-Type: application/json" \
-d '{"query":"What is workflow engine in OmniBioAI?"}'

Example response:

{
  "query": "What is workflow engine in OmniBioAI?",
  "answer": "According to the provided context...",
  "sources": [
    "../omnibioai-workflow-bundles/README.md"
  ],
  "context_used": 5,
  "version": "v6-faiss",
  "api_version": "v6"
}

Streaming API

Endpoint:

POST /rag/stream

Uses:

Server-Sent Events (SSE)
token streaming
real-time generation

V6 Retrieval Pipeline

Step 1 — Chunking

Documents are split into semantic chunks.

Step 2 — Embedding

Each chunk is embedded using:

nomic-embed-text

Output dimension:

Step 3 — FAISS Indexing

Vectors are stored in:

faiss.IndexFlatIP

Step 4 — Query Embedding

User query is embedded using the SAME embedding model.

Step 5 — Vector Search

FAISS retrieves nearest chunks.

Step 6 — Prompt Assembly

Retrieved chunks become context.

Step 7 — LLM Generation

Prompt sent to local Ollama model.

Supported Repositories

Current indexing targets:

repos = [
    "../omnibioai",
    "../omnibioai-rag",
    "../omnibioai-toolserver",
    "../omnibioai-sdk",
    "../omnibioai-workflow-bundles",
    "../omnibioai-control-center",
    "../omnibioai-lims",
    "../omnibioai-model-registry",
    "../omnibioai-dev-docker"
]

Performance

Before V6

brute-force cosine scan
slow retrieval
dimension mismatch bugs
unstable indexing

After V6

FAISS-native retrieval
stable dimensions
fast semantic search
local-only execution
scalable architecture

Troubleshooting

FAISS Dimension Mismatch

Error:

AssertionError: d == self.d

Cause:

Different embedding models used during indexing vs querying.

Fix:

Rebuild index using the SAME embedding model.

Ollama Timeout

Error:

Read timed out

Fix:

Use a smaller generation model:

model="mistral"

instead of:

deepseek-r1

Empty Retrieval Results

Check:

python -c "
from index.vector_store import VectorStore
import numpy as np

vs = VectorStore()
vs.add([np.random.rand(768)], [{'text':'test'}])

print(vs.index.ntotal)
"

Expected:

Future Roadmap

Planned V7 Features

IVF indexes
HNSW search
metadata filtering
hybrid BM25 + vector search
reranking
cross-encoder scoring
persistent FAISS storage
multi-user collections
distributed indexing
workflow-aware retrieval
graph RAG
plugin-aware retrieval

License

Internal OmniBioAI Development License.

OmniBioAI Ecosystem

RAG V6 powers:

architecture discovery
workflow documentation search
plugin documentation retrieval
developer assistant APIs
AI infrastructure exploration
cross-repository semantic search
internal engineering copilots

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
api		api
configs		configs
data/faiss_index		data/faiss_index
embeddings		embeddings
index		index
ingestion		ingestion
obsolete		obsolete
omnibioai-dev-hub-ui		omnibioai-dev-hub-ui
processing		processing
rag		rag
retrieval		retrieval
scripts		scripts
tests		tests
utils		utils
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.legacy		Dockerfile.legacy
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

OmniBioAI Dev Hub — RAG V6

Features

Architecture

V6 Major Improvements

FAISS Native Retrieval

Embedding Consistency Fix

Old Problem

V6 Fix

Repository Structure

Requirements

Python

Ollama

Required Ollama Models

Embedding Model

Generation Models

Installation

Create Environment

Install Dependencies

Build Index

Clean Existing Data

Build V6 Index

Run API Server

Test Query API

Streaming API

V6 Retrieval Pipeline

Step 1 — Chunking

Step 2 — Embedding

Step 3 — FAISS Indexing

Step 4 — Query Embedding

Step 5 — Vector Search

Step 6 — Prompt Assembly

Step 7 — LLM Generation

Supported Repositories

Performance

Before V6

After V6

Troubleshooting

FAISS Dimension Mismatch

Ollama Timeout

Empty Retrieval Results

Future Roadmap

Planned V7 Features

License

OmniBioAI Ecosystem

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages