Sentinel-RAG is a high-performance, local-first AI agent designed for secure Retrieval-Augmented Generation (RAG). It enables users to interact with their private document repositories (PDFs, DOCs, notes) with the intelligence of Google Gemini, while ensuring total data sovereignty and offline operation.
To provide a secure, private interface for document interaction where every answer is strictly grounded in the provided context, ensuring zero hallucinations and complete traceability.
- Privacy-First: Operates in a local-first environment. Your data stays on your machine.
- Strict Grounding: The agent ONLY uses provided document context to answer queries.
- Zero Hallucination: If the answer isn't in the documents, the agent will clearly state it doesn't know.
- Traceable Citations: Every response includes precise citations (Document Name | Page/Chunk).
- Gemini Optimized: Fully integrated with Google Gemini 3.0+ for high-quality reasoning and embedding.
- Minimalist UI: A clean, professional Gradio interface for seamless document ingestion and chat.
- LLM: Google Gemini (Gemini 2.0/3.0)
- Framework: LlamaIndex (Context Retrieval & Orchestration)
- Backend: FastAPI
- UI: Gradio
- Vector Store: Qdrant / Local File System
- Language: Python 3.11+
- Python 3.11 installed
- A Google Gemini API Key
Clone the repository and install dependencies using Poetry:
# Install dependencies
poetry install --with ui,llms-gemini,embeddings-geminiThe project uses a consolidated settings.yaml for all configurations.
- Create or edit
settings.yamlin the root directory. - Add your Gemini API Key:
gemini:
api_key: "YOUR_GOOGLE_API_KEY"Start the local server and UI:
python -m private_gptOnce started, the UI will be available at http://localhost:8001.
The agent is hardcoded with the following non-negotiable rules:
- Context Only: Answers must be derived strictly from retrieved chunks.
- No Hallucinations: Fabricating details is strictly forbidden.
- Mandatory Citations: Sources must be cited for every fact provided.
- Professional Tone: Responses are precise, minimal, and professional.
- Failure Mode: If context is missing, the response defaults to: "I don’t have enough information in the provided documents."
private_gpt/: Core application logic.private_gpt/ui/: Gradio UI implementation.private_gpt/server/: FastAPI server and API routes.settings.yaml: Centralized configuration for LLM, Embeddings, and UI behavior.scripts/: Utility scripts for data ingestion and setup.
This project is licensed under the MIT License. See the LICENSE file for details.
Maintained by the Sentinel-RAG Team.

