A completely offline, privacy-first repair assistant that "sees" hardware and retrieves technical documentation.
Technicians in the field (e.g., repairing wind turbines or EVs) often lack reliable internet. Cloud-based AI fails here.
This project demonstrates an Edge-Native Architecture: running a Vision-Language Model (VLM) and a Vector Search Engine entirely on-device to identify parts and pull up repair manuals in seconds.
(Placeholder: Upload a screenshot of the main Streamlit UI here)
- 100% Offline: No data leaves the device. Zero reliance on OpenAI/AWS.
- Hardware Accelerated: Uses Metal (MPS) on macOS to run quantized LLMs at interactive speeds.
- Visual RAG: Combines "Visual Understanding" (Llava) with "Semantic Retrieval" (ChromaDB).
- Edge Optimized: Runs on 16GB Unified Memory (M2/M3) or NVIDIA Jetson Orin (ARM64).
graph LR
A[Camera Input] -->|Image| B(Vision Engine)
B -->|Llava v1.5 4-bit| C{Part Identification}
C -->|"Describe: M3 Screw"| D[Vector DB]
D -->|Semantic Search| E[PDF Manual]
E -->|Page 42| F[Technician UI]
- The Eye: Llava v1.5 7B (Quantized GGUF) analyzes the image.
- The Brain: llama-cpp-python binds the C++ inference engine to Python.
- The Memory: ChromaDB stores chunked embeddings of technical manuals.
- The Interface: Streamlit provides a low-latency UI.
| Component | Latency | Implementation Details |
|---|---|---|
| Visual Inference | ~2.4s | Offloaded to GPU (Metal) |
| Vector Search | <0.1s | Local HNSW Index |
| Cold Boot | ~8.0s | Model loading (Cached) |
| RAM Usage | ~6.5 GB | 4-bit Quantization (Q4_K_M) |
- Python 3.10+
- Mac with Apple Silicon (M1/M2/M3) OR NVIDIA GPU (Linux)
- ~8GB Free RAM
git clone [https://github.com/YOUR\_USERNAME/edge-maintenance-assistant.git\](https://github.com/YOUR\_USERNAME/edge-maintenance-assistant.git)
cd edge-maintenance-assistant
# Create virtual env (Recommended)
python -m venv venv
source venv/bin/activate
# Install dependencies (Pre-compiled wheels for Apple Silicon)
pip install -r requirements.txt
We use the quantized Llava v1.5 model to fit within edge memory constraints.
python download_model.py
- Place your technical PDF in data/manual.pdf.
- Run the ingestion script to chunk and embed the document.
python src/ingest.py
streamlit run src/app.py
.
├── assets/ # Demo screenshots/gifs
│ ├── demo_screenshot.png # Main UI View
│ ├── part_input.jpg # Example input image (Placeholder)
│ └── analysis_result.png # Example AI output (Placeholder)
├── data/ # PDF Manuals & Vector DB (Chroma)
├── models/ # GGUF Quantized Models (Ignored by Git)
├── src/
│ ├── app.py # Main Streamlit Application
│ ├── ingest.py # PDF Processing & Vectorization logic
│ └── vision_test.py # CLI diagnostic tool for Vision
├── download_model.py # Automation script for Model Fetching
└── requirements.txt # Dependency lockfile
- Voice Control: Integrate Whisper.cpp for hands-free queries.
- AR Overlay: Use OpenCV to draw bounding boxes around identified parts.
- Docker Container: Package the entire stack for deployment on Rivian/Tesla service laptops.