Your turnkey local AI stack. Buy hardware. Run installer. AI running.
# One-line install (Linux/WSL)
curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bashOr manually:
git clone https://github.com/Light-Heart-Labs/Lighthouse-AI.git
cd Lighthouse-AI/dream-server
./install.shThe installer auto-detects your GPU, picks the right model, generates secure passwords, and starts everything. Open http://localhost:3000 and start chatting.
By default, Dream Server uses bootstrap mode for instant gratification:
- Starts immediately with a tiny 1.5B model (downloads in <1 minute)
- You can start chatting within 2 minutes of running the installer
- The full model downloads in the background
- When ready, run
./scripts/upgrade-model.shto hot-swap to the full model
No more staring at download bars. Start playing immediately.
To skip bootstrap and wait for the full model: ./install.sh --no-bootstrap
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/install.ps1" -OutFile install.ps1
.\install.ps1The Windows installer handles WSL2 setup, Docker Desktop, and NVIDIA drivers automatically.
Requirements: Windows 10 21H2+ or Windows 11, NVIDIA GPU, Docker Desktop
| Component | Purpose | Port |
|---|---|---|
| vLLM | High-performance LLM inference | 8000 |
| Open WebUI | Beautiful chat interface | 3000 |
| Dashboard | System status, GPU metrics, service health | 3001 |
| Privacy Shield | PII redaction for external API calls | 8085 |
| Whisper | Speech-to-text (optional) | 9000 |
| Kokoro | Text-to-speech (optional) | 8880 |
| LiveKit | Real-time WebRTC voice chat (optional) | 7880 |
| n8n | Workflow automation (optional) | 5678 |
| Qdrant | Vector database for RAG (optional) | 6333 |
| LiteLLM | Multi-model API gateway (optional) | 4000 |
The installer automatically detects your GPU and selects the right configuration:
| Tier | VRAM | Model | Context | Example GPUs |
|---|---|---|---|---|
| 1 (Entry) | <12GB | Qwen2.5-7B | 8K | RTX 3080, RTX 4070 |
| 2 (Prosumer) | 12-20GB | Qwen2.5-14B-AWQ | 16K | RTX 3090, RTX 4080 |
| 3 (Pro) | 20-40GB | Qwen2.5-32B-AWQ | 32K | RTX 4090, A6000 |
| 4 (Enterprise) | 40GB+ | Qwen2.5-72B-AWQ | 32K | A100, H100, multi-GPU |
Override with: ./install.sh --tier 3
See docs/HARDWARE-GUIDE.md for buying recommendations.
┌─────────────────────────────────────────────────┐
│ Open WebUI │
│ (localhost:3000) │
└─────────────────────┬───────────────────────────┘
│
┌─────────────────────▼───────────────────────────┐
│ vLLM │
│ (localhost:8000/v1/...) │
│ Qwen2.5-32B-Instruct-AWQ │
└─────────────────────────────────────────────────┘
│ │
┌────────▼────────┐ ┌───────▼────────┐
│ Whisper │ │ Kokoro │
│ (STT :9000) │ │ (TTS :8880) │
└─────────────────┘ └────────────────┘
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ n8n (:5678) │ │Qdrant(:6333)│ │LiteLLM(:4K) │
│ Workflows │ │ Vector DB │ │ API Gateway │
└─────────────┘ └─────────────┘ └─────────────┘
Enable components with Docker Compose profiles:
# Voice (STT + TTS)
docker compose --profile voice up -d
# Workflows (n8n)
docker compose --profile workflows up -d
# RAG (Qdrant + embeddings)
docker compose --profile rag up -d
# LiveKit Voice Chat (real-time WebRTC voice)
docker compose --profile livekit --profile voice up -d
# Everything
docker compose --profile voice --profile workflows --profile rag --profile livekit up -dReal-time voice conversation with your local AI:
- Enable the profile:
docker compose --profile livekit --profile voice up -d - Open http://localhost:7880 for LiveKit playground
- Or integrate with any LiveKit-compatible client
What it does:
- WebRTC voice streaming (low latency)
- Whisper STT → Local LLM → Kokoro TTS pipeline
- Works with browser, mobile apps, or custom clients
See agents/voice/ for the agent implementation.
Copy .env.example to .env and customize:
LLM_MODEL=Qwen/Qwen2.5-32B-Instruct-AWQ # Model (auto-set by installer)
MAX_CONTEXT=8192 # Context window
GPU_UTIL=0.9 # VRAM allocation (0.0-1.0)# Interactive showcase (requires running services)
./scripts/showcase.sh
# Offline demo mode (no GPU/services needed)
./scripts/demo-offline.sh
# Run integration tests
./tests/integration-test.shcd ~/dream-server
docker compose ps # Check status
docker compose logs -f vllm # Watch vLLM logs
docker compose restart # Restart services
docker compose down # Stop everything
./status.sh # Health check all services| Feature | Dream Server | Ollama + WebUI | LocalAI |
|---|---|---|---|
| Full-stack one-command install | LLM + voice + workflows + RAG + privacy | LLM + chat only | LLM only |
| Hardware auto-detect + model selection | Yes | No | No |
| Voice agents (STT + TTS + WebRTC) | Built in | No | Limited |
| Inference engine | vLLM (continuous batching) | llama.cpp | llama.cpp |
| Workflow automation | n8n (400+ integrations) | No | No |
| PII redaction / privacy tools | Built in | No | No |
| Multi-GPU | Yes | Partial | Partial |
vLLM won't start / OOM errors
- Reduce
MAX_CONTEXTin.env(try 4096) - Lower
GPU_UTILto 0.85 - Use a smaller model:
./install.sh --tier 1
"Model not found" on first boot
- First launch downloads the model (10-30 min depending on size)
- Watch progress:
docker compose logs -f vllm
Open WebUI shows "Connection error"
- vLLM is still loading. Wait for health check to pass:
curl localhost:8000/health
Port already in use
- Change ports in
.env(e.g.,WEBUI_PORT=3001) - Or stop the conflicting service:
sudo lsof -i :3000
Docker permission denied
- Add yourself to the docker group:
sudo usermod -aG docker $USER - Log out and back in for it to take effect
WSL: GPU not detected
- Install NVIDIA drivers on Windows (not inside WSL)
- Verify with
nvidia-smiinside WSL - Ensure Docker Desktop has WSL integration enabled
- QUICKSTART.md — Detailed setup guide
- HARDWARE-GUIDE.md — What to buy
- TROUBLESHOOTING.md — Extended troubleshooting
- SECURITY.md — Security best practices
- OPENCLAW-INTEGRATION.md — Connect OpenClaw agents
- Workflows README — Pre-built n8n workflows
Apache 2.0 — Use it, modify it, sell it. Just don't blame us.
Built by The Collective — Android-17, Todd, and friends