|
| 1 | +--- |
| 2 | +sidebar_position: 4 |
| 3 | +title: "Scalable Enterprise Deployment Options" |
| 4 | +--- |
| 5 | + |
| 6 | +# Enterprise Deployment Options |
| 7 | + |
| 8 | +Open WebUI's **stateless, container-first architecture** means the same application runs identically whether you deploy it as a Python process on a VM, a container in a managed service, or a pod in a Kubernetes cluster. The difference between deployment patterns is how you **orchestrate, scale, and operate** the application — not how the application itself behaves. |
| 9 | + |
| 10 | +:::tip Model Inference Is Independent |
| 11 | +How you serve LLM models is separate from how you deploy Open WebUI. You can use **managed APIs** (OpenAI, Anthropic, Azure OpenAI, Google Gemini) or **self-hosted inference** (Ollama, vLLM) with any deployment pattern. See [Integration](/enterprise/integration) for details on connecting models. |
| 12 | +::: |
| 13 | + |
| 14 | +--- |
| 15 | + |
| 16 | +## Shared Infrastructure Requirements |
| 17 | + |
| 18 | +Regardless of which deployment pattern you choose, every scaled Open WebUI deployment requires the same set of backing services. Configure these **before** scaling beyond a single instance. |
| 19 | + |
| 20 | +| Component | Why It's Required | Options | |
| 21 | +| :--- | :--- | :--- | |
| 22 | +| **PostgreSQL** | Multi-instance deployments require a real database. SQLite does not support concurrent writes from multiple processes. | Self-managed, Amazon RDS, Azure Database for PostgreSQL, Google Cloud SQL | |
| 23 | +| **Redis** | Session management, WebSocket coordination, and configuration sync across instances. | Self-managed, Amazon ElastiCache, Azure Cache for Redis, Google Memorystore | |
| 24 | +| **Vector Database** | The default ChromaDB uses a local SQLite backend that is not safe for multi-process access. | PGVector (shares PostgreSQL), Milvus, Qdrant, or ChromaDB in HTTP server mode | |
| 25 | +| **Shared Storage** | Uploaded files must be accessible from every instance. | Shared filesystem (NFS, EFS, CephFS) or object storage (`S3`, `GCS`, `Azure Blob`) | |
| 26 | +| **Content Extraction** | The default `pypdf` extractor leaks memory under sustained load. | Apache Tika or Docling as a sidecar service | |
| 27 | +| **Embedding Engine** | The default SentenceTransformers model loads ~500 MB into RAM per worker process. | OpenAI Embeddings API, or Ollama running an embedding model | |
| 28 | + |
| 29 | +### Critical Configuration |
| 30 | + |
| 31 | +These environment variables **must** be set consistently across every instance: |
| 32 | + |
| 33 | +```bash |
| 34 | +# Shared secret — MUST be identical on all instances |
| 35 | +WEBUI_SECRET_KEY=your-secret-key-here |
| 36 | + |
| 37 | +# Database |
| 38 | +DATABASE_URL=postgresql://user:password@db-host:5432/openwebui |
| 39 | + |
| 40 | +# Vector Database |
| 41 | +VECTOR_DB=pgvector |
| 42 | +PGVECTOR_DB_URL=postgresql://user:password@db-host:5432/openwebui |
| 43 | + |
| 44 | +# Redis |
| 45 | +REDIS_URL=redis://redis-host:6379/0 |
| 46 | +WEBSOCKET_MANAGER=redis |
| 47 | +ENABLE_WEBSOCKET_SUPPORT=true |
| 48 | + |
| 49 | +# Content Extraction |
| 50 | +CONTENT_EXTRACTION_ENGINE=tika |
| 51 | +TIKA_SERVER_URL=http://tika:9998 |
| 52 | + |
| 53 | +# Embeddings |
| 54 | +RAG_EMBEDDING_ENGINE=openai |
| 55 | + |
| 56 | +# Storage — choose ONE: |
| 57 | +# Option A: shared filesystem (mount the same volume to all instances, no env var needed) |
| 58 | +# Option B: object storage (see https://docs.openwebui.com/reference/env-configuration#cloud-storage for all required vars) |
| 59 | +# STORAGE_PROVIDER=s3 |
| 60 | + |
| 61 | +# Workers — let the orchestrator handle scaling |
| 62 | +UVICORN_WORKERS=1 |
| 63 | + |
| 64 | +# Migrations — only ONE instance should run migrations |
| 65 | +ENABLE_DB_MIGRATIONS=false |
| 66 | +``` |
| 67 | + |
| 68 | +:::warning Database Migrations |
| 69 | +Set `ENABLE_DB_MIGRATIONS=false` on **all instances except one**. During updates, scale down to a single instance, allow migrations to complete, then scale back up. Concurrent migrations can corrupt your database. |
| 70 | +::: |
| 71 | + |
| 72 | +For the complete step-by-step scaling walkthrough, see [Scaling Open WebUI](/getting-started/advanced-topics/scaling). For the full environment variable reference, see [Environment Variable Configuration](/reference/env-configuration). |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +## Choose Your Deployment Pattern |
| 77 | + |
| 78 | +Open WebUI supports three production deployment patterns. Each guide covers architecture, scaling strategy, and key considerations specific to that approach. |
| 79 | + |
| 80 | +### [Python / Pip on Auto-Scaling VMs](./python-pip) |
| 81 | + |
| 82 | +Deploy `open-webui serve` as a systemd-managed process on virtual machines in a cloud auto-scaling group (AWS ASG, Azure VMSS, GCP MIG). Best for teams with established VM-based infrastructure and strong Linux administration skills, or when regulatory requirements mandate direct OS-level control. |
| 83 | + |
| 84 | +### [Container Service](./container-service) |
| 85 | + |
| 86 | +Run the official Open WebUI container image on a managed platform such as AWS ECS/Fargate, Azure Container Apps, or Google Cloud Run. Best for teams wanting container benefits — immutable images, versioned deployments, no OS management — without Kubernetes complexity. |
| 87 | + |
| 88 | +### [Kubernetes with Helm](./kubernetes-helm) |
| 89 | + |
| 90 | +Deploy using the official Open WebUI Helm chart on any Kubernetes distribution (EKS, AKS, GKE, OpenShift, Rancher, self-managed). Best for large-scale, mission-critical deployments requiring declarative infrastructure-as-code, advanced auto-scaling, and GitOps workflows. |
| 91 | + |
| 92 | +--- |
| 93 | + |
| 94 | +## Deployment Comparison |
| 95 | + |
| 96 | +| | **Python / Pip (VMs)** | **Container Service** | **Kubernetes (Helm)** | |
| 97 | +| :--- | :--- | :--- | :--- | |
| 98 | +| **Operational complexity** | Moderate — OS patching, Python management | Low — platform-managed containers | Higher — requires K8s expertise | |
| 99 | +| **Auto-scaling** | Cloud ASG/VMSS with health checks | Platform-native, minimal configuration | HPA with fine-grained control | |
| 100 | +| **Container isolation** | None — process runs directly on OS | Full container isolation | Full container + namespace isolation | |
| 101 | +| **Rolling updates** | Manual (scale down, update, scale up) | Platform-managed rolling deployments | Declarative rolling updates with rollback | |
| 102 | +| **Infrastructure-as-code** | Terraform/Pulumi for VMs + config mgmt | Task/service definitions (CloudFormation, Bicep, Terraform) | Helm charts + GitOps (Argo CD, Flux) | |
| 103 | +| **Best suited for** | Teams with VM-centric operations, regulatory constraints | Teams wanting container benefits without K8s complexity | Large-scale, mission-critical deployments | |
| 104 | +| **Minimum team expertise** | Linux administration, Python | Container fundamentals, cloud platform | Kubernetes, Helm, cloud-native patterns | |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## Observability |
| 109 | + |
| 110 | +Production deployments should include monitoring and observability regardless of deployment pattern. |
| 111 | + |
| 112 | +### Health Checks |
| 113 | + |
| 114 | +- **`/health`** — Basic liveness check. Returns HTTP 200 when the application is running. Use this for load balancer and auto-scaler health checks. |
| 115 | +- **`/api/models`** — Verifies the application can connect to configured model backends. Requires an API key. |
| 116 | + |
| 117 | +### OpenTelemetry |
| 118 | + |
| 119 | +Open WebUI supports **OpenTelemetry** for distributed tracing and HTTP metrics. Enable it with: |
| 120 | + |
| 121 | +```bash |
| 122 | +ENABLE_OTEL=true |
| 123 | +OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4318 |
| 124 | +OTEL_SERVICE_NAME=open-webui |
| 125 | +``` |
| 126 | + |
| 127 | +This auto-instruments FastAPI, SQLAlchemy, Redis, and HTTP clients — giving visibility into request latency, database query performance, and cross-service traces. |
| 128 | + |
| 129 | +### Structured Logging |
| 130 | + |
| 131 | +Enable JSON-formatted logs for integration with log aggregation platforms (Datadog, Loki, CloudWatch, Splunk): |
| 132 | + |
| 133 | +```bash |
| 134 | +LOG_FORMAT=json |
| 135 | +GLOBAL_LOG_LEVEL=INFO |
| 136 | +``` |
| 137 | + |
| 138 | +For full monitoring setup details, see [Monitoring](/reference/monitoring) and [OpenTelemetry](/reference/monitoring/otel). |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## Next Steps |
| 143 | + |
| 144 | +- **[Architecture & High Availability](/enterprise/architecture)** — Deeper dive into Open WebUI's stateless design and HA capabilities. |
| 145 | +- **[Security](/enterprise/security)** — Compliance frameworks, SSO/LDAP integration, RBAC, and audit logging. |
| 146 | +- **[Integration](/enterprise/integration)** — Connecting AI models, pipelines, and extending functionality. |
| 147 | +- **[Scaling Open WebUI](/getting-started/advanced-topics/scaling)** — The complete step-by-step technical scaling guide. |
| 148 | +- **[Multi-Replica Troubleshooting](/troubleshooting/multi-replica)** — Solutions for common issues in scaled deployments. |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +**Need help planning your enterprise deployment?** Our team works with organizations worldwide to design and implement production Open WebUI environments. |
| 153 | + |
| 154 | +[**Contact Enterprise Sales → sales@openwebui.com**](mailto:sales@openwebui.com) |
0 commit comments