-
Notifications
You must be signed in to change notification settings - Fork 540
Enterprise Deployment Documentation #1131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| --- | ||
| sidebar_position: 2 | ||
| title: "Container Service" | ||
| --- | ||
|
|
||
| # Container Service | ||
|
|
||
| Run the official `ghcr.io/open-webui/open-webui` image on a managed container platform such as AWS ECS/Fargate, Azure Container Apps, or Google Cloud Run. | ||
|
|
||
| :::info Prerequisites | ||
| Before proceeding, ensure you have configured the [shared infrastructure requirements](/enterprise/deployment#shared-infrastructure-requirements) — PostgreSQL, Redis, a vector database, shared storage, and content extraction. | ||
| ::: | ||
|
|
||
| ## When to Choose This Pattern | ||
|
|
||
| - You want container benefits (immutable images, versioned deployments, no OS management) without Kubernetes complexity | ||
| - Your organization already uses a managed container platform | ||
| - You need fast scaling with minimal operational overhead | ||
| - You prefer managed infrastructure with platform-native auto-scaling | ||
|
|
||
| ## Architecture | ||
|
|
||
| ```mermaid | ||
| flowchart TB | ||
| LB["Load Balancer"] | ||
|
|
||
| subgraph CS["Container Service"] | ||
| T1["Container Task 1"] | ||
| T2["Container Task 2"] | ||
| T3["Container Task N"] | ||
| end | ||
|
|
||
| subgraph Backend["Managed Backing Services"] | ||
| PG["PostgreSQL + PGVector"] | ||
| Redis["Redis"] | ||
| S3["Object Storage"] | ||
| Tika["Tika"] | ||
| end | ||
|
|
||
| LB --> CS | ||
| CS --> Backend | ||
| ``` | ||
|
|
||
| ## Image Selection | ||
|
|
||
| Use **versioned tags** for production stability: | ||
|
|
||
| ``` | ||
| ghcr.io/open-webui/open-webui:v0.x.x | ||
| ``` | ||
|
|
||
| Avoid the `:main` tag in production — it tracks the latest development build and can introduce breaking changes without warning. Check the [Open WebUI releases](https://github.com/open-webui/open-webui/releases) for the latest stable version. | ||
|
|
||
| ## Scaling Strategy | ||
|
|
||
| - **Platform-native auto-scaling**: Configure your container service to scale on CPU utilization, memory, or request count. | ||
| - **Health checks**: Use the `/health` endpoint for both liveness and readiness probes. | ||
| - **Task-level env vars**: Pass all shared infrastructure configuration as environment variables or secrets in your task definition. | ||
| - **Session affinity**: Enable sticky sessions on your load balancer for WebSocket stability. While Redis handles cross-instance coordination, session affinity reduces unnecessary session handoffs. | ||
|
|
||
| ## Key Considerations | ||
|
|
||
| | Consideration | Detail | | ||
| | :--- | :--- | | ||
| | **Storage** | Use object storage (S3, GCS, Azure Blob) or a shared filesystem (such as EFS). Container-local storage is ephemeral and not shared across tasks. | | ||
| | **Tika sidecar** | Run Tika as a sidecar container in the same task definition, or as a separate service. Sidecar pattern keeps extraction traffic local. | | ||
| | **Secrets management** | Use your platform's secrets manager (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager) for `DATABASE_URL`, `REDIS_URL`, and `WEBUI_SECRET_KEY`. | | ||
| | **Updates** | Perform a rolling deployment with a single task first — this task runs migrations (`ENABLE_DB_MIGRATIONS=true`). Once healthy, scale the remaining tasks with `ENABLE_DB_MIGRATIONS=false`. | | ||
|
|
||
| ## Anti-Patterns to Avoid | ||
|
|
||
| | Anti-Pattern | Impact | Fix | | ||
| | :--- | :--- | :--- | | ||
| | Using local SQLite | Data loss on task restart, database locks with multiple tasks | Set `DATABASE_URL` to PostgreSQL | | ||
| | Default ChromaDB | SQLite-backed vector DB crashes under multi-process access | Set `VECTOR_DB=pgvector` (or Milvus/Qdrant) | | ||
| | Inconsistent `WEBUI_SECRET_KEY` | Login loops, 401 errors, sessions that don't persist across tasks | Set the same key on every task via secrets manager | | ||
| | No Redis | WebSocket failures, config not syncing, "Model Not Found" errors | Set `REDIS_URL` and `WEBSOCKET_MANAGER=redis` | | ||
|
|
||
| For container basics, see the [Quick Start guide](/getting-started/quick-start). | ||
|
|
||
| --- | ||
|
|
||
| **Need help planning your enterprise deployment?** Our team works with organizations worldwide to design and implement production Open WebUI environments. | ||
|
|
||
| [**Contact Enterprise Sales → sales@openwebui.com**](mailto:sales@openwebui.com) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,154 @@ | ||
| --- | ||
| sidebar_position: 4 | ||
| title: "Scalable Enterprise Deployment Options" | ||
| --- | ||
|
|
||
| # Enterprise Deployment Options | ||
|
|
||
| Open WebUI's **stateless, container-first architecture** means the same application runs identically whether you deploy it as a Python process on a VM, a container in a managed service, or a pod in a Kubernetes cluster. The difference between deployment patterns is how you **orchestrate, scale, and operate** the application — not how the application itself behaves. | ||
|
|
||
| :::tip Model Inference Is Independent | ||
| How you serve LLM models is separate from how you deploy Open WebUI. You can use **managed APIs** (OpenAI, Anthropic, Azure OpenAI, Google Gemini) or **self-hosted inference** (Ollama, vLLM) with any deployment pattern. See [Integration](/enterprise/integration) for details on connecting models. | ||
| ::: | ||
|
|
||
| --- | ||
|
|
||
| ## Shared Infrastructure Requirements | ||
|
|
||
| Regardless of which deployment pattern you choose, every scaled Open WebUI deployment requires the same set of backing services. Configure these **before** scaling beyond a single instance. | ||
|
|
||
| | Component | Why It's Required | Options | | ||
| | :--- | :--- | :--- | | ||
| | **PostgreSQL** | Multi-instance deployments require a real database. SQLite does not support concurrent writes from multiple processes. | Self-managed, Amazon RDS, Azure Database for PostgreSQL, Google Cloud SQL | | ||
| | **Redis** | Session management, WebSocket coordination, and configuration sync across instances. | Self-managed, Amazon ElastiCache, Azure Cache for Redis, Google Memorystore | | ||
| | **Vector Database** | The default ChromaDB uses a local SQLite backend that is not safe for multi-process access. | PGVector (shares PostgreSQL), Milvus, Qdrant, or ChromaDB in HTTP server mode | | ||
| | **Shared Storage** | Uploaded files must be accessible from every instance. | Shared filesystem (NFS, EFS, CephFS) or object storage (`S3`, `GCS`, `Azure Blob`) | | ||
| | **Content Extraction** | The default `pypdf` extractor leaks memory under sustained load. | Apache Tika or Docling as a sidecar service | | ||
| | **Embedding Engine** | The default SentenceTransformers model loads ~500 MB into RAM per worker process. | OpenAI Embeddings API, or Ollama running an embedding model | | ||
|
|
||
| ### Critical Configuration | ||
|
|
||
| These environment variables **must** be set consistently across every instance: | ||
|
|
||
| ```bash | ||
| # Shared secret — MUST be identical on all instances | ||
| WEBUI_SECRET_KEY=your-secret-key-here | ||
|
|
||
| # Database | ||
| DATABASE_URL=postgresql://user:password@db-host:5432/openwebui | ||
|
|
||
| # Vector Database | ||
| VECTOR_DB=pgvector | ||
| PGVECTOR_DB_URL=postgresql://user:password@db-host:5432/openwebui | ||
|
|
||
| # Redis | ||
| REDIS_URL=redis://redis-host:6379/0 | ||
| WEBSOCKET_MANAGER=redis | ||
| ENABLE_WEBSOCKET_SUPPORT=true | ||
|
|
||
| # Content Extraction | ||
| CONTENT_EXTRACTION_ENGINE=tika | ||
| TIKA_SERVER_URL=http://tika:9998 | ||
|
|
||
| # Embeddings | ||
| RAG_EMBEDDING_ENGINE=openai | ||
|
|
||
| # Storage — choose ONE: | ||
| # Option A: shared filesystem (mount the same volume to all instances, no env var needed) | ||
| # Option B: object storage (see https://docs.openwebui.com/reference/env-configuration#cloud-storage for all required vars) | ||
| # STORAGE_PROVIDER=s3 | ||
|
|
||
| # Workers — let the orchestrator handle scaling | ||
| UVICORN_WORKERS=1 | ||
|
|
||
| # Migrations — only ONE instance should run migrations | ||
| ENABLE_DB_MIGRATIONS=false | ||
| ``` | ||
|
|
||
| :::warning Database Migrations | ||
| Set `ENABLE_DB_MIGRATIONS=false` on **all instances except one**. During updates, scale down to a single instance, allow migrations to complete, then scale back up. Concurrent migrations can corrupt your database. | ||
| ::: | ||
|
|
||
| For the complete step-by-step scaling walkthrough, see [Scaling Open WebUI](/getting-started/advanced-topics/scaling). For the full environment variable reference, see [Environment Variable Configuration](/reference/env-configuration). | ||
|
|
||
| --- | ||
|
|
||
| ## Choose Your Deployment Pattern | ||
|
|
||
| Open WebUI supports three production deployment patterns. Each guide covers architecture, scaling strategy, and key considerations specific to that approach. | ||
|
|
||
| ### [Python / Pip on Auto-Scaling VMs](./python-pip) | ||
|
|
||
| Deploy `open-webui serve` as a systemd-managed process on virtual machines in a cloud auto-scaling group (AWS ASG, Azure VMSS, GCP MIG). Best for teams with established VM-based infrastructure and strong Linux administration skills, or when regulatory requirements mandate direct OS-level control. | ||
|
|
||
| ### [Container Service](./container-service) | ||
|
|
||
| Run the official Open WebUI container image on a managed platform such as AWS ECS/Fargate, Azure Container Apps, or Google Cloud Run. Best for teams wanting container benefits — immutable images, versioned deployments, no OS management — without Kubernetes complexity. | ||
|
|
||
| ### [Kubernetes with Helm](./kubernetes-helm) | ||
|
|
||
| Deploy using the official Open WebUI Helm chart on any Kubernetes distribution (EKS, AKS, GKE, OpenShift, Rancher, self-managed). Best for large-scale, mission-critical deployments requiring declarative infrastructure-as-code, advanced auto-scaling, and GitOps workflows. | ||
|
|
||
| --- | ||
|
|
||
| ## Deployment Comparison | ||
|
|
||
| | | **Python / Pip (VMs)** | **Container Service** | **Kubernetes (Helm)** | | ||
| | :--- | :--- | :--- | :--- | | ||
| | **Operational complexity** | Moderate — OS patching, Python management | Low — platform-managed containers | Higher — requires K8s expertise | | ||
| | **Auto-scaling** | Cloud ASG/VMSS with health checks | Platform-native, minimal configuration | HPA with fine-grained control | | ||
| | **Container isolation** | None — process runs directly on OS | Full container isolation | Full container + namespace isolation | | ||
| | **Rolling updates** | Manual (scale down, update, scale up) | Platform-managed rolling deployments | Declarative rolling updates with rollback | | ||
| | **Infrastructure-as-code** | Terraform/Pulumi for VMs + config mgmt | Task/service definitions (CloudFormation, Bicep, Terraform) | Helm charts + GitOps (Argo CD, Flux) | | ||
| | **Best suited for** | Teams with VM-centric operations, regulatory constraints | Teams wanting container benefits without K8s complexity | Large-scale, mission-critical deployments | | ||
| | **Minimum team expertise** | Linux administration, Python | Container fundamentals, cloud platform | Kubernetes, Helm, cloud-native patterns | | ||
|
|
||
| --- | ||
|
|
||
| ## Observability | ||
|
|
||
| Production deployments should include monitoring and observability regardless of deployment pattern. | ||
|
|
||
| ### Health Checks | ||
|
|
||
| - **`/health`** — Basic liveness check. Returns HTTP 200 when the application is running. Use this for load balancer and auto-scaler health checks. | ||
| - **`/api/models`** — Verifies the application can connect to configured model backends. Requires an API key. | ||
|
|
||
| ### OpenTelemetry | ||
|
|
||
| Open WebUI supports **OpenTelemetry** for distributed tracing and HTTP metrics. Enable it with: | ||
|
|
||
| ```bash | ||
| ENABLE_OTEL=true | ||
| OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4318 | ||
| OTEL_SERVICE_NAME=open-webui | ||
| ``` | ||
|
|
||
| This auto-instruments FastAPI, SQLAlchemy, Redis, and HTTP clients — giving visibility into request latency, database query performance, and cross-service traces. | ||
|
|
||
| ### Structured Logging | ||
|
|
||
| Enable JSON-formatted logs for integration with log aggregation platforms (Datadog, Loki, CloudWatch, Splunk): | ||
|
|
||
| ```bash | ||
| LOG_FORMAT=json | ||
| GLOBAL_LOG_LEVEL=INFO | ||
| ``` | ||
|
|
||
| For full monitoring setup details, see [Monitoring](/reference/monitoring) and [OpenTelemetry](/reference/monitoring/otel). | ||
|
|
||
| --- | ||
|
|
||
| ## Next Steps | ||
|
|
||
| - **[Architecture & High Availability](/enterprise/architecture)** — Deeper dive into Open WebUI's stateless design and HA capabilities. | ||
| - **[Security](/enterprise/security)** — Compliance frameworks, SSO/LDAP integration, RBAC, and audit logging. | ||
| - **[Integration](/enterprise/integration)** — Connecting AI models, pipelines, and extending functionality. | ||
| - **[Scaling Open WebUI](/getting-started/advanced-topics/scaling)** — The complete step-by-step technical scaling guide. | ||
| - **[Multi-Replica Troubleshooting](/troubleshooting/multi-replica)** — Solutions for common issues in scaled deployments. | ||
|
|
||
| --- | ||
|
|
||
| **Need help planning your enterprise deployment?** Our team works with organizations worldwide to design and implement production Open WebUI environments. | ||
|
|
||
| [**Contact Enterprise Sales → sales@openwebui.com**](mailto:sales@openwebui.com) | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scalable Enterprise Deployment Options