open-webui
diff --git a/‎docs/enterprise/deployment/container-service.md‎
Lines changed: 85 additions & 0 deletions b/‎docs/enterprise/deployment/container-service.md‎
Lines changed: 85 additions & 0 deletions
diff --git a/‎docs/enterprise/deployment/index.md‎
Lines changed: 154 additions & 0 deletions b/‎docs/enterprise/deployment/index.md‎
Lines changed: 154 additions & 0 deletions
@@ -0,0 +1,85 @@
+---
+sidebar_position: 2
+title: "Container Service"
+---
+
+# Container Service
+
+Run the official `ghcr.io/open-webui/open-webui` image on a managed container platform such as AWS ECS/Fargate, Azure Container Apps, or Google Cloud Run.
+
+:::info Prerequisites
+Before proceeding, ensure you have configured the [shared infrastructure requirements](/enterprise/deployment#shared-infrastructure-requirements) — PostgreSQL, Redis, a vector database, shared storage, and content extraction.
+:::
+
+## When to Choose This Pattern
+
+- You want container benefits (immutable images, versioned deployments, no OS management) without Kubernetes complexity
+- Your organization already uses a managed container platform
+- You need fast scaling with minimal operational overhead
+- You prefer managed infrastructure with platform-native auto-scaling
+
+## Architecture
+
+```mermaid
+flowchart TB
+    LB["Load Balancer"]
+
+    subgraph CS["Container Service"]
+        T1["Container Task 1"]
+        T2["Container Task 2"]
+        T3["Container Task N"]
+    end
+
+    subgraph Backend["Managed Backing Services"]
+        PG["PostgreSQL + PGVector"]
+        Redis["Redis"]
+        S3["Object Storage"]
+        Tika["Tika"]
+    end
+
+    LB --> CS
+    CS --> Backend
+```
+
+## Image Selection
+
+Use **versioned tags** for production stability:
+
+```
+ghcr.io/open-webui/open-webui:v0.x.x
+```
+
+Avoid the `:main` tag in production — it tracks the latest development build and can introduce breaking changes without warning. Check the [Open WebUI releases](https://github.com/open-webui/open-webui/releases) for the latest stable version.
+
+## Scaling Strategy
+
+- **Platform-native auto-scaling**: Configure your container service to scale on CPU utilization, memory, or request count.
+- **Health checks**: Use the `/health` endpoint for both liveness and readiness probes.
+- **Task-level env vars**: Pass all shared infrastructure configuration as environment variables or secrets in your task definition.
+- **Session affinity**: Enable sticky sessions on your load balancer for WebSocket stability. While Redis handles cross-instance coordination, session affinity reduces unnecessary session handoffs.
+
+## Key Considerations
+
+| Consideration | Detail |
+| :--- | :--- |
+| **Storage** | Use object storage (S3, GCS, Azure Blob) or a shared filesystem (such as EFS). Container-local storage is ephemeral and not shared across tasks. |
+| **Tika sidecar** | Run Tika as a sidecar container in the same task definition, or as a separate service. Sidecar pattern keeps extraction traffic local. |
+| **Secrets management** | Use your platform's secrets manager (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager) for `DATABASE_URL`, `REDIS_URL`, and `WEBUI_SECRET_KEY`. |
+| **Updates** | Perform a rolling deployment with a single task first — this task runs migrations (`ENABLE_DB_MIGRATIONS=true`). Once healthy, scale the remaining tasks with `ENABLE_DB_MIGRATIONS=false`. |
+
+## Anti-Patterns to Avoid
+
+| Anti-Pattern | Impact | Fix |
+| :--- | :--- | :--- |
+| Using local SQLite | Data loss on task restart, database locks with multiple tasks | Set `DATABASE_URL` to PostgreSQL |
+| Default ChromaDB | SQLite-backed vector DB crashes under multi-process access | Set `VECTOR_DB=pgvector` (or Milvus/Qdrant) |
+| Inconsistent `WEBUI_SECRET_KEY` | Login loops, 401 errors, sessions that don't persist across tasks | Set the same key on every task via secrets manager |
+| No Redis | WebSocket failures, config not syncing, "Model Not Found" errors | Set `REDIS_URL` and `WEBSOCKET_MANAGER=redis` |
+
+For container basics, see the [Quick Start guide](/getting-started/quick-start).
+
+---
+
+**Need help planning your enterprise deployment?** Our team works with organizations worldwide to design and implement production Open WebUI environments.
+
+[**Contact Enterprise Sales → sales@openwebui.com**](mailto:sales@openwebui.com)
@@ -0,0 +1,154 @@
+---
+sidebar_position: 4
+title: "Scalable Enterprise Deployment Options"
+---
+
+# Enterprise Deployment Options
+
+Open WebUI's **stateless, container-first architecture** means the same application runs identically whether you deploy it as a Python process on a VM, a container in a managed service, or a pod in a Kubernetes cluster. The difference between deployment patterns is how you **orchestrate, scale, and operate** the application — not how the application itself behaves.
+
+:::tip Model Inference Is Independent
+How you serve LLM models is separate from how you deploy Open WebUI. You can use **managed APIs** (OpenAI, Anthropic, Azure OpenAI, Google Gemini) or **self-hosted inference** (Ollama, vLLM) with any deployment pattern. See [Integration](/enterprise/integration) for details on connecting models.
+:::
+
+---
+
+## Shared Infrastructure Requirements
+
+Regardless of which deployment pattern you choose, every scaled Open WebUI deployment requires the same set of backing services. Configure these **before** scaling beyond a single instance.
+
+| Component | Why It's Required | Options |
+| :--- | :--- | :--- |
+| **PostgreSQL** | Multi-instance deployments require a real database. SQLite does not support concurrent writes from multiple processes. | Self-managed, Amazon RDS, Azure Database for PostgreSQL, Google Cloud SQL |
+| **Redis** | Session management, WebSocket coordination, and configuration sync across instances. | Self-managed, Amazon ElastiCache, Azure Cache for Redis, Google Memorystore |
+| **Vector Database** | The default ChromaDB uses a local SQLite backend that is not safe for multi-process access. | PGVector (shares PostgreSQL), Milvus, Qdrant, or ChromaDB in HTTP server mode |
+| **Shared Storage** | Uploaded files must be accessible from every instance. | Shared filesystem (NFS, EFS, CephFS) or object storage (`S3`, `GCS`, `Azure Blob`) |
+| **Content Extraction** | The default `pypdf` extractor leaks memory under sustained load. | Apache Tika or Docling as a sidecar service |
+| **Embedding Engine** | The default SentenceTransformers model loads ~500 MB into RAM per worker process. | OpenAI Embeddings API, or Ollama running an embedding model |
+
+### Critical Configuration
+
+These environment variables **must** be set consistently across every instance:
+
+```bash
+# Shared secret — MUST be identical on all instances
+WEBUI_SECRET_KEY=your-secret-key-here
+
+# Database
+DATABASE_URL=postgresql://user:password@db-host:5432/openwebui
+
+# Vector Database
+VECTOR_DB=pgvector
+PGVECTOR_DB_URL=postgresql://user:password@db-host:5432/openwebui
+
+# Redis
+REDIS_URL=redis://redis-host:6379/0
+WEBSOCKET_MANAGER=redis
+ENABLE_WEBSOCKET_SUPPORT=true
+
+# Content Extraction
+CONTENT_EXTRACTION_ENGINE=tika
+TIKA_SERVER_URL=http://tika:9998
+
+# Embeddings
+RAG_EMBEDDING_ENGINE=openai
+
+# Storage — choose ONE:
+# Option A: shared filesystem (mount the same volume to all instances, no env var needed)
+# Option B: object storage (see https://docs.openwebui.com/reference/env-configuration#cloud-storage for all required vars)
+# STORAGE_PROVIDER=s3
+
+# Workers — let the orchestrator handle scaling
+UVICORN_WORKERS=1
+
+# Migrations — only ONE instance should run migrations
+ENABLE_DB_MIGRATIONS=false
+```
+
+:::warning Database Migrations
+Set `ENABLE_DB_MIGRATIONS=false` on **all instances except one**. During updates, scale down to a single instance, allow migrations to complete, then scale back up. Concurrent migrations can corrupt your database.
+:::
+
+For the complete step-by-step scaling walkthrough, see [Scaling Open WebUI](/getting-started/advanced-topics/scaling). For the full environment variable reference, see [Environment Variable Configuration](/reference/env-configuration).
+
+---
+
+## Choose Your Deployment Pattern
+
+Open WebUI supports three production deployment patterns. Each guide covers architecture, scaling strategy, and key considerations specific to that approach.
+
+### [Python / Pip on Auto-Scaling VMs](./python-pip)
+
+Deploy `open-webui serve` as a systemd-managed process on virtual machines in a cloud auto-scaling group (AWS ASG, Azure VMSS, GCP MIG). Best for teams with established VM-based infrastructure and strong Linux administration skills, or when regulatory requirements mandate direct OS-level control.
+
+### [Container Service](./container-service)
+
+Run the official Open WebUI container image on a managed platform such as AWS ECS/Fargate, Azure Container Apps, or Google Cloud Run. Best for teams wanting container benefits — immutable images, versioned deployments, no OS management — without Kubernetes complexity.
+
+### [Kubernetes with Helm](./kubernetes-helm)
+
+Deploy using the official Open WebUI Helm chart on any Kubernetes distribution (EKS, AKS, GKE, OpenShift, Rancher, self-managed). Best for large-scale, mission-critical deployments requiring declarative infrastructure-as-code, advanced auto-scaling, and GitOps workflows.
+
+---
+
+## Deployment Comparison
+
+| | **Python / Pip (VMs)** | **Container Service** | **Kubernetes (Helm)** |
+| :--- | :--- | :--- | :--- |
+| **Operational complexity** | Moderate — OS patching, Python management | Low — platform-managed containers | Higher — requires K8s expertise |
+| **Auto-scaling** | Cloud ASG/VMSS with health checks | Platform-native, minimal configuration | HPA with fine-grained control |
+| **Container isolation** | None — process runs directly on OS | Full container isolation | Full container + namespace isolation |
+| **Rolling updates** | Manual (scale down, update, scale up) | Platform-managed rolling deployments | Declarative rolling updates with rollback |
+| **Infrastructure-as-code** | Terraform/Pulumi for VMs + config mgmt | Task/service definitions (CloudFormation, Bicep, Terraform) | Helm charts + GitOps (Argo CD, Flux) |
+| **Best suited for** | Teams with VM-centric operations, regulatory constraints | Teams wanting container benefits without K8s complexity | Large-scale, mission-critical deployments |
+| **Minimum team expertise** | Linux administration, Python | Container fundamentals, cloud platform | Kubernetes, Helm, cloud-native patterns |
+
+---
+
+## Observability
+
+Production deployments should include monitoring and observability regardless of deployment pattern.
+
+### Health Checks
+
+- **`/health`** — Basic liveness check. Returns HTTP 200 when the application is running. Use this for load balancer and auto-scaler health checks.
+- **`/api/models`** — Verifies the application can connect to configured model backends. Requires an API key.
+
+### OpenTelemetry
+
+Open WebUI supports **OpenTelemetry** for distributed tracing and HTTP metrics. Enable it with:
+
+```bash
+ENABLE_OTEL=true
+OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4318
+OTEL_SERVICE_NAME=open-webui
+```
+
+This auto-instruments FastAPI, SQLAlchemy, Redis, and HTTP clients — giving visibility into request latency, database query performance, and cross-service traces.
+
+### Structured Logging
+
+Enable JSON-formatted logs for integration with log aggregation platforms (Datadog, Loki, CloudWatch, Splunk):
+
+```bash
+LOG_FORMAT=json
+GLOBAL_LOG_LEVEL=INFO
+```
+
+For full monitoring setup details, see [Monitoring](/reference/monitoring) and [OpenTelemetry](/reference/monitoring/otel).
+
+---
+
+## Next Steps
+
+- **[Architecture & High Availability](/enterprise/architecture)** — Deeper dive into Open WebUI's stateless design and HA capabilities.
+- **[Security](/enterprise/security)** — Compliance frameworks, SSO/LDAP integration, RBAC, and audit logging.
+- **[Integration](/enterprise/integration)** — Connecting AI models, pipelines, and extending functionality.
+- **[Scaling Open WebUI](/getting-started/advanced-topics/scaling)** — The complete step-by-step technical scaling guide.
+- **[Multi-Replica Troubleshooting](/troubleshooting/multi-replica)** — Solutions for common issues in scaled deployments.
+
+---
+
+**Need help planning your enterprise deployment?** Our team works with organizations worldwide to design and implement production Open WebUI environments.
+
+[**Contact Enterprise Sales → sales@openwebui.com**](mailto:sales@openwebui.com)