diff --git a/docs/en/envoy_ai_gateway/index.mdx b/docs/en/envoy_ai_gateway/index.mdx
new file mode 100644
index 0000000..e1a3b19
--- /dev/null
+++ b/docs/en/envoy_ai_gateway/index.mdx
@@ -0,0 +1,6 @@
+---
+weight: 115
+---
+# Alauda Build of Envoy AI Gateway
+
+
diff --git a/docs/en/envoy_ai_gateway/install.mdx b/docs/en/envoy_ai_gateway/install.mdx
new file mode 100644
index 0000000..2b080c0
--- /dev/null
+++ b/docs/en/envoy_ai_gateway/install.mdx
@@ -0,0 +1,36 @@
+---
+weight: 20
+---
+
+# Install Envoy AI Gateway
+
+## Downloading Cluster Plugin
+
+:::info
+
+`Alauda Build of Envoy AI Gateway` cluster plugin can be retrieved from Customer Portal.
+
+Please contact Consumer Support for more information.
+
+:::
+
+## Uploading the Cluster Plugin
+
+For more information on uploading the cluster plugin, please refer to
+
+## Installing Alauda Build of Envoy AI Gateway
+
+1. Go to the `Administrator` -> `Marketplace` -> `Cluster Plugin` page, switch to the target cluster, and then deploy the `Alauda Build of Envoy AI Gateway` Cluster plugin.
+ :::info
+ **Note: Deploy form parameters can be kept as default or modified after knowing how to use them.**
+ :::
+
+2. Verify result. You can see the status of "Installed" in the UI or you can check the pod status:
+ ```bash
+ kubectl get pods -n envoy-gateway-system | grep "ai-gateway"
+ ```
+
+## Upgrading Alauda Build of Envoy AI Gateway
+
+1. Upload the new version for package of **Alauda Build of Envoy AI Gateway** plugin to ACP.
+2. Go to the `Administrator` -> `Clusters` -> `Target Cluster` -> `Functional Components` page, then click the `Upgrade` button to upgrade **Alauda Build of Envoy AI Gateway** to the new version.
diff --git a/docs/en/envoy_ai_gateway/intro.mdx b/docs/en/envoy_ai_gateway/intro.mdx
new file mode 100644
index 0000000..2d5da83
--- /dev/null
+++ b/docs/en/envoy_ai_gateway/intro.mdx
@@ -0,0 +1,33 @@
+---
+weight: 10
+---
+
+# Introduction
+
+## Envoy AI Gateway
+
+**Alauda Build of Envoy AI Gateway** is based on the [Envoy AI Gateway](https://aigateway.envoyproxy.io/) project.
+Envoy AI Gateway is a Kubernetes-native, AI-specific gateway layer built on top of [Envoy Gateway](https://gateway.envoyproxy.io/), providing intelligent traffic management, routing, and policy enforcement for AI inference workloads.
+
+Main components and capabilities include:
+
+- **AI-Aware Routing**: Routes inference requests to the appropriate backend model service based on request content, model name, and backend availability — enabling transparent multi-model serving behind a single endpoint.
+- **OpenAI-Compatible API**: Exposes a unified, OpenAI-compatible API surface (`/v1/chat/completions`, `/v1/completions`, `/v1/models`) for all downstream inference services, regardless of the underlying runtime.
+- **Per-Model Rate Limiting & Policies**: Enforces fine-grained rate limiting, token quotas, and traffic policies at the individual model level, preventing resource starvation and ensuring fair usage across tenants.
+- **Backend Load Balancing**: Distributes inference requests across multiple replicas of the same model using configurable load-balancing strategies, with health checking and automatic failover.
+- **Envoy Gateway Integration**: Runs as an extension of Envoy Gateway, inheriting its Kubernetes Gateway API-native control plane, TLS termination, and observability features (metrics, access logs, distributed tracing).
+- **Gateway API Inference Extension (GIE)**: Integrates with the Kubernetes SIG Gateway API Inference Extension for advanced, inference-aware scheduling and load balancing decisions based on real-time backend state.
+
+Envoy AI Gateway is a required dependency of **Alauda Build of KServe** for exposing inference services.
+
+For installation on the platform, see [Install Envoy AI Gateway](./install).
+
+## Documentation
+
+Envoy AI Gateway upstream documentation and related resources:
+
+- **Envoy AI Gateway Documentation**: [https://aigateway.envoyproxy.io/](https://aigateway.envoyproxy.io/) — Official documentation covering architecture, configuration, and API references.
+- **Envoy AI Gateway GitHub**: [https://github.com/envoyproxy/ai-gateway](https://github.com/envoyproxy/ai-gateway) — Source code, release notes, and issues.
+- **Envoy Gateway**: [https://gateway.envoyproxy.io/](https://gateway.envoyproxy.io/) — The underlying gateway infrastructure that Envoy AI Gateway extends.
+- **Gateway API Inference Extension (GIE)**: [https://gateway-api-inference-extension.sigs.k8s.io/](https://gateway-api-inference-extension.sigs.k8s.io/) — Kubernetes SIG project for AI-aware routing integrated with Envoy AI Gateway.
+- **KServe (Alauda Build)**: [../kserve/intro](../kserve/intro) — KServe uses Envoy AI Gateway as a required dependency for exposing and routing inference services.
diff --git a/docs/en/installation/ai-cluster.mdx b/docs/en/installation/ai-cluster.mdx
index ee72e11..c74f6fd 100644
--- a/docs/en/installation/ai-cluster.mdx
+++ b/docs/en/installation/ai-cluster.mdx
@@ -155,6 +155,10 @@ Confirm that the **Alauda AI** tile shows one of the following states:
+## Installing Alauda Build of KServe Operator
+
+For detailed installation steps, see [Install KServe](../kserve/install.mdx) in Alauda Build of KServe.
+
## Enabling Knative Functionality
Knative functionality is an optional capability that requires an additional operator and instance to be deployed.
@@ -220,6 +224,7 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
6. Replace the content with the following YAML:
7. Click **Create**.
+
```yaml
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
@@ -254,10 +259,14 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
kourier:
enabled: true
```
+:::warning
+- For ACP 4.0, use version **1.18.1**
+- For ACP 4.1 and above, use version **1.19.6**
+:::
-1. For ACP 4.0, keep the version as "1.18.1". For ACP 4.1 and above, change the version to "1.19.6".
+1. Specify the version of Knative Serving to be deployed.
2. `private-registry` is a placeholder for your private registry address. You can find this in the **Administrator** view, then click **Clusters**, select `your cluster`, and check the **Private Registry** value in the **Basic Info** section.
@@ -347,75 +356,6 @@ default True Succeeded
Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx).
-## Migrating to Knative Operator
-
-In the 1.x series of products, the serverless capability for inference services was provided by the `Alauda AI Model Serving` operator. In the 2.x series, this capability is provided by the `Knative Operator`. This section guides you through migrating your serverless capability from the legacy operator to the new one.
-
-### 1. Remove Legacy Serving Instance
-
-
-
-#### Procedure
-
-In **Administrator** view:
-
-1. Click **Marketplace / OperatorHub**.
-2. At the top of the console, from the **Cluster** dropdown list, select the destination cluster where **Alauda AI** is installed.
-3. Select **Alauda AI**, then click the **All Instances** tab.
-4. Locate the `default` instance and click **Update**.
-5. In the update form, locate the **Serverless Configuration** section.
-6. Set **BuiltIn Knative Serving** to `Removed`.
-7. Click **Update** to apply the changes.
-
-
-
-### 2. Install Knative Operator and Create Serving Instance
-
-Install the **Knative Operator** from the Marketplace and create the `KnativeServing` instance. For detailed instructions, refer to the [Enabling Knative Functionality](#enabling-knative-functionality) section.
-
-:::info
-Once the above steps are completed, the migration of the Knative serving control plane is complete.
-
-- If you are migrating from the **Alauda AI 2.0** + **Alauda AI Model Serving** combination, the migration is fully complete here. Business services will automatically switch their configuration shortly.
-- If you are migrating from the **Alauda AI 1.x** + **Alauda AI Model Serving** combination, please ensure that **Alauda AI** is simultaneously upgraded to version **2.x**.
-:::
-
-## Replace GitLab Service After Installation
-
-If you want to replace GitLab Service after installation, follow these steps:
-
-1. **Reconfigure GitLab Service**
- Refer to the [Pre-installation Configuration](./pre-configuration.mdx) and re-execute its steps.
-
-2. **Update Alauda AI Instance**
- - In Administrator view, navigate to **Marketplace > OperatorHub**
- - From the **Cluster** dropdown, select the target cluster
- - Choose **Alauda AI** and click the **All Instances** tab
- - Locate the **'default'** instance and click **Update**
-
-3. **Modify GitLab Configuration**
- In the **Update default** form:
- - Locate the **GitLab** section
- - Enter:
- - **Base URL**: The URL of your new GitLab instance
- - **Admin Token Secret Namespace**: `cpaas-system`
- - **Admin Token Secret Name**: `aml-gitlab-admin-token`
-
-4. **Restart Components**
- Restart the `aml-controller` deployment in the `kubeflow` namespace.
-
-5. **Refresh Platform Data**
- In Alauda AI management view, re-manage all namespaces.
- - In Alauda AI view, navigate to **Admin** view from **Business View**
- - On the **Namespace Management** page, delete all existing managed namespaces
- - Use "Managed Namespace" to add namespaces requiring Alauda AI integration
- :::info
- Original models won't migrate automatically
- Continue using these models:
- - Recreate and re-upload in new GitLab OR
- - Manually transfer model files to new repository
- :::
-
## FAQ
### 1. Configure the audit output directory for aml-skipper
diff --git a/docs/en/installation/ai-generative.mdx b/docs/en/installation/ai-generative.mdx
deleted file mode 100644
index 4cfbe45..0000000
--- a/docs/en/installation/ai-generative.mdx
+++ /dev/null
@@ -1,104 +0,0 @@
----
-weight: 35
----
-
-# Install Alauda Build of KServe
-
-**Alauda Build of KServe** is a cloud-native component built on **KServe** for serving generative AI models. As an extension of the Alauda AI ecosystem, it specifically optimizes for **Large Language Models (LLMs)**, offering essential features such as inference orchestration, streaming responses, and resource-based auto-scaling for generative workloads.
-
-
-## Prerequisites
-
-Before installing **Alauda Build of KServe**, you need to ensure the following dependencies are installed:
-
-### Required Dependencies
-
-| Dependency | Type | Description |
-|------------|------|-------------|
-| Alauda build of Envoy Gateway | Operator | Provides the underlying gateway functionality for AI services |
-| Envoy AI Gateway | Cluster Plugin | Provides AI-specific gateway capabilities |
-| [Alauda Build of LeaderWorkerSet](../../lws/install.mdx) | Cluster Plugin | Provides leader-worker set functionality for AI workloads |
-
-:::info
-`Alauda build of Envoy Gateway` is natively integrated into ACP 4.2. For environments running earlier versions (including ACP 4.0 and 4.1), please contact Customer Support for compatibility and installation guidance.
-:::
-
-### Optional Dependencies
-
-| Dependency | Type | Description |
-|------------|------|-------------|
-| GIE | Built-in | Integrated GIE (gateway-api-inference-extension) for enhanced AI capabilities. Can be enabled through the Alauda Build of KServe UI. |
-| Alauda AI | Operator | Required only if you need to use KServe Predictive AI functionality. Can be disabled if you only need LLM Generative AI functionality. |
-
-### Installation Notes
-
-1. **Required Dependencies**: All three required dependencies must be installed before installing Alauda Build of KServe.
-2. **GIE Integration**: If you want to use GIE, you can enable it during the installation process by selecting the "Integrated GIE" option in the Alauda Build of KServe UI.
-3. **Alauda AI Integration**: If you don't need KServe Predictive AI functionality and only want to use LLM Generative AI, you can disable the "Integrated With Alauda AI" option during installation.
-
-## Downloading Cluster Plugin
-
-:::info
-
-`Alauda Build of KServe` cluster plugin can be retrieved from Customer Portal.
-
-Please contact Consumer Support for more information.
-
-:::
-
-## Uploading the Cluster Plugin
-
-For more information on uploading the cluster plugin, please refer to
-
-## Installing Alauda Build of KServe
-
-1. Go to the `Administrator` -> `Marketplace` -> `Cluster Plugin` page, switch to the target cluster, and then deploy the `Alauda Build of KServe` Cluster plugin.
-
-2. In the deployment form, configure the following parameters as needed:
-
-### Envoy Gateway Configuration
-
-| Parameter | Description | Default Value |
-|-----------|-------------|---------------|
-| **ServiceAccount Name** | The name of the service account used by Envoy Gateway. | envoy-gateway |
-| **ServiceAccount Namespace** | The namespace where the service account is located. | envoy-gateway-system |
-| **Create Instance** | Create an Envoy Gateway instance to manage inference traffic with bundled extensions. | Enabled |
-| **Instance Name** | The name of the Envoy Gateway instance to be created. | aieg |
-
-### Envoy AI Gateway Configuration
-
-| Parameter | Description | Default Value |
-|-----------|-------------|---------------|
-| **Service Name** | The Kubernetes service name for Envoy AI Gateway. | ai-gateway-controller |
-| **Port Number** | The port number used by Envoy AI Gateway. | 1063 |
-
-### KServe Gateway Configuration
-
-| Parameter | Description | Default Value |
-|-----------|-------------|---------------|
-| **Enabled** | Install a KServe Gateway Instance for inferenceservices functionality. | Enabled |
-| **Gateway Name** | The name of the KServe Gateway. | kserve-ingress-gateway |
-| **Gateway Namespace** | The namespace where the KServe Gateway is deployed. | kserve |
-| **GatewayClass** | Optional. The custom name for the GatewayClass. If left empty, the system will automatically derive it following the "\{Namespace\}-\{Name\}" pattern. | (Empty) |
-| **Port Number** | The port number used by KServe Gateway. | 80 |
-
-### GIE(gateway-api-inference-extension) Configuration
-
-| Parameter | Description | Default Value |
-|-----------|-------------|---------------|
-| **BuiltIn** | Install with the bundled gateway-api-inference-extension v0.5.1 dependencies for enhanced AI capabilities. | Enabled |
-
-### Alauda AI Integration
-
-| Parameter | Description | Default Value |
-|-----------|-------------|---------------|
-| **Integrated** | Enable integration with Alauda AI core plugin to reuse existing configurations. | Disabled |
-
-3. Click **Install** to begin the installation process.
-
-4. Verify result. You can see the status of "Installed" in the UI.
-
-## Upgrading Alauda Build of KServe
-
-1. Upload the new version for package of **Alauda Build of KServe** plugin to ACP.
-2. Go to the `Administrator` -> `Clusters` -> `Target Cluster` -> `Functional Components` page, then click the `Upgrade` button, and you will see the `Alauda Build of KServe` can be upgraded.
diff --git a/docs/en/installation/pre-configuration.mdx b/docs/en/installation/pre-configuration.mdx
index 3be96ff..439704f 100644
--- a/docs/en/installation/pre-configuration.mdx
+++ b/docs/en/installation/pre-configuration.mdx
@@ -4,16 +4,6 @@ weight: 5
# Pre-installation Configuration
-## **Deploy Service Mesh**
-
-Since Alauda AI leverages Service Mesh capabilities for model inference services, Service Mesh must be deployed in the cluster before deploying Alauda AI. For detailed deployment procedures, refer to .
-
-:::info
-
-After completing the **Prerequisites** on the **Create Service Mesh** page, proceed to the **Creating a Service Mesh** page and follow the on-screen instructions to finalize the deployment of the Service Mesh.
-
-:::
-
## **Preparing the GitLab Service**
In Alauda AI, GitLab is the core component for **Model Management**. Before deploying Alauda AI, you **must prepare** a GitLab service.
diff --git a/docs/en/kserve/index.mdx b/docs/en/kserve/index.mdx
new file mode 100644
index 0000000..7a59347
--- /dev/null
+++ b/docs/en/kserve/index.mdx
@@ -0,0 +1,7 @@
+---
+weight: 95
+---
+
+# Alauda Build of KServe
+
+
diff --git a/docs/en/kserve/install.mdx b/docs/en/kserve/install.mdx
new file mode 100644
index 0000000..a070fe7
--- /dev/null
+++ b/docs/en/kserve/install.mdx
@@ -0,0 +1,182 @@
+---
+weight: 20
+---
+
+# Install KServe
+
+## Prerequisites
+
+Before installing **Alauda Build of KServe**, you need to ensure the following dependencies are installed:
+
+### Required Dependencies
+
+| Dependency | Type | Description |
+|------------|------|-------------|
+| Alauda build of Envoy Gateway | Operator | Provides the underlying gateway functionality for AI services |
+| [Alauda Build of Envoy AI Gateway](../../envoy_ai_gateway/install.mdx) | Cluster Plugin | Provides AI-specific gateway capabilities |
+| [Alauda Build of LeaderWorkerSet](../../lws/install.mdx) | Cluster Plugin | Provides leader-worker set functionality for AI workloads |
+| GIE (gateway-api-inference-extension) | Built-in | Bundled with Alauda Build of KServe by default. If GIE is already installed in the cluster, the built-in installation can be disabled via the `gie.builtIn` parameter during operator configuration. |
+
+:::info
+`Alauda build of Envoy Gateway` is natively integrated into ACP 4.2. For environments running earlier versions (including ACP 4.0 and 4.1), please contact Customer Support for compatibility and installation guidance.
+:::
+
+### Installation Notes
+
+1. **Required Dependencies**: All required dependencies must be installed before installing Alauda Build of KServe.
+2. **GIE Integration**: GIE is bundled and enabled by default. If your environment already has GIE installed separately, set `gie.builtIn` to `false` in the operator configuration to disable the built-in installation.
+
+## Upload Operator
+
+Download the Alauda Build of KServe Operator installation file (e.g., `kserve-operator.ALL.xxxx.tgz`).
+
+Use the `violet` command to publish it to the platform repository:
+
+```bash
+violet push --platform-address= --platform-username= --platform-password= kserve-operator.ALL.xxxx.tgz
+```
+
+## Install Operator
+
+In **Administrator** view:
+
+1. Click **Marketplace / OperatorHub**.
+2. At the top of the console, from the **Cluster** dropdown list, select the destination cluster where you want to install the KServe Operator.
+3. Search for and select **Alauda Build of KServe**, then click **Install**.
+
+ **Install Alauda Build of KServe** window will pop up.
+
+4. Leave **Channel** unchanged.
+5. Check whether the **Version** matches the **Alauda Build of KServe** version you want to install.
+6. Leave **Installation Location** unchanged, it should be `kserve-operator` by default.
+7. Select **Manual** for **Upgrade Strategy**.
+8. Click **Install**.
+
+### Verification
+
+Confirm that the **Alauda Build of KServe** tile shows one of the following states:
+
+- `Installing`: installation is in progress; wait for this to change to `Installed`.
+- `Installed`: installation is complete.
+
+## Create KServe Instance
+
+After the operator is installed, create a `KServe` custom resource to deploy the KServe instance.
+
+Switch to **YAML view** and apply the following configuration, then adjust the callout fields for your environment:
+
+```yaml
+apiVersion: components.aml.dev/v1alpha1
+kind: KServe
+metadata:
+ name: default-kserve
+spec:
+ namespace: kserve # [!code callout]
+ values:
+ global:
+ clusterName: # [!code callout]
+ deployFlavor: single-node # [!code callout]
+ platformAddress: # [!code callout]
+ preset:
+ GIE: # [!code callout]
+ enabled: true
+ envoy_ai_gateway: # [!code callout]
+ port: 1063
+ service: ai-gateway-controller
+ envoy_gateway: # [!code callout]
+ create_instance: true
+ deploy_type: ControllerNamespace
+ instance_name: aieg
+ sa_namespace: envoy-gateway-system
+ service_account: envoy-gateway
+ kserve_gateway: # [!code callout]
+ enabled: true
+ gateway_class: ""
+ name: kserve-ingress-gateway
+ namespace: kserve
+ port: 80
+ registry:
+ address: # [!code callout]
+ kserve:
+ controller:
+ deploymentMode: Knative # [!code callout]
+ gateway:
+ domain: # [!code callout]
+ storage:
+ caBundleConfigMapName: aml-global-ca-bundle # [!code callout]
+```
+
+
+
+1. `spec.namespace` — Kubernetes namespace where KServe components are deployed. Default: `kserve`.
+2. `global.clusterName` — Cluster name as registered in the platform. Example: `business-1`.
+3. `global.deployFlavor` — `single-node` for non-HA, `ha-cluster` for production HA.
+4. `global.platformAddress` — Alauda Container Platform management endpoint address. Example: `https://192.168.131.112`.
+5. `preset.GIE` — Built-in Gateway API Inference Extension for enhanced AI capabilities. See [GIE Configuration](#gie-gateway-api-inference-extension-configuration).
+6. `preset.envoy_ai_gateway` — AI-specific gateway for intelligent routing and policy enforcement. See [Envoy AI Gateway Configuration](#envoy-ai-gateway-configuration).
+7. `preset.envoy_gateway` — Underlying Envoy-based gateway infrastructure. See [Envoy Gateway Configuration](#envoy-gateway-configuration).
+8. `preset.kserve_gateway` — Ingress gateway for KServe inference services. See [KServe Gateway Configuration](#kserve-gateway-configuration).
+9. `global.registry.address` — The container registry endpoint used by the target cluster (`global.clusterName`) to pull KServe infrastructure and runtime images.
+Example: `registry.alauda.cn:60070`.
+10. `kserve.controller.deploymentMode` — Set to `Knative` for serverless features like scale-to-zero, or `Standard` for native Kubernetes deployments
+11. `kserve.controller.gateway.domain` — Domain for the ingress gateway to expose inference service endpoints. Use a wildcard domain, e.g., `*.example.com`.
+12. `kserve.storage.caBundleConfigMapName` — ConfigMap name containing the CA bundle for storage connections.
+
+
+
+
+### Verification
+
+Check the status of the `KServe` resource:
+
+```bash
+kubectl get kserve default-kserve -n kserve-operator
+```
+
+The instance is ready when the status shows `DEPLOYED: True`.
+
+### Envoy Gateway Configuration
+
+| Field | Description | Default |
+|-------|-------------|---------|
+| `preset.envoy_gateway.service_account` | Service account name used by Envoy Gateway. | `envoy-gateway` |
+| `preset.envoy_gateway.sa_namespace` | Namespace where the Envoy Gateway service account is located. | `envoy-gateway-system` |
+| `preset.envoy_gateway.create_instance` | Create an Envoy Gateway instance to manage inference traffic with bundled extensions. | `true` |
+| `preset.envoy_gateway.instance_name` | Name of the Envoy Gateway instance to create. | `aieg` |
+
+### Envoy AI Gateway Configuration
+
+| Field | Description | Default |
+|-------|-------------|---------|
+| `preset.envoy_ai_gateway.service` | Kubernetes service name for Envoy AI Gateway. | `ai-gateway-controller` |
+| `preset.envoy_ai_gateway.port` | Port number used by Envoy AI Gateway. | `1063` |
+
+### KServe Gateway Configuration
+
+| Field | Description | Default |
+|-------|-------------|---------|
+| `preset.kserve_gateway.enabled` | Deploy a KServe Gateway instance for InferenceService traffic. | `true` |
+| `preset.kserve_gateway.name` | Name of the KServe Gateway. | `kserve-ingress-gateway` |
+| `preset.kserve_gateway.namespace` | Namespace where the KServe Gateway is deployed. | `kserve` |
+| `preset.kserve_gateway.gateway_class` | Optional custom GatewayClass name. If empty, derived as `{namespace}-{name}`. | `""` |
+| `preset.kserve_gateway.port` | Port number used by the KServe Gateway. | `80` |
+
+### GIE (gateway-api-inference-extension) Configuration
+
+| Field | Description | Default |
+|-------|-------------|---------|
+| `preset.GIE.enabled` | Enable the bundled Gateway API Inference Extension. Set to `false` if GIE is already installed separately in the cluster. | `true` |
+
+
+## Upgrading Alauda Build of KServe
+
+1. Upload the new version of the **Alauda Build of KServe** operator package using the `violet` tool.
+2. Go to the `Administrator` -> `Marketplace` -> `OperatorHub` page, find **Alauda Build of KServe**, and click **Confirm** to apply the new version.
+
+### Verification
+
+After upgrading, confirm that the **Alauda Build of KServe** tile shows `Installed` and verify the KServe instance status:
+
+```bash
+kubectl get kserve default-kserve -n kserve-operator
+```
\ No newline at end of file
diff --git a/docs/en/kserve/intro.mdx b/docs/en/kserve/intro.mdx
new file mode 100644
index 0000000..6f277f3
--- /dev/null
+++ b/docs/en/kserve/intro.mdx
@@ -0,0 +1,44 @@
+---
+weight: 10
+---
+
+# Introduction
+
+## KServe
+
+**Alauda Build of KServe** is based on the [KServe](https://kserve.github.io/website/).
+KServe provides a standardized, cloud-native interface for serving machine learning models at scale on Kubernetes.
+It has evolved around two primary scenarios: **Predictive AI** for traditional ML inference, and **Generative AI** for LLM-based workloads.
+
+### Generative AI
+
+Generative AI support is optimized for Large Language Model (LLM) serving with OpenAI-compatible APIs.
+
+- **llm-d (Distributed LLM Inference)**: A Kubernetes-native distributed inference framework that runs under the KServe control plane. llm-d orchestrates multi-node LLM inference using a Leader/Worker pattern and makes real-time routing decisions based on KV cache state and GPU load — enabling KV-cache-aware request scheduling, elastic tensor/pipeline parallelism, and cluster-wide inference that behaves like a single machine. This lowers cost per token and maximizes GPU utilization for large models (e.g., Llama 3.1 405B) that exceed single-node memory.
+- **LLM Inference & Streaming**: Native support for streaming responses (SSE / chunked transfer), enabling real-time token delivery for chat and completion workloads, with OpenAI-compatible `/chat/completions` and `/completions` APIs.
+- **vLLM Runtime**: First-class integration with vLLM as the high-performance LLM serving backend, with support for continuous batching and PagedAttention.
+- **Gateway Integration**: Native integration with Envoy Gateway and the Gateway API Inference Extension (GIE) for AI-aware traffic routing, load balancing, and per-model rate limiting across inference services.
+- **Autoscaling for LLMs**: Metrics-driven autoscaling policies tailored to LLM throughput characteristics, including scale-to-zero for cost efficiency.
+
+### Predictive AI
+
+Predictive AI covers traditional machine learning model serving with high throughput and low latency requirements.
+
+- **InferenceService**: The core CRD for deploying and managing model serving endpoints. Supports canary rollouts, traffic splitting across model versions, and A/B testing workflows.
+- **Model Serving Runtimes**: Pre-integrated runtimes for popular ML frameworks — TensorFlow Serving, TorchServe, Triton Inference Server, SKLearn, XGBoost, and more. Custom runtimes are supported via the **ClusterServingRuntime** and **ServingRuntime** CRDs.
+- **Inference Graph**: The **InferenceGraph** CRD enables composing multiple models into a pipeline, including pre/post-processing nodes, routing logic, and ensemble patterns.
+- **Autoscaling**: Scale-to-zero and scale-from-zero support via KEDA or Kubernetes HPA, with policies based on request rate, queue depth, or custom metrics.
+
+For installation on the platform, see [Install KServe](./install).
+
+## Documentation
+
+KServe upstream documentation and key dependencies:
+
+- **KServe Documentation**: [https://kserve.github.io/website/](https://kserve.github.io/website/) — Official documentation covering concepts, model serving runtimes, and API references.
+- **KServe GitHub**: [https://github.com/kserve/kserve](https://github.com/kserve/kserve) — Source code, release notes, and issues.
+- **llm-d**: [https://github.com/llm-d/llm-d](https://github.com/llm-d/llm-d) — Kubernetes-native distributed LLM inference framework with KV-cache-aware scheduling and elastic parallelism.
+- **LeaderWorkerSet (LWS)**: [https://github.com/kubernetes-sigs/lws](https://github.com/kubernetes-sigs/lws) — Kubernetes SIG workload controller for multi-node Leader/Worker patterns, required for multi-node LLM inference.
+- **Envoy Gateway**: [https://gateway.envoyproxy.io/](https://gateway.envoyproxy.io/) — Kubernetes-native gateway built on Envoy Proxy, providing the underlying traffic management for KServe inference services.
+- **Envoy AI Gateway**: [https://aigateway.envoyproxy.io/](https://aigateway.envoyproxy.io/) — AI-specific gateway capabilities layered on top of Envoy Gateway, including AI-aware routing and per-model policies.
+- **Gateway API Inference Extension (GIE)**: [https://gateway-api-inference-extension.sigs.k8s.io/](https://gateway-api-inference-extension.sigs.k8s.io/) — Kubernetes SIG project providing AI-aware routing and load balancing for inference services.
diff --git a/docs/en/kubeflow/index.mdx b/docs/en/kubeflow/index.mdx
index dae986a..fd4b38d 100644
--- a/docs/en/kubeflow/index.mdx
+++ b/docs/en/kubeflow/index.mdx
@@ -1,5 +1,5 @@
---
-weight: 61
+weight: 120
---
# Alauda support for Kubeflow
diff --git a/docs/en/kueue/index.mdx b/docs/en/kueue/index.mdx
index fbd9406..37983a9 100644
--- a/docs/en/kueue/index.mdx
+++ b/docs/en/kueue/index.mdx
@@ -1,5 +1,5 @@
---
-weight: 82
+weight: 92
---
# Alauda Build of Kueue
diff --git a/docs/en/kueue/install.mdx b/docs/en/kueue/install.mdx
index f1a1a5c..aff3e3c 100644
--- a/docs/en/kueue/install.mdx
+++ b/docs/en/kueue/install.mdx
@@ -2,7 +2,7 @@
weight: 20
---
-# Install
+# Install Kueue
## Downloading Cluster plugin
diff --git a/docs/en/llama_stack/index.mdx b/docs/en/llama_stack/index.mdx
index 9e4afeb..483f684 100644
--- a/docs/en/llama_stack/index.mdx
+++ b/docs/en/llama_stack/index.mdx
@@ -1,5 +1,5 @@
---
-weight: 83
+weight: 98
---
# Alauda Build of Llama Stack
diff --git a/docs/en/lws/index.mdx b/docs/en/lws/index.mdx
index 873f482..fc78b27 100644
--- a/docs/en/lws/index.mdx
+++ b/docs/en/lws/index.mdx
@@ -1,5 +1,5 @@
---
-weight: 90
+weight: 100
---
# Alauda Build of LeaderWorkerSet
diff --git a/docs/en/lws/install.mdx b/docs/en/lws/install.mdx
index c4dd5a2..2e7cb61 100644
--- a/docs/en/lws/install.mdx
+++ b/docs/en/lws/install.mdx
@@ -2,7 +2,7 @@
weight: 20
---
-# Install
+# Install LeaderWorkerSet
## Downloading Cluster plugin
diff --git a/docs/en/lws/intro.mdx b/docs/en/lws/intro.mdx
new file mode 100644
index 0000000..4813776
--- /dev/null
+++ b/docs/en/lws/intro.mdx
@@ -0,0 +1,29 @@
+---
+weight: 10
+---
+
+# Introduction
+
+## LeaderWorkerSet
+
+**Alauda Build of LeaderWorkerSet** is based on the [LeaderWorkerSet (LWS)](https://github.com/kubernetes-sigs/lws) Kubernetes SIG project.
+LeaderWorkerSet provides a Kubernetes-native workload API for deploying groups of pods in a **Leader/Worker** pattern, enabling multi-node distributed workloads — particularly large AI model training and inference — to run as first-class citizens on Kubernetes.
+
+Main components and capabilities include:
+
+- **LeaderWorkerSet CRD**: The core API resource that defines a group of replicated Leader/Worker pod sets. Each replica consists of one leader pod and a configurable number of worker pods, co-scheduled and managed as a unit.
+- **Co-scheduling & Topology Awareness**: Leader and worker pods within a group are scheduled together, with support for topology spread constraints to co-locate pods on the same node, rack, or availability zone for low-latency inter-node communication (e.g., NVLink, InfiniBand).
+- **Multi-node LLM Inference**: Enables large language models that exceed single-node GPU memory (e.g., Llama 3.1 405B) to be served across multiple nodes using tensor parallelism or pipeline parallelism. LWS is a required dependency of **Alauda Build of KServe** for this use case.
+- **Multi-node Training**: Supports distributed training frameworks (PyTorch DDP, DeepSpeed, Megatron-LM) by providing stable, co-located leader/worker pod groups with predictable hostnames and network identities.
+- **Rolling Updates & Failure Recovery**: Supports rolling restarts and automatic pod replacement at the group level, ensuring the entire Leader/Worker group is recycled consistently when a failure or update occurs.
+- **Startup Sequencing**: The leader pod can act as the entry point and coordinator, with worker pods starting after the leader is ready — enabling frameworks that require a master process to be initialized before workers connect.
+
+For installation on the platform, see [Install LeaderWorkerSet](./install).
+
+## Documentation
+
+LeaderWorkerSet upstream documentation and related resources:
+
+- **LeaderWorkerSet Documentation**: [https://lws.sigs.k8s.io/](https://lws.sigs.k8s.io/) — Official documentation covering concepts, API reference, and usage guides.
+- **LeaderWorkerSet GitHub**: [https://github.com/kubernetes-sigs/lws](https://github.com/kubernetes-sigs/lws) — Source code, API reference, and examples for the LeaderWorkerSet Kubernetes SIG project.
+- **KServe (Alauda Build)**: [../kserve/intro](../kserve/intro) — KServe uses LeaderWorkerSet as a required dependency for multi-node LLM inference workloads.
diff --git a/docs/en/trustyai/index.mdx b/docs/en/trustyai/index.mdx
index f9fdd99..359407d 100644
--- a/docs/en/trustyai/index.mdx
+++ b/docs/en/trustyai/index.mdx
@@ -1,5 +1,5 @@
---
-weight: 95
+weight: 110
---
# Alauda Build of TrustyAI
diff --git a/docs/en/upgrade/migrating-to-knative-operator.mdx b/docs/en/upgrade/migrating-to-knative-operator.mdx
new file mode 100644
index 0000000..77332f1
--- /dev/null
+++ b/docs/en/upgrade/migrating-to-knative-operator.mdx
@@ -0,0 +1,43 @@
+---
+weight: 20
+---
+
+# Migrating to Knative Operator
+
+In the 1.x series of products, the serverless capability for inference services was provided by the `Alauda AI Model Serving` operator. In the 2.x series, this capability is provided by the `Knative Operator`. This section guides you through migrating your serverless capability from the legacy operator to the new one.
+
+## 1. Remove Legacy Serving Instance
+
+
+
+### Procedure
+
+In **Administrator** view:
+
+1. Click **Marketplace / OperatorHub**.
+2. At the top of the console, from the **Cluster** dropdown list, select the destination cluster where **Alauda AI** is installed.
+3. Select **Alauda AI**, then click the **All Instances** tab.
+4. Locate the `default` instance and click **Update**.
+5. On the update page, switch to the **YAML** view.
+6. Set `spec.knativeServing.managementState` to `Removed`, for example:
+
+ ```yaml
+ spec:
+ knativeServing:
+ managementState: Removed
+ ```
+
+7. Click **Update** to apply the changes.
+
+
+
+## 2. Install Knative Operator and Create Serving Instance
+
+Install the **Knative Operator** from the Marketplace and create the `KnativeServing` instance. For detailed instructions, refer to the [Enabling Knative Functionality](../installation/ai-cluster.mdx#enabling-knative-functionality) section.
+
+:::info
+Once the above steps are completed, the migration of the Knative serving control plane is complete.
+
+- If you are migrating from the **Alauda AI 2.0** + **Alauda AI Model Serving** combination, the migration is fully complete here. Business services will automatically switch their configuration shortly.
+- If you are migrating from the **Alauda AI 1.x** + **Alauda AI Model Serving** combination, please ensure that **Alauda AI** is simultaneously upgraded to version **2.x**.
+:::
diff --git a/docs/en/upgrade/upgrade-from-previous-version.mdx b/docs/en/upgrade/upgrade-from-previous-version.mdx
index 0fdd35b..aa9b8de 100644
--- a/docs/en/upgrade/upgrade-from-previous-version.mdx
+++ b/docs/en/upgrade/upgrade-from-previous-version.mdx
@@ -3,7 +3,7 @@ weight: 10
---
export const prevVersion = '1.5'
-export const curVer = '2.0'
+export const curVer = '2.2'
# Upgrade Alauda AI
@@ -16,12 +16,13 @@ Upgrade from {prevVersion} to {curVer}
Please visit [Alauda AI Cluster](../installation/ai-cluster.mdx) for:
:::warning
-Please ignore `Creating Alauda AI Cluster Instance` since we are upgrading **Alauda AI** from a previously managed version.
+Please ignore `Creating Alauda AI Instance` since we are upgrading **Alauda AI** from a previously managed version.
:::
-1. [Downloading](../installation/ai-cluster.mdx#downloading) operator bundle packages for `Alauda AI Cluster` and `KServeless`.
-2. [Uploading](../installation/ai-cluster.mdx#uploading) operator bundle packages to the destination cluster.
-3. To upgrade, follow the process described below.
+1. [Downloading](../installation/ai-cluster.mdx#downloading) operator bundle packages for `Alauda AI` and `Knative Operator` (Optional).
+2. [Downloading](../kserve/install.mdx#upload-operator) operator bundle packages for `Alauda Build of KServe`.
+3. [Uploading](../installation/ai-cluster.mdx#uploading) operator bundle packages to the destination cluster.
+4. To upgrade, follow the process described below.
## Pre-Upgrade Operations
@@ -76,19 +77,25 @@ After the upgrade is complete, please confirm that the status of **Alauda AI Ess
### Upgrading Alauda AI Operators
-The procedure for upgrading both operators is nearly identical, with only the target component being different.
+The procedure for upgrading the operator is nearly identical, with only the target component being different.
-| Step | Alauda AI Operator | Alauda AI Model Serving Operator |
-|:----------------|:--------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|
-| **1. Navigate** | Log into the Web Console, then go to **Marketplace > OperatorHub** in the **Administrator** view. | Log into the Web Console, then go to **Marketplace > OperatorHub** in the **Administrator** view. |
-| **2. Select** | Select your target **cluster**. | Select your target **cluster**. |
-| **3. Click** | Click the **Alauda AI** card. | Click the **Alauda AI Model Serving** card. |
-| **4. Confirm** | Click **Confirm** on the upgrade prompt. | Click **Confirm** on the upgrade prompt. |
+| Step | Alauda AI Operator |
+|:----------------|:--------------------------------------------------------------------------------------------------|
+| **1. Navigate** | Log into the Web Console, then go to **Marketplace > OperatorHub** in the **Administrator** view. |
+| **2. Select** | Select your target **cluster**. |
+| **3. Click** | Click the **Alauda AI** card. |
+| **4. Confirm** | Click **Confirm** on the upgrade prompt. |
:::info
Once the new version is uploaded and recognized by the platform, an upgrade prompt will appear at the top of the operator's page.
:::
+### Installing Alauda Build of KServe Operator
+
+Starting from version {curVer}, **Alauda Build of KServe** is provided as a separate operator plugin to offer more specialized and flexible model serving capabilities. After completing the core AI operator upgrades, you must install the KServe operator to enable model serving functionality.
+
+For detailed installation and configuration steps, please refer to the [Alauda Build of KServe Installation Guide](../kserve/install.mdx).
+
### Upgrading Cluster Plugins
:::info
@@ -178,7 +185,7 @@ For each existing inference service, perform the following steps:
### Alauda AI
-Check the status field from the `AmlCluster` resource which named `default`:
+Check the status field from the `AmlCluster` resource named `default`:
```bash
kubectl get amlcluster default
@@ -191,22 +198,22 @@ NAME READY REASON
default True Succeeded
```
-### Alauda AI Model Serving
+### Alauda Build of KServe
-Check the status field from the `KnativeServing` resource which named `default-knative-serving`:
+Check the status field from the `KServe` resource named `default-kserve`:
```bash
-kubectl get KnativeServing.components.aml.dev default-knative-serving
+kubectl get kserve default-kserve -n kserve-operator
```
-Should returns `InstallSuccessful`:
+Should return `DEPLOYED: True`:
```
-NAME DEPLOYED REASON
-default-knative-serving True UpgradeSuccessful
+NAME DEPLOYED REASON
+default-kserve True UpgradeSuccessful
```
-### Alauda AI Cluster Plugins
+### Other Cluster Plugins
In the **Administrator** view, navigate to **Marketplace > Cluster Plugins** and confirm that the following cluster plugins show `Installed` status with the new version:
@@ -215,3 +222,9 @@ In the **Administrator** view, navigate to **Marketplace > Cluster Plugins** and
- Alauda AI Volcano (if deployed)
+
+## Deprecating Alauda AI Model Serving
+
+Starting from the **Alauda AI 2.x** series, the legacy **Alauda AI Model Serving** operator is deprecated. We strongly recommend that users requiring serverless inference capabilities switch to the **Knative Operator** as soon as possible to ensure long-term support and access to the latest features.
+
+For guidance on how to move your serverless workloads to the new operator, please see the [Migrating to Knative Operator](./migrating-to-knative-operator.mdx) guide.