You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`Alauda Build of Envoy AI Gateway` cluster plugin can be retrieved from Customer Portal.
12
+
13
+
Please contact Consumer Support for more information.
14
+
15
+
:::
16
+
17
+
## Uploading the Cluster Plugin
18
+
19
+
For more information on uploading the cluster plugin, please refer to <ExternalSiteLinkname="acp"href="ui/cli_tools/index.html#uploading-cluster-plugins"children="Uploading Cluster Plugins" />
20
+
21
+
## Installing Alauda Build of Envoy AI Gateway
22
+
23
+
1. Go to the `Administrator` -> `Marketplace` -> `Cluster Plugin` page, switch to the target cluster, and then deploy the `Alauda Build of Envoy AI Gateway` Cluster plugin.
24
+
:::info
25
+
**Note: Deploy form parameters can be kept as default or modified after knowing how to use them.**
26
+
:::
27
+
28
+
2. Verify result. You can see the status of "Installed" in the UI or you can check the pod status:
29
+
```bash
30
+
kubectl get pods -n envoy-gateway-system | grep "ai-gateway"
31
+
```
32
+
33
+
## Upgrading Alauda Build of Envoy AI Gateway
34
+
35
+
1. Upload the new version for package of **Alauda Build of Envoy AI Gateway** plugin to ACP.
36
+
2. Go to the `Administrator` -> `Clusters` -> `Target Cluster` -> `Functional Components` page, then click the `Upgrade` button to upgrade **Alauda Build of Envoy AI Gateway** to the new version.
**Alauda Build of Envoy AI Gateway** is based on the [Envoy AI Gateway](https://aigateway.envoyproxy.io/) project.
10
+
Envoy AI Gateway is a Kubernetes-native, AI-specific gateway layer built on top of [Envoy Gateway](https://gateway.envoyproxy.io/), providing intelligent traffic management, routing, and policy enforcement for AI inference workloads.
11
+
12
+
Main components and capabilities include:
13
+
14
+
-**AI-Aware Routing**: Routes inference requests to the appropriate backend model service based on request content, model name, and backend availability — enabling transparent multi-model serving behind a single endpoint.
15
+
-**OpenAI-Compatible API**: Exposes a unified, OpenAI-compatible API surface (`/v1/chat/completions`, `/v1/completions`, `/v1/models`) for all downstream inference services, regardless of the underlying runtime.
16
+
-**Per-Model Rate Limiting & Policies**: Enforces fine-grained rate limiting, token quotas, and traffic policies at the individual model level, preventing resource starvation and ensuring fair usage across tenants.
17
+
-**Backend Load Balancing**: Distributes inference requests across multiple replicas of the same model using configurable load-balancing strategies, with health checking and automatic failover.
18
+
-**Envoy Gateway Integration**: Runs as an extension of Envoy Gateway, inheriting its Kubernetes Gateway API-native control plane, TLS termination, and observability features (metrics, access logs, distributed tracing).
19
+
-**Gateway API Inference Extension (GIE)**: Integrates with the Kubernetes SIG Gateway API Inference Extension for advanced, inference-aware scheduling and load balancing decisions based on real-time backend state.
20
+
21
+
Envoy AI Gateway is a required dependency of **Alauda Build of KServe** for exposing inference services.
22
+
23
+
For installation on the platform, see [Install Envoy AI Gateway](./install).
24
+
25
+
## Documentation
26
+
27
+
Envoy AI Gateway upstream documentation and related resources:
28
+
29
+
-**Envoy AI Gateway Documentation**: [https://aigateway.envoyproxy.io/](https://aigateway.envoyproxy.io/) — Official documentation covering architecture, configuration, and API references.
30
+
-**Envoy AI Gateway GitHub**: [https://github.com/envoyproxy/ai-gateway](https://github.com/envoyproxy/ai-gateway) — Source code, release notes, and issues.
31
+
-**Envoy Gateway**: [https://gateway.envoyproxy.io/](https://gateway.envoyproxy.io/) — The underlying gateway infrastructure that Envoy AI Gateway extends.
32
+
-**Gateway API Inference Extension (GIE)**: [https://gateway-api-inference-extension.sigs.k8s.io/](https://gateway-api-inference-extension.sigs.k8s.io/) — Kubernetes SIG project for AI-aware routing integrated with Envoy AI Gateway.
33
+
-**KServe (Alauda Build)**: [../kserve/intro](../kserve/intro) — KServe uses Envoy AI Gateway as a required dependency for exposing and routing inference services.
Copy file name to clipboardExpand all lines: docs/en/installation/ai-cluster.mdx
+10-70Lines changed: 10 additions & 70 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -155,6 +155,10 @@ Confirm that the **Alauda AI** tile shows one of the following states:
155
155
156
156
</Steps>
157
157
158
+
## Installing Alauda Build of KServe Operator
159
+
160
+
For detailed installation steps, see [Install KServe](../kserve/install.mdx) in Alauda Build of KServe.
161
+
158
162
## Enabling Knative Functionality
159
163
160
164
Knative functionality is an optional capability that requires an additional operator and instance to be deployed.
@@ -220,6 +224,7 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
220
224
6. Replace the content with the following YAML:
221
225
7. Click **Create**.
222
226
227
+
223
228
```yaml
224
229
apiVersion: operator.knative.dev/v1beta1
225
230
kind: KnativeServing
@@ -254,10 +259,14 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
254
259
kourier:
255
260
enabled: true
256
261
```
262
+
:::warning
263
+
- For ACP 4.0, use version **1.18.1**
264
+
- For ACP 4.1 and above, use version **1.19.6**
265
+
:::
257
266
258
267
<Callouts>
259
268
260
-
1. For ACP 4.0, keep the version as "1.18.1". For ACP 4.1 and above, change the version to "1.19.6".
269
+
1. Specify the version of Knative Serving to be deployed.
261
270
262
271
2. `private-registry` is a placeholder for your private registry address. You can find this in the **Administrator** view, then click **Clusters**, select `your cluster`, and check the **Private Registry** value in the **Basic Info** section.
263
272
@@ -347,75 +356,6 @@ default True Succeeded
347
356
348
357
Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx).
349
358
350
-
## Migrating to Knative Operator
351
-
352
-
In the 1.x series of products, the serverless capability for inference services was provided by the `Alauda AI Model Serving` operator. In the 2.x series, this capability is provided by the `Knative Operator`. This section guides you through migrating your serverless capability from the legacy operator to the new one.
353
-
354
-
### 1. Remove Legacy Serving Instance
355
-
356
-
<Steps>
357
-
358
-
#### Procedure
359
-
360
-
In **Administrator** view:
361
-
362
-
1. Click **Marketplace / OperatorHub**.
363
-
2. At the top of the console, from the **Cluster** dropdown list, select the destination cluster where **Alauda AI** is installed.
364
-
3. Select **Alauda AI**, then click the **All Instances** tab.
365
-
4. Locate the `default` instance and click **Update**.
366
-
5. In the update form, locate the **Serverless Configuration** section.
367
-
6. Set **BuiltIn Knative Serving** to `Removed`.
368
-
7. Click **Update** to apply the changes.
369
-
370
-
</Steps>
371
-
372
-
### 2. Install Knative Operator and Create Serving Instance
373
-
374
-
Install the **Knative Operator** from the Marketplace and create the `KnativeServing` instance. For detailed instructions, refer to the [Enabling Knative Functionality](#enabling-knative-functionality) section.
375
-
376
-
:::info
377
-
Once the above steps are completed, the migration of the Knative serving control plane is complete.
378
-
379
-
- If you are migrating from the **Alauda AI 2.0** + **Alauda AI Model Serving** combination, the migration is fully complete here. Business services will automatically switch their configuration shortly.
380
-
- If you are migrating from the **Alauda AI 1.x** + **Alauda AI Model Serving** combination, please ensure that **Alauda AI** is simultaneously upgraded to version **2.x**.
381
-
:::
382
-
383
-
## Replace GitLab Service After Installation
384
-
385
-
If you want to replace GitLab Service after installation, follow these steps:
386
-
387
-
1. **Reconfigure GitLab Service**
388
-
Refer to the [Pre-installation Configuration](./pre-configuration.mdx) and re-execute its steps.
389
-
390
-
2. **Update Alauda AI Instance**
391
-
- In Administrator view, navigate to **Marketplace > OperatorHub**
392
-
- From the **Cluster** dropdown, select the target cluster
393
-
- Choose **Alauda AI** and click the **All Instances** tab
394
-
- Locate the **'default'** instance and click **Update**
395
-
396
-
3. **Modify GitLab Configuration**
397
-
In the **Update default** form:
398
-
- Locate the **GitLab** section
399
-
- Enter:
400
-
- **Base URL**: The URL of your new GitLab instance
0 commit comments