Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/en/envoy_ai_gateway/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
weight: 115
---
# Alauda Build of Envoy AI Gateway

<Overview />
36 changes: 36 additions & 0 deletions docs/en/envoy_ai_gateway/install.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
weight: 20
---

# Install Envoy AI Gateway

## Downloading Cluster Plugin

:::info

`Alauda Build of Envoy AI Gateway` cluster plugin can be retrieved from Customer Portal.

Please contact Consumer Support for more information.

:::

## Uploading the Cluster Plugin

For more information on uploading the cluster plugin, please refer to <ExternalSiteLink name="acp" href="ui/cli_tools/index.html#uploading-cluster-plugins" children="Uploading Cluster Plugins" />

## Installing Alauda Build of Envoy AI Gateway

1. Go to the `Administrator` -> `Marketplace` -> `Cluster Plugin` page, switch to the target cluster, and then deploy the `Alauda Build of Envoy AI Gateway` Cluster plugin.
:::info
**Note: Deploy form parameters can be kept as default or modified after knowing how to use them.**
:::

2. Verify result. You can see the status of "Installed" in the UI or you can check the pod status:
```bash
kubectl get pods -n envoy-gateway-system | grep "ai-gateway"
```

## Upgrading Alauda Build of Envoy AI Gateway

1. Upload the new version for package of **Alauda Build of Envoy AI Gateway** plugin to ACP.
2. Go to the `Administrator` -> `Clusters` -> `Target Cluster` -> `Functional Components` page, then click the `Upgrade` button to upgrade **Alauda Build of Envoy AI Gateway** to the new version.
33 changes: 33 additions & 0 deletions docs/en/envoy_ai_gateway/intro.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
weight: 10
---

# Introduction

## Envoy AI Gateway

**Alauda Build of Envoy AI Gateway** is based on the [Envoy AI Gateway](https://aigateway.envoyproxy.io/) project.
Envoy AI Gateway is a Kubernetes-native, AI-specific gateway layer built on top of [Envoy Gateway](https://gateway.envoyproxy.io/), providing intelligent traffic management, routing, and policy enforcement for AI inference workloads.

Main components and capabilities include:

- **AI-Aware Routing**: Routes inference requests to the appropriate backend model service based on request content, model name, and backend availability — enabling transparent multi-model serving behind a single endpoint.
- **OpenAI-Compatible API**: Exposes a unified, OpenAI-compatible API surface (`/v1/chat/completions`, `/v1/completions`, `/v1/models`) for all downstream inference services, regardless of the underlying runtime.
- **Per-Model Rate Limiting & Policies**: Enforces fine-grained rate limiting, token quotas, and traffic policies at the individual model level, preventing resource starvation and ensuring fair usage across tenants.
- **Backend Load Balancing**: Distributes inference requests across multiple replicas of the same model using configurable load-balancing strategies, with health checking and automatic failover.
- **Envoy Gateway Integration**: Runs as an extension of Envoy Gateway, inheriting its Kubernetes Gateway API-native control plane, TLS termination, and observability features (metrics, access logs, distributed tracing).
- **Gateway API Inference Extension (GIE)**: Integrates with the Kubernetes SIG Gateway API Inference Extension for advanced, inference-aware scheduling and load balancing decisions based on real-time backend state.

Envoy AI Gateway is a required dependency of **Alauda Build of KServe** for exposing inference services.

For installation on the platform, see [Install Envoy AI Gateway](./install).

## Documentation

Envoy AI Gateway upstream documentation and related resources:

- **Envoy AI Gateway Documentation**: [https://aigateway.envoyproxy.io/](https://aigateway.envoyproxy.io/) — Official documentation covering architecture, configuration, and API references.
- **Envoy AI Gateway GitHub**: [https://github.com/envoyproxy/ai-gateway](https://github.com/envoyproxy/ai-gateway) — Source code, release notes, and issues.
- **Envoy Gateway**: [https://gateway.envoyproxy.io/](https://gateway.envoyproxy.io/) — The underlying gateway infrastructure that Envoy AI Gateway extends.
- **Gateway API Inference Extension (GIE)**: [https://gateway-api-inference-extension.sigs.k8s.io/](https://gateway-api-inference-extension.sigs.k8s.io/) — Kubernetes SIG project for AI-aware routing integrated with Envoy AI Gateway.
- **KServe (Alauda Build)**: [../kserve/intro](../kserve/intro) — KServe uses Envoy AI Gateway as a required dependency for exposing and routing inference services.
Comment on lines +1 to +33
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if the referenced kserve intro file exists and verify link format consistency
fd -t f "intro.mdx" docs/en/kserve/

# Check how other files in this repo reference cross-directory MDX files
rg -n '\]\(\.\./.*intro\)' docs/en/ --type mdx | head -10

Repository: alauda/aml-docs

Length of output: 128


🏁 Script executed:

#!/bin/bash
# Find actual internal link patterns in the documentation to see if .mdx extension is typically included or omitted
rg '\]\(\.\./[^)]+\)' docs/en/ -A 1 | head -30

# Also check documentation index/nav files that might show link format conventions
find docs/en -name "*nav*" -o -name "*config*" -o -name "*sidebar*" -o -name "index.mdx" | head -10

Repository: alauda/aml-docs

Length of output: 3854


Fix internal link format to match repository convention.

The link on line 33 ../kserve/intro should include the .mdx extension as ../kserve/intro.mdx — all other cross-directory links in the documentation consistently use this format (e.g., ../installation/ai-cluster.mdx, ../kserve/install.mdx).

Otherwise, the documentation is well-structured, clearly explains Envoy AI Gateway's purpose and capabilities, and provides good upstream context.

🧰 Tools
🪛 LanguageTool

[style] ~30-~30: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... configuration, and API references. - Envoy AI Gateway GitHub: [https://github.co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~31-~31: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ce code, release notes, and issues. - Envoy Gateway: [https://gateway.envoyproxy....

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/en/envoy_ai_gateway/intro.mdx` around lines 1 - 33, The internal
cross-directory link at the bottom of docs/en/envoy_ai_gateway/intro.mdx uses
../kserve/intro without the .mdx extension; update that link to
../kserve/intro.mdx so it matches the repository convention used elsewhere
(e.g., ../installation/ai-cluster.mdx and ../kserve/install.mdx) by editing the
markdown link in the file (search for the anchor referencing KServe or the text
"KServe (Alauda Build)").

80 changes: 10 additions & 70 deletions docs/en/installation/ai-cluster.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ Confirm that the **Alauda AI** tile shows one of the following states:

</Steps>

## Installing Alauda Build of KServe Operator

For detailed installation steps, see [Install KServe](../kserve/install.mdx) in Alauda Build of KServe.

## Enabling Knative Functionality

Knative functionality is an optional capability that requires an additional operator and instance to be deployed.
Expand Down Expand Up @@ -220,6 +224,7 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
6. Replace the content with the following YAML:
7. Click **Create**.


```yaml
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
Expand Down Expand Up @@ -254,10 +259,14 @@ Once **Knative Operator** is installed, you need to create the `KnativeServing`
kourier:
enabled: true
```
:::warning
- For ACP 4.0, use version **1.18.1**
- For ACP 4.1 and above, use version **1.19.6**
:::

<Callouts>

1. For ACP 4.0, keep the version as "1.18.1". For ACP 4.1 and above, change the version to "1.19.6".
1. Specify the version of Knative Serving to be deployed.

2. `private-registry` is a placeholder for your private registry address. You can find this in the **Administrator** view, then click **Clusters**, select `your cluster`, and check the **Private Registry** value in the **Basic Info** section.

Expand Down Expand Up @@ -347,75 +356,6 @@ default True Succeeded

Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx).

## Migrating to Knative Operator

In the 1.x series of products, the serverless capability for inference services was provided by the `Alauda AI Model Serving` operator. In the 2.x series, this capability is provided by the `Knative Operator`. This section guides you through migrating your serverless capability from the legacy operator to the new one.

### 1. Remove Legacy Serving Instance

<Steps>

#### Procedure

In **Administrator** view:

1. Click **Marketplace / OperatorHub**.
2. At the top of the console, from the **Cluster** dropdown list, select the destination cluster where **Alauda AI** is installed.
3. Select **Alauda AI**, then click the **All Instances** tab.
4. Locate the `default` instance and click **Update**.
5. In the update form, locate the **Serverless Configuration** section.
6. Set **BuiltIn Knative Serving** to `Removed`.
7. Click **Update** to apply the changes.

</Steps>

### 2. Install Knative Operator and Create Serving Instance

Install the **Knative Operator** from the Marketplace and create the `KnativeServing` instance. For detailed instructions, refer to the [Enabling Knative Functionality](#enabling-knative-functionality) section.

:::info
Once the above steps are completed, the migration of the Knative serving control plane is complete.

- If you are migrating from the **Alauda AI 2.0** + **Alauda AI Model Serving** combination, the migration is fully complete here. Business services will automatically switch their configuration shortly.
- If you are migrating from the **Alauda AI 1.x** + **Alauda AI Model Serving** combination, please ensure that **Alauda AI** is simultaneously upgraded to version **2.x**.
:::

## Replace GitLab Service After Installation

If you want to replace GitLab Service after installation, follow these steps:

1. **Reconfigure GitLab Service**
Refer to the [Pre-installation Configuration](./pre-configuration.mdx) and re-execute its steps.

2. **Update Alauda AI Instance**
- In Administrator view, navigate to **Marketplace > OperatorHub**
- From the **Cluster** dropdown, select the target cluster
- Choose **Alauda AI** and click the **All Instances** tab
- Locate the **'default'** instance and click **Update**

3. **Modify GitLab Configuration**
In the **Update default** form:
- Locate the **GitLab** section
- Enter:
- **Base URL**: The URL of your new GitLab instance
- **Admin Token Secret Namespace**: `cpaas-system`
- **Admin Token Secret Name**: `aml-gitlab-admin-token`

4. **Restart Components**
Restart the `aml-controller` deployment in the `kubeflow` namespace.

5. **Refresh Platform Data**
In Alauda AI management view, re-manage all namespaces.
- In Alauda AI view, navigate to **Admin** view from **Business View**
- On the **Namespace Management** page, delete all existing managed namespaces
- Use "Managed Namespace" to add namespaces requiring Alauda AI integration
:::info
Original models won't migrate automatically
Continue using these models:
- Recreate and re-upload in new GitLab OR
- Manually transfer model files to new repository
:::

## FAQ

### 1. Configure the audit output directory for aml-skipper
Expand Down
104 changes: 0 additions & 104 deletions docs/en/installation/ai-generative.mdx

This file was deleted.

7 changes: 7 additions & 0 deletions docs/en/kserve/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
weight: 95
---

# Alauda Build of KServe

<Overview />
Loading