-
Notifications
You must be signed in to change notification settings - Fork 0
split plugin Alauda Build of Kserve #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fyuan1316
wants to merge
1
commit into
master
Choose a base branch
from
upgrade-build-of-kserve
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| weight: 115 | ||
| --- | ||
| # Alauda Build of Envoy AI Gateway | ||
|
|
||
| <Overview /> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| --- | ||
| weight: 20 | ||
| --- | ||
|
|
||
| # Install Envoy AI Gateway | ||
|
|
||
| ## Downloading Cluster Plugin | ||
|
|
||
| :::info | ||
|
|
||
| `Alauda Build of Envoy AI Gateway` cluster plugin can be retrieved from Customer Portal. | ||
|
|
||
| Please contact Consumer Support for more information. | ||
|
|
||
| ::: | ||
|
|
||
| ## Uploading the Cluster Plugin | ||
|
|
||
| For more information on uploading the cluster plugin, please refer to <ExternalSiteLink name="acp" href="ui/cli_tools/index.html#uploading-cluster-plugins" children="Uploading Cluster Plugins" /> | ||
|
|
||
| ## Installing Alauda Build of Envoy AI Gateway | ||
|
|
||
| 1. Go to the `Administrator` -> `Marketplace` -> `Cluster Plugin` page, switch to the target cluster, and then deploy the `Alauda Build of Envoy AI Gateway` Cluster plugin. | ||
| :::info | ||
| **Note: Deploy form parameters can be kept as default or modified after knowing how to use them.** | ||
| ::: | ||
|
|
||
| 2. Verify result. You can see the status of "Installed" in the UI or you can check the pod status: | ||
| ```bash | ||
| kubectl get pods -n envoy-gateway-system | grep "ai-gateway" | ||
| ``` | ||
|
|
||
| ## Upgrading Alauda Build of Envoy AI Gateway | ||
|
|
||
| 1. Upload the new version for package of **Alauda Build of Envoy AI Gateway** plugin to ACP. | ||
| 2. Go to the `Administrator` -> `Clusters` -> `Target Cluster` -> `Functional Components` page, then click the `Upgrade` button to upgrade **Alauda Build of Envoy AI Gateway** to the new version. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| --- | ||
| weight: 10 | ||
| --- | ||
|
|
||
| # Introduction | ||
|
|
||
| ## Envoy AI Gateway | ||
|
|
||
| **Alauda Build of Envoy AI Gateway** is based on the [Envoy AI Gateway](https://aigateway.envoyproxy.io/) project. | ||
| Envoy AI Gateway is a Kubernetes-native, AI-specific gateway layer built on top of [Envoy Gateway](https://gateway.envoyproxy.io/), providing intelligent traffic management, routing, and policy enforcement for AI inference workloads. | ||
|
|
||
| Main components and capabilities include: | ||
|
|
||
| - **AI-Aware Routing**: Routes inference requests to the appropriate backend model service based on request content, model name, and backend availability — enabling transparent multi-model serving behind a single endpoint. | ||
| - **OpenAI-Compatible API**: Exposes a unified, OpenAI-compatible API surface (`/v1/chat/completions`, `/v1/completions`, `/v1/models`) for all downstream inference services, regardless of the underlying runtime. | ||
| - **Per-Model Rate Limiting & Policies**: Enforces fine-grained rate limiting, token quotas, and traffic policies at the individual model level, preventing resource starvation and ensuring fair usage across tenants. | ||
| - **Backend Load Balancing**: Distributes inference requests across multiple replicas of the same model using configurable load-balancing strategies, with health checking and automatic failover. | ||
| - **Envoy Gateway Integration**: Runs as an extension of Envoy Gateway, inheriting its Kubernetes Gateway API-native control plane, TLS termination, and observability features (metrics, access logs, distributed tracing). | ||
| - **Gateway API Inference Extension (GIE)**: Integrates with the Kubernetes SIG Gateway API Inference Extension for advanced, inference-aware scheduling and load balancing decisions based on real-time backend state. | ||
|
|
||
| Envoy AI Gateway is a required dependency of **Alauda Build of KServe** for exposing inference services. | ||
|
|
||
| For installation on the platform, see [Install Envoy AI Gateway](./install). | ||
|
|
||
| ## Documentation | ||
|
|
||
| Envoy AI Gateway upstream documentation and related resources: | ||
|
|
||
| - **Envoy AI Gateway Documentation**: [https://aigateway.envoyproxy.io/](https://aigateway.envoyproxy.io/) — Official documentation covering architecture, configuration, and API references. | ||
| - **Envoy AI Gateway GitHub**: [https://github.com/envoyproxy/ai-gateway](https://github.com/envoyproxy/ai-gateway) — Source code, release notes, and issues. | ||
| - **Envoy Gateway**: [https://gateway.envoyproxy.io/](https://gateway.envoyproxy.io/) — The underlying gateway infrastructure that Envoy AI Gateway extends. | ||
| - **Gateway API Inference Extension (GIE)**: [https://gateway-api-inference-extension.sigs.k8s.io/](https://gateway-api-inference-extension.sigs.k8s.io/) — Kubernetes SIG project for AI-aware routing integrated with Envoy AI Gateway. | ||
| - **KServe (Alauda Build)**: [../kserve/intro](../kserve/intro) — KServe uses Envoy AI Gateway as a required dependency for exposing and routing inference services. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| weight: 95 | ||
| --- | ||
|
|
||
| # Alauda Build of KServe | ||
|
|
||
| <Overview /> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: alauda/aml-docs
Length of output: 128
🏁 Script executed:
Repository: alauda/aml-docs
Length of output: 3854
Fix internal link format to match repository convention.
The link on line 33
../kserve/introshould include the.mdxextension as../kserve/intro.mdx— all other cross-directory links in the documentation consistently use this format (e.g.,../installation/ai-cluster.mdx,../kserve/install.mdx).Otherwise, the documentation is well-structured, clearly explains Envoy AI Gateway's purpose and capabilities, and provides good upstream context.
🧰 Tools
🪛 LanguageTool
[style] ~30-~30: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... configuration, and API references. - Envoy AI Gateway GitHub: [https://github.co...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~31-~31: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ce code, release notes, and issues. - Envoy Gateway: [https://gateway.envoyproxy....
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents