From 4446f0c100d43260f843459cae339c36da577ee8 Mon Sep 17 00:00:00 2001 From: Yuan Fang Date: Sat, 25 Apr 2026 11:17:03 +0800 Subject: [PATCH 1/3] add oci-models docs --- docs/en/installation/ai-cluster.mdx | 106 ++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/docs/en/installation/ai-cluster.mdx b/docs/en/installation/ai-cluster.mdx index 8734563b..8e2f2f03 100644 --- a/docs/en/installation/ai-cluster.mdx +++ b/docs/en/installation/ai-cluster.mdx @@ -354,6 +354,112 @@ default True Succeeded ``` +## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog} + +The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into your cluster's registry before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. + + + +### Obtaining the OCI image tarballs + +Built-in model images are delivered as OCI archive tarballs (`.tar` files compliant with the OCI Image Layout Specification). Each tarball contains a multi-architecture image (`linux/amd64` + `linux/arm64`) for one model. + +Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version. + +### Pushing to your cluster registry + +The recommended approach uses `ctr` (the containerd CLI), which is available on every cluster node. Run the following on **any one node** of the target cluster — only that node needs network access to your registry; once pushed, the image is available cluster-wide. + +First, set the environment variables: + +```bash +export REG= # [!code callout] +export REPO=/ # [!code callout] +export AUTH=: # [!code callout] +export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout] +``` + + + +1. The cluster's image registry endpoint. You can find this value in the **Administrator** view, then click **Clusters**, select `your cluster`, and check the **Private Registry** value in the **Basic Info** section. +2. Target repository path inside that registry, in the form `/`. For example, `mlops/modelcar-qwen3.5-0.8b` would push to project `mlops` with image name `modelcar-qwen3.5-0.8b`. The project must already exist in your registry and the credentials below must have write access to it. +3. Registry credentials in the form `user:password`. **Contact your platform administrator if you do not have these.** +4. Path to the OCI archive tarball obtained in the previous step. + + + +The tarball carries its own tag (e.g. `v2.3.0`) inside the OCI image layout. Extract it from the tar so the rest of the procedure does not depend on knowing it ahead of time: + +```bash +export TAG=$(tar -xOf "$TAR" index.json \ + | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]') +echo "$TAG" # should print something like v2.3.0 +``` + +Then run the push procedure: + +```bash +# 1. Import into the node's containerd content store. +# --base-name prepends $REG/$REPO to the tag carried inside the tarball, +# producing a fully-qualified reference $REG/$REPO:$TAG. +ctr -n k8s.io images import \ + --all-platforms \ + --base-name "$REG/$REPO" \ + "$TAR" + +# 2. Verify the import. You should see "$REG/$REPO:$TAG". +ctr -n k8s.io images ls -q | grep "$REPO" + +# 3. Push to the registry. +# --skip-verify : skip TLS verification (use when the registry has a private CA) +# --plain-http : use HTTP instead of HTTPS (use for HTTP-only registries) +# --local : resolve content from the local store (required for pushing +# locally-imported multi-arch indexes) +ctr -n k8s.io images push \ + -u "$AUTH" \ + --skip-verify \ + --local \ + "$REG/$REPO:$TAG" + +# 4. Clean up the local reference on the node. Blob data is reclaimed by +# containerd's garbage collector, leaving no persistent state on the node. +ctr -n k8s.io images rm "$REG/$REPO:$TAG" +``` + +Repeat this procedure for each built-in model tarball, varying `$REPO` and `$TAR` per model. + +:::info +`--all-platforms` is critical at the **import** step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on `push` — pushing the multi-arch index automatically pushes all platforms it references. +::: + +### Verifying the push + +Confirm that the registry now serves the image as a multi-architecture index: + +```bash +curl -sk -u "$AUTH" \ + -H 'Accept: application/vnd.oci.image.index.v1+json' \ + -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \ + "https://$REG/v2/$REPO/manifests/$TAG" \ + | jq '{mediaType, platforms: [.manifests[]?.platform]}' +``` + +Expected output: + +```json +{ + "mediaType": "application/vnd.oci.image.index.v1+json", + "platforms": [ + {"architecture": "amd64", "os": "linux"}, + {"architecture": "arm64", "os": "linux"} + ] +} +``` + +If `mediaType` is `application/vnd.oci.image.manifest.v1+json` instead and `platforms` contains only one entry, only one architecture was pushed. Re-run the import step with `--all-platforms` and push again. + + + Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx). ## FAQ From b28b14dfa850c5e79dc59d9f0782f4a08b293206 Mon Sep 17 00:00:00 2001 From: Yuan Fang Date: Wed, 6 May 2026 13:53:18 +0800 Subject: [PATCH 2/3] change to harbor style --- docs/en/installation/ai-cluster.mdx | 92 +++++++++++++++++++---------- 1 file changed, 61 insertions(+), 31 deletions(-) diff --git a/docs/en/installation/ai-cluster.mdx b/docs/en/installation/ai-cluster.mdx index 8e2f2f03..5acd0828 100644 --- a/docs/en/installation/ai-cluster.mdx +++ b/docs/en/installation/ai-cluster.mdx @@ -356,7 +356,7 @@ default True Succeeded ## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog} -The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into your cluster's registry before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. +The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into Harbor before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. @@ -366,29 +366,33 @@ Built-in model images are delivered as OCI archive tarballs (`.tar` files compli Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version. -### Pushing to your cluster registry +### Pushing to Harbor -The recommended approach uses `ctr` (the containerd CLI), which is available on every cluster node. Run the following on **any one node** of the target cluster — only that node needs network access to your registry; once pushed, the image is available cluster-wide. +The recommended target is Harbor. The following procedure has been verified with an HTTP Harbor registry, where the push command must include `--plain-http`. + +Run the commands on a node that has `ctr`, `curl`, and `jq` installed and can reach Harbor. First, set the environment variables: ```bash -export REG= # [!code callout] -export REPO=/ # [!code callout] -export AUTH=: # [!code callout] +export REG=192.168.140.0:32700 # [!code callout] +export REPO=mlops/modelcar-qwen3.5-0.8b # [!code callout] +export TAG=v2.3.0 # [!code callout] export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout] +export AUTH='user:password' # [!code callout] ``` -1. The cluster's image registry endpoint. You can find this value in the **Administrator** view, then click **Clusters**, select `your cluster`, and check the **Private Registry** value in the **Basic Info** section. -2. Target repository path inside that registry, in the form `/`. For example, `mlops/modelcar-qwen3.5-0.8b` would push to project `mlops` with image name `modelcar-qwen3.5-0.8b`. The project must already exist in your registry and the credentials below must have write access to it. -3. Registry credentials in the form `user:password`. **Contact your platform administrator if you do not have these.** +1. Harbor registry endpoint, without the URL scheme. +2. Target repository path in Harbor, in the form `/`. For example, `mlops/modelcar-qwen3.5-0.8b` uses the Harbor project `mlops` and repository `modelcar-qwen3.5-0.8b`. +3. Image tag carried by the OCI archive. If you do not know it, extract it from the tarball with the command below. 4. Path to the OCI archive tarball obtained in the previous step. +5. Harbor credentials in the form `user:password`. **Contact your platform administrator if you do not have these.** -The tarball carries its own tag (e.g. `v2.3.0`) inside the OCI image layout. Extract it from the tar so the rest of the procedure does not depend on knowing it ahead of time: +The tarball usually carries its own tag (e.g. `v2.3.0`) inside the OCI image layout. If needed, extract it from the tarball: ```bash export TAG=$(tar -xOf "$TAR" index.json \ @@ -396,7 +400,33 @@ export TAG=$(tar -xOf "$TAR" index.json \ echo "$TAG" # should print something like v2.3.0 ``` -Then run the push procedure: +Check whether the image tag already exists in Harbor: + +```bash +URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" + +HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") + +echo "HTTP=$HTTP URL=$URL" + +[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ + || jq -r '.errors[]?.message' /tmp/harbor-artifact.json +``` + +If the Harbor project does not exist yet, create it before pushing: + +```bash +PROJECT="${REPO%%/*}" + +curl -s -u "$AUTH" -X POST "http://$REG/api/v2.0/projects" \ + -H 'Content-Type: application/json' \ + -d "{\"project_name\":\"$PROJECT\",\"public\":false}" \ + -w '\nHTTP %{http_code}\n' +``` + +If the project already exists, Harbor returns a non-2xx status code. After confirming the project exists, continue with the import and push. + +Then run the import and push procedure: ```bash # 1. Import into the node's containerd content store. @@ -410,15 +440,10 @@ ctr -n k8s.io images import \ # 2. Verify the import. You should see "$REG/$REPO:$TAG". ctr -n k8s.io images ls -q | grep "$REPO" -# 3. Push to the registry. -# --skip-verify : skip TLS verification (use when the registry has a private CA) -# --plain-http : use HTTP instead of HTTPS (use for HTTP-only registries) -# --local : resolve content from the local store (required for pushing -# locally-imported multi-arch indexes) +# 3. Push to Harbor. HTTP Harbor requires --plain-http. ctr -n k8s.io images push \ - -u "$AUTH" \ - --skip-verify \ - --local \ + --plain-http \ + --user "$AUTH" \ "$REG/$REPO:$TAG" # 4. Clean up the local reference on the node. Blob data is reclaimed by @@ -426,29 +451,36 @@ ctr -n k8s.io images push \ ctr -n k8s.io images rm "$REG/$REPO:$TAG" ``` -Repeat this procedure for each built-in model tarball, varying `$REPO` and `$TAR` per model. +Repeat this procedure for each built-in model tarball, varying `$REPO`, `$TAG`, and `$TAR` per model. :::info `--all-platforms` is critical at the **import** step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on `push` — pushing the multi-arch index automatically pushes all platforms it references. ::: -### Verifying the push +### Verifying the Harbor import -Confirm that the registry now serves the image as a multi-architecture index: +Confirm that Harbor now serves the image: ```bash -curl -sk -u "$AUTH" \ - -H 'Accept: application/vnd.oci.image.index.v1+json' \ - -H 'Accept: application/vnd.docker.distribution.manifest.list.v2+json' \ - "https://$REG/v2/$REPO/manifests/$TAG" \ - | jq '{mediaType, platforms: [.manifests[]?.platform]}' +URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" + +HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") + +echo "HTTP=$HTTP URL=$URL" + +[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ + || jq -r '.errors[]?.message' /tmp/harbor-artifact.json ``` -Expected output: +`HTTP=200` means the image was successfully imported into Harbor. Expected output includes the digest, size, push time, tag, and platform information: ```json { - "mediaType": "application/vnd.oci.image.index.v1+json", + "digest": "sha256:...", + "size": 123456789, + "push_time": "2026-05-06T00:00:00.000Z", + "arch": "amd64", + "tags": ["v2.3.0"], "platforms": [ {"architecture": "amd64", "os": "linux"}, {"architecture": "arm64", "os": "linux"} @@ -456,8 +488,6 @@ Expected output: } ``` -If `mediaType` is `application/vnd.oci.image.manifest.v1+json` instead and `platforms` contains only one entry, only one architecture was pushed. Re-run the import step with `--all-platforms` and push again. - Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx). From 9a0838653b2e73ac7c53ff8796c664a92bae5275 Mon Sep 17 00:00:00 2001 From: Yuan Fang Date: Sat, 9 May 2026 16:03:41 +0800 Subject: [PATCH 3/3] add model catalog install --- docs/en/installation/ai-cluster.mdx | 78 +++++++++++++++++++--- docs/en/installation/pre-configuration.mdx | 15 +++++ sites.yaml | 3 + 3 files changed, 88 insertions(+), 8 deletions(-) diff --git a/docs/en/installation/ai-cluster.mdx b/docs/en/installation/ai-cluster.mdx index 5acd0828..bfd21229 100644 --- a/docs/en/installation/ai-cluster.mdx +++ b/docs/en/installation/ai-cluster.mdx @@ -332,11 +332,73 @@ In **Administrator** view: ::: 12. Under **Gitlab** section: + + :::warning + GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead. + ::: + 1. Type the URL of self-hosted Gitlab for **Base URL**. 2. Type `cpaas-system` for **Admin Token Secret Namespace**. 3. Type `aml-gitlab-admin-token` for **Admin Token Secret Name**. -13. Review above configurations and then click **Create**. +13. Under **Model Catalog** section, configure the following parameters: + + - **Database Password Secret Namespace**: Namespace of the secret containing the PostgreSQL password for Model Catalog. + - **Database Password Secret Name**: Name of the secret containing the PostgreSQL password for Model Catalog. + + Create the secret before creating the Alauda AI instance. If you use the following example, set **Database Password Secret Namespace** to `aml-operator` and **Database Password Secret Name** to `model-catalog`. + + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: model-catalog # [!code callout] + namespace: aml-operator # [!code callout] + stringData: + password: # [!code callout] + type: Opaque + ``` + + + + 1. `metadata.name` is the value for **Database Password Secret Name**. + 2. `metadata.namespace` is the value for **Database Password Secret Namespace**. + 3. `stringData.password` is the PostgreSQL password in plain text. Kubernetes stores it as base64-encoded `data.password` after the Secret is created. + + + + After creation, the stored Secret has a base64-encoded `data.password` field, for example: + + ```yaml + apiVersion: v1 + data: + password: cGc= + kind: Secret + metadata: + name: model-catalog + namespace: aml-operator + type: Opaque + ``` + + - **Model OCI Registry Address**: Registry address hosting model OCI artifacts for Model Catalog. The default value is `build-harbor.alauda.cn`. + + This registry stores the model OCI images used by Model Catalog. Use Harbor or another production-mode OCI registry with HTTPS access enabled. The Harbor project or repository used for Model Catalog must allow anonymous pull access from inference cluster nodes. + + If you cannot deploy a registry with HTTPS in the target environment, you can use an HTTP registry as a fallback. Configure the container runtime on every node in the inference cluster before deploying models. For containerd, add an insecure registry mirror for the registry address, for example by creating `/etc/containerd/certs.d//hosts.toml`: + + ```toml + server = "http://" + + [host."http://"] + capabilities = ["pull", "resolve"] + ``` + + Then restart containerd or apply the equivalent node-runtime configuration through your cluster management system. This configuration must exist on the nodes where inference service pods are scheduled; otherwise the pod image pull will fail even if Model Catalog can list the model. The exact containerd configuration path can vary by Kubernetes distribution; after applying the configuration, verify that the node can pull a Model Catalog image, for example with `crictl pull /:`. + + - **Source of PVC**: Choose whether to reuse an existing PVC or create a new one. Use `CreateNew` to let the installation create the PVC. + - **StorageClass Name**: StorageClass used when creating a new PVC. + +14. Review above configurations and then click **Create**. ### Verification @@ -356,7 +418,7 @@ default True Succeeded ## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog} -The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into Harbor before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. +The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into the OCI registry configured by Model Catalog before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. @@ -368,7 +430,7 @@ Download the tarballs from the Customer Portal Marketplace, or contact your Alau ### Pushing to Harbor -The recommended target is Harbor. The following procedure has been verified with an HTTP Harbor registry, where the push command must include `--plain-http`. +The recommended target is Harbor. The example below uses an HTTP Harbor registry. If your Harbor registry uses HTTPS, omit `--plain-http` and change the API URLs from `http://` to `https://`. Run the commands on a node that has `ctr`, `curl`, and `jq` installed and can reach Harbor. @@ -377,7 +439,7 @@ First, set the environment variables: ```bash export REG=192.168.140.0:32700 # [!code callout] export REPO=mlops/modelcar-qwen3.5-0.8b # [!code callout] -export TAG=v2.3.0 # [!code callout] +export TAG=v0.1.0 # [!code callout] export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout] export AUTH='user:password' # [!code callout] ``` @@ -392,12 +454,12 @@ export AUTH='user:password' # [!code callout] -The tarball usually carries its own tag (e.g. `v2.3.0`) inside the OCI image layout. If needed, extract it from the tarball: +The tarball usually carries its own tag (e.g. `v0.1.0`) inside the OCI image layout. If needed, extract it from the tarball: ```bash export TAG=$(tar -xOf "$TAR" index.json \ | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]') -echo "$TAG" # should print something like v2.3.0 +echo "$TAG" # should print something like v0.1.0 ``` Check whether the image tag already exists in Harbor: @@ -440,7 +502,7 @@ ctr -n k8s.io images import \ # 2. Verify the import. You should see "$REG/$REPO:$TAG". ctr -n k8s.io images ls -q | grep "$REPO" -# 3. Push to Harbor. HTTP Harbor requires --plain-http. +# 3. Push to Harbor. Use --plain-http only for HTTP Harbor. ctr -n k8s.io images push \ --plain-http \ --user "$AUTH" \ @@ -480,7 +542,7 @@ echo "HTTP=$HTTP URL=$URL" "size": 123456789, "push_time": "2026-05-06T00:00:00.000Z", "arch": "amd64", - "tags": ["v2.3.0"], + "tags": ["v0.1.0"], "platforms": [ {"architecture": "amd64", "os": "linux"}, {"architecture": "arm64", "os": "linux"} diff --git a/docs/en/installation/pre-configuration.mdx b/docs/en/installation/pre-configuration.mdx index e17cdf72..0509c2b1 100644 --- a/docs/en/installation/pre-configuration.mdx +++ b/docs/en/installation/pre-configuration.mdx @@ -8,6 +8,10 @@ weight: 5 In Alauda AI, GitLab is the core component for **Model Management**. Before deploying Alauda AI, you **must prepare** a GitLab service. +:::warning +GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead. +::: + ### **Deployment Options** #### **1. GitLab service requirements** @@ -94,6 +98,17 @@ kubectl create secret generic aml-gitlab-admin-token \ +## **Preparing the Harbor Service** + +If you plan to use the **Model Catalog** feature, prepare an **Alauda Build of Harbor** service before installing Alauda AI. The registry must meet the following requirements: + +- Run in production mode. Use HTTPS for production deployments. +- Allow inference cluster nodes to pull Model Catalog images without image pull credentials. + +If you cannot deploy an HTTPS registry in the target environment, you can use an HTTP registry as a fallback, but you must configure the inference cluster container runtime before deploying models. The detailed containerd configuration is covered when setting **Model OCI Registry Address** during Alauda AI instance creation. + +For the installation procedure, see . + ## **Frequently Asked Questions (FAQ)** ### **1. How to optimize GitLab 18.5 and later configuration for large LFS objects?** diff --git a/sites.yaml b/sites.yaml index fe46bc34..b86d156a 100644 --- a/sites.yaml +++ b/sites.yaml @@ -4,6 +4,9 @@ - name: alauda-build-of-gitlab base: /alauda-build-of-gitlab version: v18.5 +- name: alauda-build-of-harbor + base: /alauda-build-of-harbor + version: "2.14" - name: servicemeshv1 base: /servicemeshv1 version: "4.3"