diff --git a/docs/en/installation/ai-cluster.mdx b/docs/en/installation/ai-cluster.mdx index 8734563..bfd2122 100644 --- a/docs/en/installation/ai-cluster.mdx +++ b/docs/en/installation/ai-cluster.mdx @@ -332,11 +332,73 @@ In **Administrator** view: ::: 12. Under **Gitlab** section: + + :::warning + GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead. + ::: + 1. Type the URL of self-hosted Gitlab for **Base URL**. 2. Type `cpaas-system` for **Admin Token Secret Namespace**. 3. Type `aml-gitlab-admin-token` for **Admin Token Secret Name**. -13. Review above configurations and then click **Create**. +13. Under **Model Catalog** section, configure the following parameters: + + - **Database Password Secret Namespace**: Namespace of the secret containing the PostgreSQL password for Model Catalog. + - **Database Password Secret Name**: Name of the secret containing the PostgreSQL password for Model Catalog. + + Create the secret before creating the Alauda AI instance. If you use the following example, set **Database Password Secret Namespace** to `aml-operator` and **Database Password Secret Name** to `model-catalog`. + + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: model-catalog # [!code callout] + namespace: aml-operator # [!code callout] + stringData: + password: # [!code callout] + type: Opaque + ``` + + + + 1. `metadata.name` is the value for **Database Password Secret Name**. + 2. `metadata.namespace` is the value for **Database Password Secret Namespace**. + 3. `stringData.password` is the PostgreSQL password in plain text. Kubernetes stores it as base64-encoded `data.password` after the Secret is created. + + + + After creation, the stored Secret has a base64-encoded `data.password` field, for example: + + ```yaml + apiVersion: v1 + data: + password: cGc= + kind: Secret + metadata: + name: model-catalog + namespace: aml-operator + type: Opaque + ``` + + - **Model OCI Registry Address**: Registry address hosting model OCI artifacts for Model Catalog. The default value is `build-harbor.alauda.cn`. + + This registry stores the model OCI images used by Model Catalog. Use Harbor or another production-mode OCI registry with HTTPS access enabled. The Harbor project or repository used for Model Catalog must allow anonymous pull access from inference cluster nodes. + + If you cannot deploy a registry with HTTPS in the target environment, you can use an HTTP registry as a fallback. Configure the container runtime on every node in the inference cluster before deploying models. For containerd, add an insecure registry mirror for the registry address, for example by creating `/etc/containerd/certs.d//hosts.toml`: + + ```toml + server = "http://" + + [host."http://"] + capabilities = ["pull", "resolve"] + ``` + + Then restart containerd or apply the equivalent node-runtime configuration through your cluster management system. This configuration must exist on the nodes where inference service pods are scheduled; otherwise the pod image pull will fail even if Model Catalog can list the model. The exact containerd configuration path can vary by Kubernetes distribution; after applying the configuration, verify that the node can pull a Model Catalog image, for example with `crictl pull /:`. + + - **Source of PVC**: Choose whether to reuse an existing PVC or create a new one. Use `CreateNew` to let the installation create the PVC. + - **StorageClass Name**: StorageClass used when creating a new PVC. + +14. Review above configurations and then click **Create**. ### Verification @@ -354,6 +416,142 @@ default True Succeeded ``` +## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog} + +The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into the OCI registry configured by Model Catalog before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. + + + +### Obtaining the OCI image tarballs + +Built-in model images are delivered as OCI archive tarballs (`.tar` files compliant with the OCI Image Layout Specification). Each tarball contains a multi-architecture image (`linux/amd64` + `linux/arm64`) for one model. + +Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version. + +### Pushing to Harbor + +The recommended target is Harbor. The example below uses an HTTP Harbor registry. If your Harbor registry uses HTTPS, omit `--plain-http` and change the API URLs from `http://` to `https://`. + +Run the commands on a node that has `ctr`, `curl`, and `jq` installed and can reach Harbor. + +First, set the environment variables: + +```bash +export REG=192.168.140.0:32700 # [!code callout] +export REPO=mlops/modelcar-qwen3.5-0.8b # [!code callout] +export TAG=v0.1.0 # [!code callout] +export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout] +export AUTH='user:password' # [!code callout] +``` + + + +1. Harbor registry endpoint, without the URL scheme. +2. Target repository path in Harbor, in the form `/`. For example, `mlops/modelcar-qwen3.5-0.8b` uses the Harbor project `mlops` and repository `modelcar-qwen3.5-0.8b`. +3. Image tag carried by the OCI archive. If you do not know it, extract it from the tarball with the command below. +4. Path to the OCI archive tarball obtained in the previous step. +5. Harbor credentials in the form `user:password`. **Contact your platform administrator if you do not have these.** + + + +The tarball usually carries its own tag (e.g. `v0.1.0`) inside the OCI image layout. If needed, extract it from the tarball: + +```bash +export TAG=$(tar -xOf "$TAR" index.json \ + | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]') +echo "$TAG" # should print something like v0.1.0 +``` + +Check whether the image tag already exists in Harbor: + +```bash +URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" + +HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") + +echo "HTTP=$HTTP URL=$URL" + +[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ + || jq -r '.errors[]?.message' /tmp/harbor-artifact.json +``` + +If the Harbor project does not exist yet, create it before pushing: + +```bash +PROJECT="${REPO%%/*}" + +curl -s -u "$AUTH" -X POST "http://$REG/api/v2.0/projects" \ + -H 'Content-Type: application/json' \ + -d "{\"project_name\":\"$PROJECT\",\"public\":false}" \ + -w '\nHTTP %{http_code}\n' +``` + +If the project already exists, Harbor returns a non-2xx status code. After confirming the project exists, continue with the import and push. + +Then run the import and push procedure: + +```bash +# 1. Import into the node's containerd content store. +# --base-name prepends $REG/$REPO to the tag carried inside the tarball, +# producing a fully-qualified reference $REG/$REPO:$TAG. +ctr -n k8s.io images import \ + --all-platforms \ + --base-name "$REG/$REPO" \ + "$TAR" + +# 2. Verify the import. You should see "$REG/$REPO:$TAG". +ctr -n k8s.io images ls -q | grep "$REPO" + +# 3. Push to Harbor. Use --plain-http only for HTTP Harbor. +ctr -n k8s.io images push \ + --plain-http \ + --user "$AUTH" \ + "$REG/$REPO:$TAG" + +# 4. Clean up the local reference on the node. Blob data is reclaimed by +# containerd's garbage collector, leaving no persistent state on the node. +ctr -n k8s.io images rm "$REG/$REPO:$TAG" +``` + +Repeat this procedure for each built-in model tarball, varying `$REPO`, `$TAG`, and `$TAR` per model. + +:::info +`--all-platforms` is critical at the **import** step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on `push` — pushing the multi-arch index automatically pushes all platforms it references. +::: + +### Verifying the Harbor import + +Confirm that Harbor now serves the image: + +```bash +URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" + +HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") + +echo "HTTP=$HTTP URL=$URL" + +[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ + || jq -r '.errors[]?.message' /tmp/harbor-artifact.json +``` + +`HTTP=200` means the image was successfully imported into Harbor. Expected output includes the digest, size, push time, tag, and platform information: + +```json +{ + "digest": "sha256:...", + "size": 123456789, + "push_time": "2026-05-06T00:00:00.000Z", + "arch": "amd64", + "tags": ["v0.1.0"], + "platforms": [ + {"architecture": "amd64", "os": "linux"}, + {"architecture": "arm64", "os": "linux"} + ] +} +``` + + + Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx). ## FAQ diff --git a/docs/en/installation/pre-configuration.mdx b/docs/en/installation/pre-configuration.mdx index e17cdf7..0509c2b 100644 --- a/docs/en/installation/pre-configuration.mdx +++ b/docs/en/installation/pre-configuration.mdx @@ -8,6 +8,10 @@ weight: 5 In Alauda AI, GitLab is the core component for **Model Management**. Before deploying Alauda AI, you **must prepare** a GitLab service. +:::warning +GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead. +::: + ### **Deployment Options** #### **1. GitLab service requirements** @@ -94,6 +98,17 @@ kubectl create secret generic aml-gitlab-admin-token \ +## **Preparing the Harbor Service** + +If you plan to use the **Model Catalog** feature, prepare an **Alauda Build of Harbor** service before installing Alauda AI. The registry must meet the following requirements: + +- Run in production mode. Use HTTPS for production deployments. +- Allow inference cluster nodes to pull Model Catalog images without image pull credentials. + +If you cannot deploy an HTTPS registry in the target environment, you can use an HTTP registry as a fallback, but you must configure the inference cluster container runtime before deploying models. The detailed containerd configuration is covered when setting **Model OCI Registry Address** during Alauda AI instance creation. + +For the installation procedure, see . + ## **Frequently Asked Questions (FAQ)** ### **1. How to optimize GitLab 18.5 and later configuration for large LFS objects?** diff --git a/sites.yaml b/sites.yaml index fe46bc3..b86d156 100644 --- a/sites.yaml +++ b/sites.yaml @@ -4,6 +4,9 @@ - name: alauda-build-of-gitlab base: /alauda-build-of-gitlab version: v18.5 +- name: alauda-build-of-harbor + base: /alauda-build-of-harbor + version: "2.14" - name: servicemeshv1 base: /servicemeshv1 version: "4.3"