-
Notifications
You must be signed in to change notification settings - Fork 0
add oci-models docs #210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add oci-models docs #210
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -332,11 +332,73 @@ In **Administrator** view: | |
| ::: | ||
|
|
||
| 12. Under **Gitlab** section: | ||
|
|
||
| :::warning | ||
| GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead. | ||
| ::: | ||
|
|
||
| 1. Type the URL of self-hosted Gitlab for **Base URL**. | ||
| 2. Type `cpaas-system` for **Admin Token Secret Namespace**. | ||
| 3. Type `aml-gitlab-admin-token` for **Admin Token Secret Name**. | ||
|
|
||
| 13. Review above configurations and then click **Create**. | ||
| 13. Under **Model Catalog** section, configure the following parameters: | ||
|
|
||
| - **Database Password Secret Namespace**: Namespace of the secret containing the PostgreSQL password for Model Catalog. | ||
| - **Database Password Secret Name**: Name of the secret containing the PostgreSQL password for Model Catalog. | ||
|
|
||
| Create the secret before creating the Alauda AI instance. If you use the following example, set **Database Password Secret Namespace** to `aml-operator` and **Database Password Secret Name** to `model-catalog`. | ||
|
|
||
| ```yaml | ||
| apiVersion: v1 | ||
| kind: Secret | ||
| metadata: | ||
| name: model-catalog # [!code callout] | ||
| namespace: aml-operator # [!code callout] | ||
| stringData: | ||
| password: <postgres-password> # [!code callout] | ||
| type: Opaque | ||
| ``` | ||
|
|
||
| <Callouts> | ||
|
|
||
| 1. `metadata.name` is the value for **Database Password Secret Name**. | ||
| 2. `metadata.namespace` is the value for **Database Password Secret Namespace**. | ||
| 3. `stringData.password` is the PostgreSQL password in plain text. Kubernetes stores it as base64-encoded `data.password` after the Secret is created. | ||
|
|
||
| </Callouts> | ||
|
|
||
| After creation, the stored Secret has a base64-encoded `data.password` field, for example: | ||
|
|
||
| ```yaml | ||
| apiVersion: v1 | ||
| data: | ||
| password: cGc= | ||
| kind: Secret | ||
| metadata: | ||
| name: model-catalog | ||
| namespace: aml-operator | ||
| type: Opaque | ||
| ``` | ||
|
|
||
| - **Model OCI Registry Address**: Registry address hosting model OCI artifacts for Model Catalog. The default value is `build-harbor.alauda.cn`. | ||
|
|
||
| This registry stores the model OCI images used by Model Catalog. Use Harbor or another production-mode OCI registry with HTTPS access enabled. The Harbor project or repository used for Model Catalog must allow anonymous pull access from inference cluster nodes. | ||
|
|
||
| If you cannot deploy a registry with HTTPS in the target environment, you can use an HTTP registry as a fallback. Configure the container runtime on every node in the inference cluster before deploying models. For containerd, add an insecure registry mirror for the registry address, for example by creating `/etc/containerd/certs.d/<registry-host:port>/hosts.toml`: | ||
|
|
||
| ```toml | ||
| server = "http://<registry-host:port>" | ||
|
|
||
| [host."http://<registry-host:port>"] | ||
| capabilities = ["pull", "resolve"] | ||
| ``` | ||
|
|
||
| Then restart containerd or apply the equivalent node-runtime configuration through your cluster management system. This configuration must exist on the nodes where inference service pods are scheduled; otherwise the pod image pull will fail even if Model Catalog can list the model. The exact containerd configuration path can vary by Kubernetes distribution; after applying the configuration, verify that the node can pull a Model Catalog image, for example with `crictl pull <registry-host:port>/<repository>:<tag>`. | ||
|
|
||
| - **Source of PVC**: Choose whether to reuse an existing PVC or create a new one. Use `CreateNew` to let the installation create the PVC. | ||
| - **StorageClass Name**: StorageClass used when creating a new PVC. | ||
|
|
||
| 14. Review above configurations and then click **Create**. | ||
|
|
||
| ### Verification | ||
|
|
||
|
|
@@ -354,6 +416,142 @@ default True Succeeded | |
| ``` | ||
| </Steps> | ||
|
|
||
| ## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog} | ||
|
|
||
| The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into the OCI registry configured by Model Catalog before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`. | ||
|
|
||
| <Steps> | ||
|
|
||
| ### Obtaining the OCI image tarballs | ||
|
|
||
| Built-in model images are delivered as OCI archive tarballs (`.tar` files compliant with the OCI Image Layout Specification). Each tarball contains a multi-architecture image (`linux/amd64` + `linux/arm64`) for one model. | ||
|
|
||
| Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version. | ||
|
|
||
| ### Pushing to Harbor | ||
|
|
||
| The recommended target is Harbor. The example below uses an HTTP Harbor registry. If your Harbor registry uses HTTPS, omit `--plain-http` and change the API URLs from `http://` to `https://`. | ||
|
|
||
| Run the commands on a node that has `ctr`, `curl`, and `jq` installed and can reach Harbor. | ||
|
|
||
| First, set the environment variables: | ||
|
|
||
| ```bash | ||
| export REG=192.168.140.0:32700 # [!code callout] | ||
| export REPO=mlops/modelcar-qwen3.5-0.8b # [!code callout] | ||
| export TAG=v0.1.0 # [!code callout] | ||
| export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout] | ||
| export AUTH='user:password' # [!code callout] | ||
| ``` | ||
|
|
||
| <Callouts> | ||
|
|
||
| 1. Harbor registry endpoint, without the URL scheme. | ||
| 2. Target repository path in Harbor, in the form `<project>/<image-name>`. For example, `mlops/modelcar-qwen3.5-0.8b` uses the Harbor project `mlops` and repository `modelcar-qwen3.5-0.8b`. | ||
| 3. Image tag carried by the OCI archive. If you do not know it, extract it from the tarball with the command below. | ||
| 4. Path to the OCI archive tarball obtained in the previous step. | ||
| 5. Harbor credentials in the form `user:password`. **Contact your platform administrator if you do not have these.** | ||
|
|
||
| </Callouts> | ||
|
|
||
| The tarball usually carries its own tag (e.g. `v0.1.0`) inside the OCI image layout. If needed, extract it from the tarball: | ||
|
|
||
| ```bash | ||
| export TAG=$(tar -xOf "$TAR" index.json \ | ||
| | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]') | ||
| echo "$TAG" # should print something like v0.1.0 | ||
| ``` | ||
|
Comment on lines
+459
to
+463
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a null-guard for the tag extraction — absent annotation sets If 🛡️ Proposed fix export TAG=$(tar -xOf "$TAR" index.json \
| jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]')
-echo "$TAG" # should print something like v2.3.0
+if [ -z "$TAG" ] || [ "$TAG" = "null" ]; then
+ echo "ERROR: could not extract tag from $TAR — set \$TAG manually." >&2
+ exit 1
+fi
+echo "$TAG" # should print something like v2.3.0🤖 Prompt for AI Agents |
||
|
|
||
| Check whether the image tag already exists in Harbor: | ||
|
|
||
| ```bash | ||
| URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" | ||
|
|
||
| HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") | ||
|
|
||
| echo "HTTP=$HTTP URL=$URL" | ||
|
|
||
| [ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ | ||
| || jq -r '.errors[]?.message' /tmp/harbor-artifact.json | ||
| ``` | ||
|
|
||
| If the Harbor project does not exist yet, create it before pushing: | ||
|
|
||
| ```bash | ||
| PROJECT="${REPO%%/*}" | ||
|
|
||
| curl -s -u "$AUTH" -X POST "http://$REG/api/v2.0/projects" \ | ||
| -H 'Content-Type: application/json' \ | ||
| -d "{\"project_name\":\"$PROJECT\",\"public\":false}" \ | ||
| -w '\nHTTP %{http_code}\n' | ||
| ``` | ||
|
|
||
| If the project already exists, Harbor returns a non-2xx status code. After confirming the project exists, continue with the import and push. | ||
|
|
||
| Then run the import and push procedure: | ||
|
|
||
| ```bash | ||
| # 1. Import into the node's containerd content store. | ||
| # --base-name prepends $REG/$REPO to the tag carried inside the tarball, | ||
| # producing a fully-qualified reference $REG/$REPO:$TAG. | ||
| ctr -n k8s.io images import \ | ||
| --all-platforms \ | ||
| --base-name "$REG/$REPO" \ | ||
| "$TAR" | ||
|
|
||
| # 2. Verify the import. You should see "$REG/$REPO:$TAG". | ||
| ctr -n k8s.io images ls -q | grep "$REPO" | ||
|
|
||
| # 3. Push to Harbor. Use --plain-http only for HTTP Harbor. | ||
| ctr -n k8s.io images push \ | ||
| --plain-http \ | ||
| --user "$AUTH" \ | ||
| "$REG/$REPO:$TAG" | ||
|
|
||
| # 4. Clean up the local reference on the node. Blob data is reclaimed by | ||
| # containerd's garbage collector, leaving no persistent state on the node. | ||
| ctr -n k8s.io images rm "$REG/$REPO:$TAG" | ||
| ``` | ||
|
|
||
| Repeat this procedure for each built-in model tarball, varying `$REPO`, `$TAG`, and `$TAR` per model. | ||
|
|
||
| :::info | ||
| `--all-platforms` is critical at the **import** step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on `push` — pushing the multi-arch index automatically pushes all platforms it references. | ||
| ::: | ||
|
|
||
| ### Verifying the Harbor import | ||
|
|
||
| Confirm that Harbor now serves the image: | ||
|
|
||
| ```bash | ||
| URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG" | ||
|
|
||
| HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL") | ||
|
|
||
| echo "HTTP=$HTTP URL=$URL" | ||
|
|
||
| [ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \ | ||
| || jq -r '.errors[]?.message' /tmp/harbor-artifact.json | ||
| ``` | ||
|
|
||
| `HTTP=200` means the image was successfully imported into Harbor. Expected output includes the digest, size, push time, tag, and platform information: | ||
|
|
||
| ```json | ||
| { | ||
| "digest": "sha256:...", | ||
| "size": 123456789, | ||
| "push_time": "2026-05-06T00:00:00.000Z", | ||
| "arch": "amd64", | ||
| "tags": ["v0.1.0"], | ||
| "platforms": [ | ||
| {"architecture": "amd64", "os": "linux"}, | ||
| {"architecture": "arm64", "os": "linux"} | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| </Steps> | ||
|
|
||
| Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx). | ||
|
|
||
| ## FAQ | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.