Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 199 additions & 1 deletion docs/en/installation/ai-cluster.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -332,11 +332,73 @@ In **Administrator** view:
:::

12. Under **Gitlab** section:

:::warning
GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead.
:::

1. Type the URL of self-hosted Gitlab for **Base URL**.
2. Type `cpaas-system` for **Admin Token Secret Namespace**.
3. Type `aml-gitlab-admin-token` for **Admin Token Secret Name**.

13. Review above configurations and then click **Create**.
13. Under **Model Catalog** section, configure the following parameters:

- **Database Password Secret Namespace**: Namespace of the secret containing the PostgreSQL password for Model Catalog.
- **Database Password Secret Name**: Name of the secret containing the PostgreSQL password for Model Catalog.

Create the secret before creating the Alauda AI instance. If you use the following example, set **Database Password Secret Namespace** to `aml-operator` and **Database Password Secret Name** to `model-catalog`.

```yaml
apiVersion: v1
kind: Secret
metadata:
name: model-catalog # [!code callout]
namespace: aml-operator # [!code callout]
stringData:
password: <postgres-password> # [!code callout]
type: Opaque
```

<Callouts>

1. `metadata.name` is the value for **Database Password Secret Name**.
2. `metadata.namespace` is the value for **Database Password Secret Namespace**.
3. `stringData.password` is the PostgreSQL password in plain text. Kubernetes stores it as base64-encoded `data.password` after the Secret is created.

</Callouts>

After creation, the stored Secret has a base64-encoded `data.password` field, for example:

```yaml
apiVersion: v1
data:
password: cGc=
kind: Secret
metadata:
name: model-catalog
namespace: aml-operator
type: Opaque
```

- **Model OCI Registry Address**: Registry address hosting model OCI artifacts for Model Catalog. The default value is `build-harbor.alauda.cn`.

This registry stores the model OCI images used by Model Catalog. Use Harbor or another production-mode OCI registry with HTTPS access enabled. The Harbor project or repository used for Model Catalog must allow anonymous pull access from inference cluster nodes.

If you cannot deploy a registry with HTTPS in the target environment, you can use an HTTP registry as a fallback. Configure the container runtime on every node in the inference cluster before deploying models. For containerd, add an insecure registry mirror for the registry address, for example by creating `/etc/containerd/certs.d/<registry-host:port>/hosts.toml`:

```toml
server = "http://<registry-host:port>"

[host."http://<registry-host:port>"]
capabilities = ["pull", "resolve"]
```

Then restart containerd or apply the equivalent node-runtime configuration through your cluster management system. This configuration must exist on the nodes where inference service pods are scheduled; otherwise the pod image pull will fail even if Model Catalog can list the model. The exact containerd configuration path can vary by Kubernetes distribution; after applying the configuration, verify that the node can pull a Model Catalog image, for example with `crictl pull <registry-host:port>/<repository>:<tag>`.

- **Source of PVC**: Choose whether to reuse an existing PVC or create a new one. Use `CreateNew` to let the installation create the PVC.
- **StorageClass Name**: StorageClass used when creating a new PVC.

14. Review above configurations and then click **Create**.

### Verification

Expand All @@ -354,6 +416,142 @@ default True Succeeded
```
</Steps>

## Importing Built-in Model Images for Catalog \{#importing-built-in-model-images-for-catalog}

The **Catalog** feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images **must be imported into the OCI registry configured by Model Catalog before the Catalog can serve them**. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with `ImagePullBackOff`.

<Steps>

### Obtaining the OCI image tarballs

Built-in model images are delivered as OCI archive tarballs (`.tar` files compliant with the OCI Image Layout Specification). Each tarball contains a multi-architecture image (`linux/amd64` + `linux/arm64`) for one model.

Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version.

### Pushing to Harbor

The recommended target is Harbor. The example below uses an HTTP Harbor registry. If your Harbor registry uses HTTPS, omit `--plain-http` and change the API URLs from `http://` to `https://`.

Run the commands on a node that has `ctr`, `curl`, and `jq` installed and can reach Harbor.

First, set the environment variables:

```bash
export REG=192.168.140.0:32700 # [!code callout]
export REPO=mlops/modelcar-qwen3.5-0.8b # [!code callout]
export TAG=v0.1.0 # [!code callout]
export TAR=./Qwen3.5-0.8B.oci.tar # [!code callout]
export AUTH='user:password' # [!code callout]
```

<Callouts>

1. Harbor registry endpoint, without the URL scheme.
2. Target repository path in Harbor, in the form `<project>/<image-name>`. For example, `mlops/modelcar-qwen3.5-0.8b` uses the Harbor project `mlops` and repository `modelcar-qwen3.5-0.8b`.
3. Image tag carried by the OCI archive. If you do not know it, extract it from the tarball with the command below.
4. Path to the OCI archive tarball obtained in the previous step.
5. Harbor credentials in the form `user:password`. **Contact your platform administrator if you do not have these.**

</Callouts>
Comment thread
coderabbitai[bot] marked this conversation as resolved.

The tarball usually carries its own tag (e.g. `v0.1.0`) inside the OCI image layout. If needed, extract it from the tarball:

```bash
export TAG=$(tar -xOf "$TAR" index.json \
| jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]')
echo "$TAG" # should print something like v0.1.0
```
Comment on lines +459 to +463
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a null-guard for the tag extraction — absent annotation sets $TAG to the literal string "null"

If org.opencontainers.image.ref.name is absent from the tarball's index.json, jq -r outputs null and TAG is silently set to the string "null". Every subsequent ctr/curl command then uses $REG/$REPO:null, failing in a confusing way.

🛡️ Proposed fix
 export TAG=$(tar -xOf "$TAR" index.json \
   | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]')
-echo "$TAG"   # should print something like v2.3.0
+if [ -z "$TAG" ] || [ "$TAG" = "null" ]; then
+  echo "ERROR: could not extract tag from $TAR — set \$TAG manually." >&2
+  exit 1
+fi
+echo "$TAG"   # should print something like v2.3.0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/en/installation/ai-cluster.mdx` around lines 397 - 401, The TAG
extraction can yield the literal string "null" when the annotation is missing;
update the script that sets TAG (the command using tar -xOf "$TAR" index.json |
jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]') to
validate the result immediately after assignment and bail out with a clear error
if TAG is empty or equals "null" (e.g., check [[ -z "$TAG" || "$TAG" == "null"
]] and print an explanatory message and exit 1) so subsequent ctr/curl commands
never run with $REG/$REPO:null.


Check whether the image tag already exists in Harbor:

```bash
URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG"

HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL")

echo "HTTP=$HTTP URL=$URL"

[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \
|| jq -r '.errors[]?.message' /tmp/harbor-artifact.json
```

If the Harbor project does not exist yet, create it before pushing:

```bash
PROJECT="${REPO%%/*}"

curl -s -u "$AUTH" -X POST "http://$REG/api/v2.0/projects" \
-H 'Content-Type: application/json' \
-d "{\"project_name\":\"$PROJECT\",\"public\":false}" \
-w '\nHTTP %{http_code}\n'
```

If the project already exists, Harbor returns a non-2xx status code. After confirming the project exists, continue with the import and push.

Then run the import and push procedure:

```bash
# 1. Import into the node's containerd content store.
# --base-name prepends $REG/$REPO to the tag carried inside the tarball,
# producing a fully-qualified reference $REG/$REPO:$TAG.
ctr -n k8s.io images import \
--all-platforms \
--base-name "$REG/$REPO" \
"$TAR"

# 2. Verify the import. You should see "$REG/$REPO:$TAG".
ctr -n k8s.io images ls -q | grep "$REPO"

# 3. Push to Harbor. Use --plain-http only for HTTP Harbor.
ctr -n k8s.io images push \
--plain-http \
--user "$AUTH" \
"$REG/$REPO:$TAG"

# 4. Clean up the local reference on the node. Blob data is reclaimed by
# containerd's garbage collector, leaving no persistent state on the node.
ctr -n k8s.io images rm "$REG/$REPO:$TAG"
```

Repeat this procedure for each built-in model tarball, varying `$REPO`, `$TAG`, and `$TAR` per model.

:::info
`--all-platforms` is critical at the **import** step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on `push` — pushing the multi-arch index automatically pushes all platforms it references.
:::

### Verifying the Harbor import

Confirm that Harbor now serves the image:

```bash
URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG"

HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL")

echo "HTTP=$HTTP URL=$URL"

[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \
|| jq -r '.errors[]?.message' /tmp/harbor-artifact.json
```

`HTTP=200` means the image was successfully imported into Harbor. Expected output includes the digest, size, push time, tag, and platform information:

```json
{
"digest": "sha256:...",
"size": 123456789,
"push_time": "2026-05-06T00:00:00.000Z",
"arch": "amd64",
"tags": ["v0.1.0"],
"platforms": [
{"architecture": "amd64", "os": "linux"},
{"architecture": "arm64", "os": "linux"}
]
}
```

</Steps>

Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the [Quick Start](../../overview/quick_start.mdx).

## FAQ
Expand Down
15 changes: 15 additions & 0 deletions docs/en/installation/pre-configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ weight: 5

In Alauda AI, GitLab is the core component for **Model Management**. Before deploying Alauda AI, you **must prepare** a GitLab service.

:::warning
GitLab-backed model storage is deprecated. It remains available for compatibility only and is planned for removal in a future Alauda AI release. For built-in models and new model delivery workflows, use **Model Catalog** with OCI model artifacts instead.
:::

### **Deployment Options**

#### **1. GitLab service requirements**
Expand Down Expand Up @@ -94,6 +98,17 @@ kubectl create secret generic aml-gitlab-admin-token \

</Callouts>

## **Preparing the Harbor Service**

If you plan to use the **Model Catalog** feature, prepare an **Alauda Build of Harbor** service before installing Alauda AI. The registry must meet the following requirements:

- Run in production mode. Use HTTPS for production deployments.
- Allow inference cluster nodes to pull Model Catalog images without image pull credentials.

If you cannot deploy an HTTPS registry in the target environment, you can use an HTTP registry as a fallback, but you must configure the inference cluster container runtime before deploying models. The detailed containerd configuration is covered when setting **Model OCI Registry Address** during Alauda AI instance creation.

For the installation procedure, see <ExternalSiteLink name="alauda-build-of-harbor" href="/install/01_installation_guide.html" children="Deploy Alauda Build of Harbor" />.

## **Frequently Asked Questions (FAQ)**

### **1. How to optimize GitLab 18.5 and later configuration for large LFS objects?**
Expand Down
3 changes: 3 additions & 0 deletions sites.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
- name: alauda-build-of-gitlab
base: /alauda-build-of-gitlab
version: v18.5
- name: alauda-build-of-harbor
base: /alauda-build-of-harbor
version: "2.14"
- name: servicemeshv1
base: /servicemeshv1
version: "4.3"
Expand Down