Skip to content

Commit 5cd2a1d

Browse files
GiluerreDaniel Beres
andauthored
feat: add SONiC CLOS network lab (#79)
* feat: add SONiC CLOS network lab Signed-off-by: Daniel Beres <daniel.beres@sap.com> * feat: fix the license issue in the CLOS directory Signed-off-by: Daniel Beres <daniel.beres@sap.com> * feat: add redeploy.sh to the CLOS directory, separate logic from the makefile, automatically create Custom Resources Signed-off-by: Daniel Beres <daniel.beres@sap.com> * feat: update README.md for clos emulation Signed-off-by: Daniel Beres <daniel.beres@sap.com> --------- Signed-off-by: Daniel Beres <daniel.beres@sap.com> Co-authored-by: Daniel Beres <daniel.beres@sap.com>
1 parent 26ed10c commit 5cd2a1d

8 files changed

Lines changed: 435 additions & 0 deletions

File tree

REUSE.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ SPDX-PackageDownloadLocation = "https://github.com/ironcore-dev/sonic-operator"
2626
"README.md",
2727
"internal/agent/hack/**",
2828
"internal/agent/proto/**",
29+
"test/emulation/**",
2930
]
3031
precedence = "aggregate"
3132
SPDX-FileCopyrightText = "2025 SAP SE or an SAP affiliate company and IronCore contributors"

test/emulation/clos/README.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# SONiC Lab Topology (CLOS)
2+
3+
A containerized network lab environment running SONiC switches in a CLOS topology, orchestrated on Kubernetes using Clabernetes.
4+
5+
## Overview
6+
7+
This project sets up a complete network topology with:
8+
- **2 Spine switches** (SONiC VMs)
9+
- **2 Leaf switches** (SONiC VMs)
10+
- **2 Client nodes** (Linux multitool containers)
11+
12+
The topology implements a standard data center CLOS architecture.
13+
14+
### Topology Diagram
15+
16+
![CLOS Topology](clos_topology.svg)
17+
18+
## Prerequisites
19+
20+
The following tools must be installed on the host:
21+
22+
- **kind** - Kubernetes in Docker (for local Kubernetes cluster)
23+
- **kubectl** - Kubernetes command-line tool
24+
- **sshpass** - SSH password automation utility
25+
26+
## Project Structure
27+
28+
```
29+
clos/
30+
├── clos.clab.yml - Network topology definition (YAML)
31+
├── deploy.sh - Deployment automation script
32+
├── init_setup.sh - Node initialization and agent setup
33+
├── destroy.sh - Infrastructure cleanup script
34+
└── README.md - This file
35+
```
36+
37+
## Dependencies
38+
39+
40+
### Software Packages
41+
```
42+
docker - Container runtime
43+
kubernetes - Container orchestration
44+
helm - Package manager
45+
kubectl - Kubernetes CLI
46+
sshpass - SSH password utility
47+
jq - JSON processor
48+
```
49+
50+
### Kubernetes Services
51+
- Clabernetes - Deployed via Helm in `c9s` namespace
52+
- kube-vip - RBAC and manifests applied to cluster
53+
- kube-vip Cloud Controller - Deployed in `kube-vip` namespace
54+
55+
56+
## Configuration Details
57+
58+
### IP Management
59+
- kube-vip External IP Range: `172.18.1.10 - 172.18.1.250`
60+
- Services exposed via kube-vip ARP mode on eth0
61+
62+
## Setup Steps
63+
64+
### 1. Prerequisites
65+
Ensure all dependencies are installed and Kubernetes cluster is ready:
66+
67+
### 2. Deploy the Lab Environment
68+
Deploy the full topology to Kubernetes:
69+
```bash
70+
./deploy.sh
71+
```
72+
73+
**What it does**:
74+
- Creates Kind Cluster `clos-lab-kind`
75+
- Install CRDs
76+
- Installs Clabernetes via Helm in `c9s` namespace
77+
- Applies kube-vip RBAC policies
78+
- Deploys kube-vip cloud controller
79+
- Creates kube-vip configmap with IP range
80+
- Deploys kube-vip ARP daemonset
81+
- Converts containerlab topology to Kubernetes resources
82+
- Applies topology configuration to cluster
83+
- Waits for services to be ready (180 seconds)
84+
- Configure DNS, Pulls and starts Sonic Agenton port 57400 for each SONiC node via SSH
85+
- Creates CRs for the switches
86+
- Displays external IPs for all services
87+
88+
### 4. Access the Lab
89+
After successful deployment, retrieve external IPs:
90+
```bash
91+
# View all services with external IPs
92+
kubectl get -n c9s-clos svc
93+
94+
# SSH into a specific SONiC node (default credentials: admin/admin)
95+
ssh admin@<external-ip>
96+
97+
# Example
98+
ssh admin@172.18.1.15
99+
```
100+
101+
### 5. Cleanup
102+
Tear down the entire lab environment:
103+
```bash
104+
./destroy.sh
105+
```
106+
107+
**What it does**:
108+
- Deletes the `c9s-clos` namespace (all topology resources)
109+
- Deletes the `c9s` namespace (Clabernetes)
110+
- Removes kube-vip configmap, daemonset, and cloud controller
111+
- Cleans up kube-vip RBAC resources
112+
- Removes all related Kubernetes objects

test/emulation/clos/clos.clab.yml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
name: clos
2+
3+
topology:
4+
nodes:
5+
sonic-spine1:
6+
kind: sonic-vm
7+
image: dberes/sonic-vs:latest
8+
sonic-spine2:
9+
kind: sonic-vm
10+
image: dberes/sonic-vs:latest
11+
sonic-leaf1:
12+
kind: sonic-vm
13+
image: dberes/sonic-vs:latest
14+
sonic-leaf2:
15+
kind: sonic-vm
16+
image: dberes/sonic-vs:latest
17+
client1:
18+
kind: linux
19+
image: ghcr.io/hellt/network-multitool:latest
20+
client2:
21+
kind: linux
22+
image: ghcr.io/hellt/network-multitool:latest
23+
links:
24+
- endpoints: ["sonic-spine1:eth1", "sonic-leaf1:eth1"]
25+
- endpoints: ["sonic-spine1:eth2", "sonic-leaf2:eth1"]
26+
- endpoints: ["sonic-spine2:eth1", "sonic-leaf1:eth2"]
27+
- endpoints: ["sonic-spine2:eth2", "sonic-leaf2:eth2"]
28+
- endpoints: ["sonic-leaf1:eth3", "client1:eth1"]
29+
- endpoints: ["sonic-leaf2:eth3", "client2:eth1"]

test/emulation/clos/clos_topology.svg

Lines changed: 4 additions & 0 deletions
Loading

test/emulation/clos/deploy.sh

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
#!/bin/bash
2+
3+
# SPDX-FileCopyrightText: 2025 SAP SE or an SAP affiliate company and IronCore contributors
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
set -eu
7+
8+
# Setup Kind cluster for e2e tests if it does not exist
9+
KIND_CLUSTER="clos-lab-kind"
10+
echo "Setting up Kind cluster for tests..."
11+
if ! command -v kind &> /dev/null; then
12+
echo "Kind is not installed. Please install Kind manually."
13+
exit 1
14+
fi
15+
16+
if kind get clusters 2>/dev/null | grep -q "^${KIND_CLUSTER}$"; then
17+
echo "Kind cluster '${KIND_CLUSTER}' already exists. Skipping creation."
18+
else
19+
echo "Creating Kind cluster '${KIND_CLUSTER}'..."
20+
kind create cluster --name "${KIND_CLUSTER}"
21+
fi
22+
23+
# Go to git repo root
24+
pushd "$(git rev-parse --show-toplevel)" || exit 1
25+
26+
echo "Installing CRDs..."
27+
make install
28+
# Return to original directory
29+
popd || exit 1
30+
31+
HELM="docker run --network host -ti --rm -v $(pwd):/apps -w /apps \
32+
-v $HOME/.kube:/root/.kube -v $HOME/.helm:/root/.helm \
33+
-v $HOME/.config/helm:/root/.config/helm \
34+
-v $HOME/.cache/helm:/root/.cache/helm \
35+
alpine/helm:3.12.3"
36+
37+
CLABVERTER="sudo docker run --user $(id -u) -v $(pwd):/clabernetes/work --rm ghcr.io/srl-labs/clabernetes/clabverter"
38+
39+
$HELM upgrade --install --create-namespace --namespace c9s \
40+
clabernetes oci://ghcr.io/srl-labs/clabernetes/clabernetes
41+
42+
kubectl apply -f https://kube-vip.io/manifests/rbac.yaml
43+
kubectl apply -f https://raw.githubusercontent.com/kube-vip/kube-vip-cloud-provider/main/manifest/kube-vip-cloud-controller.yaml
44+
kubectl create configmap --namespace kube-system kubevip \
45+
--from-literal range-global=172.18.1.10-172.18.1.250 || true
46+
47+
#set up the kube-vip CLI
48+
KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | \
49+
jq -r ".[0].name")
50+
KUBEVIP="docker run --network host \
51+
--rm ghcr.io/kube-vip/kube-vip:$KVVERSION"
52+
#install kube-vip load balancer daemonset in ARP mode
53+
$KUBEVIP manifest daemonset --services --inCluster --arp --interface eth0 | \
54+
kubectl apply -f -
55+
56+
57+
echo "Checking for configuration changes..."
58+
CONFIG=$($CLABVERTER --stdout --naming non-prefixed)
59+
60+
if echo "$CONFIG" | kubectl diff -f - > /dev/null 2>&1; then
61+
echo "No changes detected, skipping apply and wait"
62+
else
63+
echo "Changes detected, applying configuration..."
64+
echo "$CONFIG" | kubectl apply -f -
65+
66+
# Wait for services to be ready
67+
echo "Waiting for services to be ready..."
68+
kubectl wait --namespace c9s --for=condition=ready --timeout=300s pods --all
69+
kubectl wait --namespace c9s-clos --for=condition=ready --timeout=300s pods --all
70+
71+
72+
# Run script on each sonic node
73+
echo "Provisioning SONiC nodes..."
74+
for service in $(kubectl get -n c9s-clos svc -o jsonpath='{.items[*].metadata.name}' 2>/dev/null | tr ' ' '\n' | grep '^sonic-' | grep -v '\-vx$'); do
75+
until IP=$(kubectl get svc "$service" -n c9s-clos -o jsonpath='{.status.loadBalancer.ingress[0].ip}') && [ -n "$IP" ]; do
76+
echo "Waiting for external IP..."
77+
sleep 1
78+
done
79+
80+
h=$(kubectl get -n c9s-clos svc "$service" -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null)
81+
if [ ! -z "$h" ]; then
82+
echo "Running init_setup.sh on $h"
83+
max_attempts=36 # 36 attempts with 10 seconds sleep = 6 minutes total wait time
84+
attempt=1
85+
while [ $attempt -le $max_attempts ]; do
86+
if sshpass -p 'admin' ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null admin@"$h" 'bash -s' < init_setup.sh; then
87+
echo "Successfully provisioned $h"
88+
break
89+
else
90+
if [ $attempt -lt $max_attempts ]; then
91+
echo "Provisioning attempt $attempt of $max_attempts failed for $h. Retrying in 10 seconds..."
92+
sleep 10
93+
else
94+
echo "Failed to provision $h after $max_attempts attempts"
95+
fi
96+
fi
97+
((attempt++))
98+
done
99+
fi
100+
done
101+
102+
fi
103+
104+
105+
echo ""
106+
echo "=========================================="
107+
echo "SONiC Lab Topology - External IPs"
108+
echo "=========================================="
109+
for service in $(kubectl get -n c9s-clos svc -o jsonpath='{.items[*].metadata.name}' 2>/dev/null | tr ' ' '\n'| grep -v '\-vx$'); do
110+
ip=$(kubectl get -n c9s-clos svc "$service" -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null)
111+
if [ -n "$ip" ]; then
112+
echo "$service -> $ip"
113+
114+
if [[ "$service" == *sonic* ]]; then
115+
cat <<EOF | kubectl apply -f -
116+
apiVersion: sonic.networking.metal.ironcore.dev/v1alpha1
117+
kind: Switch
118+
metadata:
119+
labels:
120+
app.kubernetes.io/name: sonic-operator
121+
app.kubernetes.io/managed-by: kustomize
122+
name: $service
123+
namespace: c9s-clos
124+
spec:
125+
management:
126+
host: $ip
127+
port: "57400"
128+
credentials:
129+
name: switchcredentials-sample
130+
macAddress: "aa:bb:cc:dd:ee:ff"
131+
EOF
132+
fi
133+
fi
134+
135+
done
136+
137+
echo ""
138+
echo "Script ended successfully"

test/emulation/clos/destroy.sh

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
#!/bin/bash
2+
3+
# SPDX-FileCopyrightText: 2025 SAP SE or an SAP affiliate company and IronCore contributors
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
set -eu
7+
8+
echo "Starting destruction of SONiC lab infrastructure..."
9+
10+
# Delete the c9s-clos namespace (contains all topology resources)
11+
echo "Deleting c9s-clos namespace..."
12+
kubectl delete namespace c9s-clos --ignore-not-found=true
13+
sleep 10
14+
15+
# Delete the c9s namespace (contains clabernetes)
16+
echo "Deleting c9s namespace..."
17+
kubectl delete namespace c9s --ignore-not-found=true
18+
sleep 10
19+
20+
# Remove kube-vip configmap
21+
echo "Removing kube-vip configmap..."
22+
kubectl delete configmap -n kube-system kubevip --ignore-not-found=true
23+
24+
# Remove kube-vip daemonset
25+
echo "Removing kube-vip daemonset..."
26+
kubectl delete daemonset -n kube-system kube-vip-ds --ignore-not-found=true
27+
28+
# Remove kube-vip cloud controller
29+
echo "Removing kube-vip cloud controller deployment..."
30+
kubectl delete deployment -n kube-vip kube-vip-cloud-provider --ignore-not-found=true
31+
32+
# Remove kube-vip namespace if empty
33+
echo "Cleaning up kube-vip namespace..."
34+
kubectl delete namespace kube-vip --ignore-not-found=true
35+
36+
# Remove RBAC resources for kube-vip
37+
echo "Removing kube-vip RBAC resources..."
38+
kubectl delete clusterrole system:kube-vip-role --ignore-not-found=true
39+
kubectl delete clusterrole system:kube-vip-cloud-controller-role --ignore-not-found=true
40+
kubectl delete clusterrolebinding system:kube-vip-binding --ignore-not-found=true
41+
kubectl delete clusterrolebinding system:kube-vip-cloud-controller-binding --ignore-not-found=true
42+
kubectl delete serviceaccount -n kube-system kube-vip --ignore-not-found=true
43+
kubectl delete serviceaccount -n kube-vip kube-vip-cloud-controller --ignore-not-found=true
44+
45+
echo "Destruction complete!"
46+
echo "All SONiC lab resources have been removed."
47+
48+
# Cleanup Kind cluster used for e2e tests
49+
KIND_CLUSTER="sonic-operator-test-e2e"
50+
echo "Tearing down Kind cluster '${KIND_CLUSTER}'..."
51+
if command -v kind &> /dev/null; then
52+
kind delete cluster --name "${KIND_CLUSTER}" 2>/dev/null || true
53+
else
54+
echo "Kind is not installed, skipping cluster cleanup."
55+
fi

test/emulation/clos/init_setup.sh

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/bin/bash
2+
3+
# SPDX-FileCopyrightText: 2025 SAP SE or an SAP affiliate company and IronCore contributors
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
set -euo pipefail
7+
8+
IMAGE="ghcr.io/ironcore-dev/sonic-agent:sha-966298d"
9+
10+
echo "Configuring DNS..."
11+
if [ -d "/etc/resolvconf/resolv.conf.d" ]; then
12+
echo "nameserver 8.8.8.8" | sudo tee /etc/resolvconf/resolv.conf.d/head
13+
sudo /sbin/resolvconf --enable-updates
14+
sudo /sbin/resolvconf -u
15+
sudo /sbin/resolvconf --disable-updates
16+
else
17+
echo "Warning: resolvconf not found, skipping DNS configuration"
18+
fi
19+
echo "Removing old agent container if it exists..."
20+
docker rm -f switch-operator-agent 2>/dev/null || true
21+
22+
echo "Pulling agent image..."
23+
docker pull "$IMAGE"
24+
25+
echo "Starting agent container..."
26+
docker run --pull always -d --name switch-operator-agent --entrypoint /switch-agent-server --network host --restart unless-stopped -v /var/run/dbus:/var/run/dbus:rw "$IMAGE" -port 57400
27+
28+
echo "Agent setup completed successfully"
29+
30+

0 commit comments

Comments
 (0)