Skip to content

Commit 1cf3ef3

Browse files
committed
docs: 添加 Cilium eBPF L4 负载均衡方案文档
- 添加 VIP 二层网络要求说明 - 添加 Custom(自定义)模式描述 - 生成英文版文档 (cherry picked from commit f823a0122ba89d5658d160588d5040b750d3c72c)
1 parent a867802 commit 1cf3ef3

2 files changed

Lines changed: 564 additions & 0 deletions

File tree

Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,283 @@
1+
---
2+
id: KB260300001
3+
products:
4+
- Alauda Container Platform
5+
kind:
6+
- Solution
7+
sourceSHA: pending
8+
---
9+
10+
# High-Performance Container Networking with Cilium CNI and eBPF-based L4 Load Balancer (Source IP Preservation)
11+
12+
This document describes how to deploy Cilium CNI in a ACP 4.2+ cluster and leverage eBPF to implement high-performance Layer 4 load balancing with source IP preservation.
13+
14+
## Prerequisites
15+
16+
| Item | Requirement |
17+
|------|------|
18+
| ACP Version | 4.2+ |
19+
| Network Mode | Custom Mode |
20+
| Architecture | x86_64 / amd64 |
21+
22+
> **OS Requirements**: Refer to the ACP product documentation [OS Baseline](https://docs.alauda.io/container_platform/4.1/install/prepare/prerequisites.html).
23+
>
24+
> **Note**: Cilium/eBPF requires Linux kernel 4.19+ (5.10+ recommended). The following operating systems are **NOT supported**:
25+
> - CentOS 7.x (kernel version 3.10.x)
26+
> - RHEL 7.x (kernel version 3.10.x - 4.18.x)
27+
>
28+
> Recommended:
29+
> - RHEL 8.x
30+
> - Ubuntu 22.04
31+
> - MicroOS
32+
33+
### Node Port Requirements
34+
35+
| Port | Component | Description |
36+
|------|------|------|
37+
| 4240 | cilium-agent | Health API |
38+
| 9962 | cilium-agent | Prometheus Metrics |
39+
| 9879 | cilium-agent | Envoy Metrics |
40+
| 9890 | cilium-agent | Agent Metrics |
41+
| 9963 | cilium-operator | Prometheus Metrics |
42+
| 9891 | cilium-operator | Operator Metrics |
43+
| 9234 | cilium-operator | Metrics |
44+
45+
### Kernel Configuration Requirements
46+
47+
Ensure the following kernel configurations are enabled on the nodes (can be checked via `grep` in `/boot/config-$(uname -r)`):
48+
49+
- `CONFIG_BPF=y` or `=m`
50+
- `CONFIG_BPF_SYSCALL=y` or `=m`
51+
- `CONFIG_NET_CLS_BPF=y` or `=m`
52+
- `CONFIG_BPF_JIT=y` or `=m`
53+
- `CONFIG_NET_SCH_INGRESS=y` or `=m`
54+
- `CONFIG_CRYPTO_USER_API_HASH=y` or `=m`
55+
56+
## ACP 4.x Cilium Deployment Steps
57+
58+
### Step 1: Create Cluster
59+
60+
On the cluster creation page, set **Network Mode** to **Custom** mode. Wait until the cluster reaches `EnsureWaitClusterModuleReady` status before deploying Cilium.
61+
62+
### Step 2: Install Cilium
63+
64+
1. Download the latest Cilium image package (v4.2.x) from the ACP marketplace
65+
66+
2. Upload to the platform using violet:
67+
68+
```bash
69+
export PLATFORM_URL=""
70+
export USERNAME=''
71+
export PASSWORD=''
72+
export CLUSTER_NAME=''
73+
74+
violet push cilium-v4.2.17.tgz --platform-address "$PLATFORM_URL" --platform-username "$USERNAME" --platform-password "$PASSWORD" --clusters "CLUSTER_NAME"
75+
```
76+
77+
3. Create temporary RBAC configuration on the business cluster where Cilium will be installed (this RBAC permission is not configured before the cluster is successfully deployed):
78+
79+
Create temporary RBAC configuration file:
80+
81+
```bash
82+
cat > tmp.yaml << 'EOF'
83+
apiVersion: rbac.authorization.k8s.io/v1
84+
kind: ClusterRole
85+
metadata:
86+
name: cilium-clusterplugininstance-admin
87+
labels:
88+
app.kubernetes.io/name: cilium
89+
rules:
90+
- apiGroups: ["cluster.alauda.io"]
91+
resources: ["clusterplugininstances"]
92+
verbs: ["*"]
93+
---
94+
apiVersion: rbac.authorization.k8s.io/v1
95+
kind: ClusterRoleBinding
96+
metadata:
97+
name: cilium-admin-clusterplugininstance
98+
labels:
99+
app.kubernetes.io/name: cilium
100+
roleRef:
101+
apiGroup: rbac.authorization.k8s.io
102+
kind: ClusterRole
103+
name: cilium-clusterplugininstance-admin
104+
subjects:
105+
- apiGroup: rbac.authorization.k8s.io
106+
kind: User
107+
name: admin
108+
EOF
109+
```
110+
111+
Apply temporary RBAC configuration:
112+
113+
```bash
114+
kubectl apply -f tmp.yaml
115+
```
116+
117+
4. Navigate to **Administrator → Marketplace → Cluster Plugins** and install Cilium
118+
119+
5. After Cilium is successfully installed, delete the temporary RBAC configuration:
120+
121+
```bash
122+
kubectl delete -f tmp.yaml
123+
rm tmp.yaml
124+
```
125+
126+
## Create L4 Load Balancer with Source IP Preservation
127+
128+
Execute the following operations on the master node backend.
129+
130+
### Step 1: Remove kube-proxy and Clean Up Rules
131+
132+
1. Get the current kube-proxy image:
133+
134+
```bash
135+
kubectl get -n kube-system kube-proxy -oyaml | grep image
136+
```
137+
138+
2. Backup and delete the kube-proxy DaemonSet:
139+
140+
```bash
141+
kubectl -n kube-system get ds kube-proxy -oyaml > kube-proxy-backup.yaml
142+
143+
kubectl -n kube-system delete ds kube-proxy
144+
```
145+
146+
3. Create a BroadcastJob to clean up kube-proxy rules:
147+
148+
```yaml
149+
apiVersion: operator.alauda.io/v1alpha1
150+
kind: BroadcastJob
151+
metadata:
152+
name: kube-proxy-cleanup
153+
namespace: kube-system
154+
spec:
155+
completionPolicy:
156+
ttlSecondsAfterFinished: 300
157+
type: Always
158+
failurePolicy:
159+
type: FailFast
160+
template:
161+
metadata:
162+
labels:
163+
k8s-app: kube-proxy-cleanup
164+
spec:
165+
serviceAccountName: kube-proxy
166+
hostNetwork: true
167+
restartPolicy: Never
168+
nodeSelector:
169+
kubernetes.io/os: linux
170+
priorityClassName: system-node-critical
171+
tolerations:
172+
- operator: Exists
173+
containers:
174+
- name: kube-proxy-cleanup
175+
image: registry.alauda.cn:60070/tkestack/kube-proxy:v1.33.5 ## Replace with the kube-proxy image from Step 1
176+
imagePullPolicy: IfNotPresent
177+
command:
178+
- /bin/sh
179+
- -c
180+
- "/usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=$(NODE_NAME) --cleanup || true"
181+
env:
182+
- name: NODE_NAME
183+
valueFrom:
184+
fieldRef:
185+
apiVersion: v1
186+
fieldPath: spec.nodeName
187+
securityContext:
188+
privileged: true
189+
volumeMounts:
190+
- mountPath: /var/lib/kube-proxy
191+
name: kube-proxy
192+
- mountPath: /lib/modules
193+
name: lib-modules
194+
readOnly: true
195+
- mountPath: /run/xtables.lock
196+
name: xtables-lock
197+
volumes:
198+
- name: kube-proxy
199+
configMap:
200+
name: kube-proxy
201+
- name: lib-modules
202+
hostPath:
203+
path: /lib/modules
204+
type: ""
205+
- name: xtables-lock
206+
hostPath:
207+
path: /run/xtables.lock
208+
type: FileOrCreate
209+
EOF
210+
```
211+
212+
Save as `kube-proxy-cleanup.yaml` and apply:
213+
214+
```bash
215+
kubectl apply -f kube-proxy-cleanup.yaml
216+
```
217+
218+
The BroadcastJob is configured with `ttlSecondsAfterFinished: 300` and will be automatically cleaned up within 5 minutes after completion.
219+
220+
### Step 2: Create Address Pool
221+
222+
> **VIP Address Requirement**: Cilium L2 Announcement implements IP failover through ARP broadcasting. Therefore, the VIP must be in the **same Layer 2 network** as the cluster nodes to ensure ARP requests can be properly broadcast and responded to.
223+
224+
Save as `lb-resources.yaml`:
225+
226+
```yaml
227+
apiVersion: cilium.io/v2alpha1
228+
kind: CiliumLoadBalancerIPPool
229+
metadata:
230+
name: lb-pool
231+
spec:
232+
blocks:
233+
- cidr: "192.168.132.192/32" # Replace with the actual VIP segment
234+
---
235+
apiVersion: cilium.io/v2alpha1
236+
kind: CiliumL2AnnouncementPolicy
237+
metadata:
238+
name: l2-policy
239+
spec:
240+
interfaces:
241+
- eth0 # Replace with the actual network interface name
242+
externalIPs: true
243+
loadBalancerIPs: true
244+
```
245+
246+
Apply the configuration:
247+
248+
```bash
249+
kubectl apply -f lb-resources.yaml
250+
```
251+
252+
### Step 3: Verification
253+
254+
Create a LoadBalancer Service to verify IP allocation and test connectivity.
255+
256+
**Verification 1: Check if LB Service has been assigned an IP**
257+
258+
```bash
259+
kubectl get svc -A
260+
```
261+
262+
Expected output example:
263+
264+
```
265+
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
266+
cilium-123-1 test LoadBalancer 10.4.98.81 192.168.132.192 80:31447/TCP 35s
267+
```
268+
269+
**Verification 2: Check the leader node sending ARP requests**
270+
271+
```bash
272+
kubectl get leases -A | grep cilium
273+
```
274+
275+
Expected output example:
276+
277+
```
278+
cpaas-system cilium-l2announce-cilium-123-1-test 192.168.141.196 24s
279+
```
280+
281+
**Verification 3: Test external access**
282+
283+
From an external client, access the LoadBalancer Service. Capturing packets inside the Pod should show the source IP as the client's IP, indicating successful source IP preservation.

0 commit comments

Comments
 (0)