Skip to content

Commit 107cf46

Browse files
committed
coco: adding confidential documentation
Basic markdown file with deployment steps. Signed-off-by: Beraldo Leal <bleal@redhat.com>
1 parent c41a0ed commit 107cf46

1 file changed

Lines changed: 174 additions & 0 deletions

File tree

docs/CONFIDENTIAL-CONTAINERS.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Confidential Containers Integration
2+
3+
This document describes how to deploy the Layered Zero Trust Validated
4+
Pattern with Confidential Containers (CoCo) support. CoCo extends the
5+
pattern with hardware-rooted workload identity: SPIRE agent runs inside
6+
a confidential VM (peer pod) and uses x509pop attestation backed by TEE
7+
hardware attestation to KBS.
8+
9+
## Architecture
10+
11+
In a production deployment, Trustee (the attestation server) should run
12+
on a separate trusted cluster, since it verifies the integrity of the
13+
infrastructure where workloads run. Running it on the same cluster
14+
means the attestation server shares the untrusted infrastructure it is
15+
supposed to verify. A single cluster deployment is fine for development
16+
and testing.
17+
18+
The SPIRE agent runs as a sidecar container inside each CoCo peer pod.
19+
This is different from the regular ZTVP deployment where agents run as
20+
a DaemonSet on each node. In the CoCo model, the agent must be inside
21+
the confidential VM so that its identity is rooted in hardware
22+
attestation. Each CoCo workload gets its own SPIRE agent instance.
23+
24+
The trust chain:
25+
26+
1. Peer pod VM created inside a TEE (AMD SEV-SNP or Intel TDX)
27+
2. Confidential Data Hub (CDH) inside the TEE attests to KBS
28+
3. KBS validates the TEE evidence and returns sealed secrets
29+
4. SPIRE agent loads x509pop certificates from the unsealed secrets
30+
5. Agent connects to SPIRE server and performs x509pop node attestation
31+
6. Workload receives X509-SVID via Unix attestation through spiffe-helper
32+
33+
## Prerequisites
34+
35+
- Cloud provider region with confidential VM quota for peer pod VMs
36+
(worker nodes themselves do not need to be confidential)
37+
- Vault as the secret backend
38+
39+
### Azure Instance Types
40+
41+
Azure confidential VM SKU families:
42+
43+
- DCasv5: AMD Milan (SEV-SNP)
44+
- DCasv6: AMD Genoa (SEV-SNP)
45+
- DCesv6: Intel TDX
46+
47+
Availability varies by region. The default configuration uses
48+
Standard_DC2as_v6. Change the VM flavor in values-coco-dev.yaml under
49+
the sandbox-policies app overrides if your region requires a different
50+
SKU.
51+
52+
## Deployment
53+
54+
### 1. Configure clusterGroupName
55+
56+
Edit values-global.yaml and set the clusterGroupName to coco-dev:
57+
58+
```yaml
59+
main:
60+
clusterGroupName: coco-dev
61+
```
62+
63+
Commit and push this change before deploying.
64+
65+
### 2. Generate secrets
66+
67+
Run the pre-deployment scripts from the pattern root:
68+
69+
```bash
70+
./scripts/gen-secrets-coco.sh
71+
./scripts/get-pcr.sh
72+
```
73+
74+
gen-secrets-coco.sh creates the cryptographic keys that Trustee (the
75+
attestation server) needs to authenticate requests. It also copies the
76+
values-secret template if not already present. Safe to re-run (will
77+
not overwrite existing files).
78+
79+
get-pcr.sh retrieves the expected hardware measurements for the
80+
confidential VM image. Trustee compares these against the measurements
81+
reported by the actual hardware to decide whether a VM is genuine.
82+
Requires a Red Hat pull secret (defaults to ~/pull-secret.json, or
83+
set the PULL_SECRET env var).
84+
85+
Both scripts output to ~/.config/validated-patterns/trustee/.
86+
87+
### 3. Edit the secrets template
88+
89+
Edit ~/.config/validated-patterns/values-secret-layered-zero-trust.yaml
90+
and uncomment the CoCo secrets section. Each secret has inline comments
91+
in the template explaining its purpose and how to populate it.
92+
93+
### 4. Deploy
94+
95+
```bash
96+
# If deploying from a fork, set TARGET_ORIGIN to your git remote name:
97+
# TARGET_ORIGIN=myfork ./pattern.sh make install
98+
./pattern.sh make install
99+
```
100+
101+
Wait for all ArgoCD apps to reach Healthy/Synced. CoCo apps (sandbox,
102+
trustee, sandbox-policies) reference CRDs created by the operators. On
103+
first deploy, ArgoCD may try to sync these apps before the operator
104+
has finished installing and registering its CRDs. This is normal and
105+
resolves automatically once the operator CSV succeeds and ArgoCD
106+
retries the sync.
107+
108+
The imperative framework runs jobs on a 10-minute schedule for:
109+
110+
- Azure NAT gateway configuration
111+
- initdata generation and compression
112+
- SPIRE x509pop certificate generation
113+
- SPIRE server x509pop plugin configuration
114+
115+
### 5. Create SPIRE workload registration entry
116+
117+
The regular SPIRE agents (DaemonSet) use the k8s workload attestor,
118+
which identifies workloads through the kubelet API. In the CoCo model,
119+
the infrastructure (including Kubernetes) is untrusted. The SPIRE agent
120+
runs inside the confidential VM where the kubelet is not accessible by
121+
design, ensuring workload identity is rooted in hardware attestation
122+
rather than the cluster control plane. The agent uses the Unix workload
123+
attestor instead, which identifies processes by UID over the Unix
124+
socket. Because of this, ClusterSPIFFEID CRDs do not apply and
125+
registration entries must be created manually:
126+
127+
```bash
128+
oc exec -n zero-trust-workload-identity-manager spire-server-0 -- \
129+
spire-server entry create \
130+
-parentID "spiffe://<trust-domain>/spire/agent/x509pop/<cert-fingerprint>" \
131+
-spiffeID "spiffe://<trust-domain>/ns/zero-trust-workload-identity-manager/sa/spire-agent" \
132+
-selector "unix:uid:1000800000"
133+
```
134+
135+
The parentID cert fingerprint comes from the x509pop certificate. The
136+
UID is assigned by OpenShift based on the namespace UID range.
137+
138+
## Verification
139+
140+
Check the hello-coco pod is running with 3/3 containers:
141+
142+
```bash
143+
oc get pod -n zero-trust-workload-identity-manager hello-coco
144+
```
145+
146+
Check that SVIDs were issued:
147+
148+
```bash
149+
oc exec -n zero-trust-workload-identity-manager hello-coco \
150+
-c test-workload -- ls -la /svids/
151+
```
152+
153+
Expected files: svid.pem, svid_key.pem, svid_bundle.pem.
154+
155+
Verify attestation from inside the TEE:
156+
157+
```bash
158+
oc exec -n zero-trust-workload-identity-manager hello-coco \
159+
-c test-workload -- \
160+
curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status
161+
```
162+
163+
Should return the value configured in the attestationStatus secret.
164+
165+
## Known Limitations
166+
167+
1. The ZTWIM operator CRD does not support x509pop plugin configuration.
168+
An imperative job patches the SPIRE server ConfigMap and StatefulSet
169+
directly. CREATE_ONLY_MODE must be enabled to prevent the operator
170+
from reverting these patches.
171+
172+
2. For now, SPIRE workload registration entries for CoCo pods must be
173+
created manually. The ClusterSPIFFEID CRD only works with
174+
k8s-attested agents. We are working on alternatives to automate this.

0 commit comments

Comments
 (0)