|
| 1 | +# Confidential Containers Integration |
| 2 | + |
| 3 | +This document describes how to deploy the Layered Zero Trust Validated |
| 4 | +Pattern with Confidential Containers (CoCo) support. CoCo extends the |
| 5 | +pattern with hardware-rooted workload identity: SPIRE agent runs inside |
| 6 | +a confidential VM (peer pod) and uses x509pop attestation backed by TEE |
| 7 | +hardware attestation to KBS. |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +In a production deployment, Trustee (the attestation server) should run |
| 12 | +on a separate trusted cluster, since it verifies the integrity of the |
| 13 | +infrastructure where workloads run. Running it on the same cluster |
| 14 | +means the attestation server shares the untrusted infrastructure it is |
| 15 | +supposed to verify. A single cluster deployment is fine for development |
| 16 | +and testing. |
| 17 | + |
| 18 | +The SPIRE agent runs as a sidecar container inside each CoCo peer pod. |
| 19 | +This is different from the regular ZTVP deployment where agents run as |
| 20 | +a DaemonSet on each node. In the CoCo model, the agent must be inside |
| 21 | +the confidential VM so that its identity is rooted in hardware |
| 22 | +attestation. Each CoCo workload gets its own SPIRE agent instance. |
| 23 | + |
| 24 | +The trust chain: |
| 25 | + |
| 26 | +1. Peer pod VM created inside a TEE (AMD SEV-SNP or Intel TDX) |
| 27 | +2. Confidential Data Hub (CDH) inside the TEE attests to KBS |
| 28 | +3. KBS validates the TEE evidence and returns sealed secrets |
| 29 | +4. SPIRE agent loads x509pop certificates from the unsealed secrets |
| 30 | +5. Agent connects to SPIRE server and performs x509pop node attestation |
| 31 | +6. Workload receives X509-SVID via Unix attestation through spiffe-helper |
| 32 | + |
| 33 | +## Prerequisites |
| 34 | + |
| 35 | +- Cloud provider region with confidential VM quota for peer pod VMs |
| 36 | + (worker nodes themselves do not need to be confidential) |
| 37 | +- Vault as the secret backend |
| 38 | + |
| 39 | +### Azure Instance Types |
| 40 | + |
| 41 | +Azure confidential VM SKU families: |
| 42 | + |
| 43 | +- DCasv5: AMD Milan (SEV-SNP) |
| 44 | +- DCasv6: AMD Genoa (SEV-SNP) |
| 45 | +- DCesv6: Intel TDX |
| 46 | + |
| 47 | +Availability varies by region. The default configuration uses |
| 48 | +Standard_DC2as_v6. Change the VM flavor in values-coco-dev.yaml under |
| 49 | +the sandbox-policies app overrides if your region requires a different |
| 50 | +SKU. |
| 51 | + |
| 52 | +## Deployment |
| 53 | + |
| 54 | +### 1. Configure clusterGroupName |
| 55 | + |
| 56 | +Edit values-global.yaml and set the clusterGroupName to coco-dev: |
| 57 | + |
| 58 | +```yaml |
| 59 | +main: |
| 60 | + clusterGroupName: coco-dev |
| 61 | +``` |
| 62 | +
|
| 63 | +Commit and push this change before deploying. |
| 64 | +
|
| 65 | +### 2. Generate secrets |
| 66 | +
|
| 67 | +Run the pre-deployment scripts from the pattern root: |
| 68 | +
|
| 69 | +```bash |
| 70 | +./scripts/gen-secrets-coco.sh |
| 71 | +./scripts/get-pcr.sh |
| 72 | +``` |
| 73 | + |
| 74 | +gen-secrets-coco.sh creates the cryptographic keys that Trustee (the |
| 75 | +attestation server) needs to authenticate requests. It also copies the |
| 76 | +values-secret template if not already present. Safe to re-run (will |
| 77 | +not overwrite existing files). |
| 78 | + |
| 79 | +get-pcr.sh retrieves the expected hardware measurements for the |
| 80 | +confidential VM image. Trustee compares these against the measurements |
| 81 | +reported by the actual hardware to decide whether a VM is genuine. |
| 82 | +Requires a Red Hat pull secret (defaults to ~/pull-secret.json, or |
| 83 | +set the PULL_SECRET env var). |
| 84 | + |
| 85 | +Both scripts output to ~/.config/validated-patterns/trustee/. |
| 86 | + |
| 87 | +### 3. Edit the secrets template |
| 88 | + |
| 89 | +Edit ~/.config/validated-patterns/values-secret-layered-zero-trust.yaml |
| 90 | +and uncomment the CoCo secrets section. Each secret has inline comments |
| 91 | +in the template explaining its purpose and how to populate it. |
| 92 | + |
| 93 | +### 4. Deploy |
| 94 | + |
| 95 | +```bash |
| 96 | +# If deploying from a fork, set TARGET_ORIGIN to your git remote name: |
| 97 | +# TARGET_ORIGIN=myfork ./pattern.sh make install |
| 98 | +./pattern.sh make install |
| 99 | +``` |
| 100 | + |
| 101 | +Wait for all ArgoCD apps to reach Healthy/Synced. CoCo apps (sandbox, |
| 102 | +trustee, sandbox-policies) reference CRDs created by the operators. On |
| 103 | +first deploy, ArgoCD may try to sync these apps before the operator |
| 104 | +has finished installing and registering its CRDs. This is normal and |
| 105 | +resolves automatically once the operator CSV succeeds and ArgoCD |
| 106 | +retries the sync. |
| 107 | + |
| 108 | +The imperative framework runs jobs on a 10-minute schedule for: |
| 109 | + |
| 110 | +- Azure NAT gateway configuration |
| 111 | +- initdata generation and compression |
| 112 | +- SPIRE x509pop certificate generation |
| 113 | +- SPIRE server x509pop plugin configuration |
| 114 | + |
| 115 | +### 5. Create SPIRE workload registration entry |
| 116 | + |
| 117 | +The regular SPIRE agents (DaemonSet) use the k8s workload attestor, |
| 118 | +which identifies workloads through the kubelet API. In the CoCo model, |
| 119 | +the infrastructure (including Kubernetes) is untrusted. The SPIRE agent |
| 120 | +runs inside the confidential VM where the kubelet is not accessible by |
| 121 | +design, ensuring workload identity is rooted in hardware attestation |
| 122 | +rather than the cluster control plane. The agent uses the Unix workload |
| 123 | +attestor instead, which identifies processes by UID over the Unix |
| 124 | +socket. Because of this, ClusterSPIFFEID CRDs do not apply and |
| 125 | +registration entries must be created manually: |
| 126 | + |
| 127 | +```bash |
| 128 | +oc exec -n zero-trust-workload-identity-manager spire-server-0 -- \ |
| 129 | + spire-server entry create \ |
| 130 | + -parentID "spiffe://<trust-domain>/spire/agent/x509pop/<cert-fingerprint>" \ |
| 131 | + -spiffeID "spiffe://<trust-domain>/ns/zero-trust-workload-identity-manager/sa/spire-agent" \ |
| 132 | + -selector "unix:uid:1000800000" |
| 133 | +``` |
| 134 | + |
| 135 | +The parentID cert fingerprint comes from the x509pop certificate. The |
| 136 | +UID is assigned by OpenShift based on the namespace UID range. |
| 137 | + |
| 138 | +## Verification |
| 139 | + |
| 140 | +Check the hello-coco pod is running with 3/3 containers: |
| 141 | + |
| 142 | +```bash |
| 143 | +oc get pod -n zero-trust-workload-identity-manager hello-coco |
| 144 | +``` |
| 145 | + |
| 146 | +Check that SVIDs were issued: |
| 147 | + |
| 148 | +```bash |
| 149 | +oc exec -n zero-trust-workload-identity-manager hello-coco \ |
| 150 | + -c test-workload -- ls -la /svids/ |
| 151 | +``` |
| 152 | + |
| 153 | +Expected files: svid.pem, svid_key.pem, svid_bundle.pem. |
| 154 | + |
| 155 | +Verify attestation from inside the TEE: |
| 156 | + |
| 157 | +```bash |
| 158 | +oc exec -n zero-trust-workload-identity-manager hello-coco \ |
| 159 | + -c test-workload -- \ |
| 160 | + curl http://127.0.0.1:8006/cdh/resource/default/attestation-status/status |
| 161 | +``` |
| 162 | + |
| 163 | +Should return the value configured in the attestationStatus secret. |
| 164 | + |
| 165 | +## Known Limitations |
| 166 | + |
| 167 | +1. The ZTWIM operator CRD does not support x509pop plugin configuration. |
| 168 | + An imperative job patches the SPIRE server ConfigMap and StatefulSet |
| 169 | + directly. CREATE_ONLY_MODE must be enabled to prevent the operator |
| 170 | + from reverting these patches. |
| 171 | + |
| 172 | +2. For now, SPIRE workload registration entries for CoCo pods must be |
| 173 | + created manually. The ClusterSPIFFEID CRD only works with |
| 174 | + k8s-attested agents. We are working on alternatives to automate this. |
0 commit comments