diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 2c55f29a4284..8c42a4f6004c 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -1330,8 +1330,10 @@ Topics: File: zero-trust-manager-proxy - Name: Configuring Zero Trust Workload Identity Manager OIDC Federation File: zero-trust-manager-oidc-federation - - Name: Configuring Zero Trust Workload Identity Manager SPIRE Federation + - Name: Configuring Zero Trust Worload Identity Manager SPIRE Federation File: zero-trust-manager-spire-federation + - Name: Integrating SPIRE federation with multi-cluster Red Hat OpenShift Service Mesh + File: zero-trust-manager-mesh-integration-multi-cluster - Name: Enabling create-only mode for the Zero Trust Workload Identity Manager File: zero-trust-manager-reconciliation - Name: Monitoring Zero Trust Workload Identity Manager diff --git a/modules/zero-trust-manager-configure-spire-mesh-multi-cluster.adoc b/modules/zero-trust-manager-configure-spire-mesh-multi-cluster.adoc new file mode 100644 index 000000000000..e0f916f3ef77 --- /dev/null +++ b/modules/zero-trust-manager-configure-spire-mesh-multi-cluster.adoc @@ -0,0 +1,255 @@ +// Module included in the following assemblies: +// +// * security/zero_trust_workload_identity_manageer/zero-trust-manager-oidc-federation.adoc + +:_mod-docs-content-type: PROCEDURE +[id="zero-trust-manager-configure-spire-mesh-multi-cluster_{context}"] += Configuring {SMProductName} for multi-cluster SPIRE integration + +[role="_abstract"] +Configure {SMproductName} on each cluster with federation settings, East-West Gateways, and Remote Secrets to enable cross-cluster service communication by using SPIRE-issued certificates. + +.Prerequisites + +* You have deployed SPIRE with federation on all clusters. +* You have installed the OpenShift Service Mesh 3 Operator on all clusters. +* You have configured SPIRE federation and trust bundle exchange between clusters. +* You have cluster administrator permissions on all clusters. + +.Procedure + +. On each cluster, create the `Istio` custom resource (CR) with SPIRE integration and multi-cluster settings: ++ +[source,yaml] +---- +apiVersion: sailoperator.io/v1alpha1 +kind: Istio +metadata: + name: default +spec: + version: v1.27.3 + namespace: istio-system + values: + global: + pilotCertProvider: istiod + multiCluster: + clusterName: cluster-a + pilot: + env: + PILOT_ENABLE_XDS_IDENTITY_CHECK: "false" + meshConfig: + trustDomain: cluster-a.example.org + trustDomainAliases: + - spiffe://cluster-b.example.org + - spiffe://cluster-c.example.org + defaultConfig: + proxyMetadata: + CREDENTIAL_SOCKET_EXISTS: "true" + SPIFFE_ENDPOINT_SOCKET: "unix:///tmp/spire-agent/public/socket" + meshNetworks: + network-cluster-b: + endpoints: + - fromRegistry: cluster-b + gateways: + - address: eastwest-gateway.istio-system.svc.cluster.local + port: 15443 + network-cluster-c: + endpoints: + - fromRegistry: cluster-c + gateways: + - address: eastwest-gateway.istio-system.svc.cluster.local + port: 15443 +---- ++ +where: + +`spec.values.global.multiCluster.clusterName`:: Specifies the unique cluster name for each cluster in the mesh. + +`spec.meshConfig.trustDomain`:: Specifies the SPIFFE trust domain for this cluster. + +`spec.meshConfig.trustDomainAliases`:: Specifies the list all federated trust domains from remote clusters. + +`spec.values.meshConfig.defaultConfig.proxyMetadata.SPIFFE_ENDPOINT_SOCKET`:: Specifies that the socket path must match the SPIRE Agent configuration filename `socket`. + + `spec.values.meshNetworks`:: Specifies the mesh networks for each remote cluster. + +. On each cluster, create an East-West Gateway for cross-cluster traffic: ++ +[source,yaml] +---- +apiVersion: v1 +kind: Service +metadata: + name: eastwest-gateway + namespace: istio-system +spec: + type: LoadBalancer + selector: + istio: eastwestgateway + ports: + - port: 15443 + name: tls + protocol: TCP + targetPort: 15443 +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: eastwest-gateway + namespace: istio-system +spec: + selector: + matchLabels: + istio: eastwestgateway + template: + metadata: + labels: + istio: eastwestgateway + annotations: + inject.istio.io/templates: gateway + spec: + containers: + - name: istio-proxy + image: auto +--- +apiVersion: networking.istio.io/v1beta1 +kind: Gateway +metadata: + name: cross-network-gateway + namespace: istio-system +spec: + selector: + istio: eastwestgateway + servers: + - port: + number: 15443 + name: tls + protocol: TLS + tls: + mode: PASSTHROUGH + hosts: + - "*.local" +---- ++ +where: + +`spec.servers.tls.mode`:: Specifies that the TLS Passthrough mode preserves SPIRE-issued certificates for end-to-end mTLS. + +. Create Remote Secrets to enable cross-cluster service discovery. From Cluster A, create a secret for Cluster B by running the following command: ++ +[source,terminal] +---- +$ istioctl create-remote-secret \ + --context=cluster-b \ + --name=cluster-b | \ + oc apply -f - --context=cluster-a +---- ++ +Repeat this step for each cluster pair to enable bidirectional service discovery. + +. On each cluster, create a `ClusterSPIFFEID` CR with federation settings: ++ +[source,yaml] +---- +apiVersion: spire.openshift.io/v1alpha1 +kind: ClusterSPIFFEID +metadata: + name: federated-workloads +spec: + className: zero-trust-workload-identity-manager-spire + spiffeIDTemplate: "spiffe://cluster-a.example.org/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}" + federatesWith: + - spiffe://cluster-b.example.org + - spiffe://cluster-c.example.org + podSelector: + matchLabels: + istio.io/rev: default + namespaceSelector: + matchLabels: + istio-injection: enabled + workloadSelectorTemplates: + - "k8s:ns:{{ .PodMeta.Namespace }}" + - "k8s:sa:{{ .PodSpec.ServiceAccountName }}" +---- ++ +where: + +`spec.className`:: Specifies that the `className` field is required when using `federatesWith`. + +`spec.federatesWith`:: Specifies the list of all federated trust domains. + +. Deploy test workloads with the required annotations for SPIRE socket mounting: ++ +[source,yaml] +---- +apiVersion: v1 +kind: Pod +metadata: + name: helloworld + namespace: sample + annotations: + sidecar.istio.io/userVolume: '{"spire-agent-socket":{"csi":{"driver":"csi.spiffe.io","readOnly":true}}}' + sidecar.istio.io/userVolumeMount: '{"spire-agent-socket":{"mountPath":"/tmp/spire-agent/public","readOnly":true}}' +spec: + serviceAccountName: helloworld + containers: + - name: helloworld + image: registry.access.redhat.com/ubi9/ubi-minimal:latest + ports: + - containerPort: 8080 +---- + +.Verification + +. Verify that workloads have federated trust bundles in their X.509 SVIDs by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-agent- -- \ + /opt/spire/bin/spire-agent api fetch x509 \ + -socketPath /run/spire/agent-sockets/socket +---- ++ +The output should show multiple CA certificates in the trust bundle. + +. Verify that Envoy has the federated trust bundles by running the following command: ++ +[source,terminal] +---- +$ oc exec -n sample helloworld -c istio-proxy -- \ + curl localhost:15000/config_dump | \ + jq '.configs[] | select(.["@type"] | contains("SecretsConfigDump"))' +---- ++ +Check for multiple CA certificates in the validation context. + +. Verify cross-cluster service discovery by running the following command: ++ +[source,terminal] +---- +$ oc exec -n sample helloworld -c istio-proxy -- \ + curl localhost:15000/clusters | grep +---- ++ +You should see endpoints for services in remote clusters. + +. Test cross-cluster connectivity by running the following command: ++ +[source,terminal] +---- +$ oc exec -n sample helloworld -- \ + curl -v http://..svc.cluster.local:8080 +---- ++ +Check the curl output for successful TLS handshake by using SPIRE-issued certificates. + +. Verify SPIRE registration entries include federation by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server entry show -output json | \ + jq '.entries[] | select(.federates_with | length > 0)' +---- ++ +Entries should show the `federates_with` field containing remote trust domains. \ No newline at end of file diff --git a/modules/zero-trust-manager-deploy-spire-multi-cluster.adoc b/modules/zero-trust-manager-deploy-spire-multi-cluster.adoc new file mode 100644 index 000000000000..cf0572492c5a --- /dev/null +++ b/modules/zero-trust-manager-deploy-spire-multi-cluster.adoc @@ -0,0 +1,143 @@ +// Module included in the following assemblies: +// +// * security/zero_trust_workload_identity_manageer/zero-trust-manager-oidc-federation.adoc + +:_mod-docs-content-type: PROCEDURE +[id="zero-trust-manager-deploy-spire-multi-cluster_{context}"] += Deploying SPIRE with federation for multi-cluster integration + +[role="_abstract"] +Deploy SPIRE with federation capabilities on each cluster to enable cross-cluster trust bundle exchange and workload identity validation across multiple (product-name) clusters. + +.Prerequisites + +* You have deployed SPIRE on each cluster using the {zero-trust-full} Operator. +* You have cluster administrator permissions on all clusters. +* You have the `oc` CLI installed. +* Each cluster has a unique SPIFFE trust domain. For example, `cluster-a.example.org`, `cluster-b.example.org`, and so on. + +.Procedure + +. On each cluster, enable `CREATE_ONLY_MODE` on {zero-trust-full} to prevent it from overwriting manual ConfigMap changes by running the following command: ++ +[source,terminal] +---- +$ oc set env deployment/ztwim-operator-controller-manager \ + -n openshift-spire-operator \ + CREATE_ONLY_MODE=true +---- + +. On each cluster, patch the SPIRE Agent ConfigMap to configure the socket path and SDS settings for federation by running the following command: ++ +[source,terminal] +---- +$ oc patch configmap spire-agent -n openshift-spire --type=json -p='[ + { + "op": "replace", + "path": "/data/agent.conf", + "value": "server_address = \"spire-server.openshift-spire.svc.cluster.local\"\nserver_port = \"8081\"\ntrust_domain = \"cluster-a.example.org\"\n\nplugins {\n KeyManager \"disk\" {\n plugin_data {\n directory = \"/run/spire/data\"\n }\n }\n NodeAttestor \"k8s_psat\" {\n plugin_data {\n cluster = \"openshift-cluster\"\n }\n }\n WorkloadAttestor \"k8s\" {\n plugin_data {}\n }\n WorkloadAttestor \"unix\" {\n plugin_data {}\n }\n WorkloadAPI \"socket\" {\n plugin_cmd = \"/opt/spire/bin/spire-agent\"\n plugin_data {\n socket_path = \"/run/spire/agent-sockets/socket\" \n }\n }\n SDS \"unix\" {\n plugin_cmd = \"/opt/spire/bin/spire-agent\"\n plugin_data {\n default_bundle_name = \"null\"\n default_all_bundles_name = \"ROOTCA\" <2>\n }\n }\n}\n" + } +]' +---- +The `socket_path = \"/run/spire/agent-sockets/socket\"` where the socket filename must be `socket` for Istio compatibility. +The `default_all_bundles_name = \"ROOTCA\"` serves all federated trust bundles under the name `ROOTCA`. + +. On each cluster, create a `ClusterFederatedTrustDomain` custom resource to define the remote clusters to federate with: ++ +[source,yaml] +---- +apiVersion: spire.openshift.io/v1alpha1 +kind: ClusterFederatedTrustDomain +metadata: + name: cluster-b-federation +spec: + trustDomain: spiffe://cluster-b.example.org + bundleEndpointURL: https://spire-server.openshift-spire.svc.cluster.local:8443 + bundleEndpointProfile: + type: https_spiffe +---- ++ +where + +`spec.trustdomain`:: Specifies the SPIFFE trust domain of the remote cluster. + +`spec.bundleEndpointURL`:: Specifies the URL where the remote SPIRE Server exposes its trust bundle. + +`spec.bundleEnpointProfile.type`:: Specifies that the `https_spiffe` profile is used for secure bundle exchange. ++ +Repeat this step for each remote cluster you want to federate with. + +. Export the trust bundle from each cluster by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server bundle show -format spiffe > cluster-a-bundle.json +---- ++ +Save the bundle file for each cluster. + +. Bootstrap federation by manually loading remote trust bundles into each SPIRE Server by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -i -- \ + /opt/spire/bin/spire-server bundle set \ + -format spiffe \ + -id spiffe://cluster-b.example.org < cluster-b-bundle.json +---- ++ +Replace `cluster-b-bundle.json` with the trust domain and bundle file from the remote cluster. ++ +Repeat this step on each cluster for all remote clusters. + +. Restart the SPIRE Agent `DaemonSet`` on each cluster to apply the ConfigMap changes by running the following command: ++ +[source,terminal] +---- +$ oc rollout restart daemonset/spire-agent -n openshift-spire +---- + +. Restart the SPIRE Server StatefulSet on each cluster to activate federation by running the following command: ++ +[source,terminal] +---- +$ oc rollout restart statefulset/spire-server -n openshift-spire +---- + +.Verification + +. Verify that the SPIRE Server has the federated trust bundles by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server bundle list +---- ++ +The output should show trust bundles for both the local trust domain and all federated trust domains. + +. Verify that the SPIRE Agent socket is correctly named by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-agent- -- \ + ls -l /run/spire/agent-sockets/ +---- ++ +.Example output +[source,terminal] +---- +srwxrwxrwx 1 root root 0 May 20 10:15 socket +---- ++ +The socket must be named `socket`, not `spire-agent.sock`. + +. Check the SPIRE Server logs for federation activity by running the following command: ++ +[source,terminal] +---- +$ oc logs -n openshift-spire spire-server-0 | grep -i federation +---- ++ +Look for messages indicating successful bundle retrieval from federated trust domains. \ No newline at end of file diff --git a/modules/zero-trust-manager-limits-troubleshooting.adoc b/modules/zero-trust-manager-limits-troubleshooting.adoc new file mode 100644 index 000000000000..a470b459baaa --- /dev/null +++ b/modules/zero-trust-manager-limits-troubleshooting.adoc @@ -0,0 +1,590 @@ +// Module included in the following assemblies: +// +// * security/zero_trust_workload_identity_manageer/zero-trust-manager-oidc-federation.adoc + +:_mod-docs-content-type: REFERENCE +[id="zero-trust-manager-limits-troubleshooting_{context}"] += Known limitations and troubleshooting for multi-cluster SPIRE integration + +[role="_abstract"] +This reference provides known technical limitations, configuration constraints, and troubleshooting procedures for integrating SPIRE federation with multi-cluster {SMProductName} deployments. + +[id="spire-multicluster-known-limitations_{context}"] +== Known limitations + +The following describes known limitations and critical requirements when integrating SPIRE federation with multi-cluster {SMProductName} deployments. + +[id="spire-socket-filename-requirement_{context}"] +SPIRE Agent socket filename requirement:: ++ +The SPIRE Agent must create a socket file named exactly `socket`, not `spire-agent.sock` or any other name, for proper integration with Istio's pilot-agent. ++ +The socket filename must be configured as `/run/spire/agent-sockets/socket` in the SPIRE Agent WorkloadAPI plugin configuration. ++ +*Impact*: If the socket filename is incorrect, Envoy cannot connect to the SPIRE Agent SDS API. The Istio injection templates and CSI Driver expect the socket at `/tmp/spire-agent/public/socket`. ++ +For troubleshooting missing sockets, see <>. + +[id="classname-required-with-federateswith_{context}"] +className required when using federatesWith:: ++ +When a `ClusterSPIFFEID` custom resource includes the `federatesWith` field to enable cross-cluster authentication, the `className` field is mandatory and must be set to `zero-trust-workload-identity-manager-spire`. ++ +*Impact*: Without the `className` field, SPIRE Server does not issue certificates with federated trust bundles, causing cross-cluster mTLS authentication to fail. Workloads cannot validate certificates from remote clusters. ++ +*Error symptom*: No federated bundles appear in the workload's X.509 SVID when inspecting with `spire-agent api fetch x509`. ++ +For troubleshooting, see <>. + +[id="readonly-filesystem-error_{context}"] +Read-only filesystem error with pilot-agent:: ++ +The Istio pilot-agent process might encounter read-only filesystem errors when attempting to create a socket in the CSI-mounted directory. ++ +*Cause*: The CSI volume mount is read-only, but pilot-agent expects to create files in that location. ++ +*Resolution*: Ensure the SPIRE Agent creates the socket before pilot-agent starts. The socket must already exist in the CSI-mounted path when the Envoy sidecar initializes. ++ +Configure the CSI mount path and SPIRE Agent socket path to align as follows: ++ +* SPIRE Agent: `socket_path = "/run/spire/agent-sockets/socket"` + +* CSI Driver: Mounts `/run/spire/agent-sockets` to `/tmp/spire-agent/public` in pods + +[id="sds-rootca-bundle-requirement_{context}"] +SDS ROOTCA bundle configuration for federation:: ++ +In multi-cluster deployments, the SPIRE Agent SDS configuration must include `default_all_bundles_name: "ROOTCA"` to serve federated trust bundles to Envoy. ++ +*Impact*: Without this setting, Envoy receives only the local cluster's trust bundle. Cross-cluster certificate validation fails because Envoy cannot verify certificates issued by remote SPIRE Servers. ++ +For verification procedures, see <>. + +[id="federation-bootstrap-manual-bundle_{context}"] +Federation bootstrap requires manual trust bundle loading:: ++ +SPIRE federation requires manual initialization of the remote cluster's trust bundle before automatic bundle exchange can begin. ++ +*Bootstrap problem*: Each SPIRE Server needs the remote server's CA certificate to validate the remote server's SVID when retrieving bundles. This creates a circular dependency. ++ +*Resolution*: Manually export the trust bundle from each remote cluster and load it into the local SPIRE Server by using the `spire-server bundle set` command. ++ +*Impact*: Federation does not function until this bootstrap process completes on all clusters. Automated bundle refresh begins after the initial bundle is loaded. ++ +For detailed procedures, see <>. + +[id="trustdomain-aliases-required_{context}"] +trustDomainAliases required for cross-cluster validation:: ++ +The Istio `meshConfig` must include `trustDomainAliases` that map federated trust domains to their SPIFFE trust domain URIs. ++ +*Impact*: Without `trustDomainAliases`, Envoy rejects certificates from remote clusters even if the federated trust bundles are present. The trust domain must be explicitly mapped for Envoy to validate remote certificates. ++ +For troubleshooting missing trust domain aliases, see <>. + +[id="eastwest-gateway-tls-passthrough_{context}"] +East-West Gateway requires TLS Passthrough mode:: ++ +East-West Gateways in multi-cluster SPIRE deployments must use TLS Passthrough mode `PASSTHROUGH`, not TLS Termination modes `SIMPLE` or `MUTUAL`. ++ +*Reason*: TLS Termination would replace SPIRE-issued certificates with gateway certificates, breaking end-to-end mutual TLS. Passthrough preserves the original workload certificates for peer authentication. ++ +*Impact*: If the gateway uses TLS Termination, cross-cluster services cannot authenticate using SPIFFE identities. The receiving cluster sees the gateway's certificate instead of the originating workload's certificate. ++ +For verification, see <>. + +[id="remote-secrets-service-discovery_{context}"] +Remote Secrets required for cross-cluster service discovery:: ++ +Multi-cluster service discovery requires Remote Secrets that allow one cluster's Istio control plane to access the Kubernetes API of remote clusters. ++ +*Impact*: Without Remote Secrets, Cluster A's control plane cannot discover services in Cluster B. Workloads in Cluster A do not receive endpoint information for remote services, causing connection failures even if SPIRE federation is configured correctly. ++ +For troubleshooting, see <>. + +[id="meshnetworks-gateway-configuration_{context}"] +meshNetworks configuration for gateway routing:: ++ +The Istio `meshConfig` must include the `meshNetworks` configuration that defines how to route traffic through East-West Gateways. ++ +*Impact*: Without the `meshNetworks` configuration, Istio attempts direct pod-to-pod communication across clusters, which fails due to network isolation. Traffic must be routed through gateways. ++ +For troubleshooting, see <>. + +[id="version-specific-multicluster-integration_{context}"] +Version-specific implementation details:: ++ +The multi-cluster SPIRE integration procedures are specific to {SMProductName} 3.x Istio v1.27.3 and the {zero-trust-full}. ++ +*Recommendation*: Verify all configuration requirements against the specific versions of {SMproductName}, Istio, and {zero-trust-full} in your environment. + +[id="spire-multicluster-troubleshooting-procedures_{context}"] +== Troubleshooting procedures + +The following provides procedures to resolve common issues encountered when integrating SPIRE federation with multi-cluster OpenShift Service Mesh deployments. + +[id="cross-cluster-connection-failures_{context}"] +Cross-cluster service connection failures:: ++ +*Symptom* ++ +Workloads in one cluster cannot connect to services in another cluster, even though SPIRE federation is configured. ++ +*Possible causes and resolutions* ++ +*Missing Remote Secrets* ++ +. Verify that Remote Secrets exist on each cluster by running the following command: ++ +[source,terminal] +---- +$ oc get secrets -n istio-system -l istio/multiCluster=true +---- ++ +. If Remote Secrets are missing, create them using `istioctl` by running the following command: ++ +[source,terminal] +---- +$ istioctl create-remote-secret \ + --context=cluster-b \ + --name=cluster-b | \ + oc apply -f - --context=cluster-a +---- + +*Missing trustDomainAliases* ++ +. Verify that the Istio `meshConfig` includes all federated trust domains by running the following command: ++ +[source,terminal] +---- +$ oc get istio default -n istio-system -o jsonpath='{.spec.values.meshConfig.trustDomainAliases}' +---- ++ +. Add missing trust domains to the Istio CR: ++ +[source,yaml] +---- +spec: + values: + meshConfig: + trustDomainAliases: + - spiffe://cluster-b.example.org + - spiffe://cluster-c.example.org +---- + +*Gateway routing issues* ++ +. Check that East-West Gateways are running and accessible by running the following commands: ++ +[source,terminal] +---- +$ oc get pods -n istio-system -l istio=eastwestgateway +---- ++ +[source,terminal] +---- +$ oc get svc -n istio-system eastwest-gateway +---- ++ +Verify that the gateway has an external IP assigned for `LoadBalancer` type services. + +[id="federated-bundles-not-loaded_{context}"] +Federated trust bundles not loaded in workload certificates:: ++ +*Symptom* ++ +Workloads have SPIRE-issued certificates, but the certificates do not include federated trust bundles. Cross-cluster authentication fails. ++ +*Cause* ++ +The `ClusterSPIFFEID` resource is missing the `className` field or the `federatesWith` field is not properly configured. ++ +*Resolution* + +. Verify the `ClusterSPIFFEID` configuration by running the following command: ++ +[source,terminal] +---- +$ oc get clusterspiffeid -o yaml +---- + +. Ensure the `className` field is set when using `federatesWith`: ++ +[source,yaml] +---- +apiVersion: spire.openshift.io/v1alpha1 +kind: ClusterSPIFFEID +metadata: + name: federated-workloads +spec: + className: zero-trust-workload-identity-manager-spire + federatesWith: + - spiffe://cluster-b.example.org + - spiffe://cluster-c.example.org + spiffeIDTemplate: "spiffe://cluster-a.example.org/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}" +---- ++ +The `className` field must be set to `zero-trust-workload-identity-manager-spire`. + +. Check SPIRE Server registration entries for federation: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server entry show -output json | \ + jq '.entries[] | select(.federates_with | length > 0)' +---- + +. If entries do not show `federates_with`, delete and re-create the workload pods to trigger re-registration. + +[id="spire-bundle-endpoint-unreachable_{context}"] +SPIRE federation bundle endpoint unreachable:: ++ +*Symptom* ++ +SPIRE Server logs show errors retrieving trust bundles from remote clusters: ++ +[source,text] +---- +Failed to fetch bundle from remote endpoint: connection refused +---- ++ +*Possible causes and resolutions* + +*Incorrect bundle endpoint URL* ++ +* Verify that the `bundleEndpointURL` is in the `ClusterFederatedTrustDomain` resource by running the following command: ++ +[source,terminal] +---- +$ oc get clusterfederatedtrustdomain -o yaml +---- ++ +The URL should point to the SPIRE Server's federation endpoint (default port 8443). + +*Network connectivity issues* ++ +* Test connectivity from one cluster's SPIRE Server to another cluster's federation endpoint by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + curl -k https://spire-server.openshift-spire.svc.cluster-b.local:8443/ +---- ++ +If the test fails, check the network policies, firewall rules, and DNS resolution. + +*Federation not enabled on SPIRE Server* ++ +* Verify that the SPIRE Server is configured to expose the federation endpoint by running the following command: ++ +[source,terminal] +---- +$ oc logs -n openshift-spire spire-server-0 | grep -i "bundle endpoint" +---- ++ +Look for messages indicating the bundle endpoint is listening on port 8443. + +*Manual trust bundle bootstrap required* ++ +. Manually load the initial trust bundle from the remote cluster: + +.. Export the trust bundle from Cluster B by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server bundle show -format spiffe > cluster-b-bundle.json +---- + +.. Load the bundle into Cluster A's SPIRE Server by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -i -- \ + /opt/spire/bin/spire-server bundle set -format spiffe -id spiffe://cluster-b.example.org < cluster-b-bundle.json +---- + +[id="envoy-rejecting-remote-certificates_{context}"] +Envoy rejecting certificates from federated clusters:: ++ +*Symptom* ++ +Cross-cluster connections fail during TLS handshake. Envoy logs show certificate validation errors: ++ +[source,text] +---- +TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER +---- ++ +*Cause* += +Envoy does not have the federated trust bundle, or the East-West Gateway is using TLS Termination instead of TLS Passthrough. ++ +*Resolution* + +. Verify that Envoy has federated trust bundles by running the following command: ++ +[source,terminal] +---- +$ oc exec -n -c istio-proxy -- \ + curl localhost:15000/config_dump | \ + jq '.configs[] | select(.["@type"] | contains("SecretsConfigDump")) | .dynamic_active_secrets[0].secret.validation_context.trusted_ca.inline_bytes' | \ + base64 -d | \ + openssl crl2pkcs7 -nocrl -certfile /dev/stdin | \ + openssl pkcs7 -print_certs -noout +---- ++ +The output should show multiple CA certificates, local and federated. + +. Verify that the East-West Gateway uses TLS Passthrough by running the following command: ++ +[source,terminal] +---- +$ oc get gateway cross-network-gateway -n istio-system -o jsonpath='{.spec.servers[0].tls.mode}' +---- ++ +The output should be `PASSTHROUGH`, not `SIMPLE` or `MUTUAL`. ++ +. If the mode is incorrect, update the Gateway configuration as follows: ++ +[source,yaml] +---- +spec: + servers: + - port: + number: 15443 + name: tls + protocol: TLS + tls: + mode: PASSTHROUGH <1> + hosts: + - "*.local" +---- ++ +The `TLS mode` field must be `PASSTHROUGH`, not `SIMPLE` or `MUTUAL`. + +. Check that the SPIRE Agent SDS configuration includes `default_all_bundles_name: "ROOTCA"` by running the following command: ++ +[source,terminal] +---- +$ oc get configmap spire-agent -n openshift-spire -o yaml | grep -A5 SDS +---- ++ +The configuration should include the following code: ++ +[source,text] +---- +SDS "unix" { + plugin_data { + default_bundle_name = "null" + default_all_bundles_name = "ROOTCA" <1> + } +} +---- ++ +The `defalut_all_bundles_name` field serves all trust bundles, both local and federated, under the name `ROOTCA`. + +[id="workload-identity-mismatch_{context}"] +Workload identity mismatch in cross-cluster communication:: ++ +*Symptom* ++ +Cross-cluster connections fail with authorization errors. SPIRE logs show identity validation failures. ++ +*Cause* ++ +The SPIFFE ID template generates different identities than what the receiving cluster expects, or the trust domains do not match the configured federation. ++ +*Resolution* + +. Verify the SPIFFE ID format of the source workload by running the following command: ++ +[source,terminal] +---- +$ oc exec -n -c istio-proxy -- \ + openssl s_client -connect localhost:15000 -showcerts < /dev/null 2>/dev/null | \ + openssl x509 -noout -text | grep -A1 "Subject Alternative Name" +---- + +. Verify the SPIFFE ID template in the `ClusterSPIFFEID` resource by running the following command: ++ +[source,terminal] +---- +$ oc get clusterspiffeid -o jsonpath='{.items[*].spec.spiffeIDTemplate}' +---- + +. Ensure the trust domain in the SPIFFE ID matches the local cluster's trust domain by running the following command: ++ +[source,text] +---- +spiffe:///ns//sa/ +---- ++ +Check that both clusters use compatible SPIFFE ID templates for service accounts. + +[id="missing-socket-in-workload-pod_{context}"] +SPIRE socket not present in workload pod:: ++ +*Symptom* ++ +The SPIRE socket is not mounted at `/tmp/spire-agent/public/socket` in workload pods. ++ +*Cause* ++ +The required `userVolume` and `userVolumeMount` annotations are missing from the pod specification. ++ +*Resolution* + +. Verify the socket is present by running the following command: ++ +[source,terminal] +---- +$ oc exec -n -c istio-proxy -- ls -l /tmp/spire-agent/public/ +---- ++ +The output should show a file named `socket`, not `spire-agent.sock`. + +. Add the required annotations to the pod or deployment: ++ +[source,yaml] +---- +metadata: + annotations: + sidecar.istio.io/userVolume: '{"spire-agent-socket":{"csi":{"driver":"csi.spiffe.io","readOnly":true}}}' + sidecar.istio.io/userVolumeMount: '{"spire-agent-socket":{"mountPath":"/tmp/spire-agent/public","readOnly":true}}' +---- + +. Delete and re-create the pod to apply the annotations. + +. Verify the SPIRE Agent WorkloadAPI configuration uses the correct socket path by running the following command: ++ +[source,text] +---- +plugins { + WorkloadAPI "socket" { + plugin_cmd = "/opt/spire/bin/spire-agent" + plugin_data { + socket_path = "/run/spire/agent-sockets/socket" + } + } +} +---- +The `socket_path` filename must be `socket`, not `spire-agent.sock`. + +[id="eastwest-gateway-503-errors_{context}"] +East-West Gateway returning 503 Service Unavailable:: ++ +*Symptom* ++ +Cross-cluster requests fail with HTTP 503 errors at the East-West Gateway. ++ +*Possible causes and resolutions* ++ +*No healthy endpoints for remote service* ++ +. Check the gateway's endpoint status by running the following command: ++ +[source,terminal] +---- +$ oc exec -n istio-system -c istio-proxy -- \ + curl localhost:15000/clusters | grep +---- ++ +If no healthy endpoints appear, verify that: ++ +* Remote Secrets are correctly configured +* Services exist in the remote cluster +* The remote cluster's Istio control plane is running ++ +*meshNetworks misconfiguration* ++ +. Verify the `meshNetworks` configuration points to the correct gateway by running the following command: ++ +[source,terminal] +---- +$ oc get istio default -n istio-system -o jsonpath='{.spec.values.meshConfig.meshNetworks}' +---- ++ +Ensure the gateway address matches the actual East-West Gateway service or external IP. ++ +. If the configuration is missing or incorrect, update the Istio CR as follows: ++ +[source,yaml] +---- +spec: + values: + meshConfig: + meshNetworks: + network-cluster-b: + endpoints: + - fromRegistry: cluster-b + gateways: + - address: eastwest-gateway.istio-system.svc.cluster.local + port: 15443 +---- + +[id="verifying-multicluster-spire-integration_{context}"] +Verifying multi-cluster SPIRE integration:: ++ +To confirm that multi-cluster SPIRE integration is working correctly: ++ +. Verify that the federated trust bundles are in the SPIRE Server by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-server-0 -- \ + /opt/spire/bin/spire-server bundle list +---- ++ +The output should show trust bundles for all federated trust domains. ++ +. Verify that the workload X.509 SVIDs include federated bundles by running the following command: ++ +[source,terminal] +---- +$ oc exec -n openshift-spire spire-agent- -- \ + /opt/spire/bin/spire-agent api fetch x509 \ + -socketPath /run/spire/agent-sockets/socket +---- ++ +Look for multiple CA certificates in the trust bundle section. ++ +. Verify that Envoy has federated bundles by running the following command: ++ +[source,terminal] +---- +$ oc exec -n -c istio-proxy -- \ + curl localhost:15000/config_dump | \ + jq '.configs[] | select(.["@type"] | contains("SecretsConfigDump"))' +---- ++ +The output should show multiple CA certificates in the validation context if federation is working correctly. ++ +. Test the cross-cluster connectivity with verbose output by running the following command: ++ +[source,terminal] +---- +$ oc exec -n -- \ + curl -v http://..svc.cluster.local:8080 +---- ++ +Check the TLS handshake details for SPIFFE identities. ++ +. Verify East-West Gateway traffic by running the following command: ++ +[source,terminal] +---- +$ oc logs -n istio-system -c istio-proxy +---- ++ +Look for TLS passthrough connections to remote clusters. ++ +. Check the SPIRE federation health by running the following command: ++ +[source,terminal] +---- +$ oc logs -n openshift-spire spire-server-0 | grep -i "bundle refresh" +---- ++ +Look for successful bundle refresh messages from federated trust domains. \ No newline at end of file diff --git a/modules/zero-trust-manager-multi-cluster-spire-mesh-about.adoc b/modules/zero-trust-manager-multi-cluster-spire-mesh-about.adoc new file mode 100644 index 000000000000..19ede1487c30 --- /dev/null +++ b/modules/zero-trust-manager-multi-cluster-spire-mesh-about.adoc @@ -0,0 +1,92 @@ +// Module included in the following assemblies: +// +// * security/zero_trust_workload_identity_manager/zero-trust-manager-install.adoc + +:_mod-docs-content-type: CONCEPT +[id="zero-trust-manager-multi-cluster-spire-mesh-about_{context}"] += Multi-cluster SPIRE integration with OpenShift Service Mesh + +[role="_abstract"] +You can use SPIRE federation to enable cross-cluster mutual TLS authentication between OpenShift Service Mesh workloads running on separate clusters, providing a unified zero trust identity framework across a multi-cluster deployment. + +Multi-cluster SPIRE integration extends single-cluster SPIRE capabilities to enable workloads in different clusters to authenticate each other using SPIFFE identities. This eliminates the need for separate certificate authorities per cluster and enables true cross-cluster zero trust architecture. + +[id="multicluster-spire-architecture_{context}"] +== Multi-cluster architecture components + +The multi-cluster SPIRE integration with OpenShift Service Mesh involves the following components beyond the single-cluster setup: + +SPIRE federation:: A mechanism for exchanging trust bundles (CA certificates) between SPIRE Servers in different clusters, allowing workloads in one cluster to validate certificates issued by another cluster's SPIRE Server. + +East-West Gateway:: Istio gateways that handle cross-cluster service mesh traffic by using TLS Passthrough mode, preserving the original SPIRE-issued certificates for end-to-end mutual TLS. + +Remote Secrets:: Kubernetes secrets containing credentials that allow one cluster's Istio control plane to discover services in another cluster. + +meshNetworks configuration:: Istio configuration that defines how to route traffic between clusters through East-West Gateways. + +ClusterFederatedTrustDomain:: Custom resource that defines remote SPIRE trust domains to federate with, including bundle endpoint URLs. + +trustDomainAliases:: Istio configuration that maps remote trust domains to local aliases, enabling Envoy to validate certificates from federated clusters. + +[id="multicluster-traffic-flow_{context}"] +== Cross-cluster traffic flow + +In a multi-cluster SPIRE-integrated service mesh, traffic flows as follows: + +. A workload in Cluster A initiates a connection to a service in Cluster B. +. The Envoy sidecar in Cluster A uses service discovery information obtained through Remote Secrets to identify the East-West Gateway in Cluster B. +. The connection is routed to Cluster B's East-West Gateway using TLS Passthrough mode, preserving the original SPIRE-issued certificate. +. The gateway forwards traffic to the target workload in Cluster B. +. The target workload's Envoy sidecar validates the client certificate using the federated trust bundle obtained from Cluster A's SPIRE Server. +. mTLS is established using SPIRE-issued certificates, with both workloads authenticating using their SPIFFE identities. + +[id="multicluster-vs-singlecluster_{context}"] +== Multi-cluster versus single-cluster integration + +The key differences between single-cluster and multi-cluster SPIRE integration are: + +[cols="2,3,3",options="header"] +|=== +|Aspect +|Single-cluster +|Multi-cluster + +|Certificate authority +|One SPIRE Server per cluster +|Multiple SPIRE Servers federated across clusters + +|Trust model +|All workloads trust the same CA +|Workloads trust multiple federated CAs + +|Service discovery +|Local Kubernetes service discovery +|Requires Remote Secrets for cross-cluster discovery + +|Network routing +|Direct pod-to-pod or service communication +|Traffic routed through East-West Gateways + +|Configuration complexity +|Basic SPIRE and Istio configuration +|Additional federation, gateway, and meshNetworks configuration + +|Use case +|Single cluster deployment +|Multi-cluster federation, disaster recovery, geo-distribution +|=== + +[id="benefits-multicluster-spire_{context}"] +== Benefits of multi-cluster SPIRE integration + +Multi-cluster SPIRE integration provides the following benefits: + +* *Unified identity framework*: Workloads across all clusters use consistent SPIFFE identities, simplifying access control policies. + +* *Federated trust*: Eliminates the need to configure separate trust relationships for each cluster pair. + +* *Cross-cluster zero trust*: Cryptographic authentication extends across cluster boundaries, not just within a cluster. + +* *Certificate lifecycle management*: SPIRE automatically rotates certificates and updates federated trust bundles without manual intervention. + +* *Support for hybrid deployments*: Enables secure communication between workloads in different cloud providers or on-premise environments. \ No newline at end of file diff --git a/modules/zero-trust-manager-using-multi-cluster.adoc b/modules/zero-trust-manager-using-multi-cluster.adoc new file mode 100644 index 000000000000..a93d1067d2a0 --- /dev/null +++ b/modules/zero-trust-manager-using-multi-cluster.adoc @@ -0,0 +1,33 @@ +// Module included in the following assemblies: +// +// * security/zero_trust_workload_identity_manager/zero-trust-manager-install.adoc + +:_mod-docs-content-type: CONCEPT +[id="zero-trust-manager-using-multi-cluster_{context}"] += Using multi-cluster SPIRE integration + +[role="_abstract"] +Use multi-cluster SPIRE integration with federation when your service mesh spans multiple {product-name} clusters and requires cross-cluster mTLS authentication with a unified identity framework. + +A multi-cluster deployment extends SPIRE integration across multiple {product-name} clusters by using SPIRE federation. This configuration is appropriate when: + +* Your service mesh spans multiple {product-name} clusters +* You need cross-cluster service-to-service communication with mTLS authentication +* You require a unified identity framework across clusters in different regions or cloud providers +* You are implementing disaster recovery or high availability across clusters +* You need to support hybrid deployments with workloads in multiple environments + +Multi-cluster SPIRE integration requires additional components beyond single-cluster: + +* SPIRE federation for trust bundle exchange between clusters +* East-West Gateways for cross-cluster traffic routing +* Remote Secrets for cross-cluster service discovery +* `meshNetworks` configuration to define cluster network topology +* `ClusterFederatedTrustDomain` resources defining remote SPIRE servers +* `trustDomainAliases` mapping to enable certificate validation across clusters + +[NOTE] +==== +Multi-cluster SPIRE integration builds on a single-cluster deployment. You must first deploy SPIRE on each cluster following the single-cluster procedures before configuring federation. +==== + diff --git a/security/zero_trust_workload_identity_manager/zero-trust-manager-mesh-integration-multi-cluster.adoc b/security/zero_trust_workload_identity_manager/zero-trust-manager-mesh-integration-multi-cluster.adoc new file mode 100644 index 000000000000..69f788bfe649 --- /dev/null +++ b/security/zero_trust_workload_identity_manager/zero-trust-manager-mesh-integration-multi-cluster.adoc @@ -0,0 +1,38 @@ +:_mod-docs-content-type: ASSEMBLY +[id="zero-trust-manager-mesh-integration-multi-cluster"] += Integrating SPIRE federation with multi-cluster Red Hat OpenShift Service Mesh +include::_attributes/common-attributes.adoc[] +:context: zero-trust-manager-mesh-integration-multi-cluster + +toc::[] + +[role="_abstract"] +Configure SPIRE federation across multiple {prouct-name} clusters to enable cross-cluster mutual TLS (mTLS) authentication and zero trust workload identity in a multi-cluster service mesh deployment. + +[id="prerequisites-spire-multicluster_{context}"] +== Prerequisites + +* You have access to multiple {product-name} clusters using an account with `cluster-admin` permissions on each cluster. +* You have installed the `oc` CLI and `istioctl` CLI. +* You have installed {zero-trust-full} from the OperatorHub on all clusters. +* You have installed the {SMProductName} 3 Operator (`servicemeshoperator3`) from the OperatorHub on all clusters. +* You have deployed SPIRE on each cluster following the single-cluster deployment procedure. +* Each cluster has a unique SPIFFE trust domain. For example, `cluster-a.example.org`, `cluster-b.example.org`, and so on. +* Network connectivity exists between clusters for service mesh traffic. + +include::modules/zero-trust-manager-multi-cluster-spire-mesh-about.adoc[leveloffset=+1] + +include::modules/zero-trust-manager-using-multi-cluster.adoc[leveloffset=+1] + +include::modules/zero-trust-manager-deploy-spire-multi-cluster.adoc[leveloffset=+1] + +include::modules/zero-trust-manager-configure-spire-mesh-multi-cluster.adoc[leveloffset=+1] + +include::modules/zero-trust-manager-limits-troubleshooting.adoc[leveloffset=+1] + + + +[id="additional-resources-spire-multicluster_{context}"] +== Additional resources + +* link:https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.3/html-single/installing/index#ossm-multi-cluster-configuration-overview_ossm-multi-cluster-topologies[Multi-cluster configuration overview]