|
| 1 | +A101: xDS-Based setting SNI and server certificate SAN validation |
| 2 | +---- |
| 3 | +* Author: [Kannan Jayaprakasam](https://github.com/kannanjgithub) |
| 4 | +* Approver: [Eric Anderson](https://github.com/ejona86) |
| 5 | +* Status: Draft |
| 6 | +* Implemented in: |
| 7 | +* Last updated: 2025-07-28 |
| 8 | + |
| 9 | +## Abstract |
| 10 | + |
| 11 | +gRPC will add support for setting Server Name Indication (SNI) and validation of server certificate's |
| 12 | +Subject Alternative Names (SANs) aginst the SNI that was used. |
| 13 | + |
| 14 | +### Background |
| 15 | + |
| 16 | +During Tls handshake, the server presents its certificate to the client for authentication. For servers |
| 17 | +serving multiple domains, the client needs to indicate which domain it is requesting, so that the server |
| 18 | +can present the certificate is has for that domain. The client does this at the time of Tls handshaking |
| 19 | +via Server Name Indication (SNI). When using `XdsChannelCredentials` for a channel, the gRPC client needs |
| 20 | +to be configured by the xDS server with what value to send for SNI and the gRPC client should use it for |
| 21 | +the Tls handshake. |
| 22 | + |
| 23 | +In [A29][A29] for TLS security in xDS-managed connections, the `sni` field from [UpstreamTlsContext.sni][UTC_SNI] |
| 24 | +was ignored. |
| 25 | + |
| 26 | +When using `XdsChannelCredentials` for the channel, hostname validation |
| 27 | +is turned off and instead SAN matching is performed against [UpstreamTlsContext.match_subject_alt_names][match_subject_alt_names] |
| 28 | +instead of a typical hostname. This proposal adds SAN matching for the same name as the client used for SNI. |
| 29 | + |
| 30 | +For an overview of securing connections in the envoy proxy using SNI |
| 31 | +and SAN validation, see [envoy-SNI]. |
| 32 | + |
| 33 | +[UTC_SNI]: https://github.com/envoyproxy/envoy/blob/ee2bab9e40e7d7649cc88c5e1098c74e0c79501d/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L42 |
| 34 | +[A29]: A29-xds-tls-security.md |
| 35 | +[envoy-SNI]: https://www.envoyproxy.io/docs/envoy/latest/start/quick-start/securing |
| 36 | +[match_subject_alt_names]: https://github.com/envoyproxy/envoy/blob/b29d6543e7568a8a3e772c7909a1daa182acc670/api/envoy/extensions/transport_sockets/tls/v3/common.proto#L407 |
| 37 | + |
| 38 | +## Proposal |
| 39 | +This proposal has two parts: |
| 40 | +1. Setting SNI: When using `XdsChannelCredentials` for the channel, gRPC clients will set SNI for the Tls handshake for |
| 41 | +Tls connections using the fields from [UpstreamTlsContext][UTC] in the CDS update. |
| 42 | + |
| 43 | + i. If [UpstreamTlsContext][UTC] specifies `auto_host_sni` and the hostname is available, then SNI will be set to the hostname. The hostname |
| 44 | + is either the DNS name for logical DNS clusters or the endpoint hostname for EDS clusters, as in the case of the hostname used for [authority rewriting][A81-hostname]. |
| 45 | + |
| 46 | + ii. Else, if `UpstreamTlsContext.sni` specifies the SNI to use, then it will be used. |
| 47 | + |
| 48 | + iii. Else, no SNI will be set for the Tls handshake. |
| 49 | + |
| 50 | +[UTC]: https://github.com/envoyproxy/envoy/blob/ee2bab9e40e7d7649cc88c5e1098c74e0c79501d/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L29 |
| 51 | +[A81-hostname]: A81-xds-authority-rewriting.md#xds-resource-validation |
| 52 | + |
| 53 | +2. Server SAN validation against SNI used: If `auto_sni_san_validation` is true in the [UpstreamTlsContext][UTC] |
| 54 | +gRPC client will perform matching for a SAN against the SNI used for the handshake. The normal matching when using |
| 55 | +`TlsCredentials` for the channel only checks against DNS SANs in the certificate, but with `XdsChannelCredentials` |
| 56 | +matching will be done using any of DNS / URI / IPA SAN types in the server certificate. |
| 57 | + |
| 58 | +### Related Proposals: |
| 59 | +* [gRFC A29: xDS-Based Security for gRPC Clients and Servers][A29] |
| 60 | +* [gRFC A81: xDS Authority Rewriting][A81] |
| 61 | + |
| 62 | +[A29]: A29-xds-tls-security.md |
| 63 | +[A81]: A81-xds-authority-rewriting.md |
| 64 | + |
| 65 | +### Setting SNI during Tls handshake |
| 66 | +As mentioned in [A29 implementation details][A29_impl-details] the `UpstreamTlsContext` is either |
| 67 | +passed down to child policies via channel arguments or a similar mechanism, depending on the language. |
| 68 | +[A29 implementation details][A29_impl-details] also talks about a `CertificateProvider` object that represents |
| 69 | +a plugin that provides the required certificates and keys to the gRPC implementation. When Tls handshake is |
| 70 | +initiated for a channel that is using `XdsCredentials`, this `CertificateProvider` object is used to |
| 71 | +provide the certs and trust roots for establishing the secure connection. During this handshake we need |
| 72 | +to set the SNI to use for the `ClientHello` frame of the handshake. To determine the SNI, we need both the |
| 73 | +SNI related fields from the parsed `UpstreamTlsContext` and the hostname for the endpoint. |
| 74 | +To determine the SNI `UpstreamTlsContext.sni` and `UpstreamTlsContext.auto_host_sni` from the parsed |
| 75 | +cluster resource will also be set into the `CertificateProvider` by the xds_cluster_impl policy. |
| 76 | +When the Tls handling code uses the certs and trust roots from the `CertificateProvider` |
| 77 | +to establish the connection, it will also now determine the SNI to set based on the parsed sni related fields |
| 78 | +available in the `CertificateProvider` and the hostname in the endpoint attributes. |
| 79 | +The precedence order mentioned at the top of the [Proposal](#proposal) section will be used to determine the SNI to use. For example, |
| 80 | +if `UpstreamTlsContext.auto_host_sni` was set but there is no EDS hostname for the endpoint, but |
| 81 | +`UpstreamTlsContext.sni` is set, then it would use the value of the `UpstreamTlsContext.sni` if set. |
| 82 | +If no SNI value is determined, then it will not set SNI for the Tls handshake. |
| 83 | + |
| 84 | +##### Language specific example |
| 85 | +As an example, in Java, the ClusterImpl LB policy creates the `SslContextProviderSuppler` wrapping the |
| 86 | +`UpstreamTlsContext` and puts it in the subchannel wrapper when its child policy creates a subchannel. At the time of Tls protocol negotiation |
| 87 | +for the subchannel, the Tls handling code should use the hostname from the endpoint attributes and the SNI related fields in `UpstreamTlsContext` |
| 88 | +to determine the SNI to be used for the Tls handshake. This SNI will also be passed to the the `SslContextProviderSupplier`, in addition to the |
| 89 | +callback to be invoked to provide the `SslContext` when it is available. The `ClientCertificateSslContextProvider` instantiated by the `SslContextProviderSupplier` |
| 90 | +will be passed both the callback argument and the SNI value to use, that will be used in the `XdsX509TrustManager` it creates to perform the SAN - SNI |
| 91 | +matching. The Tls protocol negotiating code will use the SNI value determined to use when creating the the SSL engine from the `SslContext` received via the callback. |
| 92 | + |
| 93 | +[A29_impl-details]: A29-xds-tls-security.md#implementation-details |
| 94 | +[UTC_SNI]: https://github.com/envoyproxy/envoy/blob/ee2bab9e40e7d7649cc88c5e1098c74e0c79501d/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L42 |
| 95 | + |
| 96 | +### SAN SNI validation |
| 97 | +The server certificate validation described in [A29 SAN matching][A29_SAN-matching] |
| 98 | +matches the Subject Alternative Names specified in the server certificate against |
| 99 | +[`match_subject_alt_names`][match_subject_alt_names] in `CertificateValidationContext`. |
| 100 | +If `auto_sni_san_validation` is set in the [UpstreamTlsContext][UTC], matching will be |
| 101 | +performed against the SNI that was used by the client, and this validation will replace |
| 102 | +the [`match_subject_alt_names`][match_subject_alt_names] if set. The value of the |
| 103 | +`auto_sni_san_validation` field and the SNI used by the client will need to be propagated |
| 104 | +to the certificate verifying mechanism that is used based on the settings in the |
| 105 | +`CertificateProvider` when using `XdsChannelCredentials` for the transport. |
| 106 | +The SNI used by the client will be used for matching, regardless of how that SNI was determined. |
| 107 | + |
| 108 | +#### Language specific example |
| 109 | +For example in Java the SAN SNI validation verification occurs in the TrustManager created by the `CertProviderClientSslContextProvider` using |
| 110 | +the cert store indicated by `CertificateValidationContext` in `UpstreamTlsContext` which is either a managed cert store or the system root cert store. |
| 111 | + |
| 112 | +gRPC Java also has a Caching for the SslContext. The `SslContextProviderSupplier` (named so because it |
| 113 | +supplies both client and server SslContext providers) creates a provider for the client `SslContext` and today |
| 114 | +maintains a cache of `UpstreamTlsContext` to the client `SslContext` provider instances. |
| 115 | +For the SNI requirement, the `TrustManager` in the `SslContext` needs to |
| 116 | +be aware of the SNI to validate the SAN against, so a different `TrustManager` instance needs |
| 117 | +to be created for each SNI to use for the same `UpstreamTlsContext`, so this cache's key will |
| 118 | +need to be enhanced to be <UpstreamTlsContext, String> to hold the SNI as well, and the client |
| 119 | +`SslContext` provider for a particular key will create a `TrustManager` instance that takes the |
| 120 | +SNI to validate the SANs against and set it in the `SslContext` it provides. |
| 121 | + |
| 122 | +[A29_SAN-matching]: A29-xds-tls-security.md#server-authorization-aka-subject-alt-name-checks |
| 123 | +[match_subject_alt_names]: https://github.com/envoyproxy/envoy/blob/b29d6543e7568a8a3e772c7909a1daa182acc670/api/envoy/extensions/transport_sockets/tls/v3/common.proto#L407 |
| 124 | +[UTC]: https://github.com/envoyproxy/envoy/blob/ee2bab9e40e7d7649cc88c5e1098c74e0c79501d/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L29 |
| 125 | + |
| 126 | +#### Validation |
| 127 | +The Cds update will be NACKed if `UpstreamTlsContext.sni` exceeds 255 characters, similar to Envoy. |
| 128 | + |
| 129 | +### Environment variable protections |
| 130 | +Setting SNI and performing the SAN validation against SNI will be guarded by the GRPC_EXPERIMENTAL_XDS_SNI env var. The env var guard will be removed once |
| 131 | +the feature passes interop tests. When the SNI value to be used for the Tls handshake is not determined based on the described rules, no SNI will be sent. |
| 132 | +Some language implementations are sending the xds channel authority today, and some customers may see breaking behavior if no SNI is sent now. To mitigate |
| 133 | +this risk, an env var GRPC_USE_CHANNEL_AUTHORITY_IF_NO_SNI_APPLICABLE will be provided to revert back to the old behavior of sending the xds channel authority |
| 134 | +when no SNI is determined. This value however will not be used for SAN verification. |
0 commit comments