From 6ed398ab6db134c0ff1f63acc0df535b12c558ac Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Thu, 23 Apr 2026 20:41:41 +0300 Subject: [PATCH 01/12] docs(networking): add Gateway API page describing the Cilium-based per-tenant ingress Covers the architecture, the two-step opt-in (gateway.enabled at platform level, tenant.spec.gateway per tenant), per-service routing (HTTPRoute for termination, TLSRoute for passthrough), the four independent ValidatingAdmissionPolicies that guard cross-tenant hostname hijacking plus the listener allowedRoutes whitelist, the per-tenant cert-manager Issuer that enables isolated ACME state for child tenants, migration from ingress-nginx, rate-limit considerations, and operational troubleshooting. Weight 15 places the page between 'Architecture' (5) and 'HTTP Cache' (20) in the networking section sidebar. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- .../en/docs/next/networking/gateway-api.md | 322 ++++++++++++++++++ 1 file changed, 322 insertions(+) create mode 100644 content/en/docs/next/networking/gateway-api.md diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md new file mode 100644 index 00000000..4d19919b --- /dev/null +++ b/content/en/docs/next/networking/gateway-api.md @@ -0,0 +1,322 @@ +--- +title: "Gateway API (Cilium)" +linkTitle: "Gateway API" +description: "Per-tenant Gateway API ingress backed by Cilium — cert-manager integration, TLS termination and passthrough, cross-tenant isolation via ValidatingAdmissionPolicies." +weight: 15 +--- + +## Overview + +Cozystack ships Gateway API support as an opt-in alternative to ingress-nginx. When enabled, every tenant with `spec.gateway: true` gets its own `Gateway` materialised in its own namespace, with the Cilium Gateway API controller programming Envoy-on-DaemonSet and announcing the tenant's LoadBalancer IPs through Cilium LB IPAM. Certificates are issued by cert-manager against a per-tenant `Issuer` so each tenant gets an isolated ACME account. + +This page documents the architecture, the two-step opt-in, the security model (four independent ValidatingAdmissionPolicies plus the listener namespace whitelist), and the migration story from ingress-nginx. + +Gateway API and ingress-nginx coexist on the same cluster — the two modes are selected per service / per tenant, not globally. Existing clusters upgrade with `gateway.enabled=false` and see no behavioural change. + +## Architecture + +### Traffic path + +```mermaid +flowchart LR + CLIENT["External client"] + LB["CiliumLoadBalancerIPPool
(announces publishing.externalIPs)"] + ENV["cilium-envoy DaemonSet
(L7 termination / L4 passthrough)"] + GW["Gateway 'cozystack'
(per-tenant namespace)"] + HTR["HTTPRoute
dashboard, keycloak, harbor, bucket"] + TLR["TLSRoute
cozystack-api, vm-exportproxy,
cdi-uploadproxy"] + CM["cert-manager Issuer
(per-tenant ACME account)"] + SVC["Service
(backend)"] + + CLIENT -->|DNS → LB IP| LB + LB --> ENV + ENV --> GW + GW --> HTR + GW --> TLR + HTR --> SVC + TLR --> SVC + CM -.->|issues wildcard Certificate| GW +``` + +- **One `GatewayClass cilium`** cluster-wide, reconciled by Cilium's Gateway API controller. There is no per-tenant GatewayClass, so no tenant can hijack the class by naming theirs after someone else. +- **One `Gateway` per tenant** in the tenant's own namespace. All listeners for that tenant live on a single Gateway object; there is no cross-Gateway merge. +- **Envoy** runs as a Cilium DaemonSet (`cilium.envoy.enabled=true`) and handles both TLS termination (HTTPS listener) and TLS passthrough (dedicated per-service listeners for the kubeapiserver and the KubeVirt VM export / CDI upload proxies). +- **LoadBalancer IP** is assigned by Cilium LB IPAM from a `CiliumLoadBalancerIPPool` scoped to the tenant's `cilium-gateway-cozystack` Service. Tenants with shared apex IPs compete for addresses — operators running multi-tenant bare-metal clusters should either carve up `publishing.externalIPs` or give every tenant its own subset. + +### Listener layout on a tenant Gateway + +A tenant Gateway always materialises three base listeners: + +| # | Name | Protocol | Port | Hostname | Purpose | +|---|---|---|---|---|---| +| 1 | `http` | `HTTP` | 80 | none (wildcard) | ACME `/.well-known/acme-challenge/*` + HTTP→HTTPS redirect HTTPRoute | +| 2 | `https` | `HTTPS` | 443 | `*.` | TLS termination for wildcard subdomain services (dashboard, keycloak, etc.) | +| 3 | `https-apex` | `HTTPS` | 443 | `` | TLS termination for the apex domain itself | + +Plus one extra listener per TLS-passthrough service (see [TLS passthrough](#tls-passthrough) below). + +`` is read from `_namespace.host` which the tenant chart derives from the tenant's `spec.host` (or inherits from the parent). Listeners 2 and 3 both consume the wildcard `Certificate` that cert-manager issues against the per-tenant `Issuer`. + +## Enabling Gateway API + +Gateway API is opt-in at two levels. Both defaults stay `false`; upgrades do not flip tenants silently. + +### 1. Platform-level flag + +Set `gateway.enabled: true` on the `cozystack.cozystack-platform` Package: + +```yaml +apiVersion: cozystack.io/v1alpha1 +kind: Package +metadata: + name: cozystack.cozystack-platform +spec: + variant: isp-full + components: + platform: + values: + publishing: + host: example.org + gateway: + enabled: true + attachedNamespaces: + - cozy-cert-manager + - cozy-dashboard + - cozy-keycloak + - cozy-system + - cozy-harbor + - cozy-bucket + - cozy-kubevirt + - cozy-kubevirt-cdi + - cozy-monitoring + - cozy-linstor-gui +``` + +Flipping `gateway.enabled=true` wires three things: + +- cert-manager `ClusterIssuer.spec.acme.solvers` switches from `http01.ingress.ingressClassName` to `http01.gatewayHTTPRoute` that attaches to the publishing tenant's Gateway. +- The exposed-service templates (dashboard, keycloak) stop rendering their `Ingress` and start rendering their `HTTPRoute`. +- TLS-passthrough services (cozystack-api, vm-exportproxy, cdi-uploadproxy) stop rendering their `Ingress` and start rendering a `TLSRoute` attached to a dedicated Passthrough listener. + +The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute`s to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry. + +### 2. Per-tenant toggle + +Set `spec.gateway: true` on any tenant to materialise its `Gateway`, `Certificate`, `Issuer` and `CiliumLoadBalancerIPPool`: + +```yaml +apiVersion: apps.cozystack.io/v1alpha1 +kind: Tenant +metadata: + name: alice + namespace: tenant-root +spec: + gateway: true + resourceQuotas: + count/certificates.cert-manager.io: "10" +``` + +Tenants may leave `spec.host` empty — the tenant chart computes it as `.`. Setting `spec.host` is reserved for cluster-admins and cozystack/Flux service accounts (enforced runtime by `cozystack-tenant-host-policy`, see [Security](#security)). + +A child tenant with `spec.gateway: true` receives its own Gateway, its own wildcard Certificate, and its own `Issuer` that talks to Let's Encrypt on a separate ACME account — so child tenants do not share HTTP-01 challenge state with the parent or with siblings. + +## Per-service routing + +When `gateway.enabled=true`, the following services switch from `Ingress` to Gateway API resources: + +### HTTPRoute (TLS termination on Gateway) + +| Service | Namespace | `HTTPRoute` name | Backend | Listener | +|---|---|---|---|---| +| dashboard | `cozy-dashboard` | `dashboard` | `incloud-web-gatekeeper:8000` | `https` | +| keycloak | `cozy-keycloak` | `keycloak` | `keycloak-http:80` | `https` | +| harbor | tenant namespace | `` | `:80` | `https` (tenant's own Gateway) | +| bucket | tenant namespace | `-ui` | `-ui:8080` | `https` (tenant's own Gateway) | + +cert-manager's HTTP-01 solver places its short-lived `HTTPRoute` on the `http` listener of the same Gateway, path-matched to `/.well-known/acme-challenge/`. More-specific path matching wins over the catch-all HTTP→HTTPS redirect HTTPRoute. + +### TLSRoute (TLS passthrough) + +Services that need SNI-based passthrough (clients present certificates, backend terminates TLS) use `TLSRoute` on a dedicated Passthrough listener. One listener per service, hostname scoped to that service's FQDN: + +| Service | Namespace | `TLSRoute` name | Backend | Listener | +|---|---|---|---|---| +| Kubernetes API | `default` | `kubernetes-api` | `kubernetes:443` | `tls-api` | +| KubeVirt VM export | `cozy-kubevirt` | `vm-exportproxy` | `vm-exportproxy:443` | `tls-vm-exportproxy` | +| KubeVirt CDI upload | `cozy-kubevirt-cdi` | `cdi-uploadproxy` | `cdi-uploadproxy:443` | `tls-cdi-uploadproxy` | + +The Passthrough listener is added to the Gateway only if the corresponding service appears in `publishing.exposedServices`. The wildcard `https` listener at `*.` and these specific `tls-*` listeners coexist on port 443 — Cilium resolves SNI to the most-specific hostname match. + +`TLSRoute` is shipped from the Gateway API experimental channel (CRD `gateway.networking.k8s.io/v1alpha2`). It graduates to `v1` in Gateway API v1.5 / Cilium v1.20; Cozystack currently pins `v1alpha2` for compatibility with Cilium v1.19. + +## Security + +Gateway API multi-tenancy in Cozystack is guarded at **four independent ValidatingAdmissionPolicies** plus a listener-level namespace whitelist. Each check enforces one invariant and fails closed on policy/ConfigMap errors (`failurePolicy: Fail`, `validationActions: [Deny]`). Compromising one of them does not bypass the others. + +```mermaid +flowchart TD + ATK["Attacker
(tenant user, misconfig, compromised SA)"] + L1["Listener allowedRoutes whitelist
(kubernetes.io/metadata.name, kube-apiserver enforced)"] + L2["VAP: Gateway listener hostname
must match namespace.cozystack.io/host"] + L3["VAP: Tenant spec.host restricted
to cluster-admins and cozystack/Flux SAs"] + L4["VAP: namespace.cozystack.io/host label
immutable after first write"] + L5["VAP: Package attachedNamespaces
must not contain tenant-*"] + L6["Render-time helm fail
for tenant-* in attachedNamespaces"] + GW["Cross-tenant hostname hijack
BLOCKED"] + + ATK --> L1 + ATK --> L2 + ATK --> L3 + ATK --> L4 + ATK --> L5 + ATK --> L6 + L1 --> GW + L2 --> GW + L3 --> GW + L4 --> GW + L5 --> GW + L6 --> GW +``` + +### Layer 1 — Listener `allowedRoutes` namespace whitelist + +Every listener on a tenant Gateway pins `allowedRoutes.namespaces.from: Selector` to a `matchExpressions` whitelist against the built-in `kubernetes.io/metadata.name` label. That label is written by kube-apiserver on every namespace and cannot be spoofed. + +The whitelist is the publishing tenant's namespace (always, implicit) plus `publishing.gateway.attachedNamespaces`. A namespace outside the list literally cannot attach any `HTTPRoute` to the Gateway. + +### Layer 2 — `cozystack-gateway-hostname-policy` + +`ValidatingAdmissionPolicy` scoped to `gateway.networking.k8s.io/v1 Gateway` CREATE/UPDATE. CEL reads `namespaceObject.metadata.labels["namespace.cozystack.io/host"]` and rejects any listener whose hostname is not equal to that value or a subdomain of it. + +Because the VAP reads the namespace label (not a cluster-wide ConfigMap), a tenant with a fully independent apex domain (e.g. `customer1.io`, not a subdomain of the platform apex) is validated correctly — the VAP does not assume a subdomain hierarchy. + +### Layer 3 — `cozystack-tenant-host-policy` + +`ValidatingAdmissionPolicy` scoped to `apps.cozystack.io/v1alpha1 Tenant` CREATE/UPDATE. Rejects setting or changing `spec.host` unless the caller is in the `system:masters` group or is a service account in `cozy-*`, `flux-system` or `kube-system`. Tenants can still create tenants with empty `spec.host` (normal inheritance flow). + +This closes the path where a tenant user creates a Tenant with `spec.host=dashboard.example.org` to have the tenant chart write a hijacked label into their namespace. + +### Layer 4 — `cozystack-namespace-host-label-policy` + +`ValidatingAdmissionPolicy` scoped to core `v1 Namespace` UPDATE. Rejects any change to `namespace.cozystack.io/host` once the label is set, except by the same trusted-caller whitelist. CREATE is unrestricted (initial label write happens there, by the cozystack chart). + +Combined with Layer 3, a tenant user cannot rewrite their host through either route. + +### Layer 5 — `cozystack-gateway-attached-namespaces-policy` + +`ValidatingAdmissionPolicy` scoped to `cozystack.io/v1alpha1 Package` CREATE/UPDATE. CEL walks `spec.components.platform.values.gateway.attachedNamespaces` and rejects any entry starting with `tenant-`. Catches `kubectl edit packages.cozystack.io` that would bypass helm. + +### Layer 6 — Render-time `fail` + +cozystack-basics' hostname policy template also fails the chart render if `_cluster.gateway-attached-namespaces` contains a `tenant-*` entry. Triggers on the helm-install path before the cluster ever sees the values. Belt-and-suspenders with Layer 5. + +### What this does NOT defend + +These residuals are design choices, not runtime gaps: + +- **Cluster-admin credentials.** Anyone in `system:masters` or with a matching cozystack/Flux SA can set any host. Gateway API isolation is not the weakest link at that trust level. +- **DNS control.** A tenant whose VAP-allowed hostname does not resolve to the cluster's LB IP cannot complete ACME HTTP-01. No Certificate is issued; no hijack even if admission somehow admitted the Gateway. ACME's DNS-based identity proof is the last line. +- **Shared LB IP pool.** Tenants drawing from the same `publishing.externalIPs` block compete for addresses via Cilium LB IPAM. Operators with multiple opted-in tenants should carve up the IP space per tenant. + +## Certificates + +Every tenant with `spec.gateway: true` gets its own cert-manager `Issuer` (namespace-scoped, not `ClusterIssuer`) named `gateway`. The Issuer carries its own ACME account via `privateKeySecretRef: gateway-acme-account`. The wildcard `Certificate` for the tenant references `issuerRef.kind: Issuer, name: gateway`. + +Two ACME servers are supported out of the box: + +- `publishing.certificates.issuerName: letsencrypt-prod` → `https://acme-v02.api.letsencrypt.org/directory` +- `publishing.certificates.issuerName: letsencrypt-stage` → `https://acme-staging-v02.api.letsencrypt.org/directory` + +Any other value fails the chart render with a pointer to `packages/extra/gateway/templates/issuer.yaml` for how to add a new mapping. + +### Rate limits + +Let's Encrypt enforces per-account and per-registered-domain quotas: + +- 50 new certificates per registered domain per week +- 5 duplicate certificates per week for the same hostname set +- 300 new orders per account per 3 hours + +A cluster where many tenants share the same apex domain can exhaust these quickly. Mitigations: + +- `publishing.certificates.issuerName: letsencrypt-stage` for non-production clusters (staging quotas do not affect prod). +- `tenant.spec.resourceQuotas.count/certificates.cert-manager.io` to cap per-tenant certificate creations. +- For air-gapped deployments, use the bundled `selfsigned-cluster-issuer` or an internal ACME server. + +## Migration from ingress-nginx + +The two modes coexist. Switching happens per cluster (`gateway.enabled`) and per tenant (`tenant.spec.gateway`), not globally. + +### For a new cluster + +Set both flags at install time. Ingress-nginx can be disabled entirely: + +```yaml +gateway: + enabled: true +publishing: + exposure: loadBalancer # ingress-nginx also moves off Service.spec.externalIPs +``` + +Tenants then enable `spec.gateway: true` at creation time. + +### For an existing cluster + +1. Flip `gateway.enabled: true` on the platform Package. This rerenders cert-manager ClusterIssuers and the exposed-service templates. Existing `Ingress` objects for dashboard / keycloak / cozystack-api / vm-exportproxy / cdi-uploadproxy are deleted by Flux as they are replaced by `HTTPRoute` / `TLSRoute`. +2. For each tenant that should move to Gateway API, set `tenant.spec.gateway: true`. The tenant chart materialises the `Gateway`, `Certificate` and `Issuer`. +3. Verify: `kubectl -n wait gateway/cozystack --for=condition=Programmed`, then `kubectl -n wait certificate/-gateway-tls --for=condition=Ready`. +4. Once every tenant has migrated, the `cozystack.ingress-application` package source can be removed from the system bundle — ingress-nginx deployment is no longer required. + +Applications that live in upstream vendored charts (harbor, bucket) attach to their tenant's Gateway through `_namespace.gateway`, which the tenant chart populates automatically once `spec.gateway: true` is set. + +## Known limitations + +- **Tenant IP allocation from a shared pool.** `publishing.externalIPs` is cluster-wide. Tenants with `gateway: true` compete for addresses. Operators running multi-tenant deployments should subset IPs per tenant — Cozystack does not partition the list automatically. +- **TLSRoute v1alpha2.** Gateway API v1.5 / Cilium v1.20 will promote TLSRoute to `v1`. Cozystack will follow once the Cilium version lands. `v1alpha2` is the currently-supported version. +- **`tenant.spec.host` enforcement.** A tenant cannot set their own host (runtime-blocked), but a cluster-admin who misconfigures it will produce a tenant that publishes a hostname they do not own. ACME will fail (no DNS control), so no cert is issued and no hijack materialises, but the diagnostics stop at "Certificate stuck in Pending". +- **Upstream application features.** Some chart-level features in harbor / bucket still rely on ingress-nginx annotations upstream. Cozystack tracks those as upstream PRs; they remain the reason some ops teams will keep ingress-nginx alongside Gateway API for a while. + +## Troubleshooting + +### Gateway stuck in `Programmed=False` + +Check the Cilium Gateway API controller logs: + +```bash +kubectl -n cozy-cilium logs deploy/cilium-operator --tail=100 | grep -i gateway +``` + +Common causes: `gatewayClassName` typo (must be exactly `cilium`), a listener that collides with another listener (same port + protocol + hostname), or an HTTPS listener whose `certificateRefs` points at a Secret that does not exist yet. + +### Certificate stuck in `Ready=False` + +```bash +kubectl -n describe certificate -gateway-tls +kubectl -n describe challenge +``` + +If the Challenge's `HTTPRoute` has `Accepted=False`, the HTTP listener's `allowedRoutes` whitelist does not include the Challenge's namespace — expected to be the tenant namespace itself, always implicitly in the list. If the Challenge reports ACME server errors, check DNS: `` and `*.` must resolve to the Gateway's LB IP. + +### Admission denied: "Gateway listener hostname must equal..." + +The VAP `cozystack-gateway-hostname-policy` rejected the Gateway because a listener hostname does not match `namespace.cozystack.io/host` on the Gateway's namespace. Fix the listener hostname, or (if the namespace label is wrong) update the tenant's `spec.host` via a trusted caller. + +### Admission denied: "tenant.spec.host can only be set..." + +A non-trusted caller tried to set `tenant.spec.host`. Use an empty `spec.host` (inherit from parent) or have a cluster-admin apply the Tenant. + +### Gateway Service `` LoadBalancer IP + +Two causes: + +- `publishing.externalIPs` is empty. No `CiliumLoadBalancerIPPool` is rendered. +- Another Gateway (same tenant or another tenant's on the same IP pool) has already claimed the addresses. + +`kubectl get ciliumloadbalancerippool` shows the pools, their serviceSelector, and which Service owns each IP. + +## See also + +- Upstream Gateway API spec: [gateway-api.sigs.k8s.io](https://gateway-api.sigs.k8s.io/) +- Cilium Gateway API documentation: [docs.cilium.io/.../gateway-api](https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/) +- KEP-5707 (`Service.spec.externalIPs` deprecation): [kubernetes/enhancements#5707](https://github.com/kubernetes/enhancements/issues/5707) +- Let's Encrypt rate limits: [letsencrypt.org/docs/rate-limits](https://letsencrypt.org/docs/rate-limits/) From 31b44cab6c25c9993651ec91ba612682b08d81d2 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Thu, 23 Apr 2026 23:27:38 +0300 Subject: [PATCH 02/12] docs(gateway-api): clarify attachedNamespaces applies to HTTPRoute and TLSRoute Address review feedback from gemini-code-assist on content/en/docs/next/networking/gateway-api.md:101: the whitelist guards both HTTPRoute attachments (dashboard, keycloak, harbor, bucket) and TLSRoute attachments (Kubernetes API, vm-exportproxy, cdi-uploadproxy), not only HTTPRoute. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 4d19919b..9ae7a289 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -98,7 +98,7 @@ Flipping `gateway.enabled=true` wires three things: - The exposed-service templates (dashboard, keycloak) stop rendering their `Ingress` and start rendering their `HTTPRoute`. - TLS-passthrough services (cozystack-api, vm-exportproxy, cdi-uploadproxy) stop rendering their `Ingress` and start rendering a `TLSRoute` attached to a dedicated Passthrough listener. -The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute`s to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry. +The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute` or `TLSRoute` to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry. ### 2. Per-tenant toggle From 6e04e0843f9f714918f58435e38ff4fd3667b933 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Thu, 23 Apr 2026 23:27:57 +0300 Subject: [PATCH 03/12] docs(gateway-api): align diagram and migration prose with the actual TLSRoute name kubernetes-api Address review feedback from gemini-code-assist on content/en/docs/next/networking/gateway-api.md:144: the routing table listed the TLSRoute as kubernetes-api (the real resource name in the cozystack-api package, pointing at the kubernetes Service in the default namespace), but the Mermaid diagram labelled it cozystack-api. Update the diagram to match the actual resource name and add a parenthetical clarification in the migration section that the cozystack-api package ships the Kubernetes API TLSRoute. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 9ae7a289..3ca26ad9 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -24,7 +24,7 @@ flowchart LR ENV["cilium-envoy DaemonSet
(L7 termination / L4 passthrough)"] GW["Gateway 'cozystack'
(per-tenant namespace)"] HTR["HTTPRoute
dashboard, keycloak, harbor, bucket"] - TLR["TLSRoute
cozystack-api, vm-exportproxy,
cdi-uploadproxy"] + TLR["TLSRoute
kubernetes-api, vm-exportproxy,
cdi-uploadproxy"] CM["cert-manager Issuer
(per-tenant ACME account)"] SVC["Service
(backend)"] @@ -262,7 +262,7 @@ Tenants then enable `spec.gateway: true` at creation time. ### For an existing cluster -1. Flip `gateway.enabled: true` on the platform Package. This rerenders cert-manager ClusterIssuers and the exposed-service templates. Existing `Ingress` objects for dashboard / keycloak / cozystack-api / vm-exportproxy / cdi-uploadproxy are deleted by Flux as they are replaced by `HTTPRoute` / `TLSRoute`. +1. Flip `gateway.enabled: true` on the platform Package. This rerenders cert-manager ClusterIssuers and the exposed-service templates. Existing `Ingress` objects for dashboard / keycloak / cozystack-api (Kubernetes API) / vm-exportproxy / cdi-uploadproxy are deleted by Flux as they are replaced by `HTTPRoute` / `TLSRoute`. 2. For each tenant that should move to Gateway API, set `tenant.spec.gateway: true`. The tenant chart materialises the `Gateway`, `Certificate` and `Issuer`. 3. Verify: `kubectl -n wait gateway/cozystack --for=condition=Programmed`, then `kubectl -n wait certificate/-gateway-tls --for=condition=Ready`. 4. Once every tenant has migrated, the `cozystack.ingress-application` package source can be removed from the system bundle — ingress-nginx deployment is no longer required. From eaec93dc8bdd6dc94362bdf465fa494d851d9f3d Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Thu, 23 Apr 2026 23:28:11 +0300 Subject: [PATCH 04/12] docs(gateway-api): clarify listener allowedRoutes blocks both HTTPRoute and TLSRoute MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address review feedback from gemini-code-assist on content/en/docs/next/networking/gateway-api.md:185: the Security section's Layer 1 description said the listener allowedRoutes whitelist blocks HTTPRoute attachments, but listener.allowedRoutes in Gateway API applies to every route kind attaching to that listener — HTTPRoute on the HTTPS listeners and TLSRoute on the tls-* Passthrough listeners. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 3ca26ad9..3e0d6c2d 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -182,7 +182,7 @@ flowchart TD Every listener on a tenant Gateway pins `allowedRoutes.namespaces.from: Selector` to a `matchExpressions` whitelist against the built-in `kubernetes.io/metadata.name` label. That label is written by kube-apiserver on every namespace and cannot be spoofed. -The whitelist is the publishing tenant's namespace (always, implicit) plus `publishing.gateway.attachedNamespaces`. A namespace outside the list literally cannot attach any `HTTPRoute` to the Gateway. +The whitelist is the publishing tenant's namespace (always, implicit) plus `publishing.gateway.attachedNamespaces`. A namespace outside the list literally cannot attach any `HTTPRoute` or `TLSRoute` to the Gateway. ### Layer 2 — `cozystack-gateway-hostname-policy` From 546eed5435fd8964919686a16b00de4be7ff2961 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Thu, 23 Apr 2026 23:28:25 +0300 Subject: [PATCH 05/12] docs(gateway-api): fix broken in-page anchor for the TLS passthrough section Address review feedback from coderabbitai on content/en/docs/next/networking/gateway-api.md:56: the link fragment #tls-passthrough did not match the heading ID Hugo generates for 'TLSRoute (TLS passthrough)' (which slugifies to tlsroute-tls-passthrough), so the jump target was broken and markdownlint-cli2 flagged MD051. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 3e0d6c2d..50be3dd3 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -53,7 +53,7 @@ A tenant Gateway always materialises three base listeners: | 2 | `https` | `HTTPS` | 443 | `*.` | TLS termination for wildcard subdomain services (dashboard, keycloak, etc.) | | 3 | `https-apex` | `HTTPS` | 443 | `` | TLS termination for the apex domain itself | -Plus one extra listener per TLS-passthrough service (see [TLS passthrough](#tls-passthrough) below). +Plus one extra listener per TLS-passthrough service (see [TLS passthrough](#tlsroute-tls-passthrough) below). `` is read from `_namespace.host` which the tenant chart derives from the tenant's `spec.host` (or inherits from the parent). Listeners 2 and 3 both consume the wildcard `Certificate` that cert-manager issues against the per-tenant `Issuer`. From 927ca5f02c61c3f069734e5e396e18b479248521 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Mon, 27 Apr 2026 12:02:44 +0300 Subject: [PATCH 06/12] docs(gateway-api): document publishing.exposure flag and ingress-nginx Service modes Address review feedback from @myasnikovdaniil: the Migration section referenced `exposure: loadBalancer` in a YAML example without explaining what the flag does. Add a subsection covering both modes (externalIPs vs loadBalancer), the KEP-5707 deprecation timeline that motivates the flip, and the loadBalancer-mode caveats (non-empty externalIPs, externalTrafficPolicy: Local, no built-in Cilium announcement, brief ingress interruption on switch, scope limited to ingress-nginx). Signed-off-by: Aleksei Sviridkin --- .../en/docs/next/networking/gateway-api.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 50be3dd3..ba42c1d9 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -260,6 +260,25 @@ publishing: Tenants then enable `spec.gateway: true` at creation time. +### `publishing.exposure` — ingress-nginx Service mode + +`publishing.exposure` controls how the ingress-nginx `Service` itself is provisioned. It is independent of `gateway.enabled` — Gateway API always uses the per-tenant `CiliumLoadBalancerIPPool` regardless of this flag — but a Gateway API rollout is the natural moment to flip it, so ingress-nginx (still in place for unmigrated tenants and for chart-level features that have not yet moved) and the per-tenant Gateway draw from the same Cilium-managed pool. + +| Value | Service shape | IP source | +| ----------------------- | ------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------- | +| `externalIPs` (default) | `ClusterIP` with `Service.spec.externalIPs` set from `publishing.externalIPs` | Operator-managed routing of those IPs to a cluster node | +| `loadBalancer` | `type: LoadBalancer` | Cilium LB IPAM allocates from a `CiliumLoadBalancerIPPool` populated with `publishing.externalIPs` | + +`Service.spec.externalIPs` is deprecated upstream in Kubernetes v1.36 ([KEP-5707](https://github.com/kubernetes/enhancements/issues/5707)). The `AllowServiceExternalIPs` feature gate is expected to default to `false` around v1.40 and the implementation removed around v1.43 — switch to `loadBalancer` before upgrading past v1.40. + +Caveats for `loadBalancer` mode: + +- `publishing.externalIPs` must contain at least one non-empty address; otherwise the chart render fails fast (a LoadBalancer Service without a pool would sit in `` forever). +- The ingress-nginx Service is created with `externalTrafficPolicy: Local` to preserve the client source IP. The external IP must therefore be routed to a node that runs an ingress-nginx pod (floating IP, keepalived, upstream router, or `podAntiAffinity` to constrain pod placement). +- Cilium does not announce the IP on its own unless L2 announcements or BGP are enabled in Cilium values (disabled by default in Cozystack). This mode assumes the operator already routes `publishing.externalIPs` to a cluster node. +- Switching this value on a running cluster recreates the ingress-nginx Service (the kind changes between `ClusterIP` and `LoadBalancer`, and the `HelmRelease` has `upgrade.force: true`). Expect a brief ingress traffic interruption. +- Scope: this setting controls only the ingress-nginx Service. Other components that write `Service.spec.externalIPs` directly (for example `packages/apps/vpn/templates/service.yaml`) are unaffected and must be migrated separately before the `AllowServiceExternalIPs` gate flips off. + ### For an existing cluster 1. Flip `gateway.enabled: true` on the platform Package. This rerenders cert-manager ClusterIssuers and the exposed-service templates. Existing `Ingress` objects for dashboard / keycloak / cozystack-api (Kubernetes API) / vm-exportproxy / cdi-uploadproxy are deleted by Flux as they are replaced by `HTTPRoute` / `TLSRoute`. From 7db740bad18b8f089a5b27220f309930d155b688 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Mon, 27 Apr 2026 15:24:52 +0300 Subject: [PATCH 07/12] docs(gateway-api): correct attachedNamespaces config path Layer 1 of the Security section called the whitelist publishing.gateway.attachedNamespaces. The actual platform values schema (packages/core/platform/values.yaml on chore/gateway-api-crds-v1.5.1) puts attachedNamespaces directly under the root gateway: key, and the helm consumer (packages/core/platform/templates/apps.yaml) reads .Values.gateway.attachedNamespaces. The publishing.gateway path appears in the upstream PR description, the extra/gateway README, and one helm-fail string, but it is not the real config path. Use gateway.attachedNamespaces here to match the schema authors will actually configure. Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index ba42c1d9..124bdecc 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -182,7 +182,7 @@ flowchart TD Every listener on a tenant Gateway pins `allowedRoutes.namespaces.from: Selector` to a `matchExpressions` whitelist against the built-in `kubernetes.io/metadata.name` label. That label is written by kube-apiserver on every namespace and cannot be spoofed. -The whitelist is the publishing tenant's namespace (always, implicit) plus `publishing.gateway.attachedNamespaces`. A namespace outside the list literally cannot attach any `HTTPRoute` or `TLSRoute` to the Gateway. +The whitelist is the publishing tenant's namespace (always, implicit) plus `gateway.attachedNamespaces` from the platform Package. A namespace outside the list literally cannot attach any `HTTPRoute` or `TLSRoute` to the Gateway. ### Layer 2 — `cozystack-gateway-hostname-policy` From 6888f84b90964a7f581ade722ddd1f6560fbaf88 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Mon, 27 Apr 2026 15:25:50 +0300 Subject: [PATCH 08/12] docs(platform-package): document gateway and publishing.exposure parameters Address review feedback from @myasnikovdaniil: the platform parameters introduced by the Gateway API rollout (gateway.enabled, gateway.attachedNamespaces) and publishing.exposure were only described in the Gateway API guide. Add them to the Platform Package Reference, which is where operators look up platform values. - publishing.exposure: new row in the Publishing table with both modes, KEP-5707 deprecation pointer, and a cross-reference to the Gateway API page for the full caveat list. - New Gateway section between Authentication and Scheduling, mirroring the schema from packages/core/platform/values.yaml on chore/gateway-api-crds-v1.5.1: gateway.enabled and gateway.attachedNamespaces, with the default whitelist printed verbatim and a forward link to the Gateway API guide. Signed-off-by: Aleksei Sviridkin --- .../configuration/platform-package.md | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/content/en/docs/next/operations/configuration/platform-package.md b/content/en/docs/next/operations/configuration/platform-package.md index ac39c501..d0a858aa 100644 --- a/content/en/docs/next/operations/configuration/platform-package.md +++ b/content/en/docs/next/operations/configuration/platform-package.md @@ -63,6 +63,7 @@ spec: | `publishing.exposedServices` | `[api, dashboard, vm-exportproxy, cdi-uploadproxy]` | List of services to expose. Possible values: `api`, `dashboard`, `cdi-uploadproxy`, `vm-exportproxy`. | | `publishing.ingressName` | `"tenant-root"` | Ingress controller to use for exposing services. | | `publishing.externalIPs` | `[]` | List of external IPs used for the specified ingress controller. If not specified, a LoadBalancer service is used by default. | +| `publishing.exposure` | `"externalIPs"` | Mode for the ingress-nginx Service. `externalIPs` creates a `ClusterIP` Service with `Service.spec.externalIPs` populated from `publishing.externalIPs`. `loadBalancer` creates a `type: LoadBalancer` Service backed by a `CiliumLoadBalancerIPPool` populated with the same addresses. `Service.spec.externalIPs` is deprecated upstream in Kubernetes v1.36 ([KEP-5707][kep-5707]) — switch to `loadBalancer` before upgrading past v1.40. The chart fails fast if `loadBalancer` is set with an empty `publishing.externalIPs`. See [Gateway API → ingress-nginx Service mode]({{% ref "/docs/next/networking/gateway-api#publishingexposure--ingress-nginx-service-mode" %}}) for the full caveat list. | | `publishing.certificates.solver` | `"http01"` | ACME challenge solver type for default letsencrypt issuer. Possible values: `http01`, `dns01`. | | `publishing.certificates.issuerName` | `"letsencrypt-prod"` | `ClusterIssuer` name for TLS certificates used in system Helm releases. | @@ -98,6 +99,33 @@ spec: | `authentication.oidc.keycloakExtraRedirectUri` | `""` | Additional redirect URI for Keycloak OIDC client. | | `authentication.oidc.keycloakInternalUrl` | `""` | Internal URL for backend-to-backend requests to Keycloak. When set, the dashboard's oauth2-proxy skips OIDC discovery and routes token, JWKS, userinfo, and logout requests through this URL while keeping browser redirects on the external URL. Example: `http://keycloak-http.cozy-keycloak.svc:8080/realms/cozy`. | +#### Gateway + +Platform-wide Gateway API integration. Per-tenant opt-in is governed separately by `tenant.spec.gateway`. See the [Gateway API guide]({{% ref "/docs/next/networking/gateway-api" %}}) for the full architecture and migration path. + +| Value | Default | Description | +| --- | --- | --- | +| `gateway.enabled` | `false` | Enable Gateway API support across the platform. When `true`, cert-manager `ClusterIssuer`s use an `http01.gatewayHTTPRoute` solver attached to the publishing tenant's Gateway, and exposed services (`dashboard`, `keycloak`, `harbor`, `bucket`, `cozystack-api`, `vm-exportproxy`, `cdi-uploadproxy`) render `HTTPRoute`/`TLSRoute` instead of `Ingress`. Materialising the actual per-tenant Gateway still requires `tenant.spec.gateway: true`. | +| `gateway.attachedNamespaces` | (see below) | Namespaces allowed to attach `HTTPRoute` or `TLSRoute` to a tenant Gateway via the listener `allowedRoutes` whitelist (matched on the built-in `kubernetes.io/metadata.name` label). The publishing tenant's namespace is always implicitly included. Tenant namespaces (`tenant-*`) are rejected by `cozystack-gateway-attached-namespaces-policy` and by a render-time helm guard — use `tenant.spec.gateway` instead. The `default` namespace is included by default because the Kubernetes API `TLSRoute` lives next to the `kubernetes` Service in `default`. | + +Default `gateway.attachedNamespaces`: + +```yaml +gateway: + attachedNamespaces: + - cozy-cert-manager + - cozy-dashboard + - cozy-keycloak + - cozy-system + - cozy-harbor + - cozy-bucket + - cozy-kubevirt + - cozy-kubevirt-cdi + - cozy-monitoring + - cozy-linstor-gui + - default +``` + #### Scheduling | Value | Default | Description | @@ -162,5 +190,6 @@ These fields are managed automatically by the Cozystack operator and should not [overwrite-parameters]: {{% ref "/docs/next/operations/configuration/components#overwriting-component-parameters" %}} [Resource Management]: {{% ref "/docs/next/guides/resource-management#cpu-allocation-ratio" %}} [oidc]: {{% ref "/docs/next/operations/oidc" %}} +[kep-5707]: https://github.com/kubernetes/enhancements/issues/5707 [telemetry]: {{% ref "/docs/next/operations/configuration/telemetry" %}} [kube-ovn]: https://kubeovn.github.io/docs/en/guide/subnet/#join-subnet From b4d413c7e72f27af1d8ece6395d21625ee7f3d82 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Fri, 1 May 2026 11:46:20 +0300 Subject: [PATCH 09/12] docs(gateway-api): align with TenantGateway CRD + 7-layer security model MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major rewrite of the Gateway API page to reflect the architecture shipped in cozystack/cozystack#2470: - Add TenantGateway CRD + cozystack-controller reconciliation flow. The chart no longer renders Gateway / Issuer / Certificate directly — those come from the controller reconciling a per-tenant TenantGateway CR. Adds a reconciliation-flow mermaid alongside the traffic-path one. - Add HTTP-01 (default) vs DNS-01 (opt-in) cert-mode section. HTTP-01 is the new default with per-listener Certificates and listeners added dynamically per HTTPRoute hostname. DNS-01 is the wildcard opt-in with parametrized provider — full provider matrix (cloudflare, route53, digitalocean, rfc2136). Document that the same provider config drives both per-tenant Issuers and the cluster-wide ClusterIssuers used by the legacy ingress flow. - Renumber the security model from 5 layers to 7 layers and add the missing layers: - Layer 3 (cozystack-gateway-attached-namespaces-policy) was previously listed as Layer 5; recategorised to match the in-repo README ordering. - Layer 7 (cozystack-route-hostname-policy) — the HTTPRoute / TLSRoute hostname VAP scoped to tenant-* namespaces — was missing entirely. This is the layer that closes the cross-apex hostname surface a tenant user with HTTPRoute RBAC could otherwise exploit. Document its fail-closed behavior on missing namespace.cozystack.io/host label. - Document the narrow port-80 listener allowedRoutes (only the tenant namespace + cozy-cert-manager) — the Layer 1 hardening that prevents app HTTPRoutes attaching by hostname from binding to port 80 and serving plaintext. - Document the HTTPS listener allowedRoutes.kinds=[HTTPRoute] restriction (TLSRoute for passthrough listeners) — prevents GRPCRoute / TCPRoute / UDPRoute from bypassing the route-hostname VAP. - Add HostnameConflict resolution section: cozy-* > tenant-* priority, lexicographic tiebreak, status condition under the controller's name in Status.Parents. - Add Foreign-takeover guards section listing all five reconcile paths (Gateway, redirect HTTPRoute, Issuer, wildcard Certificate, per-listener Certificate) that refuse to silently take over pre-existing objects without an OwnerReference back to the TenantGateway. - Add cilium-lb-pool empty-IPs exception note: in loadBalancer mode with empty publishing.externalIPs the chart skips the per-tenant pool render rather than failing — this is the legitimate operator pattern for clusters running BGP / L2-announce pools managed outside the chart. - Refresh listener-naming docs: per-app listeners use a sha256 suffix (`https--<8-hex>`) so two hostnames sharing the first label (harbor.foo.example.com vs harbor.alice.example.com) produce distinct names. - Refresh troubleshooting: TenantGateway Ready=False with ReconcileError points at foreign-takeover refusals; route VAP rejection messages. Plus platform-package.md: extend publishing.certificates table with the dns01.* provider matrix so operators can find the wiring keys in one place. Signed-off-by: Aleksei Sviridkin --- .../en/docs/next/networking/gateway-api.md | 250 ++++++++++++++---- .../configuration/platform-package.md | 14 + 2 files changed, 209 insertions(+), 55 deletions(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 124bdecc..4ba48a10 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -1,20 +1,51 @@ --- title: "Gateway API (Cilium)" linkTitle: "Gateway API" -description: "Per-tenant Gateway API ingress backed by Cilium — cert-manager integration, TLS termination and passthrough, cross-tenant isolation via ValidatingAdmissionPolicies." +description: "Per-tenant Gateway API ingress backed by Cilium — TenantGateway CRD, cert-manager integration, TLS termination and passthrough, seven-layer cross-tenant isolation." weight: 15 --- ## Overview -Cozystack ships Gateway API support as an opt-in alternative to ingress-nginx. When enabled, every tenant with `spec.gateway: true` gets its own `Gateway` materialised in its own namespace, with the Cilium Gateway API controller programming Envoy-on-DaemonSet and announcing the tenant's LoadBalancer IPs through Cilium LB IPAM. Certificates are issued by cert-manager against a per-tenant `Issuer` so each tenant gets an isolated ACME account. +Cozystack ships Gateway API support as an opt-in alternative to ingress-nginx. When enabled, every tenant with `spec.gateway: true` gets its own `Gateway` materialised in its own namespace, with the Cilium Gateway API controller programming Envoy-on-DaemonSet and announcing the tenant's LoadBalancer IPs through Cilium LB IPAM. -This page documents the architecture, the two-step opt-in, the security model (four independent ValidatingAdmissionPolicies plus the listener namespace whitelist), and the migration story from ingress-nginx. +The chart does not render `Gateway`, `Issuer`, or `Certificate` resources directly. Instead it renders one `gateway.cozystack.io/v1alpha1 TenantGateway` CR per tenant, and `cozystack-controller` reconciles all the downstream Gateway API and cert-manager objects from there. This avoids the Helm-vs-controller race on `Gateway.spec.listeners` that route-driven dynamic listener materialization would otherwise cause. + +This page documents the architecture, the two-step opt-in, the cert-mode choice (HTTP-01 default vs DNS-01 wildcard opt-in), the seven-layer security model, and the migration story from ingress-nginx. Gateway API and ingress-nginx coexist on the same cluster — the two modes are selected per service / per tenant, not globally. Existing clusters upgrade with `gateway.enabled=false` and see no behavioural change. ## Architecture +### Reconciliation flow + +```mermaid +flowchart TD + CHART["extra/gateway chart"] + CR["TenantGateway CR
(gateway.cozystack.io/v1alpha1)"] + CTRL["cozystack-controller
(TenantGatewayReconciler)"] + GW["Gateway
(per-tenant, dynamic listeners)"] + ISS["Issuer
(per-tenant ACME account)"] + CERT["Certificate(s)
HTTP-01: per-listener
DNS-01: single wildcard"] + REDIR["HTTPRoute
(http→https redirect, owned)"] + HTR["HTTPRoute / TLSRoute
(app-owned, watched)"] + + CHART -->|renders| CR + CR --> CTRL + CTRL -->|materialises| GW + CTRL -->|materialises| ISS + CTRL -->|materialises| CERT + CTRL -->|materialises| REDIR + HTR -.->|hostnames feed listener set| CTRL +``` + +The controller: + +- Materialises the `Gateway`, the per-tenant `Issuer`, the redirect HTTPRoute, and the Certificate(s) from `TenantGateway.spec`. +- Watches `HTTPRoute` and `TLSRoute` resources cluster-wide. For each route attached to its Gateway, it picks up the hostnames and (in HTTP-01 mode) appends a per-app HTTPS listener + a per-app `Certificate`. +- Resolves cross-namespace hostname conflicts: `cozy-*` namespaces (cluster-admin-managed platform services) win over tenant namespaces; the loser receives a `HostnameConflict` condition under the controller's name in `Status.Parents`. +- Refuses to silently take over pre-existing `Gateway`, `Issuer`, `Certificate`, or redirect `HTTPRoute` objects that share the controller-derived name but carry no `OwnerReference` back to the TenantGateway. Operators see an explicit `Ready=False/ReconcileError` condition instead of having their hand-pinned config rewritten. + ### Traffic path ```mermaid @@ -23,9 +54,9 @@ flowchart LR LB["CiliumLoadBalancerIPPool
(announces publishing.externalIPs)"] ENV["cilium-envoy DaemonSet
(L7 termination / L4 passthrough)"] GW["Gateway 'cozystack'
(per-tenant namespace)"] - HTR["HTTPRoute
dashboard, keycloak, harbor, bucket"] + HTR["HTTPRoute
dashboard, keycloak, harbor, bucket, ..."] TLR["TLSRoute
kubernetes-api, vm-exportproxy,
cdi-uploadproxy"] - CM["cert-manager Issuer
(per-tenant ACME account)"] + CM["cert-manager
(per-tenant Issuer + Certificate(s))"] SVC["Service
(backend)"] CLIENT -->|DNS → LB IP| LB @@ -35,27 +66,30 @@ flowchart LR GW --> TLR HTR --> SVC TLR --> SVC - CM -.->|issues wildcard Certificate| GW + CM -.->|issues Certificate(s)| GW ``` -- **One `GatewayClass cilium`** cluster-wide, reconciled by Cilium's Gateway API controller. There is no per-tenant GatewayClass, so no tenant can hijack the class by naming theirs after someone else. +- **One `GatewayClass cilium`** cluster-wide. There is no per-tenant GatewayClass, so no tenant can hijack the class by naming theirs after someone else. - **One `Gateway` per tenant** in the tenant's own namespace. All listeners for that tenant live on a single Gateway object; there is no cross-Gateway merge. -- **Envoy** runs as a Cilium DaemonSet (`cilium.envoy.enabled=true`) and handles both TLS termination (HTTPS listener) and TLS passthrough (dedicated per-service listeners for the kubeapiserver and the KubeVirt VM export / CDI upload proxies). -- **LoadBalancer IP** is assigned by Cilium LB IPAM from a `CiliumLoadBalancerIPPool` scoped to the tenant's `cilium-gateway-cozystack` Service. Tenants with shared apex IPs compete for addresses — operators running multi-tenant bare-metal clusters should either carve up `publishing.externalIPs` or give every tenant its own subset. +- **Envoy** runs as a Cilium DaemonSet (`cilium.envoy.enabled=true`) and handles both TLS termination (HTTPS listeners) and TLS passthrough (dedicated per-service listeners for the kubeapiserver and the KubeVirt VM export / CDI upload proxies). +- **LoadBalancer IP** is assigned by Cilium LB IPAM from a `CiliumLoadBalancerIPPool` scoped to the tenant's `cilium-gateway-cozystack` Service. ### Listener layout on a tenant Gateway -A tenant Gateway always materialises three base listeners: +A tenant Gateway always materialises an HTTP listener: | # | Name | Protocol | Port | Hostname | Purpose | |---|---|---|---|---|---| | 1 | `http` | `HTTP` | 80 | none (wildcard) | ACME `/.well-known/acme-challenge/*` + HTTP→HTTPS redirect HTTPRoute | -| 2 | `https` | `HTTPS` | 443 | `*.` | TLS termination for wildcard subdomain services (dashboard, keycloak, etc.) | -| 3 | `https-apex` | `HTTPS` | 443 | `` | TLS termination for the apex domain itself | -Plus one extra listener per TLS-passthrough service (see [TLS passthrough](#tlsroute-tls-passthrough) below). +Plus HTTPS listeners that depend on cert mode: -`` is read from `_namespace.host` which the tenant chart derives from the tenant's `spec.host` (or inherits from the parent). Listeners 2 and 3 both consume the wildcard `Certificate` that cert-manager issues against the per-tenant `Issuer`. +- **HTTP-01 mode (default):** one HTTPS listener per attached HTTPRoute hostname, named `https--<8-hex>`. The hex suffix is the first 32 bits of `sha256(hostname)` so two different hostnames sharing the same first label (`harbor.foo.example.com` vs `harbor.alice.example.com`) get distinct listener names. Each listener's `tls.certificateRefs` points at a per-listener `Certificate` named `--<8-hex>-tls`, also auto-issued. +- **DNS-01 mode (opt-in):** two HTTPS listeners — `https` (`*.`) and `https-apex` (``) — both consuming a single wildcard Certificate. + +Plus one extra listener per TLS-passthrough service (see [TLS passthrough](#tlsroute-tls-passthrough)). + +The plain-HTTP listener (port 80) carries a strictly narrower `allowedRoutes.namespaces` selector than the HTTPS listeners — only the tenant namespace itself (where the controller-owned redirect HTTPRoute lives) and `cozy-cert-manager` (HTTP-01 ACME challenge HTTPRoutes). App HTTPRoutes attaching to the Gateway by hostname therefore cannot bind to port 80 and serve plaintext. HTTPS listeners further restrict `allowedRoutes.kinds` to `HTTPRoute` (and TLS-passthrough listeners to `TLSRoute`), preventing GRPCRoute / TCPRoute / UDPRoute from attaching outside the route-hostname VAP's coverage. ## Enabling Gateway API @@ -98,11 +132,11 @@ Flipping `gateway.enabled=true` wires three things: - The exposed-service templates (dashboard, keycloak) stop rendering their `Ingress` and start rendering their `HTTPRoute`. - TLS-passthrough services (cozystack-api, vm-exportproxy, cdi-uploadproxy) stop rendering their `Ingress` and start rendering a `TLSRoute` attached to a dedicated Passthrough listener. -The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute` or `TLSRoute` to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry. +The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute` or `TLSRoute` to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry, plus a render-time helm `fail` for the same. ### 2. Per-tenant toggle -Set `spec.gateway: true` on any tenant to materialise its `Gateway`, `Certificate`, `Issuer` and `CiliumLoadBalancerIPPool`: +Set `spec.gateway: true` on any tenant to materialise its `TenantGateway` CR (and through the controller, its `Gateway`, `Issuer`, `Certificate`(s) and `CiliumLoadBalancerIPPool`): ```yaml apiVersion: apps.cozystack.io/v1alpha1 @@ -118,7 +152,41 @@ spec: Tenants may leave `spec.host` empty — the tenant chart computes it as `.`. Setting `spec.host` is reserved for cluster-admins and cozystack/Flux service accounts (enforced runtime by `cozystack-tenant-host-policy`, see [Security](#security)). -A child tenant with `spec.gateway: true` receives its own Gateway, its own wildcard Certificate, and its own `Issuer` that talks to Let's Encrypt on a separate ACME account — so child tenants do not share HTTP-01 challenge state with the parent or with siblings. +A child tenant with `spec.gateway: true` receives its own Gateway, its own Certificate(s), and its own `Issuer` that talks to Let's Encrypt on a separate ACME account — so child tenants do not share HTTP-01 challenge state with the parent or with siblings. There is no "share the parent's Gateway" mode; per-tenant Gateway is a deliberate isolation property of the security model. + +## Cert mode: HTTP-01 (default) vs DNS-01 (opt-in) + +`publishing.certificates.solver` controls how the per-tenant Issuer sources TLS certs. + +### HTTP-01 (default) + +Out of the box, no extra config required. The controller: + +- Renders an ACME `Issuer` in the tenant namespace with an `http01.gatewayHTTPRoute` solver pointing at the tenant's own Gateway / `http` listener. +- Watches HTTPRoutes / TLSRoutes attached to the Gateway (parentRefs pointing at it). For each unique hostname seen, it adds a per-app HTTPS listener and a per-app `Certificate` (dnsNames containing exactly that hostname). +- Per-app listener naming: `https--<8-hex>` (e.g. `https-harbor-deadbeef`). +- Per-app cert naming: `--<8-hex>-tls`. + +Adding a new published app is purely a matter of deploying its HTTPRoute — no edits to the platform Package or to `_cluster.expose-services` needed. + +### DNS-01 (opt-in) + +Set `publishing.certificates.solver: dns01` and pick a provider: + +| `publishing.certificates.dns01.provider` | Required `publishing.certificates.dns01.` keys | +|---|---| +| `cloudflare` (default) | `cloudflare.secretName`, `cloudflare.secretKey` | +| `route53` | `route53.region`, `route53.secretName` (and `route53.accessKeyID` if not running with IRSA) | +| `digitalocean` | `digitalocean.secretName` | +| `rfc2136` | `rfc2136.nameserver`, `rfc2136.tsigKeyName`, `rfc2136.secretName` | + +Each provider sub-block carries safe defaults for secret-key field names (`api-token`, `secret-access-key`, `access-token`, `tsig-secret-key`) so the typical opt-in is `solver: dns01` plus the provider-specific `secretName` (and `region` for route53 / `nameserver`+`tsigKeyName` for rfc2136). + +DNS-01 mode renders a single wildcard `Certificate` covering `` and `*.`, plus the corresponding `https` (`*.`) and `https-apex` (``) listeners. New apps published under the apex pick up the existing wildcard cert without per-listener provisioning. + +The platform chart writes the provider config into `_cluster.dns01-*` keys consumed by both the per-tenant gateway chart (rendering the TenantGateway CR) and the cluster-wide `letsencrypt-prod` / `letsencrypt-stage` ClusterIssuers used by the legacy ingress flow. Both paths agree on which provider is active. + +Pick DNS-01 when you specifically want a wildcard cert (e.g. a long-lived staging cluster with many short-lived apps and tight Let's Encrypt rate limits). Otherwise stay on HTTP-01. ## Per-service routing @@ -128,10 +196,10 @@ When `gateway.enabled=true`, the following services switch from `Ingress` to Gat | Service | Namespace | `HTTPRoute` name | Backend | Listener | |---|---|---|---|---| -| dashboard | `cozy-dashboard` | `dashboard` | `incloud-web-gatekeeper:8000` | `https` | -| keycloak | `cozy-keycloak` | `keycloak` | `keycloak-http:80` | `https` | -| harbor | tenant namespace | `` | `:80` | `https` (tenant's own Gateway) | -| bucket | tenant namespace | `-ui` | `-ui:8080` | `https` (tenant's own Gateway) | +| dashboard | `cozy-dashboard` | `dashboard` | `incloud-web-gatekeeper:8000` | per-app `https-dashboard-...` (HTTP-01) or `https` (DNS-01) | +| keycloak | `cozy-keycloak` | `keycloak` | `keycloak-http:80` | same | +| harbor | tenant namespace | `` | `:80` | tenant's own Gateway | +| bucket | tenant namespace | `-ui` | `-ui:8080` | tenant's own Gateway | cert-manager's HTTP-01 solver places its short-lived `HTTPRoute` on the `http` listener of the same Gateway, path-matched to `/.well-known/acme-challenge/`. More-specific path matching wins over the catch-all HTTP→HTTPS redirect HTTPRoute. @@ -145,23 +213,24 @@ Services that need SNI-based passthrough (clients present certificates, backend | KubeVirt VM export | `cozy-kubevirt` | `vm-exportproxy` | `vm-exportproxy:443` | `tls-vm-exportproxy` | | KubeVirt CDI upload | `cozy-kubevirt-cdi` | `cdi-uploadproxy` | `cdi-uploadproxy:443` | `tls-cdi-uploadproxy` | -The Passthrough listener is added to the Gateway only if the corresponding service appears in `publishing.exposedServices`. The wildcard `https` listener at `*.` and these specific `tls-*` listeners coexist on port 443 — Cilium resolves SNI to the most-specific hostname match. +The Passthrough listener is added to the Gateway only if the corresponding service appears in `publishing.exposedServices`. -`TLSRoute` is shipped from the Gateway API experimental channel (CRD `gateway.networking.k8s.io/v1alpha2`). It graduates to `v1` in Gateway API v1.5 / Cilium v1.20; Cozystack currently pins `v1alpha2` for compatibility with Cilium v1.19. +`TLSRoute` is shipped from the Gateway API experimental channel (CRD `gateway.networking.k8s.io/v1alpha2`) in v1.5.x. It graduates to `v1` upstream; Cozystack will follow the rename when it lands. ## Security -Gateway API multi-tenancy in Cozystack is guarded at **four independent ValidatingAdmissionPolicies** plus a listener-level namespace whitelist. Each check enforces one invariant and fails closed on policy/ConfigMap errors (`failurePolicy: Fail`, `validationActions: [Deny]`). Compromising one of them does not bypass the others. +Gateway API multi-tenancy in Cozystack is guarded at **seven independent layers**: one listener-level selector (Layer 1, controller-owned), five `ValidatingAdmissionPolicy` gates (Layers 2-5, 7), and one render-time helm guard (Layer 6). Compromising one of them does not bypass the others; admission-time checks fail closed (`failurePolicy: Fail`, `validationActions: [Deny]`). ```mermaid flowchart TD ATK["Attacker
(tenant user, misconfig, compromised SA)"] - L1["Listener allowedRoutes whitelist
(kubernetes.io/metadata.name, kube-apiserver enforced)"] - L2["VAP: Gateway listener hostname
must match namespace.cozystack.io/host"] - L3["VAP: Tenant spec.host restricted
to cluster-admins and cozystack/Flux SAs"] - L4["VAP: namespace.cozystack.io/host label
immutable after first write"] - L5["VAP: Package attachedNamespaces
must not contain tenant-*"] - L6["Render-time helm fail
for tenant-* in attachedNamespaces"] + L1["L1: Listener allowedRoutes selector
(kubernetes.io/metadata.name)"] + L2["L2 VAP: Gateway listener hostname
matches namespace.cozystack.io/host"] + L3["L3 VAP: Package attachedNamespaces
rejects tenant-*"] + L4["L4 VAP: Tenant spec.host writes
restricted to trusted callers"] + L5["L5 VAP: namespace.cozystack.io/host label
writes restricted to trusted callers"] + L6["L6 Render-time helm fail
tenant-* in attachedNamespaces"] + L7["L7 VAP: HTTPRoute/TLSRoute hostnames
match namespace label (tenant-* only)"] GW["Cross-tenant hostname hijack
BLOCKED"] ATK --> L1 @@ -170,45 +239,78 @@ flowchart TD ATK --> L4 ATK --> L5 ATK --> L6 + ATK --> L7 L1 --> GW L2 --> GW L3 --> GW L4 --> GW L5 --> GW L6 --> GW + L7 --> GW ``` ### Layer 1 — Listener `allowedRoutes` namespace whitelist Every listener on a tenant Gateway pins `allowedRoutes.namespaces.from: Selector` to a `matchExpressions` whitelist against the built-in `kubernetes.io/metadata.name` label. That label is written by kube-apiserver on every namespace and cannot be spoofed. -The whitelist is the publishing tenant's namespace (always, implicit) plus `gateway.attachedNamespaces` from the platform Package. A namespace outside the list literally cannot attach any `HTTPRoute` or `TLSRoute` to the Gateway. +The whitelist on **HTTPS / TLS-passthrough listeners** is the publishing tenant's namespace plus `gateway.attachedNamespaces`. The whitelist on the **plain-HTTP listener (port 80)** is strictly narrower — only the tenant namespace itself plus `cozy-cert-manager` (where HTTP-01 challenge HTTPRoutes are published). App HTTPRoutes attaching by hostname therefore cannot bind to port 80 and silently serve plaintext. + +HTTPS listeners additionally restrict `allowedRoutes.kinds` to `HTTPRoute` (TLS-passthrough listeners to `TLSRoute`), preventing `GRPCRoute` / `TCPRoute` / `UDPRoute` from attaching outside the Layer 7 VAP's coverage. ### Layer 2 — `cozystack-gateway-hostname-policy` -`ValidatingAdmissionPolicy` scoped to `gateway.networking.k8s.io/v1 Gateway` CREATE/UPDATE. CEL reads `namespaceObject.metadata.labels["namespace.cozystack.io/host"]` and rejects any listener whose hostname is not equal to that value or a subdomain of it. +`ValidatingAdmissionPolicy` scoped to `gateway.networking.k8s.io/v1 Gateway` CREATE/UPDATE. CEL reads `namespaceObject.metadata.labels["namespace.cozystack.io/host"]` and rejects any listener whose hostname is not equal to that value or a subdomain of it. `matchConditions` gate the VAP to cozystack-managed namespaces only — Gateways in unrelated namespaces (e.g. `kube-system`) are not touched. Because the VAP reads the namespace label (not a cluster-wide ConfigMap), a tenant with a fully independent apex domain (e.g. `customer1.io`, not a subdomain of the platform apex) is validated correctly — the VAP does not assume a subdomain hierarchy. -### Layer 3 — `cozystack-tenant-host-policy` +### Layer 3 — `cozystack-gateway-attached-namespaces-policy` -`ValidatingAdmissionPolicy` scoped to `apps.cozystack.io/v1alpha1 Tenant` CREATE/UPDATE. Rejects setting or changing `spec.host` unless the caller is in the `system:masters` group or is a service account in `cozy-*`, `flux-system` or `kube-system`. Tenants can still create tenants with empty `spec.host` (normal inheritance flow). +`ValidatingAdmissionPolicy` scoped to `cozystack.io/v1alpha1 Package` CREATE/UPDATE. CEL walks `spec.components.platform.values.gateway.attachedNamespaces` and rejects any entry starting with `tenant-`. Catches `kubectl edit packages.cozystack.io` that would bypass the helm render-time guard in Layer 6. + +### Layer 4 — `cozystack-tenant-host-policy` + +`ValidatingAdmissionPolicy` scoped to `apps.cozystack.io/v1alpha1 Tenant` CREATE/UPDATE. Rejects setting or changing `spec.host` unless the caller is in the `system:masters` group or is one of `system:serviceaccounts:cozy-system`, `system:serviceaccounts:cozy-cert-manager`, `system:serviceaccounts:cozy-fluxcd`, `system:serviceaccounts:kube-system`. Tenants can still create tenants with empty `spec.host` (normal inheritance flow). This closes the path where a tenant user creates a Tenant with `spec.host=dashboard.example.org` to have the tenant chart write a hijacked label into their namespace. -### Layer 4 — `cozystack-namespace-host-label-policy` +### Layer 5 — `cozystack-namespace-host-label-policy` + +`ValidatingAdmissionPolicy` scoped to core `v1 Namespace` CREATE/UPDATE. Rejects any set or change of the `namespace.cozystack.io/host` label, except by the same trusted-caller whitelist as Layer 4. Closes both first-time label writes on CREATE and first-time adds on UPDATE — only cozystack/Flux service accounts (which apply the tenant chart) can stamp the label. + +Combined with Layer 4, a tenant user cannot rewrite their host through either the Tenant CR or the namespace label. + +### Layer 6 — Render-time `fail` in cozystack-basics + +The cozystack-basics chart fails the helm render if `_cluster.gateway-attached-namespaces` contains any `tenant-*` entry. Triggers on the helm-install path before the cluster ever sees the values; complements Layer 3 which triggers at `kubectl apply` time. + +### Layer 7 — `cozystack-route-hostname-policy` -`ValidatingAdmissionPolicy` scoped to core `v1 Namespace` UPDATE. Rejects any change to `namespace.cozystack.io/host` once the label is set, except by the same trusted-caller whitelist. CREATE is unrestricted (initial label write happens there, by the cozystack chart). +`ValidatingAdmissionPolicy` scoped to `gateway.networking.k8s.io/v1 HTTPRoute` and `v1alpha2 TLSRoute` CREATE/UPDATE. Scoped to `tenant-*` namespaces (cozy-* are cluster-admin-managed and trusted to publish under any apex). Rejects any `spec.hostnames` entry that is not equal to the namespace's `namespace.cozystack.io/host` label or a subdomain of it. **Fail-closed when the label is absent** — a `tenant-*` namespace without `namespace.cozystack.io/host` is rejected, not silently allowed. -Combined with Layer 3, a tenant user cannot rewrite their host through either route. +Closes the cross-apex hostname surface a tenant user with HTTPRoute RBAC could otherwise exploit. The within-apex cross-namespace case (a tenant claiming a hostname owned by a `cozy-*` app) is handled by the controller at reconcile time — see [HostnameConflict resolution](#hostnameconflict-resolution) below. -### Layer 5 — `cozystack-gateway-attached-namespaces-policy` +For `tenant-root` the allowed host suffix is `publishing.host`; for any `tenant-` that inherits from its parent the suffix is `.`. A child tenant with an independent apex (`customer1.io` instead of a subdomain) is handled correctly because the VAP reads the per-namespace label rather than assuming a subdomain hierarchy. -`ValidatingAdmissionPolicy` scoped to `cozystack.io/v1alpha1 Package` CREATE/UPDATE. CEL walks `spec.components.platform.values.gateway.attachedNamespaces` and rejects any entry starting with `tenant-`. Catches `kubectl edit packages.cozystack.io` that would bypass helm. +### HostnameConflict resolution -### Layer 6 — Render-time `fail` +When two routes from different namespaces claim the same hostname, the controller picks the winner deterministically: -cozystack-basics' hostname policy template also fails the chart render if `_cluster.gateway-attached-namespaces` contains a `tenant-*` entry. Triggers on the helm-install path before the cluster ever sees the values. Belt-and-suspenders with Layer 5. +- A route from a `cozy-*` namespace (cluster-admin-managed platform service) wins over a route from any other namespace. +- Within the same priority tier, the route with the lexicographically smallest `/` pair wins. + +The losing route receives `Accepted=False` with `Reason=HostnameConflict` in `Status.Parents` under the controller's name (`gateway.cozystack.io/tenantgateway-controller`). Other controllers' status entries (Cilium etc.) are left untouched. + +### Foreign-takeover guards + +Five reconcile paths refuse to silently rewrite a pre-existing object that shares the controller-derived name but carries no `OwnerReference` back to the TenantGateway: + +- `Gateway` (named after the TenantGateway) +- redirect `HTTPRoute` (`-http-redirect`) +- per-tenant `Issuer` (`-gateway`) +- wildcard `Certificate` (`-gateway-tls`, DNS-01 mode) +- per-listener `Certificate` (`--<8-hex>-tls`, HTTP-01 mode) + +An operator who hand-pinned a Certificate or Issuer at the controller's derived name (private CA, manual cert pinning, internal ACME) gets an explicit `Ready=False/ReconcileError` condition on the TenantGateway instead of having their config silently destroyed and the resource re-issued from a different ACME account. The error message points at the offending object so the operator can either delete it (handing ownership to the controller) or rename it. ### What this does NOT defend @@ -220,14 +322,16 @@ These residuals are design choices, not runtime gaps: ## Certificates -Every tenant with `spec.gateway: true` gets its own cert-manager `Issuer` (namespace-scoped, not `ClusterIssuer`) named `gateway`. The Issuer carries its own ACME account via `privateKeySecretRef: gateway-acme-account`. The wildcard `Certificate` for the tenant references `issuerRef.kind: Issuer, name: gateway`. +Every tenant with `spec.gateway: true` gets its own cert-manager `Issuer` (namespace-scoped, not `ClusterIssuer`) named `-gateway`. The Issuer carries its own ACME account via `privateKeySecretRef: -acme-account`. Certificates reference `issuerRef.kind: Issuer, name: -gateway`. + +In **HTTP-01 mode**, one Certificate per published-app hostname (named `--<8-hex>-tls`). In **DNS-01 mode**, a single wildcard Certificate (named `-gateway-tls`) covers `` and `*.`. Two ACME servers are supported out of the box: - `publishing.certificates.issuerName: letsencrypt-prod` → `https://acme-v02.api.letsencrypt.org/directory` - `publishing.certificates.issuerName: letsencrypt-stage` → `https://acme-staging-v02.api.letsencrypt.org/directory` -Any other value fails the chart render with a pointer to `packages/extra/gateway/templates/issuer.yaml` for how to add a new mapping. +Any other value fails the chart render with a pointer to the controller's renderer (`internal/controller/tenantgateway/renderers.go`) for how to add a new mapping. ### Rate limits @@ -237,12 +341,24 @@ Let's Encrypt enforces per-account and per-registered-domain quotas: - 5 duplicate certificates per week for the same hostname set - 300 new orders per account per 3 hours -A cluster where many tenants share the same apex domain can exhaust these quickly. Mitigations: +A cluster where many tenants share the same apex domain can exhaust these quickly, especially in HTTP-01 mode where each published app contributes one certificate. Mitigations: - `publishing.certificates.issuerName: letsencrypt-stage` for non-production clusters (staging quotas do not affect prod). - `tenant.spec.resourceQuotas.count/certificates.cert-manager.io` to cap per-tenant certificate creations. +- Switch to DNS-01 to consolidate every tenant's apps under one wildcard cert (cuts cert count from N apps to 1). - For air-gapped deployments, use the bundled `selfsigned-cluster-issuer` or an internal ACME server. +Recommended tenant-level quota to contain a misbehaving tenant: + +```yaml +apiVersion: apps.cozystack.io/v1alpha1 +kind: Tenant +spec: + gateway: true + resourceQuotas: + count/certificates.cert-manager.io: "10" +``` + ## Migration from ingress-nginx The two modes coexist. Switching happens per cluster (`gateway.enabled`) and per tenant (`tenant.spec.gateway`), not globally. @@ -275,15 +391,15 @@ Caveats for `loadBalancer` mode: - `publishing.externalIPs` must contain at least one non-empty address; otherwise the chart render fails fast (a LoadBalancer Service without a pool would sit in `` forever). - The ingress-nginx Service is created with `externalTrafficPolicy: Local` to preserve the client source IP. The external IP must therefore be routed to a node that runs an ingress-nginx pod (floating IP, keepalived, upstream router, or `podAntiAffinity` to constrain pod placement). -- Cilium does not announce the IP on its own unless L2 announcements or BGP are enabled in Cilium values (disabled by default in Cozystack). This mode assumes the operator already routes `publishing.externalIPs` to a cluster node. +- Cilium does not announce the IP on its own unless L2 announcements or BGP are enabled in Cilium values (disabled by default in Cozystack). This mode assumes the operator already routes `publishing.externalIPs` to a cluster node. **Exception**: `tenant.spec.gateway=true` with empty `publishing.externalIPs` and `exposure=loadBalancer` is a legitimate operator pattern (Cilium picks IPs from a cluster-wide BGP / L2-announce pool managed outside this chart) — the chart skips the per-tenant pool render rather than failing. - Switching this value on a running cluster recreates the ingress-nginx Service (the kind changes between `ClusterIP` and `LoadBalancer`, and the `HelmRelease` has `upgrade.force: true`). Expect a brief ingress traffic interruption. - Scope: this setting controls only the ingress-nginx Service. Other components that write `Service.spec.externalIPs` directly (for example `packages/apps/vpn/templates/service.yaml`) are unaffected and must be migrated separately before the `AllowServiceExternalIPs` gate flips off. ### For an existing cluster 1. Flip `gateway.enabled: true` on the platform Package. This rerenders cert-manager ClusterIssuers and the exposed-service templates. Existing `Ingress` objects for dashboard / keycloak / cozystack-api (Kubernetes API) / vm-exportproxy / cdi-uploadproxy are deleted by Flux as they are replaced by `HTTPRoute` / `TLSRoute`. -2. For each tenant that should move to Gateway API, set `tenant.spec.gateway: true`. The tenant chart materialises the `Gateway`, `Certificate` and `Issuer`. -3. Verify: `kubectl -n wait gateway/cozystack --for=condition=Programmed`, then `kubectl -n wait certificate/-gateway-tls --for=condition=Ready`. +2. For each tenant that should move to Gateway API, set `tenant.spec.gateway: true`. The tenant chart materialises the `TenantGateway` CR; the controller reconciles the rest. +3. Verify: `kubectl -n wait gateway/cozystack --for=condition=Programmed`, then `kubectl -n get certificate` (one wildcard in DNS-01 mode, one per published app in HTTP-01 mode) and `kubectl -n wait certificate/ --for=condition=Ready`. 4. Once every tenant has migrated, the `cozystack.ingress-application` package source can be removed from the system bundle — ingress-nginx deployment is no longer required. Applications that live in upstream vendored charts (harbor, bucket) attach to their tenant's Gateway through `_namespace.gateway`, which the tenant chart populates automatically once `spec.gateway: true` is set. @@ -291,15 +407,27 @@ Applications that live in upstream vendored charts (harbor, bucket) attach to th ## Known limitations - **Tenant IP allocation from a shared pool.** `publishing.externalIPs` is cluster-wide. Tenants with `gateway: true` compete for addresses. Operators running multi-tenant deployments should subset IPs per tenant — Cozystack does not partition the list automatically. -- **TLSRoute v1alpha2.** Gateway API v1.5 / Cilium v1.20 will promote TLSRoute to `v1`. Cozystack will follow once the Cilium version lands. `v1alpha2` is the currently-supported version. -- **`tenant.spec.host` enforcement.** A tenant cannot set their own host (runtime-blocked), but a cluster-admin who misconfigures it will produce a tenant that publishes a hostname they do not own. ACME will fail (no DNS control), so no cert is issued and no hijack materialises, but the diagnostics stop at "Certificate stuck in Pending". +- **TLSRoute v1alpha2.** Gateway API v1.5 ships TLSRoute at `v1alpha2`. It graduates to `v1` upstream; Cozystack will follow the rename when it lands. +- **Inheritance from parent Gateway.** Child tenants currently must opt into their own Gateway via `tenant.spec.gateway=true`. There is no "share the parent's Gateway" mode; per-tenant Gateway is a deliberate isolation property of the security model. Inheritance may land later behind an explicit `tenant.spec.gatewayInheritFromParent` flag, paired with extensions to Layers 5 and 7. +- **Supported ACME issuers.** `publishing.certificates.issuerName` must be `letsencrypt-prod` or `letsencrypt-stage` (the controller maps those to ACME server URLs). To support another ACME provider, extend the controller's renderer with an additional branch. +- **`tenant.spec.host` enforcement.** A tenant cannot set their own host (runtime-blocked), but a cluster-admin who misconfigures it produces a tenant publishing a hostname they do not own. ACME will fail (no DNS control), so no cert is issued and no hijack materialises, but the diagnostics stop at "Certificate stuck in Pending". - **Upstream application features.** Some chart-level features in harbor / bucket still rely on ingress-nginx annotations upstream. Cozystack tracks those as upstream PRs; they remain the reason some ops teams will keep ingress-nginx alongside Gateway API for a while. ## Troubleshooting -### Gateway stuck in `Programmed=False` +### TenantGateway stuck in `Ready=False` with `ReconcileError` + +```bash +kubectl -n describe tenantgateway cozystack +``` + +The status condition's message names the failing step. Common cases: -Check the Cilium Gateway API controller logs: +- `gateway /cozystack exists but is not owned by TenantGateway ...` — a pre-existing Gateway with our derived name was found and refused. Rename or delete the foreign Gateway, or set its `OwnerReference` to the TenantGateway by hand if you intend to take ownership. +- `issuer /-gateway exists but is not owned ...` — same shape for a foreign Issuer. +- `certificate /... exists but is not owned ...` — same for a foreign Certificate. + +### Gateway stuck in `Programmed=False` ```bash kubectl -n cozy-cilium logs deploy/cilium-operator --tail=100 | grep -i gateway @@ -310,15 +438,27 @@ Common causes: `gatewayClassName` typo (must be exactly `cilium`), a listener th ### Certificate stuck in `Ready=False` ```bash -kubectl -n describe certificate -gateway-tls +kubectl -n describe certificate kubectl -n describe challenge ``` -If the Challenge's `HTTPRoute` has `Accepted=False`, the HTTP listener's `allowedRoutes` whitelist does not include the Challenge's namespace — expected to be the tenant namespace itself, always implicitly in the list. If the Challenge reports ACME server errors, check DNS: `` and `*.` must resolve to the Gateway's LB IP. +If the Challenge's `HTTPRoute` has `Accepted=False`, the HTTP listener's `allowedRoutes` whitelist does not include the Challenge's namespace — expected to be `cozy-cert-manager`, always implicitly in the list. If the Challenge reports ACME server errors, check DNS: `` (HTTP-01) or `` and `*.` (DNS-01) must resolve to the Gateway's LB IP / be answered by the configured DNS-01 provider. + +### HTTPRoute rejected with `HostnameConflict` + +```bash +kubectl -n describe httproute +``` + +Look for an entry under `Status.Parents` with `controllerName: gateway.cozystack.io/tenantgateway-controller` and `Reason: HostnameConflict`. The message names the conflicting hostname(s) and the route that owns them. Within-apex conflicts are resolved with `cozy-*` priority; the loser must use a different hostname. ### Admission denied: "Gateway listener hostname must equal..." -The VAP `cozystack-gateway-hostname-policy` rejected the Gateway because a listener hostname does not match `namespace.cozystack.io/host` on the Gateway's namespace. Fix the listener hostname, or (if the namespace label is wrong) update the tenant's `spec.host` via a trusted caller. +Layer 2 (`cozystack-gateway-hostname-policy`) rejected the Gateway because a listener hostname does not match `namespace.cozystack.io/host` on the Gateway's namespace. Fix the listener hostname, or (if the namespace label is wrong) update the tenant's `spec.host` via a trusted caller. + +### Admission denied: "HTTPRoute hostnames must equal..." + +Layer 7 (`cozystack-route-hostname-policy`) rejected the HTTPRoute or TLSRoute because a hostname falls outside the apex of the namespace's `namespace.cozystack.io/host` label. Either change the hostname to live under the apex, or move the route to a namespace whose label covers the desired hostname. ### Admission denied: "tenant.spec.host can only be set..." @@ -328,7 +468,7 @@ A non-trusted caller tried to set `tenant.spec.host`. Use an empty `spec.host` ( Two causes: -- `publishing.externalIPs` is empty. No `CiliumLoadBalancerIPPool` is rendered. +- `publishing.externalIPs` is empty AND `publishing.exposure=externalIPs`. No `CiliumLoadBalancerIPPool` is rendered. (Empty + `loadBalancer` mode is OK; an external pool is expected.) - Another Gateway (same tenant or another tenant's on the same IP pool) has already claimed the addresses. `kubectl get ciliumloadbalancerippool` shows the pools, their serviceSelector, and which Service owns each IP. diff --git a/content/en/docs/next/operations/configuration/platform-package.md b/content/en/docs/next/operations/configuration/platform-package.md index d0a858aa..b40a7642 100644 --- a/content/en/docs/next/operations/configuration/platform-package.md +++ b/content/en/docs/next/operations/configuration/platform-package.md @@ -66,6 +66,20 @@ spec: | `publishing.exposure` | `"externalIPs"` | Mode for the ingress-nginx Service. `externalIPs` creates a `ClusterIP` Service with `Service.spec.externalIPs` populated from `publishing.externalIPs`. `loadBalancer` creates a `type: LoadBalancer` Service backed by a `CiliumLoadBalancerIPPool` populated with the same addresses. `Service.spec.externalIPs` is deprecated upstream in Kubernetes v1.36 ([KEP-5707][kep-5707]) — switch to `loadBalancer` before upgrading past v1.40. The chart fails fast if `loadBalancer` is set with an empty `publishing.externalIPs`. See [Gateway API → ingress-nginx Service mode]({{% ref "/docs/next/networking/gateway-api#publishingexposure--ingress-nginx-service-mode" %}}) for the full caveat list. | | `publishing.certificates.solver` | `"http01"` | ACME challenge solver type for default letsencrypt issuer. Possible values: `http01`, `dns01`. | | `publishing.certificates.issuerName` | `"letsencrypt-prod"` | `ClusterIssuer` name for TLS certificates used in system Helm releases. | +| `publishing.certificates.dns01.provider` | `"cloudflare"` | DNS-01 provider when `solver=dns01`. Possible values: `cloudflare`, `route53`, `digitalocean`, `rfc2136`. Both the per-tenant Issuer (rendered by `cozystack-controller` from the `TenantGateway` CR) and the cluster-wide `letsencrypt-prod` / `letsencrypt-stage` `ClusterIssuer`s used by the legacy ingress flow read this. | +| `publishing.certificates.dns01.cloudflare.secretName` | `"cloudflare-api-token-secret"` | Secret name holding a Cloudflare API token with `Zone:Read` + `Zone:DNS:Edit` on the apex zone. | +| `publishing.certificates.dns01.cloudflare.secretKey` | `"api-token"` | Key inside the Secret holding the API token. | +| `publishing.certificates.dns01.route53.region` | `""` | AWS region of the Route53 hosted zone. Required when `provider=route53`. | +| `publishing.certificates.dns01.route53.accessKeyID` | `""` | IAM access key ID. Optional when running with IRSA / instance profile. | +| `publishing.certificates.dns01.route53.secretName` | `""` | Secret name holding the IAM secret access key. Optional when running with IRSA / instance profile. | +| `publishing.certificates.dns01.route53.secretKey` | `"secret-access-key"` | Key inside the Route53 Secret holding the secret access key. | +| `publishing.certificates.dns01.digitalocean.secretName` | `"digitalocean-api-token-secret"` | Secret name holding a DigitalOcean API token with write access to the apex domain. | +| `publishing.certificates.dns01.digitalocean.secretKey` | `"access-token"` | Key inside the Secret holding the DigitalOcean token. | +| `publishing.certificates.dns01.rfc2136.nameserver` | `""` | `host:port` of the authoritative nameserver accepting RFC 2136 dynamic updates. Required when `provider=rfc2136`. | +| `publishing.certificates.dns01.rfc2136.tsigKeyName` | `""` | TSIG key name authorising the dynamic updates. Required when `provider=rfc2136`. | +| `publishing.certificates.dns01.rfc2136.tsigAlgorithm` | `"HMACSHA256"` | TSIG HMAC algorithm. | +| `publishing.certificates.dns01.rfc2136.secretName` | `""` | Secret name holding the TSIG key material. Required when `provider=rfc2136`. | +| `publishing.certificates.dns01.rfc2136.secretKey` | `"tsig-secret-key"` | Key inside the Secret holding the TSIG key. | #### Networking From 5ebb32375db8526f53b611bfcfa453efeebfdf67 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Fri, 1 May 2026 13:32:45 +0300 Subject: [PATCH 10/12] docs(gateway-api): address CodeRabbit review on b4d413c Two findings from CodeRabbit on the docs/gateway-api-cilium branch: 1. The platform Package example in `gateway-api.md` was missing the `default` namespace from `gateway.attachedNamespaces`. The actual cozystack platform values (packages/core/platform/values.yaml) include it because the Kubernetes API TLSRoute lives next to the `kubernetes` Service in the `default` namespace. Add the entry and a one-line explanation so a copy-paste of the example matches the working default. 2. `platform-package.md` triggered markdownlint MD052 on the `[KEP-5707][kep-5707]` reference-style link inside the `publishing.exposure` table cell. markdownlint-cli2 has a known parsing limitation around reference-style links inside table cells. Switch to an inline URL and drop the now-orphaned `[kep-5707]:` reference definition. Signed-off-by: Aleksei Sviridkin --- content/en/docs/next/networking/gateway-api.md | 3 +++ .../en/docs/next/operations/configuration/platform-package.md | 3 +-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 4ba48a10..b37ef926 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -124,8 +124,11 @@ spec: - cozy-kubevirt-cdi - cozy-monitoring - cozy-linstor-gui + - default ``` +The `default` namespace is included because the Kubernetes API `TLSRoute` (shipped by the cozystack-api package) lives next to the `kubernetes` Service it points at, which is always in `default`. + Flipping `gateway.enabled=true` wires three things: - cert-manager `ClusterIssuer.spec.acme.solvers` switches from `http01.ingress.ingressClassName` to `http01.gatewayHTTPRoute` that attaches to the publishing tenant's Gateway. diff --git a/content/en/docs/next/operations/configuration/platform-package.md b/content/en/docs/next/operations/configuration/platform-package.md index b40a7642..b8e5cd8c 100644 --- a/content/en/docs/next/operations/configuration/platform-package.md +++ b/content/en/docs/next/operations/configuration/platform-package.md @@ -63,7 +63,7 @@ spec: | `publishing.exposedServices` | `[api, dashboard, vm-exportproxy, cdi-uploadproxy]` | List of services to expose. Possible values: `api`, `dashboard`, `cdi-uploadproxy`, `vm-exportproxy`. | | `publishing.ingressName` | `"tenant-root"` | Ingress controller to use for exposing services. | | `publishing.externalIPs` | `[]` | List of external IPs used for the specified ingress controller. If not specified, a LoadBalancer service is used by default. | -| `publishing.exposure` | `"externalIPs"` | Mode for the ingress-nginx Service. `externalIPs` creates a `ClusterIP` Service with `Service.spec.externalIPs` populated from `publishing.externalIPs`. `loadBalancer` creates a `type: LoadBalancer` Service backed by a `CiliumLoadBalancerIPPool` populated with the same addresses. `Service.spec.externalIPs` is deprecated upstream in Kubernetes v1.36 ([KEP-5707][kep-5707]) — switch to `loadBalancer` before upgrading past v1.40. The chart fails fast if `loadBalancer` is set with an empty `publishing.externalIPs`. See [Gateway API → ingress-nginx Service mode]({{% ref "/docs/next/networking/gateway-api#publishingexposure--ingress-nginx-service-mode" %}}) for the full caveat list. | +| `publishing.exposure` | `"externalIPs"` | Mode for the ingress-nginx Service. `externalIPs` creates a `ClusterIP` Service with `Service.spec.externalIPs` populated from `publishing.externalIPs`. `loadBalancer` creates a `type: LoadBalancer` Service backed by a `CiliumLoadBalancerIPPool` populated with the same addresses. `Service.spec.externalIPs` is deprecated upstream in Kubernetes v1.36 ([KEP-5707](https://github.com/kubernetes/enhancements/issues/5707)) — switch to `loadBalancer` before upgrading past v1.40. The chart fails fast if `loadBalancer` is set with an empty `publishing.externalIPs`. See [Gateway API → ingress-nginx Service mode]({{% ref "/docs/next/networking/gateway-api#publishingexposure--ingress-nginx-service-mode" %}}) for the full caveat list. | | `publishing.certificates.solver` | `"http01"` | ACME challenge solver type for default letsencrypt issuer. Possible values: `http01`, `dns01`. | | `publishing.certificates.issuerName` | `"letsencrypt-prod"` | `ClusterIssuer` name for TLS certificates used in system Helm releases. | | `publishing.certificates.dns01.provider` | `"cloudflare"` | DNS-01 provider when `solver=dns01`. Possible values: `cloudflare`, `route53`, `digitalocean`, `rfc2136`. Both the per-tenant Issuer (rendered by `cozystack-controller` from the `TenantGateway` CR) and the cluster-wide `letsencrypt-prod` / `letsencrypt-stage` `ClusterIssuer`s used by the legacy ingress flow read this. | @@ -204,6 +204,5 @@ These fields are managed automatically by the Cozystack operator and should not [overwrite-parameters]: {{% ref "/docs/next/operations/configuration/components#overwriting-component-parameters" %}} [Resource Management]: {{% ref "/docs/next/guides/resource-management#cpu-allocation-ratio" %}} [oidc]: {{% ref "/docs/next/operations/oidc" %}} -[kep-5707]: https://github.com/kubernetes/enhancements/issues/5707 [telemetry]: {{% ref "/docs/next/operations/configuration/telemetry" %}} [kube-ovn]: https://kubeovn.github.io/docs/en/guide/subnet/#join-subnet From 09d917bc375b456d10e1e20a7e5a1ffa5972ca30 Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Sat, 2 May 2026 16:29:17 +0300 Subject: [PATCH 11/12] docs(gateway-api): document Variant B auto-on default for derived-apex tenants cozystack/cozystack#2470 lands `tenant.spec.gateway` as auto-on for tenants whose apex is derived from the parent (i.e. `tenant.spec.host` is empty), opt-in for custom-apex tenants, and an explicit opt-out escape hatch via `gateway: false`. Reflect that in the Per-tenant Gateway section: replace the "Set spec.gateway: true on any tenant" framing with the actual three-rule resolution (auto-on, opt-in, opt-out) plus example manifests for each. Update the Gateway section in platform-package.md's parameter table to mirror the new resolution semantics. Signed-off-by: Aleksei Sviridkin --- .../en/docs/next/networking/gateway-api.md | 31 ++++++++++++++----- .../configuration/platform-package.md | 4 +-- 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index b37ef926..89c86730 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -137,9 +137,9 @@ Flipping `gateway.enabled=true` wires three things: The `attachedNamespaces` list restricts which namespaces may attach `HTTPRoute` or `TLSRoute` to tenant Gateways through the listener `allowedRoutes` whitelist (see [Security](#security)). It is also guarded by a runtime `ValidatingAdmissionPolicy` that rejects any `tenant-*` entry, plus a render-time helm `fail` for the same. -### 2. Per-tenant toggle +### 2. Per-tenant Gateway -Set `spec.gateway: true` on any tenant to materialise its `TenantGateway` CR (and through the controller, its `Gateway`, `Issuer`, `Certificate`(s) and `CiliumLoadBalancerIPPool`): +A tenant gets its own `TenantGateway` CR (and through the controller, its `Gateway`, `Issuer`, `Certificate`(s) and `CiliumLoadBalancerIPPool`) automatically when its apex is derived from the parent — i.e. `tenant.spec.host` is left unset and the chart computes `.`. The implicit assumption is that derived-apex tenants want their URLs routable; forcing operators to also set `tenant.spec.gateway: true` would be a needless extra step. ```yaml apiVersion: apps.cozystack.io/v1alpha1 @@ -147,15 +147,32 @@ kind: Tenant metadata: name: alice namespace: tenant-root +spec: {} # gateway auto-on, host derived as alice. +``` + +For tenants with a custom non-derived apex (independent domain like `customer1.io`, not a subdomain), the operator made a deliberate apex choice — keep explicit opt-in to avoid surprising LB IP / ACME registration on tenants the operator may not have intended to expose: + +```yaml +apiVersion: apps.cozystack.io/v1alpha1 +kind: Tenant +metadata: + name: acme + namespace: tenant-root spec: - gateway: true - resourceQuotas: - count/certificates.cert-manager.io: "10" + host: customer1.io + gateway: true # required: custom apex does not auto-default +``` + +Operators who specifically want a derived-apex tenant without a Gateway (e.g. a dev sandbox without external exposure) opt out explicitly: + +```yaml +spec: + gateway: false # escape hatch — disables the auto-on default ``` -Tenants may leave `spec.host` empty — the tenant chart computes it as `.`. Setting `spec.host` is reserved for cluster-admins and cozystack/Flux service accounts (enforced runtime by `cozystack-tenant-host-policy`, see [Security](#security)). +Setting `tenant.spec.host` to a custom value is reserved for cluster-admins and cozystack/Flux service accounts (enforced runtime by `cozystack-tenant-host-policy`, see [Security](#security)). -A child tenant with `spec.gateway: true` receives its own Gateway, its own Certificate(s), and its own `Issuer` that talks to Let's Encrypt on a separate ACME account — so child tenants do not share HTTP-01 challenge state with the parent or with siblings. There is no "share the parent's Gateway" mode; per-tenant Gateway is a deliberate isolation property of the security model. +A tenant Gateway, regardless of how it was opted in, is its own per-tenant boundary: separate `Gateway`, separate `Issuer` and ACME account, separate `Certificate`(s) — child tenants do not share HTTP-01 challenge state with the parent or with siblings. There is no "share the parent's Gateway" mode; per-tenant Gateway is a deliberate isolation property of the security model. ## Cert mode: HTTP-01 (default) vs DNS-01 (opt-in) diff --git a/content/en/docs/next/operations/configuration/platform-package.md b/content/en/docs/next/operations/configuration/platform-package.md index b8e5cd8c..fe3f59e3 100644 --- a/content/en/docs/next/operations/configuration/platform-package.md +++ b/content/en/docs/next/operations/configuration/platform-package.md @@ -115,11 +115,11 @@ spec: #### Gateway -Platform-wide Gateway API integration. Per-tenant opt-in is governed separately by `tenant.spec.gateway`. See the [Gateway API guide]({{% ref "/docs/next/networking/gateway-api" %}}) for the full architecture and migration path. +Platform-wide Gateway API integration. The actual per-tenant Gateway is materialised on a tenant-by-tenant basis: derived-apex tenants (the common case) opt in automatically; custom-apex tenants opt in explicitly via `tenant.spec.gateway: true`. See the [Gateway API guide]({{% ref "/docs/next/networking/gateway-api" %}}) for the full architecture and migration path. | Value | Default | Description | | --- | --- | --- | -| `gateway.enabled` | `false` | Enable Gateway API support across the platform. When `true`, cert-manager `ClusterIssuer`s use an `http01.gatewayHTTPRoute` solver attached to the publishing tenant's Gateway, and exposed services (`dashboard`, `keycloak`, `harbor`, `bucket`, `cozystack-api`, `vm-exportproxy`, `cdi-uploadproxy`) render `HTTPRoute`/`TLSRoute` instead of `Ingress`. Materialising the actual per-tenant Gateway still requires `tenant.spec.gateway: true`. | +| `gateway.enabled` | `false` | Enable Gateway API support across the platform. When `true`, cert-manager `ClusterIssuer`s use an `http01.gatewayHTTPRoute` solver attached to the publishing tenant's Gateway, and exposed services (`dashboard`, `keycloak`, `harbor`, `bucket`, `cozystack-api`, `vm-exportproxy`, `cdi-uploadproxy`) render `HTTPRoute`/`TLSRoute` instead of `Ingress`. Materialising the actual per-tenant Gateway is auto-on for tenants whose apex is derived from the parent (`tenant.spec.host` empty); custom-apex tenants need explicit `tenant.spec.gateway: true`. | | `gateway.attachedNamespaces` | (see below) | Namespaces allowed to attach `HTTPRoute` or `TLSRoute` to a tenant Gateway via the listener `allowedRoutes` whitelist (matched on the built-in `kubernetes.io/metadata.name` label). The publishing tenant's namespace is always implicitly included. Tenant namespaces (`tenant-*`) are rejected by `cozystack-gateway-attached-namespaces-policy` and by a render-time helm guard — use `tenant.spec.gateway` instead. The `default` namespace is included by default because the Kubernetes API `TLSRoute` lives next to the `kubernetes` Service in `default`. | Default `gateway.attachedNamespaces`: From 830357b49a4373e489b9bce6a68c6300a25b8f0c Mon Sep 17 00:00:00 2001 From: Aleksei Sviridkin Date: Sat, 9 May 2026 19:16:06 +0300 Subject: [PATCH 12/12] docs(gateway-api): security model as three groups, anchor on tenant API surface Mirrors the security framing rewrite in cozystack/cozystack#2470 README: - Security section opens with the three-group framing (tenant-user- input gates / defense-in-depth / admin-against-themselves), anchored in the apps.cozystack.io/* tenant API surface. - Mermaid diagram redrawn so the attacker arrow lands on Layer 4 (cozystack-api admission of Tenant.spec.host) as the user-input boundary; defense-in-depth and admin-against-themselves layers branch off as separate sources. - Layer 7 wording reframed: drop the implication that a tenant user with HTTPRoute RBAC could exploit the cross-apex hostname surface. Tenants in Cozystack do not hold gateway.networking.k8s.io/* RBAC by design. Reframed as defense-in-depth against an app chart bug or supply-chain compromise. - New Tenant API surface subsection in Overview anchors that constraint up front, so the rest of the security model reads correctly without re-deriving it. Description metadata flips 'seven-layer cross-tenant isolation' to 'three-group security model' to match. Assisted-By: Claude Signed-off-by: Aleksei Sviridkin --- .../en/docs/next/networking/gateway-api.md | 50 +++++++++++++------ 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/content/en/docs/next/networking/gateway-api.md b/content/en/docs/next/networking/gateway-api.md index 89c86730..c1a6a59c 100644 --- a/content/en/docs/next/networking/gateway-api.md +++ b/content/en/docs/next/networking/gateway-api.md @@ -1,7 +1,7 @@ --- title: "Gateway API (Cilium)" linkTitle: "Gateway API" -description: "Per-tenant Gateway API ingress backed by Cilium — TenantGateway CRD, cert-manager integration, TLS termination and passthrough, seven-layer cross-tenant isolation." +description: "Per-tenant Gateway API ingress backed by Cilium — TenantGateway CRD, cert-manager integration, TLS termination and passthrough, three-group security model." weight: 15 --- @@ -11,10 +11,14 @@ Cozystack ships Gateway API support as an opt-in alternative to ingress-nginx. W The chart does not render `Gateway`, `Issuer`, or `Certificate` resources directly. Instead it renders one `gateway.cozystack.io/v1alpha1 TenantGateway` CR per tenant, and `cozystack-controller` reconciles all the downstream Gateway API and cert-manager objects from there. This avoids the Helm-vs-controller race on `Gateway.spec.listeners` that route-driven dynamic listener materialization would otherwise cause. -This page documents the architecture, the two-step opt-in, the cert-mode choice (HTTP-01 default vs DNS-01 wildcard opt-in), the seven-layer security model, and the migration story from ingress-nginx. +This page documents the architecture, the two-step opt-in, the cert-mode choice (HTTP-01 default vs DNS-01 wildcard opt-in), the three-group security model, and the migration story from ingress-nginx. Gateway API and ingress-nginx coexist on the same cluster — the two modes are selected per service / per tenant, not globally. Existing clusters upgrade with `gateway.enabled=false` and see no behavioural change. +### Tenant API surface + +Tenants in Cozystack interact with the platform exclusively through `apps.cozystack.io/*` resources (Tenant, Bucket, Kubernetes, …) served by `cozystack-api`. Tenant RBAC (`cozy:tenant:*` aggregated to a RoleBinding in the tenant's own namespace) does not grant write access to `gateway.networking.k8s.io/*`, core `Namespaces`, or `cozystack.io/Package`. The security model below is built around that constraint — tenants do not write Gateways or HTTPRoutes directly, so most of its layers protect against chart bugs, controller bugs, supply-chain compromise, and cluster-admin mistakes rather than against tenant-user input. + ## Architecture ### Reconciliation flow @@ -239,34 +243,48 @@ The Passthrough listener is added to the Gateway only if the corresponding servi ## Security -Gateway API multi-tenancy in Cozystack is guarded at **seven independent layers**: one listener-level selector (Layer 1, controller-owned), five `ValidatingAdmissionPolicy` gates (Layers 2-5, 7), and one render-time helm guard (Layer 6). Compromising one of them does not bypass the others; admission-time checks fail closed (`failurePolicy: Fail`, `validationActions: [Deny]`). +Tenants in Cozystack interact with the platform exclusively through `apps.cozystack.io/*` resources (Tenant, Bucket, Kubernetes, …) served by `cozystack-api`. Tenant RBAC (`cozy:tenant:*` aggregated to a RoleBinding in the tenant's own namespace) does not grant write access to `gateway.networking.k8s.io/*`, core `Namespaces`, or `cozystack.io/Package`. The protections below split into three groups by who they defend against — most of the seven layers are not protecting against tenant-user input (that RBAC isn't granted in the first place); they guard against bugs in cozystack-controller / Flux, supply-chain compromise of an app chart, and confused-deputy mistakes by a cluster admin. All admission-time checks are fail-closed (`failurePolicy: Fail`, `validationActions: [Deny]`). + +**Tenant-user-input gates** — Layer 4 (`cozystack-tenant-host-policy`). `Tenant.spec.host` is the user-supplied field that surfaces as a security boundary at the hostname layer; it is gated on every Create / Update via `cozystack-api`'s admission chain. + +**Defense-in-depth** — Layers 1, 2, 5, 6, 7. Cover chart bugs, controller bugs, supply-chain compromise of an app chart, confused-deputy admin mistakes. They do not protect against tenant-user input because the relevant RBAC isn't granted. + +**Admin-against-themselves** — Layer 3 (`cozystack-gateway-attached-namespaces-policy`). Rejects a `kubectl edit packages.cozystack.io` that would slip a `tenant-*` entry into the platform Package's `gateway.attachedNamespaces`. Layer 6 catches the same misconfiguration at helm render time. ```mermaid flowchart TD - ATK["Attacker
(tenant user, misconfig, compromised SA)"] + USER["Tenant user
(apps.cozystack.io/* via cozystack-api)"] + CHART["App chart bug /
supply-chain compromise"] + ADMIN["Cluster admin
(misconfig)"] + + L4["L4 VAP: Tenant spec.host writes
restricted to trusted callers"] + L1["L1: Listener allowedRoutes selector
(kubernetes.io/metadata.name)"] L2["L2 VAP: Gateway listener hostname
matches namespace.cozystack.io/host"] - L3["L3 VAP: Package attachedNamespaces
rejects tenant-*"] - L4["L4 VAP: Tenant spec.host writes
restricted to trusted callers"] L5["L5 VAP: namespace.cozystack.io/host label
writes restricted to trusted callers"] L6["L6 Render-time helm fail
tenant-* in attachedNamespaces"] L7["L7 VAP: HTTPRoute/TLSRoute hostnames
match namespace label (tenant-* only)"] + + L3["L3 VAP: Package attachedNamespaces
rejects tenant-*"] + GW["Cross-tenant hostname hijack
BLOCKED"] - ATK --> L1 - ATK --> L2 - ATK --> L3 - ATK --> L4 - ATK --> L5 - ATK --> L6 - ATK --> L7 + USER -->|Tenant spec.host| L4 + L4 --> GW + + CHART -->|emits Gateway/HTTPRoute| L1 + CHART --> L2 + CHART --> L5 + CHART --> L6 + CHART --> L7 L1 --> GW L2 --> GW - L3 --> GW - L4 --> GW L5 --> GW L6 --> GW L7 --> GW + + ADMIN -->|kubectl edit Package| L3 + L3 --> GW ``` ### Layer 1 — Listener `allowedRoutes` namespace whitelist @@ -307,7 +325,7 @@ The cozystack-basics chart fails the helm render if `_cluster.gateway-attached-n `ValidatingAdmissionPolicy` scoped to `gateway.networking.k8s.io/v1 HTTPRoute` and `v1alpha2 TLSRoute` CREATE/UPDATE. Scoped to `tenant-*` namespaces (cozy-* are cluster-admin-managed and trusted to publish under any apex). Rejects any `spec.hostnames` entry that is not equal to the namespace's `namespace.cozystack.io/host` label or a subdomain of it. **Fail-closed when the label is absent** — a `tenant-*` namespace without `namespace.cozystack.io/host` is rejected, not silently allowed. -Closes the cross-apex hostname surface a tenant user with HTTPRoute RBAC could otherwise exploit. The within-apex cross-namespace case (a tenant claiming a hostname owned by a `cozy-*` app) is handled by the controller at reconcile time — see [HostnameConflict resolution](#hostnameconflict-resolution) below. +Defense-in-depth against an app chart bug or supply-chain compromise that emits Gateway API resources outside the tenant's apex — tenants in Cozystack do not hold `gateway.networking.k8s.io/*` RBAC by design, so this is not a tenant-user defense. The within-apex cross-namespace case (a tenant chart claiming a hostname owned by a `cozy-*` app) is handled by the controller at reconcile time — see [HostnameConflict resolution](#hostnameconflict-resolution) below. For `tenant-root` the allowed host suffix is `publishing.host`; for any `tenant-` that inherits from its parent the suffix is `.`. A child tenant with an independent apex (`customer1.io` instead of a subdomain) is handled correctly because the VAP reads the per-namespace label rather than assuming a subdomain hierarchy.