Skip to content

Commit c033f1d

Browse files
committed
feat: [pko-351] added documentation for HostedClusterPackage API
Signed-off-by: Ankit152 <ankitkurmi152@gmail.com>
1 parent a015b2d commit c033f1d

2 files changed

Lines changed: 516 additions & 70 deletions

File tree

Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
---
2+
title: HostedClusterPackage API
3+
weight: 1201
4+
images: []
5+
mermaid: true
6+
---
7+
8+
**Note**: The `HostedClusterPackage` API is experimental and subject to
9+
change in the future.
10+
11+
The `HostedClusterPackage` API extends Package Operator with progressive
12+
rollout capabilities for Packages targeting HyperShift Hosted Control Planes
13+
(HCP). It introduces a cluster-scoped custom resource, the
14+
`HostedClusterPackage`, which governs the lifecycle and update process of
15+
Packages across all hosted control planes within a HyperShift Management
16+
Cluster.
17+
18+
## Overview
19+
20+
This API allows for the central rollout of `Packages` to all hosted control
21+
planes on a given management cluster. It provides facilities to control
22+
rollout strategies, significantly reducing the "blast radius" of failed
23+
upgrades compared to simultaneous updates. It also simplifies the
24+
configuration required to deliver objects into an HCP namespace by reducing
25+
the dependency on multiple systems down to a single API.
26+
27+
```mermaid
28+
flowchart LR
29+
subgraph "HyperShift Management Cluster"
30+
metrics-hsp["<b>HostedClusterPackage</b><br>example-package"]
31+
subgraph "Namespace: my-cluster-x"
32+
ns-c1["<b>Package</b><br>example-package"]
33+
end
34+
subgraph "Namespace: my-cluster-y"
35+
ns-c2["<b>Package</b><br>example-package"]
36+
end
37+
end
38+
metrics-hsp--->ns-c1
39+
metrics-hsp--->ns-c2
40+
```
41+
42+
### Key Features
43+
44+
* **Progressive Rollout**: Updates are rolled out gradually to `HostedClusters`
45+
rather than all at once.
46+
* **Lifecycle Automation**: Automatically creates Packages for new
47+
`HostedClusters` and deletes them when the cluster is removed.
48+
* **Status & Monitoring**: Provides status updates on the number
49+
of available, unavailable, or updated packages.
50+
* **Simplified Configuration**: Reduces the configuration surface from
51+
multiple objects to just the `HostedClusterPackage` API.
52+
53+
## HostedClusterPackage Resource
54+
55+
The newly introduced `HostedClusterPackage` resource configuration object
56+
for this functionality. It coordinates the rollout process and which
57+
`HostedClusters` in the fleet are targeted.
58+
59+
### Targeting Clusters
60+
61+
The list of `HostedCluster` objects can be limited by specifying an optional
62+
label selector.
63+
64+
Example selecting `HostedCluster`s with the `foo: bar` label.
65+
66+
```yaml
67+
spec:
68+
hostedClusterSelector:
69+
matchLabels:
70+
foo: bar
71+
template: {}
72+
```
73+
74+
### Partitioning
75+
76+
To control the order of updates, you can attach an optional partition
77+
configuration to the Package API. This ensures that all items within a
78+
specific group are processed before the rollout moves to the next group.
79+
80+
* **Grouping**: The configuration uses labels on the HostedCluster object to
81+
assign groups (e.g., hypershift.openshift.io/risk-group).
82+
* **Ordering**: Groups can be ordered via a static list or by alphanumeric
83+
ascending order.
84+
* **Implicit Handling**: HostedClusters without the specified label or with
85+
unknown values are placed in an implicit "unknown" group and upgraded last.
86+
* **Dynamic Regrouping**: If a cluster's label changes to an earlier group
87+
during an upgrade, the process will jump back to handle that group before
88+
continuing.
89+
90+
Example for `static` ordering:
91+
92+
```yaml
93+
spec:
94+
partition:
95+
labelKey: hypershift.openshift.io/risk-group
96+
order:
97+
static:
98+
- early
99+
- normal
100+
- late
101+
```
102+
103+
Example for `alphanumeric` ordering:
104+
105+
```yaml
106+
spec:
107+
partition:
108+
labelKey: hypershift.openshift.io/risk-group
109+
order:
110+
alphanumericAsc: {}
111+
```
112+
113+
### Progression Strategies
114+
115+
The API supports configurable progression strategies to control the speed
116+
and safety of the rollout.
117+
118+
### Rolling Upgrade
119+
120+
The `rollingUpgrade` strategy is designed to keep service disruptions to a
121+
minimum.
122+
123+
* `maxUnavailable`: Configures the maximum number of Package instances that
124+
can be updating or unavailable at the same time. If a Package is already
125+
unavailable before the upgrade starts, it counts towards this limit. These
126+
unavailable packages are prioritized for updates to prevent accumulating
127+
faulty versions.
128+
129+
```yaml
130+
spec:
131+
strategy:
132+
rollingUpgrade:
133+
maxUnavailable: 1
134+
```
135+
136+
## Status & Observability
137+
138+
The `HostedClusterPackage` API exposes status information to help you track the
139+
progress of a rollout and the health of the fleet. This status can help you
140+
understand if an update is proceeding smoothly or if it has stalled due to
141+
errors.
142+
143+
### Rollout State
144+
145+
The status subresource provides high-level metrics regarding the rollout
146+
process. These fields allow you to quickly assess the distribution of package
147+
versions across your `HostedClusters`:
148+
149+
* **Updated Packages**: The number of `HostedClusters` that have successfully
150+
received the latest version of the Package.
151+
152+
* **Available Packages**: The number of `HostedClusters` where the Package is
153+
currently healthy and serving traffic based on `Available=True` status condition
154+
of the Package
155+
156+
* **Unavailable Packages**: The number of HostedClusters where the Package is
157+
currently degraded. This count is used to enforce the
158+
maxUnavailable limit defined in the progression strategy.
159+
160+
### Progression Logic
161+
162+
The controller uses the status of individual Packages to determine if the
163+
rollout can proceed to the next target.
164+
165+
* **Success**: If a targeted HostedCluster successfully updates and becomes
166+
available, the operator proceeds to select the next cluster in the partition.
167+
168+
* **Failure**: If a Package update fails or becomes unavailable, the rollout
169+
pauses for that target hosted cluster. This prevents the propagation of errors
170+
to the rest of the fleet.
171+
172+
Example `Status` and `Conditions` for a successful rollout:
173+
174+
```yaml
175+
status:
176+
availablePackages: 3
177+
conditions:
178+
- lastTransitionTime: "2026-02-05T06:59:51Z"
179+
message: 3/3 packages available.
180+
observedGeneration: 2
181+
reason: EnoughPackagesAvailable
182+
status: "True"
183+
type: Available
184+
- lastTransitionTime: "2026-02-05T06:59:57Z"
185+
message: 3/3 packages progressed.
186+
observedGeneration: 2
187+
reason: AllPackagesProgressed
188+
status: "False"
189+
type: Progressing
190+
- lastTransitionTime: "2026-02-05T06:57:57Z"
191+
message: 0/3 packages paused.
192+
observedGeneration: 2
193+
reason: NoPackagePaused
194+
status: "False"
195+
type: HasPausedPackage
196+
observedGeneration: 2
197+
progressedPackages: 3
198+
totalPackages: 3
199+
updatedPackages: 3
200+
```
201+
202+
## Configuration Example
203+
204+
The following YAML example demonstrates a HostedClusterPackage configured with
205+
risk-based partitioning and a rolling upgrade strategy.
206+
207+
```yaml
208+
apiVersion: package-operator.run/v1alpha1
209+
kind: HostedClusterPackage
210+
metadata:
211+
name: example-hosted-cluster-package
212+
spec:
213+
partition:
214+
labelKey: hypershift.openshift.io/risk-group
215+
order:
216+
alphanumericAsc: {}
217+
strategy:
218+
rollingUpgrade:
219+
maxUnavailable: 1 # Max packages to update concurrently
220+
template:
221+
metadata:
222+
labels:
223+
foo: bar
224+
spec:
225+
image: some-registry.io/foo-bar/test:v0.0.1
226+
```

0 commit comments

Comments
 (0)