Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,7 @@ aws-smithy-types = { version = "1.1.8", features = ["byte-stream-poll-next"] }
aws-types = "1.3.9"
axum = { version = "0.8.8", features = ["ws"] }
axum-extra = { version = "0.12.5", features = ["typed-header"] }
axum-server = { version = "0.8.0", features = ["tls-openssl"] }
azure_core = "0.21.0"
azure_identity = "0.21.0"
azure_storage = "0.21.0"
Expand Down
8 changes: 5 additions & 3 deletions bin/bump-version
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,11 @@ rm -f src/{clusterd,environmentd,materialized,persist-client,testdrive,catalog-d

cargo update --workspace

crd_descriptions_json=doc/user/data/self_managed/materialize_crd_descriptions.json
cargo run -p mz-cloud-resources --bin crd-writer > "${crd_descriptions_json}"
git add "${crd_descriptions_json}"
for crd_version in v1alpha1 v1; do
crd_descriptions_json="doc/user/data/self_managed/materialize_crd_descriptions_${crd_version}.json"
cargo run -p mz-cloud-resources --bin crd-writer -- "${crd_version}" > "${crd_descriptions_json}"
git add "${crd_descriptions_json}"
done

bin/helm-chart-version-bump --bump-orchestratord-version "v$version"

Expand Down
26 changes: 26 additions & 0 deletions ci/nightly/pipeline.template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2439,6 +2439,32 @@ steps:
agents:
queue: hetzner-aarch64-16cpu-32gb

- id: orchestratord-v1-opt-in
label: "Orchestratord v1 opt-in tests"
artifact_paths: ["mz_debug_*.zip"]
depends_on: devel-docker-tags
timeout_in_minutes: 120
plugins:
- ./ci/plugins/mzcompose:
composition: orchestratord
run: v1-opt-in
ci-builder: stable
agents:
queue: hetzner-aarch64-16cpu-32gb

- id: orchestratord-manually-promote
label: "Orchestratord ManuallyPromote tests"
artifact_paths: ["mz_debug_*.zip"]
depends_on: devel-docker-tags
timeout_in_minutes: 120
plugins:
- ./ci/plugins/mzcompose:
composition: orchestratord
run: manually-promote
ci-builder: stable
agents:
queue: hetzner-aarch64-16cpu-32gb

- id: emulator
label: Materialize Emulator
depends_on: build-aarch64
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Additionally, the current system is difficult to automate when faced with evicti

1. **Automatic rollout detection**: The system should automatically detect when a rollout is needed based on spec changes, without requiring users to manually set a UUID.

2. **Seamless version migration**: Existing v1alpha1 resources should continue to work, with automatic conversion to v1alpha2 as needed.
2. **Seamless version migration**: Existing v1alpha1 resources should continue to work, with automatic conversion to v1 as needed.

3. **Terraform compatibility**: Configuration must not fight with infrastructure as code tools such as Terraform.

Expand All @@ -34,9 +34,9 @@ Additionally, the current system is difficult to automate when faced with evicti

## Solution Proposal

### 1. New CRD Version: v1alpha2
### 1. New CRD Version: v1

Introduce a new `v1alpha2` version of the Materialize CRD with the following changes:
Introduce a new `v1` version of the Materialize CRD with the following changes:

**Spec changes:**
- Remove `requestRollout` (`Uuid`) - Rollouts are now triggered automatically when the spec hash changes.
Expand Down Expand Up @@ -122,14 +122,14 @@ A new HTTPS webhook server handles CRD version conversion:
**Endpoint:** `POST /convert`

**Supported conversions:**
- v1alpha1 -> v1alpha2
- v1alpha2 -> v1alpha1\*
- v1alpha1 -> v1
- v1 -> v1alpha1\*

\*The API server seemed to want this, I don't know why. We can't reconcile these, so going back never makes sense.

**Key conversion logic:**

###### v1alpha1 to v1alpha2:
###### v1alpha1 to v1:
- Spec fields:
- `forcePromote: Uuid` becomes `forcePromote: Option<String>` (nil UUID becomes None)
- `requestRollout` is removed.
Expand All @@ -144,52 +144,52 @@ A new HTTPS webhook server handles CRD version conversion:
- If we are already in "promoting" status, we should unconditionally complete the promotion for the current rollout rather than destroying and replacing it.
This may trigger an additional rollout this one time, but I don't know any way around that. I think this is acceptable given the user is doing something very weird by updating orchestratord mid-rollout.

###### v1alpha2 to v1alpha1:
###### v1 to v1alpha1:

We need to include the `lastCompletedRolloutHash` from v1alpha2 in v1alpha1 as well. This is required for round tripping from v1alpha2 -> v1alpha1 -> v1alpha2,
which may happen if a user applies a v1alpha1 change over a v1alpha2 object.
We need to include the `lastCompletedRolloutHash` from v1 in v1alpha1 as well. This is required for round tripping from v1 -> v1alpha1 -> v1,
which may happen if a user applies a v1alpha1 change over a v1 object.

In the case there is an existing `lastCompletedRolloutHash`, it should be kept as-is through the round trip. As we never reconcile with v1alpha1, it should only change at v1alpha2, so this should be safe.
In the case there is an existing `lastCompletedRolloutHash`, it should be kept as-is through the round trip. As we never reconcile with v1alpha1, it should only change at v1, so this should be safe.

No attempt is made to support v1alpha1 beyond giving a valid v1alpha1 structure and supporting round tripping to v1alpha2. Fields that do not exist in v1alpha2 may have their nil value.
No attempt is made to support v1alpha1 beyond giving a valid v1alpha1 structure and supporting round tripping to v1. Fields that do not exist in v1 may have their nil value.

##### Example round trips

In these examples, we assume that orchestratord's attempt to update the stored version succeeds and that reconciliation is triggered after this update. This is only to simplify this document, and is not necessary for correctness. If orchestratord's attempt to update the stored version fails, or the reconciliation is triggered first, the conversion webhook is simply called at that time and we will reconcile the same v1alpha2 object.
In these examples, we assume that orchestratord's attempt to update the stored version succeeds and that reconciliation is triggered after this update. This is only to simplify this document, and is not necessary for correctness. If orchestratord's attempt to update the stored version fails, or the reconciliation is triggered first, the conversion webhook is simply called at that time and we will reconcile the same v1 object.

###### Simplest case
1. There is a stored v1alpha1 Materialize resource, not actively rolling out, with both `status.lastCompletedRolloutRequest` and `spec.requestRollout` matching.
1. Orchestratord gets updated to a version with v1alpha2 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1alpha2.
1. The API server calls the conversion webhook, which returns a v1alpha2 resource. In this case, it would have `status.lastCompletedRolloutHash` and `status.requestedRolloutHash` set to the same calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1alpha2.
1. Orchestratord gets notified of the new v1alpha2 resource, but determines there is nothing to do.
1. Orchestratord gets updated to a version with v1 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1.
1. The API server calls the conversion webhook, which returns a v1 resource. In this case, it would have `status.lastCompletedRolloutHash` and `status.requestedRolloutHash` set to the same calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1.
1. Orchestratord gets notified of the new v1 resource, but determines there is nothing to do.

At this point, the stored version is v1alpha2, and no rollout is triggered.
At this point, the stored version is v1, and no rollout is triggered.

1. The user then applies a v1alpha1 resource. It contains some change that affects the hash (ie: `spec.environmentd_image_ref`). It may or may not include `spec.requestRollout`, that doesn't matter.
1. Before storing this change, the API server calls the conversion webhook, which returns a v1alpha2 resource. In this case, it should not contain a status, as the user applied v1alpha1 resource did not contain a status (TODO verify this).
1. Orchestratord gets notified of the new v1alpha2 resource, which contains the old status not yet updated after the applied v1alpha1 resource. This means the `status.lastCompletedRolloutHash` and `status.requestedRolloutHash` still match each other, but do not match the calculated hash.
1. Before storing this change, the API server calls the conversion webhook, which returns a v1 resource. In this case, it should not contain a status, as the user applied v1alpha1 resource did not contain a status (TODO verify this).
1. Orchestratord gets notified of the new v1 resource, which contains the old status not yet updated after the applied v1alpha1 resource. This means the `status.lastCompletedRolloutHash` and `status.requestedRolloutHash` still match each other, but do not match the calculated hash.
1. Orchestratord reconciles like normal, calculating a new `status.requestedRolloutHash` and triggering a rollout since it is different.

If the user had instead applied a v1alpha2 resource instead, no conversion would be needed and orchestratord would reconcile it directly.
If the user had instead applied a v1 resource instead, no conversion would be needed and orchestratord would reconcile it directly.

###### Existing v1alpha1 resource is mid-upgrade, but not promoting
1. There is a stored v1alpha1 Materialize resource, actively rolling out, with `status.lastCompletedRolloutRequest` and `spec.requestRollout` not matching. It is not in "promoting" status.
1. Orchestratord gets updated to a version with v1alpha2 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1alpha2.
1. The API server calls the conversion webhook, which returns a v1alpha2 resource. In this case, it would have `status.lastCompletedRolloutHash` set to `None` and `status.requestedRolloutHash` set to the calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1alpha2.
1. Orchestratord gets notified of the new v1alpha2 resource.
1. Orchestratord gets updated to a version with v1 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1.
1. The API server calls the conversion webhook, which returns a v1 resource. In this case, it would have `status.lastCompletedRolloutHash` set to `None` and `status.requestedRolloutHash` set to the calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1.
1. Orchestratord gets notified of the new v1 resource.
1. Orchestratord reconciles like normal, continuing the existing rollout and overwriting any objects that are different. This is the same behavior it would have with current orchestratord and v1alpha1.

###### Existing v1alpha1 resource is mid-upgrade and already promoting
1. There is a stored v1alpha1 Materialize resource, actively rolling out, with `status.lastCompletedRolloutRequest` and `spec.requestRollout` not matching. It is in "promoting" status.
1. Orchestratord gets updated to a version with v1alpha2 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1alpha2.
1. The API server calls the conversion webhook, which returns a v1alpha2 resource. In this case, it would have `status.lastCompletedRolloutHash` set to `None` and `status.requestedRolloutHash` set to the calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1alpha2.
1. Orchestratord gets notified of the new v1alpha2 resource.
1. Orchestratord gets updated to a version with v1 support.
1. Orchestratord lists existing v1alpha1 resources on startup, in order to upgrade them to v1.
1. The API server calls the conversion webhook, which returns a v1 resource. In this case, it would have `status.lastCompletedRolloutHash` set to `None` and `status.requestedRolloutHash` set to the calculated hash after conversion.
1. Orchestratord calls `replace` to store the resource as v1.
1. Orchestratord gets notified of the new v1 resource.
1. Orchestratord reconciles like normal. Critically, it unconditionally continues with promotion rather than overwriting any objects.
1. After promotion is successful, the updated status triggers a new rollout. (TODO verify that this works if we have a `status.requestedRolloutHash` set in the initial conversion)

Expand All @@ -216,8 +216,8 @@ Orchestratord will also get readiness probes so nothing tries to call this webho
### 5. CRD Registration

The CRD is registered with:
- Both v1alpha1 and v1alpha2 versions
- v1alpha2 as the stored version
- Both v1alpha1 and v1 versions
- v1 as the stored version
- Webhook conversion configuration pointing to the operator service

```rust
Expand All @@ -241,7 +241,7 @@ mz_crd.spec.conversion = Some(CustomResourceConversion {

### 6. Replace all Materialize resources to update their stored versions

We have set v1alpha2 as the stored version, but that doesn't update existing resources. Those are only updated when they are reapplied.
We have set v1 as the stored version, but that doesn't update existing resources. Those are only updated when they are reapplied.

During orchestratord startup, after waiting for the CRD to be established, we need to loop through all Materialize resources and `replace` them.

Expand All @@ -250,22 +250,22 @@ If it is possible to determine the stored version of these resources, we should
I think it is OK for this to be best-effort, and only warn in case of failure.
For backward compatibility reasons, we're going to have to support the old version for some time.
Orchestratord is likely to get restarted/upgraded multiple times in that period, so it can try again.
If the user ever writes an updated CR, it will also be stored in v1alpha2, so it isn't critical that this work immediately.
If the user ever writes an updated CR, it will also be stored in v1, so it isn't critical that this work immediately.

## Known testing required

Our existing nightly orchestratord tests cover a lot, but we'll need to extend them to work with multiple CRD versions.

- Upgrades from existing v1alpha1 environments by applying v1alpha1 CR. (this is basically what we have now, but we need to not break it with the orchestratord changes to reconcile v1alpha2 after conversion)
- Upgrades from existing v1alpha1 environments by applying v1alpha2 CR.
- Upgrades from existing v1alpha2 environments by applying v1alpha1 CR.
- Upgrades from existing v1alpha2 environments by applying v1alpha2 CR.
- Upgrades from existing v1alpha1 environments by applying v1alpha1 CR. (this is basically what we have now, but we need to not break it with the orchestratord changes to reconcile v1 after conversion)
- Upgrades from existing v1alpha1 environments by applying v1 CR.
- Upgrades from existing v1 environments by applying v1alpha1 CR.
- Upgrades from existing v1 environments by applying v1 CR.
- Upgrade from existing v1alpha1 environment that is mid-rollout not in "promoting" status.
- Upgrade from existing v1alpha1 environment that is mid-rollout in "promoting" status.
- Upgrades with a previous rollout already in progress.
- Upgrades triggered by annotation.
- Deploy of latest Materialize image versions using v1alpha2 CR.
- Deploy of older Materialize image versions using v1alpha2 CR.
- Deploy of latest Materialize image versions using v1 CR.
- Deploy of older Materialize image versions using v1 CR.

## Minimal Viable Prototype

Expand Down
4 changes: 2 additions & 2 deletions doc/user/content/releases/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1042,7 +1042,7 @@ v26.4.0 introduces several performance improvements and bugfixes.
- **Up to 3x faster hydration times for large PostgreSQL tables**: We've reduced the overhead incurred by communication between multiple *workers* on a large cluster. We've observed up to 3x throughput improvement when ingesting 1 TB PostgreSQL tables on large clusters.
- **More efficient source ingestion batching**: Sources now batch writes more effectively. This can result in improved freshness and lower resource utilization, especially when a source is doing a large number of writes.
- **CloudSQL HA failover support** (<red>*Materialize Self-Managed only*</red>): Materialize Self-Managed now offers better support for handling failovers in CloudSQL HA sources, without downtime. [Contact our support team](/support/) to enable this in your environment.
- **Manual Promotion** (<red>*Materialize Self-Managed only*</red>): [Rollout strategies](/self-managed-deployments/upgrading/#rollout-strategies) allow you control how Materialize transitions from the current generation to a new generation during an upgrade. We've added a new rollout strategy called `ManuallyPromote` which allows you to choose when to promote the new generation. This means that you can minimize the impact of potential downtime.
- **Manual Promotion** (<red>*Materialize Self-Managed only*</red>): [Rollout strategies](/self-managed-deployments/upgrading/materialize-instances/v1/#rollout-strategies) allow you control how Materialize transitions from the current generation to a new generation during an upgrade. We've added a new rollout strategy called `ManuallyPromote` which allows you to choose when to promote the new generation. This means that you can minimize the impact of potential downtime.

### Bug Fixes {#v26.4-bug-fixes}
- Fixed timestamp determination logic to handle empty read holds correctly.
Expand Down Expand Up @@ -1211,7 +1211,7 @@ use the new setting `rolloutStrategy` to specify either:
- `WaitUntilReady` (*Default*)
- `ImmediatelyPromoteCausingDowntime`

For more information, see [`rolloutStrategy`](/self-managed-deployments/upgrading/#rollout-strategies).
For more information, see [`rolloutStrategy`](/self-managed-deployments/upgrading/materialize-instances/v1/#rollout-strategies).

### Terraform helpers

Expand Down
Loading
Loading