Skip to content

TQ: Support for ZFS Key Rotation#9737

Open
plaidfinch wants to merge 9 commits intomainfrom
tq-zfs-key-rotation
Open

TQ: Support for ZFS Key Rotation#9737
plaidfinch wants to merge 9 commits intomainfrom
tq-zfs-key-rotation

Conversation

@plaidfinch
Copy link
Contributor

@plaidfinch plaidfinch commented Jan 28, 2026

When Trust Quorum commits a new epoch, all U.2 crypt datasets must have their encryption keys rotated to use the new epoch's derived key. This change implements the key rotation flow triggered by epoch commits.

Trust Quorum Integration

  • Add watch channel to NodeTaskHandle for epoch change notifications
  • Initialize channel with current committed epoch on startup
  • Notify subscribers via send_if_modified() when epoch changes

Config Reconciler Integration

  • Accept committed_epoch_rx watch channel from trust quorum
  • Trigger reconciliation when epoch changes
  • Track per-disk encryption epoch in ExternalDisks
  • Add rekey_for_epoch() to coordinate key rotation:
    • Filter disks needing rekey (cached epoch < target OR unknown)
    • Derive keys for each disk via StorageKeyRequester
    • Send batch request to dataset task
    • Update cached epochs on success
    • Retry on failure via normal reconciliation retry logic

Dataset Task Changes

  • Add RekeyRequest/RekeyResult types for batch rekey operations
  • Add datasets_rekey() with idempotency check (skip if already at target)
  • Use Zfs::change_key() for atomic key + epoch property update

ZFS Utilities

  • Add Zfs::change_key() using zfs_atomic_change_key crate
  • Add Zfs::load_key(), unload_key(), dataset_exists()
  • Add epoch field to DatasetProperties
  • Add structured error types for key operations

Crash Recovery

  • Add trial decryption recovery in sled-storage for datasets with missing epoch property (e.g., crash during initial creation)
  • Unload key before each trial attempt to handle crash-after-load-key
  • Set epoch property after successful recovery

@andrewjstone andrewjstone mentioned this pull request Jan 31, 2026
48 tasks
andrewjstone added a commit that referenced this pull request Jan 31, 2026
This commit adds a 3 phase mechanism for sled expungement.

The first phase is to remove the sled from the latest trust quorum
configuration via omdb. The second phase is to reboot the sled after
polling for commit the trust quorum removal. The third phase is to
issue the existing omdb expunge command, which changes the sled policy
as before.

The first and second phases remove the need to physically remove the
sled before expungement. They act as a software mechanism that gates the
sled-agent from restarting on the sled and doing work when it should be
treated as "absent". We've discussed this numerous times in the update
huddle and it is finally arriving!

The third phase is what informs reconfigurator that the sled is gone
and remains the same except for an extra sanity check that that the last
committed trust quorum configuration does not contain the sled that is
to be expunged.

The removed sled may be added back to this rack or another after being
clean slated. I tested this by deleting the files in the internal
"cluster" and "config" directories and rebooting the removed sled in
a4x2 and it worked.

This PR is marked draft because it changes the current
sled-expunge pathway to depend on real trust quorum. We
cannot safely merge it in until the key-rotation work from
#9737 is merged in.

This also builds on #9741 and should merge after that PR.
When Trust Quorum commits a new epoch, all U.2 crypt datasets must have
their encryption keys rotated to use the new epoch's derived key. This
change implements the key rotation flow triggered by epoch commits.

## Trust Quorum Integration

- Add watch channel to `NodeTaskHandle` for epoch change notifications
- Initialize channel with current committed epoch on startup
- Notify subscribers via `send_if_modified()` when epoch changes

## Config Reconciler Integration

- Accept `committed_epoch_rx` watch channel from trust quorum
- Trigger reconciliation when epoch changes
- Track per-disk encryption epoch in `ExternalDisks`
- Add `rekey_for_epoch()` to coordinate key rotation:
  - Filter disks needing rekey (cached epoch < target OR unknown)
  - Derive keys for each disk via `StorageKeyRequester`
  - Send batch request to dataset task
  - Update cached epochs on success
  - Retry on failure via normal reconciliation retry logic

## Dataset Task Changes

- Add `RekeyRequest`/`RekeyResult` types for batch rekey operations
- Add `datasets_rekey()` with idempotency check (skip if already at target)
- Use `Zfs::change_key()` for atomic key + epoch property update

## ZFS Utilities

- Add `Zfs::change_key()` using `zfs_atomic_change_key` crate
- Add `Zfs::load_key()`, `unload_key()`, `dataset_exists()`
- Add `epoch` field to `DatasetProperties`
- Add structured error types for key operations

## Crash Recovery

- Add trial decryption recovery in `sled-storage` for datasets with
  missing epoch property (e.g., crash during initial creation)
- Unload key before each trial attempt to handle crash-after-load-key
- Set epoch property after successful recovery

## Safety Properties

- Atomic: Key and epoch property set together via `zfs_atomic_change_key`
- Idempotent: Skip rekey if dataset already at target epoch
- Crash-safe: Epoch read from ZFS on restart rebuilds cache correctly
- Conservative: Unknown epochs (None) trigger rekey attempt
@plaidfinch plaidfinch linked an issue Feb 4, 2026 that may be closed by this pull request
@plaidfinch plaidfinch marked this pull request as ready for review February 4, 2026 03:11
Create a new key-manager-types crate containing the disk encryption key
types (Aes256GcmDiskEncryptionKey and VersionedAes256GcmDiskEncryptionKey)
that were previously defined in key-manager. This breaks the dependency
from illumos-utils to key-manager, allowing illumos-utils to depend only
on the minimal types crate.

The key-manager crate re-exports VersionedAes256GcmDiskEncryptionKey for
backwards compatibility.
Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic @plaidfinch.

We'll need to test this and possibly coordinate merging it in case it relies on trust quorum being enabled / lrtq upgrade, etc...

@andrewjstone
Copy link
Contributor

We tested this on a4x2 with TQ enabled and keys were created correctly during RSS. We then added a sled and rotation completed correctly. We're going to relaunch with trust quorum disabled and see if we can upgrade out of lrtq successfully and then add a sled in our next experiment.

@andrewjstone
Copy link
Contributor

Woohoo. Tested on hardware. LRTQ upgrade succeeded. This required a small patch, which we should pull into this branch: 494d49a

I think it's probably good to go in once the comments are resolved and this patch is pulled in. Will re-assess when fresh tomorrow.

andrewjstone and others added 4 commits February 5, 2026 16:34
- Format ZFS_GET_PROPS const with concat! and clarify epoch field docs
- Preserve error source chain with anyhow::Error::from instead of formatting
- Convert KeyRotationError from enum to struct (single variant)
- Log current and new epochs in key rotation success and failure paths
- Change rekey_for_epoch to return ReconciliationResult instead of bool
- Add error log for unexpected epoch-ahead-of-target condition
- Simplify epoch filter using Option<Epoch> ordering
- Move dataset_name allocation into Ok branch to minimize scope
- Upgrade all-key-derivations-failed log from info to warn
- Inline dataset_exists helper, calling Zfs::dataset_exists directly
- Log warning on best-effort unload_key failure
The mark_unchanged() call before the reconciler loop was a no-op:
borrow_and_update() inside do_reconcilation() already reads and
processes the current epoch unconditionally, marking it as seen.
Also remove a stale comment about copying epoch out of the Ref,
which was informational and no longer adds clarity.
@andrewjstone
Copy link
Contributor

andrewjstone commented Feb 6, 2026

Out of an abundance of caution, we'll hold off merging this until the R18 branch has been cut. I don't think there's any real risk here, but it also isn't necessary to merge. We'll be testing the latest changes today.

@andrewjstone
Copy link
Contributor

I was able to add, expunge, add with a4x2 and it all worked. We ended up at epoch 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TQ: Support for ZFS Key Rotation

4 participants