Skip to content

bump patchset to v52#153

Draft
phip1611 wants to merge 176 commits into
cyberus-technology:gardenlinux-next-v52-basefrom
phip1611:gardenlinux-next-v52
Draft

bump patchset to v52#153
phip1611 wants to merge 176 commits into
cyberus-technology:gardenlinux-next-v52-basefrom
phip1611:gardenlinux-next-v52

Conversation

@phip1611
Copy link
Copy Markdown
Member

@phip1611 phip1611 commented Apr 30, 2026

This series bumps the gardenlinux Cloud Hypervisor patchset onto the current
base (soon to be released as v52).

You can find an overview of the difficulties during the rebase in this outline document (trivial patches, hard to rebase patches, patches that are now upstream...).

From 248 commits we have in the current gardenlinux branch, we are now down to ~158 (when TLS is merged upstream). I expect the v52 release to happen very soon.

Changes & Hints for Reviewers

  • The commits that are still here, exist with the same name in the old gardenlinux branch
  • I reordered the patchset quite significantly: small standalone commits are mostly moved to the beginning where it makes sense, followed by larger series
  • All commits of series where consolidated, moved together, and sometimes even squashed (init A -> ... -> fix A commits where squashed)
  • For example, the whole CPU Profiles effort is now a single commit series at the end of our patchset
  • This was by far the toughest patchset rebase we had so far
  • Beware: I am unfortunately pretty sure that I've missed minor changes of our gardenlinux branch in that rebase process. For example, some error message improvement or so, but nothing major. This comes from the nature of this complex operation I had to do here.
  • Changes I had to do against upstream to work with our stack:
    • rename pci_device_id from upstream back to device_id to be compatible with us
    • remove mutual TLS (mTLS) (use normal TLS)
  • libvirt pipeline run: https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/194/pipelines

The result is a shorter and more reviewable branch than
cyberus-github/gardenlinux while preserving the relevant Gardenlinux behavior
on top of the current Cloud Hypervisor base.

Ticket: https://github.com/cobaltcore-dev/cobaltcore/issues/503#issuecomment-4311454443

@phip1611 phip1611 self-assigned this Apr 30, 2026
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from bc2452a to 1a41fef Compare April 30, 2026 09:21
@phip1611
Copy link
Copy Markdown
Member Author

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

@olivereanderson
Copy link
Copy Markdown

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

I plan to backport cloud-hypervisor#8029 as soon as it is merged because the code is simply better.

@phip1611
Copy link
Copy Markdown
Member Author

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

@olivereanderson
Copy link
Copy Markdown

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

We can definitely merge this PR (v52) first. Let's discuss further next week 🙂

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch 7 times, most recently from 768a632 to 7ddbe2c Compare May 5, 2026 06:17
@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 5, 2026

Normal libvirt-tests (default suite) are already passing.

@phip1611
Copy link
Copy Markdown
Member Author

phip1611 commented May 5, 2026

TODO: I totally missed the GARP changes somehow.

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch 5 times, most recently from 892081b to c5d2ed1 Compare May 12, 2026 08:45
@phip1611
Copy link
Copy Markdown
Member Author

All normal tests and the live migration tests are passing locally! 🥳 Pipeline is running! https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/194/pipelines

@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from c5d2ed1 to e676786 Compare May 12, 2026 09:11
weltling and others added 8 commits May 12, 2026 13:53
When a guest resets a device by writing status=0 and reinitializes
without enabling queues before writing DRIVER_OK, the activation
path would collect zero ready queues and treat that as a fatal
error, killing the entire VMM process.

The PCI transport now checks that at least one queue is ready
before reporting that the device needs activation. This prevents
a spurious activation attempt that would fatally fail when no
queues are enabled.

Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
User-defined zones may be mapped private. Create a memfd for private
zones so that fallocate operations are available on all regions, not
just shared ones. This prepares for zone management via hole punching.

The MAP_ANONYMOUS flag is now omitted since the memory becomes
tmpfs-backed via memfd.

Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
The trait is not used and thus can be removed.

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
ReadVolatile already provides a default read_volatile_exact()
implementation, and WriteVolatile a default write_volatile_exact()
implementation. Overriding these functions adds no behavioral value, but
duplicates logic and needs to be updated whenever SocketStream gains or
changes a variant.

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
TLS connections have a TLS server (listens for incoming connections) and
a TLS client (initiates the connection). This commit adds the code for
the client side, which is the sender of a migration

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Code for the TLS server, i.e. the receiver of a live migration.

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Teach the migration transport to handle TLS-backed streams alongside
plain TCP and UNIX sockets.

Introduce a Tls variant in SocketStream and implement the necessary
traits.

Also updates the local-migration error path to reject any non-UNIX
transport, which now includes TLS-wrapped TCP connections.

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Extend ReceiveListener with a TLS-backed listener variant for migration
receivers.

Store the TCP listener together with the server TLS configuration, wrap
accepted sockets in TlsStream::new_server(), and preserver the existing
listener cloning and fd polling behavior so receive-side migration code
can treat TLS listeners like the existing TCP and UNIX cases.

On-behalf-of: SAP sebastian.eydam@sap.com
Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>
Due to the changes to IA32_ARCH_CAPABILITIES applied after the last
code review we introduce stricter checks.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
This list will be used to help us detect unknown MSRs when generating
CPU profiles. It serves no other purpose beyond that.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
TODO: Squash into previous commit if this all works as expected

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We include a list of non-architectural MSRS. This list will only be
used to help the CPU profile generation tool rule out MSRs that it
does not know how to handle.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We include a list of MSRS defined by KVM that may be approved by
CPU profiles and another list of those that may not be approved by
CPU profiles. These lists will later be used by the CPU profile
generation tool.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
The list of HyperV MSRs introduced here will be utilized during CPU
profile generation and also at runtime to filter them out whenever
`kvm_hyperv` is set to `false`.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality related to computing necessary MSR updates
in accordance with the given CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce functionality to filter out MSRs which we want to deny
guests from using.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We record the necessary MSR-based feature modifications that need to be
set in the `CpuManager` and make sure to set these MSR values upon
vCPU configuration. We also use the Vm to filter access to MSRs that
are incompatible with the chosen CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We adapt the CPU profile generation tool to also take the MSR-based
features into account.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We regenerate the CPU profiles and include the MSR-related data.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Windows server needs the machine check architecture (MCA) CPUID bit to
be set in order to boot.

Since Windows server is a use-case we want to support we need to revert
our previous decision to disable MCA for non-host CPU profiles.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We permit these MSRs because they are expected to be available when
the CPUID 0x1.EDX[14](MCA) feature bit is set. Recall that MCA is
necessary in order to boot Windows Server which we want to support.

We also do not list the error reporting banks as forbidden any longer.
Aside: The previous implementation did not end up denying those MSRs
anyway, because KVM does not report them via KVM_GET_MSR_INDEX_LIST.
Now with MCA explicitly set, the guest will certainly expect the
presence of error reporting banks, so we make sure not to indicate
otherwise.

Recall that by default KVM reports all (32) error banks as available
and leaves all feature bits of IA32_MCG_CAP unset, hence the
information displayed to the guest in these MSRs will remain consistent
before and after a live migration in the absence of machine check
errors.

Note that as of today Cloud hypervisor does not transfer the error
reporting banks to the destination of a live migration which can indeed
lead to surprises, but on the other hand the information is likely to
be inaccurate at the point of resume anyway.

As a follow up we could try to mitigate the aforementioned problem
by checking for MCEs during live migration and marking the migration
as failed if any MCE occurred before or during the live migration.
That should however be addressed in a separate PR.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Regenerate CPU profiles in order to enable machine check architecture
(MCA) for non-host CPU profiles which is required to boot Windows
server.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
These are already displayed as not available to guests via CPUID for
non-host CPU profiles, but we forgot to forbid the corresponding MSRs.

The profiles we have generated are OK with respect to this oversight
because KVM_GET_MSR_INDEX_LIST did not report those MSRs at the time
they were generated, but it does now.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Hardware duty cycling (HDC) does not make sense in the virtualization
setting and should thus not be displayed as available to guests.

We have already disabled certain HDC aspects via CPUID 0x6 ECX[13],
but we forgot to disable the state components which is what we do
in this commit.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have already disabled architectural LBR (last branch record) for CPU
profiles, but we forgot to disable the corresponding state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Hardware P-states (HWP) is already disabled for non-host CPU profiles,
but we forgot to also disable the associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We already disabled Processor Trace (PT) for CPU profiles, but forgot
to disable the associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have already forbidden IA32_PASID, an MSR related to process
address space identifiers (PASID), but we forgot to disable the
associated state components.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Bit 56 of VM_ENTRY_HARDWARE_EXCEPTIONS in IA32_VMX_BASIC is only
set on rather recent KVM versions.

Thus whenever a CPU profile is generated on a machine with a recent
Linux kernel, the current inherit policy will lead to the CPU profile
being incompatible on deplyoments with older Linux kernels. This may
not be the intention of the person generating the CPU profile, thus
we change the policy to `Static(0)` for the time being.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
IA32_XSS (Extended Supervisor State Mask) is only reported via
KVM_GET_MSR_INDEX_LIST on rather recent kernels. This can lead to CPU
profiles that are generated on a machine with the latest Linux kernel,
not work with deployments where the hosts use a bit older kernels which
may be unintentional.

We thus decide to forbid this MSR for now, even though
CPUID 0xd.0x1.EAX[3] can inform the guest that the MSR is available.
We do not want to force the aforementioned feature bit to 0 because
it is also used to report support for XSAVES/XRSTORS.

Although not ideal, we consider denying access to IA32_XSS to be
acceptable because the 0xd CPUID leaves report all IA32_XSS related
state components to be unsupported. There is thus no reason for the
guest to be interested in using this MSR.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We have disabled LBR for non-host CPU profiles, but forgot to also do
so in the VM-Exit and VM-Entry control MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We add developer documentation on how to use the CPU profile generation
tool.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We will later use flate2 in arch/build.rs to compress CPU profile
JSON files at compile time and also later to decompress them at
runtime.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a build.rs build script in the arch crate which
automatically constructs the x86_64 CpuProfile enum with one variant
per pre-generated CPU profile.

In order to keep the binary size in check we also take the opportunity
to compress the CPU profile JSON files into the binary which then get
decompressed at runtime.

We will adapt cpu_profile.rs in the next commit to use the output
of build.rs

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
When we introduced our build script we forgot to tell `serde` to
(de-) serialize the `CpuProfile` enum in kebab-case which is a breaking
change.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
@phip1611 phip1611 force-pushed the gardenlinux-next-v52 branch from e676786 to d4012bb Compare May 13, 2026 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants