Skip to content

[Bug]: RHEL 9 driver installer fails when using RHEL 9 EUS kernel #793

@hasueki

Description

@hasueki

Important Note: NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case here.

Describe the bug
When RHEL 9 worker nodes are using EUS kernels, nvidia-driver-daemonset/nvidia-driver-ctr fails to install the below packages:

  • kernel-headers
  • kernel-devel

Example failure:

Installing Linux kernel headers...
+ echo 'Installing Linux kernel headers...'
+ dnf -q -y --releasever=9.6 install kernel-headers-5.14.0-570.112.1.el9_6.x86_64 kernel-devel-5.14.0-570.112.1.el9_6.x86_64
Error: Unable to find a match: kernel-headers-5.14.0-570.112.1.el9_6.x86_64 kernel-devel-5.14.0-570.112.1.el9_6.x86_64

I believe this is because on RHEL 9, the above packages reside in the AppStream RPM repos. But the driver container is not enabling the rhel-9-for-x86_64-appstream-eus-rpms repo.

Only rhel-9-for-x86_64-baseos-eus-rpmsrepo gets enabled:

dnf config-manager --set-enabled rhel-9-for-$DRIVER_ARCH-baseos-eus-rpms || true

To Reproduce
Install NVIDIA GPU operator and driver on RHEL 9 EUS worker nodes.

sh-5.1# uname -r
5.14.0-570.112.1.el9_6.x86_64

Expected behavior
NVIDIA GPU operator's driver installer container to install all necessary packages successfully.

Environment (please provide the following information):

  • gpu-driver-container source (Commit SHA or image digest): Any
  • NVIDIA Driver Version: Any
  • Host OS: RHEL 9
  • Kernel Version: 5.14.0-570.112.1.el9_6.x86_64 (or any RHEL 9 EUS kernel)
  • Container Runtime Version: Any
  • CPU Architecture x86_64
  • GPU Model(s): Any

If applicable, also provide:

  • Kubernetes Distro and Version: OpenShift
  • NVIDIA GPU Operator version: Any

Information to attach (optional if deemed irrelevant)

  • Output of nvidia-smi
  • Container logs
  • Kernel logs (dmesg)
  • Driver install/build output

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinghelp wantedOpen for community contributions. Maintainers welcome help from anyone interested in working on it.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions