Skip to content

K3s worker restarter uses k3s.service instead of k3s-agent.service on real k3s clusters #620

@ernoaapa

Description

@ernoaapa

Summary

On real k3s worker nodes using systemd, runtime-class-manager v0.1.0 still tries to restart k3s.service instead of k3s-agent.service.

This is not a custom service name from Hetzner tooling (I'm using hetzner-k3s). It is the standard upstream k3s naming for agent nodes.

Environment

  • runtime-class-manager: v0.1.0
  • k3s: v1.32.0+k3s1
  • cluster provisioning: hetzner-k3s on Hetzner Cloud
  • node type affected: worker / agent nodes

Observed behavior

Worker install pods fail after writing the shim and containerd config because the restarter targets k3s.service on worker nodes.

Example log from a worker install pod:

2026/03/12 05:10:44 INFO shim installed shim=spin-v2 path=/opt/rcm/bin/containerd-shim-spin-v2 new-version=true
2026/03/12 05:10:44 INFO runtime config already exists, skipping runtime=spin-v2
2026/03/12 05:10:44 INFO shim configured shim=spin-v2 path=/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
2026/03/12 05:10:44 INFO D-Bus is already installed and running
2026/03/12 05:10:44 INFO restarting containerd
2026/03/12 05:10:44 ERROR failed to install error="failed to restart containerd: unable to restart k3s: exit status 5"

A retry pod then completed with:

2026/03/12 05:10:56 INFO shim installed shim=spin-v2 path=/opt/rcm/bin/containerd-shim-spin-v2 new-version=false
2026/03/12 05:10:56 INFO runtime config already exists, skipping runtime=spin-v2
2026/03/12 05:10:56 INFO shim configured shim=spin-v2 path=/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
2026/03/12 05:10:56 INFO nothing changed, nothing more to do

So the cluster can converge eventually, but the first worker install job still fails incorrectly.

Why this looks like an RCM bug, not a cluster naming bug

The released v0.1.0 code still hardcodes systemctl restart k3s in K3sRestarter:
https://github.com/spinframework/runtime-class-manager/blob/v0.1.0/internal/containerd/restart_unix.go

Upstream k3s uses k3s.service for servers and k3s-agent.service for agents/workers:
https://github.com/k3s-io/k3s/blob/master/install.sh

I also checked hetzner-k3s, and its worker install logs show the normal k3s-agent.service name, for example here:
vitobotta/hetzner-k3s#487

Expected behavior

On real k3s worker nodes, the restarter should detect and restart k3s-agent instead of k3s.

Related

This looks adjacent to the existing restart / node-installer issues, but seems distinct from k3d-specific handling:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions