📋 Prerequisites
🎯 Affected Service(s)
Controller Service
🚦 Impact/Severity
Blocker
🐛 Bug Description
When an Agent is configured with SSH-based git skills (skills.gitAuthSecretRef pointing to a kubernetes.io/ssh-auth Secret), the skills-init init container crashes immediately and the Agent pod never becomes ready. SSH git authentication is completely non-functional.
🔄 Steps To Reproduce
Create a kubernetes.io/ssh-auth Secret with a valid SSH private key:
apiVersion: v1
kind: Secret
type: kubernetes.io/ssh-auth
metadata:
name: my-agent-git-auth
namespace: kagent
data:
ssh-privatekey: <base64-encoded private key>
Configure an Agent with SSH gitRefs and gitAuthSecretRef:
spec:
skills:
gitAuthSecretRef:
name: my-agent-git-auth
gitRefs:
- url: git@github.com:my-org/my-repo.git
ref: main
path: .agents/skills/my-skill
name: my-skill
Observe the skills-init init container statu, it enters CrashLoopBackOff
Check init container logs: kubectl -n kagent logs -c skills-init
🤔 Expected Behavior
The skills-init container adds the git host to ~/.ssh/known_hosts via ssh-keyscan, then clones the repository successfully using the provided SSH key.
📱 Actual Behavior
The init container exits immediately with:
/skills-init.sh: ssh-keyscan: not found
The pod enters CrashLoopBackOff. The Agent is never reconciled.
💻 Environment
- OS and version: Linux (Kubernetes node)
- Kubernetes version: 1.32
- Kubernetes provider: RKE2
- Application version: v0.9.0
- skills-init image: cr.kagent.dev/kagent-dev/kagent/skills-init:0.9.0
🔧 CLI Bug Report
N/A
🔍 Additional Context
Root cause: The generated skills-init.sh (from skills-init.sh.tmpl) runs under set -e. The SSH key branch calls ssh-keyscan as a standalone binary to populate ~/.ssh/known_hosts. The skills-init image does not have openssh-client installed, so ssh-keyscan is not present. The set -e flag causes immediate exit.
PR #1529 changed the template to derive SSH hosts dynamically from gitRefs URLs instead of hardcoding github.com gitlab.com bitbucket.org, but made no changes to the skills-init Dockerfile. The binary was never in the image; this bug predates and survives that PR.
Note: The HTTPS token path (elif [ -f "${_auth_mount}/token" ]) never calls ssh-keyscan and works correctly. Only SSH auth is broken.
Suggested fix: Add openssh-client to the skills-init Dockerfile:
RUN apk add --no-cache git openssh-client
📋 Logs
/skills-init.sh: ssh-keyscan: not found
📷 Screenshots
No response
🙋 Are you willing to contribute?
📋 Prerequisites
🎯 Affected Service(s)
Controller Service
🚦 Impact/Severity
Blocker
🐛 Bug Description
When an Agent is configured with SSH-based git skills (skills.gitAuthSecretRef pointing to a kubernetes.io/ssh-auth Secret), the skills-init init container crashes immediately and the Agent pod never becomes ready. SSH git authentication is completely non-functional.
🔄 Steps To Reproduce
Create a kubernetes.io/ssh-auth Secret with a valid SSH private key:
Configure an Agent with SSH gitRefs and gitAuthSecretRef:
Observe the skills-init init container statu, it enters CrashLoopBackOff
Check init container logs: kubectl -n kagent logs -c skills-init
🤔 Expected Behavior
The skills-init container adds the git host to ~/.ssh/known_hosts via ssh-keyscan, then clones the repository successfully using the provided SSH key.
📱 Actual Behavior
The init container exits immediately with:
The pod enters CrashLoopBackOff. The Agent is never reconciled.
💻 Environment
🔧 CLI Bug Report
N/A
🔍 Additional Context
Root cause: The generated skills-init.sh (from skills-init.sh.tmpl) runs under
set -e. The SSH key branch calls ssh-keyscan as a standalone binary to populate ~/.ssh/known_hosts. The skills-init image does not have openssh-client installed, so ssh-keyscan is not present. Theset -eflag causes immediate exit.PR #1529 changed the template to derive SSH hosts dynamically from gitRefs URLs instead of hardcoding github.com gitlab.com bitbucket.org, but made no changes to the skills-init Dockerfile. The binary was never in the image; this bug predates and survives that PR.
Note: The HTTPS token path (elif [ -f "${_auth_mount}/token" ]) never calls ssh-keyscan and works correctly. Only SSH auth is broken.
Suggested fix: Add openssh-client to the skills-init Dockerfile:
📋 Logs
📷 Screenshots
No response
🙋 Are you willing to contribute?