fix: add pre-flight check for oras binary before ACR login#7908
Open
Lyqed wants to merge 1 commit intoAzure:mainfrom
Open
fix: add pre-flight check for oras binary before ACR login#7908Lyqed wants to merge 1 commit intoAzure:mainfrom
Lyqed wants to merge 1 commit intoAzure:mainfrom
Conversation
Without this check, a missing `oras` binary causes CSE to exit with ERR_ORAS_PULL_NETWORK_TIMEOUT (211) — a misleading code that points engineers at networking rather than the actual problem. What: - Add ERR_ORAS_BINARY_NOT_FOUND=232 error code to cse_helpers.sh - Add pre-flight check at the start of oras_login_with_kubelet_identity that verifies `oras` is present in PATH before doing any ACR work Why: - AzureLinux V3 image 202601.27.0 shipped without the oras binary, causing all Karpenter-provisioned nodes to fail CSE with exit 211. The new check emits clear diagnostic output (PATH, known install paths, OS info, rpm/dpkg package list) and returns the unambiguous ERR_ORAS_BINARY_NOT_FOUND code so operators know immediately what went wrong. How: - command -v oras checked first; on failure, probe /usr/local/bin/oras, /usr/bin/oras, /opt/bin/oras and log OS info and installed packages via rpm (AzureLinux/Mariner) or dpkg (Ubuntu) before returning ERR_ORAS_BINARY_NOT_FOUND Fixes Azure#7907
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #7907
AzureLinux V3 image 202601.27.0 shipped without the
orasbinary. Whenoras_login_with_kubelet_identityruns on such a node, every call toorasfails silently and error propagation eventually surfaces as exit code 211 (ERR_ORAS_PULL_NETWORK_TIMEOUT) — a misleading code that sends operators chasing network/IMDS issues instead of the real problem: the binary is simply absent.This PR adds a pre-flight guard at the top of
oras_login_with_kubelet_identitythat:command -v orasbefore doing any ACR work.$PATH, probes the three canonical install locations (/usr/local/bin/oras,/usr/bin/oras,/opt/bin/oras), dumps/etc/os-release, and queriesrpm(AzureLinux/Mariner) ordpkg(Ubuntu) for installed oras packages.ERR_ORAS_BINARY_NOT_FOUND=232error code so the failure is instantly understandable from CSE logs.Changes
parts/linux/cloud-init/artifacts/cse_helpers.shERR_ORAS_BINARY_NOT_FOUND=232; add pre-flight check inoras_login_with_kubelet_identitypkg/agent/testdata/**/CustomDatamake generate(snapshot test data embeds the script)Test plan
make generate— shellcheck passes oncse_helpers.sh; Go snapshot tests regeneratedmake test— all unit tests passoras; confirm CSE exits with code 232 and the diagnostic block appears in logsoraspresent; confirm no regression (pre-flight check passes and login succeeds)Notes for reviewers
client_id/tenant_idguard and before anyorascall, so it catches the missing-binary case regardless of ACR anonymity or identity configuration.command -v(POSIX),rpm(AzureLinux/Mariner),dpkg(Ubuntu) — no bashisms in the diagnostic path; local variableoras_pathdeclared withlocalper shell script guidelines.ERR_IMDS_FETCH_FAILED) in the numeric sequence and does not collide with any existing code.🤖 Generated with Claude Code