fix(shim): do not abort Create on guest rootfs choice failure#768
Open
WHOIM1205 wants to merge 1 commit into
Open
fix(shim): do not abort Create on guest rootfs choice failure#768WHOIM1205 wants to merge 1 commit into
WHOIM1205 wants to merge 1 commit into
Conversation
Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
✅ Deploy Preview for urunc canceled.
|
Author
|
Hi @cmainas This PR addresses a lifecycle inconsistency in The change treats post-create guest rootfs selection failures as best-effort, allowing the runtime to recompute the rootfs choice during Exec while keeping task state consistent with containerd. Please let me know if you'd prefer the rollback-on-error approach instead of the current log-and-continue behavior. Thanks for taking a look! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
taskService.Create()performs guest rootfs selection after the underlying task has already been successfully created by the inner containerd task service.If
chooseGuestRootfs()returns a non-skip error at that point, the shim currently aborts Create and returns an error to containerd even though the task has already been created, registered, and its reexec init process is running. This leaves containerd believing task creation failed while the runtime still has a live task, resulting in an inconsistent lifecycle state.The guest rootfs selection performed during Create is only a pre-computation optimization. The same rootfs choice is already recomputed later during Exec when the annotation is absent. Treating failures in this best-effort step as fatal can therefore orphan partially-created containers, leak resources, and prevent successful reconciliation.
This change makes guest rootfs selection best-effort after task creation:
chooseGuestRootfs()fails after a successful inner CreateThis restores a consistent Create lifecycle: once the inner task has been committed, Create no longer fails because of a recoverable rootfs pre-computation error.
Related issues
How was this tested?
Verified by code inspection and local validation.
chooseGuestRootfs()executes afters.TaskService.Create()successfully creates the taskLLM usage
N/A
Checklist
#767