Skip to content

fix(shim): do not abort Create on guest rootfs choice failure#768

Open
WHOIM1205 wants to merge 1 commit into
urunc-dev:mainfrom
WHOIM1205:fix/shim-create-rootfs-no-abort
Open

fix(shim): do not abort Create on guest rootfs choice failure#768
WHOIM1205 wants to merge 1 commit into
urunc-dev:mainfrom
WHOIM1205:fix/shim-create-rootfs-no-abort

Conversation

@WHOIM1205

@WHOIM1205 WHOIM1205 commented Jun 16, 2026

Copy link
Copy Markdown

Description

taskService.Create() performs guest rootfs selection after the underlying task has already been successfully created by the inner containerd task service.

If chooseGuestRootfs() returns a non-skip error at that point, the shim currently aborts Create and returns an error to containerd even though the task has already been created, registered, and its reexec init process is running. This leaves containerd believing task creation failed while the runtime still has a live task, resulting in an inconsistent lifecycle state.

The guest rootfs selection performed during Create is only a pre-computation optimization. The same rootfs choice is already recomputed later during Exec when the annotation is absent. Treating failures in this best-effort step as fatal can therefore orphan partially-created containers, leak resources, and prevent successful reconciliation.

This change makes guest rootfs selection best-effort after task creation:

  • Continue returning success when chooseGuestRootfs() fails after a successful inner Create
  • Log guest rootfs selection failures for visibility
  • Preserve the existing skip behavior for unsupported workloads
  • Allow the runtime to recompute rootfs selection during Exec as designed

This restores a consistent Create lifecycle: once the inner task has been committed, Create no longer fails because of a recoverable rootfs pre-computation error.

Related issues

  • Fixes orphaned unikontainer init processes caused by post-create guest rootfs selection failures
  • Fixes task lifecycle inconsistencies where containerd reports Create failure after the task has already been created
  • Prevents resource leakage from unreconciled tasks created before guest rootfs pre-computation errors

How was this tested?

Verified by code inspection and local validation.

  • Confirmed chooseGuestRootfs() executes after s.TaskService.Create() successfully creates the task
  • Verified non-skip guest rootfs selection errors previously caused Create to return failure after task creation had already been committed
  • Confirmed runtime Exec already recomputes guest rootfs selection when the annotation is absent
  • Verified Create now succeeds and preserves task lifecycle consistency when guest rootfs pre-computation fails
  • Confirmed existing successful guest rootfs selection paths are unchanged
  • Ran build and validation checks successfully

LLM usage

N/A

Checklist

  • I have read the contribution guide
  • The project builds successfully after the change
  • The fix is limited to post-create guest rootfs error handling
  • No functional changes for successful guest rootfs selection paths
  • Runtime rootfs recomputation behavior remains unchanged

#767

Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
@netlify

netlify Bot commented Jun 16, 2026

Copy link
Copy Markdown

Deploy Preview for urunc canceled.

Name Link
🔨 Latest commit 43b272e
🔍 Latest deploy log https://app.netlify.com/projects/urunc/deploys/6a31a5654572380008b03ef2

@WHOIM1205

Copy link
Copy Markdown
Author

Hi @cmainas

This PR addresses a lifecycle inconsistency in taskService.Create() where a guest rootfs pre-computation failure could cause Create to return an error after the underlying task had already been successfully created.

The change treats post-create guest rootfs selection failures as best-effort, allowing the runtime to recompute the rootfs choice during Exec while keeping task state consistent with containerd.

Please let me know if you'd prefer the rollback-on-error approach instead of the current log-and-continue behavior. Thanks for taking a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant