Skip to content

firecracker: serve fork mem-file via per-template uffd page server#219

Draft
sjmiller609 wants to merge 4 commits intohypeship/uffd-prefetch-hotpagesfrom
hypeship/uffd-firecracker-wiring
Draft

firecracker: serve fork mem-file via per-template uffd page server#219
sjmiller609 wants to merge 4 commits intohypeship/uffd-prefetch-hotpagesfrom
hypeship/uffd-firecracker-wiring

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

Summary

  • Wires firecracker's snapshot restore through the userfaultfd page server (PR uffd: serve firecracker page faults from a shared template mem-file #216) when the fork descends from a template. Forks now restore from a per-template uffd.Server instead of mmaping the mem-file directly.
  • Adds a per-template tracker on the manager: lazily starts one uffd.Server per template on the first fork acquire and tears it down once the last fork is released. Non-template forks (the symlink-only path from PR fork: share template mem-file via symlink for fan-out forks #214) are unaffected.
  • Threads UffdSocketPath through ForkPrepareRequest → firecracker restoreMetadatasnapshotLoadParams.MemBackend ({ backend_type: "Uffd", backend_path: <UDS> }), with mem_file_path cleared when a backend is set.
  • Releases the uffd registration on DeleteInstance for forks (best-effort warn on error).

Test plan

  • go vet ./... clean
  • go build ./... clean
  • new unit tests pass: go test ./lib/instances/... -run TestUffdTracker -count=1
  • CI integration test TestFirecrackerForkFromTemplate boots a firecracker source, standbys → promotes → forks → asserts:
    • fork reaches Running
    • fork mem-file is a symlink to the template's mem-file
    • template ForkCount bumps to 1
    • per-template uffd tracker registers the fork
    • delete drops ForkCount and detaches from uffd
  • firecracker config_test.go updated for uffd backend assertions

Stacked on #218.

🤖 Generated with Claude Code

sjmiller609 and others added 4 commits May 8, 2026 15:09
Wires firecracker's snapshot restore through a userfaultfd-backed page
server when the fork descends from a template. Each template gets one
uffd.Server lazily started on the first fork and torn down once the
last fork is released, so non-template forks (the symlink-only path)
are unaffected.

Adds an E2E test that boots a firecracker source, standbys it,
promotes it to a template, forks against the template, and asserts:
fork reaches Running, mem-file is a symlink to the template's, refcount
bumps to 1, the per-template uffd tracker registers the fork, and on
fork delete the refcount drops and the tracker detaches.

Also adds unit coverage for the uffd tracker lifecycle (lazy start,
multi-fork share, last-release teardown, closeAll, empty-acquire
rejection).
Unix domain socket paths cap at 108 bytes (sun_path). Putting the per-
fork socket under <DataDir>/templates/<25-char-cuid>/uffd/<25-char-
cuid>.uffd blew that limit on CI runners where t.TempDir() returns long
prefixes, surfacing as "bind: invalid argument" in
TestFirecrackerForkFromTemplate.

Sockets are ephemeral, so anchoring them at /tmp/h-uffd/<templateID>/
keeps the path well under the limit regardless of how deep DataDir is.
The tracker now also rm -rfs the per-template socket dir on the last
release so we don't leak stale entries.
The previous fork-mem-file symlink looped through withSnapshotSourceDirAlias
during firecracker restore: fork/.../memory -> source/.../memory, but the
alias dance temporarily symlinks source dir -> fork dir, so resolution
ping-pongs back to fork/.../memory and trips ELOOP. Switching to a hardlink
makes the fork's mem-file resolve by inode so the temporary directory alias
no longer affects it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Firecracker enables snapshot base reuse, which renames the post-restore
snapshot dir from snapshot-latest to snapshot-base. The hardlink survives
the rename (same inode), so the test just needs the right path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant