Skip to content

uffd: record warmup faults and prefetch them on later forks#218

Draft
sjmiller609 wants to merge 1 commit intohypeship/uffd-page-serverfrom
hypeship/uffd-prefetch-hotpages
Draft

uffd: record warmup faults and prefetch them on later forks#218
sjmiller609 wants to merge 1 commit intohypeship/uffd-page-serverfrom
hypeship/uffd-prefetch-hotpages

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

Stacked on: #216 (uffd page server) — review #213#214#216 first.

Summary

  • Adds HotPage / HotPageList types with sort+dedup snapshot, atomic Save, and LoadHotPageList (binary varint format with a HPL1 magic).
  • New Config.RecordHotPages flag turns on per-fault recording in the page-fault loop.
  • New Server.Prefetch(forkID, list) issues UFFDIO_COPY for every entry in a hot-page list against the fork's userfaultfd before the guest unpauses.
  • The prefetcher is installed by the platform listener once the uffd has been received and registered; EEXIST/EAGAIN are tolerated to absorb first-touch races with vCPUs.

Why

Even with the shared mem-file + UFFD page server, a fresh fork still pays a fault round-trip on every page the guest needs to boot — that's tens of thousands of page-fault round-trips on the critical path. Recording the hot set during a template's first warmup fork and prefetching it on every later fork eliminates those round-trips entirely. Template.HotPagesPath (reserved in PR 2) finally has a producer/consumer.

Test plan

  • go test ./lib/uffd/... (covers HotPageList sort/dedup/save/load + bad-magic + truncation)
  • Manual: warm a template with RecordHotPages: true, save the list, fork without prefetch and time boot; fork with prefetch and time boot; confirm the second is faster
  • Manual: prefetch a corrupted/wrong-region list — confirm clean error rather than UB

🤖 Generated with Claude Code

Adds a hot-page recorder + prefetch primitive on top of the userfaultfd
page server. During a template's first warmup fork the server can
record every served page (Config.RecordHotPages); the resulting
HotPageList is stable-sorted, deduplicated, and saved to disk in a
small binary format alongside the template. Later forks call
Server.Prefetch(forkID, list) to issue UFFDIO_COPY for every recorded
page against their userfaultfd before the guest unpauses, eliminating
the fault round-trips on those addresses.

The prefetcher is installed by the platform-specific listener once the
fork's uffd has been received and registered, so callers can race
Prefetch and the fault loop safely. EEXIST/EAGAIN are tolerated the
same way the fault handler does to absorb first-touch races with
vCPUs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant