Skip to content

Fix FUSE mount race: poll for readiness#258

Merged
ejc3 merged 2 commits intomainfrom
fix-fuse-mount-race
Feb 6, 2026
Merged

Fix FUSE mount race: poll for readiness#258
ejc3 merged 2 commits intomainfrom
fix-fuse-mount-race

Conversation

@ejc3
Copy link
Copy Markdown
Owner

@ejc3 ejc3 commented Feb 6, 2026

Summary

Replace the fixed 500ms sleep for FUSE mount initialization with a proper poll loop. Waits up to 30s for each mount to become accessible via read_dir before starting the container. Returns error if mount fails (previously silently continued with broken mounts).

Problem

The entrypoint symlink was dangling when the container launched because the FUSE mount wasn't ready after only 500ms.

Test plan

make test-root FILTER=localhost
make test-root FILTER=sanity_rootless

EJ Campbell and others added 2 commits February 5, 2026 23:30
The 500ms sleep wasn't enough for large images or slow hosts. Replace
with a poll loop that waits up to 30s for each FUSE mount to become
accessible via read_dir before starting the container.
- Return error when mount not ready after 30s (was silently continuing)
- Fix elapsed time calculation: (attempt - 1) * 500 instead of attempt * 500
- Ensures containers don't start with inaccessible mounts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude Bot commented Feb 6, 2026

🔍 Claude Review

SEVERITY: low

Findings

[LOW] Unnecessary sleep on final timeout iteration - When the poll loop reaches its 60th iteration and the mount is still not ready, the code sleeps for 500ms (line 1712) before exiting the loop and returning an error (line 1715). This wastes 500ms before reporting the timeout. Consider adding a condition to skip the sleep when attempt == 60:

if !ready && attempt < 60 {
    std::thread::sleep(std::time::Duration::from_millis(500));
}

[LOW] Documentation mismatch - The comment on line 1696 and error message on line 1716 state "up to 30s" timeout, but the actual implementation waits up to 29.5s (59 sleeps × 500ms = 29,500ms). With for attempt in 1..=60, the checks happen at: 0ms, 500ms, 1000ms, ..., 29,500ms. To match the "30s" documentation, either:

  • Change the loop to 1..=61 to get 60 sleeps = 30s total
  • Update the documentation to say "up to 29.5s"

Summary

The core improvement (replacing fixed 500ms sleep with proper polling + error handling) is excellent and addresses the race condition effectively. The issues found are minor optimizations and documentation consistency improvements.


Review by Claude | Run

@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude Bot commented Feb 6, 2026

🔧 Auto-Fix Created

I found minor issues and created a fix PR: #261

The fix PR addresses:

  • Corrects the timeout duration to exactly 30s (was 29.5s)
  • Eliminates unnecessary 500ms sleep before timeout error

Please review and merge the fix PR first, then this PR.

View Claude Run

@ejc3 ejc3 merged commit 7720314 into main Feb 6, 2026
16 checks passed
ejc3 added a commit that referenced this pull request Mar 2, 2026
Fix FUSE mount race: poll for readiness
ejc3 added a commit that referenced this pull request Mar 2, 2026
Fix FUSE mount race: poll for readiness
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant