Remove output listener read timeout for large image imports by ejc3 · Pull Request #263 · ejc3/fcvm

ejc3 · 2026-02-06T08:12:48Z

Summary

The 5-minute read timeout on the container output vsock caused the listener to exit during long image imports (10+ min for 26GB images). Container stdout/stderr was lost.

Fix

Remove the read timeout. The listener stays alive until EOF (connection closed by fc-agent) or VM exit. 6 lines added, 15 removed.

Test plan

make test-root FILTER=sanity_rootless
make test-root FILTER=localhost

The 5-minute read timeout on the container output vsock caused the listener to exit during long image imports (10+ min). When the container finally started, its stdout/stderr had nowhere to go. Remove the timeout — the listener stays alive until EOF (connection closed) or the VM exits. The VM exit handler already cleans up.

claude-claude · 2026-02-06T08:39:30Z

🔍 Claude Review

SEVERITY: none

Findings

No issues found. The PR correctly addresses the timeout problem for large image imports.

Analysis:

✅ The 5-minute timeout was too restrictive for large image imports (10+ minutes)
✅ Removing the timeout allows the listener to run until the natural EOF when fc-agent closes the connection
✅ Proper cleanup mechanisms are in place:
- Socket file cleanup at line 885
- Task will be aborted if still running when output_handle drops (line 1433)
- VM exit triggers vsock connection close, causing EOF
✅ Code is simplified and clearer (15 deletions, 6 additions)
✅ SIGTERM/SIGINT handling ensures users can still interrupt long operations

The change is well-justified and follows good async patterns.

Summary

Clean fix for a real problem. No issues identified.

Review by Claude | Run

claude-claude · 2026-02-06T09:00:31Z

🔧 CI Auto-Fix

Created fix PR: #264

Issue: The test was failing because the output listener task was hanging when VMs were killed abruptly during stress tests (100 concurrent VMs). The vsock connection wouldn't cleanly close, causing to block indefinitely.

Fix: Added explicit task abortion during VM cleanup by storing the output listener task handle and aborting it in .

View Claude Run

The removal of the 5-minute read timeout in run_output_listener allowed long-running image imports to complete, but introduced a hang in stress tests where 100 VMs are spawned/killed rapidly. When VMs are killed abruptly, the vsock connection may not cleanly close, causing read_line() to block indefinitely. This prevents the task from terminating naturally. Fix by: 1. Storing the output listener task handle (removing _ prefix) 2. Adding output_listener_handle parameter to cleanup_vm() 3. Explicitly aborting the task when cleanup_vm() is called This ensures the listener is forcefully terminated during cleanup while still allowing indefinite waits for legitimate long-running operations.

claude-claude · 2026-02-06T15:11:09Z

⚠️ CI Check Failed

The Host-Root-arm64-SnapshotDisabled test has failed. Please fix the failing test before I can complete the review.

Failed Check:

Host-Root-arm64-SnapshotDisabled - View logs

Review by Claude | Run

claude-claude · 2026-02-06T15:35:57Z

CI Failure Analysis

I analyzed CI #21755094976 but couldn't create an automatic fix.

Diagnosis

The test test_podman_healthcheck_unhealthy is timing out after 81 seconds. The test is waiting for the VM's health status to become Unhealthy, but the status remains None (VM state not found/reported).

Test expectations:

Container image: localhost/test-unhealthy (nginx with HEALTHCHECK CMD exit 1)
Healthcheck config: --interval=1s --retries=1
Expected: Status should become Unhealthy within seconds
Actual: Status stays None for 60+ seconds, test times out

What Changed in PR:

Removed 5-minute read timeout on output listener
Added aborting of output listener task during VM cleanup

Observed in logs:

Multiple Error: no such object: "fcvm-container" errors (expected during startup)
VM appears to start successfully
Health monitor should be running, but status isn't being reported

Why Not Fixable

This appears to be either:

A race condition introduced by the output listener changes that prevents health status from being saved/read
A flaky test that was already intermittently failing
An environmental issue specific to the CI runner's timing/load

The code compiles successfully (cargo check --release passes), so there are no type errors or missing arguments. The issue is a subtle runtime behavior that requires:

Local reproduction with debug logging
Deeper investigation into health monitor task lifecycle
Possibly adding instrumentation to understand why VM state isn't being reported

Recommendation

Check if this test passes locally on the same architecture (x64-SnapshotEnabled, arm64-SnapshotDisabled)
Add debug logging to health monitor to see if it's starting/running
Consider if the test needs adjustment for the new output listener behavior
Check if there's a timing issue with how quickly health status is expected to be available

View Claude Run

ejc3 mentioned this pull request Feb 6, 2026

fix: abort output listener task during VM cleanup to prevent hangs #264

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove output listener read timeout for large image imports#263

Remove output listener read timeout for large image imports#263
ejc3 wants to merge 2 commits intomainfrom
fix-output-timeout

ejc3 commented Feb 6, 2026

Uh oh!

claude-claude bot commented Feb 6, 2026

Uh oh!

claude-claude bot commented Feb 6, 2026

Uh oh!

claude-claude bot commented Feb 6, 2026

Uh oh!

claude-claude bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ejc3 commented Feb 6, 2026

Summary

Fix

Test plan

Uh oh!

claude-claude bot commented Feb 6, 2026

🔍 Claude Review

Findings

Summary

Uh oh!

claude-claude bot commented Feb 6, 2026

🔧 CI Auto-Fix

Uh oh!

claude-claude bot commented Feb 6, 2026

⚠️ CI Check Failed

Uh oh!

claude-claude bot commented Feb 6, 2026

CI Failure Analysis

Diagnosis

Why Not Fixable

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant