Fix stale lockfile cleanup for function extension builds#6696
Fix stale lockfile cleanup for function extension builds#6696skypher wants to merge 7 commits intoShopify:mainfrom
Conversation
When a function build process crashes or is killed, the `.build-lock` directory can be left behind. Previously, this caused subsequent builds to hang for an extended period (with 20 retries using exponential backoff) before failing, making development frustrating. This change: - Adds proactive stale lock detection before attempting to acquire a lock - Uses `lockfile.check()` to verify if a lock is actively held - Removes orphaned lock directories from crashed builds automatically - Reduces retries from 20 to 3 with shorter timeouts (faster failure) - Sets explicit 10-second stale threshold for lock files - Improves error message to show the actual lockfile path The fix handles three scenarios: 1. Stale lock exists (no active process) - auto-removes and proceeds 2. Active lock held by another process - waits normally 3. Corrupted lock (check fails) - attempts cleanup and proceeds
This prevents stale lockfiles from being left behind when the build process is interrupted by a signal (Ctrl+C, kill, terminal close). The signal handlers: - Register after the lock is acquired - Release the lock before re-emitting the signal to allow normal termination - Are cleaned up after normal build completion to avoid memory leaks Also adds a test to verify signal handlers are properly registered and removed during the build lifecycle.
The @shopify/cli-kit rmdir function uses the 'del' library which handles recursion automatically. The RmDirOptions interface only supports 'force', not 'recursive'.
|
CLA signed. |
|
This PR seems inactive. If it's still relevant, please add a comment saying so. Otherwise, take no action. |
|
Still relevant 🤷♂️ |
|
This PR seems inactive. If it's still relevant, please add a comment saying so. Otherwise, take no action. |
|
Still waiting |
|
This PR seems inactive. If it's still relevant, please add a comment saying so. Otherwise, take no action. |
Summary
When a function build process crashes or is killed (e.g., by SIGINT/Ctrl+C, SIGTERM, SIGHUP), the
.build-lockdirectory can be left behind. Previously, this caused subsequent builds to hang for an extended period (with 20 retries using exponential backoff) before failing, making development frustrating. This was particularly confusing because the build would appear to hang indefinitely with no clear error message.This PR implements a two-pronged approach to handle stale locks:
1. Signal Handlers for Clean Shutdown
Registers signal handlers (SIGINT, SIGTERM, SIGHUP) after acquiring the lock to ensure the lock is released even when the process is interrupted:
2. Proactive Stale Lock Detection on Startup
Uses
lockfile.check()to verify if a lock is actively held before attempting acquisition:The fix handles three startup scenarios:
Test plan