fix(ci): fix flaky yarn install in lint-and-build-ctst job#2368
fix(ci): fix flaky yarn install in lint-and-build-ctst job#2368bert-e merged 1 commit intodevelopment/2.14from
Conversation
Yarn v1's default parallel fetching can race during tar extraction, corrupting packages in the cache. Add --network-concurrency 1 to serialize fetching, and a cache clean + retry fallback for caches that are already corrupt. Issue: ZENKO-5246
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
|
Looking at yarn tickets, it seems this may not be the root cause: c.f. yarnpkg/yarn#7212 (comment) for exemple |
|
/approve |
|
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue ZENKO-5246. Goodbye delthas. The following options are set: approve |
Summary
The
lint-and-build-ctstjob intermittently fails duringyarn installwith a corrupt cache error (example failure):Root cause
Yarn v1 fetches and extracts packages in parallel by default. This can cause a race condition where concurrent tar extractions corrupt files in the local cache directory. Once a corrupted entry is written, the
actions/setup-nodecache action persists it to the GitHub Actions cache, causing subsequent workflow runs to restore the bad state and fail deterministically until the cache key rotates.Solution
Two complementary mitigations:
--network-concurrency 1— serializes package fetching and extraction, preventing the race condition that causes corruption in the first place. This is the primary fix; the slight increase in install time is negligible compared to the cost of flaky CI reruns.|| (yarn cache clean && yarn install --network-concurrency 1)— if the cache is already corrupt from a previous run, the firstyarn installfails, the fallback wipes the local cache, and the retry succeeds with a clean slate. This handles the transitional period until all runners pick up a fresh cache.Fixes ZENKO-5246