Skip to content

fix: add concurrency control to nix-hashes workflow#18228

Open
jerome-benoit wants to merge 2 commits intoanomalyco:devfrom
jerome-benoit:fix/nix-hashes-race-condition
Open

fix: add concurrency control to nix-hashes workflow#18228
jerome-benoit wants to merge 2 commits intoanomalyco:devfrom
jerome-benoit:fix/nix-hashes-race-condition

Conversation

@jerome-benoit
Copy link
Contributor

Issue for this PR

Closes #18227

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Adds a workflow-level concurrency group with cancel-in-progress: true to the nix-hashes workflow to prevent a race condition where two concurrent runs can overwrite each other's hashes.

The problem: When two pushes happen in quick succession (e.g. two commits within 2 minutes that both modify bun.lock or package.json), both trigger a nix-hashes run. The 4 matrix compute-hash jobs run at different speeds across platforms (darwin runners are significantly slower than linux). The update-hashes job that finishes last commits its hashes — which is not necessarily from the most recent commit.

Concrete evidence from 2026-03-18: Run 1 (SHA 81be5449, triggered 23:52) and Run 2 (SHA 5d2f8d77, triggered 23:54) ran concurrently. Run 2's update-hashes completed at 00:07:35, but Run 1's update-hashes completed at 00:08:33 — overwriting Run 2's correct hashes with stale ones from an older commit. This left x86_64-linux with hash sha256-yfA50QKqylmaioxi+6d++W8Xv4Wix1hl3hEF6Zz7Ue0= when the correct value is sha256-b0IXNtTj5geRLZGtCI5DxOXyqBJoxuwVf++bUgY3dco=.

The fix: concurrency.cancel-in-progress: true at workflow level cancels the entire older run (matrix jobs + update-hashes) when a newer push triggers the workflow. Combined with the existing git pull --rebase defense in the commit step, this eliminates the race condition. Workflow-level (not job-level) concurrency is used because job-level on update-hashes alone would still allow stale matrix results to queue up.

How did you verify your code works?

  • Audited the GitHub Actions concurrency model documentation: workflow-level cancel-in-progress sends SIGINT to all running jobs, and needs: dependency prevents update-hashes from starting if compute-hash is cancelled
  • Verified the concurrency.group uses github.workflow + github.ref so dev and beta runs don't cancel each other
  • Confirmed workflow_dispatch triggers are also covered by the concurrency group (same workflow name + ref)
  • Cross-validated against production patterns in other repos using matrix → aggregate → commit workflows

Screenshots / recordings

N/A — CI workflow change only.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

Copilot AI review requested due to automatic review settings March 19, 2026 11:17
@github-actions
Copy link
Contributor

The following comment was made by an LLM, it may be inaccurate:

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds workflow-level concurrency to the nix-hashes GitHub Actions workflow so only the latest run per branch (dev/beta) proceeds, preventing stale nix/hashes.json commits caused by overlapping runs.

Changes:

  • Introduces a workflow-level concurrency group keyed by ${{ github.workflow }}-${{ github.ref }}.
  • Enables cancel-in-progress: true so older runs are cancelled when a newer run starts on the same ref.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition in nix-hashes workflow causes stale hashes

2 participants