Skip to content

nix-hashes workflow fails on transient infrastructure errors #30742

@jerome-benoit

Description

@jerome-benoit

Description

The compute-hash matrix step in .github/workflows/nix-hashes.yml intermittently fails on transient infrastructure errors during nix build / bun install.

A sample of 8 recent failures shows three distinct transient patterns:

  • 6/8 — APFS unlink race on macOS aarch64-darwin during nix's post-build cleanup:
    error: cannot unlink "/nix/store/.../node_modules/...": Directory not empty
  • 1/8 — bun npm registry batch resolution failure (run 26936133588, x86_64-darwin):
    error: wrangler@4.50.0 failed to resolve (~25 packages)
  • 1/8 — nix DNS failure: error: unable to download 'https://channels.nixos.org/...': Could not resolve hostname

Re-running the workflow normally succeeds in all cases.

Impact

When this fires, the update-hashes job is skipped and nix/hashes.json is not refreshed, breaking downstream Nix consumers until the workflow is manually re-run.

Proposed fix

Wrap the nix build + hash-extraction block in a 3-attempt retry loop with linear backoff (10s, 20s). Success criterion stays "hash extracted from fakeHash mismatch log"; structural failures still surface after the final attempt.

OpenCode version

dev @ 69cfc44db

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions