Skip to content

ci: shared .github/actions/go-setup composite (15 call-sites)#506

Merged
trilamsr merged 2 commits into
mainfrom
ci/go-setup-composite
Jun 3, 2026
Merged

ci: shared .github/actions/go-setup composite (15 call-sites)#506
trilamsr merged 2 commits into
mainfrom
ci/go-setup-composite

Conversation

@trilamsr

@trilamsr trilamsr commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Centralises actions/setup-go boilerplate that was copy-pasted across
8 workflows into one shared .github/actions/go-setup composite, so
dependabot SHA bumps land in one place and a stray cache: false
regression (CI wall-time hit twice before this consolidation) cannot
silently differ between callsites.

The composite is setup-go-only — GitHub requires actions/checkout
to run before any ./.github/actions/<x> path can resolve, so each
caller keeps its own checkout step (also lets the few callers that
need fetch-depth: 0 or ref: ${{ github.sha }} keep that config
without forcing every caller through the lowest common denominator —
the same shape the existing kind-cluster-setup composite follows).

Migrated (15 call-sites, 7 workflows)

Workflow Call-sites
bench.yml 1
chaos.yml 3
chart.yml 3
ci.yml 5
codeql.yml 1
install-bench.yml 1
nccl-fr-fuzz-nightly.yml 1
Total 15

Skipped (intentional — preserves auditable inline state)

  • compat-matrix.yml — explicit cache: false with multi-line
    zizmor cache-poisoning justification. Migrating would either lose
    the justification or force a cache: 'false' input plus the same
    comment, with no LoC win.
  • release.yml (×2 sites) — uses
    cache: true # zizmor: ignore[cache-poisoning] inline pragma per
    setup-go step. The pragma is load-bearing for the zizmor audit and
    cannot survive an input-passthrough. Each job also pins
    ref: ${{ github.sha }} or fetch-depth: 0 on its own checkout
    for force-push protection; those stay verbatim.

Behavioral guarantees

  • Same actions/checkout SHA preserved per caller (no checkout step
    changed in this PR — they were already there and stay there).
  • Same actions/setup-go SHA (4a3601121dd01d1626a1e23e37211e3254c1c06c / v6.4.0).
  • Same go-version-file: go.mod for every migrated caller.
  • Same cache: true for every migrated caller.

LoC delta

.github/actions/go-setup/action.yml        | 58 ++++++++++++++++++++
.github/workflows/bench.yml                |  5 +--
.github/workflows/chaos.yml                | 15 ++------
.github/workflows/chart.yml                | 15 ++------
.github/workflows/ci.yml                   | 25 +++----------
.github/workflows/codeql.yml               |  5 +--
.github/workflows/install-bench.yml        |  5 +--
.github/workflows/nccl-fr-fuzz-nightly.yml |  5 +--
8 files changed, 73 insertions(+), 60 deletions(-)

Net +13 LoC, but the composite carries ~45 lines of decision-doc
(why setup-go-only, SKIP list, SHA pin rationale). Pure-mechanics
delta is -47 lines of duplicated workflow boilerplate replaced
by 13 lines of composite YAML.

Test plan

  • actionlint .github/workflows/*.yml — green
  • make actionlint — green
  • python3 -c "import yaml; yaml.safe_load(open('.github/actions/go-setup/action.yml'))" — valid YAML
  • Pre-commit: golangci-lint, go vet, go mod verify, attribute-namespace-check, deprecation-check — all green
  • Pre-push hooks — green
  • Wait for CI on ci.yml (5 call-sites migrated) + chart.yml (3 call-sites) + codeql.yml to confirm composite resolves on ubuntu-latest
  • Verify no setup-go cache hit-rate regression (compare cache hit logs to a recent main run)
ci: shared `.github/actions/go-setup` composite consolidates the
`actions/setup-go` boilerplate across 15 call-sites in 7 workflows.
Dependabot SHA bumps now land in one place. No workflow's effective
behavior changes; `compat-matrix.yml` and `release.yml` keep their
inline zizmor cache-poisoning audit annotations.

@trilamsr trilamsr enabled auto-merge (squash) June 3, 2026 22:58
trilamsr added a commit that referenced this pull request Jun 3, 2026
## Summary

Bumps the Go toolchain pin from **1.26.3 -> 1.26.4** to pick up the
stdlib fix for [GO-2026-5037](https://pkg.go.dev/vuln/GO-2026-5037)
(`crypto/x509.HostnameError.Error`), which `govulncheck` flags via
`tools/pyspy-lint/main.go:106:14` (reachable through `fmt.Fprintln` on
an error path). This was failing the `verify-static` job on every
recent PR.

## Root cause

`crypto/x509.HostnameError.Error` shipped vulnerable in Go 1.26.3.
Patched in Go 1.26.4. There is no in-repo workaround — the call site
in `tools/pyspy-lint` is legitimate error formatting; the only correct
fix is bumping the toolchain pin. Confirmed locally:

```
$ govulncheck ./tools/pyspy-lint/...   # with GOTOOLCHAIN=go1.26.4
No vulnerabilities found.
```

## Files touched (5)

- `go.mod` — `go 1.26.3` -> `go 1.26.4`
- `go.work` — `go 1.26.3` -> `go 1.26.4` (+ updated header comments)
- `.go-version` — `1.26.3` -> `1.26.4` (drives `actions/setup-go` via
  `go-version-file`)
- `install/kubernetes/tracecore/Dockerfile` — base image bumped to
  `golang:1.26.4-alpine` with refreshed sha256 digest
  (`f23e8b22…2a17f`, fetched via `crane digest`)
- `docs/SUPPORT-MATRIX.md` — Go-toolchain row updated to `1.26.4`

`module/go.mod` is intentionally untouched — it pins `go 1.22.0` to
track the OTel collector v0.110.0 OCB-distribution baseline (see
existing comment), and the workspace `go` directive (`1.26.4`)
remains `>=` the member-module floor (`1.22.0`), so workspace mode is
unaffected.

## Test plan

- [x] `govulncheck ./tools/pyspy-lint/...` -> No vulnerabilities found
- [x] `go build ./...` (root, GOTOOLCHAIN=go1.26.4) -> clean
- [x] `go test ./tools/... ./internal/...` -> all green (incl.
  `tools/pyspy-lint`, the file containing the flagged call site)
- [x] `module/` `go test ./...` -> matches `main` (one pre-existing
  failure in `processor/patterndetectorprocessor`

`TestPatternDetector_NegativeFixturesEmitNoVerdicts/synthetic-2026-06-multi-rank-disk-pressure`,
  reproducible on `main` at the same SHA — unrelated to this bump,
  out-of-scope here)
- [x] `make lint` -> 0 issues
- [ ] CI `verify-static` job passes (the gate this PR exists to fix)
- [ ] CI `build` / kind install bench builds against new pinned-digest
  golang base image

## Unblocks

Should clear `verify-static` for PRs #504, #505, #507 (and #506 once
its own `action.yml` fix lands).

```release-notes
chore: bump Go toolchain pin to 1.26.4 to pick up the stdlib fix for
GO-2026-5037 (crypto/x509.HostnameError.Error). No behavior change.
```

Signed-off-by: Tri Lam <tree@lumalabs.ai>
@trilamsr trilamsr force-pushed the ci/go-setup-composite branch from 9d3dcfd to acff49a Compare June 3, 2026 23:23
trilamsr added 2 commits June 3, 2026 16:34
Centralises actions/setup-go boilerplate that was copy-pasted across
8 workflows so dependabot SHA bumps land in one place and a stray
`cache: false` regression (CI wall-time hit twice before this) cannot
silently differ between callsites.

The composite is setup-go-only. GitHub requires `actions/checkout` to
run before any `./.github/actions/<x>` path resolves, so each caller
keeps its own checkout step (also lets the few callers that need
`fetch-depth: 0` or `ref: ${{ github.sha }}` keep that config without
forcing every caller through the lowest common denominator).

Migrated (15 call-sites, 7 workflows):
  bench.yml, chaos.yml (3), chart.yml (3), ci.yml (5),
  codeql.yml, install-bench.yml, nccl-fr-fuzz-nightly.yml

Skipped (intentional, preserves auditable inline state):
  compat-matrix.yml — `cache: false` with zizmor cache-poisoning note
  release.yml      — `cache: true # zizmor: ignore[cache-poisoning]`
                     comment + `ref: ${{ github.sha }}` pin per job

Validated: actionlint green; no workflow's effective behavior changed
(same checkout SHA preserved per caller, same setup-go SHA, same
`go-version-file: go.mod`, same `cache: true`).

Signed-off-by: Tri Lam <tree@lumalabs.ai>
Signed-off-by: Tri Lam <tree@lumalabs.ai>
@trilamsr trilamsr force-pushed the ci/go-setup-composite branch from acff49a to eaa34dc Compare June 3, 2026 23:34
@trilamsr trilamsr merged commit 37cca43 into main Jun 3, 2026
31 checks passed
@trilamsr trilamsr deleted the ci/go-setup-composite branch June 3, 2026 23:45
trilamsr added a commit that referenced this pull request Jun 3, 2026
## Summary

Wave 2 PR-B per Lane H followups-dir recon. Two doc-only edits trim
~643 net LoC of stale carry-forward backlog from `docs/followups/`:

1. **`M13.md` stubbed** (446 → 13 lines, -433). Pyspy receiver work
is DEFERRED to v0.4.0+ per
[#222](#222)
   (`external-clock` label) and RFC-0009. The receiver still ships in
   `components/receivers/pyspy/`, but the carry-forward queue is
   paused. File kept in place as a load-bearing marker — sibling
   issue #335 tracks re-evaluation preconditions.
2. **`M8.md` DCGM section removed** (318 → 108 lines, -210). RFC-0013
   §7 STRIKE'd the in-tree DCGM receiver in favor of `dcgm-exporter`
   + `prometheusreceiver` in the bundled recipe; the cgo stub never
   shipped real code. All DCGM-specific bullets (`pkg/dcgm`,
   `client_cgo`, `kindWatch`/`kindMIG`, AST resolver, libdcgm SIGSEGV
   subprocess, `dcgm_info` join-target, `pkg/vendorsdk-template`,
   perf-gate, HARDWARE-TESTING shipped marker) are gone. Surviving
   items: `tracecore debug dump` `[AUDIT]`, `validate --explain`
   closed marker, prometheus-alerts federation labels `[KEEP]`,
   grafana-dashboard per receiver `[KEEP]`, OTel `hw.*` semconv
   upstream PR `[UPSTREAM]`, build-tags CI closed marker, M8↔M9
   Option-pattern consistency review `[AUDIT]`.

`M3.md` and `M15.md` strikethrough sweep was a no-op — `rg '^~~'`
returned zero matches in both files (the spec's "if 5+ matches,
collapse" predicate did not fire).

## Pre-flight grep

`rg DCGM docs/` cross-confirmed no inbound anchor refs into the
deleted M8 bullets — DCGM mentions in `MILESTONES.md`,
`v1-rc1-cut-criteria.md`, `FOLLOWUPS.md`, etc. are about the milestone
definitions and the RFC-0013 strike table, not specific M8 line
anchors. `M11.md` already documents that DCGM lives under M8 and is
`[STRIKE]` per RFC-0013 §7 — that pointer remains valid.

## LoC delta

- Net delete: **643 lines** (701 deletions − 58 insertions).
- M13.md: -433. M8.md: -210.

## Conflict check

Different files than Wave 2 PR-A (`docs/v1-rc1-*`), Lane F
(`module/**/*.go`), and #504/#506. No overlap with any in-flight PR
touching `docs/followups/`.

## Test plan

- [x] `git diff --stat` confirms 2 files, 58 insertions, 701 deletions
- [x] `wc -l docs/followups/M13.md` = 13 lines (target ≤25)
- [x] `grep -ic dcgm docs/followups/M8.md` = 6 (down from 32)
- [x] Pre-commit hooks pass (golangci-lint, go vet,
attribute-namespace-check)
- [x] DCO sign-off present

```release-notes
NONE
```

Signed-off-by: Tri Lam <tree@lumalabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant