Skip to content

[m3] container image publish to ghcr.io/tracecoreai/tracecore#145

Merged
trilamsr merged 7 commits into
mainfrom
worktree-m3-ghcr-image-publish
May 21, 2026
Merged

[m3] container image publish to ghcr.io/tracecoreai/tracecore#145
trilamsr merged 7 commits into
mainfrom
worktree-m3-ghcr-image-publish

Conversation

@trilamsr

@trilamsr trilamsr commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes the long-standing chart-default-image gap. The chart's install/kubernetes/tracecore/values.yaml has shipped with image.repository: ghcr.io/tracecoreai/tracecore as the default since M5b, but release.yml only ever published the binary + SBOM + cosign-bundle + provenance as GitHub Release artifacts. Operators following the chart's defaults could not helm install. RFC-0008 names this path as the target operator-pull surface.

Root cause

The chart's default image.repository and release.yml's output set drifted. The chart was deliberately specified against a future-state image registry; the registry-publish job was tracked as a M3 follow-up and not yet built. This PR closes the gap at the source by adding the publish job, not by walking back the chart default.

Architecture

  • Dockerfile pins gcr.io/distroless/static-debian12:nonroot by digest (sha256:d093aa3e30...). Non-root UID 65532 matches the chart's runAsUser. CGO_ENABLED=0 makes scratch viable too, but distroless gives a working CA bundle for the otlphttp exporter's HTTPS path and tzdata for RFC3339 stamping with zero shell-attack surface. The Dockerfile also declares ARG SOURCE_DATE_EPOCH so the determinism contract is visible to a Dockerfile-only reader.
  • The image consumes the pre-built reproducible binary from the build job (COPY release/$BINARY_BASENAME), not a recompile. Image reproducibility reduces to binary reproducibility (already gated) plus the digest-pinned base layer plus SOURCE_DATE_EPOCH threaded through buildkit's layer-rewrite (via the step env: block, not just --build-arg).

release.yml image job

  • needs: build (downloads binary artifact, verifies SHA-256 matches build.outputs.digest before push).
  • docker/build-push-action@v6.19.2 with SOURCE_DATE_EPOCH set via both the step env: block AND --build-arg so buildkit's layer-rewrite kicks in.
  • Always tags :TAG. Floats :latest only on stable releases (no - in the SemVer pre-release field), so a pre-release cannot silently promote alpha bits to the chart's default-pull surface.
  • cosign sign --yes "$IMAGE_REPO@$DIGEST": signs by digest, not tag. A registry rebuild of a floating tag would otherwise let an attacker replace what cosign verify resolves.
  • cosign verify smoke check pins the same identity binding the binary blob already uses (--certificate-github-workflow-ref refs/tags/$TAG, --trigger push).
  • attest-build-provenance with push-to-registry: true attaches the SLSA v1.0 provenance to the manifest in the registry, so a verifier pulls everything from one place via gh attestation verify oci://.

Permissions: id-token: write, attestations: write, packages: write. No long-lived registry credentials (GHCR auth uses the workflow's GITHUB_TOKEN); no long-lived signing keys (cosign keyless via OIDC).

Docs

  • docs/reproducibility.md grows two steps (8: resolve digest with crane digest, then cosign verify by digest; 9: gh attestation verify oci://) with the same identity-binding flags as the binary-side steps. crane added to prerequisites.
  • install/kubernetes/tracecore/README.md "Pre-release note" replaced with the live-publish contract. Troubleshooting "ImagePullBackOff on first install" entry updated with the Dockerfile-based local-build workaround (was: "M3 release stream has not landed yet").
  • docs/followups/M3.md "Container-image publish" item closed with the HTML-comment + struck-italic convention used by the rows already closed in that shard. New section "Items impossible to accomplish locally" added for the three M21-trigger items (end-to-end push, oci:// attestation smoke, two-build image-digest equality) so a future contributor does not file a "missing test" issue assuming the gap is oversight.
  • CHANGELOG.md [Unreleased] ### Added gains an M3 entry.

Self-review fixes (commits 2-4)

Two rounds of self-review surfaced and closed:

Round 1 (commit 2 — 7578feb):

  • F3: cosign triangulate --type digest was the wrong tool. It resolves the signature reference for a subject, not the subject's own digest. Replaced with crane digest (canonical tag→digest resolver); added crane to prerequisites.
  • F5: SOURCE_DATE_EPOCH did not actually reach buildkit. Build-args undeclared in the Dockerfile are silently ignored, so the COPY layer's mtime was non-deterministic. Now threaded through both env: block (buildkit layer-rewrite) and ARG SOURCE_DATE_EPOCH (Dockerfile contract).
  • F1: release-doc-parity.sh only covered the binary surface. Extended with a parallel block for image-side cosign verify. Mutation-verified.
  • F4: Force-push comment overstated the SHA pin's guarantee. Reworded to match the actual (binary-digest guard + tree-checkout) closure.

Round 2 (commit 3 — 7034e1a, commit 4 — 459b686):

  • R1 (gh CLI semantic drift): New scripts/gh-attestation-flag-lint.sh parses gh attestation verify --help and asserts every long flag used in release.yml + reproducibility.md is still recognised by the installed CLI. Wired into make doc-check. Mutation-verified (mutated --help output that drops one flag → script exits 1 with fix hint).
  • R2 (distroless base digest rotation): New scripts/base-digest-check.sh compares the Dockerfile pin against crane digest gcr.io/distroless/static-debian12:nonroot. Two modes: --warn (default, exits 0) for periodic cadences and --strict (exits non-zero) for M21 release-prep via make base-digest-check. Deliberately NOT in doc-check (network + legitimate-lag). Mutation-verified.
  • A++ ci(deps): bump the gh-actions group with 5 updates #1 (gate-the-gate): scripts/testdata/release-doc-parity/{intact,drift-binary,drift-image}/ fixtures exercise both parity blocks; scripts/test-release-doc-parity.sh drives them with WORKFLOW/DOC env overrides and asserts expected exit codes. Mutation-verified: breaking the image-side awk anchor in the gate makes the intact fixture fail.
  • R3 (timeout-minutes): Out of scope (no per-job timeouts exist anywhere in release.yml today). Documented as a M3 follow-up with concrete per-job minute suggestions.
  • Commit 4 fixes one em-dash the doc-check em-dash gate flagged in the fixture README.

Items impossible to accomplish locally (documented in docs/followups/M3.md)

Three checks only become exercisable at M21 v0.1.0 (or any vX.Y.Z tag) push time, because the image job is tag-triggered:

  1. End-to-end image push smoke against ghcr.io/tracecoreai/tracecore. Mitigations in place: actionlint, release-doc-parity.sh image block, gh-attestation-flag-lint.sh, binary-digest guard.
  2. gh attestation verify "oci://$DIGEST" against a real attestation in the shape this pipeline emits. No public OCI image carries a GitHub Actions provenance attestation in matching shape, so the verifier walkthrough cannot be smoke-tested end-to-end before M21. gh-attestation-flag-lint.sh partially covers this by asserting flag-name compatibility; semantic flag changes are the residual risk.
  3. Two-build digest equality for the image. The SOURCE_DATE_EPOCH plumbing claims image reproducibility, but the claim is only verifiable by building twice at the same SHA and diff'ing the manifest digests. The local dev environment currently lacks a working docker buildx; CI has buildx but doubling the runner-time at every tag is a tradeoff worth revisiting post-M21.

Release notes

[FEATURE] Container images publish to ghcr.io/tracecoreai/tracecore:<TAG> on every release tag, signed and attested (cosign keyless + SLSA v1.0 provenance, both pushed to the registry). The Helm chart's default image.repository is now a live pull path. Verification walkthrough in docs/reproducibility.md steps 8-9.

Test plan

  • make ci clean: golangci-lint, govulncheck, vet, mod-verify, RCE gate, register-lint, actionlint, zizmor, all unit/race tests.
  • make doc-check clean (14 sub-gates including 3 new ones from round 2: test-release-doc-parity (3 fixtures), gh-attestation-flag-lint (6 flags), image-side release-doc-parity block).
  • actionlint clean on release.yml after the env: block addition.
  • make base-digest-check clean against live gcr.io (pinned digest is current).
  • Mutation-verified: every new gate's failure mode (gh CLI flag rename, Dockerfile digest forge, parity-script regex break).
  • Dockerfile validates by inspection: distroless base pinned by digest; UID 65532 matches chart runAsUser; ENTRYPOINT/CMD shape allows the chart's args: [collect, --config=/etc/tracecore/config.yaml] to override cleanly; ARG SOURCE_DATE_EPOCH declared for local-reproducibility.
  • End-to-end image push exercise + gh attestation verify oci:// against a real attestation + two-build image-digest equality: impossible locally; see "Items impossible to accomplish locally" above. First real exercise will be M21 v0.1.0 (or any pre-release tag).

🤖 Generated with Claude Code

Update (commits 5-6, after main moved further)

While this PR was open, main advanced an additional 4 PRs (#143, #144, #146, #147, #148, #149). Branch caught up via git merge origin/main per the merge-not-rebase policy this PR also documents (see commit ddf86f7).

All 14 doc-check gates still green post-merge. The merge commit is preserved on the branch (git merge with --no-ff); GitHub squash-merge on the PR button will collapse it into the same single-commit-on-main shape every other tracecore PR lands as.

Update (commit 7 — 285640c, A+ polish)

Self-review pass after the merge surfaced two cross-cutting hardening items both worth one-line-per-job to land, and both gaps that would have made the surface incomplete:

  1. timeout-minutes on every release.yml job (build=20, sbom=15, sign=10, provenance=10, image=20, release=10). GitHub's default ceiling is 360m / 6h; a wedged push or hung Sigstore round-trip now fails fast inside the per-job cap rather than burning a runner-hour. Caps chosen at 2-4x observed real wall-clock so transient ghcr/Sigstore weather doesn't trip on healthy runs. Closes the M3.md row that previously held this out as "opportunistic."
  2. cosign verify-attestation --type slsaprovenance1 smoke check in the image job after attest-build-provenance pushes the SLSA v1 attestation to the registry. Uses the same identity binding (refs/tags/$TAG + release.yml workflow path + push trigger) the manifest-signature verify already enforces. Now every artifact this pipeline publishes — binary blob, image manifest, image provenance — is CI-verified inside the same run that produced it, against the same identity claims a third-party verifier would reproduce offline.

docs/followups/M3.md also gains a new explicit "Out of scope for M3" section rowing three items the self-review asked about: multi-arch image build (linux/arm64), container vulnerability scan gate (trivy/grype), and image SBOM sub-attestation (syft/cyclonedx with --upload). Each is rowed with a trigger so a future audit can find them without commit archaeology rather than ambiguously deferred.

actionlint clean on release.yml; make doc-check clean across all gates including the new release-doc-parity image block, test-release-doc-parity (3/3 fixtures), gh-attestation-flag-lint (6 flags), and chart-appversion-check.

trilamsr and others added 4 commits May 20, 2026 16:45
Closes the long-standing chart-default-image gap. The chart's
install/kubernetes/tracecore/values.yaml has shipped with
image.repository=ghcr.io/tracecoreai/tracecore as the default since
M5b, but release.yml only ever published the binary + SBOM +
cosign-bundle + provenance as GitHub Release artifacts. Operators
following the chart's defaults could not pull. RFC-0008 names this
as the target operator-pull path.

Architecture:
- New Dockerfile pinned to gcr.io/distroless/static-debian12:nonroot
  by digest (sha256:d093aa3e30...). Non-root UID 65532 matches the
  chart's runAsUser. CGO_ENABLED=0 binary means scratch was viable
  too, but distroless gives a working CA bundle for the otlphttp
  exporter's HTTPS path and tzdata for the binary's RFC3339 stamping
  with zero shell-attack surface.
- The image consumes the pre-built reproducible binary from the
  build job (COPY release/$BINARY_BASENAME), not a recompile. Image
  reproducibility reduces to binary reproducibility (already gated)
  plus the digest-pinned base layer.

release.yml `image` job:
- needs: build (downloads binary artifact, verifies sha256 matches
  build.outputs.digest before push).
- Builds with docker/build-push-action@v6.19.2, SOURCE_DATE_EPOCH
  threaded through so the COPY layer's mtime is deterministic.
- Always tags :TAG. Floats :latest only on stable releases (no `-`
  in the SemVer pre-release field) so a pre-release does not
  silently promote alpha bits to the chart's default-pull surface.
- cosign sign --yes "$IMAGE_REPO@$DIGEST": sign the manifest BY
  DIGEST, not tag. A registry rebuild of a floating tag would
  otherwise let an attacker replace what cosign verify resolves.
- cosign verify smoke check pins the same identity binding the
  binary blob already uses (--certificate-github-workflow-ref
  refs/tags/$TAG, --trigger push).
- attest-build-provenance with push-to-registry=true attaches the
  SLSA v1.0 provenance to the manifest in the registry, so a verifier
  pulls everything from one place via `gh attestation verify oci://`.

Permissions: id-token: write, attestations: write, packages: write.
No long-lived registry credentials (GHCR auth uses the workflow's
GITHUB_TOKEN); no long-lived signing keys (cosign keyless via OIDC).

Docs:
- docs/reproducibility.md grows two steps (8: cosign verify image
  manifest by digest; 9: gh attestation verify oci://) with the
  same identity-binding flags as the binary-side steps. The
  release-doc-parity.sh gate scopes to the binary-side
  `gh attestation verify "$BINARY_BASENAME"` line specifically,
  so adding the image verifier does not break parity.
- "What this verifies" / "What this does not verify" / "If a step
  fails" tables extended for steps 8-9.
- install/kubernetes/tracecore/README.md "Pre-release note"
  replaced with the live-publish contract. Troubleshooting
  "ImagePullBackOff on first install" entry updated with the
  Dockerfile-based local-build workaround (was: "M3 release stream
  has not landed yet").

Followup + changelog:
- docs/followups/M3.md "Container-image publish" item closed with
  the project's HTML-comment-above + struck-italic-line-below
  convention (mirrors the rows already closed in that shard).
- CHANGELOG [Unreleased] ### Added gains an M3 entry.

Verification:
- `make ci` clean (golangci-lint, govulncheck, vet, mod-verify, RCE
  gate, register-lint, actionlint, zizmor, all unit/race tests).
- `make doc-check` clean: 438 markdown links resolve, em-dash + en-dash
  diff gate clean, release-doc-parity green, chart-appversion green.
- actionlint clean on the new release.yml job.
- Cannot exercise the end-to-end push without a real tag push;
  workflow runs at next vX.Y.Z tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… gate, force-push comment

Self-review of #145 surfaced three correctness bugs and one missing
anti-regression gate. All fixable now; fixing now beats deferring to
the followup list.

F3 — `cosign triangulate --type digest` is the wrong tool.

`cosign triangulate` resolves the signature/attestation OCI reference
for a subject, not the subject's own digest. Step 8 as written likely
fails at M21 tag-push exercise. Replaced with `crane digest` (the
canonical tag-to-digest resolver) and added `crane` to the
prerequisites list. The "If a step fails" row for step 8 already
mentioned `crane digest` as a fallback; it is now the primary.

F5 — SOURCE_DATE_EPOCH did not actually reach buildkit.

Build-args not declared in the Dockerfile are silently ignored by
buildkit, so the COPY layer's mtime was non-deterministic despite the
intent. Two fixes layered:
  1. `env: SOURCE_DATE_EPOCH:` on the build-push-action step so
     buildkit's layer-timestamp rewrite picks up the epoch from the
     build environment (buildkit >= 0.11 honors this).
  2. `ARG SOURCE_DATE_EPOCH` declared in the Dockerfile so the
     determinism contract is visible to readers of the Dockerfile
     alone and so `docker buildx build --build-arg
     SOURCE_DATE_EPOCH=...` reproduces the CI image bit-for-bit.

F1 — release-doc-parity.sh did not cover the image surface.

The existing gate only compared the binary-side `gh attestation
verify` flag set. Extended with a parallel block covering the
image-side `cosign verify` flag set: the `image` job's smoke check vs
`docs/reproducibility.md` step 8. Mutation-verified locally —
breaking one flag in the workflow makes the gate exit non-zero.

F4 — softened the force-push comment on the image-job checkout.

The previous wording claimed the SHA pin alone prevents tree drift
between `build` and `image`. In fact the pin guarantees the
Dockerfile + workflow tree this job reads matches the commit that
ran; the binary-digest guard (already present) catches the case of a
binary built from a different tree than the Dockerfile read here.
Together they close the force-push window. Rewrote to match the
actual guarantee.

Verification:
- make doc-check clean (438 markdown links, em-dash + comment-noise
  gates, both release-doc-parity blocks, chart-appversion gate).
- actionlint clean on release.yml.
- release-doc-parity image block mutation-verified (rename one flag
  in the workflow -> gate fails).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-review surfaced three residual risks (R1 gh CLI semantic drift,
R2 distroless base digest rotation, A++ #1 gate-the-gate). All three
are bridged before merge instead of carried forward as followups.
Also drops the section-banner the comment-noise gate flagged in the
prior commit's parity-script edit.

R1 - gh attestation verify flag-shape regression lint.

New scripts/gh-attestation-flag-lint.sh parses
`gh attestation verify --help` and asserts every long flag we use in
release.yml + docs/reproducibility.md is still recognised by the
installed gh CLI. Catches the failure mode "gh renamed
--signer-workflow between point releases and our published verifier
walkthrough is now broken." Skips quietly when gh is not installed
(local dev); CI ubuntu-latest always has it. Mutation-verified.
Wired into `make doc-check`.

R2 - Distroless base digest rotation gate.

New scripts/base-digest-check.sh calls
`crane digest gcr.io/distroless/static-debian12:nonroot` and
compares against the Dockerfile pin. Two modes: --warn (default,
exits 0) for periodic cadences and --strict (exits non-zero) wired
into the M21 release-prep flow via the new `make base-digest-check`
target. Deliberately NOT in doc-check: requires outbound network to
gcr.io and the pin legitimately lags between rotations. M3 follow-up
shard gains a row describing the cadence.

A++ #1 - Gate-the-gate fixture for release-doc-parity.sh.

scripts/testdata/release-doc-parity/{intact,drift-binary,drift-image}
fixtures exercise both parity blocks.
scripts/test-release-doc-parity.sh drives them and asserts the
expected exit codes. release-doc-parity.sh now accepts WORKFLOW/DOC
env overrides (production paths unchanged). Mutation-verified:
breaking the image-side awk anchor in the gate makes the intact
fixture fail, driver catches it.

R3 - timeout-minutes on release.yml jobs: out of scope (no per-job
timeouts exist anywhere in release.yml today). Documented as a M3
follow-up with concrete per-job minute suggestions.

Items impossible to accomplish locally are listed in docs/followups/
M3.md under a new "Items impossible to accomplish locally" section:
end-to-end image-push smoke, real-attestation oci:// verifier exercise,
and two-build image-digest equality. All three only become exercisable
at M21 v0.1.0 tag-push time.

Banner-comment cleanup: removed the section banner from
release-doc-parity.sh that the comment-noise gate flagged in 7578feb.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 21, 2026
#147)

## Summary

Single-PR bundle of 10 low-risk follow-up actions. Each row was
anchor-verified on `main` before editing; no production behavior change.
Diff is 10 files, +91/-37, dominated by markdown.

**Breakdown:**
- 3 strikes (anchor shipped, row was stale)
- 1 test-only struct add (k8sevents `NodeWatchErrors`)
- 1 bash test add (`no-autoupdate-check` hit-line format lock)
- 5 doc-only clarifications / partial-ship / audit notes

## Items applied

### Strikes — anchor on `main` confirms shipped

1. **M3.md L188 `make doc-check`.** `scripts/doc-check.sh` header reads
"verify every Test\*/Fuzz\*/Benchmark\* name referenced in docs"; wired
into `make doc-check` and `make ci`.
2. **M8.md L103 `docs/HARDWARE-TESTING.md` libdcgm + nv-hostengine
setup.** File exists (28 hits for libdcgm/dcgm/nv-hostengine); covers
Ubuntu 22.04 driver / `libdcgm-dev` / `nv-hostengine` provisioning,
x86_64 + aarch64-SBSA build matrix, and the `//go:build dcgm,hardware`
distinction. Doc shipped ahead of cgo client to unblock GPU-less
contributors.
3. **M19.md L18 `nodeWatchErrCount` not in SnapshotCounters.** Closed by
item 6 below — added `NodeWatchErrors` field symmetrically.

### Test-only struct add

4. **components/receivers/k8sevents/export_test.go.** Added
`NodeWatchErrors int64` field on `CountersForTest`; `SnapshotCounters`
now reads `rr.nodeWatchErrCount.Load()` symmetrically with
`rr.watchErrCount.Load()`. Both call sites are keyed-init inside the
same file; no external positional callers to break (grep confirmed: only
2 hits, both in `export_test.go`).

### Bash test add (M23 grep-gate format lock)

5. **scripts/no-autoupdate-check_test.sh "hit-line-format-stable".** New
assertion that runs the gate against the hyphenated-go-update fixture,
captures stdout (existing tests discard it), and asserts at least one
line matches `^[^:]+:[0-9]+:`. Locks the parseable hit-line shape
*before* the first automation consumer (CI summary, dashboard, Slack
notifier) wires up — a cosmetic tweak to the gate's message body now
fails CI instead of silently breaking downstream parsers. M23.md row
struck.

### Doc-only clarifications

6. **M15.md L185 falsifying-check backfill.** Anchored the
"/var/lib/tracecore/ subdir governance" row's grep-falsifying-check to
RFC-0010 §Proposal — `docs/rfcs/0010-containerstdout-receiver-scope.md`
L177/L217/L274/L393/L407 already carry the convention ("M15 owns
`/var/lib/tracecore/container_stdout/`. Future siblings reserve their
own subdirectories."). Row marked `[x]`.

7. **M15.md L192 + RFC-0010 §Pod-attribution forward-pointer.** Appended
one-line cross-reference at RFC-0010 L158 → `docs/followups/M15.md`
"Cross-receiver rank-label reconciliation" so the deferred audit trail
is discoverable from the RFC. Row marked `[x]`.

8. **M8.md L30 `tracecore debug dump` partial-ship.**
`cmd/tracecore/debug.go::runDebugDump` already writes version + revision
+ branch + build date + Go runtime stats + registered components +
redacted config to `tracecore debug dump > diagnostic.txt`. Remaining
gap is "last N samples" — needs receiver-side ring buffer (M2
carry-forward). Row kept open with partial-ship line +
remaining-trigger.

9. **M3.md L153 SUPPLY-CHAIN-IDENTITY.md scope clarification.** Added
one sentence noting the consolidation is a copy-and-deduplicate pass
against existing `release.yml` comment blocks (cosign-sign-blob,
gh-attestation-sign), not net-new authoring — so the next reader sees
the actual scope of work, not a misleading "30-min write" estimate that
implies green-field.

10. **otlphttp.md L182 workflow paths audit + M14.md L88 test pointer.**
- **otlphttp**: inlined audit findings (2026-05-20). `chart.yml` and
`install-bench.yml` are substrate-aware (include `cmd/tracecore/**`,
`internal/**`); `kernelevents-integration.yml` and
`pyspy-integration.yml` cover only `components/receivers/<name>/**` +
`internal/runtime/lifecycle/**` — a `cmd/tracecore` factory wiring or
`internal/pipeline` contract change can land without re-running these
integration jobs. `chaos.yml` covers `tools/failure-inject/**` +
`internal/synthesis/**` only (indirect coupling, acceptable). Remaining:
6-line YAML edit per integration workflow.
- **M14**: added inline pointer from the multi-retry slow-write fixture
row to the existing single-retry baseline at
`components/receivers/kineto/shutdown_test.go::TestIngest_RetryOnTruncated`
so the future author has the test-shape anchor.

## Files changed

| File | LOC | Kind |
|---|---|---|
| `components/receivers/k8sevents/export_test.go` | +2 | test struct
field |
| `scripts/no-autoupdate-check_test.sh` | +20 | bash test add |
| `docs/rfcs/0010-containerstdout-receiver-scope.md` | +1/-1 | inline
cross-ref |
| `docs/followups/M3.md` | +9/-5 | strike + scope clarification |
| `docs/followups/M8.md` | +16/-5 | strike + partial-ship |
| `docs/followups/M14.md` | +1/-1 | test pointer |
| `docs/followups/M15.md` | +15/-8 | 2 strikes |
| `docs/followups/M19.md` | +5/-9 | strike (anchored to test add) |
| `docs/followups/M23.md` | +9/-7 | strike |
| `docs/followups/otlphttp.md` | +13/-1 | audit findings inline |

## Test plan

- [x] `go test ./components/receivers/k8sevents/...` green.
- [x] `bash scripts/no-autoupdate-check_test.sh` 10/10 assertions pass
(added "hit-line-format-stable" — the 10th).
- [x] `bash scripts/doc-check.sh` green (437 markdown links resolve,
em-dash + en-dash diff gate clean, comment-noise diff gate clean).
- [x] Pre-commit hook ran full `make check` + `make ci` (all package
tests cached/passing).
- [ ] CI green on this branch.

## Release notes

```release-notes
NONE
```

## Sequencing

Builds on `main` after PRs #132 (shard split), #133 (RUNBOOK +
chart-appversion), #142 (opportunistic curation), #134 (chaos.yml row),
#143 (cross-shard audit). Independent of currently-open PRs #144 (m6
integration recipes) and #145 (m3 GHCR image publish).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tri Lam added 3 commits May 20, 2026 21:20
Documents the actually-correct branch sync policy. Prior implicit
practice was to rebase feature branches against origin/main, on the
theory that `required_linear_history` on main forced it. It does
not - that setting only governs how PRs land on main, and tracecore
squash-merges every PR (which collapses any feature-branch merge
commits anyway).

Cost of rebasing pushed feature branches: every rebase rewrites
every commit SHA, so reviewers who have already loaded the PR lose
their "show changes since last review" position. Force-push (even
--force-with-lease) requires explicit auth per session. During
long-running features the main branch moves often, multiplying the
cost on every catch-up.

Cost of merging: one merge commit on the feature branch, which the
squash button collapses on the way into main. Plain fast-forward
push. Reviewer position preserved.

This PR demonstrates the policy by resolving its own conflict via
`git merge origin/main` (next commit).

Signed-off-by: Tri Lam <tri@maydow.com>
Sync feature branch with main per the merge-not-rebase policy
documented in CONTRIBUTING.md (commit ddf86f7).

Main moved 5 PRs ahead during this branch's lifetime:
- PR #143 (followups sweep)
- PR #134 (chaos.yml pattern-pod-evicted)
- PR #142 (follow-up curation)
- PR #144 (M6 integration recipes)
- PR #146 (kineto MaxEvents stub)
- PR #147 (followups bundle)

Conflicts expected in CHANGELOG.md and docs/followups/M3.md
(both additive).

# Conflicts:
#	CHANGELOG.md
Two cross-cutting hardening items the self-review surfaced; both close
gaps that were one-line-per-job to fix but kept the surface incomplete.

1. `timeout-minutes` on every release.yml job (build=20, sbom=15,
   sign=10, provenance=10, image=20, release=10). GitHub's default
   360m / 6h ceiling no longer applies — a stuck Sigstore call or
   wedged push fails fast inside the cap rather than burning a full
   runner-hour. Caps picked at 2-4x observed wall-clock to tolerate
   transient ghcr/Sigstore weather without tripping on healthy runs.

2. `cosign verify-attestation --type slsaprovenance1` smoke check
   added to the image job after `attest-build-provenance` pushes the
   SLSA v1 attestation to the registry. Uses the same identity binding
   the manifest-signature verify enforces (refs/tags/$TAG +
   release.yml workflow path + push trigger), so every artifact this
   pipeline publishes — binary blob, image manifest, image provenance —
   is now CI-verified inside the same run that produced it, against
   the same identity claims a third-party verifier reproduces offline.

docs/followups/M3.md updates:
- Close the `timeout-minutes` row (was P3 followup).
- Close the new image-attestation-CI-verify row (was implicit in the
  attestation-push step but unverified).
- Add explicit "Out of scope for M3" section rowing multi-arch image
  build, container vuln-scan gate, and image SBOM sub-attestation,
  so a future audit can find them without commit archaeology.

Gates: actionlint clean on release.yml, make doc-check 0 (all
release-doc-parity and gh-attestation-flag-lint pass).

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr merged commit 86fd2f6 into main May 21, 2026
13 checks passed
@trilamsr trilamsr deleted the worktree-m3-ghcr-image-publish branch May 21, 2026 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant