Skip to content

chore(pivot): PR-E unblock — bench heartbeat to hostmetricsreceiver#180

Merged
trilamsr merged 2 commits into
mainfrom
chore/pivot-pr-e-hostmetrics
May 31, 2026
Merged

chore(pivot): PR-E unblock — bench heartbeat to hostmetricsreceiver#180
trilamsr merged 2 commits into
mainfrom
chore/pivot-pr-e-hostmetrics

Conversation

@trilamsr

@trilamsr trilamsr commented May 31, 2026

Copy link
Copy Markdown
Contributor

What this PR does

Unblocks RFC-0013 PR-E. Swaps the install-bench heartbeat from the in-tree clockreceiver to upstream hostmetricsreceiver (loadscraper @ 1s), adds hostmetricsreceiver to builder-config.yaml, and adds an opt-in receivers.hostmetrics block to chart values.yaml (default disabled — chart default stays clockreceiver this release; see deferral below).

Root cause

RFC-0013's distribution-first intent is "no custom receiver where upstream satisfies." clockreceiver is a custom heartbeat receiver doing what hostmetricsreceiver's loadscraper does upstream. The root-cause fix is the swap.

The originally-planned upstream replacement — telemetrygeneratorreceiver — does not exist in opentelemetry-collector-contrib at any tag from v0.95.0 through v0.130.0 (verified 2026-05-30 against the GitHub tree API). Two community proposals were closed not_planned:

hostmetricsreceiver (already shipping in contrib v0.110.0, our OCB pin) satisfies the bench's pass condition (first parseable JSON line at the sink — see bench/install/run.sh) with a low-cardinality real signal: system.cpu.load_average.{1m,5m,15m} (3 series, constant shape). Re-evaluation trigger: a generator-shaped receiver landing in contrib at any future tag.

Why the deferred work is a scope split, not a workaround

components/receivers/clockreceiver/ has ~92 in-tree references across cmd/tracecore/*_test.go, internal/pipeline, internal/selftelemetry, internal/config/fuzz_test.go, internal/pipelinebuilder/fuzz_test.go, chart CI fixtures, STYLE.md examples, and CONTRIBUTING.md templates. It doubles as the canonical example receiver across the codebase. Deleting the source and flipping the chart default are two separate operator-visible changes:

  1. Chart-default flip (receivers.clockreceiverreceivers.hostmetrics) needs a values-keys NOTES.txt deprecation cycle per RFC-0013 §8.
  2. Source deletion needs coordinated migration of ~92 test-fixture references.

Bundling both into PR-K (in-tree-receiver deletion wave) means one coordinated cut + one deprecation cycle. Splitting into this PR means two operator-visible flips. Per priority_order (UX first), the bundled-cut path wins.

Cross-references updated:

  • RFC-0013 §4 v0.1.0 / v0.2.0 release rows
  • RFC-0013 §7 deletion table (clockreceiver row)
  • RFC-0013 §migration PR-E (full rationale) + PR-F (clock dropped from list) + PR-K (clock added + chart-default flip)
  • docs/migration/v0.1-to-v0.2.md heartbeat-primitive row
  • CHANGELOG.md Wave 3 entry + v0.1.0 deletions list
  • bench/install/README.md schema notes + tick-aliasing caveat (per adversarial-review)

Adversarial-review findings addressed (commit a3c5432)

Pre-merge adversarial review surfaced:

  • Field-name semantic drift (clockreceiver_interval_seconds in bench result JSON describes a hostmetrics interval after this PR). Resolved by documenting the historical-name reality in bench/install/README.md schema section + tick-aliasing caveat; full rename gated on a schema-v2 bump in PR-K (alongside the chart-default flip — single coordinated operator-visible change).
  • RFC §4 v0.1.0 row ambiguity vs CHANGELOG. Tightened to spell out bench-file scope + PR-K deferral target explicitly.
  • Loadscraper-@-1s unverified on darwin. Verified locally: OCB-built binary + minimal hostmetrics @ 1s config emits 3 datapoints (system.cpu.load_average.{1m,5m,15m}) at 1.002s after pipeline start, no errors / no warnings, clean shutdown on SIGINT.

Release notes

NONE

(No operator-visible default-rendered-config change this release. hostmetricsreceiver is opt-in via receivers.hostmetrics.enabled: true. The chart's default render is byte-identical to prior versions.)

Test plan

  • make verify green (golangci-lint, vet, mod-verify, fmt, tidy, doc-check, alert-check, chart-appversion-check, no-autoupdate-check)
  • make build-ocb green — OCB compiles tracecore binary with hostmetricsreceiver bundled
  • Hostmetrics @ 1s loadscraper smoke-tested locally on darwin/arm64 against the OCB-built binary
  • helm lint install/kubernetes/tracecore green
  • helm template tracecore install/kubernetes/tracecore -f bench/install/tracecore-values.yaml renders the expected hostmetrics pipeline (loadscraper @ 1s, otlphttp exporter)
  • helm template tracecore install/kubernetes/tracecore (defaults) renders byte-identical to prior versions — backward-compat preserved for existing operators
  • Install-bench CI gate (.github/workflows/install-bench.yml) — first run against this branch should confirm the hostmetrics pipeline produces a parseable JSON line at the sink within the timeout window

Tri Lam added 2 commits May 30, 2026 19:18
clockreceiver -> hostmetricsreceiver (loadscraper @ 1s) in builder-config.yaml +
bench/install/tracecore-values.yaml. Adds opt-in `receivers.hostmetrics` block
to chart values.yaml (default disabled; chart default stays clockreceiver this
release per RFC-0013 §migration PR-E rationale).

The originally-planned telemetrygeneratorreceiver does not exist in
opentelemetry-collector-contrib at any tag (verified 2026-05-30; contrib issues
#41687 and #43657 both closed `not_planned`). hostmetrics' loadscraper emits 3
low-cardinality series (system.cpu.load_average.{1m,5m,15m}) at the cadence the
bench's pass condition needs (first parseable JSON line at the sink).

RFC-0013 §4 + §7 + §migration PR-E updated. CHANGELOG + migration guide row
filled. Chart-default flip + clockreceiver source-deletion (~92 in-tree
fixture references) deferred to PR-K alongside coordinated test-fixture
migration + NOTES.txt deprecation cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
Address adversarial-review findings on #180:

- bench/install/README.md: document `clockreceiver_interval_seconds`
  field is a historical schema-v1 name; semantics still correct
  (heartbeat-receiver emit period). Schema v2 will rename alongside
  PR-K chart-default flip. Also update tick-aliasing caveat to
  name both receivers.
- docs/rfcs/0013-distro-first-pivot.md §4: tighten v0.1.0 release
  row to match CHANGELOG wording ("bench/install/tracecore-values.yaml"
  scope explicit; "source delete deferred to v0.2.0 / PR-K").

Hostmetrics @ 1s loadscraper verified locally on darwin/arm64
against the OCB-built binary: 3 datapoints
(system.cpu.load_average.{1m,5m,15m}) emit 1.002s after pipeline
start, no errors, no warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 02:25
@trilamsr trilamsr merged commit 2cd6b81 into main May 31, 2026
14 of 15 checks passed
@trilamsr trilamsr deleted the chore/pivot-pr-e-hostmetrics branch May 31, 2026 02:28
trilamsr added a commit that referenced this pull request May 31, 2026
## Root cause

PR #180 (`chore(pivot): PR-E unblock — bench heartbeat to
hostmetricsreceiver`) enabled the `hostmetrics` receiver in
`bench/install/tracecore-values.yaml` and added it to
`builder-config.yaml`, but the install-bench `Dockerfile` was still
building from `./cmd/tracecore`. The generated
`cmd/tracecore/components.go` only registers in-tree receivers —
`hostmetricsreceiver` is upstream OTel-contrib and is only bundled by
the OCB-assembled binary at `_build/tracecore`. The daemonset pod failed
config load with `unknown component type hostmetrics`, `kubectl rollout
status` timed out at 5 m, and `bash -e` aborted `run.sh` before any
diagnostics fired — so CI showed a bare red with no actionable log.

`install-bench` has been red on `main` since 2026-05-31T02:28:40Z and on
every PR opened after #180.

**Affected PRs (open as of this writing)**: #186, #187, #188, #189.

## Fix

Switch `install/kubernetes/tracecore/Dockerfile` to build via OCB:

```
make build-ocb           # generates ./_build/{main.go,go.mod,...} + compiles
cd _build && go build .  # re-link with CGO_ENABLED=0 -trimpath -ldflags "-s -w"
```

The re-link with our flags guarantees the static binary the distroless
base can exec; OCB's intermediate compile uses its own defaults. The
final image still uses `gcr.io/distroless/static-debian12:nonroot` at
the same pinned digest.

This is a **tactical bridge**: PR-A2 (#189) makes `_build/tracecore` the
canonical binary for all builds. Once that lands, the in-tree
`cmd/tracecore` path retires entirely (RFC-0013 PR-F) and this
Dockerfile change becomes the new normal across every image, not just
install-bench.

### Why not the alternatives?

- **Revert hostmetrics in bench values** → walks back PR #180's pivot
intent ("no custom receiver where upstream satisfies"); the legacy
`clockreceiver` is on its way out in PR-K.
- **Add hostmetricsreceiver to `cmd/tracecore/components.go`** →
diverges the in-tree component list from the OCB-managed one; the whole
point of PR-A2 is to delete that divergence.

## Bonus: surface root cause on rollout-status failure

`bench/install/run.sh` had post-deadline diagnostics for the first-data
path, but the rollout-status path (the actual failure mode of this
regression) just exited via `set -e`. Added `dump_failure_diagnostics()`
(pod state, `kubectl describe`, current + previous container logs,
rendered config) wired to both failure paths; refactor eliminates the
duplicated tracecore-pod spelunking that lived inline. Future
regressions surface root cause in the CI log without re-running.

## Verification

```
$ make check
… 0 issues, all modules verified

$ docker build -f install/kubernetes/tracecore/Dockerfile -t tracecore:bench-test .
… exporting to image done

$ docker run --rm tracecore:bench-test components | grep -E "hostmetrics|otlp"
    - name: hostmetrics
      module: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver v0.110.0
    - name: otlp
      module: go.opentelemetry.io/collector/receiver/otlpreceiver v0.110.0
    - name: otlphttp
      module: go.opentelemetry.io/collector/exporter/otlphttpexporter v0.110.0
```

End-to-end install-bench (kind cluster + helm install) runs on this PR
via the workflow itself.

## Cost

Docker build stage adds ~100 s (OCB compile inside Alpine). Bench Docker
rebuild only fires on chart/bench/builder-config changes — acceptable.

```release-notes
[CI] install-bench Dockerfile now builds via OpenTelemetry Collector Builder so the bench daemonset can load hostmetricsreceiver; also dumps pod state, logs, and rendered config on rollout-status failure. Unblocks every PR opened after #180.
```

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

Reconcile the four pivot-tracking docs
(`docs/rfcs/0013-distro-first-pivot.md`, `CHANGELOG.md`,
`MILESTONES.md`, `docs/migration/v0.1-to-v0.2.md`) with the wave-3
(PR-B1-shape sibling ports) and wave-4 (PR-B2-shape upstream-only ports
+ PR-F.1 + PR-J + PR-L + PR-N) landings. Pure doc sweep — no code or
config touched.

## What changed

### `docs/rfcs/0013-distro-first-pivot.md` §migration

PR sequence rows updated with PR-number citations and landed markers:

- **PR-A2** (landed, #189, 2026-05-30)
- **PR-B2** (landed, #201) — also enumerates sibling-receiver follow-ups
under PR-B2 to dispel the slug collision with #188's PR-B2-labelled dcgm
port: stdoutexporter (#202), pyspy (#203), kernelevents (#208),
containerstdout (#209)
- **PR-F.1** (landed) — fleshed-out delete list
(`internal/{selftelemetry,telemetry}` + `components/receivers/dcgm/` +
`pkg/dcgm/` + one orphan clockreceiver integration test)
- **PR-F.2** re-scoped — now deletes the whole
`internal/{componentstatus,pipeline,pipelinebuilder,consumer,fanout,runtime/lifecycle}`
bundle in one cut once the last three pipeline+consumer-importing
receivers land (#204 k8sevents, #205 clockreceiver, #207 otlphttp). Per
the import-graph state — `internal/componentstatus`'s only non-test
consumer is `internal/pipeline`, so they delete together
- **PR-G** (landed, #182), **PR-H** (landed, #183)
- **PR-I.1a** (in flight — scaffold agent), **PR-I.1b** (pre-staged;
gate satisfied by #201)
- **PR-J** (landed, #195) — kept existing marker
- **PR-K.1** (in flight — separate agent landing)
- **PR-L** (landed, skeleton #179 + body #191) — flagged as living
document
- **PR-N** (landed, #200) — shipped at v0.1.0 ahead of v0.3.0 as a
doc-only update at `docs/migration/v0.2-to-v0.3.md`

### `CHANGELOG.md` [Unreleased]

- Restructured the pivot wave list as **four waves** (was three). Wave 3
enumerates PR-B1-shape sibling ports + support infra (#180-#194/#196).
Wave 4 enumerates PR-B2-shape upstream-only ports + PR-J (#195) + PR-F.1
(#206) + PR-N (#200) + lint/TOCTOU hardening (#198/#210).
- Tightened the PR-F.2 deferred note to point at the three open ports
(#204/#205/#207) as the gate.

### `MILESTONES.md`

- **M1** (pipeline runtime) — status row now cites PR-A2 (#189), PR-F.1
(#206), PR-F.2 gate (#204/#205/#207), PR-E (#180), retains
`internal/config/` (still load-bearing for `tracecore validate`).
- **M2** (self-telemetry) — status row now cites PR-F.1 (#206); flags
`internal/componentstatus` as travelling with `internal/pipeline` in
PR-F.2.
- **M8** (DCGM receiver) — status flipped to *landed-and-replaced*:
cites PR-F.1 (#206) deletion + PR-J (#195)
`docs/integrations/prometheus-scrape.md` recipe. Notes the inert chart
toggle retention until PR-K.3.

### `docs/migration/v0.1-to-v0.2.md`

- §`internal/*` package deletion (PR-F) status flips from "not yet open"
to "PR-F.1 landed (#206), PR-F.2 gated on three open ports".
- Open-items checklist expanded from 5 to 13 entries — tracks every PR
letter the migration guide cares about (A2 / E / F.1 / F.2 / I.1a-c / J
/ K.1-3 / L / N) with PR numbers and links.

## Why now

Tracking docs accumulated drift across wave-3 + wave-4 because every
sibling-port PR (and the support-infra PRs around them) updated the
bottom of `CHANGELOG.md` but did not always touch the upstream
sequencing section in RFC-0013. Per memory rule `[Keeping this document
current]`: status drift is a review blocker. This PR is the consolidated
catch-up; future port PRs include their RFC-row flip in-PR.

## What this PR does NOT change

- No code, no config, no YAML, no chart — only the four tracking docs.
- No new doc gates added; existing gates pass.
- No PRs other than the four named docs are modified.

## Test plan

- [x] `bash scripts/doc-check.sh` clean (33 test refs, 528 links
resolve, comment-noise diff gate clean vs `origin/main`, all 13 gates
green).
- [x] Pre-commit hook (`commitlint` 72-char subject limit + DCO +
AI-trailer gates) passed.
- [x] Pre-push hook (`make ci-fast` equivalent: `golangci-lint`, `go
vet`, `go mod verify`, `no-autoupdate-check`, `doc-check.sh`) passed on
second attempt after `git fetch origin main` populated the worktree's
`origin/main` ref — first push failed because the worktree previously
tracked the (gone) `pr-a2-ocb-main-swap` branch, so `doc-check.sh`'s
comment-noise diff-scope gate exited 128 on the missing ref. Root cause
fixed by the fetch; not a workaround.
- [ ] CI green on this branch.

```release-notes
NONE
```

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

RFC-0013 §migration PR-K.2: delete the four in-tree receivers
(`clockreceiver`, `kernelevents`, `k8sevents`, `containerstdout`)
plus the `xidgen` failure-injector (whose sole consumer was
`kernelevents`'s wire shape per RFC §migration L253). PR-K.1
(#211) just severed `internal/synthesis/patterns/` + `replay` from
`k8sevents`, unblocking the source-tree cut. PR-J (#195) shipped
the four upstream-OTel recipes that replace the in-tree receivers
in the bundled Helm-chart pipeline; this PR retires the
behind-the-curtain code.

Chart cleanup (values keys, DaemonSet template refs, `NOTES.txt`
deprecation warnings) intentionally stays in PR-K.3 so operators
get one minor of deprecation telemetry before the values shape
breaks.

> **Branch state note:** PR-F.2 (#215,
`internal/{componentstatus,pipeline,pipelinebuilder,config,consumer,fanout,runtime/lifecycle}`
deletion) and PR-I.1a (#214, `module/` Go submodule scaffold +
`go.work`) both landed on main mid-flight and have been merged into this
branch. The `_test.go` placeholder-name migration originally scoped here
(RFC §migration's "~86 fixture refs" line) is now moot — PR-F.2 deleted
every `internal/*` file that held those refs, so the migration target
evaporated. `make ci` + `make verify` + `make build` re-run green
against the merged tree.

```release-notes
[CHANGE] In-tree receivers clockreceiver/kernelevents/k8sevents/containerstdout deleted in favor of PR-J upstream-OTel recipes. xidgen failure-injector deleted alongside kernelevents (sole consumer).
```

## What lands

### Deletions — receivers (4)

| Path | Files | LOC | Replacement (PR-J recipe) |
|---|---|---|---|
| `components/receivers/clockreceiver/` | 10 | ~1.4k |
`hostmetricsreceiver` (loadscraper @ 1s) — PR-E landed in #180. RFC-0013
§migration originally named `telemetrygeneratorreceiver` but that
receiver does not exist in opentelemetry-collector-contrib (contrib
#41687 + #43657 both closed `not_planned`). |
| `components/receivers/kernelevents/` | 49 | ~13.2k |
[`journaldreceiver` + `filelogreceiver` (kmsg) + OTTL Xid
transform](../blob/main/docs/integrations/journald-kernel.md).
Customer-stable `kernelevents.xid` + `gpu.id` attributes preserved via
the OTTL transform per RFC-0013 §3. |
| `components/receivers/k8sevents/` | 37 | ~6.8k | [`k8sobjectsreceiver`
(watch mode on `events`) + OTTL `k8s.event.hint`
transform](../blob/main/docs/integrations/k8sobjects-events.md).
11-entry hint enum preserved via the OTTL transform per RFC-0013 §3. The
typed `internal/synthesis/patterns/Record` + `NodeRecord` (severed in
PR-K.1) keeps M19's pod-evicted detector pinned. |
| `components/receivers/containerstdout/` | 56 | ~6.4k |
[`filelogreceiver` + container stanza + `file_storage`
extension](../blob/main/docs/integrations/filelog-container.md).
Per-rank attribution + dataloader-timing extraction move to OTTL
transforms in the bundled recipe per RFC-0013 §3. |

### Deletions — supporting infra

| Path | Why |
|---|---|
| `tools/failure-inject/xidgen/` (2 files, 281 LOC) | Sole consumer was
the kernelevents wire shape per RFC-0013 §migration L253. Operators
inject NVRM Xid via real `/dev/kmsg` (`sudo tee`) or `systemd-cat`
against the journald recipe. |
| `install/kubernetes/tracecore/ci/containerstdout-on-values.yaml` (41
LOC) | Chart-render fixture for the deleted receiver. Remaining chart
fixtures (`all-receivers-off-values.yaml`,
`one-receiver-on-values.yaml`, `pyspy-on-values.yaml`) untouched; their
`clockreceiver: enabled: false` / `kernelevents: enabled: false` rows
survive to PR-K.3's chart cleanup. |
| `.github/ISSUE_TEMPLATE/component-bug-kernelevents.yml` |
Receiver-specific bug template; no surviving receiver. |

### failure-inject CLI surface

`failure-inject xid --code=N [--format=…] [--count=N]` removed.
`failure-inject {pod-evict,nccl-hang,cpu-steal}` unchanged.
`tools/failure-inject/testdata/golden.sha256` drops the two `xid`
golden rows. `.github/workflows/chaos.yml` drops the two
two-run-determinism steps that exercised the xid byte-determinism
contract; pod-evict's determinism + the golden-SHA replay loop
survive.

### Tooling shed

- `Makefile`: retire `test-extras-sustained` body (was kernelevents-
  only; now `@true` — target retained so downstream automation has
  a stable name; future sustained-load suites slot back in).
  Retire `test-extras-fuzz-kmsg` / `test-extras-fuzz-journald`;
  `test-extras-fuzz` loop drops to nccl-fr only. Drop the
  kernelevents row from `test-extras-race`. Empty the `bench-check`
  for-loop (k8sevents was the only baseline; PR-F.2 already
  rewrote the comment block to reflect this — the merge keeps
  both edits aligned).
- `go.mod`: sheds
`k8s.io/{api,apimachinery,client-go,klog/v2,kube-openapi,utils}`,
  `sigs.k8s.io/{json,randfill,structured-merge-diff/v6,yaml}`,
  `gopkg.in/{evanphx/json-patch.v4,inf.v0}` — the dep cluster
  k8sevents dragged in. `go.uber.org/goleak` also dropped post-
  merge (was held only by PR-F.2's-deleted
  `internal/pipeline/chaos_test.go`).

### Doc + comment sweep

Comment-only references to deleted receivers in
`components/exporters/{otlphttp,stdoutexporter}/`,
`components/receivers/{nccl_fr,pyspy}/`,
`internal/synthesis/patterns/{doc,model,verdict}.go` rewired to
surviving references (or to upstream recipe pointers).
`docs/README.md` per-component-docs table drops five dead links
(caught by `doc-check.sh`'s rotten-link gate — that is in fact
how I caught the last drift). `bench/install/README.md`
tick-alias note + schema-v2-rename note updated.
`tools/failure-inject/README.md` xid section removed; status
banner rewritten.

CHANGELOG (new PR-K.2 entry under Unreleased + wave-4 paragraph
re-balanced), MILESTONES (M1 + M9 + M10 + M15 status lines flipped
from "DELETED at v0.2.0" → "DELETED in PR-K.2" with file pointers
to the integration recipes), `docs/migration/v0.1-to-v0.2.md`
(PR-K.1 + PR-K.2 checkboxes flipped, PR-F.2 + PR-I.1a status
block updated post-merge), AGENTS.md (queued-for-deletion
paragraph updated to current reality) all swept.

### What evaporated mid-flight

Originally this PR was also going to migrate a handful of
`_test.go` files that held placeholder-name string references
to `clockreceiver` (in `internal/pipeline/saferun_test.go`,
`internal/config/fuzz_test.go`,
`internal/pipelinebuilder/fuzz_test.go`).
PR-F.2 (#215) deleted those files outright while this branch
was being drafted, so the migration target disappeared. No
follow-up needed.

## Net LOC delta

```
192 files changed, 120 insertions(+), 28,927 deletions(-)
```

## What is intentionally NOT in this PR

- **Helm-chart `receivers.clockreceiver` / `receivers.kernelevents`
  / `receivers.containerstdout` toggles + DaemonSet template refs
  + `containerstdout-rbac.yaml` template** — stays for PR-K.3 so
  operators see `NOTES.txt` deprecation warnings before the values
  shape breaks. The toggles are already inert post-PR-A2 (enabling
  any of them in `values.yaml` crashes the OCB binary at boot
  because the factories are not registered) — keeping them as
  no-ops for one minor preserves operator UX.
- **`internal/{componentstatus,pipeline,…}/` deletion** — already
  done in PR-F.2 (#215) before this branch landed; merged in.
- **Chart `values.yaml` `# clockreceiver — in-tree heartbeat
  retired by RFC-0013 PR-A2` style retire-banners** — stays for
  PR-K.3 alongside the actual values-keys cleanup.
- **`tools/failure-inject/ncclhang/`** — KEPT. Used by
  `pkg/nccl/fr_parser/synthesize_test.go` +
  `bench/overhead/nccl_fr_bench_test.go`; this is the canonical
  example of a failure-injector that survives the v0.2.0 cut.

## Root cause

The four receivers + xidgen survived only because PR-K.1's pattern-
library severance from `k8sevents` was still in flight. With #211
merged at the start of this session, the deletion is unblocked.
There is no workaround being applied here — this PR is the
root-cause deletion of the in-tree receivers themselves, which
RFC-0013 §migration set as the v0.2.0 deletion target.

## Test plan

- [x] `make ci` green post-merge (verify + license + nccl-fr-rce-gate +
      register-lint + actionlint + zizmor + coverage-check +
      ci-fuzz-nccl-fr + govulncheck + doc-check + no-autoupdate-check +
      build).
- [x] `make verify` green post-merge.
- [x] `make build` green post-merge (OCB compile against
      `builder-config.yaml` with `GOWORK=off` per the PR-I.1a
      isolation guard yields `./_build/tracecore`).
- [x] `go test ./...` green across the post-merge tree.
- [x] Hard pre-flight: zero external Go importers for any
      deletion target (`clockreceiver`, `kernelevents`, `k8sevents`,
      `containerstdout`, `xidgen`) — re-verified after the merge.
- [ ] CI: `chart-render` job validates the surviving chart fixtures
      (`all-receivers-off-values.yaml`, `one-receiver-on-values.yaml`,
`pyspy-on-values.yaml`); the deleted `containerstdout-on-values.yaml`
      drop-out should not regress conftest coverage of the
      containerstdout-allowlist / operational-invariant rules
      because the template guard still fires when
      `containerstdout.enabled=true` is set in any future values
      file. PR-K.3 will reassess once the chart-side keys go.
- [ ] CI: `install (kind)` job continues to render the bench
      tracecore-values.yaml against the OCB binary
      (hostmetricsreceiver heartbeat surface).
- [ ] CI: `harness-determinism (amd64/arm64)` job no longer runs
      the xid byte-determinism steps; pod-evict + golden-SHA loop
      survive. Expected: 2 fewer steps per matrix arm.

## Gates that should fail this PR if I missed something

- `doc-check`'s "dead markdown link" gate would catch any
  surviving link into the deleted dirs (caught the
  `docs/README.md` regressions on the first run; fixed and
  re-verified).
- `go vet ./...` would catch a stale import; ran clean.
- `golangci-lint run ./...` would catch unused imports or dead
  code introduced by the sweep; 0 issues reported.
- `go mod tidy -diff` would catch a missing dep prune; ran clean
  after the post-merge prune of `go.uber.org/goleak`.

Refs RFC-0013 §migration PR-K.2.

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant