Skip to content

feat(pivot): PR-F.2 — delete in-tree boot-path internals#215

Merged
trilamsr merged 3 commits into
mainfrom
pr-f-2-delete-internal-pipeline-consumer
May 31, 2026
Merged

feat(pivot): PR-F.2 — delete in-tree boot-path internals#215
trilamsr merged 3 commits into
mainfrom
pr-f-2-delete-internal-pipeline-consumer

Conversation

@trilamsr

Copy link
Copy Markdown
Contributor

Summary

Deletes the seven internal/* packages that RFC-0013 §migration step 8 PR-F.2 promised once the upstream-port wave (#201/#202/#203/#204/#205/#207/#208/#209) cleared every external caller of the in-tree pipeline runtime.

Net: -6,888 LOC across 56 deleted files, +80 LOC across 14 modified files. 70 files total. This is the final cut of RFC-0013 §migration step 8 PR-F.

What deletes

Path LOC Replacement
internal/pipeline/ 4,134 go.opentelemetry.io/collector/service (OCB-generated _build/main.go consumes builder-config.yaml).
internal/pipelinebuilder/ 1,282 Same — assembly is upstream service.
internal/config/ 718 Upstream confmap providers (file, yaml, env).
internal/consumer/ 87 Upstream go.opentelemetry.io/collector/consumer.
internal/fanout/ 366 Upstream internal/fanoutconsumer (collector module).
internal/componentstatus/ 16 Upstream component/componentstatus.ReportStatus (same free-function shape).
internal/runtime/lifecycle/ 505 Per-receiver package-local lifecycle.go siblings — already ported during the PR-B1 wave (#184/#185/#186/#187/#194/#196/#197); the in-tree helper had no remaining non-test consumer after PR-F.1 + the wave-2 upstream-port PRs. kernelevents/lifecycle.go was inherited from k8sevents (#208).

Pre-flight grep evidence

$ grep -rn 'tracecoreai/tracecore/internal/(pipeline|consumer|pipelinebuilder|config|fanout|componentstatus|runtime/lifecycle)' --include='*.go' .
(zero matches)

Tooling

  • .golangci.yml ignore-interface-regexps repointed at upstream consumer.{Metrics,Traces,Logs} + component.Component. The in-tree-only same-package-error-wrap exemption stays — the STYLE rule applies regardless of which interface is forwarded.
  • .github/workflows/chaos.yml drops the chaos-pipeline-test job (the in-tree internal/pipeline/chaos_test.go is gone; upstream service provides the equivalent panic-recovery contract). harness-determinism (failure-inject golden-SHA), cpu-steal-mpstat, pattern-pod-evicted jobs preserved.
  • .github/workflows/install-bench.yml drops the internal/{pipeline,runtime,selftelemetry}/** path-filter rows.
  • go.mod / go.sum unchanged.

Doc sweep

  • CHANGELOG.md Unreleased: PR-F.2 landed entry replacing the "PR-F.2 deferred" sentence; "Remaining v0.1.0 work" line updated; one dead internal/pipeline/README.md link in Foundation block rewritten as "deleted at v0.1.0".
  • docs/rfcs/0013-distro-first-pivot.md §7 deletion table: both pipeline-internals and runtime/lifecycle rows updated from "v0.1.0 (audit first…)" / "v0.2.0 (with last consumer)" to "v0.1.0 (landed PR-F.2)". §migration step 8 reframed.
  • docs/FAILURE-MODES.md Lifecycle / Data flow / Shutdown timing / Backend tables rewired from in-tree internal/{config,pipeline,fanout}/*_test.go::TestName pointers to upstream-delegated wording matching the pattern PR-A2 established.
  • docs/STRATEGY.md "Post-RFC-0013 status" intro updated; "Stable interfaces in internal/pipeline/" graduation row rewritten to point at the upstream surface.
  • docs/migration/v0.1-to-v0.2.md internal/* section status banner flipped from "deferred, still present in RC builds" to "landed, deleted in v0.2.0 builds".
  • MILESTONES.md v0.1.0 deletions row extended with boot-path internals; M1 + M4b + M19 rubric details annotated with the PR-F.2 retirement.
  • README.md Contributor row repointed at upstream go.opentelemetry.io/collector package docs.
  • AGENTS.md "Self-telemetry internals" bullet split into "Self-tel internals" + "Pipeline / boot-path internals" with explicit deletion status.
  • docs/README.md table row for internal/pipeline/README.md dropped.
  • components/receivers/kernelevents/README.md lifecycle-sibling rationale updated to past-tense.
  • tools/failure-inject/README.md "Testing locally" section drops the -tags=chaos ./internal/pipeline/... invocation.

Sequencing

This PR is hard-gated on every upstream-port PR landing first:

All nine merged before this PR opened; this is the moat-deletion payoff. Remaining v0.1.0 work is PR-K (chart-default flip + clockreceiver + stdoutexporter + remaining receiver source deletions, coupled with test-fixture migration and the telemetry: values-key deprecation cycle).

Test plan

  • make check — golangci-lint 0 issues, go vet clean, go mod verify ok.
  • go build ./... — clean.
  • go test -count=1 ./... — green (excluding the known kernelevents/TestReceiver_SLIBudget flake called out in feat(pivot): PR-B3 — port clockreceiver off internal pipeline+consumer #205's body, which only triggers under heavy parallel go test ./... load; passes standalone).
  • grep confirms zero non-internal callers of the deleted packages.
  • Doc-check pre-push hook passes after the CHANGELOG dead-link fix.
[CHANGE] internal/{pipeline,pipelinebuilder,config,consumer,fanout,componentstatus,runtime/lifecycle} packages deleted. The OCB-generated boot path off builder-config.yaml replaces them. Third-party importers of internal/* (unlikely pre-1.0; the packages live under internal/ and the Go compiler rejects external imports) lose the pipeline-assembly + lifecycle + config-loader surfaces; receiver authors now wire against upstream go.opentelemetry.io/collector/{component,receiver,consumer,pipeline} directly. See docs/migration/v0.1-to-v0.2.md "internal/* package deletion".

Tri Lam added 2 commits May 31, 2026 02:07
Deletes the seven `internal/*` packages that RFC-0013 §migration step 8
PR-F.2 promised once the upstream-port wave (#201/#202/#203/#204/#205/
#207/#208/#209) cleared every external caller.

**Net: -6,888 LOC across 56 deleted files, +80 LOC across 14 modified
files. 70 files total.**

### What deletes

| Path | LOC | Replacement |
|---|---|---|
| `internal/pipeline/` | 4,134 | `go.opentelemetry.io/collector/service` (OCB-generated `_build/main.go` consumes `builder-config.yaml`). |
| `internal/pipelinebuilder/` | 1,282 | Same — assembly is upstream `service`. |
| `internal/config/` | 718 | Upstream `confmap` providers (`file`, `yaml`, `env`). |
| `internal/consumer/` | 87 | Upstream `go.opentelemetry.io/collector/consumer`. |
| `internal/fanout/` | 366 | Upstream `internal/fanoutconsumer` (collector module). |
| `internal/componentstatus/` | 16 | Upstream `component/componentstatus.ReportStatus` (same free-function shape). |
| `internal/runtime/lifecycle/` | 505 | Per-receiver package-local `lifecycle.go` siblings — already ported during the PR-B1 wave (#184/#185/#186/#187/#194/#196/#197); the in-tree helper had no remaining non-test consumer after PR-F.1 + the wave-2 upstream-port PRs. `kernelevents/lifecycle.go` was inherited from k8sevents (#208). |

### Pre-flight grep evidence

```
$ grep -rn 'tracecoreai/tracecore/internal/(pipeline|consumer|pipelinebuilder|config|fanout|componentstatus|runtime/lifecycle)' --include='*.go' .
(zero matches)
```

### Tooling

- `.golangci.yml` interface-forwarder regexps repointed at upstream
  `consumer.{Metrics,Traces,Logs}` + `component.Component`. The
  in-tree-only same-package-error-wrap exemption stays — STYLE rule
  applies regardless of which interface is forwarded.
- `.github/workflows/chaos.yml` drops the `chaos-pipeline-test` job
  (the in-tree `internal/pipeline/chaos_test.go` is gone; upstream
  `service` provides the equivalent panic-recovery contract).
  `harness-determinism` (failure-inject golden-SHA), `cpu-steal-mpstat`,
  `pattern-pod-evicted` jobs preserved.
- `.github/workflows/install-bench.yml` drops the
  `internal/{pipeline,runtime,selftelemetry}/**` path-filter rows.
- `go.mod` / `go.sum` unchanged (the deleted packages only consumed
  modules already used elsewhere in the tree).

### Doc sweep

- `CHANGELOG.md` Unreleased: PR-F.2 landed entry replacing the
  "PR-F.2 deferred" sentence; "Remaining v0.1.0 work" line updated.
- `docs/rfcs/0013-distro-first-pivot.md` §7 deletion table: both
  pipeline-internals and runtime/lifecycle rows updated from
  "v0.1.0 (audit first…)" / "v0.2.0 (with last consumer)" to
  "v0.1.0 (landed PR-F.2)". §migration step 8 reframed from
  "splits into PR-F.1 + PR-F.2 because pipeline is out of scope"
  to "PR-F.2 collapsed the full boot-path infrastructure in one cut"
  once wave-2 cleared external callers.
- `docs/FAILURE-MODES.md` Lifecycle / Data flow / Shutdown timing /
  Backend tables rewired from in-tree `internal/{config,pipeline,
  fanout}/*_test.go::TestName` pointers to upstream-delegated wording
  matching the pattern PR-A2 established for the M1 retirements.
- `docs/STRATEGY.md` "Post-RFC-0013 status" intro updated; "Stable
  interfaces in `internal/pipeline/`" graduation row rewritten to
  point at upstream surface.
- `docs/migration/v0.1-to-v0.2.md` `internal/*` section status banner
  flipped from "deferred, still present in RC builds" to "landed,
  deleted in v0.2.0 builds"; remaining `internal/*` row table preserved
  as the migration recipe for third-party importers (unlikely OSS
  pre-1.0, but the table is the canonical mapping).
- `MILESTONES.md` v0.1.0 deletions row extended with boot-path
  internals; M1 + M4b + M19 rubric details annotated with the PR-F.2
  retirement; M1 doc-reference line repointed at upstream.
- `README.md` Contributor row repointed at upstream
  `go.opentelemetry.io/collector` package docs (the `internal/pipeline/
  README.md` quickstart is gone).
- `AGENTS.md` "Self-telemetry internals" bullet split into "Self-tel
  internals" + "Pipeline / boot-path internals" with explicit deletion
  status.
- `docs/README.md` table row for `internal/pipeline/README.md` dropped.
- `components/receivers/kernelevents/README.md` lifecycle-sibling
  rationale updated to past-tense ("migrated off the now-deleted
  `internal/runtime/lifecycle`").
- `tools/failure-inject/README.md` "Testing locally" section drops
  the `-tags=chaos ./internal/pipeline/...` invocation.

### Test verification

- `make check` — golangci-lint 0 issues, go vet clean, go mod verify ok.
- `go build ./...` — clean.
- `go test -count=1 ./... excluding {internal/integration (stale ARM
  `_build/tracecore` from prior worktree session), kernelevents (known
  TestReceiver_SLIBudget flake called out in #205 body)}` — all green.
- `go test -count=1 ./components/receivers/kernelevents/` standalone —
  green (the flake only triggers under heavy parallel `go test ./...`
  load).

### Sequencing

This is the final cut of RFC-0013 §migration step 8 PR-F. The remaining
v0.1.0 work is PR-K (chart-default flip + `clockreceiver` +
`stdoutexporter` + remaining receiver source deletions, coupled with
test-fixture migration and the `telemetry:` values-key deprecation
cycle).

```release-notes
[CHANGE] internal/{pipeline,pipelinebuilder,config,consumer,fanout,componentstatus,runtime/lifecycle} packages deleted. The OCB-generated boot path off builder-config.yaml replaces them. Third-party importers of internal/* (unlikely pre-1.0; the packages live under internal/ and the Go compiler rejects external imports) lose the pipeline-assembly + lifecycle + config-loader surfaces; receiver authors now wire against upstream go.opentelemetry.io/collector/{component,receiver,consumer,pipeline} directly. See docs/migration/v0.1-to-v0.2.md "internal/* package deletion".
```

Signed-off-by: Tri Lam <tri@maydow.com>
doc-check on the parent commit flagged a dead `internal/pipeline/
README.md` link in CHANGELOG's Foundation block; rewrite it as a
"deleted at v0.1.0" reference pointing at upstream
`go.opentelemetry.io/collector` docs. Same pattern the parent
applied to other deleted-doc references.

Also picks up the MILESTONES.md M1 "Status (RFC-0013)" row that was
edited in the same sweep but missed the parent's staging — extends
the v0.1.0 delete list with the boot-path internals + componentstatus
+ runtime/lifecycle paths.

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 09:11
…l-pipeline-consumer

# Conflicts:
#	CHANGELOG.md
#	MILESTONES.md
#	docs/migration/v0.1-to-v0.2.md
#	docs/rfcs/0013-distro-first-pivot.md
@trilamsr trilamsr merged commit b285998 into main May 31, 2026
16 of 18 checks passed
@trilamsr trilamsr deleted the pr-f-2-delete-internal-pipeline-consumer branch May 31, 2026 09:38
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

RFC-0013 §migration PR-K.2: delete the four in-tree receivers
(`clockreceiver`, `kernelevents`, `k8sevents`, `containerstdout`)
plus the `xidgen` failure-injector (whose sole consumer was
`kernelevents`'s wire shape per RFC §migration L253). PR-K.1
(#211) just severed `internal/synthesis/patterns/` + `replay` from
`k8sevents`, unblocking the source-tree cut. PR-J (#195) shipped
the four upstream-OTel recipes that replace the in-tree receivers
in the bundled Helm-chart pipeline; this PR retires the
behind-the-curtain code.

Chart cleanup (values keys, DaemonSet template refs, `NOTES.txt`
deprecation warnings) intentionally stays in PR-K.3 so operators
get one minor of deprecation telemetry before the values shape
breaks.

> **Branch state note:** PR-F.2 (#215,
`internal/{componentstatus,pipeline,pipelinebuilder,config,consumer,fanout,runtime/lifecycle}`
deletion) and PR-I.1a (#214, `module/` Go submodule scaffold +
`go.work`) both landed on main mid-flight and have been merged into this
branch. The `_test.go` placeholder-name migration originally scoped here
(RFC §migration's "~86 fixture refs" line) is now moot — PR-F.2 deleted
every `internal/*` file that held those refs, so the migration target
evaporated. `make ci` + `make verify` + `make build` re-run green
against the merged tree.

```release-notes
[CHANGE] In-tree receivers clockreceiver/kernelevents/k8sevents/containerstdout deleted in favor of PR-J upstream-OTel recipes. xidgen failure-injector deleted alongside kernelevents (sole consumer).
```

## What lands

### Deletions — receivers (4)

| Path | Files | LOC | Replacement (PR-J recipe) |
|---|---|---|---|
| `components/receivers/clockreceiver/` | 10 | ~1.4k |
`hostmetricsreceiver` (loadscraper @ 1s) — PR-E landed in #180. RFC-0013
§migration originally named `telemetrygeneratorreceiver` but that
receiver does not exist in opentelemetry-collector-contrib (contrib
#41687 + #43657 both closed `not_planned`). |
| `components/receivers/kernelevents/` | 49 | ~13.2k |
[`journaldreceiver` + `filelogreceiver` (kmsg) + OTTL Xid
transform](../blob/main/docs/integrations/journald-kernel.md).
Customer-stable `kernelevents.xid` + `gpu.id` attributes preserved via
the OTTL transform per RFC-0013 §3. |
| `components/receivers/k8sevents/` | 37 | ~6.8k | [`k8sobjectsreceiver`
(watch mode on `events`) + OTTL `k8s.event.hint`
transform](../blob/main/docs/integrations/k8sobjects-events.md).
11-entry hint enum preserved via the OTTL transform per RFC-0013 §3. The
typed `internal/synthesis/patterns/Record` + `NodeRecord` (severed in
PR-K.1) keeps M19's pod-evicted detector pinned. |
| `components/receivers/containerstdout/` | 56 | ~6.4k |
[`filelogreceiver` + container stanza + `file_storage`
extension](../blob/main/docs/integrations/filelog-container.md).
Per-rank attribution + dataloader-timing extraction move to OTTL
transforms in the bundled recipe per RFC-0013 §3. |

### Deletions — supporting infra

| Path | Why |
|---|---|
| `tools/failure-inject/xidgen/` (2 files, 281 LOC) | Sole consumer was
the kernelevents wire shape per RFC-0013 §migration L253. Operators
inject NVRM Xid via real `/dev/kmsg` (`sudo tee`) or `systemd-cat`
against the journald recipe. |
| `install/kubernetes/tracecore/ci/containerstdout-on-values.yaml` (41
LOC) | Chart-render fixture for the deleted receiver. Remaining chart
fixtures (`all-receivers-off-values.yaml`,
`one-receiver-on-values.yaml`, `pyspy-on-values.yaml`) untouched; their
`clockreceiver: enabled: false` / `kernelevents: enabled: false` rows
survive to PR-K.3's chart cleanup. |
| `.github/ISSUE_TEMPLATE/component-bug-kernelevents.yml` |
Receiver-specific bug template; no surviving receiver. |

### failure-inject CLI surface

`failure-inject xid --code=N [--format=…] [--count=N]` removed.
`failure-inject {pod-evict,nccl-hang,cpu-steal}` unchanged.
`tools/failure-inject/testdata/golden.sha256` drops the two `xid`
golden rows. `.github/workflows/chaos.yml` drops the two
two-run-determinism steps that exercised the xid byte-determinism
contract; pod-evict's determinism + the golden-SHA replay loop
survive.

### Tooling shed

- `Makefile`: retire `test-extras-sustained` body (was kernelevents-
  only; now `@true` — target retained so downstream automation has
  a stable name; future sustained-load suites slot back in).
  Retire `test-extras-fuzz-kmsg` / `test-extras-fuzz-journald`;
  `test-extras-fuzz` loop drops to nccl-fr only. Drop the
  kernelevents row from `test-extras-race`. Empty the `bench-check`
  for-loop (k8sevents was the only baseline; PR-F.2 already
  rewrote the comment block to reflect this — the merge keeps
  both edits aligned).
- `go.mod`: sheds
`k8s.io/{api,apimachinery,client-go,klog/v2,kube-openapi,utils}`,
  `sigs.k8s.io/{json,randfill,structured-merge-diff/v6,yaml}`,
  `gopkg.in/{evanphx/json-patch.v4,inf.v0}` — the dep cluster
  k8sevents dragged in. `go.uber.org/goleak` also dropped post-
  merge (was held only by PR-F.2's-deleted
  `internal/pipeline/chaos_test.go`).

### Doc + comment sweep

Comment-only references to deleted receivers in
`components/exporters/{otlphttp,stdoutexporter}/`,
`components/receivers/{nccl_fr,pyspy}/`,
`internal/synthesis/patterns/{doc,model,verdict}.go` rewired to
surviving references (or to upstream recipe pointers).
`docs/README.md` per-component-docs table drops five dead links
(caught by `doc-check.sh`'s rotten-link gate — that is in fact
how I caught the last drift). `bench/install/README.md`
tick-alias note + schema-v2-rename note updated.
`tools/failure-inject/README.md` xid section removed; status
banner rewritten.

CHANGELOG (new PR-K.2 entry under Unreleased + wave-4 paragraph
re-balanced), MILESTONES (M1 + M9 + M10 + M15 status lines flipped
from "DELETED at v0.2.0" → "DELETED in PR-K.2" with file pointers
to the integration recipes), `docs/migration/v0.1-to-v0.2.md`
(PR-K.1 + PR-K.2 checkboxes flipped, PR-F.2 + PR-I.1a status
block updated post-merge), AGENTS.md (queued-for-deletion
paragraph updated to current reality) all swept.

### What evaporated mid-flight

Originally this PR was also going to migrate a handful of
`_test.go` files that held placeholder-name string references
to `clockreceiver` (in `internal/pipeline/saferun_test.go`,
`internal/config/fuzz_test.go`,
`internal/pipelinebuilder/fuzz_test.go`).
PR-F.2 (#215) deleted those files outright while this branch
was being drafted, so the migration target disappeared. No
follow-up needed.

## Net LOC delta

```
192 files changed, 120 insertions(+), 28,927 deletions(-)
```

## What is intentionally NOT in this PR

- **Helm-chart `receivers.clockreceiver` / `receivers.kernelevents`
  / `receivers.containerstdout` toggles + DaemonSet template refs
  + `containerstdout-rbac.yaml` template** — stays for PR-K.3 so
  operators see `NOTES.txt` deprecation warnings before the values
  shape breaks. The toggles are already inert post-PR-A2 (enabling
  any of them in `values.yaml` crashes the OCB binary at boot
  because the factories are not registered) — keeping them as
  no-ops for one minor preserves operator UX.
- **`internal/{componentstatus,pipeline,…}/` deletion** — already
  done in PR-F.2 (#215) before this branch landed; merged in.
- **Chart `values.yaml` `# clockreceiver — in-tree heartbeat
  retired by RFC-0013 PR-A2` style retire-banners** — stays for
  PR-K.3 alongside the actual values-keys cleanup.
- **`tools/failure-inject/ncclhang/`** — KEPT. Used by
  `pkg/nccl/fr_parser/synthesize_test.go` +
  `bench/overhead/nccl_fr_bench_test.go`; this is the canonical
  example of a failure-injector that survives the v0.2.0 cut.

## Root cause

The four receivers + xidgen survived only because PR-K.1's pattern-
library severance from `k8sevents` was still in flight. With #211
merged at the start of this session, the deletion is unblocked.
There is no workaround being applied here — this PR is the
root-cause deletion of the in-tree receivers themselves, which
RFC-0013 §migration set as the v0.2.0 deletion target.

## Test plan

- [x] `make ci` green post-merge (verify + license + nccl-fr-rce-gate +
      register-lint + actionlint + zizmor + coverage-check +
      ci-fuzz-nccl-fr + govulncheck + doc-check + no-autoupdate-check +
      build).
- [x] `make verify` green post-merge.
- [x] `make build` green post-merge (OCB compile against
      `builder-config.yaml` with `GOWORK=off` per the PR-I.1a
      isolation guard yields `./_build/tracecore`).
- [x] `go test ./...` green across the post-merge tree.
- [x] Hard pre-flight: zero external Go importers for any
      deletion target (`clockreceiver`, `kernelevents`, `k8sevents`,
      `containerstdout`, `xidgen`) — re-verified after the merge.
- [ ] CI: `chart-render` job validates the surviving chart fixtures
      (`all-receivers-off-values.yaml`, `one-receiver-on-values.yaml`,
`pyspy-on-values.yaml`); the deleted `containerstdout-on-values.yaml`
      drop-out should not regress conftest coverage of the
      containerstdout-allowlist / operational-invariant rules
      because the template guard still fires when
      `containerstdout.enabled=true` is set in any future values
      file. PR-K.3 will reassess once the chart-side keys go.
- [ ] CI: `install (kind)` job continues to render the bench
      tracecore-values.yaml against the OCB binary
      (hostmetricsreceiver heartbeat surface).
- [ ] CI: `harness-determinism (amd64/arm64)` job no longer runs
      the xid byte-determinism steps; pod-evict + golden-SHA loop
      survive. Expected: 2 fewer steps per matrix arm.

## Gates that should fail this PR if I missed something

- `doc-check`'s "dead markdown link" gate would catch any
  surviving link into the deleted dirs (caught the
  `docs/README.md` regressions on the first run; fixed and
  re-verified).
- `go vet ./...` would catch a stale import; ran clean.
- `golangci-lint run ./...` would catch unused imports or dead
  code introduced by the sweep; 0 issues reported.
- `go mod tidy -diff` would catch a missing dep prune; ran clean
  after the post-merge prune of `go.uber.org/goleak`.

Refs RFC-0013 §migration PR-K.2.

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr pushed a commit that referenced this pull request May 31, 2026
PR-K.2 (#217) and PR-F.2 (#215) deleted clockreceiver, containerstdout,
k8sevents, kernelevents from the binary; the namespace-alignment section
in v0.1-to-v0.2.md was authored before those merges landed and still
listed all eight. Trim the per-component substitution table to the four
surviving in-tree components (nccl_fr, pyspy, otlphttp, stdoutexporter)
and point operators at the existing "Orphan in-tree components" table
for the deleted-receiver migration path. Refresh the PromQL example
to use otlphttp (still present) instead of containerstdout (deleted).

Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
…216)

## Summary

The four surviving in-tree components (`nccl_fr`, `pyspy`, `otlphttp`,
`stdoutexporter`) each emit self-telemetry through their own
per-component MeterProvider. This PR renames their instrument names from
the v0.1.x `tracecore.*` family to the upstream
`otelcol_<role>_<component>_<metric>` convention so the in-tree
namespace does not collide with the OCB pipeline-runtime's own
`otelcol_*` family.

Per [RFC-0013 §migration v0.1.0](docs/rfcs/0013-distro-first-pivot.md)
row 119: *"self-tel metric rename `tracecore.*` → `otelcol_*`"*.

**Scope reduced post-PR-K.2 / PR-F.2.** The four legacy in-tree
receivers originally listed for rename (`clockreceiver`,
`containerstdout`, `k8sevents`, `kernelevents`) plus the in-tree
boot-path internals were deleted from the binary in #217 and #215 while
this PR was in CI. The namespace-alignment edits to those files were
dropped during the post-#217 merge; the surviving net diff covers only
the four components still in the binary.

## Rename matrix

OTel-dot form (Prometheus scrape renders dots as underscores):

| v0.1.x instrument | post-rename instrument |
|---|---|
| `tracecore.receiver.errors_total` |
`otelcol.receiver.<name>.errors_total` |
| `tracecore.receiver.emissions_total` |
`otelcol.receiver.<name>.emissions_total` |
| `tracecore.receiver.collection_latency_seconds` |
`otelcol.receiver.<name>.collection_latency_seconds` |
| `tracecore.receiver.degraded_seconds_total` |
`otelcol.receiver.<name>.degraded_seconds_total` |
| `tracecore.receiver.last_activity_unix_seconds` |
`otelcol.receiver.<name>.last_activity_unix_seconds` |
| `tracecore.exporter.calls_total` |
`otelcol.exporter.<name>.calls_total` |
| `tracecore.selftelemetry.init_errors_total` |
`otelcol.selftelemetry.init_errors_total` |

Where `<name>` is the OCB component name without underscores.
Per-component substitutions:

| Component | `<name>` |
|---|---|
| `components/receivers/nccl_fr` | `ncclfr` (underscore stripped per
RFC) |
| `components/receivers/pyspy` | `pyspy` |
| `components/exporters/otlphttp` | `otlphttp` |
| `components/exporters/stdoutexporter` | `stdoutexporter` |

**Label shape is preserved.** `component_id` still partitions
per-instance; `kind` / `result` values are unchanged. Dashboards and
alerts that filtered on labels need only the metric-name rename, not a
label-selector rewrite.

## What changed

- **Four `selftel.go` files** (otlphttp, stdoutexporter, nccl_fr, pyspy)
— instrument literals updated; header comments rewritten to document the
new convention + label-preservation invariant.
- **Four `selftel_test.go` files** — `findInstrument(..., "...")` and
`scopeOf(..., "...")` assertions updated; `receiverInstrumentPrefix` /
`exporterInstrumentPrefix` constants (used by the `failingReceiverMP` /
`failingExporterMP` synthetic-failure seams) bumped to the new prefix so
a future drift back to `tracecore.*` would fail compile-time on the
prefix mismatch.
- **`docs/examples/prometheus-alerts.example.yaml`** — rewritten as
receiver-agnostic starter using regex matchers
(`{__name__=~"otelcol_receiver_.*_errors_total"}`) so a new in-tree
receiver inherits coverage on first scrape. Removed the
`tracecore_exporter_failure_rate` / `tracecore_build_info` rules — those
v0.1.x gauges no longer exist post-RFC-0013.
- **`docs/migration/v0.1-to-v0.2.md`** — new "In-tree receiver /
exporter namespace alignment (RFC-0013 v0.1.0)" section listing only the
four surviving in-tree components; points operators at the existing
"Orphan in-tree components" table for deleted-receiver migration.
- **`NOTE on ExporterCarrier removal`** blocks in otlphttp.go +
stdoutexporter.go collapsed — they referenced
`cmd/tracecore.collect.collectFailureRateReaders` (deleted in PR-A2) and
`internal/selftelemetry` (deleted in PR-F.1). New comments point at the
PromQL recipe
`rate(otelcol_exporter_<name>_calls_total{result="error"}[5m])`.
- **CHANGELOG** — `[Unreleased]` entry added documenting the alignment.

## Test plan

- [x] `go vet ./...` clean (root + module/ submodule)
- [x] `go test ./...` all green (root module; module/ submodule has no
test files)
- [x] `make check` (fmt + tidy-check + lint + vet + mod-verify) green
- [x] `make doc-check` green
- [x] `TestSelfTelemetry_*` assertions pass for every surviving package
- [x] `TestFactory_FallsBackToNoopWhenMeterFails`
synthetic-register-failure seam still trips on the renamed prefix
(`failingReceiverMP` / `failingExporterMP`)

## Merge resolution (post-#215, #217)

Conflicts surfaced as "deleted by them" across 20 files in the four
deleted receivers. Resolution: accept all upstream deletions; the
namespace-alignment edits to those files were moot because the files
themselves no longer exist on main. No content edits required to the
surviving four components from the merge — their namespace-alignment
commits remain intact.

```release-notes
**Breaking (pre-1.0)**: in-tree component self-telemetry metric names renamed from `tracecore.*` to `otelcol.<role>.<name>.*` per RFC-0013 §migration v0.1.0 namespace alignment. Affects the four surviving in-tree components (`nccl_fr`, `pyspy`, `otlphttp`, `stdoutexporter`); the four legacy in-tree receivers (`clockreceiver`, `containerstdout`, `k8sevents`, `kernelevents`) were already deleted from the binary in #215 / #217 and migrate via the upstream-receiver replacements documented in the "Orphan in-tree components" table in `docs/migration/v0.1-to-v0.2.md`. Label shape preserved (`component_id`, `kind`, `result` unchanged). Operators with dashboards / alerts on the v0.1.x names rename per the table in `docs/migration/v0.1-to-v0.2.md` under "In-tree receiver / exporter namespace alignment".
```

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr pushed a commit that referenced this pull request May 31, 2026
`scripts/doc-check.sh` `scan_paths` listed four docs, three of which
were deleted in #215 (PR-F.2, `internal/pipeline/README.md`) and #217
(PR-K.2, `components/receivers/dcgm/RUNBOOK.md` +
`components/receivers/kernelevents/RUNBOOK.md`). The `grep` swallowed
the "No such file" errors via `2>/dev/null`, so the gate appeared to
pass while only `docs/FAILURE-MODES.md` was actually scanned — silent
rot, not a hard CI fail.

Also drops `pkg/` from the `defined` Go-test-symbol grep roots (deleted
in PR-F.2) and adds `module/` (introduced in PR-I.1a / extended here in
PR-I.1b), so the gate keeps full coverage of the post-pivot test corpus.

Side fix: `go.work` header claimed the Go directive matched root +
submodule `go.mod` exactly. `module/go.mod` pins `go 1.22.0` to track
the collector v0.110.0 OCB-distribution baseline; root is `1.26.3`.
Workspace mode only requires `>= every member module's directive`, so
1.26.3 >= 1.22.0 is fine — but the comment claim was wrong. Updated to
acknowledge the delta + name the reason (OCB baseline floor).

Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
#224)

## Summary

RFC-0013 §migration **PR-I.1b** — mechanical move of `nccl_fr` receiver
+ safe-pickle parser into the in-repo Go submodule scaffolded by #214.

- `git mv components/receivers/nccl_fr → module/receiver/ncclfrreceiver`
(Go package renamed `ncclfr → ncclfrreceiver` to match the OCB receiver
name; **OCB component type string `nccl_fr` unchanged** — operator
scrape names + dashboards do not regress, per PR-B1 metric-namespace
decision).
- `git mv pkg/nccl/fr_parser → module/pkg/nccl/fr_parser`.
- `builder-config.yaml` uncomments the PR-I.1a placeholder: adds `gomod:
github.com/tracecoreai/tracecore/module v0.1.0` + `import:
github.com/tracecoreai/tracecore/module/receiver/ncclfrreceiver` (split
because the single-`go.mod` submodule puts the receiver one path-segment
below the module root — a per-component `gomod:` would fail since
`module/receiver/ncclfrreceiver/` has no `go.mod` of its own).
- Root-level `replaces: github.com/tracecoreai/tracecore/module =>
../module` (`../module`, not `./module`, because OCB writes the replaces
verbatim into `./_build/go.mod`, one directory deeper than repo root).
- `module/go.mod` pins collector deps to **v0.110.0** + otel to
**v1.30.0** (the OCB-distribution baseline). MVS inside `_build/go.mod`
would otherwise pull forward to v1.59.0 — `scraperhelper` was split out
of `collector/receiver` between those two release lines, which would
break the `hostmetricsreceiver@v0.110.0` build added in PR-E.
- Root `go.mod` adds `replace github.com/tracecoreai/tracecore/module =>
./module` + matching `require` so `go mod tidy` (which ignores
`go.work`) resolves the submodule from the in-repo checkout rather than
the proxy. (Drops the first time a release builds against a published
`module/vX.Y.Z` tag.)
- Importers updated in root: `tools/genfixtures` (+ `-out` default),
`bench/overhead/nccl_fr_bench_test.go`, `scripts/nccl-fr-rce-gate.sh`
(runs `go list -deps` from inside `module/`), `Makefile`
(`generate-fixtures` / `test-extras-fuzz-nccl-fr` / `ci-fuzz-nccl-fr` /
`nccl-fr-rce-gate`), `.github/workflows/nccl-fr-fuzz-nightly.yml`, and
two doc-comment refs in
`components/exporters/stdoutexporter/selftel{,_test}.go`.
- OTel instrumentation scope on `selftel.go` moves to the new Go import
path
(`github.com/tracecoreai/tracecore/module/receiver/ncclfrreceiver`),
OTel convention: scope = Go import path.

**No operator-visible metric/log-data regression** — receiver type
string + metric instrument names unchanged across the OCB wire boundary.
The OTel instrumentation scope name does move (per OTel convention scope
= Go import path; matches bullet 7 + RFC-0013 §Migration + CHANGELOG),
but the scope is a logger/meter attribute, not part of metric instrument
names or the component-type-string operators configure against — so
dashboards and scrape jobs do not regress.

Post-merge: tag `module/v0.1.0 <merge-sha>` (the first version pinned in
`builder-config.yaml`).

## Test plan

Locally run + green on this branch (verified at commit `da44388`;
subsequent commits are an `origin/main` merge + a `doc-check.sh`
`scan_paths` fix-up that prunes three docs deleted by #215/#217 — no
source semantic change):

- [x] `make check` — fmt, golangci-lint, vet, mod-verify (0 issues).
- [x] `make verify` — license-check, generate-fixtures-check,
build-tags, **nccl-fr-rce-gate** (parser depends only on stdlib),
register-lint, actionlint, zizmor, doc-check, no-autoupdate-check.
- [x] `make build` — OCB v0.110.0 builds `./_build/tracecore` with the
submodule receiver wired in.
- [x] `./_build/tracecore components` — confirms `nccl_fr` receiver
registered (alongside hostmetrics, filelog, journald, k8sobjects, otlp,
prometheus).
- [x] `go test ./...` (root module).
- [x] `cd module && go test ./...` (submodule:
`module/pkg/nccl/fr_parser` + `module/receiver/ncclfrreceiver`).
- [x] `make ci-fuzz-nccl-fr` — 30s `FuzzParseFRPickle` on the moved
parser, no crashers.
- [x] `bash scripts/nccl-fr-rce-gate.sh` — parser deps clean (no
`os/exec`, `plugin`, `reflect.Call`, `reflect.MakeFunc`).

CI to confirm:

- [ ] All required checks green on this PR.
- [ ] `nccl-fr-fuzz-nightly` runs against the new path on next nightly
trigger.

Post-merge:

- [ ] Tag `module/v0.1.0 <merge-sha> && git push origin module/v0.1.0`.

## Why this is a root-cause fix, not a workaround

Two non-obvious decisions in this diff — both root-cause, not
workarounds:

1. **`replaces: ../module` (not `./module`) in builder-config.yaml.**
OCB writes the `replaces:` directive verbatim into `./_build/go.mod`.
Paths in `go.mod` resolve relative to that file's directory, which is
one level deeper than repo root. `./module` would fail to resolve;
`../module` resolves correctly to `<repo>/module/`.
2. **`module/go.mod` pins collector v0.110.0, not the root's v1.59.0.**
OCB's MVS reconciles `_build/go.mod`'s require graph against the
submodule's `go.mod`. If the submodule pinned the higher version, MVS
pulls forward inside `_build/go.mod` and `hostmetricsreceiver@v0.110.0`
(added in PR-E to replace `clockreceiver`) fails to build because
`scraperhelper` moved out of `go.opentelemetry.io/collector/receiver`
between v0.110.0 and v1.59.0. The root module independently uses v1.59.0
for its non-OCB code paths; MVS reconciles to the higher version where
root + submodule overlap, which is fine for root.

```release-notes
none
```

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant