Skip to content

feat(pivot): port stdoutexporter off internal/selftelemetry#186

Merged
trilamsr merged 2 commits into
mainfrom
pr-stdoutexporter-selftel-port
May 31, 2026
Merged

feat(pivot): port stdoutexporter off internal/selftelemetry#186
trilamsr merged 2 commits into
mainfrom
pr-stdoutexporter-selftel-port

Conversation

@trilamsr

@trilamsr trilamsr commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

Port components/exporters/stdoutexporter off internal/selftelemetry. This is the exporter-side half of RFC-0013 §migration PR-B — sibling to the receiver-side PR-B1 (#184) that ported components/receivers/nccl_fr.

Same shape as PR-B1: receiver-scoped self-telemetry helpers travel as siblings (selftel.go + selftel_test.go) co-located with the component, scope-name = the component's Go import path, metric names + label shape preserved bit-for-bit so dashboards / alerts do not regress.

Root cause

Not a bug — a planned dependency severance.

v0.1.x stdoutexporter imports internal/selftelemetry for the Exporter interface, the NewNoopExporter / NewExporter constructors, the Kind type, and RecordInitError. RFC-0013 PR-F deletes internal/selftelemetry entirely (the runtime that consumes it dies in the same wave). Every component carrying that import has to migrate to a sibling helper before PR-F can land. PR-B1 did the receiver; this PR does the exporter.

Changes

  • components/exporters/stdoutexporter/selftel.go (new): sibling selfExporter interface + selfExporterImpl + recordInitError. Emits tracecore.exporter.calls_total{result,kind,component_id} on a meter acquired from set.Telemetry.MeterProvider. Scope = github.com/tracecoreai/tracecore/components/exporters/stdoutexporter (OTel convention).
  • components/exporters/stdoutexporter/selftel_test.go (new): 7 tests — noop safety, nil-MP error sentinel, calls_total emission + {result,kind,component_id} label shape, scope-name standard, init_errors_total tick + nil-MP safety, factory noop-fallback wiring with synthetic register failure, and a compile-time sibling-types guard.
  • components/exporters/stdoutexporter/stdoutexporter.go (rewired): use sibling types; drop internal/selftelemetry import; drop (*stdoutExporter).SelfExporter().

ExporterCarrier removal — rationale

PR-B1 (nccl_fr receiver) didn't expose any selftelemetry types to the runtime — it was a clean port. stdoutexporter is different: v0.1.x exposed SelfExporter() selftelemetry.Exporter so cmd/tracecore/collect.collectFailureRateReaders could feed tracecore.exporter.failure_rate. This PR drops that contract:

  • The runtime's reader-collection path silently skips components that don't implement ExporterCarrier — documented "no per-exporter signal" degraded mode.
  • stdoutexporter is the canonical debug / example exporter (writes JSON lines to stdout). Operators don't alert on its failure_rate. Real backends in components/exporters/otlphttp carry that contract instead.
  • tracecore_exporter_failure_rate still surfaces in scrape via the SLO observable gauge (reports 0 with no readers registered) — the M2 acceptance check in cmd/tracecore/integration_telemetry_test.go still passes.
  • tracecore_exporter_calls_total continues to surface because the sibling impl emits it on set.Telemetry.MeterProvider directly, just under a new scope.
  • PR-F deletes the entire internal/selftelemetry package, so the ExporterCarrier contract evaporates regardless.

Rationale documented inline in stdoutexporter.go and the selfExporter docstring so a six-months-cold reader knows why the carrier is missing.

Test plan

  • make check (gofumpt, golangci-lint, go vet, go mod verify) — green
  • go test ./components/exporters/stdoutexporter/... -count=1 — 13 tests pass (7 new selftel + 6 existing)
  • go test -race ./components/exporters/stdoutexporter/... -count=1 — green
  • go test ./cmd/tracecore/ -run TestIntegration_TelemetrySurface_EndToEnd -count=1 — passes (asserts both tracecore_exporter_calls_total and tracecore_exporter_failure_rate appear in scrape)
  • go test ./... -count=1 -short — full repo green; no name regression in cmd/tracecore, internal/telemetry, or internal/selftelemetry (which still has its own tests until PR-F)
  • Scope-name pin: assertion against github.com/tracecoreai/tracecore/components/exporters/stdoutexporter in selftel_test.go
  • Sibling-types compile-time guard: asSelfExporter helper forces compile break if the type ever moves back to internal/selftelemetry

Pattern continuity

References #184 (PR-B1, nccl_fr) for the receiver sibling pattern. Test helpers (newTestMeterProvider, collectRM, scopeOf, kvMatch, dumpNames, failingExporterMP) mirror the nccl_fr equivalents so an M8+ exporter author can read either pair and infer the convention.

Release notes

NONE — this is a pivot-internal port. Metric names + labels are unchanged on the scrape surface. The unexported SelfExporter() method removal is not observable to any v0.1.x operator (only cmd/tracecore's reader-collection path read it, and that path's "skip on absent" branch is documented behavior).

NONE

Mirrors the PR-B1 sibling pattern (#184 — nccl_fr receiver) for the
stdoutexporter component, the exporter-side half of RFC-0013
§migration PR-B.

## Changes

- Add `components/exporters/stdoutexporter/selftel.go`: sibling
  selfExporter interface + impl + recordInitError. Metric name
  (`tracecore.exporter.calls_total`) and label shape
  (`{result,kind,component_id}`) are preserved bit-for-bit so
  dashboards / alerts that pin those names don't regress.
- Add `components/exporters/stdoutexporter/selftel_test.go`: 7 tests
  pinning noop safety, nil-MP error, calls_total emission + label
  shape, scope-name standard, init_errors_total tick + nil safety,
  factory noop-fallback wiring, and a compile-time sibling-types
  guard that prevents drift back to the internal package.
- Rewire `stdoutexporter.go` to use the sibling types; drop
  `internal/selftelemetry` import entirely.
- Drop `(*stdoutExporter).SelfExporter()` (the
  `selftelemetry.ExporterCarrier` implementation). Rationale
  documented inline:
  - stdoutexporter is the canonical debug / example exporter; the
    runtime's reader-collection path silently skips components that
    don't implement the carrier (documented degraded mode).
  - `tracecore_exporter_failure_rate` still surfaces in scrape via
    the SLO observable gauge (reports 0 with no readers) — the M2
    acceptance check in `cmd/tracecore/integration_telemetry_test`
    still passes.
  - PR-F deletes `internal/selftelemetry` entirely anyway.

## Scope-name standard

Instrumentation scope = exporter's Go import path
(`github.com/tracecoreai/tracecore/components/exporters/stdoutexporter`),
per OTel convention + matching the PR-B1 receiver choice. Will move
with the exporter to `module/exporter/stdoutexporter/` in PR-I.

## Verification

- `make check` green (gofumpt + golangci-lint + go vet + go mod verify).
- `go test ./... -count=1 -short` green across the repo, including
  `cmd/tracecore` integration tests that scrape `/metrics` for
  `tracecore_exporter_calls_total` and `tracecore_exporter_failure_rate`.
- `go test -race` green for the stdoutexporter package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 04:10
@trilamsr trilamsr disabled auto-merge May 31, 2026 04:13
Two cross-cut-reviewer findings against PR #186:

1. Symmetry: existing register-failure fallback had a test
   (TestFactory_FallsBackToNoopWhenMeterFails) but the nil-MeterProvider
   branch did not. Adds TestFactory_FallsBackToNoopWhenMeterProviderIsNil
   to pin (a) factory returns without error, (b) telemetry field is the
   non-nil noop, (c) hot-path Inc* calls do not panic. Documents the
   intentional skip-tick semantic for the nil path (recordInitError is
   only meaningful when telemetry is wired but registration failed).

2. Operator-visible telemetry gap: stdoutexporter no longer contributes
   to tracecore_exporter_failure_rate (ExporterCarrier dropped). Adds a
   row to the v0.1->v0.2 migration table so operators on stdoutexporter-
   only pipelines know to switch to a real backend (otlphttp, etc.) for
   per-exporter failure-rate alerts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 04:27
@trilamsr trilamsr merged commit 32e241f into main May 31, 2026
13 of 14 checks passed
@trilamsr trilamsr deleted the pr-stdoutexporter-selftel-port branch May 31, 2026 04:47
trilamsr added a commit that referenced this pull request May 31, 2026
## Root cause

PR #180 (`chore(pivot): PR-E unblock — bench heartbeat to
hostmetricsreceiver`) enabled the `hostmetrics` receiver in
`bench/install/tracecore-values.yaml` and added it to
`builder-config.yaml`, but the install-bench `Dockerfile` was still
building from `./cmd/tracecore`. The generated
`cmd/tracecore/components.go` only registers in-tree receivers —
`hostmetricsreceiver` is upstream OTel-contrib and is only bundled by
the OCB-assembled binary at `_build/tracecore`. The daemonset pod failed
config load with `unknown component type hostmetrics`, `kubectl rollout
status` timed out at 5 m, and `bash -e` aborted `run.sh` before any
diagnostics fired — so CI showed a bare red with no actionable log.

`install-bench` has been red on `main` since 2026-05-31T02:28:40Z and on
every PR opened after #180.

**Affected PRs (open as of this writing)**: #186, #187, #188, #189.

## Fix

Switch `install/kubernetes/tracecore/Dockerfile` to build via OCB:

```
make build-ocb           # generates ./_build/{main.go,go.mod,...} + compiles
cd _build && go build .  # re-link with CGO_ENABLED=0 -trimpath -ldflags "-s -w"
```

The re-link with our flags guarantees the static binary the distroless
base can exec; OCB's intermediate compile uses its own defaults. The
final image still uses `gcr.io/distroless/static-debian12:nonroot` at
the same pinned digest.

This is a **tactical bridge**: PR-A2 (#189) makes `_build/tracecore` the
canonical binary for all builds. Once that lands, the in-tree
`cmd/tracecore` path retires entirely (RFC-0013 PR-F) and this
Dockerfile change becomes the new normal across every image, not just
install-bench.

### Why not the alternatives?

- **Revert hostmetrics in bench values** → walks back PR #180's pivot
intent ("no custom receiver where upstream satisfies"); the legacy
`clockreceiver` is on its way out in PR-K.
- **Add hostmetricsreceiver to `cmd/tracecore/components.go`** →
diverges the in-tree component list from the OCB-managed one; the whole
point of PR-A2 is to delete that divergence.

## Bonus: surface root cause on rollout-status failure

`bench/install/run.sh` had post-deadline diagnostics for the first-data
path, but the rollout-status path (the actual failure mode of this
regression) just exited via `set -e`. Added `dump_failure_diagnostics()`
(pod state, `kubectl describe`, current + previous container logs,
rendered config) wired to both failure paths; refactor eliminates the
duplicated tracecore-pod spelunking that lived inline. Future
regressions surface root cause in the CI log without re-running.

## Verification

```
$ make check
… 0 issues, all modules verified

$ docker build -f install/kubernetes/tracecore/Dockerfile -t tracecore:bench-test .
… exporting to image done

$ docker run --rm tracecore:bench-test components | grep -E "hostmetrics|otlp"
    - name: hostmetrics
      module: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver v0.110.0
    - name: otlp
      module: go.opentelemetry.io/collector/receiver/otlpreceiver v0.110.0
    - name: otlphttp
      module: go.opentelemetry.io/collector/exporter/otlphttpexporter v0.110.0
```

End-to-end install-bench (kind cluster + helm install) runs on this PR
via the workflow itself.

## Cost

Docker build stage adds ~100 s (OCB compile inside Alpine). Bench Docker
rebuild only fires on chart/bench/builder-config changes — acceptable.

```release-notes
[CI] install-bench Dockerfile now builds via OpenTelemetry Collector Builder so the bench daemonset can load hostmetricsreceiver; also dumps pod state, logs, and rendered config on rollout-status failure. Unblocks every PR opened after #180.
```

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr pushed a commit that referenced this pull request May 31, 2026
Two-file conflict from main landings since last sync:

- install/kubernetes/tracecore/Dockerfile: both sides build the
  OCB-generated _build/tracecore binary. Took PR-A2's canonical
  `make build` invocation (the OCB target) and kept #190's
  defense-in-depth re-link inside ./_build/ with distroless flags
  (CGO_ENABLED=0, -trimpath, -s -w) so the resulting binary is
  guaranteed-static for distroless/static-debian12.
- docs/migration/v0.1-to-v0.2.md: PR-A2 added two self-telemetry
  rows (otelcol_* rename + telemetry.listen split), #186 added a
  stdoutexporter row. All three rows belong; kept all three.

values.yaml + .github/workflows/chart.yml had no actual conflicts
after fetch — must have auto-merged in the prior sync.

Gates: make check / go test ./... / helm lint / helm template /
make build / ./_build/tracecore validate against rendered chart —
all green.

Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
…193)

## Summary

Sibling-port the `otlphttp` exporter off `internal/selftelemetry` so
RFC-0013 PR-F can delete the internal package. Mirrors the
stdoutexporter precedent landed in #186.

- `components/exporters/otlphttp/selftel.go` owns the package-local
`kind`, `selfExporter`, `newSelfExporter`, and `recordInitError` (no
`internal/selftelemetry` import).
- Metric contract preserved: `tracecore.exporter.calls_total{result,
kind, component_id}` with the same three kind values the v0.1.x exporter
emitted (`marshal`, `io`, `downstream`). `init_errors_total` ticks on
register failure via the same factory fallback shape as stdoutexporter.
Dashboards / alerts keyed on the counter do not regress.
- Instrumentation scope name pins to the exporter's Go import path
(`github.com/tracecoreai/tracecore/components/exporters/otlphttp`); when
the exporter moves under `module/` in PR-I, the scope name moves with
it.

## ExporterCarrier dropped (intentional gap)

The v0.1.x `SelfExporter() selftelemetry.Exporter` carrier is removed.
There is no current production consumer of the carrier in this tree, and
PR-F deletes the contract regardless. The runtime degrades to the
documented "no per-exporter signal" mode;
`tracecore_exporter_failure_rate` still appears in scrape via the SLO
observable gauge (reports 0 with no readers). Documented at
`newSelfTelemetry` in `otlphttp.go`. Matches the stdoutexporter
precedent.

## Root cause

The otlphttp exporter's `internal/selftelemetry` import was the
load-bearing dependency blocking PR-F from deleting the internal
package. Root-caused at the import itself, not a symptom — fixed by full
sibling port (no shim, no compat layer).

## Test plan

- [x] `make check` (golangci-lint, vet, mod verify) — clean.
- [x] `go test ./components/exporters/otlphttp/...` — all tests green (8
new sibling tests + 5 rewired classify tests + 22 pre-existing
integration tests).
- [x] `go build ./...` — full tree compiles; no other consumers of
otlphttp's selftelemetry surface.
- [x] New tests cover: noop hot-path safety; `errNilMeterProvider`
returned (not silent noop); `calls_total` emission with all three kinds
+ `component_id` label; OTel scope name pinned to the exporter import
path; `init_errors_total` shape with `kind=exporter` +
`reason=instrument_register`; factory fallback when meter registration
fails (synthetic failingExporterMP); factory fallback when MeterProvider
is nil; sibling-types compile pin (any reintroduction of
`internal/selftelemetry` here breaks compile).

```release-notes
NONE
```

Internal refactor — sibling-port of the otlphttp exporter's
self-telemetry to a package-local surface. Metric names, label shape,
and kind values preserved. The v0.1.x `SelfExporter()`
(`ExporterCarrier`) handle is dropped — the runtime degrades to the
documented "no per-exporter signal" mode;
`tracecore_exporter_failure_rate` still appears via the SLO observable
gauge. No operator-facing change.

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trilamsr added a commit that referenced this pull request May 31, 2026
)

## Summary

Four amendments to `docs/rfcs/0013-distro-first-pivot.md` (plus a
one-line sweep across 6 companion docs) per the scope-review findings
staged before PR-I.1 / PR-K / PR-M code work begins. Pre-stages each
decision in the RFC so the autonomous code PRs don't escalate
mid-flight.

## Root cause

#181 (RFC-0013 PR-I in-repo submodule rescope) was incomplete:

1. Sweep missed 6 companion docs still pointing at the original-design
external `tracecoreai/tracecore-components` repo.
2. §7 listed 3 GitHub workflows for deletion that were already removed
pre-RFC and 1 issue template (`component-bug-dcgm.yml`) that was already
removed pre-RFC.
3. PR-K was a single 4-receiver-delete-plus-chart-migration mega-PR with
no decoupling of the `internal/synthesis/patterns/` k8sevents dep break,
which is on PR-I.2's critical path.
4. PR-I.1 conflated the `module/go.mod` scaffolding with the `git mv` +
package rename, blocking PR-I.1a from landing without PR-B2 even though
the scaffolding step has no nccl_fr dep.

Mid-flight discovery during merge cycle: merged commit #188
(`feat(pivot): PR-B2 — port dcgm off internal selftel + lifecycle`)
reused the `PR-B2` slug for a PR-B1-shape dcgm port (which is moot since
dcgm is deleted entirely in PR-F), creating a naming collision against
the canonical PR-B2 defined in the RFC — the nccl_fr
`internal/{pipeline,consumer,runtime/lifecycle}` → upstream port that
hard-gates the PR-I.1b `git mv`.

## Amendments

1. **§6/§7 sweep miss (Amendment 1)**: Remove surviving
`tracecoreai/tracecore-components` external-repo references across
`docs/getting-started.md`, `docs/followups/M11.md`,
`docs/followups/M19.md`, `docs/FOLLOWUPS.md`,
`docs/rfcs/0003-pipeline-runtime-and-component-contract.md`,
`AGENTS.md`. All re-pointed at `github.com/tracecoreai/tracecore/module`
per RFC-0013 §6. Verified zero surviving stale refs.
2. **§7 nonexistent workflow entries (Amendment 2)**: Collapse
`pyspy-integration.yml`, `python-publish.yml`,
`kernelevents-integration.yml` deletion rows into one row marked
"already removed pre-RFC". `component-bug-dcgm.yml` also already
removed. Only `component-bug-kernelevents.yml` survives for PR-K. §4
v0.3.0 row + PR-M slug cleaned for consistency.
3. **§migration PR-K sub-slice (Amendment 3)**:
- **PR-K.1** — sever `internal/synthesis/patterns/` from
`components/receivers/k8sevents` via local model types in
`internal/synthesis/patterns/model.go`. No deletions. **Unblocks
PR-I.2.**
- **PR-K.2** — delete
`components/receivers/{clockreceiver,kernelevents,k8sevents,containerstdout}`
+ migrate ~86 test fixtures + delete `tools/failure-inject/xidgen/` +
keep `tools/failure-inject/ncclhang/`.
- **PR-K.3** — chart cleanup: flip `containerstdout-on-values.yaml` to
filelog+container-stanza, delete `containerstdout-rbac.yaml`, delete
`.github/ISSUE_TEMPLATE/component-bug-kernelevents.yml`, ship
`NOTES.txt` deprecation + values-key removal.
4. **§migration PR-I sub-slice + PR-B2 promotion (Amendment 4)**:
- **PR-B2** reframed as hard gate for PR-I.1b: port
`components/receivers/nccl_fr` off
`internal/{pipeline,consumer,runtime/lifecycle}` to upstream
`go.opentelemetry.io/collector/{component,receiver,consumer,pipeline}`.
Slug-collision note added re: merged #188.
- **PR-I.1a** — `module/go.mod` + root `go.work` + `builder-config.yaml`
`replaces:` skeleton. No file movement. Tag `module/v0.0.1` (genesis
tag, validates the tagging contract).
- **PR-I.1b** — `git mv components/receivers/nccl_fr →
module/receiver/ncclfrreceiver` + `git mv pkg/nccl/fr_parser →
module/pkg/nccl/fr_parser` + rename Go package `nccl_fr` →
`ncclfrreceiver` + update all importers. Hard-gated on PR-B2. No new
tag; next bump is `module/v0.1.0` at PR-I.2.
- **PR-I.2** — `rankjoinprocessor` + `patterndetectorprocessor` net-new.
Hard-gated on PR-K.1. Tag `module/v0.1.0` (first version pinned in
`builder-config.yaml` for v0.2.0).

Also: PR-J marked `(landed, #195)` with note that recipe docs landed but
chart-values compat map follows in PR-K.3.

## Adversarial review (5 lenses, inline)

- **(a) PR slug internal consistency**: PR-I.1b ↔ PR-B2 ↔ PR-I.2 ↔
PR-K.1 bidirectional gates all match. PR-J landed marker consistent with
#195. §4 v0.2.0 row has pre-existing drift (mentions dcgm+kueue in
v0.2.0 when PR-F+#168 already deleted them in v0.1.0) — out of these 4
amendments' scope; flag for follow-up.
- **(b) PR-B2 hard-gate naming**: tightened from "Hard gate for PR-I.1"
to "Hard gate for PR-I.1b" — accurate because PR-I.1a is
scaffolding-only with no file movement.
- **(c) Sub-PR numbering collision**: #188 explicitly addressed in
slug-collision note. #185/#186/#187/#193/#194/#196 are PR-B1-shape ports
for non-nccl receivers (no PR-slug label in their commits), no
collision.
- **(d) Stale external-repo refs**: `grep -rn
"tracecoreai/tracecore-components" docs/ AGENTS.md README.md` returns
zero hits post-amendment.
- **(e) Cross-reference link integrity**:
`docs/migration/v0.1-to-v0.2.md` references `#migration--rollout` and
`#3-customer-stable-telemetry-contracts`; both anchors preserved (§
headers unchanged). `make doc-check` confirms 526 markdown links
resolve.

## Test plan

- [x] `make doc-check` — 526 markdown links resolve, 0 stale refs,
banned-phrase lint clean, alert-check + chart-appversion gates green.
- [x] Pre-push hooks: golangci-lint clean, go vet clean, go mod verify
clean.
- [ ] CI doc-check + actionlint + zizmor gates pass on PR.

```release-notes
NONE
```

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

Port `components/exporters/stdoutexporter` off
`internal/{pipeline,consumer}` onto upstream OTel collector v1.59.0 /
v0.153.0 types. Mirrors PR-B2 #201 (nccl_fr receiver port),
exporter-flavor. Sibling self-telemetry from PR #186 stays in place —
only the `pipeline.ID` → `component.ID` + `pipeline.CreateSettings` →
`exporter.Settings` type-swaps in tests are needed.

## Type swap

| Before (internal) | After (upstream) |
|---|---|
| `internal/pipeline.ExporterFactory` | `exporter.Factory` |
| `internal/pipeline.CreateSettings` | `exporter.Settings` |
| `internal/pipeline.Config` | `component.Config` |
| `internal/pipeline.Exporter` | `exporter.Metrics` |
| `internal/consumer.{Metrics,Capabilities}` |
`consumer.{Metrics,Capabilities}` |
| `pipeline.ComponentState` (embedded) | explicit no-op
`Start`/`Shutdown` |
| `pipeline.ErrSignalNotSupported` (manual returns) | upstream's
built-in unregistered-signal sentinel |
| `*slog.Logger` | `*zap.Logger` (factory-side only — exporter has no
log sites) |

Factory shape matches upstream-contrib (`debugexporter`,
`fileexporter`):

```go
func NewFactory() exporter.Factory {
    return exporter.NewFactory(
        componentType(),
        createDefaultConfig,
        exporter.WithMetrics(createMetrics, stability),
    )
}
```

`WithMetrics` is the only signal registered. Logs + traces fall through
to upstream's built-in *"telemetry type is not supported"* sentinel —
tests pin the surfaced message so a future upstream rename surfaces here
at test-time rather than as a silent contract regression.

## Root cause

RFC-0013 §migration mandates that `internal/pipeline`,
`internal/consumer`, and `internal/selftelemetry` are deleted in PR-F.
Every in-tree component must first switch to upstream OTel types so PR-F
can land cleanly. PR-B2 #201 did this for the nccl_fr receiver; this PR
is the exporter sibling. Not a workaround — this is the migration step
itself.

## Why drop `pipeline.ComponentState`

`stdoutexporter` is purely synchronous (no goroutines, no resources to
acquire). The embedded `ComponentState` only existed to provide free
`Start`/`Shutdown`/`Started`/`Stopped` accessors. Upstream
`component.Component` requires `Start`/`Shutdown` (now explicit no-ops)
but does NOT require `Started`/`Stopped` — those tests are dropped. The
`lifecycle` helper that nccl_fr's port introduced is intentionally
absent here: the exporter has no run-loop to bookend.

## New adversarial tests added with the swap

- `TestFactory_CreateMetrics_BadConfigType` — pins the type-guard in
`createMetrics` so a non-`*Config` cfg surfaces a clear error rather
than panicking on type assertion.
- `TestExporter_StartShutdown_NoOps` — pins the no-op contract so a
future refactor that introduces a side-effecting `Start`/`Shutdown` is
forced to add its own test.
- `TestExporter_WriteErrorSurfaces` — pins the I/O failure path: a
writer that errors must surface a wrapped error from `ConsumeMetrics`
(paired with the existing `kindIO` selftel coverage).
- `var _ exporter.Factory = NewFactory()` — compile-time pin against
silent regression back to `internal/pipeline.ExporterFactory`.

## LOC delta

```
components/exporters/stdoutexporter/  6 files, +221 / -142
go.mod                                +6 / -0
go.sum                                +40 / -2
```

Net +79 LOC for the exporter package, almost entirely the four new
adversarial tests above and the type-swap boilerplate the upstream
contract requires (separate `componentType` + `createDefaultConfig` +
`createMetrics` funcs vs. one struct with three methods).

## Test plan

- [x] `make check` — clean (fmt, tidy, lint, vet, mod-verify all green).
- [x] `go test -race -count=1 ./components/exporters/stdoutexporter/...`
— green.
- [x] `go test ./...` — green except three pre-existing flakes (verified
by stashing this diff and re-running on `origin/main`):
- `components/receivers/kernelevents/TestReceiver_SLIBudget` (p99 perf
budget; flakes ~50% on local Darwin)
-
`components/receivers/kernelevents/TestKernelevents_Lifecycle_ConcurrentAddDuringShutdown_NoPanic`
-
`components/receivers/k8sevents/TestK8sevents_Lifecycle_ConcurrentAddDuringShutdown_NoPanic`
- [ ] CI to run the same gates on a fresh checkout.

```release-notes
NONE
```

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
#206)

## Summary

Deletes the three internal moats and the in-tree DCGM receiver that
RFC-0013 §migration step 8 promised — the payoff for the wave-3
sibling-port PRs (#184/#185/#186/#187/#188/#193/#194/#196/#197).

**Net: -12,482 LOC across 92 files (78 deletions, 14 modifications).**

### What deletes

| Path | LOC | Why safe now |
|---|---|---|
| `components/receivers/dcgm/` | 7,604 | cgo stub never shipped real
code; #188's PR-B2-shaped dcgm sweep already removed the live port
surface. |
| `pkg/dcgm/` | 922 | Only consumer was the deleted receiver. Bonus
cleanup. |
| `internal/selftelemetry/` | 1,946 | Every consumer (containerstdout,
clockreceiver, kernelevents, k8sevents, nccl_fr, dcgm, pyspy,
stdoutexporter, otlphttp) ported onto receiver/exporter-scoped sibling
`selftel.go` files. |
| `internal/telemetry/` | 1,991 | Probes flow through upstream
`healthcheckextension`; MeterProvider via upstream `service.telemetry`.
Only remaining consumers were `internal/selftelemetry/*_test.go`
(deleted together) + one orphan clockreceiver test. |
| `components/receivers/clockreceiver/errors_integration_test.go` | 100
| Orphan from #185's PR-B1 clockreceiver port — bootstrapped via the
deleted `selftelemetry.Receiver` interface but never migrated to the
receiver-scoped sibling `selftel.go`. Covered behaviour ("errors_total
surfaces on downstream failure") is now exercised through
clockreceiver's sibling tests. |

### Pre-flight grep evidence (post-merge of origin/main)

```
$ grep -rn "tracecoreai/tracecore/internal/selftelemetry" --include="*.go" .
(zero matches)

$ grep -rn "tracecoreai/tracecore/internal/telemetry" --include="*.go" .
(zero matches)

$ grep -rn "tracecoreai/tracecore/components/receivers/dcgm" --include="*.go" .
$ grep -rn "tracecoreai/tracecore/pkg/dcgm" --include="*.go" .
(zero matches)
```

### Tooling

- Retire the `dcgm` build tag — `make build-tags` no longer vets `-tags
dcgm` (kept as a hook for future build-tag-gated paths).
- `make bench-check` loop drops both deleted package rows
(`internal/telemetry`, `components/receivers/dcgm`).
- `scripts/register-lint.sh` allowlist emptied (the two
`internal/telemetry/{build_info,slo}.go` entries are gone with the
package; allowlist comment notes the post-PR-F.1 state).
- `go.mod` direct deps shrink — `github.com/prometheus/client_golang`
and `go.opentelemetry.io/otel/exporters/prometheus` drop to indirect
(they were used by `internal/telemetry/server.go`).

### Chart toggles intentionally retained

Chart `receivers.dcgm` toggle + `templates/NOTES.txt` warning +
`templates/_helpers.tpl` doc-comment list keep the `dcgm` symbol for the
migration window. The toggle has been inert since PR-A2 — operators
enabling `receivers.dcgm.enabled=true` already crashed at boot because
the OCB binary doesn't register the factory. PR-K removes the toggle
entirely alongside the chart-default flip from `clockreceiver` →
`hostmetrics` and the v0.2.0 recipe migration.

### Doc sweep

- `internal/runtime/lifecycle/lifecycle.go` doc-comment: drop the dcgm
pointer; flag containerstdout as the sole remaining in-tree consumer;
reschedule the package itself for PR-F.2 deletion once containerstdout
ports off the helper or PR-K.2 deletes the receiver.
- `docs/FAILURE-MODES.md` self-tel-surface rows rewired from
`internal/telemetry/server_test.go::*` (deleted) to upstream-delegated
wording.
- `docs/patterns/{README,pattern-{1,3,4,5}}.md` replay-test pointers
updated — the in-tree `components/receivers/dcgm/pattern_replay_test.go`
is gone; pattern replay now flows through
`docs/integrations/prometheus-scrape.md` (PR-J's upstream
`dcgm-exporter` recipe).
- `docs/README.md` per-component table: drop the deleted
`internal/telemetry/{README,SECURITY}.md` rows + the deleted
`components/receivers/dcgm/{README,RUNBOOK}.md` rows.
- `STYLE.md` vendor-SDK section: drop the `pkg/dcgm/` reference + the
`//go:build dcgm` example; explicit cross-reference to PR-F.1 in the
integration-test build-tag note.
- `CHANGELOG.md`: PR-F.1 landed entry under Unreleased; "Remaining
v0.1.0 work" line updated to point at PR-F.2.
- `docs/rfcs/0013-distro-first-pivot.md` §migration step 8: PR-F entry
replaced with the PR-F.1/PR-F.2 split + the explicit rationale
(componentstatus travels with pipeline; pipeline is out of PR-F's scope
per line 240's original framing).

### Out of scope (PR-F.2 follow-up)

- `internal/componentstatus/` — 5-line `ReportStatus` free function.
Travels with `internal/pipeline` (its only non-test consumers are
`internal/pipeline/runtime_test.go` +
`internal/pipeline/pipelinetest/fixture_test.go`). Deletion lands when
pipeline migrates to upstream
`go.opentelemetry.io/collector/component/componentstatus`.

### Rationale links

- RFC-0013 §migration step 8 — the PR-F entry now codifies the F.1/F.2
split in this branch's RFC update.
- PR-B2 scope-discovery (#188) — established the "rename + slim, don't
reshape" pattern for the dcgm sweep that retired the cgo path.
- Wave-3 PRs that unblocked selftelemetry deletion: #184 (nccl_fr), #185
(clockreceiver), #186 (kernelevents), #187 (stdoutexporter), #188
(dcgm), #193 (otlphttp), #194 (pyspy), #196 (k8sevents), #197
(containerstdout).

```release-notes
[CHANGE] internal/{selftelemetry,telemetry} packages deleted; components/receivers/dcgm + pkg/dcgm deleted. Operators using the v0.1.x in-tree `tracecore.*` self-telemetry metric names migrate per docs/migration/v0.1-to-v0.2.md. Third-party importers of internal/* (unlikely pre-1.0) lose the `selftelemetry.{Receiver,Exporter}` interfaces and the `telemetry.MeterProvider` wrapper; receiver/exporter authors now wire a receiver-scoped sibling `selftel.go` per the PR-B1 pattern.
```

## Test plan

- [x] `make verify` (lint + vet + tidy-check + mod-verify +
license-check + generate-fixtures-check + build-tags + nccl-fr-rce-gate
+ register-lint + actionlint + zizmor + doc-check + no-autoupdate-check)
— exit 0.
- [x] `go test ./...` — all green (29 packages).
- [x] `make build` (OCB) — `./_build/tracecore` produced.
- [x] `./_build/tracecore --version` — `tracecore version
0.1.0-m9-alpha`.
- [x] Pre-flight greps for all four deleted paths — zero external
importers.
- [ ] CI green on PR (linux/race matrix, chart render, install-bench,
zizmor, govulncheck).
- [ ] Operator verification that the chart's `dcgm` toggle remains inert
post-merge (no behaviour change from main — already inert since PR-A2).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

Deletes the seven `internal/*` packages that RFC-0013 §migration step 8
PR-F.2 promised once the upstream-port wave
(#201/#202/#203/#204/#205/#207/#208/#209) cleared every external caller
of the in-tree pipeline runtime.

**Net: -6,888 LOC across 56 deleted files, +80 LOC across 14 modified
files. 70 files total.** This is the final cut of RFC-0013 §migration
step 8 PR-F.

## What deletes

| Path | LOC | Replacement |
|---|---|---|
| `internal/pipeline/` | 4,134 | `go.opentelemetry.io/collector/service`
(OCB-generated `_build/main.go` consumes `builder-config.yaml`). |
| `internal/pipelinebuilder/` | 1,282 | Same — assembly is upstream
`service`. |
| `internal/config/` | 718 | Upstream `confmap` providers (`file`,
`yaml`, `env`). |
| `internal/consumer/` | 87 | Upstream
`go.opentelemetry.io/collector/consumer`. |
| `internal/fanout/` | 366 | Upstream `internal/fanoutconsumer`
(collector module). |
| `internal/componentstatus/` | 16 | Upstream
`component/componentstatus.ReportStatus` (same free-function shape). |
| `internal/runtime/lifecycle/` | 505 | Per-receiver package-local
`lifecycle.go` siblings — already ported during the PR-B1 wave
(#184/#185/#186/#187/#194/#196/#197); the in-tree helper had no
remaining non-test consumer after PR-F.1 + the wave-2 upstream-port PRs.
`kernelevents/lifecycle.go` was inherited from k8sevents (#208). |

## Pre-flight grep evidence

```
$ grep -rn 'tracecoreai/tracecore/internal/(pipeline|consumer|pipelinebuilder|config|fanout|componentstatus|runtime/lifecycle)' --include='*.go' .
(zero matches)
```

## Tooling

- `.golangci.yml` `ignore-interface-regexps` repointed at upstream
`consumer.{Metrics,Traces,Logs}` + `component.Component`. The
in-tree-only same-package-error-wrap exemption stays — the STYLE rule
applies regardless of which interface is forwarded.
- `.github/workflows/chaos.yml` drops the `chaos-pipeline-test` job (the
in-tree `internal/pipeline/chaos_test.go` is gone; upstream `service`
provides the equivalent panic-recovery contract). `harness-determinism`
(failure-inject golden-SHA), `cpu-steal-mpstat`, `pattern-pod-evicted`
jobs preserved.
- `.github/workflows/install-bench.yml` drops the
`internal/{pipeline,runtime,selftelemetry}/**` path-filter rows.
- `go.mod` / `go.sum` unchanged.

## Doc sweep

- `CHANGELOG.md` Unreleased: PR-F.2 landed entry replacing the "PR-F.2
deferred" sentence; "Remaining v0.1.0 work" line updated; one dead
`internal/pipeline/README.md` link in Foundation block rewritten as
"deleted at v0.1.0".
- `docs/rfcs/0013-distro-first-pivot.md` §7 deletion table: both
pipeline-internals and runtime/lifecycle rows updated from "v0.1.0
(audit first…)" / "v0.2.0 (with last consumer)" to "v0.1.0 (landed
PR-F.2)". §migration step 8 reframed.
- `docs/FAILURE-MODES.md` Lifecycle / Data flow / Shutdown timing /
Backend tables rewired from in-tree
`internal/{config,pipeline,fanout}/*_test.go::TestName` pointers to
upstream-delegated wording matching the pattern PR-A2 established.
- `docs/STRATEGY.md` "Post-RFC-0013 status" intro updated; "Stable
interfaces in `internal/pipeline/`" graduation row rewritten to point at
the upstream surface.
- `docs/migration/v0.1-to-v0.2.md` `internal/*` section status banner
flipped from "deferred, still present in RC builds" to "landed, deleted
in v0.2.0 builds".
- `MILESTONES.md` v0.1.0 deletions row extended with boot-path
internals; M1 + M4b + M19 rubric details annotated with the PR-F.2
retirement.
- `README.md` Contributor row repointed at upstream
`go.opentelemetry.io/collector` package docs.
- `AGENTS.md` "Self-telemetry internals" bullet split into "Self-tel
internals" + "Pipeline / boot-path internals" with explicit deletion
status.
- `docs/README.md` table row for `internal/pipeline/README.md` dropped.
- `components/receivers/kernelevents/README.md` lifecycle-sibling
rationale updated to past-tense.
- `tools/failure-inject/README.md` "Testing locally" section drops the
`-tags=chaos ./internal/pipeline/...` invocation.

## Sequencing

This PR is hard-gated on every upstream-port PR landing first:

- #201 nccl_fr (PR-B2)
- #202 stdoutexporter
- #203 pyspy
- #204 k8sevents
- #205 clockreceiver (PR-B3)
- #207 otlphttp
- #208 kernelevents
- #209 containerstdout
- #206 PR-F.1 (selftel / telemetry / dcgm)

All nine merged before this PR opened; this is the moat-deletion payoff.
Remaining v0.1.0 work is PR-K (chart-default flip + `clockreceiver` +
`stdoutexporter` + remaining receiver source deletions, coupled with
test-fixture migration and the `telemetry:` values-key deprecation
cycle).

## Test plan

- [x] `make check` — golangci-lint 0 issues, go vet clean, go mod verify
ok.
- [x] `go build ./...` — clean.
- [x] `go test -count=1 ./...` — green (excluding the known
`kernelevents/TestReceiver_SLIBudget` flake called out in #205's body,
which only triggers under heavy parallel `go test ./...` load; passes
standalone).
- [x] `grep` confirms zero non-internal callers of the deleted packages.
- [x] Doc-check pre-push hook passes after the CHANGELOG dead-link fix.

```release-notes
[CHANGE] internal/{pipeline,pipelinebuilder,config,consumer,fanout,componentstatus,runtime/lifecycle} packages deleted. The OCB-generated boot path off builder-config.yaml replaces them. Third-party importers of internal/* (unlikely pre-1.0; the packages live under internal/ and the Go compiler rejects external imports) lose the pipeline-assembly + lifecycle + config-loader surfaces; receiver authors now wire against upstream go.opentelemetry.io/collector/{component,receiver,consumer,pipeline} directly. See docs/migration/v0.1-to-v0.2.md "internal/* package deletion".
```

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant