feat(pivot): port pyspy off internal selftel + lifecycle (PR-F unblock)#194
Merged
Conversation
added 3 commits
May 30, 2026 23:05
Sibling-port pattern per PR #184 (nccl_fr reference): co-locate `selftel.go` + `lifecycle.go` inside `components/receivers/pyspy/` so the receiver no longer depends on `internal/selftelemetry` or `internal/runtime/lifecycle`. Unblocks RFC-0013 PR-F deletion of those internal moats. Scope name pins to the receiver's Go import path (`github.com/tracecoreai/tracecore/components/receivers/pyspy`) per OTel convention; metric names + label shape preserved (`tracecore.receiver.errors_total{kind,component_id}` and siblings) so dashboards/alerts don't regress. Lifecycle is the single-source slim variant (no `Add()`) since pyspy already fans into its three goroutines via its own `sync.WaitGroup` inside `runAll`. Kinds become receiver-local typed `kind` constants; the canonical `selftelemetry.KindPanic` is replaced with a local `kindPanic`. Tests use a sibling-local `fakeTelemetry` captor + `sdkmetric.NewManualReader` so the test surface stays decoupled from `internal/*`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Tri Lam <tri@maydow.com>
Three reviewer fixes on PR-F pyspy port: 1. Delete kindSidecarUIDDrift — no impl call site ever ticked it; the const, allKinds entry, kinds_test want-map row, README table row, RUNBOOK section, and RFC-0009 degraded-modes row all removed in lockstep per the kinds_test lockstep contract. 2. Rename TestSelfTelemetry_* / TestFactory_* / TestRecordInitError_* / TestNewFactory_* tests to TestPyspy_* form so test output unambiguously identifies the package under test. 3. Add asSelfTelemetry compile-pin + TestPyspy_SiblingTypesArePackageLocal mirroring the stdoutexporter asSelfExporter pattern; pins the PR-B1 sibling contract that selfTelemetry stays package-local. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Tri Lam <tri@maydow.com>
3 tasks
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
) ## Summary Four amendments to `docs/rfcs/0013-distro-first-pivot.md` (plus a one-line sweep across 6 companion docs) per the scope-review findings staged before PR-I.1 / PR-K / PR-M code work begins. Pre-stages each decision in the RFC so the autonomous code PRs don't escalate mid-flight. ## Root cause #181 (RFC-0013 PR-I in-repo submodule rescope) was incomplete: 1. Sweep missed 6 companion docs still pointing at the original-design external `tracecoreai/tracecore-components` repo. 2. §7 listed 3 GitHub workflows for deletion that were already removed pre-RFC and 1 issue template (`component-bug-dcgm.yml`) that was already removed pre-RFC. 3. PR-K was a single 4-receiver-delete-plus-chart-migration mega-PR with no decoupling of the `internal/synthesis/patterns/` k8sevents dep break, which is on PR-I.2's critical path. 4. PR-I.1 conflated the `module/go.mod` scaffolding with the `git mv` + package rename, blocking PR-I.1a from landing without PR-B2 even though the scaffolding step has no nccl_fr dep. Mid-flight discovery during merge cycle: merged commit #188 (`feat(pivot): PR-B2 — port dcgm off internal selftel + lifecycle`) reused the `PR-B2` slug for a PR-B1-shape dcgm port (which is moot since dcgm is deleted entirely in PR-F), creating a naming collision against the canonical PR-B2 defined in the RFC — the nccl_fr `internal/{pipeline,consumer,runtime/lifecycle}` → upstream port that hard-gates the PR-I.1b `git mv`. ## Amendments 1. **§6/§7 sweep miss (Amendment 1)**: Remove surviving `tracecoreai/tracecore-components` external-repo references across `docs/getting-started.md`, `docs/followups/M11.md`, `docs/followups/M19.md`, `docs/FOLLOWUPS.md`, `docs/rfcs/0003-pipeline-runtime-and-component-contract.md`, `AGENTS.md`. All re-pointed at `github.com/tracecoreai/tracecore/module` per RFC-0013 §6. Verified zero surviving stale refs. 2. **§7 nonexistent workflow entries (Amendment 2)**: Collapse `pyspy-integration.yml`, `python-publish.yml`, `kernelevents-integration.yml` deletion rows into one row marked "already removed pre-RFC". `component-bug-dcgm.yml` also already removed. Only `component-bug-kernelevents.yml` survives for PR-K. §4 v0.3.0 row + PR-M slug cleaned for consistency. 3. **§migration PR-K sub-slice (Amendment 3)**: - **PR-K.1** — sever `internal/synthesis/patterns/` from `components/receivers/k8sevents` via local model types in `internal/synthesis/patterns/model.go`. No deletions. **Unblocks PR-I.2.** - **PR-K.2** — delete `components/receivers/{clockreceiver,kernelevents,k8sevents,containerstdout}` + migrate ~86 test fixtures + delete `tools/failure-inject/xidgen/` + keep `tools/failure-inject/ncclhang/`. - **PR-K.3** — chart cleanup: flip `containerstdout-on-values.yaml` to filelog+container-stanza, delete `containerstdout-rbac.yaml`, delete `.github/ISSUE_TEMPLATE/component-bug-kernelevents.yml`, ship `NOTES.txt` deprecation + values-key removal. 4. **§migration PR-I sub-slice + PR-B2 promotion (Amendment 4)**: - **PR-B2** reframed as hard gate for PR-I.1b: port `components/receivers/nccl_fr` off `internal/{pipeline,consumer,runtime/lifecycle}` to upstream `go.opentelemetry.io/collector/{component,receiver,consumer,pipeline}`. Slug-collision note added re: merged #188. - **PR-I.1a** — `module/go.mod` + root `go.work` + `builder-config.yaml` `replaces:` skeleton. No file movement. Tag `module/v0.0.1` (genesis tag, validates the tagging contract). - **PR-I.1b** — `git mv components/receivers/nccl_fr → module/receiver/ncclfrreceiver` + `git mv pkg/nccl/fr_parser → module/pkg/nccl/fr_parser` + rename Go package `nccl_fr` → `ncclfrreceiver` + update all importers. Hard-gated on PR-B2. No new tag; next bump is `module/v0.1.0` at PR-I.2. - **PR-I.2** — `rankjoinprocessor` + `patterndetectorprocessor` net-new. Hard-gated on PR-K.1. Tag `module/v0.1.0` (first version pinned in `builder-config.yaml` for v0.2.0). Also: PR-J marked `(landed, #195)` with note that recipe docs landed but chart-values compat map follows in PR-K.3. ## Adversarial review (5 lenses, inline) - **(a) PR slug internal consistency**: PR-I.1b ↔ PR-B2 ↔ PR-I.2 ↔ PR-K.1 bidirectional gates all match. PR-J landed marker consistent with #195. §4 v0.2.0 row has pre-existing drift (mentions dcgm+kueue in v0.2.0 when PR-F+#168 already deleted them in v0.1.0) — out of these 4 amendments' scope; flag for follow-up. - **(b) PR-B2 hard-gate naming**: tightened from "Hard gate for PR-I.1" to "Hard gate for PR-I.1b" — accurate because PR-I.1a is scaffolding-only with no file movement. - **(c) Sub-PR numbering collision**: #188 explicitly addressed in slug-collision note. #185/#186/#187/#193/#194/#196 are PR-B1-shape ports for non-nccl receivers (no PR-slug label in their commits), no collision. - **(d) Stale external-repo refs**: `grep -rn "tracecoreai/tracecore-components" docs/ AGENTS.md README.md` returns zero hits post-amendment. - **(e) Cross-reference link integrity**: `docs/migration/v0.1-to-v0.2.md` references `#migration--rollout` and `#3-customer-stable-telemetry-contracts`; both anchors preserved (§ headers unchanged). `make doc-check` confirms 526 markdown links resolve. ## Test plan - [x] `make doc-check` — 526 markdown links resolve, 0 stale refs, banned-phrase lint clean, alert-check + chart-appversion gates green. - [x] Pre-push hooks: golangci-lint clean, go vet clean, go mod verify clean. - [ ] CI doc-check + actionlint + zizmor gates pass on PR. ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
This was referenced May 31, 2026
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
) ## Summary - Port `components/receivers/pyspy` off the v0.1.x internal facades (`internal/pipeline`, `internal/consumer`, `internal/runtime/lifecycle`) onto upstream `go.opentelemetry.io/collector/{component,receiver,consumer}` v1.59.0 — the canonical types the OCB-generated `_build/main.go` already consumes for all third-party receivers. Mirrors PR-B2 #201 (`nccl_fr` port) one-for-one; sibling wave of the PR-F.2 series. - Factory is now `receiver.NewFactory(componentType, createDefaultConfig, receiver.WithLogs(createLogs, stability))` instead of a hand-rolled struct implementing `internal/pipeline.ReceiverFactory`. The hand-rolled `var Factory` + `NewFactory()` indirection wrapper is deleted — callers construct via `pyspy.NewFactory()` directly, mirroring upstream `otlpreceiver` / `filelogreceiver`. The `tools/components-gen` driver that motivated the package-var was deleted in PR-A2 #168, so no consumer remains. - Stability level pinned at `Alpha` — pyspy is in-tree but NOT registered with the OCB-stitched main (per `install/kubernetes/tracecore/values.yaml` comment: "NOT registered by OCB; no upstream equivalent yet"). The Beta-pin path is reserved for when chart enablement crosses out of "always-disabled in default values.yaml" — same gate PR-B2 used for nccl_fr but at a different milestone. - Receiver struct is `pyspyReceiver` (already the local name; no rename needed) and implements `receiver.Logs` via a `Start(ctx, component.Host) error` / `Shutdown(ctx) error` pair — the `pipeline.ComponentState` embed was dropped (upstream `component.Component` carries no equivalent mixin; the lifecycle bookkeeping the receiver actually needs lives in the sibling `lifecycle.go` helper added in PR #194). - Logger swapped from `*slog.Logger` → upstream's `*zap.Logger` (the type carried in `component.TelemetrySettings.Logger`). All log call sites converted to `zap.String/Int/Int64/Duration/Error` fields; log messages and fields are byte-for-byte preserved so operator alerting on log content does not regress. - `selftel.go` swap: `newSelfTelemetry(id pipeline.ID, ...)` → `newSelfTelemetry(id component.ID, ...)`. The instrumentation scope name (`github.com/tracecoreai/tracecore/components/receivers/pyspy`) and the five metric instrument names + label shapes are preserved — scope-name pin test (`TestPyspy_ScopeNameIsReceiverImportPath`) and the factory-fallback contract test (`TestPyspy_FallsBackToNoopWhenMeterFails`) still pass unchanged. ## Hard gate PR-I.1 (submodule extraction to `module/receiver/pyspyreceiver/`) requires `grep -r 'internal/(pipeline|consumer|runtime/lifecycle)' components/receivers/pyspy/` to return zero hits. This PR clears it: ``` $ grep -rn 'internal/pipeline\|internal/consumer\|internal/runtime/lifecycle' components/receivers/pyspy/*.go components/receivers/pyspy/lifecycle.go:4:// v0.1.x dependency on `internal/runtime/lifecycle`, which is slated ``` One comment-only mention remains in `lifecycle.go` — historical context for the v0.1.x → v0.2.0 migration, not an import. ## Predecessor PR #194 (`feat(pivot): port pyspy off internal selftel + lifecycle`, merged before this) ported the **self-telemetry + lifecycle** helpers into the package as siblings. This PR handles the **pipeline + consumer + factory** layer — the last remaining `internal/*` imports — mirroring the PR-B1 → PR-B2 split for `nccl_fr`. ## Test plan - [x] `go build ./...` — green - [x] `go test ./components/receivers/pyspy/... -race` — all tests pass - [x] `go test ./...` — green except pre-existing flake in `components/receivers/kernelevents/TestKernelevents_Lifecycle_ConcurrentAddDuringShutdown_NoPanic` (verified flaky on `git stash` of pyspy changes — predates this PR) - [x] `make check` — golangci-lint + go vet + go mod verify — green - [x] Scope-name pin (`TestPyspy_ScopeNameIsReceiverImportPath`) still asserts `github.com/tracecoreai/tracecore/components/receivers/pyspy` - [x] Factory fallback contract (`TestPyspy_FallsBackToNoopWhenMeterFails`) still surfaces `tracecore.selftelemetry.init_errors_total` when every `tracecore.receiver.*` instrument registration is synthetically failed - [x] Sibling-types pin (`TestPyspy_SiblingTypesArePackageLocal`) still enforces selfTelemetry + kind ownership by the receiver package - [x] All 11 RFC-0009 §Degraded modes still covered by `TestKinds_AllRFC0009DegradedModesCovered` - [x] Linux integration gate (`TestIntegration_NoBannedSyscalls`) still runs the receiver under strace in `target_not_attached` posture ## Compatibility note `go.mod` promotes `go.opentelemetry.io/collector/pipeline v1.59.0` from indirect → direct (used by `factory_test.go` for the `pipeline.ErrSignalNotSupported` sentinel — upstream's canonical "signal not supported" error that `receiver.NewFactory`'s default unimplemented handlers return). No transitive-dep churn. ```release-notes NONE ``` ## Type-swap reference Inherits the PR-B2 #201 type-swap table verbatim. Repeated here for the PR-F.2 series tracking sheet: | Internal | Upstream | |---|---| | `internal/pipeline.Type` | `component.Type` | | `internal/pipeline.ReceiverFactory` | `receiver.Factory` | | `internal/pipeline.CreateSettings` | `receiver.Settings` (via `receivertest.NewNopSettings` in tests) | | `internal/pipeline.Config` | `component.Config` | | `internal/pipeline.Receiver` | `receiver.Logs` (`= interface{ component.Component }`) | | `internal/pipeline.ErrSignalNotSupported` | `pipeline.ErrSignalNotSupported` (`go.opentelemetry.io/collector/pipeline`) | | `internal/pipeline.ComponentState` | (dropped — lifecycle bookkeeping lives in sibling `lifecycle.go`) | | `internal/consumer.Logs` | `consumer.Logs` | | `*slog.Logger` | `*zap.Logger` | | `internal/pipeline.MustNewType` | `component.MustNewType` | | `internal/pipeline.MustNewID` | `component.NewIDWithName` | ## Test-helper changes - `pyspy_test.go` adds a package-local `testSettings()` wrapper around `receivertest.NewNopSettings(componentType())` that pins the ID to `pyspy/test` so selftel label assertions stay deterministic (a per-run UUID would defeat the `component_id=pyspy/test` asserts). Mirrors the nccl_fr PR-B2 pattern. - `integration_linux_test.go` drops the `nopHost` struct + the `pipelinetest.NewHost()` dependency — the strace harness passes `nil` for `component.Host` (the receiver doesn't read it). - `pyspy_test.go` and `selftel_test.go` swap every `pipelinetest.NewHost()` call to `nil` for the same reason — same pattern PR-B2 established. - `factory_test.go` deletes `TestPyspy_NewFactory_ReturnsTheSamePackageVar` since the `var Factory` package-var is gone (tools/components-gen driver was deleted in PR-A2 #168 — the codegen seam this test pinned no longer exists). Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
#206) ## Summary Deletes the three internal moats and the in-tree DCGM receiver that RFC-0013 §migration step 8 promised — the payoff for the wave-3 sibling-port PRs (#184/#185/#186/#187/#188/#193/#194/#196/#197). **Net: -12,482 LOC across 92 files (78 deletions, 14 modifications).** ### What deletes | Path | LOC | Why safe now | |---|---|---| | `components/receivers/dcgm/` | 7,604 | cgo stub never shipped real code; #188's PR-B2-shaped dcgm sweep already removed the live port surface. | | `pkg/dcgm/` | 922 | Only consumer was the deleted receiver. Bonus cleanup. | | `internal/selftelemetry/` | 1,946 | Every consumer (containerstdout, clockreceiver, kernelevents, k8sevents, nccl_fr, dcgm, pyspy, stdoutexporter, otlphttp) ported onto receiver/exporter-scoped sibling `selftel.go` files. | | `internal/telemetry/` | 1,991 | Probes flow through upstream `healthcheckextension`; MeterProvider via upstream `service.telemetry`. Only remaining consumers were `internal/selftelemetry/*_test.go` (deleted together) + one orphan clockreceiver test. | | `components/receivers/clockreceiver/errors_integration_test.go` | 100 | Orphan from #185's PR-B1 clockreceiver port — bootstrapped via the deleted `selftelemetry.Receiver` interface but never migrated to the receiver-scoped sibling `selftel.go`. Covered behaviour ("errors_total surfaces on downstream failure") is now exercised through clockreceiver's sibling tests. | ### Pre-flight grep evidence (post-merge of origin/main) ``` $ grep -rn "tracecoreai/tracecore/internal/selftelemetry" --include="*.go" . (zero matches) $ grep -rn "tracecoreai/tracecore/internal/telemetry" --include="*.go" . (zero matches) $ grep -rn "tracecoreai/tracecore/components/receivers/dcgm" --include="*.go" . $ grep -rn "tracecoreai/tracecore/pkg/dcgm" --include="*.go" . (zero matches) ``` ### Tooling - Retire the `dcgm` build tag — `make build-tags` no longer vets `-tags dcgm` (kept as a hook for future build-tag-gated paths). - `make bench-check` loop drops both deleted package rows (`internal/telemetry`, `components/receivers/dcgm`). - `scripts/register-lint.sh` allowlist emptied (the two `internal/telemetry/{build_info,slo}.go` entries are gone with the package; allowlist comment notes the post-PR-F.1 state). - `go.mod` direct deps shrink — `github.com/prometheus/client_golang` and `go.opentelemetry.io/otel/exporters/prometheus` drop to indirect (they were used by `internal/telemetry/server.go`). ### Chart toggles intentionally retained Chart `receivers.dcgm` toggle + `templates/NOTES.txt` warning + `templates/_helpers.tpl` doc-comment list keep the `dcgm` symbol for the migration window. The toggle has been inert since PR-A2 — operators enabling `receivers.dcgm.enabled=true` already crashed at boot because the OCB binary doesn't register the factory. PR-K removes the toggle entirely alongside the chart-default flip from `clockreceiver` → `hostmetrics` and the v0.2.0 recipe migration. ### Doc sweep - `internal/runtime/lifecycle/lifecycle.go` doc-comment: drop the dcgm pointer; flag containerstdout as the sole remaining in-tree consumer; reschedule the package itself for PR-F.2 deletion once containerstdout ports off the helper or PR-K.2 deletes the receiver. - `docs/FAILURE-MODES.md` self-tel-surface rows rewired from `internal/telemetry/server_test.go::*` (deleted) to upstream-delegated wording. - `docs/patterns/{README,pattern-{1,3,4,5}}.md` replay-test pointers updated — the in-tree `components/receivers/dcgm/pattern_replay_test.go` is gone; pattern replay now flows through `docs/integrations/prometheus-scrape.md` (PR-J's upstream `dcgm-exporter` recipe). - `docs/README.md` per-component table: drop the deleted `internal/telemetry/{README,SECURITY}.md` rows + the deleted `components/receivers/dcgm/{README,RUNBOOK}.md` rows. - `STYLE.md` vendor-SDK section: drop the `pkg/dcgm/` reference + the `//go:build dcgm` example; explicit cross-reference to PR-F.1 in the integration-test build-tag note. - `CHANGELOG.md`: PR-F.1 landed entry under Unreleased; "Remaining v0.1.0 work" line updated to point at PR-F.2. - `docs/rfcs/0013-distro-first-pivot.md` §migration step 8: PR-F entry replaced with the PR-F.1/PR-F.2 split + the explicit rationale (componentstatus travels with pipeline; pipeline is out of PR-F's scope per line 240's original framing). ### Out of scope (PR-F.2 follow-up) - `internal/componentstatus/` — 5-line `ReportStatus` free function. Travels with `internal/pipeline` (its only non-test consumers are `internal/pipeline/runtime_test.go` + `internal/pipeline/pipelinetest/fixture_test.go`). Deletion lands when pipeline migrates to upstream `go.opentelemetry.io/collector/component/componentstatus`. ### Rationale links - RFC-0013 §migration step 8 — the PR-F entry now codifies the F.1/F.2 split in this branch's RFC update. - PR-B2 scope-discovery (#188) — established the "rename + slim, don't reshape" pattern for the dcgm sweep that retired the cgo path. - Wave-3 PRs that unblocked selftelemetry deletion: #184 (nccl_fr), #185 (clockreceiver), #186 (kernelevents), #187 (stdoutexporter), #188 (dcgm), #193 (otlphttp), #194 (pyspy), #196 (k8sevents), #197 (containerstdout). ```release-notes [CHANGE] internal/{selftelemetry,telemetry} packages deleted; components/receivers/dcgm + pkg/dcgm deleted. Operators using the v0.1.x in-tree `tracecore.*` self-telemetry metric names migrate per docs/migration/v0.1-to-v0.2.md. Third-party importers of internal/* (unlikely pre-1.0) lose the `selftelemetry.{Receiver,Exporter}` interfaces and the `telemetry.MeterProvider` wrapper; receiver/exporter authors now wire a receiver-scoped sibling `selftel.go` per the PR-B1 pattern. ``` ## Test plan - [x] `make verify` (lint + vet + tidy-check + mod-verify + license-check + generate-fixtures-check + build-tags + nccl-fr-rce-gate + register-lint + actionlint + zizmor + doc-check + no-autoupdate-check) — exit 0. - [x] `go test ./...` — all green (29 packages). - [x] `make build` (OCB) — `./_build/tracecore` produced. - [x] `./_build/tracecore --version` — `tracecore version 0.1.0-m9-alpha`. - [x] Pre-flight greps for all four deleted paths — zero external importers. - [ ] CI green on PR (linux/race matrix, chart render, install-bench, zizmor, govulncheck). - [ ] Operator verification that the chart's `dcgm` toggle remains inert post-merge (no behaviour change from main — already inert since PR-A2). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
This was referenced May 31, 2026
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Reconcile the four pivot-tracking docs (`docs/rfcs/0013-distro-first-pivot.md`, `CHANGELOG.md`, `MILESTONES.md`, `docs/migration/v0.1-to-v0.2.md`) with the wave-3 (PR-B1-shape sibling ports) and wave-4 (PR-B2-shape upstream-only ports + PR-F.1 + PR-J + PR-L + PR-N) landings. Pure doc sweep — no code or config touched. ## What changed ### `docs/rfcs/0013-distro-first-pivot.md` §migration PR sequence rows updated with PR-number citations and landed markers: - **PR-A2** (landed, #189, 2026-05-30) - **PR-B2** (landed, #201) — also enumerates sibling-receiver follow-ups under PR-B2 to dispel the slug collision with #188's PR-B2-labelled dcgm port: stdoutexporter (#202), pyspy (#203), kernelevents (#208), containerstdout (#209) - **PR-F.1** (landed) — fleshed-out delete list (`internal/{selftelemetry,telemetry}` + `components/receivers/dcgm/` + `pkg/dcgm/` + one orphan clockreceiver integration test) - **PR-F.2** re-scoped — now deletes the whole `internal/{componentstatus,pipeline,pipelinebuilder,consumer,fanout,runtime/lifecycle}` bundle in one cut once the last three pipeline+consumer-importing receivers land (#204 k8sevents, #205 clockreceiver, #207 otlphttp). Per the import-graph state — `internal/componentstatus`'s only non-test consumer is `internal/pipeline`, so they delete together - **PR-G** (landed, #182), **PR-H** (landed, #183) - **PR-I.1a** (in flight — scaffold agent), **PR-I.1b** (pre-staged; gate satisfied by #201) - **PR-J** (landed, #195) — kept existing marker - **PR-K.1** (in flight — separate agent landing) - **PR-L** (landed, skeleton #179 + body #191) — flagged as living document - **PR-N** (landed, #200) — shipped at v0.1.0 ahead of v0.3.0 as a doc-only update at `docs/migration/v0.2-to-v0.3.md` ### `CHANGELOG.md` [Unreleased] - Restructured the pivot wave list as **four waves** (was three). Wave 3 enumerates PR-B1-shape sibling ports + support infra (#180-#194/#196). Wave 4 enumerates PR-B2-shape upstream-only ports + PR-J (#195) + PR-F.1 (#206) + PR-N (#200) + lint/TOCTOU hardening (#198/#210). - Tightened the PR-F.2 deferred note to point at the three open ports (#204/#205/#207) as the gate. ### `MILESTONES.md` - **M1** (pipeline runtime) — status row now cites PR-A2 (#189), PR-F.1 (#206), PR-F.2 gate (#204/#205/#207), PR-E (#180), retains `internal/config/` (still load-bearing for `tracecore validate`). - **M2** (self-telemetry) — status row now cites PR-F.1 (#206); flags `internal/componentstatus` as travelling with `internal/pipeline` in PR-F.2. - **M8** (DCGM receiver) — status flipped to *landed-and-replaced*: cites PR-F.1 (#206) deletion + PR-J (#195) `docs/integrations/prometheus-scrape.md` recipe. Notes the inert chart toggle retention until PR-K.3. ### `docs/migration/v0.1-to-v0.2.md` - §`internal/*` package deletion (PR-F) status flips from "not yet open" to "PR-F.1 landed (#206), PR-F.2 gated on three open ports". - Open-items checklist expanded from 5 to 13 entries — tracks every PR letter the migration guide cares about (A2 / E / F.1 / F.2 / I.1a-c / J / K.1-3 / L / N) with PR numbers and links. ## Why now Tracking docs accumulated drift across wave-3 + wave-4 because every sibling-port PR (and the support-infra PRs around them) updated the bottom of `CHANGELOG.md` but did not always touch the upstream sequencing section in RFC-0013. Per memory rule `[Keeping this document current]`: status drift is a review blocker. This PR is the consolidated catch-up; future port PRs include their RFC-row flip in-PR. ## What this PR does NOT change - No code, no config, no YAML, no chart — only the four tracking docs. - No new doc gates added; existing gates pass. - No PRs other than the four named docs are modified. ## Test plan - [x] `bash scripts/doc-check.sh` clean (33 test refs, 528 links resolve, comment-noise diff gate clean vs `origin/main`, all 13 gates green). - [x] Pre-commit hook (`commitlint` 72-char subject limit + DCO + AI-trailer gates) passed. - [x] Pre-push hook (`make ci-fast` equivalent: `golangci-lint`, `go vet`, `go mod verify`, `no-autoupdate-check`, `doc-check.sh`) passed on second attempt after `git fetch origin main` populated the worktree's `origin/main` ref — first push failed because the worktree previously tracked the (gone) `pr-a2-ocb-main-swap` branch, so `doc-check.sh`'s comment-noise diff-scope gate exited 128 on the missing ref. Root cause fixed by the fetch; not a workaround. - [ ] CI green on this branch. ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Deletes the seven `internal/*` packages that RFC-0013 §migration step 8 PR-F.2 promised once the upstream-port wave (#201/#202/#203/#204/#205/#207/#208/#209) cleared every external caller of the in-tree pipeline runtime. **Net: -6,888 LOC across 56 deleted files, +80 LOC across 14 modified files. 70 files total.** This is the final cut of RFC-0013 §migration step 8 PR-F. ## What deletes | Path | LOC | Replacement | |---|---|---| | `internal/pipeline/` | 4,134 | `go.opentelemetry.io/collector/service` (OCB-generated `_build/main.go` consumes `builder-config.yaml`). | | `internal/pipelinebuilder/` | 1,282 | Same — assembly is upstream `service`. | | `internal/config/` | 718 | Upstream `confmap` providers (`file`, `yaml`, `env`). | | `internal/consumer/` | 87 | Upstream `go.opentelemetry.io/collector/consumer`. | | `internal/fanout/` | 366 | Upstream `internal/fanoutconsumer` (collector module). | | `internal/componentstatus/` | 16 | Upstream `component/componentstatus.ReportStatus` (same free-function shape). | | `internal/runtime/lifecycle/` | 505 | Per-receiver package-local `lifecycle.go` siblings — already ported during the PR-B1 wave (#184/#185/#186/#187/#194/#196/#197); the in-tree helper had no remaining non-test consumer after PR-F.1 + the wave-2 upstream-port PRs. `kernelevents/lifecycle.go` was inherited from k8sevents (#208). | ## Pre-flight grep evidence ``` $ grep -rn 'tracecoreai/tracecore/internal/(pipeline|consumer|pipelinebuilder|config|fanout|componentstatus|runtime/lifecycle)' --include='*.go' . (zero matches) ``` ## Tooling - `.golangci.yml` `ignore-interface-regexps` repointed at upstream `consumer.{Metrics,Traces,Logs}` + `component.Component`. The in-tree-only same-package-error-wrap exemption stays — the STYLE rule applies regardless of which interface is forwarded. - `.github/workflows/chaos.yml` drops the `chaos-pipeline-test` job (the in-tree `internal/pipeline/chaos_test.go` is gone; upstream `service` provides the equivalent panic-recovery contract). `harness-determinism` (failure-inject golden-SHA), `cpu-steal-mpstat`, `pattern-pod-evicted` jobs preserved. - `.github/workflows/install-bench.yml` drops the `internal/{pipeline,runtime,selftelemetry}/**` path-filter rows. - `go.mod` / `go.sum` unchanged. ## Doc sweep - `CHANGELOG.md` Unreleased: PR-F.2 landed entry replacing the "PR-F.2 deferred" sentence; "Remaining v0.1.0 work" line updated; one dead `internal/pipeline/README.md` link in Foundation block rewritten as "deleted at v0.1.0". - `docs/rfcs/0013-distro-first-pivot.md` §7 deletion table: both pipeline-internals and runtime/lifecycle rows updated from "v0.1.0 (audit first…)" / "v0.2.0 (with last consumer)" to "v0.1.0 (landed PR-F.2)". §migration step 8 reframed. - `docs/FAILURE-MODES.md` Lifecycle / Data flow / Shutdown timing / Backend tables rewired from in-tree `internal/{config,pipeline,fanout}/*_test.go::TestName` pointers to upstream-delegated wording matching the pattern PR-A2 established. - `docs/STRATEGY.md` "Post-RFC-0013 status" intro updated; "Stable interfaces in `internal/pipeline/`" graduation row rewritten to point at the upstream surface. - `docs/migration/v0.1-to-v0.2.md` `internal/*` section status banner flipped from "deferred, still present in RC builds" to "landed, deleted in v0.2.0 builds". - `MILESTONES.md` v0.1.0 deletions row extended with boot-path internals; M1 + M4b + M19 rubric details annotated with the PR-F.2 retirement. - `README.md` Contributor row repointed at upstream `go.opentelemetry.io/collector` package docs. - `AGENTS.md` "Self-telemetry internals" bullet split into "Self-tel internals" + "Pipeline / boot-path internals" with explicit deletion status. - `docs/README.md` table row for `internal/pipeline/README.md` dropped. - `components/receivers/kernelevents/README.md` lifecycle-sibling rationale updated to past-tense. - `tools/failure-inject/README.md` "Testing locally" section drops the `-tags=chaos ./internal/pipeline/...` invocation. ## Sequencing This PR is hard-gated on every upstream-port PR landing first: - #201 nccl_fr (PR-B2) - #202 stdoutexporter - #203 pyspy - #204 k8sevents - #205 clockreceiver (PR-B3) - #207 otlphttp - #208 kernelevents - #209 containerstdout - #206 PR-F.1 (selftel / telemetry / dcgm) All nine merged before this PR opened; this is the moat-deletion payoff. Remaining v0.1.0 work is PR-K (chart-default flip + `clockreceiver` + `stdoutexporter` + remaining receiver source deletions, coupled with test-fixture migration and the `telemetry:` values-key deprecation cycle). ## Test plan - [x] `make check` — golangci-lint 0 issues, go vet clean, go mod verify ok. - [x] `go build ./...` — clean. - [x] `go test -count=1 ./...` — green (excluding the known `kernelevents/TestReceiver_SLIBudget` flake called out in #205's body, which only triggers under heavy parallel `go test ./...` load; passes standalone). - [x] `grep` confirms zero non-internal callers of the deleted packages. - [x] Doc-check pre-push hook passes after the CHANGELOG dead-link fix. ```release-notes [CHANGE] internal/{pipeline,pipelinebuilder,config,consumer,fanout,componentstatus,runtime/lifecycle} packages deleted. The OCB-generated boot path off builder-config.yaml replaces them. Third-party importers of internal/* (unlikely pre-1.0; the packages live under internal/ and the Go compiler rejects external imports) lose the pipeline-assembly + lifecycle + config-loader surfaces; receiver authors now wire against upstream go.opentelemetry.io/collector/{component,receiver,consumer,pipeline} directly. See docs/migration/v0.1-to-v0.2.md "internal/* package deletion". ``` --------- Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports
components/receivers/pyspyoffinternal/selftelemetry+internal/runtime/lifecycleusing the sibling co-location pattern established by PR #184 (nccl_fr reference impl).selftel.go+lifecycle.gonow live next to the receiver source.Unblocks RFC-0013 PR-F deletion of those internal moats.
What changed
selftel.go(new) — receiver-scopedselfTelemetryinterface +selfTelemetryImpl(OTel-backed) +noopSelfTelemetry+recordInitError. Metric names + label shape preserved 1:1 withinternal/selftelemetry:tracecore.receiver.errors_total{kind,component_id}tracecore.receiver.emissions_total{component_id}tracecore.receiver.collection_latency_seconds{component_id}(12-bucket boundaries, matches v0.1.x)tracecore.receiver.degraded_seconds_total{component_id}(observable counter)tracecore.receiver.last_activity_unix_seconds{component_id}(observable gauge)tracecore.selftelemetry.init_errors_total{kind,component_id,reason}(factory-fallback signal)lifecycle.go(new) — slim single-source variant (noAdd()). pyspy fans its three goroutines (scan loop, trigger cadence driver, trigger run loop) via its ownsync.WaitGroupinsiderunAll, so lifecycle only needsStart+Shutdown.kinds.go—kindis now a receiver-local type.kindPanicis declared locally (replacesselftelemetry.KindPanic).factory.go— uses localnewSelfTelemetry/recordInitError/reasonInstrumentRegister.pyspy.go— fields/types switched to local sibling.pyspy_test.go— uses sibling-localfakeTelemetrycaptor (replacesselftelemetry.CapturingReceiver).selftel_test.go(new, RED-first TDD) — 7 tests covering noop safety, nil-MP, errors_total kind+component_id, scope-name pin, init_errors_total, nil-MP safety on recordInitError, factory fallback path.fake_telemetry_test.go(new) — test-only captor implementing theselfTelemetryinterface; mirrors the kerneleventsexport_test.gopattern.Scope-name standard
Instrumentation scope pinned to the receiver's Go import path:
When the receiver moves to
module/receiver/pyspyreceiver/in PR-I.1, the scope name moves with it (matches OTel convention).Lifecycle pattern decision
pyspy uses single-source lifecycle (like nccl_fr), NOT multi-source (like kernelevents). The receiver's
runAllcallback owns a localsync.WaitGroupand fans into three goroutines from within the singlelifecycle.Start(ctx, r.runAll)call. NoAdd()needed → dropped from the slim sibling.Tests
13 targeted tests pass (7 new selftel + 6 existing receiver lifecycle tests).
Wave context
pyspy is 1 of 4 remaining unported consumers of
internal/selftelemetry+internal/runtime/lifecycle. Sibling agents (otlphttp / containerstdout / k8sevents) are landing in parallel; this PR's scope is the pyspy receiver only — zero file overlap with the parallel work.Test plan
go test ./components/receivers/pyspy/ -count=1— greengo test -race ./components/receivers/pyspy/ -count=1— greenmake check— 0 lint issues, vet clean, modules verifiedgithub.com/tracecoreai/tracecore/components/receivers/pyspytracecore.selftelemetry.init_errors_totalon instrument-register failure