feat(pivot): PR-B3 — port clockreceiver off internal pipeline+consumer#205
Merged
Conversation
Mirrors PR-B2 (#201) for the canonical example receiver: swaps internal/{pipeline,consumer,runtime/lifecycle} for upstream OTel v1.59.0 types (component, consumer, receiver, receivertest) and *slog.Logger for *zap.Logger. Factory shape moves to receiver.NewFactory(componentType(), createDefaultConfig, receiver.WithMetrics(...)) — no package-var. Receiver struct (renamed clockReceiver → Receiver, public) drops ComponentState embedding; the upstream receiver.Metrics interface is now the only lifecycle contract, pinned at build time via `var _ receiver.Metrics = (*Receiver)(nil)`. Behavior preserved: same emit/shutdown/panic-recovery semantics, same selftel metric names + label shape, same scope name. errors_integration_test.go drops the last v0.1.x internal/pipeline+pipelinetest dep by inlining a 10-line failingMetricsSink + using receivertest.NewNopSettings. Unblocks RFC-0013 §migration PR-F.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Tri Lam <tri@maydow.com>
6 tasks
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
…ts+k8sevents) (#210) ## Summary Backport containerstdout's hardened TOCTOU race-window test pattern (#197, #209) to `kernelevents` and `k8sevents` `Lifecycle_ConcurrentAddDuringShutdown_NoPanic` tests. The pre-hardened tests have a brittle race window that collapses on fast schedulers — Shutdown wins universally, every Add no-ops via the closed-guard, and the test passes vacuously while never exercising the TOCTOU path it claims to cover. On slightly slower (or differently-loaded) CI machines, the same brittleness flips and produces intermittent failures, blocking unrelated PRs. ## Root cause The lifecycle's mutex guard around `(closed-check, wg.Add(1))` is correct in production code. The *test*, however, releases all 50 adders + the shutdowner from a single gate, then relies on `runtime.Gosched()` alone to interleave them. On fast schedulers the shutdowner reaches `lc.Shutdown(...)` before any adder reaches `lc.Add(...)`, so: - Branch (a) "Add wins, registers under WaitGroup" never fires - Branch (b) "Shutdown wins, Add no-ops" fires for all 50 adders - Test passes but tests nothing meaningful - On flip side: race window straddle changes between CI runs, surfacing as flake Containerstdout's hardened equivalent inserts a `shutdownGate` channel that holds the shutdowner back until 50µs after `release` fires, deterministically straddling adders-in-flight with the Shutdown call. This PR ports that pattern verbatim. ## Changes - `components/receivers/kernelevents/lifecycle_test.go`: add `shutdownGate` chan + 50µs sleep + comment explaining intent - `components/receivers/k8sevents/lifecycle_test.go`: same Production code: unchanged. ## Verification ``` for i in $(seq 1 10); do go test -race -count=1 -run ConcurrentAddDuringShutdown ./components/receivers/kernelevents/...; done # 10/10 PASS for i in $(seq 1 10); do go test -race -count=1 -run ConcurrentAddDuringShutdown ./components/receivers/k8sevents/...; done # 10/10 PASS ``` Branch-coverage check across 10 verbose iterations (registered count <50 == branch (b) exercised): - kernelevents: registered counts 44, 47, 49, 50, 50, 47, 44, 50, 50, 50 → branch (b) hit 6/10 iterations - k8sevents: registered counts 49, 49, 50, 50, 50, 48, 50, 44, 48, 50 → branch (b) hit 7/10 iterations Both branches deterministically exercised. The existing `registeredCount == 0` guard in the test prevents the inverse-vacuous regression (all-no-op). Full repo: `make check` clean, `go test -race ./...` green. ## Motivation Unblocks four in-flight PRs hitting this flake on CI: - #203 - #204 - #205 - #207 ## Reference Containerstdout's port: #197 (original finding + fix), #209 (additional hardening). Same pattern applied here with no behavioral divergence. ## Test plan - [x] `make check` passes locally - [x] `go test -race ./...` passes locally - [x] 10-iter stress on `ConcurrentAddDuringShutdown` in both target packages, 10/10 pass - [x] Verified both TOCTOU branches still exercised (branch (a) every iter; branch (b) 6-7/10 iters) - [ ] CI green on this branch - [ ] PRs #203, #204, #205, #207 re-run + CI green after merge ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
…ream # Conflicts: # components/receivers/clockreceiver/errors_integration_test.go
This was referenced May 31, 2026
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Cross-cut review cleanup of [#205](#205) (clockreceiver port off internal/pipeline). Two convention drifts caught when reviewing the merged PR against peer receivers; both are pure realignment, no behavior change. ## Root cause #205 ported `clockreceiver` from `internal/pipeline+consumer` to the upstream `receiver.Factory` shape. The port worked, but two package-surface decisions diverged from every other component package in the repo: 1. **`type Receiver struct` exported.** Every peer keeps the concrete struct package-private: - `containerStdoutReceiver` (components/receivers/containerstdout) - `k8sEventsReceiver` (components/receivers/k8sevents) - `kernelEventsReceiver` (components/receivers/kernelevents) - `ncclfrReceiver` (components/receivers/nccl_fr) - `pyspyReceiver` (components/receivers/pyspy) The factory is the only exported constructor — callers stitch via OCB, never by importing the concrete struct. `git grep clockreceiver.Receiver` returns zero external hits, confirming no caller depended on the exported name. Restore convention. 2. **Local `type componenttest struct{}` test stub.** The fake `component.Host` in `clockreceiver_test.go` shadowed the upstream package `go.opentelemetry.io/collector/component/componenttest` — which is already in `go.mod` at `v0.153.0` and is what every peer test uses (`componenttest.NewNopHost()`). The local stub was invented for no reason; peer ports [#208](#208) (containerstdout) and [#209](#209) (k8sevents) use the upstream helper directly. Neither drift is a workaround for an upstream blocker — both are pure convention misses caught post-merge. This PR is the root-cause fix. ## Changes - Rename `Receiver` → `clockReceiver` across `clockreceiver.go`, `factory.go` (comment), `selftel_test.go` (type-assertion). - Drop local `componenttest{}` stub from `clockreceiver_test.go`; replace usage with upstream `componenttest.NewNopHost()`. - Add doc comment on `clockReceiver` explaining the package-private convention so a future drift fails review. Net diff: 4 files, +23/-23 LOC. ## Test plan - [x] `make check` (fmt + tidy + lint + vet + mod-verify) — green - [x] `go test -race ./components/receivers/clockreceiver/...` — green - [x] Verified no external `clockreceiver.Receiver` callers exist (`grep -rn 'clockreceiver\.Receiver' --include='*.go'` returns nothing) - [x] Verified `componenttest.NewNopHost()` is the peer-receiver pattern (containerstdout, kernelevents, nccl_fr, k8sevents) ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
5 tasks
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Reconcile the four pivot-tracking docs (`docs/rfcs/0013-distro-first-pivot.md`, `CHANGELOG.md`, `MILESTONES.md`, `docs/migration/v0.1-to-v0.2.md`) with the wave-3 (PR-B1-shape sibling ports) and wave-4 (PR-B2-shape upstream-only ports + PR-F.1 + PR-J + PR-L + PR-N) landings. Pure doc sweep — no code or config touched. ## What changed ### `docs/rfcs/0013-distro-first-pivot.md` §migration PR sequence rows updated with PR-number citations and landed markers: - **PR-A2** (landed, #189, 2026-05-30) - **PR-B2** (landed, #201) — also enumerates sibling-receiver follow-ups under PR-B2 to dispel the slug collision with #188's PR-B2-labelled dcgm port: stdoutexporter (#202), pyspy (#203), kernelevents (#208), containerstdout (#209) - **PR-F.1** (landed) — fleshed-out delete list (`internal/{selftelemetry,telemetry}` + `components/receivers/dcgm/` + `pkg/dcgm/` + one orphan clockreceiver integration test) - **PR-F.2** re-scoped — now deletes the whole `internal/{componentstatus,pipeline,pipelinebuilder,consumer,fanout,runtime/lifecycle}` bundle in one cut once the last three pipeline+consumer-importing receivers land (#204 k8sevents, #205 clockreceiver, #207 otlphttp). Per the import-graph state — `internal/componentstatus`'s only non-test consumer is `internal/pipeline`, so they delete together - **PR-G** (landed, #182), **PR-H** (landed, #183) - **PR-I.1a** (in flight — scaffold agent), **PR-I.1b** (pre-staged; gate satisfied by #201) - **PR-J** (landed, #195) — kept existing marker - **PR-K.1** (in flight — separate agent landing) - **PR-L** (landed, skeleton #179 + body #191) — flagged as living document - **PR-N** (landed, #200) — shipped at v0.1.0 ahead of v0.3.0 as a doc-only update at `docs/migration/v0.2-to-v0.3.md` ### `CHANGELOG.md` [Unreleased] - Restructured the pivot wave list as **four waves** (was three). Wave 3 enumerates PR-B1-shape sibling ports + support infra (#180-#194/#196). Wave 4 enumerates PR-B2-shape upstream-only ports + PR-J (#195) + PR-F.1 (#206) + PR-N (#200) + lint/TOCTOU hardening (#198/#210). - Tightened the PR-F.2 deferred note to point at the three open ports (#204/#205/#207) as the gate. ### `MILESTONES.md` - **M1** (pipeline runtime) — status row now cites PR-A2 (#189), PR-F.1 (#206), PR-F.2 gate (#204/#205/#207), PR-E (#180), retains `internal/config/` (still load-bearing for `tracecore validate`). - **M2** (self-telemetry) — status row now cites PR-F.1 (#206); flags `internal/componentstatus` as travelling with `internal/pipeline` in PR-F.2. - **M8** (DCGM receiver) — status flipped to *landed-and-replaced*: cites PR-F.1 (#206) deletion + PR-J (#195) `docs/integrations/prometheus-scrape.md` recipe. Notes the inert chart toggle retention until PR-K.3. ### `docs/migration/v0.1-to-v0.2.md` - §`internal/*` package deletion (PR-F) status flips from "not yet open" to "PR-F.1 landed (#206), PR-F.2 gated on three open ports". - Open-items checklist expanded from 5 to 13 entries — tracks every PR letter the migration guide cares about (A2 / E / F.1 / F.2 / I.1a-c / J / K.1-3 / L / N) with PR numbers and links. ## Why now Tracking docs accumulated drift across wave-3 + wave-4 because every sibling-port PR (and the support-infra PRs around them) updated the bottom of `CHANGELOG.md` but did not always touch the upstream sequencing section in RFC-0013. Per memory rule `[Keeping this document current]`: status drift is a review blocker. This PR is the consolidated catch-up; future port PRs include their RFC-row flip in-PR. ## What this PR does NOT change - No code, no config, no YAML, no chart — only the four tracking docs. - No new doc gates added; existing gates pass. - No PRs other than the four named docs are modified. ## Test plan - [x] `bash scripts/doc-check.sh` clean (33 test refs, 528 links resolve, comment-noise diff gate clean vs `origin/main`, all 13 gates green). - [x] Pre-commit hook (`commitlint` 72-char subject limit + DCO + AI-trailer gates) passed. - [x] Pre-push hook (`make ci-fast` equivalent: `golangci-lint`, `go vet`, `go mod verify`, `no-autoupdate-check`, `doc-check.sh`) passed on second attempt after `git fetch origin main` populated the worktree's `origin/main` ref — first push failed because the worktree previously tracked the (gone) `pr-a2-ocb-main-swap` branch, so `doc-check.sh`'s comment-noise diff-scope gate exited 128 on the missing ref. Root cause fixed by the fetch; not a workaround. - [ ] CI green on this branch. ```release-notes NONE ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
8 tasks
trilamsr
added a commit
that referenced
this pull request
May 31, 2026
## Summary Deletes the seven `internal/*` packages that RFC-0013 §migration step 8 PR-F.2 promised once the upstream-port wave (#201/#202/#203/#204/#205/#207/#208/#209) cleared every external caller of the in-tree pipeline runtime. **Net: -6,888 LOC across 56 deleted files, +80 LOC across 14 modified files. 70 files total.** This is the final cut of RFC-0013 §migration step 8 PR-F. ## What deletes | Path | LOC | Replacement | |---|---|---| | `internal/pipeline/` | 4,134 | `go.opentelemetry.io/collector/service` (OCB-generated `_build/main.go` consumes `builder-config.yaml`). | | `internal/pipelinebuilder/` | 1,282 | Same — assembly is upstream `service`. | | `internal/config/` | 718 | Upstream `confmap` providers (`file`, `yaml`, `env`). | | `internal/consumer/` | 87 | Upstream `go.opentelemetry.io/collector/consumer`. | | `internal/fanout/` | 366 | Upstream `internal/fanoutconsumer` (collector module). | | `internal/componentstatus/` | 16 | Upstream `component/componentstatus.ReportStatus` (same free-function shape). | | `internal/runtime/lifecycle/` | 505 | Per-receiver package-local `lifecycle.go` siblings — already ported during the PR-B1 wave (#184/#185/#186/#187/#194/#196/#197); the in-tree helper had no remaining non-test consumer after PR-F.1 + the wave-2 upstream-port PRs. `kernelevents/lifecycle.go` was inherited from k8sevents (#208). | ## Pre-flight grep evidence ``` $ grep -rn 'tracecoreai/tracecore/internal/(pipeline|consumer|pipelinebuilder|config|fanout|componentstatus|runtime/lifecycle)' --include='*.go' . (zero matches) ``` ## Tooling - `.golangci.yml` `ignore-interface-regexps` repointed at upstream `consumer.{Metrics,Traces,Logs}` + `component.Component`. The in-tree-only same-package-error-wrap exemption stays — the STYLE rule applies regardless of which interface is forwarded. - `.github/workflows/chaos.yml` drops the `chaos-pipeline-test` job (the in-tree `internal/pipeline/chaos_test.go` is gone; upstream `service` provides the equivalent panic-recovery contract). `harness-determinism` (failure-inject golden-SHA), `cpu-steal-mpstat`, `pattern-pod-evicted` jobs preserved. - `.github/workflows/install-bench.yml` drops the `internal/{pipeline,runtime,selftelemetry}/**` path-filter rows. - `go.mod` / `go.sum` unchanged. ## Doc sweep - `CHANGELOG.md` Unreleased: PR-F.2 landed entry replacing the "PR-F.2 deferred" sentence; "Remaining v0.1.0 work" line updated; one dead `internal/pipeline/README.md` link in Foundation block rewritten as "deleted at v0.1.0". - `docs/rfcs/0013-distro-first-pivot.md` §7 deletion table: both pipeline-internals and runtime/lifecycle rows updated from "v0.1.0 (audit first…)" / "v0.2.0 (with last consumer)" to "v0.1.0 (landed PR-F.2)". §migration step 8 reframed. - `docs/FAILURE-MODES.md` Lifecycle / Data flow / Shutdown timing / Backend tables rewired from in-tree `internal/{config,pipeline,fanout}/*_test.go::TestName` pointers to upstream-delegated wording matching the pattern PR-A2 established. - `docs/STRATEGY.md` "Post-RFC-0013 status" intro updated; "Stable interfaces in `internal/pipeline/`" graduation row rewritten to point at the upstream surface. - `docs/migration/v0.1-to-v0.2.md` `internal/*` section status banner flipped from "deferred, still present in RC builds" to "landed, deleted in v0.2.0 builds". - `MILESTONES.md` v0.1.0 deletions row extended with boot-path internals; M1 + M4b + M19 rubric details annotated with the PR-F.2 retirement. - `README.md` Contributor row repointed at upstream `go.opentelemetry.io/collector` package docs. - `AGENTS.md` "Self-telemetry internals" bullet split into "Self-tel internals" + "Pipeline / boot-path internals" with explicit deletion status. - `docs/README.md` table row for `internal/pipeline/README.md` dropped. - `components/receivers/kernelevents/README.md` lifecycle-sibling rationale updated to past-tense. - `tools/failure-inject/README.md` "Testing locally" section drops the `-tags=chaos ./internal/pipeline/...` invocation. ## Sequencing This PR is hard-gated on every upstream-port PR landing first: - #201 nccl_fr (PR-B2) - #202 stdoutexporter - #203 pyspy - #204 k8sevents - #205 clockreceiver (PR-B3) - #207 otlphttp - #208 kernelevents - #209 containerstdout - #206 PR-F.1 (selftel / telemetry / dcgm) All nine merged before this PR opened; this is the moat-deletion payoff. Remaining v0.1.0 work is PR-K (chart-default flip + `clockreceiver` + `stdoutexporter` + remaining receiver source deletions, coupled with test-fixture migration and the `telemetry:` values-key deprecation cycle). ## Test plan - [x] `make check` — golangci-lint 0 issues, go vet clean, go mod verify ok. - [x] `go build ./...` — clean. - [x] `go test -count=1 ./...` — green (excluding the known `kernelevents/TestReceiver_SLIBudget` flake called out in #205's body, which only triggers under heavy parallel `go test ./...` load; passes standalone). - [x] `grep` confirms zero non-internal callers of the deleted packages. - [x] Doc-check pre-push hook passes after the CHANGELOG dead-link fix. ```release-notes [CHANGE] internal/{pipeline,pipelinebuilder,config,consumer,fanout,componentstatus,runtime/lifecycle} packages deleted. The OCB-generated boot path off builder-config.yaml replaces them. Third-party importers of internal/* (unlikely pre-1.0; the packages live under internal/ and the Go compiler rejects external imports) lose the pipeline-assembly + lifecycle + config-loader surfaces; receiver authors now wire against upstream go.opentelemetry.io/collector/{component,receiver,consumer,pipeline} directly. See docs/migration/v0.1-to-v0.2.md "internal/* package deletion". ``` --------- Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports
components/receivers/clockreceiveroffinternal/{pipeline,consumer,runtime/lifecycle}onto upstream OTel v1.59.0 types, mirroring PR-B2 (#201) which did the same fornccl_fr. Unblocks RFC-0013 §migration PR-F.2 (deletion of the three internal packages once every receiver/exporter has migrated).Third receiver port in the cross-receiver upstream-API wave (after
nccl_fr#201; siblings for kernelevents/stdoutexporter/k8sevents/containerstdout/otlphttp/pyspy running in parallel).Type swap applied
internal/pipeline.Typecomponent.Typeinternal/pipeline.ReceiverFactoryreceiver.Factoryinternal/pipeline.CreateSettingsreceiver.Settings(viareceivertest.NewNopSettings)internal/pipeline.Configcomponent.Configinternal/pipeline.Receiverreceiver.Metricsinternal/consumer.Metricsconsumer.Metrics*slog.Logger*zap.Loggerinternal/pipeline.MustNewTypecomponent.MustNewTypeinternal/pipeline.MustNewIDcomponent.NewIDWithNameShape changes
NewFactory()returns a freshreceiver.Factoryeach call (mirrors upstreamotlpreceiver/filelogreceiver); the v0.1.xclockreceiver.Factorypackage-var is gone. No external Go consumers exist outside the receiver dir, so this is internally safe (kernelevents + stdoutexporter doc comments referencing the var are unchanged — they'll get rewritten by their own in-flight upstream-port PRs).clockReceiverstruct →Receiver(exported, matches "single-Type-per-package" upstream convention).pipeline.ComponentStateembedding dropped — the upstreamreceiver.Metricsinterface is now the only lifecycle contract, pinned at build time viavar _ receiver.Metrics = (*Receiver)(nil). The only test that depended onStarted()/Stopped()accessors (TestReceiver_ComponentStateFlags) is removed; equivalent guarantee is now the compile-time interface assertion.createMetricsnow callscfg.Validate()before constructing the receiver, mirroring PR-B2. Catchesinterval=5msat component-construction time instead of first tick.Test rewrites
selftel_test.go+clockreceiver_test.gousereceivertest.NewNopSettings(componentType())for stable settings; ID overridden toclockreceiver/testso scope/label assertions stay deterministic.errors_integration_test.godropsinternal/pipeline/pipelinetestby inlining a 10-linefailingMetricsSink.internal/telemetry(MeterProvider + Prometheus handler) is kept — it's not in the banned set and the Prometheus surface assertion is the point of the test.lifecycle_test.gouseszap.NewNop()instead ofslog.Default().Verification
grep "internal/pipeline\|internal/consumer\|internal/runtime/lifecycle" components/receivers/clockreceiver/*.goreturns only doc-comment references to the v0.1.x history (no live imports / call sites).make check— golangci-lint 0 issues,go vet ./...clean,go mod verifyok.go test -race -count=1 ./components/receivers/clockreceiver/...— pass (1.6s).go test -count=1 ./...— pass everywhere except a pre-existing flakyTestReceiver_SLIBudgetinkernelevents(CPU-budget assertion, p99 16ms vs 5ms target; reproduces on second run as PASS — orthogonal to this PR).Test plan
make checkclean.go test -race ./components/receivers/clockreceiver/...green.go test ./...— only sibling-receiver flake (not introduced here).Refs: PR-B2 #201, RFC-0013 §migration PR-F.2.