Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ User-visible changes are documented here. Format: [Keep a Changelog](https://kee

## [Unreleased]

Pre-alpha. **Distribution-first pivot adopted ([RFC-0013](docs/rfcs/0013-distro-first-pivot.md))** - binary now assembled via the OpenTelemetry Collector Builder (OCB) from upstream + contrib components plus a thin `tracecoreai/tracecore-components` module containing only the moat (NCCL FlightRecorder receiver, OTTL processors with windowed semantics, pattern detectors). The M1 in-tree pipeline runtime + factory-based assembly is queued for deletion at v0.1.0 in favor of the OCB-generated boot path; the canonical `clockreceiver` + `stdoutexporter` examples ship for one PR cycle and then exit. Targeting v0.1.0 / v0.2.0 / v0.3.0 release boundaries per RFC-0013 §4.
Pre-alpha. **Distribution-first pivot adopted ([RFC-0013](docs/rfcs/0013-distro-first-pivot.md))** - binary now assembled via the OpenTelemetry Collector Builder (OCB) from upstream + contrib components plus a thin in-repo Go submodule at `module/` (path `github.com/tracecoreai/tracecore/module`) containing only the moat (NCCL FlightRecorder receiver, OTTL processors with windowed semantics, pattern detectors). The M1 in-tree pipeline runtime + factory-based assembly is queued for deletion at v0.1.0 in favor of the OCB-generated boot path; the canonical `clockreceiver` + `stdoutexporter` examples ship for one PR cycle and then exit. Targeting v0.1.0 / v0.2.0 / v0.3.0 release boundaries per RFC-0013 §4.

Pivot landed across three waves of PRs:
- Wave 1 (#166 RFC doc accepted, #168 delete kueue + kineto receivers, #169 pre-PR-A drift sweep + Helm security tighten, #170 containerstdout deletion explicit in §7, #171 PR-A OCB skeleton + `builder-config.yaml` + `make build-ocb`, #172 dedup gate execution, #173 rename check tiers + add PR body-artifact guard, #174 PR-C release pipeline → goreleaser stack + RFC supersession + top-level doc alignment, #175 wave-1 self-review fixes + delete archive folder).
Expand All @@ -15,6 +15,16 @@ Pivot landed across three waves of PRs:

Remaining v0.1.0 work: PR-F (delete `internal/{componentstatus,selftelemetry,telemetry}` + `components/receivers/{dcgm,kueue}`) deferred — chart default pipeline hardwires the to-be-deleted receivers, so deletion happens together with the v0.2.0 recipe migration (PR-K) to avoid an interim chart break. `clockreceiver` source deletion also part of PR-K per PR-E rationale.

**RFC-0013 §migration rescoped (doc-only).** Headline: **PR-I is now an in-repo Go submodule at `module/`, not an external `tracecoreai/tracecore-components` repo.** Open-source project — one fork, one CI, one issue tracker, one DCO wins. Go submodule tags give independent version line; OCB `gomod:` + `replaces: ./module` for dev-loop resolves identical to external repo.

Three sequencing findings from adversarial scoping of PR-B / PR-F / PR-I:

- **New PR-A2 introduced as sequencing gate.** Switching `cmd/tracecore` to OCB-generated main + deleting legacy boot wiring is the precondition for PR-B2 / PR-F / PR-I — they cannot delete or rewire `internal/pipeline` + `internal/consumer` while the legacy boot path is the live wiring.
- **PR-B splits into PR-B1 + PR-B2.** B1 (lands now): port nccl_fr off `internal/selftelemetry` + `internal/runtime/lifecycle`; helpers travel with the receiver as siblings using a receiver-scoped `MeterProvider` (instrument names `otelcol_receiver_nccl_fr_*` — receiver-scoped meter cannot collide with pipeline-runtime's own `otelcol_*` namespace). B2 (lands with or after PR-A2): mechanical import swap off `internal/pipeline` + `internal/consumer` to upstream `go.opentelemetry.io/collector/{component,receiver,consumer,pipeline}`.
- **PR-I subdivides into PR-I.1 + PR-I.2.** I.1: move surviving `nccl_fr` into `module/receiver/ncclfrreceiver/` after PR-B2 cleans its imports. I.2: build `rankjoinprocessor` + `patterndetectorprocessor` as net-new OTel processors wrapping `internal/synthesis/patterns/` logic, after PR-K severs the k8sevents dep that the pattern engine currently imports.

Files updated in this PR: `docs/rfcs/0013-distro-first-pivot.md` (§1, §6, §migration §4 v0.2.0 row, §migration §OQ #2, §migration v0.1.0 sequence renumbered 1-10 with PR-A1 / PR-A2 / PR-B1 / PR-B2 / PR-C / PR-D / PR-E / PR-F / PR-G / PR-H, §migration v0.2.0 PR-I body rewritten with in-repo layout + sub-sequencing), `docs/migration/v0.1-to-v0.2.md` (moat-components row reworded with explicit current-state vs future-state), `docs/rfcs/README.md` (new-receiver pointer), `docs/STRATEGY.md` (moat-location framing).

**PR-B reframed: self-tel metric rename (`tracecore.*` → `otelcol_*`) is a side-effect of the binary swap, not a caller rewrite.** Investigation found that `service/telemetry` + `componentstatus` upstream APIs are not drop-in replacements for the `IncError`/`IncEmissions`/`ObserveLatency`/`SetDegraded`/`MarkActivity` surface that `internal/selftelemetry/` provides today — the standard `otelcol_*` metrics RFC-0013 §2 promises are emitted by upstream `receiver/scraperhelper`, `exporter/exporterhelper`, and the OCB-generated pipeline runtime, NOT by `componentstatus` (which is a status-event surface). The rename therefore arrives automatically once PR-A's OCB binary boots with upstream receivers and PR-F deletes the in-tree receivers; no caller rewrite is needed in between. RFC-0013 §migration PR-B is collapsed into PR-F; the standalone PR-B step is documentation-only and lives in this CHANGELOG entry.

**PR-D landed: production container-image build moved to [`ko`](https://ko.build).** Root `Dockerfile` deleted; `.ko.yaml` at the repo root pins `gcr.io/distroless/static-debian12:nonroot` by digest, `defaultPlatforms: linux/{amd64,arm64}`, and the `cmd/tracecore` ldflags that match `.goreleaser.yaml` so the in-image binary is shape-identical to the goreleaser-published archive's binary. New `ko-publish` job in `.github/workflows/release.yml` runs after `goreleaser`, builds the multi-arch image, pushes by digest to `ghcr.io/tracecoreai/tracecore` (matching the chart default `image.repository` — no chart values change required), tags `:<version>` plus `:latest` on stable releases only (no `-` in SemVer pre-release), cosign-signs the manifest keyless against the workflow's OIDC identity, and pushes a `actions/attest-build-provenance` attestation bound to the image digest into the registry. `scripts/base-digest-check.sh` now reads the base-image pin from `.ko.yaml::defaultBaseImage` instead of the deleted `Dockerfile`. Operator-visible: chart pull path unchanged; verification flow now uses `cosign verify <repo>@<digest>` + `gh attestation verify oci://<repo>@<digest>` (docs/reproducibility.md walkthrough updates carry-forward). The chart-local kind-CI reference `install/kubernetes/tracecore/Dockerfile` (used by `.github/workflows/{chart,install-bench}.yml`) is preserved — it builds a self-contained CI image without depending on the production build path.
Expand Down
12 changes: 6 additions & 6 deletions docs/STRATEGY.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ patterns *that are expensive to add and additive later*.

**`tracecoreai/tracecore` (this repo):**
- `builder-config.yaml` - OCB manifest pinning upstream + contrib +
`tracecore-components` versions per release cycle.
`module/` submodule tag versions per release cycle.
- `install/kubernetes/tracecore/` - Helm chart + OTTL normalization
layer (the customer-stable contract in RFC-0013 §3).
- `docs/integrations/` - bundled recipes wiring upstream receivers
Expand All @@ -137,8 +137,8 @@ patterns *that are expensive to add and additive later*.
- `.github/workflows/release.yml` - goreleaser + slsa-github-generator
+ cosign-installer + sbom-action integration glue.

**`tracecoreai/tracecore-components` (separate repo, separate Go
module):**
**`module/` (in-repo Go submodule,
path `github.com/tracecoreai/tracecore/module`):**
- `receiver/ncclfrreceiver/` - moat scope #1.
- `processor/rankjoinprocessor/` - moat scope #2 (windowed cross-signal
join).
Expand All @@ -153,7 +153,7 @@ posture):
(versions pinned per release cycle); the components moat evolves at
customer-driven cadence (new patterns, parser fixes). Separating
the modules lets each move independently.
- `tracecore-components` is the upstream-contribution staging ground:
- `module/` submodule is the upstream-contribution staging ground:
when upstream OTel-contrib accepts a component, it leaves this
repo with no ripple into the distro skeleton's `go.mod`.

Expand Down Expand Up @@ -224,8 +224,8 @@ harness lands in M5.

**Default answer: don't add a receiver to this repo.** Per RFC-0013
§6, in-house code is bounded to the four moat scopes. New
receivers/processors land in the separate `tracecoreai/tracecore-components`
Go module (which graduates components to upstream OTel-contrib when
receivers/processors land in the in-repo `module/` Go submodule
(which graduates components to upstream OTel-contrib when
mature). Only components matching a moat scope live in-tree.

When a new component is genuinely in scope (e.g. a second cross-signal
Expand Down
4 changes: 2 additions & 2 deletions docs/migration/v0.1-to-v0.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ v0.2.0 completes the RFC-0013 receiver swap. The in-tree custom receivers for ke
| Kueue scheduler metrics | `kueue` (in-tree, never shipped) | `prometheusreceiver` recipe with bearer-token + TLS | Opt-in via `kueue.recipe: prometheus`. |
| Heartbeat / install-bench primitive | `clockreceiver` (in-tree, chart default) | `hostmetricsreceiver` (loadscraper @ 1s, upstream OCB-bundled) | v0.1.x bench already swapped (PR-E). v0.2.0 flips the chart default — set `receivers.hostmetrics.enabled: true` + `receivers.clockreceiver.enabled: false` if you want to track the new default before the chart-default flip; otherwise no action until v0.2.0. `NOTES.txt` will surface a deprecation warning for one minor after the flip. |
| Kineto profiler | `kineto` (in-tree, deferred) | Deferred until OTel Profiles GA | No action; re-evaluation when contrib ships `pprofreceiver`. |
| `tracecoreai/tracecore-components` module | Lives inside this repo | Separate Go module pulled via OCB `gomod:` | No operator action. Module split is internal. |
| Moat components (nccl_fr + pattern engine) | Currently in `components/receivers/nccl_fr/` + `internal/synthesis/patterns/` under the single repo-root `go.mod` | Will live in an in-repo Go submodule at `module/` (path `github.com/tracecoreai/tracecore/module`) pulled via OCB `gomod:` + dev-loop `replaces: ./module` | No operator action. Submodule split is internal — same repo, same fork, same CI; OCB builds via `gomod:` like any other upstream module. |
| Helm values keys | Per-receiver `<name>.*` | Per-receiver `<name>.recipe: <upstream|legacy>` + per-recipe stanzas | One-minor compat. Migrate by setting `.recipe: upstream` per receiver. |

## What's NOT changing
Expand Down Expand Up @@ -51,7 +51,7 @@ If recipe-toggle rollback doesn't help, pin the chart and image at the prior v0.

## Open items (fill in as PRs land)

- [ ] PR-I (separate `tracecoreai/tracecore-components` module) — link
- [ ] PR-I (in-repo Go submodule extraction at `module/`) — link
- [ ] PR-J (ship recipes for filelog + journald + k8sobjects + prometheus) — link
- [ ] PR-K (delete in-tree receivers) — link
- [ ] PR-L (this guide, full body) — link
Expand Down
Loading