Skip to content

feat(ocb): add builder-config + build-ocb target (RFC-0013 PR-A)#171

Merged
trilamsr merged 3 commits into
mainfrom
chore/pr-a1-ocb-skeleton-and-kineto-cleanup
May 31, 2026
Merged

feat(ocb): add builder-config + build-ocb target (RFC-0013 PR-A)#171
trilamsr merged 3 commits into
mainfrom
chore/pr-a1-ocb-skeleton-and-kineto-cleanup

Conversation

@trilamsr

@trilamsr trilamsr commented May 31, 2026

Copy link
Copy Markdown
Contributor

What this PR does

Implements RFC-0013 PR-A skeleton: tracecore can now build via the OpenTelemetry Collector Builder alongside the legacy cmd/tracecore binary. Both targets coexist for one PR cycle per RFC-0013 PR-A side-by-side requirement.

  • New: builder-config.yaml at repo root — pins core OTel v0.110.0 + contrib v0.110.0 receivers/processors/exporters/extensions.
  • New: make build-ocb Makefile target — go run go.opentelemetry.io/collector/cmd/builder@v0.110.0 --config=builder-config.yaml./_build/tracecore. No new go.mod dep; builder runs as a tool.
  • Updated: make build help text marks legacy target as retired at v0.2.0 (per RFC-0013 PR-F).
  • Updated: .gitignore adds /_build/ for OCB output dir.

Deviations from RFC-0013 §1 example shape

Two corrections were needed for the example config to actually build:

  1. zpagesextension: RFC-0013 example put it under contrib, but the module path github.com/open-telemetry/opentelemetry-collector-contrib/extension/zpagesextension has no v0.110.0 tag. The module lives in core: go.opentelemetry.io/collector/extension/zpagesextension. Corrected here.
  2. telemetrygeneratorreceiver: contrib path lists the receiver, but it has no published v0.110.0 tag (pseudo-module). Omitted here with an inline comment; will be added in PR-E with the clockreceiver swap.
  3. tracecoreai/tracecore-components/* entries (ncclfrreceiver, rankjoinprocessor, patterndetectorprocessor): the separate module repo doesn't exist yet — created in PR-I (v0.2.0). Omitted from this skeleton.

Doc cleanup (caught during self-review of #170)

RFC-0013 §migration PR-K: removed stale kineto from the v0.2.0 delete list. Kineto was already deleted in PR-F per #168; it was double-listed because the RFC was drafted before #168 executed early deletion. PR-O retains the OTel Profiles GA re-evaluation hook.

Smoke test

$ make build-ocb
... INFO builder/main.go:131 Compiled {"binary": "./_build/tracecore"}

$ ./_build/tracecore --version
tracecore version 0.1.0

$ ./_build/tracecore components
# enumerates 5 receivers (filelog, journald, k8sobjects, otlp, prometheus)
# + 4 processors (batch, transform, filter, k8sattributes)
# + 4 exporters (otlphttp, debug, datadog, clickhouse)
# + 3 extensions (filestorage, healthcheck, zpages)
# with upstream stability tiers attached

$ make build  # legacy target still works
$ ls -la tracecore _build/tracecore
-rwxr-xr-x  50M  tracecore
-rwxr-xr-x 119M  _build/tracecore

Release notes

[FEATURE] Add `make build-ocb` target that assembles tracecore via the OpenTelemetry Collector Builder from `builder-config.yaml`. The legacy `make build` target continues to work; both binaries coexist for one PR cycle.

Test plan

  • make lint — pass (pre-commit hook)
  • make test — pass (pre-commit hook)
  • make build — legacy binary builds
  • make build-ocb — OCB binary builds, --version + components work
  • Side-by-side: both binaries coexist with no conflicts

Tri Lam added 2 commits May 30, 2026 03:19
Closes the implicit-vs-explicit gap that left issues #159-#163
ambiguous on the containerstdout v0.2.0 fate:

- §4 v0.2.0 row: add containerstdout to in-tree delete list
- §7 deletion table: add row containerstdout → filelogreceiver +
  container stanza + file_storage (v0.2.0, pending pilot audit)
- §migration PR-K: include containerstdout in delete list; note
  M19 cross-signal join test re-homes to processor/rankjoinprocessor
  integration suite against filelogreceiver + k8sobjectsreceiver inputs
- Open Question #1: add M15 containerstdout to pilot-audit set
  (alongside M9 kernelevents, M10 k8sevents, M13 pyspy Phase 2)

The §2 adoption matrix already implied this substitution; §7
deletion table is the authoritative source operators read for
the release-boundary contract, so the explicit row is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
PR-A skeleton: tracecore can now build via the OpenTelemetry Collector
Builder alongside the legacy cmd/tracecore binary. Both targets coexist
for one PR cycle per RFC-0013 PR-A side-by-side requirement.

- `builder-config.yaml` at repo root: pins core OTel v0.110.0 +
  contrib v0.110.0 receivers/processors/exporters/extensions. Omits
  the `tracecoreai/tracecore-components/*` entries (PR-I creates that
  module in v0.2.0) and `telemetrygeneratorreceiver` (contrib
  pseudo-module with no v0.110.0 tag — added in PR-E with the
  clockreceiver swap).
- `zpagesextension` corrected to core module path
  `go.opentelemetry.io/collector/extension/zpagesextension`. RFC-0013
  example listed it under contrib, but it lives in core.
- `make build-ocb` target invokes `go run
  go.opentelemetry.io/collector/cmd/builder@v0.110.0
  --config=builder-config.yaml` → `./_build/tracecore`. No new go.mod
  dependency added — builder runs as a tool.
- Legacy `make build` retained; help text marks it as retired at v0.2.0.
- `.gitignore`: add `/_build/` for OCB output dir.

Doc cleanup (caught during self-review of PR #170):
- RFC-0013 §migration PR-K: drop stale `kineto` (already deleted in
  PR-F per #168). Note retains PR-O re-evaluation hook for OTel
  Profiles GA. No semantic change — kineto was double-listed.

Smoke test: `./_build/tracecore --version` → "tracecore version 0.1.0".
`./_build/tracecore components` enumerates 5 receivers + 4 processors
+ 4 exporters + 3 extensions with upstream stability tiers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 00:04
…ton-and-kineto-cleanup

# Conflicts:
#	docs/rfcs/0013-distro-first-pivot.md
@trilamsr trilamsr merged commit 099a7ec into main May 31, 2026
10 of 11 checks passed
@trilamsr trilamsr deleted the chore/pr-a1-ocb-skeleton-and-kineto-cleanup branch May 31, 2026 00:23
trilamsr added a commit that referenced this pull request May 31, 2026
## What this PR does

Three small cleanups that consolidate the gate UX after PR #172 landed
the dedup split:

### 1. Rename gate tiers for clarity

| Old | New | Where | Time |
|---|---|---|---|
| `check-fast` (PR #172) | `check` | pre-commit | ~6s |
| `check-medium` (PR #172) | `verify` | pre-push | ~25s |
| Old `check` (had `test`) | **deleted** | — | — |
| `ci` | `ci` (unchanged) | manual / GitHub Actions | full |

PR #172 introduced `check-fast` + `check-medium` to dedup gate
execution, but the names leaked the implementation detail (tier).
Renaming to `check` / `verify` matches what they actually do — quick
self-check vs verify-before-push. The old `check` (which included
`test`) is removed; `make test` is still a target if you want it
standalone, but the slow test gate now runs only in CI.

Hooks (`.githooks/pre-commit`, `.githooks/pre-push`) point at the new
names; `make hooks` echo block updated.

### 2. Split the .PHONY line into 8 logical groups

The Makefile had a single 60-target `.PHONY` line, ~1.2KB wide. Split
into:
- build, test, format, generate, coverage, policy gates, aggregate
gates, release+integration

Zero semantic change. Readability win for anyone editing the Makefile.

### 3. Add pr-lint workflow step that rejects shell-heredoc body
artifacts

PRs #171 and #172 both shipped with literal backslash-backtick artifacts
in their descriptions because the bodies were composed inline via `gh pr
create --body "$(cat <<'EOF' ... EOF)"` and over-escaped backticks.
Renders as visible `\` on GitHub.

The new pr-lint step counts occurrences and fails when 3+ are present
(plural is the artifact signature; a single mention in prose describing
the bug is fine — this PR body itself contains one). Error message
points to `gh pr edit --body-file FILE.md` as the working pattern.

## Root cause

- **Gate names:** tier was the wrong abstraction for the UX-facing name.
Speed-tier maps 1:1 to which hook fires it, so the hook stage (`check` =
before-commit, `verify` = before-push) is what users actually think in.
- **PR body artifact:** shell heredoc `<<'EOF'` (quoted) preserves
literals correctly. Adding `\\\`` thinking shell would expand the
backtick produced literal backslash + backtick. The workflow now blocks
this class of artifact at lint time, scoped to 3+ occurrences so prose
mentioning the pattern doesn't false-positive.

## Release notes

```release-notes
NONE
```

## Test plan

- [x] `make check` runs and passes (~6s observed)
- [x] `make verify` runs and passes (~25s observed)
- [x] Pre-commit hook fires `make check` on commit
- [x] Pre-push hook fires `make verify` on push
- [x] `make hooks` echo block reflects new names
- [x] `make actionlint` passes (new pr-lint step is YAML-valid)
- [x] New rule does NOT fire on this PR body (1 mention, threshold is 3)

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## What this PR does

Bundles three RFC-0013 PR slices that have zero file overlap with each
other.

### PR-C: release pipeline → goreleaser stack

- New `.goreleaser.yaml`: linux/amd64 + linux/arm64 builds; reproducible
via `SOURCE_DATE_EPOCH`; LDFLAGS shape matches the Makefile build
target.
- Rewritten `.github/workflows/release.yml`: invokes goreleaser,
`anchore/sbom-action`, `sigstore/cosign-installer`,
`slsa-framework/slsa-github-generator` (tag-pinned per SLSA OIDC subject
identity requirement; all other actions SHA-pinned per repo security
policy), `actions/attest-build-provenance`.
- Old `release.yml` moved to
`.github/workflows/archived/release.yml.legacy`.
- Goreleaser builds the **legacy** `cmd/tracecore` binary; OCB-output
migration deferred to PR-D (image build → ko), per inline comment in
`.goreleaser.yaml`.

### PR-G + PR-H: RFC supersession + top-level docs alignment

- Audit confirmed all 12 RFCs already carry the correct supersedence
headers from prior pivot work (PRs #166/#168/#169/#170). Only two
top-level docs needed alignment:
- `NORTHSTARS.md` O1 caveat: replaced "own-binary architecture"
assumption wording with OCB-distribution-posture wording; closed Open
Question #1 by RFC-0013 ref.
- `CHANGELOG.md`: appended pivot-wave-1 PR list
(#166/#168/#169/#170/#171/#172/#173) citing PR-A as the prior step
before this commit.
- No edits needed to
README/STRATEGY/PRINCIPLES/MILESTONES/CONTRIBUTING/AGENTS/docs/README —
all already aligned.

### PR-E: clockreceiver swap — BLOCKED

- `telemetrygeneratorreceiver` does not exist in
`opentelemetry-collector-contrib` at any version. Verified against the
Go module proxy, GitHub tree API at v0.95→v0.130, and the full receiver
listing at v0.110.0 (94 receivers; no `telemetrygenerator`, `loadgen`,
`mockreceiver`, `dummyreceiver`, or any `*generator*`). The RFC-0013 §1
example shape referenced it speculatively; it was never upstreamed.
- `builder-config.yaml`: replaced the misleading "no v0.110.0 tag"
omission comment with a verified TODO block describing the actual
blocker (receiver doesn't exist anywhere) and decision rationale.
- `bench/install/tracecore-values.yaml`: appended `[BLOCKED]` marker on
the clockreceiver→telgen mapping; bench continues to use in-tree
clockreceiver until PR-F deletes it (likely rewires to
`hostmetricsreceiver`).

## Root cause (PR-E blocker)

RFC-0013 §1 listed `telemetrygeneratorreceiver` as the swap target
without verifying the receiver existed upstream. Reality: the OTel
contrib repo has no such module path at any tag. PR-E cannot complete
until either (a) the receiver lands upstream, or (b) a different
replacement is chosen (e.g., `hostmetricsreceiver` for heartbeat
semantics). Tracked in the in-file TODO block; revisit in PR-F (delete
clockreceiver) or as a separate followup.

## Release notes

```release-notes
[CHANGE] Release pipeline migrated to goreleaser + SBOM + SLSA provenance + cosign signing. The release.yml workflow now invokes goreleaser instead of building binaries directly. Operators consuming release artifacts: artifact shape (filename, archive contents, checksum file format) follows goreleaser defaults; see CHANGELOG.md for the migration note.
```

## Test plan

- [x] `make verify` runs and passes
- [x] `make actionlint` passes (new release.yml workflow +
suppression-block YAML valid)
- [x] `make zizmor` passes (SLSA reusable-workflow tag-pin justified
inline + accepted)
- [x] `make build` (legacy) still works
- [x] `make build-ocb` (OCB) still works
- [ ] Goreleaser dry-run in CI on first push to a tag (gated until a tag
exists)

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## What this PR does

Adds a `build-ocb` job to `ci.yml` that builds tracecore via the
OpenTelemetry Collector Builder on every CI run, then smoke-tests the
resulting binary with `--version` + `components`.

## Root cause

After PR #171 added `make build-ocb`, there was no continuous signal for
OCB build health. Upstream contrib could yank a module version, move a
path, or break a tag between our release cuts — and we'd only find out
on the next release tag push, well after the breakage landed.

The new job catches this class of failure on every PR.

## Cost

- ~2 min added to CI wall time (OCB build downloads ~30 OTel modules and
compiles them; first run on a fresh runner cache).
- No impact on the required `verify` aggregator (independent job).

## Release notes

```release-notes
NONE
```

## Test plan

- [x] `make actionlint` passes
- [x] `make zizmor` passes
- [x] `make build-ocb` runs locally
- [x] `./_build/tracecore --version` + `./_build/tracecore components`
work
- [ ] First CI run on this PR exercises the new job

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
…ft (#178)

## What this PR does

Sweeps three drift sites that surfaced after wave-2 PRs landed:

- `CHANGELOG.md`: replaces stale "Pivot wave 1 landed [...] PR-A is
next" prose (written before PR-A actually merged in #171) with the full
landed history through #176. Adds two paragraphs documenting the PR-E
blocker (upstream `telemetrygeneratorreceiver` doesn't exist) and the
PR-F deferral (chart default pipeline hardwires the to-be-deleted
receivers; deletion happens together with the v0.2.0 recipe migration in
PR-K to avoid an interim chart break).
- `bench/install/tracecore-values.yaml`: the PR-E status note pointed at
"PR-F deletion" as the rewire trigger. Corrected to PR-K since PR-F is
deferred.
- `.goreleaser.yaml`: header still referenced the deleted
`.github/workflows/archived/release.yml.legacy` path. Replaced with
"preserved in git history" (matches what other docs already say after PR
#175).

## Root cause

Wave-2 PRs landed faster than the in-tree status prose could keep up.
The CHANGELOG paragraph in particular was authored mid-pivot before PR-A
merged, and was never refreshed. Caught in a post-merge sweep.

## Release notes

```release-notes
NONE
```

## Test plan

- [x] `make verify` runs and passes
- [x] `grep -r "workflows/archived" .` returns only the (untouched) RFC
reference, which is binding-doc and out of scope for this sweep
- [x] CHANGELOG and bench values still parse as YAML / markdown

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

PR-H sliver per [RFC-0013
§migration](docs/rfcs/0013-distro-first-pivot.md#migration--rollout).
Sweeps `PRINCIPLES.md` + `CONTRIBUTING.md` for distribution-first pivot
drift. Net -128 lines.

## Root cause

Two unrelated drift accumulations, both fixed at source:

1. **PRINCIPLES.md §2 example was factually wrong.** The text "we
deferred GoReleaser, SBOM signing, eBPF integration, the Helm chart, the
OTel `pdata` import — none has cost us" predates the work that landed
since: PR-C (#174) shipped the goreleaser + SBOM + cosign stack; M5b
shipped the Helm chart; OCB adoption (#171 PR-A) pulls `pdata` in via
upstream. The illustrative example for "default to *not* adding" was
listing items we *did* add. Deleted the example; the principle's first
paragraph + bullet list above it carry the message without the
contradiction.

2. **CONTRIBUTING.md carried two layers of pivot drift.**
- **External-repo references stale post-#181.** Three mentions of
`tracecoreai/tracecore-components` (separate-repo framing) survived the
PR #181 rescope to in-repo `module/` Go submodule. Updated to
`github.com/tracecoreai/tracecore/module` with layout
`module/receiver/<name>/`, `module/processor/<name>/`.
- **Adding-a-component tutorial contradicted RFC-0013 policy.** ~130
lines of tutorial taught contributors to add receivers under
`components/` referencing `clockreceiver` / `dcgm` / `kernelevents` as
canonical shapes — but RFC-0013 §6 forbids new in-tree components
("Nothing else is built in-house"), and all three canonical references
are queued for v0.1.0 or v0.2.0 deletion per §7. The routing block 100
lines above already declared the forbidding policy, so the tutorial was
a live-policy contradiction with deleted-receiver examples. Replaced
with a ~10-line routing block covering the three actual branches
(upstream first / moat → `module/` / RFC for fifth scope) plus a
one-liner on the surviving factory shape. This is also why the four
explicit `clockreceiver` references (lines 139/142/149/152 pre-edit)
were resolved by deletion-of-containing-section rather than name-swap:
no candidate survivor exists (`dcgm` deletes v0.1.0,
`clockreceiver`/`kernelevents`/`k8sevents`/`containerstdout` delete
v0.2.0, `nccl_fr` is logs-only and moves out to `module/` in PR-I.1).

## Changes

- `PRINCIPLES.md` — delete one stale concrete-example paragraph (3 lines
net).
- `CONTRIBUTING.md` —
  - L21, L32: `tracecoreai/tracecore-components` → in-repo `module/`.
- L103-L235 (old): collapse "Adding a component" tutorial → 7-line
routing block.

## Scope-discipline notes

Per prompt scope-fence:
- `MILESTONES.md` / `CHANGELOG.md` deletion-table drift untouched
(historic / intentional per prompt).
- `docs/STRATEGY.md` untouched (out of scope per prompt).
- `PRINCIPLES.md` §16 already cites RFC-0013 — no additional edit
needed.
- `STYLE.md` lines 106/115 reference `clockreceiver` as a Go-import
example; left in place — code example for `package clockreceiver`
import-form is shape-illustrative (will rotate when source actually
deletes in PR-K). Flagged for the next sweep, not PR-H.
- `cmd/tracecore/`, `bench/install/`, `install/kubernetes/`,
`AGENTS.md`, `tools/components-gen/` all carry `clockreceiver`
references and are explicitly *out of PR-H scope* — those migrations
land in PR-K (test-fixture coordination) per CHANGELOG line 16.

## Test plan

- [x] `make check` — clean (golangci-lint 0 issues, vet clean,
mod-verify ok).
- [x] `make doc-check` — clean (505 markdown links resolve including new
`#upstream-contribution-policy` + `#rfc-process` cross-links;
banned-phrase lint clean across 109 markdown files).
- [x] `make ci` (via pre-push hook) — clean (actionlint + zizmor +
alert-check + chart-appversion-check + no-autoupdate-check + clean-tree
all pass).
- [x] `grep -n "clockreceiver\|tracecore-components" CONTRIBUTING.md
PRINCIPLES.md` — zero hits post-edit.
- [x] Anchor verification: `#upstream-contribution-policy` resolves to
`## Upstream contribution policy` (L15); `#rfc-process` resolves to `###
RFC process` (L39).

Doc-only diff; no Go test corpus changes.

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant