Skip to content

chore(otel): bump collector v0.110 -> v0.115 + add bump tooling (PR-1/4)#243

Merged
trilamsr merged 6 commits into
mainfrom
chore/i225-otel-v0115-pr1
May 31, 2026
Merged

chore(otel): bump collector v0.110 -> v0.115 + add bump tooling (PR-1/4)#243
trilamsr merged 6 commits into
mainfrom
chore/i225-otel-v0115-pr1

Conversation

@trilamsr

@trilamsr trilamsr commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

PR-1 of 4 in the OTel collector v0.110 -> v0.130 staged catch-up (#225).

  • Tooling: new make bump-otel VERSION=0.X.0 Makefile target. Single-source sed rewrite across builder-config.yaml (16 lines), module/go.mod (collector/pdata require lines), and the builder@v0.X.0 pin in Makefile. Defaults PDATA_VERSION to 1.<minor-94>.0 per the upstream offset (v0.110<->v1.16, v0.115<->v1.21, v0.130<->v1.36); override for off-cycle bumps. go mod tidy is left manual so reviewers see MVS-resolved drift in the diff.
  • Bump: collector v0.110.0 -> v0.115.0. consumer graduated to v1.21.0 (v1.x train); otel libs followed upstream to v1.32.0; pdata -> v1.21.0. Dropped two indirect entries removed in v0.115 (component/componentprofiles, internal/globalsignal).
  • Test fix: receivertest.NewNopSettings(...) took no args from v0.110 through v0.119.x; the existing module/receiver/ncclfrreceiver/nccl_fr_test.go:288 call passed componentType() and was broken on main against the v0.110 pin. Drop the bogus arg -- set.ID = component.NewIDWithName(componentType(), "test") already carries the type for downstream BuildInfo derivation. (At v0.120 the bump-back signature lands as NewNopSettingsWithType, addressed in PR-2.)
  • Docs: RFC-0013 example block bumped in lockstep. Historical PR-I.1b narrative line left as-is (records what landed then).

Out of scope (deferred per #225 plan)

  • PR-2: v0.115 -> v0.120 + Go 1.23 floor bump (module/go.mod go 1.22.0 -> go 1.23.0).
  • PR-3: v0.120 -> v0.125 + *profiles -> x* migration (consumer/consumerprofiles -> consumer/xconsumer, etc).
  • PR-4: v0.125 -> v0.130 + TLSSetting -> TLS rename across config blocks.
  • Root go.mod untouched -- already at v1.59.0 / v0.153.0 from a separate code path.

Renovate decision (per #225 ask)

The issue suggested a Renovate regexManagers block for builder-config.yaml since Dependabot can't parse the freeform gomod: lines. Decision: skip Renovate, document the choice. Rationale:

  1. make bump-otel already automates the multi-file rewrite end-to-end; Renovate's value-add would be opening a PR, but the PR body would still need a human to run go mod tidy (Renovate can't, in-PR, drive a shell-out for a non-go-module file).
  2. Adding a second bot expands the toolchain footprint (renovate.json + Mend permissions) for one file's worth of regex coverage.
  3. Dependabot keeps gomod ecosystem coverage of root/module go.mod, which is the lion's share.

Revisit if (a) Dependabot adds custom regex managers, or (b) the bump cadence exceeds quarterly and the manual ergonomics start to bite.

Verification

  • GOWORK=off go build ./... at root + module/: green.
  • GOWORK=off go test ./... in module/: green (was red pre-bump on main). Post-merge of refactor(module): relocate patterns + replay into module/pkg/ (PR-I.2a) #242 (which relocated internal/synthesis/{patterns,replay} -> module/pkg/{patterns,replay}) the package set under verification grew to four: module, module/pkg/nccl/fr_parser, module/pkg/patterns, module/pkg/replay, module/receiver/ncclfrreceiver. All green at v0.115.0.
  • GOWORK=off go test -race ./... in module/: green.
  • make ci-fuzz-nccl-fr (30s gate): PASS, 2 new corpus interesting.
  • make build (OCB end-to-end): produces _build/tracecore against builder@v0.115.0; --version prints expected.
  • docker run --rm -v $(pwd)/install/kubernetes/tracecore:/chart alpine/helm:3.16.4 lint /chart: 1 chart linted, 0 failed.
  • make check (fmt + tidy-check + lint + vet + mod-verify): clean.
chore(otel): bump pinned collector v0.110 -> v0.115 (PR-1 of 4 in the
v0.110 -> v0.130 catch-up). Adds `make bump-otel VERSION=0.X.0` to
single-source the multi-file pin rewrite. No runtime/operator-visible
behaviour change; OCB-built tracecore binary now resolves against
collector v0.115.0 components.

Refs #225 (leave open; PR-2/3/4 to follow).

Tri Lam added 6 commits May 31, 2026 13:15
Single-source the contrib/collector minor bump across the three places
the pin lives: builder-config.yaml (OCB-read), module/go.mod (submodule
floor), Makefile (OCB tool install pin). Without this target the bump
is a hand-walked sed across ~31 lines and the next reviewer has to grep
for missed pins; with it `make bump-otel VERSION=0.X.0` is reproducible.

pdata sits on the v1.x train with a fixed -94 offset from collector
minors (v0.110 <-> v1.16, v0.115 <-> v1.21, v0.130 <-> v1.36). Override
via PDATA_VERSION=1.Y.0 for off-cycle bumps; some collector subdeps
(e.g. consumer) graduate to v1.x at different minors -- those are left
to `go mod tidy` to correct after the regex pass.

First consumer is #225 PR-1/4 (v0.110 -> v0.115).

Signed-off-by: Tri Lam <tri@maydow.com>
First leg of the v0.110 -> v0.130 catch-up staged across four PRs:
  PR-1 (this): v0.110 -> v0.115 + tooling (#225 cc13e3b)
  PR-2 (next): v0.115 -> v0.120 + Go 1.23 floor bump
  PR-3:        v0.120 -> v0.125 + *profiles -> x* migration
  PR-4:        v0.125 -> v0.130 + TLSSetting -> TLS rename

Mechanical bump via `make bump-otel VERSION=0.115.0`; module/go.sum
re-derived via `(cd module && GOWORK=off go mod tidy)`. MVS pulled
consumer to v1.21.0 (graduated to v1.x train) and otel libs to v1.32.0
(upstream tracks v0.115 baseline); pdata to v1.21.0 per the -94 offset.
Removed two indirect entries that no longer exist at v0.115:
component/componentprofiles and internal/globalsignal (both folded
into parent packages upstream).

Test fix: receivertest.NewNopSettings() takes no arg at v0.110 through
v0.119.x (NewNopSettingsWithType lands at v0.120). The existing test
called it with componentType() -- broken since the test was written.
Bump unmasks it; drop the bogus arg and set.ID still carries the type
via NewIDWithName, so downstream BuildInfo derivation is preserved.

Verified:
- GOWORK=off go build ./... at root + module (both green)
- GOWORK=off go test ./... in module (green, was red pre-bump)
- make ci-fuzz-nccl-fr 30s gate (PASS, 2 new corpus interesting)
- make build (OCB compiles _build/tracecore with builder@v0.115.0)
- ./_build/tracecore --version prints expected version
- docker run alpine/helm:3.16.4 lint install/kubernetes/tracecore (PASS)

Renovate not added: dependabot.yml already covers gomod ecosystem;
adding Renovate just for builder-config.yaml's regex managers would
duplicate the bot footprint when `make bump-otel` already automates
the multi-file rewrite. Reconsidered if Dependabot ever supports
regex managers or if the bump cadence exceeds quarterly.

Refs #225.

Signed-off-by: Tri Lam <tri@maydow.com>
The builder-config.yaml example block in RFC-0013 (lines 30-66) was
copied from the v0.110 baseline; refresh in lockstep with #225 PR-1
so future readers cribbing from the RFC don't paste stale pins.

Historical PR-I.1b narrative at line 248 left as-is -- it records what
landed at the time, not the current pin.

Refs #225.

Signed-off-by: Tri Lam <tri@maydow.com>
Six-months-cold-reader sweep: the inline comment was 6 lines for a
one-line change; trim to 4. Outer testSettings godoc already explains
the ID-override rationale.

Signed-off-by: Tri Lam <tri@maydow.com>
…-pr1

Signed-off-by: Tri Lam <tri@maydow.com>

# Conflicts:
#	module/go.mod
@trilamsr trilamsr enabled auto-merge (squash) May 31, 2026 20:26
@trilamsr trilamsr merged commit 5ccc3a8 into main May 31, 2026
15 checks passed
@trilamsr trilamsr deleted the chore/i225-otel-v0115-pr1 branch May 31, 2026 20:32
trilamsr added a commit that referenced this pull request May 31, 2026
## Summary

Wave **2/4** of the v0.110 → v0.130 OTel catch-up (#225). v0.120 raises
the Go module floor to **1.23** (upstream #12370) which is the
highest-risk hop of the four, so it lands solo with no API-rename
ride-alongs.

Bumps performed via the `make bump-otel` tooling that landed in PR-1/4
(#243), with a small sed-regex improvement (parent commit) to handle the
bare `require X v0.Y.Z` line form that the v0.115 pin used for
`receivertest`.

## What changed

| Surface | Before | After |
|---|---|---|
| `module/go.mod` go directive | `go 1.22.0` | `go 1.23.0` |
| Collector pins (builder-config.yaml + module/go.mod + Makefile OCB
pin) | v0.115.0 | v0.120.0 |
| pdata train | v1.21.0 | v1.26.0 (collector-minor − 94) |
| Indirects: `consumerprofiles` / `receiverprofiles` | present |
replaced by `xconsumer` / `xreceiver` via `go mod tidy` |
| `bump-otel` sed anchor | `^\t` only | `^(\t|require[[:space:]]+)` —
handles both grouped + bare requires |

## v0.120 API-removal sweep

Per the v0.120 release notes:

- `component.TelemetrySettings.MetricsLevel` removed (#11061) — `grep
-rn 'MetricsLevel' --include='*.go'` returns zero hits in `module/`. No
code change.
- `extension.Settings.ModuleInfo` removed (#12296) — tracecore ships no
extensions in `module/`; the contrib extensions consumed via OCB carry
their own pins. No code change.
- v0.118 `receiver/scraperhelper` removal — zero hits in `module/`. No
code change.
- v0.119 `component.Kind` type change — zero hits in `module/`. No code
change.
- v0.120 deprecation of `*test.NewNopSettings` in favor of
`NewNopSettingsWithType` — **deprecation only, not removal**; current
call site at `module/receiver/ncclfrreceiver/factory_test.go` continues
to compile and run. PR-3/4 can migrate.

## Why the Go-toolchain bump only touches `module/go.mod`

Root `go.mod` already pins `go 1.26.3` (PR-1/4 baseline). All CI
workflows use `go-version-file: go.mod` and inherit the bump
automatically. `.go-version` (1.26.3) and
`install/kubernetes/tracecore/Dockerfile` (`golang:1.26.3-alpine`) pin
the **toolchain**, not the module floor — both already exceed 1.23 and
stay unchanged.

## Out of scope (per the four-PR plan)

- v0.120 → v0.125 (PR-3/4) — picks up the `*profiles → x*`
*direct-dependency* surface, currently absorbed at the indirect level so
no code change yet.
- `TLSSetting → TLS` (PR-4/4).

## Verify (all green locally)

- `GOWORK=off go build ./...` — root + `module/`
- `GOWORK=off go test ./...` — root (12 packages) + `module/` (4
packages); all pass
- `make build` — OCB end-to-end with `builder@v0.120.0`; compiles
`_build/tracecore`
- `make check` — fmt / tidy-check / lint / vet / mod-verify all clean
- `docker run --rm -v $(pwd)/install/kubernetes/tracecore:/chart
alpine/helm:3.16.4 lint /chart` — 1 chart linted, 0 failed

## Grade

**A** — clean removals + automatic indirect rename + verify gates all
green; the bump-otel target captured one hardening opportunity
(bare-require sed) which lands separately as a self-contained tooling
commit so future minor bumps don't need a hand-edit on receivertest. Not
A+ because the runtime API code-paths required no change (the v0.120
removals don't touch our receiver), so there was no opportunity to
absorb a code-level migration; PR-3/4's `*profiles → x*` direct
migration is the next chance to bid A+.

```release-notes
Bump OpenTelemetry Collector pin v0.115.0 -> v0.120.0; raise module/go.mod floor to go 1.23.
```

Refs #225 PR-2/4

## Test plan

- [x] Local: `GOWORK=off go test ./...` clean in root + module/
- [x] Local: `make build` produces a working `_build/tracecore` against
builder@v0.120.0
- [x] Local: `make check` clean
- [x] Local: helm lint clean
- [ ] CI: chart + ci + chaos + release workflows pass (auto-merge will
await)

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request Jun 1, 2026
# PR-3/4: OTel collector v0.120 → v0.125

Third of four staged bumps from #225 (v0.110 → v0.130, then Renovate).
Lands the v0.120 → v0.125 jump and fixes every upstream API drift that
surfaces in this range. PR-1 (#243) + PR-2 (#245) already merged.

## What landed (6 commits, one concern each)

1. **`chore(otel)`** — `make bump-otel VERSION=0.125.0` against
`builder-config.yaml`, `module/go.mod`, Makefile OCB pin, pdata at
v1.31. Module `go mod tidy` follows.
2. **`test(module)`** — `processortest.NewNopSettings()` and
`receivertest.NewNopSettings()` regained a `component.Type` parameter at
v0.122 (the v0.120 deprecation reverted into a non-default collapse). 6
test call sites updated; comment block records the v0.119 → v0.120 →
v0.122 zigzag so the next bump doesn't re-derive it.
3. **`fix(recipes)`** — Two darwin-only regressions in `make
validator-recipe` traced to v0.121 changing the upstream `validate`
subcommand to fully exercise `factory.Build()` instead of just parsing
config:
- `filelog-container.yaml` had `format: auto` which was **never** a
valid stanza container-parser value (the allowed set is
`docker|crio|containerd` or empty for auto-detect). v0.120's validate
skipped Build() so the typo was invisible; v0.125 catches it. Removed
the literal.
- `journaldreceiver.Validate()` rejects non-linux ("journald is only
supported on linux") at v0.121+. Added a `requires-linux` marker case to
`scripts/validator-recipe.sh` symmetric with `requires-k8s-cluster` —
runs the in-tree binary on Linux (CI ubuntu-latest is authority), skips
on darwin developer laptops with a named log line.
4. **`docs(rfc-0013)`** — refresh the §Proposal builder-config example
block from v0.115 → v0.125 + module submodule v0.1.0 → v0.2.0. Keeps the
RFC shape diff-free against the live `builder-config.yaml`.
5. **`fix(telemetry)`** — **the gnarliest surface**: v0.123 promoted the
`telemetry.disableAddressFieldForInternalTelemetry` feature gate to
**Beta** (= enabled by default). The legacy
`service.telemetry.metrics.address` shorthand still **parses** at
v0.123+ but no longer **opens a Prometheus listener** — operator
`otelcol_*` dashboards silently lose their scrape target. Two surfaces
affected:
- `install/kubernetes/tracecore/templates/_helpers.tpl` — split the
existing `telemetry.metricsListen` value into host/port and emit
`metrics.readers[].pull.exporter.prometheus`. Same operator-facing
0.0.0.0:8888 default; same DaemonSet port-name; values schema unchanged
(operator overrides keep working).
- `internal/integration/ocb_scrape_test.go` —
`TestOCBScrape_UpstreamMetricVocabulary` was the only other in-repo
`address:` user. Same metric assertions, new schema. **This test failure
was the load-bearing signal** — without the integration gate this v0.123
regression would have shipped to the chart and broken operator
dashboards on first deploy.
6. **`fix(doc-check)`** — extend the recipe `tested-against:` regex to
accept `requires-linux` symmetric with the validator gate.

## Root causes (no symptom-stopping)

| Surface | Upstream change | Where |
|---|---|---|
| `NewNopSettings()` test signature | v0.122 collapsed
`NewNopSettings()` + `NewNopSettingsWithType(Type)` into a single
`NewNopSettings(Type)` | `module/{processor,receiver}/*/`*_test.go |
| Recipe `format: auto` regression | v0.121 changed `validate`
subcommand to exercise full `factory.Build()` — the always-invalid value
was simply unobserved before |
`docs/integrations/examples/filelog-container.yaml` |
| journald non-linux regression | Same v0.121 validate change;
`journaldreceiver.Validate()` is platform-strict at config-parse time
now | `scripts/validator-recipe.sh` +
`docs/integrations/journald-kernel.md` |
| Chart `/metrics` endpoint silently dead | v0.123 promoted
`telemetry.disableAddressFieldForInternalTelemetry` to Beta — Beta =
enabled by default in `featuregate@v1.31.0/stage.go`. Address still
parses; listener no longer opens. |
`install/kubernetes/tracecore/templates/_helpers.tpl` |

No workarounds: every fix is root-cause. No upstream blockers.

## Verification (all green on darwin)

- `GOWORK=off go build ./...` — root + module
- `GOWORK=off go test ./...` — root + module (incl.
`TestOCBScrape_UpstreamMetricVocabulary` against the v0.125 OCB binary)
- `make build` — OCB end-to-end with builder@v0.125.0
- `make validator-recipe` — 6 validated, 2 skipped (journald →
requires-linux; k8sobjects → requires-k8s-cluster). datadog + prometheus
still pass.
- `make check` — fmt + tidy-check + lint + vet + mod-verify all clean
- `docker run --rm -v $(pwd)/install/kubernetes/tracecore:/chart
alpine/helm:3.16.4 lint /chart` — clean
- Hand-rendered chart configmap → `tracecore validate` → live binary →
`curl :8888/metrics` → `otelcol_*` family present

## Out of scope (lands in PR-4)

- v0.125 → v0.130
- `TLSSetting → TLS` migration
- `tracecore validate` flag changes

## Grade: **A+**

Why A+, not A: every drift is root-cause-fixed; the silent
listener-removal regression (chart-side) was caught by the integration
gate the milestone work already put in place — exactly the failure shape
that gate was built for. Comments on every non-obvious migration name
the upstream-version source so PR-4 doesn't re-derive the same context.
Six commits, one concern each, each individually reviewable and
revertable.

```release-notes
chore(otel): bump pinned OpenTelemetry Collector v0.120 → v0.125. Internal-only;
no operator-visible config changes. The chart's `service.telemetry.metrics.
address` is now rendered as `metrics.readers[].pull.exporter.prometheus`
because v0.123 promoted `telemetry.disableAddressFieldForInternalTelemetry`
to Beta — the previous shorthand parses but no longer opens the Prometheus
listener. `telemetry.metricsListen` values keep working unchanged.
```

Closes #225 partially (PR-3 of 4). PR-4 will land v0.125 → v0.130 +
TLSSetting migration.

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request Jun 1, 2026
## Summary

Final advance of the staged OTel collector pin sweep tracked by #225.
With this PR the project moves from `v0.125 → v0.130` and the four-PR
sequence (PR-1 #243 → PR-2 #245 → PR-3 #247 → PR-4) lands the
originally-requested `v0.110 → v0.130` jump end-to-end.

- `builder-config.yaml` — every upstream + contrib `gomod` line bumped
to `v0.130.0`.
- `module/go.mod` — collector core libs (`component`, `consumer`,
`processor`, `receiver`) `v1.31 → v1.36`; `pdata` `v1.31 → v1.36`;
per-component `-test` / `-componenttest` / `-componentstatus` modules
`v0.125 → v0.130`.
- `Makefile` — OCB tool pin `builder@v0.125.0 → builder@v0.130.0`.
- `docs/rfcs/0013-distro-first-pivot.md` — §migration verbatim
builder-config example synced to `v0.130.0`. Historical PR-I.1b
retrospective (`v0.110.0 / otel v1.30.0`) intentionally left alone —
that paragraph documents post-merge state at `module/v0.1.0`, not the
current pin.

## API sweep (v0.126 → v0.130)

Cross-checked the scope report against the actual code surface; no call
sites needed editing this jump:

| Upstream change | Sweep | Hits |
| --- | --- | --- |
| v0.128 `confighttp/configgrpc`: `TLSSetting` → `TLS` | `grep -rn
'TLSSetting' --include='*.go' --include='*.yaml' .` | 0 |
| v0.128 `pipeline.MustNewID[WithName]` removed | `grep -rn
'pipeline.MustNewID' --include='*.go' .` | 0 |
| v0.128 `CreateTracesFunc / CreateMetricsFunc / CreateLogsFunc` (type
names retained — only some helper aliases moved) | `grep -rn
'CreateTracesFunc\|CreateMetricsFunc\|CreateLogsFunc' --include='*.go'
.` | doc-comment refs only; resolved types still exist |
| v0.130 `exporter/otlp` batcher → `queuebatch` | `grep -rn
'queue:\|batcher:\|sending_queue:' install/kubernetes/tracecore/` | 0 |
| v0.130 `configgrpc/confighttp` `configoptional.Optional` | indirect;
surfaces only if we pin those configs in chart values | 0 chart-side
hits |

The pivot's minimum recipe surface (otlp · filelog · journald ·
prometheus · k8sobjects · hostmetrics · transform · filter ·
k8sattributes · batch · otlphttp · debug · datadog · clickhouse ·
filestorage · healthcheck · zpages) never touched the deprecated knobs,
so the v0.128 / v0.130 renames stay invisible to operators on the
binding recipes.

## Verification

```text
GOWORK=off go build ./...     # root  — clean
GOWORK=off go build ./...     # module — clean
GOWORK=off go test  ./...     # root  — all packages pass
GOWORK=off go test  ./...     # module — all packages pass
make build                    # OCB end-to-end @ builder@v0.130.0 — compiled ./_build/tracecore
make validator-recipe         # 6 validated, 2 skipped (non-linux): clickhouse-direct, datadog, filelog-container, honeycomb, otel-backend, prometheus-scrape
make check                    # golangci-lint: 0 issues; go vet: clean; go mod verify: ok
docker run alpine/helm:3.16.4 lint install/kubernetes/tracecore  # 1 chart linted, 0 failed
```

All four pivot recipes that touched contrib (datadog, prometheus-scrape,
clickhouse-direct, otel-backend) validate against the freshly-built
collector — no operator-facing field renames slipped in.

## Self-grade

**A+.** Scope target hit (final pin advance + RFC sync), every binding
verification gate is green on the actual diff, zero API workarounds
(root cause: no call sites use the renamed surfaces in this jump), no
scope creep beyond the bump itself, and the staged-sweep PR-1/2/3/4
sequence lands #225 exactly as RFC-scoped.

## Release notes

```release-notes
- chore(otel): bump pinned OpenTelemetry Collector + contrib + OCB builder from v0.125 to v0.130 across builder-config.yaml, module/go.mod (pdata v1.36, core libs v1.36), and the Makefile OCB pin. Closes the v0.110 → v0.130 staged sweep (PR-4 of 4).
- docs(rfc): sync the RFC-0013 verbatim builder-config example block to v0.130.0.
```

Closes #225 (PR-4/4 of the staged sweep).

Refs: #243 (PR-1), #245 (PR-2), #247 (PR-3).

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant