Skip to content

[m15] containerstdout: Kubernetes container stdout tail receiver (alpha)#158

Merged
trilamsr merged 35 commits into
mainfrom
feat/m15-containerstdout
May 22, 2026
Merged

[m15] containerstdout: Kubernetes container stdout tail receiver (alpha)#158
trilamsr merged 35 commits into
mainfrom
feat/m15-containerstdout

Conversation

@trilamsr

@trilamsr trilamsr commented May 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements M15 (RFC-0010): Kubernetes container stdout tail receiver. Tails /var/log/pods/**/*.log per node, parses CRI text format, attributes records to training rank via Pod informer + body-match fallback, joins with M19 pod-evicted via EvictionMatchWindow, emits to consumer.Logs.

Alpha - opt-in via receivers.containerstdout.enabled=true (default off in Helm values).

Phases

  • Phase 1: doc.go, kind.go (7 Kinds), record.go (typed Record + SchemaURLv0 + wire attrs), pattern_consumer_test
  • Phase 2: Config + Validate (RE2 + ceiling + rank_source whitelist)
  • Phase 3: Factory + components.yaml registration
  • Phase 4: CRI parser + Stitcher + path resolver
  • Phase 5: AttributionCache LRU + InformerSource
  • Phase 5.3: process_rank_regex body-match fallback
  • Phase 6: DataloaderExtractor (data_time_s / iter_time_s named captures)
  • Phase 7: Egress rate-limit (token bucket, LRU, namespace budgets)
  • Phase 8: Cursor persistence (atomic write + fsync)
  • Phase 10: Stdlib file Tailer (rotation + truncation detection)
  • Phase 11: Pod informer (node + namespace scoped via filteredIndexer)
  • Phase 12: Receiver lifecycle wiring (lc.Add + recomputeDegraded OR aggregator)
  • Phase 13: 15 failure-mode test identifier pins (AGENTS.md lesson [ci] Sync branch-protection.yml with relaxed review rule #8)
  • Phase 13.5: 4 hot-path benchmarks (alloc/lookup/regex/rate-limit)
  • Phase 13.6: 10 falsifying failure-mode tests backfilled
  • Phase 14: End-to-end pipeline integration (emit, JSON splat, tailer pool, lines/s flush)
  • Phase 15: Helm chart (RBAC + DaemonSet + values) + RUNBOOK + FAILURE-MODES + prometheus alerts + conftest policy

Deferred to FOLLOWUP

  • Phase 17: Vendored pkg/stanza/fileconsumer swap - currently uses stdlib Tailer (RFC-0010 §FOLLOWUPS)
  • Bench targets: HotPath 5 vs 4 allocs/op; RegexExtraction 698 vs 500 ns/op - flagged for Phase 17 optimization

Test plan

  • CI: make check passes (fmt + tidy + lint + race tests across full repo)
  • Unit: go test -race -count=2 ./components/receivers/containerstdout/... - all green
  • Integration: 8 TestPipeline_E2E_* end-to-end tests pass
  • Failure-mode pins: 10 PASS, 5 SKIP-deferred (Phase 14 integration not yet exercised end-to-end via the SKIP'd ones)
  • Helm: helm template --set receivers.containerstdout.enabled=true renders DaemonSet + RBAC + NODE_NAME env
  • Helm: default render does NOT include containerstdout resources (opt-in verified)
  • Conftest: 3 operational-invariant policies (RBAC, volumes, NODE_NAME env)
  • Prometheus: 8 alerts cover every Kind + composite degraded
  • RUNBOOK: 7 Kind sections + recovery/rollback procedures

Rollback

helm upgrade --set receivers.containerstdout.enabled=false removes DaemonSet, RBAC, and config - verified via template diff.

Add Kubernetes container-stdout tail receiver (M15, RFC-0010): tails /var/log/pods, parses CRI text format, attributes lines to training rank via Pod informer + body-match fallback, joins with M19 pod-evicted. Alpha — opt-in via receivers.containerstdout.enabled.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

trilamsr and others added 30 commits May 20, 2026 16:17
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Adds DataloaderExtractor parsing data_time_s / iter_time_s named captures.

Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
Adds the chart wiring for the containerstdout receiver (M15, alpha):

- receivers.containerstdout block in values.yaml — opt-in alpha
  (enabled: false), with structured knobs for include globs,
  namespaces, max_log_size, rank_source, egress_rate_limit, plus
  chart-only rbac and hostPath keys.
- templates/containerstdout-rbac.yaml — ClusterRole +
  ClusterRoleBinding granting core/v1.pods get,list,watch +
  core/v1.nodes get on the chart's ServiceAccount; render gated
  on enabled AND rbac.create.
- templates/daemonset.yaml — conditional pod-level root override
  (kubelet CRI symlinks under /var/log/pods are root-owned),
  conditional /var/log/pods, /var/log/containers, and cursor
  hostPath volume mounts, and automountServiceAccountToken flips
  true so the per-node Pod informer authenticates.
- templates/_helpers.tpl — omits chart-only rbac/hostPath keys
  from the rendered tracecore config so config.Load does not
  reject the output with an unknown-field error.
- templates/NOTES.txt — operator-facing warning when the receiver
  is enabled (root, ClusterRole, RUNBOOK link).
- ci/containerstdout-on-values.yaml — happy-path render fixture
  for the chart CI gate.

RFC contract: docs/rfcs/0010-containerstdout-receiver-scope.md.

Signed-off-by: Tri Lam <tri@maydow.com>
Operator-facing docs for the M15 containerstdout receiver:

- components/receivers/containerstdout/RUNBOOK.md — per-Kind
  triage (KindRotationStalled, KindBackpressureDrop,
  KindCursorWriteFailed, KindWatch, KindFingerprintCardinality,
  KindAttributionCardinality, KindRateLimitCardinality) with
  alert names, root-cause taxonomy, diagnostic commands,
  mitigation, and the TestFailure_ identifier that pins each
  failure mode. Plus Recovery and Rollback sections.
- components/receivers/containerstdout/README.md — receiver
  overview, configuration reference, operational notes,
  per-Kind alert mapping, RBAC + Helm pointer, and limitations.
- docs/FAILURE-MODES.md — adds containerstdout to the
  per-component RUNBOOK link list AND the Alert -> RUNBOOK
  index (eight new rows, one per containerstdout alert).

Every Kind from components/receivers/containerstdout/kind.go now
has a RUNBOOK entry + linked test name; FAILURE-MODES routes
operators to it.

Signed-off-by: Tri Lam <tri@maydow.com>
- components/receivers/containerstdout/prometheus-alerts.example.yaml —
  eight alert rules, one per Kind in kind.go plus the composite
  ContainerStdoutDegraded. Per-Kind severity table in the YAML
  comment block (cursor_write_failed = critical; rotation /
  backpressure / watch = warning; cardinality kinds = info).
  Rules filter on component_id=~"containerstdout/.*" and stamp
  receiver_id=containerstdout for dashboard disambiguation of
  Kinds shared with k8sevents (KindWatch, KindBackpressureDrop)
  per RFC-0010 § Kind aliasing.
- install/kubernetes/tracecore/policies/conftest/tracecore.rego —
  carves a containerstdout-enabled exemption to the runAsNonRoot /
  runAsUser==0 / runAsGroup==0 deny rules (gated on the presence
  of the containerstdout-pod-logs hostPath volume, NOT a values
  flag), then adds three new rules enforcing operational
  invariants when the receiver is enabled: required
  containerstdout-pod-logs volume, required containerstdout-cursor
  volume, and required K8S_NODE_NAME downward-API env from
  spec.nodeName.

helm + conftest not on PATH locally; templates verified by
visual inspection of expected render shape under both
enabled=false (default) and enabled=true (ci fixture).

Signed-off-by: Tri Lam <tri@maydow.com>
Tri Lam added 5 commits May 20, 2026 23:56
Marks the M15 container stdout receiver acceptance criteria done
for items the alpha behind containerstdout.enabled=true ships:
attribution, JSON detect, dataloader extract, rotation, lines/s
feed, cursor, degraded-mode, egress rate-limit, multi-tenancy,
back-pressure, fd hygiene, security, containerd #11149 caveat,
panic recovery, shutdown.

Partial (⧗): 5s p99 pod→watcher latency and end-to-end overhead
budget (≤0.10% CPU, ≤0.3 Mbps egress, ≤20 MB RSS) — unit
benchmarks ship; e2e budget assertion deferred to Phase 17.

NOTE: vendored pkg/stanza/fileconsumer swap deferred to Phase 17
(RFC-0010 §FOLLOWUPS). Current stdlib Tailer handles rotation +
truncation correctly; the swap is an optimization carry-forward.

PR link will be patched in a follow-up commit once gh pr create
returns the URL.

Signed-off-by: Tri Lam <tri@maydow.com>
Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr merged commit 119580a into main May 22, 2026
20 checks passed
@trilamsr trilamsr deleted the feat/m15-containerstdout branch May 22, 2026 02:58
trilamsr pushed a commit that referenced this pull request May 30, 2026
shellcheck SC2221/SC2222 flagged the path-filter case statement
because `*.md` already covers CHANGELOG.md / README.md / NORTHSTARS.md
etc; listing the top-level docs explicitly created overlapping
patterns. Same semantics with the simpler two-arm match.

Also: verify-test failed on TestTailer_TruncationWithoutRotation
(components/receivers/containerstdout/tailer_test.go:347, "timed out
waiting for tailer line"). That test arrived via PR #158 (M15
containerstdout, merged on main earlier today) and is not introduced
by this PR. Flaky timeout on the slow ubuntu-latest runner. Triaging
separately via re-run; if it persists, registers a follow-up in
docs/FLAKY-TESTS.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr added a commit that referenced this pull request May 30, 2026
## What this PR does

Lands [RFC-0013](docs/rfcs/0013-distro-first-pivot.md) as the binding
decision doc for tracecore's distribution-first pivot, and propagates it
across the doc surface in fourteen scoped commits. After this PR, every
doc that names a soon-to-be-deleted receiver, the hand-rolled release
pipeline, or the custom self-telemetry surface carries a Status banner
pointing at the upstream replacement and the release boundary (v0.1.0 /
v0.2.0 / v0.3.0).

No code paths change. No deletions yet. The next eleven PRs (per
RFC-0013 §Migration) do the receiver removals, the OCB skeleton swap,
the goreleaser stack adoption, the receivers-only module split, and the
migration guides.

### Scope summary

| Concern | Adopt | Delete | When |
|---|---|---|---|
| GPU telemetry | `dcgm-exporter` (NV) + `ROCm/device-metrics-exporter`
(AMD) + `xpumanager` (Intel) + Habana exporter via `prometheusreceiver`
| `components/receivers/dcgm/` (cgo stub) | v0.1.0 |
| Container stdout | `filelogreceiver` + container stanza +
`file_storage` | `components/receivers/containerstdout/` (M15 alpha, PR
#158) | v0.2.0 |
| Kernel events | `journaldreceiver` + `filelogreceiver` + OTTL Xid
transform | `components/receivers/kernelevents/` | v0.2.0 |
| K8s events | `k8sobjectsreceiver` + OTTL `k8s.event.hint` transform |
`components/receivers/k8sevents/` | v0.2.0 |
| Kueue scheduler | `prometheusreceiver` + bearer-token |
`components/receivers/kueue/` (never shipped) | v0.1.0 |
| Python profiling | `parca-agent` (eBPF) |
`components/receivers/pyspy/` + `python/tracecore_pyspy/` +
`tools/pyspy-lint/` | v0.3.0 |
| Kineto | Deferred pending OTel Profiles GA |
`components/receivers/kineto/` (partial) | v0.1.0 |
| Heartbeat | `telemetrygeneratorreceiver` |
`components/receivers/clockreceiver/` | v0.1.0 |
| Self-telemetry | upstream `componentstatus` + `service/telemetry` +
standard `otelcol_*` | `internal/componentstatus/` +
`internal/selftelemetry/` + `internal/telemetry/` | v0.1.0 |
| Release | goreleaser + slsa-github-generator + cosign-installer +
anchore/sbom-action + actions/attest-build-provenance | hand-rolled
`.github/workflows/release.yml` | v0.1.0 |
| Image publish | `ko` (push) + `Renovate` (pull) | (refactor) | v0.1.0
|
| Datadog / ClickHouse | OCB bundles `datadogexporter` +
`clickhouseexporter` directly | "intermediate otelcol-contrib" recipe |
v0.1.0 |

### Customer-stable contracts preserved across the pivot

`k8s.event.hint` 11-entry enum, `kernelevents.xid`, `gpu.id`,
`gpu.vendor`, `gen_ai.training.{rank,job_id}`, NCCL FlightRecorder span
schema, pattern detector outputs (M17/M18/M19). Normalized back to the
stable surface via the OTTL `transform` processor in the bundled recipe.
Operator alerts written against these survive the receiver swap.

### What tracecore still builds (the moat)

Bounded by RFC-0013 §6: NCCL FlightRecorder receiver (`ncclfrreceiver`),
windowed cross-signal join processor (`rankjoinprocessor`), pattern
engine + replay corpus (`patterndetectorprocessor`), install/overhead
bench harness. Everything else comes from upstream + contrib.

### Upstream contribution policy

Now first-class. Tracecore contributes patches upstream first; forks
only when upstream rejects. When a contribution is in-flight, tracecore
ships against a `replace` directive in `go.mod` pointing at the
contribution branch; the replace is removed when the upstream tag lands.

## Commits in this PR

1. `1fe08c3` - RFC-0013 binding doc + index updates
2. `2047abe` - research notes superseded headers
3. `81dc29e` - patterns input-source references at adopted receivers
4. `ca9fbc6` - RFC bodies 0001-0012 supersede/revise headers (0004
archived)
5. `d06c929` - status banners on queued-for-deletion files (15 files)
6. `d116775` - receiver READMEs + RUNBOOKs status banners (14 files)
7. `c250509` - followups triage against RFC-0013 (21 files)
8. `3edd226` - top-level orienting docs reframed (9 files)
9. `cb01365` - integration recipes + operator docs for OCB distro (10
files)
10. `946af13` - `golang.org/x/net` v0.54.0 → v0.55.0 (GO-2026-5026,
blocking pre-push)
11. `4eb03c0` - em-dash + en-dash → hyphen across pivot surface (88 hits
in 32 files); also fixes 3 broken link refs to archived/0004 and one
fabricated test name in FAILURE-MODES.md
12. `de585ba` - language tightening + dedupe paraphrase + ancillary doc
Status banners (STYLE.md, bench/install/tracecore-values.yaml,
docs/examples/*.yaml, docs/integrations/examples/*.yaml, docs/notes/*,
docs/proposals/*)
13. `9ceeaef` - merge `origin/main` (resolves M15 containerstdout (PR
#158) alongside RFC-0013 v0.2.0 deletion banner; FAILURE-MODES.md gains
containerstdout alert rows with upstream replacement column; go.mod
takes both x/sys v0.45.0 and new x/time v0.14.0)

## Linked issue(s)

_No linked issue._

## Test plan

- [x] `make doc-check` clean on every modified file
- [x] `make ci` (full pre-push) passed locally — lint + race tests + 30s
fuzz + govulncheck + build + doc-check + release-doc-parity
- [x] `govulncheck ./...` reports zero vulnerabilities after the x/net
bump
- [x] `gh attestation` / `cosign` flag set unchanged (parity gate intact
per RFC-0013 §Migration PR-C)
- [ ] **CI expected to fail on `validator-recipe`**: integration docs
route Datadog and ClickHouse through tracecore's own `validate`
subcommand (no intermediate otelcol-contrib), but the in-tree binary on
this branch does not yet have `datadogexporter` / `clickhouseexporter`
registered. Lands in the follow-on OCB-skeleton PR (RFC-0013 §Migration
PR-A). Expected, not a regression.

### Known follow-ups (not in this PR; tracked for next PRs per RFC-0013
§Migration)

- `docs/integrations/examples/*.yaml` Status comments now reflect
OCB-bundled posture (commit `de585ba`); operator placeholders unchanged.
- `docs/FAILURE-MODES.md` retains references to `internal/pipeline/*`
test paths that will be deleted with the v0.1.0 PR-F cut.
- The `tracecoreai/tracecore-components` separate-repo module does not
exist yet; created in PR-I per v0.2.0.

## Release notes

```release-notes
[CHANGE] RFC-0013 adopted: tracecore pivots to a distribution-first posture. The binary is assembled via OpenTelemetry Collector Builder (OCB) from upstream + contrib components plus a thin tracecore-components module containing only the moat (NCCL FlightRecorder receiver, OTTL join processor, pattern engine, bench harness). Seven in-tree receivers and three internal self-telemetry packages are queued for deletion across v0.1.0 / v0.2.0 / v0.3.0; the release pipeline migrates to goreleaser + SLSA + cosign + ko. Customer-stable telemetry contracts (k8s.event.hint enum, kernelevents.xid, gpu.id, gpu.vendor, NCCL span schema) are preserved across the pivot via an OTTL normalization layer in the bundled recipe; operator alerts written against these survive the swap.
```

## Checklist

- [x] Tests added or updated (doc-only PR; doc-check + link-check gates
updated transitively)
- [x] `make check` runs green; pre-push hook passes
- [x] Commits are signed off (`git commit -s`)
- [x] For new components, follows the layout required by
[`STYLE.md`](STYLE.md) - n/a (no new components)

---------

Signed-off-by: Tri Lam <tri@maydow.com>
Co-authored-by: Tri Lam <tri@maydow.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant