Skip to content

docs(multi-cluster): v0 read-only federation recipe (roadmap A18)#291

Merged
trilamsr merged 2 commits into
mainfrom
docs/multi-cluster-v0
Jun 1, 2026
Merged

docs(multi-cluster): v0 read-only federation recipe (roadmap A18)#291
trilamsr merged 2 commits into
mainfrom
docs/multi-cluster-v0

Conversation

@trilamsr

@trilamsr trilamsr commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Documents the v0 multi-cluster federation pattern: N source-cluster tracecore collectors stamp cluster.id via OTTL transformprocessor, run patternddetectorprocessor locally, and forward OTLP/HTTP to a central aggregation collector that fans out to backends (Loki / Tempo / etc.).

  • Read-only roll-up contract: aggregation tier forwards verdicts without re-detecting; cross-cluster verdict dedup is deferred to v1+ (roadmap C4).
  • Adopt-over-build, strict: every config primitive is upstream OTel collector core or contrib (otlpreceiver, otlphttpexporter, transformprocessor). No tracecore-specific federation code — RFC-0013 §1, §2 adoption matrix.
  • cluster.id injection point: transformprocessor running at context: resource BEFORE patternddetectorprocessor in every source-cluster pipeline. OTTL stamps the attribute once per record batch and it travels with every downstream verdict.

Files:

  • docs/multi-cluster.md — topology diagram, role-by-role config, verdict-routing FAQ, Helmfile 3-cluster topology example, failure modes.
  • docs/integrations/examples/multi-cluster-source.yaml — source-cluster role config (validates).
  • docs/integrations/examples/multi-cluster-aggregation.yaml — aggregation-cluster role config (validates).
  • docs/README.md — index row under Top-level.

Scope guard: no Go code, no install/kubernetes/tracecore/values.yaml changes (chart schema for first-class federation is a v0.4 deliverable in a separate PR). No write-path / dedup (v1+ roadmap C4).

Test plan

  • bash scripts/doc-check.sh — clean (tested-against + last-verified markers, banned-phrase lint, link integrity, comment-noise diff gate).
  • make checkfmt tidy-check lint vet mod-verify all clean.
  • ./_build/tracecore validate --config=docs/integrations/examples/multi-cluster-source.yaml — exit 0.
  • ./_build/tracecore validate --config=docs/integrations/examples/multi-cluster-aggregation.yaml — exit 0.
docs(multi-cluster): v0 read-only federation recipe — N source clusters stamp `cluster.id` via OTTL transform, forward OTLP/HTTP to a central aggregation collector. Every component is upstream OTel collector core or contrib.

Document the operator pattern for running tracecore across N source
clusters with a central aggregation collector. Read-only roll-up
contract: source clusters detect verdicts locally + stamp cluster.id
via OTTL transform; aggregation cluster forwards via OTLP/HTTP without
re-running detection. Cross-cluster verdict dedup is deferred to v1+
(roadmap C4).

Every config primitive is upstream OTel collector core/contrib:
otlpreceiver, otlphttpexporter, transformprocessor (RFC-0013 §1, §2
adoption matrix). No tracecore-specific federation code.

Ships:
- docs/multi-cluster.md (topology diagram, role-by-role config,
  verdict-routing FAQ, Helmfile example, failure modes)
- docs/integrations/examples/multi-cluster-source.yaml (validates)
- docs/integrations/examples/multi-cluster-aggregation.yaml (validates)
- docs/README.md index row

Verified: bash scripts/doc-check.sh, make check, ./_build/tracecore
validate on both example configs.

Signed-off-by: Tri Lam <tri@maydow.com>
Per fresh-context review of #291:

- Move docs/multi-cluster.md to docs/integrations/multi-cluster.md so
  the scripts/validator-recipe.sh + doc-check integration-index gates
  pick it up. Validator now sees 9 recipes (was 8).
- Add docs/README.md row under backend recipes section.
- Fix typo patternddetector → patterndetector in doc + aggregation YAML.
- Add 3 failure-mode rows: aggregation cluster down > 300s,
  source-collector restart loses in-memory queue, naked-HTTP receiver.
- Auth-extension example (bearer + mTLS) deferred to v0.4 per #297.

Signed-off-by: Tri Lam <tri@maydow.com>
@trilamsr trilamsr enabled auto-merge (squash) June 1, 2026 05:31
@trilamsr trilamsr merged commit 184ed38 into main Jun 1, 2026
11 checks passed
@trilamsr trilamsr deleted the docs/multi-cluster-v0 branch June 1, 2026 05:39
trilamsr pushed a commit that referenced this pull request Jun 1, 2026
Three references to a docs/integrations/cert-manager-mtls.md recipe that has never existed in the tree were blocking doc-check. Replaced with inline upstream pointers + 'per-operator until a dedicated recipe lands' wording so the operational guidance survives the link removal.

Side cleanup blocking the chore/v1-rc1-knowledge-gaps push; root cause is that #291 (multi-cluster v0 federation) shipped forward-references to a recipe that was deferred.

Signed-off-by: Tri Lam <tri@maydow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant