docs(multi-cluster): v0 read-only federation recipe (roadmap A18)#291
Merged
Conversation
Document the operator pattern for running tracecore across N source clusters with a central aggregation collector. Read-only roll-up contract: source clusters detect verdicts locally + stamp cluster.id via OTTL transform; aggregation cluster forwards via OTLP/HTTP without re-running detection. Cross-cluster verdict dedup is deferred to v1+ (roadmap C4). Every config primitive is upstream OTel collector core/contrib: otlpreceiver, otlphttpexporter, transformprocessor (RFC-0013 §1, §2 adoption matrix). No tracecore-specific federation code. Ships: - docs/multi-cluster.md (topology diagram, role-by-role config, verdict-routing FAQ, Helmfile example, failure modes) - docs/integrations/examples/multi-cluster-source.yaml (validates) - docs/integrations/examples/multi-cluster-aggregation.yaml (validates) - docs/README.md index row Verified: bash scripts/doc-check.sh, make check, ./_build/tracecore validate on both example configs. Signed-off-by: Tri Lam <tri@maydow.com>
Per fresh-context review of #291: - Move docs/multi-cluster.md to docs/integrations/multi-cluster.md so the scripts/validator-recipe.sh + doc-check integration-index gates pick it up. Validator now sees 9 recipes (was 8). - Add docs/README.md row under backend recipes section. - Fix typo patternddetector → patterndetector in doc + aggregation YAML. - Add 3 failure-mode rows: aggregation cluster down > 300s, source-collector restart loses in-memory queue, naked-HTTP receiver. - Auth-extension example (bearer + mTLS) deferred to v0.4 per #297. Signed-off-by: Tri Lam <tri@maydow.com>
trilamsr
pushed a commit
that referenced
this pull request
Jun 1, 2026
Three references to a docs/integrations/cert-manager-mtls.md recipe that has never existed in the tree were blocking doc-check. Replaced with inline upstream pointers + 'per-operator until a dedicated recipe lands' wording so the operational guidance survives the link removal. Side cleanup blocking the chore/v1-rc1-knowledge-gaps push; root cause is that #291 (multi-cluster v0 federation) shipped forward-references to a recipe that was deferred. Signed-off-by: Tri Lam <tri@maydow.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Documents the v0 multi-cluster federation pattern: N source-cluster tracecore collectors stamp
cluster.idvia OTTLtransformprocessor, run patternddetectorprocessor locally, and forward OTLP/HTTP to a central aggregation collector that fans out to backends (Loki / Tempo / etc.).otlpreceiver,otlphttpexporter,transformprocessor). No tracecore-specific federation code — RFC-0013 §1, §2 adoption matrix.cluster.idinjection point:transformprocessorrunning atcontext: resourceBEFORE patternddetectorprocessor in every source-cluster pipeline. OTTL stamps the attribute once per record batch and it travels with every downstream verdict.Files:
docs/multi-cluster.md— topology diagram, role-by-role config, verdict-routing FAQ, Helmfile 3-cluster topology example, failure modes.docs/integrations/examples/multi-cluster-source.yaml— source-cluster role config (validates).docs/integrations/examples/multi-cluster-aggregation.yaml— aggregation-cluster role config (validates).docs/README.md— index row under Top-level.Scope guard: no Go code, no
install/kubernetes/tracecore/values.yamlchanges (chart schema for first-class federation is a v0.4 deliverable in a separate PR). No write-path / dedup (v1+ roadmap C4).Test plan
bash scripts/doc-check.sh— clean (tested-against+last-verifiedmarkers, banned-phrase lint, link integrity, comment-noise diff gate).make check—fmt tidy-check lint vet mod-verifyall clean../_build/tracecore validate --config=docs/integrations/examples/multi-cluster-source.yaml— exit 0../_build/tracecore validate --config=docs/integrations/examples/multi-cluster-aggregation.yaml— exit 0.