[docs] MILESTONES: sync stale statuses + adopt per-rubric ☑ convention + currency rule#53
Merged
Merged
Conversation
…rule M3 (PR #28), M4b (PR #30), M5b (PR #29) merged and verified on main; M10 (PR #32) and M11 (PR #31) shipped as alpha. Status lines and the lane table were stale, masking how much of the v0.1.0 critical path is already done. Strip rubric blocks per the documented convention ("unstarted / in-progress only") and replace with a one-paragraph Reference summary plus any Carry-forward bullet, matching the M1/M2/M9 pattern. Add a "Keeping this document current" section requiring every milestone-touching PR to flip the corresponding status and lane-table annotation in the same commit. Cross-reference from CONTRIBUTING.md "Before you start" so contributors land on the rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original commit scoped the rule to MILESTONES.md. The same drift problem applies to docs/FOLLOWUPS.md — opportunistic items get landed, advanced, or promoted to milestones without the tracker being updated. Expand the rule to cover both docs explicitly, with a separate bullet list for follow-up transitions (complete/advance/discover/ promote). Update the CONTRIBUTING.md cross-reference to mention both files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first iteration stripped rubric blocks when flipping a milestone to shipped, on the precedent of M1/M2/M9 and the doc's "rubrics — unstarted/in-progress only" convention. That convention loses the audit record: the rubric is exactly the falsifiable, citable claim of what "shipped" meant, and PRINCIPLES §6 asks for falsifiable claims, not prose summaries. Restore the rubric blocks for M3/M4b/M5b/M10/M11 and prefix each bullet with ☐ / ⧗ / ☑ to mark per-rubric verification state. Bullets that retain *(unverified)* keep that tag — shipping a gate doesn't retroactively verify a claim that needs a live cluster or hardware. Update the "How to read" preamble + "Keeping this document current" rule to match the new convention: - rubrics persist through the milestone's lifecycle - each bullet carries its own status prefix - top-line ☑ requires every rubric ☑ - M1/M2/M4/M9 noted as pre-convention legacy entries (deferred backfill, not a blocker for new milestones) Add a **Landed:** field naming the primary artifact for shipped/ alpha milestones, so readers don't have to grep the rubric block for the file path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "How to read" and "Keeping this document current" sections mentioned `*(unverified)*` literally in inline code spans to describe the marker convention. scripts/doc-check.sh greps for `(unverified` across MILESTONES.md and counts every match, including meta-text about the marker — that's how doc-check sees 9 markers when only 7 are actual unverified claims. Reword the prose to refer to the unverified tag descriptively (and point at docs/.unverified-baseline for the count gate) instead of quoting the literal marker form. Count back to 7; CI green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
5 tasks
trilamsr
added a commit
that referenced
this pull request
May 18, 2026
Ratify the current posture as a permanent stance: the tracecore binary contains no in-binary self-update mechanism, no background fetcher, no remote control plane. Operators pull releases via their existing delivery tooling (Flux / Argo CD / RenovateBot / kubectl set image); the trust root is the operator's, not ours. RFC-0008 at Status: accepted, covering: - which component classes may auto-update (none, in-binary) - the supported update path (operator-pulled artifacts with cosign / SBOM / SLSA verification on the operator side) - what the collector commits to (immutable digests, lockstep appVersion / binary, no mid-version mutation) - what it explicitly does not commit to (remote channel, phoning-home, vendored update library) - five rejected alternatives with one-sentence rationale each - a CI grep gate enforcing the no-fetcher invariant Adjacent changes in the same PR (per M23 rubrics): - NORTHSTARS Open Question #2 closed; pointer to RFC-0008 - scripts/no-autoupdate-check.sh wired into `make ci` to fail build on `go-update` / `self-update` / `auto-update` / `AutoUpdate` / `UpdateCheck` / `FetchLatest` identifiers under cmd|components| internal - install/kubernetes/tracecore/README.md § "Upgrade posture" points operators at RFC-0008 for the contract - MILESTONES.md M23 flipped to ☑ with per-rubric ☑ prefixes (matches the convention adopted in PR #53) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
## Summary Follow-up to merged PR #53. Self-review found the per-rubric `☑` markers were too generous — a code merge doesn't satisfy a rubric that requires measurement, and a CI gate only satisfies a rubric if the gate is actually fail-closed. This PR audits each shipped rubric against its file:line evidence and downgrades the ones that need it. ## Re-grading **M4b (failure-injection harness):** top-line `☑` delivered → `☑ partial`. The `nccl-hang` CLI shim is a stub returning `ErrPending` (`tools/failure-inject/ncclhang/ncclhang.go`); the underlying capability exists in `pkg/nccl/fr_parser/synthesize.go` from M11. Three rubrics flipped to `⧗`: - nccl-hang byte-identical round-trip (CLI not wired) - nccl-hang safe-opcode-only (CLI not wired) - determinism on amd64 + arm64 (single-arch SHA gate; cross-arch equality is carry-forward) **M5b (Helm chart):** top-line stays `☑ delivered`. One rubric flipped to `⧗` — the ≤5-min hero-KPI median across 10 CI runs is satisfied by a single-run ≤300s gate; 10-run aggregation is the open work. **M10 (k8s events receiver):** top-line stays `☑ alpha`. Overhead-budget rubric flipped to `⧗` — `BenchmarkEmitOne` + `BenchmarkConvertOne` + `rusage_linux_test.go` exist, but an end-to-end 1k-events/min run asserting CPU + egress + RSS budgets together is the open work. **M11 (NCCL FlightRecorder):** top-line stays `☑ alpha`. Five rubrics flipped to `⧗`: - nccl-hang CLI reachability (blocked on M4b carry-forward) - 5s dump-watcher emit timing assertion - 2.31-drift fixture + parser-diff CI gate - overhead bench promoted from advisory to fail-closed gate - `make generate-fixtures` byte-identical regen gate **M3 (reproducible-build CI):** all 11 rubrics keep `☑`. Every gate is wired into `release.yml` and fail-closed; the lack of a real `v0.X.Y` tag doesn't invalidate the rubrics because they gate on workflow existence, not on past published releases. ## Side cleanup The "Foundation entries (M1/M2/M4/M9) predate this convention" line in the MILESTONES preamble belonged in FOLLOWUPS per the currency rule. Moved to `docs/FOLLOWUPS.md` § Documentation as an opportunistic backfill item. ## Test plan - [x] `bash scripts/doc-check.sh` exits 0 — unverified marker count stable at 7 - [ ] CI green - [ ] MILESTONES.md renders correctly on GitHub - [ ] Every re-graded rubric has its evidence-gap noted inline in italics ## Note This PR was the third commit on the (now-merged) PR #53 branch and didn't make it into the squash. Re-applying as a focused follow-up against post-merge main. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
Ratify the current posture as a permanent stance: the tracecore binary contains no in-binary self-update mechanism, no background fetcher, no remote control plane. Operators pull releases via their existing delivery tooling (Flux / Argo CD / RenovateBot / kubectl set image); the trust root is the operator's, not ours. RFC-0008 at Status: accepted, covering: - which component classes may auto-update (none, in-binary) - the supported update path (operator-pulled artifacts with cosign / SBOM / SLSA verification on the operator side) - what the collector commits to (immutable digests, lockstep appVersion / binary, no mid-version mutation) - what it explicitly does not commit to (remote channel, phoning-home, vendored update library) - five rejected alternatives with one-sentence rationale each - a CI grep gate enforcing the no-fetcher invariant Adjacent changes in the same PR (per M23 rubrics): - NORTHSTARS Open Question #2 closed; pointer to RFC-0008 - scripts/no-autoupdate-check.sh wired into `make ci` to fail build on `go-update` / `self-update` / `auto-update` / `AutoUpdate` / `UpdateCheck` / `FetchLatest` identifiers under cmd|components| internal - install/kubernetes/tracecore/README.md § "Upgrade posture" points operators at RFC-0008 for the contract - MILESTONES.md M23 flipped to ☑ with per-rubric ☑ prefixes (matches the convention adopted in PR #53) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
… lints (#56) ## Summary Lands two of the three open M6 carry-forward items: 1. **`docs/maintainership.md`** — one-page governance reference covering commit access, RFC sponsorship, and security disclosure, each section cross-referencing the document that actually owns it (`CODEOWNERS`, `docs/rfcs/`, `SECURITY.md`). 2. **`scripts/doc-check.sh` extensions** — banned-phrase lint over every tracked `.md` (per `STYLE-docs.md` §2) and a required-H2 assertion for `docs/maintainership.md` so a future rename / split breaks the build instead of breaking the governance contract silently. Integration recipes (`docs/integrations/{datadog,honeycomb,otel-backend,clickhouse-direct}.md`) need real version-pinned testing against each backend and are intentionally deferred to a per-recipe PR series. ## What this PR changes - **New file:** `docs/maintainership.md` — three required H2 sections (`Commit access`, `RFC sponsorship`, `Security disclosure`), each leading with its answer per `STYLE-docs.md` §3. Pointer-heavy by design; the load-bearing facts live in the documents the sections link to. - **`docs/README.md`:** registers the new file under "Top-level". - **`scripts/doc-check.sh`:** - Banned-phrase lint over every tracked `.md` outside `docs/rfcs/`, `docs/research/`, and `docs/STYLE-docs.md` (the canonical source of the list itself). Catches `production-grade`, `world-class`, `best-in-class`, `industry-leading`, `cutting-edge`, `lightning-fast`, `battle-tested`, `enterprise-grade`, `rock-solid`, `blazing-fast` and their space-separated variants. - Required-H2 assertion for `docs/maintainership.md`. Exits 1 with the missing-heading list if any of the three are absent. - **`README.md` line 41:** reworded to drop `production-grade` (caught by the new gate; meaning preserved). - **`MILESTONES.md` M6:** status stays `⧗` (integration recipes remain). Per-rubric `☑` prefixes flip on the rubrics this PR closes; carry-forward bullet rewritten to name only what remains. ## Why Two of the three open carry-forward items have been queued since M1.6. The maintainership page is a one-page deliverable with no dependencies on receivers, hardware, or backend integration — landing it now closes a governance gap without expanding the PR series further. The banned-phrase + section gates fall out of the same effort and make every future markdown PR self-policing. ## Test plan - [x] `bash scripts/doc-check.sh` exits 0 on this branch - [x] `make ci` includes `doc-check`; this PR doesn't change the wiring - [ ] `docs/maintainership.md` renders correctly on GitHub - [ ] All three required H2 sections present (verified by the new gate) - [ ] Banned-phrase lint clean across 77 markdown files ## Note on PR ordering This PR's `MILESTONES.md` edit uses the per-rubric `☑` convention introduced in PR #53. If PR #53 lands first, this merges clean. If this merges first, PR #53's "How to read" preamble still reads correctly with the prefixes already in place. 🤖 Generated with [Claude Code](https://claude.com/claude-code) ```release-notes NONE ``` --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
Ratify the current posture as a permanent stance: the tracecore binary contains no in-binary self-update mechanism, no background fetcher, no remote control plane. Operators pull releases via their existing delivery tooling (Flux / Argo CD / RenovateBot / kubectl set image); the trust root is the operator's, not ours. RFC-0008 at Status: accepted, covering: - which component classes may auto-update (none, in-binary) - the supported update path (operator-pulled artifacts with cosign / SBOM / SLSA verification on the operator side) - what the collector commits to (immutable digests, lockstep appVersion / binary, no mid-version mutation) - what it explicitly does not commit to (remote channel, phoning-home, vendored update library) - five rejected alternatives with one-sentence rationale each - a CI grep gate enforcing the no-fetcher invariant Adjacent changes in the same PR (per M23 rubrics): - NORTHSTARS Open Question #2 closed; pointer to RFC-0008 - scripts/no-autoupdate-check.sh wired into `make ci` to fail build on `go-update` / `self-update` / `auto-update` / `AutoUpdate` / `UpdateCheck` / `FetchLatest` identifiers under cmd|components| internal - install/kubernetes/tracecore/README.md § "Upgrade posture" points operators at RFC-0008 for the contract - MILESTONES.md M23 flipped to ☑ with per-rubric ☑ prefixes (matches the convention adopted in PR #53) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
) ## Summary Files RFC-0008 at `Status: accepted`, ratifying tracecore's current posture as a permanent stance: the binary contains no in-binary self-update mechanism, no background fetcher, no remote update channel. Operators pull releases via their existing delivery tooling — Flux, Argo CD, RenovateBot, `kubectl set image` from CI — and the cryptographic trust root (cosign keyless verification, SBOM, SLSA v1.0 Build L1 provenance from M3) is theirs, not ours. Closes NORTHSTARS § "Open questions tracked as RFCs" entry 2 ("Auto-update boundary"). ## What this PR changes - **New RFC:** `docs/rfcs/0008-auto-update-boundary.md` (Status: accepted) — concrete proposal across receiver / processor / exporter / runtime / binary classes; five rejected alternatives with one-sentence rationale each; risks led by RFC-number-collision per `STYLE-docs.md` §3; crosslinks to PRINCIPLES §1 §2 §6 §11 to show the boundary does not weaken any of them. - **NORTHSTARS.md:** Open Question #2 closed; replaced with pointer to RFC-0008 + supersession bar ("a production-operator ask that operator-side delivery automation cannot serve"). - **CI grep gate:** `scripts/no-autoupdate-check.sh` greps `cmd/ components/ internal/` for banned identifiers (`go-update`, `self-update`, `auto-update`, `AutoUpdate`, `UpdateCheck`, `FetchLatest`); wired into `make ci`. Run locally: green. - **Chart README:** `install/kubernetes/tracecore/README.md` adds an "Upgrade posture" subsection under § Upgrade pointing operators at RFC-0008 for the contract. - **MILESTONES.md:** M23 flipped `☐` → `☑ delivered`; every functional + non-functional rubric bullet carries `☑` (rubric-preservation convention adopted in PR #53). ## Why The "default off until a real ask appears" stance was a placeholder. Operators in this segment already run delivery pipelines with cryptographic provenance gates they control. Replicating that machinery inside a workload-adjacent collector duplicates an existing strength, badly. PRINCIPLES §2 ("Reversibility before optionality") settles the trade: prefer no mechanism over an off-by-default mechanism, because an off-by-default fetcher still has to exist in the binary, and an opt-out flag is a frequent supply-chain accident. ## Test plan - [x] `bash scripts/no-autoupdate-check.sh` exits 0 on this branch - [x] `bash scripts/doc-check.sh` passes — link integrity green, unverified-marker count stable - [ ] RFC renders correctly on GitHub - [ ] CI green (`make ci` includes both gates above + license-check + lint + build) ## Note on PR ordering The MILESTONES.md edit here uses the per-rubric `☑` convention introduced in PR #53. If PR #53 lands first, this merges clean. If this merges first, PR #53's "How to read" updates remain compatible — the convention reads correctly with or without the preamble already in place. 🤖 Generated with [Claude Code](https://claude.com/claude-code) ```release-notes NONE ``` --------- Signed-off-by: Tri Lam <trilamsr@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
trilamsr
added a commit
that referenced
this pull request
May 21, 2026
…149) ## Summary Two commits, both follow-on from PR #147: 1. **`[ci] integration paths: add substrate to kernelevents + pyspy`** — closes the workflow-paths audit gap surfaced in PR #147's `docs/followups/otlphttp.md` "Workflow paths trigger" row. 2. **`[docs] MILESTONES: backfill rubric blocks for M1, M2, M4, M9`** — closes `docs/followups/M3.md` "Backfill Foundation milestone rubrics" row. Both follow-up rows are marked `[x]` in their respective shards with strike-through and ship-evidence. ## Commit 1: integration workflow paths PR #147's audit found that `.github/workflows/kernelevents-integration.yml` and `pyspy-integration.yml` `paths:` filters cover only: - `components/receivers/<name>/**` - `internal/runtime/lifecycle/**` So a change to `cmd/tracecore` factory wiring, `internal/pipeline` contract, or `internal/selftelemetry` surface could land without re-running either integration suite — even though receiver behavior depends on all three substrates. This commit adds the three substrate path patterns to both `push:` and `pull_request:` filters on both workflows. Symmetric with `install-bench.yml` (P3-Rev1 #10 fix) and `chart.yml`. `chaos.yml` was audited and intentionally not changed — its substrate coupling runs via `tools/failure-inject/**` + `internal/synthesis/**` only. `actionlint` clean on both workflows. ## Commit 2: MILESTONES rubric backfill M1, M2, M4, M9 predate the per-rubric `☑` convention adopted in PR #53 and shipped as prose-only delivery summaries. This commit reformats each section to match M3 / M5b / M10+ shape: - **Functional rubrics:** block with `☑` bullets citing RFC sections or shipped file paths. - **Non-functional rubrics:** block for budget / policy / overhead guarantees. Every claim was extracted from the existing prose summary; no new guarantees added. Source citations: - **M1**: RFC-0003 (Component / Host / Factory contracts, two-phase shutdown, push-based consumers, factory map, `safe.Call`, operator UX). - **M2**: RFC-0006 (`/metrics` + `/healthz` + `/readyz`, `selftelemetry.Receiver`, O2 SLO gauges, three OTel divergences closed). - **M4**: `.golangci.yml` + `Makefile` + `scripts/` (no RFC; convention is the tooling files themselves). - **M9**: RFC-0007 (`/dev/kmsg` + journald via one `source` interface, NVRM Xid extraction, RE2 filters compile at Start, trace context propagation, non-Linux stubs, overhead budget). Also fixed three stale `docs/FOLLOWUPS.md` references that survived the shard split (PR #132): - MILESTONES.md L210 M21 carry-forward → `docs/followups/M3.md` - MILESTONES.md L269 benchstat → `docs/followups/opportunistic.md` - MILESTONES.md L552 M8 carry-forward → `docs/followups/M8.md` ## Files changed | File | Commit | LOC | |---|---|---| | `.github/workflows/kernelevents-integration.yml` | 1 | +6 | | `.github/workflows/pyspy-integration.yml` | 1 | +6 | | `docs/followups/otlphttp.md` | 1 | +5/-5 (strike + audit closure) | | `MILESTONES.md` | 2 | +48/-12 (4 rubric blocks + 3 link fixes) | | `docs/followups/M3.md` | 2 | +5/-4 (strike) | ## Release notes ```release-notes NONE ``` ## Test plan - [x] `make ci` green (verify, verify-lint, verify-static, verify-test, build, vet, golangci-lint, zizmor, actionlint, govulncheck, fuzz 30s). - [x] `bash scripts/doc-check.sh` green (em-dash + en-dash diff gate clean, comment-noise diff gate clean, 452 markdown links resolve). - [x] `actionlint` clean on both modified workflows. - [x] `bash scripts/no-autoupdate-check_test.sh` 10/10 assertions pass. - [x] `release-doc-parity` clean, `chart-appversion-check` clean, `alert-check` clean. - [ ] CI green on this branch. ## Sequencing Builds on PR #147 (merged) which surfaced both follow-up items. Independent of any currently-open PRs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MILESTONES.mdhad drifted frommain: M3, M4b, M5b, M10, M11 are all already merged but the doc still showed them as☐/⧗, masking how much of the M21 v0.1.0 path is done. This PR brings the doc back in sync, hardens the convention so the drift can't recur, and tightens the audit record of what "shipped" actually means.Three coherent changes:
1. Status sync. Flip M3 (PR #28), M4b (PR #30), M5b (PR #29) to
☑ delivered; flip M10 (PR #32) and M11 (PR #31) to☑ alpha. Update the Lane structure table to match ((shipped),(alpha),carry-fwdannotations).2. Per-rubric
☑convention. Rubrics stay in the doc for the milestone's full lifecycle. Each functional / non-functional bullet now carries a☐(planned),⧗(in progress), or☑(verified) prefix. A milestone's top-line☑requires every rubric bullet to be☑. Bullets explicitly tagged*(unverified)*ship as☑when their gate lands; the*(unverified)*tag persists in the bullet text so the open evidence is visible. Rationale: the rubric is the falsifiable, citable claim of what shipping meant — stripping it on ship (the previous convention) loses the audit trail PRINCIPLES §6 requires. Foundation entries M1 / M2 / M4 / M9 predate this convention and remain prose-only; backfilling them is deferred cleanup, not a blocker.3. Currency rule covering both tracking docs. New "Keeping this document current" section in
MILESTONES.mdrequiring every PR that starts, advances, ships, or re-scopes work in eitherMILESTONES.mdordocs/FOLLOWUPS.mdto update the corresponding entry in the same PR. Per-rubric transition rules spelled out (☐→⧗ on start, ☐/⧗→☑ on verification, never strip on ship). Follow-up transitions covered too (complete / advance / discover / promote-to-milestone). Cross-referenced fromCONTRIBUTING.md§ "Before you start".Plus: a one-line Landed: field on each shipped/alpha milestone names the primary artifact (workflow, package, chart path) so readers don't have to grep the rubric block to find the entry point.
Why
Status drift made it hard to identify the actually unstarted parallel work (M21 already has M3 + M5b + three alpha receivers landed — only M6 docs and one more receiver gate the release). Preserving rubrics keeps the falsifiable audit chain intact across the milestone's full life. Covering
FOLLOWUPS.mdin the same rule prevents opportunistic items from disappearing into the same drift.Test plan
MILESTONES.mdrenders correctly on GitHub☑with PR citation and a Landed: pointer☑prefix on every bullet*(unverified)*still carry that tagCONTRIBUTING.mdlink toMILESTONES.md#keeping-this-document-currentresolvesmake doc-checkif run)🤖 Generated with Claude Code