Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ jobs:
run: make lint
- name: nccl_fr RCE gate
run: make nccl-fr-rce-gate
- name: register-lint
run: make register-lint
- name: test (race) + coverage-check
run: make coverage-check
- name: 30s fuzz (nccl_fr parser)
Expand Down
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: help build run test test-extras test-extras-sustained test-extras-fuzz test-extras-fuzz-kmsg test-extras-fuzz-journald test-extras-fuzz-nccl-fr test-extras-race bench bench-check fmt fmt-fix vet lint lint-fix tidy tidy-check mod-verify license-check license-fix govulncheck dco-check hooks clean check ci ci-fuzz-nccl-fr nccl-fr-rce-gate generate generate-check generate-fixtures coverage coverage-check doc-check smoke build-tags
.PHONY: help build run test test-extras test-extras-sustained test-extras-fuzz test-extras-fuzz-kmsg test-extras-fuzz-journald test-extras-fuzz-nccl-fr test-extras-race bench bench-check fmt fmt-fix vet lint lint-fix tidy tidy-check mod-verify license-check license-fix govulncheck dco-check hooks clean check ci ci-fuzz-nccl-fr nccl-fr-rce-gate register-lint generate generate-check generate-fixtures coverage coverage-check doc-check smoke build-tags

BIN := tracecore
PKG := ./cmd/tracecore
Expand Down Expand Up @@ -193,7 +193,10 @@ doc-check: ## Verify test identifiers referenced in rot-prone docs exist in the
@scripts/doc-check.sh
@scripts/alert-check.sh

ci: license-check generate-check vet build-tags tidy-check mod-verify lint nccl-fr-rce-gate coverage-check ci-fuzz-nccl-fr govulncheck doc-check build ## Everything CI runs. Run before opening a PR.
register-lint: ## Verify `func Register*` symbols live only under components/** (or an explicit allowlist). Enforces STRATEGY.md §"Each component owns its own Factory var".
@scripts/register-lint.sh

ci: license-check generate-check vet build-tags tidy-check mod-verify lint nccl-fr-rce-gate register-lint coverage-check ci-fuzz-nccl-fr govulncheck doc-check build ## Everything CI runs. Run before opening a PR.

smoke: build ## End-to-end smoke test: validate the dcgm example config, run the binary for 1.5s, kill, assert lifecycle logs appear. No hardware required; receiver degrades cleanly on macOS/CI.
@scripts/smoke.sh
Expand Down
11 changes: 6 additions & 5 deletions docs/FOLLOWUPS.md
Original file line number Diff line number Diff line change
Expand Up @@ -550,11 +550,12 @@ predicate. Documented so they aren't re-litigated.

### Tooling

- [ ] **`make register-lint`** — fail CI if `func Register*(`
appears outside `components/**` (or a `Register*` call to a
centralized registry). Converts STRATEGY.md's "Each component
package owns its own Factory var" rule from policy into
enforcement. *Target:* opportunistic, ~1 hour.
<!-- `make register-lint` closed: shipped as scripts/register-lint.sh
+ Makefile `register-lint` target, wired into `make ci` and CI's
`verify` job. Allowlist scopes two OTel-instrument registration
sites (internal/telemetry/{build_info,slo}.go) that match the
`Register*` prefix but are not the banned plugin-registry pattern. -->
- *Closed (see comment above): `make register-lint` shipped and CI-gated.*
- [ ] **OSS-Fuzz integration.** Tracecore fuzz targets currently run
only inside `go test`. Continuous fuzzing is ~1 day of setup
but premature pre-v0. *Trigger:* v0 cut — integrate or write
Expand Down
89 changes: 89 additions & 0 deletions scripts/register-lint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/usr/bin/env bash
# register-lint.sh — enforce that `func Register*` symbols live only
# under `components/**` (or an explicit allowlist).
#
# Converts STRATEGY.md §"Each component package owns its own Factory
# var" from policy into enforcement. The hard rule there:
#
# "PRs that introduce a centralized Register*() API, a global
# factory map outside components.yaml + the generated
# components.go, or anything that looks like a plugin loading
# mechanism MUST be rejected without an accepted RFC."
#
# This gate trips the moment such an API surface re-appears outside
# `components/**`, instead of waiting for a reviewer to spot it.
#
# Scope: Go source files only (*.go), excluding vendor/, .git/, and
# nested worktrees under .claude/worktrees/. Test files are in scope
# — a test helper named `RegisterFactories` would be the same drift.
#
# Allowlist: files whose `Register*` functions are OTel-instrument
# registration helpers (different verb domain — they register metric
# instruments on a MeterProvider, not component factories). Keep this
# list narrow; each entry needs a one-line rationale.
#
# Exits 0 if no violations; exits 1 with a list of offenders.

set -euo pipefail

# Allowlist — paths (relative to repo root) where `func Register*` is
# legitimate. Each entry is OTel-instrument registration on a
# MeterProvider, NOT component-factory registration. Adding here
# requires a one-line rationale in this list AND review attention.
allowlist=(
# Registers `tracecore.build.info` observable gauge on a MeterProvider.
'internal/telemetry/build_info.go'
# Registers tracecore.exporter.failure_rate / queue.depth_ratio /
# component.restart_count_per_hour gauges on a MeterProvider.
'internal/telemetry/slo.go'
)

# Find every Go file outside vendor/, .git/, .claude/worktrees/, and
# components/. The `find` flags exclude paths before grep ever sees them
# — cheaper than letting grep walk and prune later. Use a while-read
# loop instead of `mapfile` so the script runs on macOS bash 3.2.
violations=""
while IFS= read -r f; do
[ -n "$f" ] || continue
if ! grep -q '^func Register' "$f"; then
continue
fi
# Is this file allowlisted?
is_allowed=0
for allowed in "${allowlist[@]}"; do
if [ "$f" = "$allowed" ]; then
is_allowed=1
break
fi
done
if [ "$is_allowed" -eq 1 ]; then
continue
fi
# Capture the offending lines for the error message.
hits=$(grep -n '^func Register' "$f")
violations="${violations}${f}:\n${hits}\n"
done < <(
find . -name '*.go' \
-not -path './.git/*' \
-not -path './vendor/*' \
-not -path './.claude/worktrees/*' \
-not -path './components/*' \
| sed 's|^\./||' \
| sort
)

if [ -n "$violations" ]; then
echo "register-lint: 'func Register*' found outside components/ (and outside the allowlist):"
printf '%b' "$violations" | sed 's/^/ /'
echo
echo "Per STRATEGY.md §\"Each component package owns its own Factory var\","
echo "a centralized Register*() API for components is banned without an RFC."
echo "If this is OTel-instrument registration (not component-factory"
echo "registration), add the file to the allowlist in scripts/register-lint.sh"
echo "with a one-line rationale."
exit 1
fi

# Success: count the allowlisted files for visibility.
allow_count=${#allowlist[@]}
echo "register-lint: no 'func Register*' outside components/ (allowlist: $allow_count file(s))"