[docs] AGENTS.md: aggregator bypass + perf-regex flake lessons by trilamsr · Pull Request #81 · TraceCoreAI/tracecore

trilamsr · 2026-05-19T07:58:29Z

Summary

Adds two load-bearing lessons to AGENTS.md from this session's CI work. Both prevent a future contributor from repeating the same trap.

Aggregator bypass. GitHub Actions short-circuits an aggregator job's needs: to SKIPPED on any sub-job failure, and treats SKIPPED required checks as satisfied. PR #73 silently merged past a failed verify-test because the aggregator from PR #72's verify split was SKIPPED rather than FAILURE. The fix shape (if: always() + needs.*.result check) shipped in PR #74; this lesson documents the trap and the fix so anyone splitting CI jobs in the future doesn't repeat it.

Perf-budget regex flake class. require.Regexp with implicit upper bounds (e.g. 0\.0[0-9]+) on values whose only invariant is >0 flake on slow CI runners. Two of these hit in one session: TestReceiver_SLIBudget (emit-latency, observed 539ms) and TestReceiver_SetDegraded (degraded-seconds, observed 0.126s). The fix shape is the same in both — relax to any positive value (\d+\.[0-9]*[1-9]) or use baseline-relative comparisons.

File goes from 128 to 148 lines (cap is 150, with 2 lines of remaining headroom — next addition should consider demoting an older entry to a topic note per the file's own promotion rule).

Test plan

wc -l AGENTS.md reports 148, under the 150-line cap.
make doc-check clean (banned-phrase lint, 250 links resolve, (unverified) count = 7 baseline).
Capture-flow format check (learn-from-mistakes skill): banned vocabulary absent, no first-person AI phrasing, no AI attribution, both entries carry Anchor: citations.
CI on this PR exercises the same gates plus the aggregator that's now itself an anchor of lesson ci(deps): bump the gh-actions group with 5 updates #1.

NONE — documentation only. Adds two load-bearing lessons to `AGENTS.md` covering GitHub Actions aggregator semantics and a recurring perf-budget regex flake class. No runtime behavior change.

Two universal rules learned at real cost this session: 1. GitHub Actions aggregator jobs under default `needs:` semantics silently bypass branch protection. PR #73 merged past a failed `verify-test` because the aggregator was SKIPPED and GitHub treats SKIPPED required checks as satisfied. Fix shape lives already in main (PR #74); the lesson now lives where the next contributor splitting CI jobs will see it before repeating. 2. `require.Regexp` with implicit upper bounds (e.g. `0\.0[0-9]+`) is a flake class. Two assertions of identical shape hit in one session (`TestReceiver_SLIBudget`, `TestReceiver_SetDegraded`). Fix shape: assert only the actual invariant. File at 148 lines (cap is 150). `make doc-check` clean. Signed-off-by: Tri Lam <trilamsr@gmail.com>

…83) ## Summary Flake-pattern audit follow-up to PR #76 + #78. Two assertions in `tools/failure-inject/cpusteal/cpusteal_test.go` match the same shape we fixed in `TestReceiver_SLIBudget` and `TestReceiver_SetDegraded`: hard absolute upper bound on observed timing, calibrated to fast-runner expectations. | Before | After | What changed | |---|---|---| | `require.Less(elapsed, 500ms)` for 100ms request | `require.Less(elapsed, 2s)` | Hang sentinel, not perf bound — busy-loop scheduler delay under contention can run a 100ms request to 300-400ms | | `require.Less(elapsed, 250ms)` for cancel response | `require.Less(elapsed, 2s)` | Same — context-cancellation latency varies by an order of magnitude under contention | The lower-bound assertion on `TestRun_HonorsDuration` (`elapsed >= 95ms`) still pins the real contract (busy-loop runs for the requested time). The upper bounds only catch "never returned." This matches the lesson landed in `AGENTS.md` via PR #81 — *match perf-budget assertions by the invariant only*. ## Test plan - [x] Local: `go test -race -count=3 -v ./tools/failure-inject/cpusteal/` — all 4 tests PASS each iteration. - [x] `make lint` clean. - [x] `make vet` clean. - [x] Audit completeness verified: broader grep sweep (`require.Less.*Millisecond`, `assert.Less.*Millisecond`, `elapsed > N*time.X`, `WithinDuration`, `Budget` callsites, `isRaceBuild` callsites) found no other instances of the same shape outside the kernelevents SLI test we already covered. - [ ] CI on this PR. ## Rollback Single Edit to restore the original numeric bounds. No dependents; the bounds are local to two test functions. ```release-notes NONE — test stability only. Relaxes two absolute-time assertions in cpusteal's test to hang sentinels rather than performance bounds, matching the pattern landed in PR #76 and #78. No production behavior change. ``` Signed-off-by: Tri Lam <trilamsr@gmail.com>

## Summary Three observations from the recent session that didn't fit the structured surfaces (already used for the load-bearing AGENTS.md entries in PR #81 and the agent-internal notes in PR #82). Each captured via the `learn-from-mistakes` flow and lands in its existing topic note. **`.claude/notes/automation.md`** — *Memory captures rationale; hooks enforce.* The pre-PR checklist personal memory landed mid-session was followed by a lint failure shipped to CI within the hour. The same gap closed reliably by the `PreToolUse` hook installed shortly after. For any "always do X before Y" pattern, prefer the hook; the memory documents *why* the hook exists. **`docs/notes/ci.md`** — *Frame CI / perf projections as ranges, not single numbers.* PR #72's 155s wall-time projection vs 242s actual cost an investigation round (later landed in PR #77) because the projection's setup-go-cache amortization assumption was unverified. Either verify assumptions empirically before publishing the number, or frame as a range. **`.claude/notes/review-patterns.md`** — *Self-rate work, then write criteria for the next grade up.* Forces articulation of measurable improvements rather than free-form "anything else?". PR #76's B+ → A → A+ came from two iterations of this exact pattern; each iteration closed real structural gaps. A fourth lesson — "fix existing tools before proposing new ones" — was captured to personal memory (no PR, lives in `~/.claude/projects/.../memory/`), not the repo, because it's a judgment heuristic about my own decision-making rather than a repo-resident convention. ## Test plan - [x] `make doc-check` clean (banned-phrase lint, link resolution, all gates). - [x] `learn-from-mistakes` format check: banned vocabulary absent, no first-person AI phrasing, no AI attribution, all three entries carry `Anchor:` lines pointing at concrete PRs. - [ ] CI on this PR exercises `doc-check` + `pr-lint`. ## Rollback Each entry is a self-contained `### title` + body + `Anchor:` block at the top of its file. No dependents elsewhere; reverting is a single Edit per file. ```release-notes NONE — documentation only. Three meta-lessons from a recent session retrospective land in their existing topic notes (`automation.md`, `ci.md`, `review-patterns.md`). No runtime behavior change. ``` Signed-off-by: Tri Lam <trilamsr@gmail.com>

trilamsr merged commit 9201b67 into main May 19, 2026
8 checks passed

trilamsr deleted the docs/agents-md-aggregator-and-flake-lessons branch May 19, 2026 08:02

trilamsr mentioned this pull request May 19, 2026

[ci] cpusteal_test: relax hang-sentinel bounds (flake-pattern audit) #83

Merged

5 tasks

trilamsr mentioned this pull request May 19, 2026

[docs] session retrospective: three meta-lessons captured #84

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[docs] AGENTS.md: aggregator bypass + perf-regex flake lessons#81

[docs] AGENTS.md: aggregator bypass + perf-regex flake lessons#81
trilamsr merged 1 commit into
mainfrom
docs/agents-md-aggregator-and-flake-lessons

trilamsr commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

trilamsr commented May 19, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant