[docs] AGENTS.md: aggregator bypass + perf-regex flake lessons#81
Merged
Merged
Conversation
Two universal rules learned at real cost this session: 1. GitHub Actions aggregator jobs under default `needs:` semantics silently bypass branch protection. PR #73 merged past a failed `verify-test` because the aggregator was SKIPPED and GitHub treats SKIPPED required checks as satisfied. Fix shape lives already in main (PR #74); the lesson now lives where the next contributor splitting CI jobs will see it before repeating. 2. `require.Regexp` with implicit upper bounds (e.g. `0\.0[0-9]+`) is a flake class. Two assertions of identical shape hit in one session (`TestReceiver_SLIBudget`, `TestReceiver_SetDegraded`). Fix shape: assert only the actual invariant. File at 148 lines (cap is 150). `make doc-check` clean. Signed-off-by: Tri Lam <trilamsr@gmail.com>
5 tasks
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
…83) ## Summary Flake-pattern audit follow-up to PR #76 + #78. Two assertions in `tools/failure-inject/cpusteal/cpusteal_test.go` match the same shape we fixed in `TestReceiver_SLIBudget` and `TestReceiver_SetDegraded`: hard absolute upper bound on observed timing, calibrated to fast-runner expectations. | Before | After | What changed | |---|---|---| | `require.Less(elapsed, 500ms)` for 100ms request | `require.Less(elapsed, 2s)` | Hang sentinel, not perf bound — busy-loop scheduler delay under contention can run a 100ms request to 300-400ms | | `require.Less(elapsed, 250ms)` for cancel response | `require.Less(elapsed, 2s)` | Same — context-cancellation latency varies by an order of magnitude under contention | The lower-bound assertion on `TestRun_HonorsDuration` (`elapsed >= 95ms`) still pins the real contract (busy-loop runs for the requested time). The upper bounds only catch "never returned." This matches the lesson landed in `AGENTS.md` via PR #81 — *match perf-budget assertions by the invariant only*. ## Test plan - [x] Local: `go test -race -count=3 -v ./tools/failure-inject/cpusteal/` — all 4 tests PASS each iteration. - [x] `make lint` clean. - [x] `make vet` clean. - [x] Audit completeness verified: broader grep sweep (`require.Less.*Millisecond`, `assert.Less.*Millisecond`, `elapsed > N*time.X`, `WithinDuration`, `Budget` callsites, `isRaceBuild` callsites) found no other instances of the same shape outside the kernelevents SLI test we already covered. - [ ] CI on this PR. ## Rollback Single Edit to restore the original numeric bounds. No dependents; the bounds are local to two test functions. ```release-notes NONE — test stability only. Relaxes two absolute-time assertions in cpusteal's test to hang sentinels rather than performance bounds, matching the pattern landed in PR #76 and #78. No production behavior change. ``` Signed-off-by: Tri Lam <trilamsr@gmail.com>
3 tasks
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
## Summary Three observations from the recent session that didn't fit the structured surfaces (already used for the load-bearing AGENTS.md entries in PR #81 and the agent-internal notes in PR #82). Each captured via the `learn-from-mistakes` flow and lands in its existing topic note. **`.claude/notes/automation.md`** — *Memory captures rationale; hooks enforce.* The pre-PR checklist personal memory landed mid-session was followed by a lint failure shipped to CI within the hour. The same gap closed reliably by the `PreToolUse` hook installed shortly after. For any "always do X before Y" pattern, prefer the hook; the memory documents *why* the hook exists. **`docs/notes/ci.md`** — *Frame CI / perf projections as ranges, not single numbers.* PR #72's 155s wall-time projection vs 242s actual cost an investigation round (later landed in PR #77) because the projection's setup-go-cache amortization assumption was unverified. Either verify assumptions empirically before publishing the number, or frame as a range. **`.claude/notes/review-patterns.md`** — *Self-rate work, then write criteria for the next grade up.* Forces articulation of measurable improvements rather than free-form "anything else?". PR #76's B+ → A → A+ came from two iterations of this exact pattern; each iteration closed real structural gaps. A fourth lesson — "fix existing tools before proposing new ones" — was captured to personal memory (no PR, lives in `~/.claude/projects/.../memory/`), not the repo, because it's a judgment heuristic about my own decision-making rather than a repo-resident convention. ## Test plan - [x] `make doc-check` clean (banned-phrase lint, link resolution, all gates). - [x] `learn-from-mistakes` format check: banned vocabulary absent, no first-person AI phrasing, no AI attribution, all three entries carry `Anchor:` lines pointing at concrete PRs. - [ ] CI on this PR exercises `doc-check` + `pr-lint`. ## Rollback Each entry is a self-contained `### title` + body + `Anchor:` block at the top of its file. No dependents elsewhere; reverting is a single Edit per file. ```release-notes NONE — documentation only. Three meta-lessons from a recent session retrospective land in their existing topic notes (`automation.md`, `ci.md`, `review-patterns.md`). No runtime behavior change. ``` Signed-off-by: Tri Lam <trilamsr@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two load-bearing lessons to
AGENTS.mdfrom this session's CI work. Both prevent a future contributor from repeating the same trap.Aggregator bypass. GitHub Actions short-circuits an aggregator job's
needs:to SKIPPED on any sub-job failure, and treats SKIPPED required checks as satisfied. PR #73 silently merged past a failedverify-testbecause the aggregator from PR #72's verify split was SKIPPED rather than FAILURE. The fix shape (if: always()+needs.*.resultcheck) shipped in PR #74; this lesson documents the trap and the fix so anyone splitting CI jobs in the future doesn't repeat it.Perf-budget regex flake class.
require.Regexpwith implicit upper bounds (e.g.0\.0[0-9]+) on values whose only invariant is>0flake on slow CI runners. Two of these hit in one session:TestReceiver_SLIBudget(emit-latency, observed 539ms) andTestReceiver_SetDegraded(degraded-seconds, observed 0.126s). The fix shape is the same in both — relax to any positive value (\d+\.[0-9]*[1-9]) or use baseline-relative comparisons.File goes from 128 to 148 lines (cap is 150, with 2 lines of remaining headroom — next addition should consider demoting an older entry to a topic note per the file's own promotion rule).
Test plan
wc -l AGENTS.mdreports 148, under the 150-line cap.make doc-checkclean (banned-phrase lint, 250 links resolve,(unverified)count = 7 baseline).learn-from-mistakesskill): banned vocabulary absent, no first-person AI phrasing, no AI attribution, both entries carryAnchor:citations.