[docs] FOLLOWUPS: record CI wall-time investigation post-PR #72#77
Merged
Conversation
Two entries under "Open — opportunistic / Tooling": 1. `coverage-check` sharding to close the projection gap. Investigation on 2026-05-18 confirmed all three verify-* jobs already share the same setup-go cache key and hit it. The gap is verify-test bounded at 235s by race-on test execution, not cache misses. Closing it requires sharding `go test ./...` across jobs — a real refactor, triggered by PR throughput becoming binding. 2. Compile-work dedupe between verify-lint and verify-static. `make build-tags` and `make build` repeat work verify-lint already did. Moving both saves ~17% runner-minutes, zero wall-time. Triggered by runner-minute pressure or unrelated verify-split work. Records both so future investigations don't re-chase setup-go cache sharing as a wall-time lever. Signed-off-by: Tri Lam <trilamsr@gmail.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
Doc-only PR (335 changed lines, all markdown). Branch is `worktree-*` so the rubric anchor auto-resolves to `(no-milestone)` per the skill; binding rubric is PRINCIPLES.md / NORTHSTARS.md / STYLE.md / docs/STYLE-docs.md / MEMORY.md. The MILESTONES.md M13 amendments are the diff content, not the rubric anchor for this review. Pre-Phase-1 fix: prior commit on this branch was amended (8e488bc) to remove the Co-Authored-By trailer and add Signed-off-by, so the loop runs on a DCO-compliant branch. PR body cleanup deferred to Phase 5 per skill orchestration. This commit defers one new follow-up to docs/FOLLOWUPS.md (P1.6, non-CPython interpreter support boundary). No code changes; doc-only diff means no TDD code cycle. Test-discipline maps to doc-check banned-phrase lint, link integrity, and the grep-based evidence map in the PR body — all clean. Findings: P1.1 [BLOCKER] DCO + AI-vocab pre-conditions on prior commit and PR body Beneficiary: repo-long-term (feedback_no_ai_mentions_in_repo) Location: PR-level Description: Prior commit dfe1f6b carried Co-Authored-By for AI and lacked Signed-off-by. PR body has "Generated with Claude Code" footer and references the review-loop skill by name. Proof: `git log -1 --format='%(trailers)' dfe1f6b` showed trailer pre-fix; `git log -1 --format='%(trailers)' 8e488bc` shows only Signed-off-by post-fix. Contradiction: ran. Same rules apply across the repo per feedback_no_ai_mentions_in_repo; no exception for draft PRs. TDD record: amend verified via the trailers grep above; force-push landed (8e488bc replaces dfe1f6b on origin/worktree-*). PR body sync deferred to Phase 5. Action: applied 8e488bc (commit), deferred Phase-5 (PR body) Rationale: standard DCO repair flow on a draft PR with no reviewers. P1.2 [NIT] External-link rot defense absent Beneficiary: repo-long-term Location: docs/rfcs/0009-pyspy-receiver-scope.md §References plus M13 citation corrections at MILESTONES.md:444-445 Description: doc-check verifies on-disk links only; external GitHub URLs would rot silently if upstream issues move. Proof: doc-check output "252 markdown link(s) resolve to on-disk files" — no counter for external URLs. Contradiction: link-checker infrastructure is heavier than warranted for v0.1; industry standard accepts external link rot as best-effort. Action: explicitly-skipped Rationale: cost > value; manual review during reads suffices. PRINCIPLES §6 met (every cited claim has a URL). P1.3 [NIT] Internal section-anchor brittleness Beneficiary: repo-long-term Location: MILESTONES.md M13 references "RFC-0009 §Design overview", "§Wire protocol", "§Safety properties" Description: doc-check verifies file existence, not anchor existence. Future RFC heading rename silently breaks back-references. Proof: `grep -c "RFC-0009 §" MILESTONES.md` returns 3; no automated test asserts those anchors resolve. Contradiction: at land time anchors resolve; future rename touches the author who is expected to grep for back-references. Action: deferred (same family as existing FOLLOWUPS rows on docs- parity helpers; no new row added) Rationale: low-frequency failure mode, manual mitigation sufficient. P1.4 [NIT] Concurrent FOLLOWUPS.md edit conflict Beneficiary: repo-long-term Location: docs/FOLLOWUPS.md new section at file tail Description: My M13 section sits at end-of-file; concurrent PRs appending elsewhere merge cleanly; same-tail concurrent appends conflict. Proof: PR #77 recently edited FOLLOWUPS.md; my branch was rebased onto current main. Contradiction: standard git merge handles this; conflict is expected workflow, not a defect. Action: explicitly-skipped Rationale: not a defect. P1.5 [NIT] fnv128a collision math n=10^7 not workload-cited Beneficiary: repo-long-term Location: docs/rfcs/0009-pyspy-receiver-scope.md §Design overview Description: The (2.7e-6, 1.5e-25) numbers cite the birthday-bound formula `n^2 / (2 * 2^k)` but the `n=10^7 distinct (rank, stack) pairs per fleet-day` upper bound is asserted without measurement. Proof: Python repl: `1e7**2/(2*2**64) = 2.71e-6`; `1e7**2/(2*2**128) = 1.47e-25`. Math is correct; n is an estimate. Contradiction: reader can re-derive for any n trivially from the inline formula. RFC text is falsifiable in shape. Action: explicitly-skipped Rationale: PRINCIPLES §6 met — formula provided; n is a reasonable upper-bound estimate, tightening is polish not correctness. P1.6 [CONCERN] Non-CPython interpreter support boundary undocumented Beneficiary: customer-adopter Location: docs/rfcs/0009-pyspy-receiver-scope.md §Degraded modes Description: `faulthandler_missing` posture covers CPython-built- without-faulthandler but the RFC does not enumerate PyPy / MicroPython / GraalPy. An adopter on a non-CPython runtime would hit a confusing pip-install failure rather than a clean error. Proof: PyPy older versions: `import faulthandler` raises ImportError; GraalPy support is divergent across versions. Contradiction: ML training is overwhelmingly CPython; non-CPython in training is rare; Phase 4 operator docs are the right venue. TDD record: deferred — defended by operator docs (Phase 4). Action: deferred docs/FOLLOWUPS.md M13 section (this commit) Rationale: real constraint, low-frequency, doc-mitigation correct venue. Validation-cycle stats: - Findings rejected during contradict: 0 - Findings whose hard-proof did not reproduce: 0 TDD discipline stats: - New code changes landed via failing-test-first: 0 (doc-only PR) - Tests with mutation-verify outcome recorded: 0 - Tests reproducibility-verified: 0 - doc-check runs in draft cycle: 6 (all clean) Edge cases enumerated: - Internal link rot → defended by doc-check (252 links resolve) - External link rot → P1.2 explicitly-skipped - Anchor brittleness → P1.3 deferred - Concurrent edits → P1.4 explicitly-skipped (standard merge) - Collision math → P1.5 explicitly-skipped (formula inline) - Non-CPython runtime → P1.6 deferred to FOLLOWUPS - Rubric drift → covered by P1.4 Rubric additions proposed in Phase 1: none. Initial rubric (repo- standards) covers the doc-only diff. Signed-off-by: Tri Lam <trilamsr@gmail.com>
3 tasks
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
## Summary Three observations from the recent session that didn't fit the structured surfaces (already used for the load-bearing AGENTS.md entries in PR #81 and the agent-internal notes in PR #82). Each captured via the `learn-from-mistakes` flow and lands in its existing topic note. **`.claude/notes/automation.md`** — *Memory captures rationale; hooks enforce.* The pre-PR checklist personal memory landed mid-session was followed by a lint failure shipped to CI within the hour. The same gap closed reliably by the `PreToolUse` hook installed shortly after. For any "always do X before Y" pattern, prefer the hook; the memory documents *why* the hook exists. **`docs/notes/ci.md`** — *Frame CI / perf projections as ranges, not single numbers.* PR #72's 155s wall-time projection vs 242s actual cost an investigation round (later landed in PR #77) because the projection's setup-go-cache amortization assumption was unverified. Either verify assumptions empirically before publishing the number, or frame as a range. **`.claude/notes/review-patterns.md`** — *Self-rate work, then write criteria for the next grade up.* Forces articulation of measurable improvements rather than free-form "anything else?". PR #76's B+ → A → A+ came from two iterations of this exact pattern; each iteration closed real structural gaps. A fourth lesson — "fix existing tools before proposing new ones" — was captured to personal memory (no PR, lives in `~/.claude/projects/.../memory/`), not the repo, because it's a judgment heuristic about my own decision-making rather than a repo-resident convention. ## Test plan - [x] `make doc-check` clean (banned-phrase lint, link resolution, all gates). - [x] `learn-from-mistakes` format check: banned vocabulary absent, no first-person AI phrasing, no AI attribution, all three entries carry `Anchor:` lines pointing at concrete PRs. - [ ] CI on this PR exercises `doc-check` + `pr-lint`. ## Rollback Each entry is a self-contained `### title` + body + `Anchor:` block at the top of its file. No dependents elsewhere; reverting is a single Edit per file. ```release-notes NONE — documentation only. Three meta-lessons from a recent session retrospective land in their existing topic notes (`automation.md`, `ci.md`, `review-patterns.md`). No runtime behavior change. ``` Signed-off-by: Tri Lam <trilamsr@gmail.com>
trilamsr
added a commit
that referenced
this pull request
May 19, 2026
Doc-only PR (335 changed lines, all markdown). Branch is `worktree-*` so the rubric anchor auto-resolves to `(no-milestone)` per the skill; binding rubric is PRINCIPLES.md / NORTHSTARS.md / STYLE.md / docs/STYLE-docs.md / MEMORY.md. The MILESTONES.md M13 amendments are the diff content, not the rubric anchor for this review. Pre-Phase-1 fix: prior commit on this branch was amended (8e488bc) to remove the Co-Authored-By trailer and add Signed-off-by, so the loop runs on a DCO-compliant branch. PR body cleanup deferred to Phase 5 per skill orchestration. This commit defers one new follow-up to docs/FOLLOWUPS.md (P1.6, non-CPython interpreter support boundary). No code changes; doc-only diff means no TDD code cycle. Test-discipline maps to doc-check banned-phrase lint, link integrity, and the grep-based evidence map in the PR body — all clean. Findings: P1.1 [BLOCKER] DCO + AI-vocab pre-conditions on prior commit and PR body Beneficiary: repo-long-term (feedback_no_ai_mentions_in_repo) Location: PR-level Description: Prior commit dfe1f6b carried Co-Authored-By for AI and lacked Signed-off-by. PR body has "Generated with Claude Code" footer and references the review-loop skill by name. Proof: `git log -1 --format='%(trailers)' dfe1f6b` showed trailer pre-fix; `git log -1 --format='%(trailers)' 8e488bc` shows only Signed-off-by post-fix. Contradiction: ran. Same rules apply across the repo per feedback_no_ai_mentions_in_repo; no exception for draft PRs. TDD record: amend verified via the trailers grep above; force-push landed (8e488bc replaces dfe1f6b on origin/worktree-*). PR body sync deferred to Phase 5. Action: applied 8e488bc (commit), deferred Phase-5 (PR body) Rationale: standard DCO repair flow on a draft PR with no reviewers. P1.2 [NIT] External-link rot defense absent Beneficiary: repo-long-term Location: docs/rfcs/0009-pyspy-receiver-scope.md §References plus M13 citation corrections at MILESTONES.md:444-445 Description: doc-check verifies on-disk links only; external GitHub URLs would rot silently if upstream issues move. Proof: doc-check output "252 markdown link(s) resolve to on-disk files" — no counter for external URLs. Contradiction: link-checker infrastructure is heavier than warranted for v0.1; industry standard accepts external link rot as best-effort. Action: explicitly-skipped Rationale: cost > value; manual review during reads suffices. PRINCIPLES §6 met (every cited claim has a URL). P1.3 [NIT] Internal section-anchor brittleness Beneficiary: repo-long-term Location: MILESTONES.md M13 references "RFC-0009 §Design overview", "§Wire protocol", "§Safety properties" Description: doc-check verifies file existence, not anchor existence. Future RFC heading rename silently breaks back-references. Proof: `grep -c "RFC-0009 §" MILESTONES.md` returns 3; no automated test asserts those anchors resolve. Contradiction: at land time anchors resolve; future rename touches the author who is expected to grep for back-references. Action: deferred (same family as existing FOLLOWUPS rows on docs- parity helpers; no new row added) Rationale: low-frequency failure mode, manual mitigation sufficient. P1.4 [NIT] Concurrent FOLLOWUPS.md edit conflict Beneficiary: repo-long-term Location: docs/FOLLOWUPS.md new section at file tail Description: My M13 section sits at end-of-file; concurrent PRs appending elsewhere merge cleanly; same-tail concurrent appends conflict. Proof: PR #77 recently edited FOLLOWUPS.md; my branch was rebased onto current main. Contradiction: standard git merge handles this; conflict is expected workflow, not a defect. Action: explicitly-skipped Rationale: not a defect. P1.5 [NIT] fnv128a collision math n=10^7 not workload-cited Beneficiary: repo-long-term Location: docs/rfcs/0009-pyspy-receiver-scope.md §Design overview Description: The (2.7e-6, 1.5e-25) numbers cite the birthday-bound formula `n^2 / (2 * 2^k)` but the `n=10^7 distinct (rank, stack) pairs per fleet-day` upper bound is asserted without measurement. Proof: Python repl: `1e7**2/(2*2**64) = 2.71e-6`; `1e7**2/(2*2**128) = 1.47e-25`. Math is correct; n is an estimate. Contradiction: reader can re-derive for any n trivially from the inline formula. RFC text is falsifiable in shape. Action: explicitly-skipped Rationale: PRINCIPLES §6 met — formula provided; n is a reasonable upper-bound estimate, tightening is polish not correctness. P1.6 [CONCERN] Non-CPython interpreter support boundary undocumented Beneficiary: customer-adopter Location: docs/rfcs/0009-pyspy-receiver-scope.md §Degraded modes Description: `faulthandler_missing` posture covers CPython-built- without-faulthandler but the RFC does not enumerate PyPy / MicroPython / GraalPy. An adopter on a non-CPython runtime would hit a confusing pip-install failure rather than a clean error. Proof: PyPy older versions: `import faulthandler` raises ImportError; GraalPy support is divergent across versions. Contradiction: ML training is overwhelmingly CPython; non-CPython in training is rare; Phase 4 operator docs are the right venue. TDD record: deferred — defended by operator docs (Phase 4). Action: deferred docs/FOLLOWUPS.md M13 section (this commit) Rationale: real constraint, low-frequency, doc-mitigation correct venue. Validation-cycle stats: - Findings rejected during contradict: 0 - Findings whose hard-proof did not reproduce: 0 TDD discipline stats: - New code changes landed via failing-test-first: 0 (doc-only PR) - Tests with mutation-verify outcome recorded: 0 - Tests reproducibility-verified: 0 - doc-check runs in draft cycle: 6 (all clean) Edge cases enumerated: - Internal link rot → defended by doc-check (252 links resolve) - External link rot → P1.2 explicitly-skipped - Anchor brittleness → P1.3 deferred - Concurrent edits → P1.4 explicitly-skipped (standard merge) - Collision math → P1.5 explicitly-skipped (formula inline) - Non-CPython runtime → P1.6 deferred to FOLLOWUPS - Rubric drift → covered by P1.4 Rubric additions proposed in Phase 1: none. Initial rubric (repo- standards) covers the doc-only diff. Signed-off-by: Tri Lam <trilamsr@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After PR #72's verify split (wall 427s → 242s actual, vs 155s projected), investigated whether setup-go cache sharing could close the gap. It can't: all three
verify-*jobs already share the same cache key and hit it. The gap isverify-testbounded at 235s bymake coverage-check(race-ongo test ./...across 50+ packages). Records two follow-ups indocs/FOLLOWUPS.mdso future work doesn't re-chase setup-go:coverage-checksharding across multiple jobs — real wall-time win (~150s target) but a real refactor (split test execution + coverage aggregation). Triggered by PR throughput becoming binding.build-tagsandbuildfrom verify-static into verify-lint — saves ~17% runner-minutes via dedupe of compile work, zero wall-time. Cheap, but no speed benefit; triggered by runner-minute pressure.Test plan
make doc-checkclean (banned-phrase, link-resolution,(unverified)baseline).