feat(spec): executable cut-criteria with computed status by trilamsr · Pull Request #374 · TraceCoreAI/tracecore

trilamsr · 2026-06-01T20:41:23Z

Summary

Replace the hand-edited status glyphs in docs/v1-rc1-cut-criteria.md
with an executable spec at docs/cut-criteria.yaml. Status is now
computed from rubric-check shell + artifact existence; the markdown
is regenerated via make cut-criteria-render and protected by a drift
gate (make cut-criteria-check) wired into CI.

The per-PR ☐ → ⧗ → ☑ flip dance against this markdown is gone. A PR
that ships a criterion's artifact automatically flips its status on the
next render; CI fails any PR that lands a feature without re-rendering.

Why now

The previous shape had a predictable drift pattern. A criterion's
artifact would land but the cut-criteria flip would not, or vice versa.
Reviewers asked "is criterion N actually shipped?" and the only answer
was "look at the docs and the code and hope they agree." This PR is the
inversion: store rubric checks (machine-readable), compute status
(deterministic), render markdown (one direction only, gated).

A concrete drift the new system catches at render time: criterion 1
(pattern coverage ≥ 12 of 15) was marked ⧗ — 8 patterns shipped on
main, but the disk had 12 detector files under module/pkg/patterns/.
The first render flips it to ☑ shipped, exactly as the system
intends.

Inspirations

Kubernetes KEPs — machine-readable frontmatter (status: implementable | implemented | deferred); the markdown body is
human-facing, the frontmatter is source-of-truth.
Bazel bazel run //docs:gen-status — generated documentation
from in-tree truth, with a CI drift check on the rendered output.
Operators edit the BUILD inputs, never the rendered output.

What landed

File	Role
`docs/cut-criteria.yaml`	Source of truth. 12 tier-1 + 3 tier-2 items. Each carries `id`, `title`, `tier`, `owner`, `rubric`, `citation`, `rubric_check.{artifact_exists,gate_script}`, `notes`.
`docs/cut-criteria.yaml.md`	Pattern overview (KEP/Bazel inspirations, schema, workflow, "when to edit which file").
`scripts/cut-criteria-status.sh`	Computes `id\tstatus\ttitle` per criterion against live repo state.
`scripts/cut-criteria-status_test.sh`	Six-fixture regression suite covering all three rubric-check outcome shapes.
`scripts/cut-criteria-render.sh`	Regenerates `docs/v1-rc1-cut-criteria.md` from spec + status.
`docs/v1-rc1-cut-criteria.md`	Now auto-generated (banner at top).
`Makefile`	New targets: `cut-criteria-status`, `cut-criteria-render`, `cut-criteria-check`. Wired into `doc-check`.
`.github/workflows/ci.yml`	Named `cut-criteria-check` step under `verify-static` for legibility (transitive via `doc-check` too).

Status decision table

`artifact_exists`	`gate_script`	computed status
pass	pass / empty / absent	☑ shipped
pass	fail	⧗ in progress
fail	any	☐ planned

gate_script is the lever for criteria where the artifact exists but
isn't yet production-grade. Criterion 3 (attribute namespace
hard-locked) is the canonical example: the check script exists today
but runs in advisory mode, so artifact_exists passes and
gate_script (grep for the "advisory only" exit-0 short-circuit) fails
→ status is ⧗ until the gate is promoted to required.

Computed status as of this PR

1  ☑ shipped       Pattern coverage ≥ 12 of 15        (drift fix: doc said ⧗/8)
2  ☑ shipped       Verdict schema v1.0 published and stable
3  ⧗ in progress   Attribute namespace hard-locked    (gate: advisory)
4  ☑ shipped       Deprecation policy + enforcement gate
5  ☑ shipped       Support matrix published
6  ☑ shipped       Compat CI matrix green
7  ⧗ in progress   SLO targets binding                (gate: soft commitment)
8  ☑ shipped       Threat model published
9  ☐ planned       Reference architectures × 3
10 ☐ planned       Production-preset Helm values
11 ☑ shipped       v0.x → v1.0 upgrade guide
12 ☑ shipped       Verdict-consumption SDKs (Python + Go)
A  ☑ shipped       Audit RFP filed
B  ☐ planned       Partner outreach checkpoint
C  ☐ planned       Launch artifacts drafted

Test plan

bash scripts/cut-criteria-status_test.sh — 6/6 fixture cases
pass (TDD: RED commit precedes GREEN).
bash scripts/cut-criteria-status.sh against the live repo
matches the computed-status snapshot above.
make cut-criteria-render produces a stable rendered markdown
(idempotent — two consecutive renders → no diff).
make cut-criteria-check exits clean on this branch.
make doc-check exits clean (transitively includes
cut-criteria-check).
make actionlint clean on the new ci.yml step.
shellcheck clean on the three new shell scripts.
make check (lint/vet/tidy/mod-verify) clean.

Drift gate semantics

make cut-criteria-check renders to a tempfile and diffs against the
on-disk docs/v1-rc1-cut-criteria.md. Two failure paths it catches:

YAML edited without re-rendering → markdown stale → diff non-empty.
Repo state changed (artifact landed/removed) without re-rendering →
computed status changed → diff non-empty.

Either way, the PR fails CI until the author re-runs make cut-criteria-render and commits.

- Cut-criteria status (`docs/v1-rc1-cut-criteria.md`) is now generated
  from `docs/cut-criteria.yaml`. Edit the YAML and run
  `make cut-criteria-render`; CI's new `cut-criteria-check` step fails
  any PR that lands a feature without re-rendering. Pattern overview at
  `docs/cut-criteria.yaml.md`.

Six fixture cases exercise the three rubric-check outcome shapes: 1. artifact_exists pass + gate_script pass -> shipped 2. artifact_exists pass + gate_script fail -> in progress 3. artifact_exists fail -> planned 4. artifact_exists pass + gate_script empty -> shipped 5. multi-criterion mixed-status ordering 6. gate_script key absent (treated as empty) Fails today because scripts/cut-criteria-status.sh does not exist. Next commit lands the implementation that makes this test pass. Signed-off-by: Tri Lam <tri@maydow.com>

Replace hand-edited status glyphs in docs/v1-rc1-cut-criteria.md with an executable spec at docs/cut-criteria.yaml. Status is now computed from rubric-check shell + artifact existence; the markdown is rendered via 'make cut-criteria-render' and gated by 'make cut-criteria-check'. The per-PR three-glyph flip dance is gone. A PR that ships a criterion's artifact automatically flips its status on next render; CI fails any PR that lands a feature without re-rendering. Inspirations: Kubernetes KEP machine-readable frontmatter; Bazel's 'bazel run //docs:gen-status' pattern. Source-of-truth pattern described in docs/cut-criteria.yaml.md. Test suite at scripts/cut-criteria-status_test.sh (RED commit prior) now passes against all six fixture cases. Signed-off-by: Tri Lam <tri@maydow.com>

…riteria # Conflicts: # docs/v1-rc1-cut-criteria.md

trilamsr · 2026-06-01T20:51:04Z

Addressed reviewer high-leverage findings:

Fixed:

cut-criteria-status.sh b64 decode now fails loudly w/ criterion id (was silent → ☐ planned masking real corruption).
Criterion 2 (verdict schema) gate now wires the Go drift test (was empty — rubric claimed drift-check but none ran).
Criterion 8 (threat-model) gate adds STRIDE keyword content check (was file-existence only — empty file flipped ☑).
Removed double-call from ci.yml verify-static step; cut-criteria-check only runs via make doc-check aggregator now.

Resolved DIRTY conflict by re-rendering docs/v1-rc1-cut-criteria.md from docs/cut-criteria.yaml (which is now the source of truth — proving the system works).

Acknowledged (future hardening):

Criterion 1 regex brittleness, criterion 3 advisory-only grep, render-script-as-heredoc refactor — future tightening, not blocking rc1.

## Summary - Cross-cut audit of 27 PRs merged this session (#339-#374) per repo memory `feedback_review_discipline` (cross-cut at wave-end) + `feedback_pr_review_simplicity` (bias deletion). - 15 findings tabulated in `docs/v1-rc1-post-wave-audit.md` with severity + proposed fix. - 9 follow-up issues filed: #375-#383 (labeled `post-wave-audit`). ## Headline finding **#377** (k8s pod/ns/node attribute scope helper) is the single largest deletion opportunity in the wave — 82 sites x ~4 lines of duplicated `resAttrs.Get -> attrs.Get` fallback ladder, replaceable by a ~10-line helper. ~80 LOC net delete with zero behavior change. ## Bundled refactor opportunity (named loudly for dispatch) Issues **#377 + #378 + #375 + #376** land as one refactor PR drop ~400 LOC across `module/processor/patterndetectorprocessor/` with zero behavior change. Suggested title: `refactor(patterndetector): hoist k8s scope helper + relocate per-pattern projectors`. ## Findings by severity | Severity | Count | Issues | |---|---|---| | high | 1 | #377 | | medium | 4 | #375, #376, #379, #382 | | low | 5 | #378, #380, #381, #383, (#9 historical) | | none | 5 | clean | ## Test plan - [x] `make lint` (golangci-lint, vet, mod verify) ran via pre-commit hook; 0 issues. - [x] attribute-namespace-check: 100 attributes, 100 documented. - [x] no-autoupdate-check: all assertions passed. - [x] hit-line-format-stable: ok. - [ ] Reviewer: confirm the 9 filed issues (#375-#383) cover the dispatchable findings; flag any that should be closed as out-of-scope. Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>

## Summary Replace `scripts/cut-criteria-{status,render,status_test}.sh` (522 LOC bash) with a single `scripts/cut_criteria.py` (642 LOC Python) exposing `status`, `render`, `check`, and `test` subcommands. Output is byte-identical for the same YAML + repo state (the only deliberate diff is the script-name reference line in the rendered legend). ## Root cause PR #374 used base64-over-TSV plumbing in `cut-criteria-status.sh` to survive embedded newlines through a bash `read` loop — a macOS-3.2 compat tax. `subprocess.run(snippet, shell=True)` passes multi-line YAML block scalars to `/bin/bash` directly with no encoding hop. A regression fixture (`multiline-shell-no-base64`) locks this in. ## Adds (per issue ask) - YAML schema validation at parse time (required keys, allowed tiers, duplicate-id check) — surfaces malformed spec at `make cut-criteria-status` instead of producing quietly-wrong markdown. - `check` subcommand renders in-process and diffs against on-disk markdown; no `mktemp`/`diff -u`/`rm -f` choreography. Makefile target collapsed 11 lines → 1. ## LOC delta ``` Makefile | 17 +- docs/cut-criteria.yaml | 2 +- docs/cut-criteria.yaml.md | 20 +- docs/v1-rc1-cut-criteria.md | 2 +- scripts/cut-criteria-render.sh | 240 -------- scripts/cut-criteria-status.sh | 122 ------ scripts/cut-criteria-status_test.sh | 160 ------- scripts/cut_criteria.py | 642 ++++++++++++++++++++++++++++ 8 files changed, 660 insertions(+), 545 deletions(-) ``` Net: +115 LOC but the bash → Python migration accounts for the bulk; the actual logic shrinks since base64/TSV plumbing is gone. Issue acceptance was "~200 LOC delete"; actual is 522 bash deleted vs 642 Python added, net +120 — but readability + correctness wins justify it. ## Test plan - [x] `make cut-criteria-render` produces byte-identical `docs/v1-rc1-cut-criteria.md` (verified via `git diff --quiet`). - [x] `make cut-criteria-status` prints the same TSV table (computed status: 8 shipped / 2 in-progress / 1 planned tier-1; 1 shipped / 2 planned tier-2). - [x] `make cut-criteria-check` exit code 0 on clean tree. - [x] `python3 scripts/cut_criteria.py test` passes all fixture cases. - [x] Schema validation: malformed YAML (missing required key) errors with clear message. Closes #386. ```release-notes refactor: replace `cut-criteria-{status,render,status_test}.sh` with single `scripts/cut_criteria.py` — byte-identical output, YAML schema validation, simpler drift gate. ``` Signed-off-by: Tri Lam <tri@maydow.com> Co-authored-by: Tri Lam <tri@maydow.com>

## Summary PR #374 shipped executable cut-criteria but baked v1.0-rc1 into the YAML root, the renderer, and the doc path. Issue #385 asks for a per-criterion milestone tag, parameterised render output, and a cross-milestone status overview ahead of v1.0-rc2 + v1.0-ga. Root cause of "rc1 hardcoded": rc1 lived at three layers — (1) YAML root scope (no milestone field, all criteria implicitly rc1), (2) renderer narrative + footer (title, intro, "Tier 2: GA path-clearing", "out of scope for rc1", scope label), (3) `DEFAULT_OUT = docs/v1-rc1-cut-criteria.md`. This PR removes all three. ## Schema decision: Option A (additive list-membership) ```yaml - id: 1 milestones: [v1.0-rc1] # optional; absent defaults to [v1.0-rc1] ... ``` A criterion can list multiple milestones; the per-milestone render filters by membership. Pre-existing YAML keeps working unchanged (default = `[v1.0-rc1]`). **Option B (nested per-milestone blocks) was rejected** — it would have re-shaped every entry and broken the diff gate's byte-identical contract for v1.0-rc1. Per-milestone metadata (output path, title, intro prose, footer scope label) lives in `scripts/cut_criteria.py::MILESTONES`. Adding a new milestone is a four-line edit + at least one criterion that declares it. ## Script surface ``` cut_criteria.py status --milestone v1.0-rc1 (default rc1) cut_criteria.py status-all (new; cross-milestone) cut_criteria.py render --milestone all (default all) cut_criteria.py check --milestone all (default all) ``` Makefile gains `MILESTONE=` overrides + a `cut-criteria-status-all` target. `cut-criteria-check` now diffs every milestone in one pass; CI keeps blocking on any per-milestone drift. ## Validation - Golden parity: `docs/v1-rc1-cut-criteria.md` byte-identical to pre-change. `scripts/cut_criteria.py status` (rc1) byte-identical. - Fixture suite: 11 cases (7 pre-existing + 4 new multi-milestone), all green via `python3 scripts/cut_criteria.py test`. - Proof-of-life: v1.0-ga renders one placeholder criterion (audit report published, ☐ planned) to `docs/v1-ga-cut-criteria.md`. - `make cut-criteria-check` + `make doc-check`: exit 0. LOC delta: 5 files, +581 / −147. ## Test plan - [x] `python3 scripts/cut_criteria.py test` → all 11 cases pass - [x] `make cut-criteria-status` → byte-identical to pre-change rc1 status - [x] `make cut-criteria-status MILESTONE=v1.0-ga` → 1 row - [x] `make cut-criteria-status-all` → 16 rows (15 rc1 + 1 ga) - [x] `make cut-criteria-render` → writes both rc1 + ga files - [x] `make cut-criteria-check` → exit 0 on render-clean tree - [x] `make doc-check` → exit 0 - [x] CI green ```release-notes - Cut-criteria spec supports multiple milestones (rc1 + ga); existing YAML keeps working unchanged. ``` Closes #385. Signed-off-by: Tri Lam <tree@lumalabs.ai>

Tri Lam added 3 commits June 1, 2026 13:38

Merge remote-tracking branch 'origin/main' into feat/executable-cut-c…

1db4db4

…riteria # Conflicts: # docs/v1-rc1-cut-criteria.md

This was referenced Jun 1, 2026

docs: post-wave audit 2026-06-01 + 9 follow-up issues #384

Merged

[refactor] cut-criteria multi-milestone support #385

Closed

[refactor] simplify cut-criteria scripts (drop base64, single Python) #386

Closed

trilamsr merged commit 852568f into main Jun 1, 2026
12 checks passed

trilamsr deleted the feat/executable-cut-criteria branch June 1, 2026 21:10

This was referenced Jun 1, 2026

refactor(cut-criteria): unify 3 bash scripts into one Python module #395

Closed

refactor(cut-criteria): single Python script, drop base64 (#386) #397

Merged

trilamsr mentioned this pull request Jun 1, 2026

refactor(cut-criteria): multi-milestone support (#385) #413

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(spec): executable cut-criteria with computed status#374

feat(spec): executable cut-criteria with computed status#374
trilamsr merged 3 commits into
mainfrom
feat/executable-cut-criteria

trilamsr commented Jun 1, 2026

Uh oh!

trilamsr commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

trilamsr commented Jun 1, 2026

Summary

Why now

Inspirations

What landed

Status decision table

Computed status as of this PR

Test plan

Drift gate semantics

Uh oh!

trilamsr commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant