Skip to content

A16: per-detector allocs/event bench + ratchet gate #302

Description

@trilamsr

Roadmap

Item A16 from memory/project_horizon_roadmap.md. Stopped mid-session; agent had measured baseline + drafted gate before stop.

Scope

Per-detector allocs/event bench + regression gate via make bench-check. Baselines measured this session (Apple M1 Max):

Detector N allocs/op allocs/event
pod_evicted 1024 15635 15.27
nccl_hang 800 3177 3.97
xid_correlation 200 819 4.09
hbm_ecc 200 1622 8.11
thermal_throttle 240 892 3.72
pcie_aer 200 2722 13.61

All exceed the 2 allocs/event NORTHSTAR target. Plan: ratchet ceilings to measured baseline + file per-detector optimization issues. Ceiling ratchets DOWNWARD as optimization PRs land; never up without adversarial review.

Deliverables

  • module/pkg/patterns/{detector}_bench_test.go per detector (5 files; pod_evicted exists already)
  • scripts/bench-registry.sh: extend bench_entries + add allocs_gate array (bench_name|N|ceiling)
  • scripts/bench-allocs-check.sh (new) — reads allocs_gate, extracts median allocs/op from module/pkg/patterns/testdata/bench-baseline.txt, fails on exceed
  • Wire into make bench-check
  • Regenerate module/pkg/patterns/testdata/bench-baseline.txt with count=10 for benchstat
  • Per-detector optimization tracking issues (5 — pod_evicted skip; already at native limit pending re-investigation)

Initial ceilings (measured baseline + 1 slack)

  • pod_evicted: 16 / 1024 = 16
  • nccl_hang: 4 / 800 = 4
  • xid_correlation: 5 / 200 = 5
  • hbm_ecc: 9 / 200 = 9
  • thermal_throttle: 4 / 240 = 4
  • pcie_aer: 14 / 200 = 14

Acceptance

  • make bench-check green w/ committed baseline
  • Synthetic 10x slowdown fixture trips the gate
  • Per-detector optimization issues filed before ceiling-ratchet PRs

Adopt-over-build

  • stdlib testing.B + -benchmem only

Out of scope

  • Detector hot-path optimization itself (separate PRs per optimization issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions