Skip to content

fix(fixtures): pin pattern.id to detector consts (audit #421)#425

Merged
trilamsr merged 1 commit into
mainfrom
fix/audit-421-fixture-pattern-id-drift
Jun 1, 2026
Merged

fix(fixtures): pin pattern.id to detector consts (audit #421)#425
trilamsr merged 1 commit into
mainfrom
fix/audit-421-fixture-pattern-id-drift

Conversation

@trilamsr

@trilamsr trilamsr commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Audit issue #421 finding-2 root-fix. PR #398 landed the canonical v1.0-rc1 SDK fixture at docs/schemas/fixtures/shipped-patterns-v1.0.0-rc1.json with five rows carrying wrong pattern.id values — the fixture was hand-typed by sequential counter (..14, 15, 16, 17, 18, 19, 20, 21) instead of being copied from the PatternID* constants in module/pkg/patterns/*.go. The envelope schema only constrains pattern.id to type:string, minLength:1, so per-pattern envelope tests round-tripped the drift silently.

Before / after pattern.id

fixture row before canonical (module/pkg/patterns/*.go)
hbm_ecc "17" PatternIDHBMECC = "3"
thermal_throttle "18" PatternIDThermalThrottle = "4"
pcie_aer "19" PatternIDPCIeAER = "5"
cuda_oom "20" PatternIDCUDAOOM = "10"
ib_link_flap "21" PatternIDIBLinkFlap = "2"

pod_evicted ("14"), nccl_hang ("15"), xid_correlation ("16"), silent_data_corruption ("13") were already correct.

Scoped-keys delta

The ib_link_flap row also used unscoped "node" while IBLinkFlapVerdict.Node serializes as "k8s.node.name" (customer-stable namespace per docs/ATTRIBUTES.md line 152-154 + docs/patterns/pattern-2-ib-link-flap.md). Renamed in the fixture to mirror what the Go struct actually emits. The other unscoped keys on this row (hca_device, port, transition_count) match the Go struct's json:"..." tags as-is — those are the envelope key names; the operator-facing dashboard scoped names (tracecore.alert.ib_link_flap.transition_count, hw.network.ib.device, hw.network.ib.port.num) are the OTel log-record attributes the processor promotes alongside the JSON payload, a separate surface.

Drift-prevention test

TestCanonicalShippedFixtures_PatternIDsMatchDetectorConsts in module/pkg/patterns/verdict_envelope_schema_test.go pins each fixture row's pattern.id to the PatternID* const in the detector package and asserts symmetry (every const has a fixture, every fixture matches a const). Confirmed RED with the buggy fixture (5 failures matching the audit prediction exactly), GREEN after the fix.

Correct `pattern.id` values for five rows in the v1.0-rc1 canonical SDK fixture and add a drift-prevention test pinning fixtures to detector-package constants.

Refs #421
Refs #398

Test plan

  • go test ./pkg/patterns/... PASS (new test + existing envelope suite)
  • go test ./sdk/verdict/... PASS (Go SDK round-trip)
  • python3 -m pytest python/tracecore_verdict/test_decode.py 22/22 PASS
  • make vet clean
  • make lint 0 issues
  • RED→GREEN cycle confirmed on TestCanonicalShippedFixtures_PatternIDsMatchDetectorConsts

Five rows in the canonical SDK fixture
(docs/schemas/fixtures/shipped-patterns-v1.0.0-rc1.json) carried wrong
pattern.id values: hbm_ecc "17"->"3", thermal_throttle "18"->"4",
pcie_aer "19"->"5", cuda_oom "20"->"10", ib_link_flap "21"->"2".
The envelope schema only constrains pattern.id to be a non-empty
string, so per-pattern envelope tests round-tripped the drift silently.

The ib_link_flap row also used the unscoped key "node" while the
Go struct serializes Node as "k8s.node.name" (the customer-stable
attribute namespace per docs/ATTRIBUTES.md). Renamed to match.

Drift-prevention test
TestCanonicalShippedFixtures_PatternIDsMatchDetectorConsts pins each
fixture row's pattern.id to the PatternID* const exported by the
detector package, and asserts every detector const has a fixture row
(symmetry guard against orphans on either side).

Refs #421
Refs #398
@trilamsr

trilamsr commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

Review: SHIP

Dual-surface verification: PASSED

  • transition_count unscoped in JSON envelope (ib_link_flap.go:121 json:\"transition_count\")
  • Promoted as scoped tracecore.alert.ib_link_flap.transition_count in OTel log-record attributes (ATTRIBUTES.md)
  • All other envelope keys (hca_device, port, k8s.node.name) match struct tags correctly

Pattern.id corrections: All 5 match detector consts:

  • hbm_ecc "17" → "3" ✓ (PatternIDHBMECC)
  • thermal_throttle "18" → "4" ✓ (PatternIDThermalThrottle)
  • pcie_aer "19" → "5" ✓ (PatternIDPCIeAER)
  • cuda_oom "20" → "10" ✓ (PatternIDCUDAOOM)
  • ib_link_flap "21" → "2" ✓ (PatternIDIBLinkFlap)

Test coverage: LOAD-BEARING

  • TestCanonicalShippedFixtures_PatternIDsMatchDetectorConsts passes all 9 fixture rows + 1 detector const
  • Symmetry guard catches orphans (fixtures without consts, consts without fixtures)
  • Mutation test confirms: mutating fixture pattern.id fails the test before SDK round-trip silently swallows drift
  • All existing tests PASS (patterns + Go SDK + Python SDK 22/22)

Scoped-key audit: nodek8s.node.name is correct per docs/patterns/02-ib-link-flap.md; other unscoped keys are envelope-surface intentional.

No findings. Ready to merge.

@trilamsr trilamsr enabled auto-merge (squash) June 1, 2026 23:20
@trilamsr trilamsr merged commit 9f13c0e into main Jun 1, 2026
23 checks passed
@trilamsr trilamsr deleted the fix/audit-421-fixture-pattern-id-drift branch June 1, 2026 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant