Skip to content

Commit cbfa2b0

Browse files
authored
Add investigate skill for systematic debugging (#576)
## Summary - Adds a `/investigate` skill with structured 5-phase root cause debugging - Enforces the Iron Law: no fixes without root cause investigation first - Includes pattern matching table for common bug categories - Pure methodology — no special tooling dependencies ## Test plan - [ ] Verify skill appears in `/` command list - [ ] Test with a real bug investigation 🤖 Generated with [Claude Code](https://claude.com/claude-code)
2 parents 4aacc9b + 4d55023 commit cbfa2b0

1 file changed

Lines changed: 95 additions & 0 deletions

File tree

.claude/commands/investigate.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# investigate: Systematic Root Cause Debugging
2+
3+
Structured debugging methodology that finds root causes before applying fixes. Use when
4+
a bug is reported, a test fails unexpectedly, or something "just stopped working."
5+
6+
**Iron Law: No fixes without root cause investigation first.**
7+
8+
## When to Use
9+
10+
- Bug reports from users or QA
11+
- Test failures you don't immediately understand
12+
- "It was working yesterday" situations
13+
- Production errors or crashes
14+
- Performance regressions
15+
16+
## Phase 1: Gather & Reproduce
17+
18+
Before touching any code, understand the problem:
19+
20+
1. **Collect symptoms** — What exactly is failing? Error messages, stack traces, screenshots, user reports.
21+
2. **Reproduce the issue** — Can you trigger it reliably? What are the exact steps?
22+
3. **Check recent changes**`git log --oneline -20` and `git diff HEAD~5` — did something change recently?
23+
4. **Narrow the scope** — Is it one endpoint, one page, one function? Or widespread?
24+
25+
If you cannot reproduce after 3 attempts, stop and ask the user for more context.
26+
27+
## Phase 2: Analyze
28+
29+
Match the symptoms against known patterns:
30+
31+
| Pattern | Indicators |
32+
|---------|------------|
33+
| Race condition | Intermittent, timing-dependent, works in debugger |
34+
| Null/undefined propagation | TypeError, "cannot read property of null/undefined" |
35+
| State corruption | Works on first load, fails on subsequent interactions |
36+
| Data mismatch | Works with some data, fails with other data |
37+
| Environment issue | Works locally, fails in CI/staging/prod |
38+
| Dependency change | Worked before package update, lockfile changed |
39+
| Migration issue | DB-related errors after schema change |
40+
| Cache staleness | Works after hard refresh or cache clear |
41+
| Auth/session issue | Works when freshly logged in, fails later |
42+
| Concurrency issue | Works with one user, fails under load |
43+
44+
## Phase 3: Hypothesize & Test
45+
46+
1. **Form a hypothesis** — "I think X is happening because Y"
47+
2. **Design a test** — How can you prove or disprove this? Add targeted logging, write a minimal reproduction, check specific state.
48+
3. **Test the hypothesis** — Run the test. Does it confirm or refute?
49+
4. **If refuted** — Form a new hypothesis. Do NOT fix something that isn't the root cause.
50+
5. **3-strike rule** — If 3 hypotheses fail, stop and escalate. Share what you've tried.
51+
52+
### Sanitize Before Searching
53+
54+
When searching for errors online or in codebase:
55+
- Strip specific values (IDs, paths, timestamps)
56+
- Keep the error structure and type
57+
- Example: `TypeError: Cannot read property 'id' of undefined at UserService.getUser` → search for `TypeError: Cannot read property of undefined UserService`
58+
59+
## Phase 4: Fix
60+
61+
Only after root cause is confirmed:
62+
63+
1. **Fix the root cause, not the symptom** — If a null value crashes downstream, fix where null is introduced, not where it crashes.
64+
2. **Minimal diff** — Change only what's necessary. Don't refactor while fixing.
65+
3. **Write a regression test** — A test that would have caught this bug before the fix, and passes after.
66+
4. **Verify the fix** — Run the full test suite. Manually reproduce the original steps and confirm the bug is gone.
67+
5. **Check blast radius** — Does this fix affect other code paths? Run `git diff --stat` — if >5 files changed, flag it.
68+
69+
## Phase 5: Report
70+
71+
After fixing, write a brief debug report:
72+
73+
```
74+
## Debug Report
75+
76+
**Issue:** [one-line description]
77+
**Root cause:** [what was actually wrong]
78+
**Fix:** [what was changed and why]
79+
**Regression test:** [test file:line that prevents recurrence]
80+
**Blast radius:** [what else might be affected]
81+
**Time spent:** [how long the investigation took]
82+
```
83+
84+
## Important Rules
85+
86+
1. **Never apply unverified fixes.** "Maybe this will work" is not a fix — it's a guess. Verify first.
87+
2. **Read before writing.** Understand the code path before changing it.
88+
3. **One fix at a time.** Don't combine multiple fixes — you won't know which one worked.
89+
4. **Escalate early.** After 3 failed hypotheses, stop. Share findings and ask for help.
90+
5. **Flag large blast radius.** If a fix touches >5 files, pause and discuss with the user.
91+
6. **Don't optimize while debugging.** Fix the bug. Optimization is a separate task.
92+
7. **Check the obvious first.** Typos, wrong variable names, missing imports, incorrect config.
93+
8. **Trust error messages.** Read them carefully. They usually tell you exactly what's wrong.
94+
9. **Git blame is your friend.** When did this code change? Who changed it? What was the commit message?
95+
10. **Environment matters.** Check env vars, config files, database state, API versions.

0 commit comments

Comments
 (0)