Skip to content

Latest commit

 

History

History
181 lines (132 loc) · 11 KB

File metadata and controls

181 lines (132 loc) · 11 KB

Repository Guidelines

What Kilroy Is

Kilroy is a local-first Go CLI for running software-factory pipelines in a Git repository. There is a skill to convert English requirements into DOT graphs. Then it validates graph semantics and executes stages with checkpoint commits and a run history backed by cxdb. Foundational specs that are in docs/strongdm/attractor.

Use Kilroy in this order: build the binary, generate or write a graph, validate it, then run it with a config file. Typical flow: go build -o ./kilroy ./cmd/kilroy, ./kilroy attractor ingest -o pipeline.dot "<requirements>", ./kilroy attractor validate --graph pipeline.dot, then ./kilroy attractor run --graph pipeline.dot --config run.yaml.

What you're doing here - the Prime Directive.

If you can see this message, then you are not here to use Kilroy - YOU ARE HERE TO IMPROVE KILROY. If Kilroy fails to build a project:

  • Don't fix the project
  • Don't fix the dotfile
  • Don't fix the system so it works for this project Use the knowledge you've gained from the failure to make the system more robust for every project. Your changes should work for every project, every language, every system. Of course, specific user instructions may override this, or any other section.

Think like a user

Think about a blank slate agent that is trying to properly create a dotfile using the dotfile skill and then run it with the attractor. What mistakes would it make? What ergonomics would steer it away? How can you make that robust for every possible project the attractor could work on, not just this one? How can you do that without asking it to know the impossible, like how hard a problem is or how long something might take?

Canonical Specs

These three specs are the true north for system design. If you are making a change that relates to one of their areas, you must consult the relevant spec first to see what the idiomatic solution is.

  • Unified LLM Spec (docs/strongdm/attractor/unified-llm-spec.md): Provider-agnostic LLM client — a single Client interface across LLM endpoints with unified types, retry/backoff, streaming, and tool calling. Key implementation: internal/llm/ (client, types, errors, retry, generate, streaming) and internal/llm/providers/ (per-provider adapters).

  • Attractor Spec (docs/strongdm/attractor/attractor-spec.md): DOT-graph pipeline runner — parses Graphviz DOT into a directed graph of stages (LLM tasks, tool gates, human gates, parallel fan-out/fan-in) and executes them with conditional edge routing, checkpoint/resume, model stylesheets, and retry policies. Key implementation: internal/attractor/dot/ (parser, lexer), internal/attractor/engine/ (execution, handlers, parallel, resume, failure policy), internal/attractor/validate/ (graph linting), internal/attractor/style/ (CSS-like model stylesheet), internal/attractor/runtime/ (checkpoint, context, status).

  • Coding Agent Loop Spec (docs/strongdm/attractor/coding-agent-loop-spec.md): Turn-based agentic loop — pairs an LLM with developer tools (file edit, shell, search, glob, grep) through repeated LLM-call → tool-execution cycles with context truncation, subagent spawning, and event-driven observation. Key implementation: internal/agent/ (session.go for the loop, tool_registry.go for tool dispatch, profile.go for provider-specific toolsets, env_local.go for filesystem/shell execution, events.go for the event bus).

Project Structure & Module Organization

  • cmd/kilroy/: CLI entrypoint and subcommands for attractor commands (run, resume, status, stop, validate, ingest).
  • internal/attractor/: core engine/runtime, graph validation, config loading, and model metadata handling.
  • internal/agent/, internal/cxdb/, internal/llmclient/: coding-agent loop, CXDB integration, and provider client/env wiring.
  • scripts/: operational helpers (e2e.sh, e2e-guardrail-matrix.sh, start-cxdb.sh, run_benchmarks.sh).
  • demo/, docs/, skills/: sample graphs, architecture/spec references, and ingestion skills.

Skill Symlink Layout

  • Canonical skill content lives under skills/<name>/.
  • .claude/skills/<name> must be a symlink to ../../skills/<name>.
  • .agents/skills/<name> must be a symlink to ../../.claude/skills/<name>.
  • When adding/removing/renaming repo skills, update both symlink directories in the same change.

Build, Test, and Development Commands

  • go build -o ./kilroy ./cmd/kilroy: build the local CLI binary.
  • go test ./...: run the full Go test suite.
  • ./scripts/e2e.sh: smoke check (tests, build, and graph validation).
  • ./scripts/e2e-guardrail-matrix.sh: run targeted engine guardrail regression tests.
  • ./kilroy attractor validate --graph <file.dot>: validate graph structure/semantics before execution.

Pre-Commit CI Checklist

Run all of these before every commit. They mirror the exact CI steps in .github/workflows/ci.yml and all must pass before pushing:

# 1. Format — CI uses gofmt -l . on the full repo checkout (no worktrees there)
gofmt -l . | grep -v '^\./\.claude/' | grep -v '^\.claude/'
# must produce no output; fix with: gofmt -w <file>

# 2. Vet
go vet ./...

# 3. Build
go build ./cmd/kilroy/

# 4. Test
go test ./...

# 5. Validate demo graphs
for f in demo/**/*.dot; do echo "Validating $f"; ./kilroy attractor validate --graph "$f"; done

Common pitfall: gofmt -w ./cmd ./internal misses files in other packages. Always run gofmt -l . (excluding .claude/ worktrees) to match CI exactly.

Kilroy Agent Rules

Production Safety (Strict)

NEVER start a production run except precisely as the user requested, and only after an explicit user request for that production run. Production runs are expensive. Any routing decision (provider, model, reasoning depth, or API vs CLI) has cost implications and must be explicitly approved by the user.

For production runs (llm.cli_profile=real), execute only the exact command the user explicitly approved. Do not change flags, env, config, paths, --run-id, --detach, or add overrides like --force-model unless explicitly approved. If the run fails, stop immediately, report the error, and wait for explicit approval of a new exact command.

Running Attractor

Launch Modes: Production vs Test

Use explicit run configs and flags so the mode is unambiguous:

  • Production run (real providers, real cost):
    • llm.cli_profile must be real
    • Do not use --allow-test-shim
    • Example:
./kilroy attractor run --detach --graph <graph.dot> --config <run_config_real.json> --run-id <run_id> --logs-root <logs_root>
  • Test run (fake/shim providers):
    • llm.cli_profile must be test_shim
    • Provider executable overrides are expected in config
    • --allow-test-shim is required
    • Example:
./kilroy attractor run --detach --graph <graph.dot> --config <run_config_test_shim.json> --allow-test-shim --run-id <run_id> --logs-root <logs_root>

Binary Freshness

  • Before running ./kilroy attractor run, ensure ./kilroy is built from current repo HEAD.
  • If stale-build detection triggers, rebuild with go build -o ./kilroy ./cmd/kilroy and rerun.
  • Use --confirm-stale-build only when intentionally running a stale binary.

Long Runs (Detached)

For long attractor run/resume jobs, launch detached so the parent shell/session ending does not kill Kilroy:

RUN_ROOT=/path/to/run_root
setsid -f bash -lc 'cd /home/user/code/kilroy-wt-state-isolation-watchdog && ./kilroy attractor resume --logs-root "$RUN_ROOT/logs" >> "$RUN_ROOT/resume.out" 2>&1'

Checking Run Status

Runs live under ~/.local/state/kilroy/attractor/runs/<run_id>/. Key files:

  • final.json — exists only when the run finished; status is success or fail.
  • checkpoint.json — last completed node, retry counts, failure_reason (if any).
  • live.json — most recent engine event (retries, errors, current node).
  • progress.ndjson — full event log (stage starts/ends, edge selections, LLM retries).
  • manifest.json — run metadata (goal, graph, repo, base SHA).

Agent Backend Configuration

Agent nodes (shape=box, agent_tool="claude") require specific backend and handler configuration for proper agent log capture:

  • backend: cli in the run config — invokes the actual CLI binary (claude, codex, opencode) with --output-format stream-json, producing agent_output.jsonl with full conversation logs (tool calls, thinking, responses). The server parses this into structured agent events for the UI.
  • backend: api — uses the Anthropic HTTP API directly. Produces events.ndjson in a different format. The server does NOT currently parse this into UI-visible agent events. Use backend: cli for runs where you want the UI to show agent conversation detail.
  • --tmux flag — required for agent nodes that use CLI backends. Registers TmuxAgentHandler which runs agent CLIs in tmux sessions for reliable headless execution. Without --tmux, the default AgentHandler is used (API-only path).
  • --package flag — points to a workflow package directory (e.g., workflows/pr-review/). Copies scripts, prompts, and graph into the worktree at .kilroy/package/.

Example production PR review launch:

./kilroy attractor run --detach --tmux \
  --package workflows/pr-review \
  --config run.yaml \
  --no-cxdb --skip-cli-headless-warning \
  --input '{"pr_repo": "owner/repo", "pr_number": 123}'

The run config must specify backend: cli for providers used by agent nodes:

llm:
  cli_profile: real
  providers:
    anthropic:
      backend: cli

PR Review Process

For PRs we want to accept: check out the PR branch into a worktree, review, add fix-up commits, then non-squash merge — this preserves contributor credit while maintaining code quality.

Coding Style & Naming Conventions

  • Follow idiomatic Go and run formatter before commit: gofmt -w . (or target specific files). Check with gofmt -l . | grep -v '\.claude/' — must produce no output.
  • Keep packages domain-focused (for example engine, validate, modeldb) and avoid cross-package leakage.
  • Use lowercase package names, CamelCase for exported symbols, and colocated *_test.go files.
  • Prefer explicit config over implicit behavior; this codebase favors deterministic runtime contracts.

Testing Guidelines

  • Add tests next to the changed code (internal/.../*_test.go).
  • Use table-driven tests for validation, parsing, and routing logic.
  • For engine/runtime changes, run targeted package tests plus go test ./....
  • Include regression coverage for bug fixes, not just happy-path assertions.

Commit & Pull Request Guidelines

  • Follow commit patterns seen in history: area: summary or type(scope): summary (for example engine/runtime: ..., docs(plan): ..., feat(modeldb): ...).
  • Keep commits narrow and include docs/tests when behavior or contracts change.
  • PRs should include: intent, key files touched, commands run for validation, and config/runtime impact.
  • For CLI behavior changes, include example invocations and representative output.