Skip to content

Add performance instrumentation framework for CLI operations#614

Draft
pfleidi wants to merge 9 commits intomainfrom
performance-logging-investigation
Draft

Add performance instrumentation framework for CLI operations#614
pfleidi wants to merge 9 commits intomainfrom
performance-logging-investigation

Conversation

@pfleidi
Copy link
Contributor

@pfleidi pfleidi commented Mar 5, 2026

Summary

Add a perf package with span-based performance instrumentation for the Entire CLI's git hook pipeline. This replaces the ad-hoc LogDuration approach with structured, hierarchical timing that captures the full cost breakdown of each hook invocation in a single log line.

Why

Hook latency directly impacts developer experience — every git commit blocks on our prepare-commit-msg and post-commit hooks. When users report slowness, we need to know which step inside a hook is slow. The previous LogDuration call only captured total wall time with no per-step breakdown, making it impossible to diagnose bottlenecks from logs alone.

More on this can be found here: https://github.com/entirehq/company-knowledge/pull/59

What changed

New perf package (perf/span.go, perf/context.go)

  • Span-based API inspired by OpenTelemetry: perf.Start(ctx, name) returns a child span that nests under the current context's span
  • Root span emits a single structured DEBUG log line on End() with duration_ms and flat steps.<name>_ms keys for each child
  • RecordError(err) marks a span as errored (first-error-wins semantics); root span output includes error: true and steps.<name>_err: true flags for failed steps
  • Measure(name, fn) convenience for timing a closure as a child span

Hook registry (hook_registry.go)

  • Replaced LogDuration with a root perf.Start(ctx, hookName) span — all child spans in lifecycle handlers and strategy methods automatically nest under it

Lifecycle handlers (lifecycle.go)

  • Instrumented 11 substeps across handleLifecycleTurnStart and handleLifecycleTurnEnd: transcript preparation, copy, metadata extraction, file change detection, path normalization, step save, etc.
  • Added RecordError on all error-return paths (13 call sites)

Strategy hooks (manual_commit_hooks.go)

  • Instrumented 16 substeps across PrepareCommitMsg, PostCommit, and postCommitProcessSession: repository open, session lookup, content filtering, tree resolution, session processing, shadow branch cleanup, carry-forward, state save
  • Added RecordError on error paths (10 call sites)

Strategy git operations (manual_commit_git.go)

  • Instrumented 5 substeps in SaveStep: repository open, state load, shadow branch migration, checkpoint write, state update
  • Added RecordError on error paths (5 call sites)

Strategy push (manual_commit_push.go)

  • Instrumented 2 substeps in PrePush: checkpoints branch push, trails branch push
  • Added RecordError on error paths (2 call sites)

Example log output

DEBUG perf component=perf op=post-commit duration_ms=142 steps.open_repository_and_head_ms=12 steps.find_sessions_for_worktree_ms=3 steps.resolve_commit_trees_ms=8 steps.process_sessions_ms=115 steps.cleanup_shadow_branches_ms=4

With an error:

DEBUG perf component=perf op=stop duration_ms=45 error=true steps.prepare_and_validate_transcript_ms=2 steps.prepare_and_validate_transcript_err=true

Test plan

  • Unit tests for perf package: span creation, nesting, duration recording, idempotent End, auto-end children, RecordError (nil no-op, first-error-wins, child error flag in output, no flag by default)
  • mise run fmt && mise run lint && mise run test:ci passes (unit + integration tests with race detector)

pfleidi and others added 8 commits March 4, 2026 14:43
Introduce the perf package at repo root level for measuring operation
latencies. Provides Start/End/Measure API with automatic parent-child
span nesting via context propagation. Root spans emit a single DEBUG
log line through the existing logging package with the full timing tree
flattened as slog attributes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 7995b4435733
Replace the manual time.Now()/LogDuration timing pattern in
newAgentHookVerbCmdWithLogging with a perf.Start/span.End root span.
This root span will serve as the parent for all child spans added
in subsequent instrumentation of lifecycle handlers and strategy methods.

Update the corresponding test to check for the perf span log line
(msg="perf", op=hookName, duration_ms present) instead of the removed
"hook completed" LogDuration entry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 665747ffa1ac
Add perf spans to SessionStart, TurnStart, and TurnEnd handlers.
Span variable names reflect what they measure (e.g., detectSpan,
extractSpan) and span operation names are derived from the actual
code being measured rather than comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 09854502885a
Add perf spans to measure substeps of the prepare-commit-msg hook:
open_repository, find_sessions_for_worktree, filter_sessions_with_content,
read_commit_message, resolve_session_metadata, and write_commit_message.
TTY confirmation prompts are intentionally excluded from spans since they
block on user input and would skew timing measurements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: fe39488c5e4b
Add perf spans to measure substeps of SaveStep: open_repository,
load_session_state, migrate_shadow_branch, write_temporary_checkpoint,
and update_session_state. These nest as grandchildren of the root
hook span via the build_and_save_step parent span.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 9b052733f137
Instrument the PostCommit hook and its per-session helper with
descriptive perf spans to measure substep timing. PostCommit gets
spans for open_repository_and_head, find_sessions_for_worktree,
resolve_commit_trees, process_sessions, and cleanup_shadow_branches.
postCommitProcessSession gets spans for resolve_shadow_branch,
check_session_content, transition_and_condense, carry_forward_files,
and save_session_state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 4b92e096b173
Add child spans for the two push operations in PrePush:
- push_checkpoints_branch wrapping pushSessionsBranchCommon()
- push_trails_branch wrapping PushTrailsBranch()

These appear as substeps under the root hook span from hook_registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 5814d9978c73
Entire-Checkpoint: f76dd0cc9ae4
Copilot AI review requested due to automatic review settings March 5, 2026 01:23
@cursor
Copy link

cursor bot commented Mar 5, 2026

PR Summary

Medium Risk
Touches the git hook and lifecycle/strategy hot paths by adding new context-based instrumentation; behavior should be unchanged, but missing/incorrect span handling could affect logging or add minor overhead in latency-sensitive hooks.

Overview
Adds a new perf span framework that records hierarchical timings in a single structured DEBUG log line (msg="perf", op, duration_ms, steps.*_ms, and per-step error flags).

Replaces ad-hoc duration logging in agent hook execution with a root span per hook invocation, and instruments key lifecycle/strategy paths (session start/turn start/turn end, ManualCommitStrategy PrepareCommitMsg/PostCommit/SaveStep/PrePush) with named subspans plus RecordError on early-return error paths. Updates tests to assert the new perf log entry is emitted during a hook run and adds unit tests for span nesting, idempotent End, auto-ending children, and error recording.

Written by Cursor Bugbot for commit 64dd5d9. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Span ended before error check loses error recording
    • In SaveStep, the write_temporary_checkpoint span now records WriteTemporary errors before ending the span and returning.
  • ✅ Fixed: Span ended before error check in PostCommit
    • In PostCommit, the find_sessions_for_worktree span now records and handles lookup errors before ending, with End() left on the success path.

Create PR

Or push these changes by commenting:

@cursor push abf8b32088
Preview (abf8b32088)
diff --git a/cmd/entire/cli/strategy/manual_commit_git.go b/cmd/entire/cli/strategy/manual_commit_git.go
--- a/cmd/entire/cli/strategy/manual_commit_git.go
+++ b/cmd/entire/cli/strategy/manual_commit_git.go
@@ -111,10 +111,12 @@
 		AuthorEmail:       step.AuthorEmail,
 		IsFirstCheckpoint: isFirstCheckpointOfSession,
 	})
-	writeCheckpointSpan.End()
 	if err != nil {
+		writeCheckpointSpan.RecordError(err)
+		writeCheckpointSpan.End()
 		return fmt.Errorf("failed to write temporary checkpoint: %w", err)
 	}
+	writeCheckpointSpan.End()
 
 	// If checkpoint was skipped due to deduplication (no changes), return early
 	if result.Skipped {

diff --git a/cmd/entire/cli/strategy/manual_commit_hooks.go b/cmd/entire/cli/strategy/manual_commit_hooks.go
--- a/cmd/entire/cli/strategy/manual_commit_hooks.go
+++ b/cmd/entire/cli/strategy/manual_commit_hooks.go
@@ -793,14 +793,23 @@
 
 	// Find all active sessions for this worktree
 	sessions, err := s.findSessionsForWorktree(ctx, worktreePath)
+	if err != nil {
+		findSessionsSpan.RecordError(err)
+		findSessionsSpan.End()
+		logging.Warn(logCtx, "post-commit: no active sessions despite trailer",
+			slog.String("strategy", "manual-commit"),
+			slog.String("checkpoint_id", checkpointID.String()),
+		)
+		return nil
+	}
 	findSessionsSpan.End()
 
-	if err != nil || len(sessions) == 0 {
+	if len(sessions) == 0 {
 		logging.Warn(logCtx, "post-commit: no active sessions despite trailer",
 			slog.String("strategy", "manual-commit"),
 			slog.String("checkpoint_id", checkpointID.String()),
 		)
-		return nil //nolint:nilerr // Intentional: hooks must be silent on failure
+		return nil
 	}
 
 	// Build transition context
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

AuthorEmail: step.AuthorEmail,
IsFirstCheckpoint: isFirstCheckpointOfSession,
})
writeCheckpointSpan.End()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Span ended before error check loses error recording

Medium Severity

writeCheckpointSpan.End() is called unconditionally on line 114, before the error from store.WriteTemporary is checked on line 115. If WriteTemporary returns an error, RecordError is never called, so the span's error flag won't appear in the perf log output. Every other span in this PR follows the pattern of calling RecordError(err) then End() on error paths before the happy-path End().

Fix in Cursor Fix in Web

// Find all active sessions for this worktree
sessions, err := s.findSessionsForWorktree(ctx, worktreePath)
findSessionsSpan.End()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Span ended before error check in PostCommit

Medium Severity

In PostCommit, findSessionsSpan.End() is called on line 796 immediately after findSessionsForWorktree, before the error is checked on line 798. If findSessionsForWorktree returns an error, RecordError is never called. Contrast this with the same span pattern in PrepareCommitMsg (lines 351–361), which correctly calls RecordError(err) then End() inside the error branch.

Fix in Cursor Fix in Web

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new perf span-based instrumentation package and wires it into the CLI hook pipeline to produce structured, per-step timing logs for diagnosing slow git hook operations.

Changes:

  • Introduces perf package (context-attached spans, root span emits a single aggregated DEBUG log line).
  • Replaces ad-hoc hook duration logging with a root perf span in the hook registry.
  • Adds span instrumentation and RecordError calls across lifecycle + manual-commit strategy hook paths.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
perf/context.go Stores/retrieves the active perf span in context.Context.
perf/span.go Implements spans, step aggregation, and perf log emission.
perf/span_test.go Unit tests for span lifecycle, nesting, durations, and error recording.
cmd/entire/cli/hook_registry.go Starts a root perf span per hook invocation (replacing LogDuration).
cmd/entire/cli/hook_registry_test.go Updates log assertions to expect the perf span log line.
cmd/entire/cli/lifecycle.go Adds perf spans around lifecycle substeps + error recording on return paths.
cmd/entire/cli/strategy/manual_commit_hooks.go Instruments prepare-commit-msg/post-commit flows with per-step spans.
cmd/entire/cli/strategy/manual_commit_git.go Instruments SaveStep with spans for major sub-operations.
cmd/entire/cli/strategy/manual_commit_push.go Instruments pre-push branch push steps with spans.
Comments suppressed due to low confidence (1)

cmd/entire/cli/strategy/manual_commit_hooks.go:804

  • findSessionsSpan is ended before checking err, and if s.findSessionsForWorktree fails the span never calls RecordError(err). This means perf logs will show the step as successful even when it errored. Consider moving End() after the if err != nil check and recording the error before returning.
	// Find all active sessions for this worktree
	sessions, err := s.findSessionsForWorktree(ctx, worktreePath)
	findSessionsSpan.End()

	if err != nil || len(sessions) == 0 {
		logging.Warn(logCtx, "post-commit: no active sessions despite trailer",
			slog.String("strategy", "manual-commit"),
			slog.String("checkpoint_id", checkpointID.String()),
		)
		return nil //nolint:nilerr // Intentional: hooks must be silent on failure
	}

@@ -153,13 +157,6 @@ func newAgentHookVerbCmdWithLogging(agentName types.AgentName, hookName string)
}
// Other pass-through hooks (nil event, no special handling) are no-ops

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root perf span in newAgentHookVerbCmdWithLogging is always ended via defer span.End(), but it never records hookErr. If DispatchLifecycleEvent/handleClaudeCodePostTodo returns an error, the root perf log won’t show error=true unless errors are also propagated from children. Consider calling span.RecordError(hookErr) before returning when hookErr != nil.

Suggested change
if hookErr != nil {
span.RecordError(hookErr)
}

Copilot uses AI. Check for mistakes.
return nil //nolint:nilerr // Hook must be silent on failure
openRepoSpan.RecordError(err)
openRepoSpan.End()
return nil
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These hook code paths intentionally swallow errors (return nil) but the nilerr linter is enabled in this repo; without //nolint:nilerr (or another linter-friendly pattern), CI will fail. Please restore the nolint:nilerr annotation (as previously used) for this return nil on error.

Suggested change
return nil
return nil //nolint:nilerr // intentional: do not block commit if repository cannot be opened

Copilot uses AI. Check for mistakes.
return nil //nolint:nilerr // Hook must be silent on failure
writeCommitMessageSpan.RecordError(err)
writeCommitMessageSpan.End()
return nil
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nilerr issue: os.WriteFile failure returns nil without //nolint:nilerr, which will be flagged by the nilerr linter. Add back the nolint (or refactor) to keep the hook silent while passing lint.

Suggested change
return nil
return fmt.Errorf("failed to write commit message: %w", err)

Copilot uses AI. Check for mistakes.
Comment on lines 768 to 773
commit, err := repo.CommitObject(head.Hash())
if err != nil {
return nil //nolint:nilerr // Hook must be silent on failure
openRepoSpan.RecordError(err)
openRepoSpan.End()
return nil
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nilerr issue: repo.CommitObject error path returns nil without //nolint:nilerr. Add back the linter suppression or adjust the pattern.

Copilot uses AI. Check for mistakes.
return nil //nolint:nilerr // Hook must be silent on failure
findSessionsSpan.RecordError(err)
findSessionsSpan.End()
return nil
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nilerr issue: paths.WorktreeRoot failure returns nil inside an error check but lacks //nolint:nilerr. This is expected to be flagged by the nilerr linter.

Suggested change
return nil
return nil //nolint:nilerr // Intentional: hooks must be silent on failure

Copilot uses AI. Check for mistakes.
Comment on lines +752 to 758
_, openRepoSpan := perf.Start(ctx, "open_repository_and_head")
repo, err := OpenRepository(ctx)
if err != nil {
return nil //nolint:nilerr // Hook must be silent on failure
openRepoSpan.RecordError(err)
openRepoSpan.End()
return nil
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nilerr issue in PostCommit: returning nil on an OpenRepository error inside if err != nil will be flagged by nilerr. Add //nolint:nilerr (or refactor) so CI passes while keeping hooks fail-open.

Copilot uses AI. Check for mistakes.
Comment on lines 761 to 766
head, err := repo.Head()
if err != nil {
return nil //nolint:nilerr // Hook must be silent on failure
openRepoSpan.RecordError(err)
openRepoSpan.End()
return nil
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nilerr issue: repo.Head() error path returns nil without a //nolint:nilerr annotation. This will likely fail golangci-lint.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to 117
writeCheckpointSpan.End()
if err != nil {
return fmt.Errorf("failed to write temporary checkpoint: %w", err)
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeCheckpointSpan.End() is called before checking err from store.WriteTemporary, so failures won't be captured via RecordError and the span ends even when the operation errors. Record the error (if any) and end the span after the call (or defer End and record before returning).

Suggested change
writeCheckpointSpan.End()
if err != nil {
return fmt.Errorf("failed to write temporary checkpoint: %w", err)
}
if err != nil {
writeCheckpointSpan.RecordError(err)
writeCheckpointSpan.End()
return fmt.Errorf("failed to write temporary checkpoint: %w", err)
}
writeCheckpointSpan.End()

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +91
if s.err != nil {
attrs = append(attrs, slog.Bool("error", true))
}

// Add child step durations (and error flags) as flat keys
for _, child := range s.children {
// Auto-end children that were not explicitly ended
if !child.ended {
child.ended = true
child.duration = time.Since(child.start)
}
key := fmt.Sprintf("steps.%s_ms", child.name)
attrs = append(attrs, slog.Int64(key, child.duration.Milliseconds()))
if child.err != nil {
errKey := fmt.Sprintf("steps.%s_err", child.name)
attrs = append(attrs, slog.Bool(errKey, true))
}
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root span logging only sets error=true when s.err is non-nil, but child spans can record errors without propagating to the root. This can make failed operations look successful at the root level (no error=true) even though steps.*_err flags exist. Consider setting the root error flag when any child span has an error (or propagating the first child error up).

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would make things quite confusing to handle: There might be child-steps that can fail and report errors but doing so isn't going to automatically fail the step that called them. I personally think these things should stay disconnected for the time being and assume that we'd manually report errors on the root level when we need to.

Comment on lines +105 to +106
fn()
child.End()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Span.Measure doesn’t defer child.End(), so if fn() panics the child span may never be ended (and won’t have a meaningful duration/error state). Using defer child.End() (and optionally re-panicking) would make the helper robust.

Suggested change
fn()
child.End()
defer child.End()
fn()

Copilot uses AI. Check for mistakes.
…nvestigation

# Conflicts:
#	cmd/entire/cli/lifecycle.go
#	cmd/entire/cli/strategy/manual_commit_hooks.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants