Skip to content

fix: stabilize PR107 tmux spawn, offline buffering, and bun runtime#13

Closed
leeroybrun wants to merge 44 commits intomainfrom
slopus/pr/tmux-fix-2026-01-12
Closed

fix: stabilize PR107 tmux spawn, offline buffering, and bun runtime#13
leeroybrun wants to merge 44 commits intomainfrom
slopus/pr/tmux-fix-2026-01-12

Conversation

@leeroybrun
Copy link
Copy Markdown
Owner

@leeroybrun leeroybrun commented Jan 17, 2026

PR 107
fix: stabilize PR107 tmux spawn, offline buffering, and bun runtime

Context / Goal

This PR is a follow-up to slopus/happy-cli#107 (profiles/env expansion + tmux + runtime/offline changes).
The goal is to make tmux spawning correct and deterministic, align the spawn RPC contract with reality, restore reliable offline buffering, and harden logging.

Summary (what this PR does)

  • Fixes tmux env injection (no value mutation), makes empty session selection deterministic, and properly wires TMUX_TMPDIR for tmux client invocations.
  • Restores offline buffering for socket sends (removes PR107 early-return drops).
  • Makes Bun subprocess spawning coherent and consistent across normal spawn and tmux spawn.
  • Updates the spawn RPC typing to accept arbitrary env var maps (what the GUI actually sends).
  • Tightens DEBUG gating/redaction for logging and makes offline stubs safer.

What slopus#107 introduced (and what we’re fixing here)

tmux correctness

TMUX_TMPDIR contract gap

Offline behavior (“don’t send if disconnected”)

Bun runtime support

Provider config objects (ghost overrides / leftovers?)

RPC typing drift

What this series fixes / improves

1) Stop unintended profile application from CLI settings
  • Problem: daemon implicitly fell back to settings.activeProfileId when GUI didn’t provide env vars, meaning profiles could affect sessions even when the GUI profiles feature was disabled or not used by users (in "standard" non-wizard new session modal, when wizard was disabled in settings).
  • Fix: profile env vars are only applied when explicitly provided in SpawnSessionOptions.environmentVariables.
2) Persist profile identity per session (cross-device visibility)
  • Problem: without persistence, a session started with a profile on machine A couldn’t reliably show “which profile” on machine B.
  • Fix: thread profileId?: string | null through spawn → daemon → child env (HAPPY_SESSION_PROFILE_ID) → session metadata (non-secret, used for display/debugging).
3) Align CLI profile schema + env template support
  • Fixes:
    • Accept URL-or-template strings for provider endpoints (e.g. ${VAR} / ${VAR:-default})
    • Accept non-UUID ids for built-in profiles (string min(1))
    • Support ${VAR:=default} expansion (CLI env-template parity)
4) Prevent secret leakage in daemon + RPC logs
  • Problem: logging spawn params/options could leak tokens and env var values into ~/.happy/logs/*.
  • Fix: log only env var keys/counts and token presence, never values.
5) Prevent secret leakage in happy doctor
  • Problem: doctor printed settings.json verbatim, including profile env var values.
  • Fix: redact/mask secret-like fields; keep ${VAR} templates visible; avoid printing legacy caches.
6) Remove unwired / misleading functionality (strict cleanup)
  • Changes:
    • Removed startupBashScript from the CLI profile schema (it was never executed/passed to spawn by the UI and not wired anywhere in the CLI).
    • Removed localEnvironmentVariables from CLI settings (schema bump + migration deletes legacy key).
    • Removed unused helper APIs that relied on the cache (no call sites existed).
7) Tmux: optional session-env persistence, but only for profile-defined env
  • Behavior:
    • The spawned agent always gets needed env vars via tmux window -e KEY=VALUE (spawn-time).
    • If TMUX_UPDATE_ENVIRONMENT is set (truthy), we additionally persist only profile-provided env vars into the tmux session via tmux set-environment.
    • We intentionally do not persist daemon-injected auth/session markers into the tmux session environment.
8) Tmux: added optionnal integration tests to validate the real behavior against real tmux process

Links to the detailled issues in slopus/happy#107 (links go to the PR diff)

  1. tmux new-window -e env injection corrupts values (quotes/backslashes become literal)
  1. tmux $TMUX parsing is wrong (treats server pid as “session.window”)
  1. tmux “empty session name” selection is “first session from list-sessions” (nondeterministic)
  1. tmux command string built via args.join(' ') (breaks quoting/spacing)
  1. tmux runner treats exitCode=null (killed by signal) as success
  1. TMUX_TMPDIR “contract gap”: profile mapping sets it, but tmux client selection is only via -S socketPath
  1. Daemon falls back to CLI-local activeProfileId when GUI doesn’t supply env vars
  1. localEnvironmentVariables cache stored in settings (can contain secrets)
  1. Profile schema strictness (UUID ids + URL-only endpoints)
  1. expandEnvVars logs expanded values/defaults and uses non-bash empty semantics
  1. Bun runtime selection is incoherent in subprocess spawning
  1. SpawnSessionOptions.environmentVariables type is too narrow vs reality
  1. Deterministic message drop on disconnect for Claude session messages (early return)
  1. Offline stub is a plain object cast to ApiSessionClient

Summary by CodeRabbit

  • Bug Fixes

    • Fixed startup/env guards, missing-command checks, subprocess invocation, idle detection, and exit-code handling; added safer directory existence checks and signal-driven shutdown.
  • New Features

    • Environment preview RPC with secret-masking policies; subprocess invocation builder; new tmux spawn/window helpers.
  • Improvements

    • Centralized retry/session-update handling, clearer permission-response events, profileId propagation to session metadata, and improved redaction/logging of env/profile data.
  • Tests

    • Many new and updated unit and integration tests for env expansion, tmux, CLI invocation, sessions, and permission flows.

✏️ Tip: You can customize this high-level summary in your review settings.

leeroybrun and others added 30 commits January 13, 2026 08:07
- Extract session update handlers into separate module (sessionUpdateHandlers.ts)
  - handleAgentMessageChunk, handleToolCallUpdate, handleToolCall, etc.
  - Reduces AcpBackend.ts from ~1350 to ~900 lines

- Add retry logic for ACP init/newSession operations
  - 3 attempts with exponential backoff (1s, 2s, 4s)
  - Improves reliability on slow starts

- Move timeouts to configurable constants
  - Add getIdleTimeout() to TransportHandler interface
  - Export GEMINI_TIMEOUTS from GeminiTransport
  - DEFAULT_IDLE_TIMEOUT_MS, DEFAULT_TOOL_CALL_TIMEOUT_MS

- Improve tool name detection in GeminiTransport
  - ExtendedToolPattern with inputFields and emptyInputDefault
  - Remove fragile context-based heuristics
  - Cleaner, more declarative pattern matching

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Align happy-cli profile schema with the app: profiles are env-var based only. Migrate any legacy provider config fields into environmentVariables during parsing to avoid data loss.
Add a single runtime invocation builder for node vs bun and allow overriding via HAPPY_CLI_SUBPROCESS_RUNTIME.
- Treat TMUX_TMPDIR as a directory via tmux client env (not -S socket path)

- Pass per-window env via new-window -e KEY=value without shell-style quoting

- Make empty sessionName resolve deterministically (attached/most-recent)

- Preserve failure semantics for signal-terminated tmux commands
Remove the hard send guard that dropped messages while disconnected and rely on socket.io client buffering (tests updated).
Make debugLargeJson a true DEBUG-only path and avoid logging expanded env values/defaults; tighten doctor masking for default templates.
The GUI sends a profile-derived env var map that may include provider-specific keys and tmux knobs; type it as Record<string,string>.
Handle codex --version output variations without misreporting 'not installed' and remove stdout logging of elicitation payloads.
Provide an EventEmitter-compatible stub and a focused unit test so offline mode can't crash on basic session events.
…2026-01-12

# Conflicts:
#	src/api/apiSession.ts
#	src/utils/offlineSessionStub.ts
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 17, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Extracted ACP session-update handling into a new module, added unified retry/backoff for ACP, centralized environment variable/profile handling and schema bump, refactored tmux utilities and spawn lifecycle, added session/process cleanup and many tests, plus assorted CLI/env and transport tweaks.

Changes

Cohort / File(s) Summary
Build & Initialization
\bin/happy-dev.mjs``
Move unconditional HAPPY_HOME_DIR & HAPPY_VARIANT init to top-level; remove duplicates.
ACP Backend & Handlers
\src/agent/acp/AcpBackend.ts`, `src/agent/acp/sessionUpdateHandlers.ts`, `src/agent/acp/index.ts``
Add withRetry exponential backoff and RETRY_CONFIG; extract session-update handlers into sessionUpdateHandlers.ts; introduce HandlerContext, refactor per-update handlers, add respondToPermission, re-export handlers.
Transport & Gemini
\src/agent/transport/TransportHandler.ts`, `src/agent/transport/handlers/GeminiTransport.ts``
Add optional getIdleTimeout() to transport interface; export GEMINI_TIMEOUTS; extend tool-pattern metadata and revise tool-name/empty-input logic.
Gemini Factory & Runtime
\src/agent/factories/gemini.ts`, `src/agent/factories/index.ts`, `src/gemini/runGemini.ts``
createGeminiBackend returns GeminiBackendResult (backend, model, modelSource); run loop adds message-processing synchronization and applies pending session swaps; model source is canonical.
Tmux & Spawn Lifecycle
\src/utils/tmux.ts`, `src/daemon/run.ts`, `src/daemon/.test.ts`, `src/utils/tmux..test.ts``
Major tmux refactor: POSIX quoting, normalizeExitCode, TmuxEnvironment now has socket_path/server_pid/pane; TmuxUtilities accepts tmuxCommandEnv/tmuxSocketPath; spawnInTmux returns session/window; add buildTmuxWindowEnv & buildTmuxSpawnConfig, CODEX_HOME per-PID cleanup, masked env logging, many tests (including real tmux integration).
Profile, Persistence & Env Preview
\src/persistence.ts`, `src/modules/common/registerCommonHandlers.ts`, `src/persistence.profileSchema.test.ts`, `src/modules/common/registerCommonHandlers.previewEnv.test.ts``
Unify provider configs into environmentVariables; add normalizeLegacyProfileConfig & mergeEnvironmentVariables; bump SUPPORTED_SCHEMA_VERSION to 3; remove CLI-local env cache; add preview-env RPC; add SpawnSessionOptions.profileId.
Spawn Helpers & Env Expansion
\src/utils/spawnHappyCLI.ts`, `src/utils/spawnHappyCLI.invocation.test.ts`, `src/utils/expandEnvVars.ts`, `src/utils/expandEnvVars.test.ts`, `src/utils/spawnHappyCLI.*``
Add buildHappyCliSubprocessInvocation and runtime detection (node/bun); expandEnvironmentVariables supports ${VAR}, ${VAR:-default}, ${VAR:=default} with options { warnOnUndefined? }.
API & Sessions
\src/api/apiMachine.ts`, `src/api/apiSession.ts`, `src/api/types.ts``, tests
Forward profileId to spawnSession; make Machine.metadata nullable and add Metadata.profileId?; suppress repeated disconnected-send logs with one-time flag and reset on connect; tests updated to reliably restore console mocks and add socket-disconnected behavior.
Offline Sessions & Tests
\src/utils/offlineSessionStub.ts`, `src/claude/`, `src/claude/utils/sessionScanner.test.ts`, many src//*.test.ts``
Replace inline offline-session object with OfflineSessionStub class (EventEmitter + RPC manager); add -r resume short flag, AbortController wiring for signal cleanup; update scanner tests to use CLAUDE_CONFIG_DIR and polling helper; many new/updated tests across modules.
Codex MCP & Elicitation
\src/codex/codexMcpClient.ts`, `src/codex/codexMcpClient.test.ts` `
Broaden version parsing and fallback; add createCodexElicitationRequestHandler and CodexPermissionHandlerProvider type; centralize elicitation permission handling; add tests.
CLI & Scripts
\scripts/env-wrapper.cjs`, `scripts/ripgrep_launcher.cjs`, `scripts/claude_version_utils.cjs`, `scripts/test-continue-fix.sh``, tests
Require <command> in env-wrapper; ripgrep launcher captures stdout and tightens exit handling and arg validation; narrow Windows npm detection logic; add existence check for project dir in test helper.
Conversation & Permission Utilities
\src/gemini/utils/conversationHistory.ts`, `src/gemini/utils/permissionHandler.ts`**
Add message deduplication for consecutive identical role messages; expose updateSession override on GeminiPermissionHandler for type/visibility.
UI, Logging & Doctor
\src/ui/doctor.ts`, `src/ui/logger.ts``, tests
Add maskValue and redactSettingsForDisplay used by doctor; debugLargeJson no-ops when DEBUG unset; add tests for masking and logger behavior.
Misc Tests & Small Fixes
many \src//.test.ts``
Numerous new and adjusted unit/integration tests (logger, tmux, expandEnvVars, spawn invocation, ripgrep, preview-env, profile migration, etc.).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant GeminiRuntime as Gemini Runtime
    participant Backend as GeminiBackend
    participant Transport as GeminiTransport
    participant AgentStream as AgentMessage Stream

    User->>GeminiRuntime: send message / request session swap
    GeminiRuntime->>GeminiRuntime: if isProcessingMessage queue swap (pendingSessionSwap)
    GeminiRuntime->>Backend: process message (mark isProcessingMessage)
    Backend->>Transport: stream session updates (tool calls, thinking, message chunks)
    Transport->>AgentStream: emit agent messages / tool-call events / thinking
    Backend-->>GeminiRuntime: turn complete
    GeminiRuntime->>GeminiRuntime: applyPendingSessionSwap() -> Backend.update permission handler
Loading
sequenceDiagram
    participant AcpBackend as ACP Backend
    participant Retry as withRetry
    participant RPC as Remote RPC
    participant Handlers as SessionUpdateHandlers
    participant AgentStream as AgentMessage Stream

    AcpBackend->>Retry: call Initialize/NewSession (with RETRY_CONFIG)
    Retry->>RPC: attempt RPC
    alt success
        RPC-->>Retry: result
    else failure
        RPC-->>Retry: error
        Retry->>Retry: backoff & retry (onRetry callback)
    end
    Retry-->>AcpBackend: final result or throw
    AcpBackend->>Handlers: dispatch HandlerContext + SessionUpdate
    Handlers->>AgentStream: emit AgentMessage events (tool calls, thinking, outputs)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 I tunneled through sessions, retries in tow,
I nudged envs and tmux where pinwheels blow.
Handlers split out, backoff set to spin,
Profiles stitched up, tests hopped right in.
A carrot of code — hop, merge, and grin!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main focus of the PR: stabilizing tmux spawn behavior, offline buffering, and bun runtime support—all key issues addressed in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @leeroybrun, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability, correctness, and security of the CLI's core functionalities. It addresses several critical issues identified in a previous release, particularly concerning tmux integration, offline behavior, and runtime execution. The changes aim to provide a more robust and predictable user experience by ensuring consistent environment handling, secure logging practices, and resilient agent communication.

Highlights

  • Tmux Stability and Correctness: Addressed several issues related to tmux spawning, including environment variable injection, deterministic empty session selection, proper wiring of TMUX_TMPDIR, and robust command string building to prevent quoting/spacing problems. This ensures more reliable and predictable behavior when interacting with tmux sessions.
  • Offline Buffering Restoration: Reinstated reliable offline buffering for socket.io sends, removing previous early-return drops that caused complete message loss when disconnected. Messages will now be buffered and sent upon reconnection.
  • Bun Runtime Support: Improved Bun subprocess spawning to be coherent and consistent across both normal and tmux-based invocations, ensuring correct runtime selection and flag usage.
  • RPC Typing and Logging Hardening: Updated the spawn RPC typing to accurately accept arbitrary environment variable maps, aligning with actual GUI behavior. Additionally, logging has been tightened with DEBUG gating and redaction to prevent sensitive information leakage.
  • Profile Management Refinements: Implemented changes to prevent unintended profile application from CLI settings, ensuring profile environment variables are only applied when explicitly provided. Session identity is now persisted per session for cross-device visibility and debugging.
  • Schema Alignment and Secret Redaction: Aligned the CLI profile schema to support URL-or-template strings for provider endpoints and non-UUID IDs for built-in profiles. Critical updates were made to prevent secret leakage in daemon and happy doctor logs by redacting sensitive values.
  • Cleanup of Unused Functionality: Removed unwired and misleading functionalities such as startupBashScript and localEnvironmentVariables from the CLI profile schema and settings, simplifying the codebase and user configuration.
  • ACP Backend Retry Logic: Introduced retry logic with exponential backoff for initialize and newSession calls in the ACP backend, improving resilience against transient connection issues with agent processes.
  • Modularized ACP Session Update Handling: Refactored ACP session update processing into a new sessionUpdateHandlers.ts module, enhancing maintainability and testability of the agent backend logic.
  • Improved Gemini Tool Name Determination: Enhanced the GeminiTransport to more accurately determine real tool names from toolCallId patterns and input field signatures, especially when Gemini reports 'other' or 'Unknown tool'.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a comprehensive stabilization effort that addresses several critical issues across tmux integration, offline buffering, and runtime consistency. The fixes for tmux command construction and environment handling are particularly well-done, resolving numerous correctness and security bugs from the previous implementation. The restoration of offline message buffering by removing the early-return logic is a key reliability improvement. I've also noted a few important fixes, including preventing secret leakage in daemon logs and fixing a race condition in Codex auth file creation. The extensive refactoring, such as extracting ACP session update handlers and unifying profile configuration, significantly improves the maintainability of the codebase.

Comment thread src/daemon/run.ts
Comment on lines +238 to +252
// Do NOT log raw options: it may include secrets (token / env vars).
const envKeys = options.environmentVariables && typeof options.environmentVariables === 'object'
? Object.keys(options.environmentVariables as Record<string, unknown>)
: [];
logger.debugLargeJson('[DAEMON RUN] Spawning session', {
directory: options.directory,
sessionId: options.sessionId,
machineId: options.machineId,
approvedNewDirectoryCreation: options.approvedNewDirectoryCreation,
agent: options.agent,
profileId: options.profileId,
hasToken: !!options.token,
environmentVariableCount: envKeys.length,
environmentVariableKeys: envKeys,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

This is an excellent security improvement. The previous implementation logged the entire options object, which could leak sensitive tokens and environment variable values into the daemon logs. Redacting this information by logging only the count and keys of environment variables is the correct approach to prevent secret leakage.

Comment thread src/daemon/run.ts

// Write the token to the temporary directory
fs.writeFile(join(codexHomeDir.name, 'auth.json'), options.token);
await fs.writeFile(join(codexHomeDir.name, 'auth.json'), options.token);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Good catch! The previous code was missing an await here, which created a race condition. The CODEX_HOME environment variable could have been used by the spawned process before the auth.json file was fully written, leading to authentication failures. Making this call synchronous fixes the issue.

Comment thread src/utils/offlineSessionStub.ts Outdated
Comment on lines +65 to +66
export function createOfflineSessionStub(sessionTag: string): ApiSessionClient {
return {
sessionId: `offline-${sessionTag}`,
sendCodexMessage: () => {},
sendAgentMessage: () => {},
sendClaudeSessionMessage: () => {},
keepAlive: () => {},
sendSessionEvent: () => {},
sendSessionDeath: () => {},
updateLifecycleState: () => {},
requestControlTransfer: async () => {},
flush: async () => {},
close: async () => {},
updateMetadata: () => {},
updateAgentState: () => {},
onUserMessage: () => {},
rpcHandlerManager: {
registerHandler: () => {}
}
} as unknown as ApiSessionClient;
return new OfflineSessionStub(`offline-${sessionTag}`) as unknown as ApiSessionClient;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is a great fix. The previous implementation returned a plain object cast to ApiSessionClient, which was missing the EventEmitter methods. This would have caused a crash if any code tried to register an event listener (e.g., session.on(...)) while in offline mode. By creating a class that extends EventEmitter, you've made the offline stub robust and contract-compliant.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/utils/tmux.ts (1)

990-1014: Return the trimmed session name to avoid mismatched identifiers.
You create the session using trimmedSessionName but return the original sessionName; leading/trailing whitespace would yield a non-existent identifier.

🔧 Suggested fix
-            const sessionIdentifier: TmuxSessionIdentifier = {
-                session: sessionName,
+            const sessionIdentifier: TmuxSessionIdentifier = {
+                session: trimmedSessionName,
                 window: windowName
             };
src/utils/expandEnvVars.ts (1)

37-78: Implement bash assignment semantics for ${VAR:=default} or clarify if single-variable expansion is sufficient.

The current implementation treats := identically to :-, so ${MISSING:=x}-${MISSING} leaves the second placeholder unresolved. If bash semantics are intended (as the comments suggest), add a local tracking map to store := assignments during expansion. If single-variable templates are sufficient for your use cases, update the comments to clarify that assignment semantics are not supported.

🛠️ Proposed fix (local assignment map)
 export function expandEnvironmentVariables(
     envVars: Record<string, string>,
     sourceEnv: NodeJS.ProcessEnv = process.env
 ): Record<string, string> {
     const expanded: Record<string, string> = {};
     const undefinedVars: string[] = [];
+    const effectiveEnv: Record<string, string | undefined> = { ...sourceEnv };

     for (const [key, value] of Object.entries(envVars)) {
         // Replace all ${VAR}, ${VAR:-default}, and ${VAR:=default} references with actual values from sourceEnv
         const expandedValue = value.replace(/\$\{([^}]+)\}/g, (match, expr) => {
             // Support bash parameter expansion: ${VAR:-default} and ${VAR:=default}
             // Example: ${Z_AI_BASE_URL:-https://api.z.ai/api/anthropic}
             const colonDashIndex = expr.indexOf(':-');
             const colonEqIndex = expr.indexOf(':=');
             let varName: string;
             let defaultValue: string | undefined;
+            const isAssignDefault = colonEqIndex !== -1;

             if (colonDashIndex !== -1 || colonEqIndex !== -1) {
                 // Split ${VAR:-default} or ${VAR:=default} into varName and defaultValue
                 const idx = colonDashIndex !== -1 ? colonDashIndex : colonEqIndex;
                 varName = expr.substring(0, idx);
                 defaultValue = expr.substring(idx + 2);
             } else {
                 // Simple ${VAR} reference
                 varName = expr;
             }

-            const resolvedValue = sourceEnv[varName];
+            const resolvedValue = effectiveEnv[varName];
             const shouldTreatEmptyAsMissing = defaultValue !== undefined;
             const isMissing = resolvedValue === undefined || (shouldTreatEmptyAsMissing && resolvedValue === '');

             if (!isMissing) {
                 // Variable found in source environment - use its value
                 if (process.env.DEBUG) {
                     logger.debug(`[EXPAND ENV] Expanded ${varName} from daemon env`);
                 }

                 // Warn if empty string (common mistake)
                 if (resolvedValue === '') {
                     logger.warn(`[EXPAND ENV] WARNING: ${varName} is set but EMPTY in daemon environment`);
                 }

                 return resolvedValue;
             } else if (defaultValue !== undefined) {
                 // Variable not found but default value provided - use default
+                if (isAssignDefault) {
+                    effectiveEnv[varName] = defaultValue;
+                }
                 if (process.env.DEBUG) {
                     logger.debug(`[EXPAND ENV] Using default value for ${varName}`);
                 }
                 return defaultValue;
             } else {
                 // Variable not found and no default - keep placeholder and warn
                 undefinedVars.push(varName);
                 return match;
             }
         });

         expanded[key] = expandedValue;
     }
src/agent/acp/AcpBackend.ts (1)

658-745: Use AbortSignal to properly cancel in-flight initialize/newSession requests on timeout.

The current Promise.race pattern leaves requests in-flight after timeout; retries can create multiple duplicate sessions. The @agentclientprotocol/sdk supports AbortSignal cancellation for both initialize() and newSession(). Create an AbortController per retry attempt and pass { signal: controller.signal } to cleanly cancel the request when the timeout fires, preventing duplicate sessions from slow responses arriving after timeout but before the next retry.

🤖 Fix all issues with AI agents
In `@src/agent/acp/sessionUpdateHandlers.ts`:
- Around line 356-432: The failToolCall function currently logs the full
tool-call payload (variable content) for investigation tools; change it to never
log full content by creating a safe, redacted/truncated representation (e.g.,
use or extend extractErrorDetail to produce safeContent or call a new
redactAndTruncate(content, maxChars) helper) and replace the
logger.debug(`[AcpBackend] 🔍 Investigation tool FAILED - full content:`,
JSON.stringify(content, null, 2)); line with a single-line debug that logs only
the redacted/truncated string (and limit length, e.g. first 500 chars) and
ensure any other debug that may reveal payloads uses the same safeContent; keep
other behavior (timing, timeout, emit) unchanged.

In `@src/codex/codexMcpClient.test.ts`:
- Around line 41-72: The test is passing an array for codex_command which fails
validation and prevents the real logic from running; update the test so the
elicitation params match the schema (pass codex_command as a string, e.g. "echo
hi") and then assert that the permissionHandler.handleToolCall mock was called
(verify (permissionHandler.handleToolCall).toHaveBeenCalledWith(...) or at least
toHaveBeenCalled()) in addition to keeping the console.log spy assertion; locate
and modify the test around createCodexElicitationRequestHandler, handler, and
permissionHandler to make these changes.

In `@src/codex/codexMcpClient.ts`:
- Around line 57-65: The failing test sends codex_command as an array but the
schema CodexBashElicitationParamsSchema defines codex_command as z.string() and
downstream code (the call to handleToolCall / permission handler) expects a
string; update the test payload to pass codex_command as a single string (e.g.,
'echo hi' or the intended shell command string) so it conforms to
CodexBashElicitationParams and can be forwarded as the command property to
handleToolCall.

In `@src/utils/offlineSessionStub.ts`:
- Around line 17-44: The OfflineSessionStub currently diverges from the real
ApiSessionClient: its methods have no parameters, it exposes extra methods
(updateLifecycleState, requestControlTransfer), and it lacks sendUsageData,
allowing silent API drift via the as unknown as ApiSessionClient cast; fix this
by having OfflineSessionStub explicitly implement the ApiSessionClient interface
(e.g., class OfflineSessionStub implements ApiSessionClient) and update all stub
method signatures to exactly match ApiSessionClient (notably keepAlive(thinking,
mode), sendAgentMessage(provider, body), sendCodexMessage(...),
sendClaudeSessionMessage(...), sendSessionEvent(...), sendSessionDeath(...),
flush(), close(), updateMetadata(...), updateAgentState(...),
onUserMessage(...), and add sendUsageData(...)); remove updateLifecycleState and
requestControlTransfer from the class and ensure
rpcHandlerManager.registerHandler remains typed as in the diff so TypeScript
will enforce future API changes.

In `@src/utils/spawnHappyCLI.invocation.test.ts`:
- Around line 3-37: The tests mutate process.env.HAPPY_CLI_SUBPROCESS_RUNTIME
but only the second test restores it; add an afterEach that always resets
process.env.HAPPY_CLI_SUBPROCESS_RUNTIME back to the captured
originalRuntimeOverride so each spec is isolated. Use the existing
originalRuntimeOverride constant declared at the top of the suite and in
afterEach restore by deleting the env var if originalRuntimeOverride is
undefined or setting it back otherwise; this keeps
buildHappyCliSubprocessInvocation tests independent.

In `@src/utils/tmux.socketPath.test.ts`:
- Around line 1-41: The test currently mocks child_process.spawn via vi.mock
(spawnMock, getLastSpawnCall in tmux.socketPath.test.ts); remove that mocking
and either (A) move the existing spawn-argument verification into a real
integration test file (tmux.real.integration.test.ts) that runs only under an
opt-in env guard and invokes the real spawn, or (B) refactor the unit test to
extract the logic that builds the spawn arguments into a pure function (e.g.,
createTmuxSpawnArgs or similar) and test that function directly without mocking
child_process; ensure you delete the vi.mock('child_process') block and update
tests to call the new pure function (or the guarded real integration test) and
assert the expected args instead of relying on spawnMock/getLastSpawnCall.

In `@src/utils/tmux.test.ts`:
- Around line 476-517: The FakeTmuxUtilities.executeTmuxCommand method is typed
as Promise<any>; change its signature to match the base class by using
Promise<TmuxCommandResult | null> instead of any (do not use ReturnType or
double-wrap a Promise). Update the method declaration in class FakeTmuxUtilities
(executeTmuxCommand) to return Promise<TmuxCommandResult | null> and ensure the
Ts type (TmuxCommandResult) is imported or in scope so the returned objects
conform to that type.

In `@src/utils/tmux.ts`:
- Around line 153-166: The new validator in src/utils/tmux.ts now allows spaces
in session and window names but helper parsers still use \w-style regexes;
update the other parsers (e.g., extractSessionAndWindow and listWindows) to
accept the same character set (/^[a-zA-Z0-9._ -]+$/) used in the session/window
validation so names with spaces round-trip correctly, or if you prefer a
documented limitation, add a clear comment and user-facing note in those
functions explaining they currently reject space-containing names; ensure the
same named-character-class is used consistently across
TmuxSessionIdentifierError checks and parsing helpers.
🧹 Nitpick comments (15)
scripts/ripgrep_launcher.cjs (2)

88-98: Consider preserving signal termination as a non-zero exit code.

result.status is null when the process is killed by a signal. The || 0 fallback treats signal termination as success, which may hide issues.

Suggested improvement
 function createRipgrepWrapper(binaryPath) {
     return {
         ripgrepMain: (args) => {
             const { spawnSync } = require('child_process');
             const result = spawnSync(binaryPath, args, {
                 stdio: 'inherit',
                 cwd: process.cwd()
             });
-            return result.status || 0;
+            // If killed by signal, return 128 + signal number (POSIX convention)
+            // or a generic non-zero code if signal info unavailable
+            if (result.status === null) {
+                return result.signal ? 128 : 1;
+            }
+            return result.status;
         }
     };
 }

170-177: Consider a clearer error message when no arguments are provided.

If the script is invoked without arguments, args[0] is undefined and JSON.parse(undefined) throws "Unexpected token 'u'", which is cryptic. A guard would improve debuggability.

Suggested improvement
 let parsedArgs;
 try {
+    if (!args[0]) {
+        console.error('Usage: ripgrep_launcher.cjs <JSON-encoded-args>');
+        process.exit(1);
+    }
     parsedArgs = JSON.parse(args[0]);
 } catch (error) {
     console.error('Failed to parse arguments:', error.message);
     process.exit(1);
 }
src/ui/doctor.test.ts (1)

4-9: Consider expanding test coverage for maskValue.

The test correctly validates template handling, but maskValue also handles empty strings and plain secret values. Consider adding test cases for:

  • maskValue('')'<empty>'
  • maskValue('sk-live-secret')'<14 chars>'
  • maskValue(undefined)undefined

Also, the dynamic import on line 5 adds unnecessary complexity. Since maskValue is a pure function with no side effects requiring isolation, a static import would be simpler.

Suggested improvement
-import { describe, it, expect } from 'vitest';
+import { describe, it, expect } from 'vitest';
+import { maskValue } from '@/ui/doctor';
 
 describe('doctor redaction', () => {
     it('does not treat ${VAR:-default} templates as safe', async () => {
-        const doctorModule = (await import('@/ui/doctor')) as typeof import('@/ui/doctor');
-        expect(doctorModule.maskValue('${SAFE_TEMPLATE}')).toBe('${SAFE_TEMPLATE}');
-        expect(doctorModule.maskValue('${LEAK:-sk-live-secret}')).toMatch(/^\$\{LEAK:-<\d+ chars>\}$/);
-        expect(doctorModule.maskValue('${LEAK:=sk-live-secret}')).toMatch(/^\$\{LEAK:=<\d+ chars>\}$/);
+        expect(maskValue('${SAFE_TEMPLATE}')).toBe('${SAFE_TEMPLATE}');
+        expect(maskValue('${LEAK:-sk-live-secret}')).toMatch(/^\$\{LEAK:-<\d+ chars>\}$/);
+        expect(maskValue('${LEAK:=sk-live-secret}')).toMatch(/^\$\{LEAK:=<\d+ chars>\}$/);
+    });
+
+    it('masks plain values and handles edge cases', () => {
+        expect(maskValue('')).toBe('<empty>');
+        expect(maskValue('sk-live-secret')).toBe('<14 chars>');
+        expect(maskValue(undefined)).toBe(undefined);
     });
 });
src/codex/codexMcpClient.ts (1)

67-120: Well-structured elicitation handler factory with proper error handling.

The factory pattern with late-binding permission handler resolution is clean and allows for dynamic handler updates. Error handling covers all paths (no handler, invalid params, handler errors) with appropriate logging.

Minor observation: the return type could be made explicit for better documentation:

export function createCodexElicitationRequestHandler(
    permissionHandlerProvider: CodexPermissionHandlerProvider,
): (request: { params: unknown }) => Promise<{ decision: 'denied' | 'approved'; reason?: string }>
src/codex/codexMcpClient.test.ts (1)

3-39: Consider reducing mocking per project guidelines.

Per coding guidelines, tests should use "Vitest with no mocking - tests should make real API calls." While mocking external SDKs for unit tests of internal logic is sometimes justified, consider:

  1. Integration tests with a real Codex CLI (if available) for critical paths
  2. Alternatively, document why mocking is necessary here (e.g., "Codex CLI not available in CI")

If mocking is intentional for this file, a brief comment explaining the deviation from guidelines would help future maintainers.

src/utils/tmux.commandEnv.test.ts (1)

11-41: Consider guideline alignment: mocking vs real calls.

The coding guidelines specify "Write unit tests using Vitest with no mocking - tests should make real API calls." This test uses vi.mock to stub child_process.spawn. However, given that tmux.real.integration.test.ts provides opt-in real tmux integration tests, having this mocked unit test as a fast-feedback complement is reasonable—it verifies the env-merging logic without requiring tmux to be installed.

If alignment with the guideline is strict, consider moving this assertion into the real integration test suite instead. Otherwise, this is acceptable as a lightweight unit-level sanity check.

src/utils/tmux.real.integration.test.ts (2)

151-152: Consider adding a type for the dump script payload.

Using as any for the parsed JSON payload loses type safety. A simple interface would improve readability and catch typos in property access.

🔧 Optional: Add payload type
// Add near imports or within the describe block
type DumpPayload = {
    argv: string[];
    env: {
        FOO?: string;
        BAR?: string;
        TMUX?: string;
        TMUX_PANE?: string;
        TMUX_TMPDIR?: string;
    };
};

// Then use in tests:
const payload = JSON.parse(readFileSync(outFile, 'utf8')) as DumpPayload;

Also applies to: 193-193, 229-229, 277-277


160-163: Verify kill-server cleanup is reliable.

The void runTmux(['-S', socketPath, 'kill-server']) pattern ignores the result. While this is intentional for cleanup (we don't want test failures from cleanup issues), consider logging failures in debug mode for troubleshooting flaky tests.

🔧 Optional: Log cleanup failures for debugging
function cleanupTmuxServer(socketPath?: string, tmuxTmpDir?: string): void {
    const args = socketPath ? ['-S', socketPath, 'kill-server'] : ['kill-server'];
    const env = tmuxTmpDir ? { TMUX_TMPDIR: tmuxTmpDir } : undefined;
    const result = runTmux(args, { env });
    if (result.status !== 0 && process.env.DEBUG) {
        console.error(`tmux cleanup failed: ${result.stderr}`);
    }
}

Also applies to: 197-200, 234-237, 279-284

src/gemini/utils/permissionHandler.ts (1)

34-39: Consider if explicit override is necessary.

This override simply delegates to super.updateSession(newSession) without additional logic. If the base class method is already public, this is technically redundant. However, if the intent is to make the method explicitly visible in the class's public API surface for IDE discoverability or documentation purposes, this is acceptable.

src/utils/expandEnvVars.test.ts (1)

137-187: Add coverage for := reuse within the same expansion.

If you plan to mirror bash semantics, add a test that ensures ${MISSING:=default} is reused by subsequent references in the same value (or later keys).

✅ Suggested test
     it('treats empty string as missing for ${VAR:=default}', () => {
         const envVars = {
             TARGET: '${EMPTY_VAR:=default-value}',
         };
         const sourceEnv = {
             EMPTY_VAR: '',
         };

         const result = expandEnvironmentVariables(envVars, sourceEnv);
         expect(result).toEqual({
             TARGET: 'default-value',
         });
     });
+
+    it('reuses := default for subsequent references', () => {
+        const envVars = {
+            TARGET: '${MISSING_VAR:=default-value}-${MISSING_VAR}',
+        };
+        const sourceEnv = {};
+
+        const result = expandEnvironmentVariables(envVars, sourceEnv);
+        expect(result).toEqual({
+            TARGET: 'default-value-default-value',
+        });
+    });
src/utils/spawnHappyCLI.invocation.test.ts (1)

1-1: Add a brief file header comment.

The TS guidelines request a short file-level header describing responsibilities.

📝 Suggested header
+/**
+ * Tests for buildHappyCliSubprocessInvocation runtime/argv selection.
+ */
 import { describe, it, expect } from 'vitest';

As per coding guidelines.

src/persistence.profileSchema.test.ts (1)

1-32: LGTM! Clean test coverage for legacy profile migration.

The tests effectively validate the migration of legacy provider config into environmentVariables and the precedence behavior. The filter + length check pattern in the second test is a good approach.

Consider adding test coverage for other legacy providers (e.g., anthropicConfig, azureOpenAIConfig, togetherAIConfig) if they follow the same migration path.

src/agent/acp/AcpBackend.ts (1)

34-66: Move RETRY_CONFIG below the import block.

There’s an import statement after runtime code; the project requires all imports at the top of the file.

♻️ Proposed fix
-import { logger } from '@/ui/logger';
-import { delay } from '@/utils/time';
-import packageJson from '../../../package.json';
-
-/**
- * Retry configuration for ACP operations
- */
-const RETRY_CONFIG = {
-  /** Maximum number of retry attempts for init/newSession */
-  maxAttempts: 3,
-  /** Base delay between retries in ms */
-  baseDelayMs: 1000,
-  /** Maximum delay between retries in ms */
-  maxDelayMs: 5000,
-} as const;
-import {
-  type TransportHandler,
-  type StderrContext,
-  type ToolNameContext,
-  DefaultTransport,
-} from '../transport';
+import { logger } from '@/ui/logger';
+import { delay } from '@/utils/time';
+import packageJson from '../../../package.json';
+import {
+  type TransportHandler,
+  type StderrContext,
+  type ToolNameContext,
+  DefaultTransport,
+} from '../transport';
 import {
   type SessionUpdate,
   type HandlerContext,
   DEFAULT_IDLE_TIMEOUT_MS,
   DEFAULT_TOOL_CALL_TIMEOUT_MS,
   handleAgentMessageChunk,
   handleAgentThoughtChunk,
   handleToolCallUpdate,
   handleToolCall,
   handleLegacyMessageChunk,
   handlePlanUpdate,
   handleThinkingUpdate,
 } from './sessionUpdateHandlers';
+
+/**
+ * Retry configuration for ACP operations
+ */
+const RETRY_CONFIG = {
+  /** Maximum number of retry attempts for init/newSession */
+  maxAttempts: 3,
+  /** Base delay between retries in ms */
+  baseDelayMs: 1000,
+  /** Maximum delay between retries in ms */
+  maxDelayMs: 5000,
+} as const;

As per coding guidelines, keep all imports at the top.

src/persistence.ts (1)

19-79: Avoid any in legacy profile normalization.

The migration helper uses any for raw entries; prefer a typed legacy shape + unknown to keep strict typing and avoid unsafe access.

♻️ Proposed fix
+type LegacyProfile = {
+  environmentVariables?: unknown;
+  anthropicConfig?: { baseUrl?: string; authToken?: string; model?: string };
+  openaiConfig?: { apiKey?: string; baseUrl?: string; model?: string };
+  azureOpenAIConfig?: { apiKey?: string; endpoint?: string; apiVersion?: string; deploymentName?: string };
+  togetherAIConfig?: { apiKey?: string; model?: string };
+} & Record<string, unknown>;
+
 function normalizeLegacyProfileConfig(profile: unknown): unknown {
   if (!profile || typeof profile !== 'object') return profile;
 
-  const raw = profile as Record<string, any>;
+  const raw = profile as LegacyProfile;
   const additions: Record<string, string | undefined> = {
     ANTHROPIC_BASE_URL: raw.anthropicConfig?.baseUrl,
     ANTHROPIC_AUTH_TOKEN: raw.anthropicConfig?.authToken,
     ANTHROPIC_MODEL: raw.anthropicConfig?.model,
     OPENAI_API_KEY: raw.openaiConfig?.apiKey,
     OPENAI_BASE_URL: raw.openaiConfig?.baseUrl,
     OPENAI_MODEL: raw.openaiConfig?.model,
     AZURE_OPENAI_API_KEY: raw.azureOpenAIConfig?.apiKey,
     AZURE_OPENAI_ENDPOINT: raw.azureOpenAIConfig?.endpoint,
     AZURE_OPENAI_API_VERSION: raw.azureOpenAIConfig?.apiVersion,
     AZURE_OPENAI_DEPLOYMENT_NAME: raw.azureOpenAIConfig?.deploymentName,
     TOGETHER_API_KEY: raw.togetherAIConfig?.apiKey,
     TOGETHER_MODEL: raw.togetherAIConfig?.model,
   };
-      const name = (entry as any).name;
-      const value = (entry as any).value;
+      const name = (entry as { name?: unknown }).name;
+      const value = (entry as { value?: unknown }).value;

As per coding guidelines, avoid untyped any in TypeScript.

src/agent/acp/sessionUpdateHandlers.ts (1)

235-315: Log the actual tool-call timeout value.

The log claims a 10‑minute timeout before computing the real timeout; if the transport doesn’t override it, the default is 120s. Logging the computed value avoids misleading diagnostics.

♻️ Proposed tweak
-  if (isInvestigation) {
-    logger.debug(`[AcpBackend] 🔍 Investigation tool detected - extended timeout (10min) will be used`);
-  }
-
-  // Set timeout for tool call completion
-  const timeoutMs = ctx.transport.getToolCallTimeout?.(toolCallId, toolKindStr) ?? DEFAULT_TOOL_CALL_TIMEOUT_MS;
+  // Set timeout for tool call completion
+  const timeoutMs = ctx.transport.getToolCallTimeout?.(toolCallId, toolKindStr) ?? DEFAULT_TOOL_CALL_TIMEOUT_MS;
+  if (isInvestigation) {
+    logger.debug(
+      `[AcpBackend] 🔍 Investigation tool detected - timeout set to ${(timeoutMs / 60000).toFixed(1)}min`
+    );
+  }

Comment on lines +356 to +432
/**
* Fail a tool call
*/
export function failToolCall(
toolCallId: string,
status: 'failed' | 'cancelled',
toolKind: string | unknown,
content: unknown,
ctx: HandlerContext
): void {
const startTime = ctx.toolCallStartTimes.get(toolCallId);
const duration = startTime ? Date.now() - startTime : null;
const toolKindStr = typeof toolKind === 'string' ? toolKind : 'unknown';
const isInvestigation = ctx.transport.isInvestigationTool?.(toolCallId, toolKindStr) ?? false;
const hadTimeout = ctx.toolCallTimeouts.has(toolCallId);

// Log detailed timing for investigation tools BEFORE cleanup
if (isInvestigation) {
const durationStr = formatDuration(startTime);
const durationMinutes = formatDurationMinutes(startTime);
logger.debug(`[AcpBackend] 🔍 Investigation tool ${status.toUpperCase()} after ${durationMinutes} minutes (${durationStr})`);

// Check for 3-minute timeout pattern (Gemini CLI internal timeout)
if (duration) {
const threeMinutes = 3 * 60 * 1000;
const tolerance = 5000;
if (Math.abs(duration - threeMinutes) < tolerance) {
logger.debug(`[AcpBackend] 🔍 ⚠️ Investigation tool failed at ~3 minutes - likely Gemini CLI timeout, not our timeout`);
}
}

logger.debug(`[AcpBackend] 🔍 Investigation tool FAILED - full content:`, JSON.stringify(content, null, 2));
logger.debug(`[AcpBackend] 🔍 Investigation tool timeout status BEFORE cleanup: ${hadTimeout ? 'timeout was set' : 'no timeout was set'}`);
logger.debug(`[AcpBackend] 🔍 Investigation tool startTime status BEFORE cleanup: ${startTime ? `set at ${new Date(startTime).toISOString()}` : 'not set'}`);
}

// Cleanup
ctx.activeToolCalls.delete(toolCallId);
ctx.toolCallStartTimes.delete(toolCallId);

const timeout = ctx.toolCallTimeouts.get(toolCallId);
if (timeout) {
clearTimeout(timeout);
ctx.toolCallTimeouts.delete(toolCallId);
logger.debug(`[AcpBackend] Cleared timeout for ${toolCallId} (tool call ${status})`);
} else {
logger.debug(`[AcpBackend] No timeout found for ${toolCallId} (tool call ${status}) - timeout may not have been set`);
}

const durationStr = formatDuration(startTime);
logger.debug(`[AcpBackend] ❌ Tool call ${status.toUpperCase()}: ${toolCallId} (${toolKindStr}) - Duration: ${durationStr}. Active tool calls: ${ctx.activeToolCalls.size}`);

// Extract error detail
const errorDetail = extractErrorDetail(content);
if (errorDetail) {
logger.debug(`[AcpBackend] ❌ Tool call error details: ${errorDetail.substring(0, 500)}`);
} else {
logger.debug(`[AcpBackend] ❌ Tool call ${status} but no error details in content`);
}

// Emit tool-result with error
ctx.emit({
type: 'tool-result',
toolName: toolKindStr,
result: errorDetail
? { error: errorDetail, status }
: { error: `Tool call ${status}`, status },
callId: toolCallId,
});

// If no more active tool calls, emit idle
if (ctx.activeToolCalls.size === 0) {
ctx.clearIdleTimeout();
logger.debug('[AcpBackend] All tool calls completed/failed, emitting idle status');
ctx.emitIdleStatus();
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging full tool-call content (secrets + oversized logs).

Investigation tool failures currently dump full content; tool payloads can include secrets and large blobs. Truncate/redact before logging to align with logging guidance.

🐛 Suggested mitigation
-    logger.debug(`[AcpBackend] 🔍 Investigation tool FAILED - full content:`, JSON.stringify(content, null, 2));
+    const contentStr = JSON.stringify(content, null, 2);
+    const truncated = contentStr.length > 2000 ? `${contentStr.slice(0, 2000)}…` : contentStr;
+    logger.debug(`[AcpBackend] 🔍 Investigation tool FAILED - content (truncated): ${truncated}`);

As per coding guidelines, truncate large JSON in logs and avoid secret leakage.

🤖 Prompt for AI Agents
In `@src/agent/acp/sessionUpdateHandlers.ts` around lines 356 - 432, The
failToolCall function currently logs the full tool-call payload (variable
content) for investigation tools; change it to never log full content by
creating a safe, redacted/truncated representation (e.g., use or extend
extractErrorDetail to produce safeContent or call a new
redactAndTruncate(content, maxChars) helper) and replace the
logger.debug(`[AcpBackend] 🔍 Investigation tool FAILED - full content:`,
JSON.stringify(content, null, 2)); line with a single-line debug that logs only
the redacted/truncated string (and limit length, e.g. first 500 chars) and
ensure any other debug that may reveal payloads uses the same safeContent; keep
other behavior (timing, timeout, emit) unchanged.

Comment thread src/codex/codexMcpClient.test.ts
Comment thread src/codex/codexMcpClient.ts
Comment thread src/utils/offlineSessionStub.ts Outdated
Comment thread src/utils/spawnHappyCLI.invocation.test.ts Outdated
Comment thread src/utils/tmux.socketPath.test.ts
Comment thread src/utils/tmux.test.ts
Comment thread src/utils/tmux.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/utils/expandEnvVars.ts (1)

31-94: := defaults are not being assigned; subsequent variables in the same pass won't see them.

Currently ${VAR:=default} behaves identically to ${VAR:-default} — the default is returned but never stored. In bash, := should assign the value to the variable so that later references to ${VAR} within the same expansion resolve to that assigned value. This breaks profiles relying on := to seed values for reuse.

The fix introduces a local resolvedEnv overlay to track assignments without mutating process.env. When := is matched and a default is applied, assign it to the overlay so subsequent expansions see it.

Proposed fix
 export function expandEnvironmentVariables(
     envVars: Record<string, string>,
     sourceEnv: NodeJS.ProcessEnv = process.env,
     options?: {
         warnOnUndefined?: boolean;
     }
 ): Record<string, string> {
     const expanded: Record<string, string> = {};
     const undefinedVars: string[] = [];
+    const resolvedEnv: Record<string, string | undefined> = { ...sourceEnv };

     for (const [key, value] of Object.entries(envVars)) {
         const expandedValue = value.replace(/\$\{([^}]+)\}/g, (match, expr) => {
             const colonDashIndex = expr.indexOf(':-');
             const colonEqIndex = expr.indexOf(':=');
             let varName: string;
             let defaultValue: string | undefined;

             if (colonDashIndex !== -1 || colonEqIndex !== -1) {
                 const idx = colonDashIndex !== -1 ? colonDashIndex : colonEqIndex;
+                const isAssignment = idx === colonEqIndex;
                 varName = expr.substring(0, idx);
                 defaultValue = expr.substring(idx + 2);
             } else {
                 varName = expr;
             }

-            const resolvedValue = sourceEnv[varName];
+            const resolvedValue = resolvedEnv[varName];
             const shouldTreatEmptyAsMissing = defaultValue !== undefined;
             const isMissing = resolvedValue === undefined || (shouldTreatEmptyAsMissing && resolvedValue === '');

             if (!isMissing) {
                 if (process.env.DEBUG) {
                     logger.debug(`[EXPAND ENV] Expanded ${varName} from daemon env`);
                 }
                 if (resolvedValue === '') {
                     logger.warn(`[EXPAND ENV] WARNING: ${varName} is set but EMPTY in daemon environment`);
                 }
                 return resolvedValue;
             } else if (defaultValue !== undefined) {
                 if (process.env.DEBUG) {
                     logger.debug(`[EXPAND ENV] Using default value for ${varName}`);
                 }
+                if (isAssignment) {
+                    resolvedEnv[varName] = defaultValue;
+                }
                 return defaultValue;
             } else {
                 undefinedVars.push(varName);
                 return match;
             }
         });

         expanded[key] = expandedValue;
     }
🧹 Nitpick comments (2)
src/modules/common/registerCommonHandlers.previewEnv.test.ts (2)

1-5: Add a brief file-level JSDoc header describing test responsibilities.
As per coding guidelines, each TS file should include a header comment that explains its purpose.


6-27: Avoid any/casts in the RPC helper and add explicit return types.
Tighten the helper typing and remove as any to keep this test strictly typed. As per coding guidelines, keep explicit types and avoid untyped code.

♻️ Suggested refactor
+type RpcRequest = { method: string; params: string };
+type PreviewEnvDisplay = 'full' | 'redacted' | 'hidden' | 'unset';
+type PreviewEnvValue = { value: string | null; isSet: boolean; isSensitive: boolean; display: PreviewEnvDisplay };
+type PreviewEnvResponse = { policy: 'none' | 'redacted' | 'full'; values: Record<string, PreviewEnvValue> };
+
-function createTestRpcManager(params?: { scopePrefix?: string }) {
+function createTestRpcManager(params: { scopePrefix?: string } = {}): { call: <T>(method: string, request: unknown) => Promise<T> } {
     const encryptionKey = new Uint8Array(32).fill(7);
     const encryptionVariant = 'legacy' as const;
-    const scopePrefix = params?.scopePrefix ?? 'machine-test';
+    const scopePrefix = params.scopePrefix ?? 'machine-test';
@@
-        const encryptedResponse = await manager.handleRequest({
-            method: `${scopePrefix}:${method}`,
-            params: encryptedParams,
-        } as any);
+        const requestPayload: RpcRequest = { method: `${scopePrefix}:${method}`, params: encryptedParams };
+        const encryptedResponse = await manager.handleRequest(requestPayload);
         const decrypted = decrypt(encryptionKey, encryptionVariant, decodeBase64(encryptedResponse));
         return decrypted as T;
     }

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/codex/codexMcpClient.test.ts`:
- Around line 5-43: Replace the loose any typing and clarify testing strategy:
type the mock instance array as StdioClientTransport[] by changing const
instances: any[] = []; to const instances: StdioClientTransport[] = []; and
export/__transportInstances should be typed accordingly so consumers know it's
an array of StdioClientTransport; keep the class name StdioClientTransport and
its constructor signature as-is. Additionally, add an optional integration test
suite (separate file) that runs only when a CI env var like
RUN_CODEX_INTEGRATION=true is set and update the project testing
guideline/README to allow gated integration tests for the Codex CLI rather than
unconditionally requiring real API calls.
🧹 Nitpick comments (4)
scripts/__tests__/ripgrep_launcher.test.ts (1)

96-104: Hoist fs/path imports to the top-level.

Inline require calls inside the test body conflict with the project’s import-at-top rule and add repetition across tests. Consider hoisting the imports once and reusing them here. As per coding guidelines, keep imports at the top of the file.

♻️ Proposed refactor
-import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import * as fs from 'node:fs';
+import * as path from 'node:path';
-        expect(() => {
-            const fs = require('fs');
-            const path = require('path');
-            const content = fs.readFileSync(path.join(__dirname, '../ripgrep_launcher.cjs'), 'utf8');
+        expect(() => {
+            const content = fs.readFileSync(path.join(__dirname, '../ripgrep_launcher.cjs'), 'utf8');
src/persistence.ts (1)

252-261: Remove stale placeholder comment.

The comment on line 261 (// if (fromVersion < 3) { ... }) is now stale since v3 migration is implemented above it. This should be updated to reference future versions (e.g., // if (fromVersion < 4) { ... }).

✏️ Suggested fix
   // Future migrations go here:
-  // if (fromVersion < 3) { ... }
+  // if (fromVersion < 4) { ... }
src/utils/tmux.ts (1)

963-968: Global instance doesn't propagate new constructor parameters.

The getTmuxUtilities function only accepts sessionName but the TmuxUtilities constructor now accepts tmuxCommandEnv and tmuxSocketPath. Callers needing these parameters must instantiate directly, which may be intentional but could be a footgun for consumers expecting the global helper to support all options.

Consider documenting this limitation or updating the signature:

export function getTmuxUtilities(
    sessionName?: string,
    tmuxCommandEnv?: Record<string, string>,
    tmuxSocketPath?: string
): TmuxUtilities {
    // ... caching logic may need adjustment for additional params
}
src/utils/tmux.real.integration.test.ts (1)

149-149: Consider reducing the 20-second timeout.

While integration tests need reasonable timeouts, 20 seconds seems generous for tests that typically complete in under 5 seconds. A shorter timeout (e.g., 10 seconds) would fail faster if something goes wrong.

Comment on lines +5 to +43
// NOTE: This test suite uses mocks because the real Codex CLI / MCP transport
// is not guaranteed to be available in CI or local test environments.
vi.mock('child_process', () => ({
execSync: vi.fn(),
}));

vi.mock('@modelcontextprotocol/sdk/types.js', () => ({
ElicitRequestSchema: {},
}));

vi.mock('@modelcontextprotocol/sdk/client/stdio.js', () => {
const instances: any[] = [];

class StdioClientTransport {
public command: string;
public args: string[];
public env: Record<string, string>;

constructor(opts: { command: string; args: string[]; env: Record<string, string> }) {
this.command = opts.command;
this.args = opts.args;
this.env = opts.env;
instances.push(this);
}
}

return { StdioClientTransport, __transportInstances: instances };
});

vi.mock('@modelcontextprotocol/sdk/client/index.js', () => {
class Client {
setNotificationHandler() { }
setRequestHandler() { }
async connect() { }
async close() { }
}

return { Client };
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's see the full test file to understand what's being tested
cat -n src/codex/codexMcpClient.test.ts

Repository: leeroybrun/happy-cli

Length of output: 3858


🏁 Script executed:

# Check if there's a real MCP server setup or CI configuration
find . -type f -name "*.yml" -o -name "*.yaml" -o -name "*.json" | grep -E "(ci|github|\.env|setup)" | head -20

Repository: leeroybrun/happy-cli

Length of output: 98


🏁 Script executed:

# Look for other test files to understand the testing pattern in this project
fd -e test.ts -e test.tsx -e spec.ts | head -20

Repository: leeroybrun/happy-cli

Length of output: 822


🏁 Script executed:

# Check for any environment-based test skipping or integration test markers
rg -A 3 -B 3 "describe\|it\|test" src/codex/codexMcpClient.test.ts

Repository: leeroybrun/happy-cli

Length of output: 46


🏁 Script executed:

# Check one of the other test files to see if they follow the "no mocking" rule
head -50 src/api/api.test.ts

Repository: leeroybrun/happy-cli

Length of output: 1357


🏁 Script executed:

# Let's look at the actual implementation being tested
cat src/codex/codexMcpClient.ts | head -80

Repository: leeroybrun/happy-cli

Length of output: 3393


🏁 Script executed:

# Check the smoke test workflow to see if it provides the Codex CLI
cat .github/workflows/smoke-test.yml

Repository: leeroybrun/happy-cli

Length of output: 6344


🏁 Script executed:

# Look for any setup or installation scripts
find . -type f \( -name "setup.sh" -o -name "install.sh" -o -name "Dockerfile" \) | head -10

Repository: leeroybrun/happy-cli

Length of output: 46


Tests use extensive mocking despite the "no mocking / real API calls" guideline, though this appears necessary due to external dependency constraints.

The test suite mocks the MCP transport, child_process, and SDK components, which conflicts with the stated testing guideline. However, this is a pragmatic choice because the Codex CLI is not guaranteed to be available in CI or local test environments. If real API calls are truly required per the project's standard, consider:

  1. Adding integration tests (optionally gated by an environment variable) that run only when Codex CLI is available
  2. Refining the guideline to clarify which external dependencies are acceptable to mock when unavailable

Separately, replace the any types with proper typing to maintain strict TypeScript discipline:

Type the mock instances
-    const instances: any[] = [];
+    type MockTransportInstance = {
+        command: string;
+        args: string[];
+        env: Record<string, string>;
+    };
+    const instances: MockTransportInstance[] = [];
🤖 Prompt for AI Agents
In `@src/codex/codexMcpClient.test.ts` around lines 5 - 43, Replace the loose any
typing and clarify testing strategy: type the mock instance array as
StdioClientTransport[] by changing const instances: any[] = []; to const
instances: StdioClientTransport[] = []; and export/__transportInstances should
be typed accordingly so consumers know it's an array of StdioClientTransport;
keep the class name StdioClientTransport and its constructor signature as-is.
Additionally, add an optional integration test suite (separate file) that runs
only when a CI env var like RUN_CODEX_INTEGRATION=true is set and update the
project testing guideline/README to allow gated integration tests for the Codex
CLI rather than unconditionally requiring real API calls.

@leeroybrun leeroybrun closed this Jan 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants