diff --git a/.plans/17-claude-code.md b/.plans/17-claude-code.md new file mode 100644 index 0000000000..822dbd806b --- /dev/null +++ b/.plans/17-claude-code.md @@ -0,0 +1,474 @@ +# Plan: Claude Code Integration (Orchestration Architecture) + +> **Note -- Multi-provider scope:** This plan was originally written for the Claude Code adapter, but the patterns described here (adapter shape, canonical runtime mapping, resume cursor ownership, provider registry wiring, and orchestration integration) apply equally to the full multi-provider adapter infrastructure now implemented in this PR: **ClaudeCodeAdapter**, **CopilotAdapter**, **OpenCodeAdapter**, **GeminiCliAdapter**, **KiloAdapter**, and **AmpAdapter**. Where the text says "Claude adapter", read it as the reference implementation; every other adapter follows the same contract surface. + +## Why this plan was rewritten + +The previous plan targeted a pre-orchestration architecture (`ProviderManager`, provider-native WS event methods, and direct provider UI wiring). The current app now routes everything through: + +1. `orchestration.dispatchCommand` (client intent) +2. `OrchestrationEngine` (decide + persist + publish domain events) +3. `ProviderCommandReactor` (domain intent -> `ProviderService`) +4. `ProviderService` (adapter routing + canonical runtime stream) +5. `ProviderRuntimeIngestion` (provider runtime -> internal orchestration commands) +6. `orchestration.domainEvent` (single push channel consumed by web) + +Claude integration must plug into this path instead of reintroducing legacy provider-specific flows. + +--- + +## Current constraints to design around (post-Stage 1) + +1. Provider runtime ingestion expects canonical `ProviderRuntimeEvent` shapes, not provider-native payloads. +2. Start input now uses typed `providerOptions` and generic `resumeCursor`; top-level provider-specific fields were removed. +3. `resumeCursor` is intentionally opaque outside adapters and must never be synthesized from `providerThreadId`. +4. `ProviderService` still requires adapter `startSession()` to return a `ProviderSession` with `threadId`. +5. Checkpoint revert currently calls `providerService.rollbackConversation()`, so Claude adapter needs a rollback strategy compatible with current reactor behavior. +6. Web currently marks Claude as unavailable (`"Claude Code (soon)"`) and model picker is Codex-only. + +--- + +## Architecture target + +Add Claude as a first-class provider adapter that emits canonical runtime events and works with existing orchestration reactors without adding new WS channels or bypass paths. + +Key decisions: + +1. Keep orchestration provider-agnostic; adapt Claude inside adapter/layer boundaries. +2. Use the existing canonical runtime stream (`ProviderRuntimeEvent`) as the only ingestion contract. +3. Keep provider session routing in `ProviderService` and `ProviderSessionDirectory`. +4. Add explicit provider selection to turn-start intent so first turn can start Claude session intentionally. + +--- + +## Phase 1: Contracts and command shape updates + +### 1.1 Provider-aware model contract + +Update `packages/contracts/src/model.ts` so model resolution can be provider-aware instead of Codex-only. + +Expected outcomes: + +1. Introduce provider-scoped model lists (Codex + Claude). +2. Add helpers that resolve model by provider. +3. Preserve backwards compatibility for existing Codex defaults. + +### 1.2 Turn-start provider intent + +Update `packages/contracts/src/orchestration.ts`: + +1. Add optional `provider: ProviderKind` to `ThreadTurnStartCommand`. +2. Carry provider through `ThreadTurnStartRequestedPayload`. +3. Keep existing command valid when provider is omitted. + +This removes the implicit “Codex unless session already exists” behavior as the only path. + +### 1.3 Provider session start input for Claude runtime knobs (completed) + +Update `packages/contracts/src/provider.ts`: + +1. Move provider-specific start fields into typed `providerOptions`: + - `providerOptions.codex` + - `providerOptions.claudeCode` +2. Keep `resumeCursor` as the single cross-provider resume input in `ProviderSessionStartInput`. +3. Deprecate/remove `resumeThreadId` from the generic start contract. +4. Treat `resumeCursor` as adapter-owned opaque state. + +### 1.4 Contract tests (completed) + +Update/add tests in `packages/contracts/src/*.test.ts` for: + +1. New command payload shape. +2. Provider-aware model resolution behavior. +3. Breaking-change expectations for removed top-level provider fields. + +--- + +## Phase 2: Claude adapter implementation + +### 2.1 Add adapter service + layer + +Create: + +1. `apps/server/src/provider/Services/ClaudeCodeAdapter.ts` +2. `apps/server/src/provider/Layers/ClaudeCodeAdapter.ts` + +Adapter must implement `ProviderAdapterShape`. + +### 2.1.a SDK dependency and baseline config + +Add server dependency: + +1. `@anthropic-ai/claude-agent-sdk` + +Baseline adapter options to support from day one: + +1. `cwd` +2. `model` +3. `pathToClaudeCodeExecutable` (from `providerOptions.claudeCode.binaryPath`) +4. `permissionMode` (from `providerOptions.claudeCode.permissionMode`) +5. `maxThinkingTokens` (from `providerOptions.claudeCode.maxThinkingTokens`) +6. `resume` +7. `resumeSessionAt` +8. `includePartialMessages` +9. `canUseTool` +10. `hooks` +11. `env` and `additionalDirectories` (if needed for sandbox/workspace parity) + +### 2.1.b Credential management and resource limits + +Each provider manages its own authentication externally: + +1. **Environment variables and CLI auth** -- Credentials are resolved via provider-native mechanisms (e.g. `ANTHROPIC_API_KEY` for Claude, `OPENAI_API_KEY` for Codex, `gh auth` for Copilot). The adapter layer never stores or brokers credentials directly; it relies on the underlying CLI/SDK picking them up from the environment. +2. **Per-provider rate limiting** -- Each server manager (`codexAppServerManager`, `claudeCodeServerManager`, etc.) is responsible for honoring its provider's rate limits. Adapters should surface rate-limit errors as `ProviderAdapterProcessError` so orchestration can report them cleanly. +3. **Concurrent session limits** -- The number of simultaneous provider sessions is bounded by system resources (open processes, file descriptors, memory). `ProviderSessionDirectory` tracks active sessions but does not enforce hard caps; operators should monitor resource usage when running multiple providers concurrently. +4. **Credential leakage prevention** -- Error messages, logs, and serialized `ProviderAdapterProcessError` payloads must never include raw API keys or tokens. Adapters should redact secrets before surfacing diagnostics. +5. **Secure environment propagation** -- When spawning child processes (CLI binaries, SDK sub-processes), pass an explicit environment whitelist rather than forwarding the entire `process.env`. This limits accidental exposure of unrelated secrets to the child. +6. **Secret rotation** -- Rotating a provider API key or token requires restarting all active sessions for that provider. Document this operational requirement; there is no hot-reload path for credentials. + +### 2.2 Claude runtime bridge + +Implement a Claude runtime bridge (either directly in adapter layer or via dedicated manager file) that wraps Agent SDK query lifecycle. + +Required capabilities: + +1. Long-lived session context per adapter session. +2. Multi-turn input queue. +3. Interrupt support. +4. Approval request/response bridge. +5. Resume support via opaque `resumeCursor` (parsed inside Claude adapter only). + +#### 2.2.a Agent SDK details to preserve + +The adapter should explicitly rely on these SDK capabilities: + +1. `query()` returns an async iterable message stream and control methods (`interrupt`, `setModel`, `setPermissionMode`, `setMaxThinkingTokens`, account/status helpers). +2. Multi-turn input is supported via async-iterable prompt input. +3. Tool approval decisions are provided via `canUseTool`. +4. Resume support uses `resume` and optional `resumeSessionAt`, both derived by parsing adapter-owned `resumeCursor`. +5. Hooks can be used for lifecycle signals (`Stop`, `PostToolUse`, etc.) when we need adapter-originated checkpoint/runtime events. + +#### 2.2.b Effect-native session lifecycle skeleton + +```ts +import { query } from "@anthropic-ai/claude-agent-sdk"; +import { Effect } from "effect"; + +const acquireSession = (input: ProviderSessionStartInput) => + Effect.acquireRelease( + Effect.tryPromise({ + try: async () => { + const claudeOptions = input.providerOptions?.claudeCode; + const resumeState = readClaudeResumeState(input.resumeCursor); + const abortController = new AbortController(); + const result = query({ + prompt: makePromptAsyncIterable(), + options: { + cwd: input.cwd, + model: input.model, + permissionMode: claudeOptions?.permissionMode, + maxThinkingTokens: claudeOptions?.maxThinkingTokens, + pathToClaudeCodeExecutable: claudeOptions?.binaryPath, + resume: resumeState?.threadId, + resumeSessionAt: resumeState?.sessionAt, + signal: abortController.signal, + includePartialMessages: true, + canUseTool: makeCanUseTool(), + hooks: makeClaudeHooks(), + }, + }); + return { abortController, result }; + }, + catch: (cause) => + new ProviderAdapterProcessError({ + provider: "claudeCode", + sessionId: "pending", + detail: "Failed to start Claude runtime session.", + cause, + }), + }), + ({ abortController }) => Effect.sync(() => abortController.abort()), + ); +``` + +#### 2.2.c AsyncIterable -> Effect Stream integration + +Preferred when available in the pinned Effect version: + +```ts +const sdkMessageStream = Stream.fromAsyncIterable( + session.result, + (cause) => + new ProviderAdapterProcessError({ + provider: "claudeCode", + sessionId, + detail: "Claude runtime stream failed.", + cause, + }), +); +``` + +Portable fallback (already aligned with current server patterns): + +```ts +const sdkMessageStream = Stream.async((emit) => { + let cancelled = false; + void (async () => { + try { + for await (const message of session.result) { + if (cancelled) break; + emit.single(message); + } + emit.end(); + } catch (cause) { + emit.fail( + new ProviderAdapterProcessError({ + provider: "claudeCode", + sessionId, + detail: "Claude runtime stream failed.", + cause, + }), + ); + } + })(); + return Effect.sync(() => { + cancelled = true; + }); +}); +``` + +### 2.3 Canonical event mapping + +Claude adapter must translate Agent SDK output into canonical `ProviderRuntimeEvent`. + +Initial mapping target: + +1. assistant text deltas -> `content.delta` +2. final assistant text -> `item.completed` and/or `turn.completed` +3. approval requests -> `request.opened` +4. approval results -> `request.resolved` +5. system lifecycle -> `session.*`, `thread.*`, `turn.*` +6. errors -> `runtime.error` +7. plan/proposed-plan content when derivable + +Implementation note: + +1. Keep raw Claude message on `raw` for debugging. +2. Prefer canonical item/request kinds over provider-native enums. +3. If Claude emits extra event kinds we do not model yet, map them to `tool.summary`, `runtime.warning`, or `unknown`-compatible payloads instead of dropping silently. + +### 2.4 Resume cursor strategy + +Define Claude-owned opaque resume state, e.g.: + +```ts +interface ClaudeResumeCursor { + readonly version: 1; + readonly threadId?: string; + readonly sessionAt?: string; +} +``` + +Rules: + +1. Serialize only adapter-owned state into `resumeCursor`. +2. Parse/validate only inside Claude adapter. +3. Store updated cursor when Claude runtime yields enough data to resume safely. +4. Never overload orchestration thread id as Claude thread id. + +### 2.5 Interrupt and stop semantics + +Map orchestration stop/interrupt expectations onto SDK controls: + +1. `interruptTurn()` -> active query interrupt. +2. `stopSession()` -> close session resources and prevent future sends. +3. `rollbackThread()` -> see Phase 4. + +--- + +## Phase 3: Provider service and composition + +### 3.1 Register Claude adapter + +Update provider registry layer to include Claude: + +1. add `claudeCode` -> `ClaudeCodeAdapter` +2. ensure `ProviderService.listProviderStatuses()` reports Claude availability + +### 3.2 Persist provider binding + +Current `ProviderSessionDirectory` already stores provider/thread binding and opaque `resumeCursor`. + +Required validation: + +1. Claude bindings survive restart. +2. resume cursor remains opaque and round-trips untouched. +3. stopAll + restart can recover Claude sessions when possible. + +### 3.3 Provider start routing + +Update `ProviderCommandReactor` / orchestration flow: + +1. If a thread turn start requests `provider: "claudeCode"`, start Claude if no active session exists. +2. If a thread already has Claude session binding, reuse it. +3. If provider switches between Codex and Claude, explicitly stop/rebind before next send. + +--- + +## Phase 4: Checkpoint and revert strategy + +Claude does not necessarily expose the same conversation rewind primitive as Codex app-server. Current architecture expects `providerService.rollbackConversation()`. + +Pick one explicit strategy: + +### Option A: provider-native rewind + +If SDK/runtime supports safe rewind: + +1. implement in Claude adapter +2. keep `CheckpointReactor` unchanged + +### Option B: session restart + state truncation shim + +If no native rewind exists: + +1. Claude adapter returns successful rollback by: + - stopping current Claude session + - clearing/rewriting stored Claude resume cursor to last safe resumable point + - forcing next turn to recreate session from persisted orchestration state +2. Document that rollback is “conversation reset to checkpoint boundary”, not provider-native turn deletion. + +Whichever option is chosen: + +1. behavior must be deterministic +2. checkpoint revert tests must pass under orchestration expectations +3. user-visible activity log should explain failures clearly when provider rollback is impossible + +### Decision criteria + +Choose the rollback strategy as follows: + +1. If the Agent SDK exposes a native rewind/rollback API that can truncate conversation history to an arbitrary checkpoint, use **Option A** (provider-native rewind). This gives the cleanest UX and avoids session restart overhead. +2. If no native rewind API exists or it cannot target the exact checkpoint boundary orchestration requires, use **Option B** (session restart + state truncation shim). +3. **Time-box rule:** if investigation into Option A takes longer than 2 working days without a reliable prototype, default to Option B and move on. Option A can be revisited as a follow-up enhancement once the base integration is stable. + +--- + +## Phase 5: Web integration + +### 5.1 Provider picker and model picker + +Update web state/UI: + +1. allow choosing Claude as thread provider before first turn +2. show Claude model list from provider-aware model helpers +3. preserve existing Codex default behavior when provider omitted + +Likely touch points: + +1. `apps/web/src/store.ts` +2. `apps/web/src/components/ChatView.tsx` +3. `apps/web/src/types.ts` +4. `packages/shared/src/model.ts` + +### 5.2 Settings for Claude executable/options + +Add app settings if needed for: + +1. Claude binary path +2. default permission mode +3. default max thinking tokens + +Do not hardcode provider-specific config into generic session state if it belongs in app settings or typed `providerOptions`. + +### 5.3 Session rendering + +No new WS channel should be needed. Claude should appear through existing: + +1. thread messages +2. activities/worklog +3. approvals +4. session state +5. checkpoints/diffs + +--- + +## Phase 6: Testing strategy + +### 6.1 Contract tests + +Cover: + +1. provider-aware model schemas +2. provider field on turn-start command +3. provider-specific start options schema + +### 6.2 Adapter layer tests + +Add `ClaudeCodeAdapter.test.ts` covering: + +1. session start +2. event mapping +3. approval bridge +4. resume cursor parse/serialize +5. interrupt behavior +6. rollback behavior or explicit unsupported error path + +Use SDK-facing layer tests/mocks only at the boundary. Do not mock orchestration business logic in higher-level tests. + +### 6.3 Provider service integration tests + +Extend provider integration coverage so Claude is exercised through `ProviderService`: + +1. start Claude session +2. send turn +3. receive canonical runtime events +4. restart/recover using persisted binding + +### 6.4 Orchestration integration tests + +Add/extend integration tests around: + +1. first-turn provider selection +2. Claude approval requests routed through orchestration +3. Claude runtime ingestion -> messages/activities/session updates +4. checkpoint revert behavior under Claude +5. stopAll/restart recovery + +These should validate real orchestration flows, not just adapter behavior. + +### 6.5 Multi-provider test scenarios + +Cover cross-provider interactions that single-adapter tests miss: + +1. **Provider switching mid-conversation** -- Start a thread on Codex, then switch to Claude (or any other provider) for the next turn. Verify the old session is stopped, bindings are updated, and the new adapter receives the correct `providerOptions`. +2. **Concurrent active sessions** -- Run sessions on two or more different providers simultaneously. Verify events from each session are routed to the correct orchestration thread without cross-contamination. +3. **Resume cursor isolation** -- Persist resume cursors from two different providers, then attempt to resume each. Confirm that one provider's cursor cannot accidentally be used to resume another provider's session (adapter parse should reject mismatched cursors). +4. **Provider health monitoring** -- Simulate a provider becoming unavailable (process crash, binary missing). Verify `listProviderStatuses()` reflects the degraded state and that orchestration surfaces a clear error to the client rather than hanging. +5. **Performance under load** -- Run 10+ concurrent provider sessions across mixed adapters. Monitor memory usage, open file descriptors, and event-delivery latency to ensure the server remains responsive and does not leak resources. +6. **Chaos scenarios** -- Forcibly kill provider child processes and inject network timeouts mid-stream. Verify that orchestration detects the failure, emits a clear `runtime.error`, and cleans up session resources without leaving zombie processes. +7. **Resume after ungraceful shutdown** -- Terminate the server (SIGKILL) while sessions are active, then restart. Validate that persisted resume cursors allow sessions to recover and that no corrupted state prevents new sessions from starting. + +--- + +## Phase 7: Rollout order + +Recommended implementation order: + +1. contracts/provider-aware models +2. provider field on turn-start +3. Claude adapter skeleton + start/send/stream +4. canonical event mapping +5. provider registry/service wiring +6. orchestration recovery + checkpoint strategy +7. web provider/model picker +8. full integration tests + +--- + +## Non-goals + +1. Reintroducing provider-specific WS methods/channels. +2. Storing provider-native thread ids as orchestration ids. +3. Bypassing orchestration engine for Claude-specific UI flows. +4. Encoding Claude resume semantics outside adapter-owned `resumeCursor`. diff --git a/.plans/18-cursor-agent-provider.md b/.plans/18-cursor-agent-provider.md new file mode 100644 index 0000000000..452592e68d --- /dev/null +++ b/.plans/18-cursor-agent-provider.md @@ -0,0 +1,327 @@ +# Plan: Cursor ACP (`agent acp`) Provider Integration + +## Goal + +Add Cursor as a first-class provider in T3 Code using ACP (`agent acp`) over JSON-RPC 2.0 stdio, with robust session lifecycle handling and canonical `ProviderRuntimeEvent` projection. + +--- + +## 1) Exploration Findings (from live ACP probes) + +### 1.1 Core invocation and transport + +1. Binary is `agent` on PATH (`2026.02.27-e7d2ef6` observed). +2. ACP server command is `agent acp`. +3. Transport is newline-delimited JSON-RPC 2.0 over stdio. +4. Messages: + - client -> server: requests and responses to server-initiated requests + - server -> client: responses, notifications (`session/update`), and server requests (`session/request_permission`) + +### 1.2 Handshake and session calls observed + +1. `initialize` returns: + - `protocolVersion` + - `agentCapabilities` (`loadSession`, `mcpCapabilities`, `promptCapabilities`) + - `authMethods` (includes `cursor_login`) +2. `authenticate { methodId: "cursor_login" }` returns `{}` when logged in. +3. `session/new` returns: + - `sessionId` + - `modes` (`agent`, `plan`, `ask`) +4. `session/load` works and requires `sessionId`, `cwd`, `mcpServers`. +5. `session/prompt` returns terminal response `{ stopReason: "end_turn" | "cancelled" }`. + +Important sequence note: +1. ACP currently allows `session/new` even without explicit `initialize`/`authenticate` when local auth already exists. +2. For adapter consistency and forward compatibility, we should still send `initialize` and `authenticate` during startup. + +### 1.3 `session/update` event families observed + +Observed `params.update.sessionUpdate` values: + +1. `available_commands_update` +2. `agent_thought_chunk` +3. `agent_message_chunk` +4. `tool_call` +5. `tool_call_update` + +Observed payload behavior: + +1. `agent_*_chunk` provides `content: { type: "text", text: string }`. +2. `tool_call` may be emitted multiple times for same `toolCallId`: + - initial generic form (`title: "Terminal"`, `rawInput: {}`) + - enriched form (`title: "\`pwd\`"`, `rawInput: { command: "pwd" }`) +3. `tool_call_update` statuses observed: + - `in_progress` + - `completed` +4. `tool_call_update` on completion may include `rawOutput`: + - terminal: `{ exitCode, stdout, stderr }` + - search/find: `{ totalFiles, truncated }` + +### 1.4 Permission flow observed + +1. ACP server sends `session/request_permission` (JSON-RPC request with `id`). +2. Request shape includes: + - `params.sessionId` + - `params.toolCall` + - `params.options` (`allow-once`, `allow-always`, `reject-once`) +3. Client must respond on same `id` with: + - `{ outcome: { outcome: "selected", optionId: "" } }` +4. Reject path still results in tool lifecycle completion events (`tool_call_update status: completed`), typically without `rawOutput`. + +### 1.5 Error and capability quirks + +1. `session/cancel` currently returns: + - JSON-RPC error `-32601` Method not found +2. Error shape examples: + - unknown auth method: `-32602` + - `session/load` missing/invalid params: `-32602` + - `session/prompt` unknown session: `-32603` with details +3. Parallel prompts on same session are effectively single-flight: + - second prompt can cause first to complete with `stopReason: "cancelled"`. +4. `session/new` accepts a `model` field (no explicit echo in response). + +Probe artifacts: +1. `.tmp/acp-probe/*/transcript.ndjson` +2. `.tmp/acp-probe/*/summary.json` +3. `scripts/cursor-acp-probe.mjs` + +--- + +## 2) Integration Constraints for T3 + +1. T3 adapter contract still requires: + - `startSession`, `sendTurn`, `interruptTurn`, `respondToRequest`, `readThread`, `rollbackThread`, `stopSession`, `listSessions`, `hasSession`, `stopAll`, `streamEvents`. +2. Orchestration consumes canonical `ProviderRuntimeEvent` only. +3. `ProviderCommandReactor` provider precedence fix remains required (respect explicit provider on turn start). +4. ACP now supports external permission decisions, so Cursor can participate in T3 approval UX via adapter-managed request/response plumbing. + +--- + +## 3) Proposed Architecture + +### 3.1 New server components + +1. `apps/server/src/provider/Services/CursorAdapter.ts` (service contract/tag + ACP event schemas). +2. `apps/server/src/provider/Layers/CursorAdapter.ts` (single implementation unit; owns ACP process lifecycle, JSON-RPC routing, runtime projection). +3. No manager indirection; keep logic in layer implementation. + +### 3.2 Session model + +1. One long-lived ACP child process per T3 Cursor provider session. +2. Track: + - `providerSessionId` (T3 synthetic ID) + - `acpSessionId` (from `session/new` or restored via `session/load`) + - `cwd`, `model`, in-flight turn state + - pending permission requests by JSON-RPC request id +3. Resume support: + - persist `acpSessionId` in provider resume metadata and call `session/load` on reattach. + +### 3.3 Command strategy + +1. `startSession`: + - spawn `agent acp` + - `initialize` + - `authenticate(cursor_login)` (best-effort, typed failure handling) + - `session/new` or `session/load` +2. `sendTurn`: + - send `session/prompt { sessionId, prompt: [...] }` + - consume streaming `session/update` notifications until terminal prompt response +3. `interruptTurn`: + - no native `session/cancel` today; implement fallback: + - terminate ACP process + restart + `session/load` for subsequent turns + - mark in-flight turn as interrupted/failed in canonical events +4. `respondToRequest`: + - map T3 approval decision -> ACP `optionId` + - reply to exact JSON-RPC request id from `session/request_permission` + +### 3.4 Effect-first implementation style (required) + +1. Keep logic inside `CursorAdapterLive`. +2. Use Effect primitives: + - `Queue` + `Stream.fromQueue` for event fan-out + - `Ref` / `Ref.Synchronized` for session/process/request state + - scoped fibers for stdout/stderr read loops +3. Typed JSON decode at boundary: + - request/response envelopes + - `session/update` union schema + - permission-request schema +4. Keep adapter errors in typed error algebra with explicit mapping at process/protocol boundaries. + +--- + +## 4) Canonical Event Mapping Plan (ACP -> ProviderRuntimeEvent) + +1. `session/update: agent_message_chunk` + - emit `message.delta` for assistant stream +2. prompt terminal response (`session/prompt` result `stopReason: end_turn`) + - emit `message.completed` + `turn.completed` +3. `session/update: agent_thought_chunk` + - initial mapping: emit thinking activity (or ignore if we keep current canonical surface minimal) +4. `session/update: tool_call` + - first-seen `toolCallId` emits `tool.started` + - subsequent `tool_call` for same ID treated as metadata update (no duplicate started event) +5. `session/update: tool_call_update` + - `in_progress`: optional progress activity + - `completed`: emit `tool.completed` with summarized `rawOutput` when present +6. `session/request_permission` + - emit `approval.requested` with mapped options + - when client decision sent, emit `approval.resolved` +7. protocol/process error + - emit `runtime.error` + - fail active turn/session as appropriate + +Synthetic IDs: +1. `turnId`: T3-generated UUID per `sendTurn`. +2. `itemId`: + - assistant stream: `${turnId}:assistant` + - tools: `${turnId}:${toolCallId}` + +--- + +## 5) Approval, Resume, and Rollback Behavior + +### 5.1 Approvals + +1. Cursor ACP permission requests are externally controllable; implement full `respondToRequest` path in v1. +2. Decision mapping: + - allow once -> `allow-once` + - allow always -> `allow-always` + - reject -> `reject-once` + +### 5.2 Resume + +1. `session/load` is available and should be first-class for adapter restart/reconnect. +2. Must send required params: `sessionId`, `cwd`, `mcpServers`. + +### 5.3 Rollback / thread read + +1. ACP currently has no observed rollback API. +2. Plan for v1: + - `readThread`: adapter-maintained snapshot projection + - `rollbackThread`: explicit unsupported error +3. Product guard: + - disable checkpoint revert for Cursor threads in UI until rollback exists. + +--- + +## 6) Required Contract and Runtime Changes + +### 6.1 Contracts + +1. Add `cursor` to `ProviderKind`. +2. Add Cursor provider start options (`providerOptions.cursor`), ACP-oriented: + - optional `binaryPath` + - optional auth/mode knobs if needed later +3. Extend model options for Cursor list and traits mapping. +4. Add schemas for ACP-native event union in Cursor adapter service file. + +### 6.2 Server orchestration and registry + +1. Register `CursorAdapter` in provider registry and server layer wiring. +2. Update provider-kind persistence decoding for `cursor`. +3. Fix `ProviderCommandReactor` precedence to honor explicit provider in turn-start command. + +### 6.3 Web + +1. Cursor in provider picker and model picker (already partially done). +2. Trait controls map to concrete Cursor model identifiers. +3. Surface unsupported rollback behavior in UX. + +--- + +## 7) Implementation Phases + +### Phase A: ACP process and protocol skeleton + +1. Implement ACP process lifecycle in `CursorAdapterLive`. +2. Implement JSON-RPC request/response multiplexer. +3. Implement `initialize`/`authenticate`/`session/new|load` flow. +4. Wire `streamEvents` from ACP notifications. + +### Phase B: Runtime projection and approvals + +1. Map `session/update` variants to canonical runtime events. +2. Implement permission-request bridging to `respondToRequest`. +3. Implement dedupe for repeated `tool_call` on same `toolCallId`. + +### Phase C: Turn control and interruption + +1. Implement single in-flight prompt protection per session. +2. Implement interruption fallback (process restart + reload) because `session/cancel` unavailable. +3. Ensure clean state recovery on ACP process crash. + +### Phase D: Orchestration + UX polish + +1. Provider routing precedence fix. +2. Cursor-specific UX notes for unsupported rollback. +3. End-to-end smoke and event log validation. + +--- + +## 8) Test Plan + +Follow project rule: backend external-service integrations tested via layered fakes, not by mocking core business logic. + +### 8.1 Unit tests (`CursorAdapter`) + +1. JSON-RPC envelope parsing: + - response matching by id + - server request handling (`session/request_permission`) + - notification decode (`session/update`) +2. Event projection: + - `agent_message_chunk` / `agent_thought_chunk` + - `tool_call` + `tool_call_update` dedupe/lifecycle + - permission request -> approval events +3. Error mapping: + - unknown session + - method-not-found (`session/cancel`) + - invalid params + +### 8.2 Provider service/routing tests + +1. Registry resolves `cursor`. +2. Session directory persistence reads/writes `cursor`. +3. ProviderService fan-out ordering with Cursor ACP events. + +### 8.3 Orchestration tests + +1. `thread.turn.start` with `provider: cursor` routes to Cursor adapter. +2. approval response command maps to ACP permission response. +3. checkpoint revert on Cursor thread returns controlled unsupported failure. + +### 8.4 Optional live smoke + +1. Env-gated ACP smoke: + - start session + - run prompt + - observe deltas + completion + - exercise permission request path with one tool call + +--- + +## 9) Operational Notes + +1. Keep one in-flight turn per ACP session. +2. Keep per-session ACP process logs/NDJSON artifacts for debugging. +3. Treat `session/cancel` as unsupported until Cursor ships it; avoid relying on it. +4. Preserve resume metadata (`acpSessionId`) for crash recovery. + +--- + +## 10) Open Questions + +1. Should we call `authenticate` always, or only after auth-required errors? +2. Should model selection be passed at `session/new` only, or can/should we support model switching mid-session if ACP adds API? +3. For interruption UX, do we expose “hard interrupt” semantics (process restart) explicitly? + +--- + +## 11) Delivery Checklist + +1. Plan/documentation switched from headless `agent -p` to ACP `agent acp`. +2. Contracts updated (`ProviderKind`, Cursor options, model/trait mapping). +3. Cursor ACP adapter layer implemented and registered. +4. Provider precedence fixed in orchestration router. +5. Approval response path wired through ACP permission requests. +6. Tests added for protocol decode, projection, approval flow, and routing. +7. Lint + tests green. diff --git a/README.md b/README.md index 03e81b5fb9..11acc65777 100644 --- a/README.md +++ b/README.md @@ -1,28 +1,99 @@ # T3 Code -T3 Code is a minimal web GUI for coding agents. Currently Codex-first, with Claude Code support coming soon. +T3 Code is a minimal web GUI for coding agents made by [Pingdotgg](https://github.com/pingdotgg). This project is a downstream fork of [T3 Code](https://github.com/pingdotgg/t3code) customised to my utility and includes various PRs/feature additions from the upstream repo. Thanks to the team and its maintainers for keeping it OSS and an upstream to look up to. -## How to use +It supports Codex, Claude Code, Cursor, Copilot, Gemini CLI, Amp, Kilo, and OpenCode. -> [!WARNING] -> You need to have [Codex CLI](https://github.com/openai/codex) installed and authorized for T3 Code to work. +(NOTE: Amp /mode free is not supported, as Amp Code doesn't support it in headless mode - since they need to show ads for that business model to work.) + +## Why the fork? +This fork is designed to keep up a faster rate of development customised to my needs (and if you want, _yours_ as well -> Submit an issue and I'll make a PR for it). There's certain features which will (rightly) remain out of scope/priority for the project at its scale, but might be required for someone like me. + +### Multi-provider support +Adds full provider adapters (server managers, service layers, runtime layers) for agents that are not yet on the upstream roadmap: + +| Provider | What's included | +| --- | --- | +| Amp | Adapter + `ampServerManager` for headless Amp sessions | +| Copilot | Adapter + CLI binary resolution + text generation layer | +| Cursor | Adapter + ACP probe integration + usage tracking | +| Gemini CLI | Adapter + `geminiCliServerManager` with full test coverage | +| Kilo | Adapter + `kiloServerManager` + OpenCode-style server URL config | +| OpenCode | Adapter + `opencodeServerManager` with hostname/port/workspace config | +| Claude Code | Full adapter with permission mode, thinking token limits, and SDK typings | + +### UX enhancements + +| Feature | Description | +| --- | --- | +| Settings page | Dedicated route (`/settings`) for theme, accent color, and custom model slug configuration | +| Accent color system | Preset palette with contrast-safe terminal color injection across the entire UI | +| Theme support | Light / dark / system modes with transition suppression | +| Command palette | `Cmd+K` / `Ctrl+K` palette for quick actions, script running, and thread navigation | +| Sidebar search | Normalized thread title search with instant filtering | +| Plan sidebar | Dedicated panel for reviewing, downloading, or saving proposed agent plans | +| Terminal drawer | Theme-aware integrated terminal with accent color styling | + +### Branding & build +- Custom abstract-mark app icon with macOS icon composer support +- Centralized branding constants for easy identity swaps +- Desktop icon asset generation pipeline from SVG source + +### Developer tooling +- `sync-upstream-pr-tracks` script for tracking cherry-picked upstream PRs +- `cursor-acp-probe` for testing Cursor Agent Communication Protocol +- Custom alpha workflow playbook (`docs/custom-alpha-workflow.md`) +- Upstream PR tracking config (`config/upstream-pr-tracks.json`) + +## Getting started + +### Quick install (recommended) + +Run the interactive installer — it detects your OS, checks prerequisites (git, Node.js ≥ 24, bun ≥ 1.3.9), installs missing tools, and lets you choose between development/production and desktop/web builds: ```bash -npx t3 +# macOS / Linux / WSL +bash <(curl -fsSL https://raw.githubusercontent.com/aaditagrawal/t3code/main/scripts/install.sh) +``` + +```powershell +# Windows (Git Bash, MSYS2, or WSL) +bash <(curl -fsSL https://raw.githubusercontent.com/aaditagrawal/t3code/main/scripts/install.sh) ``` -You can also just install the desktop app. It's cooler. +The installer supports **npm, yarn, pnpm, bun, and deno** detection, and will auto-install bun if no suitable package manager is found. It provides OS-specific install instructions for any missing prerequisites (Homebrew on macOS, apt/dnf/pacman on Linux, winget on Windows). -Install the [desktop app from the Releases page](https://github.com/pingdotgg/t3code/releases) +### Manual build -## Some notes + > [!WARNING] + > You need at least one supported coding agent installed and authorized. See the supported agents list below. -We are very very early in this project. Expect bugs. + ```bash + # Prerequisites: Bun >=1.3.9, Node >=24.13.1 + git clone https://github.com/aaditagrawal/t3code.git + cd t3code + bun install + bun run dev + ``` -We are not accepting contributions yet. +## Supported agents -## If you REALLY want to contribute still.... read this first + - [Codex CLI](https://github.com/openai/codex) (requires v0.37.0 or later) + - [Claude Code](https://github.com/anthropics/claude-code) — **not yet working in the desktop app** + - [Cursor](https://cursor.sh) + - [Copilot](https://github.com/features/copilot) + - [Gemini CLI](https://github.com/google-gemini/gemini-cli) + - [Amp](https://ampcode.com) + - [Kilo](https://kilo.dev) + - [OpenCode](https://opencode.ai) + +## Contributing Read [CONTRIBUTING.md](./CONTRIBUTING.md) before opening an issue or PR. Need support? Join the [Discord](https://discord.gg/jn4EGJjrvv). + +## Notes + + - This project is very early in development. Expect bugs. (Especially with my fork) + - Maintaining a custom fork or alpha branch? See [docs/custom-alpha-workflow.md](docs/custom-alpha-workflow.md). diff --git a/apps/desktop/resources/icon.icns b/apps/desktop/resources/icon.icns index da16d12a0c..d5e4392f3b 100644 Binary files a/apps/desktop/resources/icon.icns and b/apps/desktop/resources/icon.icns differ diff --git a/apps/desktop/resources/icon.ico b/apps/desktop/resources/icon.ico index 8298f70d8b..947f6d57b9 100644 Binary files a/apps/desktop/resources/icon.ico and b/apps/desktop/resources/icon.ico differ diff --git a/apps/desktop/resources/icon.png b/apps/desktop/resources/icon.png index 37f3f756a5..d140f4b9cf 100644 Binary files a/apps/desktop/resources/icon.png and b/apps/desktop/resources/icon.png differ diff --git a/apps/desktop/scripts/dev-electron.mjs b/apps/desktop/scripts/dev-electron.mjs index 5d8bbe1116..6945835a62 100644 --- a/apps/desktop/scripts/dev-electron.mjs +++ b/apps/desktop/scripts/dev-electron.mjs @@ -22,6 +22,7 @@ await waitOn({ const childEnv = { ...process.env }; delete childEnv.ELECTRON_RUN_AS_NODE; +const electronPath = await resolveElectronPath(); let shuttingDown = false; let restartTimer = null; @@ -52,7 +53,7 @@ function startApp() { } const app = spawn( - resolveElectronPath(), + electronPath, [`--t3code-dev-root=${desktopDir}`, "dist-electron/main.js"], { cwd: desktopDir, diff --git a/apps/desktop/scripts/electron-launcher.mjs b/apps/desktop/scripts/electron-launcher.mjs index 9d7c522781..46560c173e 100644 --- a/apps/desktop/scripts/electron-launcher.mjs +++ b/apps/desktop/scripts/electron-launcher.mjs @@ -16,10 +16,12 @@ import { createRequire } from "node:module"; import { dirname, join, resolve } from "node:path"; import { fileURLToPath } from "node:url"; +import { generateAssetCatalogForIcon } from "../../../scripts/lib/macos-icon-composer.ts"; + const isDevelopment = Boolean(process.env.VITE_DEV_SERVER_URL); const APP_DISPLAY_NAME = isDevelopment ? "T3 Code (Dev)" : "T3 Code (Alpha)"; const APP_BUNDLE_ID = "com.t3tools.t3code"; -const LAUNCHER_VERSION = 1; +const LAUNCHER_VERSION = 2; const __dirname = dirname(fileURLToPath(import.meta.url)); export const desktopDir = resolve(__dirname, ".."); @@ -43,16 +45,12 @@ function setPlistString(plistPath, key, value) { throw new Error(`Failed to update plist key "${key}" at ${plistPath}: ${details}`.trim()); } -function patchMainBundleInfoPlist(appBundlePath, iconPath) { +function patchMainBundleInfoPlist(appBundlePath) { const infoPlistPath = join(appBundlePath, "Contents", "Info.plist"); setPlistString(infoPlistPath, "CFBundleDisplayName", APP_DISPLAY_NAME); setPlistString(infoPlistPath, "CFBundleName", APP_DISPLAY_NAME); setPlistString(infoPlistPath, "CFBundleIdentifier", APP_BUNDLE_ID); setPlistString(infoPlistPath, "CFBundleIconFile", "icon.icns"); - - const resourcesDir = join(appBundlePath, "Contents", "Resources"); - copyFileSync(iconPath, join(resourcesDir, "icon.icns")); - copyFileSync(iconPath, join(resourcesDir, "electron.icns")); } function patchHelperBundleInfoPlists(appBundlePath) { @@ -97,21 +95,66 @@ function readJson(path) { } } -function buildMacLauncher(electronBinaryPath) { +function resolveIconSourceMetadata(desktopResourcesDir) { + const iconComposerPath = join(desktopResourcesDir, "icon.icon"); + if (existsSync(iconComposerPath)) { + return { + iconAssetKind: "icon", + iconMtimeMs: statSync(iconComposerPath).mtimeMs, + }; + } + + const legacyIconPath = join(desktopResourcesDir, "icon.icns"); + return { + iconAssetKind: "icns", + iconMtimeMs: statSync(legacyIconPath).mtimeMs, + }; +} + +async function stageMainBundleIcons(appBundlePath, desktopResourcesDir) { + const resourcesDir = join(appBundlePath, "Contents", "Resources"); + const iconComposerPath = join(desktopResourcesDir, "icon.icon"); + const legacyIconPath = join(desktopResourcesDir, "icon.icns"); + + if (existsSync(iconComposerPath)) { + const compiled = await generateAssetCatalogForIcon(iconComposerPath); + const infoPlistPath = join(appBundlePath, "Contents", "Info.plist"); + + setPlistString(infoPlistPath, "CFBundleIconName", "Icon"); + writeFileSync(join(resourcesDir, "Assets.car"), compiled.assetCatalog); + writeFileSync(join(resourcesDir, "icon.icns"), compiled.icnsFile); + writeFileSync(join(resourcesDir, "electron.icns"), compiled.icnsFile); + return { + iconAssetKind: "icon", + iconMtimeMs: statSync(iconComposerPath).mtimeMs, + }; + } + + copyFileSync(legacyIconPath, join(resourcesDir, "icon.icns")); + copyFileSync(legacyIconPath, join(resourcesDir, "electron.icns")); + return { + iconAssetKind: "icns", + iconMtimeMs: statSync(legacyIconPath).mtimeMs, + }; +} + +async function buildMacLauncher(electronBinaryPath) { const sourceAppBundlePath = resolve(electronBinaryPath, "../../.."); const runtimeDir = join(desktopDir, ".electron-runtime"); const targetAppBundlePath = join(runtimeDir, `${APP_DISPLAY_NAME}.app`); const targetBinaryPath = join(targetAppBundlePath, "Contents", "MacOS", "Electron"); - const iconPath = join(desktopDir, "resources", "icon.icns"); + const desktopResourcesDir = join(desktopDir, "resources"); const metadataPath = join(runtimeDir, "metadata.json"); mkdirSync(runtimeDir, { recursive: true }); + const iconMetadata = resolveIconSourceMetadata(desktopResourcesDir); + const expectedMetadata = { launcherVersion: LAUNCHER_VERSION, sourceAppBundlePath, sourceAppMtimeMs: statSync(sourceAppBundlePath).mtimeMs, - iconMtimeMs: statSync(iconPath).mtimeMs, + ...iconMetadata, }; const currentMetadata = readJson(metadataPath); @@ -125,14 +168,18 @@ function buildMacLauncher(electronBinaryPath) { rmSync(targetAppBundlePath, { recursive: true, force: true }); cpSync(sourceAppBundlePath, targetAppBundlePath, { recursive: true }); - patchMainBundleInfoPlist(targetAppBundlePath, iconPath); + patchMainBundleInfoPlist(targetAppBundlePath); + const refreshedIconMetadata = await stageMainBundleIcons(targetAppBundlePath, desktopResourcesDir); patchHelperBundleInfoPlists(targetAppBundlePath); - writeFileSync(metadataPath, `${JSON.stringify(expectedMetadata, null, 2)}\n`); + writeFileSync( + metadataPath, + `${JSON.stringify({ ...expectedMetadata, ...refreshedIconMetadata }, null, 2)}\n`, + ); return targetBinaryPath; } -export function resolveElectronPath() { +export async function resolveElectronPath() { const require = createRequire(import.meta.url); const electronBinaryPath = require("electron"); @@ -140,5 +187,5 @@ export function resolveElectronPath() { return electronBinaryPath; } - return buildMacLauncher(electronBinaryPath); + return await buildMacLauncher(electronBinaryPath); } diff --git a/apps/desktop/scripts/start-electron.mjs b/apps/desktop/scripts/start-electron.mjs index bf93adb6b0..79132b1178 100644 --- a/apps/desktop/scripts/start-electron.mjs +++ b/apps/desktop/scripts/start-electron.mjs @@ -4,8 +4,9 @@ import { desktopDir, resolveElectronPath } from "./electron-launcher.mjs"; const childEnv = { ...process.env }; delete childEnv.ELECTRON_RUN_AS_NODE; +const electronPath = await resolveElectronPath(); -const child = spawn(resolveElectronPath(), ["dist-electron/main.js"], { +const child = spawn(electronPath, ["dist-electron/main.js"], { stdio: "inherit", cwd: desktopDir, env: childEnv, diff --git a/apps/marketing/public/apple-touch-icon.png b/apps/marketing/public/apple-touch-icon.png index 9c593ce352..3ed96e897f 100644 Binary files a/apps/marketing/public/apple-touch-icon.png and b/apps/marketing/public/apple-touch-icon.png differ diff --git a/apps/marketing/public/favicon-16x16.png b/apps/marketing/public/favicon-16x16.png index f85017caf7..44c99d8447 100644 Binary files a/apps/marketing/public/favicon-16x16.png and b/apps/marketing/public/favicon-16x16.png differ diff --git a/apps/marketing/public/favicon-32x32.png b/apps/marketing/public/favicon-32x32.png index fae2d285b7..c0ed3eddf7 100644 Binary files a/apps/marketing/public/favicon-32x32.png and b/apps/marketing/public/favicon-32x32.png differ diff --git a/apps/marketing/public/favicon.ico b/apps/marketing/public/favicon.ico index e3ab4ae5e0..947f6d57b9 100644 Binary files a/apps/marketing/public/favicon.ico and b/apps/marketing/public/favicon.ico differ diff --git a/apps/marketing/public/icon.png b/apps/marketing/public/icon.png index 0a6e1cbfcf..073ad811c2 100644 Binary files a/apps/marketing/public/icon.png and b/apps/marketing/public/icon.png differ diff --git a/apps/marketing/src/pages/index.astro b/apps/marketing/src/pages/index.astro index 425626c458..0c2e01d3ce 100644 --- a/apps/marketing/src/pages/index.astro +++ b/apps/marketing/src/pages/index.astro @@ -22,7 +22,7 @@