Skip to content

review: all fork features (1.14.39-ttk.1) — Codex review only, do not merge#10

Closed
TTK95 wants to merge 93 commits into
codex-review/baselinefrom
dev
Closed

review: all fork features (1.14.39-ttk.1) — Codex review only, do not merge#10
TTK95 wants to merge 93 commits into
codex-review/baselinefrom
dev

Conversation

@TTK95
Copy link
Copy Markdown
Owner

@TTK95 TTK95 commented May 6, 2026

Purpose

This PR exists only to give Codex a single diff to review covering all fork-specific changes since the last upstream sync. Do not merge.

  • Base: codex-review/baseline — pinned at upstream merge-base 25ecf0af (fix: retry server_is_overloaded errors)
  • Head: dev (7314c500)
  • Scope: 89 commits (64 non-merge), ~8,282 insertions across 106 files

Fork-feature summary

Major fork-only work bundled here:

  1. Fork release pipeline (feat(installation): add fork self-update via github-release method, chore(release): 1.14.x-ttk.*, fork-release CI workflow)
  2. Container mode end-to-end — sandbox ruleset wiring, --container copy/mount, /container toast, preBoot diagnostics, host-fs external_directory gate, container badge in TUI sidebar/footer (most recent: fix(container): gate external_directory in sandbox mode)
  3. YOLO mode — TUI /yolo toggle + sidebar badge, server-side disable path
  4. Permission rules — auto-approve sandboxed bash/edit, refined evaluator
  5. Sync-upstream automation — guards against open PRs, ls-remote distinction, JS-in-bash heredoc fixes
  6. Windows install/upgrade — github-release method, locked-exe replace via rename-aside, doc updates
  7. Docs — fork install guide (EN+DE), container-mode docs (EN+DE), prerequisites section
  8. CI — gate upstream-only deploy job, release-fork workflow

What I'd especially like Codex to flag

  • Permission-evaluation correctness around the new sandbox ruleset (race conditions / precedence between evaluate* calls in runtime-state.ts and permission server-side, vs the prepended sandbox rules in prompt.ts:sandboxRuleset())
  • The external_directory gate: any caller of host-fs reads (Read/Glob/Grep/Edit/Write/apply_patch/lsp) that bypasses assertExternalDirectory?
  • Bash permission-pattern bypass via shell-variable assignment (upstream issue Agents can bypass permission pattern with variable assignment anomalyco/opencode#26042) — does our bash allow rule combined with sandbox auto-allow widen exposure?
  • instance-store.ts worktree fallback in copy mode — symlink/junction edge cases?
  • Windows-specific upgrade path (fix(installation): make Windows github-release upgrade actually replace the running exe) — race conditions if multiple opencode processes are running?
  • Sync-upstream workflow guards — anywhere a gh call still treats "no refs" as a hard error?

@codex review

Thorsten.Tegetmeyer-Kleine and others added 30 commits April 20, 2026 14:28
Adds RateLimit module that ticks on every outbound request, parses any
known rate-limit response headers (x-ratelimit-*, anthropic-ratelimit-*,
IETF ratelimit-*), and on the first 429 from a provider, writes the
observed per-minute and per-day counts to opencode.json under
provider.<id>.options.rateLimit. This gives proxies like the RWTH Aachen
API — which do not expose rate-limit headers — a persisted limit after
the first hit.
- Inline minute/day cutoffs per style guide.
- Deduplicate the parsed-headers object construction.
- Drop the existsSync TOCTOU in readJsonSafe; handle ENOENT from
  readFileSync instead.
Extends the existing per-provider sliding-window rate limiter with token
windows and a pre-flight check() that blocks requests before they hit
the wire. Stops RWTH (and other) rate-limited providers from surfacing
generic "An internal error occurred" messages after a 429 — the limit
is learned on the first 429, persisted to opencode.json, and on
subsequent runs the pre-flight gate throws a RateLimitError with a
friendly "retry in Ns" message that the retry layer honours via resetAt.

- rate-limit.ts: token windows (minute + day), check() gate, recordUsage,
  estimateRequestTokens, configure() to seed from opencode.json options,
  onRateLimitError persists token limits too
- provider.ts fetch wrapper: pre-flight check, RateLimit.configure from
  options.rateLimit, tick with token estimate
- session/processor.ts: call RateLimit.recordUsage on finish-step so
  the pending estimate is replaced with actual usage
- session/retry.ts: RateLimitError is retryable; delay() honours resetAt
- provider/error.ts: friendly message for 429 responses; mark retryable
- session/message-v2.ts: RateLimitError NamedError, converts to APIError
  shape in fromError so the SDK type surface is unchanged
- config/provider.ts: tokensPerMinute / tokensPerDay schema fields
- cli/cmd/stats.ts: RATE LIMITS section reads persisted limits

https://claude.ai/code/session_01S3gS4AgfQQBNHvgZqkuiRa
Adds a terminal-first stats screen reachable via /usage (alias /stats).
The screen renders a GitHub-style year heatmap plus KPI block on the
Overview tab, and a stacked per-day tokens chart with per-model
breakdown on the Models tab. Date range (All time, Last 7 days, Last 30
days) cycles with `r` and filters all KPIs; the heatmap stays
year-wide to match the mockup. ctrl+s copies a plain-text summary to
the clipboard. Pure derivation logic lives in util/usage-stats.ts and
is covered by unit tests.
- .opencode/opencode.jsonc: register RWTH/KI Connect NRW provider with
  fast (mistral-small-4), code (gpt-5.2), reasoning (gpt-oss-120b),
  and gpt-5.4-mini models.
- provider.ts: add rwth custom loader (forces Chat Completions),
  default bare model IDs to the rwth/ prefix, alias legacy gpt-5-mini
  to gpt-5.4-mini.
- sidebar/context.tsx: surface Total/User/Assistant message counts in
  the context box (RWTH limits are per-message, not per-token).
- Tests covering getMessageStats and the new parseModel behavior.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- hoist HEIGHT/ROW_INDICES/LEGEND_LEVELS out of render to avoid
  allocating fresh arrays every frame
- render the heatmap month header as a single <text> instead of one
  element per character
- drop the unused `filtered`/`range` props from Overview and the
  unused `sub` branch from summarizeStats
- fold the two identical formatMultiplier branches
- replace the redundant `const assistant = msg` alias in
  recordsFromMessages
- deduplicate `selectedForeground` into one `contrastingFg` helper
- remove the undocumented `q` keybind so only the footer hints are
  live (escape still exits)
Drop the Status/Config/Usage/Stats top tab row and render a single bold
"Usage" title instead. Keep the Overview/Models sub-tabs, date-range
pills, keyboard handlers, and footer unchanged. Reword user-visible
"stats" strings to "usage" in the load toast, copy-confirm toast, empty
state, and loading indicator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Captures the dev workflow: one-time symlinked install via
npm install -g packages/opencode/dist/opencode-windows-x64, the rebuild
command with OPENCODE_VERSION stamp, the opencode-ai nested-shadow
gotcha and fix, fresh-launch path verification, and the
OPENCODE_BIN_PATH escape hatch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds foundations for four Claude-Code-style features identified in
~/.claude/plans/check-out-what-features-shimmering-moler.md. Each item
type-checks cleanly (`tsgo --noEmit` on both packages/plugin and
packages/opencode apart from the pre-existing TuiThemeCurrent sidebar
error) and existing bash + registry test suites still pass.

1. Lifecycle & prompt hooks (packages/plugin/src/index.ts)
   Adds five Hooks entries: session.start, session.end,
   chat.prompt.submit, chat.stop, notify. session.start/end are bridged
   from existing session.created/session.deleted bus events in
   packages/opencode/src/plugin/index.ts so plugin authors can subscribe
   by name. chat.prompt.submit and chat.stop are triggered from
   packages/opencode/src/session/prompt.ts at the entry and exit of the
   prompt pipeline; plugins can inject context parts or block the
   prompt via the submit hook. notify is types-only — the TUI/desktop
   consumer is follow-up.

2. Background bash (+ bash_output / kill_shell)
   New packages/opencode/src/shell/background.ts service manages a
   registry of detached child processes with rolling 1 MB per-shell
   buffers. The bash tool grows a run_in_background parameter that
   routes to the service and returns a shell_id. Two new tools read
   incremental output (bash_output) and terminate (kill_shell). The
   three test layers that build ToolRegistry were updated to provide
   BackgroundShell.defaultLayer.

3. Task worktree isolation
   packages/opencode/src/tool/task.ts grows an isolation: "worktree"
   option. When set on a fresh (non-resumed) task, the tool runs
   `git worktree add -b <branch> <tmp> HEAD`, prepends the worktree
   path/branch to the subagent's prompt so the child uses workdir: on
   bash/edit tools, and after completion auto-removes the worktree if
   clean or reports path+branch back so the parent can act on it.
   Note: full filesystem-level isolation (rerouting Instance.directory
   per-session) is out of scope for this change; the current approach
   relies on the subagent respecting the prompt's workdir guidance.

4. statusLine + experimental.defer_tools config
   Adds config.statusLine ({ type: "command", command, padding? }) and
   config.experimental.defer_tools to the config schema. The schemas
   and documentation are in place; TUI rendering for statusLine and
   the ToolSearch registry indirection for defer_tools are intentionally
   scoped as follow-up — both require large, independently-testable
   changes (TUI status bar, tool registry refactor) that were better
   landed separately.

Follow-up tickets this change implies:
- Wire notify hook to TUI/desktop notification surface
- Consume config.statusLine in the TUI status bar with throttled refresh
- Build the ToolSearch indirection that respects experimental.defer_tools
- Consider rerouting Instance.directory per-session so Task worktree
  isolation enforces the boundary in the filesystem layer rather than
  relying on subagent cooperation

https://claude.ai/code/session_01XTJvnZWsDnZigTBLi2U1Ud
…ll_id

Code review of the initial port found several quality issues; fixing
them before opening the PR.

- Drop dead re-export of Shell from background.ts — only killTree is used.
- Drop unused list() method and ListResult type — no caller.
- Drop unused cleanup() from the public Interface; the instance-scope
  teardown now happens via Effect.addFinalizer inside InstanceState.make,
  which kills any live child processes (via killTree) when the scope
  closes. This was the behaviour the original commit's docstring
  promised but didn't actually wire up.
- Drop unused cwd field from Handle.
- Convert ShellNotFound from a plain Error (which wouldn't match
  Effect.catchTag by _tag) to Schema.TaggedErrorClass so the tools can
  catch it structurally.
- bash_output / kill_shell: instead of Effect.orDie on a missing
  shell_id, catch ShellNotFound and return a friendly tool result so
  the model can recover instead of producing a hard defect.

Typecheck clean on both @opencode-ai/plugin and opencode (only the
pre-existing TuiThemeCurrent sidebar error remains). bash.test.ts (21)
and registry.test.ts (3) still pass.

https://claude.ai/code/session_01XTJvnZWsDnZigTBLi2U1Ud
Conflicts resolved:
- packages/opencode/src/session/message-v2.ts: adopt upstream's
  Schema-based error migration (namedSchemaError, Schema.*); preserve
  our RateLimitError re-export (moved import to top with other imports).
- packages/opencode/test/session/prompt-effect.test.ts: upstream
  consolidated this file into prompt.test.ts; our BackgroundShell layer
  addition ported to the new file, old file deleted.

Versions bumped: packages/opencode, packages/web, packages/plugin
1.14.19 -> 1.14.21-dev_ttk
…ota fall through

The post-merge rate-limit detector intercepted bare insufficient_quota
responses as rate-limit errors before the switch statement below could
label them "Quota exceeded". The original intent was only to catch
rate-limit-typed SSE envelopes (type: "rate_limit_error"); pure
insufficient_quota with no rate-limit signal should still be treated as
a quota error.

Removes the insufficient_quota code check; user_quota_exceeded and
type=rate_limit_error still cover the nested rate-limit case.
Thorsten.Tegetmeyer-Kleine and others added 10 commits May 6, 2026 00:38
The previous commit added the badges to routes/session/{footer,sidebar}.tsx, but those files only render on the /stats route — the live sidebar and bottom bar are produced by internal feature-plugins that win the sidebar_footer / home_footer slots.

Move the badges into the feature-plugin views:
- feature-plugins/sidebar/footer.tsx: full badges above the OpenCode version line
- feature-plugins/home/footer.tsx: compact badges right of the spacer, before version

Host hooks (useProject/useSync/useTheme/useRoute) work inside plugin
slot views (other internal plugins use them the same way).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs in the container/yolo badges:

1. Container badge always showed `off` even when launched with
   --container mount. The HttpApi server's listenHttpApi() builds
   layers with `Layer.makeMemoMapUnsafe()` — a fresh per-listener
   memo map — instead of the shared `memoMap` used by AppRuntime
   and the legacy webHandler. That created a separate
   InstanceStore.Service singleton with its own cache. When
   preBootContainer() registered the runtime via WithInstance.provide
   (which goes through AppRuntime), the entry landed in AppRuntime's
   store, but path.get HTTP handlers read from the server's store —
   cache miss, ctx.container undefined, mode reported as "off".

   Fix: import the shared memoMap and pass it to buildWithMemoMap.

2. YOLO badge showed `ON` for any session whose rules included
   {permission:"*",pattern:"*",action:"allow"} regardless of later
   overrides. A common config pattern is
       {"*":"allow","bash":"ask","edit":"ask"}
   where the wildcard exists for read/etc. but bash and edit are
   explicitly asked. Rule-form matching said YOLO; effective
   permission says not.

   Fix: derive YOLO state by evaluating the actual action for `bash`
   and `edit` against the rules (findLast wildcard match). Only if
   both effectively allow do we treat the session as YOLO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The container badge has been showing `off` even when launched with
--container mount because the InstanceStore.Service that
preBootContainer() populates and the one the HttpApi getPath handler
reads have apparently not been the same instance, despite the
shared-memoMap fix in d02c4e4.

Add a defensive process-wide container registry as a fallback path
for the HTTP handler:

- packages/opencode/src/container/registry.ts (new) — module-level
  Map<directory, Container.Runtime>, keyed via AppFileSystem.resolve.
  register / unregister / lookup / entries.

- worker.ts:preBootContainer registers in the registry after creating
  the runtime; the existing disposer also unregisters on dispose so
  cleanup stays unified with destroy.

- handlers/instance.ts:getPath falls back to ContainerRegistry.lookup
  when ctx.container is undefined. Both preBootContainer and the
  handler run in the same worker process, so the registry is always
  reachable.

Add diagnostic output to the /container slash-command to reveal what
session.permission actually contains at runtime — needed for
diagnosing the YOLO badge state, which is correctly computed from
the rules but reading the rules requires a runtime dump.

Refactor: move evaluateAction + Rule type + useContainerMode +
useYoloActive + useSessionRules into runtime-state.ts so the badges
and the slash-command share the same logic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the container preBoot path observable from three angles so we
can finally diagnose why the badge shows off:

1. stderr trace — preBootContainer prints state transitions to stderr
   before the TUI takes over the terminal:
     opencode container: preparing mount (image=node:22-alpine, cwd=...)
     opencode container: ready mount (id=abc123, dir=...)
     opencode container: FAILED — <reason>
   These appear in the user's pre-TUI scrollback.

2. Diagnostic state plumbed through — registry.ts now also tracks a
   Diagnostic record (status, envValue, resolvedMode, configMode,
   configImage, cwd, directory, containerID, error). preBootContainer
   updates it at every state transition. The path API exposes it as
   ContainerInfo.diagnostic. SDK regenerated.

3. Startup toast — App.tsx fires a one-time toast when sync completes
   that reflects the preBoot result:
     - succeeded: green "Container sandbox ready" toast
     - failed:    red 12s toast with the error message
     - off:       brief info toast

Plus /container slash-command now includes a `preBoot:` line with
status + env + error. Combined with the existing rules dump this
makes the toast a one-stop diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause finally found. There are TWO /path route handlers in the
opencode server:
  1. packages/opencode/src/server/routes/instance/httpapi/handlers/instance.ts
     (new HttpApi backend) — already returned container + diagnostic
     since v1.14.42
  2. packages/opencode/src/server/routes/instance/index.ts (legacy Hono
     adapter) — returned the OLD shape without container, ever since

The TUI's SDK request hits the Hono path, which is why the badge
always showed off and `/container` reported preBoot:<unknown> — the
diagnostic field never made it into the path response, regardless of
how correctly the worker preBoot ran or how robust the registry
fallback got.

Mirror the HttpApi handler in the Hono route: same ContainerRegistry
fallback, same diagnostic surface, same response shape. Update the
zod schema accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… badge

The HttpApi session.update handler always merged permission rules into
the existing set, ignoring permissionMode. The TUI's /yolo disable path
sends `permission: [], permissionMode: "replace"` expecting the server
to clear the wildcard rule, but the field was missing from the OpenAPI
schema (silently dropped at the SDK layer) and the handler had no
replace branch. Result: pressing /yolo to disable showed the toast but
session.permission stayed unchanged, badge remained ON.

Three coordinated changes:

1. Server (HttpApi): UpdatePayload accepts `permissionMode`; handler
   uses `Permission.merge` when "merge" (default) and overwrites when
   "replace". Mirrors the legacy Hono route that already did this.
   SDK regenerated.

2. TUI sync: `sync.session.sync(sessionID, { force: true })` bypasses
   the per-session cache so the post-toggle re-fetch actually pulls
   the new permission set. /yolo's `sync` callback uses force=true.

3. Container registry (copy mode): register the runtime under both
   the host cwd (what HTTP handlers query) and the agent-side tempDir
   (what InstanceStore caches under). Without this, --container copy
   showed `container: off` in the badge because the registry lookup
   missed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch from `1.14.47-dev_ttk` (non-semver — underscore is invalid in
the pre-release tag, would crash semver.major() in the upgrade flow)
to a semver-valid scheme anchored on the synced upstream version.

  fork = <upstream-version>-ttk.<counter>

Currently synced to upstream 1.14.39 → fork starts at 1.14.39-ttk.0.
Counter only increments when something concrete is shipped. When
upstream is synced to 1.14.40, fork resets to 1.14.40-ttk.0.

Updates LOCAL_REINSTALL.md (new "Version scheme" section), and the
example versions in docs/install-fork.md(.de.md). Channel string
`OPENCODE_CHANNEL=dev_ttk` is unchanged — it's an internal label
independent of the version field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tainer toast

Container mode auto-allowed `external_directory` along with `bash` + `edit`,
but Read/Glob/Grep run on the host filesystem regardless of container state —
so copy mode silently let the agent read arbitrary host paths, defeating the
isolation users expected. Drop `external_directory` from `SANDBOX_PERMISSIONS`
so out-of-project reads always go through the normal permission gate.

Also fix the `/container` slash toast: it evaluated only session rules and
hid the runtime sandbox auto-allows, so it reported `bash=ask edit=ask` when
the server was actually allowing both. Synthesize the same sandbox ruleset
locally, merge it with session rules, and surface `external_directory` on
the effective line too.

Docs (EN + DE) updated to reflect the carve-out and the host-fs read gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The deploy workflow runs `bun sst deploy` against Cloudflare/Planetscale/
Stripe/Sentry, none of which the fork has secrets for, so every push to dev
shows a red `deploy / deploy` check. Match the gate already in publish.yml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Hey! Your PR title review: all fork features (1.14.39-ttk.1) — Codex review only, do not merge doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

Docker.sweepStale(SESSION_LABEL).catch(() => {})

P1 Badge Stop sweeping all labeled containers at sandbox startup

Container.prepare() unconditionally calls Docker.sweepStale(SESSION_LABEL), but sweepStale() deletes every container matching that label via docker ps -aq --filter label=... and docker rm -f ... (including currently running ones). This means starting a new sandbox session can terminate other active opencode sandbox sessions in parallel terminals/processes, causing abrupt command failures and data loss risk for those sessions.


return yield* loop({ sessionID: input.sessionID }).pipe(
Effect.tap((result) => stop(result.info.id, false)),
Effect.tapError(() => stop(message.info.id, true).pipe(Effect.ignore)),

P1 Badge Prevent chat.stop hook errors from failing successful prompts

The success path runs plugin.trigger("chat.stop", ...) inside Effect.tap(...) without error suppression, so if any plugin throws in that hook, the entire prompt call fails even after loop() already produced a valid assistant response. In practice, a flaky/non-critical plugin can turn otherwise successful completions into user-visible errors; mirror the error path behavior and ignore/log hook failures here.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

1. container.sweepStale: only target non-running containers (created/exited/
   dead) so a new sandbox session no longer kills concurrent ones in other
   terminals via `docker rm -f`. Drops the `-f` since stale-only never needs
   a force kill.

2. session.prompt: wrap the success-path chat.stop hook in Effect.ignore,
   matching the error path. A flaky plugin hook can no longer turn a
   successful completion into a user-visible failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

@codex review

Fixed both P1 findings in c6b4eaf:

  1. sweepStale now filters by status (created/exited/dead), can no longer kill running concurrent sandboxes
  2. Success-path chat.stop wrapped in Effect.ignore to match error path

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

const sdk = createOpencodeClient({ baseUrl: "http://opencode.internal", fetch: fetchFn })

P1 Badge Pass bound instance directory to local run SDK client

When run is executed with --container copy, bootstrap() can rebind the active instance to a temporary copy workspace, but this local client is created without a directory override. The server middleware falls back to process.cwd() when no x-opencode-directory is provided, so requests can be routed to the host-directory instance instead of the copy-bound one. In that case the container runtime/isolation is bypassed and bash/edit run against the host checkout, which defeats the safety guarantees of copy mode.


let worktreeNote = ""
if (worktree) {
const { clean, removed } = yield* Effect.promise(() => cleanupWorktree(Instance.directory, worktree))

P2 Badge Always cleanup temporary task worktrees on failure

createWorktree() may allocate a temp worktree and branch, but cleanup is only performed in the success path after ops.prompt(...) returns. If prompt execution fails, is aborted, or is interrupted, the acquire/release finalizer only handles cancellation/listener removal and never calls cleanupWorktree, leaving orphaned temp worktrees and branches behind. This will accumulate stale refs/directories over repeated failed task runs.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

P1 (run.ts): Pass the bootstrapped Instance.directory to createOpencodeClient
so requests carry x-opencode-directory and route to the bound instance. In
container copy mode bootstrap rebinds the active directory to a temp copy
workspace, but the local SDK client was created without `directory`, so the
server middleware fell back to process.cwd() — silently bypassing copy-mode
isolation and running bash/edit against the host checkout.

P2 (task.ts): Always cleanup temp worktrees/branches on failure or abort.
The success path still owns its cleanup (so it can format the worktreeNote
output); a `cleanupAttempted` flag tells the release callback to skip the
duplicate. On any other exit (failure, interrupt) the release callback now
calls cleanupWorktree as a best-effort, swallowing errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

@codex review

Fixed both findings in 67a8a2b:

  1. P1 run.ts: SDK client now passes directory: Instance.directory so the bound instance (incl. copy-mode temp workspace) is the routing target.
  2. P2 task.ts: cleanup runs on failure/abort via the release callback, gated by a cleanupAttempted flag so the success path's note formatting still owns the primary cleanup.

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

const worktree =
params.isolation === "worktree" && !session
? yield* Effect.promise(() => createWorktree(Instance.directory, next.name))
: undefined

P1 Badge Abort when worktree isolation setup fails

When isolation: "worktree" is requested in a non-git directory (or any git worktree add failure), createWorktree(...) returns undefined and execution silently falls back to params.prompt without isolation. That means the subagent proceeds against the primary checkout even though the caller explicitly asked for isolation, which can unintentionally modify the main workspace. This path should fail the task instead of degrading to non-isolated execution.


await fs.rm(wt.dir, { recursive: true, force: true }).catch(() => {})
}
await git(baseDir, ["branch", "-D", wt.branch]).catch(() => undefined)
return { clean: true, removed: true }

P2 Badge Report worktree removal failure accurately

If git worktree remove fails (for example due to locked files), the fallback --force removal and fs.rm are attempted but their outcomes are ignored, and the function still returns removed: true. Callers then print the "worktree ... has been removed" success note even when cleanup actually failed, leaving orphaned temp worktrees/branches that are hard to discover. The return value should reflect real cleanup success.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

P1 (task.ts createWorktree): Fail the task when isolation=worktree was
requested but createWorktree returns undefined (non-git directory or
`git worktree add` failure). Previously the run silently degraded to the
primary checkout, which could unintentionally modify the main workspace.

P2 (task.ts cleanupWorktree): Return removed=false when actual cleanup
fails. The previous flow tried `git worktree remove` → `--force` → `fs.rm`
but unconditionally returned removed=true, so callers printed a "removed"
note even on locked/broken worktrees, leaving orphaned tmp dirs that were
hard to find. Now the return value reflects whether removal really
succeeded; the branch is force-deleted only when the worktree is actually
gone, and the fs.rm fallback treats `git worktree prune` success as the
authoritative signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

@codex review

Fixed both findings in 64fb13e:

  1. P1 task.ts createWorktree: now fails the task with a clear error when isolation=worktree was requested but worktree creation failed. No more silent degradation onto the primary checkout.
  2. P2 task.ts cleanupWorktree: returns removed: false when removal actually failed. Tracks each fallback (rm → --force → fs.rm + worktree prune) and only deletes the branch when the worktree is truly gone.

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

RateLimit.tick(model.providerID, estimate)
const res = await fetchFn(input, {

P1 Badge Move local rate-limit tick after successful request dispatch

RateLimit.tick is executed before fetchFn runs, so any transport failure (DNS outage, timeout, TLS error, aborted request, etc.) is still counted as a sent request/tokens. With configured or learned limits, these failed attempts can trip RateLimit.check and block subsequent calls even though the provider never received them, causing avoidable self-throttling during outages. Tick should happen only after the request is successfully dispatched (or be rolled back on fetch failure).

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Previously RateLimit.tick ran before fetchFn, so DNS/TLS/timeout/abort
errors all counted against the local limiter even though the provider
never received the request. During an outage the limiter could trip and
self-throttle recovery attempts. Tick only after the request was
actually dispatched. recordResponse stays paired with the successful
fetch.

Codex P1 finding on PR #10. Code is upstream-origin but the fix applies
regardless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

@codex review

Fixed in 0c49129: RateLimit.tick moved to after fetchFn resolves with a Response. Transport failures (DNS/TLS/timeout/abort) no longer count against the local limiter, so an outage can no longer self-throttle recovery attempts.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@TTK95
Copy link
Copy Markdown
Owner Author

TTK95 commented May 6, 2026

Codex gab grünes Licht in #10 (comment) ("Didn't find any major issues"). Schließe diesen Review-only-PR — die 4 Fix-Commits sind bereits auf dev.

@TTK95 TTK95 closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants