fix(app + tui): clear stale running session indicators by nexxeln · Pull Request #17593 · anomalyco/opencode

nexxeln · 2026-03-15T12:10:09Z

problem

desktop could stay active from two paths: stale bootstrap session status entries and old incomplete assistant messages in history
tui could also keep later messages queued from an old incomplete assistant

fix

bootstrap now reconciles session status so stale busy entries are cleared
desktop now uses live session status for running state and only uses the trailing assistant to pick the active turn
backend now finalizes assistant messages when prompt setup fails

tui and desktop

desktop inlines the small running-state checks
tui now only treats the trailing assistant as pending
includes the bootstrap reconcile fix from fix(app): use reconcile for session_status in bootstrap to clear stale entries #17658 by @ualtinok

testing

bun test src/context/global-sync/session-cache.test.ts src/context/global-sync/event-reducer.test.ts
bun typecheck

Use live session status for the running UI and finalize assistant messages when prompt setup fails so crashed runs do not leave sessions stuck as active.

Inline the app-side running checks and make the tui only treat the trailing assistant as pending so old crashed messages do not keep sessions active or queued.

…e entries (cherry picked from commit 912ba2c)

rekram1-node · 2026-03-20T03:28:44Z

  const pending = createMemo(() => {
-    return messages().findLast((x) => x.role === "assistant" && !x.time.completed)?.id
+    const last = messages().findLast((x) => x.role === "assistant")
+    if (!last || last.time.completed) return
+    return last.id


Can u explain this?

Document the user-reported symptom (stuck 'Thinking' shimmer and stuck stop button on the input bar long after work finishes), the investigation, and the proposed three-commit fix sequence. Root cause: client-side derivations in message-timeline.tsx (added by our commit 17), session.tsx::busy(), sidebar-items.tsx::SessionItem, and session-turn.tsx all use 'message.findLast(incomplete assistant)' as a working-state signal. That signal is permanently TRUE for any session with a previously-crashed assistant message in DB — regardless of current activity. session_status is the correct source of truth and is reliably idle (verified by code-walk: processor.ts halt path, run-state.ts cancel path, runner.ts onIdle). Empirical confirmation: 12 incomplete assistant messages persist in the live aai.db. The most recent is in the session the user reported the bug from, persisted at 12:21:23 today. Plan: three commits. 1. Trust session_status only on the client. Drop message.findLast working-state derivations in message-timeline.tsx, session.tsx, sidebar-items.tsx; tighten session-turn.tsx::pending to last-msg-only. This alone fixes the user-visible symptom. 2. Server-side: wrap the prompt-setup phase (lines 1430-1528 of prompt.ts) in an Effect tap-error that finalizes the in-flight assistant message before re-throwing. Stops new residue from accumulating. 3. Optional one-shot DB sweep on boot to finalize old residue. Plus's prior fix (Dax's 6c9b2c3) is already in current via upstream commits 51d8219 and the equivalent run-state.cancel logic. Plus's fix targets the same area but is server-status-only; it does not address the message-list-derivation bug, which is what's biting now. Upstream PR anomalyco#17593 (open) does address it but predates the Effect migration; we lift the ideas, not the diff.

Stuck 'Thinking' shimmer on the timeline progress glide. Our commit 17 (128a2c9) added `!!pending() || sessionStatus().type !== "idle"` to the working memo, intending it as a self-heal. It's actually a permanent false positive: pending() reads the DB-persisted message list, and any assistant message that ever crashed mid-stream stays incomplete (time.completed = undefined) forever. The result is that sessions which previously crashed show 'working' forever even after session_status correctly reports idle. session_status reliably goes to idle on: - processor.ts halt path (status.set idle on assistant error) - run-state.ts cancel path (status.set idle when no runner exists) - runner.ts onIdle (fires on any fiber exit: success/failure/interrupt) Trust status only. The pending() memo is still used for activeMessageID derivation (commit 17's last-msg-only change is preserved there). Empirical: 12 incomplete assistant messages persist in opencode-aai.db, including one in the actively-debugged session created today. This commit eliminates the user-visible symptom for the timeline. Companion commits will apply the same trust-status-only pattern to session.tsx::busy(), sidebar-items.tsx::SessionItem.isWorking, and tighten session-turn.tsx::pending. Each is intentionally split so it can be reverted independently if upstream PR anomalyco#17593 lands. Refs upstream PR anomalyco#17593.

Same root cause as the message-timeline fix in 3e7acb8. The busy() helper used to OR session_status against `some(incomplete assistant)` on the message list. Sessions with previously-crashed assistant messages stay 'busy' forever per that derivation, even when no work is running. Trust status only. Aligns with the helper inlined in upstream PR anomalyco#17593's second commit (6bfce60). Refs upstream PR anomalyco#17593.

Same root cause as 3e7acb8 (timeline) and 91ae391 (session.busy). The sidebar's per-session 'working' badge was OR-ing session_status against `findLast(incomplete assistant)`, leaving stale-residue sessions perpetually showing the working indicator in the sidebar list. Trust status only. Aligns with upstream PR anomalyco#17593. Refs upstream PR anomalyco#17593.

session-turn.tsx::pending used findLast over the full message list, treating any older assistant message without time.completed as the pending one. Combined with active() and pendingUser() derivations, that meant the wrong turn could show as 'active' (drove the 'Thinking' shimmer in session-turn) for sessions with DB residue from a previously-crashed stream. Tighten to last-message-only: only the final message in the list can legitimately be the pending assistant message. Matches the pattern in upstream PR anomalyco#17593's commit 060f482. Combined with 3e7acb8, 91ae391, and 347095d, this closes the client-side stuck-Thinking / stuck-busy bug end-to-end: - timeline working derivation: trusts status (3e7acb8) - session.tsx::busy() helper: trusts status (91ae391) - sidebar SessionItem.isWorking: trusts status (347095d) - session-turn.tsx::pending: last-message-only (this commit) The persistent DB residue (12 incomplete assistant messages in the live aai.db) is now ignored by all UI working-state derivations. Server-side fix to stop NEW residue from accumulating is a separate follow-up (see docs/plans/stuck-session-status.md commit 2). Refs upstream PR anomalyco#17593.

The aai.1 release shipped four UI commits that made working-state derivations trust session_status only and tightened pending detection to last-message-only (3e7acb8, 91ae391, 347095d, ef4e181). That correctly silenced stuck Thinking from historical DB residue. User reports stuck Thinking still appears on individual turns. SQL probe of opencode-aai.db at the report time found 13 incomplete assistant messages (time.completed undefined) — including TWO created in the running aai.1 binary, in different sessions, both without 'error' set (so neither went through processor.halt). Diagnosis: prompt-setup-phase throws between sessions.updateMessage (prompt.ts:1425) and the Effect.ensuring(processor.cleanup) scope inside handle.process (begins at line 1487) leave the persisted assistant message permanently incomplete in DB. processor.cleanup DOES set time.completed at processor.ts:519, but its scope only covers the inner stream-drain block — anything that throws BEFORE handle.process is yielded never reaches that finalizer. Specifically: - processor.create's snapshot.track() at processor.ts:112 (git op) - resolveTools at prompt.ts:1436 - plugin.trigger experimental.chat.messages.transform at line 1476 - Effect.all([sys.skills, sys.environment, instruction.system, MessageV2.toModelMessagesEffect]) at line 1478 Each of these can throw with msg already in DB and time.completed undefined. The last-message-only pending derivation introduced in ef4e181 then renders that turn as 'Thinking' permanently. Fix: hoist sessions.updateMessage(msg) and processor.create({...}) into an outer Effect.gen and apply Effect.onError to the whole thing. The finalizer: - Returns immediately if msg.time.completed is already set (idempotence — processor.cleanup may have run first when handle.process was reached and then errored). - Otherwise sets msg.time.completed = Date.now() and persists via sessions.updateMessage(msg).pipe(Effect.catchCause(() => Effect.void)). Why Effect.onError vs alternatives: - tapErrorCause is functionally equivalent for this purpose. onError matches the codebase's existing finalizer idiom (see runner.ts:82 Effect.onExit) more closely. - onExit fires on every exit including success, which would generate redundant updateMessage event publishes on every successful turn (the idempotence guard prevents the DB-side double-write but not the event publish). - Effect v4 finalizers (onError, ensuring, addFinalizer) all run uninterruptibly by default, so the cleanup body cannot be interrupted mid-write. Why Effect.catchCause(() => Effect.void) instead of Effect.ignore: - sessions.updateMessage wraps Effect.sync(SyncEvent.run(...)). SyncEvent.run can throw synchronously on aggregate-missing or stale-version conditions, surfacing as a Die cause (defect) rather than a typed Fail. Effect.ignore in Effect v4 only converts typed Fail causes; defect causes can re-propagate and mask the original prompt-setup error. catchCause swallows ALL non-success Causes including defects, which is what we need to keep the cleanup truly infallible. Why instruction.clear stays inside the inner gen: - instruction.clear(handle.message.id) requires handle which only exists after processor.create resolves. Moving it to the outer scope would lose access to handle. Pre-change behavior was the same: if processor.create threw, instruction.clear never ran (handle was undefined). New structure preserves that. Why runLoop only (not handleSubtask or shellImpl): - Empirical evidence: the 13 residue messages are all in runLoop territory (from agents 'build' and 'plan', tool/text/step-start parts only, never subtask parts). handleSubtask at line 528 and shellImpl at line 721 already have their own finalize paths via Effect.ensuring and explicit time.completed sets. No residue traceable to those paths. - If subsequent residue traceable to handleSubtask or shellImpl appears, a follow-up commit applies the same pattern there. Why no boot-time DB sweep for the 13 historical messages: - With the parent plan's UI commits, those 13 messages are only visible as 'Thinking' on their respective last-message turns in their respective sessions — they no longer leak into other derivations. Cosmetic debt, not active spam. Bulk DB mutation of user state without explicit opt-in violates the project's principle of not silently changing data the user can see. Plan: docs/plans/stuck-session-status-followup.md (added in this commit). Dual-reviewed by code-review-opus and code-review-gpt5 both at plan stage and after implementation. Plan-stage review caught two blockers (scope too narrow excluding processor.create; Effect.ignore insufficient for defect Causes) and several issues (verification realism, upstream-PR linkage, plan section naming); all addressed before implementation. Implementation-stage review returned 0 blockers, 0 issues, 3 optional nits (no action needed). bun test from packages/opencode passes (2166 pass, 0 fail). bun typecheck clean. Refs upstream PR anomalyco#17593 (open) — same diagnosis ('backend now finalizes assistant messages when prompt setup fails'). Predates the Effect migration in this codebase, so we lift the idea, not the diff.

rekram1-node · 2026-05-15T03:20:05Z

Automated PR Cleanup

Thank you for contributing to opencode.

Due to the high volume of PRs from users and AI agents, we periodically close older PRs using automated criteria so maintainers can focus review time on the most active and community-supported contributions.

This PR was closed because it matched the following cleanup criteria:

The PR was created more than 1 month ago
The PR had fewer than 2 positive reactions
Positive reactions are counted as thumbs-up, heart, celebration, or rocket reactions on the PR

PRs created within the last month are not affected by this cleanup.

If you believe this PR was closed incorrectly, or if you are still actively working on it, please leave a comment explaining why it should be reopened. A maintainer can review and reopen it if appropriate.

Thanks again for taking the time to contribute.

fix(app): clear stale running session indicators

060f482

Use live session status for the running UI and finalize assistant messages when prompt setup fails so crashed runs do not leave sessions stuck as active.

nexxeln requested a review from adamdotdevin as a code owner March 15, 2026 12:10

fix(session): ignore stale pending ui state

6bfce60

Inline the app-side running checks and make the tui only treat the trailing assistant as pending so old crashed messages do not keep sessions active or queued.

nexxeln changed the title ~~fix(app): clear stale running session indicators~~ fix(app + tui): clear stale running session indicators Mar 15, 2026

fix(app): use reconcile for session_status in bootstrap to clear stal…

5a7a868

…e entries (cherry picked from commit 912ba2c)

nexxeln mentioned this pull request Mar 16, 2026

fix(app): use reconcile for session_status in bootstrap to clear stale entries #17658

Closed

6 tasks

nexxeln and others added 2 commits March 16, 2026 21:02

merge origin/dev into fix/stale-running-session-ui

c19cbcc

Merge branch 'dev' into fix/stale-running-session-ui

2ef6c88

rekram1-node reviewed Mar 20, 2026

View reviewed changes

dzianisv mentioned this pull request Mar 24, 2026

Sessions permanently stuck after server restart or stream interruption — no startup recovery for orphaned messages/tool parts #19023

Open

opencode-agent Bot force-pushed the dev branch from 2f2ed11 to 9f201d6 Compare April 16, 2026 21:55

rekram1-node closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(app + tui): clear stale running session indicators#17593

fix(app + tui): clear stale running session indicators#17593
nexxeln wants to merge 5 commits into
devfrom
fix/stale-running-session-ui

nexxeln commented Mar 15, 2026 •

edited

Loading

Uh oh!

rekram1-node Mar 20, 2026

Uh oh!

rekram1-node commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nexxeln commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

problem

fix

tui and desktop

testing

Uh oh!

rekram1-node Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

rekram1-node commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nexxeln commented Mar 15, 2026 •

edited

Loading