fix(app + tui): clear stale running session indicators#17593
Closed
nexxeln wants to merge 5 commits into
Closed
Conversation
Use live session status for the running UI and finalize assistant messages when prompt setup fails so crashed runs do not leave sessions stuck as active.
Inline the app-side running checks and make the tui only treat the trailing assistant as pending so old crashed messages do not keep sessions active or queued.
…e entries (cherry picked from commit 912ba2c)
6 tasks
Comment on lines
141
to
+144
| const pending = createMemo(() => { | ||
| return messages().findLast((x) => x.role === "assistant" && !x.time.completed)?.id | ||
| const last = messages().findLast((x) => x.role === "assistant") | ||
| if (!last || last.time.completed) return | ||
| return last.id |
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 26, 2026
Document the user-reported symptom (stuck 'Thinking' shimmer and stuck stop button on the input bar long after work finishes), the investigation, and the proposed three-commit fix sequence. Root cause: client-side derivations in message-timeline.tsx (added by our commit 17), session.tsx::busy(), sidebar-items.tsx::SessionItem, and session-turn.tsx all use 'message.findLast(incomplete assistant)' as a working-state signal. That signal is permanently TRUE for any session with a previously-crashed assistant message in DB — regardless of current activity. session_status is the correct source of truth and is reliably idle (verified by code-walk: processor.ts halt path, run-state.ts cancel path, runner.ts onIdle). Empirical confirmation: 12 incomplete assistant messages persist in the live aai.db. The most recent is in the session the user reported the bug from, persisted at 12:21:23 today. Plan: three commits. 1. Trust session_status only on the client. Drop message.findLast working-state derivations in message-timeline.tsx, session.tsx, sidebar-items.tsx; tighten session-turn.tsx::pending to last-msg-only. This alone fixes the user-visible symptom. 2. Server-side: wrap the prompt-setup phase (lines 1430-1528 of prompt.ts) in an Effect tap-error that finalizes the in-flight assistant message before re-throwing. Stops new residue from accumulating. 3. Optional one-shot DB sweep on boot to finalize old residue. Plus's prior fix (Dax's 6c9b2c3) is already in current via upstream commits 51d8219 and the equivalent run-state.cancel logic. Plus's fix targets the same area but is server-status-only; it does not address the message-list-derivation bug, which is what's biting now. Upstream PR anomalyco#17593 (open) does address it but predates the Effect migration; we lift the ideas, not the diff.
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 26, 2026
Stuck 'Thinking' shimmer on the timeline progress glide. Our commit 17 (128a2c9) added `!!pending() || sessionStatus().type !== "idle"` to the working memo, intending it as a self-heal. It's actually a permanent false positive: pending() reads the DB-persisted message list, and any assistant message that ever crashed mid-stream stays incomplete (time.completed = undefined) forever. The result is that sessions which previously crashed show 'working' forever even after session_status correctly reports idle. session_status reliably goes to idle on: - processor.ts halt path (status.set idle on assistant error) - run-state.ts cancel path (status.set idle when no runner exists) - runner.ts onIdle (fires on any fiber exit: success/failure/interrupt) Trust status only. The pending() memo is still used for activeMessageID derivation (commit 17's last-msg-only change is preserved there). Empirical: 12 incomplete assistant messages persist in opencode-aai.db, including one in the actively-debugged session created today. This commit eliminates the user-visible symptom for the timeline. Companion commits will apply the same trust-status-only pattern to session.tsx::busy(), sidebar-items.tsx::SessionItem.isWorking, and tighten session-turn.tsx::pending. Each is intentionally split so it can be reverted independently if upstream PR anomalyco#17593 lands. Refs upstream PR anomalyco#17593.
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 26, 2026
Same root cause as the message-timeline fix in 3e7acb8. The busy() helper used to OR session_status against `some(incomplete assistant)` on the message list. Sessions with previously-crashed assistant messages stay 'busy' forever per that derivation, even when no work is running. Trust status only. Aligns with the helper inlined in upstream PR anomalyco#17593's second commit (6bfce60). Refs upstream PR anomalyco#17593.
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 26, 2026
Same root cause as 3e7acb8 (timeline) and 91ae391 (session.busy). The sidebar's per-session 'working' badge was OR-ing session_status against `findLast(incomplete assistant)`, leaving stale-residue sessions perpetually showing the working indicator in the sidebar list. Trust status only. Aligns with upstream PR anomalyco#17593. Refs upstream PR anomalyco#17593.
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 26, 2026
session-turn.tsx::pending used findLast over the full message list, treating any older assistant message without time.completed as the pending one. Combined with active() and pendingUser() derivations, that meant the wrong turn could show as 'active' (drove the 'Thinking' shimmer in session-turn) for sessions with DB residue from a previously-crashed stream. Tighten to last-message-only: only the final message in the list can legitimately be the pending assistant message. Matches the pattern in upstream PR anomalyco#17593's commit 060f482. Combined with 3e7acb8, 91ae391, and 347095d, this closes the client-side stuck-Thinking / stuck-busy bug end-to-end: - timeline working derivation: trusts status (3e7acb8) - session.tsx::busy() helper: trusts status (91ae391) - sidebar SessionItem.isWorking: trusts status (347095d) - session-turn.tsx::pending: last-message-only (this commit) The persistent DB residue (12 incomplete assistant messages in the live aai.db) is now ignored by all UI working-state derivations. Server-side fix to stop NEW residue from accumulating is a separate follow-up (see docs/plans/stuck-session-status.md commit 2). Refs upstream PR anomalyco#17593.
rmk40
added a commit
to rmk40/opencode
that referenced
this pull request
Apr 27, 2026
The aai.1 release shipped four UI commits that made working-state derivations trust session_status only and tightened pending detection to last-message-only (3e7acb8, 91ae391, 347095d, ef4e181). That correctly silenced stuck Thinking from historical DB residue. User reports stuck Thinking still appears on individual turns. SQL probe of opencode-aai.db at the report time found 13 incomplete assistant messages (time.completed undefined) — including TWO created in the running aai.1 binary, in different sessions, both without 'error' set (so neither went through processor.halt). Diagnosis: prompt-setup-phase throws between sessions.updateMessage (prompt.ts:1425) and the Effect.ensuring(processor.cleanup) scope inside handle.process (begins at line 1487) leave the persisted assistant message permanently incomplete in DB. processor.cleanup DOES set time.completed at processor.ts:519, but its scope only covers the inner stream-drain block — anything that throws BEFORE handle.process is yielded never reaches that finalizer. Specifically: - processor.create's snapshot.track() at processor.ts:112 (git op) - resolveTools at prompt.ts:1436 - plugin.trigger experimental.chat.messages.transform at line 1476 - Effect.all([sys.skills, sys.environment, instruction.system, MessageV2.toModelMessagesEffect]) at line 1478 Each of these can throw with msg already in DB and time.completed undefined. The last-message-only pending derivation introduced in ef4e181 then renders that turn as 'Thinking' permanently. Fix: hoist sessions.updateMessage(msg) and processor.create({...}) into an outer Effect.gen and apply Effect.onError to the whole thing. The finalizer: - Returns immediately if msg.time.completed is already set (idempotence — processor.cleanup may have run first when handle.process was reached and then errored). - Otherwise sets msg.time.completed = Date.now() and persists via sessions.updateMessage(msg).pipe(Effect.catchCause(() => Effect.void)). Why Effect.onError vs alternatives: - tapErrorCause is functionally equivalent for this purpose. onError matches the codebase's existing finalizer idiom (see runner.ts:82 Effect.onExit) more closely. - onExit fires on every exit including success, which would generate redundant updateMessage event publishes on every successful turn (the idempotence guard prevents the DB-side double-write but not the event publish). - Effect v4 finalizers (onError, ensuring, addFinalizer) all run uninterruptibly by default, so the cleanup body cannot be interrupted mid-write. Why Effect.catchCause(() => Effect.void) instead of Effect.ignore: - sessions.updateMessage wraps Effect.sync(SyncEvent.run(...)). SyncEvent.run can throw synchronously on aggregate-missing or stale-version conditions, surfacing as a Die cause (defect) rather than a typed Fail. Effect.ignore in Effect v4 only converts typed Fail causes; defect causes can re-propagate and mask the original prompt-setup error. catchCause swallows ALL non-success Causes including defects, which is what we need to keep the cleanup truly infallible. Why instruction.clear stays inside the inner gen: - instruction.clear(handle.message.id) requires handle which only exists after processor.create resolves. Moving it to the outer scope would lose access to handle. Pre-change behavior was the same: if processor.create threw, instruction.clear never ran (handle was undefined). New structure preserves that. Why runLoop only (not handleSubtask or shellImpl): - Empirical evidence: the 13 residue messages are all in runLoop territory (from agents 'build' and 'plan', tool/text/step-start parts only, never subtask parts). handleSubtask at line 528 and shellImpl at line 721 already have their own finalize paths via Effect.ensuring and explicit time.completed sets. No residue traceable to those paths. - If subsequent residue traceable to handleSubtask or shellImpl appears, a follow-up commit applies the same pattern there. Why no boot-time DB sweep for the 13 historical messages: - With the parent plan's UI commits, those 13 messages are only visible as 'Thinking' on their respective last-message turns in their respective sessions — they no longer leak into other derivations. Cosmetic debt, not active spam. Bulk DB mutation of user state without explicit opt-in violates the project's principle of not silently changing data the user can see. Plan: docs/plans/stuck-session-status-followup.md (added in this commit). Dual-reviewed by code-review-opus and code-review-gpt5 both at plan stage and after implementation. Plan-stage review caught two blockers (scope too narrow excluding processor.create; Effect.ignore insufficient for defect Causes) and several issues (verification realism, upstream-PR linkage, plan section naming); all addressed before implementation. Implementation-stage review returned 0 blockers, 0 issues, 3 optional nits (no action needed). bun test from packages/opencode passes (2166 pass, 0 fail). bun typecheck clean. Refs upstream PR anomalyco#17593 (open) — same diagnosis ('backend now finalizes assistant messages when prompt setup fails'). Predates the Effect migration in this codebase, so we lift the idea, not the diff.
Collaborator
|
Automated PR Cleanup Thank you for contributing to opencode. Due to the high volume of PRs from users and AI agents, we periodically close older PRs using automated criteria so maintainers can focus review time on the most active and community-supported contributions. This PR was closed because it matched the following cleanup criteria:
PRs created within the last month are not affected by this cleanup. If you believe this PR was closed incorrectly, or if you are still actively working on it, please leave a comment explaining why it should be reopened. A maintainer can review and reopen it if appropriate. Thanks again for taking the time to contribute. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
problem
fix
tui and desktop
testing