Skip to content

Voice input not isolated per session + captures external audio #867

@Ch1ronWang

Description

@Ch1ronWang

Problem

Voice input has two isolation issues:

1. No session isolation — voice is global across all sessions

activeVoice is a module-level variable in packages/opencode/src/cli/cmd/tui/component/prompt/index.tsx:91:

// Module-level voice state: survives component remounts and route changes
let activeVoice: {
  handle: Voice.StreamingHandle
  ...
}

When multiple sessions are open, they all share this single global voice handle. Speaking into the mic in one session writes text into all sessions' input boxes, because appendText/setText references get overwritten by whichever prompt component last mounted.

2. No audio source isolation — external sounds are captured

The recorder (packages/opencode/src/cli/cmd/tui/util/voice.ts:22-31) uses arecord or sox to capture from the default audio input device. There is no distinction between:

  • The user's voice (intended input)
  • System audio from playing videos, music, or other applications

The VAD (vad.ts) detects speech activity from whatever audio comes through, so any audio source with speech-like characteristics gets transcribed and written to the input box.

Expected behavior

  1. Each session should have its own independent voice instance — enabling/disabling voice in one session should not affect others.
  2. Voice should only capture from the microphone input, not from system/desktop audio output.

Environment

  • MiMoCode CLI TUI
  • Linux (WSL2) with PulseAudio
  • Multiple sessions open simultaneously

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions