Skip to content

feat(voice): add send/edit/cancel confirmation buttons for voice transcription#135

Open
DovyLive173 wants to merge 3 commits into
grinev:mainfrom
DovyLive173:feat/voice-confirm-buttons
Open

feat(voice): add send/edit/cancel confirmation buttons for voice transcription#135
DovyLive173 wants to merge 3 commits into
grinev:mainfrom
DovyLive173:feat/voice-confirm-buttons

Conversation

@DovyLive173
Copy link
Copy Markdown

Summary

Replace the auto-send flow for voice transcriptions with an interactive confirmation step. After transcription, the user now sees a message with inline buttons to Send, Edit, or Cancel the recognized text before it is sent to OpenCode.

Changes

New file: src/bot/handlers/stt-confirm.ts

  • showSttConfirmation() — displays the recognized text with Send/Edit/Cancel inline keyboard
  • handleSttConfirmCallback() — handles button callbacks:
    • Send — applies the optional STT_NOTE_PROMPT and forwards to processUserPrompt
    • Edit — transitions the interaction to expectedInput: "mixed" (accepts both text and a new voice recording) and prompts the user to send corrections
    • Cancel — clears the interaction and marks the voice message as cancelled
  • handleSttEditText() — receives corrected text (or a re-recording routed through the voice handler) and sends it to OpenCode via processUserPrompt

Modified: src/bot/handlers/voice.ts

  • Simplified: removed the inline processPrompt call and processPrompt dep injection in favor of showSttConfirmation
  • Removed unused FilePartInput import and processPrompt option from VoiceMessageDeps

Modified: src/interaction/guard.ts

  • Inside the expectedInput === "mixed" block: added an exception to allow inputType === "other" (voice/audio messages) when the active interaction is a custom STT interaction (kind === "custom" with sttTranscript metadata). This enables re-recording during edit without being blocked by the guard.

Modified: src/bot/index.ts

  • Registered handleSttConfirmCallback in the callback query handler chain
  • Registered handleSttEditText before processUserPrompt in the text handler to intercept edited text

i18n

  • Added 9 new keys (stt.confirm_message, stt.confirm_send, stt.confirm_edit, stt.confirm_cancel, stt.confirm_sending, stt.confirm_edit_prompt, stt.confirm_edit_sending, stt.confirm_cancelled, stt.confirm_inactive) to all supported languages: en, de, es, fr, ru, zh

Tests

  • Updated tests/bot/handlers/voice.test.ts:
    • Removed tests for the old auto-send flow
    • Added verification that the confirmation interaction is created after transcription
    • Updated existing tests to check for confirmation interaction state instead of direct processPrompt calls

How it works

User sends voice message
  → Voice handler transcribes audio
  → Status message updated with recognized text + inline buttons
  → User chooses:
     • Send → prompt sent to OpenCode (respects STT_NOTE_PROMPT)
     • Edit → user can send corrected text OR a new voice recording
               → New recording is re-transcribed and shows confirmation again
               → Text is sent directly to OpenCode
     • Cancel → transcription discarded

Testing

  • npm run build — passes
  • npm run lint — passes (0 warnings)
  • npm test — passes (942 tests, 111 files, all passing)
  • Manual testing with a Telegram dev bot verified:
    • Full confirmation flow (Send / Edit / Cancel)
    • Text correction during edit
    • Voice re-recording during edit (new transcription shows confirmation again)
    • Interaction expiry and inactive confirmation handling

…scription

Replace the auto-send flow for voice messages with an interactive
confirmation step. After transcription, the user now sees a message with
inline buttons to Send, Edit, or Cancel the recognized text.

- New stt-confirm.ts handler with showSttConfirmation, callback handling,
  and text/voice re-recording during edit
- Interaction guard allows voice messages during expectedInput='mixed'
  for STT custom interactions (re-recording during edit)
- Bot index registers the new callback and text handlers
- i18n strings added for all supported languages (en/de/es/fr/ru/zh)
- Voice handler tests updated to verify confirmation flow
- Unused processPrompt dependency removed from voice.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants