feat(huddle): sentence-at-a-time voice-mode guidelines for lower TTS latency#996
Merged
Conversation
…latency The desktop already speaks each agent kind:9 as it arrives (queued, in order, deduped), so an agent that posts its first sentence immediately and the rest as separate messages cuts time-to-first-audio from "full reply generated" to "first sentence generated" — the prompt-level equivalent of token streaming, with no harness or transport changes. Rewrites the kind:48106 voice-mode guidelines to: - instruct sentence-per-message delivery, first sentence ASAP - say one short sentence before running a tool, so tool turns are not silent gaps - treat a new human message mid-reply as an interruption: drop unsent sentences instead of finishing the stale tail Prompt-only change; non-compliance degrades gracefully (TTS already splits multi-sentence messages internally — we only lose the head start). Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
…voice guidelines Per review: a model could read "send each sentence as its own message" as "compose the full reply, then split it", or fail to connect "message" to a `buzz messages send` tool call. Spell both out: reply immediately without composing the full reply first, and one sentence per separate `buzz messages send` call. Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 12, 2026
Co-authored-by: npub1mprnacetjua2xx3p5eddmhxyk6wv929ymm5py8kd2xfxurxahspqqlgyta <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co> Signed-off-by: npub1mprnacetjua2xx3p5eddmhxyk6wv929ymm5py8kd2xfxurxahspqqlgyta <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co> * origin/main: (35 commits) feat(huddle): sentence-at-a-time voice-mode guidelines for lower TTS latency (#996) Shard desktop Playwright CI jobs (#992) chore(release): release version 0.3.18 (#995) Video Player Improvements (#993) Improve first-run welcome setup (#970) fix(release): use legacy updater key secret (#991) Replace built-in personas with Fizz (#987) docs(buzz-acp): rewrite Communication Patterns for mention accuracy and threading clarity (#982) chore(justfile): build git-credential-nostr in dev and staging recipes (#980) Fix Buzz command migration for saved agents (#979) fix(desktop): resolve effective model and prompt from persona in display path (#972) docs: clean up remaining Buzz references (#977) chore(release): release version 0.3.17 (#976) fix(onboarding): skip onboarding when relay already has a profile (#973) docs: finish Buzz rename cleanup (#974) fix(desktop): let channel members bypass mention agent gate (#965) Rename desktop app to Buzz (#960) feat(desktop): open profile panel from MembersSidebar rows (#962) feat(desktop): per-event notification sounds and alert controls (#968) fix(desktop): make header chrome zoom-correct and tidy split-pane (#941) ... # Conflicts: # crates/buzz-agent/README.md # crates/buzz-agent/src/config.rs
wpfleger96
pushed a commit
that referenced
this pull request
Jun 12, 2026
…session-new * origin/main: fix(huddle): Pocket TTS quality overhaul — reference parity + cross-message pipelining (#997) Add manual ACP session rotation command (#932) fix(desktop): heal stale persona_team_dir paths in release builds (#1003) ci(docker): publish public ghcr.io/block/buzz image (native multi-arch) (#986) fix(buzz-agent): cap tool-result text at 50 KiB with middle elision (#952) feat(huddle): sentence-at-a-time voice-mode guidelines for lower TTS latency (#996) Shard desktop Playwright CI jobs (#992) chore(release): release version 0.3.18 (#995) Video Player Improvements (#993) Improve first-run welcome setup (#970) fix(release): use legacy updater key secret (#991) Co-authored-by: Will Pfleger <pfleger.will@gmail.com> Signed-off-by: Will Pfleger <pfleger.will@gmail.com> # Conflicts: # crates/buzz-acp/src/lib.rs # crates/buzz-agent/src/config.rs
tellaho
added a commit
that referenced
this pull request
Jun 12, 2026
…tate * origin/main: Add relay disconnect UX: friendly errors, reconnect, cached identity (#1004) feat(agents): add active turn indicators to Agents Menu (#1005) ci: add fork guards to docker, release, and auto-tag workflows (#1007) docs(nip-rs): add optional thread read context scheme (#1006) fix(huddle): Pocket TTS quality overhaul — reference parity + cross-message pipelining (#997) Add manual ACP session rotation command (#932) fix(desktop): heal stale persona_team_dir paths in release builds (#1003) ci(docker): publish public ghcr.io/block/buzz image (native multi-arch) (#986) fix(buzz-agent): cap tool-result text at 50 KiB with middle elision (#952) feat(huddle): sentence-at-a-time voice-mode guidelines for lower TTS latency (#996) Shard desktop Playwright CI jobs (#992) chore(release): release version 0.3.18 (#995) Video Player Improvements (#993) Improve first-run welcome setup (#970) fix(release): use legacy updater key secret (#991) Replace built-in personas with Fizz (#987)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Rewrites the kind:48106 voice-mode guidelines (posted to the ephemeral huddle channel at huddle start) to instruct agents to reply one sentence per message, sending the first sentence the moment it's formed.
Why
Today the dominant term in mic-stop → first-audio is waiting for the agent's complete reply: TTS can't start until the full kind:9 lands, so the entire LLM generation (2–10s) sits in the silent gap.
The desktop already speaks each agent kind:9 as it arrives — queued (depth 8), in order, deduped by event ID (
useTtsSubscription→speak_agent_message). So an agent that posts its first sentence immediately, then the rest as separate messages, cuts time-to-first-audio from "full reply generated" to "first sentence generated". This is the prompt-level equivalent of token streaming, with zero harness or transport changes — the deliberate alternative to SSE streaming in buzz-agent, which we're keeping minimal.Guideline changes:
Risk
Prompt-only; degrades gracefully. If a model ignores the instruction and posts a multi-sentence message, TTS already splits sentences internally and plays it normally — we only lose the head start on that turn. No error mode.
Verification
cargo test(desktop-tauri, full): 518/518 ✓cargo clippy -D warnings✓Follow-ups (separate PRs)
Context: discussed in #sprout-conversational-agents — replaces the Concierge-destination approach from #910 (closed) with huddle-first latency work.