u/bhalothia

OSS to win - VoiceBox is here
▲ 6 r/VoiceAutomationAI+1 crossposts

OSS to win - VoiceBox is here

OSS app replaces ElevenLabs & WisprFlow, runs 100% locally.

→ Clone voice from 3s audio
→ 7 TTS engines in one
→ 23 langs: Ar, Hi, Ja etc.
→ Built-in MCP srv so Claude Code/Cursor/Cline speak cloned voice
→ Local LLM rewrites in-char before TTS

u/bhalothia — 4 days ago

There's a 4,000-word article going around about voice AI latency benchmarks.
It's well-researched. It's also mostly useless in production.

Here's what we actually track at kolsetu dot com

after running 100,000s of real voice agent calls - some learnings

1. Correlate your metrics per turn or they're meaningless

2. Track cancelled compute

3. Connection pool health is worth more than model benchmarks - they are not always matching the reality

4. Split interruptions from backchannels

5. The barge-in config that saved our UX - there's a right time to interrupt, figure that out

6. Silence handling is its own subsystem

7. Our SLO is 1.5s p95, not 800ms - its not real and not required

8. Dual mode: pipeline AND realtime - you will thank me for this dearly

Curious to know what's working for you guys? what do you measure?

reddit.com
u/bhalothia — 16 days ago

There's a 4,000-word article going around about voice AI latency benchmarks.
It's well-researched. It's also mostly useless in production.

Here's what we actually track at kolsetu dot com

after running 100,000s of real voice agent calls - some learnings

1. Correlate your metrics per turn or they're meaningless

2. Track cancelled compute

3. Connection pool health is worth more than model benchmarks - they are not always matching the reality

4. Split interruptions from backchannels

5. The barge-in config that saved our UX - there's a right time to interrupt, figure that out

6. Silence handling is its own subsystem

7. Our SLO is 1.5s p95, not 800ms - its not real and not required

8. Dual mode: pipeline AND realtime - you will thank me for this dearly

Curious to know what's working for you guys? what do you measure?

reddit.com
u/bhalothia — 16 days ago

..that too while the task was in b/w - Claude is disappointing me these days...

For context: I'm just repromtping something that was generated - it was simply to edit the content

u/bhalothia — 25 days ago