🆕 What's New

Text-to-Speech streaming with timestamps: TTS streaming now supports timestamps for improved synchronization and developer workflows.
Expanded phoneme control documentation: Added phoneme control support for English, Japanese, and Chinese in the developer documentation.
Improved ASR stability and latency: Optimized ASR response delays for a faster and more stable experience.
ASR model & pricing visibility: The Developer → Control page now displays ASR model information and pricing ($0.36 per audio hour).

🛠️ Bug Fixes

Got something cool to share? Post it here on our subreddit or drop it in ⁠🎧│voice-models channel on our Discord and tag us! We're looking to feature projects from our community!

Stay creative 🎵

— The Fish Audio Team