
Supertone's Supertonic is just a 66M param, on-device text-to-speech engine that runs via ONNX for cross-platform inference.
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.
Highlights
- ⚡ Blazingly Fast — Low-latency, real-time synthesis across desktop, browser, mobile, and edge — fast enough to turn an entire webpage into audio in under a second
- 🌍 31-Language Multilingual — Synthesize directly from text across 31 languages, or pass
lang="na"to let Supertonic process the text language-agnostically when you don't know the input language — no separate language adapters needed - 🪶 99M-Parameter Open-Weight Model — A compact, fully open-weight checkpoint — a fraction of the size of 0.7B–2B class open TTS systems — for smaller downloads, faster cold starts, and lower memory footprint
- 📱 Edge-Device Ready — Runs locally on desktop, mobile, browsers, and resource-constrained hardware like Raspberry Pi or e-readers, with zero network dependency, complete privacy, and no GPU required
- 🔊 44.1kHz High-Quality Audio — Outputs studio-grade 44.1kHz 16-bit WAV directly, ready for production playback without any external upsampler
- 🎭 Expression Tags — 10 inline tags (e.g.
<laugh>,<breath>,<sigh>) bring natural human nuance into generated speech without prompt engineering or reference audio - 🛠️ Multi-Runtime SDKs — Ready-to-use examples through ONNX Runtime across Python, Node.js, Browser (WebGPU), Java, C++, C#, Go, Swift, iOS, Rust, and Flutter
🌍 Supported Languages (31)
Arabic (ar), Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Turkish (tr), Ukrainian (uk), Vietnamese (vi)
The best part is it's 100% open source and comes under the MIT license.