
audiobooks
I wanted a way to turn ebooks into audiobooks without paying anyone or uploading text to a cloud, so I wrote a small wrapper around Kokoro-82M.
What it does: drop your text into book.txt, run ./collector.sh, get audiobook.mp3. That's it.
What I actually cared about while building it:
- Resumable. Pending sentences sit in a working file that shrinks from the top as chunks finish. Kill the process at any point, rerun, it picks up exactly where it stopped. No duplicates, no lost audio.
- Web UI on
127.0.0.1:8765to pause / resume / stop while it's running. Useful when the GPU is needed for something else. - ~8× realtime on GPU, also runs on CPU if you're patient. Works on old Maxwell cards (GTX 750 Ti / 9xx) with the CUDA 12.1 torch build.
- ffmpeg concatenates everything into a single MP3 with configurable silence between sentences.
Voice quality is Kokoro-82M — surprisingly natural for an 82M model, way better than what I expected from something this small.
Stack: Python + Kokoro + ffmpeg + espeak-ng. MIT licensed.
Repo: https://github.com/arpecop/kokobook
Caveat: text-cleaning regexes are tuned for one ebook export format, so you'll likely need to tweak build_clean_text() for your source. PRs welcome.