u/Fabulous_Tip_8539

Open-sourcing an iOS PoC for offline streaming ASR with NVIDIA Nemotron 3.5 + Core ML
▲ 7 r/speechtech+1 crossposts

Open-sourcing an iOS PoC for offline streaming ASR with NVIDIA Nemotron 3.5 + Core ML

I’m open-sourcing a small iOS proof of concept for offline, on-device speech recognition using NVIDIA Nemotron-3.5-ASR Streaming 0.6B via Core ML.
It supports live microphone streaming and offline file transcription on physical iPhone/iPad hardware. The app can also run without the model files installed, so you can still test mic capture, 16 kHz resampling, chunking, and the benchmark UI before setting up the model.

I tested it on my iPhone 15 Pro, and live transcription is almost real-time, especially for English transcription.

Repo focus:

* Offline/private ASR on iOS
* Core ML inference
* Live mic transcription
* Offline file transcription
* Audio pipeline tests

Benchmark and latency tracking

It’s still a PoC, but both main inference paths are working end-to-end on device. Feedback, testing on different iPhones, and contributions are welcome.

github.com
u/Fabulous_Tip_8539 — 8 hours ago