
Did anyone made this before? Real time speech detection and transcription in Termux
I built a project called Termux-STT that does real-time speech detection + transcription directly inside Termux on Android.
What it does
- Continuously listens through microphone
- Detects when speech starts/stops
- Records only active speech
- Transcribes locally using Whisper
- Runs fully inside Termux without needing a full Android app
Current focus is low-latency real-time usage rather than batch transcription.
Why I made it
A lot of projects already exist for:
- Desktop Whisper setups
- Android speech-to-text apps
- Voice assistants
- Server-side STT pipelines
But I could barely find projects doing clean real-time speech activity detection + local transcription purely inside Termux.
Closest things I found were assistant experiments or server-client systems, not standalone local real-time STT running directly in the terminal environment.
Questions
- Has anyone already built something similar?
- Any older/open-source projects I missed?
- Suggestions for reducing latency?
- Better approaches for speech activity detection in noisy environments?
Repo
👉 https://github.com/opsonusdh/Termux-STT
Modern phones contain absurd amounts of computing power and we collectively decided: “terminal emulator + real-time AI transcription on Android.”
Strangely beautiful.