Did anyone made this before? Real time speech detection and transcription in Termux

I built a project called Termux-STT that does real-time speech detection + transcription directly inside Termux on Android.

What it does

Continuously listens through microphone
Detects when speech starts/stops
Records only active speech
Transcribes locally using Whisper
Runs fully inside Termux without needing a full Android app

Current focus is low-latency real-time usage rather than batch transcription.

Why I made it

A lot of projects already exist for:

Desktop Whisper setups
Android speech-to-text apps
Voice assistants
Server-side STT pipelines

But I could barely find projects doing clean real-time speech activity detection + local transcription purely inside Termux.

Closest things I found were assistant experiments or server-client systems, not standalone local real-time STT running directly in the terminal environment.

Questions

Has anyone already built something similar?
Any older/open-source projects I missed?
Suggestions for reducing latency?
Better approaches for speech activity detection in noisy environments?

Repo

👉 https://github.com/opsonusdh/Termux-STT

Modern phones contain absurd amounts of computing power and we collectively decided: “terminal emulator + real-time AI transcription on Android.”

Strangely beautiful.

u/Flashy-Abalone-9212 — 6 days ago

▲ 9 r/termux

I built a sandboxed autonomous AI agent for Termux (now I need your help)

Hey everyone,

I've been working on a project called Termux-AI.

It's basically a sandboxed autonomous AI agent designed specifically for Termux.

The AI can:

execute shell commands
inspect files
reason step-by-step
create projects
interact with the internet
maintain memory/logs
operate inside a restricted filesystem sandbox

The idea was to create something practical and controllable instead of pretending to build AGI in a weekend and accidentally inventing malware with motivational speeches.

Features

Autonomous shell execution
Multi-step reasoning loop
Gemini SDK integration
Sandboxed filesystem
Persistent memory system
Command logging
Tool calling support
Protected core directory
Internet access with controlled execution

Sandbox Model

The AI:

can read files globally
can write inside ~/ai_root
cannot modify ~/ai_root/core without permission
cannot modify system files automatically

So the agent stays useful without immediately becoming a digital raccoon inside your filesystem.

Example Flow

User:

create a cli game in python

AI:

mkdir ~/ai_root/game
printf "print('hello')" &gt; ~/ai_root/game/main.py

The command output is fed back into the reasoning loop automatically.

Tech Stack

Python
Gemini API
Termux
Shell tools (grep, find, sed, etc.)

Repo

https://github.com/opsonusdh/Termux-AI

Contribution Ideas

Would love help with:

sandbox improvements
better shell parsing
memory optimization
safer execution
internet tools
UI integration
context trimming
performance improvements

Final Note

I actually made the core idea and most of the project before I knew about OpenClaw (It wasn't released back then).
I'm uploading this now because I genuinely need help improving it.
The long-term goal is to integrate this AI into another project of mine called Termux-TUI.

So if the project interests you, contributions, ideas, criticism, or testing would genuinely help a lot.

Hope you'll help with your generosity.

u/Flashy-Abalone-9212 — 10 days ago

▲ 23 r/termux

Made a modern TUI dashboard for Termux. Need ideas and contributors

A while ago, I made a post about a project I was building for Termux. Since then, the project has evolved quite a bit, and I’d really appreciate feedback, ideas, and contributions from the community.

The project is called "Termux-TUI" (https://github.com/opsonusdh/Termux-TUI).

It’s a futuristic Jarvis-style terminal dashboard for Termux built entirely in Python. The goal is to make the Termux experience more interactive and visually modern while still staying lightweight and fully terminal-based. It includes features like:

Live system stats
One-tap utility actions
File browsing
Termux API integrations
A clean TUI-focused workflow

The project is still actively being developed, and I’m looking for:

Feature ideas
UI/UX improvements
Performance suggestions
Contributors interested in Python/TUI development
General feedback from Termux users

If you work with Termux, terminal tools, Python TUIs, or just enjoy building weirdly futuristic command-line interfaces because apparently humans decided terminals should cosplay as sci-fi movies, I’d love your input.

GitHub Repository: "https://github.com/opsonusdh/Termux-TUI"

u/Flashy-Abalone-9212 — 11 days ago

▲ 1 r/learnprogramming

I’ve been working on a small Android app using Kivy that basically acts as a YouTube MP3 player/downloader.

Repo: https://github.com/opsonusdh/Ytmp3

Current setup:

Kivy app running on Android
Using yt-dlp to extract stream URLs
FFmpeg is bundled and working
Audio playback via Kivy SoundLoader

The problem:

I can extract URLs, but they’re direct Googlevideo links (expiring, sometimes video instead of audio), and Kivy often fails with:

SoundLoader could not open stream

Logs:

[INFO   ] [Logger      ] Record log in /storage/emulated/0/.kivy/logs/kivy_26-04-28_3.txt

[INFO ] [Kivy ] v2.3.1 [INFO ] [Kivy ] Installed at "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/kivy/init.py" [INFO ] [Python ] v3.13.2 (main, Mar 31 2025, 08:14:59) [GCC 11.4.0] [INFO ] [Python ] Interpreter at "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/bin/python3" [INFO ] [Logger ] Purge log fired. Processing... [INFO ] [Logger ] Purge finished! [INFO ] [Factory ] 195 symbols loaded [INFO ] [Image ] Providers: img_tex, img_dds, img_sdl2, img_pil (img_ffpyplayer ignored) [INFO ] [Audio ] Providers: audio_sdl2 (audio_android, audio_ffpyplayer ignored) [INFO ] [Window ] Provider: sdl2 [INFO ] [GL ] Using the "OpenGL ES 2" graphics system [INFO ] [GL ] Backend used <sdl2> [INFO ] [GL ] OpenGL version <b'OpenGL ES 3.2 v1.r38p1'> [INFO ] [GL ] OpenGL vendor <b'ARM'> [INFO ] [GL ] OpenGL renderer <b'Mali-G57 MC2'> [INFO ] [GL ] OpenGL parsed version: 3, 2 [INFO ] [GL ] Texture max size <16383> [INFO ] [GL ] Texture max units <16> [INFO ] [Window ] auto add sdl2 input provider [INFO ] [Window ] virtual keyboard allowed, single mode, docked [INFO ] [Text ] Provider: sdl2 [INFO ] [GL ] NPOT texture support is available [INFO ] [Loader ] using a thread pool of 2 workers [WARNING] [Base ] Unknown <android> provider [INFO ] [Base ] Start application main loop [yt-dlp] [youtube] YQHsXMglC9A: ios client https formats require a GVS PO Token which was not provided. They will be skipped as they may yield HTTP Error 403. You can manually pass a GVS PO Token for this client with --extractor-args "youtube:po_token=ios.gvs+XXX". For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide [yt-dlp] [youtube] YQHsXMglC9A: ios client hls formats require a GVS PO Token which was not provided. They will be skipped as they may yield HTTP Error 403. You can manually pass a GVS PO Token for this client with --extractor-args "youtube:po_token=ios.gvs+XXX". For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide [yt-dlp] Only images are available for download. use --list-formats to see them [yt-dlp ERROR] ERROR: [youtube] YQHsXMglC9A: Requested format is not available. Use --list-formats for a list of available formats [Player] stream extraction error: ERROR: [youtube] YQHsXMglC9A: Requested format is not available. Use --list-formats for a list of available formats Traceback (most recent call last): File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1698, in wrapper return func(self, *args, **kwargs) File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1854, in __extract_info return self.process_ie_result(ie_result, download, extra_info) ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1913, in process_ie_result ie_result = self.process_video_result(ie_result, download=download) File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 3058, in process_video_result raise ExtractorError( 'Requested format is not available. Use --list-formats for a list of available formats', expected=True, video_id=info_dict['id'], ie=info_dict['extractor']) yt_dlp.utils.ExtractorError: [youtube] YQHsXMglC9A: Requested format is not available. Use --list-formats for a list of available formats

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/user/0/ru.iiec.pydroid3/files/temp_iiec_codefile.py", line 342, in _get_stream_url info = ydl.extract_info(url, download=False) File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1687, in extract_info return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1716, in wrapper self.report_error(str(e), e.format_traceback()) ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1154, in report_error self.trouble(f'{self._format_err("ERROR:", self.Styles.ERROR)} {message}', *args, **kwargs) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1093, in trouble raise DownloadError(message, exc_info) yt_dlp.utils.DownloadError: ERROR: [youtube] YQHsXMglC9A: Requested format is not available. Use --list-formats for a list of available formats [Player Error] Could not extract audio: Adele - Hello (Official Music Video) [yt-dlp] [youtube] fazMSCZg-mw: ios client https formats require a GVS PO Token which was not provided. They will be skipped as they may yield HTTP Error 403. You can manually pass a GVS PO Token for this client with --extractor-args "youtube:po_token=ios.gvs+XXX". For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide [yt-dlp] [youtube] fazMSCZg-mw: ios client hls formats require a GVS PO Token which was not provided. They will be skipped as they may yield HTTP Error 403. You can manually pass a GVS PO Token for this client with --extractor-args "youtube:po_token=ios.gvs+XXX". For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide [yt-dlp] Only images are available for download. use --list-formats to see them [yt-dlp ERROR] ERROR: [youtube] fazMSCZg-mw: Requested format is not available. Use --list-formats for a list of available formats [Player] stream extraction error: ERROR: [youtube] fazMSCZg-mw: Requested format is not available. Use --list-formats for a list of available formats Traceback (most recent call last): File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1698, in wrapper return func(self, *args, **kwargs) File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1854, in __extract_info return self.process_ie_result(ie_result, download, extra_info) ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 1913, in process_ie_result ie_result = self.process_video_result(ie_result, download=download) File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.13/site-packages/yt_dlp/YoutubeDL.py", line 3058, in process_video_result raise ExtractorError( 'Requested format is not available. Use --list-formats for a list of available formats', expected=True, video_id=info_dict['id'], ie=info_dict['extractor']) yt_dlp.utils.ExtractorError: [youtube] fazMSCZg-mw: Requested format is not available. Use --list-formats for a list of available formats

During handling of the above exception, another exception occurred:

I understand now that:

yt-dlp gives temporary URLs
streaming directly is unreliable
audio-only extraction is inconsistent in my current setup

What I need help with:

Best way to reliably get audio-only streams using yt-dlp in Python (Android context)
Whether I should stream or always download first
Any clean pattern for integrating yt-dlp + Kivy without these failures
If there’s a better playback approach than SoundLoader for this use case.

I’ve already tried:

Filtering formats manually
Using bestaudio formats
Different extractor_args (android/web clients)

Still getting unstable results. Any advice, patterns, or examples would help a lot.

Note: my only source of knowledge is some ai. So I can't assure that I learnt all clearly.

u/Flashy-Abalone-9212 — 25 days ago

▲ 62 r/termux

guys, navigating in termux was driving me insane. i kept forgetting commands, had to install xcfe just to get any kind of interface, and half of the storage smoked away.

github: https://github.com/opsonusdh

so i built something about it.

it's called **TermuxDash** — a fully interactive terminal dashboard that runs *inside* termux itself. no X11, no XFCE, no root, no display server. just Python and a single script.

here's what it does:

**🏠 Home tab**

live clock, battery %, memory usage — all auto-updating
pulls live ASCII weather from wttr.in - reads your bash history and shows your most-used tools as clickable buttons
built-in command input so you never have to leave the dashboard
battery alert — border flashes red when you're below 20%

**📦 Packages tab**

20+ pre-configured tools
one tap installs even the annoying multi-step ones like APKTool where you normally have to wget the jar, chmod it, symlink it manually
live install log streams every step

**⚙️ System tab**

shortcuts for all termux API commands — battery, wifi info, location, telephony, camera, sensors, public IP, running processes
JSON responses auto-parsed into readable key → value pairs instead of raw blobs

**📁 Files tab**

clickable file browser. tap a folder to open it, tap a file to read it
file type icons, sizes, safe handling for large files

the whole thing has a Jarvis-style aesthetic — cyan and matrix green on near-black, double borders, boots with an ASCII splash screen.

would love feedback. what would you add to something like this?

u/Flashy-Abalone-9212 — 27 days ago