u/JudgePhobos

▲ 3 r/foss

I built a fully offline, FOSS Push-to-Talk dictation tool for Windows because the good ones are Mac-exclusive (C#)

https://reddit.com/link/1tgta4x/video/rabket39ix1h1/player

Hey everyone,

I’ve been watching a lot of workflow videos lately, and there’s a recurring theme: typing is becoming the main bottleneck when interacting with AI or doing heavy code reviews. I wanted a local speech-to-text tool to fix this, but the landscape on Windows is frustrating:

  • Built-in tools are clunky.
  • The best local FOSS tools are macOS-exclusive.
  • The rest are heavy Electron/Python apps that eat up too much RAM just to run in the background.

So, I built Echo - a completely free, open-source C# utility.

What it does: It's a minimal Push-to-Talk app. You hold a global hotkey, speak, and Echo transcribes your audio using a local Whisper model. If enabled in the settings, it instantly pastes the text into your active window (IDE, browser, Teams, Slack).

Why I think it fits here:

  • 100% Private & Offline: Zero telemetry, no cloud APIs. Audio never leaves your machine.
  • Zero UI Bloat: It runs quietly in the console.
  • VAD Built-in: Uses Voice Activity Detection to drop empty audio, fixing those infamous Whisper hallucinations.
  • Hardware Choice: Supports CUDA/Vulkan for fast inference, with a CPU fallback.
  • Seamless Translation: You can speak in your native language, and Whisper will translate and type it seamlessly in English.
  • Highly Customizable: Tweak hotkeys, volume, VAD sensitivity, and Whisper settings via config. You also choose which GGML model size fits your hardware (check the comparison table on the repo!).

System Requirements:

  • OS: Windows 10 or 11.
  • Hardware: Highly scalable. You can comfortably run smaller models (base or small) strictly on your CPU. However, for near-instant inference with larger models, a dedicated GPU is recommended (CUDA and Vulkan are fully supported).
  • Memory (RAM/VRAM): Depends entirely on the model you choose — ranging from ~500 MB for the tiny model up to ~6 GB+ for large.
  • Storage: The app itself is incredibly lightweight, but expect to use between 150 MB and 3 GB of disk space for the downloaded Whisper .bin models.

Links:

Feel free to grab the code, tear it apart, or use it for your daily workflow!

reddit.com
u/JudgePhobos — 4 days ago

I built an open-source Windows dictation app that uses local Whisper models to type directly into any active window.

Short demonstration

Hey everyone!

I initially started looking for a voice-to-text tool to help me write long LLM prompts faster. But now, I actually use it mostly for daily communication (chats, messengers) and leaving detailed code review comments. Typing all of that manually was becoming a real bottleneck in my workflow.

The built-in Windows dictation (Win+H) often struggles with technical jargon, and the best local alternatives seem to be macOS-exclusive. So, I decided to build a simple Windows alternative myself. It's called Echo.

It’s a completely offline utility written in C#. It has zero UI bloat - it just runs quietly in the console.

How it works: You simply hold down a designated hotkey of your choice, speak your thoughts, and release. Echo records your voice, runs it through a local Whisper model right on your machine, and types the transcribed text into whatever window is currently active (browser, IDE, chat, etc.) if you enabled such option in appsettings.json.

Key Features:

  • 100% Offline & Private: Your data never leaves your machine.
  • VAD: I built in a Voice Activity Detection (VAD) filter. It drops empty audio streams, so the AI doesn't type random words (like "Thank you") when there's just background noise.
  • Hardware Acceleration: Supports CUDA and Vulkan to get fast inference speeds depending on your GPU.
  • Customizable Models: You can drop in your preferred compatible Whisper GGML models (Base, Small, Medium, etc.) depending on your hardware limits.

GitHub Repo: https://github.com/GithubPhobos/Echo

I originally built this just for my own daily workflow, but I polished it up to share with the community. I'd love for you to try it out and let me know what you think!

reddit.com
u/JudgePhobos — 6 days ago
▲ 23 r/csharp

I built a completely offline Push-to-Talk dictation tool using local Whisper models (C# / .NET)

Hey everyone!

I was looking for a native, fast dictation tool for Windows to speed up writing complex LLM prompts and code review comments. Most existing solutions were either cloud-based, macOS only (like MacWhisper), or bloated Python/Electron apps. I wanted something incredibly lightweight that just sits in the background, so I built it myself in C#.

It’s called Echo, and it’s a fully open-source console application.

You just launch it, minimize it, and whenever you hold a designated hotkey, it records your voice, runs it through a local Whisper .bin model, and types the text directly into whatever window is currently active.

The Tech Stack & Implementation Details:

  • Audio Capture & VAD: I implemented a Voice Activity Detection (VAD) pre-filter to drop empty audio streams. This prevents Whisper from hallucinating those weird phrases (like "Thank you for watching") when there's only background noise.
  • Global Keyboard Hooks: It uses low-level keyboard hooks to handle the Push-to-Talk functionality seamlessly across the entire OS, without stealing focus.
  • Hardware Acceleration: Under the hood, it supports CUDA for NVIDIA GPUs (getting incredibly fast ~400ms inference times) and Vulkan for AMD/Intel.
  • Zero UI Bloat: It runs entirely in the console (Tried to make console output as pretty and readable as possible). Configuration (models, hotkeys, hardware backends) is handled via a simple appsettings.json.

It has been surprisingly fun figuring out the optimal way to manage audio streams and inference in .NET without memory leaks.

GitHub Repo: https://github.com/GithubPhobos/Echo

Feel free to check out the code! I’d love to hear any feedback on the architecture, answer questions about integrating Whisper in C#, or review PRs if anyone wants to contribute (system tray support is definitely on the wishlist).

reddit.com
u/JudgePhobos — 7 days ago