An index of Linux-friendly voice technology tools
A curated index of 100+ voice technology tools accessible to Linux desktop users, from real-time dictation to dev frameworks.
Tag
10 posts
A curated index of 100+ voice technology tools accessible to Linux desktop users, from real-time dictation to dev frameworks.
A curated resource list of multimodal AI models with native audio support — models that process audio tokens, not just transcribe.
Comparing 8 STT models on a 27-minute podcast. Local Whisper wins on word accuracy, but cloud APIs dominate punctuation.
A short curated list of the best Whisper fine-tuning resources: tutorials, notebooks, and managed compute examples.
Evaluating whether fine-tuning Whisper improves transcription accuracy. Spoiler: it depends on model size and use case.
A script for fine-tuning OpenAI's Whisper speech recognition models using Modal's serverless GPU infrastructure.
A voice-controlled Linux virtual keyboard using Deepgram's Flux turn-taking STT API, built in Rust.
A GUI tool for collecting audio training data for ASR fine-tuning, with LLM-generated prompts and Hugging Face integration.
A desktop transcription app that sends audio directly to multimodal AI models for single-pass transcription and formatting.
A local voice typing app for Linux/Wayland using NVIDIA's Parakeet model. No cloud, no GPU, built-in punctuation.