Handy is a free, open-source desktop application designed for users who need reliable, private speech-to-text transcription without relying on cloud services. Built with Tauri using Rust and TypeScript, it runs entirely on-device—transcribing audio without sending data to remote servers. This makes it ideal for privacy-conscious individuals, developers, and professionals working in sensitive environments where data leakage is a concern. Handy prioritizes accessibility and extensibility, offering a simple interface that pastes transcriptions directly into any active text field while supporting multiple language models and hardware configurations.
The app fills a critical gap in the open-source ecosystem by providing an extendable, modular foundation for local speech recognition. Unlike commercial alternatives that lock users into proprietary systems or subscriptions, Handy is designed to be forked, customized, and improved by the community. Its architecture separates audio capture, voice activity detection, and transcription into independent components, enabling developers to swap models or optimize performance for specific hardware without rewriting the entire application.
What You Get
- Offline Speech-to-Text - Transcribes speech using local Whisper or Parakeet models without uploading audio to the cloud, ensuring complete privacy and compliance with data regulations.
- Cross-Platform Support - Runs natively on Windows, macOS (Intel and Apple Silicon), and Linux with full system integration for microphone access and global hotkeys.
- Multiple Transcription Models - Choose between Whisper (Small/Medium/Turbo/Large) for high accuracy or Parakeet V3 for CPU-optimized performance with automatic language detection.
- Configurable Keyboard Shortcuts - Set custom global hotkeys to start/stop transcription; supports macOS Globe key and Linux signal-based control via SIGUSR2.
- Manual Model Installation - Download and install Whisper .bin files or Parakeet .tar.gz archives manually for use behind firewalls or in air-gapped environments with detailed directory structure requirements.
- Debug Mode & Logging - Access advanced debugging tools via Cmd+Shift+D (macOS) or Ctrl+Shift+D (Windows/Linux), including logs and model diagnostics for troubleshooting.
- Linux Text Input Compatibility - Supports xdotool (X11), wtype, or dotool for reliable text pasting; includes explicit setup instructions for Wayland and input group permissions.
- Extensible Architecture - Modular backend with whisper-rs, transcription-rs, cpal, and vad-rs libraries; designed for developers to fork, extend, or integrate into custom workflows.
Common Use Cases
- Building a privacy-first accessibility tool - A developer creating an offline transcription solution for users with disabilities who cannot rely on cloud-based services due to data privacy laws or network restrictions.
- Creating a secure dictation workflow for legal/medical professionals - Transcribing patient notes or legal testimony on air-gapped systems where cloud uploads are prohibited, using Parakeet V3 for fast CPU-based transcription.
- Problem → Solution flow: Cloud STT is slow and expensive → Handy provides free, local transcription - Users frustrated with API costs or latency of cloud-based speech services switch to Handy for real-time, zero-latency transcription on their own hardware.
- Team workflow: DevOps teams deploying STT in restricted enterprise environments - IT departments deploy Handy across Windows and Linux workstations with pre-downloaded models to avoid outbound network calls while maintaining transcription functionality.
Under The Hood
Handy is a cross-platform desktop application built with Tauri that delivers speech-to-text functionality with extensive customization and system integration capabilities. It combines a React frontend with Rust backend components to provide a feature-rich transcription tool tailored for accessibility and user control.
Architecture
Handy follows a layered architecture that separates frontend UI from backend system operations, enabling clean modularity and maintainability.
- Uses Tauri as the core framework for building native desktop applications across platforms
- Implements a component-based UI structure with well-defined sections and settings modules
- Separates business logic from presentation through custom hooks and store-based state management
- Employs a modular settings system that dynamically renders based on user preferences
Tech Stack
Handy leverages a modern tech stack that blends web development practices with system-level performance.
- Built with React, TypeScript, and Tailwind CSS for a responsive and accessible frontend
- Utilizes Tauri for cross-platform desktop capabilities with Rust backend for performance-critical tasks
- Integrates i18next for internationalization and Zustand for efficient state handling
- Employs Vite for fast development and builds, alongside Bun for scripting and formatting
Code Quality
The codebase reflects solid organizational practices with consistent patterns and strong type safety.
- Comprehensive TypeScript typing ensures robust type checking and developer clarity
- Modular component structure with clear separation of concerns enhances maintainability
- Extensive error handling through try/catch blocks and result patterns improves reliability
- Well-organized directory structure supports long-term scalability and code navigation
What Makes It Unique
Handy stands out through its unique integration with system-level permissions and modular customization features.
- Deep integration with macOS accessibility permissions enables enhanced system interaction and functionality
- Offers extensive customization via modular settings sections and runtime keyboard binding options
- Provides multi-model support with download, management, and status tracking capabilities
- Implements a sophisticated post-processing system that supports external AI service integrations