OpenWhispr is a privacy-first, cross-platform voice-to-text dictation application designed for developers, journalists, and professionals who need to transcribe speech quickly and securely. It solves the problem of slow typing and cloud-based transcription services that compromise data privacy by offering fully local processing using Whisper and NVIDIA Parakeet, while also supporting cloud APIs like OpenAI, Claude, and Gemini for users who prioritize speed over offline use. Built with Electron, React, and TypeScript, it integrates with native OS features and supports deployment on macOS, Windows, and Linux.
The app leverages whisper.cpp and sherpa-onnx for on-device speech recognition, uses better-sqlite3 for local storage of transcripts and notes, and connects to Neon’s serverless Postgres for cloud sync. Its modular architecture allows users to choose between local AI inference (no internet required) or cloud-based models via API keys, with full transparency through open-source code and no telemetry.
What You Get
- Voice Dictation with Global Hotkey - Press a system-wide hotkey to dictate into any application (Slack, Google Docs, Teams, etc.) with automatic text pasting and 150 WPM speed.
- Local Speech-to-Text with Whisper & Parakeet - Run Whisper Tiny to Turbo and NVIDIA Parakeet models entirely offline using whisper.cpp and sherpa-onnx; no audio leaves your device.
- AI Agent for Voice Commands - Use natural language commands like ‘clean this up’ or ‘draft an email to Mike’ to edit text, summarize, or rewrite content using local or cloud LLMs (GPT-5, Claude, Gemini, Groq).
- Meeting Transcription with Speaker Diarization - Auto-detect Zoom, Teams, and FaceTime calls; perform on-device speaker labeling and voice fingerprinting without cloud dependency.
- AI-Powered Notes with Semantic Search - Create, organize, and search notes using semantic search powered by embeddings; sync across devices via Neon Postgres cloud backend.
- Custom Vocabulary Learning - Auto-learn and store domain-specific terms (e.g., ‘Kubernetes’, ‘PostgreSQL’) from user corrections to improve transcription accuracy over time.
- 100+ Language Support with Real-Time Detection - Transcribe speech in over 100 languages, including switching mid-sentence; auto-detects language without manual configuration.
- Public API & MCP Server Integration - Programmatically access transcriptions and notes via a documented REST API and connect external AI assistants using the MCP server protocol.
Common Use Cases
- Running a fast-paced engineering standup - A Scrum Master uses OpenWhispr to transcribe daily standups with speaker diarization, then generates action items and decisions from the transcript using AI cleanup.
- Writing technical documentation on a plane - A developer dictating API specs in a no-internet environment uses local Whisper Turbo to transcribe speech into Markdown notes without uploading audio.
- Transcribing client interviews for legal compliance - A paralegal records and transcribes confidential client calls using local Parakeet to ensure no data leaves the device, meeting GDPR and HIPAA requirements.
- Creating blog content while commuting - A content creator uses voice dictation to draft articles in Google Docs while driving, with custom vocabulary for industry jargon and AI-powered editing to polish drafts.
Under The Hood
Architecture
- The codebase exhibits a clear separation of concerns, with distinct layers for UI, data handling, and backend logic.
- A modular structure is apparent, organizing components and helpers into logical groupings.
- Dependency injection is utilized, though not through a formal container, relying on prop and context passing.
- The application leverages a store pattern for state management and a dedicated directory for build and download tasks.
Tech Stack
- Built on a modern stack including React, TypeScript, and Electron, emphasizing type safety and component-based development.
- Utilizes a robust build pipeline with Vite, ESLint, and Prettier for development and quality assurance.
- Integrates multiple AI/ML components like whisper.cpp, llama-server, sherpa-onnx, and qdrant.
- Leverages UI component libraries like Radix UI and shadcn-ui for accessibility and rapid development.
Code Quality
- Code is well-organized into logical modules with a clear directory structure.
- Error handling employs a custom error class hierarchy, though a defensive programming style is apparent.
- Naming conventions are generally consistent and descriptive, enhancing readability.
- A comprehensive testing strategy is in place, including unit and integration tests focused on edge cases.
What Makes It Unique
- Integrates local LLM models with a robust download and management infrastructure.
- Focuses on privacy-preserving speech-to-text functionality.
- Demonstrates a sophisticated approach to voice activity detection and echo leak detection.
- Includes a custom TLS trust mechanism for integrating system CAs and addresses Wayland compatibility issues with automatic XWayland fallback.