Overview: Osaurus is a native macOS runtime designed for developers and AI enthusiasts who want to run local and cloud-based large language models on Apple Silicon with seamless integration into their workflow. It bridges the gap between on-device inference using MLX and remote APIs like OpenAI, Anthropic, and Ollama, while providing a unified interface for AI agents through Model Context Protocol (MCP). Built with Swift and optimized for Apple Neural Engine, Osaurus enables always-on AI capabilities without relying on web-based services alone. It’s ideal for users who prioritize privacy, low-latency responses, and deep macOS integration.
Osaurus isn’t just another LLM frontend—it’s a full-fledged AI infrastructure layer for macOS. It includes a built-in server, tooling ecosystem, persona system, and voice capabilities powered by WhisperKit. Whether you’re building AI agents with Cursor or Claude Desktop, automating research tasks, or experimenting with local models like Llama 3.2 or Qwen, Osaurus provides the runtime environment to make it happen—without leaving your Mac.
What You Get
- Local LLM Inference with MLX - Run models like Llama 3.2, Qwen, Gemma, and Mistral directly on Apple Silicon using the optimized MLX framework, with commands like
osaurus run llama-3.2-3b-instruct-4bit to download and execute models locally.
- OpenAI & Anthropic API Compatibility - Expose local and remote models via standard
/v1/chat/completions (OpenAI) and /messages (Anthropic) endpoints, allowing existing tools and clients to integrate without modification.
- MCP Server Integration - Act as a Model Context Protocol server with endpoints like
/mcp/health, /mcp/tools, and /mcp/call to enable AI agents (e.g., Cursor, Claude Desktop) to access your Mac’s tools and models.
- Remote MCP Provider Aggregation - Connect to external MCP servers and aggregate their tools into Osaurus, with namespaced tool access (e.g.,
provider_toolname) and secure token storage.
- System Tools & Plugin System - Access built-in tools like file system operations (
read_file, list_directory), browser automation (browser_navigate), git commands (git_status), web search, and more via osaurus tools install osaurus.browser.
- Custom Personas - Create AI assistants with unique system prompts, tool sets, visual themes, and model settings; export and share personas as JSON files via the Management window.
- Skills Import & Management - Import reusable AI capabilities from GitHub repositories or local files (.md, .json, .zip) using the Agent Skills specification; includes pre-installed skills like Research Analyst and Debug Assistant.
- Scheduled AI Tasks - Automate recurring prompts with custom instructions, assign personas to schedules, and view historical results—ideal for daily journaling or weekly reporting.
- Multi-Window Chat - Run multiple independent chat sessions with different personas, themes, and contexts simultaneously using
⌘ N to open new windows or right-clicking session history.
- Voice Input with WhisperKit - Use real-time on-device speech-to-text via WhisperKit, enabling voice commands and transcription without internet dependency.
- VAD Mode & Transcription Hotkey - Enable always-on listening with wake-word activation for persona access, or use a global hotkey to transcribe speech directly into any focused text field.
- Model Manager - Download, manage, and switch between local models from Hugging Face using the command line or UI.
- Developer Tools & API Explorer - Inspect running endpoints, test APIs live, and debug model behavior with built-in server insights and request inspection tools.
Common Use Cases
- Building a private AI coding assistant - Use Osaurus to run Llama 3.2 locally via MLX, expose tools like git and file system access via MCP, then connect Cursor to it for a fully offline code companion with context-aware suggestions.
- Creating a research workflow with automated summaries - Define a persona that uses web search and file reading tools, schedule it to run weekly on a set of PDFs in your Documents folder, and have it generate annotated summaries with citations.
- Integrating AI into existing macOS workflows - Use the global transcription hotkey to dictate notes directly into Notion, Obsidian, or Xcode without switching apps, powered by WhisperKit and no internet required.
- DevOps teams managing local AI pipelines - Deploy Osaurus on Apple Silicon workstations to host private LLMs, serve them via OpenAI-compatible endpoints for CI/CD scripts, and connect MCP clients to automate documentation or code review tasks.
- Developing AI plugins for macOS apps - Use the plugin system to create custom Swift tools that expose app-specific functionality (e.g., reading calendar events or controlling audio) to AI agents via Osaurus’s MCP server.
- Running Claude or GPT-4o alongside local models in one interface - Configure both Anthropic and OpenAI providers alongside a locally running Qwen model, then toggle between them per chat window for cost, speed, or privacy trade-offs.
Under The Hood
Osaurus is a macOS-native AI assistant platform designed to bridge command-line and graphical user interfaces, offering a unified backend for both interaction modes. It emphasizes extensibility through plugin support and standardized communication protocols.
Architecture
The application adopts a modular, layered structure that separates core logic from UI components and external integrations. This design promotes clear responsibilities and maintainable code.
- The codebase is organized into distinct packages such as OsaurusCLI and OsaurusCore, each managing specific functional domains.
- Core operations are handled by managers, controllers, and services that encapsulate configuration, chat sessions, plugins, and model interactions.
- Dependency injection and service-oriented architecture are applied to ensure loose coupling between components.
- Shared models and command structures enable seamless communication between CLI and GUI layers.
Tech Stack
The project is built using Swift, leveraging native macOS frameworks and a suite of modern Swift libraries for enhanced functionality.
- The primary language is Swift, with SwiftUI and AppKit used for UI development and native macOS integration.
- It integrates third-party libraries such as OpenAI, Hugging Face, and CryptoKit to extend capabilities.
- Build automation is handled via Makefile, and Swift formatting standards are enforced using .swift-format.
- XCTest is used for unit and UI testing, with a dedicated structure supporting test execution.
Code Quality
The codebase reflects a moderate level of quality with consistent Swift practices but limited test coverage and some stylistic inconsistencies.
- Error handling is present throughout the codebase, though patterns vary across modules.
- While type annotations are used consistently, naming and style conventions show some inconsistency.
- Technical debt is visible in the form of minimal automated testing and duplicated logic in certain areas.
What Makes It Unique
Osaurus introduces unique approaches to modular AI assistant development, particularly in plugin architecture and protocol integration.
- The platform enables modular design with independent CLI and GUI components that share a common backend.
- Swift Package Manager is leveraged for plugin management, allowing dynamic extensions without recompilation.
- Integration with the Model Context Protocol (MCP) SDK enables standardized agent communication and remote provider support.
- Comprehensive documentation and developer tooling are provided to facilitate plugin creation and configuration.