Overview: Jan is an open-source desktop application that lets users run large language models (LLMs) entirely offline on their personal computers. Designed as a privacy-focused alternative to cloud-based AI services like ChatGPT, Jan enables users to download and execute models from Hugging Face — including Llama, Gemma, Qwen, and other open-weight models — without sending data to external servers. Built with Tauri and TypeScript, it provides a native desktop experience while leveraging llama.cpp for efficient on-device inference. Jan is ideal for developers, privacy advocates, and organizations that need to keep AI interactions within their local environment due to compliance, security, or latency requirements.
What You Get
- Local AI Models - Download and run LLMs from Hugging Face directly on your machine, including Llama, Gemma, Qwen, and GPT-oss models, with no internet connection required after download.
- OpenAI-Compatible API - Expose a local inference server at
http://localhost:1337 that accepts standard OpenAI API requests, enabling integration with existing tools and scripts without cloud dependency.
- Custom Assistants - Create tailored AI assistants for specific tasks by configuring prompts, memory, and model parameters through a user-friendly interface.
- Multi-Platform Support - Install via native packages for Windows (.exe), macOS (.dmg), Linux (deb, AppImage), or build from source using Make and Yarn.
- Model Context Protocol (MCP) - Integrate agentic workflows by enabling model context persistence and external tool calls for advanced automation scenarios.
Common Use Cases
- Building a private AI assistant for sensitive data - A legal firm uses Jan to analyze confidential documents with Llama 3 without uploading files to cloud APIs, ensuring GDPR and HIPAA compliance.
- Creating a local coding assistant for air-gapped environments - A defense contractor deploys Jan on an offline workstation to generate code snippets and documentation using a locally hosted Qwen model.
- Problem → Solution flow: Need to avoid API costs and rate limits? → Use Jan to run LLMs locally - Developers tired of OpenAI API fees or outages install Jan, download a 7B-parameter model, and get unlimited, zero-latency responses on their laptop.
- Team workflow for secure AI prototyping - A startup’s R&D team uses Jan to test multiple LLMs in parallel on a single machine, comparing performance and output quality without exposing prompts to third parties.
Under The Hood
The Jan AI project is a modular, extensible framework designed to support flexible AI chat applications with a focus on platform-agnostic functionality and plugin-driven architecture. It emphasizes type safety, event-driven communication, and a layered structure that enables easy integration and expansion.
Architecture
This project adopts a modular, layered architecture with clear separation of concerns and extensibility at its core.
- The system is organized into distinct modules such as core, extensions, and documentation, each with defined responsibilities.
- Extension-based plugins allow for flexible and scalable functionality without tight coupling.
- Event-driven communication patterns facilitate loose coupling between components and enhance maintainability.
Tech Stack
The project leverages modern TypeScript-based tooling and frameworks to support a robust and extensible AI application framework.
- Built primarily in TypeScript, with React and Next.js powering the frontend and documentation layers respectively.
- Integrates Rust for performance-critical operations, complemented by libraries like Zustrn, Tailwind CSS, and Radix UI for UI development.
- Employs Rolldown for bundling, TypeScript compiler for type checking, and Vitest for testing with ESLint and Prettier for code quality.
Code Quality
The project maintains a high level of code organization and testing practices that support long-term maintainability.
- Comprehensive test coverage is present, with Vitest used for both unit and UI testing, ensuring reliability.
- Strong emphasis on type safety through TypeScript, reducing runtime errors and improving developer experience.
- Code linting and formatting tools are consistently applied across the codebase, promoting consistency.
What Makes It Unique
The project distinguishes itself through its innovative plugin architecture and extensibility model tailored for AI applications.
- Its extension-based core enables platform-agnostic deployment and seamless integration of new features or models.
- A rich type system and event-driven design provide a solid foundation for extensibility and dynamic behavior.
- The combination of React, Next.js, and Rust creates a hybrid ecosystem that balances performance, scalability, and developer experience.