Onyx is an open-source AI platform designed to bring advanced large language model (LLM) capabilities into a unified, self-hostable chat interface. It bridges the gap between generic LLM APIs and production-grade AI applications by offering built-in tools for retrieval-augmented generation (RAG), web search, agent workflows, and secure document management. Built with Python and Next.js, Onyx is engineered for teams that need control over data privacy, scalability, and integration with enterprise systems—without sacrificing ease of use. Whether you’re an individual developer experimenting with AI or a large organization deploying AI at scale, Onyx provides the infrastructure to run advanced AI chat applications securely and efficiently.
Onyx supports all major LLMs, including OpenAI, Anthropic, Gemini, and self-hosted models via Ollama or vLLM. Its modular architecture allows users to plug in custom knowledge sources, deploy across Docker or Kubernetes, and enforce enterprise-grade security policies like SSO, RBAC, and encrypted credentials—all while maintaining a sleek, intuitive UI. The platform is designed to eliminate the complexity of building AI chat systems from scratch, making it ideal for developers and DevOps teams seeking a production-ready foundation.
What You Get
- 🤖 Custom Agents - Build AI agents with custom instructions, knowledge bases, and actions to perform specialized tasks like customer support automation or internal knowledge answering.
- 🌍 Web Search - Enable real-time web browsing using Google PSE, Exa, Serper, Firecrawl, or an in-house scraper to fetch up-to-date information during conversations.
- 🔍 RAG - Implement hybrid search (vector + keyword) with knowledge graphs to retrieve relevant content from uploaded documents and connected data sources with high accuracy.
- 🔄 Connectors - Integrate with 40+ external applications (e.g., Notion, Slack, Google Drive, GitHub) to pull documents, metadata, and access permissions directly into the RAG system.
- 🔬 Deep Research - Trigger multi-step, agentic research workflows that autonomously search, summarize, and synthesize information across multiple sources for complex queries.
- ▶️ Actions & MCP - Allow AI agents to execute external actions via Model Control Protocol (MCP), such as triggering APIs, updating databases, or sending emails.
- 💻 Code Interpreter - Run Python code within the chat interface to analyze datasets, generate visualizations, and create downloadable files without leaving the conversation.
- 🎨 Image Generation - Generate images from text prompts using integrated image models, enabling multimodal AI interactions.
- 👥 Collaboration - Share chats with team members, collect user feedback, manage multiple users, and track usage analytics through a built-in dashboard.
- Enterprise Search - Index and retrieve from tens of millions of documents with custom indexing pipelines optimized for performance and relevance.
- Security - Enforce SSO (OIDC/SAML/OAuth2), role-based access control (RBAC), and encrypted storage of API keys and credentials.
- Management UI - Assign roles (basic, curator, admin) to control access levels and manage knowledge sources, agents, and user permissions through a web interface.
- Document Permissioning - Automatically mirror access controls from external systems (e.g., Google Drive permissions) to ensure RAG results respect user-level security policies.
Common Use Cases
- Building a secure internal knowledge assistant - A company ingests its HR policies, engineering docs, and product manuals into Onyx using connectors to Notion and Google Drive; employees query the system with natural language and receive answers backed by verified documents, with access restricted by department.
- Creating a customer support chatbot with live web search - An e-commerce team deploys Onyx to handle tier-1 support queries; the AI uses web search to find current product details or outage notices, then generates accurate responses without needing manual updates.
- Problem: Inconsistent RAG responses from fragmented knowledge sources → Solution: Onyx - Teams using multiple tools (Slack, Confluence, GitHub) struggle with disjointed knowledge. Onyx consolidates these into a unified RAG system with permission-aware retrieval, ensuring consistent and secure answers.
- DevOps teams managing AI in air-gapped environments - A government or healthcare team deploys Onyx on-premises using Docker Compose to run LLMs locally (via Ollama) while connecting to internal document repositories, ensuring compliance with data sovereignty laws.
Under The Hood
Onyx is a multi-language, backend-first platform designed for building AI-powered applications with a strong emphasis on document processing, LLM integration, and enterprise-grade multi-tenant support. It combines a modular architecture with extensible tooling to enable flexible deployment and robust data handling across diverse use cases.
Architecture
Onyx adopts a monolithic yet modular structure that supports both single-tenant and multi-tenant deployments, emphasizing clear separation of concerns across backend, frontend, and tooling components.
- The system uses layered architecture with distinct modules for document indexing, LLM orchestration, and configuration management
- Design patterns like factory methods and strategy patterns are applied to support extensible document indices and embedding models
- Middleware-style tracing and configuration-driven workflows enhance maintainability and scalability
Tech Stack
The platform is built primarily with Python 3.11+ and leverages a rich ecosystem of tools and frameworks to support its functionality.
- FastAPI powers the backend API, while React/Next.js handle frontend interfaces and Tauri enables native desktop integration
- Core dependencies include SQLAlchemy for database operations, Vespa/OpenSearch for indexing, and Alembic for migrations
- The project integrates with CI/CD pipelines using Docker and uv, and enforces code quality via pre-commit hooks
Code Quality
Onyx demonstrates a balanced approach to code quality with strong testing practices and structured error handling, although some inconsistencies remain.
- A comprehensive test suite covers unit, integration, and end-to-end scenarios with extensive mocking for external services
- Error handling is centralized and consistent, with clear validation and logging practices across components
- Code linting and type safety are enforced through TypeScript and Python tooling, though style uniformity varies
What Makes It Unique
Onyx distinguishes itself through its enterprise-grade extensibility and integration capabilities, particularly in multi-tenant environments.
- The platform supports custom connectors and permission sync strategies that decouple content ingestion from access control logic
- Alembic migrations are used innovatively to manage complex, versioned database schemas in multi-tenant setups
- Built-in federated connector support and external permission systems (e.g., GitHub, Google) allow flexible content ingestion and access control
- Unified admin APIs and UI components provide a cohesive interface for managing LLM models, chat sessions, and assistant personalization