Open WebUI

The extensible, privacy-first AI platform that runs Ollama, OpenAI, and any LLM backend behind a polished, feature-packed web interface.

126Kstars
17.8Kforks
Other
Python

Open WebUI is a self-hosted, extensible AI platform that puts a full-featured web interface in front of any LLM backend—Ollama, OpenAI-compatible APIs, or local models—while keeping every conversation and document entirely on your own infrastructure. It goes far beyond a simple chat UI: built-in Retrieval Augmented Generation, voice and video calling, image generation, Python function calling, and a Pipelines plugin framework make it a complete AI deployment solution rather than a thin wrapper.

The backend is a FastAPI Python application served by Uvicorn, with SQLAlchemy for relational data and support for 13+ vector database backends for RAG (ChromaDB, PGVector, Qdrant, Milvus, Elasticsearch, OpenSearch, Pinecone, S3Vector, Oracle 23ai, Weaviate, Valkey, OpenGauss, and MariaDB Vector). The SvelteKit frontend communicates over REST and Socket.IO, enabling real-time streaming responses with fine-grained DOM updates. Redis backs session management and WebSocket coordination for horizontal scaling across multiple nodes.

Open WebUI ships with enterprise-grade capabilities including SCIM 2.0 automated provisioning, LDAP/Active Directory integration, SSO via OAuth providers or trusted headers, OpenTelemetry observability, and role-based access control down to individual model and feature level. A Progressive Web App mode enables mobile installation with offline localhost access. The project releases multiple times per week and has accumulated over 142,000 GitHub stars, making it the dominant open-source LLM frontend.

What You Get

  • Ollama & OpenAI-Compatible API Integration - Connect to local Ollama models or any OpenAI-compatible API endpoint (LMStudio, GroqCloud, Mistral, OpenRouter) and switch between them from a single interface.
  • Multi-Vector Database RAG - Perform Retrieval Augmented Generation using your choice of ChromaDB, PGVector, Qdrant, Milvus, Elasticsearch, OpenSearch, Pinecone, S3Vector, Oracle 23ai, Weaviate, Valkey, OpenGauss, or MariaDB Vector with multiple document parsers including Tika, Docling, Mistral OCR, and PaddleOCR.
  • Web Search for RAG (15+ Providers) - Inject live web results from SearXNG, Brave, Google PSE, Kagi, Tavily, Perplexity, Bing, DuckDuckGo, Exa, Jina, and others directly into chat context.
  • Voice & Video Chat with Multiple Providers - Enable hands-free interaction using local Whisper or cloud STT providers (OpenAI, Deepgram, Azure) paired with TTS engines (ElevenLabs, Azure, OpenAI, Transformers, WebAPI).
  • Python Function Calling (BYOF) - Write and deploy custom Python functions directly in the UI workspace to extend LLM capabilities with external APIs, databases, or automation scripts.
  • Image Generation & Editing - Create and edit images using DALL-E, Gemini, ComfyUI (local), and AUTOMATIC1111 (local) with prompt-based editing workflows.
  • Pipelines Plugin Framework - Deploy a separate Pipelines server to add function calling, rate limiting, usage monitoring via Langfuse, real-time translation via LibreTranslate, content filtering, and custom logic without touching the core codebase.
  • Enterprise Authentication Stack - Full LDAP/Active Directory integration, SCIM 2.0 automated user provisioning, SSO via OAuth (Okta, Azure AD, Google Workspace) or trusted headers, and fine-grained RBAC.
  • Persistent Artifact Storage - Built-in key-value storage API for journals, trackers, leaderboards, and collaborative tools with personal and shared data scopes across sessions.
  • Horizontal Scalability with Redis - Redis-backed session management and Socket.IO coordination for multi-worker and multi-node deployments behind load balancers.
  • Production Observability with OpenTelemetry - Built-in traces, metrics, and logs exportable to Prometheus, Grafana, Jaeger, or any OpenTelemetry-compatible backend.
  • Progressive Web App (PWA) - Install Open WebUI as a native-like app on mobile with offline localhost access and a fully responsive UI.
  • Cloud Storage Integration - Native file picker support for Google Drive and OneDrive/SharePoint for seamless document import into knowledge bases.
  • Multi-Model Parallel Conversations - Run the same prompt against multiple models simultaneously and compare responses side by side.
  • Notes with Real-Time Collaboration - Built-in collaborative notes editor backed by CRDT (pycrdt/Y.js) with WebSocket sync across multiple users.

Common Use Cases

  • Private AI assistant for regulated industries - A healthcare organization deploys Open WebUI with local Ollama models and RAG over internal clinical documentation, ensuring patient data never leaves their on-premises servers while giving staff an intuitive chat interface.
  • Centralized LLM gateway for engineering teams - A software company runs Open WebUI as a shared internal portal, connecting to multiple model providers under one RBAC-governed interface so teams can experiment with GPT-4o, Claude, and local models without managing separate API keys per developer.
  • Research document analysis pipeline - A university lab ingests hundreds of academic PDFs into ChromaDB via Open WebUI’s knowledge base, then uses RAG-augmented chat to let researchers query findings across the corpus using natural language.
  • Voice-first AI workstation - A field operations team installs Open WebUI as a PWA on tablets, using Whisper-based STT and ElevenLabs TTS to interact with AI entirely by voice while keeping hands free for physical tasks.
  • Custom AI agent development platform - A startup uses Open WebUI’s Pipelines framework and Python function calling to build specialized agents that pull live data from internal APIs, run calculations, and return structured results—all without exposing the underlying model infrastructure to end users.

Under The Hood

Architecture Open WebUI follows a clean full-stack separation: a Python FastAPI backend handles all AI provider communication, authentication, storage, and business logic, while a SvelteKit frontend owns the UI layer and communicates exclusively via REST and Socket.IO. Backend routing is organized into distinct router modules per domain—chats, models, retrieval, authentication, images, audio, tools—with clear service boundaries and no cross-cutting logic leaking between concerns. Real-time capabilities including streaming responses, collaborative notes (backed by pycrdt/Y.js CRDT), and live chat synchronization flow through a Socket.IO layer that Redis coordinates across multiple worker nodes. Configuration is entirely environment-driven, enabling clean Docker Compose overlays for different deployment profiles without touching application code. The vector database abstraction is a textbook factory pattern: a single VectorDBBase interface with 13 concrete implementations selected at startup time, making it straightforward to swap backends without touching RAG logic.

Tech Stack The backend runs Python 3.11 with FastAPI and Uvicorn, using SQLAlchemy’s async engine for relational persistence (SQLite with optional encryption, or PostgreSQL) and Alembic for migrations. Peewee handles legacy migration paths. The frontend is SvelteKit with TypeScript, built via Vite and deployed as static assets served by the Python backend. AI integrations span the official OpenAI, Anthropic, and Google GenAI SDKs for cloud providers, plus Ollama’s API for local inference. The ML stack includes Hugging Face Transformers, sentence-transformers, ONNX Runtime, and accelerate for on-device embedding and inference. Socket.IO with Redis adapters provides horizontally scalable WebSocket support. Playwright handles browser-based web scraping via a sidecar container, and Docker multi-stage builds produce :cuda, :ollama, and :slim image variants for different hardware targets.

Code Quality The test suite is sparse—E2E coverage via Playwright is configured through a Docker Compose sidecar, but unit and integration tests are largely absent from the repository itself, which is unusual for a project of this scale. The backend Python code is well-structured with consistent module conventions and uses loguru for logging, but error handling relies primarily on FastAPI’s HTTP exception mechanism rather than typed domain errors, making failure modes implicit in some paths. The Svelte components follow a consistent naming and structure pattern, though TypeScript coverage is uneven: many components lack explicit type annotations. Pre-commit hooks are referenced in pyproject.toml and a Makefile exists for common tasks, but formal linting enforcement across the frontend is limited. The project compensates with extremely active community contribution (391+ contributors, 15,000+ commits) and rapid release cadence (multiple releases per week).

What Makes It Unique The depth of integration breadth in a single self-hosted application is genuinely exceptional: 13 vector database backends, 15+ web search providers, 6+ STT engines, 4+ TTS engines, 4 image generation backends, and direct SDK support for OpenAI, Anthropic, and Google—all governed by a single RBAC layer and deployable via one Docker command. The Pipelines framework adds an extensibility architecture that lets operators inject Python middleware between user requests and model responses without forking the core application. The CRDT-backed collaborative notes using pycrdt and Y.js brings real-time document collaboration typically found only in dedicated tools. The addition of an Anthropic Messages API proxy endpoint means tools like Claude Code can authenticate through Open WebUI and have their requests routed to any configured backend—a particularly clever interoperability bridge.

Self-Hosting

Open WebUI uses a custom proprietary license (not SPDX-classified) that is broadly permissive for small deployments but includes a notable branding restriction: you may not alter, remove, or replace “Open WebUI” branding in any deployment serving more than 50 users within a 30-day period without explicit written permission from the copyright holder or a paid enterprise license. For deployments under 50 users, modification and redistribution are allowed subject to standard attribution conditions. This means that technically it is source-available rather than fully open-source under OSI definitions, and commercial deployments at scale that want to white-label the interface will need an enterprise agreement.

Running Open WebUI yourself requires meaningful infrastructure depending on your use case. A minimal single-user deployment on a laptop with Docker is genuinely simple—one docker run command gets you a working interface. Production deployments for teams add complexity: you will need to provision and maintain a PostgreSQL database (instead of the default SQLite), a Redis instance for WebSocket coordination across workers, a vector database server for RAG (Chroma runs embedded, but Qdrant or Milvus for production workloads need separate services), and object storage if you want cloud-native file handling. GPU nodes with CUDA drivers are required if you want local inference via Ollama at meaningful speed. Kubernetes deployment is supported via Helm, but orchestrating the full stack—web app, database, vector store, Redis, optional Pipelines server—represents a genuine operational investment.

The hosted Open WebUI Cloud and Enterprise tiers (available via the openwebui.com website) add custom theming and branding rights, formal SLA support, Long-Term Support (LTS) release channels with extended maintenance windows, and direct access to the development team. Self-hosters get community support via Discord and GitHub issues, which are very active given the project’s scale, but there is no guaranteed response time. Enterprise plan holders also get managed upgrade paths and compatibility guarantees that self-hosted deployments must handle independently through Alembic migrations and Docker image updates.

Join founders buildingwith open source

Opinionated takes, migration guides, cost-saving tips, and insights from the open source ecosystem.

Subscribe on Substack

No spam. Unsubscribe anytime.

Join 750+ subscribers
No spam. Unsubscribe anytime.

Search