The open-source BI platform that lets anyone ask questions and build dashboards without writing SQL — with an embedded analytics SDK and AI-powered query assistant included.
Metabase is an open-source Business Intelligence platform that empowers every team member — not just data analysts — to explore data, create visualizations, and share insights without writing a single line of SQL. Its visual query builder transforms drag-and-drop selections into queries executed across more than 20 supported databases, while Metabot, the built-in AI assistant, lets users ask questions in plain English and receive trusted, reproducible answers backed by real database queries.
The platform is built around a semantic query layer called MBQL (Metabase Query Language), an EDN-based intermediate representation that abstracts over all connected data sources. This enables cross-database features like nested queries, implicit joins, and field-level fingerprinting that auto-classifies timestamps, categories, URLs, and IDs to intelligently suggest the right visualization type without manual configuration.
For SaaS companies, Metabase’s embedding capabilities stand out: a React SDK and iframe embedding mode share the same permission and session model, enabling customer-facing dashboards with per-row data isolation from a single Metabase instance. The Data Studio layer allows teams to define canonical metrics, transform raw tables into analytics-ready models, and track downstream dependencies — bringing data modeling and BI into one interface.
Self-hosting is straightforward — a single JAR or Docker image with H2 bundled for quick starts and PostgreSQL/MySQL recommended for production. Releases ship on a weekly cadence with long-term support tracks maintained in parallel, and the official Metabase Cloud option offloads all operational concerns for teams who prefer managed infrastructure.
Architecture
Metabase is structured as a full-stack JVM/React application with a purpose-built query abstraction at its core. The backend processes queries through a composable middleware pipeline: preprocessing normalizes and enriches the query with schema metadata and permission checks, compilation translates MBQL to database-native SQL via pluggable driver multimethods, execution dispatches to the correct JDBC or native driver, and postprocessing applies formatting and result shaping. Enterprise features live in a cleanly isolated top-level enterprise/ directory whose namespaces overlay OSS counterparts when the ee Clojure alias is active, so the open-source core has zero coupling to commercial code. The frontend is a React monorepo with independently built packages — main app, embedding SDK, static visualizations, and iframe embed — each with dedicated rspack configurations and TypeScript declarations.
Tech Stack The backend is written in Clojure on the JVM, using Malli for runtime schema validation throughout the query processor and API layer, HikariCP for connection pooling, Liquibase for database migrations, Quartzite for scheduled job execution, and Ring with Compojure for HTTP routing. Supported application databases are H2 (bundled for development), PostgreSQL, and MySQL. The frontend uses React 18 with Redux Toolkit and react-query for state management, CodeMirror 6 for the SQL editor, rspack for builds, and a mix of custom SVG rendering and Echarts for the visualization engine. Database drivers for Snowflake, BigQuery, Athena, Redshift, MongoDB, Oracle, SQL Server, ClickHouse, and others ship as separately bundled Clojure modules. Docker is the primary deployment target, with a published JAR for non-container environments.
Code Quality Test coverage is comprehensive across all layers: nearly 2,000 Jest unit and component tests cover the frontend including the visualization engine, React hooks, and embedding SDK; 424 Cypress end-to-end tests exercise complete user workflows; and extensive Clojure test namespaces cover query processing pipelines, API endpoints, driver behavior, permission enforcement, and database migrations. ESLint enforces module boundary constraints via a dedicated configuration that prevents cross-package imports. Codecov tracks per-directory thresholds for both frontend and backend. Cross-version compatibility tests verify that database driver upgrades and API changes don’t break existing deployments. The main limitation is that some visualization logic uses loosely typed interfaces given the breadth of supported chart configurations.
What Makes It Unique Metabase’s most significant technical differentiator is MBQL — a database-agnostic query intermediate representation that enables features impossible in raw SQL: semantic field fingerprinting auto-classifies columns as timestamps, geographic coordinates, URLs, or categorical dimensions to drive intelligent visualization defaults without user configuration; implicit join resolution traverses schema relationships automatically so users never need to express JOINs in the visual builder; and the permission sandbox system rewrites queries at the MBQL layer before SQL compilation, ensuring row-level filters apply regardless of whether the query originated from the GUI, the SQL editor, or an embedded dashboard. The Agent API is a newer addition that exposes this same query pipeline to external LLMs, letting AI systems query production databases through Metabase’s governance layer rather than granting direct database access.
Metabase uses a dual-license model. The core application — everything outside the enterprise/ directory — is released under the GNU Affero General Public License (AGPL v3). This means you can freely run, modify, and distribute Metabase for any purpose, including commercial use, as long as any modifications to the AGPL source are made available under the same license. The AGPL’s network copyleft clause applies to the server software itself, not to dashboards or queries you create with it, so your business data and analytics assets are your own. The enterprise/ directory is licensed separately under the Metabase Commercial License, which requires a paid plan to use in production.
Running Metabase yourself is operationally straightforward for small to medium deployments. A single Docker image or JAR file contains everything needed — you bring a PostgreSQL or MySQL application database, configure an SMTP server for email alerts, and optionally set up Slack or webhook integrations. Metabase runs as a single JVM process, so there’s no microservices coordination to manage. Memory requirements start around 2 GB RAM for a small team; larger installations with many concurrent users and complex queries should plan for dedicated database capacity and horizontal scaling behind a load balancer. You are responsible for backup of the application database (which stores all questions, dashboards, and configuration) and for applying upgrades, which ship on a weekly release cadence with separate long-term support branches maintained in parallel.
The Metabase Cloud offering adds meaningful operational benefits for teams that prefer not to manage infrastructure: automated upgrades, managed backups, built-in SMTP, SoC 2 Type 2 compliance documentation, and first-class support SLAs. The Pro and Enterprise tiers on Cloud and self-hosted unlock features in the commercial license: advanced permission sandboxing, SSO with SAML and LDAP, audit logs, multi-environment config management via Git sync, the embedding SDK white-labeling options, and customer-facing embedded analytics with multi-tenant data isolation. Self-hosters on the free open-source tier retain full functionality for internal team analytics; the commercial tier primarily adds governance, compliance, and embedding features aimed at SaaS companies building customer-facing products.
Databases · Analytics · Invoicing Finance
The AI Workspace for Finance: Connect Data, Run AI Agents, Build Analytics
Databases · Analytics · Data Engineering
Open-source column-oriented database that delivers real-time analytical queries on petabyte-scale data with millisecond latency.
Security · Developer Tools · Monitoring
Developer-first error tracking and performance monitoring platform with AI-powered root-cause analysis across 20+ languages and frameworks.