PostgresML transforms PostgreSQL into a full-stack ML/AI platform by embedding machine learning models and large language models directly within the database. It’s designed for data engineers and ML teams who want to avoid the latency, complexity, and security risks of moving data to external model servers. By running inference and training inside PostgreSQL, it enables real-time AI operations on live data without ETL pipelines.
Built as a PostgreSQL extension written in Rust, PostgresML integrates with pgvector for vector search and supports GPU-accelerated inference via ONNX Runtime. It works with existing PostgreSQL tooling like psql and psycopg, and offers both a managed cloud service and self-hosted Docker deployment. The ecosystem includes specialized libraries like Korvus and pgcat for RAG pipelines and database scaling.
What You Get
- In-Database ML/AI - Run training and inference of machine learning models directly in PostgreSQL using SQL functions like pgml.train and pgml.predict, eliminating external API calls.
- GPU Acceleration - Leverage NVIDIA GPUs for model inference via ONNX Runtime, achieving 8-40X faster performance than HTTP-based serving.
- Large Language Models from Hugging Face - Load and run state-of-the-art LLMs (e.g., Llama, Mistral, GPT-2) directly in PostgreSQL using pgml.transform for text generation, summarization, and translation.
- RAG Pipeline Functions - Built-in SQL functions: pgml.chunk (text splitting), pgml.embed (vector generation), pgml.rank (cross-encoder re-ranking), and pgml.transform (text generation) for end-to-end RAG in a single query.
- Vector Search with pgvector - Perform efficient approximate nearest neighbor (ANN) searches on embeddings using PostgreSQL’s pgvector extension for semantic search and recommendation systems.
- 47+ ML Algorithms - Access built-in classification and regression models including XGBoost, Random Forest, SVM, and Logistic Regression via pgml.train without external dependencies.
- High-Performance Inference - Execute millions of ML predictions per second with horizontal scaling via pgcat, reducing latency to microseconds for real-time applications.
- Seamless PostgreSQL Integration - Use standard SQL and existing tools like psql, psycopg3, and Django ORM to run AI queries without changing your data stack.
- Security & Privacy - Keep sensitive data and models co-located inside the database, avoiding data exfiltration risks from external API calls to OpenAI or other cloud providers.
- Self-Hosted & Cloud Options - Deploy via Docker on your infrastructure or use the serverless PostgresML Cloud with free GPU access and preloaded LLMs.
Common Use Cases
- Building RAG-powered chatbots - A developer uses pgml.chunk, pgml.embed, and pgml.rank to build a knowledge-base chatbot that retrieves and generates answers from internal documents using Hugging Face LLMs—all within PostgreSQL.
- Real-time fraud detection - A fintech company trains an XGBoost model on transaction data inside PostgreSQL and uses pgml.predict to score transactions in real time with sub-millisecond latency.
- Personalized content recommendation - An e-commerce platform embeds product descriptions using pgml.embed and uses vector search to recommend similar items based on user behavior, all in SQL.
- Automated document summarization - A legal firm uses pgml.transform with a summarization LLM to auto-generate summaries of case files stored in PostgreSQL, reducing manual review time.
- Multilingual customer support automation - A global SaaS company uses pgml.transform with translation models to auto-translate and respond to customer tickets in 20+ languages without external APIs.
- High-throughput NLP pipelines - A media company processes 10M+ articles daily using pgml.transform for topic classification and entity extraction, running entirely within their existing PostgreSQL cluster.
Under The Hood
Architecture
- Hybrid Python-Rust architecture with tightly coupled database extensions and service layers, lacking clear separation between data access, business logic, and API concerns
- Absence of dependency injection, inversion of control, or modular plugin systems results in direct imports and rigid component coupling
- No unified abstraction for routing, middleware, or authentication, leading to fragmented API layering
- Directory structure lacks conventional module boundaries, creating a monolithic feel despite multi-language components
Tech Stack
- Rust-based core engine integrated as PostgreSQL extensions for high-performance ML execution via pgml and pgvector
- Python backend powered by FastAPI with Pydantic for data validation and pgml library for model lifecycle management
- PostgreSQL extended with custom ML and vector functions serving as the central data and inference layer
- Modern frontend using HTMX and Alpine.js for dynamic interactions, SCSS for styling, and minimal JavaScript
- Build system leveraging Cargo and Poetry with Docker for consistent development and deployment environments
- Comprehensive testing infrastructure using pytest, PostgreSQL test containers, and TypeScript-based SDK validation
Code Quality
- Extensive test coverage spanning unit, integration, and end-to-end scenarios across Rust, Python, and JavaScript
- Strong type safety enforced through TypeScript interfaces and rigorous Rust assertions
- Consistent naming conventions aligned with domain-driven design and platform-specific conventions
- Robust error handling via platform-native exceptions, though custom error types are underutilized in source code
- Comprehensive linting, build tooling, and test discipline with timeouts and environment presets
- Modular organization enables independent testing and deployment of database extensions versus client APIs
What Makes It Unique
- ML operations like embedding and prediction are implemented as native PostgreSQL functions, eliminating external microservices and reducing latency
- Rust-powered server-side components generate dynamic, type-safe documentation via Sailfish templating
- Automatic icon mapping system links ML functions to UI icons using static metadata, enabling self-documenting interfaces
- Turbo-powered navigation with Stimulus maintains state across page transitions without full reloads
- Embedded documentation auto-generates from code-defined pipelines, turning docs into living validation artifacts