Timeplus Proton is a high-performance, single-binary SQL engine designed for real-time stream processing, ETL pipelines, observability, and AI/ML feature engineering. Built in C++ with ClickHouse as its core, it eliminates the complexity of JVM-based systems like Apache Flink or ksqlDB by offering native SQL over streaming data with no dependencies. It supports direct ingestion from Kafka, Redpanda, and other sources, with built-in materialized views, windowing functions, and sinks to ClickHouse or external databases. Its lightweight design allows deployment on minimal infrastructure — even a t2.nano AWS instance — making it ideal for teams seeking low-latency, high-throughput stream processing without operational overhead.
Proton is tailored for developers and data engineers who need to build real-time analytics pipelines without managing complex distributed systems. Whether you’re filtering telemetry data, computing live metrics for dashboards, or generating ML features from streaming events, Proton provides SQL as the universal interface — with support for UDFs in Python and JavaScript, multi-stream joins, and incremental view maintenance. It’s not a full data platform but a focused, fast engine for streaming SQL workloads that need to run anywhere — from laptops to cloud instances.
What You Get
- Single-binary SQL engine - A standalone C++ binary under 500MB with no JVM, ZooKeeper, or external dependencies. Deploy via curl install, Homebrew, or Docker with zero configuration overhead.
- Native streaming SQL support - Execute real-time SQL queries on Kafka, Redpanda, and Pulsar streams using CREATE EXTERNAL STREAM. Supports windowed aggregations (tumble, hop, session), watermarks, and CDC.
- ClickHouse-powered analytics - Leverage ClickHouse’s 1000+ SQL functions and columnar storage for fast materialized views, aggregations over billions of rows, and low-latency analytical queries directly from streaming data.
- Multi-source ingestion & multi-sink output - Ingest from Kafka, Redpanda, ClickHouse tables, and REST API; sink to Kafka, ClickHouse, or other Proton instances. Supports AWS MSK with IAM authentication and secure connections.
- Incremental materialized views - Automatically maintain aggregated results (e.g., avg temperature per device) in real time with automatic backfill and windowed group-by operations, eliminating batch ETL delays.
- UDFs in Python and JavaScript - Extend SQL with custom logic using user-defined functions, enabling complex transformations like data masking or enrichment without leaving the SQL interface.
- Low-resource deployment - Runs on minimal infrastructure: 0.5 GiB RAM, 1 vCPU (e.g., AWS t2.nano), and supports Docker containers for easy CI/CD integration.
- REST API & SDK integrations - Connect via REST endpoints or Python/Java/Go SDKs for programmatic access to streaming SQL results, enabling integration with custom dashboards or ML pipelines.
Common Use Cases
- Building real-time telemetry dashboards - Ingest logs and metrics from Kafka topics, filter noise with SQL WHERE clauses, compute moving averages over 10-second windows, and sink results to ClickHouse for live dashboards in Grafana or DBeaver.
- Real-time feature engineering for ML models - Compute features like ‘avg temperature per device over last 5 minutes’ or ‘device failure rate in sliding window’ directly from live sensor data streams, then write to a feature store for model inference.
- Replacing Flink/ksqlDB with lightweight SQL - Teams tired of JVM memory leaks and complex cluster management use Proton to run the same streaming ETL logic (joins, windows, aggregations) in a single binary with 10x lower latency and no dependency hell.
- DevOps teams monitoring microservices at scale - Deploy Proton on edge nodes to aggregate logs and metrics from containers, alert on anomalies (e.g., error rate > 5% in 30s), and forward to S3 or OpenSearch without running a full data pipeline stack.
Under The Hood
Timeplus-io/proton is a high-performance time-series database system built on ClickHouse’s foundation, optimized for real-time analytics and streaming workloads. It combines advanced data processing capabilities with cloud-native deployment strategies to support modern, scalable analytics infrastructures.
Architecture
The system adopts a monolithic yet modular architecture that emphasizes layered design and cross-platform support. It demonstrates clear separation of concerns across core components such as database logic, testing infrastructure, and deployment mechanisms.
- Modular organization with distinct layers for data ingestion, query execution, and storage
- Configuration-driven builds that support cross-compilation and multi-platform deployment
- Integration with containerized environments for seamless cloud-native operations
- Use of design patterns that enable extensibility and dynamic plugin loading
Tech Stack
The project is primarily developed in C++ with extensive Python integration for automation, testing, and tooling. It leverages a wide range of system-level and database-specific libraries to achieve performance and scalability.
- Built predominantly in C++ with Python for testing, configuration, and build scripts
- Relies on system-level utilities and database drivers such as ClickHouseCluster for performance
- Employs CMake for cross-platform builds, Docker for containerization, and CI/CD automation
- Integrates pytest and custom test runners to support functional, integration, and fuzzer-based testing
Code Quality
The codebase reflects a mature development approach with strong emphasis on testing, linting, and cross-platform compatibility. It includes automated checks and comprehensive build configurations to ensure consistency and reliability.
- Comprehensive test coverage with Docker-based environments for multi-platform validation
- Code linting and type annotations in place to maintain consistency and reduce errors
- Automated checks and build configurations that support scalable deployment across architectures
- Extensive API documentation and structured code organization for maintainability
What Makes It Unique
Timeplus-io/proton introduces several innovative features that distinguish it from conventional database systems, particularly in real-time analytics and cloud-native adaptability.
- Unique integration of stream processing capabilities within the database engine for low-latency analytics on live data
- Modular query engine that supports dynamic plugin loading and custom function extensions for tailored use cases
- Cloud-native deployment tooling with Docker and Kubernetes-native support for modern infrastructure
- Advanced compression and partitioning strategies optimized specifically for time-series data workloads