Coroot is an open-source APM and observability tool designed for SREs and DevOps teams managing complex microservice architectures. It solves the problem of fragmented telemetry data by automatically collecting metrics, logs, traces, and continuous profiles via eBPF — eliminating the need for code changes or manual instrumentation. This enables 100% service coverage and instant visibility into performance issues across applications, databases, infrastructure, and cloud costs.
Built with Go and leveraging eBPF, OpenTelemetry, and ClickHouse, Coroot provides a unified platform that ingests, correlates, and analyzes telemetry data in real time. It deploys as a Docker container or Kubernetes Helm chart and integrates with Prometheus, Grafana, and cloud providers like AWS, GCP, and Azure without requiring access to cloud credentials.
What You Get
- Zero-instrumentation observability - Coroot uses eBPF to automatically collect metrics, logs, traces, and profiles without requiring code changes or SDKs, enabling full visibility into legacy and third-party services.
- Service Map - Automatically generates a live, dependency-aware service map covering 100% of your system, with no blind spots, powered by eBPF and distributed tracing data.
- AI-powered Root Cause Analysis - Coroot’s AI analyzes telemetry data to identify over 80% of issues automatically, providing actionable insights and suggested fixes instead of raw alerts.
- Continuous Profiling - One-click CPU and memory profiling down to the exact line of code, with baseline comparisons to detect anomalies caused by deployments or configuration changes.
- Log Pattern Clustering - Out-of-the-box log event clustering and correlation with traces using ClickHouse for lightning-fast search and pattern detection without manual rule creation.
- Deployment Tracking - Automatically detects and monitors every Kubernetes rollout, compares performance before and after each release, and flags regressions without CI/CD integration.
- Cost Monitoring - Tracks cloud spending per application on AWS, GCP, and Azure without requiring cloud account access, correlating resource usage with application performance.
- SLO-Based Alerting - Sends single, actionable alerts when SLOs are breached, including all relevant inspection results instead of overwhelming teams with noise.
Common Use Cases
- Debugging production outages in microservices - SREs use Coroot’s service map and distributed tracing to trace a latency spike across 50+ services without instrumenting each one manually.
- Reducing MTTR after Kubernetes deployments - DevOps teams correlate deployment events with performance regressions and cost spikes using Coroot’s automatic deployment tracking and profiling.
- Monitoring legacy applications without code access - Teams observing Java or .NET apps in regulated environments use Coroot’s eBPF-based tracing to capture HTTP and database calls without modifying binaries.
- Optimizing cloud costs in multi-cloud environments - Engineering leads identify which microservices are driving AWS or GCP spend spikes using Coroot’s cost monitoring, right from the dashboard.
Under The Hood
Architecture
- Go-based monolithic backend with clear package-level modularity isolating concerns such as API routing, data collection, authentication, and gRPC serving
- Dependency injection via constructor-based service initialization ensures loose coupling and testability
- Static asset serving through embed.FS eliminates runtime file system dependencies and simplifies deployment
- Centralized configuration system dynamically initializes database backends and external data sources based on environment
- gRPC and HTTP APIs coexist in a unified process with the API layer acting as a clean facade over internal modules
- Event-driven watcher system decouples real-time alerting from data ingestion, enabling responsive incident detection
Tech Stack
- Go 1.23 backend with native tooling for code quality, dependency management, and build automation
- Dockerized deployment using minimal base images with non-root execution and embedded data volumes for security and portability
- React and TypeScript frontend with npm-based linting and formatting workflows
- Unified Makefile orchestrates both backend and frontend build processes in a single, consistent workflow
- Minimalist infrastructure design avoids external frameworks, relying on Go’s standard library and direct HTTP APIs
Code Quality
- Extensive unit test coverage across core modules with testify for validating edge cases in time series, caching, and query parsing
- Well-defined packages with focused responsibilities and minimal cross-package dependencies
- Robust error handling with explicit assertions and meaningful failure contexts in tests
- Consistent Go idioms in naming and structure, enhancing readability and maintainability
- Strong type safety enforced through custom types for time series and labels, with JSON serialization validated via tests
- Proactive data integrity safeguards in JSON utilities that sanitize malformed inputs
What Makes It Unique
- Native integration of Prometheus, ClickHouse, and cloud cost analytics into a single observability stack for cost-to-incident correlation
- Dynamic alerting engine that unifies PromQL, log patterns, and inspection results into a declarative, UI-managed system
- Granular RBAC with anonymous role support and bootstrap admin flow for secure zero-trust deployments
- Embedded cloud cost attribution engine that directly links infrastructure spend to performance anomalies
- Frontend label visualization system with rich, dependency-free rendering of Kubernetes and service metadata
- End-to-end traceability from incident detection to root cause analysis with deployment-aware alert suppression