highlight.io is an open-source, full-stack monitoring platform designed for modern developers who need a unified view of their application’s health. Unlike traditional tools that silo error tracking, logs, and performance metrics, highlight.io integrates session replay, error monitoring, logging, and distributed tracing into one cohesive system. It’s built for teams that want to understand not just what went wrong, but why — by seeing exactly what users did before an error occurred. With support for self-hosting and a growing suite of SDKs, it’s ideal for organizations that prioritize data privacy, customization, or cost control over hosted SaaS alternatives. The platform is written in TypeScript and Go, with a focus on ease of deployment and extensibility.
What You Get
- Session Replay - High-fidelity browser session replays powered by rrweb, capturing DOM changes, user interactions, console logs, and outgoing network requests to reproduce frontend bugs with pixel-perfect accuracy.
- Error Monitoring - Automatic error collection from frontend and backend SDKs with customizable grouping rules and alerting, linked directly to associated sessions for context-driven debugging.
- Logging - Centralized server-side log ingestion with powerful search, automatic property extraction, and deep integration with session replay and error data for cross-cutting analysis.
- Distributed Tracing - End-to-end trace visibility across services with automatic property collection and embedded links to related logs, errors, and sessions to diagnose performance bottlenecks.
- Self-Hosted Deployment - Full control over your data with Docker-based deployment options for hobby and enterprise use, including a one-line install command and configurable resource requirements.
- SDK Support - Official SDKs for JavaScript, React, Node.js, Python, Go, and more — all documented in the GitHub sdk/ directory with clear integration paths.
- Integrations - Connect highlight.io to popular tools like Slack, PagerDuty, and GitHub via built-in integrations for alert routing and workflow automation.
Common Use Cases
- Building a multi-tenant SaaS dashboard with real-time user feedback - Use session replay to identify UX friction points in specific customer segments, then correlate with errors and logs to prioritize fixes that impact revenue-critical paths.
- Debugging intermittent frontend crashes in a high-traffic e-commerce platform - When users report checkout failures, replay their exact session to see network requests, console errors, and DOM state leading up to the crash — eliminating guesswork in reproduction.
- Debugging slow API responses across microservices - Use distributed traces to visualize latency hotspots, then drill into associated logs and errors to pinpoint faulty services or misconfigured timeouts.
- DevOps teams managing hybrid cloud deployments - Self-host highlight.io on-prem or in private clouds to maintain compliance, then unify monitoring data from Kubernetes, AWS, and Azure services into a single dashboard with real-time alerting.
Under The Hood
The project is a sophisticated observability platform built with a hybrid Go and TypeScript architecture, designed to support real-time log analysis, distributed tracing, and extensible alerting across diverse environments. It emphasizes modular design and cross-platform compatibility, enabling seamless integration with monitoring tools and frontend dashboards.
Architecture
This system adopts a layered architecture that clearly separates backend services from frontend interfaces, promoting maintainability and scalability.
- The architecture uses a modular structure where telemetry and alerting components are decoupled, supporting flexible integration.
- Strategy and adapter design patterns are implemented in alert destination handling, allowing for easy addition of new notification channels.
- Middleware and API layers facilitate communication between backend systems and frontend UIs, enabling trace propagation across platforms.
Tech Stack
The project leverages a polyglot tech stack combining TypeScript, Go, and modern web frameworks to deliver robust observability features.
- Built primarily in TypeScript with Go components, utilizing React for frontend and Express-based services for backend logic.
- Employs OpenTelemetry for distributed tracing, Apollo for GraphQL, and @highlight-run for monitoring and session replay.
- Tools like Turbo, Yarn, ESLint, and Prettier support monorepo management and code consistency across the codebase.
Code Quality
Code quality reflects a mixed state with some structured testing practices and evidence of technical debt in generated artifacts.
- Testing is applied across multiple frameworks including Cypress, Go’s native testing, and unit/integration tests for core modules.
- Error handling is present in generated files but inconsistently applied in core logic components.
- Code style varies with some adherence to conventions and signs of inconsistency in formatting and naming.
What Makes It Unique
The system introduces innovative approaches to log and trace processing, search query parsing, and UI scaffolding for observability tools.
- An extensible alerting system supports Discord, Slack, Teams, and Webhooks through modular integration points.
- A custom GraphQL backend with ClickHouse integration enables high-performance log and trace processing with migration support.
- ANTLR-based search query parsing offers semantic search capabilities within observability data for enhanced user experience.