Gatus is a lightweight, self-hosted monitoring tool designed for developers and DevOps teams to proactively track the health of critical services. Unlike traditional metrics-based monitors that rely on incoming traffic, Gatus actively probes endpoints using HTTP, ICMP, TCP, DNS, gRPC, SSH, and TLS to detect failures before users notice them. It provides a clean dashboard with real-time status updates and integrates with over 30 alerting platforms to notify teams via their preferred channels.
Built in Go, Gatus runs as a single binary or Docker container and supports Kubernetes, Helm, and Terraform deployments. It includes a REST API, customizable badges, metrics export, OIDC authentication, and dynamic configuration reloading—making it ideal for infrastructure teams managing microservices, APIs, and cloud-native applications.
What You Get
- Multi-protocol health checks - Monitor HTTP, ICMP, TCP, DNS, gRPC, SSH, STARTTLS, TLS, and WebSocket endpoints with customizable conditions on status code, response time, response body, certificate expiration, and IP address.
- Customizable conditions - Define complex conditions using Go-like expressions (e.g., “[STATUS] == 200”, “[BODY].status == UP”, “[RESPONSE_TIME] < 300”) to validate service behavior beyond simple uptime.
- 30+ alerting integrations - Send alerts to Slack, PagerDuty, Discord, Twilio, Google Chat, Mattermost, Teams (Workflow), GitHub, GitLab, Datadog, Opsgenie, Sentry, Ntfy, Signal, Telegram, and more via native or custom webhooks.
- Real-time status dashboard - Interactive UI with dark mode, endpoint grouping, response time trends, and live health indicators for immediate visibility into service status.
- Status badges - Embed uptime and response time badges (SVG) in READMEs, wikis, or dashboards using public URLs like /api/v1/endpoints/{name}/uptimes/7d/badge.svg.
- API and metrics endpoint - Expose raw health data and metrics via /api/v1/endpoints and /metrics (Prometheus-compatible) for integration with Grafana, Prometheus, or custom monitoring pipelines.
- OIDC and basic auth - Secure the dashboard with OpenID Connect (Auth0, Keycloak, etc.) or HTTP Basic Authentication for internal or private deployments.
- Dynamic config reloading - Reload configuration without restarting the service using SIGHUP or the /admin/reload endpoint, enabling CI/CD-driven monitoring updates.
- Low resource footprint - Single binary written in Go with minimal memory and CPU usage, suitable for running on low-power servers or edge devices.
- Badge customization - Customize response time badge color thresholds (e.g., green < 200ms, yellow < 500ms, red > 500ms) to match your SLOs.
Common Use Cases
- Monitoring microservices in Kubernetes - A DevOps engineer uses Gatus to check health endpoints of 20+ microservices, alerting on failed health checks via Slack before users experience downtime.
- Proactive API uptime tracking - A SaaS company monitors their public REST API with Gatus, triggering PagerDuty alerts when response time exceeds 500ms or status code is not 200.
- Internal tool health checks - A team monitors their internal CI/CD portal and database proxies using ICMP and TCP checks to ensure network connectivity and service availability.
- Developer UAT automation - A frontend developer configures Gatus to validate JSON response structure and status codes of their API endpoints as part of automated user acceptance testing.
Under The Hood
Architecture
- Clear separation of concerns through distinct, single-responsibility packages for configuration, HTTP routing, metrics, health monitoring, and storage
- Dependency injection via package-level initialization and singleton accessors, minimizing coupling between components
- Service-layer pattern isolates HTTP controllers from monitoring and telemetry logic, ensuring stateless API design
- Modular storage abstraction enables pluggable backends (file, SQL) with runtime initialization and no compile-time dependencies
- Configuration-driven workflow decouples endpoint definitions from UI and persistence layers, enabling flexible monitoring pipelines
- Clean frontend-backend separation with API endpoints as the sole interface, supporting independent development and deployment
Tech Stack
- Go backend leveraging standard library for HTTP, metrics, and concurrency, with minimal external dependencies
- Lightweight Docker deployment using scratch image and statically compiled binary with bundled certificates
- Makefile-driven development workflow for consistent builds, tests, and containerization
- Vue 3 frontend with Tailwind CSS and Lucide Vue Next for a responsive, component-based UI
- Configuration-driven state management using YAML and in-memory storage, eliminating need for external databases
- CI/CD pipeline supported through Dockerized multi-stage builds and automated artifact generation
Code Quality
- Extensive test coverage with table-driven unit and integration tests, including edge cases and external provider mocks
- Robust error handling with domain-specific error types that enable precise classification and propagation
- Strong type safety through structured configs, pointer-based optionals, and runtime validation to enforce invariants
- Consistent naming and modular organization following Go idioms, with clear boundaries between alerting, config, and storage
- Comprehensive validation and mock-based testing for alerting integrations and JSON payload behavior
- Well-structured codebase with clear package boundaries and dependency isolation via interfaces
What Makes It Unique
- Native SVG badge generation for uptime and response metrics enables fully self-hosted, customizable status displays
- Dynamic configuration reloading allows real-time updates to monitoring rules without service interruption
- Built-in Prometheus metrics exposure with zero-config integration for deep observability
- Unified storage abstraction decouples configuration parsing from persistence, enabling flexible backend support
- Client-side UI state persistence with localStorage and system-theme awareness enhances user experience across sessions
- Automatic status aggregation with config-defined defaults ensures UI remains functional during initial setup or data gaps