Hatchet

A Postgres-backed orchestration engine for background tasks, AI agents, and durable workflows that replaces Redis queues and multi-datastore durable execution platforms with a single self-hostable service.

7.5Kstars
441forks
MIT License
Go

Hatchet is an orchestration engine for background tasks, AI agents, and durable workflows, built by Hatchet Technologies Inc. (Y Combinator W2024) as a Postgres-native alternative to Redis-backed task queues and multi-datastore durable execution platforms like Temporal. Instead of splitting the queue, the durability log, and the observability store across separate technologies, Hatchet uses Postgres for both the task runtime and the run/event history, which the maintainers say is what makes self-hosting comparatively simple.

The engine (cmd/hatchet-engine) runs as a set of internal services — a gRPC dispatcher, a scheduler, a ticker for cron and scheduled runs, an ingestor, an OLAP-style observability pipeline, and admin, health, and metrics endpoints — coordinated with an errgroup and instrumented with OpenTelemetry and Prometheus from the start. Task assignment goes through a dedicated scheduling package (pkg/scheduling/v1) that implements optimistic scheduling, per-tenant lease management, worker slots, and rate limiting, so a single Postgres-backed queue can still support priority, concurrency policies, and worker affinity at scale.

Application code is written against official SDKs for Python, TypeScript, Go, and Ruby, all vendored in the same monorepo alongside dozens of runnable examples — DAGs, fanout, durable sleep, event waits, streaming, cron, and dedicated AI-agent patterns like a multi-step support-ticket triage workflow with human-in-the-loop replies. A React/Vite/Radix dashboard gives teams a real-time view of runs, workers, cron jobs, and rate limits, and a terminal UI shipped in hatchet-cli covers much of the same surface from the command line.

The project is MIT licensed with no enterprise-gated code paths in the open-source engine. Hatchet Cloud is offered as a hosted option with autoscaling, multi-region deployment, and SSO on top of the same engine, but self-hosters get the full dispatcher, scheduler, and dashboard for free.

What You Get

  • A Postgres-native durable task queue that doubles as the observability store — no separate Redis, Cassandra, or Elasticsearch cluster to operate.
  • A real-time web dashboard (React, Vite, Radix UI) for monitoring runs, workers, cron jobs, and rate limits as they happen.
  • Official SDKs for Python, TypeScript, Go, and Ruby, each shipped with dozens of runnable examples covering DAGs, fanout, durable sleep, and streaming.
  • A modular engine binary (hatchet-engine) composed of dispatcher, scheduler, ticker, ingestor, and OLAP services that can run as one process or scale independently.
  • Built-in OpenTelemetry tracing and Prometheus metrics wired into the engine and SDKs from the start, not bolted on later.
  • A terminal UI (hatchet-cli) for browsing runs, workers, cron jobs, and webhooks without leaving the shell.

Common Use Cases

  • Coordinating multi-step AI agent workflows that need to pause for tool calls or human replies and resume exactly where they left off.
  • Replacing a Redis-backed queue like Celery or BullMQ with a durable, Postgres-backed alternative that keeps full execution history.
  • Running document and data-processing pipelines as DAGs with fanout, concurrency limits, and per-key rate limiting.
  • Triggering workflows from upstream systems over authenticated webhooks instead of polling.
  • Adopting durable execution as a self-hosted, Postgres-only alternative to Temporal or DBOS.

Under The Hood

Architecture Hatchet’s server side is a modular, service-oriented monolith: cmd/hatchet-engine/engine/run.go wires up independent controllers — dispatcher, gRPC service, scheduler, ticker, ingestor, an OLAP observability pipeline, retention, partition, health, and metrics — and runs them concurrently under a single errgroup, so the same binary can run as one process or have controllers split out for horizontal scaling. Data access is centralized behind a repository layer (pkg/repository, with sqlc-generated queries in pkg/repository/sqlcv1) that both the engine and the admin/API layers depend on, and task assignment is delegated to a dedicated scheduling package (pkg/scheduling/v1) with its own lease manager, worker-slot pool, and rate limiter rather than being embedded in the dispatcher. Notably, the Go client SDK that used to live in-repo under pkg/v1 is now marked deprecated in favor of a separate sdks/go module, showing a monorepo mid-migration from a single generics-based client to independently versioned per-language SDKs; if the core scheduling/lease abstraction in pkg/scheduling/v1 changed shape, every controller that reads assignment results would need to change with it, since they sit directly on top of it rather than behind a narrower interface.

Tech Stack The engine is written in Go and serves HTTP via the Echo v4 framework (with OpenTelemetry’s otelecho middleware) alongside a separate gRPC service for worker connections, with request/response types generated from OpenAPI specs via oapi-codegen. Persistence is Postgres accessed through pgx v5 and sqlc-generated queries, fronted by PgBouncer for connection pooling in the local dev stack, with goose used for migrations; the message queue layer supports both a Postgres-native outbox and RabbitMQ as pluggable backends. The CLI and terminal UI use Cobra, Viper, and the Charm ecosystem (bubbletea, lipgloss, huh); observability runs through OpenTelemetry and Prometheus, and encryption uses Google’s Tink library with optional GCP KMS integration. The dashboard is a Vite + React app using Radix UI primitives, a typed query-key factory for data fetching, Monaco for embedded code editing, and Cypress for end-to-end tests, while a separate Next.js app serves the documentation site, with code snippets generated directly from the SDK examples by a small Python script.

Code Quality Testing is extensive on both sides of the monorepo: an extensive set of Go test files in the core engine (repository logic, scheduler, lease manager, rate limiter) plus a comparable volume of tests across the Python, TypeScript, and Ruby SDKs, with real Postgres and RabbitMQ containers spun up for integration tests rather than mocked out. Error handling favors explicit, wrapped errors over silent failures, and Go code is gated by a comprehensive linter configuration covering static analysis, security scanning, and style/correctness checks, enforced in both pre-commit hooks and CI. The repository runs a large matrix of separate CI workflows — per-language SDK test suites, CLI end-to-end tests, a vulnerability scanner, spelling checks, and enforced conventional commits — indicating a mature, actively-gated pipeline rather than a single catch-all build script.

What Makes It Unique Hatchet’s core technical bet is using Postgres as the single source of truth for both task durability and the observability/monitoring system, rather than pairing a broker with a separate durability store (as some durable-execution platforms do) or forgoing durability entirely (as traditional Redis-backed task queues do). The scheduling package implements this with optimistic scheduling, per-tenant lease management, and worker-slot accounting to keep a single Postgres-backed queue competitive on throughput while still supporting DAGs, durable sleeps, event-based waits, worker affinity, and dynamic rate limits as first-class primitives. None of these primitives are individually unique — Postgres-backed queues and durable-execution engines both exist elsewhere — but combining full workflow/DAG semantics, built-in tracing and metrics, and official SDKs across four languages on a single-datastore architecture is a genuine simplification of the usual multi-system stack, more than it is a novel algorithm.

Self-Hosting

Licensing Model MIT licensed — the engine, dashboard, CLI, and all client SDKs are open source with no license keys or feature gates required for self-hosting.

Self-Hosting Restrictions None found in the source. The repository has no ee/, enterprise/, or pro/ directory, and no license-check code paths — every controller (dispatcher, scheduler, ticker, OLAP, admin) ships in the open-source engine binary.

Enterprise Features Not applicable — there is no separate paid, source-available tier; all engine functionality documented in the README is available in the OSS release.

Cloud vs Self-Hosted Hatchet Cloud, the hosted offering, adds operational conveniences on top of the same open-source engine: autoscaling and pay-as-you-go pricing, multi-region deployment, and SSO. The changelog also references a Scale-plan audit-logs API and tenant-scoped Prometheus metrics as Cloud-only additions. These are hosting-layer conveniences, not gated OSS functionality.

License Key Required No.

Join founders buildingwith open source

Opinionated takes, migration guides, cost-saving tips, and insights from the open source ecosystem.

Subscribe on Substack

No spam. Unsubscribe anytime.

Join 750+ subscribers
No spam. Unsubscribe anytime.

Search