Cozystack is a free, open-source Platform-as-a-Service (PaaS) framework designed to turn bare metal servers into a fully functional cloud environment. It targets infrastructure teams, DevOps engineers, and organizations seeking to build private or public clouds without vendor lock-in, using proven open-source technologies. Cozystack solves the complexity of managing heterogeneous cloud resources by unifying Kubernetes, KubeVirt, Talos, and ClusterAPI under a single REST API.
Built on CNCF sandbox technologies, Cozystack leverages Kubernetes as its core control plane, with declarative YAML manifests driving provisioning of VMs, databases, load balancers, and caching layers. It supports deployment via Talos-bootstrap (PXE/ISO), integrates with FluxCD for package management, and provides both a REST API and a built-in UI. The platform is designed for self-hosted environments and supports scalable, low-overhead cloud infrastructure with tenant isolation.
What You Get
- REST API for Cloud Provisioning - Exposes a standard Kubernetes-native REST API to programmatically spawn Kubernetes clusters, virtual machines (via KubeVirt), databases, load balancers, and HTTP caching services using YAML manifests.
- Talos-based Bare Metal Bootstrapping - Uses Talos Linux via talos-bootstrap to install and provision Cozystack on bare metal servers using PXE or ISO, ensuring immutable, consistent OS images.
- KubeVirt Integration - Enables full virtual machine management alongside Kubernetes clusters, allowing VMs to be provisioned, scaled, and monitored using Kubernetes primitives.
- Built-in Monitoring & Alerts - Automatically deploys pre-configured dashboards and alerting rules for every service instance, with support for tenant-specific or centralized monitoring hubs.
- FluxCD-Powered Package Management - All platform components are delivered as YAML packages managed by FluxCD, enabling GitOps-based deployment, versioning, and rollback of cloud services.
- Multi-Tenant Resource Isolation - Implements a unique tenant model that efficiently allocates control plane resources while maintaining security and cost-efficiency across isolated user environments.
- Web UI for App Management - Provides a visual dashboard to deploy and manage applications, making it accessible for users who prefer GUI over CLI or API interactions.
- Standardized, Non-Vendor-Locked Architecture - Built on industry-standard tools (Kubernetes, ClusterAPI, KubeVirt, Talos) to avoid proprietary lock-in and ensure interoperability.
Common Use Cases
- Building a private cloud for internal teams - An enterprise IT team uses Cozystack to provision Kubernetes clusters and VMs on bare metal servers, enabling developers to self-service environments without relying on public cloud vendors.
- Launching a public cloud service - A hosting provider uses Cozystack as a backend to offer managed Kubernetes and VM instances to customers, with billing integration via its REST API and tenant isolation.
- Deploying Kubernetes on bare metal at the edge - A manufacturing firm deploys Cozystack on localized server racks to run Kubernetes workloads with low latency, using Talos for immutable, secure edge node management.
- Creating a development environment platform - A startup uses Cozystack to give engineers instant access to isolated Kubernetes clusters and databases for testing, reducing onboarding time and infrastructure costs.
Under The Hood
Architecture
- Modular monorepo structure enforces clear domain boundaries between control plane components and application-specific packages, promoting separation of concerns
- Operator pattern implemented via custom controllers using Kubernetes controller-runtime, managing declarative CRDs for databases, backups, and lineage tracking
- Dependency injection achieved through Helm and Kustomize overlays, enabling environment-specific configurations without code changes
- API-driven control plane with OpenAPI-generated specs and versioned CRDs ensures consistent contract enforcement between components
- Build pipelines integrated with Makefiles and pre-commit hooks automate CRD generation, manifest assembly, and asset bundling for operational consistency
- Admission webhooks and Helm templating enforce cluster policies and dynamic configuration, embodying infrastructure-as-code principles
Tech Stack
- Go-based control plane leveraging Kubernetes controller-runtime and kubebuilder for custom operator development
- Helm charts with variant support for Talos, generic K8s, and hosted environments enable flexible, environment-aware deployments
- Kubernetes-native infrastructure stack including Cilium, KubeOVN, Linstor, MetallB, Multus, and Kamaji for advanced networking and storage orchestration
- Custom tooling pipeline using yq, helm, and go generate to automate manifest generation, OpenAPI spec extraction, and CRD compilation
- Multi-platform Go binaries built via Makefile for the cozypkg CLI, supporting Linux, Darwin, and Windows architectures
- Pre-commit hooks ensure code generation consistency and prevent configuration drift across distributed components
Code Quality
- Extensive testing with Helm unittest and Go test frameworks validates template rendering, resource structure, and conditional logic across components
- Clear separation of operator logic, manifest utilities, and test suites promotes maintainability and modular development
- Robust error handling through explicit validation in test assertions and structured Kubernetes client interactions
- Consistent naming conventions aligned with Kubernetes and Helm standards enhance readability and reduce cognitive load
- Strong type safety enforced via Go’s type system and Kubernetes client libraries, with runtime verification of object structure
- Comprehensive linting and validation inferred from edge-case test coverage, though explicit linter configuration is not visible
What Makes It Unique
- Native CRD-based architecture for database and backup orchestration enables fully declarative, operator-driven lifecycle management without external dependencies
- Unified backup framework with pluggable drivers and raw extension support allows application-agnostic snapshots across diverse workloads
- Deep integration of Restic encryption and S3 backup strategies directly into database CRDs eliminates need for separate backup operators
- Automated resource preset system with validated enums ensures consistent, production-grade resource allocation across clusters
- Application-aware backup status tracking with typed references and metadata enables precise recovery of complex stateful applications
- Built-in validation and code generation pipelines enforce API contract integrity, reducing configuration drift and operational errors