Clarity AI is an open-source AI-powered image upscaler designed for developers, digital artists, and photographers who need to enhance low-resolution images with photorealistic detail. It solves the problem of pixelation and loss of detail in image enlargement by leveraging advanced diffusion models like Stable Diffusion and Flux. Built on Cog and compatible with ComfyUI and AUTOMATIC1111 WebUI, it supports custom checkpoints, LoRAs, and multi-step upscaling for professional-grade results.
The tool integrates with multiple deployment options: a web app (ClarityAI.co), a REST API, ComfyUI nodes, and Stable Diffusion WebUI plugins. It uses Python-based AI models, supports PNG/JPG/WebP output formats, and enables advanced features like fractality, pattern upscaling, and pre-downscaling to preserve image integrity during enlargement.
What You Get
- 13Kx13K Resolution Upscaling - Capable of enhancing images to ultra-high resolutions up to 13,000x13,000 pixels while preserving fine details and textures.
- Flux Upscaling Support - Specialized upscaling engine optimized for faces, text, and art using Flux LoRAs, available via ClarityAI.co/flux-upscaler.
- LoRA Model Support - Allows users to apply custom LoRA models for style control, facial enhancement, and artistic rendering during upscaling.
- Custom Safetensors Checkpoints - Users can load and use their own Stable Diffusion model checkpoints (.safetensors) for personalized upscaling results.
- Multi-Step Upscaling - Enables progressive enhancement through multiple upscaling stages to reduce artifacts and improve detail fidelity.
- ComfyUI Node Integration - Official ComfyUI node with API key support for seamless integration into existing Stable Diffusion workflows.
- Pre-Downscaling - Reduces input image size before upscaling to improve model performance and reduce noise amplification.
- Fractality Enhancement - Applies fractal-based detail generation to create natural-looking textures and patterns in enlarged images.
- Pattern Upscaling - Detects and enhances repeating patterns in images (e.g., fabrics, wallpapers) without distortion.
- Sharpen Image Filter - Post-processing sharpening to enhance edge clarity and fine details after upscaling.
- WebP/JPG/PNG Output Formats - Flexible output options with support for lossless and compressed formats for web and print use.
Common Use Cases
- Restoring old or low-res photos - A family historian uses Clarity AI to upscale and enhance faded 1980s family photos to 8K resolution for printing and digital archives.
- Creating high-res art prints - A digital artist upscales their 1024x1024 digital paintings to 13K resolution for gallery-quality prints using custom LoRAs for stylistic consistency.
- Enhancing product images for e-commerce - An e-commerce seller uses the API to automatically upscale product photos from low-res supplier images to meet Amazon’s 3000px requirement.
- AI-assisted photo restoration - A photographer uses pre-downscaling and fractality to clean up and upscale grainy concert photos while preserving lighting and texture.
- Integrating upscaling into automated workflows - A developer embeds Clarity AI’s API into a content management system to auto-upscale user-uploaded images before publishing.
- Anime and illustration enhancement - An anime fan uses the dedicated anime upscaling mode to upscale fan art and manga panels to 4K for digital displays.
Under The Hood
Architecture
- Built as a modular extension of Stable Diffusion WebUI, integrating ControlNet, ADetailer, and AnimateDiff as configurable plugins via JSON-based orchestration, enabling flexible pipeline composition
- Employs a clean pipeline pattern with distinct components for feature extraction (VQModelInterface) and upscaling (UpsampleOneStep), decoupling core logic from implementation details
- Uses configuration-driven dependency declaration and environment isolation instead of explicit dependency injection, maintaining clean boundaries between components
- Implements a stateful cache system with metadata-driven asset resolution, separating model loading from inference execution
- Relies on a composite architecture that integrates third-party tools like GFPGAN and RealESRGAN as external modules, avoiding monolithic design
Tech Stack
- Powered by Python 3.10 with PyTorch and CUDA-accelerated inference, enhanced by xformers for attention optimization and RealESRGAN for high-fidelity upscaling
- Built atop Stable Diffusion WebUI with deep API-level integration of extensions, using JSON payloads to dynamically control advanced features like Refiner and Hires. fix
- Deployed via Cog container with preconfigured GPU support, ensuring reproducible inference across environments
- Frontend state is managed through JSON-based configuration, enabling dynamic parameterization of upscaling workflows
- Enforces code quality with Ruff and ESLint across Python and JavaScript, and uses pytest for API endpoint testing
- Manages models through safetensors metadata and SHA-256 hashing, with structured support for LoRA and checkpoint files
Code Quality
- Features a comprehensive test suite covering core API endpoints with reusable fixtures and parameterized inputs
- Lacks custom error handling, structured logging, or meaningful assertions on output quality, resulting in opaque failure modes
- Test structure follows DRY principles with clear separation of concerns between image encoding and API testing
- Consistent naming conventions improve readability but are not matched by robust static analysis or type safety
- Absence of type annotations and static tooling reduces long-term maintainability and increases risk of runtime errors
What Makes It Unique
- Dynamically generates API schemas from internal processing classes, eliminating manual schema maintenance and ensuring real-time alignment
- Implements intelligent base64 image handling with built-in URL validation and local request blocking for secure ingestion
- Uses script name-to-index resolution with dynamic error mapping to integrate custom modules without hardcoding
- Auto-generates Pydantic models from class annotations, reducing boilerplate and preventing API-UI drift
- Integrates Gradio and SD modules through metadata extraction rather than hardcoded endpoints, enabling extensibility without redesign
- Preserves EXIF metadata and enforces network safety policies at the image decoding layer for context-aware upscaling