Jaaz is the world’s first open-source multimodal creative assistant designed as a privacy-first alternative to commercial tools like Canva and Manus. It enables users to create images, videos, and storyboards through an intuitive visual canvas where sketches, arrows, and simple instructions are interpreted by AI without requiring complex prompts. Built for designers, content creators, and teams who need creative control without surrendering data to cloud services, Jaaz supports both local execution (via ComfyUI and Ollama) and hybrid cloud model integration. The tool prioritizes data sovereignty—everything runs on your machine by default—and offers enterprise-grade private deployment options for teams requiring compliance and security.
Unlike traditional AI image generators that rely on text prompts, Jaaz introduces ‘Magic Canvas’ and ‘Magic Video’ features that allow users to draw, annotate, and arrange elements visually. The AI interprets these actions in real time, enabling a tactile, iterative workflow similar to working with physical materials. This makes it accessible to non-technical users while still offering powerful control for professionals who need full ownership over their creative assets.
What You Get
- Magic Canvas & Magic Video - Create images and videos by sketching, drawing arrows, or placing elements on an infinite canvas; the AI interprets these visual cues without requiring text prompts. Ideal for rapid ideation and non-linear creative workflows.
- One-Prompt Image & Video Generation - Generate high-quality images or videos from a single natural language prompt, with auto-optimized outputs and multi-turn refinement capabilities. Supports GPT-4o, Midjourney, VEO3, Kling, and Seedance via API integration.
- Infinite Canvas & Visual Storyboarding - Design multi-scene visual narratives with unlimited workspace. Link layouts, manage media assets visually, and maintain context across frames—perfect for video production and advertising storyboards.
- Smart AI Agent System - Chat with an integrated agent to insert objects, apply styles, or adjust composition. The system maintains multi-character coherence across scenes and works with both local models (ComfyUI) and cloud APIs.
- Flexible Local & Hybrid Deployment - Run completely offline using Ollama or local ComfyUI instances. Pre-built media and prompt libraries are included. Supports Windows and macOS with no cloud dependency.
- Privacy & Data Ownership - All processing occurs locally by default. No user data is uploaded or tracked. Open-source code ensures transparency and compliance for commercial use.
Common Use Cases
- Building a multi-scene visual campaign for social media - A marketer sketches a character and draws arrows to place them in front of landmarks like the Eiffel Tower, Taj Mahal, and Statue of Liberty; Jaaz generates six harmoniously lit, high-quality 9:16 images for Instagram Reels.
- Creating viral short-form videos from a single idea - A content creator types ‘a cat riding a skateboard through Tokyo at night’ and then draws motion lines on the canvas; Jaaz generates a 10-second video with dynamic camera movement and lighting transitions.
- Designing brand assets without relying on SaaS tools - A small design team uses Jaaz’s local install to generate logos, product mockups, and social graphics without uploading any assets to third-party servers, ensuring GDPR compliance.
- DevOps teams managing AI creative workflows - Teams deploy Jaaz on-premises using Docker or source code to unify image/video generation pipelines, integrating with existing ComfyUI deployments and controlling access via internal authentication.
Under The Hood
The project is a cross-platform desktop application that merges an Electron-based UI with React and TypeScript frontend, Python backend services, and integration capabilities for AI tools like ComfyUI. It enables developers to manage and interact with complex AI workflows through a unified, interactive interface.
Architecture
This project adopts a monolithic architecture with clear module boundaries, separating desktop logic, UI components, and backend services. It leverages design patterns such as dependency injection and event-driven communication to manage interactions across layers.
- Modular organization supports feature development with well-defined UI, API, and Electron-specific components
- Event-driven architecture enables seamless communication between frontend and backend services
- Cross-platform deployment is supported through Electron’s abstraction layer
Tech Stack
Built using a modern web stack, the application integrates Electron for desktop functionality, React and TypeScript for UI, and Python for backend automation and tooling.
- TypeScript and React form the core frontend with Zod, Zustand, and Lucide React as key libraries
- Python backend services are integrated with Node.js ecosystem tools for system operations and API handling
- Vite, Electron Builder, and npm tooling support efficient building and packaging workflows
- Vitest is used for unit and integration testing across core components
Code Quality
Code quality reflects a mixed state with consistent efforts toward testability and error handling, though some technical debt remains.
- Error handling is applied through try/catch blocks and structured fallbacks across components
- Type safety is enforced with TypeScript and Pydantic models, improving runtime reliability
- Linting and CI/CD pipelines are in place to support code consistency and automated checks
- Testing is present but not comprehensive, covering core components with room for expansion
What Makes It Unique
This project distinguishes itself through its integration of desktop, React, and Python environments to streamline AI workflow management.
- Real-time control and configuration of ComfyUI workflows through a unified desktop interface
- Custom abstraction layers that simplify complex AI tooling setup and model management
- Strong type consistency across mixed-language environments using TypeScript and Python type hints
- Dynamic, canvas-based agent studio with chat and magic generation capabilities for interactive AI development