Khoj is a personal AI assistant designed to extend human cognition by integrating semantic search, RAG, and LLM capabilities across your local documents and the web. It’s built for researchers, writers, and knowledge workers who need to synthesize information from PDFs, Notion, Markdown, and live web sources without sacrificing privacy. Khoj scales from a lightweight on-device AI to a cloud-based enterprise solution.
Built in Python with support for Llama3, Qwen, Gemini, GPT, Claude, and Mistral via API or local inference (llama.cpp), Khoj supports deployment via Docker, PyPI, or cloud. It integrates with Obsidian, Emacs, WhatsApp, and desktop apps, offering a unified interface for AI-powered research and automation.
What You Get
- Multi-source semantic search - Search across PDFs, Markdown, Notion, Word, and org-mode files using vector embeddings for context-aware results.
- Custom AI agents - Build agents with personalized knowledge, personas, and toolsets to automate research, writing, or data analysis tasks.
- Web and document RAG - Ask questions and get answers grounded in your documents or live web results using retrieval-augmented generation.
- Cross-platform access - Use Khoj via web browser, Obsidian plugin, Emacs integration, desktop app, or WhatsApp for seamless AI access anywhere.
- Image generation - Generate images directly from prompts using integrated diffusion models, with output saved to your local storage.
- Automated newsletters and notifications - Schedule AI-generated summaries and research digests delivered to your email inbox based on your interests.
- Local LLM support - Run Llama3, Qwen, Mistral, and other models offline using llama.cpp for private, low-latency inference without cloud dependency.
- Enterprise deployment options - Deploy Khoj on-premises, in hybrid cloud environments, or via managed cloud service for teams and organizations.
Common Use Cases
- Researching academic papers - A PhD student uses Khoj to ingest 200+ PDFs from their library, ask questions about trends, and generate annotated summaries with citations.
- Building a personal knowledge base - A writer syncs their Obsidian vault with Khoj to query past notes and generate new content using their own writing style.
- Automating competitive intelligence - A product manager sets up an agent to monitor industry blogs, summarize key updates, and email daily briefs.
- Running AI on a private server - A privacy-conscious developer deploys Khoj with Llama3 on a local machine to avoid sending sensitive data to third-party APIs.
Under The Hood
Architecture
- Monolithic backend combining Django and FastAPI with shared database layer, creating ambiguous boundaries between ORM and API responsibilities
- Modular service design with distinct components for conversation processing, database adapters, and routing, though tightly coupled to Django models
- Dependency injection implemented via state singletons, introducing hidden dependencies and reducing testability
- Containerized microservice-ready architecture with decoupled search, sandbox, and automation services communicating over HTTP
- React-based web interface using Next.js and Tailwind, leveraging hooks and server-side rendering for dynamic UI components
- Mature data modeling with Django migrations managing schema evolution and rate limiting with relational integrity
Tech Stack
- Python 3.10+ backend powered by Django and FastAPI, with PostgreSQL and pgvector for semantic search
- Full-stack React and Next.js frontend built with Bun, Radix UI, and Phosphor Icons, bundled into a single Docker image
- Containerized deployment via Docker Compose orchestrating SearxNG, Terrarium, and Khoj Computer for web search, sandboxing, and local automation
- AI/ML stack built on PyTorch, sentence-transformers, and Hugging Face Transformers, supporting multiple LLM providers
- CI/CD enforced with pre-commit hooks, Hatch packaging, and pytest-django for comprehensive test coverage
- Extended tooling includes E2B, Whisper, and OCR for code interpretation, audio transcription, and image text extraction
Code Quality
- Extensive test suite covering unit, integration, and end-to-end scenarios with robust async and API validation
- Clear separation of concerns across frontend, backend, and Emacs interface with domain-specific test organization
- Structured error handling with user-facing messages, though lacking custom error classes for deeper traceability
- Consistent, intent-driven naming conventions across tests and components, enhancing readability and maintainability
- Strong type safety in frontend code via TypeScript and React hooks with proper prop typing and comprehensive assertions
- Well-organized linting and test structure with pytest markers enabling targeted execution and long-term maintainability
What Makes It Unique
- Native Magika integration enables content-aware file type detection without relying on file extensions
- Unified indexing pipeline harmonizes diverse sources like GitHub, Markdown, and Org-mode into a single semantic model
- Context-aware suggestion system delivers color-coded, intent-driven prompts tied to user input states
- Client-side dynamic loading of Excalidraw preserves performance while enabling rich visual collaboration
- Granular, real-time model selection allows seamless switching between local and remote LLMs without reloads
- Bi-directional sync between web and mobile interfaces creates a unified, cross-platform knowledge workspace