Supermemory is a memory and context layer for AI agents that solves the problem of AI forgetting between conversations. It automatically extracts facts from interactions, maintains evolving user profiles, and delivers personalized context via a single API — enabling AI systems to remember preferences, projects, and past discussions. Built for developers and end-users alike, it integrates with major AI frameworks and supports real-time data sync from Google Drive, GitHub, Notion, and more.
Built with TypeScript, Remix, Drizzle ORM, and deployed on Cloudflare Workers and Pages, Supermemory combines a graph-based memory engine with hybrid search (RAG + user memory) and multi-modal file processing. It offers both a no-code app for end-users and a production-ready API for developers, with first-class integrations for Vercel AI SDK, LangChain, Mastra, and Claude Code.
What You Get
- Memory Engine - Extracts facts from conversations, resolves contradictions, tracks temporal changes, and auto-forgets expired information using a graph-based knowledge structure.
- User Profiles - Automatically builds static (permanent) and dynamic (recent) user context in ~50ms, injected into AI prompts to personalize responses without manual configuration.
- Hybrid Search - Combines RAG (document retrieval) and user memory in a single query, returning both external docs and personalized facts from user history.
- Connectors - Real-time sync from Google Drive, Gmail, Notion, OneDrive, and GitHub via webhooks — documents are auto-chunked and indexed without manual setup.
- Multi-modal Extractors - Processes PDFs, images (OCR), videos (transcription), and code (AST-aware chunking) into searchable, structured context.
- MCP Server & Plugins - Open-source MCP server and plugins for Claude, Cursor, VS Code, and OpenCode that enable persistent memory in AI tools with one install.
Common Use Cases
- Building personalized AI assistants - A developer uses Supermemory to give their customer support bot persistent memory of each user’s past issues, preferences, and project context across sessions.
- Running AI-powered knowledge bases - A team syncs their Notion and Google Drive docs into Supermemory to enable RAG with user-specific context for internal AI tools.
- Enhancing AI coding assistants - A software engineer installs the Supermemory plugin in Cursor to have their AI assistant remember their coding style, preferred libraries, and past debugging patterns.
- Scaling AI agents with user profiles - A startup uses Supermemory’s profile API to personalize onboarding bots for 10,000+ users, reducing prompt engineering overhead by 80%.
Under The Hood
Architecture
- Monorepo structure powered by Turbo enforces clear boundaries between frontend and backend modules, promoting modular development and independent deployment
- Backend services follow clean architecture principles with Hono and Drizzle-ORM, isolating business logic from HTTP handlers and database concerns through well-defined service layers
- Dependency injection and type safety are unified via Zod-openapi and Drizzle-zod, reducing coupling between API contracts and data access layers
- Authentication and routing are centralized using Better-Auth and Hono-OpenAPI, enabling consistent middleware pipelines and auto-generated API documentation
- Frontend components leverage React Server Components and CSS-in-JS with custom state management hooks, decoupling UI logic from data flow
Tech Stack
- TypeScript monorepo with Turbo for optimized incremental builds, Next.js for server-side rendering and API routes, and Bun as the package manager
- Backend services use Hono and Drizzle-ORM with PostgreSQL, leveraging type-safe schema generation and runtime validation through Zod
- AI integration spans multiple LLM providers via @ai-sdk/openai, @ai-sdk/anthropic, and @google/genai, orchestrated with LangChain-core and deployed via Cloudflare Workers for edge compute
- Infrastructure is defined through Wrangler and Drizzle-Kit for deployment and migrations, with a clean separation between frontend, backend, and AI service layers
- Comprehensive tooling includes Biome for linting and formatting, Sentry for observability, and Node.js 20+ as the runtime baseline
Code Quality
- Extensive test coverage across layers with unit, integration, and end-to-end tests using Vitest and pytest, emphasizing structural validation and deterministic mock data
- Strong separation of concerns with dedicated packages for agent frameworks, SDKs, middleware, and memory graph components, each with well-defined interfaces
- Robust error handling through custom error classes and environment validation, ensuring configuration integrity before system initialization
- Consistent naming, type safety, and contract enforcement across TypeScript and Python modules, enhancing cross-language interoperability
- Integrated linting and test scaffolding enforce standards, with middleware wrappers enabling seamless memory integration into third-party LLM frameworks without modification
What Makes It Unique
- Native Open Graph parser embedded in the link ingestion pipeline enables automatic memory enrichment from URLs without external dependencies
- Raycast extension integration with dynamic API key generation tied to organizational context creates a seamless desktop-to-memory workflow unmatched in knowledge systems
- Unified memory ingestion architecture across browser extension and web app ensures consistent semantic capture with minimal code duplication
- Custom Radix UI component library with data-slot attributes and theming hooks enables deep customization while preserving accessibility and visual consistency
- Decoupled document parsing and memory embedding layer supports extensible content types including structured metadata and user annotations, not just plain text
- Real-time preview engine with intelligent URL normalization and fallback strategies preserves user intent during content capture, reducing cognitive overhead