MemPalace

Name: MemPalace
Rating: 5 (56956 reviews)

Local-first AI memory with verbatim storage, pluggable backends, and 96.6% retrieval recall on LongMemEval — no API key required.

57Kstars

7.4Kforks

MIT License

Python

View Source Visit Website

On This Page

MemPalace is a local-first AI memory system that stores conversation history and project content as verbatim text and retrieves it with semantic search. Unlike summarization-based memory tools, MemPalace never paraphrases or extracts — every drawer holds the original content exactly as written, which is the key to its benchmark-leading retrieval accuracy.

The palace uses a spatial metaphor to organize knowledge: people and projects become wings, topics become rooms, and the original verbatim content lives in drawers. Searches can be scoped to a specific wing or room rather than run against a flat corpus, dramatically improving precision. The hybrid search pipeline combines BM25 keyword matching with vector semantic similarity, with closet pointers providing an additional ranking signal.

The retrieval layer is pluggable through a well-defined backend contract (RFC 001). ChromaDB is the default, but alternative backends — SQLite exact-vector, Qdrant (REST), and pgvector (Postgres) — can be swapped in without touching the rest of the system. Embeddings are generated locally using an ONNX-based model (embeddinggemma-300m for multilingual support, or MiniLM for English-only), with hardware acceleration available via CUDA, CoreML, and DirectML.

Beyond file mining, MemPalace includes a temporal entity-relationship knowledge graph backed by SQLite, 33 MCP server tools for integration with Claude Code and other AI tools, auto-save hooks for Claude Code/Codex CLI/Cursor IDE, and multi-agent support where each specialist agent gets its own wing and diary in the palace.

What You Get

Verbatim drawer storage that preserves original text exactly — no summarization or paraphrasing ever occurs
Hybrid BM25 + vector semantic search with closet-pointer ranking boosts for precision without false negatives
Pluggable storage backends: ChromaDB (default), SQLite exact-vector, Qdrant REST, and pgvector Postgres
33 MCP server tools covering palace reads/writes, knowledge-graph operations, cross-wing navigation, and agent diaries
Temporal entity-relationship knowledge graph with validity windows backed by local SQLite
Auto-save hooks for Claude Code, Codex CLI, and Cursor IDE that snapshot sessions before context compression
Multi-ingest modes: project files, conversation exports (Claude Code, ChatGPT, Slack), and binary documents (PDF, DOCX, PPTX, XLSX, RTF, EPUB)
Hardware-accelerated local embeddings via ONNX Runtime with CUDA, CoreML, and DirectML support

Common Use Cases

Mining Claude Code session transcripts so an AI assistant can recall decisions, context, and rationale from previous conversations
Indexing large codebases so developers can search for architectural decisions, past bug fixes, and implementation choices in natural language
Building a personal knowledge palace from documents and notes that any MCP-compatible AI tool can query at runtime
Running multi-agent workflows where each specialist agent maintains its own wing and diary without polluting shared context
Recovering project context quickly after a context window reset by running mempalace wake-up at session start
Cross-language knowledge retrieval using the embeddinggemma-300m multilingual model across 100+ languages

Under The Hood

Architecture MemPalace follows a layered, modular architecture with clear separation between the CLI entry point (cli.py), domain logic, and storage backends. The core flow starts at cli.py which routes to miner.py (project files), convo_miner.py (conversation exports), or format_miner.py (binary documents). All three miners converge on palace.py for shared palace operations — collection access, embedder identity enforcement, closet upserts, and FTS5 validation — which in turn delegates to the pluggable backend registry in backends/. The knowledge graph (knowledge_graph.py) is a self-contained SQLite module that sits orthogonal to the vector store, joined only at the MCP server layer (mcp_server.py). Thread safety is handled by explicit per-file and per-palace mining locks in palace.py, and the MCP server protects stdout at the file-descriptor level before importing chromadb to prevent banner output from corrupting JSON-RPC streams.

Tech Stack MemPalace is written in Python 3.9+ and distributed as a PyPI package built with Hatchling. The default vector store is ChromaDB 1.5.x; alternative backends (Qdrant via REST, pgvector via psycopg3, SQLite exact-vector) are registered through Python entry points under mempalace.backends. Embeddings are generated locally using ONNX Runtime with two model options: all-MiniLM-L6-v2 (~30 MB, English-only) or onnx-community/embeddinggemma-300m-ONNX (~300 MB, 100+ languages), lazy-downloaded from HuggingFace Hub on first use. Binary document extraction uses MarkItDown with per-format sub-extras for PDF, DOCX, PPTX, and XLSX. Development tooling is ruff 0.15.15 for linting and formatting, pytest with pytest-cov enforcing 85% coverage minimum, hypothesis for property-based testing, mypy for static type checking, and pre-commit for local gate enforcement.

Code Quality The test suite is comprehensive, with over 100 test files covering individual modules (test_searcher.py, test_miner.py, test_palace.py), backend conformance (_backend_conformance.py run against all four backends), MCP server behavior, hook integration, and even benchmark claim verification (test_readme_claims.py). Error handling is explicit and typed throughout — the backends module defines a rich error hierarchy (BackendError, PalaceNotFoundError, CollectionNotInitializedError, DimensionMismatchError, UnsupportedCapabilityError) and callers distinguish between them rather than catching broadly. Inline comment density is very high, with detailed rationale comments in pyproject.toml, palace.py, and miner.py explaining non-obvious decisions. The codebase enforces a coverage floor of 85% in [tool.coverage.report] and CI gates on ruff.

What Makes It Unique Most AI memory systems summarize or extract facts before storage, which introduces irreversible information loss and shifts recall accuracy to depend on the quality of that transformation. MemPalace inverts this: verbatim storage is the invariant, and the retrieval pipeline (hybrid BM25 + vector + closet-pointer ranking) is optimized to find the right original text. This design choice, combined with the spatial hierarchy enabling scoped searches rather than flat-corpus queries, produces the benchmark-leading 96.6% R@5 on LongMemEval with zero API calls. The pluggable backend system formalized via RFC 001 — with a published conformance test suite that any third-party backend must pass — is also distinctive: it makes the storage substrate genuinely swappable without coupling the entire system to ChromaDB’s API surface.

Self-Hosting

Licensing Model MIT licensed — all features available in self-hosted deployments with no restrictions or license keys required.

On This Page

Repository Health

Pre-computed score based on development activity, maintenance, community, maturity, and trend momentum.

83/100Excellent

Development Activity100

Maintenance100

Community76

Maturity16

Momentum40

Growing community supportVery active developmentWell-maintained with consistent updatesRapidly growing project

Technical Analysis

84/100Excellent

Architecture85

Code Quality88

Innovation82

Learning Curve80

Repository Stats

Contributors

116

Total Commits

1,450

Monthly Commits

481

Watchers

326

Repo Age

3 months

Last Commit

2 days ago

Built With

Python93.3%

Recent Releases

12 total

~4.0 releases/month

Alternative To

Mem0

Topics

ai chromadb llm mcp memory python

Related Apps

Rust

95%

MIT

claw-code

AI Agents · AI Code Assistants

194,567

A Rust-built CLI agent harness for Claude AI with persistent sessions, MCP tool integration, plugin hooks, and multi-provider support — designed to run autonomous coding workflows without human babysitting.

View details