Memgraph

High-performance in-memory graph database for AI context and real-time analytics

228forks

Memgraph is an open-source, in-memory graph database designed to power AI applications requiring real-time, structured context—such as GraphRAG, AI memory, and agentic workflows. It solves the limitation of vector-only retrieval by enabling hybrid queries that combine semantic similarity with deep graph traversals in a single atomic operation. Built for performance, it targets developers and enterprises building AI systems that need traceable, connected knowledge graphs with sub-millisecond latency.

Memgraph is written in C++ and supports Cypher, the same query language as Neo4j, with native integrations for Kafka, Parquet, JSONL, and Python/Rust/C++ custom modules. It runs on Docker, Kubernetes, WSL, and cloud platforms, with enterprise-grade features like high availability, multi-tenancy, and SSO. The ecosystem includes Memgraph Lab for visualization, MAGE for graph algorithms, and the AI Toolkit for agentic frameworks.

What You Get

  • Built-in vector and text indexes - Hybrid queries combine similarity search with graph traversal in a single Cypher statement, eliminating the need for external vector databases or post-processing.
  • MAGE algorithm library - 40+ native C++ graph algorithms including PageRank, community detection, GNN-based link prediction, temporal graph networks, and embeddings—all callable via Cypher.
  • Atomic GraphRAG - Full GraphRAG pipelines (pivot search, graph expansion, ranking, prompt assembly) expressed as single Cypher queries instead of fragmented application logic.
  • LLM utility module - Automatically formats graph results into structured context prompts compatible with LLMs like GPT, Claude, or Llama for improved reasoning.
  • Native Parquet & JSONL loading - Direct ingestion from Parquet and JSONL files on S3, HTTP, or local disk without ETL pipelines.
  • Real-time streaming from Kafka/Pulsar/RedPanda - Dynamic graph updates triggered by streaming data with built-in graph algorithms that react to events in real time.

Common Use Cases

  • Running GraphRAG pipelines - AI engineers use Memgraph to replace fragmented RAG systems by connecting documents, entities, and relationships in a single graph query with sub-millisecond latency.
  • Building AI memory systems - Developers store semantic, episodic, and procedural memory as a unified graph so LLMs can recall structured, context-aware information across sessions.
  • Real-time fraud detection - Financial institutions analyze transaction networks in real time to detect multi-hop fraud patterns that rule-based systems miss.
  • Enterprise knowledge graphs - Organizations unify siloed data from SQL, PDFs, and CSVs into a queryable graph for cross-departmental search and decision-making.

Under The Hood

Architecture

  • The codebase demonstrates a clear attempt at separation of concerns, organizing functionality into distinct components like storage, query processing, replication, and system management.
  • Layers are well-defined, with interfaces and RPC mechanisms facilitating communication between components.
  • Dependency injection is utilized, though its application isn’t entirely consistent across the project.
  • Extensive use of namespaces aids in code organization, though deep nesting can sometimes obscure relationships.

Tech Stack

  • The project is built primarily with a combination of C++ and Python, leveraging the strengths of each language.
  • Code formatting and import organization are enforced through tools like black, isort, and clang-format.
  • Pre-commit hooks are extensively used to maintain code quality and consistency.
  • conan serves as a central build system and dependency manager.

Code Quality

  • Testing is comprehensive, covering various aspects like end-to-end functionality, correctness, and integration.
  • Error handling is deliberate, utilizing a combination of exception handling and custom exception classes.
  • Naming conventions are generally consistent, enhancing readability.
  • The CMake build system is robust, managing dependencies and configurations effectively.

What Makes It Unique

  • The integration of a custom query language and a focus on graph algorithms distinguishes this project.
  • Sophisticated indexing strategies, such as the TextEdgeIndex which incorporates a text search engine, enable efficient property-based searches.
  • Unique approaches to parallelizing graph algorithms, like the ChainChunk and ChainPopulation classes, suggest performance optimizations for large graphs.

Join founders buildingwith open source

Opinionated takes, migration guides, cost-saving tips, and insights from the open source ecosystem.

Subscribe on Substack

No spam. Unsubscribe anytime.

Join 750+ subscribers
No spam. Unsubscribe anytime.

Search