A pre-indexed code knowledge graph that cuts AI tool calls by 58% and token costs by 16% — auto-syncing, 100% local, works with Claude Code, Cursor, Codex, and more.
CodeGraph is a local code intelligence layer that builds a persistent, pre-indexed knowledge graph of your entire codebase using tree-sitter AST parsing across 20+ languages. Instead of having AI agents repeatedly grep files and read source to answer architecture questions, CodeGraph lets them query a SQLite graph of every symbol, call edge, and file relationship — delivering answers in one tool call at sub-millisecond speed.
Benchmarked across seven real-world open-source codebases in seven languages, CodeGraph consistently reduces token consumption by an average of 47%, cuts tool calls by 58%, and makes sessions 22% faster wall-clock — with cost reductions reaching 40% on smaller repositories like Alamofire and OkHttp. These are not synthetic benchmarks: each run used Claude Opus headlessly answering architecture questions with and without the graph, across VS Code, Django, Tokio, Gin, Excalidraw, OkHttp, and Alamofire.
The MCP server integrates with Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity IDE, and Kiro via a single codegraph install command that auto-detects which agents are present and wires in the MCP configuration. Once initialized in a project with codegraph init, a file watcher keeps the index fresh within roughly one second of any file change — meaning the graph is never stale while an agent is actively editing code.
The project is entirely self-contained: no Node.js installation is required on the host machine since the CLI bundles its own runtime. Cross-platform installers for macOS, Linux, and Windows cover the common setup path, and npm installation remains available for teams already running Node. At version 1.0.1 with over 50,000 stars and 126 commits per month, CodeGraph has rapidly become a popular tool for developers seeking to reduce the cost and latency of AI-assisted code exploration.
codegraph_explore, codegraph_search, codegraph_node, codegraph_callers, and related tools that agents call instead of reading filescodegraph install) that auto-detects Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity, and Kiro and wires the MCP config into eachcodegraph_callers that maps every call site including callback registrations, enabling agents to see what would break before making a changecodegraph upgrade in-place updater that detects whether the CLI was installed via bundle, npm, or npx and updates accordinglycodegraph_explore with natural language questions and receive verbatim source of relevant symbols in one round-trip instead of navigating directories manuallycodegraph_callers to get every call site including callback registrations before touching a single line.codegraph/ index per project, so switching tools does not require re-indexingArchitecture
CodeGraph follows a clean separation between its extraction pipeline, graph storage layer, sync daemon, and MCP transport, with each concern isolated so that tree-sitter grammar loading, SQLite writes, file watching, and MCP request handling can evolve independently. The CLI entry point dispatches into discrete command handlers for install, init, index, sync, and serve, with the MCP daemon running as a long-lived sidecar process managed through a registry of active sessions. Tool handlers are intentionally lazy-loaded off the MCP startup path to avoid pulling in the SQLite layer before the daemon binds. Error conditions that should stop an agent from retrying (security path refusals) are distinguished from recoverable guidance responses (unindexed projects), preventing the common failure mode where an early isError: true teaches agents that the toolset is broken.
Tech Stack
The core runtime is TypeScript compiled to CommonJS with strict mode enabled, targeting Node.js 20-24. Symbol extraction uses tree-sitter via WebAssembly (web-tree-sitter with per-language WASM grammars), with a dedicated worker thread that is recycled every 250 files to reclaim WASM linear memory, which cannot shrink under the WebAssembly specification. The knowledge graph is stored in a local SQLite database using a schema with nodes, edges, files, and unresolved-refs tables, backed by a full-text search virtual table (FTS5) for name queries and composite indexes on source/target/kind for graph traversal. The MCP server implements the Model Context Protocol transport layer, and the installer uses @clack/prompts for interactive terminal UI. The bundled binary packages the compiled output with a Node.js runtime using a custom bundling pipeline described in BUNDLING.md.
Code Quality
The project has an extensive test suite with over 80 test files covering extraction, graph queries, MCP daemon lifecycle, installer targets, sync hooks, worktree detection, security path validation, telemetry redaction, and evaluation harnesses for real-world benchmarking. Tests run under Vitest with a Node environment and include both unit tests for individual modules and integration tests that spawn real CLI and MCP server processes. TypeScript strict mode is enforced throughout, and the codebase consistently uses typed error classes (NotIndexedError, PathRefusalError) to distinguish recoverable from terminal failure modes. Comments throughout the source explain non-obvious decisions — especially around WASM memory management, lazy loading, and the rationale for composite database indexes.
What Makes It Unique Most code intelligence tools are built for IDE navigation or CI analysis, not for reducing AI agent token consumption at runtime. CodeGraph’s core insight is treating the pre-built graph as a cached intelligence layer that agents query directly, eliminating the grep-find-read discovery loop that dominates token spend in large codebase sessions. The adaptive output budget system scales explore call results to project size — tighter caps on small repos to avoid dumping whole files into context, larger caps on multi-thousand-file codebases where discovery cost would otherwise dominate. Blast-radius awareness via callback registration tracking extends caller analysis to dynamic dispatch patterns that static grep cannot reach, and the worker-thread recycling strategy for WASM grammars is a direct response to the WebAssembly linear memory constraint that most tree-sitter integrations silently leak.
CodeGraph is released under the MIT License, which is one of the most permissive open-source licenses available. You can use it commercially, modify it freely, embed it in proprietary products, and redistribute it without any copyleft obligation — the only requirement is retaining the copyright notice. There are no dual-license restrictions, no open-core enterprise tiers hidden behind the MIT layer, and no usage-based fees associated with the open-source release.
Running CodeGraph yourself is operationally lightweight. Each project maintains a .codegraph/ directory containing the SQLite database, and the file watcher runs as a daemon sidecar process managed by the CLI. There are no external services, network calls, or cloud dependencies — everything is local. Infrastructure requirements are minimal: a supported Node.js runtime (20-24) or the bundled binary, and disk space proportional to the codebase being indexed. The daemon handles its own lifecycle, and codegraph upgrade handles in-place updates. Teams deploying this in CI or on shared developer machines should be aware that indexing large codebases takes time on first run and that the daemon needs to remain alive for auto-sync to function.
At present, CodeGraph is entirely open-source with no paid tier or managed cloud offering shipping. The README notes that a hosted ‘CodeGraph Platform’ is in development — promising per-PR impact analysis, business logic validation, and test scope recommendations — with a public waitlist at getcodegraph.com. Self-hosted users today get all current functionality with no feature gates, but there is no SLA, enterprise support contract, or managed upgrade path. As the hosted platform ships, some advanced features may land there first, so teams with strong operational requirements may want to monitor the roadmap before building deep workflow dependencies on the current CLI.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
Developer Tools · Game Development · Design Tools
Free, MIT-licensed 2D and 3D game engine with one-click multi-platform export and no royalties.
Developer Tools · Databases · Search
The open-source Postgres development platform that replaces Firebase with authentication, real-time APIs, edge functions, storage, and vector embeddings — all built on PostgreSQL.