codegraph

Name: codegraph
Rating: 5 (57577 reviews)

A pre-indexed code knowledge graph that cuts AI tool calls by 58% and token costs by 16% — auto-syncing, 100% local, works with Claude Code, Cursor, Codex, and more.

57.6Kstars

3.5Kforks

MIT License

TypeScript

View Source Visit Website

On This Page

CodeGraph is a local code intelligence layer that builds a persistent, pre-indexed knowledge graph of your entire codebase using tree-sitter AST parsing across 20+ languages. Instead of having AI agents repeatedly grep files and read source to answer architecture questions, CodeGraph lets them query a SQLite graph of every symbol, call edge, and file relationship — delivering answers in one tool call at sub-millisecond speed.

Benchmarked across seven real-world open-source codebases in seven languages, CodeGraph consistently reduces token consumption by an average of 47%, cuts tool calls by 58%, and makes sessions 22% faster wall-clock — with cost reductions reaching 40% on smaller repositories like Alamofire and OkHttp. These are not synthetic benchmarks: each run used Claude Opus headlessly answering architecture questions with and without the graph, across VS Code, Django, Tokio, Gin, Excalidraw, OkHttp, and Alamofire.

The MCP server integrates with Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity IDE, and Kiro via a single codegraph install command that auto-detects which agents are present and wires in the MCP configuration. Once initialized in a project with codegraph init, a file watcher keeps the index fresh within roughly one second of any file change — meaning the graph is never stale while an agent is actively editing code.

The project is entirely self-contained: no Node.js installation is required on the host machine since the CLI bundles its own runtime. Cross-platform installers for macOS, Linux, and Windows cover the common setup path, and npm installation remains available for teams already running Node. At version 1.0.1 with over 50,000 stars and 126 commits per month, CodeGraph has rapidly become a popular tool for developers seeking to reduce the cost and latency of AI-assisted code exploration.

What You Get

A local SQLite knowledge graph storing every symbol, call edge, import, and file relationship across your codebase, queryable at sub-millisecond speed
An MCP server exposing codegraph_explore, codegraph_search, codegraph_node, codegraph_callers, and related tools that agents call instead of reading files
Automatic file watcher that keeps the index within ~1 second of live file edits — no manual re-indexing needed while agents are editing code
A one-command installer (codegraph install) that auto-detects Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity, and Kiro and wires the MCP config into each
A self-contained CLI binary with bundled Node.js runtime — no Node installation required on the host machine, works the same across macOS, Linux, and Windows
Blast-radius analysis via codegraph_callers that maps every call site including callback registrations, enabling agents to see what would break before making a change
A codegraph upgrade in-place updater that detects whether the CLI was installed via bundle, npm, or npx and updates accordingly
Benchmarked performance guarantees: average 47% fewer tokens, 58% fewer tool calls, and 22% faster sessions across seven real-world open-source codebases

Common Use Cases

Reducing AI session costs — teams using Claude Code or Cursor on large monorepos replace expensive file-scanning sub-agents with single graph queries, cutting per-session token spend by 16-40%
Architecture onboarding — developers joining a new codebase query codegraph_explore with natural language questions and receive verbatim source of relevant symbols in one round-trip instead of navigating directories manually
Refactor planning — engineers about to rename or change a core abstraction call codegraph_callers to get every call site including callback registrations before touching a single line
AI-assisted debugging — agents answering ‘how does X reach Y’ queries follow the call path through dynamic dispatch hops that grep cannot trace, surfacing the exact code involved
Multi-agent workflows — orchestrators querying large codebases keep sub-agents’ tool call budgets low by routing codebase questions through the graph rather than spawning read-heavy Explore agents
Cross-IDE portability — developers switching between Claude Code, Cursor, and Codex CLI share the same .codegraph/ index per project, so switching tools does not require re-indexing

Under The Hood

Architecture CodeGraph follows a clean separation between its extraction pipeline, graph storage layer, sync daemon, and MCP transport, with each concern isolated so that tree-sitter grammar loading, SQLite writes, file watching, and MCP request handling can evolve independently. The CLI entry point dispatches into discrete command handlers for install, init, index, sync, and serve, with the MCP daemon running as a long-lived sidecar process managed through a registry of active sessions. Tool handlers are intentionally lazy-loaded off the MCP startup path to avoid pulling in the SQLite layer before the daemon binds. Error conditions that should stop an agent from retrying (security path refusals) are distinguished from recoverable guidance responses (unindexed projects), preventing the common failure mode where an early isError: true teaches agents that the toolset is broken.

Tech Stack The core runtime is TypeScript compiled to CommonJS with strict mode enabled, targeting Node.js 20-24. Symbol extraction uses tree-sitter via WebAssembly (web-tree-sitter with per-language WASM grammars), with a dedicated worker thread that is recycled every 250 files to reclaim WASM linear memory, which cannot shrink under the WebAssembly specification. The knowledge graph is stored in a local SQLite database using a schema with nodes, edges, files, and unresolved-refs tables, backed by a full-text search virtual table (FTS5) for name queries and composite indexes on source/target/kind for graph traversal. The MCP server implements the Model Context Protocol transport layer, and the installer uses @clack/prompts for interactive terminal UI. The bundled binary packages the compiled output with a Node.js runtime using a custom bundling pipeline described in BUNDLING.md.

Code Quality The project has an extensive test suite with over 80 test files covering extraction, graph queries, MCP daemon lifecycle, installer targets, sync hooks, worktree detection, security path validation, telemetry redaction, and evaluation harnesses for real-world benchmarking. Tests run under Vitest with a Node environment and include both unit tests for individual modules and integration tests that spawn real CLI and MCP server processes. TypeScript strict mode is enforced throughout, and the codebase consistently uses typed error classes (NotIndexedError, PathRefusalError) to distinguish recoverable from terminal failure modes. Comments throughout the source explain non-obvious decisions — especially around WASM memory management, lazy loading, and the rationale for composite database indexes.

What Makes It Unique Most code intelligence tools are built for IDE navigation or CI analysis, not for reducing AI agent token consumption at runtime. CodeGraph’s core insight is treating the pre-built graph as a cached intelligence layer that agents query directly, eliminating the grep-find-read discovery loop that dominates token spend in large codebase sessions. The adaptive output budget system scales explore call results to project size — tighter caps on small repos to avoid dumping whole files into context, larger caps on multi-thousand-file codebases where discovery cost would otherwise dominate. Blast-radius awareness via callback registration tracking extends caller analysis to dynamic dispatch patterns that static grep cannot reach, and the worker-thread recycling strategy for WASM grammars is a direct response to the WebAssembly linear memory constraint that most tree-sitter integrations silently leak.

Self-Hosting

CodeGraph is released under the MIT License, which is one of the most permissive open-source licenses available. You can use it commercially, modify it freely, embed it in proprietary products, and redistribute it without any copyleft obligation — the only requirement is retaining the copyright notice. There are no dual-license restrictions, no open-core enterprise tiers hidden behind the MIT layer, and no usage-based fees associated with the open-source release.

Running CodeGraph yourself is operationally lightweight. Each project maintains a .codegraph/ directory containing the SQLite database, and the file watcher runs as a daemon sidecar process managed by the CLI. There are no external services, network calls, or cloud dependencies — everything is local. Infrastructure requirements are minimal: a supported Node.js runtime (20-24) or the bundled binary, and disk space proportional to the codebase being indexed. The daemon handles its own lifecycle, and codegraph upgrade handles in-place updates. Teams deploying this in CI or on shared developer machines should be aware that indexing large codebases takes time on first run and that the daemon needs to remain alive for auto-sync to function.

At present, CodeGraph is entirely open-source with no paid tier or managed cloud offering shipping. The README notes that a hosted ‘CodeGraph Platform’ is in development — promising per-PR impact analysis, business logic validation, and test scope recommendations — with a public waitlist at getcodegraph.com. Self-hosted users today get all current functionality with no feature gates, but there is no SLA, enterprise support contract, or managed upgrade path. As the hosted platform ships, some advanced features may land there first, so teams with strong operational requirements may want to monitor the roadmap before building deep workflow dependencies on the current CLI.

On This Page

Repository Health

Pre-computed score based on development activity, maintenance, community, maturity, and trend momentum.

81/100Excellent

Development Activity96

Maintenance100

Community64

Maturity24

Momentum40

Growing community supportVery active developmentWell-maintained with consistent updatesRapidly growing project

Technical Analysis

85/100Excellent

Architecture85

Code Quality88

Innovation90

Learning Curve75

Repository Stats

Contributors

Total Commits

650

Monthly Commits

158

Watchers

133

Repo Age

6 months

Last Commit

2 days ago

Built With

TypeScript92.9%

Recent Releases

25 total

~4.6 releases/month

Related Apps

Rust

95%

MIT

claw-code

AI Agents · AI Code Assistants

194,567

A Rust-built CLI agent harness for Claude AI with persistent sessions, MCP tool integration, plugin hooks, and multi-provider support — designed to run autonomous coding workflows without human babysitting.

View details

claw-code

OpenCode

AI Code Assistants

182,422

A fully open-source AI coding agent built for the terminal, with a TUI, desktop app, web client, plugin system, and SDK — one of the most-starred AI coding agents on GitHub.

View details