The batteries-included Python agent harness — planning, sub-agents, filesystem, shell, memory, and skills bundled in, built on LangGraph.
Deep Agents is an opinionated, open-source agent harness built on top of LangGraph. Where LangGraph gives you the graph runtime and LangChain’s create_agent gives you a minimal loop, Deep Agents assembles everything above that: sub-agent delegation, pluggable filesystem backends, context summarization, shell access, persistent memory, human-in-the-loop approval, and a skills system — all wired together and ready to run out of the box.
The framework is model-agnostic: any LLM that supports tool calling works, whether that’s a frontier API (Anthropic, OpenAI, Google), an open-weight model on Baseten or Fireworks, or a locally-hosted model via Ollama or vLLM. Provider and harness profiles let you tune prompting, tool visibility, and middleware per model spec without forking the core.
Every piece is designed to be overridden without touching the internals. Swap the filesystem backend for a sandboxed or remote variant, plug in a custom LangGraph CompiledStateGraph as a sub-agent, define your own tools or point at any MCP server, and layer in extra middleware for logging, rate limiting, or evaluation. Deep Agents also ships deepagents-code, a pre-built coding agent for the terminal powered by the same harness — comparable to Claude Code or Cursor but backed by any LLM you choose.
create_deep_agent() factory that wires planning, filesystem, sub-agent dispatch, context summarization, and permissions middleware into a single deployable LangGraph graphtask() tooldeepagents-code bundles the harness into a terminal-based coding agent (like Claude Code) that can read, edit, and refactor codebases using any LLMArchitecture
Deep Agents is structured as a layered middleware composition system on top of LangGraph’s compiled state graph runtime. The create_deep_agent() factory assembles an ordered pipeline of middleware — including filesystem, sub-agent dispatch, context summarization, skills loading, memory injection, and human-in-the-loop interrupts — each implementing a common protocol that wraps the model’s request/response cycle. This produces a single CompiledStateGraph with a well-defined DeepAgentState schema, allowing the result to itself be slotted in as a sub-agent elsewhere. The design enforces a clear separation between the graph runtime (owned by LangGraph), the agent loop (owned by LangChain’s create_agent), and the harness middleware (owned by Deep Agents), with extension points at each layer rather than requiring forks of any layer.
Tech Stack
Deep Agents is written in Python 3.11+ and built directly on LangGraph for graph execution, checkpointing, and streaming, and on LangChain for the agent loop and tool abstractions. It ships with first-class support for Claude models via langchain-anthropic and Gemini via langchain-google-genai, but any LangChain-compatible chat model that supports tool calling works. Filesystem operations use a protocol-backed backend system with concrete implementations for the local filesystem (using wcmatch for glob), Modal, Daytona, RunLoop, and Vercel Sandbox environments. The shell backend similarly abstracts local and sandboxed subprocess execution. LangSmith is the integrated observability layer for tracing, evaluation, and deployment. The monorepo also contains a TypeScript sibling (deepagentsjs), an ACP integration, an evaluations suite, and partner integration packages, all managed with uv and ruff.
Code Quality
The codebase demonstrates high discipline: comprehensive type annotations with strict ruff linting (ALL rules enabled, minimal ignores), Google-style docstrings, and an extensive test suite organized into unit, integration, and benchmark directories. Tests use pytest with pytest-asyncio, pytest-socket for network isolation, and pytest-codspeed for performance regression detection. CI runs lint, type-checking with ty, unit tests, and benchmarks in separate GitHub Actions workflows. Deprecation handling uses a dedicated deprecation.py module with explicit warnings rather than silent removal. The middleware module shows careful API discipline: internal state fields are name-mangled, and required scaffolding middleware is protected at construction time with clear error messages.
What Makes It Unique
Deep Agents occupies a distinct position in the ecosystem by being explicitly opinionated at a higher layer than LangGraph or LangChain’s create_agent, while remaining fully composable with them. Its layered middleware composition model — where each concern (filesystem, sub-agents, memory, skills) is a self-contained, orderable, and optionally excludable unit — allows practitioners to customize the harness without forking it, a meaningful improvement over frameworks that bake behavior into the core loop. The skills system, which implements a portable SKILL.md specification with progressive-disclosure loading from layered sources, enables behavior sharing across agent deployments in a way analogous to how plugins work in development tools. The harness profile registry, which allows per-model-spec tuning of prompting and middleware without touching application code, is a pragmatic answer to the real-world problem that different frontier models need substantially different prompting and tool-use strategies.
Deep Agents is released under the MIT License, one of the most permissive open-source licenses available. This means you are free to use it commercially, modify it, incorporate it into proprietary products, and redistribute it without any copyleft obligations. There are no usage restrictions, no open-core tiers, and no license keys required for self-hosted deployments. The only obligation is to include the copyright notice and license text in any distribution.
Running Deep Agents yourself means operating and managing a Python service and the LangGraph runtime it depends on. The infrastructure footprint is modest — the library itself is a pure Python dependency installable via uv add deepagents or pip — but production deployments need to provision model API credentials, configure checkpointers and stores for persistence (LangGraph supports Redis, Postgres, and in-memory backends), and choose and secure filesystem/shell backends. For sandboxed code execution use cases, integrations with Modal, Daytona, RunLoop, and Vercel Sandbox shift the sandbox management burden upstream, but you still own orchestration, scaling, and uptime for your own service layer. Updates are frequent (the project ships multiple releases per week), so staying current requires a deliberate upgrade cadence.
LangChain, Inc. offers LangSmith, a paid managed platform for tracing, evaluation, monitoring, and deployment of agents built on LangGraph and Deep Agents. Self-hosters can use LangSmith Cloud or self-host LangSmith, but the latter requires its own infrastructure investment. The hosted option provides a managed tracing backend, dataset management for evals, prompt versioning, and an online evaluation suite — capabilities that take significant engineering effort to replicate on your own. LangGraph Cloud (also managed) adds deployment, autoscaling, and high-availability hosting for the graph runtime, removing the need to manage your own agent servers. If your team’s core value is not in operating AI infrastructure, the managed LangSmith + LangGraph Cloud combination reduces operational burden substantially compared to running everything in-house.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
AI Code Assistants · AI Development
Orchestrate an army of AI coding agents—Claude Code, Codex, Gemini CLI, and more—running simultaneously in isolated git worktrees from a single Electron desktop app.
AI Code Assistants · AI Development
The self-hosted developer control center for running AI coding agents — locally, in Docker, on VMs, or across cloud backends — with automation workflows for GitHub, Slack, and more.