A fully local web research assistant that iteratively searches, summarizes, and refines markdown reports using any Ollama or LMStudio model—no cloud or API keys required.
Local Deep Researcher is an open-source autonomous research agent that runs entirely on your hardware. You give it a topic, and it generates a targeted web search query, retrieves results from your chosen search provider, and uses a locally hosted LLM to summarize what it found. It then reflects on its own summary to identify knowledge gaps, generates a follow-up query, and repeats the process for a configurable number of iterations.
The project is built on LangGraph and exposes its research loop as a stateful graph. Each iteration—query generation, web search, summarization, reflection, and routing—is an explicit node in the graph, making the control flow transparent and inspectable in LangGraph Studio. The final output is a well-structured markdown report with deduplicated citations to every source consulted.
Local Deep Researcher supports multiple LLM backends (Ollama and LMStudio) and multiple search providers (DuckDuckGo by default, plus Tavily, Perplexity, and SearXNG). The model and search API are swappable at runtime via environment variables or the LangGraph Studio configuration tab, requiring no code changes. Tool calling mode is available as an alternative to JSON mode for models that support it.
Because everything runs locally, no research data, search queries, or summaries ever leave your machine. This makes it suitable for privacy-sensitive research workflows, air-gapped environments, or anyone who wants full control over which LLM handles their data.
langgraph dev or DockerArchitecture
Local Deep Researcher is structured as a directed cyclic graph using LangGraph’s StateGraph primitive. The graph has five named nodes—generate_query, web_research, summarize_sources, reflect_on_summary, and finalize_summary—connected by deterministic edges and a single conditional routing function. The routing function reads research_loop_count from the shared SummaryState dataclass and decides whether to continue researching or finalize, creating an explicit bounded loop. State is passed between nodes as an immutable snapshot updated by each node’s return dict, so the full research history is preserved and inspectable at every step. This approach cleanly separates concerns: LLM calls, search API calls, state mutations, and routing logic each live in their own isolated node, making the system easy to extend without touching unrelated code.
Tech Stack
The project is written in Python 3.10+ and built on LangGraph for graph orchestration and LangChain Core for LLM abstractions. It supports two local LLM runtimes—Ollama via langchain-ollama and LMStudio via a custom ChatLMStudio wrapper that speaks the OpenAI-compatible API. Search is pluggable: DuckDuckGo through duckduckgo-search, Tavily through tavily-python, Perplexity via direct HTTPX calls, and SearXNG through langchain-community’s SearxSearchWrapper. Web content is converted to clean markdown using markdownify. Configuration is a Pydantic BaseModel that merges environment variables and LangGraph’s RunnableConfig at runtime. The package is built with setuptools and managed with uv; a Dockerfile wraps everything for containerized deployment.
Code Quality
The codebase is compact and well-organized—seven source files totalling under 700 lines. Each node function is a pure function with typed parameters and a Google-style docstring. The project uses ruff for linting and import sorting with an explicit rule selection (E, F, I, D, T201, UP). Type hints are present throughout using standard Python typing. No test files are included in the repository, which is a gap for a project that integrates multiple external APIs. Error handling is pragmatic: structured output failures fall back to a sensible default query rather than crashing the loop, and thinking-token stripping is applied defensively. The README is thorough with setup instructions for multiple platforms, LLM providers, and search APIs.
What Makes It Unique
What distinguishes Local Deep Researcher from other research agents is the combination of a fully local execution model with a graph-native iterative loop. Unlike RAG pipelines that retrieve once, this tool explicitly models the “reflect to identify gaps, then search again” cycle as a first-class control flow construct in the graph—making the research loop transparent, debuggable, and extensible. The dual support for JSON mode and tool calling (switchable at runtime) allows it to work with models that have inconsistent structured-output behavior, including reasoning models like DeepSeek R1 that produce <think> tokens before their JSON output. The LangGraph Studio integration means the full graph state is visually inspectable mid-execution, which is unusual for research automation tools aimed at local deployment.
Local Deep Researcher is released under the MIT License. This is one of the most permissive open-source licenses available: you can use, modify, distribute, and incorporate it into commercial products without restriction. There are no copyleft obligations, so you are not required to open-source derivative works. The only requirement is that the copyright notice and license text be included in any distribution.
Running Local Deep Researcher yourself means you are responsible for the full operational stack. At minimum, you need a machine capable of running a local LLM—typically 8 GB of GPU VRAM for a 7–8B parameter model, significantly more for larger ones. You must separately install and maintain Ollama or LMStudio, pull and manage model weights, and keep the search API credentials (if using Tavily or Perplexity) rotated and secured. The LangGraph dev server is designed for local development use; running it in production requires additional hardening. There is no built-in persistence, authentication, rate limiting, or horizontal scaling—those are your responsibility to add.
There is no hosted or commercial version of this specific project. LangSmith (the LangChain observability platform) can be connected for tracing and debugging, and LangGraph Cloud offers managed deployment of LangGraph applications if you want to move beyond local execution—but neither is required. Compared to fully managed research APIs like Perplexity or services built on GPT-4, self-hosting means no SLA, no managed upgrades, and no support beyond GitHub issues; in return, you get full data privacy, no per-query costs, and complete control over which model and search provider your research pipeline uses.
Automation · Productivity · AI Assistants
Build, deploy, and run autonomous AI agents that automate complex multi-step workflows using a visual block-based graph editor.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
AI Code Assistants · AI Development
Orchestrate an army of AI coding agents—Claude Code, Codex, Gemini CLI, and more—running simultaneously in isolated git worktrees from a single Electron desktop app.