FastGPT

Build, debug, and deploy knowledge-based AI agents with a visual workflow editor, RAG retrieval, and support for any OpenAI-compatible LLM.

27.3Kstars
7Kforks
Other
TypeScript

FastGPT is an open-source AI Agent platform built on Next.js and TypeScript that enables teams to create production-grade chatbots and knowledge assistants through a visual drag-and-drop workflow editor. It targets developers, enterprise teams, and AI product builders who need to ship domain-specific AI applications without deep LLM engineering expertise. The platform handles everything from document ingestion and vector embedding to multi-step workflow orchestration and real-time debugging, making it practical to deploy AI assistants that reason over private data.

The architecture is a monorepo with three clearly separated layers: shared global types and Zod schemas, a backend service package with MongoDB models and a workflow execution engine, and a Next.js full-stack application that ties them together. FastGPT integrates natively with any OpenAI-compatible API endpoint, including Claude, DeepSeek, and Qwen, and supports multiple vector database backends including PostgreSQL/pgvector, Milvus, and Zilliz. An optional AI Proxy layer provides load balancing and model aggregation across providers.

What sets FastGPT apart operationally is its bidirectional MCP (Model Context Protocol) support: it can consume tools from external MCP servers and simultaneously expose its own workflows as MCP endpoints, making any FastGPT app composable into larger AI systems. The platform ships with a separate code-sandbox microservice using Bun and Hono for isolated JavaScript execution, a standalone MCP server project, and Helm charts for Kubernetes deployment. A cloud-hosted version at fastgpt.io is available alongside full Docker self-hosting.

FastGPT has accumulated over 28,000 GitHub stars since its 2023 launch and maintains a release cadence of multiple versions per month, with 148 contributors and an active Chinese-speaking community supported by Feishu groups and official documentation in six languages.

What You Get

  • Visual Flow Workflow Editor - Drag-and-drop node graph for building AI workflows with nodes for LLM chat, knowledge base search, HTTP calls, code execution, conditional branching, loops, and parallel runs.
  • Multi-Format Knowledge Base - Import PDF, DOCX, PPTX, TXT, Markdown, CSV, XLSX, and web URLs with automated chunking, QA pair extraction, vector embedding, and hybrid search with reranking.
  • OpenAI-Compatible API - Full /v1/chat/completions endpoint that exposes any FastGPT app as a drop-in GPT replacement, enabling integration with existing clients, Discord bots, and third-party tools.
  • Bidirectional MCP Integration - Consume external MCP server tools inside workflows and expose any FastGPT app as an MCP endpoint so other AI systems can invoke it as a composable service.
  • RAG Debugging and Evaluation - Inspect the full call chain for each response, view retrieved chunks, modify or delete references inline, and evaluate answer quality with built-in scoring metrics.
  • Code Execution Sandbox - Run user-defined JavaScript inside isolated Bun processes with network restrictions and resource limits, enabling dynamic data transformation within workflows.
  • Template Marketplace - Pre-built AI agent templates for customer support, HR assistants, and product documentation that can be deployed and customized without starting from scratch.
  • Multi-Model and Multi-Provider Support - Connect any LLM with an OpenAI-compatible API, including OpenAI, Claude, DeepSeek, and Qwen, with optional AI Proxy for load balancing across providers.

Common Use Cases

  • Customer support knowledge assistant - A SaaS team imports product documentation into FastGPT, builds a workflow that hybrid-searches the knowledge base and generates responses with citation references, then embeds it via iframe on their support portal.
  • Internal HR policy chatbot - An enterprise HR team ingests onboarding guides, policy documents, and benefits FAQs into a FastGPT knowledge base and deploys a Slack-connected assistant that routes employee questions through a classification node before answering.
  • Research paper summarization pipeline - A university lab imports academic PDFs, uses QA pair extraction to structure key findings, and builds a workflow that retrieves relevant papers and synthesizes summaries in plain language on demand.
  • Automated CRM data analysis agent - A financial firm connects FastGPT to their internal CRM via the HTTP node, builds a workflow that fetches client records, passes them through an LLM for analysis, and generates personalized summaries in a structured output format.
  • Multi-step technical support escalation - A software company creates a workflow that first classifies incoming support tickets, queries a knowledge base for known solutions, attempts automated resolution, and routes unresolved cases to a human via a webhook node.
  • API-driven document generation - A legal team uses FastGPT’s OpenAPI endpoint to programmatically submit contract parameters, run them through a workflow that retrieves relevant clauses and fills a template, and receive structured JSON output for downstream systems.

Under The Hood

Architecture FastGPT follows a strictly layered monorepo pattern with three package tiers enforced via pnpm workspaces and TypeScript module boundaries. The packages/global layer contains shared Zod v4 schemas, constants, and typed enums that flow top-down into both packages/service (backend models, workflow engine, AI configuration) and packages/web (shared React components and hooks). The main application in projects/app is a Next.js full-stack app that consumes both service and web packages without any circular dependencies. Workflow nodes are defined as declarative templates with typed IO contracts — inputs and outputs carry Zod schemas with bilingual metadata — and the workflow execution engine processes them as a directed graph with support for parallel branches, loop nodes, and context-aware Agent mode for long-horizon task decomposition. Independent microservices (code sandbox via Bun/Hono, MCP server, marketplace) are separately deployable and loosely coupled to the main app.

Tech Stack The platform runs on Next.js 16 with the Rspack bundler for substantially faster local development. The frontend is built with Chakra UI and SCSS modules, with i18next and next-i18next providing internationalization across six languages. The backend persists data in MongoDB via Mongoose, and vector search is pluggable across PostgreSQL/pgvector, Milvus, Zilliz, OceanBase, and OpenGauss. Object storage uses a MinIO-compatible API. The isolated code execution sandbox uses Bun as the JavaScript runtime with Hono as the HTTP server inside sandboxed process pools. Observability is handled by OpenTelemetry via LogTape with configurable OTLP export. The AI layer wraps the OpenAI SDK and routes calls through either direct provider endpoints or an optional AI Proxy service for load balancing. Deployment targets include Docker Compose for single-node and Helm charts for Kubernetes.

Code Quality The project maintains an extensive test suite with over 470 test files spanning unit, integration, security, and boundary tests, executed via Vitest with mongodb-memory-server for in-memory database fixtures. Zod v4 is used throughout for runtime schema validation at API boundaries, with schemas shared between frontend and backend to prevent type drift. TypeScript strict mode is enabled across all packages. API routes use a centralized NextAPI middleware wrapper for consistent error handling and authentication enforcement. Security practices are concrete: SSRF protection in HTTP tool nodes with IP validation, sandboxed code execution with network restrictions, and permission-aware authorization checks that enforce team and ownership scope at the data layer rather than only at endpoints.

What Makes It Unique FastGPT’s most technically distinctive contribution is bidirectional MCP integration: workflows can consume tools from external MCP servers and simultaneously expose any FastGPT app as an MCP endpoint, making it a first-class participant in multi-agent AI systems rather than a terminal consumer. The RAG pipeline goes beyond retrieval by exposing chunk-level references in the chat UI with inline edit and delete capability, enabling real-time knowledge base correction during conversations. The workflow runtime supports a context-aware Agent mode designed for long-horizon task decomposition alongside parallel execution branches and loop nodes with state tracking — combinations not commonly found in comparable open-source platforms. The Zod-schema-driven node template system generates configuration UI panels automatically from type definitions, eliminating the need to maintain separate frontend form specifications for workflow nodes.

Self-Hosting

FastGPT uses a modified Apache License 2.0 with additional commercial restrictions. The core restriction is that running a multi-tenant SaaS service similar to fastgpt.io requires a separate commercial license obtainable from the team at dennis@sealos.io. Using FastGPT as a backend service within your own applications, or deploying it internally for enterprise use, is permitted without a commercial license. You also cannot remove or modify the FastGPT logo and copyright notices in the product console without a commercial license. For most self-hosting teams building internal tools or powering their own products, the open-source license is sufficient; the restriction targets businesses attempting to resell fastgpt.io-equivalent SaaS platforms.

Running FastGPT yourself is a meaningful infrastructure commitment. The stack requires MongoDB, a vector database (pgvector is the most common self-hosted choice, but Milvus or Zilliz work for larger deployments), and MinIO or compatible object storage. The application runs as multiple Docker services including the main Next.js app, a code sandbox service, and optionally an AI Proxy for model aggregation. Kubernetes deployments use the provided Helm charts. You are responsible for MongoDB backups, vector database scaling as knowledge bases grow, and keeping up with a fast release cadence that averages multiple versions per month — each release may require running a migration script via HTTP API before the new version starts correctly.

Compared to the hosted fastgpt.io cloud service, self-hosters give up managed infrastructure, automatic upgrades, and official SLA-backed uptime. The cloud tier also offers a commercial version with additional features — exact capabilities are documented at doc.fastgpt.io/guide/version/commercial — and the team provides scenario-specific onboarding support for commercial customers. The self-hosted community receives support primarily through GitHub issues and Feishu group chats, which are active but primarily Chinese-language. For teams without dedicated DevOps capacity, the operational overhead of running FastGPT’s multi-service stack at scale warrants careful evaluation before committing to self-hosting.

Join founders buildingwith open source

Opinionated takes, migration guides, cost-saving tips, and insights from the open source ecosystem.

Subscribe on Substack

No spam. Unsubscribe anytime.

Join 750+ subscribers
No spam. Unsubscribe anytime.

Search