An open-source context database that gives AI agents a unified filesystem for memory, resources, and skills with hierarchical tiered retrieval.
OpenViking is a purpose-built context database for AI agents, developed by ByteDance’s Volcengine team. It replaces the fragmented approach of storing memories in code, resources in vector databases, and skills in scattered files with a single unified filesystem paradigm. By organizing context as a hierarchy of directories and files with URI-addressable nodes, agents can manage their knowledge base the same way a developer manages a local filesystem.
The core innovation is OpenViking’s three-tier context loading system (L0/L1/L2), where abstract summaries, overviews, and full content are stored at different levels and loaded on demand. This dramatically reduces token consumption compared to naive RAG approaches that retrieve full documents indiscriminately. Retrieval traverses the directory tree recursively, combining directory-level positioning with dense and sparse semantic vector search and reranking for high-precision context acquisition.
OpenViking ships with automatic session management that compresses conversation history, extracts long-term memories from agent interactions, and maintains a hotness-weighted scoring system so frequently accessed context surfaces faster over time. The system integrates natively with MCP (Model Context Protocol), LangChain, and LangGraph, and supports a wide range of VLM and embedding providers including OpenAI, Volcengine Doubao, Kimi, GLM, and local Ollama models.
Beyond core context storage, OpenViking includes a Rust-based CLI tool and companion filesystem layer (RAGFS) compiled as a Python extension, a FastAPI-based REST server with OAuth and API key authentication, OpenTelemetry tracing, a visual web studio for browsing the context tree, and a vikingbot subsystem that connects agents to messaging platforms like Feishu, Telegram, Slack, DingTalk, and WeChat.
viking:// URI namespace, enabling intuitive path-based addressingArchitecture
OpenViking is organized as a layered, modular system built around a virtual filesystem abstraction. The central VikingFS layer translates viking:// URIs into storage operations, routing reads and writes through an async AGFS (Agent Filesystem) client compiled from Rust into a Python extension. Above VikingFS sits the retrieval layer, which implements hierarchical traversal using a HierarchicalRetriever that fans out vector searches across directories, convergently narrows candidates through multiple rounds, and applies rerank scoring with hotness weighting. The session layer wraps both: it intercepts incoming messages, externalizes tool outputs, triggers background compression jobs via APScheduler, and emits extracted memories back into the filesystem. This separation of concerns means context storage, retrieval strategy, and session lifecycle are independently composable — an operator can swap vector backends or tune retrieval parameters without touching session logic.
Tech Stack The project is a Python 3.10+ package built with setuptools and maturin (for Rust extensions), using FastAPI and Uvicorn as the REST server framework and Pydantic v2 for all data models. Vector storage is abstracted behind pluggable adapter classes supporting VikingDB, Qdrant, local disk, and OpenGauss backends. Embeddings and VLM calls are dispatched through litellm, enabling transparent support for OpenAI, Volcengine Doubao, Kimi, GLM, Gemini, and Ollama. The CLI is a Rust binary wrapped in a thin Python entry point. OpenTelemetry (with OTLP gRPC and HTTP exporters) provides distributed tracing throughout. Tree-sitter parsers for ten languages handle code-aware chunking, and the web studio frontend is bundled as static assets served by the FastAPI app.
Code Quality
The codebase demonstrates strong engineering discipline with a comprehensive test suite organized by subsystem under a top-level tests/ directory, covering retrievers, session compression, storage adapters, observability, and CLI behavior. Tests use pytest with pytest-asyncio for async coverage and pytest-xdist for parallel execution. Ruff enforces code style with isort, pyflakes, and bugbear rules; mypy type-checking is configured but set to permissive mode, meaning type annotations are present but not exhaustively enforced. Loguru provides structured logging throughout. Error handling is explicit, with typed exception classes (NotFoundError, PermissionDeniedError, FailedPreconditionError) mapped to HTTP status codes at the server boundary rather than swallowed silently.
What Makes It Unique OpenViking’s distinguishing contribution is applying the filesystem metaphor to agent context rather than treating it as a flat bag of vectors. The L0/L1/L2 tiered abstraction — where each context node maintains an automatically generated abstract and overview at shorter lengths alongside its full content — allows a retriever to decide how much detail to load based on relevance, rather than always fetching everything. The hierarchical recursive retriever with directory dominance scoring (a directory’s relevance must exceed a ratio of its best child’s score to be promoted) is a novel approach to multi-granularity retrieval not found in standard RAG frameworks. Combined with hotness-weighted memory lifecycle management that boosts recently and frequently accessed contexts, the system is designed to self-improve as agents use it rather than remaining static.
OpenViking is released under the GNU Affero General Public License v3.0 (AGPL-3.0). This means you can freely use, modify, and distribute the software, including for commercial purposes, but any modifications you deploy in a networked service must also be released under AGPL-3.0. For organizations that need to keep their modifications proprietary — common when embedding OpenViking into a commercial product or SaaS platform — this license requires either open-sourcing those changes or negotiating a separate commercial license with ByteDance/Volcengine. Pure internal use (not exposed as a public service) carries fewer obligations, but legal counsel is advisable for commercial deployments.
Running OpenViking yourself requires Python 3.10+, a Rust toolchain for building the RAGFS and CLI components from source, and a C++ compiler (GCC 9+ or Clang 11+). In production you will also need a vector database backend (VikingDB, Qdrant, or OpenGauss), embedding model endpoints, and optionally a VLM provider for document parsing and image understanding. The project ships a Docker Compose file and Kubernetes Helm chart examples in the repository, but operating the full stack — managing database uptime, coordinating background compression jobs, handling model API rate limits, and keeping the Rust extension compatible across Python versions — represents a meaningful operational investment for small teams.
Volcengine, the managed cloud offering from ByteDance behind OpenViking, provides a hosted version with enterprise SLAs, managed upgrades, and native integration with Doubao models and VikingDB. Self-hosters forgo this managed upgrade path, SLA guarantees, cloud-native autoscaling, and first-party support channels. The hosted tier also adds features like cross-tenant isolation enforcement and enterprise SSO that would require additional engineering effort to replicate on a self-managed deployment. For teams already in the Volcengine ecosystem, the managed path will be significantly lower operational overhead than self-hosting.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
Developer Tools · Databases · Search
The open-source Postgres development platform that replaces Firebase with authentication, real-time APIs, edge functions, storage, and vector embeddings — all built on PostgreSQL.
AI Code Assistants · AI Development
Orchestrate an army of AI coding agents—Claude Code, Codex, Gemini CLI, and more—running simultaneously in isolated git worktrees from a single Electron desktop app.