The safe, private runtime that lets autonomous AI agents operate in sandboxed environments governed by declarative YAML policies — blocking data exfiltration, credential leaks, and unauthorized network activity before they happen.
OpenShell is an open-source security runtime built by NVIDIA specifically for autonomous AI agents. As AI coding agents like Claude Code, Codex, and Copilot gain the ability to run arbitrary shell commands, browse the web, and call external APIs, the question of what they are allowed to do — and what they can exfiltrate — becomes critical. OpenShell answers this by wrapping each agent in a container sandbox with a multi-layer policy engine that enforces network egress rules, filesystem boundaries, and process constraints from application layer down to kernel.
At its core, OpenShell operates through three components: a Gateway that serves as the authenticated control plane managing sandbox lifecycle and policy delivery; Supervisors that run inside each sandbox enforcing policy where process identity, filesystem access, and outbound connections are visible; and a Policy Engine backed by OPA (Open Policy Agent) and an SMT-based formal prover using Z3 that can mathematically verify whether a given policy configuration could allow data exfiltration paths before you ever run an agent.
Policies are declarative YAML files with two classes of controls: static sections (filesystem access paths, process restrictions via Landlock and seccomp) that are locked at sandbox creation, and dynamic sections (network egress rules, inference routing) that can be hot-reloaded on running sandboxes without restart. The privacy-aware inference router strips caller credentials and injects controlled backend credentials when agents call model APIs, ensuring API keys never leak through the agent’s network traffic.
The project is currently in alpha, targeting single-developer single-environment setups, with the stated roadmap toward multi-tenant enterprise deployments. Compute backends include Docker, Podman, MicroVM, and Kubernetes, with experimental GPU passthrough support for local inference workloads. OpenShell is built agent-first — the project itself ships with agent skills for gateway troubleshooting, policy generation, and diagnostics.
openshell term provides a live debugging interface for monitoring sandbox activity, policy decisions, and network trafficArchitecture OpenShell implements a strict separation between a stateful control plane and a stateless, policy-enforcing data plane that runs inside each sandbox. The Gateway owns durable state — sandbox records, policy versions, provider credentials, inference configuration — and exposes this through a gRPC/HTTP API that the CLI, SDK, and TUI consume. Each sandbox runs a Supervisor process that is the actual security boundary: it forks the agent as a restricted child, establishes a network namespace that forces all traffic through a local policy proxy, and applies Landlock filesystem restrictions and seccomp process controls at startup. This design means the gateway never makes per-request egress decisions; enforcement is local to the sandbox where process identity and network source are unambiguous. Compute backends (Docker, Podman, MicroVM, Kubernetes) sit behind an adapter boundary so the core gateway and sandbox model stay independent of infrastructure specifics.
Tech Stack
OpenShell is written entirely in Rust using the 2024 edition, organized as a Cargo workspace with nineteen crates. The async runtime is Tokio; the gateway HTTP layer uses Axum with Tower middleware for CORS, tracing, and request IDs; inter-component communication is gRPC over Tonic with Protobuf-defined message schemas. The network policy proxy is backed by OPA (Open Policy Agent) loaded via the opa crate, with L7 inspection supporting REST method/path matching, GraphQL operation filtering, WebSocket message inspection, and TLS termination with an ephemeral per-sandbox CA generated using rcgen. The formal prover encodes policy constraints as Z3 SMT formulas and runs reachability queries against binary capability registries embedded at compile time. Persistence uses SQLx against PostgreSQL or SQLite. The TUI is built with Ratatui and Crossterm. Distribution ships as a static binary, a PyPI package installable via uv, and a Helm chart published to GHCR as an OCI artifact.
Code Quality
The codebase applies clippy at the pedantic and nursery levels project-wide, enforcing consistent error handling through miette for user-facing diagnostics and thiserror for typed library errors. Module-level documentation comments explain design intent across the crates, with the policy, prover, and architecture modules being particularly well-annotated. Test coverage exists through an e2e directory with structured end-to-end scenarios and unit tests embedded in core modules using #[cfg(test)] blocks. The workspace enforces unsafe_code = warn, rust_2018_idioms, and disallows trivial casts. Continuous integration runs on every push. The project is still in alpha and some test coverage is sparse relative to the surface area, but the type system does substantial work in preventing classes of errors at compile time.
What Makes It Unique OpenShell’s most distinctive feature is the formal policy prover backed by a Z3 SMT solver — rather than testing policies empirically by running agents and observing blocked requests, the prover encodes the entire policy, credential set, and binary capability registry as logical constraints and checks whether any combination of allowed binaries and credentials could reach attacker-controlled endpoints. This transforms policy auditing from a dynamic, trial-and-error process into a static verification step. The privacy inference router is also unusual: it operates as a credential-stripping proxy specifically for LLM API traffic, allowing organizations to let agents call model APIs without those calls carrying actual API keys that could be exfiltrated through prompt injection or logging. Together these features target a threat model — AI agents as an insider threat vector — that no general-purpose container security tool is designed around.
OpenShell is released under the Apache License 2.0, a permissive open-source license. You can use it commercially, modify it, redistribute it, and sublicense it without restriction. There are no copyleft implications — you are not required to open-source software you build on top of OpenShell. The license does require preserving copyright notices and the license text in distributions, but imposes no royalties or usage fees.
Running OpenShell yourself requires a host capable of running Docker, Podman, or a MicroVM hypervisor, or a Kubernetes cluster for multi-sandbox deployments. The gateway needs a database (SQLite for local use, PostgreSQL for multi-user deployments) and a network configuration that allows the gateway to reach sandbox containers. You are responsible for gateway availability, database backups, TLS certificate management, and keeping the gateway updated as NVIDIA releases new versions. The project is explicitly in alpha, targeting single-developer single-environment setups first, so multi-tenant production deployments will encounter rough edges and should expect breaking changes as the API stabilizes. The embedded policy prover requires a system libz3 installation or the bundled-z3 feature flag for self-contained builds.
NVIDIA does not currently offer a managed cloud service or paid tier for OpenShell — it is fully self-hosted. What you give up compared to a hypothetical managed offering is primarily operational support: no SLA, no managed upgrades, no high-availability cluster setup guidance, and no enterprise support channel. Documentation is published at docs.nvidia.com/openshell and is actively maintained alongside releases, which ship at near-daily cadence. Community support is available through GitHub Issues and Discussions, with a vouch-based contributor model that slows external contributions but signals maintainer involvement in the project’s trajectory.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
Developer Tools · Game Development · Design Tools
Free, MIT-licensed 2D and 3D game engine with one-click multi-platform export and no royalties.
Developer Tools · Databases · Search
The open-source Postgres development platform that replaces Firebase with authentication, real-time APIs, edge functions, storage, and vector embeddings — all built on PostgreSQL.