OpenShell

Name: OpenShell
Rating: 5 (7403 reviews)

The safe, private runtime that lets autonomous AI agents operate in sandboxed environments governed by declarative YAML policies — blocking data exfiltration, credential leaks, and unauthorized network activity before they happen.

7.4Kstars

914forks

Apache License 2.0

Rust

View Source Visit Website

On This Page

OpenShell is an open-source security runtime built by NVIDIA specifically for autonomous AI agents. As AI coding agents like Claude Code, Codex, and Copilot gain the ability to run arbitrary shell commands, browse the web, and call external APIs, the question of what they are allowed to do — and what they can exfiltrate — becomes critical. OpenShell answers this by wrapping each agent in a container sandbox with a multi-layer policy engine that enforces network egress rules, filesystem boundaries, and process constraints from application layer down to kernel.

At its core, OpenShell operates through three components: a Gateway that serves as the authenticated control plane managing sandbox lifecycle and policy delivery; Supervisors that run inside each sandbox enforcing policy where process identity, filesystem access, and outbound connections are visible; and a Policy Engine backed by OPA (Open Policy Agent) and an SMT-based formal prover using Z3 that can mathematically verify whether a given policy configuration could allow data exfiltration paths before you ever run an agent.

Policies are declarative YAML files with two classes of controls: static sections (filesystem access paths, process restrictions via Landlock and seccomp) that are locked at sandbox creation, and dynamic sections (network egress rules, inference routing) that can be hot-reloaded on running sandboxes without restart. The privacy-aware inference router strips caller credentials and injects controlled backend credentials when agents call model APIs, ensuring API keys never leak through the agent’s network traffic.

The project is currently in alpha, targeting single-developer single-environment setups, with the stated roadmap toward multi-tenant enterprise deployments. Compute backends include Docker, Podman, MicroVM, and Kubernetes, with experimental GPU passthrough support for local inference workloads. OpenShell is built agent-first — the project itself ships with agent skills for gateway troubleshooting, policy generation, and diagnostics.

What You Get

Sandboxed agent execution — each agent runs in an isolated container with its own network namespace, preventing lateral movement between sandboxes
Declarative YAML policy engine — define network egress rules at method/path level (REST, GraphQL, WebSocket) with hot-reload support so policies update without restarting the agent
Formal policy prover — Z3 SMT solver encodes policies as logical constraints and checks reachability queries to detect data exfiltration paths and write-bypass violations before runtime
Privacy-aware inference router — strips agent credentials from LLM API calls and injects controlled backend credentials, keeping API keys out of the agent’s observable network traffic
Credential provider system — named credential bundles injected as environment variables at runtime; credentials never written to sandbox filesystem
Multi-compute backend support — run sandboxes on Docker, Podman, MicroVM, or Kubernetes without changing policy definitions
GPU passthrough — pass host NVIDIA GPUs into sandboxes for local inference or fine-tuning workloads using CDI or Docker’s NVIDIA GPU request path
Real-time terminal UI — openshell term provides a live debugging interface for monitoring sandbox activity, policy decisions, and network traffic

Common Use Cases

Running AI coding agents safely — wrap Claude Code, Codex, or Copilot in a sandbox that prevents them from accessing credentials outside their project directory or calling unwhitelisted APIs
CI/CD agent pipelines — run automated coding agents in Kubernetes with policies that restrict outbound access to only the package registries and APIs the job legitimately needs
Multi-agent isolation — deploy multiple agents on the same host with network namespace isolation ensuring they cannot communicate with each other or access each other’s credentials
Policy verification before deployment — use the SMT prover to mathematically check that a proposed sandbox policy cannot leak secrets to attacker-controlled endpoints before approving it
Local LLM inference with access control — route agent inference calls through the privacy router to a locally-hosted model, preventing prompts containing sensitive code from reaching cloud APIs
Security research and red-teaming — observe what an agent actually tries to do by running it in a sandbox with verbose audit logging on all blocked network requests

Under The Hood

Architecture OpenShell implements a strict separation between a stateful control plane and a stateless, policy-enforcing data plane that runs inside each sandbox. The Gateway owns durable state — sandbox records, policy versions, provider credentials, inference configuration — and exposes this through a gRPC/HTTP API that the CLI, SDK, and TUI consume. Each sandbox runs a Supervisor process that is the actual security boundary: it forks the agent as a restricted child, establishes a network namespace that forces all traffic through a local policy proxy, and applies Landlock filesystem restrictions and seccomp process controls at startup. This design means the gateway never makes per-request egress decisions; enforcement is local to the sandbox where process identity and network source are unambiguous. Compute backends (Docker, Podman, MicroVM, Kubernetes) sit behind an adapter boundary so the core gateway and sandbox model stay independent of infrastructure specifics.

Tech Stack OpenShell is written entirely in Rust using the 2024 edition, organized as a Cargo workspace with nineteen crates. The async runtime is Tokio; the gateway HTTP layer uses Axum with Tower middleware for CORS, tracing, and request IDs; inter-component communication is gRPC over Tonic with Protobuf-defined message schemas. The network policy proxy is backed by OPA (Open Policy Agent) loaded via the opa crate, with L7 inspection supporting REST method/path matching, GraphQL operation filtering, WebSocket message inspection, and TLS termination with an ephemeral per-sandbox CA generated using rcgen. The formal prover encodes policy constraints as Z3 SMT formulas and runs reachability queries against binary capability registries embedded at compile time. Persistence uses SQLx against PostgreSQL or SQLite. The TUI is built with Ratatui and Crossterm. Distribution ships as a static binary, a PyPI package installable via uv, and a Helm chart published to GHCR as an OCI artifact.

Code Quality The codebase applies clippy at the pedantic and nursery levels project-wide, enforcing consistent error handling through miette for user-facing diagnostics and thiserror for typed library errors. Module-level documentation comments explain design intent across the crates, with the policy, prover, and architecture modules being particularly well-annotated. Test coverage exists through an e2e directory with structured end-to-end scenarios and unit tests embedded in core modules using #[cfg(test)] blocks. The workspace enforces unsafe_code = warn, rust_2018_idioms, and disallows trivial casts. Continuous integration runs on every push. The project is still in alpha and some test coverage is sparse relative to the surface area, but the type system does substantial work in preventing classes of errors at compile time.

What Makes It Unique OpenShell’s most distinctive feature is the formal policy prover backed by a Z3 SMT solver — rather than testing policies empirically by running agents and observing blocked requests, the prover encodes the entire policy, credential set, and binary capability registry as logical constraints and checks whether any combination of allowed binaries and credentials could reach attacker-controlled endpoints. This transforms policy auditing from a dynamic, trial-and-error process into a static verification step. The privacy inference router is also unusual: it operates as a credential-stripping proxy specifically for LLM API traffic, allowing organizations to let agents call model APIs without those calls carrying actual API keys that could be exfiltrated through prompt injection or logging. Together these features target a threat model — AI agents as an insider threat vector — that no general-purpose container security tool is designed around.

Self-Hosting

OpenShell is released under the Apache License 2.0, a permissive open-source license. You can use it commercially, modify it, redistribute it, and sublicense it without restriction. There are no copyleft implications — you are not required to open-source software you build on top of OpenShell. The license does require preserving copyright notices and the license text in distributions, but imposes no royalties or usage fees.

Running OpenShell yourself requires a host capable of running Docker, Podman, or a MicroVM hypervisor, or a Kubernetes cluster for multi-sandbox deployments. The gateway needs a database (SQLite for local use, PostgreSQL for multi-user deployments) and a network configuration that allows the gateway to reach sandbox containers. You are responsible for gateway availability, database backups, TLS certificate management, and keeping the gateway updated as NVIDIA releases new versions. The project is explicitly in alpha, targeting single-developer single-environment setups first, so multi-tenant production deployments will encounter rough edges and should expect breaking changes as the API stabilizes. The embedded policy prover requires a system libz3 installation or the bundled-z3 feature flag for self-contained builds.

NVIDIA does not currently offer a managed cloud service or paid tier for OpenShell — it is fully self-hosted. What you give up compared to a hypothetical managed offering is primarily operational support: no SLA, no managed upgrades, no high-availability cluster setup guidance, and no enterprise support channel. Documentation is published at docs.nvidia.com/openshell and is actively maintained alongside releases, which ship at near-daily cadence. Community support is available through GitHub Issues and Discussions, with a vouch-based contributor model that slows external contributions but signals maintainer involvement in the project’s trajectory.

Related Apps

Rust

95%

MIT

claw-code

AI Agents · AI Code Assistants

194,567

A Rust-built CLI agent harness for Claude AI with persistent sessions, MCP tool integration, plugin hooks, and multi-provider support — designed to run autonomous coding workflows without human babysitting.

View details