Run LLM-generated Python code safely inside your agent—no containers, no CPython, no compromise—with sub-microsecond startup.
Monty is a minimal, secure Python interpreter written entirely in Rust, purpose-built for one job: executing Python code produced by AI agents. Instead of spinning up Docker containers, spawning CPython subprocesses, or risking direct host execution, Monty runs a curated subset of Python inside a hermetic sandbox that boots in under one microsecond and gives developers precise control over every external call the code can make.
The interpreter implements its own AST-walking execution engine on top of Ruff’s Python parser, meaning it has zero dependency on CPython or any C extension. Filesystem reads, environment variable lookups, and network calls are all routed through explicit host callbacks that developers register — everything else is blocked by default. Memory usage, stack depth, and wall-clock time can all be capped per-run with a typed ResourceTracker interface.
Monty ships as a Rust library, a Python package (pydantic-monty), and a JavaScript/TypeScript package (@pydantic/monty), each backed by the same Rust worker-pool runtime. The Python and JS bindings run Monty workers as isolated subprocesses, so even a memory-safety violation triggered by adversarial code kills only the worker — the host process stays alive and receives a MontyCrashedError. A WebAssembly build is also available for browser environments where subprocess isolation is impossible.
Built by the Pydantic team and designed to power code-mode in Pydantic AI, Monty represents a practical answer to programmatic tool calling: let the LLM write Python instead of issuing sequential JSON tool calls, execute that Python safely inside the agent loop, and handle the results with the same type-checked Python machinery you already use.
ty type checker (from Astral/Ruff), enabling pre-execution validation of LLM-written code against developer-supplied type stubsMontyRun and RunProgress can be dumped to bytes and restored later, enabling checkpoint/resume across database rows or message queuesResourceTracker traitsys, os, typing, asyncio, re, datetime, json) plus support for modern Python type hints and async/awaitArchitecture
Monty is organized as a Cargo workspace of focused crates: monty (the core interpreter), monty-pool (the worker-pool runtime), monty-proto (protobuf IPC between host and worker processes), monty-python and monty-js (PyO3 and napi-rs bindings), monty-type-checking (ty integration), and monty-typeshed (bundled type stubs). The core interpreter in crates/monty/src/ follows a clean pipeline: parse.rs calls Ruff’s parser, prepare.rs lowers the AST into an internal bytecode-like representation in bytecode/, and run.rs exposes MontyRun as the serializable entry point. Execution proceeds through a stack-based VM in bytecode/ that dispatches to individual expression, statement, and built-in handlers. Heap management lives in heap/ with stable_heap.rs and free_list.rs providing a typed arena; all object graph mutations go through a HeapReader / DropWithHeap lifetime protocol enforced at compile time via Rust’s borrow checker, eliminating entire classes of memory safety bugs without unsafe. External function calls surface as RunProgress::FunctionCall variants, pausing the VM and returning control to the host — this is the seam that both enables sandboxing and supports snapshotting.
Tech Stack
The project is pure Rust (edition 2024, MSRV 1.95) with no C or CPython FFI in the core crate. Ruff’s ruff_python_parser, ruff_python_ast, and ruff_python_stdlib are pinned at a specific git revision to ensure deterministic parser output; ty_python_semantic and related crates provide the type-checking engine. Serialization uses serde + postcard for compact binary snapshots and prost for the protobuf IPC protocol between pool host and worker subprocesses. The Python binding is built with maturin + PyO3 (no CPython runtime dependency at execution time); the JS binding uses napi-rs and ships platform-specific npm packages plus a WASM sub-path. CI runs on GitHub Actions with codspeed for performance regression tracking and codecov for coverage reporting. The dev toolchain uses uv for Python dependencies and ruff + basedpyright for linting and type checking Python glue code.
Code Quality
The test suite is unusually comprehensive for an experimental interpreter: 489 Python test case files in crates/monty/test_cases/ cover argument validation, arithmetic edge cases, type coercions, async/await, exception handling, and error messages — each file is a runnable Python snippet whose output is snapshot-tested with insta. CI enforces Clippy pedantic lints (with explicit allow-list exceptions documented in Cargo.toml), Ruff formatting and import sorting on all Python glue code, and strict basedpyright + mypy stubtest checks on the public Python API. Error handling is explicit throughout: the VM returns typed RunResult/MontyException values; no panics are used on normal error paths. CodSpeed tracks performance regressions on every push. The repository has abundant inline documentation, Rust doc-comments on all public items, and three worked examples (web_scraper, sql_playground, expense_analysis) demonstrating real-world integration patterns.
What Makes It Unique
Monty’s central innovation is implementing a complete Python execution environment in safe Rust with zero CPython dependency — not as a transpiler or a sandboxed CPython fork, but as a purpose-built interpreter that treats every external call as an explicit suspension point. This design simultaneously achieves three goals that competing approaches cannot combine: sub-microsecond cold start (no interpreter bootstrap), hard sandbox boundaries enforced at the language implementation level (not by OS-level container primitives), and serializable mid-execution state for checkpoint/resume. The integration of ty type checking into the same binary means developers can validate LLM-generated code against their host API’s type stubs before committing to execution. The multi-language embedding story — identical behavior whether called from Rust, Python, or JavaScript, including a WASM build — is a direct consequence of the CPython-free architecture and makes Monty uniquely deployable at the edge.
Licensing Model MIT licensed — all features available in self-hosted and embedded deployments with no restrictions, license keys, or paid tiers required.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
Developer Tools · Game Development · Design Tools
Free, MIT-licensed 2D and 3D game engine with one-click multi-platform export and no royalties.
Developer Tools · Databases · Search
The open-source Postgres development platform that replaces Firebase with authentication, real-time APIs, edge functions, storage, and vector embeddings — all built on PostgreSQL.