Superagent

An open-source SDK that blocks prompt injections, redacts PII and secrets, scans repositories for AI-targeted attacks, and red-teams production agents.

6.7Kstars
962forks
MIT License
TypeScript

Superagent is an open-source SDK for AI agent safety, packaged as TypeScript and Python clients plus a CLI and MCP server. It ships four capabilities: Guard, which detects and blocks prompt injections and unsafe tool calls at runtime; Redact, which strips PII, PHI, and secrets from text; Scan, which analyzes a GitHub repository for AI-agent-targeted attacks like repo poisoning and malicious instructions; and Test, a red-teaming feature (in progress) for running adversarial scenarios against a deployed agent.

The SDK and CLI code are MIT licensed and open on GitHub, but the guard/redact/scan classification itself runs through Superagent’s hosted API — using the SDK requires signing up at superagent.sh for an API key rather than running a self-hosted classification engine locally. In that sense Superagent is closer to an open-source client for a hosted safety API than a fully self-hostable safety engine, similar to how some open SDKs wrap a proprietary backend.

The project is a Y Combinator-backed company (W2024) and ships an MCP server alongside the SDKs so the same guard/redact/scan tools can be called directly from MCP-compatible agent clients, not just via direct SDK integration.

What You Get

  • A Guard function that detects and blocks prompt injections and unsafe tool calls at runtime, in both TypeScript and Python.
  • A Redact function that strips PII, PHI, and secrets from text before it reaches a model or gets logged.
  • A Scan function that analyzes a GitHub repository for AI-agent-targeted attacks such as repo poisoning and hidden malicious instructions.
  • An MCP server so the same Guard/Redact/Scan tools can be invoked directly by MCP-compatible agent clients.
  • A CLI for running guard, redact, and scan operations outside of direct SDK integration.

Common Use Cases

  • Blocking prompt injection attacks before they reach a production LLM agent’s tool-calling logic.
  • Automatically redacting emails, SSNs, and other PII/secrets from text passed to or logged by an AI system.
  • Scanning a third-party or contributor repository for AI-agent-targeted attacks before pulling it into an agent’s toolchain.
  • Wiring agent safety checks into an MCP-compatible client without writing custom guard logic.

Under The Hood

Architecture Superagent is organized as a monorepo with clearly separated concerns: sdk/typescript and sdk/python provide near-identical client APIs (createClient/create_client) that wrap HTTP calls to Superagent’s hosted classification API, cli/ wraps the SDK for command-line use, and mcp/ exposes the same guard/redact/scan operations as MCP tools for agent clients. Each surface (SDK, CLI, MCP server) is a thin client around the same underlying hosted API contract rather than independent implementations, so the core detection logic itself lives outside the open-source repo.

Tech Stack Both SDKs are TypeScript/Python with typed client interfaces; the CLI and MCP server are also TypeScript. Guard, Redact, and Scan calls are billed/metered through the hosted API (usage and cost are returned in SDK responses), and authentication is via an API key issued at signup.

Code Quality Each surface (sdk/typescript, sdk/python, cli, mcp) has its own dedicated tests/ directory with real unit tests, including edge cases like SSRF protection and file-based guard input — indicating deliberate security-mindedness in the client code itself, not just the hosted classification behind it.

What Makes It Unique Most AI-safety tooling ships as a single SDK; Superagent’s differentiation is bundling four distinct safety primitives — runtime guardrails, redaction, repo-level attack scanning, and agent red-teaming — behind one client and exposing all of them as MCP tools, so an MCP-compatible agent can call its own safety checks without custom glue code.

Self-Hosting

Licensing Model MIT licensed for the SDKs, CLI, and MCP server code. No separate enterprise-gated open-source tier was found.

Self-Hosting Restrictions The open-source parts are thin clients: Guard, Redact, and Scan requests are sent to Superagent’s hosted API and require an API key from superagent.sh. There is no self-hosted classification engine in this repository, so self-hosters get the client/CLI/MCP code but still depend on Superagent’s hosted backend for the actual safety checks.

Enterprise Features Not documented in the repository beyond standard API usage and billing.

Cloud vs Self-Hosted There is effectively one deployment model: open-source clients calling a hosted API. This differs from typical self-hosted OSS apps in this directory, and is worth knowing before adopting it for a fully self-hosted stack.

License Key Required An API key from superagent.sh is required for all Guard/Redact/Scan/Test operations.

Join founders buildingwith open source

Opinionated takes, migration guides, cost-saving tips, and insights from the open source ecosystem.

Subscribe on Substack

No spam. Unsubscribe anytime.

Join 750+ subscribers
No spam. Unsubscribe anytime.

Search