LLM Gateway is an open-source API gateway designed to simplify interaction with multiple Large Language Model providers such as OpenAI, Anthropic, and Google Vertex AI. It acts as a middleware layer that abstracts provider-specific APIs into a single, standardized OpenAI-compatible interface. This enables developers and teams to centralize LLM access, eliminate vendor lock-in, and gain visibility into usage patterns, costs, and performance across all connected models. The tool is ideal for engineering teams building AI-powered applications who need to manage multiple LLM providers, track token consumption, and optimize for cost and latency without rewriting application code.
Both self-hosted and hosted versions are available, making it suitable for startups needing a quick solution and enterprises requiring full data control. The architecture is built with TypeScript and follows a monorepo structure using Hono for the API gateway, Drizzle ORM for data persistence, and Next.js for dashboards—ensuring scalability and maintainability.
What You Get
- Unified API Interface - LLM Gateway provides a fully OpenAI-compatible API endpoint (/v1/chat/completions), allowing seamless migration of existing applications without code changes. You can switch providers by updating configuration, not application logic.
- Multi-provider Support - Connect to OpenAI, Anthropic (Claude), Google Vertex AI, and other LLM providers through a single gateway with provider-specific configuration.
- API Key Management - Centralize storage and rotation of API keys for multiple LLM providers in one place, reducing secret sprawl and improving security posture.
- Usage Analytics - Track real-time metrics including number of requests, tokens consumed per model, response latency, and estimated costs across all connected LLM providers.
- Performance Monitoring - Compare model performance (speed, cost per token) side-by-side to make data-driven decisions on which provider or model to use for specific workloads.
Common Use Cases
- Building multi-LLM applications - A startup developing a customer support chatbot routes prompts to GPT-4o for complex queries and Claude 3 for cost-sensitive ones, using LLM Gateway to balance performance and budget.
- Enterprise AI platform integration - A company with existing OpenAI integrations adds Anthropic models without modifying 20+ microservices by pointing all LLM calls to the local LLM Gateway instance.
- Cost optimization → Solution flow - Problem: Uncontrolled OpenAI usage leads to unexpected bills. Solution: LLM Gateway tracks token usage per team and model, enabling budget caps and alerts before overspending.
- DevOps teams managing cloud-based LLMs - Teams deploying AI features across AWS, GCP, and Azure use LLM Gateway to standardize access to Vertex AI, OpenAI, and custom endpoints through a single authenticated endpoint.
Under The Hood
The project is a monorepo-based system designed for managing and orchestrating LLM gateways, offering a modular architecture that supports admin interfaces, API services, and observability tools. It emphasizes structured development practices with a focus on type safety, component modularity, and scalable deployment patterns.
Architecture
This system adopts a monorepo structure with distinct applications and shared packages, enabling a unified yet modular approach to system design. The architecture follows layered principles with clear separation between frontend, backend, and shared components.
- Modular monorepo design enables scalable development across multiple applications
- Layered architecture with distinct concerns for authentication, logging, and instrumentation
- Strong emphasis on separation of concerns and component-based organization
Tech Stack
Built with TypeScript and modern React frameworks, the project leverages a wide range of contemporary tools to support full-stack development and developer experience.
- Primary language is TypeScript, with Next.js and React 19 powering frontend and API layers
- Integrates Hono for API routing, better-auth for authentication, and OpenTelemetry for observability
- Uses pnpm, Turbo, Vite, and ESLint to support efficient builds, testing, and code quality
- Comprehensive test suite powered by Vitest across core modules and services
Code Quality
Code quality is maintained through structured testing, consistent error handling, and adherence to style guidelines. While the codebase shows good organization and maintainability, some areas present opportunities for improvement in robustness.
- Extensive test coverage includes unit and integration tests across key modules
- Error handling is consistently implemented with try/catch blocks and custom error messages
- Code style and naming conventions are mostly consistent, though some technical debt exists
- Moderate learning curve due to multi-layered architecture and tool integration
What Makes It Unique
This project distinguishes itself through its unique blend of admin-driven control, API orchestration, and observability in a single platform.
- Modular monorepo structure enables seamless integration of admin UI, API, and instrumentation layers
- Admin-driven control over LLM gateways provides a novel approach to managing AI workflows
- Strong focus on observability and telemetry through OpenTelemetry integration