Rasa Open Source

Name: Rasa Open Source
Rating: 5 (21225 reviews)

Rasa Open Source is a Python machine learning framework for building contextual, multi-turn chatbots and voice assistants that understand natural language and maintain conversation state.

21.2Kstars

4.9Kforks

Apache License 2.0

Python

View Source Visit Website

On This Page

Rasa Open Source is a production-grade Python framework for building AI-powered conversational assistants that handle complex, multi-turn dialogues. Unlike simple FAQ bots or rule-based systems, Rasa trains custom NLU models to recognize user intents and extract entities from free-form text, then applies learned dialogue policies to determine the appropriate response at each turn — enabling assistants that handle interruptions, slot-filling, and context shifts naturally.

The framework is built around a graph-based execution engine where NLU components (tokenizers, featurizers, classifiers, entity extractors) and dialogue policies are defined as pluggable GraphComponents wired together in a directed graph. This makes the pipeline fully customizable: swap in a spaCy tokenizer, a BERT-based featurizer, or a custom action that queries an external database without touching the rest of the system.

Rasa ships with native connectors for Slack, Facebook Messenger, Telegram, Twilio, Microsoft Bot Framework, Mattermost, and Rocket.Chat, along with a REST channel for custom integrations. Tracker stores support PostgreSQL, Redis, and MongoDB for production-scale conversation history persistence. For high-throughput event processing, a Kafka broker integration enables decoupled, stream-based architectures.

Now in maintenance mode as Rasa shifts focus to their CALM (Conversational AI with Language Models) approach, Rasa Open Source 3.6.x remains the most widely deployed open source conversational AI framework with over 21,000 GitHub stars and 25 million downloads.

What You Get

Graph-Based NLU Pipeline - A directed acyclic graph of pluggable components (tokenizers, featurizers, DIETClassifier, CRFEntityExtractor) that trains intent classification and entity extraction in a single configurable pass, with support for spaCy, HuggingFace Transformers, and MITIE backends.
TEDPolicy Dialogue Management - The Transformer Embedding Dialogue policy maps full conversation histories into dense embeddings to predict next actions, learning from example stories without requiring exhaustive rule trees for every conversation path.
Custom Action Server - A separate Python REST service where developers write business logic triggered by dialogue state, enabling database queries, API calls, and dynamic slot-filling that integrate with CRMs, ticketing systems, and enterprise backends.
Production Messaging Channel Connectors - Native, OAuth-ready integrations with Slack, Facebook Messenger, Telegram, Twilio, Microsoft Bot Framework, Rocket.Chat, Mattermost, and WebexTeams, plus a generic REST channel for custom frontends.
Event-Sourced Tracker Stores - Conversation history is stored as an ordered event log in PostgreSQL, Redis, MongoDB, or DynamoDB via SQLAlchemy and native adapters, enabling full replay and auditability of every interaction.
Kafka and RabbitMQ Event Brokers - Production event streaming to Confluent Kafka (with SASL/SCRAM and TLS support) or RabbitMQ via pika, decoupling downstream analytics and monitoring pipelines from the core bot server.
Rasa CLI - A unified command-line interface with commands for training models, running interactive learning sessions, testing conversations end-to-end, and serving the REST API — all configurable via YAML domain and pipeline files.

Common Use Cases

Customer support automation - A SaaS company deploys a Rasa bot on their website chat to handle password resets, subscription inquiries, and billing questions, routing complex edge cases to human agents via a handoff action.
Banking and fintech assistants - A bank integrates Rasa on WhatsApp to let customers check balances, report lost cards, and schedule branch appointments with context-aware dialogue flows connected to core banking APIs.
Internal IT helpdesk bots - An enterprise wires Rasa into Slack to handle employee requests for software access, VPN troubleshooting, and onboarding checklists, triggering Jira tickets and LDAP lookups via custom actions.
Healthcare appointment scheduling - A clinic runs a HIPAA-conscious Rasa deployment for appointment booking, prescription refill requests, and symptom triage with slot-filling forms collecting patient information before routing to staff.
E-commerce shopping assistants - An online retailer uses Rasa to power a conversational product discovery experience that understands user preferences, checks inventory via API, and handles order status queries across Telegram and Messenger.

Under The Hood

Architecture Rasa’s core is a directed acyclic graph execution engine where every NLU and dialogue component implements the GraphComponent abstract interface, declaring lifecycle hooks for creation, loading, and fingerprinting. Nodes in the graph are typed SchemaNode dataclasses specifying how their constructor and primary function receive inputs from parent nodes, enabling Dask-based parallel execution during training and eager loading during inference. The system separates concerns cleanly: NLU (tokenizers, featurizers, classifiers, entity extractors) lives under one sub-graph, dialogue policies (TEDPolicy, MemoizationPolicy, RulePolicy) under another, and action execution runs in a completely separate process via REST. Conversation state is maintained in an event-sourced DialogueStateTracker that replays stored events to reconstruct context, decoupling storage from in-memory state — any component can be replaced or extended without touching surrounding logic.

Tech Stack Rasa is a Python 3.8-3.10 framework using asyncio throughout, with a Sanic-based async REST API server and a Dask-based graph runner for parallel component execution during training. For NLU, TensorFlow 2.x powers the custom DIETClassifier — a Dual Intent and Entity Transformer that combines transformer-based understanding with CRF entity tagging in a single multi-task model. spaCy provides pretrained tokenizers and word vectors; HuggingFace Transformers are supported via LanguageModelFeaturizer. Persistence uses SQLAlchemy with PostgreSQL or SQLite for tracker stores, and Redis for the session lock store. Event streaming supports Confluent Kafka with SASL/SCRAM and TLS, and RabbitMQ via pika. Dependency management is via Poetry, with Black for formatting, mypy in strict mode for type checking, and pytest for testing.

Code Quality The test suite is comprehensive, with over 200 test files and more than a thousand test functions spanning core dialogue management, NLU components, channel integrations, and the engine layer — using pytest fixtures, parametrize, and extensive mocking. Type annotations are thorough throughout the codebase, with mypy configured in strict mode enforcing typed defs and calls at every level. Error handling follows a custom exception hierarchy (RasaException, GraphComponentException, GraphSchemaException) with descriptive messages and targeted assertions rather than broad catches. Black and import sorting enforce consistent style; GitHub Actions CI runs full test suites, secret scanning via Trivy, and automated changelog management via towncrier. Some accumulated complexity exists in the API server and tracker store layers reflecting years of feature additions.

What Makes It Unique Rasa’s primary innovation is the Dual Intent and Entity Transformer (DIET), a custom multi-task architecture that trains a single transformer to simultaneously classify intents and extract entities — outperforming separate specialized models when training data is limited. TEDPolicy maps full conversation histories into dense embedding spaces to predict next actions without requiring exhaustive rule trees, learning generalizable patterns from example stories. The graph-based execution model introduced in Rasa 3.0 replaced linear pipelines with a first-class DAG where components declare dependencies explicitly and the runner prunes and caches subgraphs during incremental retraining. A pluggy-based plugin hook system allows enterprise extensions to integrate into the lifecycle without forking core code.

Self-Hosting

Rasa Open Source is released under the Apache License 2.0, one of the most permissive open source licenses available. You can use it freely in commercial products, modify the source, and distribute your changes without any copyleft obligations — you are not required to open-source your own conversational assistant built on top of it. Attribution in notices and preservation of the license file in distributions are the primary requirements.

Running Rasa yourself is a meaningful operational commitment. A production deployment requires an always-on Python application server (Sanic), a separate custom action server hosting your business logic, a PostgreSQL or Redis tracker store for conversation history, and optionally a Kafka broker for event streaming to analytics pipelines. You own uptime, model retraining, database backups, scaling under load, and all security patching. The Docker multi-stage builds and docker-compose examples simplify initial setup, but horizontal scaling, SSL termination, and session lock management under concurrent users require deliberate infrastructure decisions. The framework’s maintenance-mode status (as of 2025) means security patches arrive infrequently and new feature development has moved to the CALM-based Hello Rasa platform.

Rasa Technologies offers a commercial Rasa Platform (previously Rasa X / Rasa Enterprise) that adds a conversation review UI, team collaboration tools, analytics dashboards, managed model deployment, and enterprise support SLAs. Self-hosters forgo the model annotation and review interface that makes continuous improvement practical at scale, as well as managed upgrades and high-availability configurations. For teams without dedicated ML ops capacity, the gap between running Rasa locally and operating it reliably in production for thousands of concurrent users is considerable — the hosted commercial offering addresses precisely the operational concerns that the open source framework leaves to the deploying team.

On This Page