An AI-powered personal news aggregator that filters multi-source feeds through LLMs and delivers curated, noise-free summaries to your Notion workspace.
Auto-News is an open source personal content aggregator designed to combat information overload in the AI era. It pulls from Twitter, RSS feeds, YouTube, Reddit, web articles, and personal journal notes, then routes each item through a configurable LLM backend—OpenAI ChatGPT, Google Gemini, or locally-run Ollama models via LangChain—to rank, categorize, and summarize content based on your personal interests.
The pipeline runs on Apache Airflow, making it fully schedulable and observable. Each source type has its own dedicated operator that handles pulling, deduplication against Redis, LLM summarization, and publishing results into Notion databases. The result is a unified RSS-reader-style inbox where noisy content is filtered out before it reaches you, with only the highest-ranked insights surfacing for human review.
AutoNews also includes a Weekly Top-k Recap feature, automatically generating periodical digests, as well as an experimental multi-agent Deepdive mode powered by AutoGen that lets you explore any topic across the web through an autonomous search agent. A hosted managed version (Dots Agent) exists as a commercial offering available on iOS, Android, and the web for those who prefer not to self-host.
Architecture Auto-News follows a pipeline-operator pattern organized around Apache Airflow DAGs as the top-level orchestration layer. Each DAG represents a workflow (news pulling, journal processing, weekly recap, deepdive) and is composed of sequential BashOperator tasks that invoke dedicated Python scripts. Below the DAG layer, a family of Operator classes—one per source type (RSS, YouTube, Twitter, Reddit, articles, journal)—each implement a consistent pull/dedup/summarize/publish contract inherited from a shared OperatorBase. State for deduplication and caching is stored in Redis using template-based key naming, while Notion serves as the output and reading layer. This design cleanly separates scheduling concerns from business logic and makes adding new source types a matter of adding a new Operator without touching the DAG structure.
Tech Stack The backend is written in Python 3.9+ and orchestrated by Apache Airflow deployed via Docker Compose or Helm on Kubernetes. LLM integration is handled through LangChain 0.3, with pluggable backends supporting OpenAI (via the openai SDK), Google Gemini (langchain-google-genai), and local inference via Ollama. Content ingestion uses feedparser for RSS, tweepy for Twitter, the YouTube Transcript API and yt-dlp for video, and WebBaseLoader/BeautifulSoup for web articles. Redis acts as the deduplication and caching layer, MySQL stores structured state, and Notion serves as the human-facing reading interface via the notion-client SDK. Vector storage is supported through ChromaDB, Milvus, and Pinecone for the experimental embedding and semantic search features.
Code Quality The codebase has no automated test suite—no test files, no testing framework configuration, and no CI test step beyond a basic build badge. Error handling is generally present via try/except blocks with traceback printing, but exceptions are frequently swallowed or logged without propagating failures to the Airflow task level. The operator pattern provides reasonable structural consistency, and the LLM prompt library is centralized in a single module. Type annotations are absent throughout the source. The Airflow DAG definitions and operator implementations are well-separated but the overall quality reflects a personal productivity tool that has grown organically rather than a production-grade platform with enforced quality gates.
What Makes It Unique Auto-News stands out by combining multi-source heterogeneous feed aggregation with personalized LLM-based noise filtering in a single self-hostable pipeline—a combination most RSS readers and read-later apps do not attempt. The interest-based ranking that filters over 80% of content before it reaches the user is a meaningful differentiator from simple feed aggregators. The experimental multi-agent Deepdive mode, which uses AutoGen to autonomously search and synthesize reports, goes beyond passive aggregation into active research assistance. The choice to use Notion as the reading front-end rather than a custom web UI is pragmatic and lowers the barrier to use for Notion-first knowledge workers.
Auto-News is released under the MIT License, one of the most permissive open source licenses available. This means you can use, modify, distribute, and commercially deploy the software without restriction, provided you include the original copyright notice. There are no copyleft obligations—you are not required to open-source any modifications or applications built on top of it. This makes it a straightforward choice for both personal and commercial self-hosting scenarios.
Running Auto-News yourself requires a meaningful infrastructure footprint. The recommended setup calls for 8 CPU cores, 16 GB of RAM, and 100 GB of disk space, with a minimum of 2 cores and 6 GB to function. The stack involves Apache Airflow (with its own scheduler, webserver, and worker processes), Redis, MySQL, and optionally Milvus or another vector database. You are responsible for container orchestration (Docker Compose or Kubernetes via Helm), secret management for API keys (Notion, OpenAI, Twitter), upgrades, and ensuring the Airflow DAGs continue to run on schedule. External API credentials for Twitter, Reddit, and your chosen LLM provider must be obtained and rotated independently.
A managed commercial offering called Dots Agent is available from the same team, with web, iOS, and Android clients. The hosted version removes all infrastructure burden and is described as the quickest path to using the functionality. The self-hosted path gives you full data privacy, the ability to run local LLMs via Ollama instead of paying per-token to OpenAI or Google, and complete control over filtering logic—at the cost of operating a multi-service stack yourself. There is no documented SLA, enterprise support tier, or high-availability deployment guide for the open source version.
Automation · Productivity · AI Assistants
Build, deploy, and run autonomous AI agents that automate complex multi-step workflows using a visual block-based graph editor.
No Code Platforms · AI Development · Developer Tools
Visual LLM workflow platform with RAG pipelines, agent capabilities, and model management for building production AI applications.
Productivity · Project Management · Collaboration
The open-source AI workspace that puts your data, your rules — with local LLMs, CRDT collaboration, and full self-hosting built in.