timesfm

Name: timesfm
Rating: 5 (26540 reviews)

A pretrained decoder-only foundation model by Google Research that delivers zero-shot time series forecasting with calibrated quantile prediction intervals — no training required.

26.5Kstars

2.6Kforks

Apache License 2.0

Python

View Source Visit Website

On This Page

TimesFM (Time Series Foundation Model) is a production-grade, pretrained foundation model developed by Google Research that brings the zero-shot transfer paradigm from NLP to time series forecasting. Rather than requiring per-dataset model training, TimesFM accepts any univariate time series as input and immediately returns point forecasts alongside calibrated probabilistic quantile intervals — covering the 10th through 90th percentiles — without any fine-tuning.

Version 2.5 of the model uses 200 million parameters and supports context windows up to 16,384 time steps, a dramatic increase from the 2,048-step limit of earlier versions. The model natively handles variable-length inputs, NaN imputation, and batch inference across hundreds of series simultaneously. An optional quantile head (30M parameters) enables continuous quantile forecasting up to a 1,000-step horizon, and a covariate extension (XReg) lets users layer in dynamic or static exogenous signals — such as promotional calendars, day-of-week indicators, or external pricing data — on top of the base model.

TimesFM ships as an installable Python package with separate extras for PyTorch and Flax (JAX) backends, enabling deployment on CPU, CUDA GPU, TPU, or Apple Silicon. The model weights are distributed through Hugging Face Hub, and the package integrates tightly with the HuggingFace from_pretrained pattern for one-line model loading. Beyond the open-source library, Google ships TimesFM commercially through BigQuery ML, Google Sheets Connected Sheets, and Vertex AI Model Garden, meaning teams that want SQL-level or spreadsheet-level access to the same model can opt for those managed interfaces.

The codebase supports optional fine-tuning via HuggingFace Transformers and PEFT LoRA adapters, letting practitioners adapt the pretrained base to domain-specific distributions without retraining from scratch. A growing set of worked examples covers anomaly detection via quantile intervals, covariate-driven forecasting, and global temperature prediction.

What You Get

Zero-shot forecasting for any univariate time series using a pretrained 200M-parameter model — no training data required
Calibrated probabilistic quantile forecasts (10th–90th percentile) via an optional 30M-parameter continuous quantile head
Dual backend support — PyTorch (torch) and Flax/JAX — installable as separate pip extras for GPU, TPU, or Apple Silicon acceleration
Covariate support via XReg extension, enabling fusion of dynamic numerical/categorical and static exogenous variables with the base model
Batch inference API that processes hundreds of variable-length series in a single call, with automatic NaN interpolation and zero-padding
Fine-tuning pathway using HuggingFace Transformers + PEFT LoRA adapters for domain adaptation without retraining from scratch
Integration with BigQuery ML, Google Sheets, and Vertex AI Model Garden for SQL-level and no-code access to the same pretrained checkpoint

Common Use Cases

Demand forecasting — predict product or service demand across hundreds of SKUs without per-SKU model training using batch inference
Anomaly detection — flag time series values outside the model’s 90% quantile prediction interval as statistically unusual events
Energy and sensor forecasting — forecast electricity consumption, IoT sensor readings, or industrial telemetry with a single pretrained checkpoint
Financial time series — generate probabilistic point and interval forecasts for revenue, transaction volume, or stock price series
Covariate-augmented retail forecasting — incorporate promotional calendars, day-of-week indicators, and pricing covariates via XReg to improve accuracy over the base model
Scientific measurement prediction — forecast weather, climate, or laboratory time series with flexible context lengths up to 16k data points

Under The Hood

Architecture TimesFM implements a decoder-only transformer architecture purpose-built for time series, drawing a direct structural parallel to decoder-only language models but operating on numerical patches rather than token embeddings. Input time series are sliced into fixed-length patches (32 time steps per patch), projected into a 1280-dimensional embedding space via a residual block tokenizer, and passed through 20 stacked transformer layers with 16 attention heads and rotary positional embeddings. The model autoregressively decodes output patches of 128 steps, with an optional second head that decodes 1,024-step quantile spread vectors for continuous probabilistic output. Separating the point forecast and quantile heads avoids quantile crossing by construction, and compile-time JIT via torch.compile or XLA enables efficient batched inference. The architecture is genuinely novel in applying the decode-only, patch-based transformer idiom to general-purpose time series at foundation model scale.

Tech Stack The core implementation is pure Python 3.10+ with PyTorch and Flax/JAX as parallel backends, sharing the same framework-agnostic config layer expressed through frozen Python dataclasses. Model weights are stored in safetensors format on Hugging Face Hub and loaded via the HuggingFace Hub client with PyTorchModelHubMixin for seamless from_pretrained integration. The covariate extension (XReg) adds a scikit-learn-based ridge regression layer that runs on JAX and can fuse with either the torch or flax base model. The build system uses setuptools with uv for environment management, and the package publishes to PyPI with optional dependency extras ([torch], [flax], [xreg]) to keep installs lean.

Code Quality The codebase is well-structured with a clear separation between framework-agnostic configs (frozen dataclasses), backend-specific layer implementations, and the high-level model classes. Unit tests added in April 2026 cover core layer shapes, normalization correctness, config validation, and utility functions using pytest. However, CI only runs a package build step — tests are not executed in CI, leaving test coverage validation to local runs. Code style is enforced with ruff (88-character line limit, 2-space indentation), and the dual-backend structure keeps cross-backend consistency high. Error handling in the inference path is explicit, with checked assertions on batch alignment and compile-before-forecast guards. Type annotations are partial but meaningful on public APIs.

What Makes It Unique The defining innovation is the application of the decoder-only, patch-based transformer architecture — the same design that powers large language models — to general-purpose time series forecasting at 200M parameter scale, enabling zero-shot transfer to unseen domains without any task-specific training. Unlike traditional forecasting models such as ARIMA or Prophet that require per-series fitting, or specialized deep learning models that demand domain-specific training data, TimesFM generalizes across domains out of the box. The combination of a context window up to 16k steps, a separate continuous quantile head that prevents crossing, and the XReg covariate layer positions it distinctly between pure statistical methods and heavyweight end-to-end deep learning pipelines.

Self-Hosting

TimesFM is released under the Apache License 2.0, one of the most permissive open-source licenses available. You are free to use it commercially, modify the source code, distribute it, and integrate it into proprietary products without any copyleft obligations — you do not need to open-source your downstream application. The one requirement is that you preserve the license and copyright notices in any distribution. The model weights distributed through Hugging Face Hub carry the same Apache 2.0 terms.

Running TimesFM yourself requires a Python 3.10+ environment and either PyTorch or JAX installed — the 2.5 model weighs roughly 800 MB on disk and needs approximately 1.5 GB of RAM when running on CPU, or about 1 GB of GPU VRAM. For batch workloads across hundreds of series, a CUDA-capable GPU or a TPU pod substantially reduces wall-clock time. You are responsible for provisioning hardware, managing model version upgrades as new checkpoints are released to Hugging Face, and handling inference scaling yourself. The covariate extension (XReg) adds a JAX dependency even when using the PyTorch base model, which can complicate environment management.

Google offers managed access to the same underlying model through BigQuery ML (SQL interface, no infrastructure), Google Sheets via Connected Sheets (spreadsheet interface), and Vertex AI Model Garden (dockerized REST endpoint). These managed tiers add enterprise SLAs, horizontal scaling, and Google Cloud support — things you give up when self-hosting. The open-source repository explicitly notes it is not an officially supported Google product, so production users relying on the open library should factor in the absence of a guaranteed support channel or formal SLA.