Distributed ARIMA models for ultra-long time series

Xiaoqian Wang, Yanfei Kang, Rob J. Hyndman, Feng Li

Research output: Contribution to journalArticleResearchpeer-review

18 Citations (Scopus)


Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework. The proposed model combination approach retains the local time dependency. It utilizes a straightforward splitting across samples to facilitate distributed forecasting by combining the local estimators of time series models delivered from worker nodes and minimizing a global loss function. Instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we only make assumptions on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed approach with AutoRegressive Integrated Moving Average (ARIMA) models using the real data application as well as numerical simulations. Our approach improves forecasting accuracy and computational efficiency in point forecasts and prediction intervals, especially for longer forecast horizons, compared to directly fitting the whole data with ARIMA models. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.

Original languageEnglish
Pages (from-to)1163-1184
Number of pages22
JournalInternational Journal of Forecasting
Issue number3
Publication statusPublished - Jul 2023


  • ARIMA models
  • Distributed forecasting
  • Least squares approximation
  • MapReduce
  • Ultra-long time series

Cite this