TY - JOUR
T1 - Distributed ARIMA models for ultra-long time series
AU - Wang, Xiaoqian
AU - Kang, Yanfei
AU - Hyndman, Rob J.
AU - Li, Feng
N1 - Funding Information:
Yanfei Kang is supported by the National Natural Science Foundation of China (No. 72171011 ) and Feng Li is supported by the Emerging Interdisciplinary Project of CUFE, China and the Beijing Universities Advanced Disciplines Initiative, China (No. GJJ2019163 ). This research is supported by Alibaba Group through the Alibaba Innovative Research Program and the high-performance computing (HPC) resources at Beihang University.
Publisher Copyright:
© 2022 Elsevier GmbH
PY - 2023/7
Y1 - 2023/7
N2 - Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework. The proposed model combination approach retains the local time dependency. It utilizes a straightforward splitting across samples to facilitate distributed forecasting by combining the local estimators of time series models delivered from worker nodes and minimizing a global loss function. Instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we only make assumptions on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed approach with AutoRegressive Integrated Moving Average (ARIMA) models using the real data application as well as numerical simulations. Our approach improves forecasting accuracy and computational efficiency in point forecasts and prediction intervals, especially for longer forecast horizons, compared to directly fitting the whole data with ARIMA models. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.
AB - Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle the challenges of forecasting ultra-long time series using the industry-standard MapReduce framework. The proposed model combination approach retains the local time dependency. It utilizes a straightforward splitting across samples to facilitate distributed forecasting by combining the local estimators of time series models delivered from worker nodes and minimizing a global loss function. Instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we only make assumptions on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed approach with AutoRegressive Integrated Moving Average (ARIMA) models using the real data application as well as numerical simulations. Our approach improves forecasting accuracy and computational efficiency in point forecasts and prediction intervals, especially for longer forecast horizons, compared to directly fitting the whole data with ARIMA models. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.
KW - ARIMA models
KW - Distributed forecasting
KW - Least squares approximation
KW - MapReduce
KW - Ultra-long time series
UR - https://www.scopus.com/pages/publications/85135898583
U2 - 10.1016/j.ijforecast.2022.05.001
DO - 10.1016/j.ijforecast.2022.05.001
M3 - Article
AN - SCOPUS:85135898583
SN - 0169-2070
VL - 39
SP - 1163
EP - 1184
JO - International Journal of Forecasting
JF - International Journal of Forecasting
IS - 3
ER -