A note on the validity of cross-validation for evaluating autoregressive time series prediction

Research output: Contribution to journalArticleResearchpeer-review

233 Citations (Scopus)

Abstract

One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often replaced by practitioners in favour of an out-of-sample (OOS) evaluation. It is shown that for purely autoregressive models, the use of standard K-fold CV is possible provided the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, and where CV can control for overfitting the data. Theoretical insights supporting these arguments are presented, along with a simulation study and a real-world example. It is shown empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.

Original languageEnglish
Pages (from-to)70-83
Number of pages14
JournalComputational Statistics and Data Analysis
Volume120
DOIs
Publication statusPublished - 1 Apr 2018

Keywords

  • Autoregression
  • Cross-validation
  • Time series

Cite this