Abstract
We present a nonparametric method to forecast a seasonal time series, and propose four dynamic updating methods to improve point forecast accuracy. Our forecasting and dynamic updating methods are data-driven and computationally fast, and they are thus feasible to be applied in practice. We will demonstrate the eectiveness of these methods using monthly El Niño time series from 1950 to 2008 (http://www.cpc.noaa.gov/data/indices/sstoi.indices). Let {Z w, w ∈ [0,∞)} be a seasonal univariate time series which has been observed at N equispaced time. Aneiros-Pérez & Vieu (2008) assume that N can be written as N = np, where n is the number of samples and p is dimensionality. To clarify this, in the El Niño time series from 1950 to 2008, we have N = 708, n = 59, p = 12. The observed time series {Z 1, ⋯, Z708} can thus be divided into 59 successive paths of length 12 in the following setting: yt = {Zw, w ∈ (p(t-1), pt]}, for t = 1, ⋯, 59. The problem is to forecast future processes, denoted as yn+h,h<0, from the observed data. To solve this problem, we apply a nonparametric method known as principal component analysis (PCA) to decompose a complete (12x59) data matrix (Y = [y1, ⋯ , y59]) into a number of principal components and their associated principal component scores. That is, Y = μ + φ′ 1 β1 + ⋯ + φ′Kβ K + ε where μ = [μ1, ⋯ , μ12]′ is the pointwise mean vector; φ1, ⋯ , φK ∈ RK (φk = [φ1,k, ⋯ , φ12,k]) are estimated principal components; β1, ⋯ , β K (βk = [β1,k, ⋯ ,β59,k] ′) are uncorrelated principal component scores satisfying Σk K =1 βk 2 < ∞, for k = 1, ⋯ , K; ∈ is assumed to be a zero-mean 12x59 residual matrix; and K < 12 is the optimal number of components. Since β1, ⋯ , βK are uncorrelated, we can forecast them using a univariate time series (TS) method, like exponential smoothing (Hyndman et al., 2008). Conditioning on the observed data (I) and fixed principal components (φ = φ1, ⋯ , φK), and the forecasted curves are given as Ŷn+h|n = E(yn+h|I, φ) = μ + φ′1β1,n+h|n + ⋯ + φ′KβK,n+h|n, (1) where β,k,n+h|n, k = 1, ⋯ , K are the forecasted principal component scores. An interesting problem arises when N ≠ np, which is an assumption made in Aneiros-Pérez & Vieu (2008). In other words, there are partially observed data in the final year. This motivates us to develop four dynamic updating methods, not only to update our point forecasts, but also to eliminate the assumption in Aneiros-Pérez & Vieu (2008). Four dynamic updating methods are called the block moving (BM), ordinary least squares (OLS), penalized least squares (PLS), and ridge regression (RR). The BM approach rearranges the observed data matrix to form a complete data matrix by sacrificing some observations in the first year, thus (1) can still be applied. The OLS method considers the partially observed data in the final year as responses, and use them to regress against the corresponding principal components, but it fails to consider historical data. The PLS method effectively combines the advantages of both TS and OLS methods, while the RR method is a well-known shrinkage method for solving ill-posed problems.
Original language | English |
---|---|
Title of host publication | 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation: Interfacing Modelling and Simulation with Mathematical and Computational Sciences, Proceedings |
Pages | 1552-1558 |
Number of pages | 7 |
Publication status | Published - 2009 |
Event | International Congress on Modelling and Simulation 2009: Interfacing Modelling and Simulation with Mathematical and Computational Sciences - Cairns, Australia Duration: 13 Jul 2009 → 17 Jul 2009 Conference number: 18th https://www.mssanz.org.au/modsim09/ |
Conference
Conference | International Congress on Modelling and Simulation 2009 |
---|---|
Abbreviated title | MODSIM 2009 |
Country/Territory | Australia |
City | Cairns |
Period | 13/07/09 → 17/07/09 |
Internet address |
Keywords
- El Niño time series
- Penalized least squares
- Principal component regression