Time Series Forecasting (Part II)

Estimation of ARMA models using software package. Parameter estimation of ARMA models can be automatically performed by sophisticated software packages. In some software packages, the user may have the choice of estimation method and can choose the most appropriate method based on the problem specifications The list of software packages for time series analysis and forecasting: SPSS SAS Minitab R EViews S-Plus

50 trang | Chia sẻ: vutrong32 | Lượt xem: 1378 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Time Series Forecasting (Part II), để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

Time Series Forecasting (Part II)Duong Tuan AnhFaculty of Computer Science and EngineeringSeptember 20111OutlineStationary and nonstationary processesAutocorrelation functionAutoregressive models ARMoving Average models MAARMA modelsARIMA modelsEstimating and checking ARIMA models(Box-Jenkins Methodology)2Stochastic ProcessesThe time series in this part are all based on an important assumption – that the series to be forecasted has been generated by a stochastic process. We assume that X1, X2, ,XT in the series is drawn randomly from a probability distribution. In modeling such a process, we try to describe the characteristics of its randomness. We could assume that the observed series is drawn from a set of random variables. These random variables can be denoted by {Xt, t  T} , T is set of time indices.3Stationary and Nonstationary Processes We want to know whether or not the underlying stochastic process that generated the time series can be invariant with respect to time. If the characteristics of the stochastic process change over time, i.e., if the process is nonstationary, it will be difficult to represent the time series by a simple algebraic model. If the stochastic process is fixed in time, i.e., if it is stationary, then one can model the process via an equation with fixed coefficients that can be estimated from past data.The models described here represent stochatic processes that are assumed to be in equilibrium about a constant mean level. The probability of a given fluctuation in the process from that mean level is assumed to be the same at any point in time.4Stationary processesMathematically, a stochastic process is called stationary if its first moment and second moment are fixed and do not change in time. The first moment is the mean, E[Xt], and the second moment is the covariance between Xt and Xt+k. The kind of covariance applied on the same random variable is called auto-covariance. Variance of a process, Var[Xt], is a special case of auto-covariance with the lag k = 0. Therefore, a process is called stationary if: Mean: E[Xt] =  p the covariances are determined from: k= 1k-1 + 2k-2 +..+ pk-p (9)Now by dividing the left-hand and right-hand sides of the equations in (8) by 0, we can derive a set of p equations that together determine the first p values of ACF:19 1= 1 + 21 +..+ pp-1 2= 11 + 2 +..+ pp-2 . . .. . (10) p= 1p-1 + 2p-2 +..+ p For displacement k > p we have, from Eq. (9): k= 1 k-1 + 2k-2 +..+ pk-p (11)The equations in Eq. (10) are the Yule-Walker equations. If 1,2,,p are known, then the equations can be solved for 1, 2,, p.and vice versa.Unfortunately, solution of the Yule-Walker equation requires knowledge of p , the order of the AR process. Therefore, we solve the Yule-Walker equations for successive values of p. In other words, suppose we begin by hypothesizing that p = 1, then Eqs. (10) reduce down to 1= 1 or, using the sample autocorrelations, ’1= ’1. Thus, if the calculated value is significantly different from 0, we know that the AR process is at least order 1. Let us denote this value ’1 by a1 .20Now let us consider the hypothesis that p = 2 and solve the Eqs. (10) for p = 2 . Doing this gives us a new set of estimates ’1and ’2. If ’2 is significantly different from 0 we can conclude that the AR process is at least order 2, while if ’2 is approximately 0, we can conclude that p = 1. Let denote the value ’2 by a2. We now repeat this process for successive values of p. For p = 3, we obtain an estimate of ’3, which we denote by a3, We call this series a1, a2, a3,the partial autocorrelation function (PACF) and note that we can infer the order of the autoregressive process from its behavior. In particular, if the true order of the process is p, we should observe that aj  0, j > p. In other words, for an AR(p) series, the sample PACF cuts off at lag p. 21Let look at the second order autoregressive process AR(2): yt = 1.69 yt-1 – 0.8 yt-2 + tThe graph of 120 observations on a series generated by the AR(2) process yt = 1.69 yt-1 – 0.8 yt-2 + ttogether with the theoretical and empirical ACFs (middle) and the theoretical and empirical PACFs (bottom). The theoretical values corresponds to the solid bars.22How to check whether aj is nonzeroTo test whether a particular aj is zero, we can use the fact that it is approximately normally distributed, with mean zero and variance 1/T. (T is the number of the data points in the time series). Hence, we can check whether it is statistically significant at, say, the 5 percent level by determining whether it exceeds 2/T in magnitude.23How to estimate parameters of AR(q)For AR(q) process, the difference equation for its autocorrelation function is given by: k= 1k-1 + 2k-2 +..+ pk-p We can rewrite this equation as a set of p simultaneous linear equations relating the parameters 1 ,,p to 1,,p: 1= 1 + 21 +..+ pp-1 2= 11 + 2 +..+ pp-2 . . .. . p= 1p-1 + 2p-2 +..+ p Using these Yule-Walker equations to solve for the parameters 1 ,, p in terms of the estimated values of the autocorrelation function, we arrive at the estimates of the parameters 1, 2,,p.24Moving Average (MA) Models Time series is called a moving average process of order q (MA(q)) if each observation can be written as the following equation: yt = t - 1t-1 - 2t-2 - qt-q (3.10) where random disturbance component {t } is white noise process with 0 mean, constant variance 2 and auto-covariance k = 0 for k  0.White noise processes may not occur very commonly, but weighted sums of a white noise process can provide a good representation of processes that are non-white noise.So, in the MA(q) each observation yt is generated by a weighted average of random disturbances going back q periods.25Lag operatorWe can rewrite equation (3.10) in the form: yt = (B)t where: (B)= 1 - 1B - 2B2 - - qBq is a polynomial of order q in B, and B is the lag operator which is used to describe the lag in time. Bt = t-1The mean of the moving average process is independent of time since E[yt] =  and  = 0.Each t is assumed to be generated by the same white noise process, so that E[t] = 0, variance 2 and covariance k = 0 for k  0 26Stationary MA(q)Let us now look at the variance, denoted by 0 of the moving average process of order q: Var[yt] = 0 = E[(yt - )2] =E(t2 +12t2 + q2 t-q2 - 21tt-1-) = 2(1 + 12 + 22 ++ q2)From the above equation, we see that if MA(q) is the realization of a stationary random process, it must satisfy the following conditions: 1 + 12 + 22 ++ q2 1Thus the MA(1) process has a covariance of 0 when the displacement is more than 1 period.We now can determine the autocorrelation function for the process MA(1): k = k/0 = - 1/(1+ 12) for k = 1 = 0 for k > 129MA(2)Now let us examine the moving average process of order 2. The process is denoted by MA(2), and its equation is: yt =  + t - 1t-1 - 2t-2This process has mean  and variance 0 = 2(1 + 12+ 22 ), and covariances given by 1 = E[(t - 1t -1- 2t -2 )( t -1 - 1t -2 - 2t -3)] = - 12 + 2 12 = - 1(1 - 2) 2 2 = E[(t - 1t -1- 2t -2 )( t -2 - 1t -3- 2t -4)] = - 12 and k = 0 for k > 2The process MA(2) has a memory of exactly two periods, so that the value of yt is influenced only by events that took place in the current period, one period back, and two periods back. 30MA(q)The MA process of order q has a memory of exactly q periods. Autocorrelation function of the MA process of order q is given by the following: k = 1 if k = 0 = (-k+ 1k+1++ q-k q)/(1 + 12 + 22 ++ q2 ) if k= 1,2,..,q = 0 if k > qSo, we can see why the sample ACF can be useful in specifying the order of a moving average process.An MA(q) series is only linearly related to its first lagged values and hence is a "finite memory" model.31An example of a second-order MA process might be: yt = t + 0.9 t-1 + 0.8t-2Figure 2. The graph of 120 observations on a series generated by the MA(2) process yt = t + 0.9 t-1 + 0.8t-2 together with the theoretical and empirical ACFs (bottom) and the theoretical and empirical PACFs (middle). The theoretical values corresponds to the solid bars.32Durbin’s method for estimating MA(q)Given the MA(q) with the equation: yt = t - 1t-1 - 2t-2 - - qt-q The method for estimating MA(q) consists of two steps:Step 1: The first step consists of fitting an AR model of order m > q to {yt}. Once m has been specified, the estimated AR parameters {’k} (k = 1,,m) can be obtained via Yule-Walker estimator. Hence estimated {’t} of the noise sequence {t} can be derived, using the equation: yt = ’1yt-1 + ’2yt-2 + + ’pyt-p + t Step 2: Using {’t}, we can write: yt - ’t = 1’t-1 + 2’t-2 + + q’t-q for t = 0,,N -1.33Durbin’s method (cont.)From which estimated values {’k} of {k} can be obtained through solving the equation system.The order m can be selected via the AIC or BIC. However, a more expedient rule for selecting m is m = 2q.TThe term Moving Average is historical and should not be confused with the moving average smoothing procedures.he term Moving Average is historical and should not be confused with the moving average smoothing procedures.The term Moving Average is historical and should not be confused with the moving average smoothing procedures.34SummaryA brief summary of AR and MA models is in order. We have the following properties:- For MA models, the ACF is useful in specifying the order because the ACF cuts off at lag q for an MA(q) series.- For AR models, the PACF is useful in order determination because the PACF cuts off at lag p for an AR(p) process.- An MA series is always stationary, but for an AR series to be stationary, all of its characteristic roots must be less than 1 in modulus.- For a stationary series, the multistep ahead forecasts converge to the mean of the series and the variance of forecast errors converge to the variance of the series.35ARMA Models Many stationary random processes cannot be modeled as purely MA or as purely AR, since they have the qualities of both types of processes. The logical extension of the models MA and AR is the mixed autoregressive – moving average process of order (p, q). We denote this process as ARMA(p,q) and represent it by: yt = 1Yt-1 + 2Yt-2 + + pYt-p +  + t - 1t-1 - 2t-2 -- qt-qWhy bother with the mixed model? The answer is parsimony: there are fewer parameters to be estimated.Note: In practice, the values of p and q each rarely exceed 2.36 ACFPACFAR(p)Die outCut off after the order p of the processMA(q)Cut off after the order q of the processDie outARMA(p,q)Die outDie outSummary on AR, MA and ARMAIn this context “Die out” means “tend to zero gradually” “Cut off” means “disappear” or “is zero”37ARIMA Models In practice, many of the time series we will work with are nonstationary, so that the characteristics of the underlying stochastic process change over time. Now we construct models for those nonstationary series which can be transformed into stationary series by differencing one or more times. We say that yt is homogeneous nonstationary of order d if wt = Δd yt (3.32) is a stationary. Here Δ denotes differencing, i.e., Δyt = yt – yt-1 Δ2yt = Δyt – Δyt-1 After differencing time series to obtain a stationary series wt, we can model wt as an ARMA process.38ARIMA models (cont.)If wt = Δdyt and wt is an ARMA(p,q) process, then we say that yt is an integrated autoregressive-moving average process of order (p,d,q) or simply ARIMA(p, d, q). We can write the equation for the process ARIMA(p,d,q). We can restate the equation for the process ARIMA(p,d,q), using the lag operator (backward shift operator), as: (B)Δdyt = (B)t (3.33) with (B) = 1 - 1B - 2B2 - .. - pBp and (B) = 1 - 1B - 2B2 - pBqWe called (B) the autoregressive operator and (B) the moving average operator.ARIMA models are a class of linear models that is capable of representing stationary as well as nonstationary time series.Note that:ARIMA(p,0,q) = ARMA(p,q)39Estimating and checking ARIMA modelsWe have seen that any homogeneous nonstationary time series can be modeled as an ARIMA process of order (p,d,q). The practical problem is to choose the most appropriate values for p, d, and q , that is, to specify the ARIMA model. This problem is partly resolved by examining both the autocorrelation function (ACF) and the partial auto-correlation function (PACF) for the time series of concern.40The process for determining ARIMA model consists of the following steps:Check whether the time series is stationary. If it’s not stationary, determine d the number of times that the series must be differenced to produce a stationary series. After d is determined, we have to find possible values for p and qFor MA(q) model, ACF will cut off after the order q of the process while PACF will die out very soon.For AR(p) model, ACF will die out very soon while PACF will cut off after the order q of the process.If both p and q are non-zero, it difficult to determine the exact the orders of AR and MA. Therefore, we apply an iterative approach called Box-Jenkins methodology (1972). This model-building methodology involves a cycle consisting the three stages ofmodel selection (identification), model estimation and model checking. 41Box-Jenkins methodologyThe cycle might have to be repeated several times and at the end, there might be more than one model of the same time series.The Box-Jenkins methodology uses an iterative approach as follows:An initial model is selected, from a general class of ARIMA models, based on an examination of the time series and an examination of its autocorrelations for several time lagsThe chosen model is then checked against the historical data to see whether it accurately describes the series: the model fits well if the residuals are generally small, randomly distributed, and contain no useful information. If the specified model is not satisfactory, the process is repeated using a new model designed to improve on the original one. Once a satisfactory model is found, it can be used for forecasting. 42Notes on model selection (identification)The process of choosing the optimal (p, d, q) in an ARIMA model is known as model selection (or identification).Hannan & Rissanen have suggested the 3-step procedure:1. determine the maximum length of lag for an AR model.2. Use AIC (Akaike Information Criterion) to determine the maximum length of lag in an AR model.3. Use SC (Schwarz Criterion) to determine the maximum length of lags for a mixed ARMA model.43AIC and SCWith AIC, k is chosen to minimizewhere  t is the sum of the squared residuals, p is the maximum degree of ACF and T is the number of observations.With SC, k is chosen to minimize44Estimation of ARMA modelsWhenever an AR(1) or higher order process is used, a nonlinear estimation procedure is often utilized. This procedure is also an optimization algorithm that attempts to minimize the sum of squared residuals through an iterative procedure. S =  t2 The same situation applies to ARMA models.In using the common nonlinear algorithms, the answer that is obtained may differ depending on the initial guesses for the parameter values.Any nonlinear algorithm could produce an incorrect answer for 2 reasons:It could reach to a local optimum.It could fail to converge at all.45Estimation of ARMA models using software package.Parameter estimation of ARMA models can be automatically performed by sophisticated software packages.In some software packages, the user may have the choice of estimation method and can choose the most appropriate method based on the problem specificationsThe list of software packages for time series analysis and forecasting: SPSSSASMinitabREViewsS-Plus46Model CheckingAfter a time-series model has been specified and its parameters have been estimated, one must test whether the original specification was correct. The process of model checking involves two steps.1. The autocorrelation function for the simulated series can be compared with the sample autocorrelation function of the original series. If the two autocorrelation functions seem very different, one needs to re-specify the model. 2. If the two are not markedly different, one can analyze the residuals of the model. Remember that we have assumed that the random error terms t in the actual process are normally distributed and independent. Then if the model has been specified correctly, the residuals t should resemble a white noise process.47Note: Before using the model for forecasting, it must be checked for adequacy. Basically, a model is adequate if the residuals cannot be used to improve the forecasts, i.e., - The residuals should be random and normally distributed The individual residual autocorrelations should be small. Significant residual autocorrelations at low lags or seasonal lags suggest the model is inadequate48ReferencesJ. E. Hanke & D. W., Business Forecasting, 8th Edition, Pearson Prentice Hall, 2005.M.K. Evans, Practical Business Forecasting, Blackwell Publishers, 2001.F.X. Diebold, Elements of Forecasting, 4th Edition, Thomson-South-Western, 2007.R. S. Pindyck & D.L. Rubinfield, Econometric Models and Economic Forecasts, 3rd Edition, McGraw Hill, 1991.R. S. Tsay, Analysis of Financial Time Series, 2nd Edition, Willy, 2005.49Appendix: Parameter estimation of AR(p) byLeast square methodFor AR(p) model, the least square method, which starts with the (p+1)th observation, is often used to estimate the parameters. Specifically, conditioning on the first p observations, we have: yt = 1yt-1 + 2yt-2 ++ pyt-p + at, t = p+1,,T. which is in the form of the multiple linear regression and can be estimated by the least square method. Denote the estimate of 1by ’1. The fitted model is y’t = ’1yt-1 + ’2yt-2 ++ ’pyt-p And the associated residual is: a’t = yt – y’t. The series {a’t} is called the residual series, from which we obtain the variance of the series:50

Các file đính kèm theo tài liệu này:

ts_partii_8629.ppt