ATA Unit3 4 5
ATA Unit3 4 5
ATA Unit3 4 5
Unit 3-4-5-6
Differencing
The backward shift operator B is a useful notational device when working with time series
lags:
(Some references use L for “lag” instead of B for “backshift”.)
In other words, B, operating on yt, has the effect of shifting the data back one period. Two
applications of B to yt shifts the data back two periods:
For monthly data, if we wish to consider “the same month last year,” the notation is
The backward shift operator is convenient for describing the process of differencing. A
first difference can be written as
Differencing
❑ In a multiple regression model, we forecast the variable of interest using a linear combination
of predictors.
This is like a multiple regression but with lagged values of yt as predictors. We refer to this as
an AR(p) model, an autoregressive model of order p.
Auto-Regression
❑ Autoregressive models are remarkably flexible at handling a wide range of different time
series patterns.
❑ The two series in Figure show series from an AR(1) model and an AR(2) model.
❑ Changing the parameters ϕ1,…,ϕp results in different time series patterns.
❑ The variance of the error term εt will only change the scale of the series, not the patterns.
Auto-Regression
Rather than using past values of the forecast variable in a regression, a moving average model
uses past forecast errors in a regression-like model,
❑ Notice that each value of yt can be thought of as a weighted moving average of the past
few forecast errors
❑ However, moving average models should not be confused with the moving
average smoothing
❑ A moving average model is used for forecasting future values, while moving average
smoothing is used for estimating the trend-cycle of past values.
Moving Average Models
❑ Figure shows some data from an MA(1) model and an MA(2) model.
❑ Changing the parameters θ1,…,θq results in different time series patterns.
❑ As with autoregressive models, the variance of the error term εt will only change the scale
of the series, not the patterns.
Moving Average Models
It is possible to write any stationary AR(p) model as an MA(∞) model. For example, using repeated
substitution, we can demonstrate this for an AR(1) model:
provided −−1<ϕ1<1, the value of will get smaller as k gets larger. So eventually we obtain
an MA(∞) process.
The reverse result holds if we impose some constraints on the MA parameters. Then the MA model is
called invertible. That is, we can write any invertible MA(q) process as an AR(∞) process.
The invertibility constraints for other models are similar to the stationarity constraints.
ARMA Models
Determining the appropriate values for p and q is crucial for building an effective ARMA
model. This can be done using the following methods:
1.Partial Autocorrelation Function (PACF):
1. PACF is used to determine the order p of the AR model. It measures the correlation
between observations at different lags, excluding the influence of intermediate lags.
2. The order p is determined by the lag at which the PACF plot cuts off.
2.Autocorrelation Function (ACF):
2. ACF is used to determine the order q of the MA model. It measures the correlation
between observations at different lags.
3. The order q is determined by the lag at which the ACF plot cuts off.
ARMA Models
• Causality of a stationary time series indicates that the time series is dependent on past/lag
values.
• Invertibility refers to linear stationary process which behaves like infinite representation of
autoregressive. In other word, this is the property that possessed by a moving average
process. Invertibility solves non-uniqueness of autocorrelation function of moving average.
Non-seasonal ARIMA models
where y′t is the differenced series (it may have been differenced more than once).
The “predictors” on the right hand side include both lagged values of yt and lagged errors. We
call this an ARIMA(p,d,q) model, where
Once we start combining components in this way to form more complicated models, it is
much easier to work with the backshift notation.
Non-seasonal ARIMA models
The constant c has an important effect on the long-term forecasts obtained from these models.
•If c=0 and d=0, the long-term forecasts will go to zero.
•If c=0 and d=1, the long-term forecasts will go to a non-zero constant.
•If c=0 and d=2, the long-term forecasts will follow a straight line.
•If c≠0 and d=0, the long-term forecasts will go to the mean of the data.
•If c≠0 and d=1, the long-term forecasts will follow a straight line.
•If c≠0 and d=2, the long-term forecasts will follow a quadratic trend.
❑ The value of d also has an effect on the prediction intervals — the higher the value of d, the
more rapidly the prediction intervals increase in size.
❑ For d=0, the long-term forecast standard deviation will go to the standard deviation of the
historical data, so the prediction intervals will all be essentially the same.
This behaviour is seen in forecasted Figure where d=0 and c≠0. In this figure, the prediction intervals are almost the same
width for the last few forecast horizons, and the final point forecasts are close to the mean of the data.
ACF and PACF plots
❑ ACF plot shows the autocorrelations which measure the relationship between yt and yt−k for
different values of k.
❑ Now if yt and yt−1 are correlated, then yt−1 and yt−2 must also be correlated.
❑ However, then yt and yt−2 might be correlated, simply because they are both connected to yt−1,
rather than because of any new information contained in yt−2 that could be used in
forecasting yt.
❑ To overcome this problem, we can use partial autocorrelations. These measure the
relationship between yt and yt−k after removing the effects of lags 1,2,3,…,k−1.
❑ So the first partial autocorrelation is identical to the first autocorrelation, because there is
nothing between them to remove.
❑ Each partial autocorrelation can be estimated as the last coefficient in an autoregressive
model.
❑ Specifically, αk, the kth partial autocorrelation coefficient, is equal to the estimate of ϕk in an
AR(k) model.
ACF and PACF plots
❑ If the data are from an ARIMA(p,d,0) or ARIMA(0,d,q) model, then the ACF and PACF
plots can be helpful in determining the value of p or q.
❑ If p and q are both positive, then the plots do not help in finding suitable values of p and q.
❑ The data may follow an ARIMA(p,d,0) model means AR(p) model if the ACF and PACF
plots of the differenced data show the following patterns:
✓ the ACF is exponentially decaying or sinusoidal;
✓ there is a significant spike at lag p in the PACF, but none beyond lag p.
ACF and PACF plots
Calculate/Check
Confidence Intervals:
In figure, we see that there is a decaying sinusoidal pattern in the ACF and the PACF
shows the last significant spike at lag 4. This is what you would expect from an
ARIMA(4,0,0) model.
ACF and PACF plots
❑ The data may follow an ARIMA(0,d,q) model means MA(q) if the ACF and PACF plots of
the differenced data show the following patterns:
✓ the PACF is exponentially decaying or sinusoidal;
✓ there is a significant spike at lag q in the ACF, but none beyond lag q.
Estimation and order selection
❑ Once the model order has been identified (i.e., the values of p, d and q), we need to estimate
the parameters c, ϕ1,…,ϕp, θ1,…,θq.
❑ Maximum likelihood estimation (MLE) is a technique that finds the values of the parameters
which maximise the probability of obtaining the data that we have observed.
❑ For ARIMA, MLE is similar to the least squares estimates that would be obtained by
minimizing
Order selection - Information Criteria
❑ Akaike’s Information Criterion (AIC) is useful for determining the order of an ARIMA
model. It can be written as
OR
where L is the likelihood of the data, k=1 if c≠0 and k=0 if c=0. Note that the last term in
parentheses is the number of parameters in the model (including σ2, the variance of the
residuals).
OR
Good models are obtained by minimising the AIC, AICc or BIC. Our preference is to use the AICc.
Calculating AICc
Fit the ARIMA Model: Suppose you fit an ARIMA(1,1,1) model to your time series data. After
fitting the model, you obtain the following:
•Number of observations n=100
•Number of parameters k=3 (for ARIMA(1,1,1), this includes 1 AR parameter, 1 MA parameter,
and 1 for the constant term)
•Log-Likelihood of the model = -150.0
Calculate AIC: Using the log-likelihood and the number of parameters, compute the AIC:
AIC=−2⋅(−150.0)+2⋅3=300+6=306
It is important to note that these information criteria tend not to be good guides to
selecting the appropriate order of differencing (d) of a model, but only for selecting the
values of p and q.
This is because the differencing changes the data on which the likelihood is computed,
making the AIC values between models with different orders of differencing not
comparable.
So we need to use some other approach to choose d, and then we can use the AICc to
select p and q.
How does ARIMA works
When fitting an ARIMA model to a set of (non-seasonal) time series data, the following
procedure provides a useful general approach.
1. Plot the data and identify any unusual observations.
2. If necessary, transform the data (using a Box-Cox transformation) to stabilize the variance.
3. If the data are non-stationary, take first differences of the data until the data are stationary.
4. Examine the ACF/PACF: Is an ARIMA(p,d,0) or ARIMA(0,d,q) model appropriate?
5. Try your chosen model(s), and use the AICc to search for a better model.
6. Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a
portmanteau test of the residuals. If they do not look like white noise, try a modified model.
7. Once the residuals look like white noise, calculate forecasts.
Hyndman-Khandakar algorithm
b) The best model (with the smallest AICc value) fitted in step (a) is set to be the “current model”.
c) Variations on the current model are considered:
o vary p and/or q from the current model by ±1;
o include/exclude c from the current model.
The best model considered so far (either the current model or one of these variations) becomes the
new current model.
d) Repeat Step 2(c) until no lower AICc can be found
Hyndman-Khandakar algorithm
We fit both an ARIMA(2,1,0) and an ARIMA(0,1,3) model along with two automated model
selections, one using the default stepwise procedure, and one working harder to search a
larger model space.
The four models have almost identical
#> Country `Model name` Orders AICc values. Of the models fitted, the full
#> 1 Central African Republic arima210 <ARIMA(2,1,0)> gives the lowest AICc value,
The ACF plot of the residuals from the ARIMA(3,1,0) model shows that all autocorrelations are
within the threshold limits, indicating that the residuals are behaving like white noise.
A portmanteau test
(setting K=3) returns a large p-
value, also suggesting that the
residuals are white noise.
#> Country .model lb_stat lb_pvalue
Steps:
1.Fit a model to your time series data (e.g., ARIMA).
2.Calculate the residuals from the fitted model.
3.Conduct the Portmanteau test on these residuals to check for autocorrelation.
Interpretation:
•A low p-value (commonly less than 0.05) indicates that you should reject the null hypothesis, suggesting
that there are significant autocorrelations in the residuals.
•A high p-value indicates that the residuals behave like white noise, and the model may be adequate.
The Portmanteau test, also known as
the Ljung-Box test
We’ll generate data, fit an ARIMA model, and perform the Ljung-Box test step by step, including calculations.
1. Calculate Autocorrelations:
1. Compute the sample autocorrelations rk for k=1,2,…,m where m is the number of lags.
2. Let’s say we calculate r1,r2,r3,r4
Calculating Q:
Assume we find:
•r1=0.2
•r2=0.1
•r3=0.05 The p-value can be derived from the chi-squared distribution with degrees of
•r4=0.03 freedom equal to m.
•Number of observations n=10
Through software
p_value = 1 - chi2.cdf(Q, df=m) ----- here m=4
If we calculate it, we might find a p-value that is likely greater than
0.05, indicating that we do not reject the null hypothesis, suggesting no
significant autocorrelation in the residuals.
Forecasts
Although we have calculated forecasts from the ARIMA models in our examples, we have not
yet explained how they are obtained. Point forecasts can be calculated using the following
three steps.
1.Expand the ARIMA equation so that yt is on the left hand side and all other terms are on the
right.
2.Rewrite the equation by replacing t with T+h.
3.On the right hand side of the equation, replace future observations with their forecasts, future
errors with zero, and past errors with the corresponding residuals.
Beginning with h=1, these steps are then repeated for h=2,3,… until all forecasts have been
calculated.
Forecasts
This completes the first step. While the equation now looks like an ARIMA(4,0,1), it is still the
same ARIMA(3,1,1) model we started with. It cannot be considered an ARIMA(4,0,1) because
the coefficients do not satisfy the stationarity conditions.
Forecasts
Seasonal ARIMA
Seasonal ARIMA
A seasonal ARIMA model is formed by including additional seasonal terms in the ARIMA
models we have seen so far. It is written as follows:
where m=m= the seasonal period (e.g., number of observations per year). We use uppercase
notation for the seasonal parts of the model, and lowercase notation for the non-seasonal parts
of the model.
Seasonal ARIMA
❑ The seasonal part of the model consists of terms that are similar to the non-seasonal
components of the model, but involve backshifts of the seasonal period.
❑ For example, an ARIMA(1,1,1)(1,1,1)4 model (without a constant) is for quarterly data
(m=4), and can be written as
The additional seasonal terms are simply multiplied by the non-seasonal terms.
Seasonal ARIMA
The seasonal part of an AR or MA model will be seen in the seasonal lags of the PACF and ACF.
In considering the appropriate seasonal orders for a seasonal ARIMA model, restrict attention to
the seasonal lags.
Seasonal ARIMA
The residuals for the best model are shown One small but significant spike
(at lag 11) out of 36 is still
consistent with white noise. To
be sure, we use a Ljung-Box test
❑ When we estimate the parameters from the model, we need to minimise the sum
of squared εt values.
❑ If we minimise the sum of squared ηt values instead then several problems arise.
Regression with ARIMA errors
❑ If all of the variables in the model are stationary, then we only need to consider an
ARMA process for the errors.
❑ A regression model with ARIMA errors is equivalent to a regression model in
differences with ARMA errors.
❑ if the above regression model with ARIMA(1,1,1) errors is differenced we obtain
the model
Regression with ARIMA errors
We can recover estimates of both the ηt and εt series using the residulas() function
It is the ARIMA
estimated errors (the
innovation residuals)
that should resemble
a white noise series.
Regression with ARIMA errors