ATA Unit3 4 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

TIME SERIES ANALYSIS

Unit 3-4-5-6
Differencing

The backward shift operator B is a useful notational device when working with time series
lags:
(Some references use L for “lag” instead of B for “backshift”.)

In other words, B, operating on yt, has the effect of shifting the data back one period. Two
applications of B to yt shifts the data back two periods:

For monthly data, if we wish to consider “the same month last year,” the notation is

The backward shift operator is convenient for describing the process of differencing. A
first difference can be written as
Differencing

So a first difference can be represented by (1−B)(1−B). Similarly, if second-order differences


have to be computed, then:

In general, a dth-order difference can be written as


Auto-Regression

❑ In a multiple regression model, we forecast the variable of interest using a linear combination
of predictors.

❑ In an autoregression model, we forecast the variable of interest using a linear combination


of past values of the variable.
❑ The term autoregression indicates that it is a regression of the variable against itself.
❑ Thus, an autoregressive model of order p can be written as

where εt is white noise

This is like a multiple regression but with lagged values of yt as predictors. We refer to this as
an AR(p) model, an autoregressive model of order p.
Auto-Regression

❑ Autoregressive models are remarkably flexible at handling a wide range of different time
series patterns.
❑ The two series in Figure show series from an AR(1) model and an AR(2) model.
❑ Changing the parameters ϕ1,…,ϕp results in different time series patterns.
❑ The variance of the error term εt will only change the scale of the series, not the patterns.
Auto-Regression

For an AR(1) model:


❑ when ϕ1=0 and c=0, yt is equivalent to white noise;
❑ when ϕ1=1 and c=0, yt is equivalent to a random walk;
❑ when ϕ1=1 and c≠0, yt is equivalent to a random walk with drift;
❑ when ϕ1<0, yt tends to oscillate around the mean.

We normally restrict autoregressive models to stationary data, in which case some


constraints on the values of the parameters are required.
•For an AR(1) model: −1<ϕ1<1.
•For an AR(2) model: −1<ϕ2<1, ϕ1+ϕ2<1, ϕ2−ϕ1<1.
Moving Average Models

Rather than using past values of the forecast variable in a regression, a moving average model
uses past forecast errors in a regression-like model,

We refer to this as an MA(q) model,

❑ Notice that each value of yt can be thought of as a weighted moving average of the past
few forecast errors
❑ However, moving average models should not be confused with the moving
average smoothing
❑ A moving average model is used for forecasting future values, while moving average
smoothing is used for estimating the trend-cycle of past values.
Moving Average Models

❑ Figure shows some data from an MA(1) model and an MA(2) model.
❑ Changing the parameters θ1,…,θq results in different time series patterns.
❑ As with autoregressive models, the variance of the error term εt will only change the scale
of the series, not the patterns.
Moving Average Models

It is possible to write any stationary AR(p) model as an MA(∞) model. For example, using repeated
substitution, we can demonstrate this for an AR(1) model:

provided −−1<ϕ1<1, the value of will get smaller as k gets larger. So eventually we obtain

an MA(∞) process.

The reverse result holds if we impose some constraints on the MA parameters. Then the MA model is
called invertible. That is, we can write any invertible MA(q) process as an AR(∞) process.

The invertibility constraints for other models are similar to the stationarity constraints.
ARMA Models

❑ The ARMA model is a combination of both AR and MA components.


❑ An ARMA(p, q) model, where p is the number of lagged observations (AR part) and q is
the number of lagged forecast errors (MA part), is represented as:

Determining the appropriate values for p and q is crucial for building an effective ARMA
model. This can be done using the following methods:
1.Partial Autocorrelation Function (PACF):
1. PACF is used to determine the order p of the AR model. It measures the correlation
between observations at different lags, excluding the influence of intermediate lags.
2. The order p is determined by the lag at which the PACF plot cuts off.
2.Autocorrelation Function (ACF):
2. ACF is used to determine the order q of the MA model. It measures the correlation
between observations at different lags.
3. The order q is determined by the lag at which the ACF plot cuts off.
ARMA Models

• Causality of a stationary time series indicates that the time series is dependent on past/lag
values.

• Invertibility refers to linear stationary process which behaves like infinite representation of
autoregressive. In other word, this is the property that possessed by a moving average
process. Invertibility solves non-uniqueness of autocorrelation function of moving average.
Non-seasonal ARIMA models

If we combine differencing with autoregression and a moving average model, we obtain a


non-seasonal ARIMA model. ARIMA is an acronym for AutoRegressive Integrated Moving
Average.
(9.1)

where y′t is the differenced series (it may have been differenced more than once).

The “predictors” on the right hand side include both lagged values of yt and lagged errors. We
call this an ARIMA(p,d,q) model, where

p = order of the autoregressive part;


d = degree of first differencing involved;
q = order of the moving average part.
Non-seasonal ARIMA models

Special cases of ARIMA models.

White noise ARIMA(0,0,0) with no constant


Random walk ARIMA(0,1,0) with no constant
Random walk with drift ARIMA(0,1,0) with a constant
Autoregression ARIMA(p,0,0)
Moving average ARIMA(0,0,q)

Once we start combining components in this way to form more complicated models, it is
much easier to work with the backshift notation.
Non-seasonal ARIMA models

This is an ARIMA(2,0,1) model:

where εt is white noise with a standard deviation


of 2.837=√8.046.

Forecasts from the model are shown in below Figure


Understanding ARIMA models

The constant c has an important effect on the long-term forecasts obtained from these models.
•If c=0 and d=0, the long-term forecasts will go to zero.
•If c=0 and d=1, the long-term forecasts will go to a non-zero constant.
•If c=0 and d=2, the long-term forecasts will follow a straight line.
•If c≠0 and d=0, the long-term forecasts will go to the mean of the data.
•If c≠0 and d=1, the long-term forecasts will follow a straight line.
•If c≠0 and d=2, the long-term forecasts will follow a quadratic trend.

❑ The value of d also has an effect on the prediction intervals — the higher the value of d, the
more rapidly the prediction intervals increase in size.
❑ For d=0, the long-term forecast standard deviation will go to the standard deviation of the
historical data, so the prediction intervals will all be essentially the same.

This behaviour is seen in forecasted Figure where d=0 and c≠0. In this figure, the prediction intervals are almost the same
width for the last few forecast horizons, and the final point forecasts are close to the mean of the data.
ACF and PACF plots

❑ ACF plot shows the autocorrelations which measure the relationship between yt and yt−k for
different values of k.
❑ Now if yt and yt−1 are correlated, then yt−1 and yt−2 must also be correlated.
❑ However, then yt and yt−2 might be correlated, simply because they are both connected to yt−1,
rather than because of any new information contained in yt−2 that could be used in
forecasting yt.
❑ To overcome this problem, we can use partial autocorrelations. These measure the
relationship between yt and yt−k after removing the effects of lags 1,2,3,…,k−1.
❑ So the first partial autocorrelation is identical to the first autocorrelation, because there is
nothing between them to remove.
❑ Each partial autocorrelation can be estimated as the last coefficient in an autoregressive
model.
❑ Specifically, αk, the kth partial autocorrelation coefficient, is equal to the estimate of ϕk in an
AR(k) model.
ACF and PACF plots

❑ If the data are from an ARIMA(p,d,0) or ARIMA(0,d,q) model, then the ACF and PACF
plots can be helpful in determining the value of p or q.
❑ If p and q are both positive, then the plots do not help in finding suitable values of p and q.

❑ For AR(1) model


✓ Autocorrelations exponentially decays
✓ There is a single significant partial autocorrelation

❑ The data may follow an ARIMA(p,d,0) model means AR(p) model if the ACF and PACF
plots of the differenced data show the following patterns:
✓ the ACF is exponentially decaying or sinusoidal;
✓ there is a significant spike at lag p in the PACF, but none beyond lag p.
ACF and PACF plots

Calculate/Check
Confidence Intervals:

For large sample sizes,

use to set the


significance bounds.

In figure, we see that there is a decaying sinusoidal pattern in the ACF and the PACF
shows the last significant spike at lag 4. This is what you would expect from an
ARIMA(4,0,0) model.
ACF and PACF plots

❑ For MA(1) model


✓ PACF is exponentially decaying
✓ There is a single significant spike in ACF

❑ The data may follow an ARIMA(0,d,q) model means MA(q) if the ACF and PACF plots of
the differenced data show the following patterns:
✓ the PACF is exponentially decaying or sinusoidal;
✓ there is a significant spike at lag q in the ACF, but none beyond lag q.
Estimation and order selection

❑ Once the model order has been identified (i.e., the values of p, d and q), we need to estimate
the parameters c, ϕ1,…,ϕp, θ1,…,θq.
❑ Maximum likelihood estimation (MLE) is a technique that finds the values of the parameters
which maximise the probability of obtaining the data that we have observed.
❑ For ARIMA, MLE is similar to the least squares estimates that would be obtained by
minimizing
Order selection - Information Criteria

❑ Akaike’s Information Criterion (AIC) is useful for determining the order of an ARIMA
model. It can be written as

OR

where L is the likelihood of the data, k=1 if c≠0 and k=0 if c=0. Note that the last term in
parentheses is the number of parameters in the model (including σ2, the variance of the
residuals).

OR

n and T are same

Good models are obtained by minimising the AIC, AICc or BIC. Our preference is to use the AICc.
Calculating AICc

Fit the ARIMA Model: Suppose you fit an ARIMA(1,1,1) model to your time series data. After
fitting the model, you obtain the following:
•Number of observations n=100
•Number of parameters k=3 (for ARIMA(1,1,1), this includes 1 AR parameter, 1 MA parameter,
and 1 for the constant term)
•Log-Likelihood of the model = -150.0

Calculate AIC: Using the log-likelihood and the number of parameters, compute the AIC:
AIC=−2⋅(−150.0)+2⋅3=300+6=306

Calculate AICc: Substitute the values into the AICc formula:


Order selection - Information Criteria

 It is important to note that these information criteria tend not to be good guides to
selecting the appropriate order of differencing (d) of a model, but only for selecting the
values of p and q.
 This is because the differencing changes the data on which the likelihood is computed,
making the AIC values between models with different orders of differencing not
comparable.
 So we need to use some other approach to choose d, and then we can use the AICc to
select p and q.
How does ARIMA works

When fitting an ARIMA model to a set of (non-seasonal) time series data, the following
procedure provides a useful general approach.
1. Plot the data and identify any unusual observations.
2. If necessary, transform the data (using a Box-Cox transformation) to stabilize the variance.
3. If the data are non-stationary, take first differences of the data until the data are stationary.
4. Examine the ACF/PACF: Is an ARIMA(p,d,0) or ARIMA(0,d,q) model appropriate?
5. Try your chosen model(s), and use the AICc to search for a better model.
6. Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a
portmanteau test of the residuals. If they do not look like white noise, try a modified model.
7. Once the residuals look like white noise, calculate forecasts.
Hyndman-Khandakar algorithm

Hyndman-Khandakar algorithm for automatic ARIMA modelling


1.The number of differences 0≤d≤2 is determined using repeated KPSS tests.
2.The values of p and q are then chosen by minimising the AICc after differencing the data d times. Rather
than considering every possible combination of p and q, the algorithm uses a stepwise search to traverse the
model space.

a) Four initial models are fitted:


o ARIMA(0,d,0),
o ARIMA(2,d,2),
o ARIMA(1,d,0),
o ARIMA(0,d,1).
A constant is included unless d=2. If d≤1, an additional model is also fitted:
o ARIMA(0,d,0) without a constant.

b) The best model (with the smallest AICc value) fitted in step (a) is set to be the “current model”.
c) Variations on the current model are considered:
o vary p and/or q from the current model by ±1;
o include/exclude c from the current model.
The best model considered so far (either the current model or one of these variations) becomes the
new current model.
d) Repeat Step 2(c) until no lower AICc can be found
Hyndman-Khandakar algorithm

❑ The grid covers combinations of ARMA(p,q) orders starting from


the top-left corner with an ARMA(0,0), with the AR order increasing
down the vertical axis, and the MA order increasing across the
horizontal axis.
❑ The orange cells show the initial set of models considered by the
algorithm. In this example, the ARMA(2,2) model has the lowest
AICc value amongst these models.
❑ This is called the “current model” and is shown by the black circle.
❑ The algorithm then searches over neighbouring models as shown by
An illustrative example of the Hyndman-
Khandakar stepwise search process the blue arrows.
❑ If a better model is found then this becomes the new “current
model”.
❑ In this example, the new “current model” is the ARMA(3,3) model.
❑ The algorithm continues in this fashion until no better model can be
found. In this example the model returned is an ARMA(4,2) model.
The default procedure will switch to a new “current model” as soon as a better model is identified, without going through all
the neighbouring models. It may not give best results, but gives good results.
How does ARIMA works

 The time plot shows some non-stationarity, with an overall


decline.
 To address the non-stationarity, we will take a first difference
of the data.

❑ The PACF shown in


Figure 9.14 is suggestive of
an AR(2) model; so an initial
candidate model is an
ARIMA(2,1,0).
❑ The ACF suggests an MA(3)
model; so an alternative
candidate is an
ARIMA(0,1,3).
How does ARIMA works

 We fit both an ARIMA(2,1,0) and an ARIMA(0,1,3) model along with two automated model
selections, one using the default stepwise procedure, and one working harder to search a
larger model space.
The four models have almost identical

#> Country `Model name` Orders AICc values. Of the models fitted, the full

#> <fct> <chr> <model> search has found that an ARIMA(3,1,0)

#> 1 Central African Republic arima210 <ARIMA(2,1,0)> gives the lowest AICc value,

#> 2 Central African Republic arima013 <ARIMA(0,1,3)>

#> 3 Central African Republic stepwise <ARIMA(2,1,2)>

#> 4 Central African Republic search <ARIMA(3,1,0)>

#> .model sigma2 log_lik AIC AICc BIC

#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>


#> 1 search 6.52 -133. 274. 275. 282.

#> 2 arima210 6.71 -134. 275. 275. 281.

#> 3 arima013 6.54 -133. 274. 275. 282.


#> 4 stepwise 6.42 -132. 274. 275. 284.
How does ARIMA works

The ACF plot of the residuals from the ARIMA(3,1,0) model shows that all autocorrelations are
within the threshold limits, indicating that the residuals are behaving like white noise.

A portmanteau test
(setting K=3) returns a large p-
value, also suggesting that the
residuals are white noise.
#> Country .model lb_stat lb_pvalue

#> <fct> <chr> <dbl> <dbl>


#> 1 Central African Republic search 5.75 0.569
The Portmanteau test, also known as
the Ljung-Box test
❑ The Portmanteau test, also known as the Ljung-Box test, is a statistical test used to determine
whether there are significant autocorrelations in a time series.
❑ It assesses the null hypothesis that the residuals from a fitted model (like an ARIMA model)
are independently distributed.
❑ The test computes a statistic based on the autocorrelations of the residuals up to a certain lag.
❑ If the statistic is significantly large, it suggests that there are autocorrelations present.

Steps:
1.Fit a model to your time series data (e.g., ARIMA).
2.Calculate the residuals from the fitted model.
3.Conduct the Portmanteau test on these residuals to check for autocorrelation.

Interpretation:
•A low p-value (commonly less than 0.05) indicates that you should reject the null hypothesis, suggesting
that there are significant autocorrelations in the residuals.
•A high p-value indicates that the residuals behave like white noise, and the model may be adequate.
The Portmanteau test, also known as
the Ljung-Box test
We’ll generate data, fit an ARIMA model, and perform the Ljung-Box test step by step, including calculations.

Assume we have the following synthetic time series data:

Let's say we fit an ARIMA(1, 1, 1) model to the data.


Calculation of Residuals
Assuming we fit the model and get the following estimated parameters:
•AR(1) coefficient = 0.5
•MA(1) coefficient = -0.3
•Constant = 1.5
The fitted values and residuals might look like this:
The Portmanteau test, also known as
the Ljung-Box test
We will use the residuals to perform the Ljung-Box test. For our example, we calculate the test statistic
manually.
Step-by-Step Calculation:

1. Calculate Autocorrelations:
1. Compute the sample autocorrelations rk for k=1,2,…,m where m is the number of lags.
2. Let’s say we calculate r1,r2,r3,r4

2. Calculate the Test Statistic:

Calculating Q:

Assume we find:
•r1=0.2
•r2=0.1
•r3=0.05 The p-value can be derived from the chi-squared distribution with degrees of
•r4=0.03 freedom equal to m.
•Number of observations n=10
Through software
p_value = 1 - chi2.cdf(Q, df=m) ----- here m=4
If we calculate it, we might find a p-value that is likely greater than
0.05, indicating that we do not reject the null hypothesis, suggesting no
significant autocorrelation in the residuals.
Forecasts

Although we have calculated forecasts from the ARIMA models in our examples, we have not
yet explained how they are obtained. Point forecasts can be calculated using the following
three steps.
1.Expand the ARIMA equation so that yt is on the left hand side and all other terms are on the
right.
2.Rewrite the equation by replacing t with T+h.
3.On the right hand side of the equation, replace future observations with their forecasts, future
errors with zero, and past errors with the corresponding residuals.

Beginning with h=1, these steps are then repeated for h=2,3,… until all forecasts have been
calculated.
Forecasts

This completes the first step. While the equation now looks like an ARIMA(4,0,1), it is still the
same ARIMA(3,1,1) model we started with. It cannot be considered an ARIMA(4,0,1) because
the coefficients do not satisfy the stationarity conditions.
Forecasts
Seasonal ARIMA
Seasonal ARIMA

A seasonal ARIMA model is formed by including additional seasonal terms in the ARIMA
models we have seen so far. It is written as follows:

where m=m= the seasonal period (e.g., number of observations per year). We use uppercase
notation for the seasonal parts of the model, and lowercase notation for the non-seasonal parts
of the model.
Seasonal ARIMA

❑ The seasonal part of the model consists of terms that are similar to the non-seasonal
components of the model, but involve backshifts of the seasonal period.
❑ For example, an ARIMA(1,1,1)(1,1,1)4 model (without a constant) is for quarterly data
(m=4), and can be written as

The additional seasonal terms are simply multiplied by the non-seasonal terms.
Seasonal ARIMA

The seasonal part of an AR or MA model will be seen in the seasonal lags of the PACF and ACF.

For example, an ARIMA(0,0,0)(0,0,1)12 model will show:


•a spike at lag 12 in the ACF but no other significant spikes;
•exponential decay in the seasonal lags of the PACF (i.e., at lags 12, 24, 36, …).

Similarly, an ARIMA(0,0,0)(1,0,0)12 model will show:


• exponential decay in the seasonal lags of the ACF;
• a single significant spike at lag 12 in the PACF.

In considering the appropriate seasonal orders for a seasonal ARIMA model, restrict attention to
the seasonal lags.
Seasonal ARIMA

The data are clearly non-stationary, with strong


seasonality and a nonlinear trend, so we will first
take a seasonal difference.

The seasonally differenced data are shown


Seasonal ARIMA

❑ Our aim now is to find an appropriate


ARIMA model based on the ACF and
PACF shown in Figure.
❑ The significant spike at lag 2 in the ACF
suggests a non-seasonal MA(2)
component. The significant spike at lag
12 in the ACF suggests a seasonal
MA(1) component.

❑ Consequently, we begin with an ARIMA(0,1,2)(0,1,1)12 model, indicating a first difference, a seasonal


difference, and non-seasonal MA(2) and seasonal MA(1) component.
❑ If we had started with the PACF, we may have selected an ARIMA(2,1,0)(0,1,1)12 model — using the
PACF to select the non-seasonal part of the model and the ACF to select the seasonal part of the model
Seasonal ARIMA
Seasonal ARIMA

The residuals for the best model are shown One small but significant spike
(at lag 11) out of 36 is still
consistent with white noise. To
be sure, we use a Ljung-Box test

The large p-value confirms


that the residuals are
similar to white noise.
Regression with ARIMA errors

we considered regression models of the form

where yt is a linear function of the k predictor variables (x1,t,…,xk,t), and εt is usually


assumed to be an uncorrelated error term (i.e., it is white noise).

❑ Here, we will allow the errors from a regression to contain autocorrelation.


❑ To emphasise this change in perspective, we will replace εt with ηt in the
equation. The error series ηt is assumed to follow an ARIMA model.

if ηt follows an ARIMA(1,1,1) model, we can write

where εt is a white noise series.


The error from the regression model, which we denote by ηt, and the error from the ARIMA model, which
we denote by εt. Only the ARIMA model errors are assumed to be white noise.
Regression with ARIMA errors

❑ When we estimate the parameters from the model, we need to minimise the sum
of squared εt values.
❑ If we minimise the sum of squared ηt values instead then several problems arise.
Regression with ARIMA errors

❑ An important consideration when estimating a regression with ARMA errors is


that all of the variables in the model must first be stationary.
❑ Thus, we first have to check that yt and all of the predictors (x1,t,…,xk,t) appear to
be stationary.
❑ If we estimate the model when any of these are non-stationary, the estimated
coefficients will not be consistent estimates (and therefore may not be meaningful).
❑ We therefore first difference the non-stationary variables in the model.
❑ It is common to difference all of the variables if any of them need differencing.
Regression with ARIMA errors

❑ If all of the variables in the model are stationary, then we only need to consider an
ARMA process for the errors.
❑ A regression model with ARIMA errors is equivalent to a regression model in
differences with ARMA errors.
❑ if the above regression model with ARIMA(1,1,1) errors is differenced we obtain
the model
Regression with ARIMA errors

This is equivalent to the model

where ηt is an ARIMA(1,1,0) error.

The constant term disappears due to the differencing.


❑ The final model will be expressed in terms of the original variables, even if it has
been estimated using differenced variables.
❑ The AICc is calculated for the final model, and this value can be used to
determine the best predictors.
❑ the procedure should be repeated for all subsets of predictors to be considered,
and the model with the lowest AICc value selected.
Regression with ARIMA errors

The data are clearly already stationary

The fitted model is


Regression with ARIMA errors

We can recover estimates of both the ηt and εt series using the residulas() function

It is the ARIMA
estimated errors (the
innovation residuals)
that should resemble
a white noise series.
Regression with ARIMA errors

You might also like