Chapter 1 - Lecture Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

MH4500 TIME SERIES ANALYSIS

Chapter 1: Introduction

A time series is a sequence of observations taken sequentially in time. Many


sets of data appear as time series: a monthly sequence of the quantity of goods
shipped from a factory, a weekly series of the number of road accidents, hourly
observations made on the yield of a chemical process, and so on. An intrinsic fea-
ture of a time series is that, typically, adjacent observations are dependent. Time
series analysis is concerned with techniques for the analysis of this dependence.

1 Time Series

DEFINITION 1 A time series is a sequence of observations over time.

Time series occur in a variety of fields. In business and economics, we observe


weekly interest rates, daily closing stock prices, monthly price indices, yearly
sales figures, and so forth. In engineering, we observe sound, electric signals
and voltage. In medical studies, we measure electroencephalogram (EEG) and
electrocardiogram (EKG) tracings. In the social sciences, we study annual birth
rates, mortality rates, accident rates and various crimes rates. The list of areas in
which time series are studied is virtually endless.
EXAMPLE 1 Exhibit 1 displays a time series plot of the annual rainfall amounts
recorded in Los Angeles, California, over more than 100 years. The plot shows
considerable variation in rainfall amount over the yearsŮsome years are low,
some high, and many are in-between in value. The year 1883 was an exceptionally
wet year for Los Angeles, while 1983 was quite dry. For analysis and modeling
purposes we are interested in whether or not consecutive years are related in
some way. If so, we might be able to use one yearŠs rainfall value to help forecast
next yearŠs rainfall amount. 

EXAMPLE 2 Our second example concerns the annual abundance of Canadian


hare. Fig 2 gives the time series plot of this abundance over about 30 years.
Neighboring values here are very closely related. Large changes in abundance do
not occur from one year to the next. We see an upward trend in the plot−−−−low
values tend to be followed by low values in the next year, middle-sized values by
middle-sized values, and high values by high values. 

1
Figure 1: Plot of Annual Rainfall

2
Figure 2: Plot of Canadian hare

3
DEFINITION 2 A time series is said to be discrete when the set T0 of times
at which observations are taken is a discrete set. A time series is said to
be continuous when observations are recorded continuously over some time
interval,e.g. when T0 = [0, 1].

Remarks: We are mainly interested in discrete-time time series with equally


fixed time intervals. e.g. observations made monthly, daily, weekly, etc.

2 More Representative Time Series


EXAMPLE 3 Unemployment Rate (%) in Singapore

year rate year rate year rate


1973 4.4 1984 2.7 1995 2.7
1974 3.9 1985 4.1 1996 3.0
1975 4.5 1986 6.5 1997 2.4
1976 4.4 1987 4.7 1998 3.2
1977 3.9 1988 3.3 1999 4.6

1978 3.6 1989 2.2 2000 4.4
1979 3.3 1990 1.7 2001 3.4
1980 3.5 1991 1.9 2002 5.2
1981 2.9 1992 2.7 2003 5.4
1982 2.6 1993 2.7
1983 3.2 1994 2.6

1
1970 1975 1980 1985 1990 1995 2000 2005

Figure 3: Plot of Example 3

4
7000

number of lynx 6000

5000

4000

3000

2000

1000

0
0 20 40 60 80 100 120

time (year)

Figure 4: Canadian Lynx captured (1828-1934)

35

30
daily temperature in HK

25

20

15

10

5
0 200 400 600 800 1000 1200

Figure 5: Temperature in Hong Kong (1994-1997)

300

250
no. of patients

200

150

100
0 50 100 150 200 250 300 350 400
time (daily)

Figure 6: Number of patients with respiratory problems in Hong Kong (1994)

5
10000

8000
cases of measles

6000

4000

2000

0
0 100 200 300 400 500 600 700 800 900 1000
time (week)

Figure 7: Measles cases in London (1944-1978)

1600

1400

1200
SP500 index

1000

800

600

400

200
0 500 1000 1500 2000 2500 3000 3500 4000
time (daily)

Figure 8: S & P500 stock index (1990-2004)

6
More time series [What can you observe in the time series?]

7
3 Purpose of Time Series Analysis
Time series data are often examined in hopes of discovering a historical pattern
that can be exploited in the preparation of a forecast.
Specifically speaking, the first objective of time series analysis is to understand
or model the stochastic mechanism that gives rise to the observed series. When
presented with a time series, the first step in the analysis is usually to plot the
observations against time to give what is called a time plot, and then to obtain
simple descriptive measures of the main properties of the series. The power of
the time plot is illustrated in Fig 4, which clearly shows that there is a regular
seasonal effect.
After an appropriate family of models has been chosen and estimated, the next
major objective of time series analysis is to forecast or predict future values of
the series.

DEFINITION 3 Prediction of future events and conditions are called forecasts,


and the act of making such predictions is called forecasting.

For example,

◃ what will be the unemployment rate next year?

◃ Is there a trend in global temperature?

◃ what is the seasonal effect?

◃ what is the relationship between GDP and interest rate?

Broadly speaking, forecasting methods can be divided into two basic types:

◃ Qualitative forecasting methods: use the opinions of experts to predict fu-


ture events subjectively.

◃ Quantitative forecasting methods: Based the historical data, use statistical


methods to predict future values of a variable.

4 The difference between the time series and i.i.d. statis-


tics
Time series data are dependent:

◃ there is an order for the observation of time series.

◃ time series data are dependent. e.g. this month’s exchange rate will be
correlated with the last month’s.

8
The difficulty with dependence:

Consider the i.i.d. case first. Suppose that X1 , X2 , · · · , Xn are i.i.d random
sample with mean µ and variance σ 2 . We then estimate µ by

µ̂ = (X1 + X2 + · · · + Xn )/n.

The variance of µ̂ is
σ2
var(µ̂) = .
n
It follows that V ar(µ̂) → 0 as n → ∞.
We now consider the situation where all the Xi ’s are “perfectly” correlated, i.e.

cov(Xi , Xj ) = σ 2 .

We still estimate µ by
µ̂ = (X1 + X2 + · · · + Xn )/n.
Its variance becomes
1
var(µ̂) = var(X1 + X2 + · · · + Xn )
n2
n
1 ∑ ∑
= { var(X i ) + cov(Xi , Xj )}
n2
i=1 i̸=j
1 2 2
= n σ = σ 2.
n2
n observation ≡ only one observation!!!
This example implies that: Dependence requires more observations; more
powerful/complicated mathematics in making statistical analysis; and that we need
to distinguish different type of dependence.

5 Components of a time series


In order to identify the pattern of time series data, it is often convenient to think
of a time series as consisting of several components.

DEFINITION 4 The components of a time series are: Trend, Cycle, Seasonal


variations and Irregular fluctuations.

9
(i) Trend refers to the upward or downward movement that characterizes
a time series over a period of time. Trend reflects the long-run growth
or decline in the time series.
Reasons for trend: technology improvement; changes in consumer
tastes; increases of per capita income; increase of total population; mar-
ket growth; inflation or deflation.

International Airline Passengers: Monthly Totals, 1949−1960


700

600
number of passengers

500

400

300

200

100
1948 1950 1952 1954 1956 1958 1960 1962
time: month

(ii) Cycle refers to recurring up and down movements around trend levels.
These fluctuations can have a duration of anywhere from two to ten
years or even longer measured from peak to peak or trough to trough.
Typical examples for cycle: “business cycle (Karl Mark’s explanation)”;
“nature Phenomena”; “interaction of two variables”. Explanations of
cycle is very difficult.

Wolfer Sunspot Numbers, Annual, 1770−1869


200

150

100

50

0
1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870
time: year

(iii) Seasonal variations or Seasonality are periodic patterns in a time


series that complete themselves within a calendar year and are then
repeated on a yearly basis.
Typical examples are the average monthly temperature, the number

10
of monthly housing starts and department store sales. Reasons for
seasonalilty: weather; customs.
(iv) Irregular fluctuations represent what is "left over" in a time series
after tread, cycle and seasonal variations have been accounted for. It
is random movement in a time series. Some irregular fluctuations in
a time series are caused by "unusual" events that can not be forecasted
such as earthquakes, accidents, hurricanes, wars and the like.

These components may be combined in different ways. It is usually assumed


that they are multiplied or added. This leads to possible structures for a time
series

(i) Additive

Yt = Tt + Ct + St + Rt

(ii) Multiplicative

Yt = Tt × Ct × St × Rt

where Tt is the trend component (or factor) in time period (or point) t; St is
the seasonal component (or factor) in time period (or point) t; Ct is the cyclical
component (or factor) in time period (or point) t; Rt is the irregular component
(or factor) in time period (or point) t;
The second type can be changed into an additive model by taking the loga-
rithms.

6 Errors in Forecasting
I. Types of Forecasts: we consider two types of forecasts: the point forecast and
the prediction interval forecast.

(i) A point forecast is a single number that represents our best prediction
(or guess) of the actual value being forecasted. Based on {y1 , y2 , ..., ys },
any estimate of the value at time s + m with m > 0 is a point forecast.
Denoted by ŷs+m .
(ii) A prediction interval forecast is an interval (or range) of numbers that is
calculated so that we are very confident (usually, with 95% confidence).
Based on {y1 , y2 , ..., ys }, any confidence estimate of the value at time
s + m with m > 0 is a prediction interval.

We call the prediction m-step ahead prediction.


EXAMPLE 4 Prediction of unemployment

11
unemployment rate in Singapore
7

4 *
* *
3

1
1990 1992 1994 1996 1998 2000 2002 2004 2006
unemployment rate in Singapore
7

1
1990 1992 1994 1996 1998 2000 2002 2004 2006 

II. Measuring Forecasting error


All forecasting situations involve some degree of uncertainty. We recognize
this fact by including an irregular component in the description of a time series.
The presence of this irregular component, which represents unexplained or un-
predictable fluctuations in the data, means that some error in forecasting must be
expected.
We now consider the problem of measuring forecasting errors. Denote the
actual value of the variable of interest in time t as yt and the predicted value of
yt by ŷt . We then introduce the following concepts.

DEFINITION 5

forecast error: et = yt − ŷt


Plots of forecast error plot et against t
absolute deviation: |et | = |yt − ŷt |,
squared error: et2 = (yt − ŷt )2 .

An examination of forecast errors over time can often indicate whether the
forecasting technique being used does or does not match the pattern of the data.
For example, if a forecasting technique is accurately forecasting the trend, sea-
sonal, or cyclical components that are present in a time series, the forecast errors

12
should reflect only the irregular component. In such a case, the forecast errors
should be purely random. See the following figures and we will demonstrate
more in the subsequent lectures.
EXAMPLE 5 {yt : 0.1001, 1.6900, 2.3156, 2.7119, 3.7902, 3.6686, 4.6908, 2.7975,
4.4802, 4.8433, 3.8959, 6.2573, 5.4435, 8.4151, 6.6949, 8.5287, 8.7193, 8.0781,
7.3293, 9.9408}

2
12
random forecast errors
10

et: prediction errors


1
8
yt

6 0

4
−1
2

0
−2
0 5 10 15 20 0 5 10 15 20
times times 

If the forecasting errors over time indicate that the forecasting methodology
is appropriate (random distribution of errors), it is important to measure the
magnitude of the errors so that we can determine whether accurate forecasting
is possible.
(i) mean of estimation errors:
n

−1
n et ;
t=1

(ii) mean absolute deviation (MAD):


n

n−1 |et |;
t=1

(iii) mean squared errors (MSE):


n

−1
n et2 .
t=1

EXAMPLE 6
Actual Predicted Error Absolute squared
value value deviation error
yt ŷt et = yt − ŷt |et | et2
25 22 3 3 9
28 30 -2 2 4
29 30 -1 1 1
mean 0 2 4.67

13
Therefore, for the above three predictions,

mean prediction error = 0


mean absolute deviation = 2
mean squared error = 4.67.

“Mean prediction error” can not be used to assess a prediction method because
the positive and negative errors, no matter how large or small, will cancel each
other out.
MAD and MSE can be used as criteria for assessing a prediction method. An
assessment can be carried out by observations. 

EXAMPLE 7 One can propose the following two simple methods for prediction of
yt .
Method A:
ŷt = yt−1
Method B:
1
ŷt = (yt−1 + yt−2 )
2

Actual Predicted Predicted Absolute


value value of A value of B deviation
yt ŷA,t ŷB,t |eA,t | |eB,t |
25 – – – –
28 25 – 3 –
29 28 26.5 1 2.5
30 29 28.5 1 1.5
27 30 29.5 3 3
23 27 28.5 4 5.5
20 23 25 3 5

we have

MADA = (3 + 1 + 1 + 3 + 4 + 3)/6 = 2.5;


MSEA = (32 + 12 + 12 + 32 + 42 + 32 )/6 = 7.5

and

MADB = (2.5 + 1.5 + 3 + 5.5 + 5)/5 = 3.5;


MSEB = (2.52 + 1.52 + 32 + 5.52 + 52 )/5 = 14.55.

Both criteria suggest that method A is more accurate than method B. 

14
7 Comparison of MAD and MSE
We now describe how these two measures differ. The extreme prediction errors
have bigger influence on MSE than MAD.
EXAMPLE 8 Suppose we have other two methods and their predict are

Actual Predicted Predicted Absolute


value value 1 value 2 deviation
yt ŷ1,t ŷ2,t |e1,t | |e2,t |
25 22 24 3 1
28 30 27 2 1
29 30 30 1 1
30 30 30 0 0
27 30 28 3 1
23 27 30 4 7
20 22 22 2 2

For method 1:
7
1∑
MAD1 = |e1,t | = 2.14;
7
t=1
7
1∑ 2
MSE1 = e1,t = 6.14;
7
t=1

For method 2:
7
1∑
MAD2 = |e2,t | = 1.86;
7
t=1
∑7
1
MSE2 = 2
e2,t = 8.14. 
7
t=1

Based on MAD, method 2 is better than method 1;


Based on MSE, method 1 is better than method 2.

It is commonly believed that MAD is a better criterion than MSE. However,


mathematically MSE is more convenient than MAD.
For example, a time series

{1.81, 2.73, 1.41, 4.18, 1.86,


2.11, 3.07, 2.06, 1.90, 1.17}

Denote them by y1 , ..., y10 . Suppose that it follows the following model

yt = β + εt .

15
The prediction is then
ŷt = β.
The aim is to estimate β. By MAD, we need to estimate β by minimizing
n

−1
n |yt − β|.
t=1

No simple answer! [Numerical calculation suggest the solution is β = 1.93 ]


However, by MSE, we need to estimate β by minimizing
n

n−1 (yt − β)2 .
t=1

We can get the solution (How?)


n

β̂ = n−1 yt = 2.23.
t=1

8 A Scientific view towards statistical forecasting


(i) Statistical forecasting must be based on very strong assumptions: the
future behavior of the times series has the same kind “statistics prop-
erties” as the observations.
(ii) Usually, statistical methods can only be applied to “short term” forecasts.
(iii) Statistical forecasting must be combined with other prediction methods
in practice.

16

You might also like