Chapter 1 - Lecture Notes
Chapter 1 - Lecture Notes
Chapter 1 - Lecture Notes
Chapter 1: Introduction
1 Time Series
1
Figure 1: Plot of Annual Rainfall
2
Figure 2: Plot of Canadian hare
3
DEFINITION 2 A time series is said to be discrete when the set T0 of times
at which observations are taken is a discrete set. A time series is said to
be continuous when observations are recorded continuously over some time
interval,e.g. when T0 = [0, 1].
1
1970 1975 1980 1985 1990 1995 2000 2005
4
7000
5000
4000
3000
2000
1000
0
0 20 40 60 80 100 120
time (year)
35
30
daily temperature in HK
25
20
15
10
5
0 200 400 600 800 1000 1200
300
250
no. of patients
200
150
100
0 50 100 150 200 250 300 350 400
time (daily)
5
10000
8000
cases of measles
6000
4000
2000
0
0 100 200 300 400 500 600 700 800 900 1000
time (week)
1600
1400
1200
SP500 index
1000
800
600
400
200
0 500 1000 1500 2000 2500 3000 3500 4000
time (daily)
6
More time series [What can you observe in the time series?]
7
3 Purpose of Time Series Analysis
Time series data are often examined in hopes of discovering a historical pattern
that can be exploited in the preparation of a forecast.
Specifically speaking, the first objective of time series analysis is to understand
or model the stochastic mechanism that gives rise to the observed series. When
presented with a time series, the first step in the analysis is usually to plot the
observations against time to give what is called a time plot, and then to obtain
simple descriptive measures of the main properties of the series. The power of
the time plot is illustrated in Fig 4, which clearly shows that there is a regular
seasonal effect.
After an appropriate family of models has been chosen and estimated, the next
major objective of time series analysis is to forecast or predict future values of
the series.
For example,
Broadly speaking, forecasting methods can be divided into two basic types:
◃ time series data are dependent. e.g. this month’s exchange rate will be
correlated with the last month’s.
8
The difficulty with dependence:
Consider the i.i.d. case first. Suppose that X1 , X2 , · · · , Xn are i.i.d random
sample with mean µ and variance σ 2 . We then estimate µ by
µ̂ = (X1 + X2 + · · · + Xn )/n.
The variance of µ̂ is
σ2
var(µ̂) = .
n
It follows that V ar(µ̂) → 0 as n → ∞.
We now consider the situation where all the Xi ’s are “perfectly” correlated, i.e.
cov(Xi , Xj ) = σ 2 .
We still estimate µ by
µ̂ = (X1 + X2 + · · · + Xn )/n.
Its variance becomes
1
var(µ̂) = var(X1 + X2 + · · · + Xn )
n2
n
1 ∑ ∑
= { var(X i ) + cov(Xi , Xj )}
n2
i=1 i̸=j
1 2 2
= n σ = σ 2.
n2
n observation ≡ only one observation!!!
This example implies that: Dependence requires more observations; more
powerful/complicated mathematics in making statistical analysis; and that we need
to distinguish different type of dependence.
9
(i) Trend refers to the upward or downward movement that characterizes
a time series over a period of time. Trend reflects the long-run growth
or decline in the time series.
Reasons for trend: technology improvement; changes in consumer
tastes; increases of per capita income; increase of total population; mar-
ket growth; inflation or deflation.
600
number of passengers
500
400
300
200
100
1948 1950 1952 1954 1956 1958 1960 1962
time: month
(ii) Cycle refers to recurring up and down movements around trend levels.
These fluctuations can have a duration of anywhere from two to ten
years or even longer measured from peak to peak or trough to trough.
Typical examples for cycle: “business cycle (Karl Mark’s explanation)”;
“nature Phenomena”; “interaction of two variables”. Explanations of
cycle is very difficult.
150
100
50
0
1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870
time: year
10
of monthly housing starts and department store sales. Reasons for
seasonalilty: weather; customs.
(iv) Irregular fluctuations represent what is "left over" in a time series
after tread, cycle and seasonal variations have been accounted for. It
is random movement in a time series. Some irregular fluctuations in
a time series are caused by "unusual" events that can not be forecasted
such as earthquakes, accidents, hurricanes, wars and the like.
(i) Additive
Yt = Tt + Ct + St + Rt
(ii) Multiplicative
Yt = Tt × Ct × St × Rt
where Tt is the trend component (or factor) in time period (or point) t; St is
the seasonal component (or factor) in time period (or point) t; Ct is the cyclical
component (or factor) in time period (or point) t; Rt is the irregular component
(or factor) in time period (or point) t;
The second type can be changed into an additive model by taking the loga-
rithms.
6 Errors in Forecasting
I. Types of Forecasts: we consider two types of forecasts: the point forecast and
the prediction interval forecast.
(i) A point forecast is a single number that represents our best prediction
(or guess) of the actual value being forecasted. Based on {y1 , y2 , ..., ys },
any estimate of the value at time s + m with m > 0 is a point forecast.
Denoted by ŷs+m .
(ii) A prediction interval forecast is an interval (or range) of numbers that is
calculated so that we are very confident (usually, with 95% confidence).
Based on {y1 , y2 , ..., ys }, any confidence estimate of the value at time
s + m with m > 0 is a prediction interval.
11
unemployment rate in Singapore
7
4 *
* *
3
1
1990 1992 1994 1996 1998 2000 2002 2004 2006
unemployment rate in Singapore
7
1
1990 1992 1994 1996 1998 2000 2002 2004 2006
DEFINITION 5
An examination of forecast errors over time can often indicate whether the
forecasting technique being used does or does not match the pattern of the data.
For example, if a forecasting technique is accurately forecasting the trend, sea-
sonal, or cyclical components that are present in a time series, the forecast errors
12
should reflect only the irregular component. In such a case, the forecast errors
should be purely random. See the following figures and we will demonstrate
more in the subsequent lectures.
EXAMPLE 5 {yt : 0.1001, 1.6900, 2.3156, 2.7119, 3.7902, 3.6686, 4.6908, 2.7975,
4.4802, 4.8433, 3.8959, 6.2573, 5.4435, 8.4151, 6.6949, 8.5287, 8.7193, 8.0781,
7.3293, 9.9408}
2
12
random forecast errors
10
6 0
4
−1
2
0
−2
0 5 10 15 20 0 5 10 15 20
times times
If the forecasting errors over time indicate that the forecasting methodology
is appropriate (random distribution of errors), it is important to measure the
magnitude of the errors so that we can determine whether accurate forecasting
is possible.
(i) mean of estimation errors:
n
∑
−1
n et ;
t=1
EXAMPLE 6
Actual Predicted Error Absolute squared
value value deviation error
yt ŷt et = yt − ŷt |et | et2
25 22 3 3 9
28 30 -2 2 4
29 30 -1 1 1
mean 0 2 4.67
13
Therefore, for the above three predictions,
“Mean prediction error” can not be used to assess a prediction method because
the positive and negative errors, no matter how large or small, will cancel each
other out.
MAD and MSE can be used as criteria for assessing a prediction method. An
assessment can be carried out by observations.
EXAMPLE 7 One can propose the following two simple methods for prediction of
yt .
Method A:
ŷt = yt−1
Method B:
1
ŷt = (yt−1 + yt−2 )
2
we have
and
14
7 Comparison of MAD and MSE
We now describe how these two measures differ. The extreme prediction errors
have bigger influence on MSE than MAD.
EXAMPLE 8 Suppose we have other two methods and their predict are
For method 1:
7
1∑
MAD1 = |e1,t | = 2.14;
7
t=1
7
1∑ 2
MSE1 = e1,t = 6.14;
7
t=1
For method 2:
7
1∑
MAD2 = |e2,t | = 1.86;
7
t=1
∑7
1
MSE2 = 2
e2,t = 8.14.
7
t=1
Denote them by y1 , ..., y10 . Suppose that it follows the following model
yt = β + εt .
15
The prediction is then
ŷt = β.
The aim is to estimate β. By MAD, we need to estimate β by minimizing
n
∑
−1
n |yt − β|.
t=1
16