Chap1 Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

1.

INTRODUCTION
Data obtained from observations collected sequentially over time are extremely
common. For example:
In business, we observe weekly interest rates, daily closing stock prices, monthly
price indices, yearly sales figures.
In meteorology, we observe daily high and low temperatures, annual
precipitation and drought indices, hourly wind speeds.
In agriculture, we record annual figures for crop and livestock production, soil
erosion and export sales.
In biological sciences, we observe the electrical activity of the heart at
millisecond intervals.
In ecology, we record the abundance of animal species

The purpose of time series analysis is generally twofold: to understand or model the
stochastic mechanism that gives rise to an observed series and to predict or forecast
the future values of a series based on the history of that series and, possibly, other
related series or factors.

When a variable is measured sequentially in time order or at a fixed interval, this is


known as the sampling interval, the resulting data form a time series.

A time series is a set of observations that have been collected over fixed sampling
intervals. The sampling interval is usually equally spaced; i.e. daily, quarterly, monthly,
annually.

Thus, y1, y2 ,......, yn or { yt : t = 1, 2,...., n} is an observed series and may


be taken to be a realisation of the underlying random process

{Yt : t = 1, 2,...., n} or {Yt : t = 1, 2,...}

The y ' s are not random sample. The y ' s will be related to one another, and it is
this relationship or serial dependence that is of interest.

1
Time series can be discrete or continuous in time, yt or y ( t ) respectively.

The following are examples of time series data.

Figure 1: Time series plots

Intl air passenger in USA-period 1949 to 1960 LA annual rainfall


Passengers (1000's)

30
Inches
400

10
100

1950 1954 1958 1880 1900 1920 1940 1960 1980

Time Year

Abundance of Canadian Hare Monthly average temperature in Dubuque, Iowa

Temperature F

70
80
Abundance

40
40

10
0

1905 1915 1925 1935 1964 1968 1972 1976

Year Time
Daily Closing values(JMD$)

US monthly unemployment rate-Jan 1996 to Oct 2006 USD$/JMD$ exchange rate-period 1972 to 2010
4.0 5.0 6.0

80
rate (%)

40
0

1996 1998 2000 2002 2004 2006 1980 1990 2000 2010

Time Year

In practice an observed time series can be:


(i) Inherently discrete:- closing stock price on trading day
(ii) Sampled:- a continuous series
(iii) Aggregated:- no instantaneous value but accumulated over time

2
1.2 Time Plot
The first step in any time-series analysis is to plot the observations against time. This
will show up important features of the series, such as trend, seasonality, outliers and
discontinuities. The plot is important, as seen in figure 1, page 2, both to describe the
data and to help in formulating a sensible model.

Plotting is not easy as it sounds. The choice of scales, the size of the intercept and the
way that the points are plotted (e.g. as a continuous line or as separate dots or crosses)
may substantially affect the way the plot ‘looks’ and so the analyst must exercise care
and judgement.

The following plot is a time series plot of the annual number of earthquakes in the
world with seismic magnitude over 7.0, for a 99 consecutive years.

3
Some features of the plot:-

There is no consistent trend (upward or downward) over the entire time span.
The series appears to slowly wander up and down. The horizontal line drawn at
quakes = 20.2 indicates the mean of the series.
Notice that the series tends to stay on the same side of the mean (above or
below) for a while and then wanders to the other side.

By definition, there is no seasonality as the data are annual data

There are no obvious outliers

It is difficult to judge whether the variance is constant or not.

4
1.3 Testing Randomness
We may wish to test whether an observed series ( y1 , y2 ,....., yn ) have occurred in this
order at random, or be white noise or a more complicated model is required.

There are several traditional test:- turning point test, runs, difference, portmanteau test,
records and rank correlation test to mention a few.

1.3.1 Turning Point Test


This is a very simple diagnostic, which examines a series {Yt } to test whether it is purely
random. The idea is that if {Yt } is purely random then three successive values are
equally likely to occur in any of the six possible orders.

In four cases, there is a turning point in the middle. Thus, in a series of ' n ' points we
might expect 2
3 ( n − 2) turning points.

TP
 y > yi −1 and yi > yi +1
Idea: yi is a turning point (TP) if  i
 yi < yi −1 and yi < yi +1
TP

1 if yi is a TP
Using an indicator variable X i = 
0 if not

5
n −1
Test statistic: P is the number of TP’s, i.e. P = ∑ Xi
i =2

It can be shown that for large n


 2 ( n − 2) 16n − 29 
P~N , 
 3 90 

Under H 0 : y ' s are random

Derivation
Consider 3 values: 1, 2, 3 for convenience

There are 3! = 6 possible orderings

If series is random, these are equally-likely

1,2,3 → X i = 0
4 of the 6 give TP’s
1,3, 2 → X i = 1
4 2
2,1,3 → X i = 1 So, E ( X i ) = =
6 3
2,3,1 → X i = 1
3,1, 2 → X i = 1
∴ E ( P) = 2
3 ( n − 2)
3, 2,1 → X i = 0

Consider what the series may be like when

P is large – rough with many fluctuations

P is small – smooth with few fluctuations

6
n −1 n − 2 n −1
Var ( P ) = ∑ Var ( X i ) + 2 ∑ ∑ Cov ( X i , X j )
i =2 i = 2 j =i +1

 
= ( n − 2 )Var ( X i ) + 2 ( n − 3) Cov  X i , X i +1  + 2 ( n − 4 ) Cov ( X i , X i + 2 )
 
 consecutive term 

Consider 4 values with Consider 5 values with


4!=24 orderings 5!=120 orderings

Hence,

( )
E X i2 =
2
3
E ( X i X i +1 ) =
5
12
E ( X i X i+2 ) =
9
20

Cov ( X i , X i +1 ) = E ( X i , X i +1 ) − E ( X i ) E ( X i +1 )
5  2  2 
= −   
12  3  3 

2
2 2 2
Var ( X i ) = −   = since X i2 = X i
3 3 9

16n − 29
So Var ( P ) =
90

7
1.4 Trend Component
Here we consider describing(estimating) the trend-also called “smoothing”. Recall that
trend is a smooth ‘long-term’ movement. We wish to estimate the trend component
and perhaps remove it to explore other components.

A. Global Fitting
We could fit a suitable function, example, linear, polynomial, or others; to the
whole series using least squares. But there are drawbacks:
Rarely appropriately that a single function fits for the entire series
Impractical for updating

B. Local Fitting
A possibility is to fit a piecewise linear model where the trend line is locally
linear but with change points where the slope and intercept change (abruptly).
It is often seems more sensible to look at models that allow a smooth transition
between the different sub-models.

Other general approach to describing trend: splines, non-linear methods, state-space


models.

1.4.1 Filtering
A second procedure for dealing with a trend is to use a linear filter, which converts
one time series, { yt } into another { xt } by linear operation

+s
xt = ∑ wi yt +i
i =− k

= w− k yt −k + .... + w0 yt + ... + ws yt + s

8
Remarks
xt is the smoothed or filtered series

{wi } is a set of weights

In order to smooth out local fluctuations and estimate the local mean, we should
choose the weights so that ∑ wi = 1 , then the operation is often referred to as
a moving average.
The simple moving average is not generally recommended by itself for
measuring trend, although it can be useful for removing seasonal variation.

wi ’s are symmetric about middle value. Moving average are often symmetric
with s = k and w j = w− j .
The simplest example, of a symmetric smoothing filter is simple moving
average, for which

1
wi = for i = −k ,......, + k
2k + 1

and the smoothed value is given by

1 +k
xt = ∑ yt +i
2k + 1 i =− k

wi ’s are the same for fitting polynomials of degree p = 2k as for p = 2k + 1


for fixed k

For a filter with a central weight and k weights to either side, the filter length is
p = 2k + 1

9
Spencer’s 15-point moving average
This was developed by an actuary in 1904 for smoothing mortality statistics to
get life tables.

This covers 15 consecutive points with k = 7 , and symmetric weights are

1
[ −3, −6, −5,3, 21, 46,67,74,...........]
320

Henderson moving average,


This is widely used, for example in X − 11 and X − 12 seasonal packages. This
moving average aims to follow a cubic polynomial trend without distortion, and
the choice of k depends on the degree of irregularity.

Whenever a symmetric filter is chosen, there is likely to be an end-effects


problem.

1.4.2 Filters in series


Useful filters can be built up by applying simple filters one after another.

Suppose filter 1, with weights {ar } acts on { xt } to produce { yt }

{ }
Then filter 2 with weights b j acts on { yt } to produce { zt }

Now

10
zt = ∑ b j yt + j
j

= ∑ b j ∑ ar xt + j + r
j r

= ∑ ck xt +k let k = j + r
k


where ck = ∑ ar bk −r are the weights for the overall filter.
r =−∞

The weights {ck } are obtained by convolution, i.e., {ck } = {ar } * b j where the { }
symbol ‘*’ represents the convolution operator.

Example: 1 1

{ 14 , 12 , 14} = { 12 , 12} *{ 12 , 12}


2 2
Shift to left
1 1
2 2
1
4
1 1
2 2 centre
1 1
2 2
1
4
+ 14 = 1
2
1 1
2 2
1 1
Shift to right
2 2
1
4

11
1.4.3 Derivation of Moving Average
For convenience, consider the ( 2k + 1) values
y− k ,...., y−1 , y0 , y1 ,....., yk

And fit yt = a0 + a1t + a2t 2 + .... + a pt p polynomial of degree p

Using least squares:


2
k  p 
S = ∑  yt − ∑ ai t i 
t =− k  i =0 

∂S
Set up equations = 0 : i = 0,1, 2,..., p and solve for ai
∂ai

k k k k
i +1 i+ p
i.e. a0 ∑t i
+ a1 ∑ t + ... + a p ∑t = ∑ yt t i
t =− k t =− k t =− k t =− k

In fact we need only a0 ; We seek m0 , the trend value at t = 0 which is simply a0


(that’s the convenience)

Remarks
a0 turns out to be a weighted average of the y ' s
k
m0 = aˆ0 = ∑ wt yt
t =− k

wt ' s are functions of k and p only and are the same for any set of ( 2k + 1)
consecutive observations

In general for yt −k ,...., yt −1 , yt , yt +1 ,....., yt + k

12
k
mt = ∑ wi yt +i
i =− k

Example 1.4.3
Find a 7 point least squares cubic moving average formula.

Solution 1.4.3

Here k = 3 ( 2k + 1 = 7 ) p=3

 3   3 i+1   3 i+2 
Equations: a0  ∑ t i  a1  ∑ t  a2  ∑ t 
t =−3  t =−3  t =−3 

3
i = 0: 7 a0 + 28a2 = ∑ yt (1)
t =−3

3
i = 1: 28a1 + 196 a3 = ∑ tyt (2)
t =− 3

3
i = 2: 28a0 + 196 a2 = ∑ t 2 yt (3)
t =−3

3
i = 3: 196a1 + 1588a3 = ∑ t 3 yt (4)
t =−3

13
Eliminate a2 from 1st and 3rd equations:
3 3
21a0 = 7 ∑ yt − ∑ t 2 yt
t =− 3 t =−3

= 7 ( y−3 + y−2 + ... + y2 + y3 ) − ( 9 y−3 + 4 y−2 + ... + 9 y3 )

1
Hence, a0 = {−2 y−3 + 3 y−2 + 6 y−1 + 7 y0 + 6 y1 + 3 y2 − 2 y3}
21

1 1
Notation: [ −2,3,6,7,6,3, −2] or [ −2,3,6,7]
21 21

1
So: mt = [ −2,3,6, 7 ] yt
21

= 21 ( t −3 + 3 yt −2 + 6 yt −1 + 7 yt + 6 yt +1 + 3 yt +2 − 2 yt +3 )
1 −2 y

14
1.5 Time series patterns

(i) Trend
A trend exists when there is a long-term increase or decrease in the data. It does
not have to be linear. When the series go from an increasing trend to a
decreasing trend, it is usually refers to a trend “changing direction”.

(ii) Seasonal
A seasonal pattern exists when a series in influenced by seasonal factors (eg,
quarter of the year, month, day of week). Seasonality is always of a fixed and
known period.

(iii) Cyclic
A cyclic pattern exists when data exhibit rises and falls that are not of fixed
period. The duration of these fluctuations is usually of at least two years.

Remark
Cyclic behaviour is different from seasonal behaviour. If the fluctuations are
not of fixed period then they are cyclic. If the periods is unchanging and
associated with some aspect of the calendar, then the pattern is seasonal

The average length of cycles is longer than the length of a seasonal pattern

The magnitude of cycles tends to be more variable than the magnitude of


seasonal patterns

15
1. The monthly housing sales (top left) show strong seasonality within each year, as
well as some strong cyclic behaviour with period about 6–10 years. There is no
apparent trend in the data over this period.

2. The US treasury bill contracts (top right) show results from the Chicago market for
100 consecutive trading days in 1981. Here there is no seasonality, but an obvious
downward trend. Possibly, if we had a much longer series, we would see that this
downward trend is actually part of a long cycle, but when viewed over only 100
days it appears to be a trend.

3. The Australian monthly electricity production (bottom left) shows a strong


increasing trend, with strong seasonality. There is no evidence of any cyclic
behaviour here.

4. The daily change in the Dow Jones index (bottom right) has no trend, seasonality
or cyclic behaviour. There are random fluctuations which do not appear to be very
predictable, and no strong patterns that would help with developing a forecasting
model.

16

You might also like