Econometrics Module
Econometrics Module
Econometrics Module
Econometrics Module
ECONOMETRICS
A TEACHING MATERIAL FOR DISTANCE STUDENTS
MAJORING IN ECONOMICS
Prepared By:
Module
Bedru I
Babulo Seid
Hasseл
Department of Economics
Faculty of Business and
Economics
Mekelle University
June, 2005
Mekelle
Econometrics, Module I 1
Econometrics
Module I
Introduction to the
The principal objective of the course, “Introduction to Econometrics”, is to provide an
elementary but comprehensive introduction to the art and science of econometrics. It enables
students to see how economic theory, statistical and mathematical methods are combined in the
analysis of economic data, with a purpose of giving empirical content to economic theories and
verify or refute them.
Module I of the course includes the first three chapters. The first chapter introduces students
with the definition and some fundamental conceptualization of econometrics. In chapter two a
fairly detailed treatment of the simple classical linear regression model will be made. In this
chapter students will be introduced with the basic logic, concepts, assumptions, estimation
methods, and interpretations of the simple classical linear regression models and their
applications in economic science. Chapter three, which deals with Multiple Regression Models, is
basically the extension of the simple regression models. But in chapter three attempts will be
made to expand the linear regression model by incorporating more than one explanatory variables
or regressors to the model. In both chapters (chapter one and chapter two), due attention will be
given to the basics of ordinary least square (OLS) method of estimation and investigating the
statistical properties of the parameter estimates which are summarized by the Gauss-Markov's
BLUE (Best, Linear, Unbiased, estimator) properties.
Chapter 1. Introduction
Contents of the Module in
Definition and scope of econometrics
Economic models vs. econometric models
Methodology of econometrics
Desirable properties of an econometric model
Goals of econometrics
Chapter 2. The Classical Regression Analysis: The Simple Linear regression Models
Stochastic and non-stochastic relationships
The Simple Regression model
The basic Assumptions the Classical Regression Model
OLS Method of Estimation
Properties of OLS Estimators
Inferences/Predictions
Chapter 3. The Classical Regression Analysis: The Multiple Linear Regression Models
Assumptions
Ordinary Least Squares (OLS) estimation
Matrix Approach to Multiple Regression Model
Properties of the OLS estimators
Inferences/Predictions
Econometrics, Module I 2
Chapter One
Introduction
Econometrics, Module I 3
guidance for economic policy making we also need to know the quantitative
relationships between the different economic variables. We obtain these
quantitative measurements taken from the real world. The field of knowledge
which helps us to carryout such an evaluation of economic theories in empirical
terms is econometrics.
Distance students! Having said the background statement in our attempt for
defining ‘ECONOMETRICS', we may now formally define what econometrics is.
WHAT IS ECONOMETRICS?
Literally interpreted, econometrics means “economic measurement”, but the scope
of econometrics is much broader as described by leading econometricians. Various
econometricians used different ways of wordings to define econometrics. But if
we distill the fundamental features/concepts of all the definitions, we may obtain
the following definition.
Econometrics, Module I 4
the “metric” part of the word econometrics signifies ‘measurement', and hence
econometrics is basically concerned with measuring of economic relationships.
Econometrics, Module I 5
charts them, and attempts to describe the pattern in their development over time
and perhaps detect some relationship between various economic magnitudes.
Economic statistics is mainly a descriptive aspect of economics. It does not
provide explanations of the development of the various variables and it does not
provide measurements the coefficients of economic relationships.
Econometrics, Module I 6
Example: Economic theory postulates that the demand for a commodity depends
on its price, on the prices of other related commodities, on consumers' income and
on tastes. This is an exact relationship which can be written mathematically as:
Q b0 b1 P b2 P0 b3Y b4t
The above demand equation is exact. How ever, many more factors may affect
demand. In econometrics the influence of these ‘other' factors is taken into
account by the introduction into the economic relationships of random variable. In
our example, the demand function studied with the tools of econometrics would be
of the stochastic form:
Q b0 b1 P b2 P0 b3Y b4 t u
where u stands for the random factors which affect the quantity demanded.
Econometrics, Module I 7
Specification of the model is the most important and the most difficult stage of any
econometric research. It is often the weakest point of most econometric
applications. In this stage there exists enormous degree of likelihood of
Econometrics, Module I 8
Econometrics, Module I 9
Econometrics, Module I 10
meaningful and statistically and econometrically correct for the sample period
for which the model has been estimated; yet it may not be suitable for
forecasting due to various factors (reasons). Therefore, this stage involves the
investigation of the stability of the estimates and their sensitivity to changes in
the size of the sample. Consequently, we must establish whether the estimated
function performs adequately outside the sample of data. i.e. we must test an
extra sample performance the model.
Econometrics, Module I 11
Review questions
Chapter Two
THE CLASSICAL REGRESSION ANALYSIS
[The Simple Linear Regression Model]
Economic theories are mainly concerned with the relationships among various
economic variables. These relationships, when phrased in mathematical terms, can
predict the effect of one variable on another. The functional relationships of these
variables define the dependence of one variable upon the other variable (s) in the
specific form. The specific functional forms may be linear, quadratic, logarithmic,
exponential, hyperbolic, or any other form.
Econometrics, Module I 13
Assuming that the supply for a certain commodity depends on its price (other
determinants taken to be constant) and the function being linear, the relationship
can be put as:
Q f (P) P (2.1)
The above relationship between P and Q is such that for a particular value of P,
there is only one corresponding value of Q. This is, therefore, a deterministic (non-
stochastic) relationship since for each price there is always only one corresponding
quantity supplied. This implies that all the variation in Y is due solely to changes
in X, and that there are no other factors affecting the dependent variable.
If this were true all the points of price-quantity pairs, if plotted on a two-
dimensional plane, would fall on a straight line. However, if we gather
observations on the quantity actually supplied in the market at various prices and
we plot them on a diagram we see that they do not fall on a straight line.
The derivation of the observation from the line may be attributed to several
factors.
a. Omission of variables from the function
b. Random behavior of human beings
c. Imperfect specification of the mathematical form of the model
d. Error of aggregation
e. Error of measurement
Econometrics, Module I 14
Thus a stochastic model is a model in which the dependent variable is not only
determined by the explanatory variable(s) included in the model but also by others
which are not included in the model.
2.2. Simple Linear Regression model.
The above stochastic relationship (2.2) with one explanatory variable is called
simple linear regression model.
The true relationship which connects the variables involved is split into two parts:
a part represented by a line and a part represented by the random term ‘u'.
Econometrics, Module I 15
The scatter of observations represents the true relationship between Y and X. The
line represents the exact part of the relationship and the deviation of the
observation from the line represents the random component of the relationship.
- Were it not for the errors in the model, we would observe all the points on the
line Y ' ,Y ' ,......,Y ' corresponding to X , X ,.. ,
X . However because of the random
1 2 n 1 2 n
⏟
Y
⏟i
xi u
the dependent
iable
i var
random var iable
–_–,
the regression line
- The first component in the bracket is the part of Y explained by the changes
in X and the second is the part of Y not explained by X, that is to say the
change in Y is due to the random influence of u i .
The classicals made important assumption in their analysis of regression .The most
importat of these assumptions are discussed below.
Econometrics, Module I 16
Dear distance students! Check yourself whether the following models satisfy the
above assumption and give your answer to your tutor.
a. ln Y 2 ln X 2 U
i
b. Yi X i U i
This means that the value which u may assume in any one period depends on
chance; it may be positive, negative or zero. Every value has a certain probability
of being assumed by u in any particular instance.
Econometrics, Module I 17
For all values of X, the u's will show the same dispersion around their mean.
In Fig.2.c this assumption is denoted by the fact that the values that u can
assume lie with in the same limits, irrespective of the value of X. For X 1 , u
can assume any value with in the range AB; for X 2 , u can assume any value
Mathematically;
Var(U ) E[U E(U )]2 E(U )2 2 (Since E(U ) 0 ).This constant variance is
i
i i i
Econometrics, Module I 18
Econometrics, Module I 19
[( X i ( X i )(U i
given E(Ui ) 0
)]
( X iUi ) ( X i )(Ui )
( X iUi )
0
8. The explanatory variables are measured without error
- U absorbs the influence of omitted variables and possibly errors of
measurement in the y's. i.e., we will assume that the regressors are error
free, while y values may or may not include errors of measurement.
Dear students! We can now use the above assumptions to derive the following
basic concepts.
Proof:
Mean: (Y ) xi ui
X
i Since (ui ) 0
Variance:
Var(Y ) Y (Y )
2
i i i
X u ( X )
2
i i
(u ) 2
i
2
(since (u ) 2 2 )
i
..................................................................
var(Y ) 2 (2.8)
i
Econometrics, Module I 20
xi , are a set of fixed values by assumption 5 and therefore don't affect the shape of
the distribution
of yi .
~ N( x , 2 )
Yi i
Proof:
Cov(Yi ,Y j ) E{[Yi E(Yi )][Y j E(Yj )]}
(Since Yi X i
Ui andYj X j U j )
Therefore, Cov(Y Y ) 0 .
i, j
Econometrics, Module I 21
called the true parameters since they are estimated from the population value of Y
and X But it is difficult to obtain the population value of Y and X because of
technical or economic reasons. So we are forced to take the sample value of Y and
X. The parameters estimated from the sample value of Y and X are called the
estimators of the true parameters and
and are symbolized as ˆ ˆ .
and
ˆ e 2
(Y ˆ
) 2 ……………………….(2.7)
X
i i i
To find the values of ˆ and ˆ that minimize this sum, we have to partially
differentiate e 2
with respect to ˆ and ˆ
and set the partial derivatives equal to
i
zero.
e2
1. i
2 (Yi i ˆ ˆX ) 0.......................................................(2.8)
ˆ
Rearranging this expression we will get: If you divide (2.9) by
Y i n
‘n' and rearrange, we
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
get ……(2.9)
ˆ Y ˆX................................................................................(2.10)
Econometrics, Module I 22
e2
2. i 2 X (Y ˆ ˆX ) 0..................................................(2.11)
ˆ i
i
Note: at this point that the term in the parenthesis in equation 2.8and 2.11 is the
residual,
e Yi ˆ ˆX i . Hence it is possible to rewrite (2.8) and (2.11) as
2 ei
and 2 X i 0 . It follows that;
0
ei
e i 0 and X i ei 0 .......................................(2.12)
Y Xii ˆX
i
ˆ X
2
i ……………………………………….(2.13)
Equation (2.9) and (2.13) are called the Normal Equations. Substituting the
values of ˆ from (2.10) to (2.13), we get:
Y Xi i X (Y ˆX ) ˆX 2
i i
YX i
ˆXX ˆX 2
i
i
Y Xi i
YX i ˆ(X 2 XX )
i i
XY nXY
= ˆ ( X 2 nX 2)
i
ˆ XY n X Y ………………….(2.14)
X 2 nX 2
i
( X X )2 X 2 nX 2 (2.16)
Econometrics, Module I 23
( X X )(Y Y )
ˆ ( X X ) 2
Now, denoting
(Xi X) xi , (Yi Y ) as yi we get;
as and
ˆ x i y i ……………………………………… (2.17)
x i2
Subject to: ˆ 0
The composite function then becomes
Z
ˆ )2 where is a Lagrange multiplier.
(Yi ˆX i ˆ ,
Econometrics, Module I 24
ˆ X iYi
……………………………………..(2.18)
Xi 2
This formula involves the actual values (observations) of the variables and not
‘n'; we compute the estimates ˆ 's from each sample, and for each econometric
method and we form their distribution. We next compare the mean (expected
value) and the variances of these distributions and we choose among the
alternative estimates the one whose distribution is concentrated as close as
possible around the population parameter.
Econometrics, Module I 25
According to the this theorem, under the basic assumptions of the classical linear
regression model, the least squares estimators are linear, unbiased and have
minimum variance (i.e. are best of all linear unbiased estimators). Some times the
theorem referred as the BLUE theorem i.e. Best, Linear, Unbiased Estimator. An
estimator is called BLUE if:
a. Linear: a linear function of the a random variable, such as, the
dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population
parameter.
c. Minimum variance: It has a minimum variance in the class of linear
and unbiased estimators. An unbiased estimator with the least variance
is known as an efficient estimator.
According to the Gauss-Markov theorem, the OLS estimators possess all the
BLUE properties. The detailed proof of these properties are presented below
Dear colleague lets proof these properties one by one.
a. Linearity: (for ˆ )
Proposition:
ˆ & are linear in Y.
ˆ
(but
xi ( X X ) X nX 0)
nX nX
x Y
ˆ i ; Now, xi
(i 1,2,.. .n)
let x K
2 i i
x 2
i
ˆ KiY (2.19)
ˆ is linear in Y
Econometrics, Module I 26
between ˆ and Y.
b. Unbiasedness:
Proposition: ˆ &
are the unbiased estimators of the true parameters &
ˆ
From your statistics course, you may recall that if ˆ is an estimator of then
E(ˆ) the amount of bias and if ˆ is the unbiased estimator of then bias =0
In our case, ˆ & ˆ are estimators of the true parameters & .To show that they
are the unbiased estimators of their respective parameters means to prove that:
(ˆ)
and (ˆ )
We know
that ˆ ki ( X i U i )
kYi
but
ki and ki X i 1
0
xi (X X ) X nX nX nX
ki 2
x2 x2 x
i i i
x2
i
k i 0 ........................................................................................................... (2.20)
ki X i xi X ( X X ) Xi
i x 2
x 2 i
i
X 2 XX X nX
2 2
X 2 nX 2 X 2 nX 2 1
k i X i 1 .................................................................................................... (2.21)
Econometrics, Module I 27
From the proof of linearity property under 2.2.2.3 (a), we know that:
ˆ 1 n Xk Y
i i
1
n X i 1
n ui Xki Xki X i Xkiui
1
n ui Xkiui , ˆ 1
n ui Xkiui
1 n Xk i )u i ........................................................... (2.23)
(ˆ ) (2.24)
ˆ
is an unbiased estimator of .
first obtain variance ofˆ and ˆ and then establish that each has the minimum
variance in comparison of the variances of other linear and unbiased estimators
obtained by any other econometric methods than OLS.
a. Variance of ˆ
...............................................................
var( ) (ˆ (ˆ))2 (ˆ )2 (2.25)
Substitute (2.22) in (2.25) and we get
var(ˆ) E( k u ) 2
i
[k u k u 2 ............ k 2u 2 2k k
2 2 2
uu ........ ku u ]
2k
1 1 2 2 n n 1 2 1 2 n1 n n1 n
Econometrics, Module I 28
k 2u 2 ) (k k u u i j
( )
i i i j i j
k 2(u 2 ) 2k k (u u ) 2k 2
(Since (u u ) =0)
i i i j i j i i j
x x i2 1
k i , and therefore, k 2
i
x 2 i
(x2 )2 x 2
i i i
2
var(ˆ ) 22 k
i ……………………………………………..(2.26)
x2i
b. Variance of ˆ
var(ˆ ) (ˆ (
)
2
ˆ (2.27)
2
i
2
1 Xk (u
2
)2
n i
2( 1 Xk ) 2
n
2( 1
Xki X k )
2 2 2
n
n
i
( 2 1 2X
k X 2k 2 ) , Since k 0
n i
i i
2 ( 1 X 2k 2 )
1 n
X2 i
x 2 1
2( ) , Since k 2 i
i
2
i
(x2i )2 x i2
n x
Again:
1 X 22 x i2 nX 2 X 22
n x nx 2 nx
i i i
X 2
X
2
2 1
var( ) n 2 nxi 2 ....................................................................(2.28)
ˆ 2
x
i i
Econometrics, Module I 29
Dear student! We have computed the variances OLS estimators. Now, it is time to
check whether these variances of OLS estimators do possess minimum variance
property compared to the variances other estimators of the true and , other
than ˆ and ˆ .
To establish
that ˆ ˆ possess minimum variance property, we compare their
and
variances with that of the variances of some other alternative linear and unbiased
estimators of
and , say * and * . Now, we want to prove that any other
linear and unbiased estimator of the true population parameter obtained from any
other econometric method has larger variance that that OLS estimators.
Lets first show minimum variance of ˆ and then that of ˆ .
1. Minimum variance of ˆ
Suppose: * an alternative linear and unbiased estimator of and;
Let
* wi Y i................................................................................................................................................................. (2.29)
where
wi ; but: wi ki ci
,
ki
* wi ( X i ui ) Since Yi X i U i
wi wi X i wiui
Therefore, ci
since ki wi 0
0
Again
wi X i (ki ci ) ki X i ci X i
Xi
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
Since
wi X i and ki X i ci X i 0 .
1 1
Econometrics, Module I 30
Since
ci xi ci ci xi 0
1 0
Thus, from the above calculations we can summarize the following results.
wi
wi xi ci ci X i 0
0, 1, 0,
compare
with var(ˆ) .
var( *) var(wiYi )
w 2 var(Y )
i i
Given that ci is an arbitrary constant, 2c 2 is a positive i.e it is greater than zero.
i
Thus
var( *) var(ˆ) . This proves that ˆ possesses minimum variance property.
In the similar way we can prove that the least square estimate of the constant
intercept (ˆ ) possesses minimum variance.
2. Minimum Variance of ˆ
We take a new estimator * , which we assume to be a linear and unbiased
estimator of function of . The least square estimator ˆ is given by:
ˆ ( 1 n Xki )Yi
By analogy with that the proof of the minimum variance property of ˆ , let's use
the weights wi = ci + ki Consequently;
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
* ( 1n Xwi )Yi
Econometrics, Module I 31
Since we want *
to be on unbiased estimator of the true , that is, (*) ,
we substitute for Y xi ui in * and find the expected value of * .
* ( 1n Xwi )( X i ui )
X ui
( Xw XX w
Xw u )
i i i i i
n n n
* X ui / n Xwi Xwi X i Xwiui
( Xw ) var(Y )
1 2
n i i
2( 1 Xw ) 2
n
2( 1 X 2 w 2 2 1 Xw )
n2 i n
2 ( n X 2 w 2 2 X 1 w )
n2 i n
var( *) 2
1 n
X 2w 2 ,Since wi 0
i
but w k 2 c 2 2
i i i
var( *) 2
1
n
X (k c 2 2 2
i i
2
2 1 X 2 2 2
var( *) n x 2 X c
i
i
2
2 X 2 2c
nxi2
2
i
i
Econometrics, Module I 32
Therefore, we have proved that the least square estimators of linear regression
model are best, linear and unbiased (BLU) estimators.
To use
ˆ in the expressions for the variances of ˆ ˆ , we have to prove
2 and
whether ˆ is
e 22
i
from the expressions of Y,
Yˆ , y, yˆ and ei .
Proof:
Yi ˆ ˆ X i e i
Yˆ ˆ ˆ x
Y Yˆ ei......................................................................................................................................................................................... (2.31)
ei
Yi Yˆ..................................................................................................................... (2.32)
Yi
sin (ei ) 0
Yˆ i
ce
Dividing both sides the above by ‘n' will give us
Econometrics, Module I 33
Y
Y Yˆ (2.33)
Yˆ
i
n n
Y Yˆ
(Y Y ) (Yˆ Yˆ ) e
yi yˆ i e .......................................................................... (2.34)
From (2.34):
ei yi yˆ i ....................................................................................................................................... (2.35)
From:
Yi X i U i
Y X U
We get, by subtraction
yi (Yi Y ) i ( X i X ) (Ui U ) xi (U U )
yi x (U U )........................................................................................(2.36)
Note that we assumed earlier that , (u) 0 , i.e in taking a very large number
samples we expect U to have a mean value of zero, but in any particular single
sample U is not necessarily zero.
Similarly: From;
Yˆ ˆ ˆ x
Y ˆ ˆx
We get, by subtraction
Yˆ Yˆ ˆ ( X X )
yˆ ˆx........................................................................................................ (2.37)
Econometrics, Module I 34
ei xi (u i u ) ˆxi
(u i u ) ( ˆi )xi
The summation over the n sample values of the squares of the residuals over the
‘n' samples yields:
e2 u ) (ˆ )x ]2
[(ui i
i
[(u u ) 2 (ˆ ) 2 x 2 2(u u )(ˆ )x ]
i i i
(u u )2 (ˆ )2 x 2 2[(ˆ )x (u u )]
i i i
2 (u i )2
ui
n
1
(u 2 ) (u) 2
i
n
n (u u ....... u )2 since (u 2 ) 2
2 1
n 1 2 i i u
n 2 1 (u 2 2u u )
n i i j
n 1 ((u 2 ) 2u u ) i j
2
n i i j
n n (u u )
2 1 2 2
n u n i j
n 2 2
(given (u u
) 0)
u u i j
2 (n 1)............................................................................... (2.39)
u
b. [(ˆ )2 x 2 ] x 2 .(ˆ ) 2
i i
Given that the X's are fixed in all samples and we know that
1
(ˆ ) 2 var(ˆ) 2
u
x2
Econometrics, Module I 35
1
Hence x 2 .(ˆ )2 x 2 . 2
i i u
x 2
x 2i . ( ˆ ) 22 u ……………………………………………(2.40)
we will get:
(xi ui )2
2
2
x
i
x u 2x x u u
2 2
2 i i i j i j
x i 2
x 2 (u i 2 ) 2(xi x j )(ui u j )
2 i j
x i x 2
2
i
x 2(u i 2 )
2
x 2 ( given (uiu j ) 0)
i
Consequently, Equation (2.38) can be written interms of (2.39), (2.40) and (2.41)
as follows: e2 n 1 2 2 2 2 (n 2) 2......................................... (2.42)
i u u u
Econometrics, Module I 36
2
Thus, ˆ 2 i is unbiased estimate of the true variance of the error term( 2 ).
n2
Dear student! The conclusion that we can drive from the above proof is that we
2
2
ˆ , since
can substitute ˆ i for ( ) in the variance expression of ˆ
2
n 2 and
Var(ˆ)
ˆ = e2i ........................................................................................................
(2.44)
2
x 2 (n 2) 2i
i x
X 2
Var(ˆ ) ˆ 2 i
e X2 2
i........................................................................................
n(n i 2) (2.45)
x
2
nx
i i
2
Note: e 2
can be computed as e 2 y ˆ x y .
i i i i
Dear Student! Do not worry about the derivation of this expression! we will
perform the derivation of it in our subsequent subtopic.
Econometrics, Module I 37
.Y
Y = e Y Yˆ
Y Y= Yˆ Yˆ ˆ 0 ˆ X
1
= Yˆ Y
Y.
X
Figure ‘d'. Actual and estimated values of the dependent variable Y.
As can be seen from fig.(d) above, Y Y represents measures the variation of the
sample observation value of the dependent variable around the mean. However
the variation in Y that can be attributed the influence of X, (i.e. the regression line)
is given by the vertical distance Yˆ Y . The part of the total variation in Y about
Econometrics, Module I 38
Now, we may write the observed Y as the sum of the predicted value ( Yˆ ) and the
residual term (ei.).
⏟
Yˆ
⏟ ⏟i
Yi e
Observed Yi predicted Yi Re sidual
From equation (2.34) we can have the above equation but in deviation form
y yˆ e . By squaring and summing both sides, we obtain the following
expression:
y 2 ( yˆ 2 e)2
y 2 ( yˆ 2 e2 2 yei)
i
y e 2 2yˆ ei
2
i i
(but e 0 , ex 0 )
i
i
yˆ e 0 ...........................................................................(2.46)
Therefore;
⏟i ⏟ ⏟i
y 2 yˆ 2 e2 ………………………………...(2.47)
Total Explained Un exp lained var ation
var iation var iation
OR,
Econometrics, Module I 39
i.e
yˆ 2 ˆ x 2 (2.50)
2
2
xy x
ˆ x y
2
, Since i i
i
x2 y x 2
2 i
xy xy................................................................
(2.52)
x 2 y 2
Econometrics, Module I 40
The limit of R2: The value of R2 falls between zero and one.
i.e. 0 R2 1.
Interpretation of R2
Suppose R 2
0.9 , this means that the regression line gives a good fit to the
observed data since this line explains 90% of the total variation of the Y value
around their mean. The remaining 10% of the total variation in Y is unaccounted
for by the regression line and is attributed to the factors included in the disturbance
variable ui .
And let ry 2yˆ the square of the correlation coefficient between Y and Yˆ , and is
(yyˆ) 2
given by: r 2y yˆ
y 2yˆ 2
Econometrics, Module I 41
var(ˆ) ˆ 2
x 2
ˆ 2X 2
var(ˆ ) nx 2
ˆ e 2 RSS
2 n2
n2
For the purpose of estimation of the parameters the assumption of normality is not
used, but we use this assumption to test the significance of the parameter
estimators; because the testing methods or procedures are based on the assumption
of the normality assumption of the disturbance term. Hence before we discuss on
the various testing methods it is important to see whether the parameters are
normally distributed or not.
We have already assumed that the error term is normally distributed with mean
zero and variance 2 , i.e. U ~ N(0, 2 ) . Similarly, we also proved
i
thatYi ~ N[( x), 2 ] . Now, we want to show the following:
ˆ 2
1. ~ N ,
2
2X 2
, ˆ ~ N
2. 2
nx
Econometrics, Module I 42
ˆ 2
2X 2
~ N , 2 ˆ
,; ~ N
nx 2
The OLS estimates ˆ and
are obtained from a sample of observations on Y and
ˆ
X. Since sampling errors are inevitable in all estimates, it is necessary to apply
test of significance in order to measure the size of the error and determine the
degree of confidence in order to measure the validity of these estimates. This can
be done by using various tests. The most common ones are:
i) Standard error test ii) Student’s t-test iii) Confidence interval
All of these testing procedures reach on the same conclusion. Let us now see these
testing methods one by one.
i) Standard error test
This test helps us decide whether the estimates ˆ and
are significantly different
ˆ
from zero, i.e. whether the sample from which they have been estimated might
have come from a population whose true parameters are zero.
0 and / or 0 .
Formally we test the null hypothesis
H 0 : i
against the alternative hypothesis H1 : i 0
0
The standard error test may be outlined as follows.
First: Compute standard error of the parameters.
SE(ˆ) var(ˆ)
SE(ˆ ) var(ˆ )
Decision rule:
If
SE( ˆi ) 1
2 ˆi , accept the null hypothesis and reject the alternative
Econometrics, Module I 43
If
SE( ˆi ) 1
2 ˆi , reject the null hypothesis and accept the alternative
The acceptance or rejection of the null hypothesis has definite economic meaning.
Namely, the acceptance of the null hypothesis
(the slope parameter is zero)
0
implies that the explanatory variable to which this estimate relates does not in fact
influence the dependent variable Y and should not be included in the function,
since the conducted test provided evidence that changes in X leave Y unaffected.
In other words acceptance of H 0 implies that the relation ship between Y and X is
in fact Y (0)x , i.e. there is no relationship between X and Y.
Numerical example: Suppose that from a sample of size n=30, we estimate the
following supply function.
Q 120 0.6 ei
p
SE : (1.7) (0.025)
Test the significance of the slope parameter at 5% level of significance using the
standard error test.
SE ) 0.025
( ˆ
) 0.6
(ˆ
1
2 ˆ 0.3
5% level of significance.
Note: The standard error test is an approximated test (which is approximated from
the z-test and t-test) and implies a two tail test conducted at 5% level of
significance.
ii) Student’s t-test
Like the standard error test, this test is also important to test the significance of the
parameters. From your statistics, any variable X can be transformed into t using
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
Econometrics, Module I 44
X
t , with n-1 degree of freedom.
sx
Where
i value of the population mean
sample estimate of the population standard deviation
sx
( X X ) 2
sx n 1
n sample size
We can derive the t-value of the OLS estimates
ˆ i
tˆ
SE(ˆ with n-k degree of freedom.
)
tˆ
ˆ
SE(ˆ
Where: )
SE = is standard error
k = number of parameters in the model.
Since we have two parameters in simple linear regression with intercept different
from zero, our degree of freedom is n-2. Like the standard error test we formally
test the H 0 : i against the H1 : i for the slope
hypothesis: 0 alternative 0
parameter; and
H0 : against the alternative H1 : for the intercept.
0 0
Econometrics, Module I 45
Then this is a two tail test. If the level of significance is 5%, divide it by two to
obtain critical value of t from the t-table.
tail test.
Step 5: Compare t* (the computed value of t) and tc (critical value of t)
If t*> tc , reject H0 and accept H1. The conclusion is ˆ is statistically
significant.
If t*< tc , accept H0 and reject H1. The conclusion is ˆ is statistically
insignificant.
Numerical Example:
Suppose that from a sample size n=20 we estimate the following consumption
function:
C 100 0.70 e
(0.21)
(75.5)
Econometrics, Module I 46
The values in the brackets are standard errors. We want to test the null hypothesis:
H 0 : i
against the alternative H1 : i using the t-test at 5% level of
0
0
significance.
a. the t-value for the test statistic is:
t*
ˆ 0 ˆ 0.70
0.21
SE(ˆ) = 3.3
SE(ˆ)
ˆ and is the
ˆ
correct estimate of the true population parameter
and . It simply means that
our estimate comes from a sample drawn from a population whose parameter is
different from zero.
In order to define how close the estimate to the true parameter, we must construct
confidence interval for the true parameter, in other words we must establish
limiting values around the estimate with in which the true parameter is expected to
lie within a certain “degree of confidence”. In this respect we say that with a
given probability the population parameter will be with in the defined confidence
interval (confidence limits).
Econometrics, Module I 47
sample, would include the true population parameter in 95% of the cases. In the
other 5% of the cases the population parameter will fall outside the confidence
interval.
In a two-tail test at level of significance, the probability of obtaining the specific
t-value either –tc or tc is
2 at n-2 degree of freedom. The probability of obtaining
but
ˆ
t* …………………………………………………….(2.58)
SE(ˆ
)
The limit within which the true lies at (1 )% degree of confidence is:
H1 : 0
Decision rule: If the hypothesized value of in the null hypothesis is within the
Econometrics, Module I 48
hypothesis is outside the limit, reject H 0 and accept H1. This indicates ˆ is
statistically significant.
Numerical Example:
Suppose we have estimated the following regression line from a sample of 20
observations.
Y 128.5 2.88X e
(38.2) (0.85)
ˆ SE( ˆ )t c
ˆ 2.88
SE (ˆ) 0.85
Econometrics, Module I 49
regression coefficients together with their standard errors and the value of R2. It
has become customary to present the estimated equations with standard errors
placed in parenthesis below the estimated parameter values. Sometimes, the
estimated coefficients, the corresponding standard errors, the p-values, and some
other indicators are presented in tabular form.
These results are supplemented by R2 on ( to the right side of the regression
equation).
Example: Y 128 . 5 2 . 88
X , R2 = 0.93. The numbers in the
parenthesis below the parameter estimates are the standard errors. Some
econometricians report the t-values of the estimated coefficients in place of the
standard errors.
Review Questions
Review Questions
1. Econometrics deals with the measurement of economic relationships which are stochastic
or random. The simplest form of economic relationships between two variables X and Y
can be represented by:
Econometrics, Module I 50
3. The following data refers to the price of a good ‘P' and the quantity of the good supplied,
‘S'.
P 2 7 5 1 4 8 2 8
S 15 41 32 9 28 43 17 40
a. Estimate the linear regression line (S) P
X Y 1,296,836
i i
Y 539,512
2
i
i) Estimate the regression line of sale on price and interpret the results
ii) What is the part of the variation in sales which is not explained by the
regression line?
iii) Estimate the price elasticity of sales.
5. The following table includes the GNP(X) and the demand for food (Y) for a country over
ten years period.
year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Y 6 7 8 10 8 9 10 9 11 10
X 50 52 55 59 57 58 62 65 68 70
a. Estimate the food function
b. Compute the coefficient of determination and find the explained and unexplained
variation in the food expenditure.
c. Compute the standard error of the regression coefficients and conduct test of
significance at the 5% level of significance.
Econometrics, Module I 51
Y
Y Y
2
i
21.9 86.9
i X 215.4
2
i i
X 186.2
X
X i X Yi Y 106.4
a. Estimate and
b. Calculate the variance of our estimates
c. Estimate the conditional mean of Y corresponding to a value of X fixed at X=10.
7. Suppose that a researcher estimates a consumptions function and obtains the following
results:
C 15 0.81Yd n 19
(3.1) (18.7) R 2 0.99
where C=Consumption, Yd=disposable income, and numbers in the parenthesis are the ‘t-ratios'
a. Test the significant of Yd statistically using t-ratios
b. Determine the estimated standard deviations of the parameter estimates
8. State and prove Guass-Markov theorem
9. Given the model:
Yi 0 1 X i
with usual OLS assumptions. Derive the expression for the error
Ui
variance.
Econometrics, Module I 52
Chapter Three
THE CLASSICAL REGRESSION ANALYSIS
[The Multiple Linear Regression Model]
3.1 Introduction
In simple regression we study the relationship between a dependent variable and a
single explanatory (independent variable). But it is rarely the case that economic
relationships involve just two variables. Rather a dependent variable Y can
depend on a whole series of explanatory variables or regressors. For instance, in
demand studies we study the relationship between quantity demanded of a good
and price of the good, price of substitute goods and the consumer's income. The
model we assume is:
Yi 0 1 P1 2 P2 3 X i u i --------------------------------(3.1)
Where
X ki 1,2,3,......., K ) are explanatory variables, Yi is the dependent
(i
variable and
j ( j 0,1,2,....(k 1)) are unknown parameters and ui is the
disturbance term. The disturbance term is of similar nature to that in simple
regression, reflecting:
- the basic random nature of human responses
- errors of aggregation
- errors of measurement
- errors in specification of the mathematical form of the model
Econometrics, Module I 53
i.e.
E(ui u j ) for xi j
0
Econometrics, Module I 54
We can't exclusively list all the assumptions but the above assumptions are some
of the basic assumptions that enable us to proceed our analysis.
The Y 0 1 X 1 2 X 2 ……………………………………(3.3)
model: U i
is multiple regression with two explanatory variables. The expected value of the
above model is called population regression equation i.e.
E(Y ) 0 1 X 1 2 X 2 , Since E(Ui ) 0................................................. (3.4)
where
i is the population parameters. 0 is referred to as the intercept and 1
and
2 are also some times known as regression slopes of the regression. Note
that,
2 for example measures the effect E(Y ) of a unit change X 2 when
on in
X 1 is held constant.
Where
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
ˆ j e estimates of the
ar j and Yˆ is known as the predicted value of Y.
Econometrics, Module I 55
Yˆ ˆ 0 ˆ
X ei.................................................................................................................. (3.6)
X 1i
ˆ2 2i
1
ei Yi Yˆ Yi ˆ 0 ˆ X 1 ˆ 2 X 2 ........................................................................................... (3.7)
1
e 2
with respect to ˆ ,
1 and
ˆ and set the partial derivatives equal to zero.
i
0
ˆ 2
e
2
i X X 0 ………………………. (3.8)
ˆ
0 ˆ 1i ˆ2 2i
ˆ 1
0 2 Yi
e 2
X X 0 ……………………. (3.9)
i 2 X
ˆ 1i Yi 0 ˆ 11i ˆ 11i
ˆ
1
e 2
X ˆ X 0 ………… ………..(3.10)
i 2 X
ˆ 2 2i Yi ˆ0 ˆ 11i 2i
2
X Y ˆ ˆ X ˆ X 1 1
2
1 2 1 1 …………………………(3.12)
2 i
X X
0
X Y
ˆ ˆ X ˆ X 2
.........................................
(3.13)
2 i
X X
0 2 1 1 2 2 2
From (3.11) we
obtain ˆ 0
ˆ 0 Y ˆ
X2 (3.14)
X1 1 ˆ
2
X 2 )X 1i ˆ X 1i 2 X
ˆ
X
Yi nY X (X )
2
1i
2
nX (X 1i X nX X 2 )----------(3.15)
1i ˆ2 1i ˆ 2 1
1 2
We know that
Econometrics, Module I 56
X Y 2 (X nX Y ) x y
Y
i i i i i i i i
X 2
X (X 2
nX ) x
2 2
i i i i
Substituting the above equations in equation (3.14), the normal equation (3.12) can
be written in deviation form as follows:
x y ˆ x
1 1
2
1
ˆ 2x 1 2…………………………………………(3.16)
x
Using the above procedure if we substitute (3.14) in (3.13), we get
x y ˆ x
ˆ
2
………………………………………..(3.17)
x 2 1 1 2
x
2 2
Let's bring (2.17) and (2.18) together
x y ˆ x 2
ˆ x
……………………………………….(3.18)
x 1 1 1 2 1 2
x y ˆ x
ˆ
2
……………………………………….(3.19)
x 2 1 1 2
x
2 2
ˆ and
1 ˆ can easily be solved using matrix
2
x 2
1
x 1
ˆ
= x 2 y ………….(3.20)
x2
1
xx 1 2
2
ˆ 2 x 3y
x 2
Econometrics, Module I 57
1 1 (3.25)
TSS TSS y 2
i
ei ( y i ˆ x ˆ 2 x 2i )
1 1i
ei y ˆ x e ˆ e x
1 1 i 2 i 2
ei yi
since ei x1i ei x2i 0
y i ( y i ˆ x ˆ 2 x 2i )
i.e e2 y 2 ˆ x y ˆ x y
1 1
i 1 1i i 2 2i i
y
ˆ y ˆ x y
2
----------------- (3.26)
⏟
e 2
⏟
x
Total sum of
– – – _–2 ––
1 1 i i i
Re sidual sum of squares
square (Total
var iation)
2i
,i (un exp lained var iation)
Explained sum of
square ( Explained
var iation)
ESS ˆ x y11i i22i
ˆ i x y
R2 ----------------------------------(3.27)
TSS y 2
measures the linear association between two variables, if R2 is high, that means
there is a close association between the values of Yt and the values of predicted by
Econometrics, Module I 58
the model,
Yˆ . In this case, the model is said to “fit” the data well. If R2 is low,
t
there is no association between the values of Yt and the values predicted by the
model, Yˆ and the model does not fit the data well.
t
So far we have discussed the regression models containing one or two explanatory
variables. Let us now generalize the model assuming that it contains k variables.
It will be of the form:
Y 0 1 X 1 2 X 2 ...... k X k U
Econometrics, Module I 59
known terms will be the sums of squares and the sums of products of all variables
in the structural equations.
Least square estimators of the unknown parameters are obtained by minimizing
the sum of the squared residuals.
e2 ( ˆ ˆ 2 X ....... X )2
y i i
ˆ0 X 1 2 ˆ k
1 k
With respect
j ( j 0,1,2,.. (k 1))
to
e 2
X k )(xi ) 0
ˆ X 1 ˆ 2 X .......
1
2(Yi ˆ0 ˆ 2 ˆ k
1
……………………………………………………..
e2
ˆ X ˆ X ....... X k )(xki ) 0
1 2
k 2(Yi 0 ˆ 12
ˆ ˆ k
The general form of the above equations (except first ) may be written as:
e2
ˆ X ˆ X ) 0 ; where ( j 1,2,..k)
k
j 2(Yi 0 ˆ 11i
ˆ ki
Yi X1i
ˆ 0 ˆ 1 2
................................. ˆ kX 1X ki
X 1i X1i
Yi X 2i
ˆ 0 21 ˆ X ˆ 2 2
.......... ˆ X X ki
X i X1i 2i X
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
1 2 k 2
: : : : :
: : : : :
Yi X ki 2
ˆ 0 X ˆ 1X X ki X X .................. ˆ k X ki
1i 2i ki
ki
Econometrics, Module I 60
Solving the above normal equations will result in algebraic complexity. But we
can solve this easily using matrix. Hence in the next section we will discuss the
matrix approach to linear regression model.
where
(i 1,2,3,......n) and 0 the 1 to k = partial slope coefficients
intercept,
U= stochastic disturbance term and i=ith observation, ‘n' being the size of the
observation. Since i represents the ith observation, we shall have ‘n' number of
equations with ‘n' number of observations on each variable.
Y1 0 1 X 11 2 X 21 3 X 31................... k X k1 U 1
Y2 0 1 X 12 2 X 22 3 X 32................... k X k 2 U 2
Y3 0 1 X 13 2 X 23 3 X 33................... k X k 3 U 3
…………………………………………………...
Yn 0 1 X 1n 2 X 2n 3 X 3n ............. k X kn U n
Y1 X X k1
1 X ....... 0 U1
11 21
Y 1 X ....... X U
X
2 1 2
X k3
12 22 k2
Y3 X 13 X ....... 2 U 3
1 23
. ....... . . .
. . .
Y n X X 2n ....... X kn n U n
Y 1 1n
X U
.
In short.......................................................................................................(3.29)
Y X
The order of matrix and vectors involved are:
Y (n
X (n (k (k 1) and U (n 1)
1),
1), 1
Econometrics, Module I 61
We have to minimize:
e n 2
i
e2 e2 e2 ......... e 2
1 2 3 n
i1
e1
e2
[e1, e2.....en ] . e'e
.
e n
e 2 e' e
i
Econometrics, Module I 62
Let C= ( X X ) 1 X
ˆ CY................................................................................... (3.33)
, since (U ) 0
Thus, least square estimators are unbiased.
Econometrics, Module I 63
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
3. Minimum variance
Before showing all the OLS estimators are best(possess the minimum variance
property), it is important to derive their variance.
We know that, var(ˆ) (ˆ )2 (ˆ )(ˆ )'
(ˆ )(ˆ )'
(ˆ
) 2
(ˆ )(ˆ .......
( ˆ ) )
1 1
) ( ˆ
1 1 k k
1 1 2 2 2
ˆ ˆ
) ( 2 2 )( 11
ˆ
) 22
( ....... ( ˆ ˆ
2 2 )( k k )
: : :
ˆ ( (ˆ )
: ) ( (ˆ ):
ˆ
) ........ ˆ( : ) 2
k 1 1 k k 2 2 k k
ˆ ( X X ) 1 X U.....................................................................................(3.36)
Econometrics, Module I 64
Thus we
obtain, var(ˆ) 2 ( X ' X )1
u
n X ....... X kn
1n
X X ....... X X
21n
Where, 1n 1n kn
(X'X) 1
: : :
: : :
X 1n X ....... X 2 kn
kn
kn
We can, therefore, obtain the variance of any estimator say ˆ by taking the ith term
1
from the principal diagonal of ( X ' X )1 and then multiplying it by 2 .
u
Where the X's are in their absolute form. When the x's are in deviation form we
can write the multiple regression in matrix form as ;
ˆ (xx) 1 xy
ˆ
2 x
1 1 x1 x2 ....... x1 xk
ˆ ˆ ....... x2 xk
x2 x1 x2
2
2
where = : and (x x) : : :
: : : :
ˆ k x x x x ....... x 2
n 1 n 2 k
The above column matrix ˆ doesn't include the constant term ˆ .Under such
0
(the proof is the same as (3.37) above). In general we can illustrate the variance of
the parameters by taking two explanatory variables.
The multiple regression when written in deviation form that has two explanatory
variables is
y1 ˆ x ˆ x
1 1 2 2
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
var(ˆ) (ˆ )(ˆ )'
Econometrics, Module I 65
In this
model; ((ˆˆ ) )
1 1
( ˆ 2 2 )
(ˆ )'
1 ) 2)
( ˆ
1
( ˆ 2
1 1 2 2
( ˆ 2 2 ) 1
var(ˆ
cov(ˆ , ˆ 2)
) 1 1
ˆ ˆ var(ˆ 2 )
cov( 1, 2 )
In case of two explanatory variables, x in the deviation form shall be:
x11
x12 x21
x22 x11 x1 ....... x1n
2
x and x' x ....... x
: :
x 2n
x x
12 22
1n 2n
x 2
x1x 2 1
2 (x' x) 1 2 1
x x x
u u 2
1 2 2
2 x
2 2
x1 x2
x x1 2 x 1
u 2
Or 2 (x' x)1
u x1 2 x1 x2
x1 x2 x22
i.e., u2x 22
var(ˆ ) ........................................................................................................
(3.39)
1
x 2x 2 (x x )2
1 2 1 2
ˆ u2x 12 .................................................................................
and, var( 2 ) (3.40)
x x (x x )
2 2 2
1 2 1 2
() 2ux 1x 2
cov(ˆ1, ˆ 2 ) ..........................................................................................................
(3.41)
x x (x x )
2 2 2
1 2 1 2
Econometrics, Module I 66
e 2
y2
x y x y......... x y....................................... (3.42)
i i 1
1 2 2 K K
this is for k explanatory variables. For two explanatory variables
e 2
y2
x y x y.................................................................... (3.43)
i i 1
1 2 2
This is all about the variance covariance of the parameters. Now it is time to see
the minimum variance property.
Minimum variance of ˆ
prove that the variances obtained in (3.37) are the smallest amongst all other
possible linear unbiased estimators. We follow the same procedure as followed in
case of single explanatory variable model where, we first assumed an alternative
linear unbiased estimator and then it was established that its variance is greater
than the estimator of the regression model.
ˆ ( X ' X )1 X 'B Y
Where B is (k x n) matrix of known constants.
ˆ ( X ' X )1 X 'B X U
Econometrics, Module I 67
(ˆ) ( X ' X )1 X '( X U ) B( X U )
( X ' X ) 1
X ' X ( X ' X ) 1 X 'U BX BU
BX , [since E(U) = 0]...................................................(3.44)
estimator of , therefore,
(ˆ) should be equal to ; in other (XB) should
be a null matrix. words
( BX
0)
( X ' X )1 X 'U BU U ' X ( X ' X ) 1 U ' B'
( X ' X ) BUU 'X ( X ' X ) U ' B'
1 1
2I
( X ' X ) 1
X 'B X ( X ' X ) 1 B'
u n
2
( X ' X ) 1
X ' X ( X ' X ) 1 BX ( X ' X )1 ( X ' X ) 1
( X ' X )
u
2 1
X ' X ( X ' X ) 1 BX ( X ' X )1 ( X ' X )1 X ' B'BB'
u
2
( X ' X ) 1
BB' ( 0)
BX u
Econometrics, Module I 68
We know
that e2 i e'e Y 'Y 2ˆ' X 'Y ˆ' X ' since ( X ' X )ˆ X 'Y and
X ˆ
Y 2
i
Y Y
We know, y Y Y
i i
1
y 2 Y 2 (Y ) 2
i i i
n
In matrix notation
1 ......................................................................................
y 2 Y 'Y (Y ) 2 (3.48)
i i
n
Equation (3.48) gives the total sum of squares variations in the model.
Explained sum of squares y 2 e2
i i
1
Y 'Y (y)2 e' e
n
1 ............................................
ˆ' X 'Y (Y ) 2 (3.49)
i
n
Explained sum of squares
Since R 2 Total sum of squares
1
ˆ ' X 'Y (Y )i 2
n ˆ ' X 'Y nY
R2 ……………………(3.50)
1 Y 'Y nY 2
Y 'Y (Y )i 2
n
Econometrics, Module I 69
Dear Students! We hope that from the discussion made so far on multiple
regression model, in general, you may make the following summary of results.
(i) Model: Estimators: Y X U
(ii) Statistical properties: ˆ ( X ' X )1 X 'Y
(iii) Variance-covariance: BLUE
(iv)
1
ˆ ' X 'Y (Y )i 2
n ˆ ' X 'Y nY
(vi)Coeff. of determination: R2
1 Y 'Y nY
Y 'Y ( Yi )2
n
standard error test to test a hypothesis about any individual partial regression
coefficient. To illustrate consider the following example.
Let Y ˆ0 ˆ
X1 ˆ 2 X ei................................................................................................... (3.51)
1
2
A. H 0 : 1 0
H1 : 1 0
B. H 0 : 2 0
H1 : 2 0
Econometrics, Module I 70
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
The null hypothesis (A) states that, holding X2 constant X1 has no (linear)
influence on Y. Similarly hypothesis (B) states that holding X1 constant, X2 has no
influence on the dependent variable Yi.To test these null hypothesis we will use
the following tests:
i- Standard error test: under this and the following testing methods we
test only for ˆ
.The test ˆ will be done in the same way.
1
ˆ for 2
var(ˆ ) e2
1
SE(1 ) ; where ˆ 2 i
̂ 22 x2i n3
xx(x x)
22
1i 2i 1 2
2
Note: The smaller the standard errors, the stronger the evidence that the estimates
are statistically reliable.
ii. The student’s t-test: We compute the t-ratio for each ˆ
i
ˆ i
t* ~ t n- , where n is number of observation and k is number of
SE( ˆ k
i
)
parameters. If we have 3 parameters, the degree of freedom will be n-3. So;
ˆ 2
t*
2
; with n-3 degree of freedom
SE( ˆ 2 )
In our null
2 0, the t* becomes:
hypothesis
ˆ 2
t*
SE( ˆ 2 )
If t*<t (tabulated), we accept the null hypothesis, i.e. we can conclude
that
ˆ 2 is not significant and hence the regressor does not appear to
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
Econometrics, Module I 71
In this section we extend this idea to joint test of the relevance of all the included
explanatory variables. Now consider the following:
Y 0 1 X 1 2 X 2 ......... k X k U i
H 0 : 1 2 3 ............ k 0
This null hypothesis is a joint hypothesis that 1, 2 ,........ k are jointly or
significance of ˆi 's as the above? The answer is no, and the reasoning is
as follows.
that
2 0 , it was assumed tacitly that the testing was based on different sample
Econometrics, Module I 72
3 0 . But to test the joint hypothesis of the above, we shall be violating the
assumption underlying the test procedure.
The test procedure for any set of hypothesis can be based on a comparison of the
sum of squared errors from the original, the unrestricted multiple regression
model to the sum of squared errors from a regression model in which the null
hypothesis is assumed to be true. When a null hypothesis is assumed to be true,
we in effect place conditions or constraints, on the values that the parameters can
take, and the sum of squared errors increases. The idea of the test is that if these
sum of squared errors are substantially different, then the assumption that the joint
null hypothesis is true has significantly reduced the ability of the model to fit the
data, and the data do not support the null hypothesis.
If the null hypothesis is true, we expect that the data are compliable with the
conditions placed on the parameters. Thus, there would be little change in the sum
of squared errors when the null hypothesis is assumed to be true.
Let the Restricted Residual Sum of Square (RRSS) be the sum of squared errors
in the model obtained by assuming that the null hypothesis is true and URSS be
the sum of the squared error of the original unrestricted model i.e. unrestricted
residual sum of square (URSS). It is always true that RRSS - URSS 0.
1
Gujurati, 3rd ed.pp
Econometrics, Module I 73
Consider Yˆ ˆ 0 ˆ
X1
ˆ 2 X ......... ˆ k X ei .
1
2 k
We know
Yˆ ˆ 0 ˆ X ......... X ki
that: ˆ2 2i ˆk
X 1i
1
Yi Yˆ e
e i Yi Yˆ
i
e (Y Yˆ )2
2
i i i
This sum of squared error is called unrestricted residual sum of square (URSS).
This is the case when the null hypothesis is not true. If the null hypothesis is
assumed to be true, i.e. when all the slope coefficients are zero.
Y ˆ 0 e i
ˆ 0 Y (applying OLS).............................................(3.52)
Y i
n
eY
but ˆ0 Y
ˆ0
eY Y
e2 (Y Yˆ ) 2 y 2 TSS
i i i
The sum of squared error when the null hypothesis is assumed to be true is called
Restricted Residual Sum of Square (RRSS) and this is equal to the total sum of
square (TSS).
The ratio: RRSS URSS / K 1
~ F (k 1,nk ) ……………………… (3.53);
URSS / n K
(has an F-ditribution with k-1 and n-k degrees of freedom for the numerator and denominator respectively)
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
RRSS TSS
URSS e2 y 2 ˆ yx ˆ yx ..........ˆ
yx RSS
i 1 1 2 2 k k
Econometrics, Module I 74
(TSS RSS) / k 1
F RSS / n k
ESS / k 1....................................................................................
F RSS / n k (3.54)
If the computed value of F is greater than the critical value of F (k-1, n-k), then the
parameters of the model are jointly significant or the dependent variable Y is
linearly related to the independent variables included in the model.
Econometrics, Module I 75
Table: 2.1. Numerical example for the computation of the OLS estimators.
nY X1 X2 X 3yi x1 x2 x3 yi 2 x1 x2 x2 x3x1 x3 x1 2 x22 x32 x1 yi x2 yix3 yi
1 49 35 53 200 -3 -7 -9 0 9 63 0 0 49 81 0 21 27 0
2 40 35 53 212 -12 -7 -9 12 144 63 -108 -84 49 81 144 84 108 -144
3 41 38 50 211 -11 -4 -12 11 121 48 -132 -44 16 144 121 44 132 -121
4 46 40 64 212 -6 -2 2 12 36 -4 24 -24 4 4 144 12 -12 -72
5 52 40 70 203 0 -2 8 3 0 -16 24 -6 4 64 9 0 0 0
6 59 42 68 194 7 0 6 -6 49 0 -36 0 0 36 36 0 42 -42
7 53 44 59 194 1 2 -3 -6 1 -6 18 -12 4 9 36 2 -3 -06
Downloaded by Demlie Basha Hailu
Σ =0
Σ =0
Σ =0
94
Σx1x2 40
20
Σx1x3= 30
70
30
50
Σx3yi 19
Σx2yi 92
25
Σ
Σx3yi=
Σx2x3=
Σyi
Σx1
Σx2
Σx3
2
2
From the table, the means of the variables are computed and given below:
Econometrics, Module I 76
Prepared by: Bedru B. and Seid H. ( June,
Based on the above table and model answer the following question.
i. Estimate the parameter estimators using the matrix approach
ii. Compute the variance of the parameters.
iii. Compute the coefficient of determination (R2)
iv. Report the regression result.
Solution:
In the matrix notation: ˆ (x' x) 1 x' y ; (when we use the data in deviation form),
1ˆ x x x
ˆ ˆ 11 21 31
x22 x32 ; so that
Where, 2 , x
x12
ˆ : : :
3 x x x
1n 2n 3n
x 2
x x x x x y
(x' x) 1
1 2 1
3 x y1
x1 x2 x 2
x2 x3 and x' y
x x
1
x x x 2 x
y2
1 3 2 3 3 3
(i) Substituting the relevant quantities from table 2.1 we have;
270
(x' x) 240 330
420 and
319
x' y 492
240 630
330 420 750 625
Note: the calculations may be made easier by taking 30 as common factor from
all the elements of matrix (x'x). This will not affect the final results.
270 240 330
| x' x | 240 630 420 4716000
330 420 750
0.0085
0.0012 0.0031
(x' x) 0.0012
1
0.0027 0.0009
0.0031 0.0009 0.0032
Econometrics, Module I 77
1ˆ
ˆ 0.0085 0.0012 0.0031 319 0.2063
ˆ 1
2 (x' x' y 0.0012 0.0027 0.0009 492
0.3309
x)
ˆ 0.0031 0.0009 0.0032 625 0.5572
3
And
Y ˆ
ˆ 2 X ˆ 3 X 3
X1 1
2
(ii) The elements in the principal diagonal of (x' x)1 when multiplied
2 give the variances of the regression parameters, i.e.,
u
var(ˆ ) 2 (0.0085)
1 u
e2 17.11
var( ) (0.0027)ˆ 2 i
2
2.851
ˆ
ˆ2 u
u nk 6
var( 3 ) 2 (0.0032)
u
var(ˆ SE(ˆ
) )
1
ˆ
0.0243, var( 2 ) 0.1560 SE( ˆ )
1
2
ˆ
0.0077, var( 3 0.0877 SE( ˆ
3
) 0.0093, ) 0.0962
1
ˆ' X 'Y (Y ) 2
(iii) ˆ
R2 n
i
ˆ x y ˆ y x
575.98
0.97
1 1 y
x 2 2 3 3
1 y 2
594
Y 'Y (Yi )
n 2 i
We can test the significance of individual parameters using the student's t-test.
The computed value of ‘t' is given above as t * .these values indicates us only
1
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
insignificant. ˆ is
Econometrics, Module I 78
Example 2. The following matrix gives the variances and covariance of the of the
three variables:
y x1 x2
y 7.59 3.12 26.99
x
1
29.16 30.80
x2 133.00
The first raw and the first column of the above matrix shows y and the first
2
y Y Y , x1 X X , and x2 X X
b. Compute variance
ˆ and ˆ
of 2
1
c. Compute coefficient of determination
d. Report the regression result.
Solution: It is difficult to estimate the above model as it is, to estimate the above
model easily let's take the natural log of the above model;
ln Y1 ln A 1 ln Y2 2 ln Y3 V i
And let:
0 ln A Y ln , X 1 ln and X 2 ln the above model
, Y1 Y2 Y3
becomes:
Y 0 X 1 X 2 V i
The above matrix is based on the transformed model. Using values in the matrix
Downloaded by Demlie Basha Hailu
Prepared by: Bedru B. and Seid H. ( June,
Econometrics, Module I 79
We know
ˆ (x' x) 1 x' y
that
In the present question:
x11 x12
ˆ x x
ˆ 1 and , x 21 22
ˆ 2 : :
x1n x nn
x 2
1 x x x1 y
x' x 1 2
and x' y
2
x1 x2 x x2 y
2
1 30.80 0.0105
(x' x)1 133.00 0.0454
2929.64 29.16 0.0099
ˆ
(a) ˆ (x' x) 1 x' y 1
ˆ 2
0.0454 0.01053.12 0.1421
0.0105 0.0099 26.99 0.2358
(b). The element in the principal diagonal of (x' x)1 when multiplied by 2 give
u
(0.1421)(3.12) (0.2358)(26.99)
7.59
R 2 0.78; e2 (1 R 2 )(y 2 ) 1.6680
i i
1.6680
ˆ 2 0.0981
u
17
Econometrics, Module I 80
The (constant) food price elasticity is negative but income elasticity is positive.
Also income elasticity if highly significant. About 78 percent of the variations in
the consumption of food are explained by its price and income of the consumer.
Example 3:
Consider the model:
Y 1 X 1i 2 X 2i U i
On the basis of the information given below answer the following question
X 2
32001 X 1 X X 2 400
2 4300
X 2 X 1Y X 2Y 13500
73002 8400
Y 800 X 1 n 25
Y 2 250
i
28,000
x yx 2 x yx x
ˆ1 x 2x 2 (x x )2
21 22 21 2 1
Econometrics, Module I 81
x yx 2 x yx x
ˆ2 x2 2x 21 (x 1x ) 2 1
1 2 1 2
Since the x's and y's in the above formula are in deviation form we have to find
the corresponding deviation forms of the above given values.
We know that:
x1 x2 X 1 X 2 nX 1 X 2
4300 (25)(10)(16)
300
x1 y X 1Y nX 1Y
8400 25(10)(32)
400
x2 y X 2Y nX 2Y
13500 25(16)(32)
700
x X 2 nX 2
2
1 1 1
3200 25(10) 2
700
x X 2 nX 2
2
2 2 2
7300 25(16)2
900
Now we can compute the parameters.
x yx 2 x yx x
ˆ1 x2 2x 22 (x 2x ) 2 1
1 2 1 2
(400)(900) (700)(300)
(900)(700) (300)2
0.278
x yx 2 x yx x
ˆ2 x2 2x 21 (x 1x ) 2 1
1 2 1 2
Econometrics, Module I 82
(700)(700) (400)(300)
(900)(700) (300)2
0.685
32 (0.278)(10) (0.685)(16)
18.26
ˆ 2x 2
b. var( ) 2
ˆ
1
x 2x 2 (x x ) 2
1 2 1 2
e2
ˆ 2 i Where k is the number of parameter
nk
In our case k=3
e2
ˆ 2 i
n3
e2 y 2 ˆ x y ˆ x y
1 1 1 2 2
ˆ
var( 2 ) 1
x 2x2 (x x )2
1 1 1 2
(82.24)(700)
540,000 0.1067
Econometrics, Module I 83
This is done by comparing the computed value of t and critical value of t which is
obtained from the table at
2 level of significance and n-k degree of freedom.
Hence; t* 0.278
0.751
SE(ˆ1 0.370
)
The critical value of t from the t-table at
2 0.05
2 0.025 level of significance
22 degree of freedom is 2.074. and
tc 2.074
t* 0.755
t* tc
different from zero and to accept the null hypothesis that says is equal to zero.
The conclusion is
ˆ is statistically insignificant or the sample we use to estimate
1
ˆ is drawn from the population of Y & X1in which there is no relationship
1
between Y and X1(i.e.
1 0 ).
2
ESS
R RSS
1- TSS
TSS
We know that RSS e2
and TSS y 2 and ESS i yˆ 2 ˆ x y ˆ x y ...... ˆ x y
1 1 2 2 k k
R 2 1-
RSS 10809.3
1 2400
TSS
0.24
24% of the total variation in Y is explained by the regression line
Econometrics, Module I 84
e2i / n k (1 R 2 )(n 1)
Adjusted R 2
1 1
y 2 / n 1 nk
(1 0.24)(24)
1 22
0.178
e. Let's set first the joint hypothesis as
H 0 : 1 2 0
against
H1 : at least one of the slope parameter is different from zero.
The joint test hypothesis is testing using the F-test given below.
F *(k 1),(nk )
ESS / k 1
RSS / n k
R2 /k
1 1 R 2 / n
k
From (d)
R2 and k 3
0.24
F *(2,22)
3.4736 this is the computed value of F. Let's compare this
with the critical value F at 5% level of significance and (3,.23) numerator and
denominator respectively. F (2,22) at 5%level of significance = 3.44.
F*(2,22) = 3.47
Fc(2,22)=3.44
F*>Fc, the decision rule is to reject H 0 and accept H1. We can say that
the model is significant i.e. the dependent variable is, at least, linearly
related to one of the explanatory variables.
Econometrics, Module I 85
Instructions:
Read the following instructions carefully.
Make sure that your exam paper contains 4 pages
The exam has four parts. Attempt
All questions of part one
Only two questions from part two
One question from part three
And the question in part four.
Maximum weight of the exam is
40% Part One: Attempt all of the following questions
(15pts).
1. Discuss briefly the goals of econometrics.
2. Researcher is using data for a sample of 10 observations to estimate the relation
between consumption expenditure and income. Preliminary analysis of the sample data
produces the following data.
xy 700 , x 2
1000 , X 100
Y 200
Where
x Xi X and y Yi Y
i
a. Use the above information to compute OLS estimates of the intercept and slope
coefficients and interpret the result
b. Calculate the variance of the slope parameter
c. Compute the value R2 (coefficient of determination) and interpret the result
d. Compute 95% confidence interval for the slope parameter
e. Test the significance of the slope parameter at 5% level of confidence using t-test
3. If the model Yi=+1X1i +2X2i +Ui is to be estimated from a sample of 20 observation using
the semi- processed data given in matrix in deviation form.
Econometrics, Module I 86
10,000
0.1 0.12 0.03
(x / x)1 0.12 0.04 0.02 20,300
X Y 10,100
0.03 0.02 0.08
30,200
X 1 400 , X 2 200 , and X 3 600
Where
x Xi X and y Yi Y
i
2. In a study of 100 firms, the total cost(C) was assumed to be dependent on the rate of out put
(X1) and the rate of absenteeism (X2). The means were: C 6 X 1 3 X 2 4 . The matrix
, and
showing sums of squares and cross products adjusted for means is
c x1 x2
c 100 50 40
x1 50 50 -70 where, xi X i X i and c Ci C
x2 40 -70 900
Estimate the linear relation ship between C and the other two variables. (10points)
Econometrics, Module I 87