Econometric Theory: Module - Iii

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

ECONOMETRIC THEORY

MODULE – III
Lecture - 9
Multiple Linear Regression Analysis

Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2

We consider now the problem of regression when study variable depends on more than one explanatory or independent
variables, called as multiple linear regression model. This model generalizes the simple linear regression in two ways. It
allows the mean function E(y) to depend on more than one explanatory variables and to have shapes other than straight
lines, although it does not allow for arbitrary shapes.

The multiple linear regression model


Let y denotes the dependent (or study) variable that is linearly related to k independent (or explanatory) variables
X1, X2,…, Xk through the parameters β1 , β 2 ,..., β k and we write

y X 1β1 + X 2 β 2 + ... + X k β k + ε .
=

This is called as the multiple linear regression model. The parameters β1 , β 2 ,..., β k are the regression coefficients
associated with X1,X2,…,Xk respectively and ε is the random error component reflecting the difference between the
observed and fitted linear relationship. There can be various reasons for such difference, e.g., joint effect of those
variables not included in the model, random factors which can not be accounted in the model etc.

The jth regression coefficient βj represents the expected change in y per unit change in jth independent variable Xj .
Assuming E (ε ) = 0,

∂E ( y )
βj = .
∂X j
3

Linear model
∂y ∂E ( y )
A model is said to be linear when it is linear in parameters. In such a case (or equivalently ) should not
∂β j ∂X j
depend on any β 's . For example
y β1 + β 2 X is a linear model as it is linear is parameter.
i) =
β2
ii) y = β1 X can be written as

=
log y log β1 + β 2 log X
y=
*
β1* + β 2 x*

which is linear is parameter β1* and β 2 , but nonlinear is variables


= =
y* log y, x* log x. So it is a linear model.

iii) y=β1 + β 2 X + β3 X 2
is linear in parameters β1 , β 2 and β 3 but it is nonlinear is variables X. So it is a linear model.

β2
iv) y β1 +
=
X − β3
is nonlinear in parameters and variables both. So it is a nonlinear model.

v) y β1 + β 2 X β3
=
is nonlinear in parameters and variables both. So it is a nonlinear model.

vi) y=β1 + β 2 X + β3 X 2 + β 4 X 3
β1 + β 2 X 2 + β3 X 3 + β 4 X 4
is a cubic polynomial model which can be written as y =
which is linear in parameters β1 , β 2 , β 3 , β 4 and linear in variables=
X 2 X=
, X3 X=
2
, X 4 X 3. So it is a linear model.
4

Example

The income and education of a person are related. It is expected that, on an average, higher level of education provides
higher income. So a simple linear regression model can be expressed as

β1 + β 2 education + ε .
income =

Not that β 2 reflects the change in income with respect to per unit change in education and β1 reflects the income when
education is zero as it is expected that even an illiterate person can also have some income.

Further this model neglects that most people have higher income when they are older than when they are young,
regardless of education. So β 2 will over-state the marginal impact of education. If age and education are positively
correlated, then the regression model will associate all the observed increase in income with an increase in education. So
better model is
β1 + β 2 education + β3 age + ε .
income =

Usually it is observed that the income tends to rise less rapidly in the later earning years than is early years. To
accommodate such possibility, we might extend the model to

β1 + β 2 education + β3 age + β 4 age 2 + ε .


income =

This is how we proceed for regression modeling in real life situation. One needs to consider the experimental condition and
the phenomenon before taking the decision on how many, why and how to choose the dependent and independent
variables.
5

Model set up
Let an experiment be conducted n times and the data is obtained as follows:

Observation number Response Explanatory variables


(y) X1 X2  Xk

1 y1 x11 x12  x1k


2 y2 x21 x22  x2 k
     
n yn xn1 xn 2  xnk

Assuming that the model is


y β1 X 1 + β 2 X 2 + ... + β k X k + ε ,
=
the n-tuples of observations are also assumed to follow the same model. Thus they satisfy

y=
1 β1 x11 + β 2 x12 + ... + β k x1k + ε1
y=
2 β1 x21 + β 2 x22 + ... + β k x2 k + ε 2
 
y=
n β1 xn1 + β 2 xn 2 + ... + β k xnk + ε n .
6
These n equations can be written as

 y1   x11 x12  x1k   β 0   ε1 


      
 y2   x21 x22  x2 k   β1  +  ε 2 
           
      
 yn   xn1 xn 2  xnk   β k   ε n 

or
y Xβ +ε
=
where y = ( y1 , y2 ,..., yn ) ' is a n × 1 vector of n observation on study variable,

 x11 x12  x1k 


 
x x22  x2 k 
X =  21
    
 
 xn1 xn 2  xnk 

is a n x k matrix of n observations on each of the k explanatory variables, β = ( β1 , β 2 ,..., β k ) ' is a k x 1 vector


of regression coefficients and ε = (ε1 , ε 2 ,..., ε n ) ' is a n x 1 vector of random error components or disturbance term.

If intercept term is present, take first column of X to be (1,1,…,1)’. So that

1 x11 x12  x1k 


 
1 x21 x22  x2 k 
X = .
    
 
1 xn1 xn 2  xnk 
In this case, there are (k – 1) explanatory variables and one intercept term.
7

Assumptions in multiple linear regression model


y X β + ε for drawing the statistical inferences. The following
Some assumptions are needed in the model=
assumptions are made:

(i) E (ε ) = 0

(ii) E (εε ') = σ 2 I n

(iii) Rank ( X ) = k

(iv) X is a non-stochastic matrix.

(v) ε ~ N (0, σ 2 I n )
These assumptions are used to study the statistical properties of estimator of regression coefficients. The
following assumption is required to study particularly the large sample properties of the estimators
 X 'X 
(vi) lim   = ∆ exists and is a non-stochastic and nonsingular matrix (with finite elements).
n →∞
 n 

The explanatory variables can also be stochastic in some cases. We assume that X is non-stochastic unless stated
separately.

We consider the problems of estimation and testing of hypothesis on regression coefficient vector under the stated
assumption.
8

Estimation of parameters
A general procedure for the estimation of regression coefficient vector is to minimize

n n

i∑ M=
=i 1 =i 1
(ε ) ∑ M ( y − x β i i1 1 − xi 2 β 2 − ... − xik β k )

for a suitably chosen function M.

Some examples of choice of M are

M ( x) = x

M ( x) = x 2

M ( x) = x , in general.
p

We consider the principle of least square which is related to M(x) = x2 and method of maximum likelihood estimation for

the estimation of parameters.


9

Principle of ordinary least squares (OLS)


Let B be the set of all possible vectors β . If there is no further information, then B is k-dimensional real Euclidean
space. The object is to find a vector b ' = (b1 , b2 ,..., bk ) from B that minimizes the sum of squared deviations of ε i ' s, i.e.,
n

∑ ε i2 ==
S (β ) =
i =1
ε ' ε ( y − X β ) '( y − X β )

for given y and X. A minimum will always exist as S ( β ) is a real valued, convex and differentiable function. Write
.
S (β ) =
y ' y + β ' X ' X β − 2 β ' X ' y.

Differentiate S ( β ) with respect to β


∂S ( β )
= 2X ' X β − 2X ' y
∂β
∂ 2 S (β )
= 2 X ' X (atleast non-negative definite).
∂β ∂β '

The normal equation is

∂S ( β )
=0
∂β
⇒ X ' Xb =
X'y
where the following result is used:
10


Result: If f ( z ) = Z ' AZ is a quadratic form, Z is a m × 1 vector and A is any m × m symmetric matrix then F ( z ) = 2 Az .
∂z

Since it is assumed that rank ( X ) = k (full rank), then X ' X is positive definite and unique solution of normal equation is

b = ( X ' X ) −1 X ' y

which is termed as ordinary least squares estimator (OLSE) of β .

∂ 2 S (β )
Since is at least non-negative definite, so b minimizes S ( β ) .
∂β 2

In case, X is not of full rank, then

=b ( X ' X ) − X ' y +  I − ( X ' X ) − X ' X  ω

where ( X ' X ) − is the generalized inverse of X ' X and ω is an arbitrary vector. The generalized inverse ( X ' X )− of X ' X
satisfies

X ' X ( X ' X )− X ' X = X ' X

X ( X ' X )− X ' X = X

X ' X ( X ' X ) − X ' = X '.

You might also like