L10.2_2023
L10.2_2023
L10.2_2023
Multicollinearity
What Happens If the Regressors
Are Correlated?
Introduction
• Assumption 8 of the classical linear regression model (CLRM) is
that there is no multicollinearity among the regressors included
in the regression model.
2
Introduction
We attempt to answers to the following questions:
3
The Nature of Multicollinearity
• The term multicollinearity is attributed to Ragnar Frisch.
• where λ1, λ2, ... , λk are constants such that not all of them are zero
simultaneously.
4
The Ballentine view of multicollinearity
5
Multicollinearity
• Multicollinearity refers only to linear relationships among the X variables.
6
Multicollinearity
• Why does the classical linear regression model assume that there is no
multicollinearity among the X’s?
7
Multicollinearity
• There are several sources of multicollinearity:
• The data collection method employed
• Constraints on the model or in the population being sampled
• Model specification
• An over-determined model.
8
Estimation in the Presence of Perfect
Multicollinearity
9
Multicollinearity
• Assume that X3i = λX2i, where λ is a nonzero constant
10
Multicollinearity
• 𝛽መ2 : It gives the rate of change in the average value of Y as X2 changes by a unit,
holding X3 constant.
• But if X3 and X2 are perfectly collinear, there is no way X3 can be kept constant:
As X2 changes, so does X3 by the factor λ.
11
Estimation in the Presence of “High” but
“Imperfect” Multicollinearity
• Instead of exact multicollinearity, if we have
12
Multicollinearity: Theoretical Consequences of
Multicollinearity
• Even if multicollinearity is very high, as in the case of near multicollinearity, the
OLS estimators still retain the property of BLUE.
13
Multicollinearity
• Even in the case of near multicollinearity the OLS estimators are unbiased.
14
Practical Consequences of Multicollinearity
1. Although BLUE, the OLS estimators have large variances and covariances,
making precise estimation difficult.
2. Because of consequence 1, the confidence intervals tend to be much wider,
leading to the acceptance of the “zero null hypothesis” (i.e., the true population
coefficient is zero) more readily.
3. Also because of consequence 1, the t ratio of one or more coefficients tends to
be statistically insignificant.
4. Although the t ratio of one or more coefficients is statistically insignificant, R2,
the overall measure of goodness of fit, can be very high.
5. The OLS estimators and their standard errors can be sensitive to small changes
in the data.
15
Large Variances and Covariances of OLS
Estimators
The following formulae are used
16
Variance-Inflating Factor (VIF)
• The speed with which variances and covariances increase can be seen with the
variance-inflating factor (VIF), which is defined as
17
Regression model
• We can also write
18
Multicollinearity
• The Effect of Increasing Collinearity on the 95% Confidence Interval
19
Tolerance (TOL)
• The inverse of the VIF is called tolerance (TOL). That is,
20
Wider Confidence Intervals
• Because of the large standard errors, the confidence intervals for the relevant
population parameters tend to be larger
21
“Insignificant” t Ratios
• In cases of high collinearity the estimated standard errors increase dramatically,
thereby making the t values smaller.
• Therefore, in such cases, one will increasingly accept the null hypothesis that the
relevant true population value is zero.
22
A High R2 but Few Significant t Ratios
• Consider the k-variable linear regression model:
• In cases of high collinearity, it is possible to find that one or more of the partial
slope coefficients are individually statistically insignificant on the basis of the t-
test.
• Yet the R2 in such situations may be so high, say, in excess of 0.9, that on the
basis of the F test one can convincingly reject the hypothesis that β2 = β3 = ··· = βk
=0
23
Sensitivity of OLS Estimators and Their Standard
Errors to Small Changes in Data
• Consider the following data set
24
Sensitivity of OLS Estimators and Their Standard
Errors to Small Changes in Data
• Estimate the regression model using the command: reg y x2 x3
25
Sensitivity of OLS Estimators and Their Standard
Errors to Small Changes in Data
• We obtain the following multiple regression:
26
Sensitivity of OLS Estimators and Their Standard
Errors to Small Changes in Data
• With the new data set after changing 3rd and 4th value of X3
27
Sensitivity of OLS Estimators and Their Standard
Errors to Small Changes in Data
• With the new data set after changing 3rd and 4th value of X3
28
Consequences of Micronumerosity
29
Example 2
• EXAMPLE Consumption Expenditure in Relation to Income and Wealth
30
Example 2
• EXAMPLE Consumption Expenditure in Relation to Income and Wealth
31
Example 2
• EXAMPLE Consumption Expenditure in Relation to Income and Wealth
32
Graphical presentation
• Graphical presentation
33
Example
34
Example
35
Example
• If instead of regressing Y on X2, we regress it on X3
36
Example
• If instead of regressing Y on X2, we regress it on X3, we obtain
37
Detection of Multicollinearity
1. High R2 but few significant t ratios.
38
Detection of Multicollinearity
4. Auxiliary regressions.
• Regress each Xi on the remaining X variables and compute the corresponding R2,
designated as R2i ; each one of these regressions is called an auxiliary regression
39
Detection of Multicollinearity
• Klein’ rule of thumb suggests that multicollinearity may be a troublesome
problem only if the R2 obtained from an auxiliary regression is greater than the
overall R2, that is, that obtained from the regression of Y on all the regressors.
40
Detection of Multicollinearity
5. Eigenvalues and Condition Index.
41
Detection of Multicollinearity
• Thumb rule: If k is between 100 and 1000 there is moderate to strong
multicollinearity and if it exceeds 1000 there is severe multicollinearity.
42
Detection of Multicollinearity
6. Tolerance and variance inflation factor.
• The larger the value of VIFj, the more “troublesome” or collinear the variable Xj.
• As a rule of thumb, if the VIF of a variable exceeds 10, which will happen if R2j
exceeds 0.90, that variable is said be highly collinear.
• The closer TOLj is to zero, the greater the degree of collinearity of that variable
with the other regressors.
• On the other hand, the closer TOLj is to 1, the greater the evidence that Xj is not
collinear with the other regressors.
43
Detection of Multicollinearity
• 7. Scatterplot.
44
Remedial Measures
• Do Nothing Approach
45
Rule-of-Thumb Procedures
1. A priori information.
Suppose we consider the model:
• But suppose we believe that β3 = 0.10β2; that is, the rate of change of
consumption with respect to wealth is one-tenth the corresponding rate with
respect to income.
46
Rule-of-Thumb Procedures
2. Combining cross-sectional and time series data.
47
Rule-of-Thumb Procedures
3. Dropping a variable(s) and specification bias.
• Specification bias arises from incorrect specification of the model used in the
analysis.
• Thus, if economic theory says that income and wealth should both be included in the
model explaining the consumption expenditure, dropping the wealth variable would
constitute specification bias.
• Omitting a variable may seriously mislead us as to the true values of the parameters.
48
Rule-of-Thumb Procedures
4. Transformation of variables.
49
Rule-of-Thumb Procedures
• Problems with first difference models
• For instance, the error term vt in transformed equation may not satisfy the
assumption that the disturbances are serially uncorrelated.
50
Rule-of-Thumb Procedures
5. Additional or new data.
2
• As the sample size increases, σ 𝑥2𝑖 will generally increase.
• Therefore, for any given r23, the variance of 𝛽መ2 will decrease, thus decreasing the
standard error, which will enable us to estimate β2 more precisely.
51
Rule-of-Thumb Procedures
6. Reducing collinearity in polynomial regressions.
52
Rule-of-Thumb Procedures
7. Other methods of remedying multicollinearity.
53
Is Multicollinearity Necessarily Bad?
• If the objective is prediction only, it may not pose any problem.
54
Example: The Longley Data
55
Example: The Longley Data
56
Example
• Correlation matrix
x1 x2 x3 x4 x5 x6
x1 1
x2 0.9916 1
x3 0.6206 0.6043 1
x4 0.4647 0.4464 -0.1774 1
x5 0.9792 0.9911 0.6866 0.3644 1
x6 0.9911 0.9953 0.6683 0.4172 0.994 1
57
Example
• Correlation matrix
x1 x2 x3 x4 x5 x6
x1 1
x2 0.9916 1
x3 0.6206 0.6043 1
x4 0.4647 0.4464 -0.1774 1
x5 0.9792 0.9911 0.6866 0.3644 1
x6 0.9911 0.9953 0.6683 0.4172 0.994 1
• Several of these pair-wise correlations are quite high, suggesting that
there may be a severe collinearity problem.
• However, such pair-wise correlations may be a sufficient but not a necessary condition
for the existence of multicollinearity.
58
Example
• To shed further light on the nature of the multicollinearity problem, let us run
the auxiliary regressions.
59
Example
• Finally, we get the following table.
60
Example
• What “remedial” actions can we take?
• Consider the original model.
• First of all, we could express GNP not in nominal terms, but in real terms, which we
can do by dividing nominal GNP by the implicit price deflator.
• Second, since non-institutional population over 14 years of age grows over time
because of natural population growth, it will be highly correlated with time, the
variable X6 in our model.
• Therefore, instead of keeping both these variables, we will keep the variable X5 and
drop X6.
• Third, there is no compelling reason to include X3, the number of people
unemployed; perhaps the unemployment rate would have been a better measure of
labor market conditions.
61
Example
• To make these changes, generate RGNP variable
62
Example
• Although the R2 value has declined slightly compared with the original R2, it is
still very high.
• Now all the estimated coefficients are significant and the signs of the coefficients
make economic sense.
63
64