Chapter 4. Violation of Assumptions
Chapter 4. Violation of Assumptions
Chapter 4. Violation of Assumptions
Violation of Classical
Assumptions and Diagnostic Testing
5. Violation of Assumptions
The classical assumptions do not always hold. If one or more of these assumptions are
violated (say in the presence of Autocorrelation, Heteroscedasticity and/or
multicollinearity):
• Estimations of parameters will not be accurate
• The OLS Estimators coefficients may not hold BLUE property.
• Tests of hypothesis using standard t and F – statistics will no- longer be valid
• Conclusion/ inferences made will be misleading.
The classical assumptions do not always hold. Formal tests are required to identify
whether theses assumptions are satisfied or not.
5.1 Multicollinearity
One of the assumptions of the classical linear regression model is that there is no
multicollinearity among the explanatory variables, the X’s. Broadly interpreted, multicollinearity
refers to the situation where there is either exact or less than exact linear relationship among
explanatory (X) variables.
Multicollinearity could be perfect or less than perfect. When there is perfect collinearity among
explanatory variables, the regression coefficients are indeterminate and their standard
errors are innite. There is perfect multicollinearity between two explanatory variables if one
is expressed a constant multiple of the other. Suppose we have a model:
𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + 𝒖𝒊 ; 𝑋- and 𝑋. are explanatory variables.
5.1 Multicollinearity
Then, if 𝑋- expressed as a constant multiple of 𝑋. we say there is perfect multicollinearity
between them:
𝑋- = 𝜆 𝑋. ; where 𝜆 is a constant number, 𝜆 ≠ 0.
If 𝑋- = 𝜆𝑋. ± 𝑉 - no perfect correlation
‘V’ being any random term (any random number) imply less than perfect or imperfect collinearity
between 𝑋- and 𝑋. .
If multicollinearity is less than perfect, even if it is very high, the OLS estimators still retain the
property of BLUE: linear, unbiased, best (minimum variance). Estimation of regression
coefficients is possible; however, their standard errors tend to be large. As a result, the population
values of the coefficients cannot be estimated precisely.
Under multicollinearity, standard errors of estimators become very sensitive to even
the slightest change in the data. Note that multicollinearity is only about linear
relationships between two or more explanatory variables; it isn’t about nonlinear relationships
between variables.
5.1 Multicollinearity
5.1.1 Causes/Sources of multicollinearity
1. Less Variability in the values of explanatory variables. It is sampling over a limited
range of values taken by the regressors in the population (less variability of the values of the
regressors).
2. Constraints/restriction imposed on the model or in the population being sampled.
3. Model specication error, for example adding polynomial terms to a regression model,
especially when the range of the X variable is small.
4. Over determined model. This happens when the model has large number of explanatory
variables but a few of observations. This could happen in medical research where there may be a
small number of patients about whom information is collected on a large number of variables.
5.1 Multicollinearity
5.1.2 Consequences of Multicollinearity
• In cases of near or high multicollinearity, one is likely to encounter the following consequences.
As far as multicollinearity among explanatory variables is not exact/perfect, we can estimate
coefficients and OLS estimators retain their BLUE property. Less perfect Multicollinearity
does not destroy the BLUE property of OLS estimators. But this does not mean that
multicollinearity doesn’t cause problems.
1. If there is perfect collinearity among explanatory variables, coefficients are
indeterminate and their standard errors are not defined. However, even if
collinearity is high (but not perfect), estimation of regression coefficients is possible. But their
standard errors are still large, as a result, population values of the coefficients cannot be
estimated precisely.
5.1 Multicollinearity
2. the variances, covariance and standard errors of OLS estimators will become
larger under multicollinearity. This will have serous statistical consequences.
5 𝒊 6𝜷𝒊
𝜷
3. Under estimation of t- statistics of individual coefficients 𝑡𝜷5𝒊 = 5𝒊
.
78𝜷
The t - values /ratios computed will become low due to high standard error. This leads to
Acceptance of null hypothesis more easily (i.e., the true population coefcient could be
declared zero or statically insignificant more easily and frequently).
4. Wrong confidence interval estimation of coefficients. Because of the large
standard errors, the confidence intervals for the relevant population parameters
tend to be larger.
5.1 Multicollinearity
5. Hypothesis testing and confidence interval estimation will be biased,
and lead to faulty inferences/conclusion.
6.The OLS estimators and their standard errors highly sensitive to small changes in
the data.
• Multicollinearity would inflate variances, covariance and standard errors of OLS
estimators. From a model with two explanatory variables, we can show how
multicollinearity affects variances and standard errors of coefficients.
𝑌: = 𝛽< + 𝛽- 𝑋- + 𝛽. 𝑋. + 𝑢:
• The variance and covariance of coefficients could computed as follows (as discussed in
chapter3). late 𝑟-. = denotes Correlation between explanatory variables
𝑋- and𝑋..
5.1 Multicollinearity
∑ 𝒙𝟏 𝒙𝟐
𝒓𝟏𝟐 = - correlation b/n 𝒙𝟏 𝒂𝒏𝒅 𝒙𝟐
∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐
𝝈𝟐 𝝈𝟐 𝟏
5𝟏 =
𝒗𝒂𝒓 𝜷 =∑ 𝑉𝐼𝐹 𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟐 𝟏 𝒙𝟐 𝟏 𝟏6 𝒓𝟐 𝟏𝟐
𝝈𝟐 𝝈𝟐 𝟏
5𝟐 =
𝒗𝒂𝒓 𝜷 =∑ 𝑽𝐼𝐹 𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟐 𝟐 𝒙𝟐 𝟐 𝟏6 𝒓𝟐 𝟏𝟐
R𝟐
𝒓𝟏𝟐 𝝈 RU
STU V
5𝟏, 𝜷
𝒄𝒐𝒗 𝜷 5𝟐 = = 𝑉𝐼𝐹
𝟏 6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐 ∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐
𝟏
𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐
5.1 Multicollinearity
𝑟-. = denotes Pair wise Correlation between explanatory variables 𝑋- and𝑋. ,
0 < 𝒓𝟐 𝟏𝟐 < 1
∑ 𝒙𝟏 𝒙𝟐
𝑟-. = = 1, when there is perfect collinearity between 𝑋- and 𝑋.
∑ 𝒙𝟏𝟐 ∑ 𝒙𝟐 𝟐
5.1 Multicollinearity
TheVariance Inflation Factor (VIF)
The speed with which variances and covariance increase can be seen with the Variance Inating
Factor (VIF), which is dened as
𝟏
VIF = ≥ 𝟏 , 𝑟-. - Pair wise correlation b/n 𝒙𝟏 𝒂𝒏𝒅 𝒙𝟐 .
𝟏 6 𝒓𝟐 𝟏𝟐
VIF shows how the variance and covariance of estimators are inated by the presence of
multicollinearity. As 𝑟 . -. approaches 1, the VIF approaches innity, becomes very large. That is,
as the extent of collinearity increases, the variance of an estimator increases, and in the limit it
can become innite. As can be readily seen, if there is no collinearity between 𝑋- and 𝑋. ,
𝑟 .-. = 0 and VIF = 1.
𝟏
VIF = Variance Inflation Factor. 𝑉𝐼𝐹 = ≥ 𝟏
𝟏6 𝒓𝟐 𝟏𝟐
5.1 Multicollinearity
If 𝑟-. = 0, then VIF = 1. When 𝑟-. approaches zero, that is, as collinearity between 𝑋- and 𝑋.
decreases (gets minimum) VIF approaches 1(becomes minimum) and the variances of the
coefficients get lower and lower.
As 𝑟-. approaches one, that is, as collinearity increases, VIF will become a large number, the
variances of estimators increase and in the limit when 𝑟-. =1(perfect collinearity) VIF becomes
infinite. It is equally clear that as 𝑟-. increases to 1; the covariance of the two estimators also
increases in absolute value. Note also that large variance implies large standard error.
“Insignicant” t – statistics - In hypothesis test and confidence interval estimation as we
uses compute t- statistics for individual coefficient. Note that in computing t - statistics of
coefficients we employ standard errors coefficients.
When there is high collinearity between explanatory variables the estimated standard errors of
estimators increase dramatically, thereby making the t - values computed becomes smaller.
5 𝒊 6𝜷𝒊
𝜷
𝑡𝜷5𝒊 = 5𝒊
78𝜷
5.1 Multicollinearity
A smaller t - statistics computed is always in favor of the null hypothesis. Therefore, in such
cases, one will increasingly accept null hypothesis that the true population value is zero
(we fail to reject null hypothesis as our computed t- statistics is becomes smaller due high
standard error);we make Type II Error,accepting wrong null hypothesis.
VIF when there is large number of explanatory variables;
𝟏
𝑽𝑰𝑭𝒋 = >1
𝟏6 𝑹𝒋 𝟐
𝑹𝒋 𝟐 Is obtained from regressing one of the explanatory variable (𝑋\ ) over the other
explanatory variables. What we have discussed can be easily extended to include for cases with large number
of explanatory variables (for k -1 > 2). In such a model, the variance of the coefcient Jth explanatory
variable can be expressed as:
𝟐
5𝒋 = 𝝈 ×
𝒗𝒂𝒓 𝜷
𝟏
j = 1,2, 3,… ,k -1 = number of regressors
∑ 𝟐 𝒙 𝒋 𝟏6 𝑹𝒋 𝟐
5.1 Multicollinearity
Where 𝛽^\ = (estimated) partial regression coefcient of the jth regressor, 𝑅\ . = is the R-square
obtained from the Auxiliary Regression of 𝑋\ on the remaining k − 2 regressors
[Note:There are k −1 regressors in k variables regression model].
𝟏 𝝈𝟐
𝑽𝑰𝑭𝒋 = 𝟐 and 5𝒋 =
𝒗𝒂𝒓 𝜷 ×𝑽𝑰𝑭𝒋
𝟏6 𝑹𝒋 ∑ 𝒙𝟐 𝒋
𝑽𝑰𝑭𝒋 = Variable Inflation factor obtained from the regression of the jth variable over the other
regressors
𝒗𝒂𝒓 𝜷 5 𝒋 will depend on the three factors: 1. 𝝈𝟐 2. 𝑽𝑰𝑭𝒋 3. ∑ 𝒙𝟐 𝒋
The last one, ∑ 𝒙𝟐 𝒋 , implies that the larger the variability in a regressor, the smaller the variance
of the coefcient of that regressor, the greater the precision with which that coefcient can be
estimated. Larger 𝝈𝟐 reduces the precision with which that coefcient can be estimated; similarly,
high 𝑽𝑰𝑭𝒋 reduces the precision with which that coefcient can be estimated.
5.1 Multicollinearity
The inverse of theVIF is calledTolerance Level (TOL).That is,
𝟏 𝑹𝑺𝑺 𝑱
𝑻𝑶𝑳𝒋 = = 𝟏 − 𝑹𝒋 𝟐 = f𝑻𝑺𝑺𝑱
𝑽𝑰𝑭𝒋