Chapter 4. Violation of Assumptions

Chapter 4.
Violation of Classical
Assumptions and Diagnostic Testing
5. Violation of Assumptions
The classical assumptions do not always hold. If one or more of these assumptions are
violated (say in the presence of Autocorrelation, Heteroscedasticity and/or
multicollinearity):
• Estimations of parameters will not be accurate
• The OLS Estimators coefficients may not hold BLUE property.
• Tests of hypothesis using standard t and F – statistics will no- longer be valid
• Conclusion/ inferences made will be misleading.
The classical assumptions do not always hold. Formal tests are required to identify
whether theses assumptions are satisfied or not.
5.1 Multicollinearity
One of the assumptions of the classical linear regression model is that there is no
multicollinearity among the explanatory variables, the X’s. Broadly interpreted, multicollinearity
refers to the situation where there is either exact or less than exact linear relationship among
explanatory (X) variables.
Multicollinearity could be perfect or less than perfect. When there is perfect collinearity among
explanatory variables, the regression coefficients are indeterminate and their standard
errors are innite. There is perfect multicollinearity between two explanatory variables if one
is expressed a constant multiple of the other. Suppose we have a model:
𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + 𝒖𝒊 ; 𝑋- and 𝑋. are explanatory variables.
Then, if 𝑋- expressed as a constant multiple of 𝑋. we say there is perfect multicollinearity
between them:
𝑋- = 𝜆 𝑋. ; where 𝜆 is a constant number, 𝜆 ≠ 0.
If 𝑋- = 𝜆𝑋. ± 𝑉 - no perfect correlation
‘V’ being any random term (any random number) imply less than perfect or imperfect collinearity
between 𝑋- and 𝑋. .
If multicollinearity is less than perfect, even if it is very high, the OLS estimators still retain the
property of BLUE: linear, unbiased, best (minimum variance). Estimation of regression
coefficients is possible; however, their standard errors tend to be large. As a result, the population
values of the coefficients cannot be estimated precisely.
Under multicollinearity, standard errors of estimators become very sensitive to even
the slightest change in the data. Note that multicollinearity is only about linear
relationships between two or more explanatory variables; it isn’t about nonlinear relationships
between variables.
5.1.1 Causes/Sources of multicollinearity
1. Less Variability in the values of explanatory variables. It is sampling over a limited
range of values taken by the regressors in the population (less variability of the values of the
regressors).
2. Constraints/restriction imposed on the model or in the population being sampled.
3. Model specication error, for example adding polynomial terms to a regression model,
especially when the range of the X variable is small.
4. Over determined model. This happens when the model has large number of explanatory
variables but a few of observations. This could happen in medical research where there may be a
small number of patients about whom information is collected on a large number of variables.
5.1.2 Consequences of Multicollinearity
• In cases of near or high multicollinearity, one is likely to encounter the following consequences.
As far as multicollinearity among explanatory variables is not exact/perfect, we can estimate
coefficients and OLS estimators retain their BLUE property. Less perfect Multicollinearity
does not destroy the BLUE property of OLS estimators. But this does not mean that
multicollinearity doesn’t cause problems.
1. If there is perfect collinearity among explanatory variables, coefficients are
indeterminate and their standard errors are not defined. However, even if
collinearity is high (but not perfect), estimation of regression coefficients is possible. But their
standard errors are still large, as a result, population values of the coefficients cannot be
estimated precisely.
2. the variances, covariance and standard errors of OLS estimators will become
larger under multicollinearity. This will have serous statistical consequences.
5 𝒊 6𝜷𝒊
𝜷
3. Under estimation of t- statistics of individual coefficients 𝑡𝜷5𝒊 = 5𝒊
.
78𝜷
The t - values /ratios computed will become low due to high standard error. This leads to
Acceptance of null hypothesis more easily (i.e., the true population coefcient could be
declared zero or statically insignificant more easily and frequently).
4. Wrong confidence interval estimation of coefficients. Because of the large
standard errors, the confidence intervals for the relevant population parameters
tend to be larger.
5. Hypothesis testing and confidence interval estimation will be biased,
and lead to faulty inferences/conclusion.
6.The OLS estimators and their standard errors highly sensitive to small changes in
the data.
• Multicollinearity would inflate variances, covariance and standard errors of OLS
estimators. From a model with two explanatory variables, we can show how
multicollinearity affects variances and standard errors of coefficients.
𝑌: = 𝛽< + 𝛽- 𝑋- + 𝛽. 𝑋. + 𝑢:
• The variance and covariance of coefficients could computed as follows (as discussed in
chapter3). late 𝑟-. = denotes Correlation between explanatory variables
𝑋- and𝑋..
∑ 𝒙𝟏 𝒙𝟐
𝒓𝟏𝟐 = - correlation b/n 𝒙𝟏 𝒂𝒏𝒅 𝒙𝟐
∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐
𝝈𝟐 𝝈𝟐 𝟏
5𝟏 =
𝒗𝒂𝒓 𝜷 =∑ 𝑉𝐼𝐹 𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟐 𝟏 𝒙𝟐 𝟏 𝟏6 𝒓𝟐 𝟏𝟐
𝝈𝟐 𝝈𝟐 𝟏
5𝟐 =
𝒗𝒂𝒓 𝜷 =∑ 𝑽𝐼𝐹 𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟐 𝟐 𝒙𝟐 𝟐 𝟏6 𝒓𝟐 𝟏𝟐
R𝟐
𝒓𝟏𝟐 𝝈 RU
STU V
5𝟏, 𝜷
𝒄𝒐𝒗 𝜷 5𝟐 = = 𝑉𝐼𝐹
𝟏 6 𝒓𝟐 𝟏𝟐 ∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐 ∑ 𝒙𝟏 𝟐 ∑ 𝒙𝟐 𝟐
𝟏
𝑤ℎ𝑒𝑟𝑒 𝑉𝐼𝐹 = >1
𝟏6 𝒓𝟐 𝟏𝟐
𝑟-. = denotes Pair wise Correlation between explanatory variables 𝑋- and𝑋. ,
0 < 𝒓𝟐 𝟏𝟐 < 1
∑ 𝒙𝟏 𝒙𝟐
𝑟-. = = 1, when there is perfect collinearity between 𝑋- and 𝑋.
∑ 𝒙𝟏𝟐 ∑ 𝒙𝟐 𝟐
TheVariance Inflation Factor (VIF)
The speed with which variances and covariance increase can be seen with the Variance Inating
Factor (VIF), which is dened as
𝟏
VIF = ≥ 𝟏 , 𝑟-. - Pair wise correlation b/n 𝒙𝟏 𝒂𝒏𝒅 𝒙𝟐 .
𝟏 6 𝒓𝟐 𝟏𝟐
VIF shows how the variance and covariance of estimators are inated by the presence of
multicollinearity. As 𝑟 . -. approaches 1, the VIF approaches innity, becomes very large. That is,
as the extent of collinearity increases, the variance of an estimator increases, and in the limit it
can become innite. As can be readily seen, if there is no collinearity between 𝑋- and 𝑋. ,
𝑟 .-. = 0 and VIF = 1.
𝟏
VIF = Variance Inflation Factor. 𝑉𝐼𝐹 = ≥ 𝟏
𝟏6 𝒓𝟐 𝟏𝟐
If 𝑟-. = 0, then VIF = 1. When 𝑟-. approaches zero, that is, as collinearity between 𝑋- and 𝑋.
decreases (gets minimum) VIF approaches 1(becomes minimum) and the variances of the
coefficients get lower and lower.
As 𝑟-. approaches one, that is, as collinearity increases, VIF will become a large number, the
variances of estimators increase and in the limit when 𝑟-. =1(perfect collinearity) VIF becomes
infinite. It is equally clear that as 𝑟-. increases to 1; the covariance of the two estimators also
increases in absolute value. Note also that large variance implies large standard error.
“Insignicant” t – statistics - In hypothesis test and confidence interval estimation as we
uses compute t- statistics for individual coefficient. Note that in computing t - statistics of
coefficients we employ standard errors coefficients.
When there is high collinearity between explanatory variables the estimated standard errors of
estimators increase dramatically, thereby making the t - values computed becomes smaller.
5 𝒊 6𝜷𝒊
𝜷
𝑡𝜷5𝒊 = 5𝒊
78𝜷
A smaller t - statistics computed is always in favor of the null hypothesis. Therefore, in such
cases, one will increasingly accept null hypothesis that the true population value is zero
(we fail to reject null hypothesis as our computed t- statistics is becomes smaller due high
standard error);we make Type II Error,accepting wrong null hypothesis.
VIF when there is large number of explanatory variables;
𝟏
𝑽𝑰𝑭𝒋 = >1
𝟏6 𝑹𝒋 𝟐
𝑹𝒋 𝟐 Is obtained from regressing one of the explanatory variable (𝑋\ ) over the other
explanatory variables. What we have discussed can be easily extended to include for cases with large number
of explanatory variables (for k -1 > 2). In such a model, the variance of the coefcient Jth explanatory
variable can be expressed as:
𝟐
5𝒋 = 𝝈 ×
𝒗𝒂𝒓 𝜷
𝟏
j = 1,2, 3,… ,k -1 = number of regressors
∑ 𝟐 𝒙 𝒋 𝟏6 𝑹𝒋 𝟐
Where 𝛽^\ = (estimated) partial regression coefcient of the jth regressor, 𝑅\ . = is the R-square
obtained from the Auxiliary Regression of 𝑋\ on the remaining k − 2 regressors
[Note:There are k −1 regressors in k variables regression model].
𝟏 𝝈𝟐
𝑽𝑰𝑭𝒋 = 𝟐 and 5𝒋 =
𝒗𝒂𝒓 𝜷 ×𝑽𝑰𝑭𝒋
𝟏6 𝑹𝒋 ∑ 𝒙𝟐 𝒋
𝑽𝑰𝑭𝒋 = Variable Inflation factor obtained from the regression of the jth variable over the other
regressors
𝒗𝒂𝒓 𝜷 5 𝒋 will depend on the three factors: 1. 𝝈𝟐 2. 𝑽𝑰𝑭𝒋 3. ∑ 𝒙𝟐 𝒋
The last one, ∑ 𝒙𝟐 𝒋 , implies that the larger the variability in a regressor, the smaller the variance
of the coefcient of that regressor, the greater the precision with which that coefcient can be
estimated. Larger 𝝈𝟐 reduces the precision with which that coefcient can be estimated; similarly,
high 𝑽𝑰𝑭𝒋 reduces the precision with which that coefcient can be estimated.
The inverse of theVIF is calledTolerance Level (TOL).That is,
𝟏 𝑹𝑺𝑺 𝑱
𝑻𝑶𝑳𝒋 = = 𝟏 − 𝑹𝒋 𝟐 = f𝑻𝑺𝑺𝑱
𝑽𝑰𝑭𝒋
When 𝑹𝒋 𝟐 = 1, 𝑇𝑂𝐿\ = 0 there is perfect collinearity). When 𝑹𝒋 𝟐 = 0, 𝑇𝑂𝐿\ =1 there is no

collinearity whatsoever. Because of the intimate connection between VIF and TOL, one can use them
interchangeably
Detection of multicollinearity - Although there are no sure methods of detecting
collinearity,there are several indicators of it, which are as follows:
A) High 𝑹𝟐and insignificant coefficients. The clearest sign of multicollinearity is
very high 𝑅. and insignificant t-statistics computed for most regression coefficients. High 𝑅. also
imply large F – statistics from the data. Hence, the F - test in most cases will reject the null hypothesis
that the partial slope coefficients are simultaneously equal to zero, but individual t- tests may show
none of the partial slope coefficients are significant.
B) High pair-wise correlations among regressors.
• If there are only two explanatory variables, intercorrelation can be measured by pair-wise
correlation coefcient /simple correlation coefficient.
• If the pair-wise correlation coefcient between two regressors is high, say, in excess of 75%,
then multicollinearity is a serious problem. In models involving just two explanatory variables,
a fairly good idea of collinearity can be obtained by examining the simple correlation
coefcient between the two variables. If this correlation is high, multicollinearity is present.
C) high 𝑹𝟐 but very low partial correlations between dependent and
explanatory variables, could indicate existence of multicollinearity.
For models with 3 or more explanatory variables, intercorrelation can be measured by partial
correlation coefficients and R square obtained from regressing one of X variable over all other X
variables.
D) Based on F- test on Auxiliary Regressions. Auxiliary model is a model derived from
the original regression model, with some modification. Multicollinearity arises if one or more of the
regressors are linear combinations of the other regressors.
This checked by running Auxiliary Regression; regressing one of explanatory variable, 𝑋j ,over
the remaining explanatory (X) variables in the model and get the corresponding 𝑅j . . Such
regressions are called Auxiliary Regression, auxiliary (supplement)to the main regression of Y on
the Xs’.Then, we test hypothesis using F- statistics computes as follows:
𝑹𝒋 𝟐 ⁄𝒌6𝟏 𝑹𝟐 𝒋 𝒏–𝒌
𝑭= = × ≈ 𝐹 – distribution with df = (𝒌 − 𝟏, 𝒏 − 𝒌 )
𝟏6𝑹𝒋 𝟐 ⁄𝒏6𝒌 𝟏6𝑹𝟐 𝒋 𝒌6𝟏
j = 1, 2,... 𝑘 = number of variables in the auxiliary model,
k −1 = number of explanatory variable in the auxiliary regressor and df for numerator.
Degree of freedom (df) for the denominator is: 𝑛 − 𝑘, from the auxiliary model, n = sample
size.
𝑅\ . - is R square from Auxiliary Regression or regression of the Jth explanatory variable ( 𝑋\ ) on
the remaining explanatory variables (𝐾 − 1).
𝑹𝑺𝑺 𝑱
𝟏 − 𝑹𝟐 𝒋 = f𝑻𝑺𝑺𝑱 obtained from Auxiliary regression with 𝒅𝒇 = 𝒏 − 𝒌
Decision: If the computed F - statistics > F - critical value at the chosen level of signicance, it
is taken to mean that the particular 𝑋\ is collinear with other X’s; in such case we have to decide
whether the particular X variable should be dropped from the model.
If it less than the F - critical, we say that 𝑋\ is not collinear with other X’s, hence, we may retain
that variable in the model.
Illustration - Suppose we have the following 5 variables regression model (with 4 regressors)
𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + 𝜷𝟑 𝑿𝟑 + 𝜷𝟒 𝑿𝟒 + 𝒖𝒊 ;
Where k = 5, J = 1,2,3,4 and k – 1 = 4
If we want to detect whether 𝑋x collinear with other explanatory variables, construct and run the
following Auxiliary Regression: regress 𝑋x on 𝑋- , 𝑋. , 𝑋y
𝑿𝟒 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + 𝜷𝟑 𝑿𝟑 + 𝒖𝒊 (Auxiliary model)
Note we have k – 2 = 3 explanatory variables in the Auxiliary model.
Hypothesis:
𝑯𝟎 : 𝜷𝟏 =𝜷𝟐 =𝜷𝟑 = 0 (No multicollinearity between 𝑋x & the remaining regressors)
𝑯𝟏 : 𝜷𝟏 = 𝜷𝟐 = 𝜷𝟑 ≠0 ( there is multicollinearity between 𝑋x and the other
regressors)
Run the Auxiliary model: regress 𝑿𝟒 on 𝑋- , 𝑋. and 𝑋y ; and, Compute 𝑅j . = 𝑅x . of the
Auxiliary Model and the F – statistics as shown above. Make decision based on F – critical value.
Note that F - critical value for a given level of alpha will be obtained from F- table
at (𝒌 − 𝟏, 𝒏 − 𝒌 ) degrees of freedom.
Decision Rule: if F- statistics computed > F - critical value, Reject 𝑯𝟎 for there is
significant correlation between 𝑋x and one or more of the remaining explanatory variables.
Hence, we should drop 𝑿𝟒 from the original model and regressY on 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 .
If F- statistics computed < F - critical value, accept 𝑯𝟎 for there is no evidence
of statistically significant collinearity between 𝑋x and remaining explanatory variables,
hence, retain 𝑋x in the original model.
Note that F - critical value for given level of alpha will be obtained from F- table
at (𝒌 − 𝟏 , 𝒏 − 𝒌) degrees of freedom.
E. When 𝑅 .j > 𝑹𝟐 : - Instead of formally testing all auxiliary 𝑅 . values, one may adopt what
we call ‘Klein’s Rule of Thumb’, which suggests that multicollinearity is a troublesome problem only
if the 𝑅 . j obtained from the Auxiliary Regression is greater than the overall 𝑅 . that is obtained
from the regression of Y on all regressors used in the model; 𝑅 .j > 𝑹𝟐 . Of course, like all other
rules of thumb, this one should be used judiciously.
F.Tolerance (TOL) and Variance Ination Factor (VIF) as indicators of
multicollinearity.
• We have already introduced TOL and VIF. As 𝑅 .\ - the coefcient of determination in the
regression of the jth explanatory variable over the remaining regressors in the model increases, VIF
also increase. That is, the collinearity of Xj with the other regressors’ increases.
If the VIF of a variable > 5, which will happen if 𝑅 . j > 0.80, that variable is said be
highly collinear with other variables in the model. Of course, one could use TOLj as a measure of
multicollinearity in view of its intimate connection withVIFj.
The closer is TOLj to zero, the greater the degree of collinearity of that variable
with the other regressors. On the other hand, the closer TOLj is to 1, the greater the evidence
that Xj is not collinear with the other regressors.
How to get rid of the problem?
Again there are no sure methods, only a few rules of thumb. Some of these rules are as follows:
• Omitting/drop collinear variables,
• Using extraneous or prior information,
• Combining cross-sectional and time series data( pooling data)
• Obtaining additional or new data (increase number of observation).
• Transforming data such as regressing in the rst difference form.
5.2. Heteroscedasticity
• One of the assumptions CLRM is Homoscedasticity of the Error term, that is, the variance of the
error terms has to be constant. the variance of the error term may increase or decrease with the
dependent variable or With independent variables.
• The Assumption of Homoscedasticity is needed to justify the validity of hypothesis tests using t -
tests, F – tests and confidence intervals for OLS estimation of the linear regression model are
reliable.
• The problem of heteroscedasticity is likely to be more common in cross-sectional
(than in time series data).
𝐸 𝑢: . /𝑋: = 𝜎 . -The conditional variance of the error terms is the same for all values of
X. Even if the explanatory variable changes, the variance of the error terms corresponding to each
value of X is the same.
Sources of Heteroscedasticity:
1. Existence of outliers; Heteroscedasticity can arise as a result of the presence outliers.
”outliers” are observations within the data set that are very different from others: very high or
very low observation in relation to other observations in the sample.
2. Omission of Relevant variables (specification error of regression model).
Heteroscedasticity could arise due to omission of some important variables from the model
3. Heteroscedasticity can arise because of incorrect data transformation (e.g. ratio or rst
difference transformations) and incorrect functional form (e.g., linear versus log–linear
models).
5.2.1 Consequences of Heteroscedasticity
The OLS estimators remain linear & unbiased in the presence of heteroscedasticity.
Heteroscedasticity does not cause bias or inconsistency in the OLS estimators. The adverse effects
are briefly highlighted below.
1. In the presence of Heteroscedasticity OLS estimators can’t be BLUE (can’t be Efficient/best
Estimators). As far as Variance of the error terms is not constant, OLS is no longer BLUE. The
OLS estimators don’t satisfy the condition of minimum variance.
2. The variance and standard errors of the OLS estimators will be larger when
heteroscedasticity present.
The variance of regression coefficients 𝑣𝑎𝑟 𝛽^: is directly related with the variance of
the error terms; 𝑣𝑎𝑟 𝛽^: changes as the variance of error terms changes. The variance of
coefficients 𝑣𝑎𝑟 𝛽^: will be become larger relative to homoscedastic variances of
coefficients.
3. Test Statistics and confidence interval constructed for parameters are
incorrect in the presence of heteroscedasticity. This is due to the bias in the standard errors
of coefficients used in computing test statistics and intervals of coefficients.
4. Faulty statistical inferences- Since testing hypotheses is such an important component of
any econometric analysis and the usual OLS inference is generally faulty in the presence of
heteroscedasticity. The test statistics: t, 𝜒 . , F, so computed will be biased and unreliable for
hypothesis testing.
Tests of for Heteroscedasticity
i) The White Test for Heteroscedasticity - 𝝌𝟐 test
It is a chai- square test. Given the following three variable regression functions:
𝑌: = 𝛽< + 𝛽- 𝑋- + 𝛽. 𝑋. + 𝑢:
The white test proceeds as follows.
Step-1 :-Given the data, regress 𝑌: on 𝑿𝟏 𝒂𝒏𝒅 𝑿𝟐 and obtain the residuals, 𝑢‚: and squared it
R𝒊 𝟐)
(𝒖
Step-2: - Auxiliary Regression - Regress the squared residuals (𝒖 R 𝒊 𝟐) obtained from the
regression in step1 on the original regressors (𝑿𝟏 𝒂𝒏𝒅 𝑿𝟐 ), their squares (𝑿𝟏 𝟐 𝒂𝒏𝒅 𝑿𝟐 𝟐 )
and cross products the regressors (𝑿𝟏 𝑿𝟐 ).
R 𝒊 𝟐 = 𝒃𝟎 + 𝒃𝟏 𝑿 𝟏 + 𝒃𝟐 𝑿 𝟐 + 𝒃𝟑 𝑿 𝟏 𝟐 + 𝒃𝟒 𝑿 𝟐 𝟐 + 𝒃𝟓 𝑿 𝟏 𝑿 𝟐 + 𝑽 𝒊
𝒖
[Auxiliary regression]
𝑉: − The error term, the model has 6 variables: k = 6, and 5 regressors: k – 1 = 5
The hypothesis to be tested:
𝑯𝟎 : 𝒃𝟏 = 𝒃𝟐 = 𝒃𝟑 = 𝒃𝟒 = 𝒃𝟓 = 𝟎 (The variance is homoscedastic)
𝑯𝟏 : 𝒃𝟏 = 𝒃𝟐 = 𝒃𝟑 = 𝒃𝟒 = 𝒃𝟓 ≠ 𝟎 (The variance is not homoscedastic)
The Null hypothesis asserted the variance of the error term is constant (homoscedasticity)
while the Alternative hypothesis asserted there is heteroscedasticity problem (variance of
the error terms is not constant). Accepting the null means the variance of the error terms
is constant while rejection of 𝐻< implies existence of the problem of heteroscedasticity.
Tests of Heteroscedasticity
Step-3:- Obtain 𝑹𝟐 from step -2. And compute the following 𝝌𝟐 - statistics:
𝝌𝟐 = 𝒏×𝑹𝟐 ~ 𝝌𝟐 𝒅𝒇 Where df = k-1, the number of regressors in The
Auxiliary Regression (in this case we have 5 regressors: 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟏 𝟐 , 𝑿𝟐 𝟐 and
𝑿𝟏 𝑿𝟐 )
n = sample size, 𝒏×𝑹𝟐 asymptotical follows 𝝌𝟐 - (chi- square) distribution with
degree of freedom (df) equal to the number of regressors in the Auxiliary
Regression. 𝝌𝟐 - Critical value will be obtained from 𝜒 . table at a given degree
of freedom and level of alpha.
Decision: - if 𝛘𝟐 – statistics computed is > 𝛘𝟐 𝐝𝐟 , we Reject 𝑯𝟎 and conclude
the variance of the error terms are not homoscedastic / there is a problem of
heteroscedasticity.
• if 𝝌𝟐 - statistics computed < 𝝌𝟐 𝒅𝒇 - critical value, accept 𝑯𝟎 ; and conclude
no heteroscedasticity problem. The variance of the error term is constant.
ii) Koenker – Bassett (KB) Test (t – test)
The KB test is based on the squared residuals, 𝑢‚: . , but instead of being regressed on one or
more regressors, the squared residuals are regressed on the squared estimated values of the
regressand. This test is useful when we have many regressors in our model and we don’t know
which regressor is causing the problem. The test procedures described briefly here under.
Suppose the original model is: 𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏𝑿𝟏 + 𝜷𝟐𝑿𝟐 + ⋯ + 𝜷𝒌 𝑿𝒌 + 𝒖𝒊
From the estimation this model we obtain 𝑢‚: and 𝑌Œ: then run the following:
5 𝒊 𝟐+ 𝒗𝒊 ; Where 𝑌Œ: . are the estimated values ofY from the first model.
R 𝒊 𝟐 = 𝝆𝟎 + 𝝆𝟏 𝒀
𝒖
we use t – statistics to test the following hypothesis:
𝑯𝟎 : 𝝆𝟏 = 0 No heteroscedasticity problem in the data
𝑯𝟏 : 𝝆𝟏 ≠ 0 the data contains heteroscedasticity
𝑯𝟎 Says the variance is constant/homoscedasticity or no heteroscedasticity problem
in the data, while the alternative argues that that the data contains heteroscedasticity
(variance is not constant).
If t - statistics computed > t - critical value at a given level of 𝛼, we reject 𝑯𝟎
and conclude the contains heteroscedasticity. If t- statistics computed < t -
critical value accept 𝑯𝟎 and conclude no heteroscedasticity problem.
• 𝑯𝟎 Says the variance is constant/homoscedasticity or no heteroscedasticity
problem in the data, while the alternative argues that that the data contains
heteroscedasticity (variance is not constant).
• If t - statistics computed > t - critical value at a given level of 𝛼, we reject
𝑯𝟎 and conclude the contains heteroscedasticity.
• If t- statistics computed < t - critical value accept 𝑯𝟎 and conclude no
heteroscedasticity problem.
iii).Goldfeld- Quandt (GQ)Test ( F – test)
This test is applicable for large samples and number of observations is at least twice
the number of explanatory variables. In addition to this assumption, this test assumes
normality and no autocorrelation. Goldfeld & Quandt suggest the following steps.
Given a two-variable regression function as: 𝒀𝒊 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 + 𝒖𝒊
Step:1- rearrange the observations in Ascending Order based on the explanatory
variable ( lowest to highest values)
Stept:2:- divide the observations into 3 parts; observation in the first
part n1 , middle part P and n2 in 2nd part. Usually p is taken to be one-
sixth of n and 𝒏𝟏 + 𝒏𝟐 + 𝑷 = 𝒏
Step-3 :- Fit run OLS regressions for the first 𝒏𝟏 observations get 𝑅𝑆𝑆- (lower
observations / those which lie below the middle observations or values).
Similarly regress the remaining 𝒏𝟐 observations and get 𝑅𝑆𝑆. (from the upper values/those
which lie above the central observations)
Stept-4:- Compute the F-statistic as follows:
‘’’ U
f“”U
𝐹= ‘’’ T ; Where, 𝑑𝑓. =𝑑𝑓- = 𝒏𝟏 − 𝑘 = 𝒏𝟐 − 𝑘 = ; hence
f“”T
—˜˜U ™šT —˜˜U —˜˜U
𝐹= × = ×1=
—˜˜T ™šU —˜˜U —˜˜T
Hypothesis to be tested;
𝑯𝟎 : Homoscedasticity (constant variance): 𝜎 .- = 𝜎 . .
𝑯𝟏 : Heteroscedasticity (variance is not constant): 𝜎 .- ≠ 𝜎 . .
• It is a test whether the variance in the two data set (the lower and upper
values) is the same or not. Accepting 𝑯𝟎 implies the variance is the same in
the two data sets.
• Decision Rule: - Reject 𝑯𝟎 (Homoscedastic variance),if the F – statistics > F-
critical at a give level of significance and degree of freedoms (𝑑𝑓- and 𝑑𝑓. ). We
conclude there is heteroscedasticity problem in the data (variance of the error
term is not constantan).
• If the F – statistics < F- critical we accept the null, and conclude there is no
heteroscedasticity problem.
iv) The following tests are also used to detect Heteroscedasticity includes:
• Spearman’s Rank correlation test,
• The Breush – pagan test
• Glejser Test.
Solution to Heteroscedasticity problem
1) Generalized Least Squares (GLS) or Weighted Lest Square (WLS) method
GLS method to used to eliminate Problem of heteroscedasticity and capable of producing
estimators that are BLUE.
GLS method transforms the original variables in such a way that the transformed variables
satisfy the assumptions of the classical model and then applying OLS to them. GLS is OLS on
the transformed variables that satisfy the standard least-squares assumptions.
The estimators thus obtained are known as GLS estimators, and it is these estimators that are
BLUE.
Consider a two – variable regression function:
𝑌: = 𝛽< + 𝛽- 𝑋: + 𝑢: - two variables model
Now, assume that the hetroscedastic variances 𝜎: . are known. Divide the above equation by
𝜎: to obtain:
›œ - •œ žœ
= 𝛽< + 𝛽- +
Vœ Vœ Vœ Vœ
𝑌: ∗ = 𝛽< ∗ 𝑋<: + 𝛽- ∗ 𝑋- + 𝑢: ∗ - The transformed model we run the
transformed model to generate estimators.
2) Log Transformation of the data
𝑌: = 𝛽< + 𝛽- 𝑋: + 𝑢:
Transform this into log and run the regression in log form as:
ℓ𝑛𝑌: = 𝛽< + 𝛽- ℓ𝑛𝑋: + 𝑢:
Such transformation reduces heteroscedasticity; because log transformation compresses the
scales in which the variables are measured, thereby reducing a tenfold difference between
two values to a twofold difference.
5.3 Autocorrelation / Serial Correlation
• Another case of violation of the classical assumptions is when two or more consecutive error
terms are correlated to each other. This assumption tells us that the error term at time ‘t’
shouldn’t correlate with the error term at any other point of time. This means that when
observations are made over time, the effect of the disturbance occurring at one period does not
carry-over into another period.
Autocorrelation between error terms is a more Common problem in time series data
than in crossectional data.
𝐸 𝑢¡ 𝑢\ = 𝑐𝑜𝑣 𝑢¡ 𝑢\ = 0 - no autocorrelation
The violation of this assumption implies there is none zero covariance or correlation:
𝐸 𝑢¡ 𝑢\ = 𝑐𝑜𝑣 𝑢¡ 𝑢\ ≠ 0
Causes of Autocorrelation
1. Inertia effect:- most time series economic variables such as GNP, price index,
unemployment, and production exhibit cycles. The values of a series at one point in time is
greater than (or less than) its previous value.
2. Due to Non-stationary of time series: time series is stationary if its mean,
variance & covariance are time invariant or constant. When the mean, variance &
covariance of time series variable are not constant over time, it is called Non -
Stationary.A non - stationary time series could cause autocorrelation problem.
3. Specification Bias: Incorrect functional form suppose the “true” or correct model in a
cost-output study
Causes of Autocorrelation
4. Omission of variables
Most economic variables are tend to be auto – correlated with each other If we omit such
explanatory variables their influence will be captured by the random term leading to serial
correlation among the error terms. Suppose we have the following demand model:
[a] 𝑌¤ = 𝛽< + 𝛽- 𝑋- + 𝛽.𝑋. + 𝛽y𝑋y + 𝑢¤
However we run the following demand model:
[b] 𝑌¤ = 𝛽< + 𝛽- 𝑋- + 𝛽.𝑋. + 𝑣¤
Now, if [a] is the correct model or the true relation, then the error from in running [b] will
be;
𝒗𝒕 = 𝜷𝟑 𝑿𝟑 + 𝒖𝒕 Where omission 𝑋y leads to autocorrelation among residuals.
5. manipulation of Data and Data Transformation may also induce problem of
autocorrelation into that data,even if the original data don’t contain serial correlation.
Consequences of Autocorrelation
Consequences of Autocorrelation - the uses of OLS methods in the presence of Serial
correlation between error terms will have the following consequences:
1. Although coefficients are linear, unbiased, and asymptotically normally distributed in the
presence of autocorrelation problem, the OLS estimators are can’t be BLUE
2. When there is autocorrelation between error terms, the estimated variance of the
residuals are biased and underestimate the true variance; var (𝒖 R 𝒊 ) < 𝝈𝟐 .
3. As per the second consequences, the OLS estimators’ variance and standard errors are
underestimated; and test statistics computed on these standard errors are invalid.
Consequences of Autocorrelation
4. Since variance of the residuals are underestimated, then, it Overestimates 𝑹𝟐and
F- statistics of the regression.
5. The under estimation of standard error of coefficients will lead to overestimation of
t- value of individual coefficient. Hence, declare a coefficient statistically
significant or Reject 𝑯𝟎 more easily (𝐻< ; 𝛽: = 0).
6.The confidence intervals for individual coefficient derived are likely to be wider.
7. Hypothesis testing using the usual, t, F, and χ2 may not be valid or could lead
to faulty inferences. t, F, other test statistics computed from data are biased when serial
autocorrelation present;and could lead to misleading conclusion.
Tests for Autocorrelation
1.The Durbin –Watson Test
The most celebrated test for detecting serial correlation is that developed by
statisticians Durbin and Watson. It is popularly known as the Durbin–Watson d
- statistics, which used if the error terms are generated following AR (1) process.
The test is based on The disturbance terms, 𝑢¤ , are generated by the rst-Order
Autoregressive Scheme,AR(1)
𝒖𝒕 = 𝝆𝒖𝒕6𝟏 + 𝒗𝒕 . It cannot be used to detect higher-order autoregressive
schemes.
∑ž
R§ R
ž§¨T
𝜌= and −𝟏 ≤ 𝝆 ≤ 𝟏
R§ U
∑ž
Durbin -Watson 𝑑 − 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 is computed as:
𝒅 =𝟐 𝟏−𝝆
−𝟏 ≤ 𝝆 ≤ 𝟏 and 𝟎 ≤ 𝒅 ≤ 𝟒
• If 𝝆 = 1, d- statistics will be zero, there is perfect positive autocorrelation
• If 𝝆 = - 1, d – Statistics = 4, there is perfect negative autocorrelation
• If 𝝆 = 0, d = 2 there is no positive or negative autocorrelation
𝒅 −Critical value has 𝒌 − 𝟏, 𝒏 degrees of freedom where n – sample
size, 𝒌 − 𝟏 = number of regressors which will be obtained from DW table
The hypothesis to be tested is:
𝑯𝟎 : no autocorrelation 𝑯𝟏 : there is autocorrelation
Durbin-Watson Statistical Table provides two values of d – critical value: lower and upper
d- values.
Denoted by 𝑑¬ (lower d - statistics) and 𝑑ž (upper d- statistics), for a given level of
significance (usually,𝛼 = 0.05), sample size (n) and number of regressors (𝑘 − 1).
• Example at 𝛼 = 0.05, n = 20, k – 1 = 1; that is, df = (1,20).The two d – critical values
are: 𝑑¬ = 1.201 and 𝑑ž = 1.411
Decision Rules – see the table in the next slide
Null hypothesis Decision If
No positive autocorrelation Reject 0 < d <dL
No positive autocorrelation No decision dL ≤ 𝐝 ≤ 𝐝𝐮
No negative correlation Reject 4 – dL < d < 4
No negative correlation No decision 4 – du ≤ d ≤ 4 - dL
No autocorrelation, positive or negative Do not reject du < d < 4 - du
Example n = 40, 𝒓𝟐 = 0.9584, d = 0.1229, 𝝈 𝟐 = 2.6755, 𝜶 = 𝟎. 𝟎𝟓

At 𝜶 = 𝟎. 𝟎𝟓, df = (1, 40) where 1 is number of regressor used, 40 is number of
observation, then d- critical values are : 𝑑¬ = 1.442 and 𝑑® = 1.544
Then determine if the error terms contain serial correlation at 5% level of significance.
Soln: Since the computed d - statistics, 0.1229 < 𝒅𝑳 , we reject 𝑯𝟎 for there is
statistically significant evidence for No positive serial correlations in the residuals.
2. Breush – GodfreyTest (BGTest): 𝝌𝟐 - test statistics
The BG test, also known as the LM test (Lagrange Multiplier test). The Durbin – Watson
test is applicable to only AR(1), Autoregressive process of order one. The BG Test can be used to
test for AR process of any order, AR (m), where “m” denotes number of maximum lag used in the
Auxiliary model.The test proceeds as follows
𝑌¤ = 𝛽< + 𝛽- 𝑋¤ + 𝑢¤
Assume the error terms, 𝑢¤ follows AR process of order m:
𝑢¤ = 𝜌- 𝑢¤6- + 𝜌. 𝑢¤6. + 𝜌y 𝑢¤6y + ⋯ + 𝜌° 𝑢¤6° + 𝑉¤
Step -1:- regress 𝒀𝒕 on 𝑿𝒕 using OLS and obtain the residual 𝒖𝒕
Step -2:- run the Auxiliary model; Regress 𝒖𝒕 on 𝒖𝒕6𝟏 , 𝒖𝒕6𝟐, 𝒖𝒕6𝟑, 𝒖𝒕6𝟒, assuming
m=4
𝒖𝒕 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒕 + 𝝆𝟏 𝒖𝒕6𝟏 + 𝝆𝟐 𝒖𝒕6𝟐 + 𝝆𝟑 𝒖𝒕6𝟑 + 𝝆𝟒 𝒖𝒕6𝟒 + 𝑽𝒕
Obtain 𝑹𝟐 from the Auxiliary regression.
Hypothesis to be tested :
𝑯𝟎 : 𝝆𝟏 = 𝝆𝟐 = 𝝆𝟑 = 𝝆𝟒 = 𝟎 and 𝑯𝟏 : 𝝆𝟏 = 𝝆𝟐 = 𝝆𝟑 = 𝝆𝟒 ≠ 𝟎
𝐻< states there is No Autocorrelation and 𝐻- there is serial correlation b/n error terms
Step 4 – compute the 𝜒 . statistics ( chai- square statistics)
𝜒 . = 𝒏 − 𝒎 𝑹𝟐 ≈ 𝝌𝟐 𝒎
follows 𝜒 . distribution with df = m (maximum number of lag used) and 𝑹𝟐 – is
obtained from the Auxiliary regression
Decision: Reject H0 if 𝜒 . - statistics > 𝜒 . - critical value at the chosen level of
significance. It indicate the existence of a significant autocorrelation problem.
Accept H0 if 𝜒 . - statistics < 𝜒 . - critical at the chosen level of significance and at m df. It
indicates the absence of autocorrelation between error terms.

Chapter 4. Violation of Assumptions

Uploaded by

Copyright:

Available Formats

Chapter 4. Violation of Assumptions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4. Violation of Assumptions

Uploaded by

Copyright:

Available Formats

Chapter 4.

When 𝑹𝒋 𝟐 = 1, 𝑇𝑂𝐿\ = 0 there is perfect collinearity). When 𝑹𝒋 𝟐 = 0, 𝑇𝑂𝐿\ =1 there is no

Example n = 40, 𝒓𝟐 = 0.9584, d = 0.1229, 𝝈 𝟐 = 2.6755, 𝜶 = 𝟎. 𝟎𝟓

You might also like