CH 04 Wooldridge 5e PPT
CH 04 Wooldridge 5e PPT
CH 04 Wooldridge 5e PPT
Analysis: Inference
Chapter 4
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Statistical inference in the regression model
Hypothesis tests about population parameters
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Assumption MLR.6 (Normality of error terms)
independently of
It follows that:
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Discussion of the normality assumption
The error term is the sum of "many" different unobserved factors
Sums of independent factors are normally distributed (CLT)
Problems:
• How many different factors? Number large enough?
• Possibly very heterogenuous distributions of individual factors
• How independent are the different factors?
The normality of the error term is an empirical question
At least the error distribution should be "close" to normal
In many cases, normality is questionable or impossible by definition
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Discussion of the normality assumption (cont.)
Examples where normality cannot hold:
• Wages (nonnegative; also: minimum wage)
• Number of arrests (takes on a small number of integer values)
• Unemployment (indicator variable, takes on only 1 or 0)
In some cases, normality can be achieved through transformations
of the dependent variable (e.g. use log(wage) instead of wage)
Under normality, OLS is the best (even nonlinear) unbiased estimator
Important: For the purposes of statistical inference, the assumption
of normality can be replaced by a large sample size
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Terminology
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing hypotheses about a single population parameter
Theorem 4.1 (t-distribution for standardized estimators)
Under assumptions MLR.1 – MLR.6:
Note: The t-distribution is close to the standard normal distribution if n-k-1 is large.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
t-statistic (or t-ratio)
The t-statistic will be used to test the above null hypothesis.
The farther the estimated coefficient is away from zero, the
less likely it is that the null hypothesis holds true. But what
does "far" away from zero mean?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing against one-sided alternatives (greater than zero)
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Wage equation
Test whether, after controlling for education and tenure, higher work
experience leads to higher hourly wages
Standard errors
Test against .
One would either expect a positive effect of experience on hourly wage or no effect at all.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Wage equation (cont.)
t-statistic
Degrees of freedom;
here the standard normal
approximation applies
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing against one-sided alternatives (less than zero)
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Student performance and school size
Test whether smaller school size leads to better student performance
Percentage of students Average annual tea- Staff per one School enrollment
passing maths test cher compensation thou-sand (= school size)
students
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Student performance and school size (cont.)
t-statistic
Degrees of freedom;
here the standard normal
approximation applies
One cannot reject the hypothesis that there is no effect of school size on
student performance (not even for a lax significance level of 15%).
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Student performance and school size (cont.)
Alternative specification of functional form:
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Student performance and school size (cont.)
t-statistic
(small effect)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing against two-sided alternatives
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Determinants of college GPA Lectures missed per week
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
"Statistically significant“ variables in a regression
If a regression coefficient is different from zero in a two-sided test, the
corresponding variable is said to be "statistically significant“
If the number of degrees of freedom is large enough so that the nor-
mal approximation applies, the following rules of thumb apply:
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Guidelines for discussing economic and statistical significance
If a variable is statistically significant, discuss the magnitude of the
coefficient to get an idea of its economic or practical importance
The fact that a coefficient is statistically significant does not necessa-
rily mean it is economically or practically significant!
If a variable is statistically and economically important but has the
"wrong“ sign, the regression model might be misspecified
If a variable is statistically insignificant at the usual levels (10%, 5%,
1%), one may think of dropping it from the regression
If the sample size is small, effects might be imprecisely estimated so
that the case for dropping insignificant variables is less strong
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing more general hypotheses about a regression coefficient
Null hypothesis
Hypothesized value of the coefficient
t-statistic
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Campus crime and enrollment
An interesting hypothesis is whether crime increases by one percent
if enrollment is increased by one percent
The hypothesis is
rejected at the 5%
level
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Computing p-values for t-tests
If the significance level is made smaller and smaller, there will be a
point where the null hypothesis cannot be rejected anymore
The reason is that, by lowering the significance level, one wants to
avoid more and more to make the error of rejecting a correct H0
The smallest significance level at which the null hypothesis is still
rejected, is called the p-value of the hypothesis test
A small p-value is evidence against the null hypothesis because one
would reject the null hypothesis even at small significance levels
A large p-value is evidence in favor of the null hypothesis
P-values are more informative than tests at fixed significance levels
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
How the p-value is computed (here: two-sided test)
These would be the In the two-sided case, the p-value is thus the
critical values for a
5% significance level probability that the t-distributed variable takes
on a larger absolute value than the realized
value of the test statistic, e.g.:
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Critical value of
two-sided test
Confidence intervals
Simple manipulation of the result in Theorem 4.2 implies that
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Confidence intervals for typical confidence levels
reject in favor of
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Example: Model of firms‘ R&D expenditures
Spending on R&D Annual sales Profits as percentage of sales
The effect of sales on R&D is relatively precisely estimated This effect is imprecisely estimated as the in-
as the interval is narrow. Moreover, the effect is significantly terval is very wide. It is not even statistically
different from zero because zero is outside the interval. significant because zero lies in the interval.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing hypotheses about a linear combination of parameters
Example: Return to education at 2 year vs. at 4 year colleges
Years of education Years of education
at 2 year colleges at 4 year colleges
Test against .
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Impossible to compute with standard regression output because
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Total years of college
Estimation results
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing multiple linear restrictions: The F-test
Testing exclusion restrictions
Salary of major lea- Years in Average number of
gue base ball player the league games per year
Batting average Home runs per year Runs batted in per year
against
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Estimation of the unrestricted model
Idea: How would the model fit be if these variables were dropped from the regression?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Estimation of the restricted model
The sum of squared residuals necessarily increases, but is the increase statistically significant?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Rejection rule (Figure 4.7)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Test decision in example Number of restrictions to be tested
Degrees of freedom in
the unrestricted model
Discussion
The three variables are "jointly significant"
They were not significant when tested individually
The likely reason is multicollinearity between them
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Test of overall significance of a regression
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Testing general linear restrictions with the F-test
Example: Test whether house price assessments are rational
The assessed housing value Size of lot
Actual house price
(before the house was sold) (in feet)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Unrestricted regression
Test statistic
cannot be rejected
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression
Analysis: Inference
Regression output for the unrestricted regression
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.