0 0 1 1 1 W A P 1 N N I 1 I X I N 1 N N I 1 I 2
0 0 1 1 1 W A P 1 N N I 1 I X I N 1 N N I 1 I 2
0 0 1 1 1 W A P 1 N N I 1 I X I N 1 N N I 1 I 2
ECON 107
Due Tuesday, September 27
1. (Bonus) Derive the asymptotic distribution for b0 . Hint: use b0 = Y X b1 and n(b1 1 ) Wan2
where Wn2 = 1n i=1 ((Xi X )Ui ) and an = n1 i=1 (Xi X)2 . Check equation 4.22 in S&W.
2. Suppose the true specification is Yi = 1 Xi2 + Ui , E(Ui |Xi ) = 0. However, you assume that E(Y |X) =
0 + 1 Xi and run the univariate OLS anyway, i.e. regress Yi on (1, Xi ) and obtain the OLS estimator
(b0 , b1 ). Show that
p 1 (EX EX EX)
3. You are interested in studying the factors that influence a persons decision of whether to go to college.
Therefore, you have collected data from 3796 high-school graduates, 6 years after they graduated from
high school. You can assume you have an iid sample. In particular, you observe their total years
of education (yrsed), which ranges from 12 to 18, and whether or not at least one of their parents
graduated from college.
(a) Out of the 3796 people in your dataset, 954 have at least one parent who graduated from college.
The average years of education (yrsedc ) for this group is 14.8 years with a sample standard
deviation (sc ) of 1.74. The remaining 2842 people in the sample have parents who did not
graduate from college. In this group, the average years of education (yrsednc ) is 13.5 years with
a sample standard deviation (snc ) of 1.72. Using this information, construct 95% confidence
intervals for the population means of years of education for each group.
(b) Now, lets analyze the data using a univariate regression. Using the same dataset, you construct a
dummy variable (parcol), which is equal to 1 if at least one of the persons parents graduated from
college and 0 if not. The following simple regression was estimated using data on all n = 3796
yrsed = 0 + 1 parcol + u
= 13.5 + 1.30 parcol,
R2 = .095
What is the interpretation of the constant in this regression? Does it have a meaningful interpretation in this model?
(c) What is the interpretation of 1 in this regression? Using the regression results, what is the
predicted mean of yrsed for people with at least one parent who graduated from college?
(d) Can you test the null hypothesis that there is no difference in the mean of yrsed between the two
groups of people using a 1% level of significance? What is your conclusion?
(e) In addition to the information on parents college status, you have also collected information on
the distance to the nearest college (dist) in 10s of miles (dist has a range of 0 to 16). You decide
to run another regression, this time using dist as the regressor. Here are the results:
yrsed = 13.9 .073 dist,
R2 = .01
What is the interpretation of the constant in this regression? What is the interpretation of 1 ?
Is the slope statistically significant at the 5% level?
(f) What is the interpretation of R2 in this regression? Does the low value of R2 imply that the
coefficient on dist is not statistically significant at the 1% level?
(g) Using the regression results, what is the predicted mean years of education for a person who lives
13 miles from the nearest college? How about a person who lives 100 miles from the nearest
(h) Construct a 95% confidence interval for the expected decrease in mean years of education associated with moving 20 miles farther away from the nearest college.