Panel 2
Panel 2
Panel 2
2 / 26
Preliminary:
I use the following package
lfe package.
3 / 26
Panel Data Regression
I use the dataset Fatalities in AER package.
See https://www.rdocumentation.org/packages/AER/versions/1.2-6/topics/Fatalities for
details.
library(AER)
data(Fatalities)
str(Fatalities)
5 / 26
As a preliminary analysis, let's plot the relationship between fatality rate and beer tax in
1998.
Fatalities %>%
mutate(fatal_rate = fatal / pop * 10000) %>%
filter(year == "1988") -> data
plot(x = data$beertax,
y = data$fatal_rate,
xlab = "Beer tax (in 1988 dollars)",
ylab = "Fatality rate (fatalities per 10000)",
main = "Traffic Fatality Rates and Beer Taxes in 1988",
pch = 20,
col = "steelblue")
6 / 26
Positive correlation between alcohol tax and traffic accident. Possibly due to omitted
variable bias.
7 / 26
Run fixed effect regression using felm command in lfe package.
https://www.rdocumentation.org/packages/lfe/versions/2.8-3/topics/felm
library("lfe")
Fatalities %>%
mutate(fatal_rate = fatal / pop * 10000) -> data
# OLS
result_ols <- felm( fatal_rate ~ beertax | 0 | 0 | 0, data = data )
summary(result_ols, robust = TRUE)
8 / 26
##
## Call:
## felm(formula = fatal_rate ~ beertax | 0 | 0 | 0, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.09060 -0.37768 -0.09436 0.28548 2.27643
##
## Coefficients:
## Estimate Robust s.e t value Pr(>|t|)
## (Intercept) 1.85331 0.04713 39.324 < 2e-16 ***
## beertax 0.36461 0.05285 6.899 2.64e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5437 on 334 degrees of freedom
## Multiple R-squared(full model): 0.09336 Adjusted R-squared: 0.09065
## Multiple R-squared(proj model): 0.09336 Adjusted R-squared: 0.09065
## F-statistic(full model, *iid*):34.39 on 1 and 334 DF, p-value: 1.082e-08
## F-statistic(proj model): 47.59 on 1 and 334 DF, p-value: 2.643e-11
9 / 26
# State FE
result_stateFE <- felm( fatal_rate ~ beertax | state | 0 | state, data = data )
summary(result_stateFE, robust = TRUE)
##
## Call:
## felm(formula = fatal_rate ~ beertax | state | 0 | state, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.58696 -0.08284 -0.00127 0.07955 0.89780
##
## Coefficients:
## Estimate Cluster s.e. t value Pr(>|t|)
## beertax -0.6559 0.2919 -2.247 0.0294 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1899 on 287 degrees of freedom
## Multiple R-squared(full model): 0.905 Adjusted R-squared: 0.8891
## Multiple R-squared(proj model): 0.04074 Adjusted R-squared: -0.1197
## F-statistic(full model, *iid*):56.97 on 48 and 287 DF, p-value: < 2.2e-16
## F-statistic(proj model): 5.05 on 1 and 47 DF, p-value: 0.02936
10 / 26
# State and Year FE
result_bothFE <- felm( fatal_rate ~ beertax | state + year | 0 | state, data = data )
summary(result_bothFE, robust = TRUE)
##
## Call:
## felm(formula = fatal_rate ~ beertax | state + year | 0 | state, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.59556 -0.08096 0.00143 0.08234 0.83883
##
## Coefficients:
## Estimate Cluster s.e. t value Pr(>|t|)
## beertax -0.6400 0.3539 -1.809 0.0769 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1879 on 281 degrees of freedom
## Multiple R-squared(full model): 0.9089 Adjusted R-squared: 0.8914
## Multiple R-squared(proj model): 0.03606 Adjusted R-squared: -0.1492
## F-statistic(full model, *iid*):51.93 on 54 and 281 DF, p-value: < 2.2e-16
## F-statistic(proj model): 3.271 on 1 and 47 DF, p-value: 0.07692
11 / 26
Report results using texreg. Note that
Setting "robust" option TRUE reports Heteroskedasticity-robust SE for the first column.
Automatically report Cluster-Robust SE for the second and the third columns.
library(texreg)
12 / 26
##
## ===================================================
## fatal_rate
## ----------------------------
## Model 1 Model 2 Model 3
## ---------------------------------------------------
## (Intercept) 1.853
## (0.047)
## beertax 0.365 -0.656 -0.640
## (0.053) (0.292) (0.354)
## ---------------------------------------------------
## State FE No Yes Yes
## Year FE No No Yes
## Num. obs. 336 336 336
## Adj. R^2 (full model) 0.091 0.889 0.891
## Adj. R^2 (proj model) 0.091 -0.120 -0.149
## Num. groups: state 48 48
## Num. groups: year 7
## ===================================================
13 / 26
What if we do not use the cluster-robust standard error?
# State FE w. CRS
result_w_CRS <- felm( fatal_rate ~ beertax | state | 0 | state, data = data )
14 / 26
##
## =========================================
## fatal_rate
## ------------------
## Model 1 Model 2
## -----------------------------------------
## beertax -0.656 -0.656
## (0.203) (0.292)
## -----------------------------------------
## Num. obs. 336 336
## Adj. R^2 (full model) 0.889 0.889
## Adj. R^2 (proj model) -0.120 -0.120
## Num. groups: state 48 48
## =========================================
## SE of `Model 1` is "Heteroskedasticity-Robust", while one of `Model 2` is "Cluster-Robust."
15 / 26
Panel + IV
16 / 26
Panel Data with Instrumental Variables
Revisit the demand for Cigaretts
Consider the following model
where
Qit is the number of packs per capita in state i in year t,
Pit is the after-tax average real price per pack of cigarettes, and
incomeit is the real income per capita. This is demand shifter.
17 / 26
As an IV for the price, we use the followings:
SalesT ax : the proportion of taxes on cigarettes arising from the general sales tax.
it
18 / 26
# load the data set and get an overview
library(AER)
data("CigarettesSW")
CigarettesSW %>%
mutate( rincome = (income / population) / cpi) %>%
mutate( rprice = price / cpi ) %>%
mutate( salestax = (taxs - tax) / cpi ) %>%
mutate( cigtax = tax/cpi ) -> Cigdata
19 / 26
Run IV regression with panel data.
# OLS
result_1 <- felm( log(packs) ~ log(rprice) + log(rincome) | 0 | 0 | state, data = Cigdata )
# State FE
result_2 <- felm( log(packs) ~ log(rprice) + log(rincome) | state | 0 | state, data = Cigdata )
# IV without FE
result_3 <- felm( log(packs) ~ log(rincome) | 0 | (log(rprice) ~ salestax + cigtax) |
state, data = Cigdata )
# IV with FE
result_4 <- felm( log(packs) ~ log(rincome) | state | (log(rprice) ~ salestax + cigtax) |
state, data = Cigdata )
20 / 26
##
## =========================================================
## log(packs)
## ----------------------------------
## Model 1 Model 2 Model 3 Model 4
## ---------------------------------------------------------
## (Intercept) 10.067 9.736
## (0.464) (0.555)
## log(rprice) -1.334 -1.210
## (0.174) (0.143)
## log(rincome) 0.318 0.121 0.257 0.204
## (0.212) (0.218) (0.204) (0.238)
## `log(rprice)(fit)` -1.229 -1.268
## (0.183) (0.162)
## ---------------------------------------------------------
## Num. obs. 96 96 96 96
## Adj. R^2 (full model) 0.542 0.929 0.539 0.929
## Adj. R^2 (proj model) 0.542 0.793 0.539 0.792
## Num. groups: state 48 48
## =========================================================
21 / 26
felm command
22 / 26
Report heteroskedasticity robust standard error
# Run felm command without specifying cluster.
result_1 <- felm( log(packs) ~ log(rprice) + log(rincome) | 0 | 0 | state, data = Cigdata )
screenreg(l = list(result_1),
digits = 3,
# caption = 'title',
custom.model.names = c(" Model 1 "),
custom.coef.names = NULL, # add a class, if you want to change the names of variables.
include.ci = T,
include.rsquared = FALSE, include.adjrs = TRUE, include.nobs = TRUE,
include.pvalues = FALSE, include.df = FALSE, include.rmse = FALSE,
robust = F, # robust standard error
custom.header = list("log(packs)" = 1),
stars = numeric(0),
)
23 / 26
##
## ==================================
## log(packs)
## -----------
## Model 1
## ----------------------------------
## (Intercept) 10.067
## (0.464)
## log(rprice) -1.334
## (0.174)
## log(rincome) 0.318
## (0.212)
## ----------------------------------
## Num. obs. 96
## Adj. R^2 (full model) 0.542
## Adj. R^2 (proj model) 0.542
## ==================================
24 / 26
How to conduct F test after felm
# Run felm command without specifying cluster.
result_1 <- felm( packs ~ rprice + rincome | 0 | 0 | 0, data = Cigdata )
## F
## 49.22642
25 / 26
# The following tests H0: _b[rincome] - 1 = 0 & _b[rprice] = 0
ftest2 = waldtest(result_1, ~ rincome - 1 | rprice )
ftest2
## F
## 49.22642
26 / 26