Interpretation
Interpretation
Interpretation
-------------------------------------------------------------------------------
1
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -52357.9 6557.047 -7.98 0.000 -65212.55 -39503.25
heatsqft | 322.2198 6.214718 51.85 0.000 310.0363 334.4033
unemp | -56232.35 6460.177 -8.70 0.000 -68897.09 -43567.6
1.built01plus | -89030.65 7139.838 -12.47 0.000 -103027.8 -75033.47
_cons | 165190.9 38456.95 4.30 0.000 89798.61 240583.2
-------------------------------------------------------------------------------
How shall we interpret the estimated coefficients of this regression?
Everything else constant (ceteris paribus) we can conclude that on average:
• each additional bedroom leads to a price decrease of 52357.9 dollars
• each additional square foot leads to an increase of 322.2 dollars
• an additional percentage point increase in the unemployment rate leads to a decrease of
56232.3 dollars
• houses built after 2000 have a price that is 89030.7 lower
Since our model is of the type
-------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -52357.9 6557.047 -7.98 0.000 -65212.55 -39503.25
heatsqft | 322.2198 6.214718 51.85 0.000 310.0363 334.4033
unemp | -56232.35 6460.177 -8.70 0.000 -68897.09 -43567.6
1.built01plus | -89030.65 7139.838 -12.47 0.000 -103027.8 -75033.47
-------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
As expected, the results are the same as those obtained with regress. This is because we want to
look at the impact of Δ𝑋 on Δ𝑌 (for simplicity we ommit the expected sign on Y). But what if
we want to look at the impact Δ𝑋 on Δ𝑌 𝑌 (semi-elasticities)? With margins that is quite easy.
2
[5]: margins, eydx(heatsqf unemp)
------------------------------------------------------------------------------
| Delta-method
| ey/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
heatsqft | -.0374337 228.2588 -0.00 1.000 -447.5235 447.4486
unemp | 6.532762 39834.55 0.00 1.000 -78086.42 78099.49
------------------------------------------------------------------------------
𝛽
Recall that in this case the estimated semi-elasticity is given by 𝑌𝑗 . The question is which 𝑌 to
use. margins uses the predicted value of 𝑌 , does this computation for each observation, and then
averages all values. We can replicate the result:
[6]: qui regress price bedrooms heatsqft unemp i.built01plus
qui predict double yhat
gen double invy=1/yhat
qui sum invy
di "The semi-elasticity is -> "_b[heatsqft]*r(mean)
3
------------------------------------------------------------------------------
| Delta-method
| ey/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
heatsqft | .0010875 .0000245 44.32 0.000 .0010394 .0011356
------------------------------------------------------------------------------
The estimate is quite different but the interpretation would be the same. Everything else constant
an incease of one square foot in a house would on average lead to a price increase of 1.0875%. Note
that the change in 𝑌 is relative so it must be read as a percentage.
Finally, what if we wanted to compute the elasticity?
[8]: margins, eyex(heatsqft) atmean
------------------------------------------------------------------------------
| Delta-method
| ey/ex Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
heatsqft | 2.155189 .0486256 44.32 0.000 2.059862 2.250516
------------------------------------------------------------------------------
and we can conclude that, ceteris paribus, an increase of 1 percent on footage will lead to a 2.1
percent increase in price.
Suppose now that to the specification above we decide to add a quadratic term on heatsqft. We
could create a new variable and add it to the regression
[9]: gen heat2=heatsqft*heatsqft
regress price bedrooms heatsqft heat2 unemp i.built01plus
4
Residual | 2.7326e+14 5,059 5.4015e+10 R-squared = 0.4955
-------------+---------------------------------- Adj R-squared = 0.4950
Total | 5.4160e+14 5,064 1.0695e+11 Root MSE = 2.3e+05
-------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -27927.4 6246.967 -4.47 0.000 -40174.16 -15680.64
heatsqft | -119.9291 18.30153 -6.55 0.000 -155.808 -84.05014
heat2 | .0855498 .0033553 25.50 0.000 .078972 .0921275
unemp | -50821.73 6085.551 -8.35 0.000 -62752.05 -38891.41
1.built01plus | -61857.04 6805.673 -9.09 0.000 -75199.1 -48514.97
_cons | 536362.9 39021.86 13.75 0.000 459863.1 612862.6
-------------------------------------------------------------------------------
Since heat2 is the square of heatsqft the partial effect of this variable needs to take this into account.
This is a model of the type
-------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -27927.4 6246.967 -4.47 0.000 -40174.16 -15680.64
heatsqft | -119.9291 18.30153 -6.55 0.000 -155.808 -84.05013
|
c.heatsqft#|
c.heatsqft | .0855498 .0033553 25.50 0.000 .078972 .0921275
|
unemp | -50821.73 6085.551 -8.35 0.000 -62752.05 -38891.41
1.built01plus | -61857.04 6805.673 -9.09 0.000 -75199.1 -48514.97
_cons | 536362.9 39021.86 13.75 0.000 459863.1 612862.6
5
-------------------------------------------------------------------------------
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
heatsqft | 219.1616 7.111178 30.82 0.000 205.2206 233.1026
------------------------------------------------------------------------------
and we can conclude that, everything else constant, on average, an increase of a foot leads to a
price incease of 219 dollars.
Similarly, if we wanted to compute any interaction of variables we should use Stata’s syntax. For
example, suppose that for some reason we want to introduce as a regressor the interaction between
unemp and heatsqft. We could simply do
[11]: regress price bedrooms c.heatsqft##c.unemp i.built01plus
-------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -53765.35 6496.683 -8.28 0.000 -66501.66 -41029.04
heatsqft | 779.3125 46.59624 16.72 0.000 687.9636 870.6613
unemp | 117899.9 18723.03 6.30 0.000 81194.64 154605.1
|
c.heatsqft#|
c.unemp | -89.233 9.016719 -9.90 0.000 -106.9097 -71.55633
|
1.built01plus | -92614.85 7081.681 -13.08 0.000 -106498 -78731.69
_cons | -721585.9 97367.25 -7.41 0.000 -912467.9 -530703.9
-------------------------------------------------------------------------------
and because we “told” Stata how the interaction variable was constructed margins would know
how to correctly compute the partial effects.
6
2 The log linear model
Suppose that instead of the above regression we used as a dependent variable the log of price.
[12]: gen lprice=log(price)
regress lprice bedrooms heatsqft unemp i.built01plus
-------------------------------------------------------------------------------
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
bedrooms | -.1117128 .0105461 -10.59 0.000 -.1323878 -.0910378
heatsqft | .0007175 1.00e-05 71.79 0.000 .000698 .0007371
unemp | -.1607762 .0103903 -15.47 0.000 -.1811458 -.1404067
1.built01plus | -.0918013 .0114835 -7.99 0.000 -.1143138 -.0692887
_cons | 12.16847 .0618528 196.73 0.000 12.04721 12.28973
-------------------------------------------------------------------------------
This is a model of the type
whose coefficients have a direct interpretation as semi-elasticities. Based on this model we would
conclude that, everything else constant, and on average
• an increase of one bedroom leads to a price decrease of 11.17 percent
• an increase of a foot leads to an increase of 0.07 percent in the price of a house
• an increase of a percentage point in the unemployment rate leads to a decrease of 16 percent
in price
What about the coefficient on “1.built01plus”? We can conclude that, ceteris paribus, after 2000
prices increase by a factor of 𝑒𝑥𝑝(−0.0918) = .9122864 or, in other words, decreased 8.77 percent
(.9122864 − 1 = −.0877136). When working with dummies, if the dependent variable is in logs,
then the partial effect is usually calculated as (𝑒𝛽 − 1) × 100%. A simpler alternative is to express
the change in terms of log points. In this case we could simply say that after 2000 prices decreased
9.2 log points (we are reading the coefficient directly).
7
[13]: gen lbedrooms=log(bedrooms)
gen lheatsqft=log(heatsqft)
gen lunemp=log(unemp)
regress lprice lbedrooms lheatsqft lunemp i.built01plus
-------------------------------------------------------------------------------
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
lbedrooms | -.4608738 .0375345 -12.28 0.000 -.5344576 -.3872899
lheatsqft | 1.52488 .0219449 69.49 0.000 1.481858 1.567901
lunemp | -.8749995 .055821 -15.68 0.000 -.9844328 -.7655662
1.built01plus | -.1287332 .0118935 -10.82 0.000 -.1520496 -.1054168
_cons | 2.900198 .167044 17.36 0.000 2.572719 3.227676
-------------------------------------------------------------------------------
Now we have a model of the type (ignoring the dummy variable)
and the coefficients have a direct interpretation as elasticities. In this particular case it doesn’t
make much sense to take the log of bedrooms or unemp so we will only interpret the coeffcient for
heatsqft. Based on this model we would conclude that, everything else constant, on average when
the square footage increased by 1 percent price increased by 1.52 percent.
Note that Stata does not “know” that the variables are in logs. So if using the margins command
the dydx option would produce estimates for the elasticities.
4 Remarks
In general we do not include among the explanatory the log of variables that are already in per-
centage (as unemp). It is also not common to take logs of variables that have a small number of
discrete values (as bedroom). We also have to be careful about interactions because it may become
difficult to provide meaningful interpretation. When adding interactions it is also a good idea to
add the original variables by themselves.