Non-linear Regression Functions

• Topics
• Nonlinear functions
• Interaction effects
1. Nonlinear functions
• Non-linear transformations of X
• The regression models so far have been linear in X-variables
• But the linear model may not always be appropriate
• The multiple regression model can handle functions that are
nonlinear in one or more X-variables.
Nonlinear regression functions

• If a relation between Y and X is nonlinear:

• The marginal effect of X on Y is not constant, and it depends
on the value of X and/or Y

• A linear model is mis-specified: the functional form is wrong

and the estimator of the effect of X on Y is biased

• The solution is to estimate a regression function that is

nonlinear in X
Nonlinear functions of a single variable
We will look at two types of transformations:
i. Polynomials in X: The regression model is approximated by
a quadratic, cubic, or higher-degree polynomial
ii. Logarithmic transformations: Y and/or X is transformed by
the natural logarithm. This allows one to evaluate relative
changes in percent of X and/or Y, which is relevant in many
applications (e.g., inflation and growth rates)
(i) Polynomials in X
• Approximate the population regression function by a second-
degree polynomial:
Yi = β0 + β1Xi + β2 Xi2 + ui
• This is just the linear multiple regression model – except that
the three coefficients refer to a quadratic function of X
• Estimation, hypothesis testing, etc. proceeds as in the multiple
regression model using OLS

• Marginal effect of X on Y: = β1 + 2∙β2 X,

which is a function of X
(ii) Logarithmic specifications

Specification Regression model

A. linear-log Yi = β0 + β1ln(Xi) + ui
B. log-linear ln(Yi) = β0 + β1Xi + ui
C. log-log ln(Yi) = β0 + β1ln(Xi) + ui

• Interpretation of the coefficients is different in each model

• Each model has a natural interpretation of marginal changes in X

Linear-log model
• ln-transformation of X:
Y = β0 + β1 ln(X)

• Differentiate Y with respect to X to find marginal effects:

dY = β1 dX = β1

where is a relative change in X. If you increase X by 1

percent (0.01), then Y changes by 0.01∙β1 units
Log-linear model
• ln-transformation of Y:
ln(Y) = β0 + β1 X

• Differentiate Y with respect to X to find marginal effects:

dY = = β1 dX

where is a relative change in Y. If you increase X by 1 unit,

then Y changes by β1∙100 percent
Log-log model
• ln-transformation of Y and X:
ln(Y) = β0 + β1 ln(X)

• Differentiate Y with respect to X to find marginal effects:

dY = = β1 dX = β1

If you increase X by 1 percent, then Y changes by β1 percent

Non-linear functions in Stata
• Please open Lecture_4.do in Stata
• Data: Risk_Field.dta
Example: linear function of X
• crra = b0 + b1∙age
. regress crra age

Source | SS df MS Number of obs = 842

-------------+---------------------------------- F(1, 840) = 36.45
Model | 20.6550559 1 20.6550559 Prob > F = 0.0000
Residual | 475.985953 840 .566649944 R-squared = 0.0416
-------------+---------------------------------- Adj R-squared = 0.0404
Total | 496.641009 841 .590536277 Root MSE = .75276

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
age | -.0109613 .0018155 -6.04 0.000 -.0145248 -.0073978
_cons | 1.14752 .0857724 13.38 0.000 .9791665 1.315873

Model: crra = 1.148 − 0.011∙age

Marginal effect of age on crra: d(crra)/d(age) = − 0.011
Example: quadratic function of X
• crra = b0 + b1∙age + b2∙age2
. regress crra age age_sq

Source | SS df MS Number of obs = 842

-------------+---------------------------------- F(2, 839) = 22.04
Model | 24.7954662 2 12.3977331 Prob > F = 0.0000
Residual | 471.845542 839 .562390396 R-squared = 0.0499
-------------+---------------------------------- Adj R-squared = 0.0477
Total | 496.641009 841 .590536277 Root MSE = .74993

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
age | -.0421672 .0116423 -3.62 0.000 -.0650187 -.0193157
age_sq | .0003414 .0001258 2.71 0.007 .0000944 .0005883
_cons | 1.790814 .2520151 7.11 0.000 1.29616 2.285468

Model: crra = 1.791 − 0.042∙age + 0.0003∙age2

crra(at 20 years): crra = 1.791 − 0.042∙20 + 0.0003∙202 = 1.084
Example: quadratic function of X
• crra = b0 + b1∙age + b2∙age2
. regress crra age age_sq

Source | SS df MS Number of obs = 842

-------------+---------------------------------- F(2, 839) = 22.04
Model | 24.7954662 2 12.3977331 Prob > F = 0.0000
Residual | 471.845542 839 .562390396 R-squared = 0.0499
-------------+---------------------------------- Adj R-squared = 0.0477
Total | 496.641009 841 .590536277 Root MSE = .74993

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
age | -.0421672 .0116423 -3.62 0.000 -.0650187 -.0193157
age_sq | .0003414 .0001258 2.71 0.007 .0000944 .0005883
_cons | 1.790814 .2520151 7.11 0.000 1.29616 2.285468

Model: crra = 1.791 − 0.042∙age + 0.0003∙age2

Marginal effect of age on crra: d(crra)/d(age) = − 0.042 + 2∙0.0003∙age
• Estimate a linear-log regression model of crra on age
• Is the estimated coefficient on ln(age) significantly different from 0?
• What is the marginal effect of age on crra?

• Estimate a log-linear regression model of crra on age

• Is the estimated coefficient on age significantly different from 0?
• What is the marginal effect of age on crra?

• Estimate a log-log regression model of crra on age

• Is the estimated coefficient on ln(age) significantly different from 0?
• What is the marginal effect of age on crra?
Exercise: linear-log model
• crra = b0 + b1∙ln(age)
. regress crra ln_age

Source | SS df MS Number of obs = 842

-------------+---------------------------------- F(1, 840) = 40.38
Model | 22.7804322 1 22.7804322 Prob > F = 0.0000
Residual | 473.860577 840 .564119734 R-squared = 0.0459
-------------+---------------------------------- Adj R-squared = 0.0447
Total | 496.641009 841 .590536277 Root MSE = .75108

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
ln_age | -.4827743 .0759712 -6.35 0.000 -.63189 -.3336587
_cons | 2.465469 .2862441 8.61 0.000 1.903631 3.027306

Model: crra = 2.465 − 0.483∙ln(age)

Increase in age by 1% gives change in crra by 0.01∙(−0.483) = −0.00483
Exercise: log-linear model
• ln(crra) = b0 + b1∙age
. regress ln_crra age

Source | SS df MS Number of obs = 718

-------------+---------------------------------- F(1, 716) = 9.56
Model | 8.05700589 1 8.05700589 Prob > F = 0.0021
Residual | 603.450445 716 .842807884 R-squared = 0.0132
-------------+---------------------------------- Adj R-squared = 0.0118
Total | 611.507451 717 .852869527 Root MSE = .91805

ln_crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
age | -.007402 .002394 -3.09 0.002 -.0121021 -.0027019
_cons | -.1227536 .1104286 -1.11 0.267 -.3395562 .0940489

Model: ln(crra) = −0.123 − 0.007∙age

Increase in age by 1 year gives change in crra by −0.007∙100% = −0.7%
Exercise: log-log model
• ln(crra) = b0 + b1∙ln(age)
. regress ln_crra ln_age

Source | SS df MS Number of obs = 718

-------------+---------------------------------- F(1, 716) = 10.98
Model | 9.23723118 1 9.23723118 Prob > F = 0.0010
Residual | 602.270219 716 .841159524 R-squared = 0.0151
-------------+---------------------------------- Adj R-squared = 0.0137
Total | 611.507451 717 .852869527 Root MSE = .91715

ln_crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
ln_age | -.3294411 .0994137 -3.31 0.001 -.5246183 -.134264
_cons | .7794874 .371791 2.10 0.036 .0495566 1.509418

Model: ln(crra) = 0.779 − 0.329∙ln(age)

Increase in age by 1% gives change in crra by −0.329%
2. Interaction effects
• For example, is the effect of age on risk aversion different for
men and women?

That is, d(crra) / d(age) might be different for men and women

• How do we model such interaction effects between two (or

more) independent variables?
A. Two binary variables
Yi = β0 + β1D1i + β2D2i + ui
• D1 and D2 are binary dummy variables
• = β1
• = β2

• One can also include an interaction term D1×D2 between the two
binary dummy variables:

Yi = β0 + β1D1i + β2D2i + β3(D1i×D2i) + ui

• Marginal effects:
• = β1 + β3∙D2 - marginal effect of D1 depends on D2
• = β2 + β3∙D1 - marginal effect of D2 depends on D1
Example with two binary variables
• Linear regression model with interaction effects between two
dummy variables (2×2 combination)

. regress crra female young female_young

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .0350242 .0572137 0.61 0.541 -.077274 .1473224
young | .3523575 .1061799 3.32 0.001 .1439493 .5607658
female_young | .0036123 .1394793 0.03 0.979 -.2701556 .2773803
_cons | .5731764 .0398438 14.39 0.000 .4949715 .6513813

crra, young female = 0.573 + 0.352 + 0.035 + 0.004= 0.964

crra, young male = 0.573 + 0.352 = 0.925
crra, older female = 0.573 + 0.035 = 0.608
crra, older male = 0.573 = 0.573
Example with two binary variables
• Linear regression model with interaction effects between two
dummy variables (2×2 combination)

. regress crra female young female_young

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .0350242 .0572137 0.61 0.541 -.077274 .1473224
young | .3523575 .1061799 3.32 0.001 .1439493 .5607658
female_young | .0036123 .1394793 0.03 0.979 -.2701556 .2773803
_cons | .5731764 .0398438 14.39 0.000 .4949715 .6513813

Marginal effects of female and young:

d(crra)/d(female) = 0.035 + 0.004∙young
d(crra)/d(young) = 0.352 + 0.004∙female
Alternative model with two binary variables
• Linear regression model with interaction effects between two
dummy variables (2×2 combination)

. regress crra male_young male_old female_young female_old, noconstant robust

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
male_young | .9255339 .0927203 9.98 0.000 .7435439 1.107524
male_old | .5731764 .0323822 17.70 0.000 .5096171 .6367357
female_young | .9641705 .0739627 13.04 0.000 .8189976 1.109343
female_old | .6082006 .0489453 12.43 0.000 .5121315 .7042697

crra, young female = 0.964

crra, young male = 0.925
crra, older female = 0.608
crra, older male = 0.573
Alternative model with two binary variables
• Linear regression model with interaction effects between two
dummy variables (2×2 combination)

. regress crra male_young male_old female_young female_old, noconstant robust

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
male_young | .9255339 .0927203 9.98 0.000 .7435439 1.107524
male_old | .5731764 .0323822 17.70 0.000 .5096171 .6367357
female_young | .9641705 .0739627 13.04 0.000 .8189976 1.109343
female_old | .6082006 .0489453 12.43 0.000 .5121315 .7042697

Marginal effect of female for young = 0.964 − 0.925 = 0.039

Marginal effect of female for older = 0.608 − 0.573 = 0.035
Marginal effect of young for female = 0.964 − 0.608 = 0.356
Marginal effect of young for male = 0.925 − 0.573 = 0.352
B. Binary and continuous variables
Yi = β0 + β1Di + β2Xi + ui
• D is binary, X is continuous

• The marginal effect of D is β1, which does not depend on X

• The marginal effect of X is β2, which does not depend on D

• One can also include a new variable, which is an interaction term D×X
between the dummy and continuous variable:

Yi = β0 + β1Di + β2Xi + β3(Di×Xi) + ui

• Marginal effects:
• = β1 + β3∙X - Marginal effect of D depends on X
• = β2 + β3∙D - Marginal effect of X depends on D
Example with binary and continuous variables
• Linear regression model with interaction effects between one
dummy (female) and one continuous (age) variable

. regress crra female age female_age, robust

Linear regression Number of obs = 842

F(3, 838) = 12.45
Prob > F = 0.0000
R-squared = 0.0451
Root MSE = .75226

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .2945737 .1720918 1.71 0.087 -.0432079 .6323553
age | -.0081911 .0021792 -3.76 0.000 -.0124685 -.0039138
female_age | -.0066059 .003923 -1.68 0.093 -.014306 .0010942
_cons | 1.015739 .1051843 9.66 0.000 .8092837 1.222195
Example with binary and continuous variables
• Model: crra = β0 + β1female + β2age + β3(female×age)

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .2945737 .1720918 1.71 0.087 -.0432079 .6323553
age | -.0081911 .0021792 -3.76 0.000 -.0124685 -.0039138
female_age | -.0066059 .003923 -1.68 0.093 -.014306 .0010942
_cons | 1.015739 .1051843 9.66 0.000 .8092837 1.222195

Predicted crra for women = (1.0157 + 0.2946) + (− 0.0082 − 0.0066)∙age

Predicted crra for men = (1.0157) + (− 0.0082)∙age
Example with binary and continuous variables
• Model: crra = β0 + β1female + β2age + β3(female×age)

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .2945737 .1720918 1.71 0.087 -.0432079 .6323553
age | -.0081911 .0021792 -3.76 0.000 -.0124685 -.0039138
female_age | -.0066059 .003923 -1.68 0.093 -.014306 .0010942
_cons | 1.015739 .1051843 9.66 0.000 .8092837 1.222195

Marginal effect of age on crra for women= − 0.0082 − 0.0066 = − 0.0148

Marginal effect of age on crra for men = − 0.0082
Marginal effect of female on crra = 0.2946 − 0.0066∙age
Predicted crra values across age for men and women





20 40 60 80

Males Females
C. Two continuous variables
Yi = β0 + β1X1,i + β2X2,i + ui
• X1 and X2 are continuous

• The marginal effect of X1 is β1, which does not depend on X2

• The marginal effect of X2 is β2, which does not depend on X1

• One can also include an interaction term X1×X2 between the dummy
and continuous variables:

Yi = β0 + β1X1,i + β2X2,i + β3(X1,i×X2,i) + ui

• Marginal effects:
• d(y)/d(X1) = β1 + β3∙X2
• d(y)/d(X2) = β2 + β3∙X1
• Generate an interaction term between female and city, and
estimate a linear regression model with crra as a function
of female and city (including the interaction term)
• Are the estimated coefficients in the model significantly
different from 0?
• What is the marginal effect of female on crra?
• What is the marginal effect of city on crra?
Exercise: interaction effects
• crra = b0 + b1∙female + b2∙city + b3∙female∙city

. regress crra female city female_city

Source | SS df MS Number of obs = 846

-------------+---------------------------------- F(3, 842) = 0.80
Model | 1.41797543 3 .472658477 Prob > F = 0.4924
Residual | 495.67142 842 .588683397 R-squared = 0.0029
-------------+---------------------------------- Adj R-squared = -0.0007
Total | 497.089396 845 .588271474 Root MSE = .76726

crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
female | .0800496 .0666721 1.20 0.230 -.0508135 .2109126
city | .0822605 .078421 1.05 0.294 -.0716631 .2361841
female_city | -.0624956 .10921 -0.57 0.567 -.2768514 .1518603
_cons | .5937362 .0466075 12.74 0.000 .5022556 .6852168

d(crra)/d(female) = 0.080 − 0.062∙city

d(crra)/d(city) = 0.082 − 0.062∙female
Exercise: interaction effects
. regress crra male_city male_rural female_city female_rural, noconstant robust

Linear regression Number of obs = 846

F(4, 842) = 165.06
Prob > F = 0.0000
R-squared = 0.4218
Root MSE = .76726

| Robust
crra | Coef. Std. Err. t P>|t| [95% Conf. Interval]
male_city | .6759966 .0514091 13.15 0.000 .5750915 .7769018
male_rural | .5937362 .0393311 15.10 0.000 .5165377 .6709347
female_city | .6935506 .0687072 10.09 0.000 .5586931 .8284081
female_rural | .6737857 .0536792 12.55 0.000 .5684249 .7791465

crra, male_city = 0.676

crra, male_rural = 0.594
crra, female_city = 0.694
crra, female_rural = 0.674
• Learning outcomes
• Know how to run non-linear regression models in Stata
• Understand marginal effects in regression models with non-
linear transformations of single variables
• Construct interaction variables
• Understand marginal effects in regression models with
interaction variables
Summary: linear regression models
• Linear regression model
• Some variable Y is a linear function of some variable X
• Use OLS to estimate the coefficients in the linear model
• Single test of one hypothesis
• Robust standard errors: controls for heteroskedasticity

• Multiple regression model

• Add multiple independent variables (with no perfect correlation)
• Joint tests of several hypotheses

• Non-linear functions
• Non-linear transformations of variables
• Interaction effects
Extra exercises
• Use Risk_Lab.dta
• Run a regression of crra on age and age2
• Are the estimated coefficients significant? Are they jointly
• Construct a 95% confidence interval for the estimated coefficients
on age and age2
• What is the marginal effect of age on crra?
• What is the predicted crra value for a 20 year old and a 30 year
• Generate a dummy variable for each risk task.
• Are individual risk attitudes constant across the four decision tasks?
• Is the effect of female on crra significantly different across the four
decision tasks?
Short summary of Appendix 8.1
• Nonlinear least squares estimation
• Suppose Y is a non-linear function of the parameters in the model
• One example is the logistic function (we discuss the so-called logit
model later)
• One can extend the OLS estimation method to account for non-
linear functions such as the logistic function
• Non-linear least squares estimation minimizes the sum of squared
residuals, just like ordinary least squares estimation
• Typically we use maximum likelihod estimation to estimate the
model parameters (more on that later)
Short summary of Appendix 8.2
• Slopes and elasticities
• Identify the model, Y as a function of X
• When you derive the marginal effect (slope) you differentiate Y with
respect to X
• The elasticity is defined in the usual way as the percentage change
in Y divided by the percentage change in X
• If we are interested in elasticities we typically use log-transformations
of both Y and X, and the marginal effect is then the same as the

