MLR - 2023

Financial
econometrics
FE 2106
BSc in Financial Engineering
level II
Correlation and scatter plots
Simple linear regression
Lesson plan
Multiple linear regression
Time series analysis
Smoothing
2
Multiple Linear Regression
• Describe multiple simultaneous associations of independent variables
with one continuous outcome.
ØEstimation
ØVariable selection
ØAssessing model fit
• Ex: response variable: peak plasma growth hormone level of short

children
predictors (14): gender, age and various body measurements
Mathematics: SLR in matrices
𝑋! 𝑋"
Dummy variable
SLR/MLR in matrices
The model : 𝒀 = 𝑿𝜷 + 𝝐
Normal equations to solve for parameters:

𝑛 ∑𝑋$ ∑𝑌$
where 𝑿# 𝑿 = 𝑿# 𝒀 =
𝑿# 𝑿𝜷 = 𝑿# 𝒀 ∑𝑋$ ∑𝑋$% ∑𝑋$ 𝑌$
Estimates:
𝜷 = 𝑿# 𝑿 &𝟏 𝑿# 𝒀
Singularity
1 ∑𝑋$% −∑𝑋$
𝑿# 𝑿&𝟏 = , det 𝑿# 𝑿 = 𝑿# 𝑿 = 𝑛∑ 𝑋$ − 𝑋- %
𝑛∑ 𝑋$ − 𝑋- % −∑𝑋$ 𝑛
𝑿# 𝑿 is singular when inverse does not exist. 𝑿# 𝑿&𝟏 blows up when the determinant is zero.
Least square procedure will not give unique solution, but many alternative solutions.
Reason: data are inadequate for fitting the model/model is too complex for available data
Solution: need more data or a simpler model
Example: singularity
Obtain the simple linear regression using data analysis tool and matrix
calculations
Model with two predictors
• 𝑌! = 𝛽" + 𝛽# 𝑋!# + 𝛽$ 𝑋!$ + 𝜖!
Ø Linear in predictor variables, Linear response surface

Ø linear in parameters
Ø 𝑌$ =response in i-th trial
Ø 𝑋$" , 𝑋$% = values of predictors in i-th trial
Ø 𝛽! , 𝛽" , 𝛽% =parameters of the model
Ø 𝜖$ = error term
• Assume 𝐸 𝜖! = 0, E 𝑌 = 𝛽" + 𝛽# 𝑋# + 𝛽$ 𝑋$
General procedure
• for each predictor, verify through a data plot that a linear relation is likely
to be appropriate;
• estimate the MLR model;
• assess through diagnostics whether the model provides an appropriate fit
to the data;
• if so, use the model to draw inferences about the regression coefficients;
• reduce the model by removing nonsignificant predictors, if appropriate for
the study goals; and
• reassess through diagnostics whether the model provides an appropriate
fit to the data.
Model with more than two predictors
General linear regression model:
𝑌$ = 𝛽! + 𝛽" 𝑋$," + 𝛽% 𝑋$,% + ⋯ + 𝛽)&" 𝑋$,)&" + 𝜖$

𝜖$ ~𝑁 0, 𝜎 %
𝐸 𝑌 = 𝛽! + 𝛽" 𝑋" + ⋯ + 𝛽)&" 𝑋)&"
Ø No interaction effects between predictors
Ø Qualitative predictor variables: can encompasses variables like gender, disability status that can take
values 0,1 to identify classes of a qualitative variable.
Qualitative predictor variables
Example: Consider a regression analysis to predict the length of hospital stay (Y) based on the age (X1) and
gender (X2) of the patient
General Linear Regression (Matrix Form)
𝒀 = 𝑿𝜷 + 𝝐
𝒀 is an 𝑛×1 vector of observations 𝑿# 𝑿𝜷 = 𝑿# 𝒀

𝑿 is an 𝑛×𝑝 matrix of known form
𝜷 is an 𝑝×1 vector of parameters
𝝐 is an 𝑛×1 vector vector of errors
𝜷 = 𝑿# 𝑿 &𝟏
𝑿# 𝒀
where,
𝐸 𝝐 = 0, 𝑉 𝝐 = 𝑰𝜎 %
General Linear Regression
• Least Square Properties
> = 𝑿𝜷
1. The fitted values are obtained by 𝒀
>−𝒀
2. The vector of residuals are obtained by 𝝐 = 𝒀
3. 𝑉 𝒃 = 𝑿# 𝑿 &" 𝜎 % provides the variances (diagonal terms) and covariances (off-diagonal terms) of
the estimates.
4. Suppose 𝑿#𝟎 is a specified 1×𝑝 vector whose elements are of the same form as a row of 𝑿 so that
@𝟎 = 𝑿#𝟎 𝜷 = 𝜷# 𝑿𝟎 is the fitted value at a specified location (predicted value at 𝑿𝟎 by the regression
𝒀
equation)
4. Suppose 𝑿#𝟎 is a specified 1×𝑝 vector whose elements are of the same form as a row of 𝑿 so that
𝑌A! = 𝑿#𝟎 𝜷 = 𝜷# 𝑿𝟎 is the fitted value at a specified location (predicted value at 𝑿𝟎 by the regression
equation)
predicted value has variance,

𝑉 𝑌A! = 𝑿#𝟎 𝑽 𝜷 𝑿𝟎 = 𝑿#𝟎 𝑿# 𝑿 &𝟏
𝑿𝟎 𝜎 %
5. Basic Anova
Source Df SS MS
Regression p-1 𝟏 # MS_ regression
𝜷𝑿# 𝒀 − 𝒀 𝑱𝒀
𝒏
Residual n-p 𝒀# 𝒀 − 𝜷# 𝑿# 𝒀 MS_ residual
Total n-1 #
𝟏 #
𝒀 𝒀− 𝒀 𝑱𝒀
𝒏
𝐽 is an 𝑛×𝑛 matrix of 1s
• Coefficient of multiple determination (𝑅$ )
Simple regression,
%
𝑆𝑆𝑅 𝑆𝑆𝐸
𝑅 = =1− , 0 ≤ 𝑅% ≤ 1
𝑆𝑆𝑇 𝑆𝑆𝑇
Note:
• A large value of 𝑅 % does not necessarily imply that the fitted model is useful.
• Adding more X variables to the regression model can only increase 𝑅 % and never reduce it, because
SSE can never become larger with more X variables and SST is always same for a given set of
responses.
Adjusted coefficient of multiple regression,

𝑛 − 1 𝑆𝑆𝐸
𝑅+% =1−
𝑛 − 𝑝 𝑆𝑆𝑇
• Hypothesis testing for 𝛽% :
𝐻! : 𝛽, = 0
𝐻+ : 𝛽, ≠ 0
Test statistic:
𝑏,
𝑡∗ = ; s. d. 𝑏, = √𝑉 𝛽
s. d. 𝑏,
Decision rule:
If 𝑡 ∗ ≤ 𝑡 .
"& % ,/&)
, failed to reject 𝐻!

MLR - 2023

Uploaded by

Copyright:

Available Formats

MLR - 2023

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MLR - 2023

Uploaded by

Copyright:

Available Formats

Financial

Simple linear regression

Time series analysis

• Ex: response variable: peak plasma growth hormone level of short

Normal equations to solve for parameters:

Ø Linear in predictor variables, Linear response surface

𝑌$ = 𝛽! + 𝛽" 𝑋$," + 𝛽% 𝑋$,% + ⋯ + 𝛽)&" 𝑋$,)&" + 𝜖$

𝐸 𝑌 = 𝛽! + 𝛽" 𝑋" + ⋯ + 𝛽)&" 𝑋)&"

Ø No interaction effects between predictors

𝒀 is an 𝑛×1 vector of observations 𝑿# 𝑿𝜷 = 𝑿# 𝒀

predicted value has variance,

Adjusted coefficient of multiple regression,

You might also like