Jaggia Chapter 7 2
Jaggia Chapter 7 2
Jaggia Chapter 7 2
Advanced
Regression Analysis
Business Analytics, 1e
By Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, and Leida Chen
Copyright © 2021 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior
8/16/2020 written consent of McGraw-Hill Education.
7-1
7.3: Linear Probability and Logistic
Regression Models(1/8)
• The response variable thus far has been quantitative.
• Binary choice (classification) models have a binary response
variable.
• Examples
– Whether or not to buy a house
– Whether or not to join a health club
– Whether or not to approve a loan
• The binary choice is related to predictor variables.
• We will consider two types of models.
– The linear probability regression model
– The logistic regression model
𝑥1 = 20, 𝑥2 = 30
exp −9.3671 + 0.1349 ∗ 20 + 0.1782 ∗ 30
𝑝Ƹ = = 0.2103
1 + exp −9.3671 + 0.1349 ∗ 20 + 0.1782 ∗ 30
A. Partition the sample into two parts: training and validation sets.
B. Use the training set to estimate competing models.
C. Use the estimates from the training set to predict the response in the
validation set.
D. Calculate RMSE (or accuracy rate) for each competing model. The
preferred model will have the smallest RMSE (or highest accuracy rate).
• We would like the model with the best performance in the training set to
also have the best performance in the validation set.
• Conflicting results are a sign of overfitting using the training set.