Multiple Regression
Multiple Regression
Multiple Regression
10th Edition
Chapter 14
Introduction to Multiple Regression
Chap 14-1
Learning Objectives
In this chapter, you learn:
Population slopes
Random Error
Yi 0 1X1i 2 X 2i k Xki i
Estimated
intercept
Yi b0 b1X1i b 2 X 2i bk Xki
In this chapter we will always use Excel to obtain the
regression slope coefficients and other regression
summary measures.
pe
o
Sl
e
bl
a
i
ar
v
r
fo
X1
Y b0 b1X1 b 2 X 2
X1
varia
r
o
f
e
lo p
X2
ble X 2
Example:
2 Independent Variables
Dependent variable:
Pie sales (units per week)
Independent variables: Price (in $)
Advertising ($100s)
Pie
Sales
Price
($)
Advertising
($100s)
350
5.50
3.3
460
7.50
3.3
350
8.00
3.0
430
8.00
4.5
350
6.80
3.0
380
7.50
4.0
430
4.50
3.0
470
6.40
3.7
450
7.00
3.5
10
490
5.00
4.0
11
340
7.20
3.5
12
300
7.90
3.2
13
440
5.90
4.0
14
450
5.00
3.5
15
300
7.00
2.7
Sales = b0 + b1 (Price)
+ b2 (Advertising)
Using Minitab
Stat | Regression | Regression
Enter appropriate
variables for
Response and
Predictors
Minitab
Multiple Regression Output
Regression Analysis: Pie Sales versus Price, Advertising
The regression equation is
Pie Sales = 307 - 25.0 Price + 74.1 Advertising
Predictor
Coef
Constant
306.5
Price
-24.98
Advertising 74.13
S = 47.4634
SE Coef
T
114.3
2.68
10.83 -2.31
25.97
2.85
R-Sq = 52.1%
P
0.020
0.040
0.014
R-Sq(adj) = 44.2%
Analysis of Variance
Source
DF
Regression
2
Residual Error 12
Total
14
Source
Price
Advertising
DF
1
1
SS
29460
27033
56493
Seq SS
11100
18360
MS
14730
2253
F
6.54
P
0.012
Predicted sales
is 428.6 pies
Predictions in Minitab
Select
Stat | Regression | Regression
then click on the options box
Enter desired
value for each
independent
variable
Check the
Confidence limits
and Prediction
limits boxes
Predictions in Minitab
(continued)
Minitab output:
New
Obs Fit SE Fit
1 428.6 17.2
<
Predicted Y value
95% CI
(391.1, 466.1)
95% PI
(318.6, 538.6)
<
Input values
New
Obs Price
1
5.50
Coefficient of
Multiple Determination
SST
total sum of squares
2
Multiple Coefficient of
Determination
(continued)
SE Coef
T
114.3
2.68
10.83 -2.31
25.97
2.85
R-Sq = 52.1%
P
0.020
0.040
0.014
SSR 29460
r
.521
SST 56493
2
R-Sq(adj) = 44.2%
Analysis of Variance
Source
DF
Regression
2
Residual Error 12
Total
14
Source
Price
Advertising
DF
1
1
SS
29460
27033
56493
Seq SS
11100
18360
MS
14730
2253
F
6.54
P
0.012
Adjusted r2
Adjusted r2
(continued)
n 1
r 1 (1 r )
n k 1
(where n = sample size, k = number of independent
variables)
2
adj
Multiple Coefficient of
Determination
(continued)
SE Coef
T
114.3
2.68
10.83 -2.31
25.97
2.85
R-Sq = 52.1%
2
adj
P
0.020
0.040
0.014
R-Sq(adj) = 44.2%
Analysis of Variance
Source
DF
Regression
2
Residual Error 12
Total
14
Source
Price
Advertising
DF
1
1
SS
29460
27033
56493
Seq SS
11100
18360
MS
14730
2253
F
6.54
P
0.012
.442
Hypotheses:
H0: 1 = 2 = = k = 0 (no linear relationship)
H1: at least one i 0 (at least one independent
variable affects Y)
Test statistic:
SSR
MSR
k
F
SSE
MSE
n k 1
where F has (numerator) = k and
(denominator) = (n k - 1)
degrees of freedom
SE Coef
T
114.3
2.68
10.83 -2.31
25.97
2.85
R-Sq = 52.1%
P
0.020
0.040
0.014
MSR 14730
F
6.54
MSE 2253
R-Sq(adj) = 44.2%
Analysis of Variance
Source
DF
Regression
2
Residual Error 12
Total
14
Source
Price
Advertising
DF
1
1
SS
29460
27033
56493
Seq SS
11100
18360
MS
14730
2253
F
6.54
P
0.012
P-value for
the F Test
Test Statistic:
H0: 1 = 2 = 0
H1: 1 and 2 not both zero
= .05
df1= 2
df2 = 12
Decision:
Critical
Value:
F = 3.885
= .05
Do not
reject H0
Reject H0
F.05 = 3.885
MSR
F
6.54
MSE
Conclusion:
F
Hypotheses:
(continued)
bj 0
Sb j
(df = n k 1)
SE Coef
T
114.3
2.68
10.83 -2.31
25.97
2.85
R-Sq = 52.1%
P
0.020
0.040
0.014
R-Sq(adj) = 44.2%
Analysis of Variance
Source
DF
Regression
2
Residual Error 12
Total
14
Source
Price
Advertising
DF
1
1
SS
29460
27033
56493
Seq SS
11100
18360
MS
14730
2253
F
6.54
P
0.012
H0: i = 0
H1: i 0
Price
Advertising
d.f. = 15-2-1 = 12
Coef
SECoef
-24.98
10.83
-2.31
0.04
74.13
25.97
2.85
0.014
= .05
t/2 = 2.1788
Decision:
/2=.025
/2=.025
Conclusion:
Reject H0
Do not reject H0
-t/2
-2.1788
Reject H0
t/2
2.1788
b j t nk 1Sb j
where t has
(n k 1) d.f.
d.f. = 15-2-1 = 12
Coef
Price
Advertising
SECoef
-24.98
10.83
74.13
25.97
= .05
t/2 = 2.1788
(continued)
Interpretation:
Weekly sales are estimated to be reduced by between 1.37 to
48.58 pies for each increase of $1 in the selling price
Assumptions of Regression
Use the acronym LINE:
Linearity
Independence of Errors
Normality of Error
Residual Analysis
ei Yi Yi
Not Linear
residuals
residuals
Linear
residuals
residuals
residuals
Independent
X
100
0
-3
-2
-1
Residual
x
Non-constant variance
residuals
residuals
Constant variance
Measuring Autocorrelation:
The Durbin-Watson Statistic
Autocorrelation
(e e
i 2
2
e
i
i1
i 1
Inconclusive
dL
Do not reject H0
dU
(continued)
Is there autocorrelation?
Durbin-Watson Calculations
Sum of Squared
Difference of Residuals
3296.18
Sum of Squared
Residuals
3279.98
Durbin-Watson
Statistic
1.00494
n
(e e
i 2
ei
i 1
i1
)2
3296.18
1.00494
3279.98
(continued)
Minitab Output
(continued)
Inconclusive
dL=1.29
Do not reject H0
dU=1.45
Influence Analysis
The Studentized
Deleted Residuals ti
Expressed as a t statistic
The Studentized
Deleted Residuals ti
(continued)
Cooks Di statistic
ei2
Di
k MSE
hi
2
(1 hi )
where
Collinearity
Collinearity
(continued)
Detecting Collinearity
(Variance Inflationary Factor)
VIFj is used to measure collinearity:
1
VIFj
2
1 R j
where R2j is the coefficient of determination of
variable Xj with all other X variables
Pie
Sales
Price
($)
Advertising
($100s)
350
5.50
3.3
460
7.50
3.3
350
8.00
3.0
430
8.00
4.5
350
6.80
3.0
380
7.50
4.0
430
4.50
3.0
470
6.40
3.7
450
7.00
3.5
10
490
5.00
4.0
11
340
7.20
3.5
12
300
7.90
3.2
13
440
5.90
4.0
14
450
5.00
3.5
15
300
7.00
2.7
Sales = b0 + b1 (Price)
+ b2 (Advertising)
T
2.68
-2.31
2.85
P
0.020
0.040
0.014
VIF
1.0
1.0
Chapter Summary