TUTORIAL 3 KKKQ2023 ENGINEERING STATISTICS ATHIF2023
QUESTION NO. 1
An article in Technometrics by S. C. Narula and J. F. Wellington (“Prediction, Linear Regression, and a
Minimum Sum of Relative Errors,” Vol. 19, 1977) presents data on the selling price (y) and annual
taxes (x) for 24 houses. The taxes include local, school and county taxes. The data are shown in the
following table.
Sale Taxes
25.9 4.9176
29.5 5.0208
27.9 4.5429
25.9 4.5573
29.9 5.0597
29.9 3.891
30.9 5.898
28.9 5.6039
35.9 5.8282
31.5 5.3003
31 6.2712
30.9 5.9592
30 5.05
36.9 8.2464
41.9 6.6969
40.5 7.7841
43.9 9.0384
37.5 5.9894
37.9 7.5422
44.5 8.7951
37.9 6.0831
38.9 8.3607
36.9 8.14
45.8 9.1416
OUTPUT FROM SIMPLE LINEAR REGRESSION USING EXCEL
Coefficient Standar t Stat P-value Lower Upper Lower Upper
s d Error 95% 95% 95.0% 95.0%
Intercep 13.32018 2.57172 5.17947 3.42E- 7.98675 18.6536 7.98675 18.6536
t 2 9 05 5 5
Taxes 3.324371 0.39027 8.51799 2.05E- 2.51498 4.13375 2.51498 4.13375
6 8 08 8 4 8 4
(a) Calculate the least squares estimates of the slope and intercept.
yˆ = ˆ0 + ˆ1 x
= 13.32018 + 3.324371x
ˆ1 = 3.324 (round off to 3 decimal places)
ˆ0 = 13.32 (round off to 2 decimal places)
(b) Find the mean selling price given that the taxes paid are x = 6.6.
yˆ = 13.32018 + 3.324371(6.6)
= 35.2610286
= 35.26 (round off to 2 decimal places)
(c) Calculate the fitted value of y corresponding to x = 8.2464 (observation #14). Find the
corresponding residual.
Fitted value of y corresponding to x = 8.2464
yˆ = 13.32018 + 3.324371(8.2464)
= 40.73427301
= 40.73 (round off to 2 decimal places)
corresponding residual
e = y − yˆ = 36.9 − 40.73 = −3.83 (round off to 2 decimal places)
ALTERNATIVE SOLUTION
S xy
ˆ1 =
S xx
2
24 24 24
i i
x y xi
S xx = xi − i =1
24
24
S xy = xi yi − i =1 i =1 2
i =1 24 i =1 24
(153.718)(830.7) (153.718) 2
= 5511.925 − = 1042.114 −
24 24
= 191.360725 = 57.56301983
S xy 191.360725
ˆ1 = = = 3.32436911 = 3.324 (round off to 3 d.p.)
S xx 57.56301983
ˆ0 = y − ˆ1 x
830.7 153.718
= − (3.324)
24 24
= 13.322557
= 13.32 (round off to 2 d.p.)
QUESTION 2
Part 1
Calculate the least squares estimate of the slope. (Round your answer to 3 decimal places.)
S xy
ˆ1 =
S xx
( x )( y ) S ( x )
2
S xy = =x −
i
xy −
i i 2
i i xx i
14 14
(43)(572) (43) 2
= 1697.80 − = 157.42 −
14 14
= −59.05714286 = 25.34857143
S xy −59.05714286
ˆ1 = = = −2.329801623 = −2.330 (round off to 3 d.p.)
S xx 25.34857143
Part 2
Calculate the least squares estimate of the intercept. (Round your answer to 3 decimal places.)
ˆ0 = y − ˆ1 x
572 43
= − (−2.330)
14 14
= 48.01296213
= 48.013 (round off to 3 d.p.)
Part 3
Use the equation of the fitted line to predict what permeability would be observed when the
compressive strength is x = 4.3. (Round your answer to 2 decimal places.)
yˆ = ˆ0 + ˆ1 x
= 48.01296213 + ( −2.329801623)(4.3)
= 37.99481515
= 37.99 (round off to 2 d.p.)
Part 4
Give a point estimate of the mean permeability when compressive strength is x = 3.7. (Round your
answer to 2 decimal places.)
yˆ = ˆ0 + ˆ1 x
= 48.01296213 + ( −2.329801623)(3.7)
= 39.39269612
= 39.39 (round off to 2 d.p.)
Part 5
Suppose that the observed value of permeability at x = 3.7 is y = 46.1. Calculate the value of the
corresponding residual. (Round your answer to 2 decimal places.)
ei = yi − yˆi = 46.1 − 39.39 = 6.71
QUESTION 3
(a) Fit the simple linear regression model using the method of least squares. Find the least
squares estimates of the intercept and slope in the simple linear regression model. Find an
estimate of 𝜎2.
OUTPUT FROM SIMPLE LINEAR REGRESSION USING EXCEL
ANOVA
Significance
df SS MS F F
Regression 1 136.6829 136.6829 1.767522 0.241116
Residual 5 386.6513 77.33027
Total 6 523.3343
Coefficient Standar Lower Upper Lower Upper
s d Error t Stat P-value 95% 95% 95.0% 95.0%
- -
Intercep 32.1053 1.73259 26.903 138.155 26.903 138.155
t 55.62561 9 4 0.14371 9 1 9 1
0.02569 - 0.24111 0.03188 0.03188
Temp -0.03416 3 1.32948 6 -0.1002 8 -0.1002 8
least squares estimates of the intercept
ˆ0 = 55.62561 = 55.626 (3 d.p.)
least squares estimates of the slope
ˆ1 = −0.03416 (5 d.p.)
estimate of 𝜎2
SS E 386.6513
2 = = = 77.33026 = 77.33 (2 d.p.)
n−2 7−2
(b) Estimate the mean porosity for a temperature of 1431 oC.
yˆ = ˆ0 + ˆ1 x
= 55.62561 + ( −0.03416)(1431)
= 6.74265
= 6.74 (2 d.p.)
ALTERNATIVE SOLUTION
S xy
ˆ1 =
S xx
2
7 7 7
i i
x y xi
S xx = xi − i =1
7
7
S xy = xi yi − i =1 i =1 2
i =1 7 i =1 7
(8700)(92.2) (8700) 2
= 110590 − = 10930000 −
7 7
= −4001.428571 = 117142.8571
S xy −4001.428571
ˆ1 = = = −0.034158536 = −0.03416 (round off to 5 d.p.)
S xx 117142.8571
ˆ0 = y − ˆ1 x
92.2 8700
= − (−0.034158536)
7 7
= 55.62560903
= 55.626 (round off to 3 d.p.)
QUESTION 4
_________________________________________________________________________________
S xy
ˆ1 =
S xx
( x )( y ) S ( x ) ( y )
2 2
S xy = =x − S yy = y −
i i
xy −
i i 2 2
i i xx i i
14 14 14
(43)(572) (43) 2 (572) 2
= 1697.80 − = 157.42 − = 23530 −
14 14 14
= −59.05714286 = 25.34857143 = 159.7142857
−59.05714286
S xy
ˆ1 = = = −2.329801623 = −2.330 (round off to 3 d.p.)
S xx 25.34857143
MS R SS R /1
f0 = =
MS E SS E / (n − 2)
SS = ˆ S = (−2.329801623)(−59.05714286) = 137.5914273 = 137.59 (2 d.p.)
R 1 xy
SST = SS R + SS E
SS E = SST − SS R
= S yy − SS R
= 159.7142857 − 137.5914273
= 22.1228584
MS R SS R /1 137.5914273 /1
f0 = = = = 74.63308302 = 74.63 (2 d.p.)
MS E SS E / (n − 2) 22.1228584 / (14 − 2)
8) Because 74.63 > 4.75 reject 𝐻0 and conclude that compressive strength is significant in predicting
intrinsic permeability of concrete at 𝛼=0.05. We can therefore conclude that the model specifies a
useful linear relationship between these two variables.
(a) Test for significance of regression using 𝛼=0.05.
Reject the null hypothesis
(b) Estimate 𝜎2.
SS E 22.1228584
ˆ 2 = MS E = = = 1.843571533 = 1.844 (3 d.p.)
n−2 14 − 2
(d) In this model, what is the standard error of the intercept?
ˆ 2 1 x2 1 (43 /14) 2
se( 0 ) = +
ˆ = (1.843571533) + = 0.904313862 = 0.904 (3 d.p.)
n S xx 14 25.34857143
What is the standard error of the slope?
ˆ 2 1.843571533
se( ˆ1 ) = = = 0.269682802 = 0.270 (3 d.p.)
S xx 25.34857143
QUESTION 5
OUTPUT FROM SIMPLE LINEAR REGRESSION USING EXCEL
ANOVA
Significance
df SS MS F F
Regression 1 92.93353 92.93353 53.5015 8.57E-07
Residual 18 31.26647 1.737026
Total 19 124.2
Standard Upper Lower Upper
Coefficients Error t Stat P-value Lower 95% 95% 95.0% 95.0%
7.83E-
Intercept -10.1315 1.9949 -5.07872 05 -14.3227 -5.94041 -14.3227 -5.94041
8.57E-
x 0.174294 0.023829 7.314472 07 0.124232 0.224356 0.124232 0.224356
MS R 92.93353
f0 = = = 53.50151926 = 53.50 (2 d.p.)
MS E 1.737026
OR ALTERNATIVELY, (CALCULATE BY FORMULA)
2 2
20 20 20 20
i i
x y i x yi
S yy = yi − i =1
20 20
20
S xy = xi yi − i =1 i =1
S xx = xi −2 i =1 2
i =1 20 i =1 20 i =1 20
2 2
(1656)(86) (1656) (86)
= 7654 − = 140176 − = 494 −
20 20 20
= 533.2 = 3059.2 = 124.2
S xy 533.2
ˆ1 = = = 0.174293933
S xx 3059.2
86 1656
ˆ0 = y − ˆ1 x = − (0.174293933) = −10.13153765
20 20
SS = ˆ S = (0.174293933)(533.2) = 92.9335251
R 1 xy
SST = SS R + SS E
SS E = SST − SS R
= S yy − SS R
= 124.2 − 92.9335251
= 31.2664749
MS R SS R /1 92.9335251/1
f0 = = = = 53.50150464 = 53.50 (2 d.p.)
MS E SS E / (n − 2) 31.2664749 / (20 − 2)
SS E 31.2664749
ˆ 2 = MS E = = = 1.737026383
n−2 20 − 2
(a) Test for significance of regression using 𝛼=0.05. What is the P-value for this test? Round to 7
decimal places.
P-value = 8.57E-07 or 0.000000857 = 0.0000009 (7 d.p.)
(b) Estimate the standard errors of the slope and intercept.
Standard error of the slope
ˆ ˆ 2 1.737026383
se( 1 ) = = = 0.02382864 = 0.0238 (4 d.p.)
S xx 3059.2
Standard error of the intercept
1 x2 1 (1656 / 20) 2
se( ˆ0 ) = ˆ 2 + = (1.737026383) + = 1.994899887 = 1.995 (3 d.p.)
n S xx 20 3059.2
(c) Test 𝐻0: 𝛽0=0 versus 𝐻1: 𝛽0≠0 using 𝛼=0.05. Find the P-value for this test.
H 0 : 0 = 0
H1 : 1 0
= 0.05
ˆ0 − ˆ0,0 ˆ0 − ˆ0,0 −10.13153765
t0 = = = = −5.078719849 = −5.079
1 x2 se( ˆ0 ) 1.994899887
ˆ 2 +
n S xx
Reject H0 if | t0 | > t /2, n − 2
Since | t0 | = 5.079 > t /2, n −2 = t0.025,18 = 2.101
Therefore, we reject H0.
P-value = 7.83E-05 = 0.0000783 = 0.000078 (round off to 6 d.p.)
QUESTION 6
OUTPUT FROM SIMPLE LINEAR REGRESSION USING EXCEL
ANOVA
Significance
df SS MS F F
Regression 1 45.15415 45.15415 40.38286 0.000383
Residual 7 7.827059 1.118151
Total 8 52.98121
Standard Upper Lower Upper
Coefficients Error t Stat P-value Lower 95% 95% 95.0% 95.0%
Intercept 32.04867 2.885174 11.10806 1.07E-05 25.22632 38.87102 25.22632 38.87102
Stress (x) -0.27712 0.043608 -6.35475 0.000383 -0.38023 -0.174 -0.38023 -0.174
P-value = 0.00383 < α=0.01
Thus, we reject H0.
There is evidence of a linear relationship between these two variables.
(b) Estimate the standard errors of the slope and intercept.
2
9
xi
S xx = xi − i =1
9
2
i =1 9
(591) 2
= 39397 −
9
= 588
ˆ 2 = MS E = 1.11815130026455
ˆ 2 1.11815130026455
se( ˆ1 ) = = = 0.043607543 = 0.044 (3 d.p.)
S xx 588
1 x2 1 (591/ 9) 2
se( ˆ0 ) = ˆ 2 + = 1.11815130026455 9 + 588 = 2.885173569 = 2.885
n S xx
OR ALTERNATIVELY, (CALCULATE BY FORMULA)
2 2
9 9 9 9
i i
x y i x yi
S yy = yi − i =1
9 9
9
S xy = xi yi − i =1 i =1
S xx = xi −2 i =1 2
i =1 9 i =1 9 i =1 9
(591)(124.663) (591) 2 (124.663) 2
= 8023.26 − = 39397 − = 1779.744 −
9 9 9
= −162.9436667 = 588 = 52.98138122
S −162.9436667
ˆ1 = xy = = −0.277115079
S xx 588
124.663 591
ˆ0 = y − ˆ1 x = − (−0.277115079) = 32.04866803
9 9
SS = ˆ S = (−0.277115079)(−162.9436667) = 45.15414715
R 1 xy
SST = SS R + SS E
SS E = SST − SS R
= S yy − SS R
= 52.98138122 − 45.15414715
= 7.827234047
SS E 7.827234047
ˆ 2 = MS E = = = 1.118176292
n−2 9−2
ˆ 2 1.118176292
se( ˆ1 ) = = = 0.04360803 = 0.044 (3 d.p.)
S xx 588
1 x2 1 (591/ 9) 2
se( ˆ0 ) = ˆ 2 + = 1.118176292 9 + 588 = 2.885205813 = 2.885
n S xx