Materi 5.1-Inference in Simple Linear Regression

Inference in
Simple Linear Regression

Linear Regression
 Explanatory and Response Variables are
Numeric
 Relationship between the mean of the response
variable and the level of the explanatory
variable assumed to be approximately linear
(straight line)
 Model:
Y  0  1 x    ~ N (0, )
• 1 > 0  Positive Association
• 1 < 0  Negative Association
• 1 = 0  No Association
Least Squares Estimation of 0, 1
 0  Mean response when x=0 (y-intercept)
 1  Change in mean response when x
increases by 1 unit (slope)
 0, 1 are unknown parameters (like m)
 0+1x  Mean response when explanatory
variable takes on the value x
 Goal: Choose values (estimates) that
minimize the sum of squared errors (SSE) of
observed values to the straight-line:
2
n  
2
^ ^ ^
 ^
  ^ ^
y   0 1 x SSE  i 1  yi  y i   i 1  yi    0   1 xi  
n
    
Example - Pharmacodynamics of LSD
• Response (y) - Math score (mean among 5 volunteers)

• Predictor (x) - LSD tissue concentration (mean of 5 volunteers)
• Raw Data and scatterplot of Score vs LSD concentration:
80
70
60
Score (y) LSD Conc (x)
78.93 1.17 50
58.20 2.97
67.47 3.26 40
37.47 4.69
45.65 5.83 30
SCORE
32.92 6.00
20
29.97 6.41 1 2 3 4 5 6 7
LSD_CONC
Source: Wagner, et al (1968)
Least Squares Computations
 x  x 
2
S xx 
S xy   x  x y  y 
  y  y 
2
S yy
^
1   x  x y  y   S xy
 x  x 
2
S xx
^ ^
 0  y  1 x
2
 ^

 y  y 
 SSE
s2  
n2 n2
Score (y) LSD Conc (x) x-xbar y-ybar Sxx Sxy Syy
78.93 1.17 -3.163 28.843 10.004569 -91.230409 831.918649
58.20 2.97 -1.363 8.113 1.857769 -11.058019 65.820769
67.47 3.26 -1.073 17.383 1.151329 -18.651959 302.168689
37.47 4.69 0.357 -12.617 0.127449 -4.504269 159.188689
45.65 5.83 1.497 -4.437 2.241009 -6.642189 19.686969
32.92 6.00 1.667 -17.167 2.778889 -28.617389 294.705889
29.97 6.41 2.077 -20.117 4.313929 -41.783009 404.693689
350.61 30.33 -0.001 0.001 22.474943 -202.487243 2078.183343
(Column totals given in bottom row of table)
350.61 30.33
y  50.087 x  4.333
7 7
^  202.4872 ^ ^
1   9.01  0  y   1 x  50.09  (9.01)(4.33)  89.10
22.4749
^
y  89.10  9.01x s 2  50.72
SPSS Output and Plot of Equation
i a
c
d
a
i
i c
c
Bei
Mt
Eg
4
8
6
0 1(
9
3
7
4
2 L
a
D
Math Score vs LSD Concentration (SPSS)
80.00 
Linear Regress ion
70.00

60.00

score
50.00
40.00

30.00 
scor e = 89.12 + -9.01 * ls d_conc
1.00 2.00 R-Sq uare
3.00 = 0.88 5.00
4.00 6.00
lsd_conc
Inference Concerning the Slope (1)
 Parameter: Slope in the population model (1)

 Estimator: Least squares estimate:^
1
 Estimated standard error: ^
   s / S xx
^
1
 Methods of making inference regarding

population:
 Hypothesis tests (2-sided or 1-sided)
 Confidence Intervals
Hypothesis Test for 1
2-Sided Test 1-sided Test

H0: 1 = 0 H0: 1 = 0
HA: 1  0 HA+: 1 > 0 or
HA-: 1 < 0
^
1 ^
T .S . : tobs  ^ T .S . : tobs 
1
 ^
1
^
 ^
1
R.R. : | tobs |  t / 2,n 2 R.R. : tobs  t ,n 2 R.R. : tobs   t ,n 2

P  val : 2 P(t | tobs |) P  val : P(t  tobs ) P  val : P(t  tobs )
^
n  7  1  9.01 s  50.72  7.12 S xx  22.475
^ 7.12
 ^
1
  1.50
22.475
• Testing H0: 1 = 0 vs HA: 1  0
 9.01
T .S . : tobs   6.01 R.R. :| tobs | t.025,5  2.571
1.50
• 95% Confidence Interval for 1 :
 9.01  2.571(1.50)   9.01  3.86  (12.87,5.15)

Analysis of Variance in Regression
 Goal: Partition the total variation in y into
variation “explained” by x and random variation
^ ^
( yi  y )  ( yi  y i )  ( y i  y )
^ 2 ^ 2
 ( y  y)   ( y  y )   ( y  y)
2
i i i i
• These three sums of squares and degrees of freedom are:

•Total (Syy) dfTotal = n-1
• Error (SSE) dfError = n-2
• Model (SSR) dfModel = 1
Analysis of Variance in Regression
Source of Sum of Degrees of Mean

Variation Squares Freedom Square F
Model SSR 1 MSR = SSR/1 F = MSR/MSE
Error SSE n-2 MSE = SSE/(n-2)
Total Syy n-1
• Analysis of Variance - F-test

• H0: 1 = 0 HA: 1  0
MSR
T .S . : Fobs 
MSE
R.R. : Fobs  F ,1, n  2
P  val : P ( F  Fobs )
• Total Sum of squares:
S yy   ( yi  y)2  2078.183 dfTotal  7 1  6
• Error Sum of squares:

^
SSE   ( yi  y i )2  253.890 df Error  7  2  5
• Model Sum of Squares:

^
SSR   ( y i  y) 2  2078.183  253.890  1824.293 df Model  1
Source of Sum of Degrees of Mean
Variation Squares Freedom Square F
Model 1824.293 1 1824.293 35.93
Error 253.890 5 50.778
Total 2078.183 6
•Analysis of Variance - F-test

• H0: 1 = 0 HA: 1  0
MSR
T .S . : Fobs   35.93
MSE
R.R. : Fobs  F.05,1, 5  6.61
P  val : P ( F  35.93)
Example - SPSS Output
Ob
m
dF
S
M
iag
f
2
1
2
8
21Ra
1
5
6 R
3
6 T
a
P
b
D

Materi 5.1-Inference in Simple Linear Regression

Uploaded by

Copyright:

Available Formats

Materi 5.1-Inference in Simple Linear Regression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Materi 5.1-Inference in Simple Linear Regression

Uploaded by

Copyright:

Available Formats

Inference in

Simple Linear Regression

• Response (y) - Math score (mean among 5 volunteers)

(Column totals given in bottom row of table)

Math Score vs LSD Concentration (SPSS)

 Parameter: Slope in the population model (1)

 Methods of making inference regarding

2-Sided Test 1-sided Test

R.R. : | tobs |  t / 2,n 2 R.R. : tobs  t ,n 2 R.R. : tobs   t ,n 2

• Testing H0: 1 = 0 vs HA: 1  0

 9.01  2.571(1.50)   9.01  3.86  (12.87,5.15)

• These three sums of squares and degrees of freedom are:

Source of Sum of Degrees of Mean

• Analysis of Variance - F-test

• Total Sum of squares:

S yy   ( yi  y)2  2078.183 dfTotal  7 1  6

• Error Sum of squares:

• Model Sum of Squares:

•Analysis of Variance - F-test

You might also like