Bivariate Linear Regression
Bivariate Linear Regression
Bivariate Linear Regression
Dr Menaal Kaushal
JR II
Department of S P M
S N Medical College, Agra
1 22-11-2013
Statistical Analysis can be:
Univariate: When Only one variable is studied. E.g
Heights of all the IV graders, ages of mothers
delivering at a DH, etc. (Measures of Central
Tendency, Measures of Dispersion)
Bivariate: When relationship between two variables
are studied. e.g. Relationship between height and
weight of Every Child in the IV grade; relation
between mother’s age & birth weight of her baby, etc.
Multivariate: When relationship between more than
two variables are studied. E.g Relationship between
height, weight and MAC of every child in the IV grade
2 22-11-2013
Bivariate Regression
7 22-11-2013
8 22-11-2013
The “Football” Bivariate
Normal Scatter Plot
9 22-11-2013
Can you identify any
difference?
10 22-11-2013
How Tightly Clustered
Are these Data?
11 22-11-2013
Calculating the Correlation
Coefficient
12 22-11-2013
So, How to Calculate r
13 22-11-2013
Formula of Correlation
Coefficient
Lets Simplify:
Convert the data into Standard units.
Multiply the corresponding standard unit values
of x and y
r is the mean of this product
14 22-11-2013
Properties of Correlation
Coefficient
The calculations uses only standard units so r is a pure
number with no units
-1≤ r ≤ 1
15 22-11-2013
Adding a constant to one of the lists just slides the
scatter diagram so r stays the same
16 22-11-2013
Heteroscadastic Curve
17 22-11-2013
What r can not tell?
Association is not causation. r does not tell “Why”
18 22-11-2013
Beware of:
Outliers
19 22-11-2013
Deal with the outliers
20 22-11-2013
Can you find the outlier?
21 22-11-2013
Avoid “Ecological
Correlation”:
22 22-11-2013
Regression
23 22-11-2013
Each estimate is at the center of the vertical strip
22-11-2013 24
25 22-11-2013
The slope of the green line= r
26 22-11-2013
The Equation of Regression
Estimate of y = r* given x (in Standard units)
27 22-11-2013
Why call “Regression”
Sir Francis Galton 1822- 1911: “The Galton Effect”
“Those who have high values in one variable tend to
be not as high in the second variable”
A eugenicist, who gave the idea of SD and regression
“Fathers who are tall, tend to have sons who are not
quite that tall on average”
All data regresses towards “mediocrity”
i.e. regresses towards mean
The Regression Fallacy or Sophomore Slump
28 22-11-2013
29 22-11-2013
Univariate Normal Bivariate Normal
+1 r.m.s.
error
68%
68% r
µx
+1 SD
30 22-11-2013
Residual Plot
32 22-11-2013
Questions??
33 22-11-2013