10-Correlation and Linear Regression
10-Correlation and Linear Regression
10-Correlation and Linear Regression
Regression Analysis
Week
Introduction
General multiple regression equation:
𝑦ො = 𝑎 + 𝑏1 𝑥1 + 𝑏2 𝑥2 +𝑏3 𝑥3 + ⋯ + 𝑏𝑘 𝑥𝑘
Multiple standard error of estimate
o Correlation analysis is a group of techniques to measure the relationship between
two variables
o The basic idea of correlation analysis is to report the relationship between two
variables
o The usual first step is to plot the data in a scatter diagram
What is correlation analysis?
Example: The sales manager of North American Copier Sales, which has a large sales
force throughout the United States and Canada, wants to determine whether there is
a relationship between the number of sales calls made in a month and the number of
copiers sold that month. 15 samples were collected.
What is correlation analysis?
o Observation: As the number of sales calls increases, it appears the number of
copiers sold also increases.
o Dependent variable: the variable that is being predicted or estimated
o Independent variable: provides the basis for estimating or predicting the dependent
variable
o Independent variable: number of sales calls; dependent variable: number of copiers
sold
Originated by Karl Pearson about 1900, the correlation coefficient describes the
strength of the relationship between two sets of interval-scaled or ratio-scaled variables.
Designated r, it is often referred to as Pearson’s r and as the Pearson product-moment
correlation coefficient. It can assume any value from −1.00 to +1.00 inclusive. A correlation
coefficient of −1.00 or +1.00 indicates perfect correlation
What is correlation analysis?
The scatter diagram shows graphically that the sales representatives who make
more calls tend to sell more copiers.
The Correlation Coefficient
Correlation coefficient: A measure of the strength of the linear relationship between
two variables.
The Correlation Coefficient
Interpretation:
o direct relationship
o the association is strong
Step 1: 𝐻0 : 𝜌 = 0
𝐻1 : 𝜌 ≠ 0
Step 2: α = 0.05 Step 3: Test statistic used is t because we
don’t know the σ and sample size is < 30
Step 4: n – 2 = 15 – 2 = 13
Testing the significance of the correlation coefficient
Step 5:
𝑟 𝑛−2 0.865 15 − 2
𝑡= = = 6.216
1− 𝑟2 1 − 0.8652
Step 6:
There’s evidence that the correlation in the population is not zero. This indicates to
the sales manager that there is correlation with respect to the number of sales calls
made and the number of copiers sold in the population of salespeople.
END
Regression analysis
Regression equation: An equation that expresses the linear relationship between two
variables.
Regression analysis: The technique used to develop the regression equation and
provide the estimates (dependent variable Y)
Regression analysis
In regression analysis, our objectives are to:
o use the data to position a line that best represents the relationship between the
two variables
o calculate the values of a (y-intercept) and b (slope of the line) to develop a linear
equation (𝑦ො = 𝑏𝑥 + 𝑎)that best fits the data
𝑦ො = 0.2608𝑥 + 19.9632
𝑠𝑦 12.89
𝑏=𝑟 = 0.865 = 0.2608
𝑠𝑥 42.76
Step 4: n – 2 = 15 – 2 = 13
Step 6:
There’s evidence that the slope is greater than 0. . The
independent variable, number of sales calls, is useful in
estimating copier sales.
References
Lind, D.A., Marchal, W.G. & Wathen, S.A. (2015) Statistical Techniques in Business and
Economics, 17th Edition. McGraw-Hill