0% found this document useful (0 votes)
7 views33 pages

LR24 Correlation

LR24 Correlation

Uploaded by

Bernard Banal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views33 pages

LR24 Correlation

LR24 Correlation

Uploaded by

Bernard Banal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

What Is This Module About?

Research studies are done to describe variance in the world. Variance is simply the

difference, which can occur naturally or change due to manipulation. In Mathematics, we name

this variance as Variables.

A Variable is an attribute or characteristic that may take more than one value. Some

examples of variables are; height and weight of students, number of hours’ students spend in

playing online games, and daily expenses of students. From these variables, information is

collected and analyzed.

In this module, you will learn about the following:

Lesson 1- Illustrating the Nature of Bivariate Data

Lesson 2 – Describing the Shape (Form), Trend (Direction), and Variation (Strength) based on a

Scatter Plot

Lesson 2- Solving Problems Involving Correlation Analysis

What Will You Learn from This Module?

After studying this module, you will be able to:


● illustrate the nature of bivariate data,

● describe shape (form), trend (direction), and variation (strength) based on a scatter

plot, and

● solve problems involving correlation analysis.

Let’s See What You Already Know


Answer the following exercises.

1
A. Determine the variable/s involved in the following situations and identify whether they
involve a Univariate or Bivariate Data.

1. Sean surveyed his classmates’ shoe sizes and height.

2. Joebelle conducted a survey to determine the number of male household members in their
barangay.

3. Shaun interviewed 12 of his friends to determine their daily money allowance and weight.

4. A Barangay Health Worker finds out the number of hours of sleep of 25 children and their
weight in kilograms.

5. Jarein recorded the daily time spent by her 5 friends in playing computer games.

B. Choose the correct answer to the given questions or statements.

1. Data that involve one variable is called ______________.

A. Univariate B. Bivariate
C. Multivariate D. None of these

2. Which of the following refers to data that involve two variables?

A. Univariate B. Bivariate
C. Multivariate D. None of these

3. The statistical procedure used to describe the relationship of the variables of bivariate data
is called _______________.

A. Measures of central tendency B. Measures of Variation


C. Descriptive Statistics D. Correlation Analysis

4. What correlation is described if the points on a scatter plot follow a trend of rising from
left to right?

A. positive B. negative
C. No/Negligible correlation D. cannot be determined

5. Given the scatter plot below, describe the correlation of the variables involved.

2
A. positive B. negative
C. No/Negligible correlation D. cannot be determined

6. The strength of the correlation is associated with the ______ of the points around the trend
line on a scatter plot.

A. closeness B. form
C. direction D. number

7. Austin observed that the points on the scatter plot follow a trend of rising from right to left
and the points are scattered closely from the trend line. What is the correlation of the variables
involved?

A. strong positive B. strong negative


C. weak positive D. weak negative

8. What conclusion can you draw of the correlation


from the scatter plot at the right?

A. strong positive B. strong negative


C. weak positive D. weak negative

9. Shane needs to analyze the strength of the relationship between two variables. What
statistical analysis does she need to describe the relationship?

A. Spearman Correlation B. Pearson r


C. z-test D. t-test

10. Which of the following values of r describes a weak positive correlation between two
variables?
A. +1 B. +0.56
C. +0.15 D. +0.47

Lesson 1
NATURE OF BIVARIATE DATA

3
In Statistical Research, choosing which variables to measure is central to good

experimental design. You need to know which variables you are working with to choose the

appropriate statistical tests and interpret the results of your study.

LET’S LEARN
In Research, data collected involving only one variable is called a univariate data.

Univariate data is often described using the measures of central tendency (mean or average,

median and mode), variation or other descriptive statistics.

The following are sample situations involving univariate data.

SITUATION VARIABLE

1. Michael interviewed 8 of his classmates


weekly money allowance
regarding their weekly money allowance
2. Jane recorded the number of rice cake she sells
number of rice cake
per barangay
3. A Barangay Health Worker recorded the number
of nursing mother visiting the Health Center for number of nursing mother
check-up
4. The Brigada Eskwela Coordinator conducted a
number of senior high school
survey to determine the number of senior high
students
school students qualified to receive a learning kit
5. A school teacher finds out that 40% of her
Number of students
students have their own personal computer

Some research studies involve two variables. One of these two variables is called the

independent variable and the other one is the dependent variable. The independent variable is the

variable that causes the dependent variable to change. The dependent variable is the variable that

is influenced or affected by the independent variable.

4
Data that involve two variables are called bivariate data.

The following are sample situations involving bivariate data.

SITUATION VARIABLES

1. A researcher observed the number of


minutes it takes for students to answer a
worded problem in Math and the number Number of minutes and number of hours
of hours they spend in studying the
subject.
2. A researcher recorded the number of
infected Covid-19 patients and the
Number of infected patients and number
number of days they spent in the
of days spent in the hospital
hospital before recovering from the
disease
3. A group of researchers found out that
long hours spent by students in TikTok
Long hours spent and academic grades
application has a negative effect on their
academic grades.
4. The owner of a mobile phone store in
Ligao City Albay hired a new seller and Number of mobile phones sold and
told him to record the number of mobiles number of days
phones sold per day for the past 20 days.
5. A biologist collected data on total
rainfall and total number of plants in Total rainfall and number of plants
different regions

Let’s Try This I UNIVARIATE or BIVARIATE?

Determine the variables in the following situations and identify whether they involve a

univariate or bivariate data.

5
SITUATION VARIABLE/S UNIVARIATE
or BIVARIATE
1. A security guard of a shopping mall
estimates that on the average, the number of
customers entering the mall premises is 70.
2. A mother asked her children to minimize
their water consumption so their monthly
water bill will not be high.
3. A student researcher concluded that the
number of hours of sleep is highly related to the
blood count of students.
4. A businessman collected data for 10
consecutive quarters to determine the total
money spent for advertising and the total
revenue.
5. A doctor recorded data of his patients of
different ages and their resting heart rate.

Let’s Remember
UNIVARIATE DATA BIVARIATE DATA

● involving a single variable ● involving two variables

● does not deal with causes or ● deals with causes or relationships


relationships
● the statistical analysis that can be
● the statistical treatments used to used is correlation analysis
describe this data are measures of
variation and central tendency ● the major purpose of analysis is to
explain
● the major purpose of analysis is to
describe

6
Lesson 2
SHAPE, TREND, and VARIATION based on a Scatter
Plot
Bivariate data are always in pairs. For instance, a researcher wants to find out if there is a

relationship between height and weight. Here, height is the independent variable and weight is the

dependent variable. If a person gets taller, his weight may increase but an increase in his weight

will not make the person taller. But this does not mean that this variable causes the other variable,

it simply means that there is a significant association between the two. The heights of the students

which may be in centimeters and the weights of the students which may be in kilograms are the

bivariate data.

LET’S LEARN
Scatter Plots are diagrams that are used to show the degree and pattern of relationship

between the two sets of data. They are constructed on the xy coordinate plane. Each data point on

the scatter plot represents two values (x , y).

EXAMPLE: The table below shows the time in hours (x) spent by six grade 11 students in

studying their lessons and their scores (y) on a test. Construct a scatter plot.

Time Spent (x) 1 2 3 4 5 6

Score (y) 5 15 10 15 30 35

SOLUTION: To construct the Scatter Plot,

Step 1: Draw the horizontal (x) and vertical axis (y)

7
Step 2: Plot the data points ( x , y)

The points plotted on the xy coordinate plane seem to follow a straight line that points

upward to the right. This indicates that the two variables are to some extent linearly related and

the relationship between the variables is positive. It describes a positive trend since as the amount

spent in studying increases, their scores, also increases.

Correlation between two variables can be described according to its shape (form), trend

(direction), and variation (strength) based on a scatter plot.

The form of correlation can be determined by the shape of points on a scatter plot

categorized as Linear or Curvilinear.

The form of correlation is linear if the points on a scatter plot follow a trend of a straight

line. However, if the points on a scatter plot follow a trend of a curve line, it is categorized as

curvilinear.

Observe the scatter plots below.

8
Linear Curvilinear
The following Scatter Plots describe the Trend (Direction) of Correlation

POSITIVE CORRELATION
A positive correlation exists when high
values of one variable correspond to high
values of another variable or low values
of one variable correspond to low values
of another variable. The points follow a
trend rising from left to right.

NEGATIVE CORRELATION
A negative correlation exists when high
values of one variable correspond to low
values of another variable or low values
of one variable correspond to high values
of another variable. The points follow a
trend rising from right to left.

NO/NEGLIGIBLE CORRELATION
No/Negligible correlation exists when
high values of one variable correspond to
either high or low values of another
variable. The points are neither rising
from left to right nor from right to left.

The closeness of the points around the trend line determines the Variation (Strength) or

the correlation of the variables involved. The closer the points to the trend line, the stronger the

correlation of the variable is.

9
The strength of the correlation between two variables can be perfect, strong, weak, or

no/negligible correlation.

Consider the following Scatter plots:

The dots are concentrated around


the trend line rising from left to
right.

strong positive
correlation

The dots are concentrated around


the trend line rising from right to
left.

strong negative
correlation

The dots are widely spread around


the trend line rising from left to
right.

weak positive
correlation

10
The dots are widely spread around
the trend line rising from right to
left.

weak negative
correlation

The dots are not close but are not


too far from the trend line rising
from left to right.

moderately positive
correlation

The dots are not close but are not


too far from the trend line rising
from right to left.

moderately negative
correlation

x 2 4 6 8 10 12 14 16

y 5 10 5 25 10 15 20 10

EXAMPLE 1: Construct a scatter plot based on the table below. Describe the shape, trend

and variation of correlation between the two variables.

11
SOLUTION:

Given the table in Example 1, the points on the scatter plot seem to follow a straight line

upward to the right at certain intervals. However, it cannot be said that the correlation is perfect

positive, not even moderately positive. There is a weak correlation between the two variables

because the data points are widely spread and far from the straight line.

EXAMPLE 2: Miguel Devela own a private resort. He decided to record the temperature

in one week and the number of guests who came to her resort for that week. The data are shown

below.

Construct a scatter plot and determine the trend and strength of correlation between the

variables.

Day Temperature in ℃ Number of Guests

1 27 520

12
2 28 955

3 29 1400

4 31 2300

5 26 800

6 30.5 1000

7 31 2500

SOLUTION:

In Example 2, the independent variable is the temperature in ℃ while the dependent

variable is the number of guests. As shown in the scatter plot, the points seem to follow a straight

line rising from left to right. There is a moderate positive correlation between the two variables

because the points are not too far from the trend line.

Let’s Try This

13
A. Describe the shape, trend and variation of the variables based on the following

scatter plots:

1.

2.

Student Hours using the Scores


Internet

1 1 30

2 2 25

14
3 3 20

4 2 30

5 0.5 26

6 2.5 15

7 2 32

8 3 28

9 1.5 25

10 0.5 18

B. The table below shows the number of hours spent by 10 grade 11 students in using

the Internet and their scores on a Mathematics test.

1. Identify the independent and the dependent variable.

2. Construct a scatter plot.

15
3. Determine the direction of the straight line that the data points seem to

follow.

4. Determine the strength of correlation between the two variables.

What Have You Learned

Complete the following statements.

1. Univariate data consists of only ___________ variable.

2. Data that involve two variables are called ____________.

3. The statistical treatments used to describe univariate data are measure of variation and

___________.

4. The statistical analysis that can be used in bivariate data is ___________.

5. If the data given in an experiment can only be described by the measure of central

tendency and variation, then the type of data given is ____________.

6. When the points on a scatter plot follows a trend which rises from left to right, then

the data indicates a _____________.

7. Two variables have _____________ when the points on the scatter plot are neither

rising from left to right or right to left.

8. _____________ are diagrams used to show the degree and pattern of relationship

between two sets of data.

9. If the points on a scatter plot does not follow a trend of a straight line, it is categorized

as _____________.

16
10. Correlation between two variables can be described according to its shape, trend, and

_____________ based on a scatter plot.

Lesson 3
CORRELATION ANALYSIS

Correlation Analysis is the Statistical procedure used to determine and describe the

relationship between two variables.

LET’S LEARN

The Pearson Product Moment Correlation Coefficient (also known as Pearson r),

denoted by r, is a test statistic that measures the strength of the relationship between two variables.

To find r, the following formula is used.

𝑛 (∑𝑥𝑦)−𝑛(∑𝑥)(∑𝑦)
r=
√[𝑛∑𝑥 2 −(∑𝑥)2 ] [𝑛∑𝑦 2 −(∑𝑦)²]

where:

n = number of paired values

∑x = sum of x-values

∑y = sum of y-values

∑xy = sum of the products of paired values x and y

∑x2 = sum of squared x-values

17
∑y2 = sum of squared y-values

Let’s Remember

The value of r ranges from +1 to -1.


If r = +1, then the variables have a perfect positive correlation
If r = -1, then the variables have a perfect negative correlation
A value of r that is close to +1 indicates a strong positive correlation, and
A value of r that is close to -1 indicates a strong negative correlation.

Value of r Strength of Correlation

+1 Perfect positive correlation

+0.71

+0.71 + 0.99
to Strong positive correlation

+0.51 + 0.70
to Moderately positive correlation

+0.31 to +0.50 Weak positive correlation

18
+0.01 to +0.30 Negligible positive correlation

0 No correlation

-0.01 to -0.30 Negligible negative correlation

-0.31 to -0.50 Weak negative correlation

-0.51 to - 0.70 Moderately negative correlation

-0.71 to - 0.99 Strong negative correlation

+1 Perfect negative correlation

The following table for interpretation of r can be used in interpreting the degree of linear

relationship existing between the two variables.

EXAMPLE 1: The table below shows the time in hours spent studying (x) of six

grade 11 students and their scores on a test (y). Solve for the Pearson’s Sample

Correlation Coefficient, r.

x 1 2 3 4 5 6

19
y 5 10 15 15 25 35

x y xy x2 y2

1 5 5 1 25

2 10 20 4 100

3 15 45 9 225

4 15 60 16 225

5 25 125 25 625

6 35 210 36 1225

∑x = 21 ∑y = 105 ∑xy = 465 ∑x2 = 91 ∑y2 = 2425

SOLUTION:

Using the formula,

20
𝑛 (∑𝑥𝑦)−𝑛(∑𝑥)(∑𝑦)
r=
√[𝑛∑𝑥 2 −(∑𝑥)2 ] [𝑛∑𝑦 2 −(∑𝑦)²]

6 (465) − (21)(105)
𝑟=
√[6(91) − (21)²][6(2 425) − (105)²]
2790 − 2205
𝑟=
√[546 − 441][14 550 − 11 025]
585
𝑟=
√[105][3 525]
585
𝑟=
√370 125

𝑟 = 0.96157 𝑜𝑟 0.962
The value r = 0.962 is between +0.71 and +0.99 in the table for interpretation of r.

It indicates that there is a strong positive correlation between the time in hours spent in studying

and the scores on a test.

Note: In simplifying complicated numerical expressions with


more than one binary operation, follow the GEMDAS Rule;
G – Grouping Symbols ( ) , [ ] , { }
E – Exponents
M – Multiply
D – Divide
A – Add
S - subtract

x
1 2 3 4 5 6 7

21
y
240 200 160 120 80 40 0

EXAMPLE 2: Sam & Jorge traveled from City A to City B. They traveled at a

constant rate of 40 kilometers per hour. The distance between City A and City B

is 280 kilometers. Jorge decided to record on her smartphone the distance they

need to travel after 1 hour, 2 hours, 3 hours and so on until they reach City B.

These are shown on the following table. Solve for Pearson Product Correlation

Coefficient.

x y xy x2 y2

1 240 240 1 57 600

2 200 400 4 40 000

3 160 480 9 25 600

4 120 480 16 14 400

5 80 400 25 6 400

6 40 240 36 1 600

22
7 0 0 49 0

∑x = 28 ∑y = 840 ∑xy = 2 240 ∑x2 = 140 ∑y2 = 145 600

SOLUTION:

Using the formula,

𝑛 (∑𝑥𝑦)−𝑛(∑𝑥)(∑𝑦)
𝑟=
√[𝑛∑𝑥 2 −(∑𝑥)2 ] [𝑛∑𝑦2 −(∑𝑦)²]

7 (2 240) − (28)(840)
𝑟=
√[7(140) − (28)²][7(145 600) − (840)²]
15 680 − 23 520
𝑟=
√[980 − 784][1 019 200 − 705 600]
−7 840
𝑟=
√61 465 600

𝑟 = −1
The value r = -1 shows a perfect negative correlation between the two variables.

23
Let’s Try This

A. Interpret the degree of linear relationship existing between the two variables given the

following value of r.

1. r = -0.245

2. r = + 1

3. r = +0.563

4. r = -0.891

5. r = 0.422

B. Compute and interpret the Pearson Product Correlation Coefficient, r.

1. The following are the heights in centimeters of 8 employees and their weights in

kilograms.

Employee Height in centimeters Weight in kilograms

1 160 52

2 162 57

3 167 61

24
4 158 54

5 167 64

6 170 68

7 163 59

8 164 64

What Have You Learned


Fill-in the blanks with the correct answer.

1. The Statistical procedure used to determine and describe the relationship between two

variables is the _________________.

25
2. The test statistic that measures the strength of the relationship between two variables

is the ___________________.

3. The Pearson product Moment Correlation Coefficient is also known as

_________________.

4. In Pearson r, r ranges from __________________.

5. When the value of r is +1, the strength of correlation is _________________.

Let’s Sum Up

Let us summarize what you have learned in this module.


❖ In research studies, a univariate data involves a single variable while a bivariate
data involves two variables.

❖ Based on a scatter plot, the correlation between two variables, can be described
based on the following:

1. Shape (form) of the correlation between two variables can be linear or


curvilinear
2. Trend (direction) of the correlation between two variables can be positive,
negative, or no/negligible correlation.
3. Variation (strength) of the correlation between two variables can be
perfect, strong, weak, or no/negligible correlation.

❖ In analyzing the correlation between two variables, the formula for Pearson’s
Sample Correlation Coefficient r, is used.

𝑛 (∑𝑥𝑦)−𝑛(∑𝑥)(∑𝑦)
r=
√[𝑛∑𝑥 2 −(∑𝑥)2 ] [𝑛∑𝑦 2 −(∑𝑦)²]

26
Let’s See What You Have Learned

Try answering the following exercises.

A. Choose the correct answer to the given statements.

1. Which of the following sets of data involve two variables?

A. Univariate B. Bivariate

C. Multivariate D. None of these

2. Correlation Analysis is the statistical procedure used to describe the relationship of the

variables of ________________.

A. Univariate data B. Bivariate data

C. Multivariate data D. None of these

3. What correlation is described if the trend of the points on a scatter plot is neither rising

from left to right or from right to left?

A. Positive B. negative

C. No/Negligible D. cannot be determined

4. Reign observed that the points on the scatter plot follow a trend of rising from

left to right and are scattered widely from the trend line. What conclusion can you

draw from the correlation of the variables based on scatter plot described?

A. Strong positive B. strong negative

C. weak positive D. weak negative

27
5. Which of the following values of correspond to r correspond to a perfect negative

correlation?

A. 0 B. -0.50

C. -0.75 D. -1

B. The table below shows the number of cups of coffee that 8 persons had in one week

and their systolic blood pressure number after one week.

Person Number of Cups of Systolic Blood Pressure


Coffee Number

1 5 110

2 4 115

3 7 100

4 10 120

5 14 130

6 6 110

7 12 110

8 16 120

1. Determine the independent variable and the dependent variable.

28
2. Construct a scatter plot.

3. Determine the direction of the straight line that the data points seem to follow.

4. Determine the strength of correlation between the two variables.

C. The table below shows the number of selfies (x) posted online and the scores (y)

obtained from a Math test. Solve for the Pearson Product Correlation Coefficient, r.

x 1 2 3 4 5 6 7

y 40 45 50 65 55 60 75

29
ANSWER KEY

Let’s See What You Already Know

A.

1. Shoe sizes and height / bivariate

2. Number of male household members / univariate

3. Daily money allowance and weight / bivariate

4. Number of hours of sleep & weight / bivariate

5. Daily time spent – univariate

B. 1. A 6. A

2. B 7. B

3. D 8. A

4. A 9. B

5. C 10. D

Lesson 1: Let’s Try This

1. Number of customers / univariate

2. Monthly bill / univariate

30
3. Water consumption and monthly bill / bivariate

4. Money spent and revenue / bivariate

5. Age and resting heart rate / bivariate

Lesson 2: Let’s Try This

A. 1. Linear / positive / strong positive

2. linear / negative / moderately negative

B. 1. Hours using the internet (independent variable)


Scores (dependent variable)

2.

3. neither rises from left or right or vice versa

4. no/negligible correlation

What Have you Learned

1. One 6. Positive correlation


2. Bivariate 7. No/negligible correlation
3. Central tendency 8. Scatter plot
4. Correlation 9. No/negligible correlation
5. Univariate 10. variation

Lesson 3: Let’s Try This

A. 1. Negligible negative correlation

31
2. perfect positive correlation
3. moderately positive correlation
4. strong negative correlation
5. weak positive correlation

B. r = 0.9104 (there is a strong positive correlation between the height and weight of 8
employees)

What Have you Learned

1. Correlation Analysis
2. Pearson Product
Moment Correlation
3. Pearson r
4. +1 to -1
5. Perfect positive

Let’s See What You Have Learned

A. 1. B
2. B
3. C
4. C
5. D

B. 1. Number of cups of coffee (independent variable)


Systolic blood pressure number (dependent variable)

2.

3. rising from left to right (positive correlation)

32
4. moderately positive correlation

C. r = 0.896

References

Mercado, Jesus P. et.al. (2016). Next Century Mathematics Grade 11/Grade 12 Statistics and
Probability. Phoenix Publishing House, Inc.
Paderagao, Rommel A. (2020). Statistics and Probability Quarter 4 – Module 17: Describing the
Shape (Form), Trend (Direction), and Variation (Strength) Based on a Scatter Plot. Department
of Education, Philippines
Pierce, Rod. (2017, Nov 9). "Univariate and Bivariate Data". Math Is Fun. Retrieved 20 Sep 2022
from http://www.mathsisfun.com/data/univariate-bivariate.html
Wow Math. (2021, May 30). Describing the Shape (Form), Trend (Direction) and Variation
(Strength) based on Scatter Plot. Retrieved 19 Sep 2022 from
https://www.youtube.com/watch?v=8zKvhR0ZwjQ
Wow Math. (2021, May 30). Illustrating the Nature of Bivariate Data // Statistics and Probability
Quarter. Retrieved 19 Sep 2022 from https://www.youtube.com/watch?v=3KHVXQ8hIE4
Wow Math. (2021, June 7). Solving Problems involving Correlation Analysis // SHS Statistics and
Probability Q4. Retrieved 19 Sep 2022 from https://www.youtube.com/watch?v=Uo8gflMVHQ4

33

You might also like