Ge 4 - Midterm Learning Material

Download as pdf or txt
Download as pdf or txt
You are on page 1of 93

GE 4 – MATHEMATICS IN

THE MODERN WORLD


(Midterm)
Secillano, Patrick Joseph N.
UST-LEGAZPI PRAYER

Lord, in our weakness and vulnerability, bless us


with your grace to soar beyond limits. Enlighten
our vision and guide our mission that we may
clearly see and fully realize our quest. Keep our
passion for the truth burning and our
compassion for humanity bright that we may live
truly and lovingly.
UST-LEGAZPI PRAYER

Keep us in harmony with the universe that we


may be joyfully one with your creation. Yet
above and before all, Lord, grant us the grace to
love you with all our mind and soul and with all
our heart and strength that we may praise, bless
and preach according to your will.
UST-LEGAZPI PRAYER

Make us, Legazpi Thomasians, whole as a


person and as a community in Your wondrous
Name, this we ask and pray with a happy and
grateful memory. Amen.
INTRODUCTION
TO STATISTICS
What is Statistics?
• Statistics - The science of collecting, analyzing,
presenting, and interpreting data. Data are the facts
and figures that are collected, analyzed, and
summarized for presentation and interpretation. Data
may be classified as either quantitative or qualitative.
Quantitative data measure either how much or how
many of something, and qualitative data provide
labels, or names, for categories of like items.
Two types of Statistics
• Descriptive statistics is a method for organizing and
summarizing data. For example, tables or graphs are
used to organize data, and descriptive values such as
the average score are used to summarize data.
• A descriptive value for a population is called a
parameter and a descriptive value for a sample is
called a statistic.
• Example: Measures of Central Tendency,
Dispersion, Position, etc.
Two types of Statistics
•Inferential statistics is a method for using
sample data to make general conclusions
(inferences) about populations.
•Because a sample is typically only a part of the
whole population, sample data provide only
limited information about the population.
• Example: Hypothesis testing, Regression
Analysis, etc.
What are variables in Statistics?
•A variable is a characteristic or
condition that can change or take on
different values. Most research begins
with a general question about the
relationship between two variables for a
specific group of individuals.
OBSERVE
What is a Population?
•The entire group of individuals is called the
population. For example, a researcher may
be interested in the relation between class
size (variable 1) and academic performance
(variable 2) for the population of freshmen
students of an institution.
Why do we need to get a sample from
a population?
•Usually populations are so large that a
researcher cannot examine the entire group.
Therefore, a sample is selected to represent
the population in a research study. The goal
is to use the results obtained from the sample
to help answer questions about the
population.
Types of Quantitative Variables
•Variables can be classified as discrete or
continuous.
• Discrete variables (such as class size) consist of
indivisible categories.
•Continuous variables (such as time or weight) are
infinitely divisible into whatever units a researcher
may choose. For example, time can be measured to
the nearest minute, second, half-second, etc.
Measuring Variables
•To establish relationships between variables,
researchers must observe the variables and record
their observations. This requires that the variables
be measured.
•The process of measuring a variable requires a set
of categories called levels of measurement and a
process that classifies each individual into one
category.
LEVELS OF MEASUREMENT
• In descriptive statistics, there are four scales
of measurement that can be used to explain
data:
NOMINAL
ORDINAL
INTERVAL
RATIO
LEVELS OF MEASUREMENT
Each scale is an incremental level of measurement,
meaning, each scale fulfills the function of the
previous scale, and all survey question scales such as
Likert, Semantic Differential, Dichotomous, etc, are
the derivation of this these 4 fundamental levels of
variable measurement. Before we discuss all four
levels of measurement scales in details, with
examples, let’s have a quick brief look at what these
scales represent.
QUALITATIVE DATA (Categorical)
• NOMINAL - Data created by assigning
observations into various independent
categories and then counting the
frequency of occurrence within each of
the categories. (e.g. Name, Gender,
Address, etc.)
QUALITATIVE DATA (Categorical)
•ORDINAL - A scale in which scores
indicate only relative amounts or rank
order. (e.g. Government Position,
Status of employment, etc.)
QUANTITATIVE DATA (MEASURABLE)
•INTERVAL - A scale in which equal
differences in scores represent equal
differences in amount of the property
measured, but with an arbitrary zero
point. Distance is meaningful.(e.g. time,
income, temperature, etc.)
QUANTITATIVE DATA (MEASURABLE)
•RATIO - All the properties of an
interval scale with the additional
property of zero or absolute zero
indicating a total absence being
measured. (e.g. Height, weight, etc.)
SUMMARY
EXERCISE 1: CLASSIFICATION OF DATA
Instruction: Determine each of the given variables according to the specified data
and corresponding level of measurement. For Qualitative – Nominal or Ordinal, For
Quantitative – Interval or Ratio; then determine if discrete or continuous.
1. Driver’s Licensed Number
2. Likert Scale (Excellent, Very Satisfactory, …, Poor)
3. Test Scores (IQ)
4. Temperature in Kelvin
5. Age
SAMPLE AND
SAMPLING
TECHNIQUES
WHAT IS A SAMPLE SIZE?
•The sample size is the proportion of the general
population that are taking part in the study. In
most cases, it's important that the sample chosen
is representative of the wider population, so that
any conclusions drawn from the study can be
reasonably extrapolated to individuals who did
not directly take part.
FORMULA IN SOLVING FOR THE
SAMPLE SIZE
• If you take a population sample, you must use a formula to figure out
what sample size you need to take. Sometimes you know something
about a population, which can help you determine a sample size. When
you can use Slovin’s formula to figure out what sample size you need
to take, which is written as:
𝑵
𝒏=
𝟏 + 𝑵𝒆𝟐
Where:
• n =Sample Size,
• N = Total population and
• e = margin of error; e = 1 – confidence level
EXAMPLE
• How many sample do we get from a population of 1,000 people with 95%
confidence level in a survey to be conducted in Barangay Lamba, Legazpi City?
• Step 1: let e = 1 – confidence level
e = 1 – 0.95 = 0.05 0r 5%
• Step 2. Plug your data into the formula. In this example, we will use 5% (0.05)
margin of error with a population size of 1,000.
Solve:
1000
n= 2 =285.714286… = 286
1+(1000)(0.05)
NOTE: Rule: Rounding up a sample size calculation for conservativeness ensures
that your sample size will always be representative of the population.
PROBABILITY SAMPLING TECHNIQUES
There are two broad sampling techniques: probability, and
non-probability. Probability sampling techniques means
every unit in the population has a chance of being selected in
the sample, and this chance can be determined. Sample
statistics produced are unbiased estimates of population
parameters if the sampled units are weighted according to
their probability of selection. Results are generalizable to the
population. The following are the probability sampling
techniques.
SIMPLE RANDOM SAMPLING
Simple Random Sampling - All members of the population have a chance of
being included in the sample. For example, a lottery sampling and using the
table of random numbers.
33658965431 33658965431 Note: A Sample on the random numbers
3648362 6 648362 will only be determined by using the
37354271383 3 271383 population size and will not be exceeded on
43538463782 its solved sample size. Example: N = 4500
43538463782
and n = 367; Find 1st sample and 6th
6213564 83906213564
sample units: 1st = 3,365 and 6th = 3,906.
SYSTEMATIC SAMPLING
Systematic Sampling −The sampling frame is ordered according to some
criteria and elements are selected at regular intervals through that ordered list.
Involves random start and then proceeds with the selection of every kth element
from that point onwards, where k = N/n. It ensures that there is no
overrepresentation of large or small firms in the sample, but rather firms of all
sizes are generally uniformly represented.
Formula: 𝑘 = For Middle most term/s:
𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒 (𝑁) If n is odd – (n+1)th/2
𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 (𝑛)
If n is even – (n)th/2 and (n+2)th/2

Formula: nth = r + (n-1) k


EXAMPLE 1
Find the Middle term unit/s
where K = 15, n = 86 and the
random start is 3.
Solve:
43rd = 3 + (43-1)15
43rd = 633
44th = 3 + (44-1) 15
44th= 648
EXAMPLE 2
Example: Find the middlemost Step 2: Solve for the K
term/s if the sample size is 103, 𝟏𝟑𝟖
K= = 1.339… = 1
where r = 4 and e = 5%. 𝟏𝟎𝟑
Step 1: Solve for N
103=
𝑵
𝟏+𝑵(𝟎.𝟎𝟓)𝟐
Step 3: Solve for Middlemost
103+ 103N(𝟎. 𝟎𝟓)𝟐 = 𝑵
term
103= 𝑵- 103N(𝟎. 𝟎𝟓)𝟐 (103+1)/2 = 52nd term.
103= 𝑵[𝟏- 103(𝟎. 𝟎𝟓)𝟐 ] 52nd = 4 + ( 52 – 1) 1
𝟏𝟎𝟑
N=
[𝟏− 103(𝟎.𝟎𝟓)𝟐 ]
= 138 52nd = 4 + 51 = 55
EXAMPLE 3
Example: Find the first two Solve 2: k = N/n
units, Middle term unit/s and k = 1290/306 = 4
1st unit = random start = 3
the last two units where N = 2nd unit = r + k = 3 + 4 = 7
1290, e = 5% and the random
start is 3. Solve 3: Middle Terms: (n)/2 and (n+2)/2
Solve 1: (306)/2 = 153rd
𝟏𝟐𝟗𝟎 153rd = 3 + (153 - 1)4
𝒏= 153rd = 611
𝟏 + 𝟏𝟐𝟗𝟎(𝟎. 𝟎𝟓)𝟐 (306+2)/2=154th
n = 306 154th = 3 + (154 -1)4
154th = 3 + 612 = 615
EXAMPLE 3
Example: Find the Middle Solve 4: last two units
term unit/s where N = 1290, 305th = 3 + (305 - 1)4
e = 5% and the random start
is 3. 305th = 1219
Solve 1: 306th = 3 + (306 - 1)4
𝟏𝟐𝟗𝟎
𝒏= 306th = 1223
𝟏 + 𝟏𝟐𝟗𝟎(𝟎. 𝟎𝟓)𝟐 Persons 3, 7, 611, 615, 1219 and 1223 are
n = 306 the first two units, middle term units and last
two units respectively in the conducted
study.
STRATIFIED SAMPLING
Stratified Random Sampling - This method is used when the
population is too big to handle, thus dividing N into subgroups, which
are called strata is necessary. Samples per stratum are then
randomly selected, but considerations must be given to the sizes of
the random samples to be selected from the subgroups.

𝑁𝑖 𝑁𝑖 𝜎 𝑖
Formula: 𝑛𝑖 = 𝑥 𝑛 or 𝑛𝑖 = 𝑁 𝜎 +𝑁 𝑥 𝑛; if sd is given
𝑁 𝑖 𝑖 (𝑖+1) 𝜎 (𝑖+1) +⋯+𝑁(𝑖+𝑥) 𝜎 (𝑖+𝑥)
EXAMPLE
CASE 1 Step 2: solve for number of
Example: CHS - 200, sample per stratum.
CASE- 120, CBMA - 450 CHS=( 200/770)x264= 69
Step 1: Compute for “N” CASE=(120/770)x264= 41
N= 200 + 120 + 450= 770 CBMA = (450/770)x264=154
Then use Slovin's formula Case 2:
(Use the indicated "e“ e = 5%) Just follow the flow in case 1, but
multiply the sd.
n = 264
NON-PROBABILITY SAMPLING TECHNIQUES
• Non-Probability random sampling technique has the
following characteristics: some units of the population have
zero chance of selection or the probability of selection
cannot be accurately determined; units are selected based
on non-random criteria, e.g. quota or convenience; does not
allow the estimation of sampling error, and may be subjected
to sampling bias; ideal if generalizability of results is not that
important for the study. (e.g. convenient sampling, quota
sampling, expert sampling, snowball, etc.)
NON-PROBABILITY SAMPLING TECHNIQUES
• OTHERS: Purposive sampling, also known as judgmental,
selective, or subjective sampling, is a form of non-probability
sampling in which researchers rely on their own judgment
when choosing members of the population to participate in
their surveys.
• Researchers use purposive sampling when they want to
access a particular subset of people, as all participants of a
survey are selected because they fit a particular profile.
PRESENTATION
OF DATA
FREQUENCY DISTRIBUTION TABLE
• Frequency distribution in statistics provides the
information of the number of occurrences (frequency)
of distinct values distributed within a given period of
time or interval, in a list, table, or graphical
representation. Grouped and Ungrouped are two types
of Frequency Distribution. Data is a collection of
numbers or values and it must be organized for it to be
useful. Let us take a look at data and its frequency
distribution.
CONSTRUCTING A FREQUENCY
DISTRIBUTION TABLE
120 133 180 138
140 150 170 153
161 149 124 168
148 139 161 142
130 143 137 147
156 151 128 118
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
Step 1: Determine the Range
A. Get the lowest and highest value in the raw data
120 133 180 138
140 150 170 153
161 149 124 168 R=H–L
148 139 161 142 R = 180 – 118
130 143 137 147
156 151 128 118
R = 62
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
Step 2: Solve for the number of class
A. Get the number of raw of data (n); n = 40

Formula: C = 1 + 3.322logn
C = 1 + 3.322log(40)
C = 6.32204…
C=6
Note: if the highest value in the raw data is not
reached yet, add another class
Step 3: Solve for the Class Size ( K )
Formula: K = Range (R) / Number of classes(C)
K = 62/[1+3.322log(40)]
K = 9.806955…
K = 10
Or K = 62/6 = 10.333…
K =10
Step 4: In creating the class intervals always
start with the lowest value
CLASS INTERVALS

118 – 127

128 – 137

138 – 147

148 – 157

158 – 167

168 – 177

178 - 187
Step 5: Tally or get the frequency/ies per class
interval
CLASS FREQUENCY
INTERVALS
118 – 127 3
128 – 137 6
138 – 147 11
148 – 157 10
158 – 167 6
168 – 177 3
178 - 187 1
Note: Try to make color codes
120 133 180 138
140 150 170 153
161 149 124 168
148 139 161 142
130 143 137 147
156 151 128 118
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
Step 6: Solve for the class mark
• In solving for the classmark (midpoint of the class interval): (lower
limit + higher limit)/2
CLASS INTERVALS CLASSMARK
118 – 127 (118+127)/2 = 122.5
128 – 137 132.5
138 – 147 142.5
148 – 157 152.5
158 – 167 162.5
168 – 177 172.5
178 - 187 182.5
Step 7: Solve for the relative frequency
• In solving for the relative frequency: frequency per class/ (n)

CLASS INTERVALS RELATIVE


FREQUENCY
118 – 127 7.5%
128 – 137 15%
138 – 147 27.5
148 – 157 25%
158 – 167 15%
168 – 177 7.5%
178 - 187 2.5%
Step 8: Solve for Lower and Upper
Boundaries
• Lower boundary: Subtract 0.5 to the lower limit per class
• Upper boundary: Add 0.5 to the higher limit per class
CLASS LOWER UPPER
INTERVALS BOUNDARY BOUNDARY
118 – 127 118 – 0.5 = 117.5 127 + 0.5 = 127.5
128 – 137 127.5 137.5
138 – 147 137.5 147.5
148 – 157 147.5 157.5
158 – 167 157.5 167.5
168 – 177 167.5 177.5
178 - 187 177.5 187.5
Step 9: Solve for the cumulative frequency
Frequency <cf >cf
3 3 40
6 9 37
11 20 31
10 30 20
6 36 10
3 39 4
1 40 1
GRAPHICAL
PRESENTATIONS
(These will be presented in the MS Excel)
DESCRIPTIVE
STATISTICS
Measures of Central Tendency and Measures of Dispersion
for GROUPED DATA
DEFINITION OF TERMS
• The mean (or average) is the most popular and well-known measure
of central tendency. It can be used with both discrete and continuous
data, although its use is most often with continuous data.
• The median is the middle score for a set of data that has been
arranged in order of magnitude. The median is less affected by outliers
and skewed data.
• The mode is the most frequent score in our data set. On a histogram it
represents the highest bar in a bar chart or histogram. You can,
therefore, sometimes consider the mode as being the most popular
option.
DEFINITION OF TERMS
• The range is the difference between the largest and smallest
values in a set of values. R= H-L
• The standard deviation is a quantity calculated to indicate the
extent of deviation for a group as a whole. The Standard
Deviation is a measure of how spread out numbers are. SD is
just the square root of the variance. The standard deviation
measures how concentrated the data are around the mean; the
more concentrated, the smaller the standard deviation.
• Coefficient of Variation is a measure of relative variability. It
is the ratio of the standard deviation to the mean (average).
RECALL - Sigma Notation
n

a
i 1
i  a1  a2  ...  an
Where:
•ai = function
•i & n = lower and upper bounds of
summation
Determine the sum
4

 k  2  (1 2)  (2  2)  (3  2)  (4  2)  18
k 1

 3k
k 3
 3(3)  3( 4)  3(5)  36

 (1) (2k  1) 1 2(0) 1 1 2(1) 1 1 2(2) 1 1 2(3) 1 1 2(4) 1
k 0
k

5
0 1 2 3 4
FORMULA: MEASURES OF CENTRAL
TENDENCY
Mean Median Mode
𝑓𝑥 𝑛
𝑀𝑒𝑎𝑛 = − 𝑓𝑚−1
𝑛 𝑀𝑑 = 𝐿𝑚𝑑 +2 (𝑖) 𝑀𝑜 = 𝑙𝑚𝑜 +
𝑓𝑜 −𝑓1
(i)
𝑓𝑚 2𝑓𝑜 −𝑓1 −𝑓2
Where: 𝐿𝑚𝑑 - is the lower class boundary of the 𝑙𝑚𝑜 - is the lower class boundary of
f – frequency median group the modal class
x – class mark / midpoint n - is the total number of frequency 𝑓1 - is the frequency of the group
n – total number of 𝑓𝑚−1 - is the cumulative frequency of before the modal class
frequency the groups before the median group 𝑓𝑜 - is the frequency of the modal
𝑓𝑚 -is the frequency of the median group class
i - is the class width/size 𝑓2 -is the frequency of the group
after the modal class
i -is the class width/size
Given: List of pre-board examination scores of 40 BSN
graduating students of University of Santo Tomas – Legazpi.
Solve for the descriptive statistical measures and interpret the
results in two to three sentences.
120 133 180 138
140 150 170 153
161 149 124 168
148 139 161 142
130 143 137 147
156 151 128 118
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
MEASURES OF CENTRAL TENDENCY
Step 1: Solve for the Mean
CLASS INTERVALS FREQUENCIES (f) CLASS MARK (x) fx
118 – 127 3 122.5 367.5
128 – 137 6 132.5 795
138 – 147 11 142.5 1567.5
148 – 157 10 152.5 1525
158 – 167 6 162.5 975
168 – 177 3 172.5 517.5
178 - 187 1 182.5 182.5
n = 40 Σfx = 5930
𝑓𝑥
𝑀𝑒𝑎𝑛 =
𝑛
5,930
=
40
= 𝟏𝟒𝟖. 𝟐𝟓 or 𝟏𝟒𝟖
MEASURES OF CENTRAL TENDENCY
Step 2: Solve for the Median
LOWER
CLASS INTERVALS FREQUENCIES (f) <cf
BOUNDARY
118 – 127 3 118 – 0.5 = 117.5 3 𝑛
− 𝑓𝑚−1
128 – 137 6 127.5 9 𝑀𝑑 = 𝐿𝑚𝑑 + 2 𝑖
138 – 147 11 137.5 20
𝑓𝑚
148 – 157 10 147.5 30
20 − 9
= 137.5 + 𝑥 10
158 – 167 6 157.5 36 11
= 137.5 + 10
168 – 177 3 167.5 39
178 - 187 1 177.5 40
= 147.5
n = 40
1. Determine the Median class: n/2 = 40/2 = 20; Median Class: 138 – 147
2. Determine the lower boundary of the median class: L = 137.5
3. Determine cumulative frequency BEFORE the median class: <cf = 9
4. Determine the frequency of the median class: f = 11
5. Determine the class size: i = 10
MEASURES OF CENTRAL TENDENCY
Step 3: Solve for the Mode
CLASS INTERVALS FREQUENCIES (f)
118 – 127 3
𝑓𝑜 −𝑓1
128 – 137 6 𝑀𝑜 = 𝑙𝑚𝑜 + (i)
2𝑓𝑜 −𝑓1 −𝑓2
138 – 147 11 11 − 6
148 – 157 10 = 137.5 + 𝑥 10
2 11 − 6 − 10
158 – 167 6 5
168 – 177 3
= 137.5 + 𝑥(10)
6
178 - 187 1 = 137.5 + 8.33 …
n = 40 = 145.833….

1. Determine the Modal class: Get the highest frequency; Modal Class: 138 – 147
2. Determine the lower boundary of the modal class: L = 137.5
3. Determine the frequency before the modal class: 𝑓1 = 6
4. Determine the frequency of the modal class:𝑓0 = 11
5. Determine the frequency after the modal class: 𝑓2 = 10
6. Determine the class size: i = 10
NORMAL
DISTRIBUTION
How would you describe a Normal Distribution?
•A normal distribution is a bell-shaped frequency
distribution curve. Most of the data values in a
normal distribution tend to cluster around the
mean. The further a data point is from the mean,
the less likely it is to occur. There are many
things, such as intelligence, height, and blood
pressure that naturally follow a normal
distribution.
What are the Characteristics of a Normal
Distribution?
•Normal distributions are
symmetric, unimodal, and
asymptotic, and the mean,
median, and mode are all
equal. A normal distribution
is perfectly symmetrical
around its center.
EMPIRICAL RULE

68.26%

95.44%

99.74%
99.72%

-5 -4 -3 -2 -1 0 1 2 3 4 5
CONVERSION OF RAW SCORE TO Z-SCORE
• The standard score or z-score measures how many standard deviation a
given value (x) is above or below the mean. The z-scores are useful in
comparing observed values. A positive z-score indicates that the score or
observed value is above the mean, whereas a negative z-score indicates
that the score or observed value is below the mean.

𝒙− 𝐱̄
• Formula for sample: 𝒛 =
𝒔
𝒙− µ
• Formula for population: 𝒛 =
𝝈
EXAMPLE
• On a sample final examination in integral calculus, the mean
was 75 and the standard deviation was 12. Determine the
standard score of a student who received a score of 60
assuming that the scores are normally distributed.
SOLUTION
𝑥− x 60−75
• Solve:𝑧 = = = −𝟏. 𝟐𝟓
𝑠 12
• This indicates that 60 is 1.25 standard deviations below the
mean.
AREA OF THE NORMAL CURVE (Z-SCORES)
• The total area under the normal curve is equal to 1.
• The probability that a normal random variable X equals any
particular value is 0.
• The probability that X is greater than a equals the area under the
normal curve bounded by a and plus infinity (as indicated by the
non-shaded area in the figure).
• The probability that X is less than a equals the area under the
normal curve bounded by a and minus infinity (as indicated by the
shaded area in the figure.
CASES IN SOLVING THE AREA
OF A NORMAL CURVE
CASE 1 CASE 2
z = 0 and z = ± a z = a and z = b or z = -a and z = -b (both on
Note: Just determine the area on the z table. the same side)
Note: Subtract the areas of two z-scores.

CASE 3 CASE 4
z = a and z = -b or z = -a and z = b (on z = a (to the right) or z = -a (to the left)
Note: Subtract the areas of two z-scores.

CASES IN SOLVING THE AREA


OF A NORMAL CURVE
CASE 3 CASE 4
CASE 3 CASE 4
z = a and zz = -b or z = -a and z = b (on
= a and z = -b or z = -a and z = b (on
z = a (to the right) or z = -a (to the
z = a (to the right) or z = -a (to the left)
left)
different sides)
different sides) Note:Note: Subtract
Subtract 0.5 to the 0.5
Areato the Area
Note: Add Note:their areas.
Add their areas.

CASE 5
z = - a (to the right) or z = a (to the left)
Note: Add 0.5 to the Area
CASE 5
z = - a (to the right) or z = a (to the left)
Note: Add 0.5 to the Area
EXAMPLE 1: Find the area between z = -1.5 and z = - 2.5
Step 1: Use CASE 2: z scores are both on the same side.
Step 2: Sketch the normal curve and plot the z-scores.

-2.5 -1.5 0
Step 3: Look for the area of the z-scores in the z-table: Note: Negative sign in z-scores is
just a notation that they are plotted on the left side of the curve.
Area of -1.5 = 0.4332 and Area of -2.5 = 0.4938

Step 4: Solve: USE CASE 2

A = 𝐴2 − 𝐴1
A = 0.4938 – 0.4332
A = 0.0606 or 6.06%
Step 5: Interpret: The area between z = -1.5 and z =-2.5 is 0.0606 or 6.06%
EXAMPLE 2: The mean height of 2nd Year 2B students at UST-Legazpi is 164
cm and the standard deviation is 10 centimeters. Assuming the heights are
normally distributed, what percent of the heights is greater than 168
centimeters?
Step 1: Convert 168 to z-score

𝑥−x
z= 𝑠
168−164
z = = 10
z = 0.4
Step 2: Sketch the normal curve:

0 0.4
164 168
EXAMPLE 2: The mean height of 2nd Year 2B students at UST-Legazpi is 164
cm and the standard deviation is 10 centimeters. Assuming the heights are
normally distributed, what percent of0 the
0.4 heights is greater than 168
centimeters? 164 168
Step 3: Find the area of z = 0.4 in the z table.
Area of 0.4 = 0.1554

Step 4: Solve using CASE 4: Subtract 0.5


A = 𝐴2 − 𝐴1
A = 0.5 - 0.1554
A = 0.3446 or 34.46%
Step 5: 34.46% of the heights are greater than 168 centimeters.
CORRELATION
How would you define Correlation?
• Correlation refers to the statistical relationship
between two entities. In other words, it's how two
variables move in relation to one another.
Correlation can be used for various data sets, as well.
In some cases, you might have predicted how things
will correlate, while in others, the relationship will be a
surprise to you. It's important to understand that
correlation does not mean the relationship is causal.
PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT
•The strength of relationship was only estimated
and described visually based on the dots plotted
on the xy coordinate plane. The Pearson Product
Moment Correlation Coefficient, denoted by r,
measures the strength of the linear relationship.
To find r, the following is used.
PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT
𝑛(𝛴𝑥𝑦) − (𝛴𝑥)(𝛴𝑦)
𝑟=
[𝑛(𝛴𝑥 2 ) − (𝛴𝑥)2 [𝑛(𝛴𝑦 2 ) − (𝛴𝑦)2
Where:
n = number of paired values
𝛴𝑥 = sum of x values
𝛴𝑦 = sum of y values
𝛴𝑥𝑦 = sum of the products of paired values x and y
𝛴𝑥 2 = sum of squared x-values
𝛴𝑦 2 = sum of squared y-values
PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT

The following table for interpretation of r can


be used in interpreting the degree of linear
relationship existing between the two
variables.
PEARSON PRODUCT MOMENT
CORRELATION COEFFICIENT
Value of r Strength of Correlation
+1 Perfect positive correlation
+0.71 to +0.99 Strong positive correlation
+0.51 to +0.70 Moderately positive correlation
+0.31 to +0.50 Weak positive correlation
+0.01 to +0.30 Negligible positive correlation
0 No correlation
-0.01 to -0.30 Negligible negative correlation
-0.31 to -0.50 Weak negative correlation
-0.51 to -0.70 Moderately negative correlation
-0.71 to -0.99 Strong negative correlation
-1 Perfect negative correlation
SPEARMAN’S RANK
CORRELATION COEFFICIENT
•The Spearman rank-order correlation coefficient
(Spearman-rho) is a non-parametric measure of
the strength and direction of association that
exists between two variables measured on at
least an ordinal scale. It is denoted by the symbol
rs (or the Greek letter ρ, pronounced rho), the
formula is given below:
SPEARMAN’S RANK
CORRELATION COEFFICIENT

𝟔𝜮𝒅𝟐
𝒑= 𝟏−
𝒏(𝒏𝟐 − 𝟏)
Where:

𝑑 = difference in the ranks of each pair


𝑛 = number of pairs being correlated
SPEARMAN’S RANK
CORRELATION COEFFICIENT
Value of r Strength of Correlation
+1 Perfect positive correlation
+0.71 to +0.99 Strong positive correlation
+0.51 to +0.70 Moderately positive correlation
+0.31 to +0.50 Weak positive correlation
+0.01 to +0.30 Negligible positive correlation
0 No correlation
-0.01 to -0.30 Negligible negative correlation
-0.31 to -0.50 Weak negative correlation
-0.51 to -0.70 Moderately negative correlation
-0.71 to -0.99 Strong negative correlation
-1 Perfect negative correlation
Example 1: The table shows the time in hours spent in studying (x) by six sophomore architecture
students and their scores on a test (y). Solve for the Pearson Product Moment Correlation (r).

x 1 2 3 4 5 6
y 5 10 15 15 25 35
SOLUTION:
Step 1: Construct a table of values.

x y xy 𝑥2 𝑦2
1 5 5 1 25
2 10 20 4 100
3 15 45 9 225
4 15 60 16 225
5 25 125 25 625
6 35 210 36 1,225
Σx = 21 Σy=105 Σxy=465 Σ𝑥 2 = 91 Σ𝑦 2 =2,425
Step 2: Use the formula, where n = 6

𝑛(𝛴𝑥𝑦) − (𝛴𝑥)(𝛴𝑦)
𝑟=
[𝑛(𝛴𝑥 2 ) − (𝛴𝑥)2 [𝑛(𝛴𝑦 2 ) − (𝛴𝑦)2
6(465) − (21)(105)
𝑟=
[6(91) − (21)2 [6(2,425) − (105)2
585
𝑟= = 0.96157 𝑜𝑟 𝟎. 𝟗𝟔𝟐
370,125
Interpretation: It indicates that there is a strong positive correlation between the time in hours spent in
studying and the scores on a test.
Example 2: In a regional finals for the mathematical device, two judges were asked to rank eight
contestants (A, B, C,…, H) based on their over-all performance. Calculate Spearman’s rank correlation
coefficient and determine how strong the correlation is between the scores of the two judges. The table
shows the resulting ranks.

A B C D E F G H
First
5 2 4 3 6 1 8 7
Judge(x)
Second
3 4 5 2 6 1 7 8
Judge (y)
Solution: n =8

Contestants x y d 𝑑2
A 5 3 2 4
B 2 4 -2 4
C 4 5 -1 1
D 3 2 1 1
E 6 6 0 0
F 1 1 0 0
G 8 7 1 1
H 7 8 -1 1
Σ𝑑 2 = 12
𝟔𝜮𝒅𝟐
𝒑= 𝟏−
𝒏(𝒏𝟐 − 𝟏)
𝟔(𝟏𝟐)
𝒑=𝟏−
𝟖(𝟖𝟐 − 𝟏)
𝟕𝟐
𝒑=𝟏− = 𝟎. 𝟖𝟓𝟕𝟏
𝟓𝟎𝟒
Interpretation: It indicates that there is a strong positive correlation between the scores of the two
judges.
FIN…
PRAYER Almighty God, bless our nation and make it true
FOR OUR to the ideals of freedom and justice and
brotherhood for all who make it great. Guard us
COUNTRY from war, from fire and wind, from compromise
and disease from fear and confusion. Be close to
our president and statesmen; give them vision
and courage, as they ponder decisions affecting
peace and the future of the world. Make us more
deeply aware of our heritage; realizing not only
our rights but also our duties and responsibilities
as citizens. Make this great land and all its people
know clearly Your will, that we may fulfill the
destiny ordained for us in the salvation of the
nations, and the restoring of all things in Christ.
Amen.

You might also like