Statistics MS. ORDONIO
Statistics MS. ORDONIO
Statistics MS. ORDONIO
CHAPTER 5: STATISTICS
Objectives:
a. Identify and differentiate Patterns in Nature.
b. Understand the Fibonacci Sequence.
c. Appreciate the beauty of Mathematics in terms of Patterns and Number in Nature and
in the World.
The mean is the part of the distribution around which the values balance.
ˆ
Symbol for mode: X read as “x hat”
Definition:
Page 1
MODULE MATHEMATICS IN THE MODERN WORLD
The mode is the number that occurs most often in a set of data. A set of
data can have more than one mode. If all the numbers appear the
same number of times, there is no mode for that data set.
A. The Mean
The most widely used average, the arithmetic mean, is defined as the sum
of the observations divided by the number of observations.
X = X i
N
Example:
A motorist records the time in it takes him to travel to work by car during
the peak hour traffic for a 10-day period. The times (to the nearest minute) are as
follows:
36,33,28,28,32,29,33,34,32,33
Find the mean time it took him to get to work during the two weeks (ten working
days).
Solution:
X = X i
N
=
= 31.8 minutes
Weighted Mean
Page 2
MODULE MATHEMATICS IN THE MODERN WORLD
X = fX
N
Example:
Assignments 10%
Project 20%
100 %
Solution:
X = fX
N
=
85(10)+74(20)+76(30)+87(40)
100
= 80.9%
Page 3
MODULE MATHEMATICS IN THE MODERN WORLD
X = fXm
N
Example:
95-99 1
90-94 9
85-89 8
80-84 14
75-79 11
70-74 5
65-69 2
Solution:
Add two columns for the class mark ( Xm ) and fXm in the given table. Find
the summations of f and fXm.
Class Interval Frequency Xm fXm
(f)
95-99 1 97 97
90-94 9 92 828
Page 4
MODULE MATHEMATICS IN THE MODERN WORLD
85-89 8 87 696
80-84 14 82 1148
75-79 11 77 847
70-74 5 72 360
65-69 2 67 134
f = 50 fX m =
4110
X = fXm =
4110 = 82.2 N
50
B. The Median
The median is the value of the middle observation if the data are arranged
in the form of an array. Thus, the median is the value of an array which divides it
so that there an equal number of observations on either side of it. The median is
often used when describing an educational and sociological data, such as ages,
income, family size, etc.
If there is an odd number (n) of observations, then the median is the value
n +1
of 2 th observation. If n is an even number, the median is usually
Page 5
MODULE MATHEMATICS IN THE MODERN WORLD
Example 1:
Solution:
11 1 +
Median waiting time = 2 th observations
= 6th observations
= 17 minutes
Example 2:
A sample of 50 students was given an inventory test (based on a
possible score of 0-5). The result sore as follows:
Score 0 1 2 3 4 5
Frequency 5 9 12 16 6 2
Solution:
Construct a cumulative frequency distribution.
Page 6
MODULE MATHEMATICS IN THE MODERN WORLD
n
to determine the class interval which contains the 2 thscore. This
distribution.
n
The class interval that contains the 2 thscore is called the
median class
of the distribution. To calculate the median, we use the formula:
Where:
X= median
XLB = the lower boundary or true lower limit of the median class
N = total frequency
cf = cumulative frequency before the median class
fm= frequency of the median class
i = size of the class interval
Example:
Calculate the median score of 50 junior students in an achievement test in
Math given in the table below.
Solution:
Achievement test Results in Math of 50 Junior Students
Class Interval Frequency (f)
Page 7
MODULE MATHEMATICS IN THE MODERN WORLD
95-99 1
90-94 9
85-89 8
80-84 14
75-79 11
70-74 5
65-69 2
= 25th score
N −cfb
Page 8
MODULE MATHEMATICS IN THE MODERN WORLD
50 −18
X = 79.5+2.5
X =82.2
This means that 50 percent of the students got scores below 82.
C. The Mode
The mode is defined as the observation which occurs the most often in a
set of data
This is the observation which has the largest frequency. It is frequently
used to determine those products which are in greatest demand.
Example:
Find the mode of the following scores:
14,17,17,17,18,18,19,20,21,21,23
Solution:
By inspection, the mode is 17 since it occurs 3 times in the distribution.
Example:
An ice cream parlor sells 6 flavors of ice cream The numbers for each type
sold on a particular day are shown below.
Cheese 16
Chocolate 12
Vanilla 22
Macapuno 26
Page 9
MODULE MATHEMATICS IN THE MODERN WORLD
Strawberry 23
Fruit Salad 18
Determine the most popular flavor of ice cream for that day.
Solution:
The most popular flavor is that which is most frequently sold. The highest
frequency is 26. Hence, the modal (most popular) flavor for that day is macapuno.
In the computation of the mode given a frequency distribution, the first step is to
get the modal class. The modal class is that class interval with the highest
frequency. To compute for the mode, we use the formula:
Where:
Example:
Find the mode for the following grouped frequency distribution.
Page 10
MODULE MATHEMATICS IN THE MODERN WORLD
The modals class is the class interval 80-84 since it has the highest
frequency. Therefore,
XLB = 79.5 d2 =14−8 = 6
Page 11
MODULE MATHEMATICS IN THE MODERN WORLD
Xˆ = 79.5+ 3+365
Xˆ = 79.5 1.67+
Xˆ =81.17
For more knowledge about Measures of Central Tendency, please check the
link provided;
http://onlinestatbook.com/2/summarizing_distributions/measures.html
https://study.com/academy/lesson/central-tendency-measures-definition examples.html
REMEMBER
a set of data
ACTIVITY:
Page 12
MODULE MATHEMATICS IN THE MODERN WORLD
Choose 10 of your classmates and ask them if how much money left in their pocket. In
your collected data, compute for the mean, median and mode.
The measure of central tendency is not in itself sufficient to adequately describe a set
of data. In addition, a measure of dispersion (or spread) of data is also required. This
measure describes the extent to which individual observations vary above and below
the average. The need for a measure of dispersion is just as important as the average.
A measure of dispersion gives an indication of the reliability of the average value.
The most commonly measure of dispersion are: the range, the quartile deviation, the
mean deviation, the variance and the standard deviation.
A. The Range
The easiest and the simplest way to determine measure of dispersion is
the range. The range is simply defined as the difference of the highest score
(H.S) and the lowest score (L.S). It shows the extreme scores of a set of data.
When we talk of grouped data, the range can be calculated data by
subtracting the lower boundary (L.B) of the lowest class interval from the upper
boundary (U.B) of the highest class interval. That is,
R = H S. − =LSU B. −LB
Example 1:
a. The range of the set of scores in 12,14,14,16,16 is 16 12− or 4.
b. The range of the set of scores in 10,14,14,18,25 is 25 10− or 15.
Example 2:
38-39 1
36-37 3
Page 13
MODULE MATHEMATICS IN THE MODERN WORLD
34-35 3
32-33 3
30-31 6
28-29 6
26-27 8
24-25 6
22-23 10
20-21 14
Solution:
Range U B LB= . − .
Range= 39.5 19.5−
Range= 20
• 2nd quartile (Q2 ) . There are 50% of the observations belowQ2 and 50% of
the observations above Q2. The second quartile is also the median.
• 3rd quartile (Q3 ). There are 75% of the observations below Q3 and 25% of
Page 14
MODULE MATHEMATICS IN THE MODERN WORLD
If there are n observation in a set of data, then Q1can be identified as the
n +1
4
th
Example:
Mr. Basanez is interested in the amount of time it takes his bank tellers to
service customers. One particular morning, her records the service times for 15
customers. The times (to the nearest minute) are given below.
6,9,7,5,16,11,9,7,4,9,7,11,10,8,6
a. Find the median time
b. Find Q1and Q3 of the service times.
Solution:
The number of observations is n=15. Arrange the data in array
4,5,6,6 7,7,7,8 9,9,9,10 11,11,16
Q1 Q2 Q :123 th
th
n +1
a. The median or Q2 = 2 observations
= 8th observation
= 8 minutes
th
n +1
b. The first quartile: Q1 = 4 observation
= 4 observation
th
= 6 minutes
3(n+1)th
c. The third quartile: Q3 = observation
4
= 12thobservation
= 10 minutes
Page 15
MODULE MATHEMATICS IN THE MODERN WORLD
N cf
Q1 = XLB + 4 − b i
fq1
i −
Q3 = XLB + f
q3
Where: XLB = lower boundary of theQ3class
N = total frequency
cfb= cumulative frequency before the Q3class
fq3 = frequency of the Q3class
i = size of the class interval
Example: From the given frequency distribution table, compute for Q Q1, 2 and Q3.
Class interval f
28-32 3
23-27 8
18-22 15
13-17 12
8-12 5
3-7 2
Solution:
a. The first step is to add the entitles in the column for cf .
Class interval f cf
28-32 3 45
23-27 8 42
18-22 15 34
Page 16
MODULE MATHEMATICS IN THE MODERN WORLD
13-17 12 19
8-12 5 7
3-7 2 2
4
N −cfb i
Q1 = XLB + f
q1
= 12.5+1.77
= 14.27
= 17.5+1.17
= 18.67
Page 17
MODULE MATHEMATICS IN THE MODERN WORLD
= 17.5+4.92
= 22.42
Interquartile Deviation
I R.= Q3 −Q1
The formula for finding the interquartile range shows the distance between
Q3 and Q1. The value obtained half of this distance is called the quartile deviation
or (Q.D) and the formula is given by:
Quartile Deviation
Q D.= Q3 −Q1
2
Example:
A farmer has his corn crop spread over 15 fields each of equal size. The
output (in cubic meters) for each of the fields is
Page 18
MODULE MATHEMATICS IN THE MODERN WORLD
226,174,185,203,193,216,164,228,244,208,235,200,216,196,188
a. Find the range of the output.
b. Find the quartile deviation of the outputs.
Solution:
Arrange the data in an array:
164,174,185,188,193,196,200,202,208,216,,216,226,228,235,244 a.
The range of the output = 244 164− =80 cubic meters
b. The first quartile =4th observations
=188 cubic meters
The third quartile =12th observations
=226 cubic meters
The interquartile range = 226 1888−
=38 cubic meters
−
interquartile range 226 188
The quartile deviation = =
2 2
=
=19 cubic meters
C. The Mean Deviation
A measure of dispersion which takes into account each observation is the mean
deviation. This is more reliable than the range and the quartile deviation because
each makes use of only two values in the distribution, namely: the two most
extreme values in the range; and Q3 and Q1in the quartile deviation. The formulas
for the computation of the mean deviation will be shown.
M D. = X − X
N
Where:
X = represents the scores of the distribution
X = is the mean
N = is the number of observations
The formula tells us that we have to follow the following steps:
1. Calculate the mean of the data.
2. Add a column for X − X .
Page 19
MODULE MATHEMATICS IN THE MODERN WORLD
Solution:
X = X = 24 = 8
N 8
b. Add the column for X − X .
c.
X X−X
5 3
8 0
11 3
d. X−X=6
e. M D. = =2
−X
M D. = f X − X or M D. = f Xm
N N
Page 20
MODULE MATHEMATICS IN THE MODERN WORLD
X f
20 5
18 3
16 7
14 15
12 12
10 8
N= 50
Solution:
fX
a. Calculate the mean by using the formula X = . This means we
N
are going to add the entitles in the column for fX.
X f fX
20 5 100
18 3 54
16 7 112
14 15 210
12 12 144
10 8 80
X = fX
N N= 50
fX = 700
=
= 14
Page 21
MODULE MATHEMATICS IN THE MODERN WORLD
18 3 4 12
16 7 2 14
14 15 0 0
12 12 2 24
10 8 4 32
=112
N= 50
fX−X
c. Divide f X − X by N.
M D.= = 2.24
The variance is defined as the quotient of the sum of the squared deviations
from the mean divided by N-1 while the standard deviation is the square root of
the variance. The formulas are given below
Page 22
MODULE MATHEMATICS IN THE MODERN WORLD
(X − X ) 2 (X X− ) 2
S2 = and S=
N −1 N−1
These are formulas use the mean deviation method and tell us to follow the
following steps:
1. Calculate the mean.
2. Get the difference of each score and the mean, then get the square of
this difference.
3. Get the sum of the squared deviations in Step 2.
(X − X ) (X X− )
2 2
Example: Find the variance and standard deviation of the following distribution:
X 5 8 11
Solution:
5 9
8 0
11 9
(X − X) =18
2
Page 23
MODULE MATHEMATICS IN THE MODERN WORLD
2
18 9 s ==
2
s= 9=3
The following are important points to remember regarding the calculations of the
standard deviation.
This method of computing the variance and standard deviation is called the raw score
method. The formulas are given below.
N
S2 = X −( X )
2 2 S = N X −( X )
2 2
N N( −1) N N( −1)
S2 = N fX −( 2
fX )2 S = N fX −( 2 fX )2
N N( −1) N N( −1)
Page 24
MODULE MATHEMATICS IN THE MODERN WORLD
S2 = N fXm2 − ( )
fXm 2
S=N fXm2 − ( )
fX m 2
N N( −1) N N( −1)
Example: Find the variance and the standard deviation of the following
distribution.
X 5 8 11
Solution:
a. Get X .
X
5
11
X =24
b. Add a column for X 2 , square all the scores and get their sum.
X X2
5 25
8 64
11 121
X =24 X 2
= 210
Page 25
MODULE MATHEMATICS IN THE MODERN WORLD
S2 = N X −( X )
2 2
N N( −1)
3(210)
S2 = −(24)2
3(3 1)−
S2 = 630−576
6
254
S= 6
S 2= 9
S= 9
S=3
REMEMBER
Page 26
MODULE MATHEMATICS IN THE MODERN WORLD
•
• The standard deviation allows us to immediately compare the
spread of different sets of score and enables us also to interpret
the scores of a given set of data.
ACTIVITY:
A measure of position is a method by which the position that a particular data value has
within a given data set can be identified. As with other types of measures, there is more
than one approach to defining such a measure.
The standard score (often called the z-score) for a given data value x is the
number of standard deviations that x is above or below the mean of the data. The
Page 27
MODULE MATHEMATICS IN THE MODERN WORLD
following formulas show how to calculate the z-score for a data value x in a population
and sample.
− x− x
population data z = x and sample data z =
s
To compute a standard score, only the mean and standard deviation are
required. However, since both of those quantities do depend on every value in the data
set, a small change in one data value will change every z-score.
Example:
−
x
Solut
ion:
z=
z
=
z=−0.833
Example:
−
x
Page 28
MODULE MATHEMATICS IN THE MODERN WORLD
Solution: z =
8.17 8
z
=
z=1.7
• Percentiles
A value x is called the pth percentile of a data set provided p% of the data
values are less than x.
Example:
In a recent year, the median annual salary for a physical therapist was
Php 74,480. If the 90th percentile for the annual salary of a physical therapist was
Php 105,900; find the percent of physical therapists whose annual salary was
Solution:
a. By definition, the median is the 50th percentile. Therefore, 50% of the physical
therapists earned more than Php 74,480 per year.
b. Because Php 105,900 is the 90th percentile, 90% of all physical therapists
earned less than Php 105,900.
c. From parts a and b,
90%-50% = 40%
40% of the physical therapists earned between Php 74,480 and Php 105,900.
Example:
Page 29
MODULE MATHEMATICS IN THE MODERN WORLD
Solution:
Percentile = 100
= 64
• Quartile
The three numbers Q Q1, 2, and Q3 that partitions a ranked data into four
equal groups are called quartiles.
- 1st quartile (Q1 ). There are 25% of the observations below Q1and 75%
- 2nd quartile (Q2 ) . There are 50% of the observations belowQ2 and 50%
of the observations above Q2. The second quartile is also the median.
- 3rd quartile (Q3 ). There are 75% of the observations below Q3 and 25%
of the observations above Q3.
The Median procedure for Finding Quartiles
The following table lists the calories per 100 milliliters of 25 popular
sodas. Find the quartiles for the data.
Page 30
MODULE MATHEMATICS IN THE MODERN WORLD
43 37 42 40 53 62 36 32 50 49
26 53 73 48 45 39 45 48 40 56
41 36 58 42 39
Solution:
1) 26 2) 32 3) 36 4) 36 5) 37 6) 39 7) 39 8) 40 9) 40
Step 2: The median of these 25 data values has a rank of 13. Thus, the
median is 43. The second quartile Q2 is the median of the data, so Q2 = 43
.
Step 3: There are 12 data values less than the median and 12 data values
greater than the median. The first quartile is the median of the data values
less than the median. Thus, Q1is the mean of the data values with rank of
6 and 7.
Q1 = = 39
The third quartile is the median of the data values greater than the
median. Thus, Q3 is the mean of the data values with ranks of 19 and 20.
Q3 = = 51.5
Page 31
MODULE MATHEMATICS IN THE MODERN WORLD
1. Draw a horizontal scale that extends from minimum data value the maximum
data value.
2. Above the scale, draw a rectangle (a box) with left side at Q1 and its right at
Q3.
3. Draw a vertical line segment across the rectangle at the median, Q2.
4. Draw a horizontal line segment, called a whisker, that extends from Q1 to the
minimum and another whisker that extends from Q3 to the maximum.
Example:
Solution:
Median (middle
value) = 22
Lower quartile (middle value of the lower half) = 12
Upper quartile (middle value of the upper half) = 36
(If there is an even number of data items, then we need to get the average
of the middle numbers.)
Page 32
MODULE MATHEMATICS IN THE MODERN WORLD
Step 3: Draw a number line that will include the smallest and the largest data.
Step 4: Draw three vertical lines at the lower quartile (12), median (22) and the
upper quartile (36), just above the number line.
Step 5: Join the lines for the lower quartile and the upper quartile to form a box.
Step 6: Draw a line from the smallest value (5) to the left side of the box and
draw a line from the right side of the box to the biggest value (53).
Page 33
MODULE MATHEMATICS IN THE MODERN WORLD
REMEMBER
ACTIVITY:
69 93 70 53 92 75 85 70 68 76 88 70 77 82 85 82 80 100 96 85
b. Find the z-score of each measurement in the following sample data set.
−5 6 2 −1 0
Page 34
MODULE MATHEMATICS IN THE MODERN WORLD
Properties of Normal
Distribution
▪ The graph is symmetric about a vertical line through the mean of the distribution.
▪ The mean, median and mode are equal.
▪ The y-value of each point on the curve I the percent (expressed as a decimal) of
the data at the corresponding x-value.
▪ Areas under the curve that are symmetric about the mean are equal.
▪ The total area under the curve is 1.
Example:
The weights of adorable, fluffy kittens are normally distributed with a
mean of 3.6 pounds and a standard deviation of 0.4 pounds.
Page 35
MODULE MATHEMATICS IN THE MODERN WORLD
First, draw your Empirical curve with the 4 percentages! (Steps 1-3 are
completed below.)
What percent of adorable, fluffy kittens weigh between 2.8 and 4.8
pounds?
What percent of adorable, fluffy kittens weigh less than 2.4 pounds?
Page 36
MODULE MATHEMATICS IN THE MODERN WORLD
0.15%
Page 37
MODULE MATHEMATICS IN THE MODERN WORLD
For finding the area under the curve, the table is shown below:
Example:
Page 38
MODULE MATHEMATICS IN THE MODERN WORLD
Solution:
Because the curve is symmetrical, the same table can be used for values
going either direction, so a negative 0.45 also has an area of 0.1736
Example:
Solution:
From 0 to +2 is:
Linear Regression
Page 39
MODULE MATHEMATICS IN THE MODERN WORLD
Linear regression finds the line that best fits the data points. There are actually a
number of different definitions of "best fit," and therefore a number of different methods
of linear regression that fit somewhat different lines. By far the most common is
"ordinary least-squares regression"; when someone just says "least-squares
regression" or "linear regression" or "regression," they mean ordinary least-squares
regression.
There are many names for a regression’s dependent variable. It may be called
an outcome variable, criterion variable, endogenous variable, or regressand. The
independent variables can be called exogenous variables, predictor variables, or
regressors.
First, the regression might be used to identify the strength of the effect that the
independent variable(s) have on a dependent variable
Second, it can be used to forecast effects or impact of changes. That is, the regression
analysis helps us to understand how much the dependent variable changes with a
change in one or more independent variables.
Page 40
MODULE MATHEMATICS IN THE MODERN WORLD
y = a x + b where
a and b are given by
Figure 2. Formulas for the constants a and b included in the linear regression
Example:
Solution:
x y xy x2
-2 -1 2 4
1 1 1 1
3 2 6 9
Page 41
MODULE MATHEMATICS IN THE MODERN WORLD
=x 2 =y 2 xy = 9 x2 = 14
We now use the above formula to calculate a and b as follows
a = (n x yn x − 2
−( xx y)2 ) / = ((((3)(9)3)(14) − −
(2) (2(2)2))) = 3823
points.
The Least-Square Regression Line for a set of bivariate data is the line that
minimizes the sum of the squares of the vertical deviations from each data point to the
line.
Page 42
MODULE MATHEMATICS IN THE MODERN WORLD
But for better accuracy let's see how to calculate the line using Least Squares
Regression.
The Line
Our aim is to calculate the values m (slope) and b (y-intercept) in the equation of a line :
y = mx + b
Where:
• y = how far up
• x = how far along
• m = Slope or Gradient (how steep the line is)
• b = the Y Intercept (where the line crosses the Y axis)
Steps
Step 2: Sum all x y x and xy, , 2 , which gives us x, y, x and2 xy
Page 43
MODULE MATHEMATICS IN THE MODERN WORLD
b =y − m xN
y = mx + b
Example:
Sam found how many hours of sunshine vs how many ice creams were sold at
the shop from Monday to Friday:
"x" "y"
Hours of Ice Creams
Sunshine Sold
2 4
3 5
5 7
7 10
9 15
Let us find the best m (slope) and b (y-intercept) that suits that data
y = mx + b
2 4 4 8
3 5 9 15
5 7 25 35
7 10 49 70
9 15 81 135
Page 44
MODULE MATHEMATICS IN THE MODERN WORLD
Step 2: Sum all x y x and xy, , 2 , which gives us x, y, x and2
xy
x y x2 Xy
2 4 4 8
3 5 9 15
5 7 25 35
7 10 49 70
9 15 81 135
m =N (xy) − x yN (x ) − (x)
2 2
b =y − m xN b=
41 − 1.5183 265x b=
0.3049...
y = mx + b
y = 1.518 x + 0.305
Page 45
MODULE MATHEMATICS IN THE MODERN WORLD
2 4 3.34 −0.66
3 5 4.86 −0.14
5 7 7.89 0.89
7 10 10.93 0.93
9 15 13.97 −1.03
Linear Correlation
For the n ordered pairs (x y x y1, 1),( 2, 2),(x y3, 3),...,(x yn, n) , the linear
correlation coefficient r is given by
r= n(xy)−(x)( y)
n(
Page 46
MODULE MATHEMATICS IN THE MODERN WORLD
Example:
Page 47
MODULE MATHEMATICS IN THE MODERN WORLD
r xy( ) =
r xy( ) =
r xy(
) =1
Therefore, r xy() =1
Page 48
MODULE MATHEMATICS IN THE MODERN WORLD
REMEMBER
•
The linear correlation coefficient measures the strength
and direction of the linear relationship between two
x and y.
The sign of the linear correlation coefficient indicates the
direction of the linear relationship between x and y.
ACTIVITY:
Solve the following problem.
Page 49
MODULE MATHEMATICS IN THE MODERN WORLD
https://www.abs.gov.au/websitedbs/a3121120.nsf/home/statistical+language+-
+measures+of+central+tendency
https://www.toppr.com/guides/busines s-
mathematics-and-statistics/measures-of-
centraltendency-and-dispersion/measure-of-dispersion/
https://stattrek.com/descriptiv e-statistics/measures-of-position.aspx
https://statisticsbyjim.com/basics/normal-distribution/
Page 50