Collection of Data Organize The Data - Tally Presentation of Data - Graphs and Table Analysis of The Data Interpretation of The Data
Collection of Data Organize The Data - Tally Presentation of Data - Graphs and Table Analysis of The Data Interpretation of The Data
Collection of Data Organize The Data - Tally Presentation of Data - Graphs and Table Analysis of The Data Interpretation of The Data
Collection of data
Organize the data - tally
Presentation of data – graphs and table
Analysis of the data
Interpretation of the data
respondent f %
s
boys 15 30
girls 35 70
Learning Outcomes:
1. Solve and interpret the measures of central tendency for ungrouped data.
2. Solve and interpret the range, variance, standard deviation, coefficient of variation
and skewness.
3. Apply the correlation to determine the relationship between two variables.
4. Use linear regression to predict the value of a variable given certain conditions.
5. Use a variety of statistical tools to process and manage numerical data.
- computational average
- the sum of all n values divided by the total frequency
● Arithmetic Mean
❑
∑ ❑x
Where: x represents the value of an observation
x= ❑
n
n represents the total number of observations
● Weighted Mean
❑
∑ ❑wx
w x= ❑
❑ Where: x represents each of the item values
∑ ❑w
❑
w represents the weight of each item value
❑
∑ ❑ fx
Where: f represents the frequency
w x= ❑
n
n represents the sample size
MEDIAN (~
x)
- Positional average
- the center most or the middle most observation or value (when n is odd) or
the average of the two middle values (when n is even) when the data are
arranged (either ascending or descending)
- divides the set of data into two equal parts (half of the observation belongs to
the higher 50%, while the other half belongs to the lower 50% of the group)
MODE (^x )
- Nominal average
- the most frequently occurring score in a distribution
- the observation or value which appears the most number of times in the set of
values
Examples:
Find the mean, median and mode of the following set of data.
1. 17 25 34 25 27 19 24
^x =¿ 25
2. 40 52 50 48 56 60 37 65 40 50 65
^x =¿ 40, 50 and 65
3. 87 94 36 56 54 76 87 54 87 36
667
x= =¿ 66.7
10
~
x ⇒ 36, 36, 54, 54, 56, 76, 87, 87, 87, 94
~ 56+76 132
x= = =¿ 66
2 2
^x =¿ 87
4. 21 23 16 15 26 27 19 24
171
x= =¿ 21.375 ≈ 21.38
8
~
x ⇒ 15, 16, 19, 21, 23, 24, 26, 27
~ 21+23 44
x= = =¿ 22
2 2
^x =¿ no mode
⮚ Weighted Mean
❑
∑ ❑wx
41.25
w x= ❑
= =¿ 2.29
❑
18
∑ ❑w
❑
2. If 8 000 books of Algebra were sold at ₱320 each, 1 500 Business Mathematics at
₱380 each, 1 000 Mathematics of Investment at ₱300 each and 3 500 Statistics at
₱340 each, find the weighted mean sales for the four books.
❑
∑ ❑wx
4 620 000
w x= ❑
= =¿ ₱330.00
❑
14 000
∑ ❑x
❑
3. Miss Z has 21 students in a specific subject. These students were asked on how
often Miss Z gives assignment. Of these students, 18 answered (4) very often, 2
answered (3) often, 1 for (2) seldom and nobody for (1) never.
❑
∑ ❑wx
18 ( 4 ) +2 ( 3 ) +1 (2 )+ 0(1)
w x= ❑
= =¿ 3.81(very often)
❑
21
∑ ❑x
❑
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________
Activity 1
Measures of Central Tendency
a. 21 10 36 42 39 52 30 25 26
x=¿ _________ ~
x=¿ _________ ^x =¿ _________
b. 21 55 25 30 26 36 42 39 36 25
x=¿ _________ ~
x=¿ _________ ^x =¿ _________
x=¿ _________ ~
x=¿ _________ ^x =¿ _________
d. 31 21 16 15 21 27 19 18
x=¿ _________ ~
x=¿ _________ ^x =¿ _________
e. 87 94 36 56 54 76 87 85 68 56 78 88
x=¿ _________ ~
x=¿ _________ ^x =¿ _________
f. A student gets the following grades in his seven subjects: 87 for Calculus, 82
for Physics, 79 for Chemistry, 81 for English and 83 for History. Compute for
his mean grade if the weights for the five subjects are 5.0, 4.0, 4.0, 3.0 and
3.0, respectively. x=¿ _________
g. It was recorded that 5 brands of ballpen with tag prices of ₱7.50, ₱8.00,
₱9.00, ₱10.00 and ₱12.50 were bought by 16, 5, 4, 12 and 6 students. Find
the mean sale. x=¿ _________
h. Jessie Salvador, an Engineering student got 88%, 85%, 91% and 93% in four
of his subjects. What grade must he get in his fifth subject in order to obtain
an average of 90%? x=¿ _________
i. The table below shows the number of respondents who answered 5, 4, 3, 2
and 1 on three questions. Compute for the weighted mean and give the mean
interpretation using the scale below:
Mean Interpretation
1.00 – 1.79 To a Very Slight Extent (VSE)
1.80 – 2.59 To a Slight Extent (SE)
2.60 – 3.39 To a Moderate Extent (ME)
3.40 – 4.19 To a Great Extent (GE)
4.20 – 5.00 To a Very Great Extent (VGE)
Interpretatio
5 4 3 2 1 wx̅
n
To what extent do you think
Statistics will help you in your 15 20 5 0 0
chosen career?
To what extent do you think
Statistics will help you in doing 10 25 3 2 0
research?
To what extent do you think
Statistics will help you in real life 11 16 8 5 0
situation?
MEASURES OF VARIABILITY OR DISPERSION
RANGE (R)
The range, which is the simplest to compute, is the difference between the
largest and the lowest values in the set of numerical data. This is a poor and
unstable measure of variation, particularly, if we consider a large number of
values. It is least reliable and should be used only when someone wants to obtain
a quick measure of variation.
The variance and the standard deviation are generally accepted measures
of dispersion, especially in discussions and presentation of reports containing
basic statistics. The standard deviation is more popularly used than the variance
since its value is expressed in the unit of observations and the mean.
Take note: The higher the standard deviation, the more spread or more dispersed
the data are. The smaller the standard deviation, the less spread and
less dispersed, the more homogeneous, more consistent or more
uniform the data are.
❑ ❑ ❑
2 2
∑ ❑(x− x)2 n ∑ ❑ x −( ∑ ❑ x )
s2= ❑ or 2
s= ❑ ❑
n−1 n (n−1)
❑ ❑ ❑
√
2
√
2 2
∑ ❑( x−x ) n ∑ ❑ x −( ∑ ❑ x)
or
s= ❑
s2= ❑ ❑
n−1 n( n−1)
Examples:
1. Find the value of the range, variance and standard deviation of the set of data:
17, 25, 24, 18, 20 17, 18. 20, 24, 25 mean = 20.8
R = HV – LV = 25 – 17 = 8
x ( x−x ) ( x−x )2 x2
17 17– 20.8 = –3.8 (–3.8)2 = 14.44 289
18 18 – 20.8 = –2.8 (–2.8)2 = 7.84 324
20 20 – 20.8 = –0.8 (–0.8)2 = 0.64 400
24 24 – 20.8 = 3.2 (3.2)2 = 10.24 576
25 25 – 20.8 = 4.2 (4.2)2 = 17.64 625
104 50.8 2214
❑
∑ ❑( x− x)2
50.8 50.8 12.7 or
s2= ❑
= = =¿
n−1 5−1 4
❑ ❑
2
n ∑ ❑ x 2−( ∑ ❑ x )
5 ( 2214 )−(104)2 254 12.7
s2= ❑ ❑
= = =¿
n (n−1) 5(5−1) 20
s= √ 12.7 ≈ 3.56
RA = 30 – 14 = 16 RB = 24 – 18 = 6
Secretary A Secretary B
x x2 x x2
14 196 18 324
16 256 18 324
18 324 20 400
20 400 22 484
22 484 24 576
24 576 24 576
26 676 24 576
28 784 24 576
30 900 24 576
198 4 596 198 4 412
❑ ❑
2 2
n ∑ ❑ x −( ∑ ❑ x )
Secretary A: s2= 9 ( 4 596 )−(198)2 2160
❑ ❑
= = =¿ 30 s= √ 30 ≈ 5.48
n (n−1) 9(9−1) 72
❑ ❑
2 2
n ∑ ❑ x −( ∑ ❑ x )
Secretary B: s2= 9 ( 4 412 )−(198)2 504
❑ ❑
= = =¿ 7 s= √ 7 ≈
n (n−1) 9(9−1) 72
2.65
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________
Activity 2
Measures of Variability or Dispersion
a. The monthly number of cars sold by a car dealer from January to October for a
particular year are: 20 24 12 10 18 4 15 6 11 19.
b. Sample annual salaries, in thousands of pesos, for Manila and Makati are listed.
Manila: 34 25 17 17 27 25 29 33 26
Makati: 26 23 27 28 25 26 18 26 31
*Compute for the range, variance and standard deviation; and interpret the result.
*In which area salary is more consistent?
COEFFICIENT OF VARIATION
When the units of measurement are different, this relative dispersion may also
be used to compare the descriptions of the variability of sets of numerical data. For
instance, you may compare the variability of the ages of 9 children whose mean age
is 10 years with a standard deviation of 2 years, with their weights whose mean is 45
pounds with a standard deviation of 5 pounds, by calculating their measures of
relative dispersion. While it is not logical to compare the values of their standard
deviations in as much as they are expressed in different units of measure, it is,
nevertheless, reasonable to determine measures that would indicate the amounts of
their variations relative to their means.
s
CV = ×100 % Where: s = standard deviation and x = mean
x
Examples:
1. A dealer sells two classes of quality lamps, A and B. Lamp A has a mean life
span of 2000 hours with a standard deviation of 200 hours, while Lamp B has a
mean life span of 2500 hours with a standard deviation of 300 hours. Compare
the dispersion.
Lamp A Lamp B
s 200 s 300
CV = ×100 %= ×100 %=¿ 10% CV = ×100 %= ×100 %=¿ 12%
x 2000 x 2500
Interpretation:
● Lamp B (CV = 12%) has greater relative dispersion or is more variable; more
dispersed than Lamp A (CV = 10%).
● Lamp A has lesser relative dispersion or is more consistent; more uniform;
more homogenous; better than Lamp B.
Company A Company B
s 15 s 40
CV = ×100 %= ×100 % ≈ 14.29%CV = ×100 %= ×100 % ≈ 12.01%
x 105 x 333
Interpretation:
● Company B is more consistent than Company A.
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________
Activity 3
Coefficient of Variation
● The weight (CV = 9.52%) has greater relative dispersion or is more variable;
more dispersed than the score (CV = 8.97%).
● The score has lesser relative dispersion or is more consistent; more uniform;
more homogenous; better than the weight.
● The height (CV = 15.79%) has greater relative dispersion or is more variable;
more dispersed than the weight (CV = 11.17%).
● The weight has lesser relative dispersion or is more consistent; more uniform;
more homogenous; better than the height.
3. Two employees A and B are to compare their daily routine of work. A can finish
his job with an average of 1.5 hours with a standard deviation of 0.025 hour,
whereas B can finish the job with an average of 4 hours and a standard deviation
of 0.01 hour. Who is more consistent?
Another statistical measure like the central tendency (average) and the
dispersion (variation) is the skewness (symmetry). Skewness (sk) is the degree of
symmetry or departures from symmetry of a set of data. A skewed distribution is
similar in shape to a normal distribution except that it is not symmetrical: the half left
of the polygon is not a mirror image of the right half.
3( x−~x)
sk=
s
2. Positively Skewed
- skewed to the right (longer right tail)
- the mean is greater than the median and mode
- sk > 0
3. Negatively Skewed
- skewed to the left (longer left tail)
- the mean is less than the median and mode
- sk < 0
Examples:
3(x−~x ) 3 (40−38)
sk= = =¿ 1.5 positively skewed
s 4
3(x−~x ) 3 (320−350)
sk= = =¿ –2.25 negatively skewed
s 40
iii. x=¿ 70 ~
x=¿ 70 s = 10
3(x−~ x ) 3 (70−70)
sk= = =¿ 0 symmetrical
s 10
2. A physician conducted a medical research on the study of the spread of cancer
using a group of patients. The results reveal that the mean is 70 days with a
standard deviation of 44 days and a median of 65 days. What is the coefficient of
skewness?
3(x−~x ) 3 (70−65)
sk= = ≈ 0.34 positively skewed
s 44
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________
Activity 4
Skewness
1. Determine the coefficient of skewness for each of the following sets of data and
describe the result.
a. x=¿ 50 ~x=¿ 40 s = 4.5
~
b. x=¿ 100 x=¿ 120 s = 11.5
c. x=¿ 75 ~x=¿ 85 s = 6.2
~
d. x=¿ 295 x=¿ 250 s = 35
2. At Saint Mary’s Academy, the mean age of the students is 19.2 years, with a
standard deviation of 1.2 years. The median age is 18.6 years. Compute the
coefficient of skewness. Describe the skewness.
CORRELATION
The investigation of two or more variables requires not only procedures for
defining and measuring the variables under study, but also for describing the nature
of relations between them. A procedure that may be used to determine the
relationship between variables is the correlation.
The most widely used measure of correlation is the Pearson Product Moment
Correlation Coefficient or Pearson r which was developed by Karl Pearson. This
statistics is used for interval and ratio type of data. If two variables, X and Y, are
under investigation, the correlation coefficient is determined by:
❑ ❑ ❑
n ∑ ❑ XY −( ∑ ❑ X )( ∑ ❑Y )
r= ❑
❑
❑
❑
❑
❑
❑
√ 2 2 2
[n ∑ ❑ X −( ∑ ❑ X ) ][n ∑ ❑ Y −( ∑ ❑Y ) ]
❑ ❑ ❑ ❑
2
Example:
Determine the degree of relationship between the midterm and final grade of
10 students at a certain university.
Midterm Final
Student XY X2 Y2
Grade (X) Grade (Y)
A 84 85 7 140 7 056 7 225
B 88 89 7 832 7 744 7 921
C 78 86 6 708 6 084 7 396
D 79 83 6 557 6 241 6 889
E 91 88 8 008 8 281 7 744
F 84 87 7 308 1 056 7 569
G 77 81 6 237 5 929 6 561
H 83 86 7 138 6 889 7 396
I 85 82 6 970 7 225 6 724
J 86 85 7 310 7 396 7 225
❑ ❑ ❑ ❑ ❑
∑ ❑ X =¿
❑
∑ ❑Y =¿
❑
∑ ❑ XY =¿ 71
❑
∑ ❑ X 2=¿ 69
❑
∑
❑
❑Y 2=¿ 72
835 852 208 901 650
❑ ❑ ❑
n ∑ ❑ XY −( ∑ ❑ X )( ∑ ❑Y )
r= ❑
❑
❑
❑
❑
❑
❑
√ 2 2 2
[n ∑ ❑ X −( ∑ ❑ X ) ][n ∑ ❑ Y −( ∑ ❑Y ) ]
❑ ❑ ❑ ❑
2
Example:
Compute for the value of Spearman rho and determine the degree of
relationship between capital and profit of dried fish.
❑
6∑ ❑ D2
6 (7 ) 42 0.96
ρ=1− ❑
=1− =1− ≈
2
n ( n −1 ) 2
10 ( 10 −1 ) 990
Interpretation: There is a high positive correlation between the capital and profit of
10 businessmen.
Name:______________________________________Score:_________________
Section:_____________________________________Date:__________________
Activity 5
Correlation
1. The heights and weights of 10 basketball players in the PBA are randomly
selected from different teams. Calculate the value of Pearson r and interpret the
result.
2. Compute for the value of Spearman rho and determine the degree of relationship
between weight and height of bottle–fed infants using the same brand of milk.
The least square regression equation can be formed from a set of sample
data using the formula:
y = a + bx
❑ ❑ ❑ ❑
∑ ❑Y (∑ ❑ X 2)−∑ ❑ X (∑ ❑ XY )
a= ❑ ❑
❑
❑
❑
❑
or
2 2
n ∑ ❑ X −(∑ ❑ X )
❑ ❑
❑ ❑ ❑ ❑
b=
n (∑ ❑ XY )−(∑ ❑ X )(∑ ❑Y ) or
❑ ❑ ❑
b=
∑
❑
❑
❑ XY −n x y
❑ ❑
2
n( ∑ ❑ X 2 )−( ∑ ❑ X) ∑ ❑ X 2−n x 2
❑ ❑ ❑
Note: The constants a and b in the regression equation are called the regression
coefficients.
Example:
The number of hours 13 students spent in studying for a test and their scores on that
test are shown below, what would be the estimated score if a student studies for 6.5
hours?
Hours spent
0 1 2 4 4 5 5 5 6 6 7 7 8
studying, X
Test Score, Y 40 41 51 48 64 69 73 75 68 93 84 90 95
Solution:
❑ ❑ ❑
From the data above: ∑ ❑ X =¿ 60; ∑ ❑Y =¿ 891; ∑ ❑ XY =¿ 4 620 and
❑ ❑ ❑
❑
∑ ❑ X 2=¿ 346.
❑
❑ ❑ ❑
b=
n (∑❑ ❑ XY )−(∑❑ ❑ X )(∑❑ ❑Y ) =
13 ( 4 620 ) −(60)(891) 6 600
= ≈ 7.35
13 ( 346 )−(60)2 898
❑ ❑
2
n( ∑ ❑ X 2 )−( ∑ ❑ X)
❑ ❑
❑ ❑ ❑ ❑
2
∑ ❑Y (∑ ❑ X )−∑ ❑ X (∑ ❑ XY ) 891 ( 346 ) −(60)( 4620) 31 086
a= ❑ ❑ ❑ ❑
= = ≈ 34.62
❑ ❑
2 13 ( 346 )−(60)2 898
n ∑ ❑ X 2−(∑ ❑ X )
❑ ❑
Activity 6
Linear Regression
1. The table below shows the monthly income (X) and the monthly expenses (Y) of
7 families in a certain barangay in Makati. Estimate the monthly expenditures of a
family whose income is ₱ 8 250.
Monthly Monthly
Family No. XY X2
Income (X) Expenses (Y)
1 6 600 4 980
2 5 875 4 680
3 7 250 5 650
4 4 925 3 700
5 5 678 5 668
6 5 975 4 260
7 6 950 6 380
References: