Measures of Central Tendency and Variability

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Measures of Central

Tendency and
Dispersion/Variability
THE MEASURES OF CENTRAL
TENDENCY

The mean, mode and median are


valid measures of central tendency but
under different conditions, one measure
becomes more appropriate than the others.
For example, if the scores are
extremely high and extremely low, the
median is a better measure of central
tendency since mean is affected by
extremely high and extremely low scores.
The Mean (Arithmetic)

The mean (or average or arithmetic


mean) is the most popular and most well
known measure of central tendency. The
mean is equal to the sum of all the values
in the data set divided by the number of
values in the data set.
Example
10 students in a Graduate School class got the
following scores in a 100-item test: 70, 72, 75, 77, 78,
80, 84, 87, 90, 92. The mean score of the group of 10
students is the sum of all their scores divided by 10.

The mean, therefore, is 805/10 equals 80.5 so 80.5 is


the average score of the group.

There are 6 scores below the average score(mean) of


the group (70, 72, 75,77,78, and 80) and there are 4
scores above the average score (mean) of the group
(84.87.90 and 92).
When Not to Use the Mean?
The mean has one main disadvantage. It is
particularly susceptible to the influence of outliers.

Score 1 2 3 4 5 6 7 8 9 10
5 38 56 60 67 70 73 78 79 95

• The mean score is 62.1


• The mean is skewed by the extremely low
and extremely high scores.
• Median would be the better measure of
central tendency in this situation
Median
The median is the middle score for a
set of scores arranged from lowest to
highest. The mean is less affected by
extremely low and extremely high scores.
How do we find the median?

Suppose we have the following data:

65 55 89 56 35 14 56 55 87 45 92
To determine the median, first we have
to rearrange the scores into order of
magnitude (from smallest to largest.

14 35 45 55 55 56 56 65 87 89 92

Our median is the score at the middle of


the distribution.
Mode
The mode is the most frequent score in
our data set. On a histogram or bar chart it
represents the highest bar. If is a score of the
number of times an option is chosen in a
multiple choice test You can, therefore,
sometimes consider the mode as being the
most popular option.

14 35 45 55 55 56 56 65 87 89

There are two frequent scores 55 and 56


- two modes, binomial distribution
NORMAL AND SKEWED
DISTRIBUTIONS
A score distribution a sample has a “normal
distribution" when most of the values are aggregated
around the mean, and the number of values
decrease as you move below or above the mean.
OUTCOME-BASED TEACHING-
LEARNING AND SCORE DISTRIBUTION
If teachers teach in accordance with the
principles of outcome- based teaching-learning and so
align content and assessment with the intended
learning outcomes and re-teach till mastery what has/
have not been understood as revealed by the formative
assessment process, then student scores in the
assessment phase of the lesson will tend to congregate
on the higher end of the score distribution.
MEASURES OF DISPERSION OR
VARIABILITY
If the measures of central tendency
indicate where scores congregate, the
measures of variability indicate how spread out
a group of scores is or how varied the scores
are. Common measures of dispersion or
variability are range, variance and standard
deviation.
What is variability?

Variability refers to how “spread out” a


group of scores is. Here are two sets of score
distribution:
A-5,5,5,5,6,6,6,6,6,6 -Mean is 5.6
B-1,3,4,5,5, 6,7,8,8,9 -Mean is 5.6

The two score distributions have equal mean


scores and yet the scores are varied. Score
distribution A shows scores that are less
varied than score distribution B.
Quiz 1
7

0
4 5 6 7 8 9 10

Figure 16.Bar charts of two quizzes Quiz 2


5

0
4 5 6 7 8 9 10

Figure 17.Bar charts of two quizzes

http://onlinestatbook.com/2/summarizing_distributions/variability.html
RANGE
The range is the most simple measure of
variability. The range is simply the highest score
minus the lowest score.
Example
What is the range of the following group of
scores: 10, 2, 5, 6, 7, 3, 4?

Highest number is 10
Lowest number is 2
10 – 2 = 8.
The range is 8.
VARIANCE
Variability can also be defined in terms
of how close the scores in the distribution are
to the middle of the distribution. Using the
mean as the measure of the middle of the
distribution, the variance is defined as the
average squared difference of the scores from
the mean.
Scores Deviation from Mean Squared Deviation
9 2 4
9 2 4
9 2 4
8 1 1
8 1 1
8 1 1
8 1 1
Table.
7 0 0
7 0 0 Calculation
7 0 0 of
7 0 0
Variance
7 0 0
6 -1 1 for Quiz 1
6 -1 1 scores
6 -1 1
6 -1 1
6 -1 1
5 -2 4
5 -2 4
Means
7 0 1.5
One thing that is important to notice is that the
mean deviation from the mean is 0. This will
always be the case. The mean of the squared
deviations is 1.5. Therefore, the variance is 1.5.

The formula for the variance is:

σ 2
(𝑋 − 𝜇)
𝜎2 =
𝑁
STANDARD DEVIATION
To calculate the standard deviation of those
numbers:
1. Work out the Mean (the simple average of
the numbers).
2. Then for each number: subtract the Mean
and square the result.
3. Then work out the mean of those squared
differences.
4. Take the square root of that and we are
done!
Example: Sam has 20 rose bushes.
The number of flowers on each bush is 9, 2, 5, 4, 12, 7,
8, 11, 9, 3, 7, 4, 12, 5, 4,10,9,6,9,4

Step 1.Work out the mean

The mean is:


9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4

=140/20
=7
So: μ=7
Step 2. Then for each number: subtract the Mean
and square the result.
This is the part of the formula that says:(xi-μ)²

Example (continued):
(9-7)² = (2)² = 4
(2-7)² = (-5)² = 25
(5-7)² = (-2)² = 4
(4-7)² = (-3)² =9
(12-7)² = (5)² =25
(7-7)² = (0)² =0
(8-7)² = (1)² =1
Step 3. Then work out the mean of those squared
differences.
To work out the mean, add up all the values then
divide by how many. First add up all the values from
the previous step. We Use “Sigma”:Σ

Example (continued):
𝑁

෍(𝑥𝑖 − µ)²
𝑖=1
Sum all values from (x,-7) to (x,-7)
We already calculated(x,-7)2=4 etc.in the previous step,
so just sum them up:= 4 +25 +4 +9 +25 +0 +1 +16 +4
+16 +0 +9 +25+4+9+9+4+1+4+9=178,
But that isn't the mean yet, we need to divide by how
many, which is done by multiplying by 1/N (the same as
dividing by N):
Example (continued):
𝑁
1
෍(𝑥𝑖 − µ)²
𝑁
𝑖=1
Mean of squared differences =(1/20) x 178 = 8.9
Step 4. Take the square root of that:
Example (concluded):
𝑁
1
σ= ෍(𝑥𝑖 − µ)²
𝑁
𝑖=1

σ = v(8.9)
=2.983...
Sample Standard Deviation

Example: Sam has 20 rose bushes, but only


counted the flowers on 6 of them! The
"population" is all 20 rose bushes, and the
"sample” is the 6 bushes that Sam counted among
the 20.

The formula for Sample Standard Deviation:


𝑁
1
S= ෍(𝑥𝑖 − 𝑥)²
ҧ
𝑁−1
𝑖=1
Bessel's correction
• The important change is “N-1” instead of “N“.
• The symbols also change to reflect that we
are working on a sample instead of the whole
population: The mean is now x (for sample
mean) instead of μ (the population mean),
And the answer is s (for Sample Standard
Deviation) instead of o. But that does not
affect the calculations. Only N-1 instead of N.
Step 1. Work out the mean

Example 2: Using sampled values 9,2,5,4,12,7

The mean is (9+2+5+4+12+7)/6


= 39/6
= 6.5
x = 6.5
Step 2.Then for each number: subtract the
Mean and square the result

Example 2(continued):

(9-6.5)² = (2.5)² =6.25


(2-6.5)² = (-4.5)² =20.25
(5-6.5)² = (-1.5)² =2.25
(4-6.5)² =(-2.5)² =6.25
(12-6.5)² =(5.5)² =30.25
(7-6.5)² =(0.5)² =0.25

Step 3.Then work out the mean of those squared
differences.
To work out the mean, add up all the values then
divide by how many. But hang on...we are
calculating the Sample Standard Deviation, so
instead of dividing by how many (N). we will divide
by N-1

Example 2(continued):
Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 =
65.5
Divide by N-1: (1/5) x 65.5 = 13.1
Step 4.Take the square root of that:
Example 2(concluded):

𝑁
1
S= ෍(𝑥𝑖 − 𝑥)²
ҧ
𝑁−1
𝑖=1

s = (13.1)
=3.619...
COMPARING
When we used the whole population we got:
Mean=7, Standard Deviation = 2.983... When we used the
sample we got: Sample Mean = 6.5, Sample Standard
Deviation = 3.619...
Our Sample Mean was wrong by 7%,and our Sample
Standard Deviation was wrong by 21%.
MORE NOTES ON STANDARD DEVIATION
The standard deviation is simply the square root
of the variance. The standard deviation is an especially
useful measure of variability when the distribution is
normal or approximately normal because the
proportion of the distribution within a given number
of standard deviations from the mean can be
calculated.
Example

68% of the distribution is within one


standard deviation of the mean and
approximately 95% of the distribution is within
two standard deviations of the mean. Therefore,
if you had a normal distribution with a mean of
50 and a standard deviation of 10, then 68% of
the distribution would be between 50 - 10 = 40
and 50 + 10 = 60. Similarly, about 95% of the
distribution would be between 50 - 2 x 10 = 30
and 50+ 2 x 10 = 70.
The symbol for the population standard
deviation is σ.
The distribution (bold line) has a mean of 40 and a
standard deviation of 5; The other distribution has a
mean of 60 and a standard deviation of 10. For the
distribution (bold line), 68% of the distribution is
between 35 and 45; for the other distribution, 68% is
between 50 and 70.
Interpretation of Standard Deviation
The lower the standard deviation, the more consistent the
data are.

Example -Two bowlers, Katie and Mike have the scores given
below:
Katie's Scores 189 146 200 241 231
Mike's Scores 235 201 217 168 186

Both sets of data have a mean (x)=201.4.


Katie has a standard deviation of SD = 37.6470 and Mike has
a standard deviation of SD = 26.1017. Since Mike has a
smaller standard deviation, he is a more consistent bowler
than Katie,i.c. Mike is more likely to get a score of 201.4.

You might also like