Chapter 2
Chapter 2
Chapter 2
net/publication/353378173
CITATIONS READS
0 3,365
1 author:
Raid Salha
Islamic University of Gaza
43 PUBLICATIONS 60 CITATIONS
SEE PROFILE
All content following this page was uploaded by Raid Salha on 22 July 2021.
1
2.1 Central Tendency Measures
1. The Mean
The most commonly used index of central tendency is the mean, the term
used in statistics for the arithmetic average. The equation for calculating the
mean is as follows:
80
𝑋̅ = = 10
8
2
2. The Median
To calculate the median, the data values must first be sorted in ascending
order from the smallest to the largest value or in descending order from the
largest value to the smallest.
2. If the data size N is even, then the median is the mean of the two
𝑁 𝑁+2
observations whose order are and
2 2
9 10 15 5 10 8 6 14 7
Median = 9
3
Example 3: The median of the following data
9 10 15 5 10 8 6 14 7 13
5 6 7 8 9 10 10 13 14 15
9+10
Median = = 9.5
2
3. The Mode
4
Example 6: Using SPSS, find the central tendency measure for the data in
Example 2 (Heart rate data).
Statistics
heartrate
N Valid 100
Missing 0
Mean 65.21
Median 66.00
Mode 66
5
2.2 Variability Measures
Consider, for example, the two distributions in Figure 1. This figure shows
body weight data for two hypothetical samples, both of which have means of
150 pounds, but, clearly, the two samples differ markedly. In sample A,
there is great diversity: Some people weigh as little as 100 pounds, while
others weigh up to 200 pounds. In sample B, by contrast, there are few
people at either extreme: The weights cluster more tightly around the mean
of 150. We can verbally describe a sample’s variability. We can say, for
example, that sample A is higher varied than sample B, with regard to
weight.
Figure 1: Two distributions with the same mean and different variability
6
. Variability indexes
Statisticians have developed indexes that express the extent to which data
values on quantitative variables deviate from one another in a distribution.
Four indexes of which are described here.
1. The Range
The range, the simplest measure of variability, is the difference between the
highest data value and the lowest data value in the distribution.
Note: In research reports, the range is often shown as the minimum and
maximum value, without the subtracted difference score.
2. Interquartile Range
7
Note: The median is the second quartile is the point below which 50% of the
data lie.
Example 7: Use SPSS to find for the heart rate data the following
a. The three quartiles and 𝐼𝑄𝑅 .
b. 10 % (percentiles) points.
c. 15 % , 25%, 45%, 50%, 60%, 75%, 95% (percentiles) points.
a.
Statistics
heartrate
N Valid 100
Missing 0
Percentiles 25 62.00
50 66.00
75 68.00
𝐼𝑄𝑅 = 68 − 62 = 6
8
b.
Statistics
heartrate
N Valid 100
Missing 0
Percentiles 10 59.00
20 61.00
30 63.00
40 64.00
50 66.00
60 66.60
70 68.00
80 69.00
90 71.00
c.
Statistics
heartrate
N Valid 100
Missing 0
Percentiles 15 60.00
25 62.00
45 65.00
50 66.00
60 66.60
75 68.00
95 72.00
9
3. The Variance
The most widely used index of variability is the variance (often abbreviated
as Var). The variance is based on differences between every data value and
the value of the mean. Thus, the formula for variance is given by:
∑𝑁 ̅ 2
𝑖=1(𝑋𝑖 −𝑋)
𝑉𝑎𝑟 = ,
𝑁−1
where
𝑁 is the sample size.
𝑋̅ is the sample mean.
𝑋𝑖 ′𝑠 are the data.
𝑋𝑖 𝑋𝑖 − 𝑋̅ (𝑋𝑖 − 𝑋̅)2
110 - 40 1600
120 -30 900
130 -20 400
140 -10 100
150 0 0
150 0 0
160 10 100
170 20 400
180 30 900
190 40 1600
𝑁 𝑁 𝑁
10
∑ 𝑁
𝑋 1500
𝑋̅ = 𝑖=1 𝑖 = = 150.
𝑁 10
∑𝑁 ̅ 2 6000
𝑖=1(𝑋𝑖 − 𝑋 )
𝑉𝑎𝑟 = = = 666.67
𝑁−1 9
Note: Because the variance is not in the same measurement units as the
original data (in this example, it is in pounds squared), the variance is rarely
used as a descriptive statistic
The most widely used index of variability is the standard deviation (often
abbreviated as SD). The standard deviation is the square root of the variance
𝑁
∑ (𝑋 −𝑋) ̅ 2
𝑆𝐷 = √ 𝑖=1 𝑖 = √𝑉𝑎𝑟.
𝑁−1
11
Example 10: Using SPSS, find the variability measures for the data in Heart
rate data.
Statistics
heartrate
N Valid 100
Missing 0
Std. Deviation 4.495
Variance 20.208
Range 19
Example 11: Using SPSS, find the central tendency and variability
measures for the Heart rate data.
Statistics
heartrate
N Valid 100
Missing 0
Mean 65.21
Median 66.00
Mode 66
Std. Deviation 4.495
Variance 20.208
Range 19
12
2.3 Outliers
Example 12: Are there any outliers in the heart rate data? If yes, find them?
13
• A mild lower outlier would be any value between 44 and 53 and A
mild upper outlier would be any value between 77 and 86.
There is no mild outlier.
14
Figure 4: The boxplot graph for the heart rate data
Example 13:
15
see if they are legitimate values, or reflect errors in data entry. If they are
true values, researchers can decide on whether it is appropriate to make
adjustments, such as trimming the mean.
Figure 5: The boxplot graph for the heart rate data after adding six extreme
values
16
Exercises
41 27 32 24 21 28 22 25 35 27
31 40 23 27 29 33 42 30 26 30
27 39 26 34 28 38 29 36 24 37
If the values of these indexes are not the same, discuss what they suggest
about the shape of the distribution.
(a) 1 5 7 8 9
(b) 3 5 6 8 9 10
(c) 3 4 4 4 6 20
(d) 2 4 5 5 8 9
Q4. The following ten data values are systolic blood pressure readings.
Compute the mean, the range, the SD, and the variance for these data.
130 110 160 120 170 120 150 140 160 140
17
Q5. The following data represent blood pressure of 30 people
a. Calculate IQR.
b. Are there any outliers in the blood pressure data? If yes, find them and
classify them as mild or extreme outliers?
c. Graph the boxplot for the data.
18