Statistics
Statistics
Statistics
•Descriptive Statistics
Collecting and describing
data.
•Inferential Statistics
Making decisions based on sample
data.
Descriptive Statistics
•Collect Data e.g. Survey
•Estimation
•Hypothesis
Testing
Data
Categorical Numerical
Discrete Continuous
Types of Data
• Solution:
386140(4)012(7)0(10) 2
X
11
MEDIAN
3 8 6 14 0 -4 2 12 -7 -1 -10
Solution: We need to arrange the data set in order. The order set is as follows:
-10 -7 -4 -1 0 2 3 6 8 12 14
Median
2. Find the median for the ages of the following eight
college students:
23 19 32 25 26 22 24 20
84 80 68 87 86 70 79 90 67 80
82 62 85 86 61 86 87 91 78 86
72 96 89 84 78 88 78 78 82 76
70 86 85 88 70 79 75 89 73 86
72 68 82 89 81 69 77 81 77 83
1. Find the RANGE (subtract the highest to the
lowest score)
𝑅𝑎𝑛𝑔𝑒 = 𝐻 − 𝐿 96 − 61 = 35
2. Determine the class interval/class size by
dividing the range into desired number of classes
(10-15)
35
=3
11
3. Determine the lowest interval. It should be a
multiple of class interval (60)
Lowest interval: 60-62
4. Record the limits of the intervals, tally raw
scores and convert each tally to frequency
X Frequency
96-98 1
93-95 0
90-92 2
87-89 7
84-86 10
81-83 6
78-80 8
75-77 4
72-74 3
69-71 4
66-68 3
63-65 0
60-62 2
N= 50
Set class boundaries( integral class limit) and
cumulative frequency
X Frequency Class Boundaries Cumulative Freq.
96-98 1 95.5-98.5 50
93-95 0 92.5-95.5 49
90-92 2 89.5-92.5 49
87-89 7 86.5-89.5 47
84-86 10 83.5-86.5 40
81-83 6 80.5-83.5 30
78-80 8 77.5-80.5 24
75-77 4 74.5-77.5 16
72-74 3 71.5-74.5 12
69-71 4 68.5-71.5 9
66-68 3 65.5-68.5 5
63-65 0 62.5-65.5 2
60-62 2 59.5-62.5 2
N= 50
GET THE MEAN (GROUP)
𝑆𝑓𝑐𝑚
Mean = ,
𝑁
3995
Mean = = 79.9 = 80
50
GET THE MEDIAN (GROUP)
𝑁
−𝐶𝑓
Median = L + ( 2
)i
𝑓
L = lower limit of the median class
N= total number of scores
Cf = cum freq below the median class
F= frequency of the median class
i = class size
X Frequency Class Cumulative Class Mark fcm
Boundaries Freq. ( Midpoint)
96-98 1 95.5-98.5 50 97 97
93-95 0 92.5-95.5 49 94 0
90-92 2 89.5-92.5 49 91 182
87-89 7 86.5-89.5 47 88 616
84-86 10 83.5-86.5 40 85 850
81-83 6 80.5-83.5 30 82 492
78-80 8 77.5-80.5 24 79 632
75-77 4 74.5-77.5 16 76 304
72-74 3 71.5-74.5 12 73 219
69-71 4 68.5-71.5 9 70 280
66-68 3 65.5-68.5 5 67 201
63-65 0 62.5-65.5 2 64 0
60-62 2 59.5-62.5 2 61 122
N= 50 Sfcm = 3995
GET THE MEDIAN (GROUP)
𝑁
−𝐶𝑓
Median = L + ( 2
)i
𝑓
N/2 = 50/2 = 25
X Frequency Class Cumulative Class Mark fcm
Boundaries Freq. ( Midpoint)
96-98 1 95.5-98.5 50 97 97
93-95 0 92.5-95.5 49 94 0
90-92 2 89.5-92.5 49 91 182
87-89 7 86.5-89.5 47 88 616
84-86 10 83.5-86.5 40 85 850
81-83 6 80.5-83.5 30 = Md 82 492
78-80 8 77.5-80.5 24 79 632
75-77 4 74.5-77.5 16 76 304
72-74 3 71.5-74.5 12 73 219
69-71 4 68.5-71.5 9 70 280
66-68 3 65.5-68.5 5 67 201
63-65 0 62.5-65.5 2 64 0
60-62 2 59.5-62.5 2 61 122
N= 50 Sfcm = 3995
GET THE MEDIAN (GROUP)
50
−24
Median = 80.5 + ( 2
)3= 81
6
GET THE MODE (GROUP)
50
−24
Median = 80.5 + ( 2
)3= 81
6
Mo = 3Md – 2M
Mo = 3(81)-2(79.9)
= 83.2
Measures of Variability
Range = 1005
The Variance and Standard Deviation
x)
( x 2
2
S n 1
• The formula says that you subtract the mean from each data
value and square the differences, then you add these values
and divide by the sample size minus 1.
Do not let the formula frighten you. We will build a table to help
compute the variance.
What is the variance for the following sample
values?
3 8 6 14 0 11
• Solution: First of all, we need to compute the sample
mean:
3 8 6 14 0 11 42
X 7
6 6
Table Used in Helping to Compute the Sample Variance
x)
( x 2
2
S n 1
132 132
26.4
2
S 6 1 5
The Variance (Group)
𝑆𝑓𝑑2
Variance = ( 𝑆𝑓𝑑 )𝑖
𝑁− 2
𝑁
X Frequenc Class Cumulati Class fcm d fd fd2
y Boundari ve Freq. Mark (
es Midpoint)
96-98 1 95.5-98.5 50 97 97 6 6 36
93-95 0 92.5-95.5 49 94 0 5 0 0
90-92 2 89.5-92.5 49 91 182 4 8 64
87-89 7 86.5-89.5 47 88 616 3 21 441
84-86 10 83.5-86.5 40 85 850 2 20 400
81-83 6 80.5-83.5 30 82 492 1 6 36
78-80 8 77.5-80.5 24 79 632 0 0 0
75-77 4 74.5-77.5 16 76 304 -1 -4 16
72-74 3 71.5-74.5 12 73 219 -2 -6 36
69-71 4 68.5-71.5 9 70 280 -3 -12 144
66-68 3 65.5-68.5 5 67 201 -4 -12 144
63-65 0 62.5-65.5 2 64 0 -5 0 0
60-62 2 59.5-62.5 2 61 122 -6 -12 144
N= 50 Sfcm =
3995 Sfd=15 sfd2=1425
The Variance (Group)
1425
Variance = 15 3 = 86.54
50 − 2
50
Standard Deviation
• Mean counterpart
• Reliable/stable
When to Use Standard Deviation
( x x) 2
S
n 1
1. The sample standard deviation is approximately equal to the average distance
(MAD) of the observations from their mean.
2. If all of the observations have the same value, the sample standard deviation will
be zero. That is, there is no variability in the data set.
3. The variance (standard deviation) is influenced by outliers (very small or very
large values) in the data set.
4. The unit for the standard deviation is the same as that for the raw data, so it is
preferable to use the standard deviation instead of the variance as a measure of
variability.
What is the standard deviation for the following sample values?
3 8 6 14 0 11
• Solution:
(x x) 2
S
n 1
132
S 26.4 5.14
5
Quartile Deviation
𝑁
−𝐶𝑓
Q1 = 𝐿 + 4
𝑖
𝑓
3𝑁
−𝐶𝑓
Q3 = 𝐿 + 4
𝑖
𝑓
X Frequency Class Cumulative
Boundaries Freq.
96-98 1 95.5-98.5 50
93-95 0 92.5-95.5 49
90-92 2 89.5-92.5 49
87-89 7 86.5-89.5 47
84-86 10 83.5-86.5 40
81-83 6 80.5-83.5 30
78-80 8 77.5-80.5 24
75-77 4 74.5-77.5 16
72-74 3 71.5-74.5 12
69-71 4 68.5-71.5 9
66-68 3 65.5-68.5 5
63-65 0 62.5-65.5 2
60-62 2 59.5-62.5 2
N= 50
Quartile Deviation (Group)
50
−12
Q1 = 74.5 + 4
3
4
3(50)
−30
Q3 = 83.5 + 4
3
10
Measures of Relationship
•Degree of relationship or
correlation between two variables
•Correlation coefficient (-1 to 0 to
1)
Measures of Relationship
Pearson r
• Most appropriate
• Stable
• Data are interval or ratio type
• When relationship between the two variables is a linear
one
Measures of Relationship
50
−12
Q1 = 74.5 + 4
3 = 74.875
4
3(50)
−30
Q3 = 83.5 + 4
3= 85.75
10
(85.75 – 74.875) / 2= 5.4375
PERCENTILE RANKS
Z= standard score
X = score
M = Mean
S = standard deviation
Which of the following two scores has a
better relative position?
a. A score of 59 on a test with a mean of 48 and
standard deviation of 11
b. A score of 59 on a test with a mean of 48 and
standard deviation of 6
Which of the following two scores has a
better relative position?
48
a. z = 59 − =1
11
48
b. z = 59 − = 1.83
6
A distribution is left-skewed
(or negatively skewed) if the
values are more spread out
on the left, meaning that
some low values are likely to
be outliers.
Skewed Distribution of Test Scores
•Negatively Skewed Distribution
Skewed Distribution of Test Scores
A distribution is right
skewed or positively
skewed if the values are
more spread out on the
right. It has a tail pulled
toward the right.
Skewed Distribution of Test Scores
•Positively Skewed Distribution
Skewed Distribution of Test Scores
•Difficult
•Easy
•Average/moderately difficult
•Partly easy-partly difficult