RM Module 3
RM Module 3
Descriptive Statistics
RESEARCH METHODOLOGY Statistics may be defined as the science of collecting
and analyzing data
1
3/5/2024
• Secondary data: Those which have already been collected by someone else
and which have already been passed through the statistical process.
89 90
2
3/5/2024
3. Based on Variables
• Univariate Data : Univariate data involves only one
variable. Example: Height of all students in a class
3
3/5/2024
93 94
4
3/5/2024
95 96
5
3/5/2024
97 98
6
3/5/2024
7
3/5/2024
Mean Mean
Numerical Example: Numerical Example:
Calculate the arithmetic mean for the following the marks obtained by 9 The weight recorded to the nearest grams of 60 apples picked out at random
students are given below: from a consignment are given below:
101 102
8
3/5/2024
Mean Mean
Solution: Using formula of short cut method of arithmetic mean for grouped data:
103 104
9
3/5/2024
Mean Mean
Geometric Mean: Numerical example of geometric Mean for both grouped and ungrouped
data:
“The nth root of the product of “n” positive values is called Calculate the geometric mean for the following the marks obtained by 9
geometric mean” students are given below:
105 106
10
3/5/2024
Mean Mean
Given the following frequency distribution of weights of 60 apples, Harmonic Mean:
calculate the geometric mean for grouped data.
“The reciprocal of the Arithmetic mean of the reciprocal of the values is called
Harmonic mean”
107 108
11
3/5/2024
Mean Mean
Numerical example of harmonic Mean for both grouped and ungrouped data: Given the following frequency distribution of weights of 60 apples, calculate
Calculate the harmonic mean for the following the marks obtained by 9 the harmonic mean for grouped data.
students are given below:
109 110
12
3/5/2024
Median Median
Median Arrange the data in ascending order:
When the observation are arranged in ascending or descending order, then a
value, that divides a distribution into equal parts, is called median.
Calculate the median for the following the marks obtained by 10 students are given
below:
Numerical example of median for both grouped and ungrouped data:
Calculate the median for the following the marks obtained by 9 students are
given below:
111 112
13
3/5/2024
Median Median
Numerical examples:
The following distribution relates to the number of assistants in 50 retail
establishments
The number of values above the median balances (equals) the number of
values below the median i.e. 50% of the data falls above and below the
median.
113 114
14
3/5/2024
Median Median
115 116
15
3/5/2024
Median Mode
Numerical example: Find the median, for the distribution of examination A mode is defined as the value that has a higher frequency in a given set of values. It is the
marks given below:
value that appears the most number of times.
Example: In the given set of data: 2, 4, 5, 5, 6, 7, the mode of the data set is 5 since it has
appeared in the set twice.
The value occurring most frequently in a set of observations is its mode. In other words,
the mode of data is the observation having the highest frequency in a set of data.
Example: The following table represents the number of wickets taken by a bowler in 10
matches. Find the mode of the given set of data.
It can be seen that 2 wickets were taken by the bowler frequently in different matches.
Hence, the mode of the given data is 2.
117 118
16
3/5/2024
Mode Mode
Mode For Grouped Data Example 1: Find the mode of the given data set: 3, 3, 6, 9, 15, 15, 15, 27, 27, 37, 48.
• In the case of grouped frequency distribution, calculation of mode just by looking into
the frequency is not possible. To determine the mode of data in such cases we Solution: In the following list of numbers, (3, 3, 6, 9, 15, 15, 15, 27, 27, 37, 48)
calculate the modal class. Mode lies inside the modal class. The mode of data is
given by the formula: 15 is the mode since it is appearing more number of times in the set compared to other numbers.
Example 2: Find the mode of 4, 4, 4, 9, 15, 15, 15, 27, 37, 48 data set.
Where, Solution: Given: 4, 4, 4, 9, 15, 15, 15, 27, 37, 48 is the data set.
l = lower limit of the modal class As we know, a data set or set of values can have more than one mode if more than one value
occurs with equal frequency and number of time compared to the other values in the set.
h = size of the class interval
Hence, here both the number 4 and 15 are modes of the set.
f1 = frequency of the modal class
Example 3: Find the mode of 3, 6, 9, 16, 27, 37, 48.
• f0 = frequency of the class preceding the modal class
Solution: If no value or number in a data set appears more than once, then the set has no mode.
• f2 = frequency of the class succeeding the modal class
Hence, for set 3, 6, 9, 16, 27, 37, 48, there is no mode available.
119 120
17
3/5/2024
Mode Mode
Example 4: In a class of 30 students marks obtained by students in mathematics out Bimodal, Trimodal & Multimodal (More than one mode)
of 50 is tabulated as below. Calculate the mode of data given.
When there are two modes in a data set, then the set is called bimodal
For example, The mode of Set A = {2,2,2,3,4,4,5,5,5} is 2 and 5, because both 2 and
Solution:
5 is repeated three times in the given set.
The maximum class frequency is 12 and the class interval corresponding to this
frequency is 20 – 30. Thus, the modal class is 20 – 30.
When there are three modes in a data set, then the set is called trimodal
Lower limit of the modal class (l) = 20
Size of the class interval (h) = 10
For example, the mode of set A = {2,2,2,3,4,4,5,5,5,7,8,8,8} is 2, 5 and 8
Frequency of the modal class (f1) = 12
Frequency of the class preceding the modal class (f0) = 5
When there are four or more modes in a data set, then the set is called multimodal
Frequency of the class succeeding the modal class (f2)= 8
Substituting these values in the formula we get;
If the given set of observations do not have any value that is repeated in the set, more
than once, then it is said to be no mode.
121 122
18
3/5/2024
123 124
19
3/5/2024
20
3/5/2024
127 128
21
3/5/2024
where 𝑌is the mean, s is the standard deviation, and N is the number
of data points.
The above formula for skewness is referred to as the Fisher-Pearson
coefficient of skewness.
129 130
22
3/5/2024
131 132
23
3/5/2024
Skewness describes the asymmetry of the dataset about the mean or indicates
the degree to which distribution deviates from symmetry.
133 134
24
3/5/2024
25
3/5/2024
137 138
26
3/5/2024
When the random variable is discrete with N equally likely values, then
equation becomes
For the case of a discrete random variable with exactly N equally Standard Deviation
likely values [that is, p(xi) =1/N], then equation for mean in case of
discrete distribution reduces to The standard deviation is a measure of spread or scatter in the population
expressed in the original units.
139 140
27
3/5/2024
141 142
28
3/5/2024
is the number of combinations of a items taken b at a time. the probability of finding one or fewer nonconforming items in the sample is
29
3/5/2024
145 146
30
3/5/2024
Note that the mean and variance of the Poisson distribution are both equal to the parameter
147 148
31
3/5/2024
149 150
32
3/5/2024
151 152
33
3/5/2024
Probability Distribution
153
34