Chap 3
Chap 3
Chap 3
BASIC STATISTICS
3.0 Frequency Distributions
Raw Data are collected data that have not been organized numerically. An example is the set of
Trigonometry test scores of 93 students obtained from an alphabetical listing of two sections in the college of
engineering.
Table 3.1
TRIGONOMETRY TEST SCORES OF 93 STUDENTS
39 10 6 13 9 17 11
21 12 14 9 12 36 18
23 15 7 12 12 11 13
26 10 7 17 10 12 10
29 13 23 10 11 16 14
20 15 18 19 15 13 13
15 23 23 9 12 12 27
16 13 15 13 11 14 38
21 19 12 21 13 7 34
12 9 18 9 16 14 31
14 18 30 25 6 16 15
9 7 10 17 14 8 41
39 42 15 47 42 48 37
15 48
An ungrouped frequency distribution is merely an arrangement of the data usually from the highest to
the lowest that shows the frequency of occurrence of the different values of the variable. It is used when there is
small number of observations. Example 3.2: Table 3.2 shows an ungrouped frequency distributions of the ages
of 30 contractual employees of Jollibee in Baliuag.
12
Table 3.2
Ages of 30 Contractual Employees
of Jollibee in Baliuag
Age Frequency
22 3
21 5
20 5
19 11
18 6
Total 30
A grouped frequency distribution is an arrangement of data that shows the frequency of occurrence of
values falling within arbitrarily defined ranges of the variable known as class intervals. An example is given in
Table 3.3.
Table 3.3
Heights of 100 Male Students at B.U.
Height Number of
(in) Students
60-62 5
63-65 18
66-68 42
69-71 27
72-74 8
Total 100
Class Intervals, Class Limits, and Class Boundaries
A symbol defining a class, such as 60-62 in Table 3.3, is called a class interval.
The end numbers, 60 and 62 are called class limits; the smaller number (60) is the lower class limit, and
the larger number (62) is the upper class limit.
A class interval that has either no upper class limit or no lower class limit indicated is called an open
class interval.
If heights are recorded to the nearest inch, the class interval 60-62 theoretically includes all
measurements from 59.5000 to 62.5000 in. These numbers are called class boundaries, or true class limits; the
smaller number (59.5) is the lower class boundary, and the larger number (62.5) is the upper class boundary.
The class boundaries are obtained by adding the upper limit of one class interval to the lower limit of the next
higher interval and dividing by 2.
Class Size and Class Mark/Midpoint
The size of the class interval is the difference between the lower and upper class boundaries and is also
referred to as the class size. It is also equal to the difference between two successive lower class limits or two
successive upper class limits. For the data of Table 3.3, the class size c = 62.5 – 59.5 = 3.
The class mark is the midpoint of the class interval and is obtained by adding the lower and upper class
limits and dividing by 2. Thus the midpoint or class mark of the interval 60 – 62 is .
2. Determine the class interval size (i) by dividing the range into a convenient number of class intervals.
The number of class intervals is usually taken between 5 and 20, depending on the data. The result, if
not exact may be rounded to the next unit if the scores to be grouped are expressed as whole numbers.
3. Class intervals are also chosen so that the class marks (or midpoints) coincide with the actually observed
data.
4. Determine the number of observations falling into each class interval; that is, find the class frequencies.
This is best done using a tally, or score sheet.
5. Get the sum of the frequency column and check it against the total number of observations or cases.
Example 3.3: Follow the preceding steps in the in constructing a grouped frequency distribution for the
trigonometry test scores of 93 students shown in Table 3.1.
1. Determine the range:
Total (n) =
Example 3.4: Follow the preceding steps to complete the derived frequency distributions in the following
table for the trigonometry test scores of 93 students shown in Table 3.4.
n = 93
2. A frequency polygon is a line graph of the class frequency plotted against the class mark. It can be
obtained by connecting the midpoints of the tops of the rectangles in the histogram.
Example 3.5.1: Plot a frequency polygon and a histogram for the data in example 3.3
15
F
R
E
Q
U
E
N
C
Y
Test Score
Ogive is the graphical representation of cumulative frequency distributions. For “less than” type, this
graph is constructed by plotting the appropriate cumulative frequencies against the upper class boundaries. For
the “greater than” cumulative frequency distribution, the cumulative frequencies are marked directly above the
lower class boundaries.
Example 3.5.2: Plot the cf< and cf> Ogives for the data in example 3.4
F
R
E
Q
U
E
N
C
Y
Test Score
16
EXERCISES:
1. (a) Arrange the numbers 17, 45, 38, 27, 6, 48, 11, 57, 34, and 22 in an array.
(b) Determine the range of these numbers.
2. Give the class mark, the class boundaries, and the interval size for each of the following:
(a) 10 – 19
(b) 16 – 18
(c) 1.5 – 5.0
(d) 1.5 – 4.99
(e) 12.85 – 13.43
3. The prices of a certain commodity are gathered from different stores and grouped into following classes:
8.00 –8.49
7.50 –7.99
7.00 –7.49
6.50 – 6.99
6.00 – 6.49
5.50 – 5.99
5.00 – 5.49
Determine
(a) the class boundaries
(b) the class marks
(c) the interval size
4. The mid points of the class intervals of a frequency distribution are 14, 21, 28, 35, 42, 49. Find the
following:
(a) interval size
(b) class limits
(c) class boundaries
5. The following entrance test scores are obtained by a sample of 100 freshmen in Baliuag University
48 41 31 65 57 59 39 38 53 66
44 37 48 33 54 47 26 55 35 47
62 21 69 35 31 51 40 45 22 41
57 34 62 55 27 47 32 46 43 68
43 67 60 46 50 53 59 71 32 39
56 40 50 55 47 29 53 52 44 56
62 51 35 60 58 47 39 36 64 33
34 32 28 80 63 52 84 56 67 56
47 35 28 57 61 36 46 44 58 43
72 34 53 39 61 43 63 37 54 44