Chap 3

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 6

11

BASIC STATISTICS
3.0 Frequency Distributions

3.1 Raw Data and Arrays

Raw Data are collected data that have not been organized numerically. An example is the set of
Trigonometry test scores of 93 students obtained from an alphabetical listing of two sections in the college of
engineering.

Table 3.1
TRIGONOMETRY TEST SCORES OF 93 STUDENTS

39 10 6 13 9 17 11
21 12 14 9 12 36 18
23 15 7 12 12 11 13
26 10 7 17 10 12 10
29 13 23 10 11 16 14
20 15 18 19 15 13 13
15 23 23 9 12 12 27
16 13 15 13 11 14 38
21 19 12 21 13 7 34
12 9 18 9 16 14 31
14 18 30 25 6 16 15
9 7 10 17 14 8 41
39 42 15 47 42 48 37
15 48

An Array is an arrangement of raw numerical data in ascending or descending order of magnitude.


From an array, one can easily identify the range, which is the difference between the largest and smallest
numbers. Moreover the mid-value and the most frequently occurring value are determined by mere inspection.
Example 3.1:
The ages, arranged in ascending order, of the 12 employees of the personnel department of a company
are as follows:
20, 21, 23, 23, 26, 29, 32, 36, 38, 39, 45, 48
Analysis of the data: Of the twelve employees, six are in their 20’s, four are in their 30’s, and two in their 40’s.
The youngest employee is 20 years old and the oldest is 48. Moreover, four employees are below 25 years and ten are
below 40.

3.2 Ungrouped and Grouped Frequency Distributions

An ungrouped frequency distribution is merely an arrangement of the data usually from the highest to
the lowest that shows the frequency of occurrence of the different values of the variable. It is used when there is
small number of observations. Example 3.2: Table 3.2 shows an ungrouped frequency distributions of the ages
of 30 contractual employees of Jollibee in Baliuag.
12
Table 3.2
Ages of 30 Contractual Employees
of Jollibee in Baliuag

Age Frequency

22 3
21 5
20 5
19 11
18 6

Total 30

A grouped frequency distribution is an arrangement of data that shows the frequency of occurrence of
values falling within arbitrarily defined ranges of the variable known as class intervals. An example is given in
Table 3.3.
Table 3.3
Heights of 100 Male Students at B.U.

Height Number of
(in) Students

60-62 5
63-65 18
66-68 42
69-71 27
72-74 8

Total 100
Class Intervals, Class Limits, and Class Boundaries
A symbol defining a class, such as 60-62 in Table 3.3, is called a class interval.
The end numbers, 60 and 62 are called class limits; the smaller number (60) is the lower class limit, and
the larger number (62) is the upper class limit.
A class interval that has either no upper class limit or no lower class limit indicated is called an open
class interval.
If heights are recorded to the nearest inch, the class interval 60-62 theoretically includes all
measurements from 59.5000 to 62.5000 in. These numbers are called class boundaries, or true class limits; the
smaller number (59.5) is the lower class boundary, and the larger number (62.5) is the upper class boundary.
The class boundaries are obtained by adding the upper limit of one class interval to the lower limit of the next
higher interval and dividing by 2.
Class Size and Class Mark/Midpoint
The size of the class interval is the difference between the lower and upper class boundaries and is also
referred to as the class size. It is also equal to the difference between two successive lower class limits or two
successive upper class limits. For the data of Table 3.3, the class size c = 62.5 – 59.5 = 3.

The class mark is the midpoint of the class interval and is obtained by adding the lower and upper class

limits and dividing by 2. Thus the midpoint or class mark of the interval 60 – 62 is .

3.3 General Rules for Forming Frequency distributions:


13
1. Determine the largest and smallest numbers in the raw data and thus find range (the difference between
the largest and smallest numbers).

2. Determine the class interval size (i) by dividing the range into a convenient number of class intervals.
The number of class intervals is usually taken between 5 and 20, depending on the data. The result, if
not exact may be rounded to the next unit if the scores to be grouped are expressed as whole numbers.

3. Class intervals are also chosen so that the class marks (or midpoints) coincide with the actually observed
data.

4. Determine the number of observations falling into each class interval; that is, find the class frequencies.
This is best done using a tally, or score sheet.

5. Get the sum of the frequency column and check it against the total number of observations or cases.

Example 3.3: Follow the preceding steps in the in constructing a grouped frequency distribution for the
trigonometry test scores of 93 students shown in Table 3.1.
1. Determine the range:

2. Determine the class interval size:

3. Complete the grouped frequency distribution in the following table:

Class Interval (X) Tally Midpoint (Xm) Frequency (f)

Total (n) =

3.4 Derived Frequency Distribution


There are three types of frequency distributions that may be derived from a simple frequency table:
1. Relative Frequency Distribution. It indicates how many percent fall within each category. The relative
frequencies are obtained by simply dividing the class frequencies by n and then multiplying by 100%;
that is
14
2. Cumulative Frequency Distribution. When the successive frequencies are added from the smallest to the
largest class interval, we obtain a “less than cumulative frequency distribution” (cf<). When the
frequencies are cumulated starting from that of the largest class interval, the result is a “greater than
cumulative frequency distribution” (cf>).
3. Cumulative Percentage Frequency Distribution. It is derived using one of the following methods:
(a) Cumulate the relative frequencies
(b) Divide the cumulative frequencies by n and multiply by 100%; that is,

Example 3.4: Follow the preceding steps to complete the derived frequency distributions in the following
table for the trigonometry test scores of 93 students shown in Table 3.4.

X f cf< cf> rf cpf< cpf>

n = 93

3.5 Graphical Representations of Frequency Distributions


Histograms and frequency polygons are two graphic representations of frequency distributions.
1. A histogram, or frequency histogram, consists of a set of rectangles having (a) bases on a horizontal axis
(the x-axis), with centers at the class marks and lengths equal to the class interval sizes, and (b) areas
proportional to the class frequencies.

2. A frequency polygon is a line graph of the class frequency plotted against the class mark. It can be
obtained by connecting the midpoints of the tops of the rectangles in the histogram.
Example 3.5.1: Plot a frequency polygon and a histogram for the data in example 3.3
15
F
R
E
Q
U
E
N
C
Y

Test Score
Ogive is the graphical representation of cumulative frequency distributions. For “less than” type, this
graph is constructed by plotting the appropriate cumulative frequencies against the upper class boundaries. For
the “greater than” cumulative frequency distribution, the cumulative frequencies are marked directly above the
lower class boundaries.
Example 3.5.2: Plot the cf< and cf> Ogives for the data in example 3.4

F
R
E
Q
U
E
N
C
Y

Test Score
16

EXERCISES:
1. (a) Arrange the numbers 17, 45, 38, 27, 6, 48, 11, 57, 34, and 22 in an array.
(b) Determine the range of these numbers.

2. Give the class mark, the class boundaries, and the interval size for each of the following:
(a) 10 – 19
(b) 16 – 18
(c) 1.5 – 5.0
(d) 1.5 – 4.99
(e) 12.85 – 13.43

3. The prices of a certain commodity are gathered from different stores and grouped into following classes:
8.00 –8.49
7.50 –7.99
7.00 –7.49
6.50 – 6.99
6.00 – 6.49
5.50 – 5.99
5.00 – 5.49
Determine
(a) the class boundaries
(b) the class marks
(c) the interval size

4. The mid points of the class intervals of a frequency distribution are 14, 21, 28, 35, 42, 49. Find the
following:
(a) interval size
(b) class limits
(c) class boundaries

5. The following entrance test scores are obtained by a sample of 100 freshmen in Baliuag University

48 41 31 65 57 59 39 38 53 66
44 37 48 33 54 47 26 55 35 47
62 21 69 35 31 51 40 45 22 41
57 34 62 55 27 47 32 46 43 68
43 67 60 46 50 53 59 71 32 39
56 40 50 55 47 29 53 52 44 56
62 51 35 60 58 47 39 36 64 33
34 32 28 80 63 52 84 56 67 56
47 35 28 57 61 36 46 44 58 43
72 34 53 39 61 43 63 37 54 44

(a) Construct a frequency distribution using 10 as the number of class intervals.


(b) Derive a cumulative frequency distribution cf< and cf>
(c) Derive rf, and cpf
(d) Construct a histogram and a frequency polygon
(e) Construct cf< and cf> Ogives

You might also like