Unit 1.2
Unit 1.2
Unit 1.2
Continuous variable: Continuous variables can take on an unlimited number of values between the
lowest and highest points of measurement
Frequency
Frequency distribution: Tabular summary of the data showing frequency or relative frequencies of items
in each of several non-overlapping classes.
Relative frequency:
Relative frequency of the value =No of times the value occurs(f)/Number of observations in the data
set(N)
R.F. = f/N
A histogram is a plot that lets you discover, and show, the underlying frequency
distribution (shape) of a set of continuous data. This allows the inspection of the data for
its underlying distribution (e.g., normal distribution), outliers, skewness, etc. An example
of a histogram, and the raw data it was constructed from, is shown below:
36 25 38 46 55 68 72 55 36 38
67 45 22 48 91 46 52 61 58 55
How do you construct a histogram from a continuous variable?
To construct a histogram from a continuous variable you first need to split the data into
intervals, called variables. In the example above, age has been split into variables, with
each variable representing a 10-year period starting at 20 years. Each variable contains
the number of occurrences of scores in the data set that are contained within that
variable. For the above data set, the frequencies in each variable have been tabulated
along with the scores that contributed to the frequency in each variable (see below):
Notice that, unlike a bar chart, there are no "gaps" between the bars (although some
bars might be "absent" reflecting no frequencies). This is because a histogram
represents a continuous data set, and as such, there are no gaps in the data (although
you will have to decide whether you round up or round down scores on the boundaries
of variables).
There is no hard and fast rule for number of classes still a reasonable rule of thumb is
After determining frequencies and relative frequencies, Calculate the height of each
rectangle = Relative frequency of the class/ class width
Resulting rectangular heights are usually called densities and vertical scale is density
scale. It will give you correct picture when class width are equal.
= Rectangular area
It is the product of height multiplied by the width of the variable that indicates the
frequency of occurrences within that variable. One of the reasons that the height of the
bars is often incorrectly assessed as indicating frequency and not the area of the bar is
due to the fact that a lot of histograms often have equally spaced bars (variables), and
under these circumstances, the height of the variable does reflect the frequency.
Shapes of histogram
Histogram is symmetric
Positive skewed
Negative skewed
Unit 1.3
Measures of location
Mean /simple mean/arithmetic mean: (x1+ x2+ x3+ x4+ x5+……..+ xn)/N
Median: Middle value of the series
Average of ((N/2 )th item and ((N+1)/2 )th item in even series
Trimmed Mean
2.0 2.4 2.5 2.6 2.6 2.7 2.7 2.8 3.0 3.1 3.2 3.3
3.4 3.4 3.6 3.6 3.6 3.6 3.7 4.4 4.6 4.7 4.8 5.3
N= 26