Statistics
Statistics
Statistics
Statistics
One does not need to be a statistical wizard to grasp the basic mathematical
concepts needed to understand major measurement issues.
Statistics
singular sense branch of science which deals with the, collection, presentation,
analysis and interpretation of data
Areas of Statistics
Inferential Statistics comprise those methods concerned with the analysis of sample
data leading to predictions or inferences about the population
Data
Classification of Data
Examples:
Color - red, blue, yellow, green
Sex - male, female
Statistics EASSESS1
Examples:
weight - 160 lbs, 25 kg, 77 mg, etc.
height - 34 in., 5 cm, 5ft. 6 in., etc
Array data arranged either from highest to lowest or from lowest to highest
Measurement
Scales of Measurement
Nominal Scales:
• Examples: Gender - Female =1, Male = 2; Eye Color - Brown =1, Blue =2,
Green = 3.
Ordinal Scales:
• contains the properties of the nominal level but the numbers assigned to
categories of any variable may be ranked or ordered in some low - to - high
manner
Interval level
• contains the properties of the ordinal level but the distances between any two
numbers on the scale are of known sizes
• the number zero does not imply the absence of the characteristic under
consideration (thus, the zero point is arbitrary
Ratio level
• contains the properties of the interval level but it has a true zero point, that is, the
number zero indicates the absence of the characteristic under consideration
Examples:
Distributions
• Frequency Distributions
refers to the tabular arrangement of data by classes or categories together with the
number of observations falling with each class
1. Single-value grouping
• a form of frequency distribution where the distinct values are used as classes
Example:
The following data represent the number of school-age children from a sample of
30 families in a certain residential area.
0 0 3 2 0 2
0 1 4 4 1 1
0 0 3 3 0 0
2 1 1 0 2 0
2 0 0 2 1 2
Statistics EASSESS1
0 12 0.40
1 6 0.20
2 7 0.23
3 3 0.10
4 2 0.07
Definitions:
Class limits the smallest and the largest values than can fall in a given class
Class boundaries numbers that are halfway between the upper limit of a class and the
lower limit of the next class
Class size length of the class interval; computed by taking the difference
between two successive upper lower class boundaries or class
limits
Class mark midpoint of an interval; computed by taking the average of the lower
and upper class limits of a given class interval
Relative frequency obtained by dividing the class frequency by the total number of
observations
R = highest – lowest
Statistics EASSESS1
C* = R/K
4. Determine the class size (C) by rounding-off C* to a number that is easy to work
with.
7. Sum the frequency column and check against the total number of observations.
Example :
The following are the scores of 4th year high school students in a certain
achievement test in Mathematics
11 14 14 14 16 17 20 24
25 25 28 30 30 31 31 33
34 34 35 35 37 37 37 38
39 41 41 42 44 44 44 45
47 47 47 47 51 53 53 54
54 55 55 56 56 56 57 57
58 58 58 58 59 60 60 60
61 62 62 62 65 66 66 66
66 67 68 68 74 75 76 76
81 87 92 92 97
Statistics EASSESS1
DESCRIPTIVE STATISTICS
There are three most commonly used measure of central tendency: the mean, the median
and the mode.
Properties of Mean
1. The mean is sensitive to the exact values of all the scores in the distribution
2. The sum of the deviations about the mean is zero.
3. The mean is very sensitive to the extreme scores when the scores are not
balanced at both ends of the distribution
4. The sum of the squared deviations of all the scores about the mean is a
minimum.
5. Among the measures of central tendency, the mean is least subject to sampling
variation.
Statistics EASSESS1
When to Use
X X 1 + X 2 + X 3 + ... + X N
X = i =1
=
N N
Where: X- scores
N- number of cases
weighted Mean ( X w )
Sometimes values are not equally important in a distribution, in order to give these
quantities equal importance. It is necessary to assign weights and then calculate the
weighted mean.
Xw =
w1 X 1 + w2 X 2 + ... + wN X N
=
w X N N
w1 + w2 + ... + wN W
1. Using Classmark
X=
fX where: X- classmark
N
fu
X = X0 + i
N
U code
f frequency
i class width
Median ( X )
The median is the middle value when the data is arranged in ascending or
descending order.
N +1
To determine the position of the median, use .
2
N
− F
X = LB + i
2
f
median class
frequency ( Below)
f Frequency
N Number of cases
I class width
Mode
When to use
1
X = LB + i
1 + 2
just above it
just below it
i class width
True Mode
X = 3Median − 2Mean = 3 X − 2 X
Crude Mode
Measures of Variability
The measure of variability is a value that that describes how far scores are spread apart.
A deviation score or value tells us how far away the raw score or value departs from the
mean.
1. Range. It is the difference between the highest score and the lowest score in the
distribution.
R= highest score – lowest score
= HS-LS
2. Mean Deviation. It is the average distance between the mean and the scores in the
distribution.
MD=
X−X
N
Statistics EASSESS1
3. Variance. It is computed by squaring each deviation from the mean, adding them up,
and dividing by the number of cases.
Sample Variance:
X−X
2
s2 =
n −1
Population Variance
X−X
2
2 =
N
Ungouped Data
Sample SD
(X − X )
2
s=
n −1
Population SD
(X − X )
2
=
N
Alternate Formulas
Sample SD
( X ) 2
X 2
−
n
s=
n −1
Population SD
N X 2 − ( X )
1
=
2
MEASURES OF LOCATION
numbers below which a specified amount or percentage of data must lie and are
oftentimes used to find the position of a specific piece of data in relation to the entire set
of data
Percentiles
◼ values that divide an ordered set of data into 100 equal parts
◼ the ith percentile (i=1,2,...,99) , denoted by Pi, is a value below which i% of the data
must lie
Statistics EASSESS1
ii. If ni/100 is a whole number, Pi is the mean of the mean of the (ni/100)th and (ni/100
+ 1)th ordered values.
iii. If ni/100 is not a whole number, Pi is the kth ordered value where k is the closest
whole number greater than ni/100.
Deciles
• the ith decile (i=1,2,...,9) , denoted by Di, is a value below which 10i% of the data
must lie
Quartiles
• the ith quartile (i=1,2,3) , denoted by Qi, is a value below which 25i% of the data
must lie
• MEASURES OF SKEWNESS
Mo Md
Correlation (r)
• Size - a correlation of 0.0 indicates the absence of a relationship; the closer the
correlation gets to 1.0, the stronger the relationship; a 1.0 indicates a perfect
relationship.
Scatterplots
• Scatterplots: graph depicting the relationship between two variables (X & Y). Each
mark in the scatterplot actually represents two scores, an individual’s scores on
the X and the Y variable.
Statistics EASSESS1
General Guidelines:
• 0.70 Strong
Statistics EASSESS1
Mr. Valid established the validity of his test using the test-retest method. He administered
the same test to the same students with one-month interval. The following are the scores
of the students:
A 23 23
B 24 26
C 15 14
D 24 25
E 17 18
F 18 17
G 23 23
H 24 25
I 25 27
J 34 35
K 23 23
L 22 24