Chapter 4

WHAT IS DATA?
Data management is the development,
execution and supervision of plans, policies,
programs and practices that control, protect,
deliver and enhance the value of data and
information assets.
Statistics is a branch of mathematics dealing
with the collection, organization, presentation,
analysis and interpretation of data.
1. DATA GATHERING
• Direct or interview method
• Indirect or questionnaire method
• Registration method
• Observation method
• Experimental method
2. DATA ORANIZATION AND
PRESENTATION
Data collected or obtained from
whatever manner are called
raw data.
Data collected can be classified
according to the scale of
measurement used.
4 Levels of Measurements
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale
• Nominal scale assigns names or
labels to observation in arbitrary
ordering.
• Ordinal scale assigns numbers pr
labels to observations with
implied ordering.
• Interval scale assigns real numbers
to observations to reflect distance
between rank position of the
respondents or objects in equal units.
• Ratio Scale assigns numbers to
observations to reflect the existence
of true absolute zero point as its
origin.
TEXTUAL FORM
• Makes use of words, sentences and
paragraphs in presentation.
TABULAR FORM OR
PRESENTATIONN
• Is a systematic presentation of data in rows and
columns.
• It is used when related numerical facts need to be arrays.
• It should be simple.
• It should focus the reader's attention on the data rather
than on the form.
• It should make the meanings and signification of
information being presented clear.
Heading which shows the table
number, title and head note.
PARTS OF Title is a brief statement of the nature,
classification and time reference of the
STATISTICAL information presented and the area to
TABLES which the statistics refer.
Head Note is a statement enclosed in
brackets between the title and the top
rule of the table that provides
Box Head – is the portion that contains
the column heads, which describe the
data in each column.
Stub – is the first column on the left,
which describes the data on the given
PARTS OF row.
STATISTICAL
TABLES Footnote – is a statement inserted at the
bottom of the table.
Source Note – is the exact citation of the
source of data which is usually include to
acknowledge the origin of the data.
Table 1. Relationship Between Academic
Performance and the Identified Variables.
Variables Correlation Significance Remarks
Coefficient
GPA 0. 7461 0.000 HS
MI 0.4015 0.000 S
IQ 0.9891 0.000 S
Gender 0.1452 0.084 NS
NS Not Significant
S Significant
HS – Highly Significant
GRAPHICAL
FORM/PRESENTATION
• Shows numerical values or
relationships in a pictorial form. It
makes use of graphs, symbols or
visual aids.
Properties of a Good Chart
• Accurate – the dimensional aspects should reflect the
highest degree of accuracy possible within the practical
limits impose by expert draftsmam or the electronic
computer being used.
• Simple – the basic design should be simple and straight
forward and not loaded with irrelevant, or trivial symbols
and ornamentation.
• Clear It should be easily understood.
• Attractive – It should be designed and constructed to
attract and hold the attention by holding a nezt,
dignified and proffesional appearance. It should be stylish.
LINE GRAPH
• Is used when:
(1) data cover long period of time;
(2) several series are compared;
(3) movements are to be emphasized;
(4) trends are to be established
(5) estimates are to be forcasted.
14
Year
12
Number of Graduates
10
0
2012 2013 2014 2015 2016 2017
Year
BAR GRAPH
• Is used when numerical values of an item over a period of time
are compared. It consists of regular bars where the height of
bars represents quantity or frequency for each category.
14
12
Year
10
0
2012 2013 2014 2015 2016 2017
PIE GRAPH
• Is use to show percentage or the composition by
parts of a whole.
2012
2017
2013
2014
2016
2015
PICTOGRAPH OR PICTOGRAM
• Is used to immediately suggest the nature
of data.
Organizing collected numerical data
can be can be done into twoways
1. ARRAY – is an arrangement of the numerical data/values according to order of
magnitude either ascending or descending.
2. FREQUENCY DISTRIBUTION TABLE – is a considered version of an array. It
categorizes the numerical data into intervals or classes. It has the following parts:
• Classes are mutually exclusive categories defining the lower limit and the upper limit
with equal intervals.
• Class Frequency is the number of observations in each class.
• Class mark or class midpoint is used in computing the man and some measure of
variability.
• Cumulative frequency tells the sum of frequencies in a particular class of interest.
• Relative frequency tells the percentage of observations in a particular class of
interest.
Steps in Constructing a Frequency Distribution
with Equal Class Size
1. Determine the range R of the numerical data.
R = [ Highest value – Lowest value ]
2. Determine the number of classes K to which the data are to be grouped using the
Sturges' Approximation:
K = 1+ 3.322 Log N
where N total number of values to the grouped.
3. Determine the class size C.
C = R/K
4. Determine the lower limit of the first class.
5.Construct the lower limit of the first class.
Remarks
1. Sturges' Approximation is just a guide and a flexible rule
2. The number of classes should be large enough to demonstrate the
major characteristics of the data yet not so large as to result in losing
the advantage of summarizing raw data . For instance , where the
highest observed value fails to be included in the class constructed ,
the number of classes should not be increased just to accommodate
the highest value increase the class size.
3. The number of classes is usually taken between 5 to 20 depending
nature of data without using the Sturges' Approximation
4.Class intervals are.chosen so that the class marks coincide with
actually observed data . However , class boundaries should not
coincide with actually observed data.
144 112 156 122 168 172 141 159 127 154
156 145 134 137 123 149 144 160 136 154
142 138 159 151 147 150 126 152 147 136
135 132 146 133 150 122 139 149 152 129
131 155 116 140 145 135 160 125 172 163

1. The range R= 172 – 112 = 60
2. K = 1 +3.222 log 50 = 6.643978 7
3. C = 60/7 = 8. 571428571 = 9
4. The lower limit is 112
5. Construct the Frequency Distribution Table.
Class Intervals Frequency Class Class Boundary Relative Frequency <CF >CF
Mark
112 120 2 116 111.5 120.5 4 2 50
121 129 7 125 120.5 129.5 14 9 48
130 138 10 134 129.5 138.5 20 19 41
139 147 12 143 138.5 147.5 24 31 31
148 156 11 152 147.5 156.5 22 42 19
157 165 5 161 156.5 165.5 10 47 8
166 174 3 170 165.5 – 174.5 6 50 3
Total 50
Graphical Presentation of Frequency
Distributions with Equal Class Size
Steps in Constructing Frequency Charts
1. Label either class limits or class along the horizontal axis.
2. Plot the frequency of each class along the vertical axis above the class the class mark of the
corresponding class.
3. The vertical scale must always include zero.
4. The horizontal scale must include only the range of the observed data and one extra
interval at each end.
5. The vertical axis height should be approximately ¾ the length of horizontal axis.
FREQUENCY HISTOGRAM
a set of the vertical bars whose areas are proportional to the frequencies
presented.
25
20
No. of Students
15
10
0
49.5 54.5 59.5 64.5 69.5 74.5 79.5 84.5 89.5 94.5 99.5
Grades
FREQUENCY POLYGON
- is a line chart plotted along same scale as the histogram . The class
frequency is plotted against the class mark.
Pictureee
LESS THAN OGIVE
- the less than cumulative frequency is plotted against the upperclass
limit.
GREATER THAN OGIVE
the greater than cumulative frequency is plotted against the
lowerclass limit.
3. Data Analysis and Interpretation
• Is the process of making sense of numerical data
that has been collected, analyze, and
presented.
• Method of describing the characteristics of
individual objects or group individuals under
study is known as descriptive statistics,
• Analyzing and Intercepting data is know as
inferential statistics.
Descriptive Statistics given a single value which
represents the set of values.
There are three methods of describing a set of
values ;
• Measures of central tendency
• Measures of dispersion,
• Measures of skewness and kurtosis
Inferential Statistics
• Are techniques wherein samples can be used to
make generalization about the population from
which the samples were drawn.
• Inferential Statistics arise out of the fact that
sampling naturally incurs sampling error and thus
a sample is not expected to perfectly represent the
population. There are two methods f Inferential
Statistics; Estimation of parameter(s) and the
hypothesis testing.
B. Measures of Central Tendency
• Are measures indicating the center of set of data
which are arranged in order of magnitude.
There are three measures of Central
Tendency;
• Mean
• Median
• Mode
Mean or Arithmetic mean
• Is the most popular and well known measure of
•
central tendency.
• It can be used with both discrete and continuous
data.
For Ungrouped data; The mean is the most
frequently used measure of central tendency. The
mean is the noted by a symbol (read as “mu”) and xx
(read asx bar”) for the population and sample respectively.
∑ Xi x1 + x2 + x3 +. .. + xN
µ= =
N N
Where:
µ = Population Mean
Xi = ith observed value in the population
N = total number of observations
∑ xi x2 + x2 + x3 +. .. + xN
xx = =
n n
Where:
xx = sample mean
xi = ith observed value in the sample
∑ = sum of all values
n = total number of observations
Example 1. The items listed below represent the scores of seven
BS Applied Statistics students during the final examination.
Compute the mean score.
89,75,90,85,78,87 and 80
∑ xi 89 + 75 + 90 + 85 + 78 + 87 + 80 584
xx = = = = 83.43
n 7 7
For Grouped Data the mean for grouped data s
denoted by µg or xG for population and sample
respectively.
∑ fi xi f1 x1 + f2 x2 + f3 x3 +… + fk xk
µg = =
N N
Where:
µg = mean
xi = frequency class mark of ith class
fi = frequency of ith class
N = total number of observations
Example: The table below represents the scores of 64 students
in a long quiz
Class Interval Frequency Class Mark fi xi

59 7 7 49
1014 10 12 120
1519 13 17 221
2024 18 22 396
2529 8 27 216
3034 5 32 160
3539 3 37 111
Total N= 64 ∑ fi xi = 1273
∑ fi xi f1 x1 + f2 x2 + f3 x3 +… + fk xk
µg = =
N N
7(7) + 10(12) + 13(7) + 18(22) + 8(27) + 5(32) + 3(37)
=
64
= 1273

64
19.89
= 19.89
Weighted Mean
• Is denoted by µw or Xw for population and sample
respectively.
∑ wi Xi w1 X1 + w2 X2 + w3 X3 +… + wk Xk
µw = =
∑ Wi w1 + w2 + w3 + … + wk
Where:
µw = weighted of mean
Xi = ith quantity
wi = weight of ith class
Example: Consider the grade f a freshman student during the first
semester.
Subjects Units Grades wi Xi
Purposive Comm 3 2.25 6.750
STS 3 1.75 5.250
MMW 3 2.00 6.000
Panitikan 3 2.00 6.000
PHED 1 2 1.50 3.000
Military Science 1.5 1.25 1.875
Total ∑ Wi = 15.5 ∑ wi Xi = 28.875

∑ wi Xi w1 X1 + w2 X2 + w3 X3 +… + wk Xk
µw = =
∑ Wi w1 + w2 + w3 + … + wk
∑ wi Xi 3(2.25)+3(1.75)+3(2.00)+3(2.00)+2(1.50)+1.5(1.25)
µw = =
∑ Wi 3+3+3+3+2+1.5

∑ wi Xi 28.875 1.863
µw = = =
∑ Wi 15.5

Properties of the mean
1. The sum of the deviation of the observation from the mean is zero. The deviation of
the ith observation from the mean is denoted by
di = Xi µ
Given the following observed values 3,8 and 4. The mean is 5.
d1 = X1 – 5 = 3 – 5 = 2
d2 = X2 – 5 = 8 – 5 = 3
d3 = X3 – 5 = 4 – 5 = 1
∑ di = d1 + d2 + d3 = 2 + 3 +(1) = 0

2. The sum of the squared deviations of the observations from the mean is
•
minimum.
∑ = ∑(Xi µ)² = ∑(Xi 5)² = (35) ² + (85) ² +(45) ² = 14
= ∑(Xi – X1)² = ∑(Xi – 3)² = (33 ) ² + (83 ) ² + (43 ) ² = 26
= ∑(Xi – X2)² = ∑(Xi – 8)² = (38) ² + (88) ² + (48) ² = 41
= ∑(Xi – X3)²= ∑(Xi – 4)² = (34) ² + (34) ² +(44) ² = 17
(Hence, the sum of the squared deviation of the observation from the
mean has a minimum value).
3. The mean reflects the magnitude of every observation, since every
observation contributes to the value of mean.
4. The mean can be easily affected by the presence
of an extreme value, hence not good measure of
central tendency when an extreme value do occur.
 From the previous data: 3, 8, 4, and 50. The mean
is 16.25
5. The mean of subgroups may be combined when
properly weighted, the combined mean is called
the weighted mean.
2. Median is the middle score for a set of data
arranged in order of magnitude. Median is the
best used when data has several extreme entries.
Symbolically, a given set of data is denoted by X1, X2, … , Xn
a the array is denoted by X (1), X (2), +, … X (n). The media is.
X( N+1)/2 if N is odd
Md= X( N/2)+ X ( N/2)+1 if N is even

2
Example: Suppose MA math program has 10 graduate students
and the height (in cm)are as follows: 170, 165, 155, 160, 150,
149, 152, 161, 163, 175
Arrange the data in ascending or descending order of magnitude.
X(1) = 149 X(2) = 150 X(3) = 153 X(4) = 155 X(5) = 160
X(6) = 161 X(7) = 163 X(8) = 165 X(9) = 170 X(10) =175
X (N/2) + X (N/2) + 1) X (10/2) + X (10/2) +1) X(5) + X(6) 160+161 160.5cm
Md= = = = =
2 2 2 2

For Grouped data The median from grouped data can be
calculated using the formula
(N/2)Fb
MdG = Lmd + = C
fmd
Example: The table below represents the scores of 64 students
in along quiz.
Class Frequency Class Mark Class <CF Array
Interval Boundary
59 7 7 4.5 – 9.5 7 X(1), X(2), .. X(7)
1014 10 12 9.5 – 14.5 17 X(8), X(9), .. X(17)
1519 13 17 14.5 – 19.5 30 X(18), X(19), .. X(30)
2024 18 22 19.5 – 24.5 48 X(31), X(32), .. X(48)

2529 8 2 24.5 – 29.5 56 X(49), X(50), .. X(56)
3034 5 27 29.5 – 34.5 61 X(57), X(58), .. X(64)

3539 3 32 34.5 – 39.5 64
TOTAL 64 37
The middle value is the X(32) observation and it falls under the class interval 2024.
Lmd = 19.5 C= 5 Fb = 30 fmd = 18 N = 64
MdG = 19.5 + 5 (64/2)30
= 20.05555
18
Properties of Median
1. Median is a positional value and hence is not affected by the presence of an extreme
value unlike the mean.
2. The sum of the absolute deviation from a point say “a” is minimum when a a = Md, that
is ∑ | xi – Md } is minimum.
3.The Median is not amenable for further computation and hence medians of
subgroups cannot be combined in the same manner as the mean.
4. The median of grouped data can be calculated even with openended
intervals provided the median is not openended.
3. Mode is the most frequent score in the data set. It is sometimes considered as the
most popular option.
for Ungrouped data: The Mode is value which occurs most often or the most
frequently occurring observation. The mode is denoted by Mo.
Example 1. Consider the data set 1,2,2,2,8,1,4,10. The most frequently occurring
observation is 2, which appeared thrice. Thus, the mode is 2, and since there is only
one mode, then the distribution is unimodal.
for Grouped data: The mode from grouped data can be approximated using the
formula
MOG=Lmo+c [fmo – fb / 2fmo – fa – fb]
where:
Lmo – Lower CB of the modal class
C – Class size
fb – frequency before the modal class
fa – frequency after the modal class
Note: Modal class is the class with the highest frequency
Example: The table below represents the scores of 64 students in a long
quiz.
Class Interval Frequency Class Mark Class
Boundary
59 7 7 4.5 9.5
1014 10 12 9.5 14.5
1519 13 17 14.5 19.5
2024 18 22 19.5 24.5
2529 8 27 24.5 29.5
3034 5 32 29.5 34.5
3539 3 37 34.5 39.5
Total 64
The modal class is the interval 2024.
Lmo = 19.5 c = 5 fb = 13 fmo =18 fa = 8
MOG = 19.5 + 5 [18 – 3 / 2 (18) – 13 – 8 ] = 21.166666 = 21.17
Properties of Mode
1. May not be the center of the data.
2. It does not make use of all observations.
3. It’s difficult t manipulate algebraically.
4. It’s ideal for qualitative type of data.
C. Measures of Dispersion
1. Range
is the simplest measure of dispersion. It is the difference between the highest
and lowest score.
For Ungrouped data: The range of a set of data is the absolute difference between the highest
and the lowest value in the set. The range is denoted by R.
R = |HV – LV |
where
R – Range
HV – Highest value
LV – Lowest value
for Grouped data: The Range for grouped data is denoted by RG.
RG = |ULHC – LLLC |
where:
R – Range
ULHC – Upper Limit of the Highest Class
LLLC – Lower Limit of the Lower Class
Example. The table below represents the scores of 64 students in a long quiz.
Class Interval Frequency Class Mark

59 7 7
1014 10 12
1519 13 17
2024 18 22
3034 5 32
3539 3 3
Total 64
RG = |ULHC LLLC| = |39 5|= 34
2. Mean absolute deviation, also known as variance, is the simplest method of taking into
account the variations or the spread ability of all items into a series from the point of central
tendency.
For Ungrouped data: Given the set of values X1, X2, X3,…,XN.
The deviation of observation from the mean is X2 µ.. The
population variance, ², is
² = Σ(Χіµ.)² (Χ1 µ.)² + (Χ2 – µ.)² + (Χ3 µ.)² +…+ (ΧN µ.)²
=
N N
The computational formula of the variance is
² = ΣΧі² Χ1² + Χ2² + Χ3² +…+ ΧN²
µ² = µ²
N N

Example The following data represent the score of 7 BS Applied Statistics in a quiz : X 1 = 4, X2 =
7, X3 = 8, X4 = 2, X5 = 2, X6 = 9 ,X7 = 3
Compute the population variance
µ² = 4+7+8+2+2+9+3
= 5
7
² = Σ(Χіµ.)² (Χ1 – 5)² + (Χ2 – 5)² + (Χ3 5)² +…+ (Χ75)²
=
7 7

² = ( 4 – 5 ) ² + ( 7 – 5 ) ² + ( 8 – 5 ) ² +( 2 – 5 ) ² + (2 – 5 ) ² + (9 – 5 ) ² + ( 3 – 5 ) ²

7
² = 1 ² + 4 ² + 3 ² +(3) ² + (3) ² + 4 ² + ( 2 ) ²
= 7.42857 ≈ 7.43
7
Using the computational formula
² = ΣΧі² 4² + 7² + 8² +2 ² +2 ² +9 ² +3 ²
µ² = 5 ²=7.4285714 ≈7.43
N 7
Note: Using the definitional or computational formula the
population variance is the same. But the computational is faster
and easier to apply than the definitional formula.
The sample variance is
S² = Σ(Χі – Χ )² (Χ1 – X)² + (Χ2 – X)² + (Χ3 – X)² +…+ (ΧnX)²
=
n1 n1
The computational formula of the variance is
S² = nΣΧі –(Χі)²

n(n1)
Given a random example of size, n = 10.
X1 = 4, X2 = 7, X3=8, X4 = 2, X5=2, X6= 8, X7 = 9, X8 = 2, X9 = 5, X10=7
_
X = ΣΧі² 4 + 7 + 8 + 2 + 2 + 8 + 9 + 2 + 5 54
= = = 5.4
n 10 10
Using the definitional formula
_
S² = Σ(Χі X) ² (4 4.9)² + (7 4.9)² + (8 4.9)² +... + (7 4.9)²
= ≈ 7.6
n 1 9
Using the computational formula
ΣΧі² = Χ1² + Χ2² + Χ3² +…+ Χn² = 4² + 7² + 8² +... + 7² = 305
S² = nΣΧі² (ΣΧі)² 10(360) (54)²
= = 7.6
n (n1) 10(101)
For Grouped data: The variance from the grouped data can be obtained using the formula
G² = ΣfіΧі² f1Χ1² + f2Χ2² + f3Χ3² +…+ fkXk²
µG² = µG²
N N
S² = nΣfiΧі² (ΣfiΧі)²

n (n1)
Where
fi – the frequency of the ith class
Xi the class mark of the ith class
µG – the mean from the grouped data
G² = ΣfіΧі² 29311² 1273 ²
µG² = = 62.347412 ≈ 62.35
N 64

S² = nΣfiΧі² (ΣfiΧі)² 64(29311)(1273)²
= = 63.3370536 ≈ 63.34
n(n1) 64(64—1)
The table below represent the scorer of 64 students in a long quiz.
Class Frequency Class Mark fiXi fiXi²
Interval
59 7 7 49 343
104 10 12 120 1140
1519 13 17 221 3757
2024 18 22 396 8712
2529 8 27 216 5832
3034 5 32 160 5120
3539 3 37 111 4107
Total 64 1273 29311
G² = ΣfіΧі² 29311² 1273 ²
µG² = = 62.347412 ≈ 62.35
N 64 64

S² = nΣfiΧі² (ΣfiΧі)² 64(29311)(1273)²
= = 63.3370536 ≈ 63.34
n(n1) 64(64—1)
Correlation is a relationship or association between two variables.
Correlation Coefficient
The linear correlation coefficient, denoted by ρ (rho), is a measure of the strength of the linear
relationship existing between two variables, X and Y, which is independent of their respective
scales of measurement.
Correlation Coefficient Interpretation
1.00 Perfect Negative Correlation
0.76 to 0.99 Very High Negative Correlation
.051 to 0.75 High Negative Correlation
.026 to 0.50 Moderately Small Negative Correlation
0.01 to 0.25 Very Small Negative Correlation
0.00 No Correlation
0.01 to 0.25 Very Small Positive Correlation
.026 to 0.50 Moderately Small Positive Correlation
.051 to 0.75 High Positive Correlation
0.76 to 0.99 Very High Negative Correlation
1.00 Perfect Positive Correlation
Student Entrance GPA(Y) XY X² Y²
Score (X)
1 68 85 5780 4624 7225
2 56 80 4480 3136 6400
3 79 85 6715 6241 7225
4 53 79 4187 2809 6241
5 46 86 3956 2116 7396
6 80 87 6960 6400 7569
7 40 78 3120 1600 6084
8 69 83 5757 4761 6889
9 34 76 2584 1156 5776
10 26 75 1950 676 5625
11 76 88 6688 5776 7744
12 85 95 8075 7225 9025
13 52 78 4076 2704 6084
14 30 77 2310 900 5929
15 49 81 3969 2401 6561
nΣXiYi ΣXi ΣYi
r=
√[nΣXiYi2 (ΣXi)2][nYi2(ΣYi)2]
r= 0.858083=0.858

D. Measures of Relative Position
Measures of position identifies the rank or position occupied by a data from an array of data collected.
•
1.Percentiles are values that divide a set of observation into 100 equal parts. These values denoted by P 1 ,
P2 , P3 ,…. P99, mean that P1 1% of data fall below P2 99% fall below . The position occupied by each the score
from an array of data collected is based on the hundredth when the scores are arranged from highest to
lowest or vice versa.
To determine or identify the data of desired percentile, the formula ( ) n gives the number of observation
below percentile, then counting from 1 to ( ) n from the data arranged in ascending order gives the
percentile.
2.Deciles are values that divide a set of observations into 10 equals parts. These values denoted by D 1 ,
D2 , D3 ,…D9 indicate that 10% of the fall below D 1, 20% fall below D2 ,…..90% fall below D9. The position
occupied by each of the score from an array of data collected is based on the tenth when the scores are
arranged from highest to lowest or vice versa.
To determine or identify the data of the desired decile , the formula ( ) n gives the number of
observation below the decile, then, counting, from 1 to ( ) n from the date arranged is ascending order gives
the decile.

•3. Quartiles are values that divide a set of observation into 4 equals parts.

• The 1st Quartile, Q1 also called the lower quantile is equivalent to P 25. To determine the 1st quantile, the
formula Q1 = gives the number of observations below the quartile; then , counting from 1 to from the
data arranged in ascending order gives the quantile.
• The 2nd Quartile, Q2 is the middlemost score or the median and is equivalent to the 50 th percentile. To
determine the 2nd quartile, the formula Q2 = = gives the number of observations below the quartile;
then, counting from 1 to from the data arranged in ascending order gives the quartile.
• The 3rd Quartile, Q3 also called the upper quartile is equivalent to the 75 th percentile. To determine the
3rd quartile, the formula Q3 = gives the number of observation below the quartile ; then , counting from
1 to from the data arranged in ascending order gives the quartile.

Example: The scores of ten student in a 20point Math quiz are as follows:

6, 12 , 18 , 8 , 9 , 15 , 17 , 15
Find the values of Q1, Q2 ,D1 , D5 , P10 , P25 , P50 . Interpret the values.
Scores Position
6 1
8 2
9 3
9 4
10 5
12 6
15 7
15 8
17 9
18 10
n=10
= = = 2.5 3. This implies that the value is located on the 3 rd position and that is 9. Thus, Q 1 = 9.This means
•Q1
that 25% of the students got scores equal or below 9; or 75% of the students got' scores equal or above 9.
Q2 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, Q 2 = 10. This means

that 50% of the students got scores equal or below 6; or 90% of the students got scores equal or below 10 or above
10.
D2 = = = 1. This implies that the values is located on the 1 st position and that is 6. Thus, D 2 = 6. This is means

that 10% of the students got scores equal or below 6; or 90% of the students got scores equal or above 6.
D5 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, D 5 = 10. This means

that 50% of the students got scores equal or below 10 or above 10.
P10 == = 1. This implies that the value is located on the 1 st position and that is 6. Thus, P 10 = 10. This means

that 10% of the students git scores equal or below 6; or 90% of the students got scores equal or above 6.
P25 == = 2.5 3. This implies that the value is located on 3 rd the position and that is 9. Thus, P 25 = 9.This means

that 25% of the students got scores equal or below 9;or 75% of the students got scores equal or above 9 or above
9.
P50 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, P 50 = 10.This means

that 50% of the students got scores equal or above or below 9.

•For Grouped data : The formulas for quartiles , deciles and percentiles are derived from the
formula of the median , I.e.
where
Lower CB of the quartile class
Class size
< CF before the quartile class
N total number of observation
frequency of the quartile class
• D
where
Lower CB of the decile class
C Class size
< CF before the decile class
N total number of observation
frequency of the decile class

• P
Where
Lower CB of the quartile class
C Class size
<CF before the quartile class
N total number of observation s
frequency of the quartile class
Characteristics of the normal Distribution
•1. The normal distribution is a continuous distribution in high random
variable X can assume value between
Some examples of continuous random variables’ are Weight, Length,
Heights, Distance, Traveled, and Volume of liquid.
2. The two parameters that describe the normal distribution are mean µ and the
variance o².
3. Normal distribution is symmetric, bellshaped probability distribution. Since it
is symmetric, then P(X>) = P (X<) = 0.50. The distribution with respect to the
three measures of central tendency.
4. The total area under the normal curve and above the xaxis is one.
5. The normal curve approaches the horizontal axis asymptotically as the normal
curve extends in either direction of the mean.
Standard Normal Distribution

The values of normal random variable X for the normal distribution
•
are usually in terms of how many standard deviations they are away
from the mean.
In standard normal distribution, different values of mean and
variance are no longer generated completely with different curves.
The normal random variable can be transformed into a standard
normal random variable Z using the transformation formula.
Z=
Spearman Rank Correlation Coefficient

• The spearman rank correlation coefficient is the best is the best known measure of
relationship between two variables based on ranks ( ordinal scale). It is applicable when
quantitative measurements of the variables are not normally distributed and could be ranked in
two ordered series .Its formula is given by.
Where di is the difference between the ith paired, and n is the total number of paired
measurements.
Finding Areas Under the Normal Curve
•The following are special cases to consider in finding areas under the normal curve.

1. Because the normal curve is symmetric, the chance that an event that x will fall below the
mean is 0.50, and X will fall above the mean is also 0.50.
2. Approximately 99.9% of the area of the normal distribution is contained within a 3 standard
deviations away from the mean.
P ( + )
3. Approximately 95% of the area of the normal distribution is contained with 2 standard away
from the mean.
P ( + )
4. Approximately 68% of the area of the normal distribution is contained within one standard
deviation away from the mean.
P ( + )
Testing for the Significance of Spearmen rank

•1.Ho: r = 0 or there is no significant relation between self and supervisors' evaluation
Ha: r 0 or there is significant relation between self and supervisors' evaluation
2. level of significance = 0.05 and sample size n = 10
3.Test Statistics: ttest
4.Critical Region : Reject Ho if > 2.306
5. Computations : Compute the ttest statics using the formula
= 3.3034108 = 3.303
6.Decision: Since > 2.306 , therefore reject the Ho and conclude that there is significant relation between
supervisors' evaluation selfevaluation of the faculty members.

Chapter 4

Uploaded by

Copyright:

Available Formats

Chapter 4

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4

Uploaded by

Copyright:

Available Formats

WHAT IS DATA?

112 ­ 120 2 116 111.5 ­ 120.5 4 2 50

121 ­ 129 7 125 120.5 ­ 129.5 14 9 48

130 ­ 138 10 134 129.5 ­ 138.5 20 19 41

139 ­ 147 12 143 138.5 ­ 147.5 24 31 31

148 ­ 156 11 152 147.5 ­ 156.5 22 42 19

157 ­ 165 5 161 156.5 ­ 165.5 10 47 8

166 ­ 174 3 170 165.5 – 174.5 6 50 3

Class Interval Frequency Class Mark fi xi

Purposive Comm 3 2.25 6.750

STS 3 1.75 5.250

MMW 3 2.00 6.000

Panitikan 3 2.00 6.000

PHED 1 2 1.50 3.000

Military Science 1.5 1.25 1.875

Total ∑ Wi = 15.5 ∑ wi Xi = 28.875

∑ Wi w1 + w2 + w3 + … + wk

∑ Wi 3+3+3+3+2+1.5

∑ wi Xi 28.875 1.863

∑ Wi 15.5

X( N+1)/2 if N is odd

Md= X( N/2)+ X ( N/2)+1 if N is even

2 2 2 2

15­19 13 17 14.5 – 19.5 30 X(18), X(19), .. X(30)

20­24 18 22 19.5 – 24.5 48 X(31), X(32), .. X(48)

25­29 8 2 24.5 – 29.5 56 X(49), X(50), .. X(56)

30­34 5 27 29.5 – 34.5 61 X(57), X(58), .. X(64)

Lmd = 19.5 C= 5 Fb = 30 fmd = 18 N = 64

Class Interval Frequency Class Mark

Q2 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, Q 2 = 10. This means

D2 = = = 1. This implies that the values is located on the 1 st position and that is 6. Thus, D 2 = 6. This is means

D5 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, D 5 = 10. This means

P10 == = 1. This implies that the value is located on the 1 st position and that is 6. Thus, P 10 = 10. This means

P25 == = 2.5 3. This implies that the value is located on 3 rd the position and that is 9. Thus, P 25 = 9.This means

P50 = = = 5. This implies that the value is located on the 5 th position and that is 10. Thus, P 50 = 10.This means

You might also like

112 120 2 116 111.5 120.5 4 2 50

121 129 7 125 120.5 129.5 14 9 48

130 138 10 134 129.5 138.5 20 19 41

139 147 12 143 138.5 147.5 24 31 31

148 156 11 152 147.5 156.5 22 42 19

157 165 5 161 156.5 165.5 10 47 8

166 174 3 170 165.5 – 174.5 6 50 3

1519 13 17 14.5 – 19.5 30 X(18), X(19), .. X(30)

2024 18 22 19.5 – 24.5 48 X(31), X(32), .. X(48)

2529 8 2 24.5 – 29.5 56 X(49), X(50), .. X(56)

3034 5 27 29.5 – 34.5 61 X(57), X(58), .. X(64)