Selected Statistical Tests
Selected Statistical Tests
Selected Statistical Tests
intentionally left
blank
Copyright 2006 New Age International (P) Ltd., Publishers
Published by New Age International (P) Ltd., Publishers
ISBN : 978-81-224-2429-4
Statistics is a subject used in research and analysis of data in almost all fields. Official government
statistics are our old records and creates historical evidences. Many people have contributed to the
refinement of statistics, which we use today in various fields. It is a long process of development.
Today we have many statistical tools for application and analysis of data in various fields like
business, medicine, engineering, agriculture, management etc. Many people feel difficult to find which
statistical technique is to be applied and where. Even though computer softwares have minimized the
work, a basic knowledge is must for proper application.
This book is providing the important and widely used statistical tests with worked out examples
and exercises in real life applications. It is presented in a simple way in an understandable manner. It
will be useful for the researchers to apply these tests for their data analysis. The statisticians also find
it useful for easy reference. It is good companion for all who need statistical tools for their field.
The author is greatly indebted to the Authorities of Annamalai University for permitting to
publish this book.
V. Rajagopalan
This page
intentionally left
blank
CONTENTS
Preface ..................................................................................................................... v
1. INTRODUCTION..................................................................................................... 1-6
2. PARAMETRIC TESTS ............................................................................................ 7-93
Test 1 Test for a Population Proportion ................................................................. 9
Test 2 Test for a Population Mean (Population variance is known) .......................... 13
Test 3 Test for a Population Mean (Population variance is unknown) ...................... 16
Test 4 Test for a Population Variance (Population mean is known) .......................... 20
Test 5 Test for a Population Variance (Population mean is unknown) ....................... 24
Test 6 Test for Goodness of Fit .......................................................................... 27
Test 7 Test for Equality of two Population Proportions .......................................... 30
Test 8 Test for Equality of two Population Means (Population variances
are equal and known) ............................................................................... 33
Test 9 Test for Equality of two Population Means (Population variances
are unequal and known) ........................................................................... 36
Test 10 Test for Equality of two Population Means (Population variances
are equal and unknown) ........................................................................... 39
Test 11 Test for Paired Observations ..................................................................... 42
Test 12 Test for Equality of two Population Standard Deviations .............................. 45
Test 13 Test for Equality of two Population Variances ............................................. 48
Test 14 Test for Consistency in a 22 table ........................................................... 53
Test 15 Test for Homogeneity of Several Population Proportions ............................. 56
Test 16 Test for Homogeneity of Several Population Variances (Bartlett's test) ............ 60
Test 17 Test for Homogeneity of Several Population Means ..................................... 65
Test 18 Test for Independence of Attributes ........................................................... 70
Test 19 Test for Population Correlation Coefficient Equals Zero ................................ 74
Test 20 Test for Population Correlation Coefficient Equals a Specified Value .............. 78
Test 21 Test for Population Partial Correlation Coefficient ........................................ 81
Test 22 Test for Equality of two Population Correlation Coefficients ......................... 83
Test 23 Test for Multiple Correlation Coefficient ..................................................... 86
viii Contents
INTRODUCTION
Testing of Statistical hypotheses is a remarkable aspect of statistical theory, which helps us to make
decisions where there is a lack of uncertainty. There are many real life situations where we would like
to take a decision for further action. Further, there are some problems, for which we would like to
determine whether the claims are acceptable or not. Suppose that we are interested to test the following
claims:
1. The average consumption of electricity in city A is 175 units per month.
2. Bath soap B reduces the rate of skin infections by 50%.
3. Oral polio vaccine is more potent than parenteral polio vaccine.
4. A new variety of paddy yields 16.5 tones per hectare.
5. Drug C produces less drug dependence than drug D.
6. Health drink E improves weight gain by 25% for children.
7. Plant produced by cloning grows 50% faster than the ordinary one.
8. Door-to-door campaign increases the sales of a washing powder by 20%.
9. Machine F produces items within specifications than Machine G.
10. The defective items in a large consignment of coconut is less than 4%.
These are a few of the many varieties of problems, which can be solved, only with the help of
statisticians. To solve such problems, we need the following basic and important concept in statistics
theory, as follows.
1. POPULATION
In any statistical investigation, the interest usually lies in the assessment of general magnitude with
respect to one or more characters relating to individuals belonging to a group. Such group of individuals
under study is called population. The number of units in any population is known as population size,
which may be either finite or infinite. In a finite population, the size is denoted by, N. Thus in
statistics, population is an aggregate of objects, animate or inanimate under study.
In statistical survey, complete enumeration of population is tedious, if the population size is too
large or infinite. In some situations, even though, 100% inspection is possible, the units are destroyable
during the course of inspection. As there are various constraints in conducting complete enumeration
namely man-power, time factor, expenditure etc., we take the help of sampling.
2 Selected Statistical Tests
2. SAMPLE
A finite, small subset of units of a population is called a sample and the number of units in a sample is
called sample size and is denoted by n. The process of selecting a sample is known as sampling.
Every member of a sample is called sample unit and the numerical values of such sample units are
called observations. If each unit of population has an equal chance of being included in it, then such a
sample is called random sample. A sample of n observations be denoted by X 1, X 2,, X n.
3. PARAMETERS
The statistical measures namely mean, standard deviation, variance, correlation coefficient etc., if they
are calculated based on the population are called parameters. If the population information is neither
available completely nor finite, parameters cannot be evaluated. In such cases, the parameters are
termed as unknown.
4. STATISTICS
The statistical measures, if they are obtained, based on the sample alone, they are called statistics. Any
function of sample observations is also known as a statistic.
The following are the list of standard symbols used for parameters and statistics:
Statistical measures Parameter Statistic
Mean X
Median M m
Standard deviation s
Variance 2 s2
Proportion P p
Correlation coefficient r
Regression coefficient b
5. SAMPLING ERROR
Errors arise because only a part of the population is (i.e., sample) used to estimate the parameters and
drawing inferences about the population. Such error is called sampling error.
6. STATISTICAL INFERENCE
The process of ascertaining or arriving valid conclusions to the population based on a sample or
samples is called statistical inference. It has two major divisions namely, estimation and testing of
hypothesis.
7. ESTIMATION
When the parameters are unknown, they are estimated by their respective statistics based on the
samples. Such a process is called estimation. If an unknown parameter is estimated by a specific
statistic, it is called an estimator. For example, the sample mean is an estimator to the population mean.
If a specific value is used for estimating, the unknown parameter is called an estimate. It is broadly
classified into two types namely point and interval estimation.
Introduction 3
9. TESTING OF HYPOTHESIS
Hypothesis testing begins with an assumption or hypothesized value that we make about the unknown
population parameter. The sample data are collected and sample statistics are obtained from it. These
statistics are used to test the assumption about the parameter whether we made is correct. The difference
between the hypothesized value and the actual value of the sample statistic is determined. Then we
decide whether the difference is significant or not. The smaller the difference, the greater the likelihood,
that our hypothesized value is correct. We cannot accept or reject the hypothesized value about a
population parameter simply by intuition. The statistical tests for testing the significance of the difference
between the hypothesized value and the actual value of the sample statistic or the difference between
any set of sample statistics are called tests of significance.
2 p PQ / n
3 s / 2n
4 s2 2 2/ n
5 r (1 )/
2
n
6 (X 1 X2 ) 12 22
+
n1 n 2
12 2
7 (s1 s2 ) + 2
2 n1 2n 2
P1 Q1 P2 Q2
8 ( p1 p2 ) n1
+
n2
The following are the test procedures that we adopt in studying the parametric tests in a systematic
manner:
The acceptance or rejection of H0 depend on the test criterion that is used in hypothesis testing. In
any hypothesis testing, we would like to control both Type-I and Type-II errors. The probability of
committing Type-I error is denoted by and the probability of committing Type-II error is denoted by .
sided or right-sided) is called a one-sided test. For example, a test for testing the mean of a population,
H0: = 0 against the alternative hypothesis H1: < 0 (left-sided) or H1: > 0 (right-sided) and for
testing H0 against H1: 0 (two-sided) is known as two-sided test.
11.8 Conclusion
By comparing the two values namely, the observed value of the test statistic and the critical value, the
conclusion is arrived at.
If Z Z, we conclude that there is no evidence against the null hypothesis H0 and hence it may
be accepted.
If Z > Z, we conclude that there is evidence against the null hypothesis H0 and in favor of H1.
Hence, H0 is rejected and alternatively, H1 is accepted.
12.1 Treatments
Various factors or methods that we adopted in a comparative experiment are termed as treatments. For
example, in field experiments, different varieties of paddy seeds, different kinds of fertilizers, different
methods of cultivation etc., are called treatments.
12.3 Blocks
In field experiments, the experimental material is firstly divided into relatively homogeneous divisions,
known as Blocks. All the blocks are further divided into small plots of experimental units.
12.4 Replication
The repetition of the treatments to the experimental units more number of times under investigation is
called replication. In agricultural experiments, each block will receive all the treatments and in every
block the similar treatments are repeated according to the number of blocks available. Hence, in analysis,
the number of blocks will be same as number of replications.
12.5 Randomization
The adoption of various treatments to the experimental units in a random manner is called randomization.
Different kinds of randomization will be adopted in the ANOVA tests, namely, complete randomization,
randomization within blocks, row-wise, column-wise etc., according to the types of experimental designs.
PARAMETRIC TESTS
THIS PAGE IS
BLANK
TEST 1
Aim
To test the population proportion, P be regarded as P 0, based on a random sample. That is, to
investigate the significance of the difference between the observed sample proportion p and the assumed
population proportion P 0.
Source
If X is the number of occurrences of an event in n independent trials with constant probability P
of occurrences of that event for each trial, then E (X ) = nP and V (X ) = nPQ, where Q = 1 P, is the
probability of non-occurrence of that event. It has proved that for large n, the binomial distribution
tends to normal distribution. Hence, the normal test can be applied. In a random sample of size n, let X
be the number of persons possessing the given attribute. Then the observed proportion in the sample be
X P (1 P )
= p, (say), then E(p) = P and S.E(p) = Var( p) = .
n n
Assumption
The sample size must be sufficiently large (i.e., n > 30) to justify the normal approximation to
binomial.
Null Hypothesis
H0: The population proportion (P ) is regarded as P 0. That is, there is no significant difference
between the observed sample proportion p and the assumed population proportion P 0. i.e., H0: P = P 0.
Alternative Hypotheses
H1(1) : P P 0
H1(2) : P > P 0
H1(3) : P < P 0
10 Selected Statistical Tests
/2 /2
Z/2 0 Z/2
0 Z
Z 0
Parametric Tests 11
Critical Values ( Z )
Test Statistic
pP
Z= (Under H0: P = P 0)
P (1 P )
n
The statistic Z follows Standard Normal Distribution.
Conclusions
1. If Z Z, we conclude that the data do not provide us any evidence against the null
hypothesis H0 . Hence, it may be accepted at % level of significance. Otherwise reject H0
or accept H1 (1).
2. If Z Z, we conclude that the data do not provide us any evidence against the null
hypothesis H0 and hence it may be accepted at % level of significance. Otherwise reject
H0 or accept H1 (2).
3. If Z Z , we conclude that the data do not provide us any evidence against the null
hypothesis H0 and hence it may be accepted at % level of significance. Otherwise reject
H0 or accept H1 (3).
Example 1
Hindustan Lever Ltd. Company expects that more than 30% of the households in Delhi city will
consume its product if they manufacture a new face cream. A random sample of 500 households from
the city is surveyed, 163 are favorable in manufacturing the product. Examine whether the expectation
of the company would be met at 2% level.
Solution
Aim: To test the HLL Companys manufacture of a new product of face cream will be consumed
by 30% of the households in New Delhi or more.
H0: The HLL Companys manufacture of a new product of face cream will be consumed by
30% of the households in New Delhi. i.e., H0: P = 0.3.
H1: The HLL Companys manufacture of a new product of face cream will be consumed by
more than 30% of the households in New Delhi. i.e., H1: p > 0.3
12 Selected Statistical Tests
pP 0. 326 0. 3
Test Statistic: Z= (Under H0: P = 0.3) = = 1.27
P (1 P ) (0. 3)(0.7 )
n 500
Conclusion: Since Z < Z, we conclude that the data do not provide us any evidence against the
null hypothesis H0. Hence, accept H0 at 5% level of significance. That is, the HLL Companys
manufacture of a new product of face cream will be consumed by 30% of the households in New
Delhi.
Example 2
A plastic surgery department wants to know the necessity of mesh repair of hernia. They think
that 15% of the hernia patients only need mesh. In a sample of 250 hernia patients from hospitals, 42
only needed mesh. Test at 2% level of significance that the expectation of the department for mesh
repair of hernia patients is true.
Solution
Aim: To test the necessity of hernia repair with mesh is 15% or not.
H0: The necessity of mesh repair of hernia is 15%. i.e., H0: P = 0.15
H1: The necessity of mesh repair of hernia is not 15%. i.e., H1: P 0.15
Level of Significance: = 0.02 and Critical Value: Z = 2.33
Based on the above data, we observed that, n = 250, p = (42/250) = 0.326
pP 0.168 0. 15
Test Statistic: Z = (Under H0: P = 0.15) = = 0.80
P (1 P ) (0. 15)(0.85)
n 250
Conclusion: Since Z < Z, we conclude that the data do not provide us any evidence against the
null hypothesis H0. Hence, accept H0 at 2% level of significance. That is, the necessity of mesh repair
of hernia as expected by the plastic surgery department 15% is true.
EXERCISES
1. A random sample of 400 apples was taken from large consignment and 35 were found to be bad.
Examine whether the bad items in the lot will be 7% at 1% level.
2. 150 people were attacked by a disease of which 5 died. Will you reject the hypothesis that the death
rate, if attacked by this disease is 3% against the hypothesis that it is more, at 5% level?
TEST 2
Aim
To test the population mean be regarded as 0, based on a random sample. That is, to investigate
the significance of the difference between the sample mean X and the assumed population mean 0.
Source
Let X be the mean of a random sample of n independent observations drawn from a population
whose mean is unknown and variance 2 is known.
Assumptions
(i) The population from which, the sample drawn, is assumed as Normal distribution.
(ii) The population variance 2 is known.
Null Hypothesis
H0: The sample has been drawn from a population with mean be 0. That is, there is no
significant difference between the sample mean X and the assumed population mean 0. i.e., H0 : =
0.
Alternative Hypotheses
H1 (1) : 0
H1 (2) : > 0
H1 (3) : < 0
Test Statistic
X
Z= (Under H0 : = 0 )
/ n
The Statistic Z follows Standard Normal distribution.
Example 1
The daily wages of a Factorys workers are assumed to be normally distributed. A random
sample of 50 workers has the average daily wage of rupees 120. Test whether the average daily wages
of that factory be regarded as rupees 125 with a standard deviation of rupees 20 at 5% level of
significance.
Solution
Aim: Our aim is to test the null hypothesis that the average daily wage of the Factorys workers
be regarded as rupees 125 with standard deviation of rupees 20.
H0: The average daily wage of the Factorys workers is 125 rupees. i.e., H0: = 125.
H1: The average daily wage of the Factorys workers is not 125 rupees. i.e., H1: 125.
Level of Significance: = 0.05 and Critical Value: Z = 1.96
X
Test Statistic: Z= (Under H0 : = 125)
/ n
120 125
= = 1.77.
20 / 50
Conclusion: Since the observed value of the test statistic |Z| = 1.77, is smaller than the critical
value 1.96 at 5% level of significance, the data do not provide us any evidence against the null hypothesis
H0. Hence it is accepted and concluded that the average daily wage of the Factorys workers be
regarded as rupees 125 with a standard deviation of rupees 20.
Example 2
A bulb manufacturing company hypothesizes that the average life of its product is 1,450 hours.
They know that the standard deviation of bulbs life is 210 hours. From a sample of 100 bulbs, the
company finds the sample mean of 1,390 hours. At a 1% level of significance, should the company
conclude that the average life of the bulbs is less than the hypothesized 1,450 hours?
Solution
Aim: Our aim is to test whether the average life of bulbs is regarded as 1,450 hours or less.
H0 : The average life of bulbs is 1,450 hours. i.e., H0 : = 1450.
H1 : The average life of bulbs is below 1,450 hours. i.e., H1: < 1450.
Level of Significance: = 0.01 and Critical Value: Z = 2.33
Parametric Tests 15
X
Test Statistic: Z = (Under H0 : = 1450)
/ n
1390 1450
= = 2.86
210 / 100
Conclusion: Since the observed value of the test statistic Z = 2.86, is smaller than the critical
value 2.33 at 1% level of significance, the data provide us evidence against the null hypothesis H0 and
in favor of H1. Hence, H1 is accepted and concluded that the average life of the bulbs is significantly
less than the hypothesized 1,450 hours.
EXERCISES
1. A Film producer knows that his movies ran an average of 100 days in each cities of Tamilnadu, and
the corresponding standard deviation was 8 days. A researcher randomly chose 80 theatres in
southern districts and found that they ran the movie an average of 86 days. Test the hypotheses at
2% significance level.
2. A sample of 50 children observed from rural areas of a district has an average birth weight of 2.85 kg.
The past record shows that the standard deviation of birth weight in the district is 0.3 kg. Can we
expect that the average birth weight of the children in the district will be more than 3 kg at 5% level?
TEST 3
Aim
To test that the population mean be regarded as 0, based on a random sample. That is, to
investigate the significance of the difference between the sample mean X and the assumed population
m ean 0.
Source
A random sample of n observations X i, (i = 1, 2,, n) be drawn from a population whose mean
and variance 2 are unknown.
Assumptions
(i) The population from which, the sample drawn is Normal distribution.
(ii) The population variance 2 is unknown. (Since 2 is unknown, it is replaced by its unbiased
estimate S2 )
Null Hypothesis
H0 : The sample has been drawn from a population with mean be 0. That is, there is no
significant difference between the sample mean X and the assumed population mean 0. i.e., H0 :
= 0.
Alternative Hypotheses
H1(1): 0
H1(2): > 0
H1(3): < 0
Parametric Tests 17
/2 /2
t/2, n1 0 t/2, n1
0 t,n1
t , n 1 0
Test Statistic
X
t = (Under H0 : = 0)
S/ n
(X X )2
n
1 n
1
X i , S2 =
X = n n 1 i =1 i
i =1
Conclusions
1. If |t| t , we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject
H0 or accept H1(1).
2. If t t , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1(2).
3. If t t , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1(3).
Example 1
A sample of 12 students from a school has the following scores in an I.Q. test. 89 87 76 78 79 86
74 83 75 71 76 92. Do this data support that the mean I.Q. mark of the school students is 80? Test at
5% level.
Solution
Aim: To test the mean I.Q. marks of the school students be regarded as 80 or not.
H0: The mean I.Q. mark of the school students is 80. i.e., H0: =80.
H1: The mean I.Q. mark of the school students is not 80. i.e., H1: 80.
Level of Significance: = 0.05 and Critical Value: t0.05,11 = 2.20
X
Test Statistic: t= (Under H0 : = 80)
S/ n
80. 5 80
= = 0.25
7.01 / 12
Conclusion: Since |t|< 2.20, we conclude that the data do not provide us any evidence against the
null hypothesis H0. Hence, accept H0, at 5% level of significance. That is, the mean I.Q. mark of the
school students is regarded as 80.
Parametric Tests 19
Example 2
The average breaking strength of steel rods is specified as 22.25 kg. To test this, a sample of 20
rods was examined. The mean and standard deviations obtained were 21.35 kg and 2.25 respectively.
Is the result of the experiment significant at 5% level?
Solution
Aim: To test the average breaking strength of steel rods specified as 22.25 kg is true or not.
H0: The average breaking strength of steel rods specified as 22.25 kg is true. i.e., H0 : = 22.25.
H1: The average breaking strength of steel rods specified as 22.25 kg is not true. i.e. ,
H1: 22.25.
Level of Significance: = 0.05 and Critical Value: t0.05,19 = 2.09
X
Test Statistic: t = (Under H0 : = 22.25)
S/ n
21.35 22. 25
= = 1.74
2. 31 20
Conclusion: Since |t| < 2.09, we conclude that the data do not provide us any evidence against
the null hypothesis H0 and hence it may be accepted at 5% level of significance. That is, the average
breaking strength of steel rods specified as 22.25 kg is true.
EXERCISES
1. A sales person says that the average sales of pickle in a week will be 120 numbers. A sample of
sales on 8 weeks observed as 112 124 110 114 108 114 115 118 125 126. Examine whether the claim
of the salesman is true at 1% significance level.
2. A sample of 10 coconut has the following yield of coconuts from a grove in a season are 68 56 47
52 62 70 56 54 63 60. Shall we conclude that the average yield of coconuts from the grove is 65? Test
at 2% level.
TEST 4
Aim
To test the population variance 2 be regarded as 20 , based on a random sample. That is, to
investigate the significance of the difference between the assumed population variance 20 and the
sample variance s2.
Source
A random sample of n observations X i, (i = 1, 2,, n) be drawn from a normal population with
known mean and unknown variance 2.
Assumption
The population from which, the sample drawn is normal distribution.
Null Hypothesis
H0: The population variance 2 is 20 . That is, there is no significant difference between the
assumed population variance 20 and the sample variance s2. i.e., H0: 2 = 20 .
Alternative Hypotheses
H1(1) : 2 20
H1(2) : 2 > 20
H1(3) : 2 < 20
Parametric Tests 21
/2 /2
2 2
0 1 ( / 2 ), n ( / 2 ), n
{
(2) 2 > 2 , n such that P 2 > 2 , n = }
2
0 ,n
(3) 2 < 21, n such that P {2 < 21, n} = .
(1 ), n
2
0
22 Selected Statistical Tests
The critical values of Left sided test and Right sided test are provided as a and b are obtained from
Table 3.
Test Statistic
( X i )2
2 = i =1
20
Conclusions
1. If 21 (/2) 2 2(/2), we conclude that the data do not provide us any evidence against
the null hypothesis H0 , and hence it may be accepted at % level of significance. Otherwise
reject H0 or accept H1(1).
2. If 2 2, we conclude that the data do not provide us any evidence against the null
hypothesis H0 , and hence it may be accepted at % level of significance. Otherwise reject
H0 or accept H1(2).
3. If 2 21 , we conclude that the data do not provide us any evidence against the null
hypothesis H0 , and hence it may be accepted at % level of significance. Otherwise reject
H0 or accept H1(3).
Example 1
An agriculturist expects that the average yield of coconut is 63 per coconut tree and variance is
20.25 per year from a coconut grove. A random sample of 10 coconut trees has the following yield in
a year: 76 65 64 56 58 54 62 68 76 78. Test the variance is significant at 5% level of significance.
Solution
Aim: To test the variance yield of coconut from the grove is significant with the sample variance
or not.
H0: The variance of the yield of coconut in the grove is 20.25. i.e., H0: 2 = 20.25
H1: The variance of the yield of coconut in the grove is not 20.25. i.e., H1: 2 20.25
Level of Significance: = 0.05
Critical Values: 2(.975), 10 = 3.247 & 2(.025), 10 = 20.483
Critical Region: P (2(.975), 10 < 3.247) + P (2(.025), 10 >20.483) = 0.10
n
(X i ) 2
49. 1
i =1
Test Statistic: 2 = = = 10.91
20 4. 5
Parametric Tests 23
Conclusion: Since 21(/2) < 2 < 2(/2), we conclude that the data do not provide us any evidence
against the null hypothesis H0. Hence, H0 is accepted at 5% level of significance. That is, the variance
of the yield of coconut in the grove be regarded as 20.25.
Example 2
The variation of birth weight (as measured by the variance) of children in a region is expected to
be more than 0.16. The mean of the birth weight is known, which is 2.4 Kg. A sample of 11 children
is selected, whose birth weight is obtained as follows.
Weight (in Kgs.): 2.7 2.5 2.6 2.6 2.7 2.5 2.5 2.3 2.4 2.3 2.5
Set up the hypotheses and for testing the expectedness at 5% level of significance.
Solution
Aim: To test the variance of the birth weight of the children be 0.16 or more.
H0: The variance of the birth weight of children in the region is 0.16. i.e., H0: 2 = 0.16
H1: The variance of the birth weight of children in the region is more than 0.16. i.e., H1: 2 > 0.16
Level of Significance: = 0.05 and Critical Value: 20.05,11 = 18.307
n
( X
i =1
i ) 2
0. 31
Test Statistic: 2 = = = 1.94
02 0.16
Conclusion: Since 2 < 2, we conclude that the data do not provide us any evidence against the
null hypothesis H0. Hence, H0 is accepted at 5% level of significance. That is, the variance of the birth
weight of children in the region is 0.16.
EXERCISES
1. A psychologist is aware of studies showing that the mean and variability (measured as variance)
of attention, spans of 5-year-olds can be summarized as 80 and 64 minutes respectively. She wants
to study whether the variability of attention span of 6-year-olds is different. A sample of 20 6-year-
olds has the following attention spans in minutes: 86 89 84 78 75 74 85 71 84 71 75 68 75 71 82 85 81
78 79 78. State explicit null and alternative hypotheses and test at 5% level.
2. The average and variance of daily expenditure of office going women is known as Rs.30 and Rs.10
respectively. A sample of 10 office going women is selected whose daily expenditure is obtained
as 35 33 40 30 25 28 35 28 35 40. Test whether the variance of the daily expenditure of office going
women is 10 at 1% level of significance.
TEST 5
Aim
To test the population variance 2 be regarded as 20 , based on a random sample. That is, to
investigate the significance of the difference between the assumed population variance 20 and the
sample variance s2.
Source
A random sample of n observations X i, (i = 1, 2,, n) be drawn from a normal population with
mean and variance 2 (both are unknown). The unknown population mean is estimated by its
unbiased estimate X .
Assumption
The population from which, the sample drawn is normal distribution.
Null Hypothesis
H0: The population variance 2 is 20 . That is, there is no significant difference between the
assumed population variance 20 and the sample variance s2. i.e., H0: 2 = 20 .
Alternative Hypotheses
H1(1) : 2 02
H1(2) : 2 > 02
H1(3) : 2 < 02
Test Statistic
n
( X
i =1
i X )2
2 =
20
The statistic 2 follows 2 distribution with (n1) degrees of freedom.
Example 1
A Statistics Professor conducted an examination to the class of 31 freshmen and sophomores.
The mean score was 72.7 and the sample standard deviation was 15.9. Past experience to the Professor
to believe that, a standard deviation of about 13 points on a 100-point examination indicates that the
exam does a good job. Does this exam meet his goodness criterion at 10% level?
Solution
Aim: To test that, the examination meets the professors goodness criterion or not.
H0: The variance of the score on the exam is regarded as 132 (=169). i.e., H0: 2 = 169
H1: The variance of the score on the exam is not 169. i.e., H1: 2 169
Level of Significance: = 0.10
Critical Values: 2(.95), 30 = 18.493 & 2(.05), 30 = 43.773
Critical Region: P (2(.95),30 < 18.493) + P (2(.05),30 > 43.773) = 0.10
n
( X
i =1
i X )2
ns 2 31 (15. 9) 2
Test Statistic: 2 = = = = 46.37
20 20 132
Conclusion: Since 2 > 2(/2), we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 10% level of significance. That is, this
examination does not meet his goodness criterion of believing the standard deviation to be 13.
Example 2
The variation of daily sales in a vegetable mart is reported as Rs.100. A sample of 20 days was
observed with variance as Rs.160. Test whether the variance of the sales in the vegetable mart be
regarded as Rs.100 or not at 1% level of significance.
Solution
Aim: To test the variance of the sales in the vegetable mart be regarded as Rs.100 or not.
H0: The variance of the sales in the vegetable mart is Rs.100. i.e., H0: 2 = 100
H0: The variance of the sales in the vegetable mart is not Rs.100. i.e., H1: 2 100
Level of Significance: = 0.05
Critical Values: 2(.975), 19 = 8.907 & 2(.025), 19 = 32.852
26 Selected Statistical Tests
( X
i =1
i X )2
3200
Test Statistic: 2 = = = 32
20 100
Conclusion: Since 21(/2) < 2 < 2(/2), we conclude that the data do not provide us any evidence
against the null hypothesis H0 . Hence, H0 is accepted at 5% level of significance. That is, the variance
of the sales in the vegetable mart is Rs.100.
EXERCISES
1. A manufacturer claims that the lifetime of a certain brand of batteries produced by his company
has a variance more than 6800 hours. A sample of 20 batteries selected from the production
department of that company has a variance of 5000 hours. Test the manufacturers claim at 5%
level.
2. A manufacturer recorded the cut-off bias (volt) of a sample of 10 tubes as follows: 21.9 22.2 22.2
22.1 22.3 21.8 22.0 22.4 22.0 22.1. The variability of cut-off bias for tubes of a standard type as
measured by the standard deviation is 0.210 volts. Is the variability of new tube with respect to
cut-off bias less than that of the standard type at 1% level?
TEST 6
Aim
To test that, the observed frequencies are good for fit with the theoretical frequencies. That is, to
investigate the significance of the difference between the observed frequencies and the expected
frequencies, arranged in K classes.
Source
Let Oi, (i = 1, 2,, K) is a set of observed frequencies on K classes based on any experiment and
E i (i = 1, 2,, K) is the corresponding set of expected (theoretical or hypothetical) frequencies.
Assumptions
(i) The observed frequencies in the K classes should be independent.
K K
(ii) O = E
i =1
i
i =1
i = N.
(iii) The total frequency, N should be sufficiently large (i.e., N > 50).
(iv) Each expected frequency in the K classes should be at least 5.
Null Hypothesis
H0: The observed frequencies are good for fit with the theoretical frequencies. That is, there is
no significant difference between the observed frequencies and the expected frequencies, arranged in
K classes.
Alternative Hypothesis
H1: The observed frequencies are not good for fit with the theoretical frequencies. That is, there
is a significant difference between the observed frequencies and the expected frequencies, arranged in
K classes.
28 Selected Statistical Tests
Test Statistic
2
Oi E i
K
=
2
i =1
Ei
The Statistic 2 follows 2 distribution with (K1) degrees of freedom.
Conclusion
If 2 2,(K1), we conclude that the data do not provide us any evidence against the null
hypothesis H0 and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1.
Example 1
The sales of milk from a milk booth are varying from day-to-day. A sample of one-week sales
(Number of Liters) is observed as follows.
Day: Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Sales: 154 145 152 140 135 165 173
Examine whether the sales of milk are same over the entire week at 1% level of significance.
Solution
Aim: To test the sales of milk is same over the entire week or not.
H0: The sale of milk is same over the entire week.
H1: The sale of milk is not same over the entire week.
Level of Significance: = 0.01
Critical value: 20.01,6 = 16.812
Frequency (Oi Ei ) 2
Day (Oi Ei ) 2
Observed (Oi ) Expected (Ei ) Ei
2
K
Oi E i
Test Statistic: 2 =
i =1
Ei
= 7.2105
Parametric Tests 29
Conclusion: Since 2 < 2,(K1), we conclude that the data do not provide us any evidence
against the null hypothesis H0 . Hence, H0 is accepted at 1% level of significance. That is, the sales of
milk are same over the entire week.
Example 2
In an experiment on pea breeding, Mendal obtained the following frequencies of seeds from 560
seeds: 312 rounded and yellow (RY), 104 wrinkled and yellow (WY); 112 round and green (RG), 32
wrinkled and green (WG). Theory predicts that the frequencies should be in the proportion 9:3:3:1
respectively. Set up the hypothesis and test it for 1% level.
Solution
Aim: To test the observed frequencies of the pea breeding in the ratio 9:3:3:1.
H0: The observed frequencies of the pea breeding are in the ratio 9:3:3:1.
H1: The observed frequencies of the pea breeding are not in the ratio 9:3:3:1.
Level of Significance: = 0.01
Critical value: 20.01,3 = 11.345
(Oi Ei ) 2
Seed type Frequency
Ei
Observed (Oi ) Expected (Ei ) ( Oi E i ) 2
2
K
Oi E i
Test Statistic: =2
i =1
Ei
= 0.7619
Conclusion: Since 2 < 2,(K1) , we conclude that the data do not provide us any evidence
against the null hypothesis H0 . Hence, H0 is accepted at 1% level of significance. That is, the observed
frequencies of the pea breeding are in the ratio 9:3:3:1.
EXERCISES
1. A chemical extract plant processes seawater to collect sodium chloride and magnesium. It is
known that seawater contains sodium chloride, magnesium and other elements in the ratio of
62:4:34. A sample of 300 hundred tones of seawater has resulted in 195 tones of sodium chloride
and 9 tones of magnesium. Are these data consistent with the known composition of seawater at
10% level?
2. Among 80 off springs of a certain cross between guinea pigs, 42 were red, 16 were black and 22
were white. According to genetic model, these numbers should be in the ratio 9:3:4. Are these
consistent with the model at 1% level of significance?
TEST 7
Aim
To test the two population proportions P 1 and P 2 be equal, based on two random samples. That
is, to investigate the significance of the difference between the two sample proportions p1 and p2.
Source
From a random sample of n1 observations, X 1 observations possessing an attribute A whose
sample proportion p1 is X 1/n1. Let the corresponding proportion in the population be denoted by P 1,
which is unknown. From another sample of n2 observations, X 2 observations possessing the attribute
A whose sample proportion p2 is X 2/n2. Let the corresponding proportion in the population be denoted
by P 2, which is unknown.
Assumption
The sample sizes of the two samples are sufficiently large (i.e., n1, n2 30 ) to justify the normal
approximation to the binomial.
Null Hypothesis
H0: The two population proportions P 1 and P 2 are equal. That is, there is no significant difference
between the two sample proportions p1 and p2. i.e., H0: P 1 = P 2.
Alternative Hypotheses
H1(1) : P 1 P 2
H1(2) : P 1 > P 2
H1(3) : P 1 < P 2
Test Statistic
( p1 p 2 ) (P1 P2 )
Z= (Under H0: P 1 = P 2)
1 1
P(1 P ) +
n1 n2
n1 p1 + n 2 p 2
P =
n1 + n2
The statistic Z follows Standard Normal distribution.
Example 1
Random samples of 300 male and 400 female students were asked whether they like to introduce
CBCS system in their university. 160 male and 230 female were in favor of the proposal. Test the
hypothesis that proportions of male and female in favor of the proposal are equal or not at 2% level.
Solution
Aim: To test the proportion of male and female students are equal or not, in introducing CBCS
system in their university.
H0: The proportion of male (P 1) and female (P 2) students are equal, in favour of the proposal of
introducing CBCS system in their university. i.e., H0: P 1 = P 2.
H1: The proportion of male and female students is not equal, in favour of the propasal of introducing
CBCS system in their university. i.e., H1: P 1 P 2
Level of Significance: = 0.02 and Critical Value: Z= 2.33
16
Based on the data, we observed that n1 = 300, p1 = = 0.53,
300
230
n2= 400, p2 = = 0.58
400
n1 p1 + n 2 p 2 (300 0.53) + (400 0. 58)
P= = = 0.56
n1 + n2 300 + 400
( p1 p 2 ) (P1 P2 )
Test Statistic: Z= (Under H0: P 1 = P 2)
1 1
P(1 P ) +
n1 n2
(0. 53 0. 58)
Z= = 1.32
1 1
0. 56 0. 44 +
300 400
Conclusion: Since Z < Z , we conclude that the data do not provide us any evidence against the
null hypothesis H0 and hence it is accepted at 2% level of significance. That is, the proportion of male
and female students are equal, in favour of the propsal of introducing CBCS system in their university.
32 Selected Statistical Tests
Example 2
From a random sample of 1000 children selected from rural areas of a district in Tamilnadu, it is
found that five are affected by polio. Another sample of 1500 from urban areas of the same district,
three of them is affected. Will it be reasonable to claim that the proportion of polio-affected children in
rural area is more than urban area at 1% level?
Solution
Aim: To test the proportion of polio-affected children in rural area is same as in urban area or more
than urban area.
H0: The proportion of polio-affected children in rural (P 1) and urban (P 2) areas are equal i.e.,
H0 : P 1 = P 2.
H1: The proportion of polio-affected children in rural area is more than urban area. i.e. ,
H1: P 1 > P 2.
Level of Significance: = 0.01 and Critical Value: Z= 2.33
5
Based on the data, we observed that n1 = 1000, p1 = = 0.005,
1000
3
n2 = 1500, p2 = = 0.002
1500
n1 p1 + n 2 p 2 (1000 0.005) + (1500 0. 002)
= = = 0.0032
P n1 + n2 1000 + 1500
( p1 p 2 ) (P1 P2 )
Test Statistic: Z= (Under H0: P 1 = P 2)
1 1
P(1 P ) +
n1 n2
Conclusion: Since Z < Z , we conclude that the data do not provide us any evidence against
the null hypothesis H0 and hence it is accepted at 1% level of significance. That is, the proportions of
polio-affected children in rural and urban areas are equal.
EXERCISES
1. From a sample of 300 pregnancies in city-A in a year, 163 births are females. Another sample of 250
pregnancies in city-B in the same year, 132 births are females. Test whether the female births in
both cities are equal at 1% level of significance.
2. A sample of 500 persons were selected from a city in Tamilnadu, 210 are tea drinkers. Another
sample of 300 persons from a city of Kerala, 160 persons are tea drinkers. Test the hypothesis that
the tea drinkers in Tamilnadu are less than that of Kerala at 10% level.
TEST 8
Aim
To test the two population means are equal, based on two random samples. That is, to investigate
the significance of the difference between the two sample means X 1 and X 2 .
Source
A random sample of n1 observations has the mean X 1 be drawn from a population with unknown
mean 1. A random sample of n2 observations has the mean X 2 be drawn from another population
with unknown mean 2.
Assumptions
(i) The populations, from which, the two samples drawn are assumed as Normal distributions.
(ii) The two Population variances are equal and known which is denoted by 2.
Null Hypothesis
H0: The two population means 1 and 2 are equal. That is, there is no significant difference
between the two sample means X 1 and X 2 .
i.e., H0: 1 = 2
Alternative Hypotheses
H1(1) : 1 2
H1(2) : 1 > 2
H1(3) : 1 < 2
Test Statistic
( X 1 X 2 ) (1 2 )
Z= (Under H0 : 1 = 2)
1 1
+
n1 n2
Example 1
TVS Company wanted to test the mileage of its two wheelers with that of other brands. A
random sample of 125 TVS make gave a mileage of 90 km. A random sample of 150 two wheelers of
all other brands gave a mileage of 80 km. It is known that the standard deviation of both TVS Company
and all other brands was 12 km. If significance is 5%, do TVS vehicles give a better mileage?
Solution
Aim: To test the average mileage of TVS two-wheelers with that of other brands is equal or more.
H0: The average mileage of TVS two-wheelers (1) and all other brands (2) are equal. i.e.,
H0: 1 = 2.
H1: The average mileage of TVS two-wheelers is more than that of all other brands. i.e. ,
H1: 1 > 2.
Level of Significance: = 0.05 and Critical Value: Z = 1.645.
( X 1 X 2 ) ( 1 2 )
Test Statistic: Z= (Under H0 : 1 = 2)
1 1
+
n1 n2
90 80
= = 6.88
1 1
12 +
125 150
Conclusion: Since the observed value of the test statistic Z = 6.88, is larger than the critical value
1.645 at 5% level of significance, the data provide us evidence against the null hypothesis H0 and in
favor of H1. Hence, H1 is accepted and concluded that the average mileage of TVS two wheelers is
more than that of all other brands.
Example 2
A random sample of 1000 persons from Chennai city have an average height of 67 inches and
another random sample of 1200 persons from Mumbai city have an average height of 68 inches. Can
the samples be regarded that the average height of persons from both cities is equal with a standard
deviation of 5 inches? Test at 2% level of significance.
Parametric Tests 35
Solution
Aim: To test the average height of persons from the cities Chennai and Mumbai are equal or not.
H0: The average height of persons from the cities Chennai (1) and Mumbai (2) are equal. i.e.,
H0: 1 = 2.
H1: The average height of persons from the cities Chennai and Mumbai are not equal. i.e. ,
H1: 1 2.
Level of Significance: = 0.02 and Critical Value: Z= 2.33
( X 1 X 2 ) ( 1 2 )
Test Statistic: Z= (Under H0 : 1 = 2)
1 1
+
n1 n2
67 68
= = 4.67
1 1
5 +
1000 1200
Conclusion: Since the observed value of the test statistic Z = 4.67, is larger than the critical value
2.33 at 2% level of significance, the data provide us evidence against the null hypothesis H0 and in
favor of H1. Hence, H1 is accepted and concluded that the average height of persons from the cities
Chennai (1) and Mumbai (2) are not equal.
EXERCISES
1. A sample of 100 households from Chidamabaram has an average monthly income of Rs. 6000 and
from a sample of 125 from Cuddalore has Rs. 5400. It is known that the standard deviation of
monthly income in those two places is Rs. 500. Is it reasonable to say that the average monthly
income of Chidambaram is more than that of Cuddalore at 10% level?
2. Two research laboratories have independently produced drugs that provide relief to arthritis
suffer. The first drug was tested on a group of 85 arthritis sufferers, producing an average of 6.8
hours of relief. The second drug was tested on 95 arthritis sufferers, producing an average of 7.2
hours of relief. Given that, the standard deviation of hours of relief by both drugs is equal and 2
hours. At 1% level of significance, does the first drug provide a significantly shorter period of
relief ?
TEST 9
Aim
To test the two population means be equal, based on two random samples. That is, to investigate
the significance of the difference between the two sample means X 1 and X 2 is significant.
Source
A random sample of n1 observations has the mean X 1 be drawn from a population with unknown
mean 1 and known variance 12 . A random sample of n2 observations has the mean X 2 be drawn
from another population with unknown mean 2 and known variance 22 .
Assumptions
(i) The populations from which, the two samples drawn, are Normal distributions.
(ii) The population variances 2 and 2 are known.
2
1
Null Hypothesis
H0: The two population means 1 and 2 are equal. That is, there is no significant difference
between the two sample means X 1 and X 2 .
i.e., H0 : 1 = 2
Alternative Hypotheses
H1(1) : 1 2
H1(2) : 1 > 2
H1(3) : 1 < 2
Test Statistic
( X 1 X 2 ) (1 2 )
Z= (Under H0 : 1 = 2)
12 22
+
n1 n 2
( X 1 X 2 ) (1 2 )
Z= (Under H0: 1 = 2)
s12 s 22
+
n1 n2
Example 1
The average daily wage of a sample of 140 workers in Factory-A was Rs. 120 with a standard
deviation of Rs. 15. The average daily wage of a sample of 190 workers in Factory-B was Rs. 125 with
a standard deviation of Rs. 20. Can we conclude that the daily wages paid by Factory-A are lower than
those paid by Factory-B at 5% level?
Solution
Aim: To test whether the average daily wage of Factory-A with that of Factory-B is equal or less.
H0: The average daily wage of Factory-A (1) and Factory-B (2) are equal. i.e., H0 : 1 = 2
H1: The average daily wage of Factory-A is less than Factory-B. i.e., H1 : 1 < 2
Level of Significance: = 0.05 and Critical Value: Z= 1.645
( X 1 X 2 ) ( 1 2 )
Test Statistic: Z = (Under H0 : 1 = 2)
s12 s 22
+
n1 n 2
120 125
= = 2.60
(15) 2 ( 20)2
+
140 190
Conclusion: Since |Z|, is larger than the critical value at 1% level of significance, the data provide
us evidence against the null hypothesis H0 and in favor of H1. Hence H1 is accepted and concluded that
the average daily wage of Factory-A is less than that of Factory-B.
38 Selected Statistical Tests
Example 2
In a survey of buying habits, 390 women shoppers are chosen at random in super market-A
located at Calcutta. Their average weekly food expenditure is Rs. 500 with a standard deviation of
Rs. 60. From a random sample of 240 women shoppers chosen from super market-B of the same city,
the average weekly food expenditure is Rs. 520 with a standard deviation of Rs. 75. Can we agree that
the average weekly food expenditure of the women shoppers from two super markets is equal at 2%
level?
Solution
Aim: To test the average weekly food expenditure of women shoppers from two super markets A
and B are equal or not.
H0: The average weekly food expenditure of women shoppers from super market-A (1) and
super market-B (2) are equal. i.e., H0 : 1 = 2.
H1: The average weekly food expenditure of women shoppers from super market-A and super
market-B are not equal. i.e., H1 : 1 2
Level of Significance: = 0.05 and Critical Value: Z= 2.33
( X 1 X 2 ) ( 1 2 )
Test Statistic: Z = (Under H0 : 1 = 2)
s12 s 22
+
n1 n 2
500 520
= = 3.50
(60) 2 (75) 2
+
390 240
Conclusion: Since the observed value of the test statistic lZl = 3.50, is larger than the critical
value 2.33 at 2% level of significance, the data provide us evidence against the null hypothesis H0 and
in favor of H1. Hence, H1 is accepted and concluded that the average weekly food expenditure of
women shoppers from two super markets A and B are not equal.
EXERCISES
1. Suppose that the number of hours spent for watching the television in a day by middle-aged
women is normally distributed with standard deviation of 30 minutes in urban area and 45 minutes
in rural area. From a sample of 75 women in urban area and 100 women in rural area, the average
number of hours spent by them in watching the television is 6 hours and 7 hours respectively per
day. Can you claim that the average number of hours spent by middle-aged women in rural and
urban area is equal at 1% level?
2. The marks obtained by students from Public schools and Matriculation schools in a city are
normally distributed with a standard deviations of 12 and 15 marks respectively. A random sample
of 60 students from Public schools has a mean mark of 84 and 80 students and from Matriculation
schools has an average of 90 marks. Can we claim that the students of Public schools get less mark
than that of Metric schools at 1% level?
TEST 10
Aim
To test the null hypothesis of the mean of the two populations are equal, based on two random
samples. That is, to investigate the significance of the difference between the two sample means X 1
and X 2 .
Source
A random sample of n1 observations X 1i, (i = 1, 2,, n1) be drawn from a population with
unknown mean 1 . A random sample of n2 observations X 2j, (j = 1, 2,, n2) be drawn from another
population with unknown mean 2.
Assumptions
(i) The populations from which, the two samples drawn, are Normal distributions.
(ii) The two Population variances are equal and unknown which is denoted by 2 (Since 2 is
unknown, it is replace by unbiased estimate S2 ).
Null Hypothesis
H0: The two population means 1 and 2 are equal. That is, there is no significant difference
between the two sample means X 1 and X 2 .
i.e., H0: 1 = 2
Alternative Hypotheses
H1(1) : 1 2
H1(2) : 1 > 2
H1(3) : 1 < 2
Test Statistic
( X 1 X 2 ) (1 2 )
t = (Under H0 : 1 = 2)
1 1
S +
n1 n2
(X ) (X )
n1 n2
i1
X1 + i2
X2
n1 n2
X X
1 1 i =1 j =1
X1 = n 1i , X2 = 2i and S 2 =
n1 + n2 2
.
1 i =1
n2 j =1
Example 1
The gain in weight of two random samples of chicks on two different diets A and B are given
below. Examine whether the difference in mean increases in weight is significant.
Diet A: 2.5 2.25 2.35 2.60 2.10 2.45 2.5 2.1 2.2
Diet B: 2.45 2.50 2.60 2.77 2.60 2.55 2.65 2.75 2.45 2.50
Solution
Aim: To test the mean increases in weights by diet-A (1) and diet-B (2) are equal or not.
H0 : The mean increases in weights by both diets are equal. i.e., H0 : 1 = 2
H1 : The mean increases in weights by both diets are not equal. i.e., H1 : 1 2
Level of significance: = 0.05(say) and Critical value: t0.05 for 17 d.f = 2.11
( X 1 X 2 ) (1 2 )
Test Statistic: t = (Under H0 : 1 = 2)
1 1
S +
n1 n2
(2. 34 2. 58)
= = 2.25
1 1
0.16 +
9 10
Conclusion: Since |t| > t , we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is, the mean
increase in weights by two diets A and B are not equal.
Parametric Tests 41
Example 2
A researcher is interested to know whether the performance in a public examination by students
of schools from Tsunami affected area compared with other students is poor or not. A random sample
of 10 students from coastal area schools is selected whose marks are given below. 68 72 64 65 56 72
64 56 60 73. Another sample of 8 students from non-coastal area schools has the following marks 76
78 68 72 83 85 88 78. Test at 1% level of the hypothesis.
Solution
Aim: To test the performance in a public examination by students of schools from Tsunami
affected area compared with other students is equal or less.
H0: The performance in a public examination by students of schools from Tsunami affected area
(1) compared with other students (2) is equal. i.e., H0: 1 = 2
H1: The performance in a public examination by students of schools from Tsunami affected area
is less than that of other students. i.e., H1: 1 < 2
Level of Significance: = 0.01 and Critical value: t0.01 for 16 d.f = 2.58
( X 1 X 2 ) (1 2 )
Test Statistic: t= (Under H0 : 1 = 2)
1 1
S +
n1 n2
(65 78. 5)
= = 4.13
1 1
6.88 +
10 8
Conclusion: Since |t| > |t |, we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 1% level of significance. That is, the
performance in a public examination by students of schools from Tsunami affected area is less than
that of other students.
EXERCISES
1. A paper company produces covers on two machines whose data is given below. The average
number of items produced by two machines per hour is 250 and 280 with standard deviations 16
and 20 respectively based on records of 50 hours production. Can we expect that the two machines
are equally efficient at 10% level of significance?
2. The yield of two varieties of brinjal on two independent sample of 10 and 12 plants are given
below. Test whether the yield of Variety-A is more than Variety-B at 2% level of significance.
Variety-A: 18 15 16 20 22 20 23 18 20 25
Variety-B: 12 14 16 13 16 20 22 24
TEST 11
Aim
To test the treatment applied is effective or not, based on a random sample. That is, to investigate
the significance of the difference between before and after the treatment in the sample.
Source
Let X i, (i = 1, 2,, n) be the observations made initially from n individuals as a random sample of
size n. A treatment is applied to the above individuals and observations are made after the treatment and
are denoted by Yi, (i = 1, 2,, n). That is, (X i, Yi) denotes the pair of observations obtained from the
ith individual, before and after the treatment applied. Let X is unknown population mean before the
treatment and Y is the unknown population mean after the treatment.
Assumptions
(i) The observations for the two samples must be obtained in pair.
(ii) The population from which, the sample drawn is normal.
Null Hypothesis
H0: The treatment applied, is ineffective. That is, there is no significant difference between before
and after the treatment applied.
i.e., H0: d = X Y = 0.
Alternative Hypotheses
H1(1) : d 0
H1(2) : d > 0
H1(3) : d < 0
Test Statistic
d d
t= ( Under H0 : d = 0)
Sd / n
d
( )
i 2
i =1 1 n
d = , d i = X i Yi , S d2 = d d
n n 1 i =1 i
The statistic t follows t distribution with (n1) degrees of freedom.
Example 1
A health spa has advertised a weight-reducing program and has claimed that the average participant
in the program loses more than 5 kgs. A random sample of 10 participants has the following weights
before and after the program. Test his claim at 5% level of significance.
Solution
Weights before: 80 78 75 86 90 87 95 78 86 90
Weights after: 76 75 70 80 84 83 91 72 83 83
Aim: To test the claim of health spa on average weight reduction is five kgs or more.
H0: The average weight reduction is only 5 kgs. i.e., H0: d = x y = 5
H1: The average weight reduction is more than 5 kgs. i.e., H1: d > 5.
Level of Significance: = 0.05 and Critical value: t0.05,9 = 1.83
d d
Test Statistic: t= (Under H0: d = 0)
Sd / n
4.7
= =10.54
1. 41 / 10
Conclusion: Since t > t, we conclude that the data provide us evidence against the null hypothesis
H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is, the average weight
reduction is more than 5 kgs.
Example 2
A manufacturer claims that a significant gain on weight will be attained for infants if a new
variety of health drink marketed by him. A sample of 10 babies was selected and was given the above
diet for a month and the weights were observed before (A) and after (B) the diet given. Examine
whether the claim of the manufacturer is true at 2% level of significance.
A : 3.50 3.75 3.65 4.10 3.65 3.55 3.60 4.20 3.80 3.50
B : 3.80 4.20 3.90 4.50 3.75 4.20 3.60 4.35 4.20 3.40
44 Selected Statistical Tests
Solution
Aim: To test the claim of manufacturer on marketing a new variety of health drink, that will
promote weight gain or not.
H0: The claim of manufacturer on marketing a new variety of health drink that will promote
weight gain is not true. i.e., H0: d = 0.
H1: The claim of manufacturer on marketing a new variety of health drink that will promote
weight gain is true. i.e., H1: d 0.
Level of Significance: = 0.02 and Critical value: t0.02,9 = 2.82
d d
Test Statistic: t= (Under H0: d = 0)
Sd / n
0. 26
= = 3.43
0.24 / 10
Conclusion: Since |t| > t, we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 2% level of significance. That is, the claim
of manufacturer on marketing a new variety of health drink that will promote weight gain is true.
EXERCISES
1. The following data shows the additional hours of sleep gained by 15 patients in an experiment to
test the effect of a drug. Do these data shows the evidence that the drug produces additional
hours of sleep at 2% level?
Hours gained : 2.5 3.0 2.25 3.25 1.75 1.5 2.5 2.25 3.0 3.25 3.0 2.5 2.75 3.25 3.75.
2. A coaching centre for giving coach to civil service examination claims that there will be a significant
improvement in obtainning scores to the students. A random sample of 12 students was selected.
They are conducted examinations, before and after the coach, and are given below. Test whether
the claim of the coaching centre at 1% level of significance.
Student: 1 2 3 4 5 6 7 8 9 10 11 12
Score Before Coaching : 68 72 74 67 79 78 82 78 77 77 80 78
Score After Coaching : 78 75 78 80 80 85 80 75 90 92 95 90
TEST 12
Aim
To test the standard deviations of the two populations 1 and 2 are equal, based on two random
samples. That is, to investigate the significance of the difference between the two sample standard
deviations s1 and s2.
Source
A random sample of n1 observations is drawn from a population whose mean 1 and standard
deviation 1 are unknown. A random sample of n2 observations is drawn from another population
whose mean 2 and standard deviation 2 are unknown. Let s1 and s2 be sample standard deviations of
the respective samples.
Assumptions
(i) The two samples are independently drawn from two normal populations.
(ii) The sample sizes are sufficiently large.
(iii) Since the population standard deviations 1 and 2 are unknown, they are replaced by their
estimates s1 and s2.
Null Hypothesis
H0: The two population standard deviations 1 and 2 are equal. That is, there is no significant
difference between the two, sample standard deviations s1 and s2. i.e., H0 : 1 = 2.
Alternative Hypotheses
H1(1) : 1 2
H1(2) : 1 > 2
H1(3) : 1 < 2
Test Statistic
s1 s 2
Z=
s12 s 22
+
2n
1 2 n2
n1 n2
Y
1 1
s1 = X i2 ( X ) , s2 =
2
i
2
(Y ) 2
n1 i =1
n2 i =1
Example 1
Two types of rods are manufactured by an industry for a specific task.
A random sample of 50 items of rod-1 has a standard deviation 0.85 and a sample of 80 items of
rod-2 has a standard deviation 0.72. Test whether the two types of rods are equal in their variation of
specifications at 5% level of significance.
Solution
Aim: To test the two types of rods are equal in their variation of specifications or not.
H0: The two types of rods are equal in their variation of specifications. i.e., H0: 1 = 2
H1: The two types of rods are not equal in their variation of specifications. i.e., H1: 1 2
Level of Significance: =0.05 and Critical value: Z=1.96
s1 s2 0. 85 0.72
Test Statistic: Z= = = 1.27
s2 2 0.85 2 0.72 2
1 + s2 +
2n1 2n2 2 50 2 80
Conclusion: Since the observed value of the test statistic lZl = 1.27, is smaller than the critical
value 1.96 at 5% level of significance, the data do not provide us evidence against the null hypothesis
H0. Hence, H0 is accepted and concluded that the two types of rods are equal in their variation of
specifications.
Example 2
A random sample of 100 students from a private school has a standard deviation of mark in a
competitive examination is 12.35. Another sample of 150 students from a government school has the
standard deviation of mark in the same examination is 10.25. Test whether the standard deviation of
mark by two schools is equal at 5% level of significance.
Solution
Aim: To test the standard deviation of mark in a competitive examination by two schools is equal
or not.
Parametric Tests 47
H0: The standard deviations of marks in a competitive examination by two schools are equal. i.e.,
H0: 1 = 2
H1: The standard deviations of marks in a competitive examination by two schools are not equal.
i.e., H1: 1 = 2
Level of Significance: = 0.05 and Critical value: Z=1.96
s1 s2 12.35 10.25
Test Statistic: Z= = = 1.99
s2 2 (12 .35)2 (10. 25) 2
1 + s2 +
2n1 2n2 2 100 2 150
Conclusion: Since the observed value of the test statistic |Z| = 1.99, is greater than the critical
value 1.96 at 5% level of significance, the data provide us evidence against the null hypothesis H0 and
in favor of H1. Hence, H1 is accepted and concluded that the standard deviation of mark in a competitive
examination by two schools is not equal.
EXERCISES
1. A random sample of 1500 adult males is selected from France whose mean height (in inches) is 72.25
and a standard deviation of 6.5. Another sample of 1200 adult males is selected from Japan whose
mean height (in inches) is 58.75 and a standard deviation of 7.25. Examine whether the standard
deviation of heights of adult male in two countries are equal or not.
2. A large organization produces electric bulbs in each of its two factories. It is suspected the efficiency
in the factory is not the same, so a test is carried out by ascertaining the variability of the life of the
bulbs produced by each factory. The data are as follows:
Factory-A Factory-B
Number of bulbs in the sample 150 250
Average life 1200 hrs 950 hrs
Standard deviation 250 hrs 200 hrs
Based on the above data, determine whether the difference between the variability of life of bulbs
from each sample is significant at 1 percent level of significance.
TEST 13
Aim
To test the variances of the two populations are equal, based on two random samples. That is, to
investigate the significance of the difference between the two sample variances.
Source
Let X 1i, (i = 1, 2,, n1) be a random sample of n1 observations drawn from a population with
unknown variance 12 . Let Y2j ( j = 1, 2,, n2 ) be a random sample of n2 observations drawn from
another population with unknown variance 22 .
Assumption
The populations from which, the samples drawn are normal distributions.
Null Hypothesis
H0: The two population variances 12 and 22 are equal. That is, there is no significant difference
between the two, sample variances s12 and s22 . i.e., H0: 12 = 22 .
Alternative Hypotheses
H1(1) : 12 22
H1(2) : 12 > 22
H1(3) : 12 < 22
for left tailed test is F < F(1 ), ( n1 1, n 2 1) and for two tailed test is F > F( / 2 ),( n1 1 ,n 2 1 ) and
F < F(1 / 2 ), (n1 1, n 2 1) . We have the following reciprocal relation between the upper and lower significant
points of F-distribution:
1
F (n1 , n2 ) = F (n1 , n2 ) F1 (n2 , n1 ) = 1.
F1 (n2 , n1 )
Critical Regions
/2 /2
0 F(1 / 2 ), (n 1 ,n 1) F( / 2 ), ( n1 1, n 2 1)
1 2
0 F,( n
1 1, n 2 1 )
50 Selected Statistical Tests
0 F(1 ),( n
1 1, n 2 1)
Test Statistic
2
S1
F= 2
S2
n1 n2
X
1 1
X1 = X 1i , X 1 = 2j ,
n1 i =1
n2 j =1
n1 n2
( X i
X 1) 2
(Y i
X 2 )2
i =1 j =1
, S 22 =
2
S1 =
n1 1 n2 1
Conclusions
Example 1
A quality control supervisor for an automobile manufacturer is concerned with uniformity in the
number of defects in cars coming off the assembly line. If one assembly line has significantly more
variability in the number of defects, then changes have to be made. The supervisor has obtained the
following data.
Number of Defects
Assembly Line-A Assembly Line-B
Mean 12 14
Variance 20 13
Sample size 16 20
Does assembly line A have significantly more variability in the number of defects? Test at 5%
level of significance.
Solution
Aim: To test the assembly line A have significantly more variability than assembly line B in the
number of defects or not.
H0: There is no significant difference in variability between assembly line A and assembly line B in
the number of defects. i.e., H0: 12 = 22.
H1: The assembly line A has significantly more variability than assembly line B in the number of
defects. i.e., H1: 12 > 22.
Level of Significance: = 0.05 and Critical value: F 0.05, (16-1, 201) = 2.23
2
S1 20
Test Statistic: F= 2 = = 1.54
S2 13
Conclusion: Since F < F ,(n 1 ,n 1 ) , we conclude that the data do not provide us any evidence
1 2
against the null hypothesis H0, and hence it is accepted at 5% level of significance. That is, there is no
significant difference in variability between assembly line A and assembly line B in the number of
defects.
Example 2
An insurance company is interested in the length of hospital-stays for various illnesses. The
company has selected 15 patients from hospital A and 10 from hospital B who were treated for the
same ailment. The amount of time spent in hospital A had an average of 2.6 days with a standard
deviation of 0.8 day. The treatment time in hospital B averaged 2.2 days with a standard deviation of
0.12 day. Do patients in hospital A have significantly less variability in their recovery time? Test at 1%
level of significance.
Solution
Aim: To test the patients in hospital A, have significantly less variability than the patients do in
hospital B, in their recovery time.
H0: There is no significant difference in recovery time in variability between the patients in hospital
A and hospital B. i.e., H0: 12 = 22.
52 Selected Statistical Tests
H1: The patients in hospital A, have significantly less variability than the patients do in hospital B,
in their recovery time.
i.e., H1: 12 < 22 H1: 22 > 12.
Level of Significance: = 0.01 and Critical value: F 0.01, (101, 151) = 4.03.
S 22 1.44
Test Statistic: F= = = 2.25
S12 0. 64
Conclusion: Since F < F ,( n 1, n 1 ) , we conclude that the data do not provide us any evidence
1 2
against the null hypothesis H0 , and hence it is accepted at 5% level of significance. That is, patients at
hospital A do not have significantly less variability in their recovery times.
EXERCISES
1. Two brand managers were in disagreement over the issue of whether urban homemakers had
greater variability in grocery shopping patterns than did rural homemakers. To test their conflicting
ideas, they took random samples of 25 homemakers from urban areas and 15 homemakers from
rural areas. They found that the variance for the urban homemaker was 4.25 and rural homemaker
was 3.5. Is the difference in the variances in days between shopping visits significant at 5% level?
2. The diameters of two random samples, each of size 10, of bullets produced by two machines have
standard deviations 0.012 and 0.018. Test the hypothesis that the two machines are equally
consistent in diameters at 1% level of significance.
TEST 14
Aim
To test the given two attributes classified into two classes each, are independent, based on the
observed frequencies, obtained from any sample survey.
Source
A random sample of size N is classified into 2 classes by attribute-A and 2 classes by attribute-B.
The above observed frequencies can be expressed in the following table known as 2 2 contingency
table as follows.
Attribute-A
AttributeB
Class1 Class2 Total
Class1 a b a +b
Class2 c d c+d
Total a +c b +d N
Assumptions
(i) The sample size N, should be sufficiently large (i.e., N > 20)
(ii) Each cell frequencies should be independent.
(iii) Each cell frequencies are at least 3.
Null Hypothesis
H0: The two attributes are independent.
Alternative Hypothesis
H1: The two attributes are not independent.
54 Selected Statistical Tests
Test Statistic
N {(ad bc) }
2
2 =
(a + b)(a + c )(b + d )(c + d )
The statistic 2 follows 2 distribution with one degree of freedom.
Conclusion
If 2 2,(1), we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example 1
Out of 5000 households in a town, 3200 are self-employed, out of 2200 graduate households,
1400 are self-employed. Examine whether there is any association between graduation and nature of
employment at 5% level of significance.
Solution
Aim: To test the two attributes, graduation and nature of employment are independent.
H0: Graduation and nature of employment are independent.
H1: Graduation and nature of employment are dependent.
Level of Significance: = 0.05 and Critical value: 20.05, 1 = 3.841
Employment
Graduation Total
Self-empoyed Others
Graduates 1400 800 2200
Non-graduates 1800 1000 2800
Total 3200 1800 5000
N {(ad bc) }
2
Test Statistic: 2 =
(a + b)(a + c )(b + d )(c + d )
N {(ad bc) 2 }
Test Statistic: 2 =
(a + b)(a + c )(b + d )(c + d )
EXERCISES
1. In an experiment on immunization of cattle from tuberculosis, the following data were obtained.
Affected Unaffected Total
Inoculated 12 68 80
Not Inoculated 98 22 120
Total 110 90 200
Examine the effect of vaccine in controlling the incidence of the disease at 2% level.
2. A sample survey was conducted from 500 to know the response from the students about the
introduction of CBCS system in the university. The following data were obtained:
Favor Against Total
Male 135 115 250
Female 120 130 250
Total 255 245 500
Test whether the opinion about the introduction of CBCS system depends on the gender of the
students at 2% level of significance.
TEST 15
Aim
To test the k population proportions are equal based on k independent samples. That is to investigate
the significance of the difference among the k sample proportions.
Source
Let there be k populations from which k independent random samples are drawn. Let Oi be the
observed frequency of a specific kind obtained from the ith sample of ni observations, i = 1, 2,, k.
Null Hypothesis
H0: The k population proportions are equal. That is, there is no significance difference among the
k sample proportions.
i.e., H0: P 1 = P 2 = = P k.
Alternative Hypothesis
H1: P 1 P 2 P k.
Test Statistic
k
(Oi ni p )2
=
2
i =1
ni pq
O i
n
where p= and q = 1p.
i
Conclusion
If 21(/2),(k1) 2 2(/2),(k1), we conclude that the data do not provide us any evidence
against the null hypothesis H0, and hence it may be accepted at % level of significance. Otherwise
reject H0 or accept H1.
Example 1
In an experiment on the efficiency of different insecticides in the control of mottle streak disease
in finger millet, 50 plants were selected at random from the field, from each group. The number of
plants affected from the disease in each group was observed as follows:
Insecticide Number of
diseased plants
1 Endosulfan 8
2 Methyl dematon 7
3 Monocrotophos 5
4 Phosphamidon 6
5 Dimethoate 4
Test whether the proportions of diseased plants affected by various insecticides are equal at 5%
level of significance.
Solution
Aim: To test the proportions of diseased plants affected by various insecticides are equal or not.
H0: The proportions of diseased plants affected by various insecticides are equal.
i.e., H0: P 1 = P 2 = P 3 = P 4 = P 5.
H1: The proportions of diseased plants affected by various insecticides are not equal.
i.e., H1: P 1 P 2 P 3 P 4 P 5.
Level of Significance: = 0.05
Critical Values: 2(.975), 4 = 0.484 & 2(.025), 4 = 11.143
Critical Region: P (2(.975), 4 < 0.484) + P(2(.025),4 > 11.143) = 0.05
O i 30
n
p= = = 0.12 and q = 1p = 0.88
i
250
(Oi ni p )2
Insecticide Number of diseased Sample size(n i ) nip
number plants (Oi ) ni pq
1 8 50 6 0.7576
2 7 50 6 0.1894
3 5 50 6 0.1894
4 6 50 6 0.0000
5 4 50 6 0.7576
30 250 30 1.8940
58 Selected Statistical Tests
k
(Oi ni p )2
Test Statistic: =2
ni pq
= 1.894
i =1
Conclusion: Since 0.484 < 2 < 11.143, we conclude that the data do not provide us any evidence
against the null hypothesis H0, and hence it is accepted at 5% level of significance. That is, the proportions
of diseased plants affected by various insecticides are equal.
Example 2
A sample survey was conducted in 4 villages to study about the consumption of tobacco product.
A random sample was selected from each of the village and the number of smokers is observed as
follows. Examine whether the proportion of smokers in all the four villages are same at 2% level of
significance.
Village Sample size No.of smokers
A 60 14
B 70 16
C 80 17
D 90 13
Solution
Aim: To test the proportions of smokers in all the four villages are equal or not.
H0: The proportions of smokers in all the four villages are equal.
i.e., H0: P 1 = P 2 = P 3 = P 4.
H1: The proportions of smokers in all the four villages are not equal.
i.e., H1: P 1 P 2 P 3 P 4.
Level of Significance: = 0.02
Critical Values: 2(.99), 3 = 0.115 & 2(.01), 3 = 11.345
Critical Region: P (2(.99), 3 < 0.115) + P (2(.01), 3 > 11.345) = 0.02
O i 60
p=
n i
=
300
= 0.2 and q = 1 p = 0.8
Village
Number of smokers
Sample size (ni) ni p (Oi ni p)2
(Oi )
ni pq
A 14 60 12 0.4167
B 16 70 14 0.3571
C 17 80 16 0.0781
D 13 90 18 1.7361
60 300 60 2.5880
k
(Oi ni p )2
Test Statistic: =2
i =1
ni pq = 2.5880
Parametric Tests 59
Conclusion: Since 0.115 < 2 < 11.345, we conclude that the data do not provide us any
evidence against the null hypothesis H0, and hence it is accepted at 2% level of significance. That is, the
proportions of smokers in all the four villages are equal.
EXERCISES
1. The number of defective items was observed from 4 lots of fruits by taking random samples as
follows. Can we regard that the proportion of defective items in all four varieties of fruits are same
at 5% level.
Number of
Fruits Sample sze (n i p)
defectives (Oi )
A 12 100
B 17 100
C 10 100
D 11 100
2. A clinical survey was conducted at four taluks of Thanjavur district to study the attack of
filariasis. The following data were obtained. Test whether the ratio of filariasis is same in all the
four taluks at 10% level of significance.
Aim
To test the variances of the k populations are equal, based on k random samples. That is, to
investigate the significance of the differences among k sample variances.
Source
Let X ij, ( i = 1, 2,, k ; j = 1, 2,, ni ) be the observations of k random samples each has ni
observations drawn from k independent populations whose variances are respectively 12 , 22 ,, 2k .
Let X 1 , X 2 , , X k be the means of k samples.
Assumptions
(i) The populations from which, the k samples drawn, are Normal distributions.
(ii) The unknown variances 12 , 22 ,, 2k are estimated by their respective unbiased estimates
S12 , S 22 ,, S k2 .
Null Hypothesis
H0: The variances of k populations 12 , 22 ,, 2k are equal. That is, there is no significant
difference among the k unbiased estimates of the population variances S12 , S 22 ,, S k2 . i.e. ,
H0 : 12 = 22 = = 2k .
Alternative Hypothesis
H1: 12 22 2k .
Test Statistic
k
S2
i =1
i log
S i2
=
2
1 1
1
1 +
3(k 1) i i
k
i = (ni 1) , i = v,,
i =1
Si2 =
1
ni
(X Xi ) , 2 2
S =
S i i
2
i
ij
j =1
Conclusion
If 21 ( / 2), (k 1) 2 2( / 2),(k 1), we conclude that the data do not provide us any evidence
against the null hypothesis H0 , and hence it may be accepted at % level of significance. Otherwise
reject H0 or accept H1.
Example 1
Three experts conducted an interview to the candidates and assigned the marks independently. A
random sample of 5 candidates is selected whose marks are as follows. Examine whether there exists
variation among the experts in assigning the marks at 5% level of significance.
Candidates
Experts
1 2 3 4 5
A 64 78 86 65 92
B 68 72 80 74 80
C 70 75 78 70 85
Solution
Aim: To test the variances among the experts in assigning the marks are equal or not.
H0: The variances among the experts in assigning the marks are equal.
H1: The variances among the experts in assigning the marks are not equal.
Level of Significance: = 0.05
Critical Values: 2(.975), 2 = 0.0506 & 2(.025), 2 = 7.378
Critical Region: P (2(.975), 2 < 0.0506) + P (2(.025), 2 > 7.378) = 0.05
62 Selected Statistical Tests
Calculations:
k
v i = (ni 1) = 5 1 = 4 for all i = 1, 2, 3 v i =1
i
= v = 12 ; k = 3 1 = 2
ni
(X
2 1 2
Si = ij X i ) ; S12 = 193.75; S 22 = 75.9993 ; S32 = 49.125
vi j =1
vS i i
2
4(193. 75 + 75.9993 + 49. 125)
S2 = = = 106.29 ; log S 2 = 4.6662
v 12
2
vi Si2 log S i vi log s i2
v log S
2
i i = 53.9666
Test Statistic:
k
log S 2 log S
i =1
i i
2
(12 4.6662 ) 53. 9666
2 = = 1 3 1 = 1.825
1 1 1+
1
1 + 3 2 4 12
3(k 1) i i
Conclusion: Since 2.975,2 < 2 < 2.025,2, we conclude that the data do not provide us any
evidence against the null hypothesis H0, and hence it may be accepted at % level of significance. That
is, the variances among the experts in assigning the marks are equal.
Example 2
An agricultural experiment was carried out to examine the effectiveness of the yield of brinjals of
four varieties. The following are the yields (in kgs.) of four varieties of brinjals applied in different plots
as follows:
Sample
Variety Yield
Size
A 4 12.50 16.25 14.50 16.50
B 5 10.50 12.75 14.50 13.25 14.25
C 6 8.50 9.50 9.75 16.75 15.50 10.50
D 7 16.50 15.65 15.35 14.25 16.25 15.55 16.75
Test, whether the variances of the yield of four varieties of brinjals, are equal at 2% level of
significance.
Parametric Tests 63
Solution
Aim: To test variances of the yield of four varieties of brinjals are equal or not.
H0: The variances of the yield of four varieties of brinjals are equal.
H1: The variances of the yield of four varieties of brinjals are not equal.
Level of Significance: = 0.02
Critical Values: 2(.99), 3 = 0.115 & 2(.01), 3 = 11.345
Critical Region: P (2(.99), 3 < 0.115) + P (2(.01), 3 > 11.345) = 0.02
Calculations:
vi = (ni 1) . v1 = 3, v 2 = 4, v3 = 5, v 4 = 6,
4 ni
i = v =18, (X
1 2
Si2 = v ij Xi )
i =1 i j =1
S =
2 S i i
2
v log S
i i
2
= 26.5684
Test Statistic:
k
log S 2 log S
i =1
i i
2
Conclusion: Since 2 > 2.01,3, we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence H1 is accepted at 2% level of significance. That is, the
variances of the yield of four varieties of brinjals are not equal.
64 Selected Statistical Tests
EXERCISES
1. A manufacturer produces three types of iron rods. Random samples are drawn from each type,
whose lengths (in mm) are as follows. Test whether the variances of the three types are equal at 5%
level of significance.
Type Sample size Length of rods
A 6 22 24 22 21 23 24
B 5 20 25 26 21 22
C 6 20 26 22 21 25 27
2. A sample survey was conducted in three localities from 10 households each, whose monthly
expenditure on food are as follows. Are these samples agree with the variation of monthly food
expenses of these three localities are same? Test at 5% significance level.
Aim
To test the mean of the k populations are equal, based on k independent random samples. That is,
to investigate the significance of the difference among the k sample means.
Source
Let X ij, (i = 1, 2,, k ; j = 1, 2,, ni) be the observations of k random samples each has ni
observations drawn from k independent populations whose means 1, 2,, k are unknowns and the
variances are equal but unknown. Let X 1 , X 2 , , X k be the means of k samples. Let n1 + n2 ++
nk = n.
Assumptions
(i) The populations from which, the k samples drawn, are Normal distributions.
(ii) Each observation is independently drawn.
Null Hypothesis
H0: The means of k populations 1, 2,, k are equal. That is, there is no significant difference
among the k sample means X 1 , X 2 , , X k i.e., H0: 1 = 2 = , = k.
Alternative Hypothesis
H1: 1 2 , k
Method
Calculate the following, based on the sample observations.
k ni
Ti 2 k
Test Statistic
SSS /(k 1)
F = ESS / (n k )
Conclusion
If F F , (k 1, n k), we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1 .
Note:This test is same as test for completely randomized design with unequal number of replications
on k treatments with i th treatment has ni replications.
Example 1
The following data is obtained from three independent samples of students selected from three
batches of students, which denotes their marks in an examination. Test whether, the mean mark of all
the three batches students are equal at 5% level of significance.
Batch A: 62 68 64 76
Batch B: 82 88 74 86 80
Batch C: 83 87 80
Parametric Tests 67
Solution
Aim: To test the mean mark of all the three batches of students in the examinations are equal or
not.
H0: The mean marks of all the three batches of students in the examinations are equal. i.e.,
H0: 1 = 2 = 3
H1: The mean marks of all the three batches of students in the examinations are not equal. i.e.,
H1: 1 2 3
Level of Significance: = 0.05 and Critical Value = F 0.05, (2,9) = 4.26
Calculations:
Number of Samples k = 3 n1= 4 n2 = 5 n3 = 3
n = 12 T1 = 270 T2 = 410 T3 = 250 G = 250
Correction Factor, CF = 9302/12 = 72075
Total Sum of Squares, TSS = 622 ++ 802 CF = 863
270 2 410 2 250 2
Sum of Squares between samples, SSS = + + 72075 = 603.33
4 5 3
Error Sum of Squares, ESS = TSS SSS = 259.67
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Samples 2 603.33 301.67
Error 9 259.67 28.85
Total 11 863
SSS /(k 1) 301 .67
Test Statistic: F = ESS / (n k ) = = 10.46
28. 85
Conclusion: Since F > F 0.05, (2,9) = 4.26, we conclude that the data provide us evidence against
the null hypothesis H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is,
the mean marks of all the three batches of students in the examinations are not equal.
Example 2
The following data denotes the life of electric bulbs of four varieties. Test, whether the average
life of four varieties of bulbs is homogeneous at 5% level of significance.
Variety Sample size Life of the electric bulbs in hours
I 8 1560 1670 1580 1650 1640 1680 1600 1650
II 9 1450 1460 1480 1450 1460 1440 1450 1480 1470
III 9 1430 1440 1450 1440 1430 1420 1410 1450 1470
IV 8 1540 1570 1550 1560 1570 1580 1530 1590
Solution
Aim: To test the average life of four varieties of bulbs is equal or not.
H0: The average life of four varieties of bulbs is equal. i.e., H0: 1 = 2 = 3 = 4.
68 Selected Statistical Tests
H1: The average life of four varieties of bulbs is not equal. i.e., H1: 1 2 3 4.
Level of Significance: = 0.05 and Critical Value : F 0.05,(3,30) = 4.51
Calculations
Shifting the origin to 1410 and then dividing by 10, the above data reduces to
15 26 17 24 23 27 19 24
04 05 07 04 05 03 04 07 06
02 03 04 03 02 01 00 04 06
13 16 14 15 16 17 12 18
Number of Samples k = 4 n1 = 8 n2 = 9 n3 = 9 n4 = 8
n = 34 T1 = 175 T2 = 45 T3 = 25 T4 = 121 G = 366
2
Correction Factor, CF = 366 /34 = 3939.88
Total Sum of Squares, TSS = 152 + + 182 CF = 2216.12
145 2 45 2 25 2 1212
Sum of Squares between samples, SSS = + + + 3939.88 = 2012.81
8 9 9 8
Error Sum of Squares, ESS = TSS SSS = 203.31
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Samples 3 2012.81 670.94
Error 30 203.31 6.78
Total 33 2216.12
EXERCISES
1. Three varieties of coal were analyzed by four chemists and the ash content in the varieties was
obtained as follows.
Chemists
Varieties
1 2 3 4
A 6 7 7 8
B 7 6 8 7
C 4 3 5 6
Do the varieties differ significantly in their ash-content?
Parametric Tests 69
2. Three processes A, B and C are tested to see whether their outputs are equivalent. The following
observations of output are made:
A 12 15 17 18 15 17 16
B 14 17 18 14 16 14
C 14 18 17 15 15 19 17 19
Examine the outputs of these three processes differ significantly at 1% level of significance.
TEST 18
Aim
To test the given two attributes are independent, based on the observed frequencies, obtained
from any sample survey.
Source
A random sample of N observed frequencies be classified into m classes by attribute-A and n
classes by attribute-B. The above observed frequencies can be expressed in the following table known
as m n contingency table.
Attribute-B
Total
1 2 j n
Assumptions
(i) The sample size N, should be sufficiently large.
(ii) Each cell frequencies Oij should be independent.
(iii) Each cell frequencies Oij should be at least 5.
Parametric Tests 71
Null Hypothesis H0
The two attributes are independent.
Alternative Hypothesis H1
The two attributes are dependent.
Level of Significance ( ) and Critical Region
2 > 2,(m1) (n1) such that P {2 > 2,(m1) (n1)} =
Test Statistic
m [Oij Eij ]2
n
= 2
i =1 j =1
Eij
Oi . Oj .
E ij =
N
The statistic follows distribution with (m1) (n1) degrees of freedom.
2 2
Conclusion
If 2 2,(m1) (n1), we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or
accept H1.
Example 1
A newspaper publisher, trying to pinpoint his markets characteristics, wondered whether
newspaper readership in the community is related to readers educational achievement. A survey
questioned adults in the area on their level of education and their frequency of readership. The results
are shown in the following table.
Frequency of Level of educational achievement
Total
readership Post graduate Graduate Secondary Primary
Never 15 18 22 25 80
Sometimes 16 24 15 25 80
Morn or Even 22 14 18 16 70
Both Editions 27 14 15 14 70
Total 80 70 70 80 300
Solution
Aim: To test the frequency of readership of Newspaper is i ndependent of level of educational
achievement or not.
H0: The frequency of readership of Newspaper is independent of level of educational achievement.
H1: The frequency of readership of Newspaper depends on level of educational achievement.
Level of Significance: = 0.05
Critical Value: 20.05, (4 1) (4 1) = 20.05,9 = 16.919
Oi. O. j
Calculations: E ij =
N
72 Selected Statistical Tests
m n [Oij Eij ]2
=
Test Statistic: 2
Eij = 20.8926
i =1 j =1
Conclusion: Since 2 > 20.05,9, we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is, the
frequency of readership of Newspaper depends on level of educational achievement.
Example 2
In a survey, a random sample of 200 farms was classified into three classes according to tenure
status as owned, rented and mixed. They were also classified according to the level of soil fertility as
highly fertile, moderately fertile and low fertile farms. The results are given below. Test at 1% level of
significance.
Tenure status
Soil fertility Total
Owned Rented Mixed
High 45 15 10 70
Moderate 20 10 15 45
Low 20 25 40 85
Total 85 50 65 200
Solution
Aim: To test the tenure status is independent of soil fertility or not.
H0: The Tenure status and soil fertility are independent of each other.
H1: The tenure status depends on soil fertility.
Parametric Tests 73
Test Statistic: =2
i =1 j =1
E
= 20.8926
ij
Conclusion: Since > 0.01,4, we conclude that the data provide us evidence against the null
2 2
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 1% level of significance. That is, the tenure
status depends on soil fertility.
EXERCISES
1. Two researchers adopted different sampling techniques while investigating the same group of
students to find the number of students falling in different intelligence levels. The data is as
follows. Can you say that the sampling techniques adopted by the two researchers are significantly
different?
Level of students
Researcher Below Average Above Genius
average average
A 64 42 36 24
B 56 58 44 26
2. In an organization, a random sample of 100 employees were selected whose educational level and
their employment status was observed. Examine whether the employment status depends on their
level of education at 10% level of significance.
Employment Level of education
status Primary Secondary Graduates
Assistants 15 14 5
Clerical 12 18 8
Supervisors 8 8 12
TEST 19
Aim
To test the population correlation coefficient is zero, based on a bivariate random sample. That is,
to investigate the significance of the difference between the sample correlation coefficient r and zero.
Source
Let (X i, Yi), (i = 1, 2,, n) be a random sample of n pairs of observations drawn from a bivariate
normal population whose correlation coefficient is unknown. Let r be the correlation coefficient
based on the above sample.
Assumptions
(i) The population from which, the sample drawn, is a bivariate normal population.
(ii) The relationship between X and Y is linear.
Null Hypothesis
H0: The population correlation coefficient is zero. That is, there is no significant difference
between the sample correlation coefficient r and zero. i.e., H0: = 0
Alternative Hypothesis
H1: 0
Test Statistic
r
t= n2
1 r2
Parametric Tests 75
XY X Y
1
n
r=
1 2 1
X Y
2
2
X 2
Y
n n
Conclusion
If |t| t, we conclude that the data do not provide us any evidence against the null hypothesis
H0, be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example 1
A random sample of 10 students marks in Mathematics and English are given below. Test whether
the correlation exists between the marks of two subjects at 2% level of significance.
Marks in Mathematics: 68 54 78 75 76 85 54 68 87 75
Marks in English: 59 68 72 67 72 78 64 58 68 74
Solution
Aim: To test the correlation coefficient between the marks in mathematics and English is zero or
not.
H0: The correlation coefficient between the marks in Mathematics and English is zero i.e. ,
H0 : = 0
H1: The correlation coefficient between the marks in Mathematics and English is not zero i.e.,
H1 : 0
Level of Significance: = 0.02 and Critical Value: t0.02,8 = 2.896
Based on the data,
Y = 680 ; X Y
2 2
X = 720 ; = 52984 ; = 46606 ; XY = 49293
XY X Y
1
n
r=
1 2 1
X Y
2
2
X 2
Y
n n
1
49293 (72 68)
10
= = 0.51
1 2 1
52984 72 46606 68 2
10 10
r
Test Statistic: t = n 2 = 0.51 2.83/0.86 = 1.68
1 r 2
76 Selected Statistical Tests
Conclusion: Since |t| < t, we conclude that the data do not provide us any evidence against the
null hypothesis H0. Hence, H0 is accepted at 2% level of significance. That is, the correlation coefficient
between the marks in Mathematics and English is zero.
Example 2
A random sample of 10 students is selected from a kinder garden school whose height (in cms)
and weight (in kgs) are given below. Test whether the height and weight of the students of that school
is correlated at 1% level of significance.
Height: 92 96 88 96 98 95 89 96 90 90
Weight: 18.50 19.25 17.75 19.50 19.00 19.25 18.00 19.50 18.50 18.75
Solution
Aim: To test, the correlation coefficient between the height and weight of the students is zero or
not.
H0: The correlation coefficient between the height and weight of the students is zero i.e. ,
H0 : = 0
H1: The correlation coefficient between the height and weight of the students is not zero i.e.,
H1 : 0
Level of Significance: = 0.01 and Critical Value: t0.01,8 = 3.355
Based on the data,
X = 930 ; Y = 188 ; X 2
= 86606; Y 2
= 3537.75 ; XY = 17501.25
XY X Y
1
n
r=
1 2 1
X Y
2 2
X
2
Y
n n
1
17501.25 (93 18.8)
10
= = 0.8848
1 2 1
86606 93
2
3537. 75 18. 8
10 10
r
Test Statistic: t= n 2 = 0.88482.83/0.4659 = 5.3745
1 r
2
Conclusion: Since t > t , we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 1% level of significance. That is, the
correlation coefficient between the height and weight of the students is not zero.
Parametric Tests 77
EXERCISES
1. The following bivariate data is obtained from a sample of five households whose monthly income (in
rupees) and their electricity consumption (in units). Examine whether the monthly income and the electricity
consumption for the households are correlated at 5% level of significance.
Income: 12150 16500 17610 10800 16300
Electricity: 165 174 180 170 185
Income: 15300 14800 16500 14800 16800
Electricity: 155 168 188 175 185
2. A random sample of 15 students is selected; the correlation coefficient between their IQ and their English
aptitude is obtained as 0.68. Examine whether, in general, IQ and English aptitude are correlated or not at 1%
level of significance.
TEST 20
Aim
To test the correlation coefficient in the population be regarded as 0 (assumed value), based on
a bivariate random sample. That is, to investigate the significance of the difference between the assumed
population correlation coefficient 0 and the sample correlation coefficient r.
Source
Let (X i, Yi), (i = 1, 2,, n) be a random sample of n pairs of observations drawn from a bivariate
normal population whose correlation coefficient is unknown. Let r be the correlation coefficient
based on the above sample.
Assumptions
(i) The population from which, the sample drawn, is a bivariate normal population.
(ii) The relationship between X and Y is linear.
(iii) The variance in the Y values is independent of the X values.
Null Hypothesis
H0 : The population correlation coefficient is 0. That is, there is no significant difference
between the sample correlation coefficient r and the assumed population correlation coefficient 0.
i.e., H0: = 0
Alternative Hypothesis
H1: 0
Test Statistic
U
Z= (Under H0: = 0)
1
n 3
Parametric Tests 79
1 (1 + r ) 1 (1 + )
log e and = log e
(1 r ) (1 )
U=
2 2
The statistic Z follows Standard Normal distribution.
Conclusion
If Z Z , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example 1
The past record of the correlation coefficient between age (X) and height (X) of children reveals
that it is 0.83. A random sample of 50 children whose age and weight is observed and the correlation
coefficient is obtained as 0.88. Test whether the sample information is significant with the past record
at 2% level.
Solution
Aim: To test the sample information on the age and height of the children whose correlation
coefficient is significant with the past record or not.
H0: The correlation coefficient between the age and weight of the children is 0.83. i.e. ,
H0 : = 0.83.
H1: The correlation coefficient between the age and weight of the children is not 0.83. i.e.,
H1 : 0.83.
Level of Significance: = 0.02 and Critical Value: Z= 2.33
Calculations:
1 (1 + r ) 1 (1 + 0.88)
U= log e = 2 log e (1 0.88) = 1.3757
2 (1 r )
1 (1 + ) 1 (1 + 0.83)
and = log e = log e =1.1881
2 (1 ) 2 (1 0.83)
U 1. 3757 1. 1881
Test Statistic: Z= = = 1.29 (Under H0 : = 0.83)
1 1
n 3 50 3
Conclusion: Since |Z| < Z, we conclude that the data do not provide us any evidence against the
null hypothesis H0, and hence accept H0 at 2% level of significance. That is, the correlation coefficient
between the age and weight of the children is 0.83.
Example 2
The correlation coefficient between sales of textile cloths and advertising expenditure is expected
by the sellers is 0.65 during the festival season. A random sample of 30 sellers amount of sales and
expenditure on advertisement is observed and correlation coefficient between them is obtained as 0.52.
Examine whether the expectation by the sellers is true or not at 1% level.
80 Selected Statistical Tests
Solution
Aim: To test the expectation by the sellers is true or not, that the correlation coefficient between
sales of textile cloths and advertising expenditure is 0.65.
H0: The expectation by the sellers is true, that the correlation coefficient between sales of textile
cloths and advertising expenditure is 0.65. i.e., H0: = 0.65
H1: The expectation by the sellers is true, that the correlation coefficient between sales of textile
cloths and advertising expenditure is not 0.65.
H1: 0.65
Level of Significance: = 0.01 and Critical Value: Z= 2.58
Calculations:
1 (1 + r ) 1 (1 + 0. 52)
log e = log e = 0.5763
(1 r ) (1 0. 52)
U=
2 2
1 (1 + ) 1 (1 + 0. 65)
= log e = log e = 0.3367
and
2 (1 ) 2 (1 0. 65)
U 0.5763 0. 3367
Test Statistic: Z= = = 1.25 (Under H0: = 0.83)
1 1
n 3 30 3
Conclusion: Since Z < Z , we conclude that the data do not provide us any evidence against the
null hypothesis H0 and hence accept H0 at 1% level of significance. That is, the expectation by the
sellers is true, that the correlation coefficient between sales of textile cloths and adverting expenditure
is 0.65.
EXERCISES
1. The medical record reveals that the correlation between the age of the mother and the birth weight
of their first child is 0.24. A random sample of eight persons age and their birth weight of their
first child are observed as follows.
Age of the Mother: 35 28 24 26 29 30 34 32
Birth weight of Child: 2.85 3.25 3.50 3.25 3.00 2.75 2.90 3.00
Examine whether the medical record provides the true information at 1% level of significance.
2. The age of husbands and their wives in India is correlated with correlation coefficient is 0.75. A
random sample of 9 pairs is selected whose age is given below. Test whether this data reveals that
the correlation coefficient in the population be 0.75 at 5% level of significance.
Age of Husband: 58 54 46 49 37 36 35 28 29
Age of Wife: 53 52 40 42 35 32 30 24 26
TEST 21
Aim
To test the population partial correlation coefficient 12.34(k+2) be regarded as zero, based on a
random sample. That is, to investigate the significance of the difference between zero and the partial
correlation coefficient of order k (< n), r12.34(k+2), (observed in a sample of size n from a multivariate
normal population).
Assumption
The sample is drawn, from a multivariate normal population.
Source
A random sample of n observations be drawn from a multivariate normal population whose
sample partial correlation coefficient of order k is r12.34(k+2).
Null Hypothesis
H0: The Population partial correlation coefficient 12.34(k+2) = 0. That is, there is no significant
difference between the sample partial correlation coefficient r12.34(k+2) and zero.
Alternative Hypothesis
H1: 12.34(k+2) 0
Test Statistic
r12 .34...(k +2 )
t= (n k 2)
1 r12 .34...(k + 2 )
2
82 Selected Statistical Tests
Example
An agricultural experiment was conducted to know the effect of some factors which influences
the yield of paddy. The yield of paddy (Y) depends on the factors such as fertilizer used (X 1), irrigation
(X 2), pesticides (X 3) and seed type (X 4). A sample study was conducted in 20 experimental units and it
was found that the sample partial correlation coefficient between irrigation and fertilizer used was 0.23.
Test whether the partial correlation coefficient of irrigation and fertilizer used in the yield of paddy is
zero or not at 5% level of significance.
Solution
H0: The partial correlation coefficient of irrigation and fertilizer used in the yield of paddy is zero.
i.e., H0: 12.34 = 0.
H1: The partial correlation coefficient of irrigation and fertilizer used in the yield of paddy is zero.
i.e., H1: 12.34 0.
Level of significance: = 0.05 and Critical value: t0.05,11 = 2.201
r12 .34...(k +2 ) 0.23 15 2 2
Test Statistic: t= (n k 2) = = 0.7838
1 r12 .34...(k + 2 )
2
1 (0.23)
2
Conclusion: Since t < t0.05,11, H0 is accepted and conclude that the partial correlation coefficient
of irrigation and fertilizer used in the yield of paddy is zero.
TEST 22
Aim
To test the two population correlation coefficients 1and 2 are equal, based on two independent
bivariate random samples. That is, to investigate the significance of the difference between the two
sample correlation coefficients r1 and r2.
Source
A random sample of n1 pairs of observations be drawn from a bivariate population whose correlation
coefficient 1 is unknown. A random sample of n2 pairs of observations be drawn from another
bivariate population whose correlation coefficient 2 is unknown. The sample correlation coefficients
of those two samples are r1 and r2 respectively.
Assumptions
(i) The population from which the sample drawn is a bivariate normal population.
(ii) The relationship between X and Y is linear.
(iii) The variance in the Y values is independent of the X values.
Null Hypothesis
H0: The two population correlation coefficients 1 and 2 are equal. That is, there is no significant
difference between the sample correlation coefficient r1 and r2. i.e., H0: 1 = 2
Alternative Hypothesis
H1: 1 2
Test Statistic
(U 1 U 2 ) (1 2 )
Z= (Under H0: 1 = 2 1= 2)
1 1
+
n1 3 n2 3
1 (1 + r1 ) 1 (1 + r2 ) 1 (1 + 1 )
U1 = 2 log e (1 r ) , U2 = 2 log e (1 r ) , 1 = 2 log e (1 )
1 2 1
1 (1 + 2 )
and 2 = 2 log e (1 )
2
The statistic Z follows Standard Normal distribution.
Conclusion
If Z Z , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example
A random sample of 29 children in City-A has the correlation coefficient between age and weight
0.72. Another sample of 29 children in City-B has the correlation coefficient between age and weight
0.8. Test whether the correlation coefficient between the age and height of the children in two cities is
equal at 5% level of significance.
Solution
H0: The correlation coefficient between the age and height of the children in two cities is equal.
i.e., H0: 1 = 2.
H1: The correlation coefficient between the age and height of the children in two cities is not
equal. i.e., H1: 1 2.
Level of Significance: = 0.05 and Critical value: Z0.05 = 1.96.
Calculations:
1 (1 + r1 ) 1 (1 + 0.72)
U1 = 2 log e (1 r ) = 2 log e (1 0.72) = 0.91
1
1 (1 + r2 ) 1 (1 + 0.80)
U2 = 2 log e (1 r ) = 2 log e (1 0.80) = 1.1
2
(U 1 U 2 ) (1 2 )
Test Statistic: Z= (Under H0: 1 = 2 1= 2)
1 1
+
n1 3 n2 3
Parametric Tests 85
(0.91 1.1)
= = 0.985
1 1
+
29 3 29 3
Conclusion: Since, Z < Z0.05, H0 is accepted and concluded that the correlation coefficient between
the age and height of the children in two cities are equal.
TEST 23
Aim
To test the multiple correlation coefficient in the population is zero, based on a sample multiple
correlation coefficient. That is, to investigate the significance of the difference between the observed
sample multiple correlation coefficient and zero.
Source
A random sample of size n from a (k+1) variate population be drawn with multiple correlation
coefficient R. That is, R is the observed multiple correlation coefficient of a variate (say, X 1) with k
other variates (say, X 2, X 3, , X k+1). Let be the corresponding multiple correlation coefficient in the
population.
Assumptions
(i) The population from which the sample drawn is a (k+1) variate normal population.
(ii) The relationship between X 1, X 2,X k+1 are linear.
Null Hypothesis
H0: The population multiple correlation coefficient, is zero. That is, there is no significant
difference between the sample multiple correlation coefficient R and zero. i.e., H0: = 0.
Alternative Hypothesis
H1: 0.
Test Statistic
2
R n k 1
F= 2
1 R k
The statistic F follows F distribution with (k, nk1) degrees of freedom.
Conclusion
If F F , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example
A random sample of 15 students was selected from a school and observed their marks in three
subjects are obtained. The multiple correlation coefficient on the first subject to the other two subjects
of the 15 students is found as 0.65. Test whether the multiple correlation coefficient on the first subject
to the other two subjects in the school students is zero or not at 5% level of significance.
Solution
H0: The multiple correlation coefficient on the first subject to the other two subjects in the school
students is zero.
H1: The multiple correlation coefficient on the first subject to the other two subjects in the school
students is not zero.
Level of Significance: = 0.05 and Critical value: F 0.05,(3,11) = 3.59
(0. 65) 15 3 1
2 2
R n k 1
Test Statistic: F = = = 2.68
1 (0.65)
2 2
1 R k 3
Conclusion: Since, F < F 0.05,(3,11), H0 is accepted and concluded that the multiple correlation
coefficient on the first subject to the other two subjects in the school students is zero.
TEST 24
Aim
To test the population regression coefficient of Y on X denoted by be regarded as zero, based
on a bivariate random sample. That is, to investigate the significance of the difference between the
sample regression coefficient of Y on X, b and zero.
Source
Let (X i, Yi), (i = 1, 2, , n) be a random sample of n pairs of observations drawn from a
bivariate normal population whose regression coefficient of Y on X is . The sample regression coefficient
of Y on X is denoted by b.
Assumptions
(i) The population from which, the sample drawn, is a bivariate normal population.
(ii) The relationship between X and Y is linear.
Null Hypothesis
H0: The population regression coefficient of Y on X, is zero. That is, there is no significant
difference between the sample regression coefficient of Y on X, b and zero. i.e., H0: = 0.
Alternative Hypothesis
H1: 0
Test Statistic
(n 2) ( X X ) 2
i
t = (b ) i (Under H0 : = 0)
2
(Yi y i )
i
b=
( X X )(Y Y ) ;
i i
yi = Y + b( X i X ) be the estimate of Y for a given value (say) xi of
(X X ) i
2
X of the regression line of Y on X (for the given sample). The statistic t follows t distribution with
(n2) degrees of freedom.
Example
A sample study was conducted on weight (Y ) and age (X ) of a sample of 8 children from a city.
The regression coefficient of Y on X is found as 0.665 and sum of squares of deviation from the mean
of Y is 44 and of X is 36. Test whether the regression coefficient in the weight and age of the children
in the city is zero or not at 5% level of significance.
Solution
H0: The regression coefficient in the weight on age of the children in the city is zero. i.e., = 0.
H1: The regression coefficient in the weight on age of the children in the city is not zero. i.e.,
0.
Level of significance: = 0.05 and Critical value: t0.05,6 = 2.45
(n 2) ( X X ) 2
i (8 2) 36
Test Statistic: t = (b ) i = 0.665 = 1.4734
2 44
(Yi y i )
i
Conclusion: Since t < t0.05,6, H0 is accepted and concluded that the regression coefficient in the
weight on age of the children in the city is zero.
TEST 25
Aim
To test the regression that passes through the origin. That is, to investigate the significance of the
difference between the intercept of a regression and zero.
Source
A random sample of size n from a bivariate population be drawn. The intercept of the regression
in the population is denoted by . The regression with = 0 is known as regression through origin.
The linear regression in the sample is y = a + bx, where a is the intercept and b is the slope of the linear
regression.
Assumptions
(i) The population from which, the sample drawn is a bi-variate normal population.
(ii) The relationship between Y and X are linear.
Null Hypothesis
H0: The intercept of the regression in the population is zero. That is, there is no significant
difference between the intercept of the linear regression in the sample and zero. i.e., H0: = 0.
Alternative Hypothesis
H1: 0.
Method
For the given bivariate data with Y is the dependent variable and X is the independent variable on
n observations, calculate the following:
x; x xy ;
2 2
(i) y; y ; ; x and y .
2
y
y .
2
(ii) Sum of Squares of the observations y = SS(Y) =
n
x
2
x
2
(iii) Sum of Squares of the observations x = SS(X) = .
n
xy n .
x y
(iv) Sum of Products of the observations x and y = SP(XY) =
[SP( XY )]2 .
SS (X )
(vii) Sum of Squares due to regression b = SS(b) =
ESS
(ix) Error Mean Square, se2 = .
n 1
Test Statistic
a0
t=
2 1
se +
(x)
2
n SS ( X )
Conclusion
If t t , we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
92 Selected Statistical Tests
Example
From a Sorghum field, 36 plants were selected at random. The length of panicles (x) and the
number of grains per panicle (y) of the selected plants were recorded. The results are given below. Fit
a regression line of Y on X and test whether the intercept is zero at 5% level of significance.
y x y x y x
Solution
H0: The intercept of the regression in the population is zero. That is, there is no significant
difference between the intercept of the linear regression in the sample and zero. i.e., H0: = 0.
H1: 0.
Level of Significance: = 0.05 and Critical value: t0.05, 34 = 2.04
Calculations:
y = 4174 y x = 822.9 x
2 2
(i) = 496258 = 18876.83.
( y ) 2
y
2
(ii) Sum of Squares of the observations y = SS(Y) = = 12305.89.
n
( x ) 2
x 2
(iii) Sum of Squares of the observations x = SS(X) = = 66.7075.
n
xy n
x y
(iv) Sum of Products of the observations x and y = SP(XY) = = 772.7167.
SP(XY )
(v) The regression coefficient, b = =11.5837.
SS ( X )
(vi) The intercept of the regression, a = y bx = 148.8396.
Parametric Tests 93
[SP( XY )]2
(vii) Sum of Squares due to regression b = SS(b) = = 8950.884.
SS ( X )
(viii) ESS = SS(Y) SS(b) = 3355.0048.
ESS
(ix) Error Mean Square, se2 = = 98.6766.
n 1
a0 148. 8396 0
Test Statistic: t= = = 9.506
2 1 ( x)
2
1 (22.86 )2
se + 98.6766 +
n SS ( X ) 36 66. 7075
Conclusion: Since t > t0.05, 34, H0 is rejected and concluded that the intercept is significantly
different from zero. In other words, the regression does not pass through the origin.
This page
intentionally left
blank
CHAPTER 3
Aim
To test the significance of the t treatment effects based on the observations from n experimental
units.
Source
Let yij, (i = 1, 2,, t; j = 1, 2,, r) be the observations of t treatments, each replicated with
(equal number of replications) r times in n experimental units (i.e., n = tr). In this design, treatments
are allocated at random to the experimental units over the entire experimental material. That is, the
entire experimental material is divided into n experimental units and the treatments are distributed
completely at random over the units.
Linear Model
The linear model is yij = + i + ij ; (i = 1, 2,, t; j = 1, 2,, r),
where yij is the observation from the jth replication of the ith treatment, is the overall mean effect, i
is the effect due to the ith treatment and ij is the error effect due to chance causes.
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ij are identically independently distributed as Normal distribution with mean zero and variance
2 .
Null Hypothesis
H0: The k treatments have equal effect. i.e., H0: 1 = 2 = = t.
98 Selected Statistical Tests
Alternative Hypothesis
H1: The k treatments do not have equal effect
i.e., H1: 1 2 .
Method
Calculate the following, based on the observations:
t r
y
2
3. Total Sum of Squares, TSS = ij CF
i =1 j =1
t
1 2
4. Sum of Squares between Treatments, SST = r Ti CF
i =1
Ti be the total of the ith treatment observations from all the replications.
5. Error Sum of Square (Sum of Squares within treatments), ESS = TSS SST
Test Statistic
SST / (t 1)
ESS /(n t )
F=
Conclusion
If F F ,(t1,nt), we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1.
Analysis of Variance Tests 99
Example 1
The following data denotes the four tropical feed stuffs A, B, C, D tried on 20 chicks is given
below. All the twenty chicks are treated alike in all respects except the feeding treatments and each
feeding treatment is given to five chicks. Test whether all the four feedstuffs are alike in weight gain of
the chicks at 5% level of significance.
A: 55 49 42 21 52
B: 61 112 30 89 63
C: 42 97 81 95 92
D: 169 137 169 85 154
Solution
Aim: To test all the four feedstuffs are equal in weight gain of chicks.
H0: The four feedstuffs are equal in weight gain of chicks.
H1: The four feedstuffs are not equal in weight gain of chicks.
Level of Significance: = 0.05 and Critical value: F 0.05,(3,16) = 3.06
Calculations: Number of treatments, t = 4 n = 20
T1 = 219 T2 = 355 T3 = 407 T4 = 714 Grand Total, G = 1695
2
CF = 1695 /20 = 143651.25
TSS = 552++1542 CF = 181445 143651.25 = 37793.75
1
SST = (2192 + + 7142) CF = 26234.95
5
ESS = TSS SST = 11558.80
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Treatments 3 26234.95 8744.98
Error 16 11558.80 722.42
Total 19 37793.75
SST / (t 1) 8744. 98
Test Statistic: F = ESS /(n t ) = = 12.111
722.42
Conclusion: Since F > F 0.05,(3,16), we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is, the four
feedstuffs are not equal in weight gain of chicks.
Example 2
In order to study the yield of five types of sesame, say, A, B, C, D, E an experiment was
conducted using CRD with four pots per type. The outputs are given below. Examine whether all the
four types of sesame are equal in their yield at 1% level of significance.
100 Selected Statistical Tests
A: 25 21 21 18
B: 25 28 24 25
C: 24 24 16 21
D: 20 17 16 19
E: 14 15 13 11
Solution
Aim: To test all the five types of sesame are equal in their yields.
H0: The five types of sesame are equal in their yields.
H1: The five types of sesame are not equal in their yields.
Level of Significance: = 0.01 and Critical value: F 0.01,(4,15) = 4.89
Calculations: Number of treatments, t = 5 n = 20 Grand Total, G = 397
T1 = 85 T2 = 102 T3 = 85 T4 = 72 T5 = 53
2
CF = 397 /20 = 7880.45
TSS = 252 + + 11 2 CF = 8307 7880.25 = 426.55
1
SST = (852 + + 532) CF = 331.30
4
ESS = TSS SST = 95.25
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Treatments 4 331.30 82.825
Error 15 95.25 6.35
Total 19 426.55
SST / (t 1 ) 82.825
Test Statistic: F = ESS / (n t ) = = 13.04
6. 35
Conclusion: Since F > F 0.01,(4,15), we conclude that the data provide us evidence against the null
hypothesis H0 and in favor of H1. Hence, H1 is accepted at 5% level of significance. That is, the five
types of sesame are not equal in their yields.
EXERCISES
1. To test the effect of small proportion of coal in the sand used for manufacturing concrete, several
batches were mixed under identical conditions except for the variation in the percentage of coal.
From each batch, several cylinders were made and tested for breaking strength. The results
obtained are given below.
Analysis of Variance Tests 101
Test whether all the five cylinders show equal breaking strength.
2. A varietals trial on green gram was conducted in a CRD with five varieties. The results are given
below. Test whether all the four varieties of green gram are equal in their yields at 1% level of
significance.
Varieties
1 2 3 4 5
12.5 14.2 14.6 15.2 13.5
14.2 13.5 14.3 14.8 14.2
13.2 12.8 13.8 15.6 14.6
14.3 12.9 12.9 14.9 15.2
15.2 13.2 14.2 15.3 14.9
TEST 27
Aim
To test the significance of the treatment effects and the significance of the regression coefficient
of Y on X, based on the observations from n experimental units.
Source
Let (Yij, X ij) (i = 1, 2,, t; j = 1, 2,, r) be the observations made from an experiment consists
of t treatments each with replicated r times on two variables Y and X. The observations on auxiliary or
concomitant variable, X apart from the main variable Y under study is available for each of the
experimental units. When Y and X are associated, a part of the variation of Y is due to variation in values
of X. After eliminating, the effects of blocks and treatments one can then estimate a relationship,
between Y and X and use that relationship to predict the value of Y for a given value of X. This test is
used for assessing the significance of relationship between X and Y. If there is, a significant association
between X and Y one may calculate the adjusted treatment sum of squares and perform the test for the
homogeneity of treatment effects. Let n = t r. The observed data is arranged as follows:
Treatments
1 2 T
Y X Y X Y X
Y11 X11 Y21 X21 Yt1 Xt1
Y12 X12 Y22 X22 Yt2 Xt2
Y1r X1r Y2r X2r Ytr Xtr
Treatment totals
TY1 TX1 TY2 TX2 TYt TXt
Analysis of Variance Tests 103
Linear Model
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ij are identically independently distributed as Normal distribution with mean zero and variance
.
2
Null Hypotheses
H0(1): The regression coefficient b is insignificant.
H0(2): The k treatments have equal effect.
i.e., H0(2): 1 = 2 = = .
Alternative Hypotheses
H1(1): The regression coefficient b is significant.
H1(2): The k treatments do not have equal effect.
i.e., H1(2): 1 2 .
Method
Calculate the following, based on the observations.
For variable Y
t r
2
GY
2. Correction Factor, CF Y = .
n
t r
Y
2
3. Total Sum of Squares, GYY = ij CF Y
i =1 j =1
t
1 2
4. Treatment Sum of Squares, TYY = r TYi CF
Y
i =1
th
Tyi be the total of the i treatment observations of Y.
5. Error Sum of Squares, E YY = GYY TYY
For variable X
t r
2
G
7. Correction Factor, CF X = X
n
t r
X
2
8. Total Sum of Squares, GXX = ij CF X
i =1 j =1
T
1 2
9. Treatment Sum of Squares, TXX = r Xi CF X
i =1
TXi be the total of the ith treatment observations of X, from all the replications.
10. Error Sum of Squares, E XX = GXX TXX
GY G X
11. Correction Factor, CF YX =
n
t r
1 t
13. Treatment Sum of products of Y and X, TYX = r TYi T Xi CF
YX
i =1
14. Error Sum of Products, E YX = GYX TYX
15. The regression coefficient within treatment, b = E YX/ E XX
Analysis of Variance Tests 105
Test Statistic
E2
YX /1
E XX
F1 =
E
2
E YX /(n t 1)
YY E XX
Conclusion
If F 1 F ,(1,nt1), accept H0 and conclude that the regression coefficient of Y on X is insignificant.
If F 1 > F ,(1,nt1), reject H0 or accept H1 and conclude that the regression coefficient of Y on X
is significant and proceed to make adjustments for the variate.
Calculate the following adjusted values for the variable Y:
2 2
GYX E
GYY = GYY ; EYY = EYY YX ; EYY
TYY = GYY
G XX E XX
One degree of freedom is lost in error due to fitting a regression line. The above calculations are
provided as a single table as follows:
TAR Denotes the Treatment Adjusted for the average Regression within Treatments.
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
TAR t 1 TYY / t 1
TYY
Error n t 1 EYY / n t 1
EYY
Total n2 GYY
Test Statistic
/(t 1)
TYY
F 2= E /(n t 1)
YY
Conclusion
If F2 F, (t1, nt1), we conclude that the data do not provide us any evidence against the null
hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject H0(2) or
accept H1(2).
Example
The following data shows the age, X (in months) and weight, Y (in kgs) of samples of children
from three states namely Tamilnadu (A), Kerala (B) and Karnataka (C). Test whether the regression
coefficient of Y on X is significant and the children from all the three states are homogeneous.
A B C
Y X Y X Y X
7.25 9 10.5 10 8.5 8
8.65 10 12.5 11 12.5 9
12.5 12 7.5 6 18.5 15
15.5 14 15.5 12 16.5 13
16.5 15 16.5 14 13.5 10
Solution
H0(1): The regression coefficient of weight on age, b is insignificant.
H0(2): The children from the three states are homogeneous.
H1(1): The regression coefficient of weight on age, b is significant.
H1(2): The children from the three states are not homogeneous.
Level of Significance: = 0.05
Critical Values: F 0.05,(1,11) = 4.84 and F 0.05,(2,11) =3.98
Calculations:
For variable Y
2
G
1. GY = 192.4; 2. CF Y = Y = 2467.85
n
t r
Y
2
3. GYY = ij CF Y = 2660.3225 2467.85 = 192.4725
i =1 j =1
1 2
4. TYY = TYi CF Y = 2476.932 2467.85 = 9.082
r i =1
5. E YY = GYY TYY = 192.4725 9.082 = 183.3905
For variable X
t r
X
2
GX
6. GX = ij = 168; 7. CF X = = 1881.6
i =1 j =1 n
Analysis of Variance Tests 107
t r
X
2
8. GXX = ij CF X = 1982 1881.6 = 100.4
i =1 j =1
T
1 2
9. TXX = r Xi CF X = 1886.8 1881.6 = 5.2
i =1
GY G X
11. CF YX = = 2154.88
n
t r
12. GYX = Y
i =1 j =1
ij X ij CF = 2278.25 2154.88 = 123.37
YX
T
1
13. TYX = r Yi T Xi CF = 2151.8 2154.88 = 3.08
YX
i =1
E2
YX /1 15989.602
E XX
95.2
Test Statistic: F 1 = = = 119.71
2
(183 . 3905 167.958) / 11
E
EYX
/(n t 1)
YY E XX
Conclusion: Since F 1 > F 0.05,(1,11), reject H0(1), accept H1(1) and conclude that the regression
coefficient of Y on X is significant. That is, the regression coefficient of weight on age of the children
is significant.
Calculate the following adjusted values for the variable Y
2 2
GYX (123.37 )
GYY = GYY = 192.4725 = 40.8773
G XX 100.4
2 2
E YX (126.45)
EYY = EYY = 183.3905 = 15.4325
E XX 95.2
'
TYY = GYY E YY
' = 40.9773 15.4325 = 25.4448
108 Selected Statistical Tests
ANOCOVA Table:
Sources Degrees Sum of squares
of of and products
variation freedom Y X YX
Treatments 2 9.082 5.2 3.08
Error 12 183.39 95.2 126.45
Total 14 192.47 100.4 123.37
TAR denotes the treatment adjusted for the average regression within treatments.
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
TAR 2 25.4448 12.7224
Error 11 15.4325 1.403
Total 13 40.8773
/(t 1)
TYY
Test Statistic: F 2 = E /(n t 1) = 9.068
YY
Conclusion: Since F 2 > F 0.05,(2,11), we conclude that the data provide us evidence against the null
hypothesis H0(2) and in favor of H1(2). Hence H1(2) is accepted at 5% level of significance. That is,
the children in the three states are not homogeneous in their weights and ages.
TEST 28
Aim
To test the significance of the t treatment effects and the significance of the r block effects based
on the observations from n experimental units.
Source
Let yij, ( i = 1, 2,, t ; j = 1, 2,, r) be the observations of k treatments, each applied with
(equal number of replications) r times in n experimental units. In this design, the entire experimental
material is divided into r homogeneous blocks, each block is further divided into t sub units such that t
r = n. The t treatments are allocated to each block randomly and for every r blocks. That is,
randomization is restricted within blocks.
Linear Model
The linear model is yij = + i + j + ij ; (i = 1, 2,, t ; j = 1, 2, , r)
where yij is the observation from the j block of the ith treatment, is the overall mean effect, i is the
th
effect due to the ith treatment, j is the effect due to the jth block and ij is the error effect due to
chance causes.
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ij are identically independently distributed as Normal distribution with mean zero and variance
.
2
Null Hypotheses
H0(1): The k treatments have equal effect. i.e., H0: 1 = 2 = = .
H0(2): The r blocks have equal effect. i.e., H0: 1 = 2 = = r.
110 Selected Statistical Tests
Alternative Hypotheses
H1(1): The k treatments do not have equal effect.
i.e., H1: 1 2 .
H1(2): The r blocks do not have equal effect.
i.e., H1: 1 2 r.
Method
Calculate the following, based on the observations.
t r
y
2
3. Total Sum of Squares, TSS = ij CF
i =1 j =1
T
1 2
4. Sum of Squares between Treatments, SST = r i CF
i =1
Ti be the total of the ith treatment observations.
r
B
1 2
5. Sum of Squares between Blocks, SSB = k j CF
j =1
Total n1 TSS
Analysis of Variance Tests 111
Test Statistics
SST /(t 1)
(1) F 1 = ESS /(t 1)(r 1)
SSB /(r 1)
(2) F 2 = ESS /(t 1)(r 1)
The statistic F 1 follows F distribution with (t 1),(t 1)(r 1) degrees of freedom and the
statistic F 2 follows F distribution with (r 1),(t 1)(r 1) degrees of freedom.
Conclusions
If F 1 F ,(t1), (t1)(r1) , we conclude that the data do not provide us any evidence against the null
hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject H0(1) or
accept H1 (1).
If F 2 F ,(r1), (t1)(r1), we conclude that the data do not provide us any evidence against the null
hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject H0(2) or
accept H1 (2).
Example 1
The following result shows the yield of three varieties of paddy manure in four plots each using
RBD layout.
Paddy Varieties
Block Total
ADT36 IR20 PONNI
I 46.2 48.5 54.3 149
II 48.4 52.6 57.0 158
III 44.3 51.4 53.3 149
IV 49.1 53.5 51.4 154
Total 188 206 216 610
Solution
Aim: 1. To test the yield of all the three varieties of paddy are equal.
2. To test the yield in all the four blocks are equal.
H0(1): The yields of all the three varieties of paddy are homogeneous.
H0(2): The yields in all the four blocks are homogeneous.
H1(1): The yields of all the three varieties of paddy are not homogeneous.
H1(2): The yields in all the four blocks are not homogeneous.
Level of Significance: = 0.05
Critical values: F 0.05,(2,6) = 5.14 and F 0.05,(3,6) = 4.76
Calculations:
No. of treatments, t = 3; No. of Blocks, r = 4, Grand total, G = 610
CF = 6102/12 = 31008.33
TSS = 46.22 + + 51.42 CF = 31153.86 31008.33 = 145.53
112 Selected Statistical Tests
1
SST = (1882 + 2062 + 2162) CF = 100.67
4
1
BSS = (1492 + 1582 + 1492 + 1542) CF = 19.003
3
ESS = TSS SST BSS = 25.857
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Treatments 2 100.67 50.335
Blocks 3 19.003 6.334
Error 6 25.857 4.3095
Total 11 145.53
Test Statistics:
SST /(t 1) 50.335
1. F 1 = ESS /(t 1)(r 1) = = 11.68
4.3095
Example 2
A varietal trial was conducted on four varieties of sorghum at a research station. The design
adopted was five randomized blocks of four plots each. The yield in lb. per plot obtained from the
experiment is as follows. Analyze the data and comment on your findings.
Varieties
Blocks Total
T1 T2 T3 T4
I 22.5 28.2 32.5 26.8 110
II 27.6 29.6 36.8 24.0 118
III 24.4 27.4 34.2 25.0 111
IV 28.6 30.8 35.3 26.3 121
V 25.9 31.0 36.2 23.9 117
Total 129 147 175 126 577
Analysis of Variance Tests 113
Solution
Aim: 1. To test the yield of all the four varieties of sorghum are equal.
2. To test the yield in all the five blocks are equal.
H0(1): The yields of all the four varieties of sorghum are homogeneous.
H0(2): The yields in all the five blocks are homogeneous.
H1(1): The yields of all the four varieties of sorghum are not homogeneous.
H1(2): The yields in all the five blocks are not homogeneous.
Level of Significance: = 0.05
Critical values: F 0.05,(3,12) = 3.49 and F 0.05,(4,12) = 3.26
Calculations:
No. of treatments, t = 4; No. of Blocks, r = 5, Grand total, G = 577
CF = 5772/20 = 16646.45
TSS = 22.52 + + 23.92 CF = 17002.74 CF = 356.29
SST = (1292 + 1472 + 1752 1262) CF = 303.75
BSS = (1102 + 1182 + 1112 + 1212 1172) CF = 22.3
ESS = TSS SST BSS = 30.24
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Treatments 3 303.75 101.25
Blocks 4 22.3 5.575
Error 12 30.24 2.52
Total 19 356.29
Test Statistics:
SST /(t 1) 50.335
1. F 1 = ESS /(t 1)(r 1) = = 40.18
4.3095
EXERCISE
1. An experiment was conducted to test the effect of different treatment of warp beams on the warp
breakage-rates during weaving. Four wrap beams A, B, C and D were treated differently and were
woven simultaneously on four looms over four days. At the end of the each day, the warp beams
were interchanged between the four experimental looms in such a manner as to ensure that after
completion of the experiment, the warp beam had worked on each of the four looms for one day.
The plan of the experiment and the wrap breakage rates are given in the following table. Analyze
the data and draw your conclusions.
Day of weaving
Loom 1 2 3 4
1 4.37(D) 5.24(C) 6.31(B) 6.28(A)
2 6.54(C) 6.58(B) 5.85(A) 5.94(D)
3 5.68(B) 6.12(A) 6.55(D) 5.85(C)
4 6.15(A) 5.85(D) 5.75(C) 6.25(B)
TEST 29
Aim
To test the significance of the t treatment effects and the significance of the r block effects and
the interaction between treatments and blocks based on the observations from n experimental units.
Source
Let yijk, (i = 1, 2,, t ; j = 1, 2,, r ; k = 1, 2,, m) be the k th observation in the ith treatment
and in the jth block. Let n = t r m.
Linear Model
The linear model is yijk = + i + j + ij + ij
where is the overall mean effect, i is the effect due to the ith treatment, j is the effect due to
the j block, ij is the interaction effect between ith treatment with jth block and ij is the error effect
th
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ij are identically independently distributed as Normal distribution with mean zero and variance
.
2
t r
(v) i =
j =1
j =0
i =1
t
(vi)
i =1
ij = 0 for all j.
r
(vii)
j =1
ij = 0 for all i.
116 Selected Statistical Tests
Null Hypotheses
H0(1): The k treatments have equal effect. i.e., H0: 1 = 2 = , = t.
H0(2): The r blocks have equal effect. i.e., H0: 1 = 2 = , = r.
H0(3): The interaction effect between treatments and blocks is insignificant. i.e., H0: ij = 0
for all i and j. That is, treatment effects and block effects are independent of each other.
Alternative Hypotheses
H1(1): The k treatments do not have equal effect. i.e., H1: 1 2 , t.
H1(2): The r blocks do not have equal effect. i.e., H1: 1 2 , r.
H1(3): The interaction effect between treatments and blocks is significant. i.e., H0: ij 0 for
i and j. That is, treatment effects and block effects are interacted with each other.
Method
Calculate the following, based on the observations:
t r m
y
2
3. Total Sum of Squares, TSS = ijk CF
i =1 j =1 k =1
t
T
1 2
4. Sum of Squares between Treatments, SST = i CF
rm i =1
Ti be the total of the ith treatment observations.
r
B
1 2
5. Sum of Squares between Blocks, SSB = j CF
km j =1
th
B j be the total of the j Block observations.
6. Sum of Squares due to interaction,
1 t r 2
SSI = m Tij CF
SST SSI.
i =1 j =1
7. Error Sum of Square (ESS),
ESS = TSS SST SSB SSI.
Analysis of Variance Tests 117
Test Statistics
SST /(t 1)
1. F 1 = ESS /tr(m 1)
SSB/(r 1)
2. F 2 = ESS/tr(m 1)
Conclusions
If F 1 F ,(t1), (tr(m1)), we conclude that the data do not provide us any evidence against the null
hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject H0(1) or
accept H1(1).
If F 2 F ,(r1), (tr(m1)), we conclude that the data do not provide us any evidence against the null
hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject H0(2) or
accept H1(2).
If F 3 F ,(t1)(r1), (tr(m1)), we conclude that the data do not provide us any evidence against the
null hypothesis H0(3), and hence it may be accepted at % level of significance. Otherwise reject
H0(3) or accept H1(3).
Example
The following data shows the birth weights of babies born, classified according to the age of
mother and order of gravida, there being three observations per cell. Test whether the age of mother
and order of gravida significantly affect the birth weight of children.
118 Selected Statistical Tests
1 5.1 5.0 4.8 5.0 5.1 5.3 5.1 5.1 4.9 4.9 4.9 5.0 5.0 5.0 5.0
2 5.2 5.2 5.4 5.3 5.3 5.5 5.3 5.2 5.2 5.2 5.0 5.5 5.1 5.3 5.9
3 5.8 5.7 5.9 6.0 5.9 6.2 5.8 5.9 5.9 5.8 5.5 5.5 5.9 5.4 5.5
4 6.0 6.0 5.9 6.2 6.5 6.0 6.0 6.1 6.0 6.0 5.8 5.5 5.8 5.6 5.5
5 & above 6.0 6.0 6.0 6.0 6.1 6.3 5.9 6.0 5.8 5.9 6.0 5.5 5.5 6.0 6.2
Solution
H0(1): The order of gravida is insignificant.
H0(2): The age of mother is insignificant.
H0(3):The age of mother and order of gravida do not significantly affect the birth weight of
children.
H1(1): The order of gravida is significant.
H1(2): The age of mother is significant.
H1(3): The age of mother and order of gravida significantly affect the birth weight of children.
Level of Significance: = 0.05.
Critical values: F 0.05, (4,50) = 2.57 and F 0.05, (16,50) = 2.13
Calculations:
Age group of mother Total
Order of
gravida 2
15 20 20 25 25 30 30 35 > 35 Ti.. Ti ..
1 14.9 15.4 15.1 14.8 15.0 75.2 5655.04
2 15.8 16.1 15.7 15.7 15.4 78.7 6193.69
3 17.4 18.1 17.6 16.8 16.8 86.7 7516.89
4 17.9 18.7 18.1 17.3 16.9 88.9 7903.21
5 18.0 18.4 17.1 17.4 17.7 89.2 7956.64
Total T.j 84.0 86.7 84.2 82.0 81.8 418.7 35225.5
T T
1 2 1 2
SSG = CF = 10.96; SSM = CF = 1.12
53 5 3
i .. . j.
T
1 2
SSI = ij . CF SSG SSM = (7049.33/3) 2337.40 10.96 1.12 = 0.30
3 i j
ANOVA Table:
Sources of Degrees of Sum of Mean sum
variation freedom squares of squares
Order of gravida 4 10.96 2.74
Mothers age 4 1.12 0.28
Interaction 16 0.30 0.02
Error 50 1.41 0.03
Total 74 13.79
Test Statistics:
SST/ (t 1)
1. F 1 = ESS/tr(m 1) = 91.33
SSB/(r 1)
2. F 2 = ESS/tr(m 1) = 9.33
SSI/(t 1)(r 1)
3. F3 = ESS/tr(m 1) = 0.67
Conclusions:
Since F 1 > F 0.05, (4,50), we conclude that the data provide us evidence against the null hypothesis
H0(1) and in favor of H1(1). Hence H1(1) is accepted at 5% level of significance. That is, the order of
gravida is significant.
Since F 2 > F 0.05,(4,50), we conclude that the data provide us evidence against the null hypothesis
H0(2) and in favor of H1(2). Hence H1(2) is accepted at 5% level of significance. That is, the mothers
age is significant.
Since F 3 < F 0.05, (16,50), we conclude that the data do not provide us any evidence against the null
hypothesis H0(3), and hence it is accepted at 5% level of significance. That is, the age of mother and
order of gravida do not significantly affect the birth weight of children.
TEST 30
Aim
To test the significance of the treatment effects and the significance of the regression coefficient
of Y on X, based on the observations from n experimental units under randomized block design.
Source
Let (Yij, X ij) (i = 1, 2, , t ; j = 1, 2, , r) be the observations made from an experiment consists
of t treatments each with r blocks (replications) on two variables Y and X. The observations on auxiliary
or concomitant variable, X apart from the main variable Y under study is available for each of the
experimental units. When Y and X are associated, a part of the variation of Y is due to variation in values
of X. After eliminating, the effects of blocks and treatments one can then estimate a relationship,
between Y and X and use that relationship to predict the value of Y for a given value of X. This test is
used for assessing the significance of relationship between X and Y. If there is, a significant association
between X and Y one may calculate the adjusted treatment sum of squares and perform the test for the
homogeneity of treatment effects. Let n = t r. The observed data is arranged as follows:
Blocks Treatments
Block
totals
1 2 t
Y X Y X Y X Y X
1 Y11 X11 Y21 X21 Yt1 Xt1 BY1 BX1
2 Y12 X12 Y22 X22 Yt2 Xt2 BY1 BX1
r Y1r X1r Y2r X2r Ytr Xtr BY1 BX1
Treatment
totals TY1 TX1 TY2 TX2 TYt TXt GY GX
Analysis of Variance Tests 121
Linear Model
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ij are identically independently distributed as Normal distribution with mean zero and variance
.
2
Null Hypotheses
H0(1): The regression coefficient b is insignificant.
H0(2): The k treatments have equal effect.
That is, H0(2): 1 = 2 = = t.
Alternative Hypotheses
H1(1): The regression coefficient b is significant.
H1(2): The k treatments do not have equal effect.
That is, H1(2): 1 2 t.
Method
Calculate the following, based on the observations.
122 Selected Statistical Tests
For variable Y
t r
GY2
2. Correction Factor, CF Y =
n
t r
1
4. Treatment Sum of Squares (SST), TYY = r TYi2 CF
Y
i =1
TYi be the total of the ith treatment observations of Y.
r
B
1 2
5. Block sum of squares (BSS), B YY = t Yj CF Y
j =1
For variable X
t r
G X2
8. Correction Factor, CF X =
n
t r
T
1 2
10. Treatment Sum of Squares (SST), TXX = r Xi CF X
i =1
TXi be the total of the ith treatment observations of X, from all the replications.
r
B
1 2
11. Block sum of squares (BSS), B XX = t Xj CF X
j =1
GY G X
13. Correction Factor, CF YX =
n
14. Total Sum of Products of Y and X (TSP),
t r
GYX = Yij X ij CFYX
i =1 j=1
1 t
TYX = T T Xi CF YX
r i =1 Yi
Test Statistic
E2
YX / 1
E XX
F 1=
E
2
YX /(t 1)(r 1) 1
E
YY E XX
Conclusion
If F 1 F ,(1(t 1) (r 1)1 accept H0(1) and conclude that the regression coefficient of Y on X is
insignificant.
If F 1 > F ,(1,(t1)(r1)-1 reject H0(1) or accept H1(1) and conclude that the regression coefficient of
Y on X is significant and proceed to make adjustments for the variate.
Calculate the following adjusted values for the variable Y:
= EYY + TYY ;
EYY = EYX + TYX ;
EYX E XX = E XX + TXX
E YX ~
~
b = E XX ; b EYX
E 1 = EYY
124 Selected Statistical Tests
One degree of freedom is lost in error due to fitting a regression line. The above calculations are
provided as a single table as follows
TAR denotes the Treatment Adjusted for the average Regression within treatments and R.C
denotes the regression coefficients.
Test Statistic
E1 E /(t 1)
F 2 = E /(t 1)(r 1) 1
Conclusion
If F F ,(t 1),(t1)(r 1) 1, we conclude that the data do not provide us any evidence against the
null hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject
H0(2) or accept H1(2).
Example
A fertilizer trial on ADT-31 paddy was conducted in RBD. The grain yield was the primary
variable, Y. The number of productive tillers per hill was observed as mean of ten hills and it was the
covariate, X. The outputs are given below. Analyze the data and state your comments.
Analysis of Variance Tests 125
Block
Total
Treatment I II III IV
Y X Y X Y X Y X Y X
Control 7.7 5.1 6.4 5.5 8.0 5.0 6.9 5.5 28.3 21.1
AN1 10.8 6.5 9.0 6.3 10.5 6.7 9.6 6.5 39.9 26.0
AN2 13.0 7.6 12.6 7.6 12.0 7.3 13.0 8.6 50.6 31.1
AN3 15.0 8.5 14.8 8.9 14.0 9.5 14.0 9.5 57.8 36.4
AN4 14.8 10.4 15.0 9.5 13.0 9.7 14.1 10.1 56.9 39.7
UN1 9.9 6.3 10.5 6.4 9.0 6.3 9.6 6.2 39.0 25.2
UN2 13.1 7.5 11.9 7.1 12.9 7.8 12.5 7.9 50.4 30.3
UN3 14.4 8.1 14.2 9.5 13.5 9.5 14.1 8.8 56.2 35.9
UN4 15.0 9.2 14.8 10.1 13.8 10.4 12.8 9.9 56.4 39.6
Total 113.0 69.2 109.2 70.9 106.7 72.2 106.6 73.0 435.5 285.3
Analysis for X
2
(285.3)
CF = = 2261.0025
36
TSS = GXX = (5.1)2 + (6.5)2 + + (9.9)2 CF = 93.8875
1
BSS = B XX = [(69.2)2 + (70.9)2 + (72.2)2 + (73.0)2] CF = 0.9186
9
1
SST = TXX = [(21.1)2 + (26.0)2 + + (39.6)2] CF = 88.89
4
ESS = E XX = 4.0789
126 Selected Statistical Tests
(435. 5)(285. 3)
CF = = 3451.3375
36
TSP = Gyx = (7.0)(5.1) + (10.8)(6.5) + + (12.8)(9.9) CF = 130.7625
1
BSP = B yx = [(113)(69.2) + (109.2)(70.9) + (106.7)(72.2) + (106.6)(73)] CF
9
= 3449.7133 3451.3375 = 1.6242
1
SPT = Tyx = [(28.3)(21.1) + (39.9)(26.0) + + (56.4)(39.6)] CF
4
= 3582.9950 3451.3375 = 131.6575
ESP = E yx = 0.7292
ANOCOVA Table:
Sources of Degrees of Sum of squares and products
variation freedom YY XX YX
Blocks 3 3.003 0.9186 1.6242
Treatments 8 214.7272 88.8900 131.6575
Error 24 9.8795 4.0789 0.7292
Treat + Error 32 224.6067 92.9689 132.3867
Total 35 227.6097 93.8875 130.7625
88. 89
For the covariate X, Treatment Mean Square, TMS = = 11.1112
8
4.0789
Error Mean Square, EMS = = 0.17
24
11. 1112
F= = 65.36
0. 17
Since F is significant at 1% level of significance, we conclude that the covariate is also affected
by the treatments.
0.7292
The regression coefficient within treatment, b = E YX/E XX = = 0.1788
4.0789
2
2 (0. 7292)
E = E YY E YX/E XX = 9.8795 = 9.8795 0.13036 = 9.74914
4. 0789
E2
YX / 1
E XX
0. 13036 /1
Test Statistic: F1 = = = 0.3075
EYX
2 9 . 74914 / 23
E /(t 1)(r 1) 1
YY E XX
Conclusion: Since, F 1 < F 0.05,(1,23), F is not significant and hence b is not significant. Since b is
not significant, the effect of covariate in reducing the error will not be significant.
TEST 31
Aim
To test the significance of the m treatment effects, m row effects and m column effects based on
the observations from m square (m2) experimental units.
Source
Let yijk, (i, j, k = 1, 2,, m) be the observations of m treatments, each applied with (equal number
of replications) m times in m2 experimental units. In this design, the entire experimental material is
divided into m2 experimental units arranged in a square so that each row and each column contains m
units. The m treatments are allocated at random to these rows and columns in such a way that every
treatment occurs once and only once in each row and in each column.
This design is very much advantageous in the sense that, the treatment effect, the two orthogonal
effects such as row and column effects can be studied simultaneously in m square experimental units.
Linear Model
The linear model is yijk = + i + j + k + ijk; (i, j, k = 1, 2,, m)
where yijk is the observation of the ith treatment obtained from the jth row and k th column, is the
overall mean effect, i is the effect due to the ith treatment, j is the effect due to the jth row, k is the
effect due to the k th column and ijk is the error effect due to chance causes.
Assumptions
(i) The population from which, the observations drawn is Normal distribution.
(ii) The observations are independent.
(iii) The various effects are additive in nature.
(iv) ijk are identically independently distributed as Normal distribution with mean zero and
variance 2 .
Null Hypotheses
H0(1): The m treatments have equal effect. i.e., H0(1): 1 = 2 = , = m.
128 Selected Statistical Tests
Alternative Hypotheses
Method
Calculate the following, based on the observations.
m m
2
G
2. Correction Factor, CF = 2
m
m m
y
2
3. Total Sum of Squares, TSS = ijk CF
j =1 k =1
m
T
1 2
4. Sum of Squares between Treatments, SST = i CF
m i =1
Ti be the total of the ith treatment observations.
m
R
1 2
5. Sum of Squares between Rows, SSR = j CF
m j =1
C
1 2
6. Sum of Squares between Columns, SSC = k CF
m k =1
Test Statistics
SST/ (m 1)
1. F1 =
ESS/(m 1)(m 2)
SSR/ (m 1)
2. F2 =
ESS/(m 1)(m 2)
SSC/(m 1)
3. F3 =
ESS/(m 1)(m 2)
The statistic F 1, F 2, F 3 follows F distribution with (m1),(m1)(m2) degrees of freedom.
Conclusions
If F i F , (m1),(m1)(m2) , we conclude that the data do not provide us any evidence against the
null hypothesis H0(i), and hence it may be accepted at % level of significance. Otherwise reject H0(i)
or accept H1(i) for i = 1, 2, 3.
Example
1. An experiment was carried out to determine the effect of claying the ground on the field of
barley grains; amount of clay used were as follows. A: No clay, B: Clay at 100 per acre.
C: Clay at 200 per acre, D: Clay at 300 per acre. The yields were in plots of 10 square meters
and the layout and yields were as follows. Analyze all the effects at 5% level of significance.
Column I II III IV
Total
Row
I D 34.7 A 35.6 B 38.2 C 35.5 144
II C 38.2 D 34.4 A 42.8 B 37.6 153
III A 36.4 B 37.2 C 41.7 D 36.7 152
IV B 39.7 C 38.8 D 40.3 A 38.2 157
Total 149 146 163 148 606
130 Selected Statistical Tests
Solution
H0(1): The yields under four types of clay are equal.
H0(2): All the four rows have equal yields.
H0(3): All the four columns have equal yields.
H1(1): The yields under four types of clay are not equal.
H1(2): All the four rows do not have equal yields.
H1(3): All the four columns do not have equal yields.
Level of Significance: = 0.05 and Critical value: F 0.05,(3,6) = 4.76
Calculations:
m = No. of treatments = No. of rows = No. of columns = 4
No. of experimental units, n = 16. T1=153 T2=152.7 T3= 154.2 T4 = 146.1
m m
1. G = y ijk = 606
j =1 k=1
2 2
G 606
2. CF = 2 = 2 = 22952.25
m 4
m m
y
2
3. TSS = ijk CF= 23038.58 CF = 86.33
j =1 k =1
m
T
1 2 1
4. SST = i CF = (1532 + 152.72 + 154.22 + 146.12) CF = 10.035
m i =1
4
m
R
1 2 1
5. SSR = j CF = (1442 + 1532 + 1522 + 1572) CF = 22.25
m j =1
4
C
1 2 1
6. SSC = k CF = (1492 + 1462 + 1632 + 1482) CF = 45.25
m k =1
4
7. ESS = TSS SST SSR SSC = 8.795
ANOVA Table:
Sources of Degrees of Sum of Mean sum of
variation freedom squares squares
T reatments 3 10.035 3.345
Rows 3 22.25 7.4167
Columns 3 45.25 15.08
Error 6 8.795 1.4658
Total 15 86.33
Analysis of Variance Tests 131
Test Statistics:
SST/ (m 1)
1. F1 = = 2.28
ESS/(m 1)(m 2)
SSR /(m 1)
2. F 2 = ESS /(m 1)(m 2) = 5.06
SSC /(m 1)
3. F 3 = ESS /(m 1)(m 2 ) = 10.29
Conclusions: Since F 1 < F 0.05, (3,6), we conclude that the data do not provide us any evidence
against the null hypothesis H0(1), and hence it may be accepted at 5% level of significance. That is, all
the four types of clay have equal yields.
Since F 2, F 3 > F 0.05, (3,6), we conclude that the data provide us evidence against the null hypotheses
H0(2) and H0(3) and in favor of H1(2)and H1(3). Hence, H1(2) and H1(3) are accepted at 5% level of
significance. That is, all the four rows have not equal yields and all the four columns have not equal
yields.
TEST 32
Aim
To test the significance of the main effects and interaction effect based on experiment consists of
two factors each with two levels.
Source
In this design, let there be two treatments (Factors) say, A and B are called simple treatments
whose effects can be tested with two levels, say 0 (absent) and 1 (present). That is, we study the
individual effects of A and B as well as their combined effect, called as interaction. This 22 factorial
design consists of 4 treatment combinations namely A 0B 0, A 1B 0, A 0B 1, A 1B 1 are denoted by 1 (both
at 0 level indicate no application of factor), main effect A, main effect B and interaction AB. It can be
tested in r blocks (replications), so that it requires r 22 = 4r = n experimental units. [1], [a], [b] and
[ab] are called treatment totals, denote, respectively the observations of the treatments 1, a, b and
ab from all the r blocks.
Null Hypotheses
H0(1): All the r blocks have equal effect.
H0(2): The main effect A is insignificant.
H0(3): The main effect B is insignificant.
H0(4): The interaction AB is insignificant.
Alternative Hypotheses
H1(1): All the r blocks do not have equal effect.
H1(2): The main effect A is significant.
H1(3): The main effect B is significant.
H1(4): The interaction AB is significant.
Analysis of Variance Tests 133
Method
Calculate the following
1. Factorial effect total for the main effect A [A] = [ab] + [a] [b] [1]
2. Factorial effect total for the main effect B [B] = [ab] + [b] [a] [1]
3. Factorial effect total for the interaction AB [AB] = [ab] [a] [b] + [1]
4. Sum of Squares due to main effect A, SS[A] = [A]2/4r
5. Sum of Squares due to main effect B, SS[B] = [B]2/4r
6. Sum of Squares due to interaction AB, SS[AB] = [AB]2/4r
7. Calculation of G, CF, TSS, SSB are same as in RBD.
8. ESS = TSS SSB SS[A] SS[B] SS[AB]
Test Statistics
SSB/ (r 1)
F1 =
ESS/3(r 1)
SS [ A]/1
F2 =
ESS/3(r 1)
SS [B ]/1
F3 =
ESS/3(r 1)
SS[ AB ]/1
F 4=
ESS/3(r 1)
134 Selected Statistical Tests
Conclusions
If F 1 F ,(r1),3(r1), we conclude that the data do not provide us any evidence against the null
hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject H0(1) or
accept H1(1).
If F i F ,(1,3(r1)), we conclude that the data do not provide us any evidence against the null
hypothesis H0(i), and hence it may be accepted at % level of significance. Otherwise reject H0(i) or
accept H1(i) for i = 2, 3, 4.
Example
An experiment was planned to study the effect of urea and potash on the yield of tomatoes. All the
combinations of two levels of urea [0 cent (p0) and 5 cent (p1) per acre] and two levels of potash
[0 cent (k 0) and 5 cent (k 1) per acre] were studied in an RBD design with four replications each. The
following are the yields. Analyze the data and state your conclusions.
Solution
H0(1): All the four blocks have equal effect.
H0(2): The main effect p is insignificant.
H0(3): The main effect k is insignificant.
H0(4): The interaction pk is insignificant.
H1(1): All the four blocks do not have equal effect.
H1(2): The main effect p is significant.
H1(3): The main effect k is significant.
H1(4): The interaction pk is significant.
Level of Significance: = 0.05.
Test Statistics:
SSB/ (r 1)
F1 = = 0.77
ESS/3(r 1)
SS [ A]/1
F2 = = 2.45
ESS/3(r 1)
SS [B ]/1
F3 = = 1.20
ESS/3(r 1)
SS[ AB ]/1
F4 = = 1.20
ESS/3(r 1)
Conclusions: Since F 1 < F 0.01, (3,9), we conclude that the data do not provide us any evidence
against the null hypothesis H0(1), and hence it is accepted at 1% level of significance. That is, all the
four blocks have equal effect.
Since F i < F 0.01, (1,9), for i = 2, 3, 4, we conclude that the data do not provide us any evidence
against the null hypothesis H0(i), and hence it is accepted at 1% level of significance. That is, the main
effects p, k and the interaction effect pk are insignificant.
TEST 33
Aim
To test the significance of the main effects and interaction effect based on experiment consists of
three factors each with two levels.
Source
In this design, let there be three treatments (Factors) say, A, B and C are called simple treatments
whose effects can be tested with two levels, say 0 (absent) and 1 (present). That is, we study the
individual effects of A, B and C as well as their combined effects, called as interactions. This 23
factorial design consists of 8 treatment combinations namely A 0B 0C0, A 1B 0C0, A 0B 1C0, A0B 0C1, A 1B 1C0,
A 1B 0C1, A 0B 1C1 and A 1B 1C1 are denoted by 1 (all at 0 levels indicate no application of factor), main
effects A, B, C and interactions AB, AC, ABC. It can be tested in r blocks (replications), so that it
requires r 2 3 = 8r = n experimental units. [1], [a], [b], [c], [ab], [ac], [bc] and [abc] are called
treatment totals, denote, respectively the observations of the treatments 1, a, b, c, ab, ac,
bc and abc from all the r blocks.
Null Hypotheses
H0(1): All the r blocks have equal effect.
H0(2): The main effect A is insignificant.
H0(3): The main effect B is insignificant.
H0(4): The main effect C is insignificant.
H0(5): The interaction AB insignificant.
H0(6): The interaction AC insignificant.
H0(7): The interaction BC insignificant.
H0(8): The interaction ABC insignificant.
Alternative Hypotheses
H1(1): All the r blocks do not have equal effect.
H1(2): The main effect A is significant.
Analysis of Variance Tests 137
Method
Yates method of totals and sum of squares of factorial effects in a 2 3 factorial experiment
Test Statistics
SSBC/1 SSABC/1
F 7 = ESS/7 (r 1) F 8 = ESS/7 (r 1)
Conclusions
If F 1 F , (r1),7(r1), we conclude that the data do not provide us any evidence against the null
hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject H0(1) or
accept H1(1).
If F m F , (1,7(r1)), we conclude that the data do not provide us any evidence against the null
hypothesis H0(m), and hence it may be accepted at % level of significance. Otherwise reject H0(m)
or accept H1(m) for m = 2, 3, 4, 5, 6, 7, 8.
Example
The following data shows the layout and results of a 23 factorial design laid out in four replicates
(blocks). The purpose of the experiment is to determine the effect of different kinds of fertilizers
Nitrogen, N, Potash, K and Phosphate, P on potato crop yield.
Block-I
nk kp p np 1 k n nkp
291 391 312 373 101 265 106 450
Block-II
kp p k nk n nkp np 1
407 324 272 306 89 449 338 106
Block-III
p 1 np kp nk k n nkp
323 87 324 423 334 279 128 471
Block-IV
np nk n p k 1 nkp kp
361 272 103 324 302 131 437 435
Analysis of Variance Tests 139
Solution
H0: All the treatments as well as blocks have homogeneous effect.
H1: All the treatments and blocks effects are significant.
Level of Significance: = 0.05
Critical values: F 0.05,(3,21) = 3.70 and F 0.05,(1,21) = 2.50
Calculations:
n = 32; G = 9324; CF = 93242/32 = 2716780.5
Block totals: B1 = 2289 B2 = 2291 B3 = 2369 B4 = 2375
Treatment totals: 1= 425; n = 426; k = 1118; nk = 1203;
p = 1283; np = 1396; kp = 1666; nkp = 1807.
TSS = (291)2 + (391)2 + + (445)2 CF = 3182118 2716780.5 = 465337.5
1
BSS = (2289)2 + + (2375)2 CF = 843
8
1
SST = (425)2 + + (1807)2 CF = 456955.5
4
ESS = TSS BSS SST = 7539
Yates method of totals and sum of squares of factorial effects in a 2 3 factorial experiment.
Test Statistic:
BSS/ (r 1) 843/(4 1)
F 1 = ESS/7 (r 1) = 7539/ 7(4 1) = 0.78
Aim
To test the significance of the effect of main plot treatments and the effect of sub plot treatments.
Source
Suppose we are interested to test two factors a and b, factor a being at p levels a1, a2,, ap
and factor b at q levels b1, b2, , bq. The different types of treatments are allotted at random to their
respective plots. Such arrangement is split-plot design. In this design, the larger plots are called main
plots and the smaller plots within the larger plots are called sub-plot treatments. The factor levels
allotted to the main plots are called main plot treatments and the factor levels allotted to the sub-plot are
called sub-plot treatments. The factor that requires greater precision is assigned to the sub-plots. The
replication is then divided into number of main plots equivalent to the main plot treatments. Each main
plot is divided into sub-plots depending on the number of sub-plot treatments.
Hence, there are p main plot treatments, q sub plot treatments and r blocks (replications), so that
there are rpq = n experimental units in total. The observations are arranged in a three-way table.
Linear Model
The model for this experiment in randomized blocks is
Yijk = + bi + mj + mij + sk + jk + ijk.
(i = 1, 2, , r; j = 1, 2, , p; k = 1, 2,, q)
Where
Yijk is the observation of the ith block, jth main plot and k th sub plot.
is the overall mean effect.
bi is the effect due to the ith block.
mj is the effect due to the jth main plot treatment.
mij is the main plot error or error (A).
sk is the effect due to the k th sub plot treatment.
jk is the effect due to interaction between main and sub plots.
and ijk is the error effect due to sub plot and interaction or error (B).
142 Selected Statistical Tests
Assumptions
1. The main plot treatments are allocated randomly to each of the blocks.
2. The sub plot treatments are allocated randomly within the main plot treatments.
3. bi, mij and ijk are independently normally distributed each with mean zero and variance
b , m and respectively..
2 2 2
4. m
j
j = 0, s
k
k = 0,
k
jk = 0L j, j
jk = 0L k .
Null Hypotheses
H0(1): The m main plot treatments have equal effect. i.e., H0(1): m1 = m2 = , = mp.
H0(2): The s sub plot treatments have equal effect. i.e., H0(2): s1 = s2 = , = sq.
H0(3): There is no interaction between main and sub plot treatments. i.e., H0(3): jk = 0 for all j
and k.
Alternative Hypotheses
H1(1): The m main plot treatments do not have equal effect. i.e., H0(1): m1 m2 , mp.
H1(2): The s sub plot treatments do not have equal effect. i.e., H0(2): s1 s2 , sp.
H0(3): There is interaction between main and sub plot treatments. i.e., H0(3): jk 0 for all j
and k.
Method
Calculate the following, based on the observations.
2
G
2. Correction Factor, CF =
n
r p q
4. Form a two-way table (BM table) for Blocks Main plot treatments as follows.
Analysis of Variance Tests 143
Y
1 2
5. Sum of Squares in BM table, SSBM = q ij . CF
i j
B
1 2
6. Sum of Squares between blocks, SSB = pq i CF
i
M
1 2
7. Sum of Squares between Main plot treatments, SSM = rq j CF
j
Y
1 2
10. Sum of Squares in MS table, SSMS = . jk CF
r j k
S
1 2
11. Sum of Squares between Sub plot treatments, SSS = k CF
rp k
Test Statistics
SSM / ( p 1)
1. F 1 = ESS ( A) / (r 1)( p 1)
SSS / (q 1)
2. F 2 = ESS ( B ) /( r 1) p (q 1)
SSI/ ( p 1)(q 1)
3. F 3 = ESS ( B ) /( r 1) p (q 1)
The statistics F 1, F 2, F 3 follows F distribution with [(p 1), (r 1)(p 1)], [(q 1), (r 1)p
(q 1)] and [(p 1)(q 1), (r 1)p(q 1)] degrees of freedoms respectively.
Conclusions
If F 1 F , (p 1),(r 1)(p 1), we conclude that the data do not provide us any evidence against the
null hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject
H0(1) or accept H1(1).
If F 2 F , (q 1), (r 1)p (q 1) , we conclude that the data do not provide us any evidence against
the null hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject
H0(2) or accept H1(2).
If F 3 F , (p 1) (q 1), (r 1)p (q 1), we conclude that the data do not provide us any evidence
against the null hypothesis H0(3), and hence it may be accepted at % level of significance. Otherwise
reject H0(3) or accept H1(3).
Analysis of Variance Tests 145
Example
An experiment was conducted in split plot design to study the effect of fertilizer (F ) and seed rate
(S) on the yield of paddy raised under semi-dry condition. The main plot treatments were the seed rates
75, 100 and 125 kg/ha denoted by s1, s2 and s3 respectively.
The sub-plot treatments were the fertilizer rates. They were N:P:K in the rate 75:15:20 = f 1;
75:15:40 = f 2; 75:15:60 = f 3; 75:30:20 = f 4; 75:30:40 = f 5; 75:30:60 = f 6; 75:45:20 = f 7; 45:45:40 = f 8;
75:45:60 = f 9 and 50:15:40 = f 10. The layout plan and grain yield of paddy in kg/plot are given in the
following table. Analyze the data and draw the conclusions.
Solution
H0(1): The seed rates have equal effect.
H0(2): The fertilizer rates have equal effect.
H0(3): There is no interaction between seed rate and fertilizer rate.
H1(1): The seed rates do not have equal effect.
H1(2): The fertilizer rates do not have equal effect.
H1(3): There is interaction between seed rate and fertilizer rate.
Level of Significance: = 0.05.
Critical Values: F 0.05,(2,4) = 6.94; F 0.05,(4,54) = 2.52; F 0.05,(18,54) = 1.79
146 Selected Statistical Tests
Calculations:
n = 90; r = 3; m = 10; s = 3; G = 1131.61
CF = 14228.2355; TSS = 235.9742
Block X Main plot (BM) table:
Test Statistics:
SSM/ ( p 1) 172. 7740 / 2
1. F 1 = ESS ( A)/ (r 1)( p 1) = = 227.964
1. 5158 / 4
Conclusions:
Since F 1 > F 0.05, (2, 4), we conclude that the data provide us evidence against the null hypothesis
H0(1) and in favor of H1(1). Hence H1(1) is accepted at 5% level of significance. That is, the seed rates
do not have equal effect.
Since F 2 > F 0.05, (4, 54), we conclude that the data provide us evidence against the null hypothesis
H0(2) and in favor of H1(2). Hence H1(2) is accepted at 5% level of significance. That is the fertilizer
rate do not have equal effect.
Since F 3 > F 0.05, (18, 54), we conclude that the data provide us evidence against the null hypothesis
H0(3) and in favor of H1(3). Hence H1(3) is accepted at 5% level of significance. That is, there is an
interaction between seed rate and fertilizer rate.
TEST 35
Aim
To test the significance of the effect of main plot treatments and the effect of sub plot treatments
based on strip plot design.
Source
In this design, the main plot treatments are applied at random to rows and the sub plot treatments
are applied at random to columns. Suppose we are interested to test two factors a and b, factor a
being at p levels a1, a2, , ap and factor b at q levels b1, b2, , bq as in split plot design.
Hence, there are p main plot treatments, q sub plot treatments and r replications (blocks), so that
there are rpq = n experimental units in total. The observations are arranged in a three-way table.
Linear Model
The model for this experiment is
Yijk = + ri + mj + mij + sk + eik + jk + ijk
(i = 1, 2, , r ; j = 1, 2,, p ; k = 1, 2,, q)
Where
Yijk is the observation of the ith block, jth main plot and k th sub plot.
is the overall mean effect.
ri is the effect due to the ith block.
mj is the effect due to the jth main plot treatment.
mij is the main plot error or error (A).
sk is the effect due to the k th sub plot treatment.
jk is the effect due to interaction between main and sub plots.
Analysis of Variance Tests 149
and ijk is the error effect due to sub plot and interaction or error (B).
Assumptions
1. The main plot treatments are allocated randomly to each rows of the block.
2. The sub plot treatments are allocated randomly to each columns of the block.
3. ri, mij, eik and eijk are independently normally distributed each with mean zero and variance
r , m e and respectively..
2 2 2
4. j m j = 0, sk = 0, jk = 0, . j, j jk = 0 . k.
k k
Null Hypotheses
H0(1): The m main plot treatments have equal effect. i.e., H0(1): m1 = m2 = , = mp.
H0(2): The s sub plot treatments have equal effect. i.e., H0(2): s1 = s2 = , = sq.
H0(3): There is no interaction between main and sub plot treatments. i.e., H0(3): jk = 0 for all j
and k.
Alternative Hypotheses
H1(1): The m main plot treatments do not have equal effect. i.e., H1(1): m1 m2 , mp.
H1(2): The s sub plot treatments do not have equal effect. i.e., H1(2): s1 s2 , sq.
H1(3): There is interaction between main and sub plot treatments. i.e., H1(3): jk 0 for all j
and k.
Method
Calculate the following, based on the observations:
2
G
2. Correction Factor, CF =
n
r p q
y
2
3. Total Sum of Squares, TSS = ijk CF
i =1 j =1 k =1
4. Form a two-way table (BM table) for Block Main plot treatments as follows.
150 Selected Statistical Tests
Y
1 2
5. Sum of Squares in BM table, SSBM = ij . CF
q i j
R
1 2
6. Sum of Squares between Blocks, SSB = i CF
pq i
M
1 2
7. Sum of Squares between Main plot treatments, SSM = j CF
rq j
Y
1 2
10. Sum of Squares in BS table, SSBS = . jk CF
r j k
S
1 2
11. Sum of Squares between Sub plot treatments, SSS = k CF
rp k
14. Form a two-way table (MS table) for Main plot treatments Sub plot treatments as follows:
1 2
15. Sum of Squares in MS table, SSMS = Y. jk CF
r j k
16. Sum of Squares of Interaction, SSI = SSMS SSM SSS
17. Error Sum of Squares (Error (C)),
ESS(C) = TSS SSB SSM ESS(A) SSS ESS(B) SSI.
Test Statistics
SSM / ( p 1)
1. F 1 = ESS ( A) / (r 1)( p 1)
152 Selected Statistical Tests
SSS/ (q 1)
2. F 2 = ESS (B ) /(r 1)(q 1)
SSI/ ( p 1)(q 1)
3. F 3 = ESS (C ) / (r 1)( p 1)(q 1)
The statistics F 1, F 2, F 3 follows F distribution with [(p 1), (r 1)(p 1)], [(q 1),(r 1)
(q 1)] and [(p 1)(q 1),(r 1)(p 1)(q 1)] degrees of freedoms respectively.
Conclusions
If F 1 F , (p 1), (p 1)(r 1), we conclude that the data do not provide us any evidence against the
null hypothesis H0(1), and hence it may be accepted at % level of significance. Otherwise reject
H0(1) or accept H1(1).
If F 2 F , (q 1), (r 1)(q 1), we conclude that the data do not provide us any evidence against the
null hypothesis H0(2), and hence it may be accepted at % level of significance. Otherwise reject
H0(2) or accept H1(2).
If F 3 F , (p 1)(q 1), (r 1)(p1)(q 1), we conclude that the data do not provide us any evidence
against the null hypothesis H0(3), and hence it may be accepted at % level of significance. Otherwise
reject H0(3) or accept H1(3).
Example
Use the data in test-9, apply strip plot design, and draw your conclusions.
Solution
The main plot analysis is same as in split plot design. Apart from this, we have to form a two way
table (BS table) for block sub plot treatment as follows:
1 42857. 122
SSBS = (35.89)2 + + (31.31)2 CF = 14228.236 = 57.471
3 3
SSS = 56.7606; SSI = 3.6699
ESS(B) = SSBS SSS = 57.4710 056.7606 = 0.7104
ESS(C) = TSS SSB SSM ESS(A) SSS ESS(B) SSI
= 235.9742 0.4348 172.7740 1.5158 56.7606 0.7104 3.6699
= 0.1087
Analysis of Variance Tests 153
ANOVA Table:
Test Statistics:
SSM/ ( p 1)
1. F 1 = ESS ( A)/ (r 1)( p 1) = 0.5737
SSS/ (q 1)
2. F 2 = ESS (B ) /(r 1)(q 1) = 159.66
SSI/ ( p 1)(q 1)
3. F 3 = ESS (C )/ (r 1)( p 1)(q 1) = 67.97
Conclusions:
Since F 1< F 0.05, (2,4), we conclude that the data do not provide us evidence against the null
hypothesis H0(1). Hence H0(1) is accepted at 5% level of significance. That is, the seed rates have
equal effect.
Since F 2 > F 0.05, (9,18), we conclude that the data provide us evidence against the null hypothesis
H0(2) and in favor of H1(2). Hence H1(2) is accepted at 5% level of significance. That is, the fertilizer
rates do not have equal effect.
If F 3 > F 0.05,(18, 36), we conclude that the data provide us evidence against the null hypothesis
H0 (3) and in favor of H1(3). Hence H1(3) is accepted at 5% level of significance. That is, there is an
interaction between seed rates and fertilizer rates.
This page
intentionally left
blank
CHAPTER 4
MULTIVARIATE TESTS
This page
intentionally left
blank
TEST 36
Aim
To test the mean vector of the multivariate population be regarded as 0, based on a multivariate
random sample. That is, to investigate the significance of the difference between the assumed population
mean vector 0 and sample mean vector X .
Source
Let X ij, (i = 1, 2,p; j = 1, 2,, N) be a random sample of p-fold N observations drawn from a
p-variate normal population whose mean vector = (1, 2,, p)T is unknown and co-variance
matrix
11 12 ... 1 p
22 ... 2 p
21
...
= ... ... ... is known
p 1 p2 ... pp
The diagonal elements of are variances, the non-diagonal elements are co-variances and the
N
matrix is symmetric. Let X = ( X 1 , X 2 ,..., X p )T ; X =
i X
j =1
ij
; (i = 1, 2,, p) be the sample mean
Assumptions
(i) The population from which, the sample drawn, is p-variate normal population.
(ii) The covariance matrix is known.
158 Selected Statistical Tests
Null Hypothesis
H0: The population mean vector be regarded as 0. That is, there is no significant difference
between the sample mean vector X and the assumed population mean vector 0. i.e., H0: = 0.
Alternative Hypothesis
H1: 0
Test Statistic
2 = N ( X )T 1 ( X ) (Under H0 : = 0)
The Statistic 2 follows 2 distribution with p degrees of freedom.
Conclusion
If 2 2p (), we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1.
Example
A random sample of 42 insects of a specific variety is selected whose mean lengths of left and
right antenna are observed as 0.564 inches and 0.603 inches. Test whether the lengths of left and right
0. 55
antenna of a specific variety of insects with mean vector with known covariance matrix
0.60
0. 014 0. 012
0. 012 0. 015
at 5% level of significance.
Solution
0. 55
H0: The left and right antennas of a specific variety of insects have the mean lengths i.e.,
0.60
0. 55
H0: =
0.60
0. 55
H1: The lengths of left and right antenna of a specific variety of insects is not . i.e., H1:
0.60
0. 55
0.60
Multivariate Tests 159
Aim
To test the null hypothesis that the mean vector of the multivariate population be regarded as 0,
based a multivariate random sample. That is, to investigate the significance of the difference between
the assumed population mean vector 0 and the sample mean vector X .
Source
Let X ij, (i = 1, 2,p ; j = 1, 2,, N) be a sample of p-fold N observations drawn from a p-variate
normal population whose mean vector = (1, 2,, p)T and the covariance matrix are unknown.
T
Let X = ( X 1 , X 2 , ..., X p ) be the sample mean vector which is an unbiased estimate of the population
mean vector . The unknown covariance matrix is estimated by
A
S=
N 1
A = ( X ij X )( X ij X )
N
T
j=1
The diagonal elements of S are variances, the non-diagonal elements are co-variances, and the
matrix is symmetric.
Multivariate Tests 161
Assumptions
(i) The population from which, the sample drawn is p-variate normal population.
(ii) The covariance matrix is unknown.
Null Hypothesis
H0: The population mean vector be regarded as 0. That is, there is no significant difference
between the sample mean vector X and the assumed population mean vector 0. i.e., H0: = 0.
Alternative Hypothesis
H1: 0
Test Statistic
T 2 = N ( X ) S ( X ) (Under H0 : = 0)
T 1
T2 = N
( X 0 ) T A 1 ( X 0 )
N 1
2
T Np
and F=
N 1 p
The Statistic F follows F distribution with (p, Np) degrees of freedom.
Conclusion
If F F p,Np(), we conclude that the data do not provide us any evidence against the null
hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept
H1.
Example
Perspiration from 20 healthy females was analyzed. Three components, X 1 = sweat rate,
X 2 = sodium content, and X 3 = potassium content, were measured and the data are given below:
Test the hypothesis that H0: = [4 50 10] against H1: [4 50 10] at 10% level of significance.
Solution
H1: The average perspiration of the female () is not [4 50 10] i.e., H0: [4 50 10]
Level of Significance: = 0.10; Critical Value: F 0.10,(3,17) = 2.44
Calculations:
Based on the above data,
T 2 = N ( X ) S ( X ) (Under H0 : = 0)
T 1
Test Statistic:
0. 467
= 20 [0.640 4.600 0.035] 0.042 = 9.74
0. 160
T2 N p 9. 74 20 3
F= = = 2.9049
N 1 p 20 1 3
Conclusion: Since, F > F 0.05,(3.17), H0 is rejected and concluded that the average perspiration of
the female () is not [4 50 10].
TEST 38
Aim
To test the mean vectors of two multivariate populations 1 and 2 are equal, based on two
multivariate random samples. That is, to investigate the significance of the difference between the
sample mean vectors.
Source
Let X ij(1), (i = 1, 2,p ; j = 1, 2,, N1) be a random sample of p-fold N1 observations called as
sample-1 drawn from a p-variate normal population whose mean vector (1) = (1(1), 2(1),, p(1))T .
Let X ij(2), (i = 1, 2,p ; j = 1, 2,, N2) be a random sample of p-fold N2 observations called as
sample-2 drawn independently from another p-variate normal population whose mean vector
(2) = (1(2), 2(2), , p(2))T . The mean vectors (1) and (2) are unknown. The covariance matrices of
the two populations are equal and known and is denoted by
11 12 ... 1 p
22 ... 2 p
21
= ... ... ... ...
p 1 p2 ... pp
The diagonal elements of are variances, the non-diagonal elements are co-variances and the
matrix is symmetric. Let. X (1) = ( X 1 , X 2 , , X p
(1 ) (1) (1) T
) be the sample mean vector of the sample-1
which is an unbiased estimate of the population mean vector (1) and X (2 ) = ( X 1 , X 2 , , X p )T
(2 ) ( 2) ( 2)
be the sample mean vector of the sample-2 which is an unbiased estimate of the population mean
vector (2).
Multivariate Tests 165
Assumptions
(i) The populations from which, the samples drawn, are two independent p-variate normal
populations.
(ii) The covariance matrices of two populations are equal and known, denoted by .
Null Hypothesis
H0: The two population mean vectors (1) and (2) are equal. That is, there is no significant
difference between the two sample mean vectors X (1) and X (2 ) i.e., (1) = (2).
Alternative Hypothesis
Test Statistic
N 1N 2
[ 1
2 = N + N ( X ) ( X )
T
]
1 2
X = X (1 ) X (2 ) , =
(1) (2)
2 =
N 1N 2 ( X (1) X ( 2 ) )T 1 ( X (1) X ( 2 ) )
N1 + N 2
Conclusion
If 2 2p(), we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example
Fifty observations are taken from the population Iris versicolour (1) and fifty from the population
Iris setosa (2) on the characters: sepal length (X 1), sepal width (X 2), petal length (X 3) and petal width
(X 4) in centimeters and obtained the measures as follows:
5. 936 5. 006
2. 770 3. 428
X 1 = 4. 260 X 2 = 1. 462 with known covariance matrix
1.326 0. 246
166 Selected Statistical Tests
Test whether the mean vectors of given four characters of two populations are equal at 5% level
of significance.
Solution
H0: The mean vectors of given four characters of two populations are equal. i.e., H0: (1) = (2).
H1: The mean vectors of given four characters of two populations are not equal. i.e., H1: (1)
(2).
Level of Significance: = 0.05 and Critical value: 20.05,(4) = 9.49
N 1 N 2 (1) ( 2) T 1 (1) ( 2)
2 = ( X X ) ( X X )
N 1 + N 2
Test Statistic:
1
5. 936 5. 006 5. 936 5. 006
T
19.1434 9. 0356 9.7634 3. 2394
2. 770 3.428 9.0356 2. 770 3.428
11. 8658 4.6232 2.4746
50 50
= 4.260 1.462 9.7634 4.6232 12.2978 3. 8794 4.260 1.462
50 + 50
1. 326 0.246 3.2394 2.4746 3.8794 2.4604 1. 326 0.246
= 2580.732
Conclusion: Since 2 > 20.05,(4), H0 is rejected and conclude that the mean vectors of given four
characters of two populations are not equal.
TEST 39
Aim
To test the mean vectors of two multivariate populations 1 and 2 are equal, based on two
multivariate random samples. That is, to investigate the significance of the difference between the two
sample mean vectors.
Source
Let X ij(1), (i = 1, 2, p; j = 1, 2,, N1) be a random sample of p-fold N1 observations called as
sample-1 drawn from a p-variate normal population whose mean vector (1) = (1(1), 2(1), , P(1))T .
Let X ij(2), (i = 1, 2,p; j = 1, 2, , N2) be a random sample of p-fold N2 observations called as
sample-2 drawn independently from another p-variate normal population whose mean vector
(2) = (1(2), 2(2),, p(2))T . The mean vectors (1) and (2) are unknown. The covariance matrix of the
two populations is equal but unknown and is denoted by . The estimate of is given by
N1 N2
1 (1) (1) T ( 2) ( 2)
S= N + N 2 ( X (1)
ij X )( X (1)
ij X ) + ( X ij( 2 ) X )( X ij( 2 ) X )T
1 2 j =1 j =1
The diagonal elements of S are variances, the non-diagonal elements are co-variances and the
matrix is symmetric. Let X (1) = ( X 1 , X 2 , , X p
(1 ) (1) (1) T
) be the sample mean vector of the sample-1
which is an unbiased estimate of the population mean vector (1) and X (2 ) = ( X 1 , X 2 , , X p )T
(2 ) ( 2) ( 2)
be the sample mean vector of the sample-2 which is an unbiased estimate of the population mean
vector (2).
168 Selected Statistical Tests
Assumptions
(i) The populations from which, the sample drawn are two independent p-variate normal
populations.
(ii) The covariance matrices of two populations are equal, denoted by , is unknown.
Null Hypothesis
H0: The two population mean vectors (1) and (2) are equal. That is, there is no significant
difference between the two sample mean vectors X (1) and X ( 2) . i.e., H0: (1) = (2).
Alternative Hypothesis
Test Statistic
N1 N 2 T 1
(
T2 = N + N X S X
) ( )
1 2
(1) ( 2)
X = X X , = (1) (2)
Under H0: = , hence the test statistic becomes
(1) (2)
N1 N 2 (1) ( 2)
T
( 2)
S X X
1 (1 )
T2 = N +N X X
1 2
2
T N1 + N 2 p 1
and F=
(N 1 + N 2 2 ) p
The Statistic F follows F distribution with (p1 N1 + N2 p 1) degrees of freedom.
Conclusion
If F F p , N1 + N 2 p 1 (), we conclude that the data do not provide us any evidence against the
null hypothesis H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or
accept H1.
Note: This test is also known as Hotellings T 2 test.
Example
Two random samples of sizes 45 and 55 were observed from Chennai city of households having
with and without air conditioning, respectively. Two measurements of electrical usage (in kilowatt
hours) were considered. The first is the measure of total on peak consumption (X 1) during July and the
second is a measure of total off-peak consumption (X 2) during July. The resulting summary statistics
Multivariate Tests 169
are
Solution
H0: The average consumption of electrical usage on both on-peak and off-peak are equal.
i.e., H0: (1) = (2).
H1: The average consumption of electrical usage on both on-peak and off-peak are not equal.
i.e., H1: (1) (2).
Level of Significance: = 0.05 and Critical value: F 0.05,(2,98) = 3.10
Calculations:
The pooled sample covariance matrix,
2475 74.4
= [0.001699 0.002592]. 201. 6 = 24.75 0.6489528 = 16.0616
100
2
T N1 + N 2 p 1 16.0616 45 + 55 2 1
and F=
(N 1 + N 2 2 ) p
=
45 + 55 2 2
= 7.9488
Conclusion: Since, F > F 0.05,(2,97), H0 is rejected and concluded that the average consumption of
electrical usage on both on-peak and off-peak are not equal.
TEST 40
Aim
To test the mean vectors of two multivariate populations 1 and 2 are equal, based on two
multivariate random samples. That is, to investigate the significance of the difference between the two
sample mean vectors.
Source
Let X ij(1), (i = 1, 2, p; j = 1, 2, , N) be a random sample of p-fold N observations called as
sample-1 drawn from a p-variate normal population whose mean vector (1) = (1(1), 2(1), , p(1))T .
Let X ij(2), (i = 1, 2, p; j = 1, 2,, N) be a random sample of p-fold N observations called as sample-
2 drawn independently from another p-variate normal population whose mean vector
(2) = (1(2), 2(2), , p(2))T . The mean vectors (1) and (2) are unknown. The covariance matrices of
the two populations are unequal and unknown and are denoted by 1 and 2 . In this case 1 is
estimated by S1 and 2 is estimated by S2, where S1 and S2 are sample covariance matrices of the two
samples.
Let X (1) = ( X 1 , X 2 , , X p )T be the sample mean vector of the sample-1 which is an
(1) (1) (1)
sample mean vector of the sample-2 which is an unbiased estimate of the population mean vector (2).
Assumptions
(i) The populations from which, the sample drawn are two independent p-variate normal
populations.
(ii) The covariance matrices of two populations are unequal, denoted by 1 and 2 , are unknown.
Null Hypothesis
H0: The two population mean vectors (1) and (2) are equal. That is, there is no significant
difference between the two sample mean vectors X (1) and X (2 ) . i.e., H0: (1) = (2).
Multivariate Tests 171
Alternative Hypothesis
Test Statistic
1
1
[
T 2 = X 1 X 2 S1 +
1
]
T
S 2 X1 X 2 [ ]
N1 N2
The Statistic T follows distribution with p degrees of freedom.
2 2
Conclusion
If T 2 2,(p), we conclude that the data do not provide us any evidence against the null hypothesis
H0, and hence it may be accepted at % level of significance. Otherwise reject H0 or accept H1.
Example
The problem given in Test 39, test whether the mean vectors of both samples can be regarded as
drawn from the same population at 5% level of significance.
Solution
H0: The average consumption of electrical usage on both on-peak and off-peak are equal. i.e.,
H0: (1) = (2).
H1: The average consumption of electrical usage on both on-peak and off-peak are not equal. i.e.,
H1: (1) (2).
Level of Significance: = 0.05 and Critical value: 20.05,(2) = 5.99
Calculations:
Given that
464. 17 886.08
= 886. 08
2642. 15
1
1
= [X ] [X ]
T 1
Test Statistic: T2 1 X2 N S1 + N S 2 1 X2
1 2
172 Selected Statistical Tests
1
204.4 130.0 464. 17
T
886.08 204.4 130.0
=
556.6 355. 0 886.08 2642. 15 556.6 355. 0
NONPARAMETRIC TESTS
This page
intentionally left
blank
TEST 41
Aim
To test whether the population median M be regarded as M0.
Source
A random sample of n observations is drawn independently. Let M0 be a given value to the
population median.
Assumption
Each observation in the sample should be independent of each other.
Null Hypothesis
H0 : M = M0
Alternative Hypotheses
H1(1) : M M0
H1(2) : M > M0
H1(3) : M < M0
Method
1. Discard the sample observations whose value is equal to M0.
2. Count the number of observations below and above M0 and they are respectively denoted by
n1 and n2.
176 Selected Statistical Tests
Test Statistic
Conclusion
1. If T, accept H0 and if T < T reject H0 or accept H1.
Example
A random sample of 15 students is selected from a school whose height (in cms) is given below.
Test whether the median height of the school students be regarded as 135 or not. Test at 5% level of
significance.
132 134 138 139 142 132 140 136 135 140 139 132 131 136 138
Solution
Aim: To test the median height of the school students be 135 cms or not.
H0 : The median height of the school students is 135 cms. i.e., H0: M = 135.
H1 : The median height of the school students is not 135 cms. i.e., H1:M 135.
Level of Significance: = 0.05 and Critical Value: T0.05, 15 = 9.
Calculations:
1. Discard the sample observation 135 as it is the value of median.
2. Number of observations below the median, n1 = 5.
3. Number of observations above the median, n2 = 9.
Test Statistic:
T = Minimum (n1, n2) = 5.
Conclusion: Since, T < T0.05, 15, H0 is rejected and H1 is accepted. Hence, we conclude that the
median of the school students is not 135 cms.
TEST 42
Aim
To test the population medians M1 and M2 are equal.
Source
Two random samples of n pairs of observations are drawn from two populations. The population
medians M1and M2 are unknown.
Assumptions
(i) Each pair of observations should be taken under the same conditions.
(ii) The different pairs need not be taken under similar conditions.
Null Hypothesis
H0 : M1 = M2
Alternative Hypothesis
H1 : M1 M2
Method
1. Let (X i, Yi), (i = 1, 2, n) be the pairs of observations.
2. Find X i Yi for each of n pairs.
3. Put + sign, if X i Yi > 0.
4. Put sign, if X i Yi < 0.
5. Count the number of + signs and denote it by T+.
6. Count the number of signs and denote it by T.
178 Selected Statistical Tests
Test Statistic
T = Min (T+, T)
Conclusion
Solution
Aim: To test the median marks of the two examinations are equal or not.
H0: The median marks of the two examinations are equal.
H1: The median marks of the two examinations are not equal.
Level of Significance: = 0.05 and Critical value: R 0.05, 14 = 2.
Calculations:
X: 85 89 78 72 68 65 78 75 79 78 82 85 84 73.
Y: 88 79 85 80 75 62 79 80 85 75 80 88 85 75.
XY + + + +
T+ = 4; T = 10.
Test Statistic:
T = Minimum (T+ ,T) = 4
Conclusion: Since, T > T0.05, 14, accept H0 and conclude that the median marks of the two
examinations are equal.
TEST 43
MEDIAN TEST
Aim
To test the two samples are drawn from the populations having the same medians.
Source
A random sample of n1 observations, arranged in order of magnitude as, X 1, X 2,, X n1 drawn
from a population with density function f 1(.) and a random sample of n2 observations, arranged in
order of magnitude as, Y1, Y2,, Yn2 drawn from another population with density function f 2(.). The
population medians of the two populations are unknown. Let N = n1 + n2.
Assumptions
(i) The two samples drawn are independent.
(ii) The observations must be at least ordinal.
(iii) The sample sizes should be sufficiently large.
Null Hypothesis
H0: The two samples are drawn from the populations having the same median.
Alternative Hypothesis
H1: The two samples are drawn from the populations having different medians.
Method
1. Combine the two samples and arrange the observations in order of magnitude, say, X 1 X 2 Y1
X 3 Y2 Y3 X 4 Y4 X 5 such that X 1 <X 2 <Y1 <X 3 <Y2 <Y3 <X 4 <Y4 <X 5
180 Selected Statistical Tests
Let the combined ordered observations be Z = {Z (1 ) , Z( 2 ) , , Z ( n1 + n2 ) } such that Z(1) < Z(2) < <
Z( n1 + n2 ) and each Z(i) is a either X or Y.
(2) Calculate the median, M of the combined sample.
(3) Let m1 be the number of Xs and m2 be the number of Ys exceeding the median M.
(4) Classify the frequencies m1 and m2 into the following 2 2 contingency table.
Test Statistic
N (ad bc )
2
2 =
(a + c )(b + d )(a + b)(c + d )
Conclusion
n1
2 , LLL if L N Lis Leven.
E(m1) = n1 N 1
,Lif L N Lis Lodd.
2 N
n1n2
4(N 1) , LLif L N L isL even.
Var (m1) = n n ( N + 1)
1 2 2 ,L if L N Lis Lodd.
4N
which may be compared with the Table 1 as the statistic Z follows Standard Normal distribution.
Non-parametric Tests 181
Example
The following data give the lifetime of bulbs of two different brands. A sample of 7 bulbs of
brand-I and a sample of 8 bulbs of brand-II is selected.
Brand-I(X): 80 100 90 110 125 130 70
Brand-II(Y): 100 120 80 140 130 160 115 120
Test whether the median lifetime of two brands of bulbs are equal or not at 5% level of significance.
Solution
H0: The median lifetimes of two brands of bulbs are equal.
H1: The median lifetimes of two brands of bulbs are not equal.
Level of Significance: = 0.10 and Critical value: 20.10,1 = 1.82
Calculations:
The combined sample in the ordered form is
70 80 80 90 100 100 110 115 120 120 125 130 130 140 160
Here the median, M = 120.
Number of
Number of Ys Total
Xs
Total 7 8 15
N (ad bc )
2
Test Statistic: 2 =
(a + c )(b + d )(a + b)(c + d )
15(2 3 5 5)
2
= = 1.73
7 8 7 8
Conclusion: Since, 2 < 2() , accept H0 and conclude that the median lifetimes of two brands of
bulbs are equal.
TEST 44
Aim
To test the two random samples could have come two populations with the same frequency
distribution.
Source
Two independent random samples of sizes n1 and n2 are drawn.
Assumptions
The sample sizes of the two samples are sufficiently large.
Null Hypothesis
H0: The populations from which, the two samples drawn have the same frequency distribution.
Alternative Hypothesis
H1: The populations from which, the two samples drawn have the different frequency distribution.
Method
1. The median of the combined samples, N = n1 + n2, is found.
2. For each of the samples, find the number of observations below and above the median, then
form a 2 2 table as follows:
Sample 1 Sample 2 Total
Below Median a B a+b
Above Median c D c+ d
Total a+ c b+d N
Non-parametric Tests 183
Test Statistic
2
N
ad bc 2 N
2 =
(a + b) (a + c) (b + d ) (c + d )
The statistic 2 follows 2 distribution with one degree of freedom.
Conclusion
[| 9 2 62 | 15]2 30
= = 0.40
15 15 15 15
Conclusion: Since 2 < 20.05,1, accept H0 and conclude that the populations from which, the
two samples drawn have the same frequency distribution.
TEST 45
Aim
To test the K random samples could have come K populations with the same frequency distribution.
Source
K independent random samples of sizes n1, n2, , nk are drawn.
Assumptions
The sample sizes of the K samples are sufficiently large.
Null Hypothesis
H0: The populations from which, the K samples drawn have the same frequency distribution.
Alternative Hypothesis
H1: The populations from which, the K samples drawn have the different frequency distribution.
Method
1. The median of the combined samples, N = n1 + n2 + + nk, is found.
2. For each of the samples, find the number of observations below and above the median, then
form a 2 K table as follows:
Samples
Total
1 2 j K
Above Median a 11 a12 a 1j a1K A
Below Median a 21 a22 a 2j a2K B
Total a1 a2 aj aK N
Non-parametric Tests 185
In this table a1j represents the number of observations above the median and a2j is the number of
observations below the median in the jth sample (j = 1, 2,, K).
Test Statistic
K
(a e1 j )
2 K
(a e2 j )
2
1j
+
2j
2 = e1 j e2 j
j =1 j =1
Conclusion
Example
Five independent random samples are drawn with sizes 45, 65, 55, 85 and 62. The median of the
combined sample is found and the number of observations above and below the median for each
sample is found and is tabulated as follows. Examine whether the five random samples can be regarded
as drawn from five populations with the same frequency distribution. Test at 5% level of significance.
Samples
Total
1 2 3 4 5
Above Median 20 30 25 40 30 145
Below Median 25 35 30 45 32 167
Total 45 65 55 85 62 312
H0: The populations from which, the five samples drawn have the same frequency distribution.
H1: The populations from which, the five samples drawn have the different frequency distribution.
Level of Significance: = 0.05 and Critical Value: 20.05, 4 = 9.49
Calculations:
e11 = 145 45/312 = 20.91 e21 = 167 45/312 = 24.08
e12 = 145 65/312 = 30.21 e22 = 167 65/312 = 34.79
e13 = 145 55/312 = 25.56 e23 = 167 55/312 = 29.44
e14 = 145 85/312 = 39.50 e24 = 167 85/312 = 45.50
e15 = 145 62/312 = 28.80 e25 = 167 62/312 = 33.18
186 Selected Statistical Tests
Test Statistic:
K (a1 j e1 j )2 K (a2 j e2 j )2
2 =
j =1 e1 j
+
j =1 e2 j
Aim
To test the two samples have been drawn from the populations having the same density functions.
Definition (RUN)
A run is defined as a sequence of letters of one type surrounded by a sequence of letters of the
other type, and the number of elements in a run is referred to as the length of the run.
Source
A random sample of n1 observations, arranged in order of magnitude as, X 1, X 2,, X n1 drawn
from a population with density function f 1(.) and a random sample of n2 observations, arranged in
order of magnitude as, Y1, Y2,, Yn2 drawn from another population with density function f 2(.)
Assumption
The two samples are drawn independently.
Null Hypothesis
H0: The populations from which the two samples drawn have the same density function. i.e.,
H0: f 1(.) = f 2(.).
Alternative Hypothesis
Method
1. Combine the two samples and arrange the observations in order of magnitude, say, X 1 X 2 Y1
X 3 Y2 Y3 X 4 Y4 X 5 such that X 1 <X 2 <Y1 <X 3 <Y2 <Y3 <X 4 <Y4 <X 5 Let the combined
188 Selected Statistical Tests
ordered observations be Z = {Z(1) , Z(2) , Z(n1 + n2)} such that Z(1) < Z(2) < < Z(n1+n2) and
each Z(i) is a either X or Y. Replace each X by a 0 and each Y by a 1, one gets a sequence
of n1 0s and n2 1s in Z.
2. Let r1 be the number of runs of 0s and r2 be the number of runs of 1s.
Test Statistic
U = r1 + r2
Conclusion
Note
For sufficiently large n1 and n2, (i.e., n1 > 10, n2 > 10), the statistic becomes,
U E (U )
Z = Var (U )
2 n1n 2
E(U) = n + n + 1
1 2
2n1 n2 (2 n1n 2 n1 n2 )
Var (U) = 2
(n1 + n2 ) (n1 + n2 1)
which may be compared with the Table 1 as the statistic Z follows Standard Normal distribution.
Example
A random sample of 8 households is selected from a village, A whose daily expense on milk as 11
15 17 19 25 27 31 33. Another sample of 9 households is selected from village B whose expense on
milk is 12 16 20 22 28 30 36 38 42. Test whether the households of the two villages are same on
spending daily milk expenses.
Solution
H0: The households of the two villages are same on spending daily milk expenses.
H1: The households of the two villages are not same on spending daily milk expenses.
Level of Significance: = 0.05 and Critical value: U0.05, (8, 9) = 5.
Calculations:
The pooled ordered observation is
11 12 15 16 17 19 20 22 25 27 28 30 31 33 36 38 42
The representation of 0 for Xs and 1 for Ys is
01010011001100111
Here 0 and 1 have 5 runs each. i.e., r1 = 5 and r2 = 5.
Test Statistic: U = r1 + r2 = 10.
Conclusion: Since U > U0.05, (8, 9), H0 is accepted and concluded that the households of the two
villages are same on spending daily milk expenses.
TEST 47
Aim
To test the K random samples drawn from the K populations have the same mean.
Source
K random samples, each with sizes ni, (i = 1, 2, , K) be drawn independently from K populations.
Let n1 + n2 + + nK = N.
Assumptions
(i) The sample sizes of each sample should be at least 5.
(ii) The sample sizes need not be equal.
(iii) The frequency distributions of K populations should be continuous.
Null Hypothesis
H0: The means of the K populations are equal.
Alternative Hypothesis
H1: The means of the K populations are not equal.
Method
1. Combine all the K samples and arrange the observations in increasing order of magnitude.
2. Assign ranks to the combined observations Z. If the observations are equal, the mean of the
available rank numbers is assigned.
3. Find the rank sum of each of the K samples in the combined ordered sample.
4. Let R i be the rank sum of the ith sample.
190 Selected Statistical Tests
Test Statistic
12 R i
2
H = N ( N + 1) n 3 (N + 1)
i
The statistic H follows 2 distribution with (K1) degrees of freedom.
Conclusion
Sample 1 1 1 1 1 1 1 1 1 2
Value 11.7 11.9 16.1 17.5 20.5 25.1 30.5 32.1 82.5 19.6
Rank
1 2 3 4 7 10.5 14 15 20 6
Sample 2 2 2 2 2 3 3 3 3 3
Value 21.8 25.2 33.2 33.2 34.1 18.4 22.9 25.1 29.7 33.5
Rank
8 12 16.5 16.5 19 5 9 10.5 13 18
Solution
H0: The mean weight of the children from the three populations is same.
H1: The mean weight of the children from the three populations is not same.
Level of Significance: = 0.10 and Critical Value: 2 = 4.61
2
Calculations:
n1 = 9; n2 = 6; n3 = 5; N = 20;
R 1 = 76.5 R 2 = 74 R 3 = 55.5
Test Statistic:
12 R i
2
H = N ( N + 1) n 3(N + 1)
i
12 76.5
2
74 55.5
= 20 21 + + 3 21 = 2.15
9 6 5
Conclusion: Since, H < 2(), H0 is accepted and concluded that the mean weight of the children
from the three populations is same.
TEST 48
Aim
To test the two random samples be drawn from the populations having the same mean, based on
the rank sum of the sample.
Source
A random sample of n1 observations, arranged in order of magnitude as, X 1, X 2, , X n1 drawn
from a population with density function f 1(.) and a random sample of n2 observations, arranged in
order of magnitude as, Y1, Y2, , Yn2 drawn from another population with density function f 2(.).
Assumptions
(i) The two samples drawn are independent.
(ii) The populations have continuous frequency distributions.
Null Hypothesis
H0: The populations from which the samples drawn have the same mean.
Alternative Hypothesis
H1: The populations, from which, the samples drawn have different mean.
Method
1. Combine the two samples and arrange the observations in order of magnitude, say, X 1 X 2 Y1
X 3 Y2 Y3 X 4 Y4 X 5 such that X 1 <X 2 <Y1 <X 3 <Y2 <Y3 <X 4 <Y4 <X 5 Let the combined
ordered observations be Z = {Z(1) , Z(2) , Z(n1 + n2)} such that Z(1) < Z(2) < < Z(n1 + n2) and
each Z(i) is a either X or Y.
192 Selected Statistical Tests
2. Assign ranks to the combined observations Z. If the observations are equal, the mean of the
available rank numbers is assigned.
3. Find the rank sum of the smaller sample and denote it by R (1).
4. If the two samples are of equal size, then R be the smaller of the two rank sums.
5. Let n be the sample size of the smaller sample.
6. Let N be the sum of the two sample sizes. i.e., N = n1 + n2.
7. Calculate R (2) = n(N + 1) R (1).
Test Statistic
R = Min (R (1), R (2))
Conclusion
Solution
H0: The mean weight of the adults from the two cities is same.
H1: The mean weight of the adults from the two cities is not same.
Level of Significance: = 0.05 and Critical Value: R 0.05,(10,9) = 69.
Calculations:
Combine the two samples, assign ranks to the observations, and rearrange the X and Y observations
with their ranks as follows.
MANNWHITNEYWILCOXON U-TEST
Aim
To test that the two random samples are drawn from the populations having the same density
functions.
Source
Assumptions
(i) The two samples drawn are independent.
(ii) The populations have continuous frequency distributions.
(iii) The sample sizes should be sufficiently large.
Null Hypothesis
H0: The populations, from which, the two samples drawn have the same density function. i.e.,
H0: f 1(.) = f 2(.).
Alternative Hypothesis
Method
1. Combine the two samples and arrange the observations in order of magnitude, say, X 1 X 2 Y1
X 3 Y2 Y3 X 4 Y4 X 5 such that X 1 <X 2 <Y1 <X 3 <Y2 <Y3 <X 4 <Y4 <X 5 Let the combined
194 Selected Statistical Tests
ordered observations be Z = {Z(1) , Z ( 2 ) ,...Z ( n1 + n 2 ) } such that Z(1) < Z( 2 ) < ... < Z ( n1 + n 2 ) and
each Z(i) is a either X or Y.
2. Assign ranks to the combined observations Z. If the observations are equal, the mean of the
available rank numbers is assigned.
3. Find the rank sum of the Ys in the combined ordered sample and denote it by T.
4. Calculate
n2 (n2 + 1)
U = n1n2 + T
2
Test Statistic
U E (U )
Z=
Var (U )
n1 n2 n1 n2 (n1 + n 2 + 1)
E(U ) = , Var (U ) =
2 12
The Statistic Z follows Standard Normal distribution.
Conclusion
If |Z| Z/2 accept H0 and if |Z| > Z/2 reject H0 or accept H1.
Example
Two independent samples of 15 students each from two universities namely Annamalai University
(A) and Banaras Hindu University (B) are drawn. The scores obtained by students of the two universities
in an Aptitude test are given below. Test whether the two samples have been drawn from the populations
having the same distribution at 5% level of significance.
A: 920 840 780 850 830 930 800 860 760 730 740 680 670 540 710
B: 870 890 620 650 700 720 750 660 810 790 950 690 640 600 770
Solution
H0: The two samples have the same population distributions.
H1: The population distributions of the two samples are not same.
Level of Significance: = 0.05, Critical Value: Z= 1.96
Calculations:
1. The two samples are combined, arranged in order of magnitude and assigned ranks as
follows:
Non-parametric Tests 195
540 1 A 760 A 16
600 2 B 770 B 17
620 3 B 780 A 18
640 4 B 790 B 19
650 5 B 800 A 20
660 6 B 810 B 21
670 7 A 830 A 22
680 8 A 840 A 23
690 9 B 850 A 24
700 10 B 860 A 25
710 11 A 870 B 26
720 12 B 860 B 27
730 13 A 920 A 28
740 14 A 930 A 29
750 15 B 950 B 30
920 28 870 26
840 23 890 27
780 18 620 3
850 24 650 5
830 22 700 10
930 29 720 12
800 20 750 15
860 25 660 6
760 16 810 21
730 13 790 19
740 14 950 30
680 8 690 9
670 7 640 4
540 1 600 2
710 11 770 17
Rank Sum 259 Rank Sum 206
196 Selected Statistical Tests
15(15 + 1)
= 15 15 + 206 = 139
2
n1 n2 15 15
E(U ) = = = 112.5
2 2
n1 n2 (n1 + n 2 + 1) 15 15(15 + 15 + 1)
Var (U ) = = = 581.25
12 12
Test Statistic:
U E (U ) 139 112. 5
Z= = = 1.1
Var (U ) 581.25
Conclusion: Since Z < Z/2, we accept H0 and conclude that the two samples have the same
population distributions.
TEST 50
Aim
To test the population distribution F(x) be regarded as F 0(x), based on a random sample.
Source
Let X i, (i = 1, 2, , n) a random sample of n observations be drawn from a population. Let F 0(x)
be the cumulative distribution of a specified (given) population.
Null Hypothesis
H0: The population distribution F(x) is F 0(x).
Alternative Hypothesis
H1: The population distribution F(x) is not F 0(x).
Method
1. Calculate the cumulative distribution F 0(x) based on the sample observations and the specified
(given) population distribution.
2. Obtain the cumulative distribution of the sample, F n(x) be the empirical distribution function
defined as a step function, F n(x) = (Number of observations X i x)/n.
3. Find the absolute difference |F 0(x) F n(x)|
Test Statistic
D = Max |F 0(x) F n(x)|
198 Selected Statistical Tests
Conclusion
Solution
H0: The given sample is drawn from a standard normal distribution.
H1: The given sample is not drawn from a standard normal distribution.
Level of Significance: = 0.05 and Critical Value: D0.05,20 = 0.294
Calculations:
Aim
To test the two population distributions are identical, based on the two sample distributions.
Source
Let X i, (i = 1, 2,, n) be a random sample of n observations be drawn from a population. Let Yi,
(i = 1, 2,, n) be a random sample of n observations be drawn from another population.
Null Hypothesis
H0: The two population distributions are identical. i.e., There is no significant difference between
the two sample distributions.
Alternative Hypothesis
H1: The two population distributions are not identical. i.e., there is a significant difference between
the two sample distributions.
Method
1. Calculate the cumulative frequencies for each of the observations, X i and denote it by C(x),
and for each of the observations, Yi and denote it by C(y).
2. Obtain the cumulative distribution of the two samples, F n(x) and F n(y) are the empirical
distribution functions defined as a step function, F n(x) = (Number of observations (X i x)/
n and F n(y) = (Number of observations Yi y)/n.
3. Find the absolute difference |F n(x) F n(y)|
200 Selected Statistical Tests
Test Statistic
D = Max |F n(x) F n(y)|
Conclusion
Example
The following data denotes the lifetime of bulbs of two different brands. Test whether the brands
differ with respect to average life.
Brand-I: 80 100 90 110 125 130 70
Brand-II: 100 120 80 140 130 160 115 120
Solution
H0: The average lifetimes of two brands of bulbs are equal.
H1: The average lifetimes of two brands of bulbs are not equal.
Level of Significance: = 0.10 and Critical Value: D0.10, 7,8 = 33/56 = 0.5893
Calculations:
x F7(x) F8(y) 1F7(x)-F8(y)1
70 1/7 0 1/7
80 2/7 1/8 9/56
90 3/7 1/8 17/56
100 4/7 2/8 9/28
110 5/7 2/8 13/28
115 5/7 3/8 19/56
120 5/7 5/8 5/56
125 6/7 5/8 13/56
130 1 6/8 1/4
140 1 7/8 1/8
160 1 1 0
Test Statistic:
D = Max |F n(x) F n(y)| = 13/28 = 0.4643
Conclusion: Since, D < D , accept H0 and conclude that the average lifetimes of two brands of
bulbs are equal.
TEST 52
Aim
To test the existence of correlation between the two pairs of observations in the population based
on a sample.
Source
Let (X i, Yi), i = 1, 2, , n be a random sample of n pairs of observations drawn.
Assumptions
(i) The population distribution is continuous.
(ii) The observations should be obtained in pairs.
Null Hypothesis
H0: There exists correlation between the pairs (X, Y)
Alternative Hypothesis
H1: There exists correlation between the pairs (X, Y)
Method
1. Assign ranks to each of the observations X i and Yi independently and denote them by r (X i)
and r (Yi) respectively.
2. For each pair of observations, find the difference of the ranks di = r(X i ) r( Yi) ,
i = 1, 2, , n.
n
d
2
3. Calculate r= i
i =1
202 Selected Statistical Tests
Test Statistic
6r
R = 1 2
n (n 1)
Conclusion
Example
Two Judges have ranked the ten competitors those who attended a beauty competition as follows.
Test whether the rank correlation between the two judges is significant or not at 5% level of significance.
Judge-I: 2 4 7 8 3 1 5 9 10 6
Judge-II: 3 5 6 7 2 1 4 8 9 10
Solution
H0: There is no correlation between the two judges in the competition.
H1: There exists correlation between the two judges in the competition.
Level of Significance: = 0.05 and Critical Value: R 0.05,10 = 0.5515.
Calculations:
di = 1 1 1 1 1 0 1 1 1 4
d
2
r= i = 24. n = 10.
Test Statistic:
6r
R = 1 = 0.8545
n (n 1)
2
Conclusion: Since, R > R , H0 is rejected and concluded that there exists correlation between the
two judges in the competition.
TEST 53
Aim
To test the order of observations in a sample is random, obtained from any experiment.
Source
A sample of n observations is drawn from any experiment.
Assumptions
(i) The sample observations be obtained under similar conditions.
(ii) Retain the observations in the order in which they occur. That is, X i is the ith observation in
the outcome of an experiment.
Null Hypothesis
H0: The sample observations obtained is random.
Alternative Hypothesis
H1: The sample observations obtained is not random.
Method
1. Find the median for the given sample observations.
2. All the observations in the sample larger than the median value are assigned a + sign and
those below the median are assigned a sign.
3. If the number of observations is odd, the median is deleted.
4. A succession of values with the same sign is called a run.
5. The number of runs in the sample, in the order in which they occur is found and is denoted
by K.
204 Selected Statistical Tests
Test Statistic
K = Number of runs in the sample, in the order in which they occur.
Conclusion
n(2n 2)
E(K) = (n + 1), Var (K) = 2(2 n 1)
which may be compared with the Table 1 as the statistic Z follows Standard Normal distribution.
Example
The following data denotes the length of iron rods (in cms.) of a sample of 24 units manufactured
by an industry. Test whether the sample drawn is random at 10% level of significance.
21.02 20.08 20.05 19.70 19.13 17.09
20.09 19.40 20.56 20.97 20.17 21.35
19.64 20.82 21.26 20.75 20.74 21.59
20.75 21.01 19.09 18.73 18.45 19.80
Solution
H0: The sample observations obtained is random.
H1: The sample observations obtained is not at random.
Level of Significance: = 0.10.
Critical Value: K0.10,12 = 8 (lower), 18 (upper).
Calculations:
Number of observations, n = 24. Median = 20.12
Number of observations above the median, n1 = 12
Number of observations below the median, n2 = 12
21.02 20.08 20.05 19.70 19.13 17.09
(+) () () () () ()
20.09 19.40 20.56 20.97 20.17 21.35
(-) (-) (+) (+) (+) (+)
19.64 20.82 21.26 20.75 20.74 21.59
(-) (+) (+) (+) (+) (+)
20.75 21.01 19.09 18.73 18.45 19.80
(+) (+) () () () ()
Test Statistic: K = Number of runs = 6.
Conclusion: Since K lies in the critical region, H0 is rejected and concluded that the sample
observations drawn is not random.
TEST 54
Aim
To test the fluctuations in a sample have a random nature.
Source
A sample of n observations is drawn as a time series data.
Null Hypothesis
H0: The fluctuation in the sample is random.
Alternative Hypothesis
H1: The fluctuation in the sample is not random.
Method
1. The observations in the sample be given serial numbers in the order in which they occur and
they are denoted by X i, i = 1, 2, , n.
2. The ranks are given to the observations according to the increasing order of magnitude and
is denoted by Yi, i = 1, 2, , n.
3. Find di = X i Yi, i = 1, 2, , n.
n
4. Find d
i =1
2
i and denote it by r.
Test Statistic
6r n (n 1)
2
Z=
n(n + 1) n 1
The statistic Z follows standard Normal distribution.
206 Selected Statistical Tests
Conclusion
If |Z| Z/2 accept H0 and if |Z| > Z/2 reject H0 or accept H1.
Example
The monthly rainfall (in cms) is obtained by metrological station over a period of twelve months
in a city is given below. Test whether the rainfall is random over the entire year at 5% level of significance.
Month (X ) :1 2 3 4 5 6 7 8 9 10 11 12
Rain (Y ): 12.5 10.7 14.5 10.2 8.5 12.8 15.5 16.8 22.5 26.5 28.2 30.5
Solution
H0: The rainfall over the entire year is random nature.
H1: The rainfall over the entire year is not random nature.
Level of Significance: = 0.05 and Critical Value: Z0.05 = 1.96.
Calculations: n = 12
RX : 1 2 3 4 5 6 7 8 9 10 11 12
R Y: 4 3 6 2 1 5 7 8 9 10 11 12
d
2
r= i = 40
Test Statistic:
6r n (n 1) (6 40) 12(144 1)
2
240 1716
Z= = = = 2.85
n(n + 1) n 1 12 13 11 517.39
Conclusion: Since |Z| > Z/2 reject H0 and conclude that the rainfall over the entire year is not
random nature.
TEST 55
Aim
To test the significance of the differences in response for K treatments applied to n subjects.
Source
The data are obtained as a two-way table having n rows (subjects) and K columns (treatments).
Assumptions
(i) The response to one treatment by a subject is not affected by the same subjects response to
another treatment.
(ii) The response distribution is continuous for each subject.
Null Hypothesis
H0: The effects of the K treatments are same.
Alternative Hypothesis
H1: The effects of the K treatments are not same.
Method
1. The data be represented by a table of n rows and K columns.
2. The rank numbers 1, 2,, K are assigned in increasing order of magnitude for the values in
each row.
3. The rank sum Rj, (j = 1, 2,, K) is calculated for each of the K columns.
208 Selected Statistical Tests
Test Statistic
R
12 2
G= 3n (K + 1)
nK (K + 1)
j
Conclusion
Example
Four experts were appointed to conduct an interview board. There are fifteen candidates attended
the interview. The following are the points given to the candidates by the experts. Test whether the
points given by the experts to the candidates are significant at 5% level of significance.
C1 C2 C3 C4
1 8 8 10 10
2 7 9 9 9
3 10 8 8 10
4 8 8 10 10
5 9 9 9 10
6 9 9 10 9
7 8 9 8 9
8 8 8 8 8
9 9 9 10 9
10 9 9 9 9
11 9 10 9 10
12 7 9 9 9
13 10 10 10 10
14 9 9 9 10
15 7 10 10 10
Solution
H0: The points given by the experts to the candidates are not significant.
H1: The points given by the experts to the candidates are significant.
Level of Significance: = 0.05 and Critical Value: 20.05,3 = 7.81
Non-parametric Tests 209
Calculations:
Candidates(n) Ranks
Rj 47 39 35 29
N = 15; K = 4.
(R )
2
S= j R = 171
ti = Number of times any observation is repeated in each of the candidates.
f i frequency of ti.
210 Selected Statistical Tests
3
ti fi fiti fiti
1 7 7 7
2 10 20 80
3 7 21 189
4 3 12 192
Total 468
ft
3
D= i i = 468
Test Statistic:
SEQUENTIAL TESTS
This page
intentionally left
blank
TEST 56
Aim
To test that, the mean of a population has a specified value based on sequential observations.
Source
A random sample of observations is drawn sequentially as necessary.
Assumption
The observations drawn are independent and follow a normal distribution with known variance
2.
Null Hypothesis
H0: The mean of a population, has a specified value 0.
i.e., H0: = 0.
Alternative Hypothesis
H1: The mean of a population, has a specified value 1.
i.e., H1: = 1.
Method
(i) Fix the probabilities of Type-I and Type-II errors, and at a minimum level.
(ii) Choose c as a convenient value close to (0 + 1)/2 .
(iii) Calculate the following two boundary lines for every successive observations m:
2 1 + 1
am = log + m 0 c
1 0 2
214 Selected Statistical Tests
2
+ 1
rm = log + m 0 c
1 0 1 2
(iv) Plot the above two lines in a graph.
(v) For each m, find the cumulative sum of xi and plot in the graph.
(vi) For every stage of m, the following decision is made which is provided in the conclusion.
Conclusion
m
(i) Accept H0 if (x i c ) a m
i =1
m
(ii) Accept H1 if (xi c ) rm
i =1
m
Example
An ancillary industry manufactures copper plates for major industries. Test whether the mean
length of their products can be considered as either 8.30 cms or 8.33 by taking the sample units
sequentially given that the standard deviation of the length is 0.02 cms. Let = = 0.05. The successive
observations are 8.34 8.29 8.30 8.31 8.32 8.30.
Solution
H0: = 8.30. H1: = 8.33.
0 + 1
Given that = = 0.05. = 0.02. = 8.315
2
2 1 2
log log
1
= 0.039; = 0.039
1 0 1 0
Critical boundary lines are:
m 1 2 3 4 5 6 7 8 9 10
am -0.024 -0.009 0.006 0.021 0.036 0.051 0.066 0.081 0.096 0.111
rm 0.054 0.069 0.084 0.099 0.114 0.129 0.144 0.159 0.174 0.189
0.2
0.15 Accept H1
Cum. Sum
0.1 Cont.Ins.
0.05 Accept H0
0
1 2 3 4 5 6 7 8 9 10
0.05
Sample size, m
TEST 57
Aim
To test that, the standard deviation of a population has a specified value based on sequential
observations.
Source
A random sample of observations is drawn sequentially as necessary.
Assumption
The observations drawn are independent and follow a normal distribution with known mean .
Null Hypothesis
H0: The standard deviation of a population, has a specified value 0.
i.e., H0: = 0.
Alternative Hypothesis
H1: The mean of a population, has a specified value 1.
i.e., H1: = 1.
Method
(i) Fix the probabilities of Type-I and Type-II errors, and at a minimum level.
(ii) Calculate the following two boundary lines for every successive observations m:
2
2 0 1
2
1 2 2 2
am = 2 log + m 2 0 1 2 log 0
1 0 1
2
1 0
2
2 0 1
2
2 2 2
rm = 2 log + m 2 0 1 2 log 0
1 0 1 1
2
1 0
Sequential Tests 217
(vi) For every stage of m, the following decision is made which is provided in the conclusion.
Conclusion
(i) Accept H0 if (x )
i
2
am
(ii) Accept H1 if ( x )
i
2
rm
Example
A sequential sample observations are drawn from N( = 2, 2) population. Test whether the
variance of the population be regarded as either 4 or 6. Given that = 0.15 and = 0.25. There are 10
successive observations drawn and are 2.15 1.85 1.65 2.35 2.55 1.75 1.85 2.45 1.45 2.75.
Solution
H0: = 4; H1: = 6.
Given that = 0.15; = 0.25. = 2. m = 10.
The critical boundary lines are:
2
2 0 1
2
1 2 2 2
am = 2 log + m 2 0 1 2 log 0
1 0 1
2
1 0
2 4 6 1 0. 25 4
log + 10 log = 33.95
= 6 4 0.15
6
2
2 0 1
2 2 2 2
rm = 2 log + m 2 0 1 2 log 0
1 0 1 1
2
1 0
2 4 6 1 0. 15 4
log + 10 log = 87.24
= 6 4
0.25 6
(x 2 ) (x 2)
2 2
Conclusion: Accept H0 if i .96, accept H1 if i 87.24 and continue
(x 2 ) < 87.24.
2
sampling as long as 33.96 < i
TEST 58
Aim
To test that, the parameter of a population has a specified value based on sequential observations.
Source
A random sample of observations is drawn sequentially as necessary.
Assumption
The observations drawn are independent and follow a binomial distribution.
Null Hypothesis
H0: The parameter of a population, p has a specified value p0. i.e., H0: p = p0.
Alternative Hypothesis
H1: The parameter of a population, p has a specified value p1. i.e., H1: p = p1.
Method
(i) Fix the probabilities of Type-I and Type-II errors, and at a minimum level.
(ii) Calculate the following two boundary lines for every successive observations m and for the
number of defective items dm:
p 1 p1 1 p1
d m log 1 log + m log = log
p0 1 p0 1 p0 1
p 1 p1 1 p1 1
d m log 1 log + m log = log
p0 1 p0 1 p0
(iii) Plot the above two lines in a graph.
Sequential Tests 219
(iv) For every stage of m, the following decision is made which is provided in the conclusion.
Conclusion
p1 1 p1 1 p1
(i) Accept H0 if dm log p log 1 p + m log log
0 0 1 p0
1
p1 1 p1 1 p1 1
(ii) Accept H1 if d m log p log 1 p + m log 1 p log
0 0 0
(iii) Continue sampling if
p 1 p1 1 p1 1
log < d m log 1 log + m log < log
1 p0 1 p0 1 p0
for every sequential values of m.
Example
A sequential sample is drawn from a large consignment of apples such that the good items are
denoted by a and bad items are denoted by r. Test whether the proportion of bad items in the
consignment be regarded as either 0.10 or 0.20 by fixing = 0.01 and = 0.05 from the following
sample items.
a a a r a r a a r a a
a r r a r r a r
Solution
H0: p = p0 = 0.10 H1: p = p1 = 0.20; = 0.01 = 0.05.
p 0.20 1 p1
log 1 = log =0.693 log = log 0. 80 = 0.118
p0 0.10 1 p0 0. 90
0.05 1 0.95
log = log = 2.986 log = log = 4.554
1 0. 99 0. 01
The boundary lines are:
am = 0.811dm 0.118m = 2.986
rm = 0.811dm 0.118m = 4.554
If m = 0, the two boundary lines are dm1 = 3.68 and dm2 = 0.562.
If m = 30, the two boundary lines are dm1 = 0.68 and dm2 = 9.98.
The first one intersects the m-axis in m = 25.31. After the 21st observation, we can conclude that
the H1 may be accepted. That is the proportion of defective apples is more than 0.20 and hence the
consignment of apple may be rejected.
TEST 59
Aim
To test the parameter of the Bernoulli population, by sequential method.
Source
In any random experiment, which produces only two mutually exclusive outcomes namely,
occurrence and non-occurrence of the event, the probability of such events follows Bernoulli distribution
whose probability function is as follows:
Assumption
The observations drawn are independent and follow a Bernoulli distribution.
Null Hypothesis
H0: The parameter of the Bernoulli population is 0. i.e., H0: = 0.
Alternative Hypothesis
H1: The parameter of the Bernoulli population is 1. i.e., H1: = 1.
Method
(i) Fix the probabilities of Type-I and Type-II errors, and at a minimum level.
m
(iii) Calculate the following two numbers namely, am, acceptance number and rm, rejection number
for successive values of m:
Sequential Tests 221
1 0
log m log
1 1 1
+
am = 1 1 1 1
log 1 + log log 1 + log
0 1 0 0 1 0
1 1 0
log m log
1 1
+
rm = 1 1 1 1
log 1 + log log 1 + log
0 1 0 0 1 0
(iv) For every stage of m, the following decision is made which is provided in the conclusion.
Conclusion
m
(i) Accept H0, if Xi am
i 1
m
(ii) Accept H1 if Xi rm
i 1
m
(iii) Continue the sampling if am < X
i 1
i < rm
Example
The quality control unit of an industry classifies their products into two divisions namely, within
specifications and out of specifications. Test whether the proportion of items which are out of
specifications be either 0.04 or 0.08 based on a sequential sampling by fixing = 0.15 and = 0.25.
Solution
H0: = 0 = 0.04 H1: = 1 = 0.08 = 0.15 and = 0.25
0.25
log = log = 1.2238
1 1 0. 15
1 1 0. 25
log = log = 1.6094
0.15
log 1 = log 0.08 = 0.6931
0 0. 04
1 1
log = log 1 0.08 = 0.0426
1 0 1 0.04
222 Selected Statistical Tests
1 0
log = log 1 0.04 = 0.0426
1 1 1 0.08
The boundary lines are:
am = 1.76 + 0.59 m and rm = 2.26 + 0.59 m
Conclusion:
m
(i) Terminate the process by accepting H0 if X i 1.76 + 0.59 m
i =1
m
(ii) Terminate the process by accepting H1 if X i 2.26 + 0.59 m
i =1
m
(iii) Continue the inspection by taking sample if 1.76 + 0.59 m < X
i =1
i < 2.26 + 0.59 m
TEST 60
Aim
Sequential test for the parameter of a population.
Source
Sample observations are drawn sequentially in any experiment.
Assumption
The sample observations be drawn from a population having the probability density function,
f(x,).
Null Hypothesis
H0: The parameter of the population has a specified value 0. i.e., H0: = 0.
Alternative Hypothesis
H1: The mean of a population has a specified value 1. i.e., H1: = 1.
Method
(i) The likelihood function of a sample x1, x2, , xm from the population has the p.d.f f(x, ) is
given by
m
L1m = f (x i , 1 ) when H1 is true,
i=1
m
L0m = f (x i , 0 ) when H0 is true and the likelihood ratio m is
i=1
given
224 Selected Statistical Tests
n
f (x i , 1 ) m
f (x , )
f ( xii , 10 )
L1m i =1
by m = = = m = 1,2,
L0m n
i=1
f ( xi , 0 )
i =1
(ii) At each stage of the experiment, (at the mth trial for any integral value m), the likelihood ratio
m , (m = 1, 2, ) is computed.
(iii) Fix and , the probabilities of Type-I and Type-II errors at a minimum level.
1
(iv) Calculate the constants, A = and B = .
1
(v) For every stage of m, the following decision is made which is provided in the conclusion.
(vi) For computational point of view, it is much convenient to find log m rather than m.
Conclusion
Example
A sequential sample of observations be drawn from N(, 2) distribution. We are interested to
test whether the mean of the population be either 0.2 or 0.4 by fixing = 0.25 and = 0.35.
Solution
H0: = 0.2 H1: = 0.4 = 0.25 and = 0.35
1
A= = 2.6 and B = = 0.47 log A = 0.9555 log B = 0.755
1
Conclusions:
(i) Terminating the process by accepting H0 if log m 0.755
(ii) Terminating the process by accepting H1 if log m 0.9555
(iii) Continue sampling if 0.755 < m < 0.9555.
CHAPTER 7
TABLES
TABLES
Area
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-3.4 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0002
-3.3 0.0005 0.0005 0.0005 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0003
-3.2 0.0007 0.0007 0.0006 0.0006 0.0006 0.0006 0.0006 0.0005 0.0005 0.0005
-3.1 0.0010 0.0009 0.0009 0.0009 0.0008 0.0008 0.0008 0.0008. 0.0007 0.0007
-3.0 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
-2.9 999
0.0019 0.0018 0.0017 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
-2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
-2.7 0.0035 0.0033 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
-2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
-2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
-2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
-2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
-2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
-2.1 0.0179 0.0114 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
-2.0 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
-1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
-1.8 0.0359 0.0352 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
-1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
-1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
-1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
-1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0722 0.0708 0.0694 0.0681
-1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
-1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
-1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
- 1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
-0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
-0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
(Contd...)
228 Selected Statistical Tests
TABLE 1 (Contd.)
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
-0.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.2578 0.2546 0.2514 0.2483 0.2451
-0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.2912 0.2877 0.2843 0.2810 0.2776
-0.4 0.3446 0.3409 0.3372 0.3336 0.3300 0.3264 0.3228 0.3192 0.3156 0.3121
-0.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.3632 0.3594 0.3557 0.3520 0.3483
-0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
-0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
-0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9278 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.98l7
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
(Contd...)
Tables 229
TABLE 1 (Contd.)
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
0 t
Level of significance
n
0.10 0.05 0.025 0.01 0.005
I 3.078 6.314 12.706 31.821 63.657
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.99 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
10 1.372 1.812 2.228 2.764 3.169
11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977
15 1.341 1.753 2.131 2.602 2.947
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861
20 1.325 1.725 2.086 2.528 2.845
21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797
25 1.316 1.708 2.060 2.485 2.787
26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.05 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
1.282 1.645 1.960 2.326 2.576
Source: Fisher and Yates, 1974.
Tables 231
2) DISTRIBUTION
TABLE 3: AREA IN THE RIGHT TAIL OF A CHI- SQUARE (
0 1
Degrees Area in Right Tail
of
freedom 0.99 0.975 0.95 0.90 0.80 0.20 0.10 0.05 0.025 0.01
1 0.000 0.001 0.004 0.016 0.064 1.642 2.706 3.841 5.024 6.635
2 0.020 0.051 0.103 0.211 0.446 3.219 4.605 5.991 7.378 9.210
3 0.115 0.216 0.352 0.584 1.005 4.642 6.251 7.815 9.348 11.345
4 0.297 0.484 0.711 1.064 1.646 5.989 7.779 9.488 11.143 13.277
5 0.554 0.831 1.145 1.610 2.343 7.289 9.236 11.070 12.833 15.086
6 0.872 1.237 1.635 2.204 3.070 8.558 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 3.822 9.803 12.017 14.067 16.013 18.475
8 1.646 2.180 2.733 3.490 4.594 11.030 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 5.380 12.242 14.684 16.919 19.023 21666
10 2.558 3.247 3.940 4.865 6.179 13.442 15.987 18.307 20.483 23.209
11 3.053 3.816 4.575 5.578 6.989 14.631 17.275 19.675 21.920 24.725
12 3.571 4.404 5.226 6.304 7.807 15.812 18.549 21.026 23.337 26.217
13 4.107 5.009 5.892 7.042 8.634 16.985 19.812 22.362 24.736 27.688
14 4.660 5.629 6.571 7.790 9.467 18.151 21.064 23.685 26.119 29.14 1
15 5.229 6.262 7.261 8.547 10.307 19.311 22.307 24.996 27.488 30.578
16 5.812 6.908 7.962 9.312 11.152 20.465 23.542 26.296 28.845 32.000
17 6.408 7.564 8.672 10.085 12.002 21.615 24.769 27.587 30.191 33.409
18 7.015 8.231 9.390 10.865 12.857 22.760 25.989 28.869 31.526 34.805
19 7.633 8.907 10.117 11.651 13.716 23.900 27.204 30.144 32.852 36.191
20 8.260 9.591 10.851 12.443 14.578 25.038 28.412 31.410 34.170 37.566
21 8.897 10.283 11.591 13.240 15.445 26.171 29.615 32.671 35.479 38.932
22 9.542 10.982 12.338 14.041 16.314 27.301 30.813 33.924 36.781 40.289
23 10.196 11.689 13.091 14.848 17.187 28.429 32.007 35.172 38.076 41.638
24 10.856 12.401 13.848 15.658 18.062 29.553 33.196 36.415 39.364 42.980
25 11.524 13.120 14.611 16.473 18.940 30.675 34.382 37.652 40.647 44.314
26 12.198 13.844 15.379 17.292 19.820 31.795 35.563 38.885 41.923 45.642
27 12.879 14.573 16.151 18.114 20.703 32.912 36.741 40.113 43.194 46.963
28 13.565 15.308 16.928 18.939 21.588 34.027 37.916 41.337 44.461 48.278
29 14.256 16.047 17.708 19.768 22.475 35.139 39.087 42.557 45.722 49.588
30 14.953 16.791 18.493 20.599 23.364 36.250 40.256 43.773 46.979 50.892
Source: Fisher, R.A, Statastical Methods for Research Workers, 14th edn. Hafner Press, 1972.
232 Selected Statistical Tests
= 0.05
3.94(n 1 = 15, n 2 = 6)
n1
n2 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 40 60 120
1 161 200 216 225 230 234 237 239 241 242 244 246 248 249 250 251 252 253
2 18.5 19.0 19.2 19.2 19.3 19.3 19.4 19.4 19.4 19.4 19.4 19.4 19.4 19.5 19.5 19.5 19.5 19.5
3 10.1 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.46
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 2.60 2.53 2.46 2.42 2.38 2.34 2.30 2.25
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.42 2.35 2.28 2.24 2.19 2.15 2.11 2.06
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.34 2.27 2.19 2.15 2.11 2.06 2.02 1.97
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.25 2.18 2.10 2.05 2.01 1.96 1.92 1.87
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 2.23 2.15 2.07 2.03 1.98 1.94 1.89 1.84
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 2.20 2.13 2.05 2.01 1.96 1.91 1.86 1.81
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 2.16 2.09 2.01 1.96 1.92 1.87 1.82 1.77
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47
120 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35
3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83 1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22
(Contd...)
Tables 233
= 0.01
10.5(n 1 = 7, n 2 = 5)
n1
n2 1 2 3 4 5 6 7 8 9 10 12 15 20 24 30 40 60 120
1 4052 5000 5403 5625 5764 5859 5928 5982 6023 6056 6106 6157 6209 6235 6261 6287 6313 6339
2 98.5 99.0 99.2 99.2 99.3 99.3 99.4 99.4 99.4 99.4 99.4 99.4 99.4 99.5 99.5 99.5 99.5 99.5
3 34.1 30.8 29.5 28.7 28.2 27.9 27.7 27.5 27.3 27.2 27.1 26.9 26.7 26.6 26.5 26.4 26.3 26.2
4 21.2 18.0 16.7 16.0 15.5 15.2 15.0 14.8 14.7 14.5 14.4 14.2 14.0 13.9 13.8 13.7 13.7 13.6
5 16.3 13.3 12.1 11.4 11.0 10.7 10.5 10.3 10.2 10.1 9.89 9.72 9.55 9.47 9.38 9.29 9.20 9.11
6 13.7 10.9 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.72 7.56 7.40 7.31 7.23 7.14 7.06 6.97
7 12.2 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62 6.47 6.31 6.16 6.07 5.99 5.91 5.82 5.74
8 11.3 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81 5.67 5.52 5.36 5.28 5.20 5.12 5.03 4.95
9 10.6 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26 5.11 4.96 4.81 4.73 4.65 4.57 4.48 4.40
10 10.0 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85 4.71 4.56 4.41 4.33 4.25 4.17 4.08 4.00
11 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54 4.40 4.25 4.10 4.02 3.94 3.86 3.78 3.69
12 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30 4.16 4.01 3.86 3.78 3.70 3.62 3.54 3.45
13 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10 3.96 3.82 3.66 3.59 3.51 3.43 3.34 3.25
14 8.86 6.51 5.56 5.04 4.70 4.46 4.28 4.14 4.03 3.94 3.80 3.66 3.51 3.43 3.35 3.27 3.18 3.09
15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.67 3.52 3.37 3.29 3.21 3.13 3.05 2.96
16 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69 3.55 3.41 3.26 3.18 3.10 3.02 2.93 2.84
17 8.40 6.11 5.19 4.67 4.34 4.10 3.93 3.79 3.68 3.59 3.46 3.31 3.16 3.08 3.00 2.92 2.83 2.75
18 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51 3.37 3.23 3.08 3.00 2.92 2.84 2.75 2.66
19 8.19 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43 3.30 3.15 3.00 2.92 2.84 2.76 2.67 2.58
20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37 3.23 3.09 2.94 2.86 2.78 2.69 2.61 2.52
21 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31 3.17 3.03 2.88 2.80 2.72 2.64 2.55 2.46
22 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26 3.12 2.98 2.83 2.75 2.67 2.58 2.50 2.40
23 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21 3.07 2.93 2.78 2.70 2.62 2.54 2.45 2.35
24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17 3.03 2.89 2.74 2.66 2.58 2.49 2.40 2.31
25 7.77 5.57 4.68 4.18 3.86 3.63 3.46 3.32 3.22 3.13 2.99 2.85 2.70 2.62 2.53 2.45 2.36 2.27
30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98 2.84 2.70 2.55 2.47 2.39 2.30 2.21 2.11
40 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80 2.66 2.52 2.37 2.29 2.20 2.11 2.02 1.92
60 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63 2.50 2.35 2.20 2.12 2.03 1.94 1.84 1.73
120 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47 2.34 2.19 2.03 1.95 1.86 1.76 1.66 1.53
6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32 2.18 2.04 1.88 1.79 1.70 1.59 1.47 1.32
Two-sided Two-sided
0.10 0.05 0.02 0.01 0.10 0.05 0.02 0.01
one-sided one-sided
0.05 0.025 0.01 0.005 0.05 0.025 0.01 0.005
n n
1 - - - - 31 11 13 15 17
2 - - - - 32 12 14 16 16
3 - - - - 33 11 13 15 17
4 - - - - 34 12 14 16 16
5 5 - - - 35 11 13 15 17
6 6 6 - - 36 12 14 16 18
7 7 7 7 - 37 11 13 17 17
8 6 8 8 8 38 12 14 16 18
9 7 7 9 9 39 13 15 17 17
10 8 8 10 10 40 12 14 16 18
11 7 9 9 11 45 13 15 17 19
12 8 8 10 10 46 14 16 18 20
13 7 9 11 11 49 13 15 19 19
14 8 10 10 12 50 14 16 18 20
15 9 9 11 11 55 15 17 19 21
16 8 10 12 12 56 14 16 18 20
17 9 9 11 13 59 15 17 19 21
18 8 10 12 12 60 14 18 20 22
19 9 11 11 13 65 15 17 21 23
20 10 10 12 14 66 16 18 20 22
21 9 11 13 13 69 15 19 23 25
22 10 12 12 14 70 16 18 22 24
23 9 11 13 15 75 17 19 23 25
24 10 12 14 14 76 16 20 22 24
25 11 11 13 15 79 17 19 23 25
26 10 12 14 14 80 16 20 22 24
27 11 13 13 15 89 17 21 23 27
28 10 12 14 16 90 18 20 24 26
29 11 13 15 15 99 19 21 25 27
30 10 12 14 16 100 18 22 26 28
8 1 0 0
10 1 1 0 0
12 2 2 1 0 0
14 3 2 1 0 0
16 4 3 1 0 0
18 5 4 3 2 1
20 5 5 3 3 2
22 6 5 4 4 3
25 7 7 5 5 4
30 10 9 7 6 5
35 12 11 9 8 7
40 14 13 11 10 9
45 16 15 13 12 11
50 18 17 15 15 13
55 20 19 17 17 15
N2
n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2 2 2 2 2 2 2 2 2 2
3 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3
4 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4
5 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5
6 2 2 3 3 3 3 4 4 4 4 5 5 5 5 5 5 6 6
7 2 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6
8 2 3 3 3 4 4 5 5 5 6 6 6 6 6 7 7 7 7
9 2 3 3 4 4 5 5 5 6 6 6 7 7 7 7 8 8 8
10 2 3 3 4 5 5 5 6 6 7 7 7 7 8 8 8 8 9
11 2 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9 9 9
12 2 2 3 4 4 5 6 6 7 7 7 8 8 8 9 9 9 10 10
13 2 2 3 4 5 5 6 6 7 7 8 8 9 9 9 10 10 10 10
14 2 2 3 4 5 5 6 7 7 8 8 9 9 9 10 10 10 11 11
15 2 3 3 4 5 6 6 7 7 8 8 9 9 10 10 11 11 11 12
16 2 3 3 4 5 6 6 7 8 8 9 9 10 10 11 11 11 12 12
17 2 3 4 4 5 6 7 7 8 9 9 10 10 11 11 11 12 12 13
18 2 3 4 5 5 6 7 8 8 9 9 10 10 11 11 12 12 13 13
19 2 3 4 5 6 6 7 8 8 9 10 10 11 11 12 12 13 13 13
20 2 3 4 5 6 6 7 8 9 9 10 10 11 12 12 13 13 13 14
Source: Swed, Frieda S., and Eisenhart, C. 1943. Tables for Testing Randomness of Grouping in a Sequence
of Alternatives, Ann. Math, Statist., 14, 83-86.
Tables 237
TABLE 8: CRITICAL VALUES OF THE SMALLEST RANK SUM FOR THE WILCOXON-MANN-
WHITNEY TEST
Two-sided
0.10 0.05 0.02 0.01
one-sided
0.05 0.025 0.01 0.005
n1 n2
3 2 3
3 3 7 6
4 2 3
4 3 7 6
4 4 13 11 10
5 2 4 3
5 3 8 7 6
5 4 14 12 11
5 5 20 19 17 15
6 2 4 3
6 3 9 8 7
6 4 15 13 12 10
6 5 22 20 18 16
6 6 30 28 26 13
7 2 4 3
7 3 10 8 7
7 4 16 14 13 10
7 5 23 21 20 16
7 6 32 29 27 24
7 7 41 39 36 32
8 2 5 4 3
8 3 11 9 8
8 4 17 15 14 11
8 5 25 2 21 17
8 6 34 31 29 25
8 7 44 41 38 34
8 8 55 51 49 43
9 1 1
9 2 5 4 3
9 3 11 9 8 6
9 4 19 16 14 11
9 5 27 24 22 18
9 6 36 33 31 26
(Contd...)
238 Selected Statistical Tests
TABLE 8 (Contd.)
Two-sided
one-sided 0.10 0.05 0.02 0.01
0.05 0.025 0.01 0.005
n1 n2
9 7 46 4 40 35
9 8 58 54 51 45
9 9 70 66 62 56
10 1 1 - - -
10 2 6 4 3 -
10 3 12 10 9 6
10 4 20 17 15 12
10 5 28 26 23 19
10 6 38 35 32 27
10 7 49 45 42 37
10 8 60 56 53 47
10 9 73 69 65 58
10 10 87 82 78 71
11 1 1 - - -
11 2 6 4 3 -
11 3 13 11 9 6
11 4 21 18 16 12
11 5 30 27 24 20
11 6 40 37 34 28
11 7 51 47 44 38
11 8 63 59 55 49
11 9 76 72 68 61
11 10 91 86 81 73
11 11 106 100 96 87
12 1 1 - - -
12 2 7 5 4 -
12 3 14 11 10 7
12 4 22 19 17 13
12 5 32 28 26 21
(Contd...)
Tables 239
TABLE 8 (Contd.)
Two-sided
0.10 0.05 0.02 0.01
one-sided
0.05 0.025 0.01 0.005
n1 n2
12 6 42 38 35 30
12 7 54 49 46 40
12 8 66 62 58 51
12 9 80 75 71 63
12 10 94 89 84 76
12 11 110 104 99 90
12 12 127 120 115 105
13 1 - - -- -
13 2 7 5 4 -
13 3 15 12 10 7
13 4 23 20 18 14
13 5 33 30 27 22
13 6 44 40 37 31
13 7 56 52 48 44
13 8 69 64 60 53
13 9 83 78 73 65
13 10 98 92 88 79
13 11 114 108 103 93
13 12 131 125 119 109
13 13 149 142 136 125
14 1 1 - - -
14 2 7 5 4 -
14 3 16 13 11 7
14 4 25 21 19 14
14 5 35 31 28 22
14 6 46 42 38 32
14 7 59 54 50 43
14 8 72 67 62 54
(Contd...)
240 Selected Statistical Tests
TABLE 8 (Contd.)
Two-sided
one-sided 0.10 0.05 0.02 0.01
n1 n2 0.05 0.025 0.01 0.005
14 9 86 81 76 67
14 10 102 96 91 81
14 11 118 112 106 96
14 12 136 129 123 112
14 13 154 147 141 129
14 14 174 166 160 147
15 1 1 - - -
15 2 8 6 4
15 3 16 13 11 8
15 4 26 22 20 15
15 5 37 33 29 23
15 6 48 44 40 33
15 7 61 56 52 44
15 8 75 69 65 56
15 9 90 84 79 69
15 10 106 99 94 84
I5 11 123 116 110 99
15 12 141 133 127 115
15 13 159 152 145 133
15 14 179 171 164 151
15 15 200 192 184 171
16 1 1 - - -
16 2 8 6 4 -
16 3 17 14 12 8
16 4 27 24 21 15
16 5 38 34 30 24
16 6 50 46 42 34
16 7 64 58 54 46
16 8 78 72 67 58
Tables 241
TABLE 8 (Contd.)
Two-sided
one-sided 0.10 0.05 0.02 0.01
n1 n2 0.05 0.025 0.01 0.005
16 9 93 87 82 72
16 10 109 103 97 86
16 11 127 120 113 102
16 12 145 138 131 119
16 13 165 156 150 130
16 14 185 176 169 155
16 15 206 197 190 175
16 16 229 219 211 196
17 1 1
17 2 9 6 5
17 3 18 15 12 8
17 4 28 25 21 16
17 5 40 35 32 25
17 6 52 47 43 36
17 7 66 61 56 47
17 8 81 75 70 60
17 9 97 90 84 74
17 10 113 106 100 89
17 11 131 123 117 105
17 12 150 142 135 122
17 13 170 161 154 140
17 14 190 182 174 159
17 15 212 203 195 180
17 16 235 225 217 201
17 17 259 249 240 223
18 1 1
18 2 9 7 5
18 3 19 15 13 8
18 4 30 26 22 16
18 5 42 37 33 26
18 6 55 49 45 37
18 7 69 63 58 49
(Contd...)
242 Selected Statistical Tests
TABLE 8 (Contd.)
Two-sided
0.10 0.05 0.02 0.01
one-sided
0.05 0.025 0.01 0.005
n1 n2
18 8 84 77 72 62
18 9 100 93 87 76
18 10 117 110 103 92
18 11 135 127 121 108
18 12 155 146 139 125
18 13 175 166 158 144
18 14 196 187 179 163
18 15 218 208 200 184
18 16 242 231 222 206
18 17 266 255 246 228
18 18 291 280 270 252
19 1 2 1 - -
19 2 10 7 5 3
19 3 20 16 13 9
19 4 31 27 23 17
19 5 43 38 34 27
19 6 57 51 46 38
19 7 71 65 60 50
19 8 87 80 74 64
19 9 103 96 90 78
19 10 121 113 107 94
19 11 139 131 124 111
19 12 159 150 143 129
19 13 180 171 163 147
19 14 202 192 182 168
19 15 224 214 205 189
19 16 248 237 228 210
Source: Natrella,1963.
n1 - Number of elements in the largest sample; n2 - Number of elements in the smallest sample
Tables 243
One-sided test
.10 .05 .025 .01 .005 .10 .05 .025 .01 .005
two-sided test n
.20 .10 .05 .02 .01 .20 .10 .05 .02 .10
n
1 .900 .950 .975 .990 .995 21 .226 .259 .287 .321 .344
2 .684 .776 .842 .900 .929 22 .221 .253 .281 .314 .337
3 .565 .636 .708 .785 .829 23 .216 .247 .275 .307 .330
4 .493 .565 .624 .689 .734 24 .212 .242 .269 .301 .323
5 .447 .509 .563 .627 .669 25 .208 .238 .264 .295 .317
6 .410 .468 .519 .577 .617 26 .204 .233 .259 .290 .311
7 .381 .436 .483 .538 .576 27 .200 .229 .254 .284 .305
8 .358 .410 .454 .507 .542 28 .197 .225 .250 .279 .300
9 .339 .387 .430 .480 .513 29 .193 .221 .246 .275 .295
10 .323 .369 .409 .457 .489 30 .190 .218 .242 .270 .290
11 .308 .352 .391 .437 .468 31 .187 .214 .238 .266 .285
12 .296 .338 .375 .419 .449 32 .184 .211 .234 .262 .281
13 .285 .325 .361 .404 .432 33 .182 .208 .231 .258 .277
14 .275 .314 .349 .390 .418 34 .179 .205 .227 .254 .273
15 .266 .304 .338 .377 .404 35 .177 .202 .224 .251 .269
16 .258 .295 .327 .366 .392 36 .174 .199 .221 .247 .265
17 .250 .286 .318 .355 .381 37 .172 .196 .218 .244 .262
18 .244 .279 .309 .346 .371 38 .170 .194 .215 .241 .258
19 .237 .271 .301 .337 .361 39 .168 .191 .213 .238 .255
20 .232 .265 .294 .329 .352 40 .165 .189 .210 .235 .252
Source: Table 1 of Leslie H. Miller, Table of Percentage Points of Kolmogorov Statistics. J. Am. Stat.
Assoc. 51 (1956), 111-121.
This table gives the values of D+n,a and Dn,a for which P {D+ n > D+n,a} and P {Dn > Dn,a} for some
selected values of n and .
244 Selected Statistical Tests
TABLE 10: CRITICAL VALUES OF rs FOR THE SPEARMAN RANK CORRELATION TEST
Level of significance
n 0.001 0.005 0.010 0.025 0.050 0.100
4 - - - - 0.8000 0.8000
5 - - 0.9000 0.9000 0.8000 0.7000
6 - 0.9429 0.8857 0.8286 0.7714 0.6000
7 0.9643 0.8929 0.8571 0.7450 0.6786 0.5357
8 0.9286 0.8571 0.8095 0.6905 0.5952 0.4762
9 0.9000 0.8167 0.7667 0.6833 0.5833 0.4667
10 0.8667 0.7818 0.1333 0.6364 0.5515 0.4424
11 0.8455 0.7545 0.7000 0.6091 0.5273 0.4182
12 0.8182 0.7273 0.6713 0.5804 0.4965 0.3986
13 0.7912 0.6978 0.6429 0.5549 0.4780 0.3791
14 0.7670 0.6747 0.6220 0.5341 0.4593 0.3626
15 0.7464 0.6536 0.6000 0.5179 0.4429 0.3500
16 0.7265 0.6324 0.5824 0.5000 0.4265 0.3382
17 0.7083 0.6152 0.5637 0.4853 0.4118 0.3260
18 0.6904 0.5975 0.5480 0.4716 0.3994 0.3148
19 0.6737 0:5825 0.5333 0.4579 0.3895 0.3070
20 0.6586 0.5684 0.5203 0.4451 0.3789 0.2977
21 0.6455 0.5545 0.5078 0.4351 0.3688 0.2909
22 0.6318 0.5426 0.4963 0.4241 0.3597 0.2829
23 0.6186 0.5306 0.4852 0.4150 0.3518 0.2767
24 0.6070 0.5200 0.4748 0.4061 0.3435 0.2704
25 0.5962 0.5100 0.4654 0.3977 0.3362 0.2646
26 0.5856 0.5002 0.4564 0.3894 0.3299 0.2588
27 0.5757 0.4915 0.4481 0.3822 0.323 0.2540
28 0.5660 0.4828 0.4401 0.3749 0.3175 0.2490
29 0.5567 0.4744 0.4320 0.3685 0.3113 0.2443
30 0.5479 0.4665 0.4251 0.3620 0.3059 0.2400
Source : Sachs, 1972
Tables 245
TABLE 11: CRITICAL VALUES FOR THE RUN TEST (EQUAL SMAPLE SIZES)
Level of significance
Two-sided
one-sided 0.10 0.05 0.02 0.01
0.05 0.025 0.01 0.005
n1 = n2 A B A b a b a B
5 3 9 2 10
6 3 11 2 12
7 4 12 3 13
8 5 13 4 14
9 6 14 4 16
10 6 16 5 17
11 7 17 7 16 6 18 5 18
12 8 18 7 18 7 19 6 19
13 9 19 8 19 7 21 7 20
14 10 20 9 20 8 22 7 22
15 11 21 10 21 9 23 8 23
16 11 23 11 22 10 24 9 24
17 12 24 11 24 10 26 10 25
18 13 25 12 25 11 27 10 27
19 14 26 13 26 12 28 11 28
20 15 27 14 27 13 29 12 29
21 16 28 14 30
22 17 29 14 32
23 17 31 15 33
24 18 32 16 34
25 19 33 18 33 17 35 16 35
26 20 34 18 36
27 21 35 19 37
28 22 36 19 39
29 23 37 20 40
30 24 38 22 39 21 41 20 41
35 28 43 27 44 25 46 24 47
40 33 48 31 50 30 51 29 52
45 37 54 36 55 34 57 33 58
50 42 59 40 61 38 63 37 64
(Contd...)
246 Selected Statistical Tests
TABLE 11 (Contd.)
Parimal Mukhopadhyay (1996), Mathematical Statistics, New Central Book Agency (P) Ltd.,
Calcutta.
Parimal Mukhopadhyay (1999), Applied Statistics, New Central Book Agency (P) Ltd, Calcutta.
Rangasamy, R. (1995), A Text Book of Agricultural Statistics, New Age International Publishers
Ltd.
Rao, C.R. (1963), Linear Statistical Inference and Its Applications, John Wiley & Sons.
Rao, C.R. (1952), Advanced Statistical Methods in Biometric Research, John Wiley & Sons.
Richard I. Levin and David S. Rubin (2001), Statistics for Management, 7th ed., Prentice Hall of
India.
Rohatgi, V.K. (1976), An Introduction to Probability Theory and Mathematical Statistics, Wiley
Eatern.
Sachs, L. (1972), Statistische Methoden: ein Soforthelfer, Springer-Verlag, Berlin.
Scheffe, H. (1961), The Analysis of Variance, John Wiley & Sons, New York.
Searle, S.R. (1971), Linear Models, John Wiley & Sons, New York.
Wald, A. (1947), Sequential Analysis, John Wiley & Sons, New York.
Walpole, R.E. and Myers, R.H. (1989), Probability and Statistics for Engineers and Scientists, 4th
edn. Macmillan, New York.
Wijvekate, M.L. (1962), Verklarende Statistick, Aula, Utrecht.