Statistics

UNIT-I
SINGLE RANDOM VARIABLES

AND PROBABILITY
DISTRIBUTION
Probability
Trial and Event: Consider an experiment, which though repeated under essential and identical
conditions, does not give a unique result but may result in any one of the several possible
outcomes. The experiment is known as Trial and the outcome is called Event
E.g. (1) Throwing a dice experiment getting the no‟s 1,2,3,4,5,6 (event)
(2) Tossing a coin experiment and getting head or tail (event)
Exhaustive Events:
The total no. of possible outcomes in any trial is called exhaustive event.
E.g.: (1) In tossing of a coin experiment there are two exhaustive events.
n
(2) In throwing an n-dice experiment, there are 6 exhaustive events.
Favorable event:
The no of cases favorable to an event in a trial is the no of outcomes which entities the
happening of the event.
E.g. (1) In tossing a coin, there is one and only one favorable case to get either head or tail.
Mutually exclusive Event: If two or more of them cannot happen simultaneously in the same
trial then the event are called mutually exclusive event.
E.g. In throwing a dice experiment, the events 1,2,3,------6 are M.E. events
Equally likely Events: Outcomes of events are said to be equally likely if there is no reason for
one to be preferred over other. E.g. tossing a coin. Chance of getting 1,2,3,4,5,6 is equally likely.
Independent Event:
Several events are said to be independent if the happening or the non-happening of the event is
not affected by the concerning of the occurrence of any one of the remaining events.
An event that always happen is called Certain event, it is denoted by „S‟.
An event that never happens is called Impossible event, it is denoted by „  ‟.
Eg: In tossing a coin and throwing a die, getting head or tail is independent of getting no‟s 1 or 2
or 3 or 4 or 5 or 6.
Definition: probability (Mathematical Definition)
If a trial results in n-exhaustive mutually exclusive, and equally likely cases and m of them are
favorable to the happening of an event E then the probability of an event E is denoted by P(E)
and is defined as
no of favourable cases to event m

P(E) = =
Total no of exaustive cases n
Sample Space:
The set of all possible outcomes of a random experiment is called Sample Space .The elements
of this set are called sample points. Sample Space is denoted by S.
Eg. (1) In throwing two dies experiment, Sample S contains 36 Sample points.
S = {(1,1) ,(1,2) ,----------(1,6), --------(6,1),(6,2),--------(6,6)}
Eg. (2) In tossing two coins experiment , S = {HH ,HT,TH,TT}
A sample space is called discrete if it contains only finitely or infinitely many points which can
be arranged into a simple sequence w1,w2,……. .while a sample space containing non
denumerable no. of points is called a continuous sample space.
Statistical or Empirical Probability:
If a trial is repeated a no. of times under essential homogenous and identical conditions, then
the limiting value of the ratio of the no. of times the event happens to the total no. of trials, as
the number of trials become indefinitely large, is called the probability of happening of the
event.( It is assumed the limit is finite and unique)
Symbolically, if in „n‟ trials and events E happens „m‟ times , then the probability „p‟ of the
m
happening of E is given by p = P(E) = lim .
n  n
An event E is called elementary event if it consists only one element.
An event, which is not elementary, is called compound event.
Random Variables
 A random variable X on a sample space S is a function X : S  R from S onto the set of

real numbers R, which assigns a real number X (s) to each sample point „s‟ of S.
 Random variables (r.v.) bare denoted by the capital letters X,Y,Z,etc..
 Random variable is a single valued function.
 Sum, difference, product of two random variables is also a random variable .Finite linear
combination of r.v is also a r.v .Scalar multiple of a random variable is also random
variable.
 A random variable, which takes at most a countable number of values, it is called a
discrete r.v. In other words, a real valued function defined on a discrete sample space is
called discrete r.v.
 A random variable X is said to be continuous if it can take all possible values between
certain limits .In other words, a r.v is said to be continuous when it‟s different values
cannot be put in 1-1 correspondence with a set of positive integers.
 A continuous r.v is a r.v that can be measured to any desired degree of accuracy. Ex : age
, height, weight etc..
 Discrete Probability distribution: Each event in a sample has a certain probability of
occurrence . A formula representing all these probabilities which a discrete r.v. assumes
is known as the discrete probability distribution.
 The probability function or probability mass function (p.m.f) of a discrete random
variable X is the function f(x) satisfying the following conditions.
i) f(x)  0
ii)  f (x) =1
x
iii) P(X =x) = f(x)
 Cumulative distribution or simply distribution of a discrete r.v. X is F(x) defined by F(x)

= P(X  x) =  f ( t ) for    x  
t x
 If X takes on only a finite no. of values x1,x2,……xn then the distribution function is given
by
F(x) = 0 - < x < x1
f(x1) x1x<x2
f(x1)+f(x2) x2x<x3
………
f(x1)+f(x2)+…..+f(xn) xn x < 
F(-) = 0 , F()=1, 0F(x)1, F(x)F(y) if x<y
P(xk)= P(X= xk)=F(xk) – F(xk-1)
 For a continuous r.v. X, the function f(x) satisfying the following is known as the
probability density function(p.d.f.) or simply density function:
i) f(x)  0 ,- <x <


ii)  f ( x ) dx  1

iii) P(a<X<b)=  f ( x ) dx = Area under f(x) between ordinates x=a and x=b
a
 P(a<X<b) = P(ax<b)=P(a<Xb)=P(aXb)
(i.e) In case of continuous it does not matter weather we include the end
points of the interval from a to b.This result in general is not true for
discrete r.v.
a  x
 Probability at a point P(X=a) =  f ( x ) dx

a  x
 Cumulative distribution for a continuous r.v. X with p.d.f. f(x), the cumulative
distribution F(x) is defined as

F(x)= P(Xx)=  f ( t ) dt -<x<


It follows that F(-) = 0 , F()=1, 0F(x)1 for -<x<
f(x)= d/dx(F(x))= F1(x)0 and P(a < x < b)= F(b)-F(a)
 In case of discrete r.v. the probability at a point i.e., P(x=c) is not zero for some fixed c
however in case of continuous random variables the probability at appoint is always zero.
I.e., P(x=c) = 0 for all possible values of c.
 P(E) = 0 does not imply that the event E is null or impossible event.
 If X and Y are two discrete random variables the joint probability function of X and Y is
given by P(X=x,Y=y) = f(x,y) and satisfies
(i) f(x,y)  0 (ii)   f ( x, y) =1

x y
The joint probability function for X and Y can be reperesented by a joint probability
table.
Table
X Y y1 y2 …… yn Totals
x1 f(x1,y1) f(x1,y2) …….. f(x1,yn) f1(x1)

=P(X=x1)
x2 F(x2,y1) f(x2,y2) …….. f(x2,yn) f1(x2)

=P(X=x2)
…….. ……. ……… ……… ……… ……..
xm f(xm,y1) f(xm,y2) ……. f(xm,yn) f1(xm)

=P(X=xm)
Totals f2(y1) f2(y2) …….. f2(yn) 1
=P(Y=y1) =P(Y=y2) =P(Y=yn)
The probability of X = xj is obtained by adding all entries in arrow corresponding to X

= xj
Similarly the probability of Y = yk is obtained by all entries in the column

corresponding to Y =yk
f1(x) and f2(y) are called marginal probability functions of X and Y respectively.
The joint distribution function of X and Y is defined by F(x,y)= P(Xx,Yy)=

  f (u , v )
u  xv y
 If X and Y are two continuous r.v.‟s the joint probability function for the r.v.‟s X and Y is
defined by
 
(i) f(x,y)  0 (ii)   f ( x , y ) dxdy =1
 
b d
 P(a < X < b, c< Y < d) =   f ( x , y ) dxdy
xa yc
 The joint distribution function of X and Y is F(x,y) = P( X  x,Y  y)=

 
  f ( u , v ) dudv
u   v  
2
 F
  f ( x, y)
xy
 The Marginal distribution function of X and Y are given by P( X  x) = F1(x)=
   
  f ( u , v ) dudv and P(Y  y) = F2(y) =   f ( u , v ) dudv

u   v   u   v  
 The marginal density function of X and Y are given by
 
f1(x) =  f ( x , v ) dv and f2(y) =  f ( u , y ) du
v   u  
 Two discrete random variables X and Y are independent iff
P(X = x,Y = y) = P(X = x)P(Y = y)  x,y (or)
f(x,y) = f1(x)f2(y)  x, y
 Two continuous random variables X and Y are independent iff
P(X  x,Y  y) = P(X  x)P(Y  y)  x,y (or)
f(x,y) = f1(x)f2(y)  x, y
If X and Y are two discrete r.v. with joint probability function f(x,y) then
f (x, y)
P(Y = y|X=x) = = f(y|x)
f1 ( x )
f (x, y)
Similarly, P(X = x|Y=y) = = f(x|y)
f2 ( y)
f (x, y)
If X and Y are continuous r.v. with joint density function f(x,y) then = f(y|x) and
f1 ( x )
f (x, y)
= f(x|y)
f2 ( y)
Expectation or mean or Expected value : The mathematical expectation or expected value of

r.v. X is denoted by E(x) or  and is defined as

  xi f ( xi ) X is discrete
 i


E(X)=   xf ( x ) dx X is Continuous
 
 If X is a r.v. then E[g(X)] =

 g (x) f (x) FOR Discrete
x
 g ( x ) f ( x ) dx For Continuous

 If X, Y are r.v.‟s with joint probability function f(x,y) then
E[g(X,Y)] =  g ( x, y) f ( x, y) for discrete r.v.‟s

x y
 
  g ( x , y ) f ( x , y ) dxdy for continuous r.v.‟s

 
If X and Y are two continuous r.v.‟s the joint density function f(x,y) the conditional

expectation or the conditional mean of Y given X is E(Y |X = x) =  yf ( y | x ) dy


Similarly, conditional mean of X given Y is E(X |Y = y) =  xf ( x | y ) dx

 Median is the point, which divides the entire distribution into two equal parts. In case of
continuous distribution median is the point, which divides the total area into two equal
M 
parts. Thus, if M is the median then  f ( x ) dx =  f ( x ) dx =1/2. Thus, solving any one of
 M
the equations for M we get the value of median. Median is unique
 Mode: Mode is the value for f(x) or P(xi) at attains its maximum
For continuous r.v. X mode is the solution of f1(x) = 0 and f11(x) <0
provided it lies in the given interval. Mode may or may not be unique.
 Variance: Variance characterizes the variability in the distributions with same mean can
still have different dispersion of data about their means
Variance of r.v. X denoted by Var(X) and is defined as
 (x   ) f (x)
2
Var(X) = E  (X -  )
2
 = for discrete
x

2
 (x   ) f ( x ) dx for continuous

where  = E(X)
 If c is any constant then E(cX) = c E(X)
 If X and Y are two r.v.‟s then E(X+Y) = E(X)+E(Y)
 IF X,Y are two independent r.v.‟s then E(XY) = E(X)E(Y)
 If X1,X2,-------,Xn are random variables then E(c1X1 +c2X2+------+cnXn) =

c1E(X1)+c2E(X2)+-----+cnE(Xn) for any scalars c1,c2,------,cn If all expectations exists
 n  n
 If X1,X2,-------,Xn are independent r.v‟s then 
E  i 
 
X  E(X i) if all expectations

 i 1  i 1
exists.
 Var (X) = E (X2) –[E (X)]2
 If „c‟ is any constant then var (cX) = c2var(X)
 The quantity E[(X-a)2] is minimum when a == E(X)
 If X and Y are independent r.v.‟s then Var(X  Y) = Var(X)  Var(Y)

Binomial Distribution
 A random variable X is said to follow binomial distribution if it assumes only non-negative

values and its probability mass function is given by
 n  x n x
P(X = x) = P(x) =  p q
  where x = 0,1,2,3,….n q = 1-p
 x
0 other wise
Where n, p are known as parameters, n- number of independent trials p- probability of success

in each trial, q- probability of failure.
 Binomial distribution is a discrete distribution.
 The notation X ~ B(n,p) is the random variable X which follows the binomial distribution with
parameters n and p
 If n trials constitute an experiment and the experiment is repeated N times the frequency
function of the binomial distribution is given by f(x) = NP(x). The expected frequencies of
0,1,2,….. n successes are the successive terms of the binomial expansion N(p+q) n
 The mean and variance of Binomial distribution are np , npq respectively.
 Mode of the Binomial distribution: Mode of B.D. Depending upon the values of (n+1)p
(i) If (n+1)p is not an integer then there exists a unique modal value for binomial distribution
and it is „m‟= integral part of (n+1)p
(ii) If (n+1)p is an integer say m then the distribution is Bi-Modal and the two modal values are
m and m-1
 Moment generating function of Binomial distribution: If X ~ B(n,p)then MX(t)=(q+pet) n
 The sum of two independent binomial variates is not a binomial varaite. In other words,
Binomial distribution does not possess the additive or reproductive property.
1 2p 1  6 pq
 For B.D. 1= 1 = 2= 2 –3 =
npq npq
 If X1~ B(n1,p) and X2~ B(n2,p) then X1+X2 ~ B(n1+n2,p).Thus the B.D. Possesses the additive
or reproductive property if p1=p2
Poisson distribution
 Poisson Distribution is a limiting case of the Binomial distribution under the following
conditions:
(i) n, the number of trials is infinitely large.
(ii) P, the constant probability of success for each trial is indefinitely small.
(iii) np= , is finite where  is a positive real number.
 A random variable X is said to follow a Poisson distribution if it assumes only non-negative

values and its p.m.f. is given by
 x
e 
P(x,)= P(X= x) = : x= 0,1,2,3,……  > 0
x!
0 Other wise
Here  is known as the parameter of the distribution.
 We shall use the notation X~ P() to denote that X is a Poisson variate with parameter 
 Mean and variance of Poisson distribution are equal to .
 The coefficient of skewness and kurtosis of the poisson distribution are 1 = 1= 1/ and 2=
2-3=1/. Hence the poisson distribution is always a skewed distribution. Proceeding to limit
as  tends to infinity we get 1 = 0 and 2=3
 Mode of Poisson Distribution: Mode of P.D. Depending upon the value of 
(i) when  is not an integer the distribution is uni- modal and integral part of  is the
unique modal value.
(ii) When  = k is an integer the distribution is bi-modal and the two modals are k-1 and k.
 Sum of independent poisson variates is also poisson variate.
 The difference of two independent poisson variates is not a poisson variate.
 Moment generating function of the P.D.

t
 ( e 1)
If X~ P() then MX(t) = e
 Recurrence formula for the probabilities of P.D. ( Fitting of P.D.)

P(x+1) = p(x)
x 1
 Recurrence relation for the probabilities of B.D. (Fitting of B.D.)
n  x p
P(x+1) =  .  p(x)
 x 1 q 
Normal Distribution
 A random variable X is said to have a normal distribution with parameters  called mean and
2 called variance if its density function is given by the probability law
1 x   2 
1  
f(x; , ) = exp     , - < x < , - <  < ,  > 0
 2  2    
 
 A r.v. X with mean  and variance 2 follows the normal distribution is denoted by
X~ N(, 2)
X  
 If X~ N(, 2) then Z = is a standard normal variate with E(Z) = 0 and var(Z)=0 and

we write Z~ N(0,1)
1 2
z
 , - < Z< 
/2
The p.d.f. of standard normal variate Z is given by f(Z) = e
2
z
1 2
 The distribution function F(Z) = P(Z  z) = e
t /2
dt
2

 F(-z) = 1 – F(z)
 P(a < z  b) = P( a  z < b)= P(a <z < b)= P(a  z  b)= F(b) – F(a)
X   b    a   
 If X~ N(, 2) then Z = then P(a  X  b) = F   F 
      
 N.D. is another limiting form of the B.D. under the following conditions:
i) n , the number of trials is infinitely large.
ii) Neither p nor q is very small
 Chief Characteristics of the normal distribution and normal probability curve:
i) The curve is bell shaped and symmetrical about the line x = 
ii) Mean median and mode of the distribution coincide.
iii) As x increases numerically f(x) decreases rapidly.
iv) The maximum probability occurring at the point x=  and is given by

[P(x)]max = 1/2
v) 1 = 0 and 2 = 3
vi) 2r+1 = 0 ( r = 0,1,2……) and 2r = 1.3.5….(2r-1)2r
vii) Since f(x) being the probability can never be negative no portion of the curve lies below
x- axis.
viii) Linear combination of independent normal variate is also a normal variate.
ix) X- axis is an asymptote to the curve.
1 1 / 2
x) The points of inflexion of the curve are given by x =    , f(x) = e
 2
2 4 2 4
xi) Q.D. : M.D.: S.D. ::  :  :  :: : : 1 Or Q.D. : M.D.: S.D. ::10:12:15
3 5 3 5
xii) Area property: P(-  < X <  + ) = 0.6826 = P(-1 < Z < 1)
P(- 2 < X <  + 2) = 0.9544 = P(-2 < Z < 2)
P(- 3 < X <  +3 ) = 0.9973 = P(-3 < Z < 3)
P( |Z| > 3) = 0.0027
 m.g.f. of N.D. If X~ N(, 2) then MX(t) = e t +t22/2
2
t /2
If Z~ N(0,1) then MZ(t) = e
Continuity Correction:
 The N.D. applies to continuous random variables. It is often used to approximate distributions
of discrete r.v. Provided that we make the continuity correction.
 If we want to approximate its distribution with a N.D. we must spread its values over a
continuous scale. We do this by representing each integer k by the interval from k-1/2 to k+1/2
and at least k is represented by the interval to the right of k-1/2 to at most k is represented by
the interval to the left of k+1/2.
 Normal approximation to the B.D:
X  np
X~ B(n, p) and if Z = then Z ~ N(0,1) as n tends to infinity and F(Z) =
np (1  p )
z
1 2
F(Z)= P(Z  z) =  e
t /2
dt - < Z < 
2

 Use the normal approximation to the B.D. only when (i) np and n(1-p) are both greater than 15
(ii) n is small and p is close to ½
 Poisson process: Poisson process is a random process in which the number of events
(successes) x occurring in a time interval of length T is counted. It is continuous parameter,
discrete stable process. By dividing T into n equal parts of length t we have T = n . T.
Assuming that (i) P  T or P =  t (ii) The occurrence of events are independent (iii)
The probability of more than one substance during a small time interval t is negligible.
As n  , the probability of x success during a time interval T follows the P.D. with
parameter  = np = T where  is the average(mean) number of successes for unit time.
PROBLEMS:
1:A random variable x has the following probability function:

x 0 1 3 4 5 6 7
P(x) 0 k 2k 2k 3k k2 7k2+k
Find (i) k (ii) P(x<6) (iii) P( x>6)
Solution:
(i) since the total probability is unity, we have  p(x)  1

x0
i.e., 0 + k +2k+ 2k+ 3k+ k2+ 7k2+k=1

i.e., 8k2+ 9k -1=0
k=1,-1/8
(ii) P(x<6)= 0 + k +2k+ 2k + 3k
=1+2+2+3=8
iii) P( x>6)= k2 + 7k2+k

=9
2. Let X denotes the minimum of the two numbers that appear when a pair of fair dice is thrown
once. Determine (i) Discrete probability distribution (ii) Expectation (iii) Variance
Solution:
When two dice are thrown, total number of outcomes is 6x6-36
1,1 1, 2 1, 3 1, 4 1, 5 1, 6 

 2 ,1  2 , 2  2 , 3  2 , 4  2 , 5  2 , 6 
 3 ,1  3 , 2  3 , 3  3 , 4  3 , 5  3 , 6 
In this case, sample space S=
 4 ,1  4 , 2  4 , 3  4 , 4  4 , 5  4 , 6 
 5 ,1  5 , 2  5 , 3  5 , 4  5 , 5  5 , 6 
 6 ,1  6 , 2  6 , 3  6 , 4  6 , 5  6 , 6 
If the random variable X assigns the minimum of its number in S, then the sample space S=
1 1 1 1 1 1
 
1 2 2 2 2 2
 
 1 2 3 3 3 3 
 
1 2 3 4 4 4
1 2 3 4 5 5
 
 1 2 3 4 5 6 
The minimum number could be 1,2,3,4,5,6
For minimum 1, the favorable cases are 11
Therefore, P(x=1)=11/36
P(x=2)=9/36, P(x=3)=7/36, P(x=4)=5/36, P(x=5)=3/36, P(x=6)=1/36
The probability distribution is
X 1 2 3 4 5 6
P(x) 11/36 9/36 7/36 5/36 3/36 1/36
(ii)Expectation mean =  pi xi
11 9 7 5 3 1
E (x)  1  2  3  4  5  6
36 36 36 36 36 36
1 9
Or   11  8  21  20  15  6    2 . 5278
36 36
(ii) variance =  pi x
2
i  
2
11 9 7 5 3 1
36   2 . 5278 
2
E (x)  1 4  9  16  25 
36 36 36 36 36 36
=1.9713
3: A continuous random variable has the probability density function

x
 kxe , fo r x  0 ,   0
f (x)  
 0 , o th e r w is e
Determine (i) k (ii) Mean (iii) Variance
Solution:

(i) since the total probability is unity, we have  f  x dx 1


0 
x
 0 dx   kxe dx  1
 0

x
i.e.,  kxe dx  1
0

  e x  e
x

k  x   1   or k  
2
   
      
2
 0

(ii) mean of the distribution    xf  x dx


0 
x
 0 dx   kx
2
e dx
 0

 2  e x  e
x
 e
x

  x    2 x   2 
2
    
       
2 3
   0
2
=


Variance of the distribution   x f  x dx  

2 2 2


4
  x f  x dx 
2 2

2


 3  e x  e
x
 e
x
 e
x
 4
  x    3x    6 x   6  
2 2
      
           
2 3 4 2
   0
2


2
4:
Out of 800 families with 5 children each, how many would you expect to have (i)3
boys (ii)5girls (iii)either 2 or 3 boys ? Assume equal probabilities for boys and girls
Solution(i)
1 5
C3 
5
P(3boys)=P(r=3)=P(3)= 5
per family
2 16
Thus for 800 families the probability of number of families having 3 boys=
5
800   250 families
16
(iii)
1 1
C0 
5
P(5 girls)=P(no boys)=P(r=0)= 5
per family
2 32
Thus for 800 families the probability of number of families having 5girls=
1
800   25 families
32
(iv) P(either 2 or 3 boys =P(r=2)+P(r=3)=P(2)+P(3)

1 1
C2 
5 5
5 5
C 3 =5/8 per family
2 2
5
Expected number of families with 2 or 3 boys = 800   500 families.
8
5: Average number of accidents on any day on a national highway is 1.8. Determine the
probability that the number of accidents is (i) at least one (ii) at most one
Solution:
Mean=   1 . 8
𝑒 −𝜆 𝜆 𝑥 𝑒 −1.8 1.8𝑥
We have P(X=x)=p(x) =
𝑥! 𝑥!
(i)P (at least one) =P( x≥1)=1-P(x=0)
=1-0.1653
=0.8347
P (at most one) =P (x≤1)
=P(x=0)+P(x=1)
= 0.4628
6: The mean weight of 800 male students at a certain college is 140kg and the standard deviation
is 10kg assuming that the weights are normally distributed find how many students weigh I)
Between 130 and 148kg ii) more than 152kg
Solution:
Let  be the mean and  be the standard deviation. Then  =140kg and  =10pounds
x   138  140
(i) When x= 138, z     0 .2  z 1
 10
x   148  140
When x= 138, z    0 .8  z 2
 10
 P(138≤x≤148)=P(-0.2≤z≤0.8)
=A( z 2 )+A( z 1 )
=A(0.8)+A(0.2)=0.2881+0.0793=0.3674
Hence the number of students whose weights are between 138kg and 140kg
=0.3674x800=294
𝑥−𝜇 152 −140
(ii) When x=152, = = 1.2=z1
𝜎 10
Therefore P(x>152)=P(z>z1)=0.5-A(z1)
=0.5-0.3849=0.1151
Therefore number of students whose weights are more than 152kg =800x0.1151=92.
Exercise Problems:
1. Two coins are tossed simultaneously. Let X denotes the number of heads then find i)
E(X) ii) E(X2) iii)E(X3) iv) V(X)
 x
2. If f(x)=k e is probability density function in the interval,    x   , then find i) k
ii) Mean iii) Variance iv) P(0<x<4)
3. Out of 20 tape recorders 5 are defective. Find the standard deviation of defective in the
sample of 10 randomly chosen tape recorders. Find (i) P(X=0) (ii) P(X=1) (iii) P(X=2)
(iv) P (1<X<4).
4. In 1000 sets of trials per an event of small probability the frequencies f of the number of
x of successes are
f 0 1 2 3 4 5 6 7 Total
x 305 365 210 80 28 9 2 1 1000
Fit the expected frequencies.
5.If X is a normal variate with mean 30 and standard deviation 5. Find the probabilities
that i) P(26  X40) ii) P( X  45)
6. The marks obtained in Statistics in a certain examination found to be normally
distributed. If 15% of the students greater than or equal to 60 marks, 40% less than 30
marks. Find the mean and standard deviation.
3
7.If a Poisson distribution is such that P ( X  1)  P ( X  3) then find (i) P ( X  1) (ii)
2
P ( X  3)
(iii) P ( 2  X  5 ) .
8. A random variable X has the following probability function:
X -2 -1 0 1 2 3
P(x) 0. K 0.2 2K 0.3 K
1
Then find (i) k (ii) mean (iii) variance (iv) P(0 < x < 3)
UNIT-II
MULTIPLE RANDOM VARIABLES
Joint Distributions: Two Random Variables
In real life, we are often interested in several random variables that are related to each other. For
example, suppose that we choose a random family, and we would like to study the number of
people in the family, the household income, the ages of the family members, etc. Each of these is
a random variable, and we suspect that they are dependent. In this chapter, we develop tools to
study joint distributions of random variables. The concepts are similar to what we have seen so
far. The only difference is that instead of one random variable, we consider two or more. In this
chapter, we will focus on two random variables, but once you understand the theory for two
random variables, the extension to n
random variables is straightforward. We will first discuss joint distributions of discrete random
variables and then extend the results to continuous random variables.
Joint Probability Mass Function (PMF)

Remember that for a discrete random variable X, we define the PMF as
PX(x)=P(X=x). Now, if we have two random variables X and Y, and we would like to
study them jointly, we define the joint probability mass function as follows:
The joint probability mass function of two discrete random variables X and Y is defined as
PXY(x,y)=P(X=x,Y=y).
Note that as usual, the comma means "and," so we can write
PXY(x,y)=P(X=x,Y=y)=P((X=x) and (Y=y)).
We can define the joint range for X and Y as
RXY={(x,y)|PXY(x,y)>0}.
In particular, if RX={x1,x2,...} and RY={y1,y2,...}, then we can always write
RXY⊂RX×RY={(xi,yj)|xi∈RX,yj∈RY}.
In fact, sometimes we define RXY=RX×RY to simplify the analysis. In this case, for
some pairs (xi,yj) in RX×RY, PXY(xi,yj) might be zero. For two discrete random
variables X and Y, we have
∑(xi,yj)∈RXYPXY(xi,yj)=1
Marginal PMFs
The joint PMF contains all the information regarding the distributions of X and Y. This means
that, for example, we can obtain PMF of X from its joint PMF with Y. Indeed, we can write
PX(x)=P(X=x)=∑yj∈RYP(X=x,Y=yj)=∑yj∈RYPXY(x,yj).law of total probablity
Here, we call PX(x) the marginal PMF of X. Similarly, we can find the marginal PMF of Y as
PY(Y)=∑xi∈RXPXY(xi,y).
Marginal PMFs of X and Y
PX(x)PY(y)=∑yj∈RYPXY(x,yj), for any x∈RX=∑xi∈RXPXY(xi,y), for any y∈RY
 Correlation: In a bivariate distribution, if the change in one variable effects the

change in other variable, then the variables are called correlated.
 Covariance between two random variables X and Y is denoted by Cov(X,Y) is
defined as E(XY)-E(X)E(Y)
 If X and Y are independent then Cov(X,Y) = 0
 Karl Pearson Correlation Coefficient between two r.v. X and Y usually denoted by
r(X,Y) or simply r XY is a numerical measure of a linear relationship between them
and is defined as r = r(X,Y) = cov(X,Y)/xy
It is also called product moment correlation coefficient.
 If (xi,yi); I = 1,2…n is bivariate distribution , then
Cov (X,Y) = E[{X-E(X)}{Y-E(Y)}]
= (1/n)(xi- x )(yi- y ) = (1/n)xiyi - x y
X2 = E[ X-E(X)]2 = (1/n)  ( xi  x )

2
= (1/n)xi2 –( x )2
Y2 = E[ Y-E(Y)]2 = (1/n)  ( yi  y)

2
= (1/n)yi2 –( y )2
1
 xy  x y
 Computational formula for r(X,Y) = n
 1 2  2   1 2  2 
  x  x    y  y 
 n    n  
 -1  r  1
 If r = 0 then X, Y are uncorrelated.
 If r = -1 then correlation is perfect and negative.
 If r = 1then the correlation is perfect and positive.
 r is independent of change of origin and scale
 Two independent variables are uncorrelated. Converse need not be true.
 The correlation coefficient for Bivariate frequency distribution:
The bivariate data on X on Y are presented in a two-way correlation table with n
classes of Y placed along the horizontal lines and m classes of X along vertical lines
and fij is the frequency of the individuals lying in i, j th cell.
 f (x, y) =g(y),is the sum of the frequencies along any row and
x
 f ( x, y ) =f(x),is the sum of the frequencies along any column.

y
 f ( x, y ) =  f ( x, y ) = f (x) = g ( y) =N
x y y x x y
1 1
x =
N
 xf ( x ) , y =  yg ( y )
x N y
1 1
  f (x)  x and  
2 2 2 2
y g ( y)  y
2 2
X = x Y =
N x N y
1
Cov(X,Y) =
N
 xyf ( x , y ) -x y
x y
Cov ( X , Y )
r=
 X
 Y
 Rank Correlation: Let (xi,yi) for I = 1,2,…n be the ranks of the ith individuals in the
characteristics A and B respectively, Pearsonian coefficient of correlation between xi
and yi are called rank correlation coefficient between A and B for that group of
individual.
 The Spearman’s rank correlation between the two variables X and Y takes the
n

2
di
values 1,2…n denoted by  and is defined as = 1– i 1
 1)
2
n(n
where di = xi-yi ( In general xi  yi)
In case, common ranks are given to repeated items, the common rank is the average of
the ranks which these items would have assumed if they were slightly different from each
other and the next item will get the rank next to the rank already assumed. The adjustment or
 1)
2
m (m
correction is made in the rank correlation formula. In the formula we add factor to
12
 d2, where m is the number of times an item is repeated. This correction factor is to be added
to each repeated value in both X-series and Y- series.
 -1    1
 Regression analysis is a mathematical measure of the average relationship between
two or more variables in terms of the original units of the data.
 The variable whose value is influenced or is to be predicted is called dependent
variable and the variable, which influences the values or is used for the prediction is
called independent variable. Independent variable is also known as regressor or
predictor or explanatory variable while the dependent variable is also known as
regressed or explained variable.
 If the variables in bivariate distributions are related we will find that the points in the
scatter diagram will cluster round some curve called the “ curve of regression”. If the
curve is a straight line, it is called line of regression and
there is said to be linear Regression between the variables, otherwise the regression
is said to be curvilinear. The line of regression is line of best fit and is obtained by
principle of least squares.
 In the bivariate distribution (xi,yi) ; i = 1,2,….n Y is dependent variable and X is

independent variable. The line of regression Y on X is Y = a + b X.

i.e. Y- y = r (X  x)
Y
 X
Similarly the line of regression X on Y is X = a + b Y

i.e.,X- x = r (Y  y )
X
 Y
If X and Y are any random variables the two regression lines are
Cov ( X , Y )
Y – E(Y) = [X – E(X)]

2
X
Cov ( X , Y )
X – E(X) = [Y – E(Y)]

2
Y
 Both lines of regression passes through the point  x , y  i.e., the mean values  x , y 
can be obtained at the point of intersection of regression lines.
 The slope of regression line Y on X is also called the regression coefficient Y on X. It
represents the increment in the value of dependent variable Y corresponding to a unit
change in the value of independent variable X. We write, bYX = Regression

coefficient Y on X = r Y
 X
 Similarly the coefficient of regression of X on Y indicates the change in value of

variable X corresponding to a unit change in value of variable Y and is given by bXY

= Regression coefficient X on Y = r X
 Y
 Correlation Coefficient is the geometric mean between the regression coefficients.

 The sign of correlation coefficients is same as that of sign of regression coefficients
 If one of the regression coefficients is greater than unity the other must be less than
unity.
 The modulus value of the arithmetic mean of regression coefficient is not less than
modulus value of correlation coefficient r.
 Regression coefficients are independent of the change of origin but not scale.
 If  is the acute angle between two lines of regression then

1  r
2
  X Y 
 
 = Tan -1 
 
 X 
2 2

 |r | Y 
If r = 0 then variables X and Y are uncorrelated. The lines of regressions are Y = y
and X = x which are perpendicular to each other and are parallel to x- axis and y-
axis respectively.
If r =  1 , the two lines of regression coincide.
 Regression Curves: The conditional mean E ( Y|X = x) for a continuous distribution

is called the regression function Y on X and the graph of this function of x is known
as regression curve of Y on X.
The regression function of X on Y is E (X|Y = y) and the graph of this function of y
is called regression curve (of the mean ) of X on Y.
 Multiple Regression analysis is an extension of (simple) regression analysis in which

two or more independent variables are used to estimate the value of dependent
variable.
Least square regression planes fitting of N data points (X1,X2,X3) in a three

dimensional scatter diagram. The least square regression plane of X 1 on X2 and X3 is
X1 = a + b X2+cX3 where a,b,c are determined by solving simultaneously the normal
equations:
 X 1  an  b  X 2
 c X 3
  a X  b X  c X 2 X
2
X1X 2 2 2 3
  a X  b X  c X
2
X1X 3 3 2
X 3 3
Similarly for the regression plane of X2 on X1 and X3 and the regression plane of X3
on X1 and X2
 The linear regression equation of X1 on X2 ,X3 and X4 can be written as

X1 = a + b X2 + c X3 + d X4
PROBLEMS:
1. Let x and y are two random variables with a joint probability density
 y
 e ,0  x  y
function f ( x, y)   . Find the marginal probability density
 0 , otherwise
function of x
and y.
 y
 e ,0  x  y
Solution: Given that f ( x, y)  
 0 , otherwise
Marginal probability density functionof x is
f x (x)   f ( x , y ) dy


y
 e dy
x
 
y 
 e x
  e  
 e
x

x
 e
Marginal probability density functionof y is


f y ( y)   f ( x , y ) dx

y
y
 e dx
0
 
y y
 xe 0
 ye   y
 0 
y
 ye
2. Determine b for joint probability density function

(x y)
 be ,0  x  a ,0  y  
f ( x, y)  
 0 . Otherwise
(x y)
 be ,0  x  a ,0  y  
Solution: Given f ( x, y)  
 0 . Otherwise
 
  f ( x , y ) dxdy  1

 a
(x y)
  be dxdy 1
y0 x0
 e 
y x a
 be 0
dy  1
y0
 be
y
e 0
 e
a
dy 1
y0

a y
b (1  e ) e dy  1
y0
 
a y 
b (1  e ) e 0
dy  1
b (1  e
a
) e  0
 1
a
b (1  e ) 1
1
 b  a
(1  e )
3. Calculate the coefficient of correlation from the following data
x 12 9 8 10 11 13 7
y 14 8 6 9 11 12 13
Solution: Here X  x  x and Y  y  y
 xi 12  9  8  10  11  13  7
x    10
n 7
y 
 yi

14  8  6  9  11  12  13
 10 . 4
n 7
x y X  x x Y  y  y XY X
2
Y
2
12 14 2 3.6 7.2 4 12.9

9 8 -1 -2.4 2.4 1 5.7
8 6 -2 -4.4 8.8 4 19.3
10 9 0 -1.4 0 0 1.9
11 11 1 0.6 0.6 1 0.3
13 12 3 1.6 4.8 9 2.5
7 13 -3 2.6 -7.8 9 6.7
 XY  16   28 Y  49 . 3
2 2
X
 r 
 XY
Correlation Coefficient
 . Y
2 2
X
16

28  49 . 3
 r  0 . 43
r is positive.
4. The ranks of 16 students in Mathematics and Statistics are as follows

(1,1),(2,10),(3,3),(4,4),(5,5),(6,7),(7,2),(8,6),(9,8),(10,11),(11,15),(12,9),(13,14),(1
4,12),(15,16),(16,13). Calculate the rank correlation coefficient for
proficiencies of this group in mathematics and statistics.
Solution:
Ranks in Ranks in D  X Y D
2
Mathematics  X  Statistics Y 
1 0 0
1
10 -8 64
2
3 0 0
3
4 0 0
4
5 0 0
5
7 -1 1
6
2 5 25
7
6 2 4
8
8 1 1
9
11 -1 1
10
15 -4 16
11
9 3 9
12
14 -1 1
13
12 2 4
14
16 -1 1
15
13 3 9
16
  136
2
D
6 D
2
 Rank Correlation Coefficient  1

 1)
2
N (N
6  136
1
16  225
   0 .8
5. Determine the regression equation which best fit to the following data:
x 10 12 13 16 17 20 25
y 10 22 24 27 29 33 37
Solution: The regression equation of y on x is y  a  bx
The normal equations are
 y  na  b  x
 xy  a  x  b  x
2
2
x y x xy
10 100 100
10
22 144 264
12
24 169 312
13
27 256 432
16
29 289 493
17
33 400 660
20
37 625 925
25
 y  182   1983  xy  3186

2
x
 x  113
Substitute the above values in normal equations
 y  na  b  x  182  7 a  113 b      1 
 xy  a  x  b  x  3186  113 a  1983 b       2 

2
Solve equations 1  and  2  we get

a  0 . 7985 and b  1 . 5611
Now substitute a, b values in regression equation
 The regression equation of y on x is y  0 . 7985  1 . 5611 x
6. Give the following data compute multiple coefficient of correlation of X 3 on X1

and X2.
X1 3 5 6 8 12 14
X2 16 10 7 4 3 2
X3 90 72 54 42 30 12
48 42 300
Solution: here n  6, X 1
  8, X 2
  7, X 3
  50
6 6 6
Now we calculate values of r12 , r13 and r 23
x1  X 1
 X 1
x2  X 2
 X 2
x3  X 3
 X 3
2 2 2
X1 x1 x1 X 2
x2 x2 X 3
x3 x3 x1 x 2 x2 x3 x 3 x1
S.NO
3 -5 25 16 9 81 90 40 1600 -45 360 -200
1
5 -3 9 10 3 9 72 22 484 -9 66 -66
2
6 -2 4 7 0 0 54 4 16 0 0 -8
3
8 0 0 4 -3 9 42 -8 64 0 24 0
4
12 4 16 3 -4 16 30 -20 400 -16 80 -80
5
14 6 36 2 -5 25 12 -38 1444 -30 190 -228
6
68 0 90 42 0 140 300 0 4008 -100 -582 720
r12 
 x1 x 2

 100
  0 . 89
  90  140
2 2
x1 x2
r12 
 x1 x 3

 582
  0 . 97
  90  4008
2 2
x1 x3
r12 
 x2 x3

720
 0 . 96
  140  4008
2 2
x2 x3
r13  r 23  2 r13 r 23 r12

2 2
R 3 . 12   0 . 987
1  r12
2
Exercise Problems:
1. Calculate the Karl Pearson‟s coefficient of correlation from the following data
x 15 18 20 24 30 35 40 50
y 85 93 95 105 120 130 150 160
2. A sample of 12 fathers and their elder sons gave the following data about their elder sons.
Calculate the coefficient of rank correlation.
Father 6 6 6 6 6 6 70 66 68 67 69 71
s 5 3 7 4 8 2
Sons 6 6 6 6 6 6 68 65 71 67 68 70
8 6 8 5 9 6
3. Find the most likely production corresponding to a rainfall 40 from the following data:
Rain fall(X) Production(Y)

Average 30 500Kgs
Standard deviation 5 100Kgs
Coefficient of 0.8
correlation
4. If  is the angle between two regression lines and S.D. of Y is twice the S.D. of X and
r=0.25,
Aexy , 0  x  y, 0  y  
5. find tan  The joint probability density function f (x, y)   .
 0 . O th e rw is e
Determine A.
6. Determine the regression equation which best fit to the following data:
x 10 12 13 16 17 20 25
y 10 22 24 27 29 33 37
UNIT-III
SAMPLING DISTRIBUTION AND
TESTING OF HYPOTHESIS
Sampling Distribution
 Population is the set or collection or totality of the objects, animate or inanimate, actual
or hypothetical under study. Thus, mainly population consists of set of numbers
measurements or observations, which are of interest.
 Size of the population N is the number of objects or observations in the population.
 Population may be finite or infinite.
 A finite sub-set of the population is known as Sample. Size of the sample is denoted by
n.
 Sampling is the process of drawing the samples from a given population.
 If n  30 the sampling is said to be large sampling.
 If n < 30 then the sampling is said to be Small sampling.
 Statistical inference deals with the methods of arriving at valid generalizations and
predictions about the population using the information contained in the sample.
 Parameters Statistical measures or constants obtained from the population are known as
population parameters or simply parameters.
 Population f(x) is a population whose probability distribution is f(x).If f(x) is binomial,
Poisson or normal then the corresponding population is known as Binomial Population,
Poisson population or normal Population.
 Samples must be representative of the population, sampling should be random.
 Random Sampling is one in which each member of the population has equal chances or
probability of being included in the sample.
 Sampling where each member of a population may be chosen, more than once is called
Sampling with replacement. A finite population, which is sampled with replacement,
can theoretically be considered infinite since samples of any size can be drawn with out
exhausting the population. For most practical purpose sampling from a finite population,
which is very large, can be considered as sampling from an infinite population.
 If each member cannot be chosen more than once it is called sampling with out
replacement.
 Any quantity obtained from a sample for the purpose of estimating a population
parameter is called a sample statistics or briefly Statistic. Mathematically a sample
statistic for a sample of size n can be defined as a function of the random variables X 1,
X2……Xn i.e., g(X1, X2……Xn). The function g(X1, X2……Xn) is another random
variable whose values can be represented by g(X1, X2……Xn). The word statistic is often
used for the r.v. or for its values.
 Random samples (Finite population): A set of observations X 1, X2……Xn, constitute a
random sample of size n from a finite population of size N, if its values are chosen so that
each subset of n of the N elements of the population has same probability if being
selected.
 Random sample (Infinite Population): A set of observations X 1, X2……Xn constitute a
random sample of size n from infinite population f(x) if:
(i) Each Xi is a r.v. whose distribution is given by f(x)
(ii) These n r.v.‟s are independent
 Sample Mean X1, X2……Xn is a random sample of size n the sample mean is a r.v.
X 1  X 2  ....... X n
defined by X =
n
 Sample Variance X1, X2……Xn is a random sample of size n the sample variance is a r.v.
n

2
(X i  X )
i 1
defined by S2 = and is a measure of variability of data about the mean.
n
 Sample Standard deviation is the positive square root of the sample variance.
 Degrees of freedom (d o f) of a statistic is the positive integer denoted by , equals to n-k
where n is the number of independent observations of the random sample and k is the
number of population parameters which are calculated using sample data. Thus d o f  =
n – k is the difference between n, the sample size and k, the number of independent
constraints imposed on the observations in the sample.
 Sampling Distributions: The probability distribution of a sample statistic is often called
as sampling distribution of the statistic.
 The standard deviation of the sampling distribution of a statistic is called Standard
Error(S.E)
 The mean of the sampling distribution of means, denoted by  x , is given by E( X ) =  x =
 where  is the mean of the population.
 If a population is infinite or if sampling is with replacement, then the variance of the
2
2 
sampling distribution of means, denoted by 
x
2
is given by E[( X   ) ]=  x
2
=
n
where  is the variance of the population.
2
 If the population is of siqe N, if sampling is without replacement, and if the sample size is
2
2   N  n
nN then  =  
 N 1 
x
n
 N  n
 The factor   is called the finite population correction factor, is close to 1 (and can
 N 1 
be omitted for most practical purposes) unless the samples constitutes a substantial
portion of the population.
 (Central limit theorem) If X is the mean of a sample of size n taken from a population
X  
having the mean  and the finite variance 2 , then Z= is a r.v. whose

n
distribution function approaches that of the standard normal distribution as n

 If X is the mean of a sample of size n taken from a finite population of size N
X  
with mean  and variance 2 then Z= is a r.v whose distribution function
 N n
n N 1
approaches that of the standard normal distribution as n

 The normal distribution provides an excellent approximation to the sampling distribution of
the mean X for n as small as 25 or 30
 If the random samples come from a normal population, the sampling distribution of the
mean is normal regardless of the size of the sample
Estimation
 To find an unknown population parameter, a judgment or statement is made which is an

estimate.
 The method or rule to determine an unknown population parameter is called an
Estimator. For example sample mean is an estimator of population mean because sample
mean is a method of determining the population mean. A parameter can have many or 1,2
estimators. The estimators should be found so that they are very near to parameter values.
 An estimate of a population parameter given by a single number is called a point
estimate of the parameter. If we say that a distance in 5.28 mts , we are giving a point
estimate.
 An estimate of a population parameters given by two numbers between which the
parameter may be considered to lie is called an interval estimate of the parameter. The
distance lie between 5.25 and 5.31 mts.
 A statement of the error or precision of the estimate is called Reliability.
 Let  be the parameter of the interest and ˆ be a statistic, the hat notation distinguishes
the sample-based quantity from the parameter.
 A statistic ˆ is said to be unbiased estimator or its value an unbiased estimate iff the
mean of the sampling distribution of the estimator equals to .
 E( X ) =  and E Sˆ 2  = 2 so that X and Sˆ 2  are unbiased estimators of the
population mean  and variance 2
 A statistic ˆ1 is said to be more efficient unbiased estimator of the parameter  than the
statistic ˆ 2 if
(i) If ˆ1 and ˆ 2 are both unbiased estimators of 
(ii) The variance of the sampling distribution of the first estimator is less than that of
second. E.g. the sampling distribution of the mean and median both has same
namely the population mean. However., the variance of the sampling distribution
of means is smaller than that of the sampling distribution of the medians. Thus the
mean provides a more efficient estimate than the median.
 Maximum error E of a population mean  by using large sample mean is

E = Z /2
n
 The most widely used values for 1- are 0.95 and 0.99 and the corresponding values
of Z/2 are Z0.025 = 1.96 and Z0.005 = 2.575
2
  
 Sample size n = Z
  /2 
 E
 Confidence interval for  ( for large samples n  30 )  known
 
x -Z /2 <  < x +Z /2
n n
 If the sampling is without replacement from a population of finite size N then the
confidence interval for  with known is
 N  n  N  n
x -Z /2 <  < x +Z /2
n N 1 n N 1
 Large sample confidence interval for  -  unknown is

S S
x -Z /2 <  < x +Z /2
n n
 Large sample confidence interval for 1 - 2 ( where 1 and 2 are unknowns)

 S1 
2 2
S2
( x 1  x 2 )  Z /2 
 n
 

 1 n2 
 The end points of the confidence interval are called Confidence Limits.
 In Bayesian estimation prior feelings about the possible values of  are combined with
the direct sample evidence which give the posterior distribution of  approximately
normally distributed with
n x   0
2 2
 
2 2
mean 1 = 0
and standard deviation 1= 0
. In the computation
n 
2 2
n 
2 2
0 0
and 1 and 1,  is assumed to be known. When  is unknown which is generally the
2 2
case, 2 is replaced by sample variance S2 provided
n  30(Large sample)
 Bayesian Interval for :

(1-)100% Bayesian interval for  is given by
1 - Z  / 2  1 <  < 1 + Z  / 2  1
Inferences concerning means
 Statistical decisions are decisions or conclusions about the population parameters on the
basis of a random sample from the population.
 Statistical hypothesis is an assumption or conjecture or guess about the parameters of the
population distribution
 Null Hypothesis (N.H) denoted by H0 is statistical hypothesis, which is to be actually
tested for acceptance or rejection. NH is the hypothesis, which is tested for possible
rejection under the assumption that it is true.
 Any Hypothesis which is complimentary to the N.H is called an Alternative Hypothesis
denoted by H1
 Simple Hypothesis is a statistical Hypothesis which completely specifies an exact
parameter. N.H is always simple hypothesis stated as a equality specifying an exact value
of the parameter. E.g. N.H = H0 :  = 0 N.H. = H0 : 1- 2= 
 Composite Hypothesis is stated in terms of several possible values.
 Alternative Hypothesis(A.H) is a composite hypothesis involving statements expressed as
inequalities such as < , > or 
i) A.H : H1:  > 0 (Right tailed) ii) A.H : H1:  < 0 (Left tailed)
iii) A.H : H1:   0 (Two tailed alternative)
 Errors in sampling
Type I error: Reject H0 when it is true
Type II error: Accept H0 when it is wrong (i.e) accept if when H1 is true.
Accept H0 Reject H0
H0 is True Correct Decision Type 1 error
H0 is False Type 2 error Correct Decision
 If P{ Reject H0 when it is true}= P{ Reject H0 | H0}= and

P{ Accept H0 when it is false}= P{ Accept H0 | H1} =  then , are called the sizes of
Type I error and Type II error respectively. In practice, type I error amounts to rejecting a
lot when it is good and type II error may be regarded as accepting the lot when it is bad.
  and  are referred to as producers risk and consumers risk respectively.

 A region (corresponding to a statistic t) in the sample space S that amounts to rejection of
H0 is called critical region of rejection.
 Level of significance is the size of the type I error ( or maximum producer‟s risk)
 The levels of significance usually employed in testing of hypothesis are 5% and 1% and
is always fixed in advance before collecting the test information.
 A test of any statistical hypothesis where AH is one tailed( right tailed or left tailed) is
called a one–tailed test. If AH is two-tailed such as: H0:  = 0, against the AH. H1 :  
0 (  > 0 and  < 0) is called Two-Tailed Test.
 The value of test statistics which separates the critical ( or rejection) region and the
acceptance region is called Critical value or Significant value. It depends upon (i) The
level of significance used and (ii) The Alternative Hypothesis, whether it is two-tailed or
single tailed
 From the normal probability tables we get
Critical Value Level of significance ()
(Z) 1% 5% 10%
Two-Tailed test -Z/2 = -2.58 -Z/2 = -1.96 -Z/2 = -1.645
Z/2 = 2.58 Z/2 = 1.96 Z/2 = 1.645
Right-Tailed test Z = 2.33 Z = 1.645 Z = 1.28
Left-Tailed Test -Z = -2.33 - Z = -1.645 -Z = -1.28
 When the size of the sample is increased, the probability of committing both types of
error I and II (i.e)  and  are small, the test procedure is good one giving good chance of
making the correct decision.
 P-value is the lowest level ( of significance) at which observed value of the test statistic is
significant.
 A test of Hypothesis (T. O.H) consists of
1. Null Hypothesis (NH) : H0
2. Alternative Hypothesis (AH) : H1
3. Level of significance: 
4. Critical Region pre determined by 
5. Calculation of test statistic based on the sample data.
6. Decision to reject NH or to accept it.
PROBLEMS: 1. A population consists of five numbers 2,3,6,8 and 11. Consider all possible
samples of size two which can be drawn with replacement from this population. Find
i) The mean of the population

ii) The standard deviation of the population
iii) The mean of the sampling distribution of means
iv) The standard deviation of the sampling distribution of means
Solution: Given that N=5, n=2 and
i. Mean of the population
xi 2  3  6  8  11 30
   N

5

5
 6
ii. Variance of the population

xi  x  2  6    3  6    6  6   8  6   11  6 
2 2 2 2 2 2
   
2
N 5
16  9  0  4  25
=
5
 10 . 8
  3 . 29
Sampling with replacement(infinite population):
The total number of samples with replacement is
 5  25
n 2
N
There 25 samples can be drawn

 ( 2 ,2 ) ( 2 ,3 ) ( 2 ,6 ) ( 2 ,8 ) ( 2 . 11 ) 
 
(3,2 ) ( 3 ,3 ) (3,6 ) ( 3 ,8 ) ( 3 ,11 )
 
 
 (6 ,2 ) ( 6 ,3 ) ( 6 .6 ) ( 6 ,8 ) ( 6 ,11 ) 
 
(8 ,2 ) ( 8 ,3 ) (8 ,6 ) ( 8 ,8 ) ( 8 ,11 )
 
 (11 , 2 ) `( 11 , 3 ) (11 , 6 ) (11 , 8 ) (11 ,11 ) 
The sample means are
 2 2 .5 4 5 6 .5 
 
2 .5 3 4 .5 5 .5 7
 
 
 4 4 .5 6 7 .0 8 .5 
 
5 5 .5 7 8 9 .5
 
 6 . 5 7 8 .5 9 .5 11 
iii. The mean of the sampling distribution of means is

2  2 . 5  4  5  6 . 5       11
x 
25
=6
iv. The standard deviation of the sampling distribution of means
(2  6)  ( 2 .5  6 )       (11  6 )
2 2 2
 
2
x
25
= 5 . 40
 x
 2 . 32
2. A population consists of five numbers4, 8, 12, 16, 20, 24. Consider all possible samples of
size two which can be drawn without replacement from this population. Find

iii)The mean of the sampling distribution of means
iv) the standard deviation of the sampling distribution of means
Solution: Given that N=6, n=2 and
i. Mean of the population

xi 4  8  12  16  20  24 84
   N

6

6
 14
ii. Variance of the population

xi  x   4  14   8  14   12  14   16  14    20  14    24  14 
2 2 2 2 2 2 2
   
2
N 6
100  36  4  4  36  100
=
6
 46 . 67
  3 . 29
Sampling without replacement (finite population):
The total number of samples without replacement is 𝑁𝑐𝑛 = 6𝑐2 = 15
There 15 samples can be drawn

 ( 4 ,8 ) ( 4 ,12 ) ( 4 ,16 ) ( 4 , 20 ) ( 4 , 24 ) 
 
( 8 ,12 ) ( 8 ,16 ) ( 8 , 20 ) ( 8 , 24 )
 
 
 (12 ,16 ) (12 , 20 ) (12 , 24 ) 
 
(16 , 20 ) (16 , 24 )
 
 ( 20 , 24 ) ` 
The sample means are
 6 8 10 12 14 
 
10 12 14 16
 
 
 14 16 18 
 
18 20
 
 22 
iii. The mean of the sampling distribution of means is

6  8  10  12      20  22
x 
15
=14
iv. The standard deviation of the sampling distribution of means
( 6  14 )  ( 8  14 )       ( 22  14 )
2 2 2
 
2
x
15
= 18 . 67
 x
 4 . 32
3. The mean of certain normal population is equal to the standard error of the mean of the
samples of 64 from that distribution. Find the probability that the mean of the sample size
36 will be negative.
Solution: Given mean of the population (  ) = 155 cm
Standard deviation of the population (  ) = 15 cm

Sample size ( n ) = 36
Mean of sample ( x ) = 157 cm
x  
Now Z 

n
157  155
=
15
36
=0.8
 P ( x  157 )  P ( Z  0 . 8 )
= 0 .5  P ( 0  Z  0 .8 )
=0.5+0.2881
 P ( x  157 ) = 0.7881
4. In a study of an automobile insurance a random sample of 80 body repair costs had a
mean of Rs. 472.36 and the standard deviation of Rs. 62.35. If x is used as a point estimate
to the true average repair costs, with what confidence we can assert that the maximum
error doesn’t exceed Rs. 10.
Solution: Given Sample size ( n ) = 80
Standard deviation of sample (s) = 62.35
Mean of sample ( x ) = 472.36
Maximum Error(E)=10

 E  Z .
2 n
E. n 10 80
 Z    1 . 4345
2  62 . 35
The area when z=1.43 from the tables is 0.4536

  0 . 4236    0 . 8472
2
 Confidence = (1   )100 %  84 . 72 %
Hence we are 84.72% confidence that the maximum error is Rs. 10.
5. Determine a 95% confidence interval for the mean of normal distribution with variance
0.25, using a sample of size 100 values with mean 212.3.
Solution: Given Sample size ( n ) = 100
Standard deviation of sample (  ) = 0 . 25  0 . 5
Mean of sample ( x ) = 212.3 and Z = 1.96(for 95%)

2
   
 Confidence interval =  x  Z  . , x  Z . 
 2 n 2 n 
 0 .5 0 .5 
=  212 . 3  1 . 96 . , 212 . 3  1 . 96 . 
 100 100 
= (212.202, 212.398)
Exercise Problems:
1. Samples of size 2 are taken from the population 1, 2, 3, 4, 5, 6. Which can be drawn without
replacement? Find
iii) The mean of the sampling distribution of means
iv) The standard deviation of the sampling distribution of means
2. If a 1-gallon can of paint covers on an average 513 square feet with a standard deviation of
31.5 square feet, what is the probability that the mean area covered by a sample of 40 of these 1-
gallon cans will be anywhere from 510to 520 square feet?
3. What is the size of the smallest sample required to estimate an unknown proportion to within a
maximum error of 0.06 with at least 95% confidence.
4. A random sample of 400 items is found to have mean 82 and standard deviation of 18. Find
the maximum error of estimation at 95% confidence interval. Find the confidence limits for the
mean if x =82.
5. A sample of size 300 was taken whose variance is 225 and mean is 54. Construct 95%
confidence interval for the mean.
UNIT - IV
LARGE SAMPLE TESTS
 Test statistic for T.O.H. in several cases are
1. Statistic for test concerning mean  known
X  0
Z=
 / n
2. Statistic for large sample test concerning mean with  unknown

X  0
Z=
S / n
3. Statistic for test concerning difference between the means

X X  
Z= 1 2
under NH H0 : 1 - 2 =  against the AH, H1: 1 - 2 >  or H1:
1 2 
2 2
  
 n n2 
 1 
1 - 2 <  or H1: 1 - 2  
4. Statistic for large samples concerning the difference between two means (1 and 2 are
unknown)
X 1
X 2
 
Z=
 S1 
2 2
S2
  
 n n2 
 1 
Statistics for large sample test concerning one proportion
X  np
Z = o
under the N.H: H0: p = po against H1: p  p0 or p > p0 or p <P0
np 0 (1  p 0 )
Statistic for test concerning the difference between two proportions

X X

1 2
n1 n2 X  X
Z= with p̂ = 1 2
under the NH : H0: p1=p2 against the AH H1:p1 <
pˆ (1  pˆ )( 1
 1
) n1  n 2
n1 n2
p2 or p1 > p2 or p1  p2
 To determine if a population follows a specified known theoretical distribution such as

ND,BD,PD the 2(chi-square) test is used to assertion how closely the actual distribution
approximate the assumed theoretical distribution. This test is based on how good a fit is
there between the observed frequencies and the expected frequencies is known as
“goodness-of-fit-test”.
 Large sample confidence interval for p
x  x x  x
1   1  
x n  n x n  n
 Z /2
<p<  Z /2
where the degree of confidence is
n n n n
1- 
 Large sample confidence interval for difference of two proportions (p 1- p2) is

x1  x  x2  x 
1  1  1  2 
   
 x1 x2  n1  n1  n2  n2 
    Z 
  /2
n
 1 n 2 
n1 n2
p (1  p )
 Maximum error of estimate E = Z/2 with observed value x/n substituted for p
n
we obtain an estimate of E
2
 Z 
 Sample size n = p(1-p)  /2
 when p is known
 E 
2
1  Z /2 
n=   when p is unknown
4  E 
 One sided confidence interval is of the form p < (1/2n)2 with (2n+1) degrees of
freedom.
Problems:
1. A sample of 400 items is taken from a population whose standard deviation is

10.The mean of sample is 40.Test whether the sample has come from a population
with mean 38 also calculate 95% confidence interval for the population.
Solution: Given n=400, x  40 and  =38 and  =10

1. Null hypothesis(H0):  =38
2. Alternative hypothesis(H1):   38
3. Level of significance:  =0.05 and Z  =1.96
x  
4. Test statistic: Z 

n
x   40  38
Z  = =4
 10
n 400
Z  4
5. Conclusion:
 Z >Z
 We reject the Null hypothesis.

   
Confidence interval =  x  Z  ,x  Z 
 n n 
 10 10 
=  40  1 . 96 , 40  1 . 96 
 400 400 
=  39 . 02 , 40 . 98 
2. Samples of students were drawn from two universities and from their weights in
kilograms mean and S.D are calculated and shown below make a large sample test to the
significance of difference between means.
MEAN S.D SAMPLE

SIZE
University-A 55 10 400
University-B 57 15 100
Solution: Given n1=400, n2=100, x 1 =55, x 2 =57

S1=10 and S2=15
1. Null hypothesis(H0): x 1 = x 2
2. Alternative hypothesis(H1): x1  x 2
x1  x 2 55  57
4. Test statistic: Z  = =-1.26
2 2
S1 S2 100 225
 
n1 n2 400 100
Z  1.26
5. Conclusion:
 Z <Z
 We accept the Null hypothesis.

3. In a sample of 1000 people in Karnataka 540 are rice eaters and the rest are wheat
eaters. Can we assume that both rice and wheat are equally popular in this state at
1% level of significance?
Solution: Given n =400, x =540

x 540
p = = = 0.54
n 1000
1
P = = 0.5 , Q = 0.5
2
1. Null hypothesis(H0): P =0.5

2. Alternative hypothesis(H1): P  0.5
3. Level of significance:  =1% and Z  =2.58
P  p
4. Test statistic: Z 
PQ
n
P  p 0 . 54  0 . 5
Z  = = 2.532
PQ 0 .5  0 .5
n 1000
Z  2.532
5. Conclusion:
 Z <Z
4. Random sample of 400 men and 600 women were asked whether they would like to
have flyover near their residence .200 men and 325 women were in favour of
proposal. Test the hypothesis that the proportion of men and women in favour of
proposal are same at 5% level.
Solution: Given n1=400, n2=600 , x 1  200 and x 2  325
200
p1   0 .5
400
325
p2   0 . 541
600
200 325
400   600 
n1 p1  n 2 p 2 400 600
p    0 . 525
n1  n 2 400  600
q  1  p  1  0 . 525  0 . 475
1. Null hypothesis(H0): p 1 = p 2
2. Alternative hypothesis(H1): p1  p 2
p1  p 2
4. Test statistic: Z  =
 1 1 
pq   
 
 n1 n2 
0 . 5  0 . 541
  1 . 28
 1 1 
0 . 525  0 . 425   
 400 600 
Z  1.28
5. Conclusion:
 Z <Z
Exercise Problems:
1. An ambulance service claims that it takes on the average 8.9 minutes to reach its destination In
emergency calls. To check on this claim the agency which issues license to Ambulance service
has then timed on fifty emergency calls getting a mean of 9.2 minutes with 1.6 minutes. What
can they conclude at 5% level of significance?
2.According to norms established for a mechanical aptitude test persons who are 18 years have
an average weight of 73.2 with S.D 8.6 if 40 randomly selected persons have average 76.7 test
the hypothesis 𝐻0 : 𝜇 =73.2 againist alternative hypothesis : 𝜇 >73.2.
3.A cigarette manufacturing firm claims that brand A line of cigarettes outsells its brand B by
8% .if it is found that 42 out of a sample of 200 smokers prefer brand A and 18 out of another
sample of 100 smokers prefer brand B. Test whether 8% difference is a valid claim.

Statistics

Uploaded by

Copyright:

Available Formats

Statistics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics

Uploaded by

Copyright:

Available Formats

UNIT-I

SINGLE RANDOM VARIABLES

(2) Tossing a coin experiment and getting head or tail (event)

An event that always happen is called Certain event, it is denoted by „S‟.

An event that never happens is called Impossible event, it is denoted by „  ‟.

Definition: probability (Mathematical Definition)

no of favourable cases to event m

S = {(1,1) ,(1,2) ,----------(1,6), --------(6,1),(6,2),--------(6,6)}

Eg. (2) In tossing two coins experiment , S = {HH ,HT,TH,TT}

Statistical or Empirical Probability:

An event E is called elementary event if it consists only one element.

An event, which is not elementary, is called compound event.

 A random variable X on a sample space S is a function X : S  R from S onto the set of

iii) P(X =x) = f(x)

 Cumulative distribution or simply distribution of a discrete r.v. X is F(x) defined by F(x)

f(x1)+f(x2)+…..+f(xn) xn x < 

F(-) = 0 , F()=1, 0F(x)1, F(x)F(y) if x<y

P(xk)= P(X= xk)=F(xk) – F(xk-1)

i) f(x)  0 ,- <x <

 Probability at a point P(X=a) =  f ( x ) dx

F(x)= P(Xx)=  f ( t ) dt -<x<

It follows that F(-) = 0 , F()=1, 0F(x)1 for -<x<

f(x)= d/dx(F(x))= F1(x)0 and P(a < x < b)= F(b)-F(a)

(i) f(x,y)  0 (ii)   f ( x, y) =1

x1 f(x1,y1) f(x1,y2) …….. f(x1,yn) f1(x1)

x2 F(x2,y1) f(x2,y2) …….. f(x2,yn) f1(x2)

…….. ……. ……… ……… ……… ……..

xm f(xm,y1) f(xm,y2) ……. f(xm,yn) f1(xm)

=P(Y=y1) =P(Y=y2) =P(Y=yn)

The probability of X = xj is obtained by adding all entries in arrow corresponding to X

Similarly the probability of Y = yk is obtained by all entries in the column

The joint distribution function of X and Y is defined by F(x,y)= P(Xx,Yy)=

 The joint distribution function of X and Y is F(x,y) = P( X  x,Y  y)=

  f ( u , v ) dudv and P(Y  y) = F2(y) =   f ( u , v ) dudv

 The marginal density function of X and Y are given by

 Two discrete random variables X and Y are independent iff

P(X = x,Y = y) = P(X = x)P(Y = y)  x,y (or)

 Two continuous random variables X and Y are independent iff

P(X  x,Y  y) = P(X  x)P(Y  y)  x,y (or)

Expectation or mean or Expected value : The mathematical expectation or expected value of

 If X is a r.v. then E[g(X)] =

 If X, Y are r.v.‟s with joint probability function f(x,y) then

E[g(X,Y)] =  g ( x, y) f ( x, y) for discrete r.v.‟s

  g ( x , y ) f ( x , y ) dxdy for continuous r.v.‟s

the equations for M we get the value of median. Median is unique

Variance of r.v. X denoted by Var(X) and is defined as

 If c is any constant then E(cX) = c E(X)

 If X and Y are two r.v.‟s then E(X+Y) = E(X)+E(Y)

 IF X,Y are two independent r.v.‟s then E(XY) = E(X)E(Y)

 If X1,X2,-------,Xn are random variables then E(c1X1 +c2X2+------+cnXn) =

 Var (X) = E (X2) –[E (X)]2

 If „c‟ is any constant then var (cX) = c2var(X)

 The quantity E[(X-a)2] is minimum when a == E(X)

 If X and Y are independent r.v.‟s then Var(X  Y) = Var(X)  Var(Y)

 A random variable X is said to follow binomial distribution if it assumes only non-negative

Where n, p are known as parameters, n- number of independent trials p- probability of success

 Binomial distribution is a discrete distribution.

 The mean and variance of Binomial distribution are np , npq respectively.

 Moment generating function of Binomial distribution: If X ~ B(n,p)then MX(t)=(q+pet) n

(i) n, the number of trials is infinitely large.