3.5.16 Probability Distribution PDF
3.5.16 Probability Distribution PDF
3.5.16 Probability Distribution PDF
Structure
16.0 Objectives
16.1 Introduction
16.2 Elementary Concept of Random Variable
16.3 Probability Mass Function
16.4 Probability Density Function
16.5 Probability Distribution Function
16.6 Moments and Moment Generating Functions
16.7 Three Important Probability Distributions
16.7.1 Binomial Distribution
16.7.2 Poisson Distribution
16.7.3 Normal Distribution
16.8 Let Us Sum Up
16.9 Key Words
16.10 Some Useful Books
16.11 Answer or Hints to Check Your Progress
16.12 Exercises
16. 0 OBJECTIVES
After going through this unit, you will be able to:
• understand the random variables and how they are inseparable to
probability distributions;
• appreciate moment generating functions and their role in probability
distribution; and
• solve the problems of probability, which fit into binomial, poisson and
normal distributions.
16.1 INTRODUCTION
In the previous unit on probability theory, we discussed the deterministic and
non-deterministic events and introduced to random variables, which are
outcomes of non-deterministic experiments. Such variables are always
generated with a particular pattern of probability attached to them. Thus,
based on the pattern of probability for the different values of random variable,
we can distinguish them. Once we know these probability distributions and
their properties, and if any random variable fits in a probability distribution, it
will be possible to answer any question regarding the variable. In this unit, we
have defined the random variable and made a broad distinction of the
probability distributions based on whether the random variable is continuous
or not. Then we have discussed how the moments of a probability distribution
describe the distribution completely; how the technique of moment generating
function could be used to obtain the moments of any probability distribution.
101
Statistical Methods-1 In the next section, we discuss three most widely used probability
distributions viz., binomial, poisson and normal.
(2) ∑ f ( xi ) = 1
n
i =1
Example: An unbiased coin is tossed until the first head is obtained. If the
random variable X denotes the number of tails preceding the first head, then
what is the probability distribution of X?
0 ½
prerequisites mentioned earlier, i.e., f(x) is
2
1 (½)
always greater than ‘0’ and ∑ f ( x ) =1. Note
n
i =1
2 (½)3 that X is a countably infinite random variable.
∑ f ( x ) =1
n
3 ( (½)4 i =1
Values of X f(X=x)
-3 0.216
-1 0.432
1 0.288
3 0.064
The function f(x) is called the probability density function (p.d.f.) provided it
satisfies following two conditions
103
Statistical Methods-1 1) f(x) ≥ 0
b
2) If the range of the continuous random variable is (a, b) ∫ f ( x ) = 1
a
The shaded area in the following figure represents the probability that the
variable x will take values between the interval (c, d), whereas its range is (a,
b). In the figure, we have taken the values of x in the horizontal axis and those
of f(x) on the vertical axis. This is known as the probability curve. Since f(x)
is a p.d.f., total area under the probability curve is 1 and the curve cannot lie
below the horizontal axis as f(x) cannot take negative values. Any function
that satisfies the above two conditions, can serve as a probability density
function in the presence of real values only.
Clearly, the range of the function given in the problem is -∝ to ∝ and for
every value of x within that interval, f(x) is positive, provided that ‘k’ is
positive. To satisfy the second condition
∞
∫ f ( x ) dx = 1
0
∞ °
or , ∫ f ( x )dx + ∫ f ( x )dx = 1
0 ∞
∞
or , ∫ f ( x ) dx = 1
0
∞ −3 x
or , ∫ ke dx = 1
0
or, k/3 = 1
or, k = 3
Thus, we get, f(x) = 3.e-3 x for x > 0 and 0 otherwise.
1
−3 x
P (0.5 ≤ x ≤ 1) = ∫ 3e dx = -e-3 + e- 1.5 = 0.173
0.5
104
Probability Distribution
16.5 PROBABILITY DISTRIBUTION FUNCTIONS
If X is a discrete random variable and the value of it’s probability at the point
t is given by f(t), then the function given by
F(x) = ∑ f ( t ) for -∞ ≤ x ≤ ∞
x≤t
function of a continuous random variable has the same nice properties as that
of a discrete random variable viz.,
i) F(-∝) = 0, F(∝) = 1
ii) If a < b then F(a) ) ≤ F(b) where a and b are any real number
iii) Furthermore, it follows directly from the definition that
P (a ≤ x ≤ b) = F(b) - F(a), where a and b are real constants with a ≤ b.
d
iv) f ( x ) = F ( x ) where the derivatives exist.
dx
Example: Find the distribution function of the random variable x of the
previous example and use it to reevaluate P (0.5 ≤ x ≤ 1)
For all non-positive values of x, f(x) takes the value 0. Therefore,
F(x) = 0 for x ≤ 0.
For x > 0,
x
F(x) = ∫ f ( t )dt
∞
105
Statistical Methods-1 x
or ∫ 3e−3t dt
0
or, 1 – e-3 x
Thus, F(x) = 0 for x ≤ 0, and F(x) = 1 – e-3 x for x > 0.
To determine P (0.5 ≤ x ≤ 1), we use the fourth property of the distribution
function of a continuous random variable.
P (0.5 ) ≤ x ) ≤ 1) = F(1) - F(0.5) = (1 – e-3) – (1 – e-3 × 0.5) = 0.173
Check Your Progress 1
1) What is the difference between probability mass function and probability
density function? What are the properties that p.d.f. or p.m.f. must satisfy?
2) If X is a discrete random variable having the following p.m.f.,
X P (X=x) (i) determine the value of the constant k.
0 0 (ii) find P (X<5)
1 k
(iii) find P (X>5)
2 2k
3 2k
4 3k
5 k2
6 2k2
7 7k2 + k
3) For each of the following, determine whether the given function can serve
as a p.m.f.
i) f(x) = (x - 2)/5 for x = 1,2,3,4,5
ii) f(x) = x2/30 for x = 0,1,2,3,4
iii) f(x) = 1/5 for x = 0,1,2,3,4,5
4) If X has the p.d.f.
f(x) = k.e-3x for x > 0
0 otherwise.
Find k and P (0.5) ≤ X) ≤ 1)
5) Find the distribution function of for the above p.d.f. and use it to
reevaluate
P (0.5 ) ≤ X ≤ 1)
i =1
106
Correspondingly, if X is a continuous random variable and f(x) gives the Probability Distribution
probability density at x, the expected value of x is given by E (X) =
∞
∫ x f ( x )dx
−∞
x =1
⎢⎣ i =1 ⎥⎦ i =1
In statistics as well as economics, the notion of mathematical expectation is
very important. It is a special kind of moment. We will introduce the concept
of moments and moment generating functions in the following.
The rth order moment about the origin of a random variable x is denoted by µ’r
and given by the expected value of xr.
n
Symbolically, for discrete random variable µ’r = ∑ xi r f ( xi ) , for r = 1, 2, 3… n.
i =1
∞
r
For continuous random variable µ’r = ∫ x f ( x)dx . It is interesting to note
−∞
that the term moment comes from physics. If f(x) symbolizes quantities of
points of masses, where x is discrete, acting perpendicularly on the x axis at
distance x from the origin, then µ’1 as defined earlier would give the center of
gravity, which is the first moment about the origin. Similarly, µ’2 gives the
moments of inertia.
In statistics, µ’1 gives the mean of a random variable and it is generally
denoted by µ.
The special moments we shall define are of importance in statistics because
they are useful in defining the shape of the distribution of a random variable,
viz.,the shape of it’s probability distribution or it’s probability density.
The rth moment of random variable about mean is given by µ r . It is the
n
expected value of (X - µ)r , symbolically, µr = E ((X - µ)r) = ∑ (xi – µ)r f(xi),
i=1
for r = 1, 2, 3, ….., n, for discrete random variable and for continuous random
∞
107
Statistical Methods-1 The second moment about mean is of special importance in statistics because
it gives an idea about the spread of the probability distribution of a random
variable. Therefore, it is given a special symbol and a special name. The
variation of a random variable and it’s positive square root is called standard
deviation. The variance of a variable is denoted by Var (X) or V (X) or simply
by σ2.
The above example shows how probability distributions vary with the
variance of the random variable. A high value of σ2 means the spread of the
distribution is thick at the tails and a low value of σ2 implies the spread of the
distribution is tall at the mean of the distribution and the tails are flat.
Similarly, third order moment about mean describes the symmetry or
skewness (i.e., lack of symmetry) of the distribution of a random variable.
We state few important theorems on moments without going into details as
they have been covered in the unit on probability.
1) σ2 = µ’2 – µ2
2) If the random variable X has the variance, then Var (a.X + b) = a2 σ2
3) Chebyshev’s theorem: To determine how σ or σ2 is indicative of the
spread or the dispersion of the distribution of the random variable,
Chebyshev’s theorem is very useful. Here we will only state the theorem:
If µ and σ2 are the mean and variance of a random variable, say X, then for
any constant k the probability is at least ( 1 – 1/k2) that X will take on a value
within k standard deviations (k. σ); symbolically
P( |X - µ | < k.σ) ≥ 1 – 1/k2
108
Probability Distribution
µ – k.σ µ + k.σ
i =1
∞
Mx (t) = E (etX) = ∫ etx f ( x ) dx when X is continuous
−∞
∑ e f(x) = ∑
n
Mx (t) = E (etx) = i (1 + tx + t2x2/2! + t3x3/3! +.......+ trxr/r!
i=1
i =1
n n n n
+………).f(x) = ∑ f(x) +t. ∑ x.f(x) + t2 /2!. ∑ x2f(x) + t3 /3!. ∑ x3f(x)
i=1 i=1 i=1 i=1
n
r r 2 r
+…..+ t /r!. ∑ x f(x) + …………= 1 + t. µ + t /2!. µ’2 +…….+ t /r!. µ’r
i=1
+…………….
Thus, we can see that in the Maclaurin’s series of the moment generating
function of X, the coefficient of tr/r! is µ’r, which is nothing but the rth order
moment about the origin. In the continuous case, the argument is the same
(readers many verify that).
To get the rth order moment about the origin, we differentiate Mx (t) r times
with respect to t and put t = 0 in the expression obtained. Symbolically,
µ’r = dr Mx (t)/dtr|t=0
An example will make the above clear.
Example: Find the moment generating function of the random variable whose
probability density is given by
F(x) = e-x if x > 0
109
Statistical Methods-1 0 otherwise
and use that to find the expression for µ’r.
By definition,
0 ∞
tx
Mx (t) = E (e ) =
r
∫ et f (x )dx = ∫ e−r (1 – t) dx = 1/ 1- t for t<1.
−∞ −∞
When |t|<1, the Maclaurin’s series for this moment generating function is
Mx (t) = 1 + t + t2 + t3 + t4 +……. + tr +……… = 1 + 1!.t/1! + 2!.t2 /2! + 3!.t3
/3! + +……. + r!.tr /r! +………
Hence, µ’r = dr Mx(t)/dtr|t=0 = r! for r = 0, 1, 2…
If ‘a’ and ‘b’ are constants, then
1) Mx +a (t) = E (et ( x +a)) = eat. Mx (t)
2) Mx.b (t) = E (et .b.x) = Mx (b.t)
3) M (x +a) /b (t) = E (et ( x +a) /b) = e( a /b )t . Mx (t/b)
Among the above three results, the third one is of special importance. When a
= - µ and b = σ, M (x -µ) /σ (t) = E (et ( x – µ) /σ) = e( -µ /σ )t. Mx (t/σ)
Check Your Progress 2
1) Given X has the probability distribution f(x) = 1/8. 3Cx for x = 0, 1, 2, and
3, find the moment generating function of this random variable and use it
to determine µ’1 and µ’2.
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
16.7.1 Binomial Distribution
Repeated trials play has a very important role in probability and statistics,
especially when the number of trials is fixed and the probability of success (or
failure) in each of the trial is independent.
The theory, which we discuss in this section, has many applications; for
example, it applies to events like the probability of getting 5 heads in 12 flips
of a coin or the probability that 3 persons out of 10 having a tropical disease
will recover. To apply binomial distribution in these cases, the probability (of
getting head in each flip and recovering from the tropical disease for each
person) should exactly be the same. More importantly, each of the coin tossed
and chance of recovering of each patient should be independent.
To derive a formula for the probability of getting ‘x successes in n trials’
under the stated conditions we proceed as follows: suppose the probability of
getting a success is ‘p’ since every experiment has only two possible
outcomes (this type of events are called Bernoulli events) - the probability of a
success or a failure. The probability of failure is simply ‘1 – p’ and the trials
or the experiments are independent of each other. The probability of getting
‘x’ successes and ‘n – x’ failures in n trials is given by p x (1 - p )n - x . The
probabilities of success and failure are multiplied by virtue of the assumption
that these experiments are independent. Since the probability applies to any
110
sequence of n trials in which there are ‘x’ successes and ‘n – x’ failures, we Probability Distribution
have to count how many sequences of this kind are possible and then we have
to multiply, p x.( 1 - p )n − x, by that number. Clearly, the number of ways in
which we have ‘x’ successes and ‘n – x’ failures is given by nCx. Therefore,
the desired probability of getting ‘x’ success in ‘n’ trials is given by nCxpx(1-
p) n –x. Remember that the binomial distribution is a discrete probability
distribution with parameters n and p.
A random variable X is said to have a binomial distribution and is referred as
a binomial random variable, if and only if it’s probability distribution is given
by
B(x;n,p) = nCx. p x.( 1 - p ) n − x for x=0, 1, 2, ……..,n
Example: Find the probability of getting 7 heads and 5 tails in 12 tosses of an
unbiased coin.
Substituting x = 7, n = 12, p = ½ in the formula of binomial distribution, we
get the desired probability of getting 7 heads and 5 tails.
B (7;12,1/2) = 12C7. 1/2 7.( 1 – 1/2) 5 =12C7. (1/2)12
There are a few important properties of the binomial distribution. While
discussing these, we retain the notations used earlier in this unit.
• The mean of a random variable, say X, which follows binomial
distribution is µ = n. p and variance σ2 = n. p.(1-p) = n. p. q (the proofs of
these properties are left as recap exercises)
• b (x; n, p) = b (n - x, n, 1- p)
b (n - x, n, 1- p) = nCn-x (1-p)n-x.px = b (x; n, p) [as nCn-x = nCx]
• Binomial distribution may have either one or two modes. When (n+1)p is
not an integer, mode is the largest integer contained therein. However,
when (n+1)p is an integer, there are two modes given by (n + 1)p and {( n
+ 1)p -1}.
• Skewness of binomial distribution is given by (q - p)/√n.p.q, where q = (1-p)
• The kurtosis of binomial distribution is given by (1– 6pq)/npq
• If X and Y are two random variables, where X follows binomial
distribution with parameters ( n1, p) and Y follows binomial distribution
with parameters (n2, p), then the random variable (X +Y) also follows
binomial distribution with parameters (n1 + n2, p).
• Binomial distribution can be obtained as a limiting case of hypergeometric
distribution.
• The moment generating function of a binomial distribution is given by
n n
Mx (t)=E (etx)= ∑ e f ( x ) = ∑ e . nCx. px (1-p)n-x
tx tx
i =0 i =0
n
= ∑ nCx. (pet )x (1-p)n-x. The summation is easily recognized as the
i=0
112
• The distribution has skewness = 1/ √λ and kurtosis = 1/ λ, therefore we Probability Distribution
∞
= ∑ etx. λxe-λ/x!
i=0
∞
= e-λ ∑ etx. λx/x!
i=0
∞
= e-λ ∑ (λet)x /x!
i=0
∞
In the above expression ∑ (λet)x /x! can be recognized as the Maclaurin’s
i=0
z t
series of e where z = λe .
Thus, the moment generating function of the poisson distribution is
Mx (t)= e-λ.e λet
Differentiating Mx (t) twice with respect to t we get,
Mx’ (t) = λ.et.eλ( et - 1)
Mx”(t) = λ.et.eλ( et - 1) + λ2 .e2t.eλ( et - 1)
Therefore, µ’1 = Mx’ (0) = λ and µ’2 = Mx “ (0) = λ + λ2 and we get µ =
λ and σ2 = µ’2 – (µ’1)2 = λ + λ2 - λ2 = λ
Example: Let X be a random variable following poisson distribution. If P(X =
1) =P(X = 2), find P (X = 0 or 1) and E (X).
For poisson distribution, the p.m.f. is given by P (x, λ) = λxe-λ/x!
Therefore, P (X = 1) = λ1e-λ/1! = λe-λ
P (X = 2) = λ2e-λ/2! = λ2e-λ/2
As P (X = 1) =P (X = 2), from the equation λe-λ = λ2e-λ/2, we get λ = 2.
Therefore, E (X) = λ = 2 and
P (X = 0 or 1) = P (X = 0) +P (X = 1) = 20e-2/0! + 21e-2/1! = 3.e-2.
Example: In a textile mill, on an average, there are 5 defects per 10 square
feet of cloth produced. If we assume a poisson distribution, what is the
probability that a 15 square feet cloth will have at least 6 defects?
113
Statistical Methods-1 Let X be a random variable denoting the number of defects in a 15 square feet
piece of cloth.
Since on an average, there are 5 defects per 10 square feet of cloth, there will
be on an average 7.5 defects per 15 square feet of cloth i.e., λ = 7.5. We are to
find P(X ≥ 6) = 1- P( X ≤ 5).
You are asked to verify the table for
X P (X) λ = 7.5 with the help of a calculator. From the table
0 0.0006 we obtain P(X ≤ 5) = .2415. Therefore, P( X ≥ 6) =
1 0.0041 1- .2415 = .7585.
2 0.0156
3 0.0389
4 0.0729
5 0.1094
F(x) = n (x,µ,σ) = ×e 2
for -∝ < x <∝, where σ >0
σ 2Π
The shape of normal distribution is like a cross section of a bell and is shown
as,
While defining the p.d.f. of normal distribution we have used the standard
notations, where σ stands for standard deviation and µ for mean of the random
variable X. Note that f(x) is positive as long as σ is positive which is
guaranteed by the fact that standard deviation of a random variable is always
positive. Since f(x) is a p.d.f. while X can assume any real value, integrating
f(x) over -∝ to ∝ we should get the value 1. In other words, the area under the
curve must be equal to 1. Let us prove that
114
Probability Distribution
∞
1
{( x − µ ) / σ }
∫ 1 2
×e 2
dx = 1
−∞ σ 2Π
We substitute ( x - µ)/σ = z in the R.H.S of the above equation to get
1
∞
1 {( x − µ ) / σ }
2 ∞ 1 ∞
1
1 − z2 2 − z2
∫ ×e ∫ 2Π ∫0
2
dx = e 2
dz = e 2
dz
−∞ σ 2Π 2Π −∞
[Since the p.d.f. is symmetrical , integrating the function from 0 to ∞
twice is the same as integrating the same function −∞ to ∞ ]
1 ∞
2 ⎛1⎞ − z2 ⎛1⎞
= × Γ ⎜ ⎟ / 2 [since ∫ e 2 dz = Γ ⎜ ⎟ / 2 ]
2Π ⎝2⎠ 0 ⎝2⎠
= 2/√2Π × √Π / √2
= 1 ……………………….[ proved]
Normal distribution has many nice properties, which make it amenable to be
applied to many statistical as well as economic models.
Example: The height distribution of a group of 10,000 men is normal with
mean height 64.5” and standard deviation 4.5. Find the number of men whose
height is
a) less than 69” but greater than 55.5”
b) less than 55.5”
c) more than 73.5”
The mean and standard deviation of a normal distribution is given by
µ = 64.5” and σ = 4.5. We explain the problem graphically. From the figure
below, we can easily comprehend what we are asked to do. We are to find out
the shaded regions but as we know area under a standard normal curve only,
we have to reduce the given random variable into a standard normal variable
Let X is the continuous random variable measuring the height of each man.
Therefore, (X – 64.5)/4.5 = z is a standard normal variable.
The following table shows values of z for corresponding values of x. In the
table true
115
Statistical Methods-1 area under standard normal curve is given for
X Z
only positive values of the standard normal
55.5 -2 variable and as the distribution is symmetrical,
area under the curve for negative values of the
64.5 0 standard normal variable is easy to find out. For
standard normal curve (say for the variable z)
69 1 the area under the curve to the left is
conventionally denoted by Ф (z1). This is shown
73.5 2 in the figure below.
a) P (55.5<X<69) = P (-2<z<1) = Ф (1) - Ф (-2) = .82, Therefore, men of
height less than 69” but greater than 55.5” is 10000×0.82 = 8200
b) P (X<55.5) = P (z<-2) = .02, Therefore, men of height less than 55.5” is
10000×0.02 = 200
c) P (X>73.5) = P (z > 2) = 1 - P (z < 2) = 1 - .98 = .02, Therefore, men of
height greater than 73.5” is 10000×0.02=200
• If X and Y are two normal variables with mean µ1 and µ2 and standard
deviation σ1 and σ2, then ( X + Y) is also a normal variable with mean
(µ1 + µ2) and variance (σ12 + σ22).
• The moment generating function of a normal curve is given by
µ t +1/ 2.σ 2t 2
MX (t) = e
2
∞ ⎛ x−µ ⎞
1 −1/ 2⎜
σ ⎟⎠
MX (t) = ∫ tx
e × e ⎝ dx
−∞ σ 2π
The above expression could be written, after some algebraic
manipulation as the following:
∞ ⎡ x − ( µ + tσ 2 ) ⎤
−1/ 2 ⎢
1 ⎥
MX (t) =e µt + ½(tσ)2
× ∫ e ⎣⎢ σ ⎦⎥
dx = eµt + ½(tσ)2
σ 2π −∞
∑ f ( xi ) = 1 .
n
2)
i =1
Using the second property, we get 10k2 + 9k – 1 =0. It gives two values of
k, viz., -1 and 1/10. Clearly, k cannot take the value -1 (as f(X = 1) = k and
f(x) is always non-negative). Given k = 1/10 rest is trivial algebra.
3) i) Cannot.
ii) Cannot.
iii) Cannot.
∞
∫ k.e-3x dx = 1
0
or, k. [e-3x/-3]0∞
or, k.1/3 = 1
or, k = 3
∞
x x
120
Check Your Progress 2 Probability Distribution
3
13
1) Mx (t) = E (etX) = ∑e
i =0
tx
8
Cx = 1/8 (1 + 3et + 3e2t + e3t) = 1/8(1 + et)
1) As the formula for mean and variance of binomial distribution are given
by n.p and n.p. (1 - p) respectively, we get the following two equations
n.p = 4………………………….(1)
n.p. (1 - p) = 8/3……………..(2)
5) Clearly, the random variable denoting the weight, say X, of the students
follow a normal distribution, mean (X)= 151;Var (X)= 152
i) Proportion of students whose weight lie between 120 and 155 lbs =
Area under the standard normal curve between the vertical lines at
the standardized values; z = (120 - 151)/15 = -2.07 and z = (155 -
151)/15 = 0.27
Since the area to the right of z = 0 is 0.05 and the area between z = 0 and
z = 1.64 is given to be 0.45, the area to the right of z = 1.64 is
.5 - .45=.05. Thus, we get 10/σ = 1.64, or, σ = 6.1.
121
Statistical Methods-1
16.12 EXERCISES
1) Show that if a random variable has the following distribution
f(x) = ½.e-|x| for - ∞ x < ∞
Its moment generating function is given by Mx(t) = 1/1 – t2.
2) Find the mean and the standard deviation of the random variable with the
moment generating function Mx(t) = e
( ).
4 et −1
3) For each of the following find the value of ‘c’ so that the function can
serve as a probability distribution.
i) f(x) = c.x for x = 1,2,3,4,5
ii) f(x) = c.5Cx for x = 0,1,2,3,4,5
iii) f(x) = c.x2 for x = 1,2,3,4,5,…..k
iv) f(x) = c(1/4)x for x = 1,2,3,4,5……….
4) Find the probability distribution function for a random variable whose
density function is given by the following
⎧ 0 fo r x ≤ 0
⎪
F(x) = ⎨ x fo r 0 < x < 1
⎪ 1 fo r x ≥ 1
⎩
And plot the graph of the distribution function as well as the density
function.
122
5) Find the distribution function of the random variable X whose probability Probability Distribution
density is given by the following
⎧ x / 2 fo r 0 < x ≤ 1
⎪ 1 / 2 f o r1 < x ≤ 2
⎪
f(x) = ⎨
⎪ (3 − x ) / 2 fo r 2 < x < 3
⎪⎩ 0 otherwise
123