03 - Probability Distributions and Estimation

Probability Distributions
and
Random Variables
Introduction
Probability is the measure of uncertainty and have always been an
important aspect in the reliability assessment of industrial products
and/or equipments.
Good product design is of course essential for products with high
reliability. However, no matter how good the product design is,
products deteriorate over time since they are operating under certain
stress or load in the real environment, often involving randomness.
Maintenance has, thus, been introduced as an efficient way to assure a
satisfactory level of reliability during the useful life of a physical asset.
Lecture Overview
Definitions
Mathematical Derivations
Example /Tutorial
Probability definition?
Probability can be defined as the measure of uncertainty

(used to represent the risk of uncertainty in engineering
applications)
In other words it can be used to quantify the likelihood

or chance of an event occurring at a given time
It can be interpreted as the degree of belief or relative

frequency
4
Random Variable
A random variable is defined as a function that assigns a real value to every
possible outcome or event of an experiment or observation.
In many applications ranging from manufacturing engineering and elsewhere, the
outcomes x1, x2,,xn of events that constitute a sample over an interval of time
(or space) take real numerical values. It is therefore convenient, and sometimes
necessary, to express all events using numerical values on a real line. The functions
that establish such transformations to a real line are called random variables.
Random variables can be classified into two types: discrete and continuous
random variables.
Probability distribution
Definition: is a mathematical model that relates the value of the variable

with the probability of occurrence of that value in the population.
The probability of an event is a number lying in the interval 0 p 1,

with 0 corresponding to an event that never occur and 1 to an event that is
certain to occur.
For example if we visualise the diameter of a piston-ring as a random

variable, because it takes on different values in the population according
to some random mechanisms, then the probability distribution of ring
diameter is the probability of occurrence of any value of ring diameter in
the population.
6
Example
Suppose that an event E can happen in h ways out of n equally likely

possible ways. Then the probability of occurrence of an event E is
denoted by
p P( E )
h
.
n
Then the probability of non-occurrence of an event E is denoted by

q P( E c )
nh
h
1 1 p 1 P ( E ).
n
n
Example
If E1 and E2 are two events, the probability that event E2 occurs given
that E1 has occurred is denoted by P(E2 | E1) (conditional probability
of E2 given has E1 occurred). If the two events are mutually exclusive
then P(E1 E2) = 0. Then E1 E2 denotes the event that either E1 or
E2 or both occur, then
P(E1 E2) = P(E1) + P(E2) P(E1 E2).
An extension of n mutually exclusive events with respective

probabilities p1, p2, ., pn gives the result that the probability of
occurrence of the union of all events is the sum of all possible
probabilities, Spiegel (1992).
8
Discrete distribution
Discrete distribution: when the parameter measures

assumes a certain value, such as integers 0, 1, 2the
probability distribution is called a discrete distribution.
If a discrete random variable X can take the values x1, x2,

,xn with probabilities p1, p2,,pn where p1+ p2+
+pn =1 and p0, then this defines a discrete
distribution for X. The probability that X will take a
particular value x is denoted by P(X = x) or P(x)
9
Probability Mass function
Given the probability (mass) distribution f (x), the mean is

defined as
x. f ( x)
all x
and the variance is

2 x 2 . f ( x) x 2 . f ( x) 2
all x
all x
The standard deviation is simply the square root of the

variance
S 2
10
Continuous distribution
A random variable X is continuous if its set of possible values is an entire interval

of numbers (If A < B, then any number x between A and B is possible).
Continuous distribution: when a variable is measured and expressed in a

continuous scale, its probability distribution is called a continuous distribution.
Then a probability distribution of X is a function f (x) such that for any two
numbers a and b,
P a X b
f ( x )dx
Properties of a probability density function are

1)
f ( x) 0
2)
f ( x) 1
11
Probability density function

For f (x) to be a probability density function (pdf)
f (x) > 0 for all values of x.
The area of the region between the graph of f and the x axis
is equal to 1.
Area = 1
12
The Cumulative Distribution

Function F(x)
The cumulative distribution function, F(x) for a continuous
random variable X is defined for every number x by
F ( x) P X x
f ( y )dy
For each x, F(x) is the area under the density curve to the
left of x.
13
Using F(x) to Compute

Probabilities
Let X be a continuous random variable with pdf
f(x) and cdf F(x). Then for any number a,
P X a 1 F (a )
and for any numbers a and b with a < b,
P a X b F (b) F (a )
14
Obtaining f(x) from F(x)
If X is a continuous random variable with

pdf f(x) and cdf F(x), then at every
number x for which the derivative
F ( x) exists,
F ( x) f ( x).
15
Expected Value or Mean Value

The expected or mean value of a continuous random
variable X with pdf f (x) is
X E X
x f ( x)dx
16
Expected Value of h(X)

If X is a continuous random variable with pdf f(x)
and h(x) is any function of X, then
E h( x ) h ( X )
h( x) f ( x)dx
17
Standard Deviation and

Variance
The variance of continuous random variable X with
pdf f(x) and mean is
2
X
V ( x)
(x )
f ( x)dx
E[ X ]
The standard deviation is
X V ( x).
18
Short-cut Formula for Variance
E ( X )
V (X ) E X
19
Important Distributions
Several distributions are used quite frequently in reliability analysis they

are:
Discrete distributions
The binomial distribution
Poison distribution
Continuous distributions
Normal distribution
Log-normal distribution
Exponential distribution
Weibull distribution
Gamma distribution
20
Binomial distribution
The Binomial Distribution with parameters n 0 and 0 < p < 1 its probability distribution
is expressed as:
n x
p (1 p ) n x
x
p ( x)
x 0,1, ..., n
The mean and variance of the binomial distribution are

Mean np
Standard deviation
Variance
2 np (1 p)
S 2 np (1 p )
In the process of manufacturing a product, inspection may test whether the product is good
or defective. The probability outcomes for analysing several products would follow a
binomial distribution.
21
Binomial Example
Example 1.
coin is
The probability of getting exactly 2 heads in 6 tosses of a fair
2
6 1 1
p ( x)
2 2 2
Example 1.
62
6! 1 1

2!4! 2 2
15
0.23
64
The probability of getting 4 heads in 6 tosses of a fair coin is
4
64
5
6 1 1
6 1 1
p ( x)

4 2 2
5 2 2
15
6
1
11
0.34
64 64 64 32
6 5
6
6 1 1

6 2 2
6 6
22
Binomial Exercise
To guard against spurious failure causing a plant
outage, automatic protective system are often
designed with three protective channels. Any 2 out
of 3 channels need to be in a failed state to initiate
a system shutdown. Assuming that the reliability
is 0.99, determine the probability of such an
automatic system being in a failed state when an
inspection is carried out on the system
23
Solution
If the reliability R of a single channel is 0.99, then the probability p of

it being in a failed state is (1 - 0.99) = 0.01. For the APS to be in a
failed state on demand, two or more channels must have failed:
3 2
3!
p 1 p 3 2
0.01 2 (0.99) 3 0.0001 0.99
2!1!
2
p ( X 2)
2.97 10 4
3 3
3!
p 1 p 3 3
0.01 3 (1) 1 10 6
3!1!
3
p ( X 3)
Hence the probability of the system being in a failed state

ps p ( X 2) p ( X 3) 2.98 10 4
24
Conditions for Binomial

distribution
The experiment consist of n repetitions or trials

Each trial can have only one or two possible outcomes
The probability of a given outcome is the same for each
trial
The trials are independent (i.e. the probability of obtaining
a given result doe not depend upon the previous/other
trials)
25
Poison distribution
A useful distribution in statistical quality control is the Poison distribution,

defines as follows.
e x
f ( x)
x!
x 0,1,....
The mean and variance of the Poison distribution are

Mean and
Variance
where the parameter (lamda) is greater than zero, and e is a constant equal to
approximately 2.71828.
Note that the mean and variance of the Poison distribution are equal
26
Poisson Example
Suppose the average number of lions seen on a 1 day safari is 5. What is the
probability that tourists will see fewer than four lions on the next 1-day safari?
Solution: This is a Poisson experiment in which we know the following:
= 5; since 5 lions are seen per safari, on average.
x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer
than 4 lions; that is, we want the probability that they will see 0, 1, 2, or 3 lions.
since e = 2.71828.
27
Poisson Solutions
To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3
lions.
We need to calculate the sum of four probabilities: P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5).
To compute this sum, we use the Poisson formula:
P(x < 3, 5) = P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5)
P(x < 3, 5) = [ (e-5)(50) / 0! ] + [ (e-5)(51) / 1! ] + [ (e-5)(52) / 2! ] + [ (e-5)(53) / 3! ]
P(x < 3, 5) = [ (0.006738)(1) / 1 ] + [ (0.006738)(5) / 1 ] + [ (0.006738)(25) / 2 ] +
[ (0.006738)(125) / 6 ]
P(x < 3, 5) = [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] + [ 0.140375 ]
P(x < 3, 5) = 0.2650
The probability of seeing no more than 3 lions is 0.2650.
28
Poison distribution
Characteristics of a Poisson experiment are:
The probability of an occurrence is the same over any

two intervals of equal length
The occurrence or non-occurrence in any interval is

independent of the occurrence or non-occurrence in any
other interval
29
Normal Distribution
Most useful in modeling distributions of physical nature, e.g.

measurement of height of people in a population.
Probability density function is symmetrical, bell-shaped: with
as mean, as S.D.
1
f ( x)
e
2
1 x

2
, where - x .
f(x)
30
Normal Distribution
Normal distributions are defined using 2 parameters: mean and standard deviation:
N(,).
Bell Shaped
f(x)
Symmetrical
Mean, Median and Mode are Equal
Location is determined by the mean,
Spread is determined by the standard deviation,

The random variable has an infinite theoretical range:
+ to
Area under bell = 1.00
31
Spread of S.D
32
Standard Normal Distributions

The normal distribution with parameter values 0 and 1
is called a standard normal distribution. The random
variable is denoted by Z. The probability density function is
1
z2 / 2
f ( z; 0,1)
e
2
z
The cumulative distribution function is z
( z ) P( Z z )
f ( y; 0,1) dy
33
Standard Normal Cumulative

Areas
Shaded area = (z )
Standard
normal
curve
0
34
Standard Normal
By substituting z= (x-)/, we transform the normal distribution into

standard normal distribution.
Standard normal distribution mean = 0,
Standard normal distribution variance = 1.
Standard Cumulative Normal Distribution Table:
Z
0.0
0.1
1.0
2.0
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.500 0.504 0.508 0.512 0.516 0.520 0.524 0.528 0.532 0.536
0.540 0.544 0.548 0.552 0.556 0.560 0.564 0.567 0.571 0.575
0.841 0.844 0.846 0.848 0.851 0.853 0.855 0.858 0.860 0.862
0.977 0.978 0.978 0.979 0.979 0.980 0.980 0.981 0.981 0.982
35
Example
If X is distributed normally with mean, , of 100 and

standard deviation, , of 50, the Z value for X = 200 is
Z
200 100
2.0
50
This says that X = 200 is two standard deviations (2

increments of 50 units) above the mean of 10
36
Empirical Rule
2 covers about 95% of Xs
3 covers about 99.7% of Xs
95.44%
99.72%
37
Standard Normal Table
From the standard normal table the probability

less than a desired value for Z (i.e., from negative
infinity to Z)
.9772
P (Z < 2.0)=0.9772
0
2.00
38
The Standardized Normal

Table
The column gives the value
of Z to the second decimal
point
Z
The row
shows the
value of Z to
the first
decimal point
0.00
0.01
0.02
0.0
0.1
.
.
.
2.0
.9772
The value within the

table gives the
probability from Z =
up to the desired Z value
P(Z < 2.00) = .9772 2.0

39
Log-Normal distribution
It is a more versatile distribution than the Normal distribution as it has

a range of shapes, and therefore it is often a better fit to reliability data,
such as for population with wear out characteristics.
It does not have the disadvantage of the Normal distribution of

extending below zero to . The p.d.f is given as
f ( x)
x 2
1 ln x

for ( x 0)
Mean e
for ( x 0)
SD
e 2 2 e 2
2
40
Exponential distribution
A continuous random variable X has an exponential distribution with

parameter if the probability density function is
e x
f ( x)
0
x0
otherwise
The mean and variance of a random variable X having the exponential

distribution
Mean
1
,
Variance
1
2
41
Gamma distribution
A continuous random variable X has a gamma distribution if the probability
density function is
f ( x)
1
1 x /
x
e
( )
0
otherwise
x0
where the parameter satisfy > 0, > 0. The standard gamma distribution has
= 1.
For > 0 the () is defined as
( )
1 x
x
e dx
Mean E(X) , Variance V ( X ) 2 2

42
Weibull Distribution
The Weibull is a very flexible life distribution model with two parameters. It
has probability density function given as
x 1e ( x / )
f ( x)
x0
x0
With parameter >0, >0

The mean and variance are
1
1
2
1
1 1

2
43
Cumulative distribution
function (cdf) F(t)
The cumulative distribution function (cdf) of a continuous random

variable is defined by
t
F (t ) P(T t )
f (s) ds
The cumulative distribution function equals 0 at - and equals 1 at

+. The relationship between the density and the distribution can also
be expressed as
f (t )
d F (t )
.
dt
44
Conditional probability
Based on the definition of conditional probability, the conditional probability density

function f (t1 | t2) for a random variable T1 given another random variable T2 and is given
by
f (t1 | t 2 )
f (t1 , t 2 )
f (t 2 )
where f (t1, t2) is equal to the joint density function of T1 and T2,and f (t1) is equal the
marginal density function of T2 given by
f ( t2 )
f (t1 , t 2 ) dt1 .
Similarly, the conditional probability density for T2 given T1 = t1 is given by
f ( t 2 | t1 )
f (t1 , t 2 )
f (t1 )
45
Joint density function
The joint density function can be obtained from a given joint

cumulative distribution function by evaluating the partial derivative as
follows
f (t )
F (t )
.
t
That is
n F (t1 , t 2 ,......., t n )
f (t1 , t 2 ,......., t n )
.
t1t 2 .....t n
This concept simply offers a convenient way of modelling n random

variables simultaneously.
46
Maximum Likelihood
Estimator (MLE)
Suppose we have random variables T1, T2,,Tn having a joint density,

and we observe a random sample of observations t1, t2, ,tn from a
population of interest with common density function
f(t1, t2, , ,tn | )
where the form of f is known and is not known. Given observed values Ti
= ti, where i = 1, 2,, n, then the likelihood L( ) as a function of t1, t2,
, tn is defined by
n
L( ; t1 , t 2 ,......, t n ) f (ti | )
i 1
and we sometimes abbreviate L( ; t1, t2,, tn) to L( ) for convenience.

47
Log-likelihood estimates
It is usually easier to maximize the natural logarithm rather than the

likelihood function itself, Rice (1995). The log-likelihood is defined by
n
l ( ) ln f (ti | ).
i 1
Alternatively, the maximum likelihood estimates may be found from the loglikelihood by setting the derivative to zero. The solution sometimes involves
numerical methods such as Newton-Raphson or Quasi-Newton algorithms.
Such methods are available in subroutine libraries such as NAG, Crowder et
al. (1993). However, from a theoretical point of view, the maximumlikelihood estimation method ensures that most statistical problems of
parameter estimation likely to arise in reliability contexts are easily dealt with.
48
Exponential MLE
For both complete and censored data the MLE estimator for lambda is given as
r

T
Where r is the number of failure and T is the cumulative test time (censored)
f (ti ) i e t
Pr(Ti t r for all i r e tr
i 1,2,......r
L(t1 ,....t 2 )
exp(t )
nr
nr
i 1
exp ti (n r )t r
i 1
And the natural logarithm of the likelihood function

r
r
d ln L( ) r
ln L( ) t ln ti (n r )t r . Therefore,
t i ( n r )t r 0
d
i 1
i 1
Solving for lambda
r
r
r
t
i 1
( n r )t r
r
T
49
Exponential MLE
For an exponential distribution with r representing the number of failures
L ( )
i 1
exp(ti ) exp(ti )
i 1
Taking the logarithm

n
i 1
i 1
ln L( ) r ln ti ti
n
n
d ln L( ) r
ti ti
d
i 1
i 1
Solving for lambda
r
n
t tdistribution can be estimated by taking the Total Time

This confirm that the MTTF for the
exponential
i 1
i 1
on Test and dividing by the number of failures
50
Confidence Interval estimation

Confidence intervals give a plausible range of
values for a population parameter.
Confidence intervals also give information about

the precision of an estimate.
(When sampling variability is high, the confidence

interval will be wide to reflect the uncertainty of
the observation.)
51
Point and Interval Estimates
A point estimate is a single number
A confidence interval provides additional

information about variability
Lower
Confidence
Limit
Point Estimate
Upper
Confidence
Limit
Width of
confidence interval
52
Concepts of Confidence
Intervals
The value of a statistic (the mean, odds ratios,

indices etc.)
The standard error (SE) of the sample.
The desired width of the confidence interval (e.g.,

the 95% confidence interval or the 99%
confidence interval).
53
General Format of Confidence

Intervals
The value of the statistic in the sample (eg.,
mean, odds ratio, etc.)
estimate (measure of how confident we want to be)

(standard error)
From a Z table or a T table, depending on the
sampling distribution of the statistic.
Standard error of the statistic.
54
Point estimators
Characteristics
Unbiased: E ( x) x
Efficiency: as n increases s get closer to
s
Standard error
n
Sampling error:
55
For a large sample.

A confidence interval for a population mean is:
x Z
The mean, standard deviation, and n depend on the sample,

and Z depends on the confidence level.
56
Confidence Interval for

( Known)
Assumptions
Population standard deviation is known

Population is normally distributed
If population is not normal, use large sample (n30)
Confidence interval estimate:

xZ
(where Z is the normal distribution critical value)
57
Common Z levels of
confidence
Commonly used confidence levels are 90%,

95%, and 99%
Confidence
Level
80%
90%
95%
98%
99%
99.8%
99.9%
Z value
1.28
1.645
1.96
2.33
2.58
3.08
3.27
58

( Unknown)
If the population standard deviation is

unknown, we can substitute the sample
standard deviation, S
This introduces extra uncertainty, since S is

variable from sample to sample
So we use the t distribution instead of the

normal distribution
59

( Unknown)
Assumptions
Population standard deviation, , is unknown

Population is normally distributed
If population is not normal, use large sample
Use Students t Distribution

Confidence Interval Estimate:
X t n-1
S
n
(where t is the critical value of the t distribution with n-1 d.f. )

60
Example
If we want to estimate the average ages of kids
that ride a particular roller coaster ride in
Blackpool, and we take a random sample of 8 kids
exiting the ride, and find that their ages are: 2, 3,
4, 5, 6, 6, 7, 7.
a. Calculate the sample mean.
b. Calculate the sample standard deviation.
c. Calculate the standard error of the mean.
d. Calculate the 99% confidence interval.
61
Answer (a, b)
a. To calculate the sample mean.
8
X8
X
i 1
2 3 4 5 6 6 7 7 40
5.0
8
8
b. To calculate the sample standard deviation.

8
s X2
i 1
( X i 5) 2
8 1
s X 3.4 1.9
32 2 2 12 0 2(12 ) 2(2 2 ) 24
3.4
7
7
62
Answer (c)
c. Calculate the standard error of the mean.
s x 1 .9
sx
.67
n
8
d. Calculate the 99% confidence interval.

mean s x (t df , / 2 )
5.0 .67(3.50) (2.65, 7.35)
where t comes from Students t distribution, and depends on the sample
size through the degrees of freedom n-1.
63
Confidence Interval
The confidence level tells us how sure we can be

about our estimate.
It expresses how often the true percentage of the

population lies within the confidence interval.
The confidence level describes the uncertainty

associated with a sampling method.
64
Revision: Possible
Questions/Problems
Definitions: Probability, random variables, Confidence

interval
Discrete and continuous probability distributions
Types of distribution, show the density function, mean and
standard deviations of the Binomial, Exponential, Poisson
and the Normal distribution
Maximum Likelihood Estimator for Exponential
Distribution
Characteristics of a Normal distribution
Standard normal calculation using table
65
QUESTIONS?
66

03 - Probability Distributions and Estimation

Uploaded by

Copyright:

Available Formats

03 - Probability Distributions and Estimation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

03 - Probability Distributions and Estimation

Uploaded by

Copyright:

Available Formats

Probability Distributions

Probability can be defined as the measure of uncertainty

In other words it can be used to quantify the likelihood

It can be interpreted as the degree of belief or relative

Definition: is a mathematical model that relates the value of the variable

The probability of an event is a number lying in the interval 0 p 1,

For example if we visualise the diameter of a piston-ring as a random

Suppose that an event E can happen in h ways out of n equally likely

Then the probability of non-occurrence of an event E is denoted by

An extension of n mutually exclusive events with respective

Discrete distribution: when the parameter measures

If a discrete random variable X can take the values x1, x2,

Probability Mass function

Given the probability (mass) distribution f (x), the mean is

and the variance is

The standard deviation is simply the square root of the

A random variable X is continuous if its set of possible values is an entire interval

Continuous distribution: when a variable is measured and expressed in a

Properties of a probability density function are

Probability density function

f (x) > 0 for all values of x.

The Cumulative Distribution

Using F(x) to Compute

Obtaining f(x) from F(x)

If X is a continuous random variable with

Expected Value or Mean Value

Expected Value of h(X)

Standard Deviation and

Short-cut Formula for Variance

Several distributions are used quite frequently in reliability analysis they

The mean and variance of the binomial distribution are

The probability of getting exactly 2 heads in 6 tosses of a fair

The probability of getting 4 heads in 6 tosses of a fair coin is

If the reliability R of a single channel is 0.99, then the probability p of

Hence the probability of the system being in a failed state

Conditions for Binomial

The experiment consist of n repetitions or trials

A useful distribution in statistical quality control is the Poison distribution,

The mean and variance of the Poison distribution are

The probability of an occurrence is the same over any

The occurrence or non-occurrence in any interval is

Most useful in modeling distributions of physical nature, e.g.

Spread is determined by the standard deviation,

Standard Normal Distributions

Standard Normal Cumulative

By substituting z= (x-)/, we transform the normal distribution into

If X is distributed normally with mean, , of 100 and

This says that X = 200 is two standard deviations (2

Standard Normal Table

From the standard normal table the probability

The Standardized Normal

The value within the

P(Z < 2.00) = .9772 2.0

It is a more versatile distribution than the Normal distribution as it has

It does not have the disadvantage of the Normal distribution of

A continuous random variable X has an exponential distribution with

The mean and variance of a random variable X having the exponential

Mean E(X) , Variance V ( X ) 2 2