PLSC214 Topic 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

PLANT SCIENCES 214.

3
STATISTICAL METHODS

Topic 5
Discrete Random Variables and Their Probability
Distributions
Topic 5
A. Random variables
B. Probability distribution of a discrete random
variable
§ Mean and standard deviation of a discrete random
variable
C. Binomial probability distribution
§ Mean and standard deviation of a binomial probability
distribution
D. Poisson probability distribution
§ Mean and standard deviation of a Poisson probability
distribution
E. Review Questions
Tables Needed!
A. Random Variables
Random Variables
A random variable is variable whose value is determined
by the outcome of a random experiment.
There are two types of random variables:

Random
Variables

Discrete Random Continuous


Variable Random Variable
Test your knowledge: What type of variable is the following?
Number of cats owned by a sample of 100 people.

1. Discrete
2. Continuous
Test your knowledge: What type of variable is the following?
Weights of cats owned by a sample of 100 people.

1. Discrete
2. Continuous
B. Probability Distribution of a
Discrete Random Variable
Probability Distribution of a
Discrete Random Variable
The probability distribution of a discrete random variable lists all
the possible values that the random variable can assume and
their corresponding probabilities.
Table 1. Frequency and relative frequency distribution of the number of
vehicles per apartment in residence (maximum 4 students per residence).

Number of Relative
Vehicles Frequency Frequency
0 30
1 470
2 850
3 490
4 160
N = 2000 Sum = 1.000
Test your knowledge: The relative frequency
approach to probability in Topic 4 taught us that:

1. The relative frequencies can be calculated


and used as approximate probabilities of the
outcomes occurring.
2. The relative frequencies can NEVER be used
as approximate probabilities of the outcomes
occurring.
3. I can’t remember anything about this from
topic 4.
Probability Distribution of a
Discrete Random Variable
Relative frequencies can be use as approximate probabilities.
Therefore:
Table 2. Probability distribution of the number of
vehicles per apartment.
Number Vehicles Probability
0.450
X P(X)

Probability
0 0.015 0.300

1 0.235
0.150
2 0.425
3 0.245
0.000
4 0.080 0 1 2 3 4
SP(X) = 1.000
Number of
Vehicles
Test your knowledge: Probability from
Topic 4 stated this about outcomes:
1. Every probability for an outcome must be
between 0 and 1
2. The probabilities of all the possible outcomes
must add to 1.
3. Both of the above.
Two Characteristics of a Probability Distribution:

1. 0 £ P(x) £ 1 (For each value of x)

2. S P(x) = 1 (If values for x are rounded, not exactly 1)


Table 2. Probability distribution of the number of
vehicles per apartment. P(x=2) = P(2) =
Number of Vehicles Probability
P(x>2) = P(3)+P(4) =
X P(X)
0 0.015 P(x<4) = P(0)+P(1)+P(2)+P(3)
1 0.235 =
2 0.425
3 0.245
4 0.080 OR
S P(X) = 1.000 = 1- P(4)
=
Mean of a Discrete Random Variable

The mean of a discrete random variable x is the value


that is expected to occur per repetition, on average, if
an experiment is repeated a large number of times.

It is denoted by µ and calculated as


µ = SxP(x)

The mean of x is also called its expected value and is denoted


by E(x):
E(x) = S xP(x)
Standard Deviation of a Discrete
Random Variable
The standard deviation of a discrete random variable x measures
the spread of its probability distribution. A high standard deviation
value for a discrete random variable indicates that x can assume
values over a larger range about the mean. A smaller value indicates
that most of the values of x are clustered around the mean.
It is convenient to use the shortcut formula:

s = å x p( x) - µ
2 2 Variance (s2) is standard
deviation squared.

%
Formula as seen in some textbooks: 𝜎! = #(𝑥" −𝜇)! 𝑃(𝑋 = 𝑥" )
"#$
Table 3. The following table gives the probability distribution of
the number of cell phones sold on a given day at store.

Cell phones sold 0 1 2 3 4 5 6


Probability 0.05 0.12 0.19 0.30 0.20 0.10 0.04

Calculate the mean and standard deviation.

X P(x) xP(x) x2P(x)


0 0.05
1 0.12
2 0.19
3 0.30
4 0.20
5 0.10
6 0.04

S xP(x)= S x2P(x)=
Table 3. The following table gives the probability distribution of
the number of cell phones sold on a given day at store.

Cell phones sold 0 1 2 3 4 5 6


Probability 0.05 0.12 0.19 0.30 0.20 0.10 0.04

Calculate the mean and standard deviation for this probability


distribution.
µ = S x P(x) =

s = å x 2 p( x) - µ 2
Table 4. The following table gives the probability distribution of the
number of wallets sold on a given day at a department store.

Calculate the mean and standard deviation for this probability


distribution.

x P(x) xP(x) x2P(x)


-1 0.25
1 0.50
10 0.25

µ = S x P(x) =

s = å x 2 p( x) - µ 2
You start a new life insurance company for pet owners. If their pet dies accidently,
you pay $1,000 and if the pet dies of disease or old age, you pay nothing. The
statistics of your consumer base have come in. Your findings show that 8 out of 10
pets in your area die of accidental causes. You decide that:

1. Your business should do good.


2. You better charge a lot for insurance.
3. ???
What is an Actuary?
¨ These are people who put a price on risk,
estimating the likelihood and cost of rare events so
they can be insured.
Insurance Example
Insurance companies make bets. They bet you are going to live a long life,
you bet you are going to die sooner. The insurance company wants to stay
in business; you want to have insurance. An actuary will help find a “fair
price” for your bet. When an insurance company averages over enough
customers, it can make reasonable accurate estimates of the amount it can
expect to collect on a policy before it has to pay its benefit.

Example: An insurance company offers a “death and disability” policy that


pays up to $10,000 if you die and $5,000 if you are disabled. It charges
a premium of $50 a year. Is the company likely to make a profit?
You need to know the probability clients will die or be disabled in any
year.
Example: Table 5. Probabilities of payouts for insurance policies.
Policy Holder Outcome Payout Probability

Death $10000 .001


Disability $5000 .002
Neither 0 .997

µ = S x P(x) =

The company expects to pay $20 per policy and charges $50
per policy.
s = å x 2 p( x) - µ 2 𝜎 = 386.8

The standard deviation in this case is quite large (and so is the risk). This gives an indication that it’s no sure thing to
make $50 (even though 997 out of 1000 the company would).
C. The Binomial Probability
Distribution
The Binomial Probability Distribution

¨ Is a widely used discrete probability distribution.


¨ It is used for individuals that are divided into two
mutually exclusive groups.
¤ For example: dead or alive, left or right, etc.
Arbitrarily, one is called a ‘success’ the other is the
‘failure’.
¨ If a random sample of n individuals was taken from
a population, the sampling distribution of
individuals falling to the ‘success’ category is a
binomial distribution.
The Binomial Experiment
¨ The binomial probability distribution is a
probability distribution that describes
probability for experiments in which there are
two mutually exclusive outcomes:
¤ success or failure
¨ Therefore, a binomial experiment is an
experiment in which there are only two
possible outcomes.
Conditions of a Binomial Experiment:

A binomial experiment must satisfy the following four conditions:


1. There are n identical trials. The given experiment is repeated n
times and all these repetitions are performed under identical
conditions (called Bernoulli trials).
2. Each trial has only two possible outcomes. These outcomes are
called a success and a failure.
3. The probability of the two outcomes remain constant. The
probability of success is denoted by p and the failure by q, and
p + q = 1. The probabilities p and q remain constant for each
trial.
4. The trials are independent. The outcome of one trial does not
change the outcome of another trial.
Binomial Formula: Find the probability

For a binomial experiment, the probability of exactly x


successes in n trials is given by the binomial formula:
P(x) = nCx pxqn-x
Where, n = total number of trials
p = probability of success
q = 1-p = probability of failure
x = number of successes in n trials
n-x = number of failures in n trials
We know,
n! n!
n Cx =
Therefore, P( x) = p x q n- x
x!(n - x)! x!(n - x)!
Combinations
Combinations give the number of ways x elements can be
selected from n elements. The notation used to denoted the
total number of combinations is

nCx
n denotes the x denotes the number of
total number of elements selected per
elements selection

Which is read as “the number of combinations of n


elements selected x at a time”
Number of Combinations
The number of combinations for selecting x from n distinct elements is
given by the formula:
n!
n Cx =
x!(n - x)!

Special cases: nCn =1 and nC0 =1

The symbol n!, read as “n factorials” represents the product of all


the integers from n to 1.
n! = n(n-1)(n-2)(n-3) ··· 3 · 2 · 1
Number of Combinations
Example: Three members of a group presentation will be randomly
selected from 5 people. How many combinations are possible?
Solution:
5!
n C x = 5 C3 = =
3!(5 − 3)!

“the number of combinations of n=5 elements selected x=3 at a time”


If 5 people are denoted as A B C D E, respectively. Then possible
combinations are
ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, CDE

10
Probability of Success and the Shape of the
Binomial Distribution
X P(x) 0. 4 Fig. Bar graph
0 0.0625 for the
0. 35

Table 1: 0. 3

probability
0. 25

Probability 1 0.2500 0. 2

distribution of
0. 15

distribution of x 2 0.3750
0. 1

Table 1.
0. 05

for n = 4 3 0.2500
1 2 3 4 5

And p =.50 4 0.0625 Symmetric


0. 45

Fig. Bar
Table 2:
0. 4

0 0.2401 graph for the


0. 35

0. 3

Probability 0. 25

probability
1 0.4116
0. 2

distribution of x
0. 15

0. 1

distribution of
for n = 4 2 0.2646 0. 05

0
1 2 3 4 5
Table 2.
And p =.30 3 0.0756
4 0.0081 Skewed to right
(p<0.5)
0. 45

Table 3: 0. 4
Fig. Bar
Probability 0 0.0016 0. 35

0. 3
graph for the
1 0.0256 probability
0. 25

distribution of x 0. 2

0. 15

2 0.1536 Distribution of
for n = 4
0. 1

Table 3.
0. 05

And p =.80 3 0.4096


0
1 2 3 4 5

(p>0.5) 4 0.4096 Skewed to left


Using the Binomial Formula

P(x) = nCx pxqn-x

Example: It is known that 2% of students will be done their labs before the first
midterm. What is the probability that 1 out of 10 students will be done before
the first midterm?

P(x = 1) = nCx pxqn-x = 10C1 0.0210.9810-1 =


P(x = 1) =

Example: It is known that 2% of students will be done their labs before the first
midterm. What is the probability that at most 1 student out of 10 will be done
before the first midterm?
P(x = 1)+P(x = 0) = .1667 + 10C0 0.0200.9810-0 =
Binomial Tables and Calculators
¨ For our class, we can use the tables provided.
¨ Excel can also be used.
¨ Many free online calculators available.
Using the Table of Binomial
Probabilities
P(x) = nCx pxqn-x The table gives the binomial probability given: n = total number of trials,
p = probability of success, and x = number of successes in n trials.

Example: It is known that 50% of dogs will dig holes in their owners yards.
What is the probability that 1 out of 5 dogs will dig a hole?
using the table: n=5, x=1,p=0.5
P(x = 1) =

Example: It is known that 50% of dogs will dig holes in their owners yards.
What is the probability that at least 3 dogs out of 5 will dig holes?

P(x = 3)+ P(x = 4) + P(x = 5)

using the table: P(x = 3) = , P(x = 4) = , P(x = 5) =

P(x = 3)+ P(x = 4) + P(x = 5)=


Key Phrase and Math Symbols
Phrase Math Symbol
“at least” or “no less than” ≥
“more than” or “greater than” >
“fewer than” or “less than” <
“no more than” or “at most” ≤
“exactly” =
Mean and Standard Deviation of a
Binomial Distribution
If a discrete random variable has a binomial distribution then we can still
use the formulas we talked about earlier: µ = SxP(x)

For a Binomial Distribution:


𝜇 = 𝑛𝑝 𝜎 = 𝑛𝑝𝑞 𝜎 ! = 𝑛𝑝𝑞

Where: n = total number of trials


p = probability of success
q = 1-p = probability of failure
Mean and Standard Deviation of a
Binomial Distribution
Example: 37% of people have the blood type O+. What
is the mean of a total of 40 trials (how many people in a
random sample of 40 people would you expect to have
O+ blood type) and what is the standard deviation?

µ = np
s = npq
D. The Poisson Probability
Distribution
Poisson Probability Distribution
¨ An important probability distribution of
discrete random variable.
¨ The only thing we have to know to specify the
Poisson distribution is the mean number of
occurrences for which the symbol lambda (λ ) is
often used.
Poisson Probability Distribution

¨ Conditions under which a Poisson


probability distribution holds:
¤x is a discrete random variable
¤ occurrences are random
¤ occurrences are independent (one occurrence
does not influence another occurrence)
Poisson Distribution
¨ The Poisson distribution describes the number
of occurrences in blocks of time or space. The
occurrences must happen independently of
each other and with equal probability at every
point in time or space.
¨ Generally in the biological sciences, the Poisson
distribution is used to test whether occurrences
occur randomly. If the occurrences are too
clumped or disperse, then something other than
chance is happening.
Poisson Probability Distribution
¨ Examples that follow a Poisson Distribution:
¤ Number of insect parts in a chocolate bar.
¤ The number of heart attacks per day in a large city.
¤ The number of road kill found per unit length of road.
¤ The number of mutations in a given stretch of DNA after a certain
amount of radiation.
¤ The number of stars in a given volume of space.
¨ Examples that do not follow a Poisson Distribution:
¤ Outbreaks of contagious disease (clumped).
¤ Number of territorial animals per area (because they chase others
away from their territory).
Poisson Probability Distribution
Formula
The probability of x occurrences in an interval is:
lx e - l
P( x) =
x!
Where l (lambda) is the mean number of occurrences in that
interval and the value of e (exponential number) is
approximately 2.71828.

l is called Poisson parameter and is the mean number of


occurrences in an interval.
Poisson Probability Distribution
Formula
Example: On average there are 9.5 mutations found per Gb
(giga base pairs). What is the probability that a Gb will
have 6 mutations?

lx e - l
P( x) = = (9.5)6 e -9.5 /6! =
x!

Or use the table of Poisson probabilities and get the same answer!
Poisson Probability Distribution
Example: A car salesperson sells on average 0.5 cars a day.
What is the probability that the salesperson will sell no more
than 1 car today? Phrase Math Symbol
“at least” or “no less than” ≥

l=0.5, P(X≤1)= P(X=0) + P(X=1)


“more than” or “greater >
than”
“fewer than” or “less than” <
P(X≤1) = “no more than” or “at most” ≤

= “exactly” =

Example: A car salesperson sells on average 0.5 cars a day.


What is the probability that the salesperson will sell 6 cars
today?
l=0.5 x=6 probability is

Does that mean that a salesperson cannot sell 6 cars today?


Poisson Probability Distribution
Mean and Standard Deviation
For the Poisson Distribution the mean and the
variance are both equal to l.

µ =l
s =l
2

s = l
Poisson Probability Distribution
Mean and Standard Deviation
For the Poisson Distribution the mean and the
variance are both equal to l.

µ =l Example: A car salesperson sells on


average 0.5 cars a day. What is the mean

s =l
2 and standard deviation for that salesperson?

µ=l=
s = l
s=Ö l =

You might also like