Reading Material 3
Reading Material 3
Reading Material 3
We have seen earlier that uncertainty is omnipresent in the business world, which induces
variability. To model variability probabilistically, we need the concept of a random variable.
A random variable is the outcomes of a random experiment numerically expressed and can
take different values with given probabilities.
Suppose that we have a random experiment with sample space 𝛺 . A function 𝑋 from 𝛺 into
another set 𝑇 is called a (𝑇-valued) random variable. Some typical examples are:
i. The return on an investment in a span (period) of one year.
ii. The closing price of a stock in NSE.
iii. The number of customers entering a shopping complex
iv. The sales volume of a store on a particular day
v. The turnover rate at your organization next year
In General, Random variables that take Integer or rational numbers are discrete, while those
that take real numbers are continuous. In some cases, numbers are not immediately associated
with the outcomes of a random experiment. For example,
i. You may win a bid or lose
ii. After flipping, a coin may show a head or a tail
iii. A customer can be male or female
We Often assign numbers such as 0 and 1 to the possible outcomes in such cases.
Probability Distribution
“A probability distribution describes the “randomness” of a random variable. Informally, the
probability distribution specifies the probability or likelihood for a random variable to assume
a particular value. Formally, let 𝑋 be a random variable and let 𝑥 be a possible value of X.
Then, we have two cases.
1|Pa g e
• Discrete: the probability mass function of X specifies 𝑃(𝑥) ≡ 𝑃(𝑋 = 𝑥) for all
possible values of 𝑥.
• Continuous: the probability density function of 𝑋 is a function 𝑓(𝑥) that is such that
𝑓(𝑥) · ℎ ≈ 𝑃(𝑥 < 𝑋 ≤ 𝑥 + ℎ) for small positive ℎ.
The probability mass function specifies the actual probability, while the probability density
function specifies the probability rate; both can be viewed as a measure of “likelihood.”
Discrete probability distribution may have
• Finite Support (Sample space is countably finite)
• For example, the number of Successes in 𝑁 bidding or the number of stocks in
the List of 50 companies that form part of the NIFTY 50 Index whose closing
prices were higher than opening prices yesterday
• Infinite Support: (Sample space is countably infinite)
• For example: Number of trials required to get r successes
Discrete Probability Distribution
A probability mass function must satisfy the following two requirements:
i. 0 ≤ 𝑃(𝑥) ≤ 1for all 𝑥
ii. ∑𝑥∈𝜒 𝑃(𝑥) = 1; 𝜒 being set of all possible values of 𝑥.
Empirical data can be used to estimate the probability mass function. Consider, for example,
the number of TVs in a household.
2|Pa g e
The answer is 0.191. That X turns out to be 3 in a random sample is called a realization.
Similarly, the realization X = 2 has a probability of 0.374. We can, therefore, compute the
population mean, variance, and so on. The results of such calculations are examples of
population parameters. Details of the estimation will be taken later.
Bernoulli Trials
A sequence of trials is said to be Bernoulli trials if they satisfy the following three
assumptions:
I. Each trial has two possible outcomes, in the language of probability
called success and failure.
II. The trials are independent. Intuitively, one trial’s outcome does not influence another
trial’s outcome.
III. On each trial, the probability of success is 𝑝 , and the probability of failure is 1 −
𝑝 where 𝑝 ∈ [0,1] is the success parameter of the process.
3|Pa g e
He is known for his numerous contributions to calculus, and his brother Johann was one of
the founders of the calculus of variations. His most important contribution was in the field
of probability, where he derived the first version of the law of large numbers in his work Ars
Conjectandi.
Probability Models from Bernoulli Trials
The binomial model can be considered the number of successes in a sequence of 𝑛 Bernoulli
trials. The Poisson Model can approximate the number of successes in an infinite sequence of
Bernoulli Trials. We shall also consider models for the number of Bernoulli trials required to
achieve a specified number of successes.
Some Important Discrete Distributions
Problem 1: Suppose an organization had ten senior and fifteen junior managers. Out of those
25 managers, 5 left the organization in the last quarter. Suppose the managers acted
independently of each other, and it is equally likely for anyone to separate. What is the
probability that two of the five managers left were senior managers?
General Version of Problem-1: Suppose an organization had 𝑁 managers; out of them, the
proportion of senior managers is 𝑝, and the rest are junior managers. Out of those 𝑁
managers, 𝑛 left the organization in the last quarter. If the managers acted independently of
each other, and it is equally likely for anyone to separate, what is the probability that 𝑥 out of
the 𝑛 managers left were senior managers?
Solutions to such problems are connected to Hypergeometric distribution.
The hypergeometric distribution
(𝑁𝑝 𝑁𝑞
𝑥 )(𝑛−𝑥)
The pmf of the distribution is given by 𝑓(𝑥) = , 𝑥 = 0,1,2, … , 𝑛. 0 ≤ 𝑝 ≤ 1, 𝑞 =
(𝑁
𝑛)
1 − 𝑝, In practice, 𝑥 ≤ min (𝑛, 𝑁𝑝) and 𝑥 ≥ 𝑚𝑎𝑥 (0, 𝑁𝑞 − 𝑛).
Problem-2. Suppose an organization has many employees, 20% of whom are rewarded with
one additional increment based on their performance appraisal. There are 24 employees in the
city branch of the organization. If the employee’s chance of getting a reward is independent
of others, what is the probability that exactly 7 of the 24 employees of the branch will be
rewarded?
General Version of Problem 2: Suppose an organization has many employees, of which a
certain proportion 𝑝 of employees are rewarded with one additional increment based on
performance appraisal. There are 𝑛 employees in the city branch of the organization. If an
4|Pa g e
employee’s chance of getting a reward is independent of others, what is the probability that
exactly 𝑥 of the 𝑛 employees of the branch are rewarded?
Solutions to Problem 2 or its general version are connected to Binomial distribution.
The Binomial distribution
Binomial Probability Mass Function (For varying sample size and fraction p fixed at
0.5)
Binomial Probability Mass Function (For varying fraction p and fixed sample size =20)
5|Pa g e
Binomial Cumulative Distribution Function
Poisson Distribution
𝑒 −𝜆 𝜆𝑥
The pmf of a Poisson distribution is given by: 𝑓(𝑥) = , 𝜆 > 0, 𝑥 = 0,1,2, … .
𝑥!
This model can actually be derived from the pmf of a Binomial distribution taking limits over
𝑛 and 𝑝. We consider limits as: 𝑛 tends to infinity and 𝑝 tends to zero such that the product
𝑛𝑝 is finite and is equal to, say, 𝜆.
7|Pa g e
The horizontal axis is the index k, the number of occurrences. The function is defined only at
integer values of k. The connecting lines are only guides for the eye.
The horizontal axis is the index k, the number of occurrences. The CDF is discontinuous at
the integers of k and flat everywhere else because a Poisson distributed variable takes on only
integer values.
8|Pa g e
Exercises:
1. Suppose an organization has many employees, of which 65% are permanent, and the
rest are in fixed-term contractual appointments. The proportion is more or less the
same across all its branches. Suppose there are 50 employees in a city branch of the
organization. What is the probability that exactly 20 of the 50 branch employees are
permanent?
2. Suppose an organization has many employees, of which 55% are male, and the rest
are female. Further, suppose that the gender ratios are more or less the same across all
its branches. Suppose there are 100 employees in a city branch of the organization.
What is the probability that exactly 70 of the 100 branch employees are male?
3. In a given hour, a human resource manager receives job applications online. The
number of job applications she receives per hour varies from hour to hour. Suppose
the best distribution that models the hour-to-hour fluctuations in the number of
applicants received is Poisson, and the human resource manager receives applications
from the internet at an average (rate) of 6 per hour. What is the probability that the
human resource manager receives between 4 and 6, both inclusive, in any given hour?
9|Pa g e