Statistical and Mathematical
Methods for Data Analysis
Dr. Faisal Bukhari
Punjab University College of Information Technology
(PUCIT)
Textbooks
Probability & Statistics for Engineers & Scientists,
Ninth Edition, Ronald E. Walpole, Raymond H.
Myer
Elementary Statistics: Picturing the World, 6th
Edition, Ron Larson and Betsy Farber
Elementary Statistics, 13th Edition, Mario F. Triola
Dr. Faisal Bukhari, PUCIT, PU, Lahore 2
Reference books
Probability and Statistical Inference, Ninth Edition,
Robert V. Hogg, Elliot A. Tanis, Dale L. Zimmerman
Probability Demystified, Allan G. Bluman
Practical Statistics for Data Scientists: 50 Essential
Concepts, Peter Bruce and Andrew Bruce
Schaum's Outline of Probability, Second Edition,
Seymour Lipschutz, Marc Lipson
Python for Probability, Statistics, and Machine
Learning, José Unpingco
Dr. Faisal Bukhari, PUCIT, PU, Lahore 3
References
Readings for these lecture notes:
Probability & Statistics for Engineers &
Scientists, Ninth edition, Ronald E. Walpole,
Raymond H. Myer
These notes contain material from the above book.
Dr. Faisal Bukhari, PUCIT, PU, Lahore 4
Discrete Probability Distribution
The set of ordered pairs (x, f(x)) is a probability
function, probability mass function, or probability
distribution of the discrete random variable X if, for
each possible outcome x,
1. f(x) ≥ 0,
2. σ𝑥 𝑓 𝑥 = 1,
3. P(X = x) = f(x).
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example: A shipment of 20 similar laptop computers to
a retail outlet contains 3 that are defective. If a school
makes a random purchase of 2 of these computers,
find the probability distribution for the number of
defectives.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
N = 20
n=2
k=3
P(X = x)= h(x; N, n, k) = (kCx)(N-kCn-x)/(NCn), max{0,
n-(N-k)} ≤ x ≤ min{n, k}
Let X represent the number of defective computers
max{0, n - (N-k)} = max{0, 2 - (20 - 3)}
= max(0, -17) = 0
min{n, k} = min(2, 3) = 2
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Probability Distribution
x P(X = x)
0 68
95
1 51
190
2 3
190
𝑃 𝑋 = 1
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example : If a car agency sells 50% of its inventory of a
certain foreign car equipped with side airbags, find a
formula for the probability distribution of the number
of cars with side airbags among the next 4 cars sold by
the agency.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
𝐧 x n−x
𝐛 𝐱; 𝐧, 𝐩 = p q , x = 0, 1, 2, …, n
𝐱
Here n = 4, p = 0.50, q = 0.50
Let x denotes the number of cars with side airbags
𝟒
𝐛 𝐱; 𝟒, 𝟎. 𝟓𝟎 = (0.50)x(0.50)4−x, x = 0, 1, 2, 3, 4
𝐱
= 4x (0.50)4, x = 0, 1, 2, 3, 4
1 4
𝐛 𝐱; 𝟒, 𝟎. 𝟓𝟎 = x = 0, 1, 2, 3, 4
16 x ,
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Cumulative Distribution Function
The cumulative distribution function F(x) of a discrete
random variable X with probability distribution f(x) is
F(x) = P(X ≤ x) = σ𝒕≤𝒙 𝒇 𝒕 , for −∞ < x < ∞
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example A stockroom clerk returns three safety
helmets at random to three steel mill employees
who had previously checked them. If Smith, Jones, and
Brown, in that order, receive one of the three hats, list
the sample points for the possible orders of returning
the helmets, and find the value m of the random
variable M that represents the number of correct
matches
Dr. Faisal Bukhari, PUCIT, PU, Lahore
If S, J, and B stand for Smith’s, Jones’s, and Brown’s
helmets, respectively, then the possible arrangements
in which the helmets may be returned and the number
of correct matches are
Sample space m
SJB 3
SBJ 1
JSB 1
BJS 1
JBS 0
BSJ 0
Dr. Faisal Bukhari, PUCIT, PU, Lahore
For the random variable M, the number of correct
matches in the previous example, we have
2 3 5
F(2) = P(M ≤ 2) = f(0) + f(1) = + =
6 6 6
The cumulative distribution function of M is
0, for m < 0,
𝟏
, for 0 ≤ m < 1,
𝟑
F(m) = 𝟓
, for 1 ≤ m < 3,
𝟔
1, for m ≥ 3.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example : Find the cumulative distribution function of
1 4
the random variable X in 𝑓(𝑥) = x = 0, 1, 2, 3, 4.
16 x ,
Using F(x), verify that f(2) = 3/8.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
1 4
𝑓(𝑥) = x = 0, 1, 2, 3, 4
16 x ,
1
𝑓(0) =
16
4
𝑓(1) =
16
6
𝑓(2) =
16
4
𝑓(3) =
16
1
𝑓(4) =
16
Dr. Faisal Bukhari, PUCIT, PU, Lahore
F(x) = P(X ≤ x) =σ𝒕≤𝒙 𝒇 𝒕 , for −∞ < x < ∞
1
F(0) = P(X ≤ 0) = 𝑓(0) = ,
16
F(1) = P(X ≤ 1) = 𝒇 𝟎 + 𝒇(𝟏) -----------------------------------(1)
1 4 𝟓
= + = ,
16 16 𝟏𝟔
F(2) = P(X ≤ 2) = 𝒇 𝟎 + 𝒇(𝟏) + 𝒇(𝟐) --------------------------(2)
1 4 6 𝟏𝟏
= + + = ,
16 16 16 𝟏𝟔
F(3) = P(X ≤ 3) = 𝒇 𝟎 + 𝒇(𝟏) + 𝒇(𝟐) + 𝒇(𝟑)
1 4 6 4
= + + +
16 16 16 16
1 4 6 4 𝟏𝟓
= + + + = ,
16 16 16 16 𝟏𝟔
Dr. Faisal Bukhari, PUCIT, PU, Lahore
F(4) = P(X ≤ 4) = 𝑓 0 + 𝑓(1) + 𝑓(2) + 𝑓(3) + 𝑓(4)
1 4 6 4 1
= + + + +
16 16 16 16 16
16
= =1
16
0, for x < 0,
𝟏
, for 0 ≤ x < 1,
𝟏𝟔
𝟓
, for 1 ≤ x < 2,
𝟏𝟔
∴F(x) = 𝟏𝟏
, for 2 ≤ x < 3,
𝟏𝟔
𝟏𝟓
, for 3 ≤ x < 4,
𝟏𝟔
1, for x ≥ 4.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
(2) –(1):
11 5 6 𝟑
f(2) = F(2) – F(1) = - = =
16 16 16 𝟖
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Probability mass function plot vs.
Probability histogram
Probability mass function plot vs. Probability histogram
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Discrete cumulative distribution
function
Discrete cumulative distribution function
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Continuous Probability Distributions
A continuous random variable has a probability of 0
of assuming exactly any of its values.
Consequently, its probability distribution cannot be
given in tabular form.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Continuous Probability Distributions
We shall concern ourselves with computing
probabilities for various intervals of continuous
random variables such as P(a < X < b), P(W ≥ c), and
so forth.
Note that when X is continuous,
P(a < X ≤ b) = P(a < X < b) + P(X = b) = P(a < X < b).
That is, it does not matter whether we include an
endpoint of the interval or not.
This is not true, though, when X is discrete.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Because areas will be used to represent probabilities
and probabilities are positive numerical values, the
density function must lie entirely above the x axis.
Typical density functions.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Probability Density Function
The function f(x) is a probability density function (pdf)
for the continuous random variable X, defined over the
set of real numbers, if
1. f(x) ≥ 0, for all x ∈ R.
+∞
2. −∞ f(x) dx = 1.
𝒃
3. P(a < X < b) =𝒂 f(x) dx
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example: Suppose that the error in the reaction
temperature, in ◦C, for a controlled laboratory
experiment is a continuous random variable X having
the probability density function
x2
,−1 < x < 2,
f(x) = ቐ 3
0, elsewhere
(a) Verify that f(x) is a density function.
(b) Find P(0 < X ≤ 1).
Dr. Faisal Bukhari, PUCIT, PU, Lahore
f(x) ≥ 0.
+∞
−∞ f(x) dx = 1.
2 x2
LHS = −1 dx
3
x3
=[ ]2−1
9
[(2)3 − (−1)3]
=
9
=1
LHS = RHS
Dr. Faisal Bukhari, PUCIT, PU, Lahore
1 x2
P(0 < X ≤ 1) = 0 3 dx
x3
=[ ]10
9
[(1)3 − (0)3]
=
9
1
=
9
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Probability mass function plot vs.
Probability histogram
Probability mass function plot vs. Probability histogram
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Discrete cumulative distribution
function
Discrete cumulative distribution function
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Cumulative Distribution Function
The cumulative distribution function F(x) of a
continuous random variable X with density function f(x)
is
𝒙
F(x) = P(X ≤ x) = −∞ f(t) dt, for −∞ < x < ∞
𝑑𝐹(𝑥)
P(a < X < b) = F(b) − F(a) and f(x) = , if the
𝑑𝑥
derivative exists.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example: For the density function
x2
,−1 < x < 2,
f(x) = ቐ 3
0, elsewhere
, find F(x), and use it to evaluate P(0 < X ≤ 1).
Dr. Faisal Bukhari, PUCIT, PU, Lahore
𝒙
F(x) = P(X ≤ x) = −∞ f(t) dt, for −∞ < x < ∞
For −1 < x < 2,
𝑥 t2
F(x) =−1 dt
3
t3 𝑥
= [ ]−1
9
[(x)3 − (−1)3]
=
9
x3 + 1
=
9
Dr. Faisal Bukhari, PUCIT, PU, Lahore
0, for x <−1,
x3 + 1
F(x) = , for −1 ≤ x < 𝟐,
𝟗
𝟏, for x ≥ 2
Dr. Faisal Bukhari, PUCIT, PU, Lahore
P(0 < X ≤ 1) = F(1) – F(0)
13 + 1 2
F(1) =
9
= 9
03 + 1 1
F(0) =
9
= 9
2 1 1
P(0 < X ≤ 1) = – =
9 9 9
Dr. Faisal Bukhari, PUCIT, PU, Lahore
Example: The Department of Energy (DOE) puts
projects out on bid and generally estimates what a
reasonable bid should be. Call the estimate b. The DOE
has determined that the density function of the
5 2
, b ≤ y ≤ 2b,
winning (low) bid is f(x) = ቐ8b 5
0, elsewhere
Find F(y) and use it to determine the probability that
the winning bid is less than the DOE’s preliminary
estimate b.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
𝑥
F(x) = P(X ≤ x) = −∞ f(t) dt, for −∞ < x < ∞
2
b ≤ y ≤ 2b
5
𝒚 5
F(y) =2 dy
8b
5b
5 𝒚
=[ 𝑦]2
8b
5b
5 5 2
= 𝑦 - ( b)
8b 8b 5
5 1
= 𝒚-
8b 4
Dr. Faisal Bukhari, PUCIT, PU, Lahore
2
𝟎, y < b,
5
F(y) = 5 𝒚 − 1 , 2 b ≤ y ≤ 2b
8b 4 5
1, y ≥ 2b.
Dr. Faisal Bukhari, PUCIT, PU, Lahore
To determine the probability that the winning bid is
less than the preliminary bid estimate b, we have
5 1
F(y) = 𝐲 -
8b 4
5 1
⇒F(b) = 𝐛 -
8b 4
5 1
⇒F(b) = -
8 4
𝟓 𝟏 𝟑
∴ P(Y ≤ b) = F(b) = - =
𝟖 𝟒 𝟖
Dr. Faisal Bukhari, PUCIT, PU, Lahore