Lecture 14 1756137910 231018 104530

Lecture #14: Expectations, Affine Transformations, & Probability Distributions
14.0 Expectations & Correlation Coefficient

Basic Properties of Expectation & Variance
You may refer to Section 7.4 and Section 7.5 from the lecture note #7.
Or you may refer to Expectations.pdf up to page 9 (up to correlation part).
Topics concerning “Conditional expectations” and others will be discussed after the midterm.
Correlation Coefficient
Correlation is a linear association between two variables X and Y, defined by the equation:
𝐶𝑜𝑣(𝑋, 𝑌) 𝐸[(𝑋 − 𝜇 )(𝑌 − 𝜇 )] 𝐸(𝑋𝑌) − 𝜇 𝜇 𝜎 ,

𝐶𝑜𝑟𝑟(𝑋, 𝑌) = 𝜌 , = = = =
𝑉𝑎𝑟(𝑋)𝑉𝑎𝑟(𝑌) 𝜎 𝜎 𝜎 𝜎 𝜎 𝜎
Where the equality on the denominator is a simple manipulation of the definition:
𝑉𝑎𝑟(𝑋)𝑉𝑎𝑟(𝑌) = 𝜎 𝜎 = (𝜎 𝜎 ) = 𝜎 𝜎
And the equality on the numerator was found by solving Q3 of Assignment #3:
𝐸[(𝑋 − 𝜇 )(𝑌 − 𝜇 )] = ⋯ = 𝐸(𝑋𝑌) − 𝜇 𝜇
Some stylized facts about 𝝆𝑿,𝒀 :
 No units of measurements, but captures the relative distance from the mean (e.g., ).
∗
 −1 ≤ 𝜌 , ≤ 1
 𝜌 , = 1: Perfect positive linear relationship between X and Y.
 𝜌 , = −1: Perfect negative linear relationship between X and Y.
 𝜌 , = 0: No linear relationship between X and Y.
 Independent X and Y implies 𝜌 , = 0.
 Does 𝜌 , = 0 imply independent X and Y? Why?
We will talk about correlation coefficient more in detail when we get into inferential statistics.
14.1 Affine Transformation of Gaussian Distribution Functions (Exercise #1)
We continue from example 1 from Section 10.5 of Lecture Note #10.
Linear Transformation of 𝑋~𝑓 (𝑥)

Let 𝑋~𝑓 (𝑥) and 𝑌 = 𝑎𝑋 + 𝑏 for some 𝑎 > 0 and b.
To find the PDF 𝑓 (𝑦), we follow these steps:
𝐹 (𝑦) = 𝑃{𝑌 ≤ 𝑦} = 𝑃{𝑎𝑋 + 𝑏 ≤ 𝑦}
𝑦−𝑏 𝑦−𝑏
=𝑃 𝑋≤ =𝐹
𝑎 𝑎
Thus,
𝑑[𝐹 (𝑦)] 1 𝑦−𝑏
𝑓 (𝑦) = = 𝑓
𝑑𝑦 𝑎 𝑎
As long as 𝑎 ≠ 0, we may argue that, without the loss of generality,

1 𝑦−𝑏
𝑓 (𝑦) = 𝑓 .
|𝑎| 𝑎
Why? (Set 𝑎 = −𝑘 and try solving this out)

If 𝑎 = 0, then there’s no point in studying the distribution of Y, since 𝑌 = 𝑏 ∀𝑥 ∈ 𝑅.
Linear Transformation of 𝑋~𝑁(𝜇 , 𝜎 )

Let 𝑋~𝑁(𝜇 , 𝜎 ) and 𝑌 = 𝑎𝑋 + 𝑏 for some 𝑎 ≠ 0 and b.
We already know the probability distribution function 𝑓 (𝑥):
( )
1
𝑓 (𝑥) = 𝑒
𝜎 √2𝜋
Using this preliminary setting, can we compute the PDF of Y?

To find the PDF 𝑓 (𝑦), we just follow through the same procedure:
𝐹 (𝑦) = 𝑃{𝑌 ≤ 𝑦} = 𝑃{𝑎𝑋 + 𝑏 ≤ 𝑦}
𝑦−𝑏 𝑦−𝑏
=𝑃 𝑋≤ =𝐹
𝑎 𝑎
Then, we know can take the derivative of 𝐹 to compute the PDF 𝑓 (𝑦):
𝑑[𝐹 (𝑦)] 1 𝑦−𝑏
𝑓 (𝑦) = = 𝑓
𝑑𝑦 |𝑎| 𝑎
Now, recall that what’s inside of the parenthesis of the function is the variable:
1 𝑦−𝑏
𝑓 (𝑦) = 𝑓
|𝑎| 𝑎
( )
1 1
= 𝑒
|𝑎| 𝜎 √2𝜋
We manipulate what’s outside of 𝑒 (∙) first (the part inside the big brackets):
1 1
𝑓 (𝑦) = 𝑒 (∙)
|𝑎| 𝜎 √2𝜋
1 1 1
= 𝑒 (∙)
√𝑎 𝜎 √2𝜋
1 1
= 𝑒 (∙)
(𝑎𝜎 ) √2𝜋
1
= 𝑒 (∙)
(𝑎𝜎 )√2𝜋
Here, we already see where we are going with this right?

Ignore the 𝑒 (∙) part for now and compare 𝑓 (𝑥) and 𝑓 (𝑦):
1 1
𝑓 (𝑥) = 𝑒 (∙) 𝑓 (𝑦) = 𝑒 (∙)
𝜎 √2𝜋 (𝑎𝜎 )√2𝜋
Now, forget about the similarities and finish up simplifying what’s inside 𝑒 (∙) :
( )
𝑒 ⇒
(𝑦 − 𝑏)
−𝜇 (𝑦 − 𝑏) − 𝑎𝜇
𝑎
− =−
2𝜎 2𝑎 𝜎
(𝑦 − 𝑏 − 𝑎𝜇 )
=−
2𝑎 𝜎
𝑦 − (𝑏 + 𝑎𝜇 )
=− ⇒
2𝑎 𝜎
( )
𝑒
Again, let’s compare parts concerning 𝑒 (∙) for 𝑓 (𝑥) and 𝑓 (𝑦):
( ) ( )
𝑓 (𝑥) = [∙]𝑒 𝑓 (𝑦) = [∙]𝑒
Putting all the pieces together:
( ) ( )
1 1
𝑓 (𝑥) = 𝑒 𝑓 (𝑦) = 𝑒
𝜎 √2𝜋 (𝑎𝜎 )√2𝜋
Highlighted in yellow should indicate something about the variance or standard deviation.
Highlighted in green should indicate something about the mean.
If we set 𝑏 + 𝑎𝜇 = 𝜇 and 𝑎 𝜎 = 𝜎 , we clearly see that:

( ) ( )
1 1
𝑓 (𝑥) = 𝑒 𝑓 (𝑦) = 𝑒
𝜎 √2𝜋 𝜎 √2𝜋
Hence, we observe that any linear transformation Y of 𝑋~𝑁(𝜇 , 𝜎 ), in the form of 𝑌 = 𝑎𝑋 + 𝑏

with 𝑎 ≠ 0, is also a gaussian distribution with 𝑌~𝑁(𝑎𝜇 + 𝑏, 𝑎 𝜎 ).
14.2 PMF/CDF (Exercise #2)
Q1 (Easy PMF; Warm-up)
Suppose X is a random variable with the PMF defined by:
2𝑥 + 1
𝑝 (𝑥) =
25
for 𝑥 = 0,1,2,3,4.
Part (A): What is 𝑝 (5)?
Determine the following Probabilities.

Part (B): 𝑃(𝑋 = 5)
Part (C): 𝑃(𝑋 ≤ 1)
Part (D): 𝑃(2 ≤ 𝑋 < 4)
Part (E): 𝑃(𝑋 > −10)
Solution to Q1
Part (A): What is 𝑝 (5)?
Simple answer would be 𝑝 (5) = 0, since 5 ∉ 𝐵 ≡ {0,1,2,3,4}.
A more detailed answer would be confirming that result using the properties of probability:
𝑝 (𝑥) = 1
∈
2𝑥 + 1 1 + 3 + 5 + 7 + 9 25
⇒ = = =1
25 25 25
⇒ 𝑝 (𝑥) = 0 ∀𝑥 ∉ 𝐵.
…
Part (D): 𝑃(2 ≤ 𝑋 < 4) = 𝑝 (2) + 𝑝 (3).
Part (E): 𝑃(𝑋 > −10) = 𝑃(−10 < 𝑋 < 0) + 𝑃(0 ≤ 𝑋 ≤ 4) + 𝑃(𝑋 > 4).
Fairly simple question.

There might be one problem in the exam like this one to ensure that you get a minimum point.
Q2 (Identifying a distribution using a CDF)
The CDF of a random variable X is given by the following figure:
Part (A): Find 𝑝 (𝑥) using this CDF.

Part (B):
What is the distribution that X follows?
Find all the parameters of this distribution using your answer from Part (A).
Hint: We are dealing with PMF here. We covered Bernoulli, Binomial, Geometric, and Poisson.
Solution
Part (A):
We note that there are a total of 4 differently sized jumps: from 0 to 1, 1 to 2, 2 to 3, and 3 to 4.
Recall that the jump size indicates the probabilities.
Then, this can be translated using the following terminology:
𝑝 (𝑥) = 𝑃(𝑋 = 𝑥) = the size of a jump at 𝑥 = 0
So, in sum, we should have:
0.064, 𝑥=0
⎧
⎪ 0.288, 𝑥=1
𝑝 (𝑥) = 0.432, 𝑥=2
⎨0.216, 𝑥=3
⎪
⎩ 0, otherwise
Part (B):
Among all the PMFs that we discussed in class, only one can have the subset 𝐵 = {0,1,2,3} with
each element x having unequal probabilities (i.e., I am not talking about 𝜃, but 𝑝 (𝑥) when I say
probabilities).
This looks likes the probability mass function of binomial distribution.
To see if 𝑝 indeed follows a binomial distribution, we bring up the PMF from lecture 8:
𝑛 𝑛!
𝑝 (𝑥) = 𝑃(𝑋 = 𝑥) = 𝜃 (1 − 𝜃) = 𝜃 (1 − 𝜃)
𝑥 𝑥! (𝑛 − 𝑥)!
for 𝑥 = 0,1,2, … , 𝑛
In this case, 𝑛 = 3.
Now see if it makes sense by plugging in the values.
3! 64 4∗4∗4 2
𝑝 (0) = (𝜃) (1 − 𝜃) = (1 − 𝜃) = 0.064 = = =
0! 3! 1000 10 ∗ 10 ∗ 10 5
2 3
⇒ 1−𝜃 = ⇒𝜃= .
5 5
!
To confirm, calculate 𝑝 (1) = =3 = = 0.288.
! !
Therefore, we have,
3 3 2 3! 3 2
𝑝 (𝑥) = 𝑃(𝑋 = 𝑥) = =
𝑥 5 5 𝑥! (3 − 𝑥)! 5 5
Or, simply,
3
𝑋~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙 .
5
Q3 (Relationship between Binomial and Poisson; Difficult Questions)
Review Q2 and Q3 from the Assignment #3.
Please take the time to review Section 8.5 of Lecture Note #8.
14.3 PDF/CDF (Exercise #3)

Q1 (CDF and Expectations)
Suppose that the CDF of random variable X is given by:
0, 𝑥 < −5
(𝑥 + 5)
𝐹 (𝑥) = , −5 ≤ 𝑥 < 7
144
1, 𝑥≥7
Using this information, please answer the following questions.

Part (A): What is 𝑓 (𝑥)?
Part (B): What is 𝐸(𝑋)?
Part (C): What is 𝑉𝑎𝑟(𝑋)?
Part (D): What is 𝐸(𝑋 )?
Solutions:
First, how do you know if X is a continuous random variable?
In this case, I kind of hinted that it is by putting this problem under the section PDF, but how
would you be able to tell if you weren’t given this piece of information?
The only way you would be able to say that X is indeed a continuous RV is by determining:
(𝑥 + 5)
𝐹 (𝑥) =
144
for 𝑥 = −5.
Why?
Because by the general definition of PDF of a continuous random variable, we should have a
point probability equal to 0, as given by 𝑓 (𝑥) = 𝑃(𝑋 = 𝑥) = 0 .
Are there any other jumps at the boundaries of −5 ≤ 𝑥 < 7? Try by plugging in -5 ~ 7.
Seems like there aren’t. So, it’s safe to say that there’s no discontinuity in 𝐹 (𝑥).
Part (A):
Note that 𝑓 (𝑥) can be found by just taking the derivative of 𝐹 (𝑥) for 𝑥 ∈ (−5,7) and for 𝑥 ∈
{0,7}.
But what about for 𝑥 = −5?
We should have the left and right derivates be equal to each other to clearly define that 𝑓 (𝑥)
exists at 𝑥 = −5.
Since we have already pointed that out earlier by observing that there’s no discontinuity in 𝐹 (𝑥)
at 𝑥 = −5, 𝑓 (𝑥) is defined as the following:
𝑑 𝑥+5
𝑓 (𝑥) = [𝐹 (𝑥)] = 72 , 𝑥 ∈ (−5, 7)
𝑑𝑥
0, otherwise
Part (B):
𝑥(𝑥 + 5)
𝐸(𝑋) = 𝑥𝑓(𝑥)𝑑𝑥 = 𝑑𝑥 = ⋯ = 3
72
Part (C):
You just need to compute,
𝑥 (𝑥 + 5)
𝐸(𝑋 ) = 𝑥 𝑓(𝑥)𝑑𝑥 = 𝑑𝑥 = ⋯ = 17
72
And then, apply the property of variance:
𝑉𝑎𝑟(𝑋) = ⋯ = 𝐸(𝑋 ) − 𝐸(𝑋) = 17 − 9 = 8.
Part (D):
This is just testing whether you can compute the integral:
𝑥 (𝑥 + 5) 431
𝐸(𝑋 ) = 𝑥 𝑓(𝑥)𝑑𝑥 = 𝑑𝑥 = ⋯ =
72 5
Q2 (Uniform Distribution & Independence of Sub-intervals)
Please take the time to review Q3 of Assignment #3.
Q3 (Standard Gaussian Distribution; Proving symmetric property)

Suppose that 𝑋~𝑁(0,1).
Show that Φ(−𝑥) = 1 − Φ(𝑥).
Solution #1 (Easy way out; plug in some numbers):

Refer to Q4 of Assignment #3.
Solution #2 (Strictly speaking)
Φ(−𝑥) = 𝐹 (−𝑥) = 𝑃(𝑋 ≤ −𝑥) = 1 − 𝑃(𝑋 > −𝑥) = 1 − 𝑃(𝑋 ≥ −𝑥) = 1 − 𝑓 (𝑡) 𝑑𝑡
Definition of Even function:

𝑓(𝑥) = 𝑓(−𝑥)
Example: 𝑓(𝑥) = 𝑥 ∀𝑘 = 0,2,4,6 …
Definition of Odd function:

−𝑓(𝑥) = 𝑓(−𝑥)
Example: 𝑓(𝑥) = 𝑥 ∀𝑘 = 1,3,5,7 …
But what about the PDF of a standard normal?

( )
1
𝑓 (𝑥) = 𝑒
𝜎 √2𝜋
Plug in 𝜇 = 0 and 𝜎 = 1 as per definition of standard normal distribution.

1
𝑓 (𝑥) = 𝑒
√2𝜋
1 ( ) 1
𝑓 (−𝑥) = 𝑒 = 𝑒 = 𝑓 (𝑥)
√2𝜋 √2𝜋
So, PDF of standard normal is an even function, which likely implies that:
1− 𝑓 (𝑡) 𝑑𝑡 = 1 − 𝑓 (𝑡) 𝑑𝑡 + 𝑓 (𝑡) 𝑑𝑡
Refer to Section 11.2 from Lecture Note #11.
=1− 𝑓 (𝑡) 𝑑𝑡 + 𝑓 (𝑡) 𝑑𝑡
by the definition of even function.
=1− 𝑓 (𝑡) 𝑑𝑡 = 1 − 𝐹 (𝑥)
= 1 − Φ(𝑥)
This might be too difficult, but it’s just a manipulation.

We have used nothing but properties given in Lecture 11.2 to prove the “symmetrical property”
of (standard) Gaussian Distribution.
Expect more numerical problem in Midterm than this exercise.
14.4 Additional Comments (Joint & Conditional Distribution; Expectations)
Joint Distribution Exercise
There will be at most one part (or none) of one question that ask you to find the marginal
PMF/PDF using the joint distribution 𝑝 , (𝑥, 𝑦) or 𝑓 , (𝑥, 𝑦).
Reviewing Lecture #12 should be sufficient.
Conditional Distribution Exercise

There will at most be one part (or none) of one question that asks you to find the conditional
distribution given other (PMF/PDF/CDF) distribution functions.
Reviewing Lecture #13 should be sufficient.
Expectation (Mean, Variance, Covariance, Correlation) Exercises

Some of them will involve proofs like Q2 & Q3 from Assignment #3.
Others will ask you to compute theses values using the given probability distribution functions
for the random variable, similar to what we have been seeing throughout the lectures concerning
probability distributions and from Assignment #3.
Schedule until Midterm (Chronological Order):

Zoom-Recitation (10/20 Class; Will be recorded & Uploaded on YouTube)
I won’t be making a lecture note for this class.
You do not need to come online if you don’t have any questions.
Zoom link: https://cau.zoom.us/j/3841860582
Midterm (take-home) will be uploaded at the start of this class.
1. Review Assignments (#1~#3)
a. Written solutions won’t be provided
2. Midterm Instructions
a. Deadline
b. Which topics will be included?
c. How many problems per topic?
i. Calculation
ii. Proof
d. Q&A’s regarding directions
Midterm Due (10/26 00:10 A.M.; No class on 10/25)

Lecture 14 1756137910 231018 104530

Uploaded by

Copyright:

Available Formats

Lecture 14 1756137910 231018 104530

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 14 1756137910 231018 104530

Uploaded by

Copyright:

Available Formats

Lecture #14: Expectations, Affine Transformations, & Probability Distributions

14.0 Expectations & Correlation Coefficient

𝐶𝑜𝑣(𝑋, 𝑌) 𝐸[(𝑋 − 𝜇 )(𝑌 − 𝜇 )] 𝐸(𝑋𝑌) − 𝜇 𝜇 𝜎 ,

Where the equality on the denominator is a simple manipulation of the definition:

Some stylized facts about 𝝆𝑿,𝒀 :

Linear Transformation of 𝑋~𝑓 (𝑥)

As long as 𝑎 ≠ 0, we may argue that, without the loss of generality,

Why? (Set 𝑎 = −𝑘 and try solving this out)

Linear Transformation of 𝑋~𝑁(𝜇 , 𝜎 )

Using this preliminary setting, can we compute the PDF of Y?

Here, we already see where we are going with this right?

Putting all the pieces together:

If we set 𝑏 + 𝑎𝜇 = 𝜇 and 𝑎 𝜎 = 𝜎 , we clearly see that:

Hence, we observe that any linear transformation Y of 𝑋~𝑁(𝜇 , 𝜎 ), in the form of 𝑌 = 𝑎𝑋 + 𝑏

Part (A): What is 𝑝 (5)?

Determine the following Probabilities.

Fairly simple question.

Part (A): Find 𝑝 (𝑥) using this CDF.

14.3 PDF/CDF (Exercise #3)

Using this information, please answer the following questions.

And then, apply the property of variance:

𝑉𝑎𝑟(𝑋) = ⋯ = 𝐸(𝑋 ) − 𝐸(𝑋) = 17 − 9 = 8.

Q3 (Standard Gaussian Distribution; Proving symmetric property)

Solution #1 (Easy way out; plug in some numbers):

Solution #2 (Strictly speaking)

Definition of Even function:

Definition of Odd function:

Example: 𝑓(𝑥) = 𝑥 ∀𝑘 = 1,3,5,7 …

But what about the PDF of a standard normal?

Plug in 𝜇 = 0 and 𝜎 = 1 as per definition of standard normal distribution.

1− 𝑓 (𝑡) 𝑑𝑡 = 1 − 𝑓 (𝑡) 𝑑𝑡 + 𝑓 (𝑡) 𝑑𝑡

Refer to Section 11.2 from Lecture Note #11.

=1− 𝑓 (𝑡) 𝑑𝑡 + 𝑓 (𝑡) 𝑑𝑡

by the definition of even function.

=1− 𝑓 (𝑡) 𝑑𝑡 = 1 − 𝐹 (𝑥)

This might be too difficult, but it’s just a manipulation.

Conditional Distribution Exercise

Expectation (Mean, Variance, Covariance, Correlation) Exercises

Schedule until Midterm (Chronological Order):

Midterm Due (10/26 00:10 A.M.; No class on 10/25)

You might also like