Lecture 1. Introduction and Random Variables: Kwong-Yu Wong

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Lecture 1.

Introduction and random variables

Kwong-Yu Wong

Kwong-Yu Wong Lecture 1. Introduction and random variables 1 / 43


Course Summary

▶ Instructor: Dr. WONG Kwong-Yu


▶ Lecture: Monday 4 - 6pm on Zoom.
▶ Link is available on Canvas under "Zoom" -> tab "Upcoming
Meetings", with a passcode: metrics
▶ Lectures are recorded and you’d find it as long as you change to
"Zoom" -> tab "Cloud Recordings".
▶ Each lecture is around 80-90 mins. Some are longer, some are
shorter.
▶ Tutorial Sessions: The module has tutorial sessions by TAs
in-person. You are expected to attend the tutorial sessions.

Kwong-Yu Wong Lecture 1. Introduction and random variables 2 / 43


What do we learn in EC2303?

▶ We learn Foundations for Econometrics.


▶ Econometrics is the application of mathematics, statistical
methods, and computer science, to economic data and is
described as the branch of economics that aims to give empirical
content to economic relations.
▶ In the econometrics courses, you will learn how to describe,
summarize, and make a statistical inference about economic data.

Kwong-Yu Wong Lecture 1. Introduction and random variables 3 / 43


What do we learn in EC2303?

To understand data, it is very important to understand the basic


concepts in probability theory and statistics.
▶ Before we talk about the real data, we will learn probability
theory. This is to help understanding how the data are created in
the fundamental level.
▶ After that, we will learn the statistical methods for inference and
estimation.
▶ In more advanced econometrics courses, you will have more
chances to see real data.

[Warning!!] This module is about mathematics and statistics! You will


see a lot of numbers, figures and formulas in this course. :)

Kwong-Yu Wong Lecture 1. Introduction and random variables 4 / 43


Lectures

▶ What do you need for lectures? Slides and notes.


▶ Slides: slides (download from Canvas) provide the big picture and
outline of lectures.
▶ Notes: for the details, you need to write down on your own notes.
▶ You need to study both slides and notes.
▶ No textbook is required. If you wish, you may refer to
▶ Probability and Statistics for Engineers and Scientists
By Walpole, Myers, Myers and Ye.
▶ Introduction to Econometrics
by Stock and Watson.
▶ Please do NOT post or spread out class materials (slides, problem
sets etc). All rights reserved.

Kwong-Yu Wong Lecture 1. Introduction and random variables 5 / 43


Evaluation

▶ Problem Sets 48%


▶ 10 weekly problem sets starting from Week 3. Each accounting for
4.5%.
▶ Problem set 0 due on Week 3, accounting for 3%.
▶ Participation: 12%
▶ Final exam 40%
▶ Examination: 21 Nov 2022, Monday, 5pm (in-person)
▶ Venue: MPSH-DA (tentative)

Kwong-Yu Wong Lecture 1. Introduction and random variables 6 / 43


Problem set and Tutorial session

▶ Starting on week 3, problem set is due every Monday 9am.


▶ Each problem set accounts for 4.5%. To reward the effort in
wrestling with the problem set, if most of your answers are empty
but you submitted your answers, you get 2%. Late submission1
gets 2% and 0% if you do not submit anything.
▶ During the tutorial sessions, the TAs discuss the problem set just
submitted. Partial answers will be posted in the following week.

1
No answer sheet will be accepted 2 days after the deadlines
Kwong-Yu Wong Lecture 1. Introduction and random variables 7 / 43
Problem set and Tutorial session

▶ Participation
▶ Each tutorial sessions accounts for 1%. Half from showing up to
participate and half from proactive participation (e.g.
answering/asking questions)
▶ For excused absence, please provide a medical certificate or a
statement of reasons to your TA to avoid deduction.
▶ In addition to the 11% from 11 tutorial sessions, the final 1% is
rewarded to those who never miss any tutorial session, excluding
excused absence.

Kwong-Yu Wong Lecture 1. Introduction and random variables 8 / 43


Communication

▶ Have questions on the lectures and problem sets?


▶ We will be available for questions after each class or tutorial
session and we’ll schedule office hours later (TBA).
▶ You may post questions about class materials on Canvas
Discussion.
▶ Answering your classmates correctly can be considered for
proactive participation for that week. Please bring it up to your TA.
▶ Please do not send questions via email. Use emails for
administrative issues ONLY.

Kwong-Yu Wong Lecture 1. Introduction and random variables 9 / 43


Communication

▶ Announcement is made on Canvas. Hence, it’s important to check


Canvas frequently.
▶ Have any other administrative/technical issues?
▶ Please contact your TA for any issue regarding tutorial sessions (to
submit statement of reasons, etc).
▶ For other admin issues, please contact me
(kwongyu.wong@nus.edu.sg) or our admin staff.
▶ Always use email title starting with "[EC2303 TWXX Student]"2
and cc your TA.

2
TWXX is your tutorial session number.
Kwong-Yu Wong Lecture 1. Introduction and random variables 10 / 43
Shall we start?

Kwong-Yu Wong Lecture 1. Introduction and random variables 11 / 43


Economic data
▶ Example: unemployment rates, interest rates, exchange rates,
consumption and income levels for households, Gross Domestic
Incomes etc.
▶ Daily stock prices (S&P500)

Kwong-Yu Wong Lecture 1. Introduction and random variables 12 / 43


Probability Theory

▶ Probability theory provides an analytic tool to understand the


underlying structure of data.
▶ Let’s start from a simple example: you toss a coin once, and
assign 1 when head side comes up, −1 when tail side comes up.

(1) How would you describe the experiment?


▶ After rolling a 6, what is the outcome of rolling a dice?
(a) 6, or
(b) random
▶ What about the outcome of this dice after rolling?
(2) How would you summarize the experiment?

Kwong-Yu Wong Lecture 1. Introduction and random variables 13 / 43


How to describe? Example 1

▶ We distinguish by notation
▶ Random variable: Upper case (e.g. X)
▶ Potential outcome: Lower case (e.g. x)
▶ Let usdefine X (the random variable) as
−1 if tail shows up
X=
1 if head shows up

▶ In words, there is a half chance of obtaining 1 and another half


chance of obtaining −1.

Kwong-Yu Wong Lecture 1. Introduction and random variables 14 / 43


How to describe? : Example 1
▶ In picture,

▶ The function described in the picture is called Probability Mass


Function (PMF).
▶ In mathematical
 expression,
0.5 at x = −1 (tail event)

fX (x) = 0.5 at x = 1 (head event)

0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 15 / 43


How to describe? Example 1

Definition
Probability Mass Function (PMF) is defined as
fX (x) = P(X = x)a .
a
Note the use of x for the potential outcome as opposed to the random
variable X.

▶ In our Example 1, P(X = −1) = 0.5 = fX (−1) and


P(X = 1) = 0.5 = fX (1).
▶ Observe that
▶ 0 ≤ fX (x) ≤ 1

P P
x P(X = x) = x fX (x) = 1

Kwong-Yu Wong Lecture 1. Introduction and random variables 16 / 43


How to describe? Example 2

▶ Suppose you roll a normal six-sided die once and define X to be


the number comes up.
▶ We have an equal chance of obtaining {1, 2, 3, 4, 5, 6}.
▶ Therefore, the probability of X = 1 is 1/6. The probability of
X = 2 is also 1/6. The probability of X = 3 is also 1/6... The
same for X = 4, X = 5, and X = 6.

Kwong-Yu Wong Lecture 1. Introduction and random variables 17 / 43


How to describe? Example 2

▶ The Probability Mass Function (PMF) is,



 1/6 at x = 1
1/6 at x = 2




 1/6 at x = 3


▶ fX (x) = P(X = x) = 1/6 at x = 4
1/6 at x = 5




1/6 at x = 6




0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 18 / 43


How to describe? Example 2
▶ The PMF is given as,

▶ What is P(X = 1)?


Answer: 1/6
▶ What is P(X = 2)?
Answer: 1/6
▶ What is P(X = 10)?
Answer: 0
▶ What is P(1 ≤ X ≤ 3)?
Answer: P(X = 1) + P(X = 2) + P(X = 3) = 1/2
Kwong-Yu Wong Lecture 1. Introduction and random variables 19 / 43
How to describe? Example 3

▶ The PMF is given as,

▶ fX (x)? 

 1/4 at x = 0
1/2 at x = 1

Answer: fX (x) = P(X = x) =

 1/4 at x = 2
0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 20 / 43


How to describe? Example 3
▶ The PMF is given as,

▶ What is P(X ≤ 0)?


Answer: 1/4
▶ What is P(X ≤ 1)?
Answer: P(X = 0) + P(X = 1) = 1/4 + 1/2 = 3/4
▶ What is P(X ≤ 2)?
Answer:
P(X = 0) + P(X = 1) + P(X = 2) = 1/4 + 1/2 + 1/4 = 1
▶ The function P(X ≤ x) can also provide a good description of a
random variable as well as P(X = x).
Kwong-Yu Wong Lecture 1. Introduction and random variables 21 / 43
Another way of describing random variables

Definition
Cumulative Distribution Function (CDF) is defined as
FX (x) = P(X ≤ x).

▶ In our previous Example


 3,

 0 when x < 0
1/4 when 0 ≤ x < 1

FX (x) = P(X ≤ x) =
 3/4 when 1 ≤ x < 2

1 when 2 ≤ x

Kwong-Yu Wong Lecture 1. Introduction and random variables 22 / 43


PMF and CDF
Hence, given a discrete random variable X,

Definition 1
Probability Mass Function (PMF) is defined as
fX (x) = P(X = x).

Definition 2
Cumulative Distribution Function (CDF) is defined as
FX (x) = P(X ≤ x).

Kwong-Yu Wong Lecture 1. Introduction and random variables 23 / 43


PMF and CDF

Recall the three examples we’ve seen so far.

1 Coin toss example with one coin


2 The example of rolling a die
3 Coin toss example with two coins

Kwong-Yu Wong Lecture 1. Introduction and random variables 24 / 43


Example 1
Coin toss example: we assign 1 when head side comes up, −1 when
tail side comes up.

−1 with the probability 1/2
X=
1 with the probability 1/2
Question 1 : Find the PMF of X.

 0.5 at x = −1
Answer: fX (x) = 0.5 at x = 1
0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 25 / 43


Example 1

Question 2: Find the CDF of X.

▶ The CDF FX (x) is defined as P(X ≤ x).


▶ Given a PMF, how can we find P(X ≤ x) for each x?

P P
FX (x) = y≤x P(X = y) = y≤x fX (y).
▶ As we move the point small x, we do the summation of the length
of the red sticks on the left hand side of x.

Kwong-Yu Wong Lecture 1. Introduction and random variables 26 / 43


Example 1

▶ For a fixed x,

▶ P(X ≤ x) is,

Kwong-Yu Wong Lecture 1. Introduction and random variables 27 / 43


Example 1

▶ For another fixed x,

▶ P(X ≤ x) is,

Kwong-Yu Wong Lecture 1. Introduction and random variables 28 / 43


Example 3
Two coins toss example: we toss two coins and let X be the number of
heads. Recall that the sample space is,
Ω = {(T, T), (T, H), (H, T), (H, H)}.

Kwong-Yu Wong Lecture 1. Introduction and random variables 29 / 43


Example 3

Two coins toss example: we toss two coins and let X be the number of
heads.

Kwong-Yu Wong Lecture 1. Introduction and random variables 30 / 43


Example 3
Two coins toss example: we toss two coins and let X be the number of
heads.
Question 1 : Find the PMF of X.

Answer:

 0.25 at x = 0
0.5 at x = 1

fX (x) =

 0.25 at x = 2
0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 31 / 43


Example 3
Two coins toss example: we toss two coins and let X be the number of
heads.
Question 2 : Find the CDF of X.

Answer: 

 0 when x < 0
0.25 when 0≤x < 1

FX (x) =

 0.75 when 1≤x < 2
1 when 2≤x

Kwong-Yu Wong Lecture 1. Introduction and random variables 32 / 43


A function of a random variable

▶ A function of a random variable can be thought as another new


random variable.
▶ Consider X as the number of heads in the two coins toss example.

 0 with the probability 0.25
X= 1 with the probability 0.5
2 with the probability 0.25

▶ What if, for instance, we are interested in the square of the


number of heads?

Kwong-Yu Wong Lecture 1. Introduction and random variables 33 / 43


A function of a random variable

We define a new random variable, say Y, as the square of the number


of heads in our two coins toss experiment. (i.e. Y = X 2 ).
What is the PMF of Y?
▶ X only takes the value 0, 1, 2 with positive probabilities.
▶ Y is a random variable which takes 0, 1, 4 with positive
probabilities.


 0.25 at y = 0
0.5 at y = 1

fY (y) = P(Y = y) =

 0.25 at y = 4
0 otherwise

Kwong-Yu Wong Lecture 1. Introduction and random variables 34 / 43


A function of a random variable

▶ Once you find the PMF, it is easy to find the CDF, FY (y).


 0 when y < 0
0.25 when 0 ≤ y < 1

FY (y) =

 0.75 when 1 ≤ y < 4
1 when 4 ≤ y

Kwong-Yu Wong Lecture 1. Introduction and random variables 35 / 43


A function of a random variable

Let’s go back to our one coin toss example where we defined X to be,


−1 with the probability 0.5
X=
1 with the probability 0.5

Define Y = 2X + 1 and Z = X 2 .
What are the PMF and CDF of Y and Z respectively?

Kwong-Yu Wong Lecture 1. Introduction and random variables 36 / 43


Discrete and continuous random variables

So far, we have only considered the cases where the sample space
consists of discrete numbers.(e.g. coin toss, die rolling)

There is another type of random variable which can take any value in
an interval.
▶ For example, the response time to some stimulus, the sample
space is Ω = (0, ∞), and the possible values of the response time
are measured on a continuum.

We call the latter a CONTINUOUS random variable, while the former


is called a DISCRETE random variable.

Kwong-Yu Wong Lecture 1. Introduction and random variables 37 / 43


Discrete and continuous random variables

Continuous or discrete?

1.

2.

1. Discrete
2. Discrete

Kwong-Yu Wong Lecture 1. Introduction and random variables 38 / 43


Discrete and continuous random variables

Continuous or discrete?
▶ Time between failures of an electrical component
▶ Average temperature in Singapore tomorrow
▶ Weight of packages filled by a mechanical filling process
▶ The favorite integer of a student in EC2303 between 1 and 3
▶ The favorite real number of a student in EC2303 between 1 and 3
▶ The exam score of a student in EC2303

Kwong-Yu Wong Lecture 1. Introduction and random variables 39 / 43


Continuous random variables
How do you describe continuous random variables?
▶ For a discrete random variable with a PMF

▶ P(X ≤ x) can be found as,

Kwong-Yu Wong Lecture 1. Introduction and random variables 40 / 43


Continuous random variables
Similarly, we can think of functions fX (x) and FX (x) such that
▶ the shaded area (red) represents P(X ≤ x).

▶ The function fX (x): Probability Density Function (PDF)


▶ The function FX (x): Cumulative Distribution Function (CDF)
▶ fX (x) is displayed by the solid line in the graph, and FX (x) is a
function that represents the red area as we move around a point x.
Kwong-Yu Wong Lecture 1. Introduction and random variables 41 / 43
Summary

Definition
If X is discrete, Probability Mass Function (PMF) fX (x) is

fX (x) = P(X = x)

If X is continuous, Probability Density Function (PDF) fX (x) is


Z b
P(a < X < b) = fX (x)dx
a

▶ Since the probability of a random variable lying in any interval (a, b)


must be nonnegative, every pdf must be nonnegative: fX (x) ≥ 0 for all x.
▶ Also, the probability of a random variable
R ∞ lying anywhere on the real
line is equal to one, so we must have −∞ fX (x)dx = 1.

Kwong-Yu Wong Lecture 1. Introduction and random variables 42 / 43


Summary
When X is a continuous random variable,

Definition
Cumulative Distribution Function (CDF) is defined as
FX (x) = P(X ≤ x).

FX (x) is a non decreasing and right continuous function.


▶ When X is a discrete random variable, FX (x) = x fX (x) is a step
P
function with a certain number of jumps.
Rx
▶ When X is a continuous random variable, FX (x) = −∞ fX (y)dy is
a continuous function.

Kwong-Yu Wong Lecture 1. Introduction and random variables 43 / 43

You might also like