STAT 552 Probability and Statistics Ii: Short Review of S551
STAT 552 Probability and Statistics Ii: Short Review of S551
STAT 552 Probability and Statistics Ii: Short Review of S551
PROBABILITY AND
STATISTICS II
INTRODUCTION
Short review of S551
1
WHAT IS STATISTICS?
2
BASIC DEFINITIONS
• POPULATION: The collection of all items of
interest in a particular study.
•SAMPLE: A set of data drawn from the population;
a subset of the population available for observation
•PARAMETER: A descriptive measure of the
population, e.g., mean
•STATISTIC: A descriptive measure of a sample
•VARIABLE: A characteristic of interest about each
element of a population or sample. 3
STATISTIC
• Statistic (or estimator) is any function of a r.v.
of r.s. which do not contain any unknown
quantity. E.g.
n n n
o Xi , Xi , Xi / n, m ii n(Xi ), m ai x(Xi ) are statistics.
i 1 i 1 i 1
n n
o Xi , Xi / are NOT.
i 1 i 1
5
RANDOM VARIABLES
• Variables whose observed value is determined
by chance
• A r.v. is a function defined on the sample space
S that associates a real number with each
outcome in S.
• Rvs are denoted by uppercase letters, and their
observed values by lowercase letters.
6
DESCRIPTIVE STATISTICS
• Descriptive statistics involves the
arrangement, summary, and presentation of
data, to enable meaningful interpretation, and to
support decision making.
• Descriptive statistics methods make use of
– graphical techniques
– numerical descriptive measures.
7
Types of data – examples
Examples of types of data
Quantitative
Continuous Discrete
Blood pressure, height, Number of children
weight, age Number of attacks of asthma
per week
Categorical (Qualitative)
Ordinal (Ordered categories) Nominal (Unordered
categories)
Grade of breast cancer Sex (Male/female)
Better, same, worse Alive or dead
Disagree, neutral, agree Blood group O, A, B, AB
8
PROBABILITY
POPULATION SAMPLE
STATISTICAL
INFERENCE
9
• PROBABILITY: A numerical value
expressing the degree of uncertainty
regarding the occurrence of an event. A
measure of uncertainty.
P : S [0,1]
11
THE CALCULUS OF
PROBABILITIES
• If P is a probability function and A is any
set, then
a. P()=0
b. P(A) 1
c. P(AC)=1 P(A)
12
ODDS
• The odds of an event A is defined by
P( A) P( A)
P( A ) 1 P( A)
C
13
ODDS RATIO
• OR is the ratio of two odds.
• Useful for comparing the odds under two
different conditions or for two different
groups, e.g. odds for males versus
females.
14
CONDITIONAL PROBABILITY
• (Marginal) Probability: P(A): How likely is it
that an event A will occur when an
experiment is performed?
• Conditional Probability: P(A|B): How will
the probability of event A be affected by
the knowledge of the occurrence or
nonoccurrence of event B?
• If two events are independent, then
P(A|B)=P(A)
15
CONDITIONAL PROBABILITY
P( A B)
P(A | B) if P( B) 0
P( B)
0 P( A | B) 1
P( AB) P( A) P( B | A) P( B) P( A | B)
16
BAYES THEOREM
• Suppose you have P(B|A), but need
P(A|B).
17
Independence
• A and B are independent iff
– P(A|B)=P(A) or P(B|A)=P(B)
– P(AB)=P(A)P(B)
• A1, A2, …, An are mutually independent iff
P ( Ai ) P ( Ai ) for every subset j of {1,2,…,n}
i j i j
19
Example
• Discrete Uniform distribution:
1
P(X x ) ; x 1,2,..., N; N 1,2,...
N
• Example: throw a fair die.
P(X=1)=…=P(X=6)=1/6
20
CONTINUOUS RANDOM
VARIABLES
• When sample space is uncountable
(continuous)
• Example: Continuous Uniform(a,b)
1
f (X) a x b.
ba
21
CUMULATIVE DENSITY
FUNCTION (C.D.F.)
22
JOINT DISCRETE
DISTRIBUTIONS
• A function f(x1, x2,…, xk) is the joint pmf for
some vector valued rv X=(X1, X2,…,Xk) iff
the following properties are satisfied:
f(x1, x2,…, xk) 0 for all (x1, x2,…, xk)
and
23
MARGINAL DISCRETE
DISTRIBUTIONS
• If the pair (X1,X2) of discrete random
variables has the joint pmf f(x1,x2), then the
marginal pmfs of X1 and X2 are
f1 x1 f x1 , x2 and f 2 x2 f x1 , x2
x2 x1
24
CONDITIONAL
DISTRIBUTIONS
• If X1 and X2 are discrete or continuous
random variables with joint pdf f(x1,x2),
then the conditional pdf of X2 given X1=x1
is defined by
f x1, x 2
f x 2 x1 , x1 such that f x1 0, 0 elsewhere.
f x1
g x f X x < , if X is discrete
x
E g X
g x f X x dx< , if X is continuous
27
Laws of Expected Value and Variance
Let X be a rv and c be a constant.
Laws of Expected Laws of
Value Variance
E(c) = c V(c) = 0
E(X + c) = E(X) + c V(X + c) = V(X)
E(cX) = cE(X) V(cX) = c2V(X)
28
EXPECTED VALUE
E ai X i ai E X i .
k k
i 1 i 1
If X and Y are independent,
Eg X hY Eg X EhY
The covariance of X and Y is defined as
CovX, Y EX EX Y EY
E(XY ) E(X)E(Y)
29
EXPECTED VALUE
If X and Y are independent,
Cov X ,Y 0
30
EXPECTED VALUE
Var X 1 X 2 Var X 1 Var X 2 2Cov X 1 , X 2
31
CONDITIONAL EXPECTATION
AND VARIANCE
yf y x , if X and Y are discrete.
y
E Y x
yf y x dy , if X and Y are continuous.
Var Y x E Y x E Y x
2 2
32
CONDITIONAL EXPECTATION
AND VARIANCE
E E Y X E Y
e tx f ( x )dx if X is cont .
tX all x
M X ( t ) E (e )
e tx f ( x ) if X is discrete
all x
36
Properties of m.g.f.
• M(0)=E[1]=1
k (k ) (k ) th
• E ( X ) M (0) where M is the k derivative .
itx
e f ( x) if X is discrete
all x
38
Uniqueness
Theorem:
1. If two r.v.s have mg.f.s that exist and are
equal, then they have the same
distribution.
2. If two r.v.s have the same distribution,
then they have the same m.g.f. (if they
exist)
Similar statements are true for c.h.f.
39
SOME DISCRETE PROBABILITY
DISTRIBUTIONS
• Please review: Degenerate, Uniform,
Bernoulli, Binomial, Poisson, Negative
Binomial, Geometric, Hypergeometric,
Extended Hypergeometric, Multinomial
40
SOME CONTINUOUS
PROBABILITY DISTRIBUTIONS
• Please review: Uniform, Normal
(Gaussian), Exponential, Gamma, Chi-
Square, Beta, Weibull, Cauchy, Log-
Normal, t, F Distributions
41
TRANSFORMATION OF RANDOM
VARIABLES
• If X is an rv with pdf f(x), then Y=g(X) is also an
rv. What is the pdf of Y?
• If X is a discrete rv, replace Y=g(X) whereever you
see X in the pdf of f(x) by using the relation
x g 1( y) .
• If X is a continuous rv, then do the same thing,
but now multiply with Jacobian.
• If it is not 1-to-1 transformation, divide the region
into sub-regions for which we have 1-to-1
transformation.
42
CDF method
2x
• Example: Let F( x ) 1 e for x 0
Consider Y e X . What is the p.d.f. of Y?
• Solution:
FY ( y) P(Y y) P(e X y) P(X ln y)
FX (ln y) 1 y 2 for y 1
d
f Y ( y) FY ( y) 2 y 3 for y 1
dy
43
M.G.F. Method
• If X1,X2,…,Xn are independent random
variables with
n
MGFs M xi (t), then the
MGF of Y Xi is MY (t) MX1 (t)...MXn (t)
i 1
44
THE PROBABILITY INTEGRAL
TRANSFORMATION
• Let X have continuous cdf FX(x) and define
the rv Y as Y=FX(x). Then,
Y ~ Uniform(0,1), that is,
P(Y y) = y, 0<y<1.
c)
n 1 S
2
~ n1
2
2
47
SAMPLING FROM THE NORMAL
DISTRIBUTION
If population variance is unknown, we use sample
variance:
48
SAMPLING FROM THE NORMAL
DISTRIBUTION
• The F distribution allows us to compare
the variances by giving the distribution of
S X2 / SY2 S X2 / X2
2 2 ~ Fn1,m1
X / Y SY / Y
2 2