Review of Basic Statistics: Appendix A

Appendix A
Review of Basic Statistics

In this appendix, a brief review of the basic definitions and concepts in
statistics will be given. The intent is to highlight some of the relevant
concepts in statistics that are frequently referred to in various chapters of
the book. The review starts with some definitions for the random variables
and commonly used statistical functions.
A.I Random Variables

A real valued function, that is denned on the sample space S, is called a
random variable (r.v.). A simple example of a r.v. is the outcome of rolling
a dice. Here, the sample space is the set of numbers 1 through 6, i.e. all
possible outcomes of rolling a dice.
A.2 The Distribution Function (d.f.), F(x)

It is a real valued function of a real number 2, defined as:
F(x) = Pr(X<2:), -ooOKoo (A.I)

0 < F(x)<l (A.2)
where, Pr stands for "probability of".

The distribution function, d.f. has several useful properties some of
which are listed below:
Property-1 F(x) is non-decreasing as x is increasing, i.e.
if 3:1 < Kg, then F(?;i) < F^)
Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

Property-2 Lower and upper limits of d.f are zero and one respectively,
i.e.
lim F(z) = 0, and, lim F(z) = 1
a:—
—>—
— oo —
Property-3 A d.f at a given point z is defined by its limit from the right,
i.e.
F(g;) = F(:E+) for all z.
A. 3 The Probability Density Function (p.d.f),

f(x)
A non-negative function f, defined on the real line, is called a probability
density function if for any interval T, the following is satisfied:
Pr(X € T) -
The p.d.f has the following properties:

/M > 0 (A.3)
/(2;)&E = 1 (A.4)
Note that when a random variable JSf has a continuous distribution, then
= 2) = 0. Also, the distribution and density functions are related as
shown below:
/(M)&t (A.5)
= F(x) (A.6)
or as long as the p.d.f is continuous, the d.f. will be differentiable:
A.4 Continuous Joint Distributions
Joint distributions are denned for two or more random variables. Consider
the case of two random variables, X and Y. A non-negative function f(x,y)
is called the joint probability density function of X and Y, if for a region T
in the x-y plane, the following holds true:

Again, similar properties can be written for the joint p.d.f:
/M > 0 (A.7)
= 1 (A.8)
— oo — oo
Similarly, joint distribution function of two random variables X,Y can be

defined as:
F(x,y) = Fr(X<3: and Y<y) (A.9)
A. 5 Independent Random Variables
X and Y are independent random variables if
/(x,y) = A(x)/2(2/) and F(x,y) = Fi(^)F2(y)
A. 6 Conditional Distributions
The conditional probability density function g, of a random variable X
when another random variable V is already known to have assumed a value
T/, is given by:
a;
-oo<3;<oo
A.7 Expected Value

Expected (or mean) value of a random variable X, is denoted by -E(X) and
defined as:
Expected value has the following properties:

1. Expected value of a random variable X = aX + 5, will be given by
6.
2. If Pr(X > a) = 1, then E(X) > o.. If Pr(X < A) = 1, then
E(X) < 5.
3.

A. 8 Variance
Variance of a random variable X is denoted by cr^ and defined as:
c-2 = Var(X) = E[(X - ^)% where /^ =
Variance has the following properties:
1. Var(aX + 6) =
2.
3. âr(^Li ^î) = ELi ^Var(Xi), for independent
A. 9 Median
Median of a distribution of a random variable X, is denned as the value
along the real line, such that
Pr(X < m) > - and Pr(X > m) >
Also, note that:
> m) = 1 - Pr(X < m) < (A.ll)
> m) < - < Pr(X < m) (A.12)
A. 10 Mean Squared Error

It can be shown that the value of 2 that will minimize the expected value
of the squared error, i.e. pf — z]^, is the expected value of the random
variable J^. Consider the expression
Choosing -E(^Q = 2 will minimize the above expression, which is called the
mean squared error (MSB). Note also that, with the choice of 2 as
MSB will be identical to the definition of the variance of X:

A. 11 Mean Absolute Error
Similar to the case of the MSB, it can be shown that the value z, that
minimizes the mean absolute error:
yields the median m for the random variable .X. In other words, if m is the
median and c is any other number, then
Proo/.- Let us arbitrarily assume that m < c. Then,

/-OO
E(] X - c I) - E(] X - m ]) = / (] X - c - I X - m [)/

-7 — OO
/*m /*c /<oo
= / (c — ?7^)/(a:)^ + / (c + m — 2:r)y(2;)G!2-t- / (?n, — c)/
^/— oo ^/?n. *^c
/*TH /-C
> / (c — ??T-)/(^)^T + / (m — c)/(3j)^

J — OO */T7t
= (c - m) [Pr (X < m) - Pr(X > m)]
Since m is the median, Eq.(A.12) must hold true:
m > -> r >m
Hence,
E(IX-el)-E(IX-ml) >0
Proof under the alternative assumption of m > c is left to the reader.
A. 12 Covariance
Covariance between two random variables X and X is defined as:
Related to the covariance, one can define the correlation coefficient as:
Properties of covariance:

2. If X, V are independent random variables, then ccw (X, y) =
0.
3. Var(X + y) = Var(X) + + 2cov(X,
4. 2 ^
A. 13 Normal Distribution
A random variable X is said to have a Normal (Gaussian) distribution with
a mean ^, and variance o*^, if it is distributed according to the following
function:
1 1 T— M
When plotted, the Normal distribution function looks like a bell as

shown in Figure A.I, hence it is frequently referred to as the bell shaped
distribution. A random variable X which has a Normal distribution with
mean /^ and variance cr^ is commonly denoted by X
Normal Distribution Function with 0 mean, 2 stand, dev.
-10
Figure A.I. Normal Distribution function

Theorem
If ^f ^ N(/^,cr^) and K = oJf + 5, then V will also have a Normal

distribution with mean a^ + 5 and variance aô*^.
A. 14 Standard Normal Distribution

Normal distribution with 0 mean and 1 variance is called the Standard
Normal distribution. The p.d.f. of Standard Normal distribution is given
by:
and the corresponding distribution function d.f. will be given by:
Note that, due to distribution symmetry:
Example 1.1:
A random variable X is distributed according to a Normal distribution with
a mean of 15 and variance of 9.
(a) Find the probability that X > 16.
(b) Find the probability that ] X - 15 > 4.
(c) Find To such that Pr(] X - 15 [< xo) = 0.90.

Solution
(a)
> 16) = 1 - Pr(X < 16)
_ X-15 16-15^
3 "" 3
= 1-0.63 = 0.37
Note that the value 0.63 is looked up from the Standard Normal distribu-
tion table corresponding to the value 1/3.
SOutton tor part (b)
(b)
< 11) + > 19)
> 19)
<19)1
- 15 19-15^
= 2[1-0.9082] =0.1836

Sotution for part (c)
(c)
Pr(X > xi) = 0.05
Ti) = 0.95
= 0.95
From the Standard Normal table, look up the value corresponding to 0.95
-^ 1.645.
— 15
= 1.645
xi = 19.935
a;o = xi - 15 = 4.935
A. 15 Properties of Normally Distributed Ran-

dom Variables
1. If Xi, Xg, . . . , Xn. are independent r.v. each with Normal distribution
,cr?), then
2. If y = + &'

A. 16 Distribution of Sample Mean
Given a sample {î, ^2, - - - 1 -X^,} of random variables, the sample mean is
defined as:
" v
If the sample is taken from a Normal distribution with mean /i and variance
cr^, then
2
Example 1.2:
Determine the minimum value of n for which
Pr(l^-^]< 1)>0.95
if the random sample is taken from a distribution N(/^, 9).
Solution
Make a change of variable to obtain Z ^ 7V(0, 1), i.e. Standard Normal
distribution:
Then,
l< ^) > 0.95
> —) < 0.025
l-$(^) < 0.025
#(^) > 0.975
From the Standard Normal table:
IT - '
Therefore, n should be at least 35!.

A. 17 Likelihood Function and Maximum
Likelihood Estimator
Consider the random variables Xi,X2,...,X^ taken from a distribution
with a p.d.f of /(J*f 0), where ^ is a vector of unknown parameters of this
distribution.
Assuming a parameter space Q which 0 belongs, we try to find a region
in Q that 0 will most likely to lie. An estimate for 0 can be found by
observing the random variables X^'s, and choosing the parameter 0 that
will most likely yield the observed variables. The joint p.d.f. of a set of
random observations rr = (21,22,... 3^} will be expressed as:
This joint p.d.f is referred to as the LêÂood FMncNon, since it yields

the distribution of 0 for a set of observed variables, 2. Variation of the
distribution as the parameter 0 is changed, will indicate how likely the
chosen value of 6* is, for a given set of observations. The value of 0, which
will maximize the function /f],(^ I 0) will be called the Afa^mwrn LêKAooti
RsNnMio?" (MLE) of 0.
A.17.1 Properties of MLE's

1. If 0 is MLE of 0, then g(6<) will be the MLE of 3(0).
2. It may not always be possible to express the MLE as an explicit
algebraic function.
3. MLE's are consistent estimators, that is the sequence of MLE's will
converge to the true unknown value of 0 as the sampling size % be-
comes infinitely large.
In determining the MLE's, it is common to maximize the logarithm of
the likelihood function (log likelihood function) instead of the likelihood
function itself, in order to simplify the algebra. Since log function is mono-
tonically increasing, the solution of the maximization problem will not be
affected by this change of the objective function. The following example
shows the procedure of obtaining the MLE for the parameters of a Normal
distribution, namely the mean // and the variance cr^, based on a finite
number of observations.
Example 1.3:
Suppose (Xi, ^2,. . < , ^n} are samples taken from a Normal distribution with
unknown î and cr. Find the MLE of these unknown parameters.

Solution
The likelihood function for w samples from N(^, c^) can be expressed as:
The log likelihood function will then be given by:
Writing the nrst order optimality conditions for maximizing the log likelihood
function /^ with respect to the unknown parameters /i and cr^:
= J^^ sample mean
9n-2 QrT^ Z—<^ ^^ —
1 -^, y ,2 ,
3;, — An.) sample variance
It is therefore to be noted that the sample mean and variance constitute

MLE's for the unknown parameters of a Normal distribution.
A. 18 Central Limit Theorem for the Sample

Mean
If the r.v.s. Xi, ^2? - - - ; ^?i form a random sample of size n taken from a
distribution whose mean is // and variance cr^, then for any real number 3;
we can write the following:
In other words, as the sample size grows, sample mean of any distribu-
tion will be distributed more and more like a Normal distribution.

Review of Basic Statistics: Appendix A

Uploaded by

Copyright:

Available Formats

Review of Basic Statistics: Appendix A

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Review of Basic Statistics: Appendix A

Uploaded by

Copyright:

Available Formats

Appendix A

Review of Basic Statistics

A.I Random Variables

A.2 The Distribution Function (d.f.), F(x)

F(x) = Pr(X<2:), -ooOKoo (A.I)

where, Pr stands for "probability of".

Property-1 F(x) is non-decreasing as x is increasing, i.e.

if 3:1 < Kg, then F(?;i) < F^)

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

A. 3 The Probability Density Function (p.d.f),

The p.d.f has the following properties:

A.4 Continuous Joint Distributions

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

Similarly, joint distribution function of two random variables X,Y can be

F(x,y) = Fr(X<3: and Y<y) (A.9)

A. 5 Independent Random Variables

X and Y are independent random variables if

/(x,y) = A(x)/2(2/) and F(x,y) = Fi(^)F2(y)

A.7 Expected Value

Expected value has the following properties:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

c-2 = Var(X) = E[(X - ^)% where /^ =

Variance has the following properties:

3. ^ar(^Li ^^i) = ELi ^Var(Xi), for independent

Pr(X < m) > - and Pr(X > m) >

Also, note that:

> m) = 1 - Pr(X < m) < (A.ll)

> m) < - < Pr(X < m) (A.12)

A. 10 Mean Squared Error

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

Proo/.- Let us arbitrarily assume that m < c. Then,

E(] X - c I) - E(] X - m ]) = / (] X - c - I X - m [)/

> / (c — ??T-)/(^)^T + / (m — c)/(3j)^

= (c - m) [Pr (X < m) - Pr(X > m)]

Since m is the median, Eq.(A.12) must hold true:

m > -> r >m

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

When plotted, the Normal distribution function looks like a bell as

Normal Distribution Function with 0 mean, 2 stand, dev.

Figure A.I. Normal Distribution function

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

If ^f ^ N(/^,cr^) and K = oJf + 5, then V will also have a Normal

A. 14 Standard Normal Distribution

and the corresponding distribution function d.f. will be given by:

Note that, due to distribution symmetry:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

A. 15 Properties of Normally Distributed Ran-

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

if the random sample is taken from a distribution N(/^, 9).

l< ^) > 0.95

> —) < 0.025

l-$(^) < 0.025

#(^) > 0.975

From the Standard Normal table:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

This joint p.d.f is referred to as the L^e^Aood FMncNon, since it yields

A.17.1 Properties of MLE's