Review of Basic Statistics: Appendix A

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Appendix A

Review of Basic Statistics


In this appendix, a brief review of the basic definitions and concepts in
statistics will be given. The intent is to highlight some of the relevant
concepts in statistics that are frequently referred to in various chapters of
the book. The review starts with some definitions for the random variables
and commonly used statistical functions.

A.I Random Variables


A real valued function, that is denned on the sample space S, is called a
random variable (r.v.). A simple example of a r.v. is the outcome of rolling
a dice. Here, the sample space is the set of numbers 1 through 6, i.e. all
possible outcomes of rolling a dice.

A.2 The Distribution Function (d.f.), F(x)


It is a real valued function of a real number 2, defined as:

F(x) = Pr(X<2:), -ooOKoo (A.I)


0 < F(x)<l (A.2)

where, Pr stands for "probability of".


The distribution function, d.f. has several useful properties some of
which are listed below:

Property-1 F(x) is non-decreasing as x is increasing, i.e.

if 3:1 < Kg, then F(?;i) < F^)

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Property-2 Lower and upper limits of d.f are zero and one respectively,
i.e.
lim F(z) = 0, and, lim F(z) = 1
a:—
—>—
— oo —

Property-3 A d.f at a given point z is defined by its limit from the right,
i.e.
F(g;) = F(:E+) for all z.

A. 3 The Probability Density Function (p.d.f),


f(x)
A non-negative function f, defined on the real line, is called a probability
density function if for any interval T, the following is satisfied:

Pr(X € T) -

The p.d.f has the following properties:


/M > 0 (A.3)
/(2;)&E = 1 (A.4)

Note that when a random variable JSf has a continuous distribution, then
= 2) = 0. Also, the distribution and density functions are related as
shown below:

/(M)&t (A.5)

= F(x) (A.6)
or as long as the p.d.f is continuous, the d.f. will be differentiable:

A.4 Continuous Joint Distributions

Joint distributions are denned for two or more random variables. Consider
the case of two random variables, X and Y. A non-negative function f(x,y)
is called the joint probability density function of X and Y, if for a region T
in the x-y plane, the following holds true:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Again, similar properties can be written for the joint p.d.f:

/M > 0 (A.7)
= 1 (A.8)
— oo — oo

Similarly, joint distribution function of two random variables X,Y can be


defined as:

F(x,y) = Fr(X<3: and Y<y) (A.9)

A. 5 Independent Random Variables

X and Y are independent random variables if

/(x,y) = A(x)/2(2/) and F(x,y) = Fi(^)F2(y)

A. 6 Conditional Distributions
The conditional probability density function g, of a random variable X
when another random variable V is already known to have assumed a value
T/, is given by:
a;
-oo<3;<oo

A.7 Expected Value


Expected (or mean) value of a random variable X, is denoted by -E(X) and
defined as:

Expected value has the following properties:


1. Expected value of a random variable X = aX + 5, will be given by
6.
2. If Pr(X > a) = 1, then E(X) > o.. If Pr(X < A) = 1, then
E(X) < 5.

3.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


A. 8 Variance
Variance of a random variable X is denoted by cr^ and defined as:

c-2 = Var(X) = E[(X - ^)% where /^ =

Variance has the following properties:

1. Var(aX + 6) =

2.

3. ^ar(^Li ^^i) = ELi ^Var(Xi), for independent

A. 9 Median
Median of a distribution of a random variable X, is denned as the value
along the real line, such that

Pr(X < m) > - and Pr(X > m) >

Also, note that:

> m) = 1 - Pr(X < m) < (A.ll)

> m) < - < Pr(X < m) (A.12)

A. 10 Mean Squared Error


It can be shown that the value of 2 that will minimize the expected value
of the squared error, i.e. pf — z]^, is the expected value of the random
variable J^. Consider the expression

Choosing -E(^Q = 2 will minimize the above expression, which is called the
mean squared error (MSB). Note also that, with the choice of 2 as
MSB will be identical to the definition of the variance of X:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


A. 11 Mean Absolute Error
Similar to the case of the MSB, it can be shown that the value z, that
minimizes the mean absolute error:

yields the median m for the random variable .X. In other words, if m is the
median and c is any other number, then

Proo/.- Let us arbitrarily assume that m < c. Then,


/-OO

E(] X - c I) - E(] X - m ]) = / (] X - c - I X - m [)/


-7 — OO
/*m /*c /<oo
= / (c — ?7^)/(a:)^ + / (c + m — 2:r)y(2;)G!2-t- / (?n, — c)/
^/— oo ^/?n. *^c
/*TH /-C

> / (c — ??T-)/(^)^T + / (m — c)/(3j)^


J — OO */T7t

= (c - m) [Pr (X < m) - Pr(X > m)]

Since m is the median, Eq.(A.12) must hold true:

m > -> r >m

Hence,
E(IX-el)-E(IX-ml) >0
Proof under the alternative assumption of m > c is left to the reader.

A. 12 Covariance
Covariance between two random variables X and X is defined as:

Related to the covariance, one can define the correlation coefficient as:

Properties of covariance:

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


2. If X, V are independent random variables, then ccw (X, y) =
0.
3. Var(X + y) = Var(X) + + 2cov(X,

4. 2 ^
A. 13 Normal Distribution
A random variable X is said to have a Normal (Gaussian) distribution with
a mean ^, and variance o*^, if it is distributed according to the following
function:
1 1 T— M

When plotted, the Normal distribution function looks like a bell as


shown in Figure A.I, hence it is frequently referred to as the bell shaped
distribution. A random variable X which has a Normal distribution with
mean /^ and variance cr^ is commonly denoted by X

Normal Distribution Function with 0 mean, 2 stand, dev.

-10

Figure A.I. Normal Distribution function

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Theorem

If ^f ^ N(/^,cr^) and K = oJf + 5, then V will also have a Normal


distribution with mean a^ + 5 and variance a^o*^.

A. 14 Standard Normal Distribution


Normal distribution with 0 mean and 1 variance is called the Standard
Normal distribution. The p.d.f. of Standard Normal distribution is given
by:

and the corresponding distribution function d.f. will be given by:

Note that, due to distribution symmetry:

Example 1.1:
A random variable X is distributed according to a Normal distribution with
a mean of 15 and variance of 9.
(a) Find the probability that X > 16.
(b) Find the probability that ] X - 15 > 4.
(c) Find To such that Pr(] X - 15 [< xo) = 0.90.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Solution

(a)
> 16) = 1 - Pr(X < 16)
_ X-15 16-15^
3 "" 3

= 1-0.63 = 0.37
Note that the value 0.63 is looked up from the Standard Normal distribu-
tion table corresponding to the value 1/3.
SOutton tor part (b)

(b)
< 11) + > 19)
> 19)
<19)1
- 15 19-15^

= 2[1-0.9082] =0.1836

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Sotution for part (c)

(c)
Pr(X > xi) = 0.05
Ti) = 0.95
= 0.95

From the Standard Normal table, look up the value corresponding to 0.95
-^ 1.645.
— 15
= 1.645
xi = 19.935
a;o = xi - 15 = 4.935

A. 15 Properties of Normally Distributed Ran-


dom Variables
1. If Xi, Xg, . . . , Xn. are independent r.v. each with Normal distribution
,cr?), then

2. If y = + &'

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


A. 16 Distribution of Sample Mean

Given a sample {^i, ^2, - - - 1 -X^,} of random variables, the sample mean is
defined as:
" v

If the sample is taken from a Normal distribution with mean /i and variance
cr^, then
2

Example 1.2:
Determine the minimum value of n for which

Pr(l^-^]< 1)>0.95

if the random sample is taken from a distribution N(/^, 9).

Solution
Make a change of variable to obtain Z ^ 7V(0, 1), i.e. Standard Normal
distribution:

Then,

l< ^) > 0.95

> —) < 0.025

l-$(^) < 0.025

#(^) > 0.975

From the Standard Normal table:

IT - '
Therefore, n should be at least 35!.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


A. 17 Likelihood Function and Maximum
Likelihood Estimator
Consider the random variables Xi,X2,...,X^ taken from a distribution
with a p.d.f of /(J*f 0), where ^ is a vector of unknown parameters of this
distribution.
Assuming a parameter space Q which 0 belongs, we try to find a region
in Q that 0 will most likely to lie. An estimate for 0 can be found by
observing the random variables X^'s, and choosing the parameter 0 that
will most likely yield the observed variables. The joint p.d.f. of a set of
random observations rr = (21,22,... 3^} will be expressed as:

This joint p.d.f is referred to as the L^e^Aood FMncNon, since it yields


the distribution of 0 for a set of observed variables, 2. Variation of the
distribution as the parameter 0 is changed, will indicate how likely the
chosen value of 6* is, for a given set of observations. The value of 0, which
will maximize the function /f],(^ I 0) will be called the Afa^mwrn L^eKAooti
RsNnMio?" (MLE) of 0.

A.17.1 Properties of MLE's


1. If 0 is MLE of 0, then g(6<) will be the MLE of 3(0).
2. It may not always be possible to express the MLE as an explicit
algebraic function.
3. MLE's are consistent estimators, that is the sequence of MLE's will
converge to the true unknown value of 0 as the sampling size % be-
comes infinitely large.
In determining the MLE's, it is common to maximize the logarithm of
the likelihood function (log likelihood function) instead of the likelihood
function itself, in order to simplify the algebra. Since log function is mono-
tonically increasing, the solution of the maximization problem will not be
affected by this change of the objective function. The following example
shows the procedure of obtaining the MLE for the parameters of a Normal
distribution, namely the mean // and the variance cr^, based on a finite
number of observations.

Example 1.3:

Suppose (Xi, ^2,. . < , ^n} are samples taken from a Normal distribution with
unknown ^i and cr. Find the MLE of these unknown parameters.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.


Solution
The likelihood function for w samples from N(^, c^) can be expressed as:

The log likelihood function will then be given by:

Writing the nrst order optimality conditions for maximizing the log likelihood
function /^ with respect to the unknown parameters /i and cr^:

= J^^ sample mean

9n-2 QrT^ Z—<^ ^^ —

1 -^, y ,2 ,
3;, — An.) sample variance

It is therefore to be noted that the sample mean and variance constitute


MLE's for the unknown parameters of a Normal distribution.

A. 18 Central Limit Theorem for the Sample


Mean
If the r.v.s. Xi, ^2? - - - ; ^?i form a random sample of size n taken from a
distribution whose mean is // and variance cr^, then for any real number 3;
we can write the following:

In other words, as the sample size grows, sample mean of any distribu-
tion will be distributed more and more like a Normal distribution.

Copyright 2004 by Marcel Dekker, Inc. All Rights Reserved.

You might also like