Statistics For Engineers Statistics 509, Fall 2010: Professor Edsel A. Pe Na E-Mail: Pena@stat - Sc.edu

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

Statistics for Engineers

Statistics 509, Fall 2010

Professor Edsel A. Peña


E-Mail: pena@stat.sc.edu

November 16, 2010

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Lecture 05: Sampling, Likelihoods, and Estimation

▶ Reprise: Distributions as Population Models


▶ Need to Discover Values of Parameters of Distributions
▶ Experimentation and Sampling
▶ Samples, Statistics, and Sampling Distributions
▶ Central Limit Theorem
▶ Sample Mean and Sample Median
▶ Sample Variance and Sample Standard Deviation
▶ Sample Covariance and Sample Correlation Coefficient
▶ Likelihood Function: Information about Parameters from
Sample Data
▶ The Problem of Parameter Estimation
▶ Method of Moments Principle
▶ Maximum Likelihood Estimation Principle

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Problem of Inference

▶ The probability distributions (both discrete and continuous)


serve as population models.
▶ If we know the parameters of the probability distribution
modeling our population, then we know how the population
behaves.
▶ Thus, it is important to discover the values of the parameters
of our probability distributions.
▶ How do we discover their values?
▶ We do experiments or take samples that will provide us
sample data.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling and Experiments

▶ Consider a population where the variable of interest is denoted


by X and which is postulated to have a distribution F (x; 𝜃),
with PMF p(x; 𝜃) if X is discrete or with PDF f (x; 𝜃) is X is
continuous, where 𝜃 is a parameter.
▶ A random sample of size n from this population is a subset of
size n taken in such that every possible subset of size n are
equally likely.
▶ It could either be taken sampling with or without replacement.
However, in this course we will assume that it is sampling
with replacement. These will make the observations
independent of each other.
▶ The sample variables will be denoted by X1 , X2 , . . . , Xn .

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Joint PMF and Likelihood for Discrete Population

Let X1 , X2 , . . . , Xn be a random sample from a discrete PMF


p(x; 𝜃). The joint PMF of (X1 , X2 , . . . , Xn ) is
n

p(x1 , x2 , . . . , xn ; 𝜃) = p(xi ; 𝜃).
i =1

Given the sample values (x1 , x2 , . . . , xn ), if we view the joint PMF


for these values as a function of 𝜃, we obtain what is called as the
likelihood function, denoted by L(𝜃):
n

L(𝜃) = L(𝜃; x1 , x2 , . . . , xn ) = p(x1 , x2 , . . . , xn ; 𝜃) = p(xi ; 𝜃)
i =1

For a given 𝜃, L(𝜃) represents the probability of getting the data


(x1 , x2 , . . . , xn ) for the value 𝜃.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Joint PDF and Likelihood for Continuous Population

Let X1 , X2 , . . . , Xn be a random sample from a continuous PDF


p(x; 𝜃). The joint PDF of (X1 , X2 , . . . , Xn ) is
n

f (x1 , x2 , . . . , xn ; 𝜃) = f (xi ; 𝜃).
i =1

Given the sample values (x1 , x2 , . . . , xn ), the likelihood function is:


n

L(𝜃) = L(𝜃; x1 , x2 , . . . , xn ) = f (x1 , x2 , . . . , xn ; 𝜃) = f (xi ; 𝜃)
i =1

For a given 𝜃, L(𝜃) represents the ‘likelihood’ of getting the data


(x1 , x2 , . . . , xn ) under the value of 𝜃.
Remark: The likelihood function is the function that contains
information about the parameter 𝜃 provided by the sample data.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Example: Sampling from Bernoulli Population
▶ Consider a population with only two values: 0 and 1, a
Bernoulli population.
▶ Denote by 𝜃 the proportion of 1s in this population. This is
the parameter of interest and it will be in Θ = (0, 1).
▶ We sample, with replacement, n units from this population.
The sample data is X1 , X2 , . . . , Xn where each Xi takes either
the value of 0 or 1. We write

X1 , X2 , . . . , Xn IID p(x; 𝜃) = 𝜃 x (1 − 𝜃)1−x for x = 0, 1.

▶ Likelihood function is:


n

L(𝜃) = 𝜃 xi (1 − 𝜃)1−xi = 𝜃 T (x1,x2 ,...,xn ) (1 − 𝜃)n−T (x1 ,x2 ,...,xn )
i =1
∑n
with T (x1 , x2 , . . . , xn ) = i =1 xi .

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Example: Sampling from an Exponential Population
▶ Consider a study to determine the lifetime properties of an
electronic component (say, electric bulbs of a certain brand).
▶ Assume that the lifetime distribution of a component is
exponential with parameter 𝜆, that is, the pdf is
f (t; 𝜆) = 𝜆 exp(−𝜆t) for t > 0.
▶ We perform a life-testing experiment consisting of n = 50
components where we observe the lifetimes of these 50
components.
▶ The sample data will be

T1 , T2 , . . . , T50 IID f (t; 𝜆) = 𝜆 exp(−𝜆t)


▶ Likelihood function is
50
{ }

50
L(𝜆) = 𝜆 exp −𝜆 Ti
i =1

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Example: Sampling from a Normal Population
▶ Normal Population with mean 𝜇 and variance 𝜎 2 .
▶ Random sample of size n:
X1 , X2 , . . . , Xn IID N(𝜇, 𝜎 2 )
▶ Likelihood Function:
n
{ }
2 1 1 ∑ 2
L(𝜇, 𝜎 ) = √ exp − 2 (xi − 𝜇)
(𝜎 2 )(n/2) ( 2𝜋)n 2𝜎
i =1
▶ Note that with
n
1∑
x̄ = xi ,
n
i =1
we have
n
∑ n

2
(xi − 𝜇) = (xi − x̄)2 + n(x̄ − 𝜇)2
i =1 i =1

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sample Statistics

▶ Given a random sample X1 , X2 , . . . , Xn .


▶ Any function of X1 , X2 , . . . , Xn , together possibly with known
constants, is called a sample statistic.
▶ Note that compared to parameters whose values we will
usually not know, we will be able to know or compute the
values of sample statistics since we will observe the values of
the Xi s.
▶ Sample statistics will be the basis of inference about the
parameters of the population distribution.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Common Examples of Sample Statistics

1 ∑n
▶ Sample Mean: X̄ = n i =1 Xi
▶ When the values of the Xi s are either 0 or 1, the sample mean
is the sample proportion.
▶ Sample Median: This is a value that divides the ordered data
set into two equal parts.
kth Sample Moment: Mk′ = n1 ni=1 Xik

▶ Sample Variance:
n
[ n ]
2 1 ∑ 2 1 ∑
2 2
S = (Xi − X̄ ) = Xi − n(X̄ )
n−1 n−1
i =1 i =1

▶ Sample Standard Deviation: S = + S 2

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sample Covariance and Correlation

Definition
Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a bivariate random sample
from a joint bivariate distribution. The sample covariance is
defined to be
n
[ n ]
1 ∑ 1 ∑
SXY = (Xi − X̄ )(Yi − Ȳ ) = Xi Yi − nX̄ Ȳ .
n−1 n−1
i =1 i =1

The sample correlation coefficient is


SXY
r=
(SX )(SY )

where SX and SY are the sample standard deviations of the Xi s


and Yi s, respectively.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Commands in R to Compute Sample Statistics

▶ x below is the vector containing the values of x1 , x2 , . . . , xn ; y


is another vector containing the values of y1 , y2 , . . . , yn
▶ To Create a Histogram: hist(x)
▶ Sample Mean (X̄ ): mean(x)
▶ Sample Median: median(x)
▶ Sample Variance (S 2 ): var(x)
▶ Sample Standard Deviation (S): sd(x) or sqrt(var(x))
▶ Sample Covariance (SXY ): cov(x,y)
▶ Sample Correlation (r ): cor(x,y)

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of Sample Statistics

▶ Since the sample X1 , X2 , . . . , Xn consists of random variables,


then a sample statistic is also random variable, hence it has its
own distribution function. Such a distribution function is
called a sampling distribution.
▶ Thus, the sample mean has its own sampling distribution; the
sample median has its own sampling distribution; and the
sample variance has its own sampling distribution. Each of
these sampling distributions will have their own mean and
variance.
▶ Thus, for the sample mean we will have the mean of the
sample mean, denoted by 𝜇X̄ , and we will have the variance of
the sample mean, denoted by 𝜎X̄2 .
▶ The standard deviation of the sample mean, denoted 𝜎X̄ is
called the standard error of the sample mean.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Basic Results about Sampling Distributions
▶ Let X1 , X2 , . . . , Xn be a random sample from a population or
distribution whose mean is 𝜇 and variance 𝜎 2 .
▶ The mean of the sample mean is

𝜇X̄ = E (X̄ ) = 𝜇.
▶ The variance of the sample mean is
𝜎2
𝜎X̄2 = Var (X̄ ) = .
n
▶ The standard error of the sample mean is
𝜎
𝜎X̄ = SE (X̄ ) = √ .
n
▶ The mean of the sample variance is

𝜇S 2 = E (S 2 ) = 𝜎 2 .

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distribution of X̄ from Normal Population

Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal
population/distribution with mean 𝜇 and variance 𝜎 2 . The the
sampling distribution of the sample mean X̄ is normal with mean
𝜇X̄ = 𝜇 and variance 𝜎X̄2 = 𝜎 2 /n. That is,

𝜎2
( )
X̄ ∼ N 𝜇, .
n

Equivalently,
X̄ − 𝜇
Z= √ ∼ N(0, 1).
𝜎/ n

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Chi-Square Distribution

Definition
A positive-valued random variable X is said to have a chi-squared
distribution with degrees-of-freedom k if it has a gamma
distribution with shape parameter 𝛼 = k/2 and scale parameter
𝜆 = 1/2. That is, its pdf is

1
f (x) = x k/2−1 exp(−x/2) for x ≥ 0.
2k/2 Γ(k/2)

We denote this by X ∼ 𝜒2k . For such a random variable,

E (X ) = k and Var (X ) = 2k.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distribution of S 2 under Normal Population

Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal
population/distribution with mean 𝜇 and variance 𝜎 2 . Let S 2 be
the sample variance. Then

(n − 1)S 2
V = ∼ 𝜒2n−1 .
𝜎2
Consequently, under a normal population,

2𝜎 4
E (S 2 ) = 𝜎 2 and Var (S 2 ) = .
n−1

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Independence of X̄ and S 2 under Normality

Theorem
Let X1 , X2 , . . . , Xn be a random sample from a normal
population/distribution with mean 𝜇 and variance 𝜎 2 . Then the
sample mean X̄ and the sample variance S 2 are independent.

Remark: This implies that knowledge of the value of X̄ does not


provide any additional information about the value of S 2 !

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Central Limit Theorem

Theorem
Let X1 , X2 , . . . , Xn be a random sample from any population or
distribution with mean 𝜇 and variance 𝜎 2 . For large sample size n
(usually at least 30), the sampling distribution of the sample mean
X̄ is approximately normal with mean 𝜇X̄ = 𝜇 and variance
𝜎X̄2 = 𝜎 2 /n. In short hand, for large (n),

𝜎2
( )

X̄ ∼ N 𝜇, .
n

As such,
X̄ − 𝜇 ⋅
Z= √ ∼ N(0, 1).
𝜎/ n

Importance: Normal distribution can be used to compute


probabilities for the sample mean X̄ .
Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Simulated Demonstrations: Under Normality and the CLT

▶ Design of the Simulation Study.


▶ For different types of populations or distributions (Bernoulli,
Uniform, Exponential, Gamma, Normal) we will take samples
of size n and for each sample we compute the sample mean
(X̄ ) and the sample variance (S 2 ).
▶ We will choose n to be n = 2, n = 5, n = 10, n = 30, and
n = 100.
▶ For each combination of population type and sample size n,
we repeat the sampling process a total of MREPS = 10000
times. We then examine the empirical sampling distributions
of the sample mean and the sample variance.
▶ We examine the shape of their histograms (are they becoming
more normal?), their means and their variances relative to
what are expected under theory.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
For a Normal Population with 𝜇 = 20, 𝜎 2 = 25

n 2.0000 10.0000 30.0000 100.0000


MeanOfSampMeans 19.9483 20.0024 19.9990 19.9952
PopnMean 20.0000 20.0000 20.0000 20.0000
VarOfSampMeans 12.3503 2.5105 0.8443 0.2498
TrueVarOfSampMean 12.5000 2.5000 0.8333 0.2500
MeanOfSampVars 24.9223 25.0382 25.0149 25.0041
PopnVar 25.0000 25.0000 25.0000 25.0000
VarOfSampVars 1244.4987 137.0298 42.7614 12.7159

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of X̄ under Normal Popn

Sample Means With n=2 Sample Means With n=10

1000 2000
1500
Frequency

Frequency
0 500

0
5 10 15 20 25 30 35 14 18 22 26

SMeans[, 1] SMeans[, 2]

Sample Means With n=30 Sample Means With n=100


1500

500 1000
Frequency

Frequency
0 500

17 19 21 23 19 20 21 22

SMeans[, 3] SMeans[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of S 2 under Normal Popn

Sample Variances With n=2 Sample Variances With n=10

1500
5000
Frequency

Frequency
0 2000

0 500
0 100 200 300 400 0 20 40 60 80

SVars[, 1] SVars[, 2]

Sample Variances With n=30 Sample Variances With n=100


2500

1500
Frequency

Frequency
0 1000

0 500

10 30 50 70 15 20 25 30 35 40

SVars[, 3] SVars[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Independence of X̄ and S 2 under Normal Popn

XBar and S2 With n=2 XBar and S2 With n=10


Sample Variance

Sample Variance
0 4 8 12

0.5 2.0
−2 −1 0 1 2 −1.0 0.0 0.5 1.0

Sample Mean Sample Mean

XBar and S2 With n=30 XBar and S2 With n=100


Sample Variance

Sample Variance

0.6 1.0 1.4


0.5 1.5

−0.6 −0.2 0.2 0.6 −0.3 −0.1 0.1 0.3

Sample Mean Sample Mean

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
For a Binomial Population with 𝜃 = .5

n 2.0000 10.0000 30.0000 100.0000


MeanOfSampMeans 0.5035 0.5003 0.5018 0.5003
PopnMean 0.5000 0.5000 0.5000 0.5000
VarOfSampMeans 0.1243 0.0250 0.0083 0.0024
TrueVarOfSampMean 0.1250 0.0250 0.0083 0.0025
MeanOfSampVars 0.2513 0.2499 0.2499 0.2500
PopnVar 0.2500 0.2500 0.2500 0.2500
VarOfSampVars 0.0625 0.0013 0.0001 0.0000

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of X̄ for Bernoulli Popn

Sample Means With n=2 Sample Means With n=10

1000 2000
2000 4000
Frequency

Frequency
0

0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

SMeans[, 1] SMeans[, 2]

Sample Means With n=30 Sample Means With n=100


1000 2000
Frequency

Frequency

500 1000
0

0.2 0.4 0.6 0.8 0.4 0.5 0.6 0.7

SMeans[, 3] SMeans[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of S 2 for Bernoulli Popn

Sample Variances With n=2 Sample Variances With n=10


2000 4000

5000
Frequency

Frequency

0 2000
0

0.0 0.1 0.2 0.3 0.4 0.5 0.00 0.10 0.20

SVars[, 1] SVars[, 2]

Sample Variances With n=30 Sample Variances With n=100


5000

3000
Frequency

Frequency
0 2000

0 1000

0.14 0.18 0.22 0.26 0.22 0.23 0.24 0.25

SVars[, 3] SVars[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
For a Uniform Population with (𝛼, 𝛽) = (0, 1)

n 2.0000 10.0000 30.0000 100.0000


MeanOfSampMeans 0.5005 0.5000 0.5002 0.5006
PopnMean 0.5000 0.5000 0.5000 0.5000
VarOfSampMeans 0.0416 0.0084 0.0028 0.0008
TrueVarOfSampMean 0.0417 0.0083 0.0028 0.0008
MeanOfSampVars 0.0828 0.0830 0.0831 0.0833
PopnVar 0.0833 0.0833 0.0833 0.0833
VarOfSampVars 0.0098 0.0007 0.0002 0.0001

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of X̄ under Uniform Popn

Sample Means With n=2 Sample Means With n=10


800

1500
Frequency

Frequency
400

0 500
0

0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8

SMeans[, 1] SMeans[, 2]

Sample Means With n=30 Sample Means With n=100


500 1000 1500

1000 2000
Frequency

Frequency
0

0.3 0.4 0.5 0.6 0.7 0.40 0.50 0.60

SMeans[, 3] SMeans[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of S 2 under Uniform Popn

Sample Variances With n=2 Sample Variances With n=10

500 1000 1500


2000 4000
Frequency

Frequency
0

0
0.0 0.1 0.2 0.3 0.4 0.5 0.05 0.10 0.15 0.20

SVars[, 1] SVars[, 2]

Sample Variances With n=30 Sample Variances With n=100


1400

1000 2000
Frequency

Frequency
0 400 800

0.04 0.08 0.12 0.06 0.08 0.10

SVars[, 3] SVars[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Non-Independence of X̄ and S 2 under Uniform Popn

XBar and S2 With n=2 XBar and S2 With n=10


Sample Variance

Sample Variance

0.15
0.0 0.2 0.4

0.05
0.0 0.4 0.8 0.2 0.4 0.6

Sample Mean Sample Mean

XBar and S2 With n=30 XBar and S2 With n=100


Sample Variance

Sample Variance

0.06 0.09
0.10
0.04

0.30 0.45 0.60 0.45 0.55

Sample Mean Sample Mean

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
For an Exponential Population with 𝜆 = 1

n 2.0000 10.0000 30.0000 100.0000


MeanOfSampMeans 0.9958 0.9975 1.0011 0.9986
PopnMean 1.0000 1.0000 1.0000 1.0000
VarOfSampMeans 0.4830 0.0989 0.0329 0.0099
TrueVarOfSampMean 0.5000 0.1000 0.0333 0.0100
MeanOfSampVars 1.0014 0.9956 1.0067 0.9992
PopnVar 1.0000 1.0000 1.0000 1.0000
VarOfSampVars 4.9061 0.8075 0.2738 0.0796

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of X̄ under Exponential Popn

Sample Means With n=2 Sample Means With n=10

1000 2000
2500
Frequency

Frequency
0 1000

0
0 1 2 3 4 5 6 0.5 1.5 2.5

SMeans[, 1] SMeans[, 2]

Sample Means With n=30 Sample Means With n=100

1500
1500
Frequency

Frequency

0 500
0 500

0.4 0.8 1.2 1.6 0.8 1.0 1.2 1.4

SMeans[, 3] SMeans[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of S 2 under Exponential Popn

Sample Variances With n=2 Sample Variances With n=10


8000

5000
Frequency

Frequency
4000

0 2000
0

0 10 20 30 40 0 2 4 6 8 10

SVars[, 1] SVars[, 2]

Sample Variances With n=30 Sample Variances With n=100


2000 4000

2500
Frequency

Frequency

0 1000
0

0 2 4 6 8 0.5 1.5 2.5

SVars[, 3] SVars[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
For a Gamma Population with 𝛼 = .5, 𝜆 = 1

n 2.0000 10.0000 30.0000 100.0000


MeanOfSampMeans 0.5067 0.5005 0.5010 0.4993
PopnMean 0.5000 0.5000 0.5000 0.5000
VarOfSampMeans 0.2551 0.0500 0.0167 0.0050
TrueVarOfSampMean 0.2500 0.0500 0.0167 0.0050
MeanOfSampVars 0.5093 0.5009 0.5003 0.5009
PopnVar 0.5000 0.5000 0.5000 0.5000
VarOfSampVars 2.1788 0.3430 0.1147 0.0352

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of X̄ under Gamma Popn

Sample Means With n=2 Sample Means With n=10


5000

1500
Frequency

Frequency
0 2000

0 500
0 1 2 3 4 5 6 0.0 0.5 1.0 1.5

SMeans[, 1] SMeans[, 2]

Sample Means With n=30 Sample Means With n=100

1000 2000
Frequency

500 1000

Frequency
0

0.2 0.4 0.6 0.8 1.0 1.2 0.2 0.4 0.6 0.8

SMeans[, 3] SMeans[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sampling Distributions of S 2 under Gamma Popn

Sample Variances With n=2 Sample Variances With n=10


4000 8000

5000
Frequency

Frequency

0 2000
0

0 10 20 30 40 0 2 4 6 8

SVars[, 1] SVars[, 2]

Sample Variances With n=30 Sample Variances With n=100

1000 2000
2500
Frequency

Frequency
0 1000

0.0 1.0 2.0 3.0 0.5 1.0 1.5

SVars[, 3] SVars[, 4]

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
The Problem of Parameter Estimation

▶ Statement of the Problem: Given a sample data


X1 , X2 , . . . , Xn from a population or distribution F (x; 𝜃),
where 𝜃 belongs to a parameter space Θ, we would like to
estimate a parametric function of 𝜃, such as the population
mean or the population variance, based on the sample data.
▶ Ideally, we would like our estimate to be as close as possible
to the true value of the parametric function, whatever the
value of 𝜃 is.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
A Special Case: Bernoulli Population

▶ Consider the problem of estimating the proportion of 1’s


(‘successes’) in a Bernoulli population. (For example, you
would like to estimate the proportion of defective items being
produced by a manufacturing process.)
▶ Recall that for a Bernoulli population the mean is 𝜇 = 𝜃 and
variance 𝜎 2 = 𝜃(1 − 𝜃).
▶ A sample of size n from this population is

X1 , X2 , . . . , Xn IID Ber (𝜃).

▶ Note that the Xi s will take 0/1-values.


▶ How do we estimate 𝜃 based on the sample data? If we could
estimate 𝜃, then we could also estimate the variance.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
The Method-of-Moments (MM) Approach

▶ The first population moment is 𝜇 = 𝜃.


▶ The first sample moment, which is the sample mean X̄ , has a
mean that equals the first population moment.
▶ The idea therefore is, since this first sample moment will, on
average, equal 𝜇, we could obtain an estimate of 𝜇 by
equating it with the sample mean, X̄ .
▶ Therefore, a method-of-moments (MM) estimate of 𝜇 = 𝜃 is
the sample mean, X̄ . We write

𝜃ˆ = X̄ .

▶ To estimate the variance 𝜎 2 = 𝜃(1 − 𝜃), we substitute 𝜃ˆ for 𝜃


to get
ˆ 2 = X̄ (1 − X̄ ).
𝜎

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
The Maximum Likelihood (ML) Approach

▶ As discussed earlier, when sampling from a Bernoulli


population, the likelihood function for 𝜃, given the sample
data X1 , X2 , . . . , Xn , is

L(𝜃) = 𝜃 T (1 − 𝜃)n−T
∑n
where T = i =1 Xi .
▶ The likelihood function provides the probability of observing
the data given the value of 𝜃.
▶ The ML principle relies on the idea that the best estimate of 𝜃
is the value that will yield the largest possible likelihood.
▶ To obtain the estimate, we therefore maximize the likelihood
function, or equivalently, maximize the log-likelihood function.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
ML Estimate for the Bernoulli Population

▶ The log-likelihood function for the Bernoulli population is

l (𝜃) = log L(𝜃) = T log(𝜃) + (n − T ) log(1 − 𝜃).

▶ The derivative with respect to 𝜃 of l (𝜃) is

dl T n−T
= l ′ (𝜃) = − .
d𝜃 𝜃 1−𝜃
▶ Equating l ′ (𝜃) to zero, then solving for 𝜃, yields the maximizer

T
𝜃ˆ = = X̄ .
n
▶ This ML estimate for this Bernoulli population coincides with
the MM estimate.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Remark About the Variance Estimate
▶ We have seen that for the Bernoulli population, the MM and
ML estimates of the variance 𝜎 2 = 𝜃(1 − 𝜃) is

ˆ 2 = X̄ (1 − X̄ ).
𝜎

▶ This estimate is related to the sample variance S 2 via


n
n−1 2 1∑
X̄ (1 − X̄ ) = S = (Xi − X̄ )2 .
n n
i =1

This follows from the fact that X = X 2 since X takes a 0 or 1


value.
▶ In practice, we usually use the sample variance, S 2 , to
estimate the variance since, on the average, it equals the
variance (we say that S 2 is unbiased for 𝜎 2 ); whereas
X̄ (1 − X̄ ) tends to underestimate 𝜎 2 .
Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Standard Error of the Estimate

▶ The MM or ML estimate of 𝜇 = 𝜃 is 𝜃ˆ = X̄ . On average, this


will equal the true value of 𝜃, that is, E (X̄ ) = 𝜃.
▶ As discussed earlier, the standard deviation, called the
standard error, of X̄ is equal to
√ √
𝜎2 𝜃(1 − 𝜃)
SE (X̄ ) = = .
n n
▶ This standard error represents the variability that is associated
with X̄ when we think of repeating the sampling process of
size n from this Bernoulli population and for each sample
computing the value of X̄ .

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Estimate of the Standard Error

▶ Since we do not know the value of 𝜃, that is why we are


trying to estimate it, then we will also not know the value of
the standard error. We estimate this by substituting our
estimate of the variance.

ˆ X̄ (1 − X̄ )
SE (X̄ ) = .
n−1

▶ If the sampling distribution of X̄ is close to normal [justified


by Central Limit Theorem], which is the case when 𝜃 is not
too close to zero nor too close to one, we have that

ˆ ˆ
{ }
P 𝜃 − (1.96)SE (X̄ ) ≤ X̄ ≤ 𝜃 + (1.96)SE (X̄ ) ≈ 0.95

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Notion of a Confidence Interval
▶ This is equivalent to writing
ˆ ˆ
{ }
P X̄ − (1.96)SE (X̄ ) ≤ 𝜃 ≤ X̄ + (1.96)SE (X̄ ) ≈ 0.95

▶ That is, we will be approximately 95% confident that the


value of 𝜃 is in the interval
ˆ ˆ
[ ]
X̄ − (1.96)SE (X̄ ), X̄ + (1.96)SE (X̄ )

▶ Such an interval is called an approximate 95% confidence


interval for 𝜃.
▶ The so-called estimate of the margin of error of X̄ is the
quantity

ˆ X̄ (1 − X̄ )
ME (X̄ ) = (1.96)SE (X̄ ) = 1.96 .
n−1

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
A Concrete Numerical Example for Bernoulli Population

▶ 𝜃: proportion of defective components produced by process.


▶ X = 1 if component is defective; X = 0 if component is good.
▶ We take a random sample of size n = 100.
▶ Suppose 12 of these 100 sampled components are defective.
That is,
100

T = Xi = 12.
i =1
▶ The MM or ML estimate of 𝜃 is

𝜃ˆ = X̄ = 12/100 = .12.

▶ An estimate of the the variance 𝜎 2 is

𝜎ˆ2 = (.12)(1 − .12) = .1056.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
The Confidence Interval for Bernoulli Example

▶ Estimate of the standard error of X̄ is


ˆ

SE (X̄ ) = (.12)(1 − .12)/(100 − 1) = .0327.

▶ Estimate of the margin of error of X̄ is

ˆ
ME (X̄ ) = (1.96)(.0327) = .0641.

▶ An approximate 95% confidence interval for 𝜃 is therefore

[.12 − .0641, .12 + .0641] = [.0559, .1841].

▶ We would then say that we are approximately 95% that 𝜃 is


between .0559 and .1841.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Measurement Model with Normal Errors
▶ Seek to measure a fixed value c, e.g., the speed of light; the
volume of a container.
▶ Measurement process not perfect; measured values are
contaminated with errors.
▶ Error Model:
Y =c +𝜖
where Y the measurement, c is the true (but unknown) value,
and 𝜖 is the error contamination, which is also not known.
▶ Assumption on the Error Term:
𝜖 ∼ N(0, 𝜎 2 )
where the variance 𝜎 2 is not known.
▶ To discover both c and 𝜎 2 , we make n measurements, so we
get the data Y1 , Y2 , . . . , Yn . Note that
Y1 , Y2 , . . . , Yn IID N(c, 𝜎 2 ).
Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Least-Squares Approach: Estimating c

▶ A distance measure:
n
∑ n

Q(c) = 𝜖2i = (Yi − c)2
i =1 i =1

▶ To get an estimate of c, we minimize Q(c). The first


derivative of Q wrt c is
n n
dQ ∑ ∑
= 2(Yi − c)(−1) = 2( Yi − nc).
dc
i =1 i =1

▶ Equating this to zero and solving for c, we obtain the


estimator ∑n
Yi
ĉ = i =1 = Ȳ .
n

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Estimating the Variance 𝜎 2
▶ To get an estimator for the variance 𝜎 2 , observe that
𝜎 2 = E [(Y − c)2 ], so that
[ n ]
1∑
E (Yi − c) = 𝜎 2 .
2
n
i =1

▶ If we knew c, we could then estimate 𝜎 2 using


n
1∑
(Yi − c)2 .
n
i =1
▶ However, since we do not c, we could not compute the above
quantity. Thus we replace c by ĉ = Ȳ , which yields the
estimator of 𝜎 2 given by
n
1∑
𝜎ˆ2 = (Yi − Ȳ )2 .
n
i =1

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Getting an Unbiased Estimator of 𝜎 2

▶ Interestingly, after this substitution of ĉ for c, the mean of 𝜎ˆ2


is not 𝜎 2 , but is instead (n − 1)𝜎 2 /n.
▶ To get an estimator whose mean is 𝜎 2 , that is, an unbiased
estimator, we multiply 𝜎ˆ2 by n/(n − 1), and this leads to the
sample variance S 2 as our estimator of 𝜎 2 .
▶ Recall that the sample variance is
n
[ n ]
2 1 ∑ 2 1 ∑
2 2
S = (Yi − Ȳ ) = Yi − nȲ .
n−1 n−1
i =1 i =1

This is what we use in practice for estimating 𝜎 2 .

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Method-of-Moments Approach
▶ Population Moments:
E (Y ) = c and E (Y 2 ) = 𝜎 2 + c 2
▶ First two Sample Moments:
n
1∑ 2
M1′ = X̄ and M2′ = Yi
n
i =1
▶ Equate population and sample moments:
n
1∑ 2
c = X̄ and 𝜎 2 + c 2 = Yi
n
i =1
▶ Solving for c and 𝜎2 (exercise!) we get:
n
1∑
ĉ = Ȳ and 𝜎ˆ2 = (Yi − Ȳ )2
n
i =1
▶ Same estimators as in the least-squares approach.
Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Maximum Likelihood Approach

▶ The likelihood function of (c, 𝜎 2 ) is:


n
{ }
1 n/2
[ ]
2 1 ∑ 2
L(c, 𝜎 ) = exp − 2 (Yi − c)
2𝜋𝜎 2 2𝜎
i =1

▶ By maximizing this function or its logarithm with respect to c


and 𝜎 2 [left as an exercise], we obtain also
n
1∑
ĉ = Ȳ and 𝜎ˆ2 = (Yi − Ȳ )2
n
i =1

▶ Thus, the LS, MM, and ML estimators coincide under this


normal measurement error model!

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Example: Michelson’s Speed of Light Measurements
▶ See the website:
http://www.itl.nist.gov/div898/bayesian/datagall/michelso.dat

▶ Michelson’s 1879 Data: There were 101 observations (in


millions of meters per second).
▶ Histogram of this 101 observations are in next slide.
▶ Summary Statistics:

n = 101, Ȳ = 299.8524, S 2 = 0.006242667, S = 0.07901055

▶ Our estimate of the speed of light is therefore

ĉ = 299.8524 millions of meters per second.

▶ Estimate of the Standard Error of the Estimate:


√ √
ˆ
SE (ĉ) = S/ n = 0.07901055/ 101 = 0.007861843

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Histogram of Michelson’s Speed of Light Data

Michelson’s 1879 Speed of Light Data


30
25
20
Frequency

15
10
5
0

299.6 299.7 299.8 299.9 300.0 300.1

SpeedOfLight

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
On Desirable Properties of Estimators

▶ An estimator T of 𝜃 is said to be unbiased if E (T ∣𝜃) = 𝜃


whatever 𝜃 is. That is, on the average it hits the ‘target.’ An
unbiased estimator is usually preferred over a biased estimator.
An unbiased estimator is usually said to be accurate.
▶ Given two unbiased estimators of 𝜃, say T1 and T2 , the one
with a smaller variance is preferred over the other. We say
that if T1 has smaller variance than T2 , then T1 is more
precise.
▶ Theoretically, it turns out that ML estimators are usually the
ones that are unbiased and with small variance. Most
estimators we use are the ML estimators.
▶ Generally, MM estimators tend to be less precise than ML
estimators, though usually they are easier to compute.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Estimating Normal Mean: Battle of Three Estimators

▶ Population: Normal with mean 𝜇 = 20 and 𝜎 = 25.


▶ Sample Data: X1 , X2 , . . . , X10
▶ Estimator 1: Sample Mean, X̄ .
▶ Estimator 2: Sample Median, M, which is the value that
divides the arranged data into two equal parts.
▶ Estimator 3: Sample Midrange, MR, which is the average of
the smallest and largest observations.
▶ We take 10000 samples of size 10 each.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Performance of Three Estimators under Normal

Boxplots Samp Dist, Mean


−20 20 60

Frequency

0 1000
SMean SMidrange −10 10 30 50

SMean

SampDist, Median Samp Dist, Midrange


Frequency

Frequency

1000
0 1000

0
−20 0 20 40 60 −20 0 20 40 60

SMedian SMidrange

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Estimating Uniform Mean: Battle of Three Estimators

▶ Population: Uniform[10, 20] with mean 𝜇 = 15.


▶ Sample Data: X1 , X2 , . . . , X10
▶ Estimator 1: Sample Mean, X̄ .
▶ Estimator 2: Sample Median, M, which is the value that
divides the arranged data into two equal parts.
▶ Estimator 3: Sample Midrange, MR, which is the average of
the smallest and largest observations.
▶ We take 10000 samples of size 10 each.

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010
Sample Mean is Not the Best under Uniform

Boxplots Samp Dist, Mean

Frequency
12 16

0 1000
SMean SMidrange 12 14 16 18

SMean

SampDist, Median Samp Dist, Midrange


Frequency

Frequency

0 1500
0 600

12 14 16 18 12 14 16 18

SMedian SMidrange

Professor Edsel A. Peña E-Mail: pena@stat.sc.edu Statistics for Engineers Statistics 509, Fall 2010

You might also like