sp-sampling-lect-33

Introduction to Sampling Theory
Lecture 33
Cluster Sampling
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
Slides can be downloaded from

http://home.iitk.ac.in/~shalab/sp
1
Estimation of Population Mean:
Consider the mean of all such cluster means as an estimator of
population mean as
1 n
ycl   yi
n i 1
1 n
E ( ycl )   E ( yi )  Y .
n i 1
N n 2 1 N
Var ( ycl )  E ( ycl  Y ) 
2
Nn
Sb , Sb2   i
N  1 i 1
( y  Y ) 2
N n 2 1 n

Var ( ycl ) 
Nn
sb , s 
2
b  i cl
n  1 i 1
( y  y ) 2
2
Comparison with SRS:
If an equivalent sample of nM units were to be selected from the
population of NM units by SRSWOR, the variance of the mean per
element would be
NM  nM S 2
Var ( ynM )  .
NM nM
f S2
 .
n M
N -n 1 N M
where f 
N
and S 
2

NM  1 i 1 j 1
( yij  Y ) 2
.
N n 2
Also Var ( ycl )  Sb
Nn
f 2
 Sb .
n 3
Consider N M
( NM  1) S   ( yij  Y ) 2
2
i 1 j 1
N M 2
  ( yij  yi )  ( yi  Y ) 
i 1 j 1
N M N M
  ( yij  yi )   ( yi  Y ) 2
2
i 1 j 1 i 1 j 1
 N ( M  1) S w2  M ( N  1) Sb2
1 N
where S 
N
2
w S
i 1
i
2
is the mean sum of squares within clusters in
the population.
1 M
S 
i
2
 ij i
M  1 j 1
( y  y ) 2
is the mean sum of squares for the ith
cluster.
4
The efficiency of cluster sampling over SRSWOR is
Var ( ynM )
E
Var ( ycl )
S2

MSb2
1  N ( M  1) S w2 
   ( N  1)  .
( NM  1)  M Sb 2

Thus the relative efficiency increases when S w2 is large and Sb2 is small.
So cluster sampling will be efficient if clusters are so formed that the

variation between the cluster means is as small as possible while
variation within the clusters is as large as possible.
5
Efficiency in Terms of Intraclass Correlation:
The intra class correlation between the elements within a cluster is
E ( yij  Y )( yik  Y ) 1
given by   ;    1
E ( yij  Y ) 2 M 1
1 N M M
 
MN ( M  1) i 1 j 1 k (  j )1
( yij  Y )( yik  Y )

1 N M

MN i 1 j 1
( yij  Y ) 2
1 N M M
 
MN ( M  1) i 1 j 1 k (  j )1
( yij  Y )( yik  Y )

 MN  1  2
 S
 MN 
N M M
 
i 1 j 1 k (  j ) 1
( yij  Y )( yik  Y )
 .
( MN  1)( M  1) S 2
6
Consider 2
N 1 N M 
 ( yi  Y )   
2
i 1  M
 ( yij  Y ) 
i 1 j 1 
2
N  1 M
1 M M 
  2  ( yij  Y )  2
2
  ( yij  Y )( yik  Y ) 
i 1  M j 1 M j 1 k (  j ) 1 
N M M N N M
   ( yij  Y )( yik  Y )  M 2  ( yi  Y ) 2   ( yij  Y ) 2
i 1 j 1 k (  j ) 1 i 1 i 1 j 1
or
 ( MN  1)( M  1) S 2  M 2 ( N  1) Sb2  ( NM  1) S 2
or
( MN  1)
S  2
2
1   ( M  1) S 2  .
M ( N  1)
b
7
The variance of ycl now becomes
N n 2
Var ( ycl )  Sb
Nn
N  n MN  1 S 2
2 
 1  ( M  1)   .
Nn N  1 M
MN  1 N n
For large N ,  1,  1 and so
MN N
1 S2
Var ( ycl )  1  ( M  1)  .
nM
The variance of sample mean under SRSWOR for large N is

S2
Var ( ynM )  .
nM
8
The relative efficiency for large N is now given by
Var ( ynM )
E
Var ( ycl )
S2
 nM
S2
1  ( M  1)  
nM
1 1
 ;     1.
1  ( M  1)  ( M  1)
9
 If M = 1 then E = 1, i.e., SRS and cluster sampling are equally
efficient. Each cluster will consist of one unit, i.e., SRS.
 If M > 1, then cluster sampling is more efficient when
E >1
or ( M  1)   0
or   0.
 If   0, then E = 1, i.e., there is no error which means that
the units in each cluster are arranged randomly. So the
sample is heterogeneous.
10
Efficiency in Terms of Intraclass Correlation: 
 In practice,  is usually positive and  decreases as M increases
but the rate of decrease in  is much lower in comparison to
the rate of increase in M.
The situation that    is possible when the nearby units are

grouped together to form cluster and which are completely
enumerated.
 There are situations when   
11
Estimation of Relative Efficiency:
The relative efficiency of cluster sampling relative to an
equivalent SRSWOR is obtained as
S2
E
MSb2
An estimator of E can be obtained by substituting the
estimates of S 2 and Sb2 .
12
1 n
Since ycl   yi is the mean of n means yi from a
n i 1
population of N means yi , i  1, 2,..., N which are drawn by
SRSWOR, so from the theory of SRSWOR,

 1 n 2
E (s )  E 
2
b
 n  1

i 1
( yi  y c ) 

1 N
  i
N  1 i 1
( y  Y ) 2
 Sb2 .
2 2
s S
Thus b is an unbiased estimator of b .
13
1 n 2
Since s   Si is the mean of n mean sum of squares
2
w
n i 1
Si2 drawn from the population of N mean sums of squares
Si2 , i  1, 2,..., N , so it follows from the theory of SRSWOR that

1 n 2
E ( s )  E   Si 
2
w
 n i 1 
1 N

N
 i
S 2
i 1
 S w2 .
2
Thus sw is an unbiased estimator of S w2 .
14
1 N M
Consider S 
2
 ij
MN  1 i 1 j 1
( y  Y ) 2
N M 2
or ( MN  1) S 2   ( yij  yi )  ( yi  Y ) 
i 1 j 1
N M
  ( yij  yi ) 2  ( yi  Y ) 2 
i 1 j 1
N
  ( M  1) Si2  M ( N  1) Sb2
i 1
 N ( M  1) S w2  M ( N  1) Sb2 .
An unbiased estimator of S2 can be obtained as

1
Sˆ 2   N ( M  1) sw2  M ( N  1) sb2 
MN  1 15
 ( y )  N  n s2
Var cl b
Nn
N  n ˆ2
S
(y ) 
Var nM
Nn M
1 n
where sb2  
n  1 i 1
( yi  ycl .
) 2
S2
An estimate of efficiency E  is
MSb2
N ( M  1) s 2
 M ( N  1) s 2
Eˆ  w b
.
M ( NM  1) sb 2
16
If N is large so that M ( N  1)  MN and MN  1  MN , then
1  M  1  S w2
E  
M  M  MSb2
and its estimate is
1  M  1  s 2
Eˆ   
w
2
.
M  M  Msb
17

sp-sampling-lect-33

Uploaded by

Copyright:

Available Formats

sp-sampling-lect-33

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

sp-sampling-lect-33

Uploaded by

Copyright:

Available Formats

Introduction to Sampling Theory

Slides can be downloaded from

So cluster sampling will be efficient if clusters are so formed that the

The variance of sample mean under SRSWOR for large N is

The situation that    is possible when the nearby units are

 There are situations when   

SRSWOR, so from the theory of SRSWOR,

Si2 , i  1, 2,..., N , so it follows from the theory of SRSWOR that

An unbiased estimator of S2 can be obtained as

If N is large so that M ( N  1)  MN and MN  1  MN , then

and its estimate is

You might also like