Generic TST Protocol Distributed Annexes KNCV
Generic TST Protocol Distributed Annexes KNCV
Generic TST Protocol Distributed Annexes KNCV
see: Sample size determination in health studies, a practical manual. SK Lwanga and S Lemeshow. WHO, Geneva, 1991
P1 P2 P1-P2 V N
0.4 0.1 0.3 0.33 39
0.4 0.2 0.2 0.4 105
0.4 0.25 0.15 0.4275 200
0.4 0.3 0.1 0.45 473
0.4 0.35 0.05 0.4675 1964
N
17287
4322
1921
1080
691
14214
3553
1579
888
569
9604
2401
1067
600
384
DESIGN EFFECT:
The design effect can be calculated as the variance around the outcome variable in the cluster design divided by
the variance as if it were a simple random sample of the same size
STATA
Using data on an individual basis, the design effect (deff) is by default calculated in STATA with the procedure svy
Example of syntax:
Use [dataset]
svymean [variable name], strata ([name of district variable])
Using aggregated data, the design effect (deff) can be calculated in STATA with blogit
Example of syntax:
blogit nrpositive nrtotal, cluster(cluster)
blogit nrpositive nrtotal
(where nrpositive is the number of those who participants who were positive, and nrtotal is the total number of participants
(the design effect is the ratio of the variances (square of the robust standard errors))
SPSS
In SPSS, the design effect (deff) is calculated with Analyze --> Complex Samples. Request the design effect with th
Excel
In Excel, the design effect (deff) can be calculated approximately as described hereafter:
Step 1: Calculation of the variance around the prevalence in a simple random sample (SRS)
prevalence of
district sample size no. positive TST infection
a 1200 98 0.0817
b 1100 81 0.0736
c 1050 72 0.0686
d 950 108 0.1137
e 1000 101 0.1010
f 900 87 0.0967
g 1150 83 0.0722
h 950 78 0.0821
i 1000 111 0.1110
j 900 98 0.1089
total 10200 917 0.0899
Step 2: Calculation of the variance around the prevalence in the cluster design
square of
prevalence of (district prevalence -
district sample size no. positive TST infection mean prevalence)
a 1200 98 0.081666666667 0.00006782
b 1100 81 0.073636363636 0.00026457
c 1050 72 0.068571428571 0.00045499
d 950 108 0.113684210526 0.00056560
e 1000 101 0.101000000000 0.00012317
f 900 87 0.096666666667 0.00004576
g 1150 83 0.072173913043 0.00031428
h 950 78 0.082105263158 0.00006079
i 1000 111 0.111000000000 0.00044513
j 900 98 0.108888888889 0.00036050
total 10200 917
Step 0: Estimate the required number of children to be included in the survey, taking into account design effect, e
Assume 100.000 children will be included in the study
Step 1: list all the districts (administrative areas) in alphabetical order, with its population size and calculate the c
District District population Cumulative population
A 556000 556000
B 125000 681000
C 245000 926000
D 73000 999000
E 156000 1155000
F 468000 1623000
G 74000 1697000
H 356000 2053000
I 64000 2117000
J 231000 2348000
K 639000 2987000
L 123000 3110000
M 54000 3164000
N 185000 3349000
O 354000 3703000
P 568000 4271000
Q 34000 4305000
Total 4305000
Step 2: decide on the number of clusters that will be sampled in the whole country
No.of clusters that
will be sampled
(the min.
recommended is
30): 30
Sampling interval is
the total population
/ no. of clusters: 143500
0.84241947305981 120887
If we assume that the random number is 0.165547, then the random start number is 23756
Step 5: list the sampling interval number added with subsequent counts of the sampling interval:
random start number: 23756
sampling interval: 143500
Step 6: allocate the districts included in the sample on the basis of the random start number and sampling interva
random start number: 23756
random start number + sampling interval: 167256
random start number + 2*sampling interval 310756
random start number + 3*sampling interval 454256
random start number + 4*sampling interval 597756
random start number + 5*sampling interval 741256
random start number + 6*sampling interval 884756
random start number + 7*sampling interval 1028256
random start number + 8*sampling interval 1171756
random start number + 9*sampling interval 1315256
random start number + 10*sampling interv 1458756
etc. etc. 1602256
School 1 will be in the district with a cumulative population equal to or under 23756
School 2 will be in the district with a cumulative population equal to or under 167256
Step 7: apply the random start number to the cumulative population to determine the number of times a district wi
District District population Cumulative population Sampling o# of samples per district
A 556000 556000 1, 2, 3, 4 4
B 125000 681000 5 1
C 245000 926000 6, 7 2
D 73000 999000 0
E 156000 1155000 8 1
F 468000 1623000 9, 10, 11, 4
G 74000 1697000 etc. etc. etc. etc.
H 356000 2053000
I 64000 2117000
J 231000 2348000
K 639000 2987000
L 123000 3110000
M 54000 3164000
N 185000 3349000
O 354000 3703000
P 568000 4271000
Q 34000 4305000
Total 4305000
Step 8: randomly select the number of schools in district A needed to include the required number per cluster
If a total of 10.000 children will be included in the survey and 30 clusters are sampled, each cluster should deliver approxim
District A is included 4 times, so 4*333=1332 children should be included in the survey in district A.
If the average number of eligible chidren in a school is 100 in district A, at least 14 schools should be sampled.
District B is included in the sample one time, so 333 children should be included in the survey in district B
If the average number of eligible children in a school is 80 in district B, at least 42 schools should be sampled.
etc. etc.
ubercle and Lung Disease 1996;77:Suppl 1-20
into account design effect, exclusions and if necessary BCG vaccination coverage
ng interval:
number and sampling interval
hould be sampled.
ey in district B
hould be sampled.
Example of BUDGET with most common budget components
Insert other components if necesarry
Travelling expenses
Vehicle costs including maintenance and insurance
Fuel for planning visits
Fuel for field team during training
Fuel for field team during survey
Travel costs for supervision visits by investigator(s)
Travel costs for shifting camp between regions
Travel costs trainer
Travel costs epidemiologist/statistician
Travel costs to test smear-positives
Supplies
Tuberculin vials (training and survey)
Disposable 1 ml syringes (training and survey)
Transparant rulers
Cotton wool, alcohol, plasters
Vaccine carriers and icepacks for transportation of tuberculin
Waste buckets
Containers for used needles
Calculator
T shirts and caps for survey teams
Information leaflets (teachers and parents)
Stationary, printing and copy costs, stamps, etc.
Other office supplies (pens, staplers, envelops, etc)
Personal computer(s)
Printer(s)
Data storage devices (e.g. external hard disk, rewritable CD's, removable disc)
Incentives for children / schools
Refrigerator (plus gas cylinders)
Other costs
Costs follow-up children with induration (medicine, health care if not provided by NTP)
Insurance for field team and study population
Dissemenation of results (meeting, printing of report, conference visit)
Total
Valuta (US Dollar, Euro, etc.)