Research Methodology: NOVEMBER 26, 2010

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 136

Research Methodology

NOVEMBER 26, 2010


Concept of Variables
• A magnitude that varies: S to Real Space
• Discrete and Continuous Variables
• Qualitative and Quantitative Variables
• Dichotomous (Dummy) and Multiple
Response Variables (Ordered/Unordered)
• Dependent and Independent Variables
• Static and Time Varying Variables
• Count variables
Data Sources
• Secondary Data Sources
-Macro Level Data
-Reserve Bank of India : Covers information
on Finance, Banking and other
Macroeconomic indicators: Time series, it
also provides information about monetary
policy and operations, issues of
government securities, financial and
economic statistics, and the RBI museum
a. www.rbi.org.in/
b. Handbook of Statistics on Indian Economy
Data Sources
• Census of India: Covers data on
demographic and related information
-website
-Various Census Reports and CD (2001
onwards)
Web: www.censusindia.gov.in/
Data Sources
• NSSO: Covers Micro (individual and
household level) data on Consumer
expenditure, employment, agriculture,
disability, health and morbidity,
unorganised manufacturing, service
sector,loan and payment etc.
-Reports :
http://www.mospi.gov.in/mospi_nsso_rept_
pubn.htm
-Data CD can be obtained according to
required rounds: 64th round is the latest
Data Sources
• NFHS AND DLHS: Covers Micro data on
Health-especially for child and women of
age 15-49
-website: www.nfhsindia.org/
-Reports are also available
NFHS-3 (2005-06) AND DLHS (2007-08)
are the latest
Other Data Sources
• Reports and Data from Different Ministries,
Planning Commission
• Annual Survey of Industries (ASI) Data
• Capita Line
• CMIE reports and data: Infrastructure,
Road, Train, Health Expenditure, etc.
• Statistical abstracts
• Many NGOs and Agencies collects data e.g
Crisil, OAF, MFIs
Designing Questionnaires

8
Step 1: Define the aims of the
study
• Write out the problem and primary and
secondary aims using one sentence per
aim. Formulate a plan for the statistical
analysis of each aim. For example,
prepare a set of dummy tables

9
Step 2: Define the variables
to be collected
• Write a detailed list of the information to be
collected and the concepts to be measured
in the study. Are you trying to identify:
– Attitudes
– Needs
– Behavior
– Demographics
– Some combination of these concepts

10
Step 2 contd..

• Translate these concepts into variables


that can be measured
• Define the role of each variable in the
statistical analysis:
– Predictor
– Qualitative/Quantitative
– Outcome

11
Step 3: Review the literature
• Review current literature to identify related
surveys and data collection instruments
that have measured concepts similar to
those related to your study’s aims.
• Saves development time and allows for
comparison with other studies if used
appropriately.

12
Step 4: Specify the Type of
Interviewing method
• Determine how the questionnaire will be
administered: self-administered, telephone
survey, face-to-face interview, web-based
survey.
• At the top, clearly state:
– The purpose of the study
– How the data will be used
– Instructions on how to fill out the questionnaire
– Your policy on confidentiality
13
Step 5: Decide on the question
structure, wordings etc.
• There should be a logical progression
such that the informant is drawn into the
interview by awakening his/her interest.
For this, the introductory item should be as
attention-catching as possible, start with
neutral item (but relevant) than
controversial items.
Step 5: Decide on the question
structure, wordings etc.
• Begin with simple to complex questions:
Time taking questions should neither
come too early nor too late
• Respondent should not be affronted by an
early and sudden request for personal
information
• Avoid ambiguous and complex wordings:
avoid usually, normally, often, regularly
Step 5: Decide on the question
structure, wordings etc.
• Never ask to give an answer to a question
which is embarrassing without being given
an opportunity to explain. This should be
done whether or not researcher is
interested in the “why” question. For
example, if a question is asked “Did you
vote in the last local election?” it should be
supplemented with next question “Why he
did not vote?”
Step 5: Decide on the question
structure, wordings etc.
• Respondents must be brought as
smoothly as possible from one frame of
reference to another rather than made to
jump back and forth.
Step 5: Decide on the question
structure, wordings etc.
• No interview or questionnaire should be
completed without an expression of
appreciation for the efforts put by the
respondents.
Step 6: Design the Questions
• Question: How many cups of coffee or
tea do you drink in a day?

19
Step 6: Design the Questions
• Question: How many cups of coffee or tea
do you drink in a day?
• Principle: Ask for an answer in only one
dimension.
• Solution: Separate the question into two
– (1) How many cups of coffee do you drink
during a typical day?
– (2) How many cups of tea do you drink during a
typical day?
20
Step 6: Design the Questions
• Question: What brand of computer do you
own?
– (A) IBM PC
– (B) Apple

21
Step 6: Design the Questions
• Question: What brand of computer do you own?
– (A) IBM PC
– (B) Apple
• Principle: Avoid hidden assumptions. Make sure to
accommodate all possible answers.
• Solution:
– (1) Make each response a separate dichotomous item
• Do you own an IBM PC? (Circle: Yes or No)
• Do you own an Apple computer? (Circle: Yes or No)
– (2) Add necessary response categories and allow for
multiple responses.
• What brand of computer do you own? (Circle all that apply)
– Do not own computer
– IBM PC
– Apple
– Other 22
Step 6: Design the Questions
• Question: Have you had pain in the last
week?
[ ] Never [ ] Seldom [ ] Often [ ] Very
often

23
Step 6: Design the Questions
• Question: Have you had pain in the last
week?
[ ] Never [ ] Seldom [ ] Often [ ] Very
often
• Principle: Make sure question and answer
options match.
• Solution: Reword either question or answer
to match.
– How often have you had pain in the last week?
[ ] Never [ ] Seldom [ ] Often [ ] Very Often
24
Step 6: Design the Questions
• Question: Where did you grow up?
– Country
– Farm
– City

25
Step 6: Design the Questions
• Question: Where did you grow up?
– Country
– Farm
– City
• Principle: Avoid questions having non-
mutually exclusive answers.
• Solution: Design the question with
mutually exclusive options.
– Where did you grow up?
• House in the country
• Farm in the country
• City 26
Step 6: Design the Questions
• Question: Are you against drug abuse?
(Circle: Yes or No)

27
Step 6: Design the Questions

• Question: Are you against drug abuse?


(Circle: Yes or No)
• Principle: Write questions that will
produce variability in the responses or
does not lead to a particular way or
biasness.
• Solution: Eliminate the question or
rephrase it appropriately.
28
Step 6: Design the Questions

• Question: Do you like to fly when


travelling short distances? (Circle: Yes or
No)
Step 6: Design the Questions

• Question: Do you like to fly when


travelling short distances? (Circle: Yes or
No)
• Principle: Avoid implicit alternatives.
• Solution: Rephrase the questions like
Do you like to fly when travelling short
distances, or would you rather drive?
(Circle: Yes or No)
Step 6: Design the Questions

• Question: Are you in favour of a


balanced budget? (Circle: Yes or No)
Step 6: Design the Questions
• Question: Are you in favour of a balanced budget?
(Circle: Yes or No)
• Principle: Avoid implicit assumptions (not stated in
questions).
• Solution: Rephrase the questions like
Are you in favour of a balanced budget?:
(a) if it would result in an increase in the personal income
tax? (Circle: Yes or No)
(b) if it would result in a cut in defense expenditures?
(Circle: Yes or No)
(c ) if it would result in a cut in social programs? (Circle:
Yes or No)
Step 6: Design the Questions
Sources of income Amount in INR
• Question: What is
the amount of your Agriculture

household annual
Agricultural Wages
income comes from:
Live Stocks

Non-agricultural Wages

Salary

Self employed (business


etc.)
Others (pensions, interest
etc.)
Step 6: Design the Questions
• Question: Which one of the following do you think
increases a person’s chance of having a heart attack the
most? (Check one.)
[ ] Smoking [ ] Being overweight [ ] Stress

34
Step 6: Design the Questions
• Question: Which one of the following do you think
increases a person’s chance of having a heart attack the
most? (Check one.)
[ ] Smoking [ ] Being overweight [ ] Stress
• Principle: Encourage the respondent to consider each
possible response to avoid the uncertainty of whether a
missing item may represent either an answer that does not
apply or an overlooked item.
• Solution: Which of the following increases the chance of
having a heart attack?
– Smoking: [ ] Yes [ ] No [ ] Don’t know
– Being overweight: [ ] Yes [ ] No [ ] Don’t know
– Stress: [ ] Yes [ ] No [ ] Don’t know
35
Step 6: Design the Questions

• Question: What is the annual per capita


expenditure on groceries in your
household?
Step 6: Design the Questions

• Question: What is the annual per capita


expenditure on groceries in your household?
• Principle: Avoid generalizations and
Estimates.
• Solution: Break this question into two:
(a) What is the monthly expenditure on groceries
in your household?
(b) How many members are there in your
household?
Step 6: Design the Questions
• Question:
– (1) Do you currently have a life insurance policy?
(Circle: Yes or No)
– If no, go to question 3.
– (2) How much is your annual life insurance
premium?

38
Step 6: Design the Questions
• Question:
– (1) Do you currently have a life insurance policy?
(Circle: Yes or No)
– If no, go to question 3.
– (2) How much is your annual life insurance
premium?
• Principle: Avoid branching as much as
possible to avoid confusing respondents.
• Solution: If possible, write as one question.
– How much did you spend last year for life
insurance? (Write 0 if none). 39
Step 6: Design the Questions

• Question: Is Colgate your favourite


toothpaste? (Circle: Yes or No)

40
Step 6: Design the Questions

• Question: Is Colgate your favourite


toothpaste? (Circle: Yes or No)
• Principle: Bias in the response may arise
if clues are given. Respondents tend to
respond favourably toward the researcher.
Solution: Re-phrase the question as:
What is your favourite toothpaste brand?
(you may give or may not given options).

41
Step 7: Revise
• Shorten the set of questions for the study. If
a question does not address one of your
aims, discard it.
• Refine the questions and their wording by
testing them with a variety of respondents.
– Ensure the flow is natural.
– Verify that terms and concepts are familiar and
easy to understand for your target audience.
– Keep recall to a minimum and focus on the
recent past.
42
Step 8: Assemble the final
questionnaire
• Include identifying data on each page of a multi-
page questionnaire such as a respondent ID
number in case the pages separate.
• Group questions concerning major subject areas
together and introduce them by heading or short
descriptive statements.
• Ordering of questions should serve to stimulate
recall.
• Ordering and formatting of questions should be
unbiased and balanced.
• Include white space to make answers clear and to
help increase response rate. 43
Step 8: Assemble the final
questionnaire
• Space response scales widely enough so that it is
easy to circle or check the correct answer without
the mark accidentally including the answer above
or below.
– Open-ended questions: the space for the response
should be big enough to allow respondents with large
handwriting to write comfortably in the space.
– Closed-ended questions: line up answers vertically and
precede them with boxes or brackets to check, or by
numbers to circle, rather than open blanks.
• Use larger font size (e.g., 14) and high contrast
(black on white).
44
Conclusions
• You need plenty of time!
– Design your questionnaire from research hypotheses
that have been carefully studied and thought out.
– Discuss the research problem with colleagues and
subject matter experts is critical to developing good
questions.
– Review, revise and test the questions on an iterative
basis.
– Examine the questionnaire as a whole for flow and
presentation.

45
HOW TO COLLECT
PRIMARY DATA?

46
How to Collect Data from
survey?
Two ways of data collection
• Census
- is an attempt to contact every individual
in the entire population
- Time and money consuming, only for
very small population
- Populations rarely stand still. Population
changes while you take the census
•Sampling (we will discuss in detail) 47
Defining Population
• Population : an aggregate of objects,
animate or inanimate under study

- These can be countries, lab-rats, light


bulbs, university students, banks, residents
of a particular area, regional health
authorities etc.

• The population for a study of infant health


might be all children born in India in the
1980's.
48
Definining Sample
• Sample : is a finite subset of statistical
individuals selected from the population.

• Using example for study of infant health the


sample might be all babies born on 7th May
in any of the years.

• A properly selected sample can represent


the population

49
Sampling and representativeness

Sampling or accessible
Population
Sample

Target Population

Target Population  Sampling Population  Sample


50
Population and sample

Sampling
process
Sample
population (data)

Conclusions
Statistics
parameters
(estimate)

unknown known
51
Population versus sample
• Population: The • Sample: The part of
entire group of the population we
individuals in which
we are interested actually examine and
but can’t usually for which we do have
assess directly data

• A statistic is a
• A parameter is a
number describing a Population number describing a
characteristic of the characteristic of a
population. The Sample sample. the value of a
parameter value for statistic is always
the population is a
fixed number but is known as it is from a
sometimes unknown known sample but its
– which is why we We use the statistic to value changes from
try to estimate it. estimate the unknown sample to sample.
population parameter
52
What is sampling?

Procedure by which some members of a


given population are selected as
representatives of the entire population

53
Why sampling not census ?

To get information from large populations


– At minimal cost in both the terms of
money and time
– At maximum speed as only a part of
the population has to be examined

54
Why sampling not census ?
To get information from large populations

- At increased accuracy as in sampling


it is possible to determine the extent of the
sampling error and non-sampling errors which
can be controlled by employing qualified,
trained and experience personnel, better
supervision and better equipment for
processing and analyzing limited data 55
Why sampling not census ?

To get information from large populations


– with Greater scope as it is Practically
achievable in the sense that if the
population is too large, as for example,
trees in jungle, we are left with no way
but to resort to sampling.

56
Why sampling not census ?
In some situations census is not practical
• If testing is destructive like testing the quality
of milk or chemical salt by analysis, testing
the breaking strength of chalks, testing of
crackers and explosives and testing the life
of an electric tube or bulb etc.
• If the population is hypothetical, as in coin-
tossing problem where the process may
continue indefinitely then sampling method is
the only scientific method of estimating the
parameters of the universe. 57
Sampling: Balance between
Precision and cost

Precision
Cost

-More precision, more money and time required

58
Principal Steps in a sample
Survey

59
Define the objectives of the survey

Principal
Define the target Population

Steps Select a sampling frame

in Decide on the sampling technique/s

a Select sampling units

sample Determine sample size

Survey Select actual sampling units

60
Principal Prepare Questionnaire or schedule

Select Method of Collecting information


Steps
Conduct field work
in
Preparing data for analysis
a
Analyze the collected data
sample

Survey Reporting of findings and conclusions


61
Step 1: Define the objectives of the survey

- must be defined in clear, lucid and concrete


terms
- Without this, it is easy in a complex survey to
forget the objectives when engrossed in
details of planning, and to make decisions
that are at variance with the objectives.
- The sponsors of the survey should take care
that these objectives are commensurate with
the available resources in terms of money,
manpower and time limit required
62
Step2. Define the target Population

- Must be defined in clear and unambiguous


terms.
- Must be relevant and operationally defined.
For example, in sampling of farms clear-cut
rules must be framed to define a farm
regarding shape, size, etc. keeping in mind
the border line cases so as to enable the
investigator to decide in the field without
much hesitation whether or not to include a
given farm in the population.
- The sampled population should coincide
with the population about which information
is wanted (Target population) 63
Step 3. Select a sampling frame

• Sampling frame is any list , map or other


material containing all the sampling units in
the population. For example, list of
households taken from census, School lists,
trade association lists, attendance registers
of students, Telephone directory etc.
- it determines the structure of the sample
survey.
- It should be almost or exactly identical to the
entire population. Ideally, there is one to one
correspondence exists between the frame
units and the population units
64
Step 3. Select a sampling frame

65
Step 3. Select a sampling frame

Two problems of frames


1. Over-registration contains all the target
population units + some additional units
2. Under-registration contains fewer units
than does the target population

- it be carefully scrutinized and examined to


ensure that they are up-to date.
66
Step 4. Decide on the sampling technique/s

- Depends upon the nature of the data and


type of enquiry
- Method should be impersonal to avoid
favoritism by the sampler and voluntary
response

67
Types of Sampling

PROBABILITY SAMPLING

NONPROBABILITY SAMPLING

68
Probability Sampling
A probability provides a quantitative description
of the likely occurrence of a particular event.
Probability Sampling is the scientific method of
selecting samples according to some law of
chance in which each unit in the population
has some definite pre-assigned known, non-
zero probability of being selected in the
sample.
The pre-assigned probabilities may be equal for
all, may be different for all or may be
proportional to the sample size. 69
Types of Probability Sampling

SIMPLE RANDOM SAMPLING

STRATIFIED SAMPLING

SYSTEMATIC RANDOM SAMPLING

CLUSTER SAMPLING

MULTISTAGE RANDOM SAMPLING


70
Simple Random Sampling
 is an EPSEM (equal probability of being
selected) technique of drawing a sample in
such a way that
- Each sample unit has an equal chance
of being selected
- Selection of each sample unit is
independent of others
In short,
Every possible combination of sampling units has an
equal and independent chance of being selected 71
Examples of SRS

• Suppose a mental health care trust wants


to assess quality of service provided to
client then list of clients over past year
(Sampling frame) can be found from office
and then we can draw a simple random
sample

72
Examples of SRS

List of Clients

73
Simple Random Sampling

List of Clients

Random Subsample

74
Types of SRS
SRS can be used in two ways
1. SRS with replacement (SRSWR): drawing a
unit and replace it into the population so that
population size remain same before any unit
drawn
2. SRS without replacement (SRSWOR): unit
once selected cannot be a part of population

 SRSWOR provides better precision in


comparison to SRSWR

75
How to choose Random Sample
Procedure - use table of random numbers, computer
random number generator or mechanical
device
Step I: Identify and assign each unit within the
sampling frame a unique number
between 1 to N (sampled population).
Step II. Identify a random start from the random
number table.
Step III. Determine how the digits in the random
number table will be assigned to the
sampling frame.
Step IV. Select the sample elements from the
sampling frame. 76
77
A Practical Example
Problem: Draw a random sample wor of size 10 from a
population of size 1000.
Solution:
Step I: Identify and assign each unit within the
sampling frame a unique number between
1 to 1000.
Step II: Identify a random start from the random
number table say we start from.
Step III: Determine how the digits in the random
number table will be assigned to the
sampling frame.
Step IV: Select the sample elements from the sampling
frame. 78
Merits and Demerits of SRS
Merits:
1. Eliminates the element of subjectivity or personal
bias: Since equal probability of unit selection
2. Simple statistical analysis

Demerits:
1. Problem with Sampling frame: need up-to-date
frame but difficult to get
2. Useful only in small and/or homogeneous areas
3. Requires larger sample size
4. Time consuming/inefficient
79
Stratified Random Sampling

Useful in following situations


1. When a population under study consisting
of (i) literates and illiterates or, (ii) people
living institutions, hostels, hospitals etc.
and those living in ordinary home.
2. When Population is heterogeneous in
nature and Sampling frame is available

80
Stratified Random Sampling
 A population is divided into mutually
exclusive subpopulations of known size, and
a simple random sample is selected in each
subpopulation.
 Each subpopulation is called stratum
 The criterion which enables us to classify
various sampling units into different strata is
called Stratifying factor (s.f.). For example:
Age, sex, educational or income level,
geographical area, economic status and so
on. 81
Stratified Random Sampling
• A s.f. is effective if it divides the given populations
into different strata such that units in each stratum
are
- Homogeneous within themselves
- Heterogeneous/ unlike between different
stratum
• Suppose a farmer wishes to work out the average
milk yield of each cow type in his herd which
consists of Ayrshire, Friesian, Galloway and Jersey
cows. He could divide up his herd into the four sub-
groups and take samples from these (Easton and
Mc Coll 2004). 82
Stratified Random Sampling
Stratum 1 Stratum 2 Stratum 3 Stratum 4
1 1 1 1
2 2 2 2
3 3 3 3

N3
N1
N2
N4
n1 n2 n3
N = N1 + N2 + N3 + N4 n4
83
n = n1+n2+n3+n4
Stratified Random Sampling

84
Allocation of Sample Size
 Proportional Allocation
here sample size allocated to each stratum is
directly proportional to the Population size for that
stratum i.e. More the population size of a stratum,
larger sample will be drawn from that.

Suppose Population of size N (say 600) is divided


into 4 strata and N1(say 100), N2 (say 200), N3 (say
50) and N4 (say 250), respectively, be the size of
each. Now, suppose a total of n (say 30) units has
to be drawn then sample size for these strata will
be n1 = n*N1/N = 5, n2 =n*N2/N = 10, n3 =n*N3/N =2
and n4 =n*N4/N = 12 such that 5 + 10 + 3+ 12=30
85
Merits of Stratified Sampling
1. More representative: Overrules the possibility
of exclusion of any essential group of the
population.
2. Greater accuracy: Reduces variation among
sample units within strata
3. Administrative Convenience: as samples are
more concentrated geographically.
4. More efficient: since variance differs between
the strata
5. Data can be interpreted separately or lumped

86
Demerits of Stratified Sampling

1. Statistical analysis more complex


2. May leave some areas within each
stratum unsampled (poor dispersion)

87
Systematic Random Sampling

- Systematic sampling, sometimes called


interval sampling, means that there is a
gap, or interval, between each selection.
- This consists in selecting only the first unit
at random, the rest being automatically
selected according to some pre-
determined pattern involving regular
spacing of units

88
Systematic Random Sampling
Procedure
• Identify the total number of elements in the
population and number units in from 1 to N
• Decide on the size of desired sample size n
that you want or need
• Identify the sampling ratio k = N/n
• randomly select a number from 1 to k.
• Draw a sample by choosing every kth entry

89
Systematic Sampling

Sample 1 Sample 2 Sample 3

90
Systematic Random Sampling
1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
N = 100 6 31 56 81
7 32 57 82
8 33 58 83
9 34 59 84
10 35 60 85
11 36 61 86
12 37 62 87
13 38 63 88
14 39 64 89
15 40 65 90
16 41 66 91
17 42 67 92
18 43 68 93
19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100
91
Systematic Random Sampling
1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
N = 100 6 31 56 81
7 32 57 82
8 33 58 83
want n = 20 9
10
34
35
59
60
84
85
11 36 61 86
12 37 62 87
13 38 63 88
14 39 64 89
15 40 65 90
16 41 66 91
17 42 67 92
18 43 68 93
19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100
92
Systematic Random Sampling
1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
N = 100 6 31 56 81
7 32 57 82
8 33 58 83
want n = 20 9 34 59 84
10 35 60 85
11 36 61 86
N/n = 5 12
13
37
38
62
63
87
88
14 39 64 89
15 40 65 90
16 41 66 91
17 42 67 92
18 43 68 93
19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100
93
Systematic Random Sampling
1 26 51 76
2 27 52 77
3 28 53 78
4 29 54 79
5 30 55 80
6 31 56 81
N = 100 7 32 57 82
8 33 58 83
9 34 59 84
10 35 60 85
want n = 20 11 36 61 86
12 37 62 87
13 38 63 88
N/n = 5 14 39 64 89
15 40 65 90
16 41 66 91
select a random number from 1-5: 17 42 67 92
18 43 68 93
chose 4 19 44 69 94
20 45 70 95
21 46 71 96
22 47 72 97
23 48 73 98
24 49 74 99
25 50 75 100
94
EXAMPLE OF
SYSTEMATIC
RANDOM SAMPLE

95
When to apply Systematic
Random Sampling?

Commonly employed technique if


- the population is randomly ordered like set
of files, trees of jungles etc.
- Complete and up-to-date list of the
sampling unit is available.

96
Use of Systematic Sampling
1. Often used in industry, where an item is selected for testing
from a production line (say, every fifteen minutes) to ensure
that machines and equipment are working to specification.
Alternatively, the manufacturer might decide to select every
20th item on a production line to test for defects and quality.
This technique requires the first item to be selected at
random as a starting point for testing and, thereafter, every
20th item is chosen.

2. Used when questioning people in surveys e.g. market


researcher selecting every 10th person who enters a
particular store, after selecting a person at random as a
starting point; interviewing occupants of every 5th house in a
street, after selecting a house at random as a starting point.

3. In the survey of Trees in Jungle


97
Merits of Systematic Sampling
1. Operationally more convenient : units are
easily allocated
2. Good representative
3. Efficient

98
Demerits of Systematic Sampling
1. Biased if systematic layout coincides with an
environmental or vegetation pattern
2. Not in general random since requirement of
randomly arranged frame is rarely fulfilled
3. If N is not a multiple of n then the actual sample
size is different from that required

Cautions while using Systematic Sampling


- Avoid using Systematic Random Sampling if
there is some underlying pattern or periodicity in
your sampling frame…..see the handout.
- Periodicity is rare so don’t be paranoid about it!
99
Summary

Systematic
Random
Stratified

100
Cluster sampling
- is a simple random sample of groups or
clusters of elements i.e. is a sampling
technique where the entire population is
divided into groups, or clusters, and a
random sample of these clusters are
selected. All observations in the selected
clusters are included in the sample.
every element should have a specified
(equal) chance of being selected into the
final sample. 101
Cluster sampling

Block A

Block B

Block E

Block D

Block F

Block C

102
Use of Cluster sampling
This procedure is useful when
- it is difficult and costly to develop a
complete list of the population members
but can get a complete list of groups or
'clusters' of the population
- In the survey involving a large,
geographically dispersed and population
related to developing societies.
- when one wants to sample economically
while retaining the characteristics of a
probability sample. 103
Merits & Demerits of Cluster
sampling
Merits
1.Requires less prior information than
stratified sampling.

Demerits
1.Cluster sampling increases sampling error,
because there are probably similarities
among cluster members.
104
Cautions while using Cluster
sampling
1. Clusters should be as small as possible consistent
with the cost and limitations of the survey,
2. The number of sampling units in each cluster
should be approximately same
3. Cluster sampling not recommended when we are
sampling areas in city where there are private
residential houses, business and industrial
complexes, apartment buildings etc. with widely
varying number of persons or households.

105
Multistage sampling
Sampling is carried out in stages.
-First stage units are selected by some suitable
method of sampling,
-From among the selected first stage units, a
sub-sample of secondary stage units is drawn
by some suitable method of sampling which
may be same as or different from the method
used in selecting first stage units and so
on…
Any one of the previous sampling schemes
106
can be applied during each stage.
Multi-Stage sampling

1st stage
Sampled Population
N’’
N’ SRSWOR

SRSWOR 2nd stage

n1 3rd stage N1 Str1


n2 Str2
Stratified sampling N2
n 107
N
Multistage sampling
•Procedure is useful in surveys of under
developed area or pockets where no up-to-
date and accurate frame is available for
subdivision of the material into reasonably
small sampling units.
•Examples of this kind of sampling is Census,
National Safety Belt Survey where we need
national samples of households or people.

108
Merits and Demerits of
Multistage sampling
Merits
1.More flexible
2.Simple to carry out
3.Administrative convenience: by permitting the
field work to be concentrated and yet covering
large area
4.Cost effective: Need second stage frame only
for those units which are selected in the first
stage sample
Demerits 109
1. Less efficient
Difference Between Cluster Sampling
and Simple Random Sample

1.Cluster sample is less efficient than the


simple random samples of the same size.
But it may cost considerably less.

2.The efficiency can be measured in terms of


the size of standard error of estimate, a
small standard error indicates high
efficiency.
110
Difference Between Cluster
Sampling and Stratified Sampling
Although both types of sample involve divide
population into groups, they involve in an opposite
sampling operations.
- In a stratified sample, we sample individuals within
every stratum. The sampling errors involve
variability within strata. Strata are supposed to be
homogeneous as possible and as different as
possible from each other.
- In (single-stage ) cluster sampling, we have no
source of sampling error within the clusters
because every case is being used. The variability is
between the clusters. 111
Types of Nonprobability Samples
• Convenience, accidental, haphazard

• Judgemental

• Quota

• Snowball

112
Accidental, Haphazard or
Convenience Sampling
- Attempts to obtain a sample of people or
units that are most convenient
Examples: “man on the street”,Mental
health branch students,available or
accessible clients, volunteer samples etc.
• Least expensive, least time consuming,
accessible, easily measured and
cooperative sampling units.
• Problem:
Problem we have no evidence for
representativeness
113
Judgemental Sampling
- is a form of convenience sampling in which the
population elements are selected based on the
judgement of the researcher about some
appropriate characteristics required from the
sample member
Examples: Test markets selected to determine the
potential of a new product, expert witnesses used
in court, purchase engineers selected in industrial
marketing research because they are considered
to be representative of the company
114
Quota Sampling
- May be viewed as two-stage restricted judgemental
sampling in which Ist stage consists of developing
control categories or quotas of population elements
and then in 2nd stage sample elements are selected
based on convenience or judgement.
- select people non-randomly according to some
quotas
Example: very effective in Magazine readership
• Requires that the various subgroups in a population
are represented .
• It should not be confused with stratified sampling.
115
Snowball Sampling
• An initial group of respondents is selected
randomly. Subsequent respondents are selected
based on the referrals or information provided by
the initial respondents.
• one person recommends another, who
recommends another, who recommends another,
etc.
• good way to identify hard-to-reach populations for
example, homeless persons, drug users
• Advantage: it substantially increases the
likelihood of locating the desired characteristic in
the population resulting low sampling variance
and cost. 116
Choosing Nonprobability vs.
Probability Sampling
Choice should be based on
1. Nature of research: ex. In exploratory research,
the findings are treated as preliminary and the use
of prob. Sampling may not be warranted. On the
other hand, in conclusive research where the
researcher wishes to use the results to estimate
overall market shares or the size of the total
market, prob. Sampling is favored.
2. Relative magnitude of nonsampling vs
sampling errors: If non-sampling error is an
important factor then non-prob. Sampling may be
preferable as the use of judgment may allow
greater control over the sampling process.
117
Choosing Nonprobability vs.
Probability Sampling
3. Variability in the Population: A more
heterogeneous population would favor probability
sampling as it would be more important to secure
a representative sample.
4. Statistical considerations: As probability
sampling is the basis of most common statistical
techniques, it is preferable from a statistical point
of view.
5. Operational considerations: Probability
sampling is sophisticated and requires statistically
trained researchers, costs more and takes longer
time so it is practically difficult to use.
118
Choosing Nonprobability vs.
Probability Sampling: Summary
Conditions favoring the use of
Factors Non-probability Probability
Sampling Sampling
Nature of research Exploratory Conclusive
Relative magnitude Nonsampling errors are larger Sampling errors are
of sampling and larger
Nonsampling
errors
Variability in the Homogeneous (low) Heterogeous (High)
Population
Statistical Unfavorable Favorable
considerations
Operational Favorable Unfavorable
119
considerations
Step 5: Determine sample size

120
Step 5: Determine sample size
Factors Influencing Sample Size are
1.Importance of the decisions

2.Variability of characteristics & Predetermined


desired Accuracy of the estimates of
characteristics under study: Precision required
3.Probability level considered for the accuracy
of the estimate: confidence level
4.Cost of sampling 121
Data elements to determine SS

Data elements needed to determine


sample size are
• Mean Value
• Standard Error
• Accuracy level
• Confidence Level

122
Formula for Determining
Sample Size with an Example

X  1.96
Accuracy Level =  .1
   N

.1
is (Standard deviation of Population)

.1  1.96
Confidence level = 95% N
2.5
Example: Determining a sample size to .1  1.96
N
estimate the mean number of schooling
2.5
completed by persons with foreign-born N  1.96
Parents. .1
N  2,401

123
Sampling Variability
• When sampling from a population, statistics vary from
sample to sample.
• Ideally we would like the values of the statistic to
randomly fluctuate around the true parameter value.
i.e. not to always be higher or always be lower than
the true parameter value.
• If the statistic does not randomly fluctuate, we say the
statistic is biased.
– Bias is a consistent repeated deviation of the
sample statistic from the population parameter in
the same direction when we take many samples.
We don’t want a sample that favors one result.124
Sampling Variability

• Variability
describes how
spread out the
values of the
sample statistic
are. Large
variability
means that the
result of the
sampling is not
repeatable.
125
Sampling error
• Origin in sampling and arise due to the
fact that only a part of the population has
been used to estimate population
parameters and draw inferences about
the population
• No sample is the exact mirror image of
the population
• Magnitude of error can be measured in
probability samples

126
Sampling error
• Expressed by standard error
– of mean, proportion, differences, etc
• Function of
– amount of variability in measuring
factor of interest
– sample size

127
Sources of sampling error

I. Faulty selection of the sample - Non


representative sample
II. Substitution: since the characteristics
possessed by the substituted unit will usually
be different from those possessed by the unit
originally included in the sample
III. Faulty demarcation of sampling units:
Significant in area surveys while dealing with
border line cases
IV. Constant error due to improper choice of the
statistics for estimating the population
parameters.
128
Sample size and Sampling error
• Inadequate and smaller sample size less
chances for that sample to truly capture the
characteristics of a population Imprecise
sample/results
• The larger the sample, the better but collecting
large samples costs money and resources
• In reality, a balance needs to be struck between
collecting extensive samples and spending a lot
of money and resources and saving money but
not having enough data to draw conclusions from
• Thus, Sampling error decreases with increase in
size of the sample

129
Non-Sampling error
Arise at the stages of observation,
ascertainment and processing of the data and
can occur at every stage of the planning or
execution of census or sample survey.

130
Sources of Non-sampling error
I. Faulty planning or Definitions:
inadequate and inconsistent data
specification w.r.t the objectives of the
survey, errors due to location of the units
and actual measurement of the
characteristics, in recording the
measurements and due to ill-designed
questionnaire, lack of trained, qualified and
adequate number of investigators and
supervisory staff.
II. Defective frame in sample survey 131
Sources of Non-sampling error
III. Response errors: Due to
misunderstanding of respondent, prestige
bias by virtue of which respondent may
upgrade education, intelligence quotient,
occupation, income, etc, or downgrade age
etc., Self-interest like giving underestimate
of salary or production and an over-
statement of his expenses or requirements
etc., Bias due to interviewer who influenced
responses and failure of respondent’s
memory for past happening or conditions. 132
Sources of Non-sampling error
IV. Non-response Bias: If full information is
not obtained on all the sampling units or
when selected individuals are not contacted
or do not respond
- usually 30%
- results in bias
V. Errors in coverage: inclusion of irrelevant
units and/ or exclusion of relevant survey
units. For example: under representation is
found in surveys of poor, homeless, prison
inmates and opinion polls over telephones
where a part of population will be missed
that do not have phones 133
Sources of Non-sampling error
VI. Interviewing skills - important not to introduce
bias
- types of questions asked
- attitude during interviewing
- wording of questions - confusing, misleading,
intimidating
VII. Compilation and publication errors: In data
editing, coding of the responses, entering data in
computers, tabulation and summarizing the original
observations made in the survey….Can be
controlled through verification, consistency check
etc.
134
How to control Non-sampling
error
• Non-Sampling errors can be controlled
by employing qualified, trained and
experience personnel, better supervision
and better equipment for processing and
analyzing data as compared to a
complete census

135
136

You might also like