K.Anandakumar: Lecturer Department of Management Studies Velammal Institute of Technology
K.Anandakumar: Lecturer Department of Management Studies Velammal Institute of Technology
K.Anandakumar: Lecturer Department of Management Studies Velammal Institute of Technology
ANANDAKUMAR/BRM 2010
UNIT 1- INTRODUCTION
BUSINESS RESEARCH- DEFINITIONS AND
SIGNIFICANCE-THE RESEARCH PROCESS-
TYPES OF RESEARCH-EXPLORATORY &
CAUSAL RESEARCH-THEORETICAK &
EMPRICAL RESEARCH-CROSS SECTIONAL &
TIME SERIES RESEARCH-RESEARCH
QUESTIONS/PROBLEMS-RESEARCH
OBJECTIVES-RESEARCH HYPOTHESIS-
K.ANANDAKUMAR
LECTURER
DEPARTMENT OF
MANAGEMENT STUDIES
VELAMMAL INSTITUTE
OF TECHNOLOGY
1
K.ANANDAKUMAR/BRM 2010
2
K.ANANDAKUMAR/BRM 2010
It develops focus
Since the days of steam engine, the research continued to come up with more
powerful locomotive which could be operated with alternative sources of energy
like diesel, electricity etc.
It reveals characteristics
In these days, before a criminal is sentenced, efforts are taken to study why be
had turned a criminal. This help to develop an approach to create opportunities
for criminals to change themselves and join the main stream of life.
It determines frequency of occurrence
It tests hypothesis
Promotes better decision making
Research is the basis for innovation
Research identifies the problem areas.
Helps in forecasting which is very useful for manages
Research helps in the development of new produces or in modifying existing
products and in understanding the competitive environment.
3
K.ANANDAKUMAR/BRM 2010
All progress in born of inquiry. Doubt is often better than overconfidence, for it
leads to inquiry, and inquiry leads to intervention.
Research inculcates scientific and inductive thinking and it promotes the
development of logical habits of thinking and organization.
The increasingly complex nature of business and government has focused
attention on the use of research in solving operational problems.
Research as an aid to economic policy, has gained added importance, both for
govt and business
Research provides the basis for nearly all government policies in our economic
system.
Research has its special significance in solving various operational and planning
problems of business and industry.
Research is equally important for social scientists in studying social
relationships and in seeking answers to various social problems.
Research is the fountain of knowledge for the sake of knowledge and an
important source of providing guidelines for solving different business,
governmental and social problems.
It is a sort of formal training which enables one to understand the new
developments in one’s field in a better way.
Research replaces intuitive business decisions by more logical and scientific
decisions.
Increased amounts of research make progress possible.
4
K.ANANDAKUMAR/BRM 2010
This is a very crucial stage in research. It is in this stage, the researcher makes
himself familiar with all the previous studies and their findings relevant to his
field to work.
5
K.ANANDAKUMAR/BRM 2010
(4)Problem Definition:
After the interviews and the literature review, the researcher has to define the
issues of concern more clearly.
Problem definition or problem statement is a clear, precise and succinct
statement of the question or issue that is to be investigated with the goal of
finding an answer or solution.
Example
To what extend has the new advertising campaign been successful in creating
the high-quality, customer-Centred corporate image that it was intended to
produce?
After conducting the interviews, completing a literature survey and defining the
problem, one is ready to develop a theoretical framework.A theoretical
framework is none other than identification the network of relationship
among the variables considered important to the study of any given
problem situation.
(6)Hypothesis Development:
6
K.ANANDAKUMAR/BRM 2010
Employees who are healthier will take sick leave less frequently.
Women are more motivated than men.
There is a relationship between age and job satisfaction.
This is a stage in which the researcher clearly spells out how he intends to carry
out his work.
In other words, a research design is a description of conceptual structure
within which the research will be conducted.
The researcher would indicate through the design whether he adopts
experimental design or formal design.
He would also state the purpose of his research work viz., descriptive,
diagnostic, explorative or experimental.
The researcher has to make a careful selection of a few elements from the
population and then study them intensely and reach conclusion.
The researcher should determine
The size of the sample,
The method of sampling,
The tests of sample etc.,
(9)Data Analysis:
This involves
Editing
Tabulating
Coding
Editing:
7
K.ANANDAKUMAR/BRM 2010
The data collected should be scanned to make sure that it is complete and that
all the instructions are followed. This process is called editing. Once these
forms have been edited, they must be coded.
Coding:
Tabulation:
The final step is called tabulation. It is the orderly arrangement data in a tabular
form. Also, at the time of analysing the data, the statistical tests to be used must
be finalised such as Z-test, t-test, X2 test, ANOVA, correlation, regression,
SPSS package etc.
After collecting and analysing the data the researcher has to accomplish the task
of drawing inferences followed by report writing.
This has to be done very carefully, otherwise misleading conclusions may be
drawn and the whole purpose of dong research may get vitiated.
It is only through interpretation that the researcher can expose relations and
processes that underlie his findings.
8
K.ANANDAKUMAR/BRM 2010
Exploration Exploration
Stage 2
Research
Proposal
(type,purpose,time,scope,environment)
Stage 3
Data Collection Sampling
Design Design
9
Data Collection & Preparation
Stage 5
y
r
a
s
p
jti
b
e
v
u
q
d
o
iT
m
n
f
lh
c
1.6. Types of research:
Several
2010
From the perspective of the application, a research can be classified into two
broad categories:
is an alternative
that management
applicationmight
solve the management dilemma.
may developing and testing theories and hypotheses that are
be formulated at this stage. Each
intellectually challenging
action
at thetake to
present
Usually the most plausible action,
or the one that offers the greatest
gain using the fewest resources, is
researched first.
to the researcher but may or may not have practical
time or in the future.
10
K.ANANDAKUMAR/BRM 2010
Applied Research:
Descriptive,
Correlational,
Explanatory or
Exploratory.
11
K.ANANDAKUMAR/BRM 2010
Descriptive:
Analytical Research:
The researcher has to use facts or information already available and analyze
these to make a critical evaluation of the material.
Correlation Research:
12
K.ANANDAKUMAR/BRM 2010
Examples:
What is the impact of an advertising campaign on the sale of a product?
What is the relation between technology and unemployment?
Are smoking and cancer related?
Explanatory:
Attempts to clarify why and how there is a relationships between two aspects of
a situation or phenomenon.
Exploratory:
(a) Qualitative:
Research involving analysis of data/information that is descriptive in nature and
not readily quantifiable.
The study is classified as qualitative if the purpose of the study is primarily to
describe a situation, phenomenon, problem or event.
Examples :
The description of an observed situation.
13
K.ANANDAKUMAR/BRM 2010
14
K.ANANDAKUMAR/BRM 2010
interviews
11 Feedback turnaround Smaller sample sizes Larger sample sizes
make data collection lengthen data
faster for shorter possible collection; internet
turnaround methodologies are
Insights are developed as shortening turnaround
the research progresses but inappropriate for
shortening data analysis many studies
Insight development
follows data collection
and entry, lengthening
research process;
interviewing software
permits some tallying
of responses as data
collection progresses
12 Data Security More absolute given use Act of research in
of restricted access progress is often
facilities and smaller known by competitors;
sample sizes insights may be
gleaned by competitors
for some visible, field-
based studies
Casual Research:
16
K.ANANDAKUMAR/BRM 2010
Cross-sectional study:
17
K.ANANDAKUMAR/BRM 2010
utilizes different groups of people who differ in the variables of interest, but
share other characteristics such as socio-economic status, educational
background and ethnicity.
A cross sectional research is an observational one. This means the researchers
record information about their subjects without manipulating the study
environment.
For example, measuring the cholesterol levels of daily walkers and non-walkers
along with any other characteristics that might be.
Cross sectional research takes a ‘slice’ of its target group and bases its overall
finding on the views or behaviours of those targeted assuming them to be
typical of the whole group interest of us. We would not influence non-walkers
to take up that activity, or advise daily walkers to modify their behaviour. In
short, we had tried not to interfere.
The defining feature of a cross-sectional study is that it can compare different
population groups at a single point in time.
Longitudinal study:
A research study for which data are gathered at several points in time to
answer a research question is called longitudinal study.
A longitudinal study is like a cross sectional an observational one
The benefits of longitudinal study are that researchers are able to detect
developments or changes in the characteristics of the target population at both
the group and the individual level.
The key here is that longitudinal studies extend beyond a single moment in
time. As a result, they can establish sequences of events.
For example
We might choose to look at the change in cholesterol levels among women over
40 who walk daily for a period of 20 years. The Longitudinal study design
18
K.ANANDAKUMAR/BRM 2010
would account for cholesterol levels at the onset of a walking regime and the
walking behaviour continued over time.
The researcher might want to study employees’ behaviour before and after a
change in the top management, so as to know what effects the change
accomplished.
Action Research:
Historical Research:
19
K.ANANDAKUMAR/BRM 2010
A study of factors influencing the growth of location for cement plants in Tamil
Nadu is an historical research.
Cross-cultural:
Library Research:
Motivational:
Conceptual:
20
K.ANANDAKUMAR/BRM 2010
Empirical:
Clinical Research:
People
Problems
21
K.ANANDAKUMAR/BRM 2010
Programs
Phenomena
22
K.ANANDAKUMAR/BRM 2010
population composition,
profiles etc. Information that you
Program Contents, structure,
need to collect to find
outcomes, attributes,
answers to your
satisfaction, consumer
research question
service providers etc.
phenomenon Cause-and-effect
relationship, the study of
a phenomenon itself etc.,
The formulation of a problem is far more often essential than its solution, which
may be merely a matter of mathematical or experimental skills.
If one wants to solve a problem, one must generally know what the problem is.
It can be said that a large part of the problem lies in knowing what one is trying
to do.
Step1:
23
K.ANANDAKUMAR/BRM
If you are a social work student, inclined to work in the area of youth welfare,
refugees or domestic violence after graduation, you might take to research in
one of these areas. Or if you are studying marketing you might be interested in
researching consumer behaviour. Or, as a student of public health, intending to
work with patients who have HIV/AIDS, you might like to conduct research on
a subject area relating to HIV/AIDS. As far as the research journey goes these
are the broad research areas. It is imperative that you identify one of the
interests to you before undertaking your research journey.
Step2:
Step3:
Step4:
24
s
m
r
d
l
i
e
D
o
p
y
a
h
w
V
f
c
t
v
u
fi
x
n
b
ff
ti
K.ANANDAKUMAR/BRM 2010
Step5:
Formulate objectives
Formulate your objectives and sub objectives. Your objectives grow out of your
research questions. The main difference between objectives and research
questions is the way in which they are written. Research questions are obviously
that- questions. Objectives transform these questions into behavioural aims by
using action-oriented words such as to find out, to determine, to ascertain, and
to examine.
Step6:
Step7:
Double check
Go back and give final consideration to whether or not you are sufficiently
interested in the study, and have adequate resources to undertake it. Ask
yourself, “Am I really enthusiastic about this study? And, “Do I have enough
resources to undertake it?” Answer these questions thoughtfully and
realistically. If your answer to one of them is ‘no’, re-assess your objectives.
25
K.ANANDAKUMAR/BRM 2010
1.8.1. Definition:
26
K.ANANDAKUMAR/BRM 2010
(a) Null:
The null hypothesis is a proposition that states a definitive, exact relationship
between two variables. That is, it states that population correlation between two
variables is equal to zero. In general the null hypothesis is expressed as no
significant relationship between two variables or no significant difference
between two groups.
Example: There is no relationship between age and job satisfaction.
(b)Alternate hypothesis.
The negation of null hypothesis is called alternate hypothesis. (Or) The
complementary of null hypothesis is called null hypothesis. (Or)The conclusion
we accept when the data fail to support the null hypothesis is called alternative
hypothesis.
Example: women are more motivated then men.
27
K.ANANDAKUMAR/BRM 2010
d. Refined hypothesis;
Refined hypothesis is one, which is more significant in research and the degree
of significance depends on the level of abstraction.
The refined hypothesis may be hypothesis that state the existence of empirical
uniformities, hypothesis that are concerned with complex ideal types and
hypothesis that are concerned with relation of analytical variables.
For examples, a hypothesis like “reduction of tax rates and extent of evasion”
would have been studied before formulating this hypothesis.
e. Working hypothesis;
Once the necessary data or facts are collected for the purpose of empirical
verification, this type of hypothesis becomes redundant.
For examples, ‘’ monetary incentives act as great motivators’’ may be a
hypothesis formulated to facilitate focus on the collection of data regarding
monetary incentives and how this had improved the production or sales, etc., in
a specific environment.
Once these data are collected, they could be analyzed and based on that, a
correct hypothesis may be formulated. In such a situation, the original
hypothesis becomes redundant.
f. Statistical hypothesis;
Statistical hypothesis are those, which are formulated based on the sample data
or facts.
They serve the usual purpose of testing any expected relationship among
variables.
Once these hypotheses are tested or verified, the conclusion about the
population is drawn.
For example, with sample data, when a tentative statement is made it is tested
for acceptance or rejection. Once it is accepted with the sample data, it is used
for making inference and drawing conclusions.
For example, in a steel factory, a very large quantity of iron rods is cut to
specific size.
A few samples are selected over a few days and measured for their accuracy is
size.
Suppose the sample test reveals that there is no significant difference in the size
of iron rods cut.
Then, on this basis it may be inferred that in the bulk of iron rods cut, there will
be no significant difference in size.
29
K.ANANDAKUMAR/BRM 2010
30
cepjHoARt K.ANANDAKUMAR/BRM
31
2010
K.ANANDAKUMAR/BRM 2010
1. Purposiveness
2. Rigor
3. Testability
4. Replicability
5. Precision and confidence.
6. Objectivity
7. Generalizability
8. Parsimony
32
K.ANANDAKUMAR/BRM 2010
Guiding research:
33
K.ANANDAKUMAR/BRM 2010
34
K.ANANDAKUMAR/BRM 2010
35
K.ANANDAKUMAR/BRM 2010
2.1.1 Definition:
Research design constitutes the blueprint for the collection, measurement and
analysis of data.
Research design aids the researcher in the allocation of limited resources by
posing crucial choices in methodology.
Research design is the plan and structure of investigation conceived as to obtain
answers to research questions. The plan is the overall scheme or program of the
research. It includes an outline of what the investigator will do from writing
hypothesis and their operational implications to the final analysis of data.
RD expresses both the structure of the research problem- the framework,
organization or configuration of the relationships among variables of a study-
and the plan of investigation used to obtain empirical evidence on those
relationships.
36
K.ANANDAKUMAR/BRM 2010
37
K.ANANDAKUMAR/BRM 2010
Question: “Did the change from selling in packs of two to free selection from
produce bins cause this sales increase?”
Could there be other variables that could have affected mango sales?
What would happen to the sales if the weather changed from rainy to fair?
Did the change take place during a festive season?
38
K.ANANDAKUMAR/BRM 2010
In this example, weather and the onset of the festive season etc. may be viewed
as extraneous variables, having an effect on the dependent variable. However,
these are not independent variables.
This example clearly shows that isolating the effects of independent variables
on dependent variables without controlling for the effects of the extraneous
variables is very difficult.
• Measure the sales for both groups before the experiment date and after
the experiment date.
39
K.ANANDAKUMAR/BRM 2010
Purely post-design
In this design, the dependent variable is measured after exposing the test units to
the experimental variable.
Example
Assume M/s Hindustan Lever Ltd wants to conduct an experiment on the
“Impact of free sample on the sale of toilet soaps”. A small of toilet soaps are
mailed to selected customers in a locality. After one month, a coupon of 25
paise off on one cake of soap is mailed to each customer to whom free samples
were sent earlier. An equal number of these coupons are also mailed to people
in another locality in the neighbourhood. The coupons are coded to keep an
account of the number of coupons redeemed from each locality. Suppose, 400
40
K.ANANDAKUMAR/BRM 2010
coupons were redeemed from the experimental group and 250 coupons were
redeemed from the control group. The difference of 150 is supposed to be the
effect of free samples. In this method, the conclusion can be drawn only after
conducting the experiment.
In this method, measurements are made before as well as after the design.
Example:
Let us say that, an experiment is conducted to test an advertisement which is
aimed at reducing alcoholism. Attitudes and perceptions towards consuming
liquor are measured before exposure to the advertisement. The group is exposed
to an advertisement, which tells them the consequences, and their attitudes are
again measured after several days. The difference, if any, shows the
effectiveness of that advertisement.
The above example of “Before-after” suffers from validity threat due to the
following.
It alerts the respondents to the fact that they are being studied. The respondents
may discuss the topics with friends and relatives and modify their between
behaviour accordingly.
Instrumentation Effect
This can be due to two difference instruments being used – one before and one
after. A change in the interviewers before and after, results in the
instrumentation effect.
Factorial Design
41
K.ANANDAKUMAR/BRM 2010
Factorial design permits the researcher to test two or more variables at the same
time. Factorial design helps to determine the effect of each of the variables and
measure the interacting effect of many variables.
Example:
A departmental store wants to study the impact of price reduction of products.
Given that, there is also promotion (POP) being carried out in the stores (a) near
the entrance (b) at usual place, at the same time. Now assume that there are two
price levels namely regular price A1 and reduced price A2. Let there be three
types of POP namely B1, B2, and B3. There are 3×2=6 combinations possible.
The combinations possible are B1A1, B1A2, B2A1, B2A2, B3A1, B3A2. Which of
these combinations is best suited is what the researcher is interested in. Suppose
there are 60 departmental stores of the chain divided into groups of 10 stores
each. Now, randomly assign the above combination to each of these 10 stores as
follows:
Combinations Sales
B1A1, S1
B1A2 S2
B2A1 S3
B2A2 S4
B3A1 S5
B3A2 S6
S1 to S6 represents the sales resulting from each variable. The data gathered will
provide details on product sales on account of two independent variables.
Is the display at the entrance more effective than the display at the usual
location?
42
K.ANANDAKUMAR/BRM 2010
Also, the research will tell us about the interaction effect of the two
variables.
The researcher chooses three shelf arrangements in three stores. He would like
to observe the sales generated in each of these stores at different periods. The
researcher must make sure that one type of shelf arrangement is used in each
store only once.
In the Latin Square Design, only one variable is tested. As an example of Latin
Square design, assume that a supermarket chain is interested in the effect of in-
store promotion on sale. Suppose there are 3 promotions considered as follows:
1. No promotion
2. Free sample with demonstration
3. Window display.
4. Which of the three will be effective? The outcome may be affected by the size
of the stores and the time period. If we choose three stores and three time
periods, the total number of combination in 3×3=9. The arrangement is are
follows:
43
K.ANANDAKUMAR/BRM 2010
This is a variation of “after only design”. The groups such as experiment and
control are identified only after they are exposed to the experiment.
Let us assume that a magazine publisher wants to ascertain the impact of
advertisement on knitting in Women’s Era periodical. The subscribers were
asked whether they have seen this advertisement on ‘knitting’. Those who have
read and not read were asked about the price, design, etc, of the product. The
difference indicates the effectiveness of the advertisement. In this design, the
experimental group is set to receive the treatment rather than exposing it to the
treatment by its choice.
Category Options
44
K.ANANDAKUMAR/BRM 2010
2.4.3. Monitoring:
The studies in which the researcher inspects the activities of a subject or the
nature of some material without attempting to elicit responses from anyone.
45
K.ANANDAKUMAR/BRM 2010
2.4.5. Experiment:
The researcher attempts to control and manipulate the variables in the study.
Experimental design is appropriate when one wishes to discover whether certain
variables produce effects in other variables.
With an ex post _facto design, investigators have no control over the variables
in the sense of being able to manipulate them. They can only report what has
happened or what is happening.
2.4.7. Descriptive:
If the research is concerned with finding out who, what, where, when or how
much then the study is descriptive.
2.4.8. Causal:
46
K.ANANDAKUMAR/BRM 2010
If the research is concerned with learning ‘why’ that is, how one variable
produces changes in another, it is causal.
Cross_ sectional studies are carried out once and represent a snapshot of one
point in time.
These are the studies in which an event or occurrence is measured again and
again over a period of time. This is also known as Time- Series -Study.
2.4.12. Simulation:
47
K.ANANDAKUMAR/BRM 2010
In that sense, the simulation lies somewhere between a lab and a field
experiment, insofar as the environment is artificially created but not far different
from “reality”.
For example, in the study by Koolstra and Beentijes (1999), elementary
students participated in different television-based treatments in vacant school
rooms similar to their actual classrooms.
48
K.ANANDAKUMAR/BRM 2010
Examples:
49
K.ANANDAKUMAR/BRM 2010
Examples:
To determine, under field conditions, the impact of maternal and child health
services on the level of infant mortality.
To establish the effects of a counselling service on the extent of marital
problems.
To find out the effect of parental involvement on the level of academic
achievement of their children.
Studies focus on past trends in a phenomenon and study it into the future.
In a retrospective-prospective study a part of the data is collected
retrospectively from the existing records before the intervention is introduced
and then the study population is followed to ascertain the impact of the
intervention
Example:
Trend studies.
a) Cohort studies:
Cohort studies are based upon the existence of a common characteristic such as
year of birth, graduation or marriage, within a subgroup of a population.
Example:
Suppose you want to study the employment patterns of a batch of accountants
who graduated from a university in 1975 or to study the fertility behaviour of
women who were married in 1930.
b) Blind studies
50
K.ANANDAKUMAR/BRM 2010
In a blind study, the study population does not know whether it is getting real or
fake treatment.
The main objective of designing a blind study is to isolate the placebo effect.
The placebo effect is the psychological effect on the recovery process of a
patient’s knowledge that he/she is receiving the treatment.
c) Double-Blind studies:
In a double blind study neither the researcher nor the study participants know
who is receiving real and who is receiving fake treatment.
Example:
Pharmaceutical companies experimenting with the efficacy of newly developed
drugs in the prototype stage ensure that the subjects in the experimental and
control groups are kept unaware of who is given the drug, and who the placebo.
Such studies are called blind studies.
The experimental design that sets up two experimental groups and two control
groups, subjecting one experimental group and one control group to both the
pre-test and the post test, and the other experimental and control group to only
the post test.
51
E= (O2-O1)
E= (O2-O4)
E= (O5-O6)
E= (O5-O3)
E= [(O2-O1)-(O4-O3)]
K.ANANDAKUMAR/BRM
If all Es are similar, the cause and effect relationship is highly valid.
History
Maturation
Testing
Instrumentation
Selection
52
x
E
d
i
l
a
v
r
e
t
n
I
2010
K.ANANDAKUMAR/BRM 2010
Statistical regression
Experimental mortality
History:
History refers to those events which are external to the experiment, but occur at
the same time as experiment is being conducted. This may affect the result.
Example
Let us suppose that, manufacture makes a 20% cut in the price of a product and
monitors sales in the coming weeks.
The purpose of research is to learn about the impact of price on sales.
Meanwhile, if the production of the product declines due to a shortage of raw
materials, then the sales will not increase.
Therefore, we cannot conclude that the price cut did not have any influence on
sales because the history of external has occurred during the period and we
cannot control the event. The event can only be identified.
Maturation
Maturation refers to the changes occurring within the test units and not due to
the effect of the experiment.
Maturation takes place due to passage of time.
It refers to the effect of people growing older.
Persons who use a particular product may discontinue using that product and
may switch over to an alternate product.
Example 1:
Pepsi is consumed when people are young. Due to passage of time, the
consumer might prefer to consume Diet Pepsi or even avoid it altogether.
Example2:
Assuming that a training programme is conducted for salesman, the company
wants to measure the impact of its sales programme. If the company finds that
53
K.ANANDAKUMAR/BRM 2010
the sales have improved, it may not be due to its training programme. It may be
because their salesmen have gained more experience now and know the
customer better. Better understanding between salesmen and customer may be
the reason for increased sales.
Maturation effect is not just limited to test unit, composed of people alone.
Organisation also change, dealers grow, become more successful, diversify, and
so on.
Testing
Pre-testing effect occurs, when the same respondents are measured more than
once. Responses given at a later stage will have a direct bearing on the
responses given during an earlier measurement.
Example:
Consider a respondent, who is given an initial questionnaire, intended to
measure brand awareness.
After examining him, if a second questionnaire similar to the initial
questionnaire is given to the respondent, he will respond quite differently,
because of the respondent’s familiarity with the earlier questionnaire.
Instrumental variation:
54
K.ANANDAKUMAR/BRM 2010
The measurement in experiments will depend upon the instrument used for
measurements. Also, results may vary due to the application of instruments,
where there are several interviewers. Thus, it is very difficult to ensure that all
the interviewers will ask the same questions with the same tone and develop the
same rapport. There may be difference in response, because each interviewer
conducts the interview differently.
Experimental Mortality
Some members may leave the original group and some new members may join
the old group. This is because some members might migrate to another
geographical area. This change in composition of the members will alter the
composition of the group itself.
Example:
Assume that a vacuum cleaner manufacturer wants to introduce a new version.
He interviews hundred respondents who are currently using the older version.
Let us assume that, these 100 respondents have rated the existing vacuum
cleaner on a 10 point scale (1 for lowest and 10 for highest). Let the mean rating
of the respondents be 7.
Now the newer version is demonstrated to the same hundred respondents and
the equipment is left with them for two months. At the end of two months, only
80 participants respond, since the remaining 20 refused to answer. Now the
mean score of 80 respondents is 8 on the same 10 point scale. From this, ca we
conclude that the new vacuum cleaner is better?
The answer to the above question depends on the composition of 20 respondents
who dropped out. Suppose the 20 respondents who dropped out displayed
negative reaction to the product, then the mean score would not have been 8. It
would have been even lower than 7. The difference in mean rating does not give
the true picture. It does not indicate that the new product is better than the old
one.
55
K.ANANDAKUMAR/BRM 2010
One might wonder why not we leave the 20 respondents from the original group
and calculate the mean rating of the remaining 80 and compare the two? But
this method will also not solve the mortality effect. Mortality effect will occur
in an experiment, irrespective of whether human beings are involved or not.
The external validity refers to the degree to which the results of an experiment
can be generalised beyond the experimental situation to other population.
Aptitude-Treatment-Interaction:
The sample may have certain features that may interact with the independent
variable, limiting generalizability.
For example, inferences based on comparative psychotherapy studies often
employ specific samples (e.g. volunteers, highly depressed, no comorbidity). If
psychotherapy is found effective for these sample patients, will it also be
effective for non-volunteers or the mildly depressed or patients with concurrent
other disorders?
Situation:
56
K.ANANDAKUMAR/BRM 2010
Pre-Test Effects:
If cause-effect relationships can only be found when pre-tests are carried out,
then this also limits the generality of the findings.
Post-Test Effects:
If cause-effect relationships can only be found when post-tests are carried out,
then this also limits the generality of the findings.
Rosenthal Effects:
57
K.ANANDAKUMAR/BRM 2010
Motivation.
2.7.1. Types:
58
K.ANANDAKUMAR/BRM 2010
Since the sales of the product can vary – can be low, medium or high – it is a
variable.
Since sales in the main focus of interest to the manager, it is the dependent
variable.
Independent variable
Stock market
New product
Price
Success
The moderating variable is one that has a strong contingent effect on the
independent variable – dependent variable relationship. That is the presence of a
59
K.ANANDAKUMAR/BRM 2010
third variable (mv) modifies the original relationship between the independent
and dependent variables.
(eg)., A prevalent theory is that the diversity of the workforce (comprising
people of different ethnic origins, race and nationalises) contributes more to
organizational effectiveness because each group brings its own special expertise
and skills to the work place. This synergy can be exploited, however only if
managers know how to harness the special talents of the diverse work group;
otherwise they will remain untapped.
Organizational effectiveness – DV
Organisational
Work force
Effectiveness
Diversity
Managerial
expertise
Interviewing Variable:
60
r
s
a
(
g
in
k
o
S
)
t
c
ff
d
e
m
u Managerial effectiveness
(moderating variable)
Active Variables:
K.ANANDAKUMAR/BRM
61
2010
K.ANANDAKUMAR/BRM 2010
Attribute variables:
Those variables that can’t be manipulated changed or controlled and that reflect
the characteristics of the study population (eg)., age, gender, education &
income.
Categorical variable:
Categorical
Constant:
When a variable can have only one value or category, for example taxi, tree or
water, it is known as constant variable.
Dichotomous:
When the variable can have only two categories as in yes/no, good/bad it is
known as dichotomous variable.
Polytomous:
62
K.ANANDAKUMAR/BRM 2010
When a variable can be divided into more than two categories for example
religion (Christian, Muslim, and Hindu), political parties (labour, liberal, and
democrat) then it is Polytomous variables.
Continuous variable:
2.8.2. Scaling
2.8.3. Scale:
63
K.ANANDAKUMAR/BRM 2010
Nominal scale:
Example:
Ordinal scale:
An ordinal scale not only categorizes the variables in such a way as to denote
the differences among the various categories, it also rank-orders the categories
in some meaningful way
Example:
64
K.ANANDAKUMAR/BRM 2010
Interval scale is more powerful than the nominal and ordinal scale. The distance
given on the scale represents equal distance on the property being measured.
Interval scale may tell us “How far the objects are apart with respect to an
attribute?” an interval scale allows us to perform certain mathematical
operations on the data collected from the respondents.
Example:
Ratio scale:
Ratio scale is a special kind of interval scale that has a meaningful zero point.
With this scale length, weight or distance can be measured. In this scale, it is
possible to say, how many times greater or smaller one object is being
compared to the other.
Example:
Sales this year for product A are twice the sales of the same product last year.
Dichotomous scale
Category scale
65
K.ANANDAKUMAR/BRM 2010
Likert scale
Numerical scales
Semantic differential scale
Itemized rating scale
Fixed or constant sum rating scale
Stapel scale
Graphic rating scale
Consensus scale
Dichotomous Scale
E.g.
Category Scale
The category scale uses multiple items to elicit a single response as per the
following example. This also uses the nominal scale.
E.g.
1. North Bay
2. South Bay
3. East Bay
4. Peninsula
66
K.ANANDAKUMAR/BRM 2010
5. Other
Likert Scale
The Likert scale is designed to examine how strongly subjects agree or disagree
with statements on a 5-point scale with the following anchors:
1 2 3 4 5
Using the preceding Likert scale, state the extent to which you agree with
each of the following statements:
Several bipolar attributes are identified at the extremes of the scale, and
respondents are asked to indicate their attitudes, on what may be called a
semantic space, toward a particular individual, object, or event on each of the
attributes. The bipolar adjectives used, for instance, would employ such terms
as Good-Bad; Strong-Weak; Hot-Cold. The semantic differential scale is used
67
K.ANANDAKUMAR/BRM 2010
e.g.
Responsive - - - - - - Unresponsive
Beautiful - - - - - - Ugly
Courageous - - - - - - Timid
Numerical Scale
The numerical scale is similar to the semantic differential scale, with the
difference that numbers on a 5 point or 7 point scale are provided, with bipolar
adjectives at both ends, as illustrated below. This is also an interval scale.
E.g.
How pleased are you with your new real estate agent?
A 5-point or 7-point scale with anchors, as needed, is provided for each item
and the respondent states the appropriate number on the side of each item, or
circles relevant number against each item, as per the example that follow. The
responses to the items are then summated. This uses an interval scale.
Respond to each item using the scale below, and indicate your response number
on the line by each item.
1 2 3 4 5
Very unlikely Unlikely Neither Unlikely Nor Likely Likely Very Likely
1. I will be changing my job within the next 12 months.
2. I will take on new assignments in the near future.
68
K.ANANDAKUMAR/BRM 2010
E.g.
In choosing toilet soap, indicate the importance you attach to each of the
following five aspects by allotting points for each to total 100 in all.
Fragrance -
Color -
Shape -
Size -
Texture of lather -
Total points 100
Stapel scale:
This scale simultaneously measures both the direction and intensity of the
attitude toward the items under study. The characteristic of interest to the study
is placed at the centre and a numerical scale ranging, say, from +3 to -3, on
either side of the item as illustrated below. This gives an idea of how close or
distant the individual response to the stimulus is, as shown in the example
below. Since this does not have an absolute zero point, this is an interval scale.
69
K.ANANDAKUMAR/BRM 2010
State how you would rate your supervisor’s abilities with respect to each of the characteristics mentioned
below, by circling the appropriate number.
+3 +3 +3
+2 +2 +2
+1 +1 +1
Adopting Modern Technology Product Innovation interpersonal Skills
-1 -1 -1
-2 -2 -2
-3 -3 -3
On a scale of 1 to 10 10 Excellent
how would you rate your 5 All right
supervisor? 1 Very bad
Reliability
Validity
Practicality
2.9.1. Reliability:
70
K.ANANDAKUMAR/BRM 2010
Reliability means the extent to which the measurement process is free from
errors.
It is an indication of the stability and consistency with which the instrument
measures the concept and helps to assess the goodness of a measure.
71
K.ANANDAKUMAR/BRM 2010
72
K.ANANDAKUMAR/BRM 2010
As an equation,
Or
The ratio of 1 shows 100% reliability between test score and retest score.
In another way, zero difference between the test and retest scores is an
indication of 100% reliability.
In this procedure you construct two instruments that are intended to measure the
same phenomenon.
The two instruments are then administered to two similar populations.
The results obtained from one test are compared with those obtained from the
other.
If they are similar, it is assumed that the instruments are reliable.
The idea behind internal consistency procedures is that items measuring the
same phenomenon should produce similar results. The following method is
commonly used for measuring the reliability of an instrument.
A test given and divided into halves and are scored separately, then the score of
one half of test are compared to the score of the remaining half to test the
reliability (Kaplan & Saccuzzo, 2001).
73
K.ANANDAKUMAR/BRM 2010
1st- Divide test into halves. The most commonly used way to do this would be to
assign odd numbered items to one half of the test and even numbered items to
the other, this is called, Odd-Even reliability.
2nd- Find the correlation of scores between the two halves by using the Pearson r
formula.
Spearman-Brown formula
r=2r
1+ r
2.10. VALIDITY:
74
K.ANANDAKUMAR/BRM 2010
Content validity:
Content validity draw an inference from test scores to a large domain of items
similar to those on the test.
75
K.ANANDAKUMAR/BRM 2010
Content validity is the degree to which the content of the items
adequately represents the universe of all relevant items under study.
The more the scale items represent the domain or universe of the
concept being measured, the greater the content validity.
To put it differently, content validity is a function of how well the
dimensions and elements of a concept have been delineated.
Criterion validity:
For instance, scores of the driving test by simulation is the predictor variable
while scores of the road test is the criterion variable. It is hypothesized that if
the tester passes the simulation test, he/she should meet the criterion of being a
safe driver. In other words, if the simulation test scores could predict the road
test scores in a regression model, the simulation test is claimed to have a high
degree of criterion validity.
Construct validity:
77
K.ANANDAKUMAR/BRM 2010
The degree to which a research instrument is able to provide
evidence based on theory is called construct validity.
Because it is concerned with abstract and theoretical construct, construct validity is
also known as theoretical construct.
Concurrent Validity:
78
K.ANANDAKUMAR/BRM 2010
It allows you to show that your test is valid by comparing it with an already
valid test.
A new test of adult intelligence, for example, would have concurrent validity if
it had a high positive correlation with the Wechsler Adult Intelligence Scale
since the Wechsler is an accepted measure of the construct we call intelligence.
An obvious concern relates to the validity of the test against which you are
comparing your test.
Some assumptions must be made because there are many who argue the
Wechsler scales, for example, are not good measures of intelligence.
Predictive Validity:
In order for a test to be a valid screening device for some future behaviour, it
must have predictive validity.
The SAT is used by college screening committees as one way to predict college
grades.
The GMAT is used to predict success in business school. And the LSAT is
used as a means to predict law school performance.
The main concern with these and many other predictive measures is predictive
validity because without it, they would be worthless.
We determine predictive validity by computing a correlation coefficient
comparing SAT scores, for example, and college grades. If they are directly
related, then we can make a prediction regarding college grades based on SAT
score.
We can show that students who score high on the SAT tend to receive high
grades in college.
The figure below portrays the difference between reliability and validity.
79
K.ANANDAKUMAR/BRM 2010
If the purpose of the measurement is to hit the centre of the target, we see that
reliability looks like a tight pattern regardless of where it hits, because
reliability is a function of consistency. Validity on the other hand, is a function
of shots being arranged around the bull’s eye. In statistical terms, if the
expected value is the bull’s eye, then it is valid; If the variations are small
relative to the entire target, then it is reliable.
Practicality:
Economy:
Economy considerations suggest that some trade-off is needed between the ideal
research project and that which the budget can afford.
The length of measuring instrument is an important area where economic
pressures are quickly felt.
The choice of data collection is also often dictated by economic factors.
The rising cost of personal interviewing first led to an increased use of
telephone surveys and subsequently to the current rise in Internet surveys.
In standardized tests, the cost of test materials alone can be such a significant
expense that it encourages multiple reuses.
Convenience:
80
K.ANANDAKUMAR/BRM 2010
For this purpose one should give due attention to the proper layout of the
measuring instrument.
For instance, a questionnaire, with clear instructions (illustrated by examples),
is certainly more effective and easier to complete than one which lacks these
features.
Interpretability:
81
K.ANANDAKUMAR/BRM 2010
82
K.ANANDAKUMAR/BRM 2010
83
K.ANANDAKUMAR/BRM 2010
84
-
S
N
A
C
I
L
B
U
P
T
V
O
G
H
R
E
D
M
S
D
O
H
T
E
R
E
T
B
V
N
I
S
Q
U
C
O
D
D
N
O
C
E
A
M
I
R
P
S
F
O
Y
R
A
N
I
T
C
E
L
O
S
R
U
G
W
R
I
A
O
S
T
N
E
M
O
S
E
C
R
U
D
G
T
A
I
V
Y
The data directly collected by the researcher, with respect to the problem under
study, is known as primary data.
Primary data is also the first hand data collected by the researcher for the
immediate purpose of the study.
A
M
I
R
P R
Y
B
O
R
T
N
I
S
E
U
Q
TI
A
D R
T
V
C
O
E
L
T
W
E
I
A
N
O I
T
O
E
R
G
N
N
OBSERVATION METHOD:
86
K.ANANDAKUMAR/BRM 2010
ti
a
P
c
e
i
D
l
r
t
n
o
p
e
s
b
o
c
u
r
t
S
g
D
d
e
s
i r
o
d
e
l
a
n
t
v
r
d
r
o
c
r
t
g
d
n
u
s
i d
e
r
n
c
e
d
i
r
-
o t
n
o
ti
a
d
o
h
t
e
m
ti
a
p
d
h
e
m
l
r
t
n
o
c
i d
e
l
a
n
t
Structured or unstructured:
If the observation is characterised by a careful definition of units to be observed,
the style of recording the observed information, standardised conditions of
observation and the selection of pertinent data of observation, then the
observation is called as structured observation.
When the observation is to take place without these characteristics to be thought
of in advance, the same is termed as unstructured observation.
Example:
A manager of a hotel wants to know “how many of his customers visit the hotel
with their families and how many come as single customers”. Here, the
observation is structured, since it is clear “what is to be observed”. He may
instruct his waiters to record this. This information is required to decide
requirements of the chairs and tables and also the ambience.
Suppose the manager wants to know how single customers and those with
families behave and what their attitudes are like. This study is vague, and it
needs a non –structured observation.
It is easier to record structured observation than unstructured observation.
87
K.ANANDAKUMAR/BRM 2010
In disguised observation, the respondents do not know that they are being
observed.
In non –disguised observation, the respondents are well aware that they are
being observed.
When the observer is physically present and personaly monitors and records the
behaviour of the participants, then it is called direct observation.
When the recording of data is done by mechanical, photographic or electronic
means, it is called indirect observation.
Example:
88
3.2.
t
e
Iis
m
u
r
n
d
o
h
c
w
v
Controlled and non-controlled:
Unstructured interview:
t
d
e
r
is
r
The strength of unstructured interview is the almost complete freedom they
provide in terms of content and structure.
You are free to order these in whatever sequence you wish.
89
K.ANANDAKUMAR/BRM 2010
You also have complete freedom in terms of the wording you use and the way
you explain questions to your respondents.
You may formulate questions and raise issues on the spur of the moment,
depending upon what occurs to you in the context of the discussion.
There are several types of unstructured interviewing...
In depth interviewing
Focus group interviewing
Narratives
Oral histories
In depth interviews:
Narratives:
The person tells his/her story about an incident or situation and the researcher
listen passively. Occasionally the researcher encourages the individual by using
the techniques of active listening.
Example:
90
K.ANANDAKUMAR/BRM 2010
Asking the sexually abused people to narrate their experience and how they
have been affected.
Oral histories:
Oral histories are more commonly used for learning about a historical event or
episode that took place in the past or for gaining information about a cultural,
custom or story that has been passed from generation to generation.
Example:
Suppose you want to find out about the life after World War II in some regional
town of Western Australia or about living conditions of Aboriginal and Torres
Strait Islander people in the 1930s. You would talk to persons who were alive
during that period and ask them about life at that time.
Structured interviews:
Other types:
Focussed interview:
Clinical interview:
92
K.ANANDAKUMAR/BRM 2010
After deciding the topic of interest, the researcher tries to define the research
problem. This helps the researcher to focus on a more narrow research area to
be able to study it appropriately.
the research problem. The results will depend on the exact measurements that
the researcher chooses and may be operationalized differently in another study
to test the main conclusions of the study.
explain why the contrary evidence. A poor ad hoc analysis may be seen as the
researcher's inability to accept that his/her hypothesis is wrong, while a great ad
hoc analysis may lead to more testing and possibly a significant discovery.
93
K.ANANDAKUMAR/BRM 2010
probability sampling
non-probability sampling
Simple random sampling
convenience sampling
stratified sampling
systematic sampling
cluster sampling
sequential sampling
disproportional sampling
judgmental sampling
snowball sampling
quota sampling
Pretest-Posttest Design
Check whether the groups are different before the manipulation starts and the
effect of the manipulation. Pretests sometimes influence the effect.
94
K.ANANDAKUMAR/BRM 2010
Control Group
Control groups are designed to measure research bias and measurement effects,
such as the Hawthorne Effect or the Placebo Effect. A control group is a group
not receiving the same manipulation as the experimental group.
Randomized Controlled Trials
Randomized Sampling, comparison between an Experimental Group and a
Control Group and strict control/randomization of all other variables
Solomon Four-Group Design
With two control groups and two experimental groups. Half the groups have a
pretest and half do not have a pretest. This to test both the effect itself and the
effect of the pretest.
Between Subjects Design
Grouping Participants to Different Conditions
Within Subject Design
Participants Take Part in the Different Conditions
Counterbalanced Measures Design
Testing the effect of the order of treatments when no control group is
available/ethical
Matched Subjects Design
Matching Participants to Create Similar Experimental- and Control-Groups
Double-Blind Experiment
Neither the researcher, nor the participants, know which is the control group.
The results can be affected if the researcher or participants know this.
Bayesian Probability
Using bayesian probability to "interact" with participants is a more "advanced"
experimental design. It can be used for settings were there are many variables
which are hard to isolate. The researcher starts with a set of initial beliefs, and
tries to adjust them to how participants have responded
95
K.ANANDAKUMAR/BRM 2010
Pilot study
It may be wise to first conduct a pilot-study or two before you do the real
experiment. This ensures that the experiment measures what it should, and that
everything is set up right.
Minor errors, which could potentially destroy the experiment, are often found
during this process. With a pilot study, you can get information about errors and
problems, and improve the design, before putting a lot of effort into the real
experiment.
independent variable, affecting the experimental group. The effect that the
researcher is interested in, the dependent variable(s), is measured.
not want to influence the effects, is crucial to drawing a valid conclusion. This
is often done by controlling variables, if possible, or randomizing variables to
minimize effects that can be traced back to third variables. Researchers only
want to measure the effect of the independent variable(s) when conducting an
experiment, allowing them to conclude that this was the reason for the effect.
96
K.ANANDAKUMAR/BRM 2010
not prepared to be analyzed is called "raw data". The raw data is often
summarized as something called "output data", which typically consists of one
line per subject (or item). A cell of the output data is, for example, an average of
an effect in many trials for a subject. The output data is used for statistical
analysis, e.g. significance tests, to see if there really is an effect.
If the researcher suspects that the effect stems from a different variable than the
although it happens.
Examples of experiments
Social psychology
97
K.ANANDAKUMAR/BRM 2010
Genetics
Physics
98
K.ANANDAKUMAR/BRM 2010
3.4.2 Types:
99
K.ANANDAKUMAR/BRM 2010
100
t
e
r
p
s
4
K.ANANDAKUMAR/BRM
Motivation research,
Projective techniques.
The purpose of the study is clear, but the
Non-structured and
non-disguised responses to the question are open-ended.
Example
2010
101
Steps in a Questionnaire Development Process
K.ANANDAKUMAR/BRM 2010
Question
Pre-design activities Evaluation
by Researcher
and by Client
Design
Pretest the activities
Post-design activities Questionna
ire
These are the questions where respondents are free to answer in their own
words.
Example:
State five things that are interesting and challenging in the job.
What factor do you consider while buying a suit?
A closed ended questions, in contrast would ask the respondents to make
choices among a set of alternatives given by the researcher
Example:
102
K.ANANDAKUMAR/BRM 2010
Advantages
Since they do not restrict the respondent’s response, the widest scope of
response can be attained.
• Responses may often be used as direct quotes to bring realism and life to
the written report.
Disadvantages
103
K.ANANDAKUMAR/BRM 2010
Advantages
Disadvantages
These are the questions, in which the respondents can agree with one part of the
question, but not agree with the other or cannot answer without making a
particular assumption.
Example:
Do you feel that firm today are employee oriented and customer oriented?
104
K.ANANDAKUMAR/BRM 2010
Ambiguous Questions:
Some questions might require respondents to recall experiences from the past
that are hazy in their memory. Answers to such questions might have bias.
For example , If an employee who has had 30 years of services in the
organisation is asked to state when he first started working in a particular
department and for how long , he may not be able to give the correct answers
and may be off on his responses.
Leading questions:
A leading question is one that suggests the answer to the respondent. The
question itself will influence the answer, when respondents get an idea that the
data is being collected by a company.
Example:
How do you like the programme on Radio Mirchy?
Don’t you think that in these days of escalating costs of living, employees
should be given good pay raises?
Loaded questions:
Questions that would elicit highly biased emotional responses from subjects.
Example:
Do you think the civic body is incompetent?
To what extend do you think the management is likely to be vindictive if the
union decides to go on strike?
Here the words- incompetent, strike and vindictive are loaded.
105
K.ANANDAKUMAR/BRM 2010
Funneling technique:
The questioning technique that consists of initially asking general and broad
questions, and gradually narrrowing the focus thereafter on more specific
themes is called funnelling technique.
3.4.8 Demerits:
It is always found that the response rate in questionnaire is very poor compared
to using schedules.
Bias of the respondents cannot be determined easily.
Respondents need to be educated.
107
K.ANANDAKUMAR/BRM 2010
Follow up on non response or unfilled questionnaire only adds to the cost and
time.
Accuracy of response cannot be ensured.
A lot of care is required to design and structure a questionnaire.
When the researcher is interested in a spontaneous response, this method is
unsuitable.
Any clarification required by the respondent regarding questions is not possible.
108
K.ANANDAKUMAR/BRM 2010
109
K.ANANDAKUMAR/BRM 2010
“Tell me what you think of when you think of Kellogg’s special K cereal”
Sentence completion:
110
K.ANANDAKUMAR/BRM 2010
Participants are asked to write the dialog for a cartoon like picture.
“What will the customer comment when she sees the salesperson approaching
her in the new-car showroom”
Component sorts:
Participants are presented with flash cards containing component features and
asked to create new combinations.
Sensory sorts:
Participants are presented with scents, textures and sounds usually verbalized on
cards and asked to arrange them by one or more criteria.
Personification:
“If brand X were a person, what type of person would brand x be?
111
K.ANANDAKUMAR/BRM 2010
Semantic mapping:
Participants are presented with a four quadrant map where different variables
anchor the two different axes; they then spatially place brands, product
components or organisations within four quadrants.
Brand mapping:
Participants are presented with different brands and asked to talk about their
perceptions, usually in relation to several criteria. They may also be asked to
spatially place each brand on one or more semantic maps.
Participants are asked to imagine a brand as something else (e.g., a Tide dog
food or Marlboro cereal), describing its attributes and position.
This test consists of 45 ink blot cards which are based on colour, movement,
shading and other factors involved in inkblot perception.
Only one response per card is obtained from the respondent and the responses
of a respondent are interpreted at three levels of form appropriateness.
Form responses are interpreted for knowing the accuracy or inaccuracy of
respondent’s percepts; shading and colour for ascertaining his affectional and
emotional needs; and movement responses for assessing the dynamic aspects of
his life.
3.6 Sampling:
112
K.ANANDAKUMAR/BRM 2010
Population:
All items that have been chosen to study are called population.
Example :
Total number of families living in a city, total number of employees in an
organisation etc.
Sample:
Sample size:
Sample frame:
Sampling frame is the list of elements from which the sample is actually drawn.
Actually sampling frame is nothing but the correct list of population.
Example:
Telephone directory
Yellow pages
Census:
113
K.ANANDAKUMAR/BRM 2010
Example
A researcher may be interested in contacting firms in iron and steel or
petroleum products industry. These industries are limited in number, so a census
will be suitable.
100% enumeration of all elements in the population is called census.
Lower cost
Greater accuracy of results
Greater speed of data collection
Availability of population elements
Probability Sampling:
114
K.ANANDAKUMAR/BRM 2010
Probability sampling is a method of sampling that ensures that every unit in the
population has a known non –zero chance of being included in the sample.
The different methods of random sampling are.
115
K.ANANDAKUMAR/BRM 2010
Systematic sampling:
In this method, the units are selected from the population at a uniform interval.
To facilitate this we arrange the items in numerical, alphabetical, geographical
or any other order. When a complete list of the population is available, this
method is used.
116
K.ANANDAKUMAR/BRM 2010
K=N/n.
Example:
If you want to take a systematic sample of from the population of
N=600 employees, the population of 600 would be partitioned into 600/40=15
groups. For example, if the first number selected was 005, the next selections
would be 020, 035, 050, 065....
Cluster Sampling:
In cluster sampling, the population is divided into groups or clusters, such that
each cluster is a representative of the population
If a study has to be done to find out the no. of children that each family in
Chennai has, then the city can be divided into several clusters and a few clusters
can be chosen at random. Every family in the chosen clusters can be a sample
unit.
While using cluster sampling, the following points should be noted.
117
K.ANANDAKUMAR/BRM 2010
(i) For getting precise results, clusters should be as small as possible consistent
with the cost and limitations for the survey.
(ii) The no. of units in each cluster must be more or less equal.
Area sampling:
Double sampling:
118
K.ANANDAKUMAR/BRM 2010
In non-probability sampling, the selection of the sample units does not ensure a
known chance to the units being selected. In other words, the units are selected
without using the principle of probability. It is suitable for pilot studies and
exploratory research.
The methods of non-random sampling are,
In this sampling, the sample is selected with definite purpose in view and the
choice of the sampling units depends entirely on the discretion and the
judgments of the investigator.
Example:
If an investigator wants to give the picture that the standard of living has
increased in the city of Madurai, he may take the individual in the sample from
the posh localities and ignore the localities where low and middle income group
families live.
Quota Sampling:
119
K.ANANDAKUMAR/BRM 2010
Expert Sampling:
Convenience sampling:
Researchers use this sampling method if the sample for the study is very rare or
is limited to a very small subgroup of the population. This type of sampling
technique works like chain referral. After observing the initial subject, the
researcher asks for assistance from the subject to help identify people with a
similar trait of interest.
The process of snowball sampling is much like asking your subjects to nominate
another person with the same trait as your next subject. The researcher then
observes the nominated subjects and continues in the same way until the
obtaining sufficient number of subjects.
For example, if obtaining subjects for a study that wants to observe a rare
disease, the researcher may opt to use snowball sampling since it will be
120
K.ANANDAKUMAR/BRM 2010
difficult to obtain subjects. It is also possible that the patients with the same
disease have a support group; being able to observe one of the members as your
initial subject will then lead you to more subjects for the study.
The chain referral process allows the researcher to reach populations that
are difficult to sample when using other sampling methods.
The process is cheap, simple and cost-efficient.
121
K.ANANDAKUMAR/BRM 2010
The researcher has little control over the sampling method. The subjects
that the researcher can obtain rely mainly on the previous subjects that were
observed.
Representativeness of the sample is not guaranteed. The researcher has no
idea of the true distribution of the population and of the sample.
Sampling bias is also a fear of researchers when using this sampling
technique. Initial subjects tend to nominate people that they know well. Because
of this, it is highly possible that the subjects share the same traits and
characteristics, thus, it is possible that the sample that the researcher will obtain
is only a small subgroup of the entire population.
122
K.ANANDAKUMAR/BRM 2010
N=z2σ2 /E2
N=z2.p.q /E2
N= sample size
Z= confidence level
σ = standard deviation
P=sample proportion
q=1-p
Golden rule:
The greater the sample size, more accurately your findings will reflect the ‘true
picture’
123
K.ANANDAKUMAR/BRM 2010
124
K.ANANDAKUMAR/BRM 2010
125
K.ANANDAKUMAR/BRM 2010
Introduction:
Data preparation includes editing, coding and data entry and is the
activity that ensures the accuracy of the data and their conversion from raw
form to reduced and classified forms that are more appropriate for analysis.
126
K.ANANDAKUMAR/BRM 2010
After collecting the data, the next task of the researcher is to analyze and
interpret the data.
The purpose of analysis is to draw conclusions.
There are two parts in processing the data:
Data analysis
Interpretation of data.
Analysis of the data involves organizing the data in a particular manner.
Interpretation of the data is a method for deriving conclusions from the
data analyzed.
Analysis of data is not complete, unless it is interpreted.
Steps in processing of DATA:
Interviews
Questionnaires
Observation
In-depth interviews
Focus group interviews
Secondary sources
Editing:
127
K.ANANDAKUMAR/BRM 2010
Editing is nothing but process of ensuring that the data are clean-that is free
from inconsistencies and incompleteness.
Editing detects errors and omissions, corrects them when possible, and certifies
that maximum data quality standards are achieved.
The editor’s purpose is to guarantee that data are:
Accurate
Consistent with the intent of the question and other information in the survey.
Uniformly entered
Arranged to simplify coding and tabulation.
Coding rules:
Four rules guide the pre and post coding and categorization of a data set. They
are
1) The best partitioning of the data for testing hypotheses and showing
relationships and
2) The availability of comparison of data.
Example:
128
K.ANANDAKUMAR/BRM 2010
Suppose the researcher is analyzing the inconvenience that a car owner is facing
with his present model. Therefore, the factor chosen for coding may be
inconvenience. Under this there could be 4 types
Example:
Sometimes the respondents might think that they belong to more than one
category. This is because sales personnel may be doing a sales job and therefore
should be placed under the sales category. Also, he may supervise the work of
other sales executives. In this case he is doing a managerial function. Viewed in
this context, he should be placed under the managerial category, which has a
different code. Therefore, he can only be put under one category, which is to be
decided. One way of deciding this could be to analyze “in which of the two
functions does he spend most time?”
Another scenario assumes that there is a salesman who is currently employed.
Under the column of occupation, he will tick it as sales, while under the current
employment column, he will mark unemployed. How does one codify this?
Under which category should be placed. One of the solutions is to have a
129
K.ANANDAKUMAR/BRM 2010
Types of Tabulation:
If the question has only one answer, the tabulation may be of the following.
130
K.ANANDAKUMAR/BRM 2010
Sometimes, respondents may give more than one answer to a given question. In this
case, there will be an overlap, and responses when tabulated, need not add to 100
percent.
Example:
What do you dislike about the car which you own at present?
Parameter No of respondents
Engine 10
Body 15
Mileage 15
Interior 06
Colour 18
Maintenance frequency 16
Inconvenience 20
Total 100
Example:
Popularity of a healthy drink among families having different incomes. Suppose 500
families are contacted and data collected is as follows:
131
K.ANANDAKUMAR/BRM 2010
families
0 1 2 3 4 5 More than 5
<1000 5 0 8 9 11 15 25 73
1001-2000 10 5 8 10 13 18 27 91
2001-3000 20 10 12 14 20 22 32 130
3001-4000 12 3 6 7 13 20 30 91
4001-5000 6 2 6 5 10 15 20 64
>5000 6 1 4 5 7 10 18 51
59 21 44 50 74 100 152 500
Note: The above table shows that consumption of a health drink not only depends on
income but also on the number of children per family.
A statistical table has at least four major parts and some other minor parts.
(1) The Title
(2) The Box Head (column captions)
(3) The Stub (row captions)
(4) The Body
(5) Prefatory Notes
(6) Foots Notes
(7) Source Notes
The general sketch of table indicating its necessary parts is shown below:
---THE TITLE----
----Prefatory Notes----
132
K.ANANDAKUMAR/BRM 2010
----Box Head----
----Row
----Column Captions----
Captions----
----Stub
----The Body----
Entries----
Foot Notes…
Source Notes…
133
K.ANANDAKUMAR/BRM 2010
134
K.ANANDAKUMAR/BRM 2010
Before taking up summarizing, the data should be classified into (1) relevant
data and (2) Irrelevant data. During the field study, the researcher collects lot of
data which he may think would be of use. Summarizing the data includes:
Bases of Classification:
There are four important bases of classification:
(1) Qualitative Base
135
K.ANANDAKUMAR/BRM 2010
Types of Classification:
(1) First the data are classified and then they are presented in tables, the classification
and tabulation in fact goes together. So classification is the basis for tabulation.
136
K.ANANDAKUMAR/BRM 2010
Frequency Distribution
Grouped Data:
Data presented in the form of frequency distribution is called grouped data.
Array:
The numerical raw data is arranged in ascending or descending order is called an
array.
Example:
Array the following data in ascending or descending order 6, 4, 13, 7, 10, 16, 19.
Solution:
Array in ascending order is 4, 6, 7, 10, 13, 16, and 19
Array in descending order id 19, 16, 13, 10, 7, 6, and 4
Class Limits:
The variant values of the classes or groups are called the class limits.
The smaller value of the class is called lower class limit and larger value of the class is
called upper class limit.
Class limits are also called inclusive classes.
For Example: Let us take the class 10 – 19, the smaller value 10 is lower class limit and
larger value 19 is called upper class limit.
Class Boundaries:
The true values, which describe the actual class limits of a class, are called class
boundaries.
The smaller true value is called the lower class boundary and the larger true value is
called the upper class boundary of the class.
It is important to note that the upper class boundary of a class coincides with the lower
137
K.ANANDAKUMAR/BRM 2010
For Example:
No of Students
Weights in Kg
60 – 65 8
65 – 70 12
70 – 75 5
25
A student whose weights are between 60kg and 64.5kg would be included in the 60 – 65
class. A student whose weight is 65kg would be included in next class 65 – 70.
Open-end Classes:
A class has either no lower class limit or no upper class limit in a frequency table is
called an open-end class. We do not like to use open-end classes in practice, because
they create problems in calculation.
For Example:
No of Persons
Weights (Pounds)
Below – 110 6
110 – 120 12
120 – 130 20
130 – 140 10
140 – Above 2
The class marks or mid point is the mean of lower and upper class limits or boundaries.
It is obtained by dividing the sum of lower and upper class limit or class boundaries of a
138
K.ANANDAKUMAR/BRM 2010
class by 2.
For Example: The class mark or midpoint of the class 60 – 69 is 60+69/2 = 64.5
The difference between the upper and lower class boundaries (not between class limits)
of a class or the difference between two successive mid points is called size of class
interval.
Construct a frequency distribution with suitable class interval size of marks obtained
by students of a class are given below:
23, 50, 38, 42, 63, 75, 12, 33, 26, 39, 35, 47, 43, 52, 56, 59, 64, 77, 15, 21, 51, 54, 72, 68,
36, 65, 52, 60, 27, 34, 47, 48, 55, 58, 59, 62, 51, 48, 50, 41, 57, 65, 54, 43, 56, 44, 30, 46,
67, 53
Solution:
Arrange the marks in ascending order as
12, 15, 21, 23, 26, 27, 30, 33, 34, 35, 36, 38, 39, 41, 42, 43, 43, 44, 46, 47, 47, 48, 48, 50,
50, 51, 51, 52, 52, 53, 54, 54, 55, 56, 56, 57, 58, 59, 59, 60, 62, 63, 64, 65, 65, 67, 68, 72,
75, 77
Minimum Value = Maximum =
Range = Maximum Value – Minimum Value = =
Number of Classes =
=
=
= = or approximate
139
K.ANANDAKUMAR/BRM 2010
Note: For finding the class boundaries, we take half of the difference between lower class
limit of the 2nd class and upper class limit of the 1st class . This value is
subtracted from lower class limit and added in upper class limit to get the required class
boundaries.
140
K.ANANDAKUMAR/BRM 2010
Discrete data is generated by counting; each and every observation is exact. When
an observation is repeated. It is counted the number for which the observation is repeated
is called frequency of that observation. The class limits in discrete data are true class
limit; there are no class boundaries in discrete data.
Example:
The following are the number of female employees in different branches of
commercial banks. Make a frequency distribution.
2, 4, 6, 1, 3, 5, 3, 7, 8, 6, 4, 7, 4, 4, 2, 1, 3, 6, 4, 2, 5, 7, 9, 1, 2, 10, 1, 8, 9, 2, 3, 1, 2, 3, 4,
4, 4, 6, 6, 5, 5, 4, 5, 8, 5, 4, 3, 3, 2, 5, 0, 5, 9, 9, 8, 10, 0, 4, 10, 10, 1, 1, 2, 2, 1, 8, 6, 9, 10
Solution:
The involved variable is “the number of female employees” which is a discrete
variable. The largest and smallest values of the given data are 10 and 0 respectively.
Number of Branches
Tally (Frequency)
Employees
Marks
(Classes)
141
K.ANANDAKUMAR/BRM 2010
The total frequency of all classes less than the upper class boundary of a given
class is called the cumulative frequency of that class. “A table showing the cumulative
frequencies is called a cumulative frequency distribution”. There are two types of
cumulative frequency distributions.
142
K.ANANDAKUMAR/BRM 2010
more
Less than or
more
Less than or
more
Less than or
more
Less than or
more
Less than or
more
Types of Diagrams/Charts:
143
K.ANANDAKUMAR/BRM 2010
Types of Diagrams/Charts:
Histogram
Frequency Curve and Polygon
Lorenz Curve
Historigram
Example:
Draw simple bar diagram to represent the profits of a bank for 5 years.
144
K.ANANDAKUMAR/BRM 2010
Years
Profit
(million $)
Simple bar chart showing the profit of a bank for 5 years.
By multiple bars diagram two or more sets of inter-related data are represented (multiple
bar diagram facilities comparison between more than one phenomena).
The technique of simple bar chart is used to draw this diagram but the difference is that
we use different shades, colors, or dots to distinguish between different phenomena.
We use to draw multiple bar charts if the total of different phenomena is meaningless.
Example:
Draw a multiple bar chart to represent the import and export of Canada (values
in $) for the years 1991 to 1995.
145
K.ANANDAKUMAR/BRM 2010
Multiple bar chart showing the import and export of Canada from 1991 – 1995.
Sub-divided or component bar chart is used to represent data in which the total
magnitude is divided into different or components.
In this diagram, first we make simple bars for each class taking total magnitude in that
class and then divide these simple bars into parts in the ratio of various components.
This type of diagram shows the variation in different components within each class as
well as between different classes. Sub-divided bar diagram is also known as
component bar chart or staked chart.
Example:
The table below shows the quantity in hundred kgs of Wheat, Barley and Oats
produced on a certain form during the years 1991 to 1994.
146
K.ANANDAKUMAR/BRM 2010
Solution:
To make the component bar chart, first of all we have to take year wise total
production.
Sub-divided bar chart may be drawn on percentage basis. To draw sub-divided bar
chart on percentage basis, we express each component as the percentage of its respective
total. In drawing percentage bar chart, bars of length equal to 100 for each class are drawn at
first step and sub-divided in the proportion of the percentage of their component in the
second step. The diagram so obtained is called percentage component bar chart or percentage
staked bar chart. This type of chart is useful to make comparison in components holding the
difference of total constant.
Example:
The table below shows the quantity in hundred kgs of Wheat, Barley and Oats
produced on a certain form during the years 1991 to 1994.
147
K.ANANDAKUMAR/BRM 2010
Solution:
Necessary computations for the construction of percentage bar chart given below:
Item
cum cum cum cum
Wheat
Barley
Oats
Total
indicates Percentage of each item
Cum indicates the cumulative percentage.
Pie Chart
Pie chart can used to compare the relation between the whole and its
components.
148
K.ANANDAKUMAR/BRM 2010
Pie chart is a circular diagram and the area of the sector of a circle is
used in pie chart.
Circles are drawn with radii proportional to the square root of the
Example:
The following table gives the details of monthly budget of a family. Represent these
figures by a suitable diagram.
Family Budget
Item of Expenditure
Food
Clothing
House Rent
Fuel and Lighting
Miscellaneous
Total
Solution:
149
K.ANANDAKUMAR/BRM 2010
Family Budget
Items Expenditure $ Angle of Sectors Cumulative Angle
Food
Clothing
House Rent
Fuel and Lighting
Miscellaneous
Total
150
K.ANANDAKUMAR/BRM 2010
Mean
Median
Mode
Harmonic mean
Geometric mean
Mean:
Nature of Data
Method’s Name
Ungrouped Data Grouped Data
Direct Method
Indirect or
Short-Cut Method
Method of
Step-Deviation
Where
Indicates values of the variable .
Indicates number of values of .
Indicates frequency of different groups.
Indicates assumed mean.
151
K.ANANDAKUMAR/BRM 2010
Step-deviation and Indicates common divisor
Indicates size of class or class interval in case of grouped data.
Summation or addition.
Example (1):
The one-sided train fare of five selected BS students is recorded as follows
, , , and . Calculate arithmetic mean of the following data.
Solution:
Let train fare is indicated by , then
Example (2):
152
K.ANANDAKUMAR/BRM 2010
Age (Years)
Number of Students
Solution:
The given distribution belongs to a grouped data and the variable involved is
ages of first year students. While the number of students Represent frequencies.
Total
Example (3):
The following data shows distance covered by persons to perform
their routine jobs.
Distance (Km)
Number of
Persons
153
K.ANANDAKUMAR/BRM 2010
Solution:
The given distribution belongs to a grouped data and the variable
involved is ages of “distance covered”. While the “number of persons”
Represent frequencies.
Number of
Distance (Km) Persons Mid Points
Total
Example (4):
The following data shows distance covered by persons to perform
their routine jobs.
Distance (Km)
Number of
Persons
Distance Number of
Covered in Persons Mid Points
(Km)
154
K.ANANDAKUMAR/BRM 2010
Total
Km
Explanation:
Here from the mid points ( ) it is very much clear that each mid point is
multiple of and there is also a gap of from mid point to mid point i.e.
class size or interval ( ). Keeping in view this, we should prefer to take method
of Step-Deviation instead of Direct Method.
Example (5):
The following frequency distribution showing the marks obtained by
students in statistics at a certain college. Find the arithmetic mean using (1)
Direct Method (2) Short-Cut Method (3) Step-Deviation.
Marks
Frequenc
y
Solution:
155
K.ANANDAKUMAR/BRM 2010
Total
Marks
156
K.ANANDAKUMAR/BRM 2010
Where:
Example 1:
A student obtained 40, 50, 60, 80, and 45 marks in the subjects of Math,
Statistics, Physics, Chemistry and Biology respectively. Assuming weights 5, 2,
4, 3, and 1 respectively for the above mentioned subjects. Find Weighted
Arithmetic Mean per subject.
Solution:
157
K.ANANDAKUMAR/BRM 2010
marks/subject.
Merits and Demerits of Arithmetic Mean
Merits:
It is rigidly defined.
It is easy to calculate and simple to follow.
It is based on all the observations.
It is determined for almost every kind of data.
It is finite and not indefinite.
It is readily put to algebraic treatment.
It is least affected by fluctuations of sampling.
Demerits:
Geometric
158
K.ANANDAKUMAR/BRM 2010
values”
Hence, geometric mean for a value containing values such as
Where
Example 4:
Find the Geometric Mean of the values 10, 5, 15, 8, 12
Solution:
Example 6:
Find the Geometric Mean of the following Data
159
K.ANANDAKUMAR/BRM 2010
Solution:
We may write it as given below:
Here ,
, , , ,
Using the formula of geometric mean for grouped data, geometric mean in
this case will become:
The method explained above for the calculation of geometric mean is useful
when the numbers of values in given data are small in number and the facility of
electronic calculator is available. When a set of data contains large number of values
then we need an alternative way for computing geometric mean. The modified or
alternative way of computing geometric mean is given as under:
160
K.ANANDAKUMAR/BRM 2010
Example 7: Find the Geometric Mean of the values 10, 5, 15, 8, 12
Total
Example 8:
Find the Geometric Mean for the following distribution of students’ marks:
Marks
No. of Students
Solution:
Total
161
K.ANANDAKUMAR/BRM 2010
Example 9:
Calculate the harmonic mean of the numbers: 13.5, 14.5, 14.8, 15.2 and 16.1
Solution:
The harmonic mean is calculated as
below:
162
Total
K.ANANDAKUMAR/BRM 2010
Example 10:
Given the following frequency distribution of first year students of a particular
college. Calculate the Harmonic Mean.
Age (Years)
Number of Students
Solution:
The given distribution belongs to a grouped data and the variable involved is
ages of first year students. While the number of students Represent frequencies.
Number of Students
Ages (Years)
Total
years.
Example 11:
163
K.ANANDAKUMAR/BRM 2010
Marks
Solution:
The necessary calculations are given below:
Marks
Total
164
K.ANANDAKUMAR/BRM 2010
Mode:
X = sort(x);
indices = find(diff([X; realmax]) > 0); % indices where repeated values
change
[modeL,i] = max (diff([0; indices])); % longest persistence length of
repeated values
mode = X(indices(i));
Examples:
165
K.ANANDAKUMAR/BRM 2010
Answer: Since both 18 and 24 occur three times, the modes are 18
and 24 miles per hour. This data set is bimodal.
Answer: Since each value occurs only once in the data set, there is
no mode for this set of data.
Median
Half the numbers in the list will be less, and half the numbers will be greater.
To find the Median, place the numbers you are given in value order and find
the middle number.
Examples:
166
K.ANANDAKUMAR/BRM 2010
3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56
There are fifteen numbers. Our middle number will be the eighth number:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56
And you can see that "half the numbers in the list are less, and half the numbers
are greater."
(Note that it didn't matter if we had some numbers the same in the list)
BUT, if there are an even amount of numbers things are slightly different.
In that case we need to find the middle pair of numbers, and then find the value
that would be half way between them.
3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, and 56
There are now fourteen numbers and so we don't have just one middle number,
we have a pair of middle numbers:
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
167
K.ANANDAKUMAR/BRM 2010
To find the value half-way between them, add them together and divide by 2:
21 + 23 = 44
44 ÷ 2 = 22
Measures of dispersion:
Type Description Example Result
A modern student of statistics is mainly interested in the study of variability
Arithmetic Total sum divided by quantity of (1+2+2+3+4+7+9) /
and uncertainty. In this section we shall discuss variability and its measures4 and
mean integers 7
uncertainty will be discussed in probability. We live in a changing world. Changes
Middle value that separates the
are Median 1, 2, 2,does
taking place in every sphere of life. A man of statistics 3, 4,not
7, 9show much
3
greater and lesser halves of a data set
interest in those things which are constant. The total area of the earth may not be very
Mode Most frequent number in a data set 1, 2, 2, 3, 4, 7, 9 2
important to a research minded person but the area under different crops, area
covered by forests, area covered by residential and commercial buildings are figures
of great importance because these figures keep on changing form time to time and
from place to place. Very large number of experts is engaged in the study of changing
phenomenon. Experts working in different countries of the world keep a watch on
forces which are responsible for bringing changes in the fields of human interest. The
agricultural, industrial and mineral production and their transportation from one part
to the other parts of the world are the matters of great interest to the economists,
statisticians, and other experts. The changes in human population, the changes in
standard living, and changes in literacy rate and the changes in price attract the
experts to make detailed studies about them and then correlate these changes with the
human life. Thus variability or variation is something connected with human life and
study is very important for mankind.
Dispersion:
The word dispersion has a technical meaning in statistics. The average
measures the centre of the data. It is one aspect observations. Another feature of the
observations is as to how the observations are spread about the centre. The
168
K.ANANDAKUMAR/BRM 2010
observation may be close to the centre or they may be spread away from the centre. If
the observation are close to the centre (usually the arithmetic mean or median), we
say that dispersion, scatter or variation is small. If the observations are spread away
from the centre, we say dispersion is large. Suppose we have three groups of students
who have obtained the following marks in a test. The arithmetic means of the three
groups are also given below:
In a group A and B arithmetic means are equal i.e. . But in group A the
observations are concentrated on the centre. All students of group A have almost the
same level of performance. We say that there is consistence in the observations in
group A. In group B the mean is 50 but the observations are not closed to the centre.
One observation is as small as 30 and one observation is as large as 70. Thus there is
greater dispersion in group B. In group C the mean is 60 but the spread of the
observations with respect to the centre 60 is the same as the spread of the
observations in group B with respect to their own centre which is 50. Thus in group B
and C the means are different but their dispersion is the same. In group A and C the
means are different and their dispersions are also different. Dispersion is an important
feature of the observations and it is measured with the help of the measures of
dispersion, scatter or variation. The word variability is also used for this idea of
dispersion.
The study of dispersion is very important in statistical data. If in a certain
factory there is consistence in the wages of workers, the workers will be satisfied. But
if some workers have high wages and some have low wages, there will be unrest
169
K.ANANDAKUMAR/BRM 2010
among the low paid workers and they might go on strikes and arrange
demonstrations. If in a certain country some people are very poor and some are very
high rich, we say there is economic disparity. It means that dispersion is large. The
idea of dispersion is important in the study of wages of workers, prices of
commodities, standard of living of different people, distribution of wealth,
distribution of land among framers and various other fields of life. Some brief
definitions of dispersion are:
The degree to which numerical data tend to spread about an average value is
called the dispersion or variation of the data.
Dispersion or variation may be defined as a statistics signifying the extent of
the scatteredness of items around a measure of central tendency.
Dispersion or variation is the measurement of the scatter of the size of the
items of a series about the average.
For the study of dispersion, we need some measures which show whether the
dispersion is small or large. There are two types of measure of dispersion which are:
(a) Absolute Measure of Dispersion
(b) Relative Measure of Dispersion
The Range
170
K.ANANDAKUMAR/BRM 2010
The Range:
Range is defined as the difference between the maximum and the minimum
observation of the given data. If denotes the maximum observation denotes the
minimum observation then the range is defined as
Range
In case of grouped data, the range is the difference between the upper
boundary of the highest class and the lower boundary of the lowest class . It is also
calculated by using the difference between the mid points of the highest class and the
171
K.ANANDAKUMAR/BRM 2010
lowest class. It is the simplest measure of dispersion. It gives a general idea about the
total spread of the observations. It does not enjoy any prominent place in statistical
theory. But it has its application and utility in quality control methods which are used
to maintain the quality of the products produced in factories. The quality of products
is to be kept within certain range of values.
The range is based on the two extreme observations. It gives no weight to the
central values of the data. It is a poor measure of dispersion and does not give a good
picture of the overall spread of the observations with respect to the center of the
observations. Let us consider three groups of the data which have the same range:
In all the three groups the range is 50 – 30 = 20. In group A there is
concentration of observations in the centre. In group B the observations are friendly
with the extreme corner and in group C the observations are almost equally
distributed in the interval from 30 to 50. The range fails to explain these differences
in the three groups of data. This defect in range cannot be removed even if we
calculate the coefficient of range which is a relative measure of dispersion. If we
calculate the range of a sample, we cannot draw any inferences about the range of the
population.
Coefficient of Range:
It is relative measure of dispersion and is based on the value of range. It is
also called range coefficient of dispersion. It is defined as:
172
K.ANANDAKUMAR/BRM 2010
Let us take two sets of observations. Set A contains marks of five students in
Mathematics out of 25 marks and group B contains marks of the same student in
English out of 100 marks.
Set A: (Mathematics)
Set B: (English)
In set A the range is 10 and in set B the range is 20. Apparently it seems as if
there is greater dispersion in set B. But this is not true. The range of 20 in set B is for
large observations and the range of 10 in set A is for small observations. Thus 20 and
10 cannot be compared directly. Their base is not the same. Marks in Mathematics
are out of 25 and marks of English are out of 100. Thus, it makes no sense to compare
10 with 20. When we convert these two values into coefficient of range, we see that
coefficient of range for set A is greater than that of set B. Thus there is greater
dispersion or variation in set A. The marks of students in English are more stable than
their marks in Mathematics.
Example 1:
Following are the wages of 8 workers of a factory. Find the range and the
coefficient of range. Wages in ($) 1400, 1450, 1520, 1380, 1485, 1495, 1575, 1440.
Solution:
Range
173
K.ANANDAKUMAR/BRM 2010
Example 2:
The following distribution gives the numbers of houses and the number of
persons per house.
Number of
Persons
Number of
Houses
Calculate the range and coefficient of range.
Solution:
Range
Example 3:
Weights (Kg)
Number of Students
Calculate the range and coefficient of range.
Solution:
174
K.ANANDAKUMAR/BRM 2010
Solution:
Method 1:
Method 2:
Quartile Deviation:
It is based on the lower quartile and the upper quartile . The difference
175
K.ANANDAKUMAR/BRM 2010
is called the inter quartile range. The difference divided by is
called semi-inter-quartile range or the quartile deviation. Thus
It is pure number free of any units of measurement. It can be used for comparing the
dispersion in two or more than two sets of data.
Example 1:
The wheat production (in Kg) of 20 acres is given as: 1120, 1240, 1320, 1040,
1080, 1200, 1440, 1360, 1680, 1730, 1785, 1342, 1960, 1880, 1755, 1720, 1600, 1470,
1750, and 1885. Find the quartile deviation and coefficient of quartile deviation.
Solution:
After arranging the observations in ascending order, we get
176
K.ANANDAKUMAR/BRM 2010
1040, 1080, 1120, 1200, 1240, 1320, 1342, 1360, 1440, 1470, 1600, 1680, 1720, 1730,
1750, 1755, 1785, 1880, 1885, 1960.
Example 2:
Calculate the quartile deviation and coefficient of quartile deviation from the
177
K.ANANDAKUMAR/BRM 2010
Maximum Load
Number of Cables
(short-tons)
Solution:
The necessary calculations are given below:
178
K.ANANDAKUMAR/BRM 2010
179
K.ANANDAKUMAR/BRM 2010
data in which the suitable average is the , the mean deviation ( ) is given by the
relation:
When the mean deviation is calculated about the median, the formula becomes
The mean deviation about the mode is
For a population data the mean deviation about the population mean is
The mean deviation is a better measure of absolute dispersion than the range
and the quartile deviation.
180
K.ANANDAKUMAR/BRM 2010
A drawback in the mean deviation is that we use the absolute deviations
which does not seem logical. The reason for this is that is
always equal to zero. Even if we use median or mode in place of , even then the
181
K.ANANDAKUMAR/BRM 2010
Example 1:
Calculate the mean deviation form (1) arithmetic mean (2) median (3) mode in
respect of the marks obtained by nine students gives below and show that the mean
deviation from median is minimum.
Marks (out of 25): 7, 4, 10, 9, 15, 12, 7, 9, 7
Solution:
After arranging the observations in ascending order, we get
Marks: 4, 7, 7, 7, 9, 9, 10, 12, 15
Marks
Total
182
K.ANANDAKUMAR/BRM 2010
From the above calculations, it is clear that the mean deviation from the median
hast the least value.
Example 2:
Calculate the mean deviation from mean and its coefficients from the following
data.
Size of Items
Frequency
Solution:
The necessary calculation is given below:
Size of
Items
183
K.ANANDAKUMAR/BRM 2010
Total
Standard Deviation
The standard deviation is defined as the positive square root of the mean of
the square deviations taken from arithmetic mean of the data.
For the sample data the standard deviation is denoted by and is defined as:
For a population data the standard deviation is denoted by (sigma) and is
defined as:
For frequency distribution the formulas becomes
or
The standard deviation is in the same units as the units of the original
observations. If the original observations are in grams, the value of the standard
deviation will also be in grams.
184
K.ANANDAKUMAR/BRM 2010
The standard deviation plays a dominating role for the study of variation in
the data. It is a very widely used measure of dispersion. It stands like a tower among
measure of dispersion. As far as the important statistical tools are concerned, the first
important tool is the mean and the second important tool is the standard deviation
. It is based on all the observations and is subject to mathematical treatment. It is of
great importance for the analysis of data and for the various statistical inferences.
However some alternative methods are also available to compute standard
deviation. The alternative methods simplify the computation. Moreover in discussing
these methods we will confirm ourselves only to sample data because sample data
rather than whole population confront mostly a statistician.
185
K.ANANDAKUMAR/BRM 2010
Where and is any assumed mean other than zero. This method is also
known as short-cut method.
(b) If is considered to be zero then the above formulas are reduced to the
following formulas:
(c) If we are in a position to simplify the calculation by taking some common
factor or divisor from the given data the formulas for computing standard deviation
are:
186
K.ANANDAKUMAR/BRM 2010
Method-II: Taking Assumed Mean as
Total
187
K.ANANDAKUMAR/BRM 2010
Total
Example 2:
188
K.ANANDAKUMAR/BRM 2010
Calculate standard deviation from the following distribution of marks by using all the
methods.
Solution:
Method-I: Actual Mean Method
Marks
Total
Marks
Marks
189
K.ANANDAKUMAR/BRM 2010
Total
Marks
Marks
Total
Marks
Marks
Total
Mark
190
K.ANANDAKUMAR/BRM 2010
Coefficient of Variation:
The most important of all the relative measure of dispersion is the coefficient of
variation. This word is variation not variance. There is no such thing as coefficient of
191
K.ANANDAKUMAR/BRM 2010
average of their wages. But the families consume meat in quite different quantities.
Some families use very small quantities of meat and some others use large quantities
of meat. We say that there is greater variation in their consumption of meat. The
observations about the quantity of meat are more dispersed or more variant.
Example 1:
Calculate the coefficient of standard deviation and coefficient of variation for
the following sample data: 2, 4, 8, 6, 10, and 12.
Solution:
192
K.ANANDAKUMAR/BRM 2010
Example 2:
Calculate coefficient of standard deviation and coefficient of variation from
the following distribution of marks:
Solution:
Marks
Total
Marks
193
K.ANANDAKUMAR/BRM 2010
Variance:
For a sample data the variance is denoted is denoted by and the population variance is
Where is sample mean and is the number of observations in the sample.
Where is the mean of the population and is the number of observations in the data.
The sample variance is calculated and if need be, this is used to make inference
about the population variance.
Thus the unit of is the square of the units of the original measurement.
194
K.ANANDAKUMAR/BRM 2010
In simple words we can say that variance is the square of standard deviation.
Example 1:
Calculate the variance for the following sample data: 2, 4, 8, 6, 10, and 12.
Solution:
195
K.ANANDAKUMAR/BRM 2010
Example 2:
Calculate variance from the following distribution of marks:
No. of Students
Marks
Solution:
Marks
Total
196
K.ANANDAKUMAR/BRM 2010
Following aspects are considered in examining the statistical relationship between two
or more variables.
Is there an association between two or more variables? If yes, what is the form
and degree of that relationship?
Is the relationship strong or significant enough to arrive at a desirable
conclusion?
Can the relationship be used for predictive purpose, that is, to predict the most
likely value of a dependent variable corresponding to the given value of independent
variable or variables?
There are two different techniques which are used for the study of two or more than
two variables. These are regression and correlation. Both study the behaviour of the
variables but they differ in their end results. Regression studies the relationship where
dependence is necessarily involved. One variable has the dependence on a certain
197
K.ANANDAKUMAR/BRM 2010
number of variables. Regression can be used for predicting the values of the variable
which depends upon other variables. The term regression was introduced by the
English biometrician, Sir Francis Galton (1822 - 1911). Correlation attempts to study
the strength of the mutual relationship between two variables. In correlation we assume
that the variables are random and dependence of any nature is not involved.
Linear Model
Regression involves the study of equations. First we talk about some simple
equations or linear models. The simplest mathematical model or equation is the
equation of straight line.
Example: Suppose a shop keeper is selling pencils. He sells one pencil for 2 cents.
Table as shown gives the number of pencils sold and the sale price of the pencils.
The information written above can be presented in some other forms as well.
For example we can write an equation describing the above relation between and .
It is very simple to write the equation. The algebraic equation connecting and is.
.
It is called mathematical equation or mathematical model in which depends
198
K.ANANDAKUMAR/BRM 2010
upon . Here is called independent variable and is called dependent variable.
Cent . Neither less than nor more than . The above model is called deterministic
mathematical model because we can determine the value of without any error by
putting the value of in the equation. The sale is said to be function of . This
The graph lies in the first quadrant because all the values of and are
positive.
It is an exact straight line. But all graphs are not in the form of a straight line. It
could be some curve also.
All the points (pair of and ) lies on the straight line.
The line passes through the origin.
Take any point on the line and draw a perpendicular line which joins
199
K.ANANDAKUMAR/BRM 2010
with the X-axis. Let us find the ratio . Here units and units.
Thus units.
It is called the slope of the line and in general it is denoted by “ ”. The slope of
the line is the same at all points on the line. The slope “ ” is equal to the change
in for a unit change in . The relation is also called linear equation
between and
Example: Suppose a carpenter wants to make some wooden toys for the small
children. He has purchased some wood and some other material for $ . The cost of
making each toy is $ . Table gives the information about the number of toys made
and cost of the toys.
Number of Toys
Cost of Toys
Let denote the number of toys and denote the cost of the toys. What is the
algebraic relation between and . When , . This is called fixed or
starting cost and it may be denoted by “ ”. For each additional toy, the cost is $ .
Thus and are connected through the following equation:
200
K.ANANDAKUMAR/BRM 2010
1. The line does not pass through the origin. It passes through the point on
Y-axis. The distance between and the origin is called the intercept and is usually
denoted by “ ”.
2. Take any point on the line and complete a triangle as shown in the figure.
Let us find the ratio between the perpendicular and the base of this triangle.
This ratio is denoted by “ ” in the equation of straight line. Thus the equation of
straight line has the intercept and slope . In general, when the
values of intercept and slope are not known, we write the equation of straight line as
. It is also called linear equation between and , and the relation between
and is called linear. The equation may also be called exact linear
model between and or simply linear model between and . The value of can
called the deterministic linear model between and . In statistics, when we shall use
the term “Linear Model”, we shall not mean a mathematical model as described
above.
Non Linear Model
By putting the values of in this equation; we find the values of
as given in the table below. The first and second differences are calculated in that
given table.
The second differences are exactly constant. The general quadratic equation or
non linear model is written as
It is also called second degree parabola or second degree curve. The graph of
the data is shown in the figure given below:
202
K.ANANDAKUMAR/BRM 2010
This figure is not a straight line. It is a curve or we say that the model
in non-linear.
The readers are advised to remember that if in a certain observed data, the
second differences are constant or almost constants, we find the second degree
curve close to the observed data.
We shall face this type of situation in time series.
Scatter Diagram
.
These points are plotted on a rectangular co-ordinate system taking
independent variable on X-axis and the dependent variable on Y-axis.
Whatever be the name of the independent variable, it is to be taken on X-
axis.
Suppose the plotted points are as shown in figure (a).
Such a diagram is called scatter diagram. In this figure, we see that when
203
K.ANANDAKUMAR/BRM 2010
X has a small value, Y is also small and when X takes a large value, Y also
takes a large value.
This is called direct or positive relationship between X and Y.
The plotted points cluster around a straight line.
It appears that if a straight line is drawn passing through the points, the
line will be a good approximation for representing the original data.
Suppose we draw a line AB to represent the scattered points.
The line AB rises from left to the right and has positive slope.
This line can be used to establish an approximate relation between the
random variable Y and the independent variable X.
It is nonmathematical method in the sense that different persons may
draw different lines.
This line is called the regression line obtained by inspection or judgment.
204
K.ANANDAKUMAR/BRM 2010
Correlation
205
K.ANANDAKUMAR/BRM 2010
206
K.ANANDAKUMAR/BRM 2010
other words it can be defined as if all the points on the scatter diagram tends to lie near
a smooth curve, the correlation is said to be non linear (curvilinear), as shown in the
figure.
Positive Correlation:
The correlation in the same direction is called positive correlation. If one
variable increase other is also increase and one variable decrease other is also
decrease. For example, the length of an iron bar will increase as the temperature
increases.
Negative Correlation:
The correlation in opposite direction is called negative correlation, if one
variable is increase other is decrease and vice versa, for example, the volume of gas
will decrease as the pressure increase or the demand of a particular commodity is
increase as price of such commodity is decrease.
If there is no relationship between the two variables such that the value of one
variable change and the other variable remain constant is called no or zero correlation.
Perfect Correlation
If there is any change in the value of one variable, the value of the others variable
is changed in a fixed proportion, the correlation between them is said to be perfect
correlation. It is indicated numerically as +1 and -1.
Coefficient of Correlation
The degree or level of correlation is measured with the help of correlation
208
K.ANANDAKUMAR/BRM 2010
The Cov(X, Y) may be positive, negative or zero.
The covariance has the same units in which X and Y are measured.
.
For sample data the correlation coefficient denoted by “r” is a measure of
strength of the linear relation between X and Y variables where “r” is a pure
number and lies between
-1 and +1.
On the other hand Karl Pearson’s coefficient of correlation is:
209
K.ANANDAKUMAR/BRM 2010
Examples of Correlation
Examples 1:
Calculate and analyze the correlation coefficient between the number of study
hours and the number of sleeping hours of different students.
2 4 6 8 10
Number of Study hours
Number of sleeping hours 10 9 8 7 6
Solution:
X Y
2 10 -4 +2 -8 16 4
4 9 -2 +1 -2 4 1
6 8 0 0 0 0 0
8 7 +2 -1 -2 4 1
10 6 +4 -2 -8 16 1
And
There is perfect negative correlation between the number of study hours and the
210
K.ANANDAKUMAR/BRM 2010
Example 2:
From the following data, compute the coefficient of correlation between X and Y:
X Series Y Series
Number of Items 15 15
Arithmetic Mean 25 18
Sum of Square
136 138
Deviations
Summation of products of deviations of X and Y series from their arithmetic means = 122.
Solution:
Curve Fitting:
Curve fitting is a process of introduction mathematical relationship between
dependent and independent variables in the form of an equation for a given set of
211
K.ANANDAKUMAR/BRM 2010
data.
212
K.ANANDAKUMAR/BRM 2010
The given example explains you that how to find the equation of straight line or
least square line by using the method of least square, which is very useful in statistics as
well as in mathematics.
Example:
Fit a least square line to the following data. Also find trend values and show that
X 1 2 3 4 5
Y 2 5 3 8 7
Solution:
1 2 2 1 2.4 -0.4
2 5 10 4 3.7 +1.3
3 3 9 9 5.0 -2
4 8 32 16 6.3 1.7
5 7 35 25 7.6 -0.6
Trend Values
Eliminate ‘a’ from equation (1) and (2), multiply equation (2) by 3 and subtract form
equation (2), we get the values of ‘a’ and ‘b’.
213
K.ANANDAKUMAR/BRM 2010
Linear Regression
Regression:
The word regression was used by Frances Galton in 1985. It is defined as
“The dependence of one variable upon other variable”. For example, a weight
depends upon the heights. The yield of wheat depends upon the amount of fertilizer.
In regression we can estimate the unknown values of one (dependent) variable from
known values of the other (independent) variable.
Linear Regression:
When the dependence of the variable is represented by a straight line then it is
called linear regression, otherwise it is said to be non linear or curvilinear regression.
For Example, if ‘X’ is dependent variable and ‘Y’ is dependent variable, then the
relation Y = a + bX is linear regression.
Regression Line of Y on X:
Regression lines study the average relationship between two variables. In
regression line Y on X, we estimate the average value of Y for a given value of X.
Y = a + bX
Where Y is dependent and X is independent variable. Alternate form of
regression line Y on X is:
214
K.ANANDAKUMAR/BRM 2010
Regression Line of X on Y:
In regression line X on Y we estimate the average value of X for a given value of
Y.
Qualitative Data:
Interview transcript
Field notes (notes taken in the field being studied)
Video
Audio recordings
Images
Documents (reports, meeting minutes, e-mails)
Such data usually involve people and their activities, signs, symbols, artefacts
and other objects they imbue with meaning. The most common forms of
qualitative data are what people have said or done.
215
K.ANANDAKUMAR/BRM 2010
The method you use will depend on your research topic, your personal
preferences and the time, equipment and finances available to you.
Also, qualitative data analysis is a very personal process, with few rigid rules
and procedures.
However, to be able to analyse your data you must first of all produce it in a
format that can be easily analysed.
216
K.ANANDAKUMAR/BRM 2010
It is useful to write memos and notes as soon as you begin to collect data as
these help to focus your mind and alert you to significant points which may be
coming from the data.
These memos and notes can be analysed along with your transcripts or
questionnaires.
You can think of the different types of qualitative data analysis as positioned on
a continuum.
At the one end are the highly qualitative, reflective types of analysis, whereas
on the other end are those which treat the qualitative data in a quantitative way,
by counting and coding data.
For those at the highly qualitative end of the continuum, data analysis tends to
be an on-going process, taking place throughout the data collection process.
The researcher thinks about and reflects upon the emerging themes, adapting
and changing the methods if required.
However, during the three interviews she finds that the participants are raising
issues that she has not thought about previously.
So she refines her interview schedule to include these issues for the next few
interviews. This is data analysis.
She has thought about what has been said, analysed the words and refined her
schedule accordingly.
Thematic analysis
217
K.ANANDAKUMAR/BRM 2010
This type of analysis is highly inductive, that is, the themes emerge from the
data and are not imposed upon it by the researcher.
In this type of analysis, the data collection and analysis take place
simultaneously.
Even background reading can form part of the analysis process, especially if it
can help to explain an emerging theme.
Using this method, data from different people is compared and contrasted and
the process continues until the researcher is satisfied that no new issues are
arising.
Comparative and thematic analyses are often used in the same project, with the
researcher moving backwards and forwards between transcripts, memos, notes
and the research literature.
Content analysis
For those types of analyses at the other end of the qualitative data continuum,
the process is much more mechanical with the analysis being left until the data
has been collected.
Perhaps the most common method of doing this is to code by content. This is
called content analysis.
Using this method the researcher systematically works through each transcript
assigning codes, which may be numbers or words, to specific characteristics
within the text.
The researcher may already have a list of categories or she may read through
each transcript and let the categories emerge from the data.
218
K.ANANDAKUMAR/BRM 2010
This type of analysis can be used for open-ended questions which have been
added to questionnaires in large quantitative surveys, thus enabling the
researcher to quantify the answers.
Discourse analysis
These methods look at patterns of speech, such as how people talk about a
particular subject, what metaphors they use, how they take turns in
conversation, and so on.
Much of this analysis is intuitive and reflective, but it may also involve some
form of counting, such as counting instances of turn-taking and their influence
on the conversation and the way in which people speak to others.
219
K.ANANDAKUMAR/BRM 2010
Multidimensional scaling
Uses of MDS:
220
K.ANANDAKUMAR/BRM 2010
There are two ways of collecting the input data to plot perceptual mapping:
Non-attribute method:
Here, the researcher asks the respondents to make a judgment about the objects
directly. In this method, the criteria for comparing the objects are decided by the
respondent himself.
Attribute method:
In this method, instead of respondents selecting the criteria, they were asked to
compare the objects based on the criteria specified by the researcher.
Example 1:
1) Convenient locality
2) Courteous personal service
221
K.ANANDAKUMAR/BRM 2010
Inconvenient
B A
B
D E
Convenient
Example 2:
For example, from the following MDS graph, it is observed that company A is
perceived to be taking more interest in the welfare of the staff than company B.
222
K.ANANDAKUMAR/BRM 2010
B
A
Interest of staff
Example 3:
The consultant collected a lot of relevant data, analyzed it and offered their
recommendations. In one of the presentations, they showed the following
diagram obtained through multi dimensional scaling technique. The diagram
shows the concerns of various zonal managers, indicated by letters A to F,
towards the organization and also towards the staff working under them.
☻D
223
K.ANANDAKUMAR/BRM 2010
☻A ☻B
☻E
☻C ☻F
It is observed that two zonal managers viz.B and E exhibit high concern for
both the organization as well as staff. If these criteria are crucial to the
organization, then these two zonal managers could be the right candidates for
higher positions in the head office.
Multivariate Analysis
Definition:
224
K.ANANDAKUMAR/BRM 2010
For example , take the case of college entrance exam, wherein a number of tests
are administered to candidates , and the candidates scoring high marks based on
many subjects are administered are admitted.
This system though apparently fair, may at sometimes be biased in favour of
some subjects with the larger standard deviations.
If the researcher is interested in making probability statements on the basis of
sampled multiple measurements, then the best strategy of data analysis is to use
some suitable multivariate statistical technique.
Multivariate techniques may be appropriately used in such situations for
developing norms as to who should be admitted in college.
The objective underlying multivariate techniques is to represent a collection of
massive data in a simplified way.
The main contribution of these techniques is in arranging a large amount of
complex information in the real data into a simplified visible form.
Multivariate procedure:
225
K.ANANDAKUMAR/BRM 2010
3) MANOVA
4) Conjoint analysis
Interdependence method:
In interdependence method, no single variable or group of variables is defined
as being independent or dependent.
The multivariate procedure here involves the analysis of all the variables in the
data set simultaneously.
The goal of interdependence method is to group respondents or objects together.
The most frequently used methods of interdependence techniques are
1) Cluster analysis
2) Factor analysis
3) Multidimensional scaling
Factor analysis:
Important terminologies:
226
K.ANANDAKUMAR/BRM 2010
Factor:
Factor loadings:
Factor loadings are those values which explain how closely the variables are
related to each one of the factors discovered.
They are also known as factor-variable correlations.
In fact, factor loadings work as key to understanding what the factors mean.
It is the absolute size of the loadings that is important in the interpretation of a
factor.
Communality (h2):
It shows how much of each variable is accounted for by the underlying factor
taken together.
A high value of communality means that not much of the variable is left over
after whatever the factors represent is taken into consideration.
It is worked out in respect of each variable a under:
H2 of the ith variable= ( i th factor loading of factor A)+( i th factor loading of
factor B)+…..
When we take the sum of squared values of factor loadings relating to a factor,
then such sum is referred as Eigen value or latent root.
Eigen value indicates the relative importance of each factor in accounting for
the particular set of variables being analyzed
227
K.ANANDAKUMAR/BRM 2010
Factor scores:
Factor scores represents the degree to which each respondent gets high scores
on the group of items that load high on each factor.
Factor scores can help explain what the factors mean. With such scores, several
other multivariate analyses can be performed.
The first step involved in conducting factor analysis is to define the problem
and identify the variables involved.
A correlation matrix is to be constructed and a method of factor analysis to be
performed is to be selected.
Decision regarding the number of factors to be extracted and the method of
method of factor analysis is made.
The rotated factors are interpreted. Depending upon the objective the factor
scores are calculated or surrogate variables selected so as to represent the
factors in subsequent multivariate analysis.
Finally the fit of the factor analysis model is determined.
1) Formulate the problem:
Problem formulation includes several tasks. The objectives of factor analysis
should be identified and the variables to be included in the factor analysis
should be specified based on the past research, theory and judgment of the
researcher. The variables should be appropriately measured in an interval or
ratio scale. An appropriate sample size should be identified. The sample size
should be at least four or five times more than the variables identified for study.
For e.g., if the study includes 20 variables, then the sample size should be a
minimum of 80 or 40. If the sample size is small and the ratio is not maintained,
the results should be interpreted cautiously.
228
K.ANANDAKUMAR/BRM 2010
The variables identified for the study should be correlated in order to conduct
the factor analysis. If the correlation between the variables is small, factor
analysis may not be appropriate. It can also be expected that the variables that
are highly correlated with each other would also highly correlate with the same
factor or factors. Formal statistics are available for testing the appropriateness of
the factor model. Bartlett’s test of sphericity can be used to test the null
hypothesis that the variables are uncorrelated in the null hypothesis cannot be
rejected, and then the appropriateness of factor analysis should be questioned.
229
K.ANANDAKUMAR/BRM 2010
factor analysis is concerned only with the variance shared among all the
variables.
a) A priori determination
Due to prior knowledge the researcher knows how many factors to extract and
thus can specified the number of factors to be extracted beforehand. The
extraction of factors is completed as soon as the desired number of factors is
extracted.
In this approach only factor with Eigen values greater than 1.0 or retained, the
other factors are not included in the modern. An Eigen value represents the
amount of variance associated with the factor. Hence, factors with variance
greater than 1.0 are included. If the number of variables is less than 20, this
approach will result in conservative number of factors.
A scree plot is a plot of the Eigen values against the number of factors in order
of extraction. The shape of plot is used to determine the number of factors. The
230
K.ANANDAKUMAR/BRM 2010
plot typically as a distinct break between the steep slope of factors with large
Eigen values and gradual trailing off associated with the rest of the factors. The
gradual trailing off is refereed as scree. The point at which the scree begins
denotes the true number of factors.
231
K.ANANDAKUMAR/BRM 2010
The sample is split in half and factor analysis is performed on each half. Only
factors with high correspondence of factors loadings across the two subsamples
retained.
There are two most commonly employed factor analysis procedures. They are
1) Principle component analysis
2) Common factor analysis
Example:
Method:
232
K.ANANDAKUMAR/BRM 2010
3) Comfort (C)
4) Spare parts availability (D)
5) Breakdown frequency (E)
6) Price (F)
● ● ●
For future analysis, while
A, B, D, E into factor -1 conducting a
study to F into factor -2 obtain
233
K.ANANDAKUMAR/BRM 2010
If the researcher wants to analyze the components of the main factor, common
factor analysis is used.
Example:
1) Leg room
2) Seat arrangement
3) Entering the rare seat
4) Inadequate dickey space
5) Door locking mechanism
Cluster analysis:
Objects within a cluster are similar and between the clusters are dissimilar.
The example below shows cluster analysis based on three dimensions age,
income and family size.
Cluster analysis is used to segment the car- buying population in a metro.
For example” A” might represent potential buyers of low end cars. Example:
Maruti 800(for common man). These are the people who are graduating from
the two-wheeler market segment.
Cluster B may represent mid-population segment buying Zen, Santro, and Alto
etc.
Cluster C represents car buyers, who belong to upper strata of society. Buyers
of Lancer, Honda city etc.
Cluster D represents the super rich cluster i.e. Buyers of Benz, BMW etc.
Income
235 B
K.ANANDAKUMAR/BRM 2010
Age
Family size
236
K.ANANDAKUMAR/BRM 2010
I) Hierarchical clustering:
237
S
t
p
m
o
C
k
n
i
l
g
a
r
e
v
A K.ANANDAKUMAR/BRM
a) Divisive clustering: starts with all the objects grouped in a single cluster.
a) Linkage methods:
It is based on minimum distance or the nearest neighbor rule. The first two
objects clustered are those that have the smallest distance between them. The
next shortest distance is identified and either the third object is clustered with
the first two, or a new two-object cluster is formed. At every stage, the distance
between two clusters is the distance between their two closest points as
illustrated below;
239
K.ANANDAKUMAR/BRM 2010
b) Variance methods:
The variance method attempts to minimize the within cluster variance. Wards
procedure is a commonly used variance method. For each cluster, the means of
all the variables are computed; subsequently for each object the squared
Euclidean distance to the clusters means is calculated. The distances are
summed for all the objects. At each stage, the two clusters with the smallest
increase in the overall sum of squares within the cluster are combined. This is
illustrated as follows:
240
K.ANANDAKUMAR/BRM 2010
c) Centroid methods:
In the Centroid mehods, the distance between two clusters is the distance
between their centroids i.e. means of all the variables. Every time objects are
grouped, new Centroid is computed.
The average linkage method and ward’s method perform better than other
procedures.
241
K.ANANDAKUMAR/BRM 2010
242
K.ANANDAKUMAR/BRM 2010
The relative sizes of the cluster should be meaningful with each cluster having
more elements. It is not useful to have only one element in a cluster.
Interpreting and profiling clusters involves examining the cluster centroids. The
centroids represent the mean values of the objects contained in the cluster on
each of the variables. The centroids enable us to describe each cluster by
assigning it a name or label. It will be more helpful to profile the clusters in
terms of variables that are not used for clustering. The demographic,
psychographic, product usage, media usage or other variables can be used for
profiling. The variables that significantly differentiate between clusters can be
identified via discriminant analysis and one-way analysis of variance.
243
K.ANANDAKUMAR/BRM 2010
Several decisions are made on the basis of cluster analysis; hence clustering
solutions should not be accepted without assessing the reliability and validity.
The following procedure can be followed to provide adequate checks on the
quality of clustering results.
1. Perform cluster analysis on the same data using different distance measure.
Compare the results across measures to determine the stability of the solutions.
2. Use different methods of clustering and compare the results.
3. Split the data randomly into halves, perform clustering separately on each half
and compare the cluster centroids across the two sub samples.
4. Delete variables randomly. Perform clustering based on the reduced set of
variables. Compare the results with those obtained by clustering based on the
entire set of variables. In non-hierarchical clustering, the solution may depend
on the order of cases in the data set. Multiple runs using different order of cases
can be performed until solutions are stabilized.
244
K.ANANDAKUMAR/BRM 2010
DISCRIMINANT ANALYSIS
245
K.ANANDAKUMAR/BRM 2010
To test whether any significant differences exist between the mean values (all
predictor variables taken simultaneously) of two or more a prori defined groups.
To find the linear combinations of the predictor variables that enables us to
represent the groups by maximizing the ratio of the squared difference between
group means to the variance within the groups
To establish procedures for assigning new observation to one of the groups,
assuming a priori that they belong to one of the defined groups.
(1) The objects (elements) of the population belong to two or more mutually
exclusive groups (the elementary units may be people, states or countries, the
economy at different points in time or of different regions, etc.)
(2) The Discriminant function, a mathematical equation, used for the purpose of
classification of the objects is a linear function. These equations combine the
group characteristic in a way that allows one to identify the group with an object
is closely associated.
(3) The Discriminant variables, or the characteristics used to distinguish among the
groups, must be measured at an interval or ratio scale, so that the means and
variances can be calculated and they can be used in the analysis.
(4) It is assumed that each group is drawn from a multivariate normal population.
This allows the precise computation of tests of significance and probabilities of
each group membership.
Application:
246
K.ANANDAKUMAR/BRM 2010
247
K.ANANDAKUMAR/BRM 2010
248
K.ANANDAKUMAR/BRM 2010
b. Deciding the sample size needed for estimation of discriminant function and
c. Division of sample for validation purpose.
a) Selection of dependent and independent variable
To apply discriminant analysis the researcher should specify the dependent and
the independent variables.
Dependent variable should be categorical and the independent variables are
metric.
The number of dependent variables categories can be two or more, but these
groups must be one group.
The dependent variable in some cases may involve two groups eg., purchasers
and non light users and non users of a product.
After the decision regarding the dependent variables, the researcher must decide
about the independent variables to be included in the analysis.
Independent variables can be selected in the following two ways.
1) Identifying the variables from the previous research or from the
theoretical model that is underlying the basis of research question.
2) The second approach is intuition ie utilizing the researchers’
knowledge and intuitively selecting variables for which previous research is not
available.
b) Sample size
The ratio of sample size to the number of predictor variables should be
considered in discriminant analysis.
Many studies suggest a ratio of 20 observations for each predictor variable. If
adequate sample is not maintained the results became unstable.
The minimum size recommended is five observations per independent variable.
The ratio applies to all variables considered in the analysis, even if all of the
variables considered are not entered into the discriminant function.
249
K.ANANDAKUMAR/BRM 2010
In addition to the overall sample size, the researcher must also consider sample
size of each group.
The smallest group size must exceed the number of independent variables.
The practical guideline is that each group should have at least 20 observations.
c) Division of sample
The sample should be divided into two groups called as estimation or analysis
sample and the holdout or validation sample.
The analysis sample is used for estimation of the discriminant function.
The hold out or validation sample is reserved for validating the discriminant
function.
It is essential that each subsample should be of adequate size to support
conclusions from the results.
If the sample is large enough, it can be split in half. One half serves as the
analysis sample and the other is used for validation. The analysis sample is
used to develop the discriminant function and the validation sample is used to
test the Discriminant function.
The method of validation the sample is referred to as the split-sample or cross-
validation approach.
The role of the halves is then the interchanged and the analysis is repeated. This
is called double cross-validation.
The distributions of the number of cases in the analysis and validation samples
follow the distribution in the total sample.
For example, if the total sample contains 60 percent users and 40 percent non
users of the product, then the analysis and validation sample would each contain
60 percent users and 40 percent non users.
250
K.ANANDAKUMAR/BRM 2010
3) Assumption
1) The objects (elements) of the population belong to two or
more mutually exclusive groups (the elementary units may be people, states or
countries, the economy at different points in time or of different regions, etc.)
2) The Discriminant function, a mathematical equation, used
for the purpose of classification of the objects is a linear function. These
equations combine the group characteristic in a way that allows one to identify
the group with an object is closely associated.
3) The Discriminant variables, or the characteristics used to
distinguish among the groups, must be measured at an interval or ratio scale, so
that the means and variances can be calculated and they can be used in the
analysis.
4) It is assumed that each group is drawn from a multivariate
normal population. This allows the precise computation of tests of significance
and probabilities of each group membership.
Assessing overall fit of the selected Discriminant function involves three tasks:
251
K.ANANDAKUMAR/BRM 2010
252
K.ANANDAKUMAR/BRM 2010
Conjoint analysis
Conjoint analysis is concerned with the measurement of the joint effect of two
or more attributes that are important from the customer’s point of view.
In a situation where the company would like to know the most desirable
attributes or their combination for a new product or service, the use of conjoint
analysis is most appropriate.
Example:
An airline would like to know, which is the most desirable combination of
attributes to a frequent traveller: (a) Punctuality (b) Air fare (c) Quality of food
served on the flight and (d) Hospitality and empathy shown.
Conjoint analysis is a multivariate technique that captures the exact levels of
utility that an individual customer places on various attributes of the product
offering. Conjoint analysis enables a direct comparison,
Example
A comparison between the utility of a price level of Rs.400 versus Rs.500, a
delivery period of 1 week versus 2 weeks ,or an after-sales response of 24 hours
versus 48 hours.
Once we know the utility levels for each attribute (and at individual levels as
well), we can combine these to find the best combination of attributes that gives
the customer the highest utility, the second best combination that gives the
second highest utility, and so on. This information is then used to design a
product or service offering.
Application
253
K.ANANDAKUMAR/BRM 2010
Some examples of other areas where this technique can be used are:
Process
Design attributes for a product are first indentified. For a shirt manufacturer,
these could be design such a designer shirts Vs plain shirts, this price of Rs 400
versus Rs 800. The outlets can have exclusive distribution or mass distribution.
All possible combinations of these attribute levels are then listed out. Each
design combination will be ranked by customers and used as input data for
conjoint analysis. Then the utility of the products relative to price can be
measured.
The output is apart-worth or utility for each level of each attribute. For example,
the design may get a utility level of 5 and plain, 7.5. Similarly, the exclusive
distribution may have part utility of 2, and mass distribution, 5.8. We then put
together the part utilities and come up with a total utility for any product
combination we want to offer, and compare that with the maximum utility
combination for this customer segment.
This process clarifies to the marketer about the product or service regarding the
attributes that they should focus on in the design.
If a retail store finds that the height of a shelf is an important attribute for selling
at a particular level, a well-designed shelf may result from this knowledge.
Similarly, a designer of clocks will benefit from knowing the utility attached by
customers to the dial size, background colours, and price range of the clocks.
254
K.ANANDAKUMAR/BRM 2010
Approach
From a discussion with the client, identify the design attributes to be studied and
the levels at which they can be offered. Then build a list of product concepts of
offer. These product concepts are then ranked by customers. Once his data is
available, use conjoint analysis to derive the part utilities of each attribute level.
This is then used to predict the best product design for the given customer
segment. Use the SPSS conjoint procedure to analyse the data.
For attributes selection, the market researcher can conduct interview with the
customers directly.
Combination Rank
3kg,2hours,lenovo 4
5kg,4hours,dell 5
5kg,2hours,lenovo 8
3kg,4hours,lenovo 3
3kg,2hours,dell 2
255
K.ANANDAKUMAR/BRM 2010
5kg,4hous,lenovo 7
5kg,2hous,dell 6
3kg,4hours,dell 1
One combination 3kg, 4 hours, bell clearly dominates and 5kg, 2hours, Lenovo
is least preferred.
Let us now take the average rank for 3kg option =4+3+2+1/4=2.5
Canonical correlation:
Application:
Example:
1) Technology
2) Trained manpower
3) High quality
X= a1p1+a2p2+a3p3
Y represents the subject of interest namely market share, sales volume, brand
image etc.
Y=b1q1+b2q2+b3q3
257
K.ANANDAKUMAR/BRM 2010
Assumption
258
K.ANANDAKUMAR/BRM 2010
259
K.ANANDAKUMAR/BRM 2010
260
K.ANANDAKUMAR/BRM 2010
261
K.ANANDAKUMAR/BRM 2010
Introduction:
Statistical packages
262
K.ANANDAKUMAR/BRM 2010
263
K.ANANDAKUMAR/BRM 2010
Xlisp-stat
Yxilon
Public domain
BrightStat
CSPro
Epi Info
X-12-ARIMA
MINUIT
Freeware
BV4.1
GeoDA
WinBUGS - Bayesian analysis using Markov chain Monte Carlo methods
Winpepi - package of statistical programs for epidemiologists
WinIDAMS
Zaitun Time Series
Proprietary
264
K.ANANDAKUMAR/BRM 2010
265
K.ANANDAKUMAR/BRM 2010
266
K.ANANDAKUMAR/BRM 2010
Add-ons
XLfit add-on to Microsoft Excel for curve fitting and statistical analysis
MATLAB
267
K.ANANDAKUMAR/BRM 2010
In 2004, MathWorks claimed that MATLAB was used by more than one
million people across industry and the academic world. [2] MATLAB users come
from various backgrounds of engineering, science, and economics.
MATLAB
Written in C, Java
License Proprietary
MINITAB
268
K.ANANDAKUMAR/BRM 2010
Minitab produce two other products that complement Minitab 15. Quality
Trainer; an eLearning package that teaches statistical tools and concepts in the
context of quality improvement that integrates with Minitab 15 to
simultaneously develop the user's statistical knowledge and ability to use the
Minitab software; and Quality Companion 3, an integrated tool for managing
Six Sigma and Lean Manufacturing projects that allows Minitab 15 data to be
combined with project management and governance tools and documents.
Website http://www.minitab.com/
269
K.ANANDAKUMAR/BRM 2010
In addition, SAS has many business solutions that enable large scale software
solutions for areas such as IT management, human resource management,
financial management, business intelligence, customer relationship management
and more.
Features
270
K.ANANDAKUMAR/BRM 2010
271
K.ANANDAKUMAR/BRM 2010
SPSS
SPSS is a computer program used for statistical analysis. Between 2009 and
2010 the premier software for SPSS was called PASW (Predictive Analytics
SoftWare) Statistics. [1] The company announced July 28, 2009 that it was being
acquired by IBM for US$1.2 billion.[2] As of January 2010, it became "SPSS:
An IBM Company".
Platform Java
272
K.ANANDAKUMAR/BRM 2010
Website http://www.spss.com/
Statistics program
SPSS (originally, Statistical Package for the Social Sciences) was released in its
first version in 1968 after being developed by Norman H. Nie and C. Hadlai
Hull. Norman Nie was then a political science postgraduate at Stanford
University, and now Research Professor in the Department of Political Science
at Stanford and Professor Emeritus of Political Science at the University of
Chicago
SPSS is among the most widely used programs for statistical analysis in social
science.
It is used by market researchers, health researchers, survey companies,
government, education researchers, marketing organizations and others.
The original SPSS manual (Nie, Bent & Hull, 1970) has been described as one
of "sociology's most influential books".[4] In addition to statistical analysis, data
management (case selection, file reshaping, creating derived data) and data
documentation (a metadata dictionary is stored in the data file) are features of
the base software.
XLSTAT
XLSTAT is the leading data analysis and statistical solution for Microsoft Excel
274
K.ANANDAKUMAR/BRM 2010
From traditional statistical analysis of variance and predictive modelling to
exact methods and statistical visualization techniques, SAS/STAT software is
designed for both specialized and enterprise wide analytical needs. SAS/STAT
software provides a complete, comprehensive set of tools that can meet the data
analysis needs of the entire organization.
Benefits
Features
Analysis of variance
275
K.ANANDAKUMAR/BRM 2010
Mixed models
Regression
Categorical data analysis
Bayesian analysis
Multivariate analysis
Survival analysis
Psychometric analysis
Cluster Analysis
Nonparametric analysis
Survey data analysis
Multiple imputation for missing values
276
K.ANANDAKUMAR/BRM 2010
277
K.ANANDAKUMAR/BRM 2010
278
K.ANANDAKUMAR/BRM 2010
279
K.ANANDAKUMAR/BRM 2010
Research report:
The documents that describes the research project, its findings, analysis of
findings, interpretations, conclusions, and, sometimes, recommendations is
called research report.
Types of report:
Oral report
Written report
Oral report:
This type of reporting is required, when the researchers are asked to make an
oral presentation.
Making an oral presentation is somewhat difficult when compared to the written
report. This is because the reporter has to interact directly with the audience.
The presenter may have to face a barrage of questions from the audience.
In oral presentation, communication plays a big role.
280
K.ANANDAKUMAR/BRM 2010
Guidelines:
u
h
e
s
o
l
d
e
h
i
d
a
p
l
u c
n
e
y
o
l
p
m
e
o
t
p
g
s
a
r e
h
n
a
r
t
i
p
s
e
i
v
s
a
u
l
a
m
r
o
f
n
i
ti
f
o
t
e
r
c
n
s
d
i
n
o
d
i
l
s
e
y
u
a
n e
c
n
d
e
o
v
i
u
a
o
ti
c
a
r
en
i
e
t
n
n
m
o
tie
c
s
i
n
d
e
l
k
c
b
o y
a
n
o
l
e
r
d
g
i
w
g
p
i
h
o
i
d
g
n
a
e
r
e
h
t
s
s
t
n
i
p
s
e
r
n
o
i
t t
281
K.ANANDAKUMAR/BRM 2010
v
ti
e
r
t
u
c
x
p
n
s
a t
a
s
ti
n
o
o
i
y
t
A
Y
K
u
s
a
s
d
l
u
o
v
i
r
e
h
r
u
s
l
e
v
ti
y
f
t
r
o
k
p
e
w
n
y
ff
i
d
e
n
p
s
y
l
a
c r
u
o
t
n
r
e
h
t
i
d
e
u
a
n
r
o
f
a
d
ti
n
i
m
u e
c
e
c
n
i
o
p
n
i
o
a
m s
t
o
f
w
l
o m
ti
e
n
a
l
p
e
h
t
k
a
m
v
o
c
e
b
r
a
t
n
s
e
r
p
d
n
a
d
n
o
ti
a
n
g
m
n
h
t
i
w m
e
e
v
i
l
e
d
v
ti
c
ff
e
t
n
o
l
a
ti
d
e
t r
e
m
Written reports:
Types of report:
1) Short report
2) Long report
3) Formal report
4) Informal report
282
K.ANANDAKUMAR/BRM 2010
5) Government report
Short report:
Short reports are produced when the problem is very well defined and if the
scope is limited.
It will run into about five pages.
It consists of report about the progress made with respect to a particular product
in clearly specified geographical locations.
E.g. monthly sales report
Long report:
Technical report:
This will include the sources of data, research procedure, sample design, tools
used for gathering data, data analysis methods used, appendix, conclusion and
detailed recommendations with respect to specific findings. If any journal, paper
or periodical is referred, such references must be given for the benefit of reader.
This report is meant for those who are not technically qualified.
E.g. chief of the finance department.
He may be interested in financial implications only, such as margins, volumes
etc.
He may not be interested in the methodology.
Final report:
Informal report:
The report prepared by the supervisor by way of filling the shift log book, to be
used by his colleagues.
Government report:
I. Preliminary section
a) Title page
b) Certificate
c) Declaration
d) Ackowledgement
e) Preface
f) Forward
g) Abstract
h) Table of contents
i) List of tables
j) List of figures
II. Main body of the report
1. Introduction
284
K.ANANDAKUMAR/BRM 2010
285
K.ANANDAKUMAR/BRM 2010
Structure your writing around the IMR&D framework and you will ensure a
beginning, middle and end to your report.
287
K.ANANDAKUMAR/BRM 2010
Bibliography:
Instructions:
1. Step 1
288
K.ANANDAKUMAR/BRM 2010
Create a page at the end of the paper. Call it either "Bibliography" or "Works
Cited."
2. Step 2
List, alphabetically by author's last name, all the sources used in writing the
paper.
3. Step 3
Write the last name of the author first, followed by a comma and his or her first
name, followed by a period.
4. Step 4
Write the name of the book in italics, followed by a period. You can also
underline the book title.
5. Step 5
Cite the name of an article, in quotation marks, in place of a book title. Then
write the name of the journal or magazine from which it came (italicize the
name or underline it). Include a volume number, if applicable.
6. Step 6
Write the name of the city in which the work was published, followed by a
colon.
7. Step 7
8. Step 8
289
K.ANANDAKUMAR/BRM 2010
9. Step 9
290
K.ANANDAKUMAR/BRM 2010
291
K.ANANDAKUMAR/BRM 2010
292
K.ANANDAKUMAR/BRM 2010
293
K.ANANDAKUMAR/BRM 2010
Ethics in research
Ethics-Definition:
Ethics are norms or standards of behaviour that guide moral choices about
our behaviour and our relationship with others.
The goal of ethics in research is to ensure that no one is harmed or suffers
adverse consequences from research activities.
According to Collins Dictionary, ethical means “in accordance with principles
of conduct that are considered correct, especially those of a given
profession or group”.
Ethics is nothing but the accepted code of conduct.
Ethics in business research is very much required and relevant in today’s
industrial scenario.
Seeking consent:
294
K.ANANDAKUMAR/BRM 2010
F oCompetent
r according to eSchinke and
x Gilchrist,’is
a m
concerned p the legal
with l e
and mental capacities of participants to give permission’.
them from making informed decisions, people in crisis, people who cannot
speak the language in which research is being carried out, people who are
dependent upon you for a service and the children are not considered to be
competent.
Providing incentives:
295
K.ANANDAKUMAR/BRM 2010
Harm includes:
296
K.ANANDAKUMAR/BRM 2010
Maintaining confidentiality:
297
K.ANANDAKUMAR/BRM 2010
Rights to choose:
Rights to safety:
Rights to be informed:
Rights to privacy:
Avoiding bias:
Incorrect reporting:
Sometimes firms, for the sake of formality, call for quotations from a number of
market research agencies, even though they have already decided to whom the
project should be given. This is unethical practice in the matter of selection of
researchers.
Limited funds:
299
K.ANANDAKUMAR/BRM 2010
It may happen that such ambiguity may cause the researcher to prepare his
proposals for a nationwide research, but upon bagging the project, the funds
released are sufficient only to conduct research on regional basis.
This may frustrate researchers besides, it is an unethical practice.
Non-availability of data:
Some firms give projects to their researcher, but do not provide him with
required sales and cost data.
Since this may be the basis for carrying out the research, the researcher feels
frustrated at not receiving the basic promised data.
This is an unethical on the part of the client firm.
Pseudo-Pilot studies:
Some clients ask the research agencies to conduct pilot studies and promise that
if the researcher does a good job during the pilot study stages, there will be an
additional major contract immediately.
Most often, this comprehensive study never materialises and the research
agencies absorb a huge loss.
This is not an ethical practice.
Political research:
300
K.ANANDAKUMAR/BRM 2010
Most research in the social sciences is carried out using funds provided by
sponsoring organisations for a specific purpose.
The funds may be given to develop a program or evaluate it; to examine its
effectiveness and efficiency; to study the impact of a policy; to test a product; to
study the behaviour of a group of community; or to study a phenomenon, issue
or attitude.
Sometimes there may be direct or indirect controls exercised by sponsoring
organisations.
They may select the methodology, prohibit the publication of ‘what was found’
or impose other restrictions on the research that may stand in the way of
obtaining and disseminating accurate information.
Both the imposition and acceptance of these controls and restrictions are
unethical, as they constitute interference and could amount to the sponsoring
organisation tailoring research findings to meet its vested interests.
301