Research Methods
Research Methods
Research Methods
Research Methods
M.L.I.S
CONTENTS
Chapter No.
Topic
Page No.
II
Types of Research
III
IV
Research Problem
Research Design
VI
Research Methods
VII
Sampling
VIII
Data Collection
IX
Data Analysis
Report Writing
Unit I
Research: Concept, Characteristics and Types; Pure, Applied, Action and Inter
Disciplinary Research Logic and Scientific Investigation.
Unit II
Research Problem: Identification, Selection and Formulation of a Research Problem Research design - Literature Search and Review of Literature - Hypothesis Definition Types and Characteristics.
Unit III
Research Methods: Survey Historical Case Study Experimental - Sampling
Definition - Types and Relevance.
Unit IV
Data Collection: Data Sources Primary Sources and Secondary Sources - Data
Collection Methods Questionnaire, Interview, Observation
Unit V
Data Analysis: Analysis and Interpretation Statistical Tools and Techniques
Measure of Central Tendency - Frequency Distribution - Regression and Correlation Scales Statistical Tools Report Writing - Style and Structure of Presentation of Data..
Books Recommended for Further Readings
1) Busha, Charles H. and Harter, Stephen P. Research Methods in Librarianship:
Techniques and Interpretation. New York: Academic Press, 1980.
2) Gopal, M. H. An Introduction to Research Procedure in Social Science.
Bombay: Asia, 1964.
3) Goode, William J. and Hatt, Paul K. Methods in Social Research. New York:
McGraw Hill, 1952.
4) Krishan Kumar. Research Methods in Library and Information Science, New
Delhi Vikas; 1992.
5) Morice, B. L. Library Surveys: An Introduction to the use, Planning Procedure
and Presentation of Surveys. 2nd Ed. London: Bingley, 1982.
Bingley, 1971.
1994.
17) Sehgal, R. L. Applied Statistics for Library Science Research. 2 Vols. New Delhi;
Ess Ess, 1998.
UNIT I
CHAPTER I
Introduction to Research Methodology
Research has proved to be an essential and powerful tool in leading man towards
progress. Research is a matter of raising question, and then trying to find answer. In other
words, Research means a sort of investigation describing the fact that some problem being
investigated to shed for generalization. Therefore, Research is the activity of solving problem,
which adds new knowledge and developing a theory as well as gathering of evidence to test
generalization.
All significant research leads to progress in some field of life or the other. Each year
new products, new facts, new concepts and new ways of doing things come into our lives due
to ever increasing significant research in physical, biological, social and psychological fields.
Research activity is no longer confined to the science laboratory. Even as the manufacturers,
the agricultural experts and the archeologists carry on research, in their respective sphere so
also the sociologists, economists and educationists. The aim of all research is progress and
good life.
Decisions based on systematic research would surely save money, energy, and a lot of
failure and frustration and show us the path of progress. Thus, it is not so difficult to show that
research is extremely necessary and very worthwhile. It is a careful search for solutions to the
problems that plague and puzzle the mankind.
Definition of Research
George G. Mouly of University of Miami defines research as The systematic and
scholarly applications of the scientific method, interpreted in its broader sense to the solutions
of educational problems. Conversely, any systematic study designed to promote the
development of education as a science can be educational research.
Rusk writes, Research is a point of view, an attitude of inquiry, or a frame of mind. It
asks questions which have hither to not been asked, and it seeks to answer them following a
fairly definite procedure. It is not a mere theorizing, but rather an attempt to elicit facts, and to
face them once they have been assembled.
Francis G. Cornell feels to be sure, the best research is that which is reliable, verifiable
and exhaustive so that it provides information in which we have confidence. The main point
here is that research is literally speaking, a kind of human behaviour, an activity in which
people engage. By this definition, all intelligent human behaviour involves some research.
Maduri Singh defines Teaching research is essentially a state of mind-friendly,
welcoming attitude towards change.
Clifford Woody of the University of Michigan writes in an article in the Journal of
Educational Research, Research is a careful inquiry or examination in seeking facts or
principles and diligent investigation to ascertain something.
C. C. Craford, University of Southern California, writes that Research is simply a
systematic and redefined technique of thinking, employing specialized tools, instruments and
procedure in order to obtain a more adequate solution of a problem, then would be possible
under ordinary means. It starts with a problem, collects data or facts, analyses them critically,
and reaches decisions based on the actual evidence. It evolves original work instead of mere
exercise of personal opinion. It evolves from a genuine desire to know rather than a desire to
prove something. It is quantitative, seeking to know not only what but how much, and
measurement, is therefore a central feature of it.
According to J. W. Best, Research is considered to be the more formal, systematic,
intensive process of carrying on the scientific method of analysis. It involves a more systematic
structure of investigation usually resulting in some sort of formal record of procedures and a
report of results or conclusions.
According to P. M. Cook Research is an honest, exhaustive, intelligent searching for
facts and their meanings or implications with reference to a given problem.
W. S. Monroe defines research as a method of studying problems whose solutions are
to be desired partly or wholly from facts. The facts dealt with in research may be statements of
opinions, historical facts, those contained in records and reports, the results of tests, answers to
questionnaires, experimental data of any sort, and so forth. The final purpose of educational
research is to ascertain principles and develop procedures for use in the field of education.
Therefore, it should conclude by formulating principles or procedures. The more collection and
tabulation of facts is not enough, though it may be preliminary to it or even a part thereof.
Research is thus an original contribution to the existing stock of knowledge of making
for its advancement. It is the pursuit of truth with the help of study, observation, comparison
and experiment. In short, the search for knowledge through objective and systematic method of
Solutions
Exactness
Relationship of facts
1. Careful recording
2. Critical observation
3. Constructive attitude
4. Condensed and Compactly stated generalizations
Characteristics of Research
1. Research gathers new knowledge or data from primary or first hand sources
Research endeavors to reach the first hand source (primary source) of data instead of
serving its purpose with the data available from second hand sources (secondary source). It is
not research when one simply restates or reorganizes what is already known or what has been
written.
10
11
Though there is a specific purpose behind each research study, however, the objectives
can be broadly classified as under;
To obtain familiarity of a phenomenon.
To determine the association or independence of an activity.
To determine the characteristics of an individual or a group of activities and the
frequency of its (or their) occurrence.
Nature of Research
The goal of research is to improve the level of living in society. The word
research carries an aura of respect. The prestige of much is great and is equally
emphasized on the researcher to challenge the problem that needs to be solved even
thought the situation is far beyond the scope of research. Some people think the
research is a waste of time, effort and money, and further think that more pure research
is needed for practical value. Personal opinion guided by prejudice or dogmatism may
be conceded in nature of problems but can be solved only on the basis of research
evidence. The researcher has to implement more effectively has value judgment for the
benefit of much to the society. One can perhaps say that a researcher enjoy the
intellectual freedoms, independent thinking and can have rewarding experiences. And
further, it is he (researcher) as adviser has to take the society towards the right path of
development.
Significance of Research
The significance of research may be analyzed as under:
1. Research has an important role to guiding social plan. Knowledge of the society and
the cultural behavior of people require proper planning for their well development
because of both knowledge and the cultural behavior of human being are independent.
A reliable as well as factual knowledge may be needed to take decision for planning.
This is possible only by means of research; Social research is generally worth much
more than of the area since the success of the former depends ultimately on peoples
acceptance and participation. So any plan for success of our economic development
needs to be taken research.
12
2. Knowledge is a kind of power with which one can force the implication of particular
phenomena. It also dispels the Thrust of odd settings, superstitious, etc., and light is
thrown on them for welfare development. Thus social research may have the effect of
promoting better understanding and social cohesion.
3. Research is charge with the responsibility for effective functioning of facts. Thus it
affords a considerably, sound basis of prediction. Otherwise it leads to failure-bound
programme which may have a serious impact on society. For example, Chernobyl
Nuclear plant disaster. Another is Bhopal gas disaster. Thus prediction serves better
control over the phenomena and helps in successful plan. This leads towards cherished
goals.
example.
4. It is the role of the researcher to effect constant improvement in techniques of his trade
i.e., Research. He is in spatial temporal contexts; each challenging his attack is faced
with the need to improve upon his techniques. In other words the technique of research
has to become greater perfection.
13
CHAPTER II
Types of Research
Research can be classified differently depending upon the approach, the purpose and
the nature of a research activity. Broadly speaking, research is categorized as (i) Fundamental,
pure or Theoretical Research; and (ii) Applied Research and (iii) Action Research.
Fundamental/pure Research
Research motivated by the desire to know or understand for the sake of knowing is
called fundamental research.
formulation of a theory. On other words, gathering knowledge for the same of knowledge is
termed pure or basic research.
Research concerning some natural phenomenon or relating to pure mathematics are
examples of fundamental research. Similarly, research studies, concerning human behaviour
carried on with a view to make generalizations about human behaviour, are also examples of
fundamental research.
The fundamental research always results in the discovery of a new theory.
This
discovery has nothing to do with n existing theory. Galileos or Newtons contributions are
fundamental in character and they form the basis for different theories.
Fundamental research also leads to the development of the existing theory. It brings
out improvement in the existing theory either by relating some of its assumptions or by
reinterpreting it.
for altering or formulating new set of assumptions and adding new dimension to the existing
theory, Relaxing assumptions, altering them or making new ones altogether depends upon how
a researcher views the existing theory. In a dynamic society a scholar may ascertain that earlier
assumptions have become obsolete or have been inadequately defined. Thus the existing
theory may appear to be outdated and implausible with the prevailing conditions.
For example, Malthusian population theory became almost useless in his own country
owing to new developments invalidating the assumptions of his theory. Naturally, therefore by
dropping out the invalid assumptions, researchers come out with new theories on population
behaviour. There have also been attempts to reinterpret the Malthusian doctrine and thus seek
to retain its validity character even now. Similarly by questioning some of the assumptions of
14
Keynesian theory, Friedman came out with new interpretations of the monetary phenomenon.
Theories developed in capitalist countries often been challenged by researchers of the socialist
block and they have developed new theories or reinterpreted the existing theory.
The researcher engaging in the pure research derives greatest satisfaction from
increasing his knowledge in a field of enquiry where many questions remain unanswered. To
him the challenge of not knowing is paramount. If he can solve the problem, he is satisfied in
the results may or may not have any practical use. The pure scientist would probably argue
that knowledge itself is always of practical use in the end. His governing principal is that
scientific enquiry is noble in itself; it is its own reward. To keep digging away at the layers of
intellectual questions is a challenge enough for a pure scientist. To him knowledge is the
highest good, truth is the supreme value is all the rest is secondary and subordinate.
Applied Research
Applied research is based on the application of known theories and models to the actual
operational field or population. The applied research is conducted to test the empirical content
or the basic assumptions or the very validity of theory under given conditions. For example,
Lewiss growth model for labour surplus economics assumes that real wage rate of labour shall
remain constant till the surplus labour is completely wiped out; it may of interest to a
researcher to investigate if it so happens in every labour surplus economy. In case of a theory
or model not holding good, the researchers interest may further be stimulated to know why a
given model does not apply and what modifications would be required to make the model
operational in that situation. It may also form a basis for developing an alternative strategy.
Applied research has practical utility. The central aim of applied research is to discover
a solution for some pressing practical problem.
research in a practical context from the onset. He would define a problem as one in which
some action could be taken to improve matters. The applied scientist is thus, much more likely
to be working within a certain set of values and norms preference. Sociologists study a host of
problems, viz., the problem of juvenile delinquency, of old people, of gange etc., and find out
the measures to remedy the situation.
Applied research often takes the form of a field investigation and aims at collecting the
basic data for verifying the applicability of existing theories and models in given situation.
Naturally, therefore, the adequacy and accuracy of data will have considerable impact on the
15
application of the model and reliability of the results. Not only should his data be reliable, he
must also be objective, scientific and sharp in identifying the field of application for a given
theory. Like Pure research, the applied research also contributes to the developments in the
following manners.
i)
ii)
iii)
iv)
16
appears as a participant rather than observer, and therefore he is actively and even emotionally
involved in the result and their application. It is a special type of research in the sense, testing
of applications in accordance with a set of situation and modified according to the local
prevailing conditions as well. Another feature is that it adopts itself to the changes that have
taken place in the particular community; in case of applied research the evolution of entire
community rather individual will be taken with applications of theory. These action research is
similar to applied research but differ from the action process carry of action research is
depended upon the type of feedback information supplied. Difference phase in action research
are to be used for information gatherings. They are
a)
A base line survey in order to get possible information relevant to the subject
matter should be collected to get an idea of the existing situation.
b)
c)
d)
Depending the availability of information the method must be selected for the action
research which usually consisting of the personal interview method, the survey method, the
attitude measurement and so on. The action research is not much difference to concurrent
evaluation of a project the example for action research are:
As stated earlier. Teach conduct action research to improve his own teaching. The
practitioner attempts to study his problem scientifically for guiding, correcting and evaluating
actions.
The scope of action research is very vast. The approach of action research is dealing
with practical problems seems to be appropriate and promising for all kind of professional
workers in education as long as their desire is to improve their own career. An administrator
who dissatisfies with his efforts top develop a morale in his staff could approach this problem
with action research.
The merits of action research are as follows:
1)
2)
3)
17
4)
5)
6)
S.
No.
1.
Objectives
2.
Training
3.
4.
5
Area
Fundamental Research
Action Research
Review
literature
Sampling
Experimental
design
18
9
10
11
Analysis
of Complex analysis is often called
Data
for
Statistical
Statistical significance is usually
Treatment
stressed
Application of Lack of coordination between
results
research workers and teachers
generate
serious
practical
problems. The generalization
usually remain continued to
books and research report
The difference between fundamental research and action research should not be
magnified. The difference is there but in the emphasis only not in method or spirit. Both type
are committed to the high standard of scientific objectively and scholarship.
CHAPTER III
diligent
19
Scientific Methods
The overall approach to research of any variety is what is generally termed as the
scientific method. Scientific method comprises of three steps, viz., observation, hypothesis
and verification. According to Prof. Northrop, Scientific methods are relative to the stage of
enquiry and the type of problem. Different stage of enquiry may have different scientific
methods and the method which is scientific for one stage may not be so for another. As there
are multiplicities in sciences, there are multiplicities in rational methods. Prof. Caldin holds
that there are several types of rational methods.
scientific technique and scientific method. The former is the instrument available for any
research; the latter is the special method of any particular science.
Scientific research requires a logical approach comprised of several independent
activities, the essentials of which are the following (1) description of the problem and a critical
review of relevant research reports and literature; (2) mustering of previously produced facts
by collecting pertinent information about the problem or topic: (3) careful study of available
evidence to that research problem can be refined, specific hypotheses or exploratory questions
can be posed, and solution can be anticipated: (4) Structuring and conduct of experiments or
other careful studies to test the most feasible hypothesis in relation to the most crucial
questions: (5) analysis and evaluation of accumulated data and the drawing of relevant
conclusions: (6) utilization of research findings to predict effects so that new hypotheses can
be generated; and (7) recording of methods, findings, and conclusions in a written research
report so that newly acquired insights and knowledge can be communicated to other persons.
The scientific approach has the constituent elements procedural and personal.
Procedural Components:
Observation, Hypothesis, and Verification are the three procedural approach.
Observation is based on the data currently available. Observation depends on knowledge,
material, and personal hunches. Observation helps to build up hypothesis. It also helps as a
technique of collection of data, there by helping verification.
20
technique used for checking and counter checking. Tools help in the careful definition of the
concepts and in the collection of the comprehensive data. Even in social science, the statistical
or quantitative tool is coming to greater use.
Personal Components:
The researchers need imagination, analytical ability, resourcefulness, skill, persistence
and independence. The capacity to find the heart of the problem is a part of the enquiry. The
knowledge of the field of investigation is a fact of the personal component. Researchers
ability and attitude are more important than the method of approach. He should have an
objective scientific, and unbiased view. He should have sufficient personal qualifications and
professional training, apart from the practical experience, the personal qualifications must
enable the researcher sufficiently to assess the adequacy, relevance, and the value of the data.
Perusal quality also relates to the integrity, honestly, truthfulness and seniority of purpose. At
every step the researcher requires reasoning. There is a great need for balance or poise
between mental, physical and moral qualities. Poise gives the ability to see things in their true
perspective. Ambition interest and perseverance are very much required to go on successfully
with research. In other words a worthy researcher should possess the objectivity of Socrate,
the wisdom of Solomon, the courage of David, the strength of Samson, the patience of job, the
leadership of Moses, the strategy of Alexander, the tolerance of the carpenter of Nazareth, and
above all in an intimate knowledge of every branch of the natural, biological and social
sciences.
Scientific Method: Meaning and its essentials
Scientific method is a branch of study which is concerned with observed facts
systematically classified, and which includes trustworthy methods for discovery of facts.
Every fact is initially nothing but some proposition or problem. At every stage of enquiry
hypothesis is necessary. Hypothesis is defined as a suggestion of possible condition between
imagined facts and actual facts.
As Wilkinson and Bhandarkar put it, Science is an intellectual construction a
working thought model of the world and its aim is to describe and conceptualize the
impersonal facts of experience in verifiable terms, as exactly as possible, as simply as possible
and as completely and meaningfully as possible.
21
generalization and the scientific method lies in the degree of formality, rigorousness,
verifiability and general validity of the latter.
Scientific method is self corrective in nature. It depends on the methods of developing
and testing hypothesis. The method of enquiry itself may be testified and modified. Since
scientific method is based on the most probable inference at a particular point of time, a
scientific theory is more probable then any other alternative theory. The method of science
depends on evidence which is collected from empherical material on the basis of principle. In
the process of such dealings, the doubtful matters are detected and clarified.
Assumption of scientific method
The following are some assumptions and objectives of scientific method:
(1)
Regularities
(2)
Verification
(3)
Techniques
(4)
Quantification
(5)
Values
(1) Regularities: Scientific method believes that the world is regular and phenomena
occur in patterns. Further there are certain discernible uniformities in political and
social behaviour, which can be expressed as generalizations which are capable of
explaining and predicting political and social phenomena.
(2) Verification:
propositions that have been subjected to empirical tests and that all evidence must be
on observations. Hence science is empirical. Experience has come to represent the
true test of science.
experience them.
(3) Techniques:
interpreting data. There is need for the use of statistical tools like multi-variate
analysis, samples surveys, mathematical models, stimulations, etc.
to make the
researcher conscious about the methodology. This helps the researcher for planning,
executing and assessing his research work. The techniques must be validated for
analyzing data.
22
23
behaviour in its true character. Therefore, a complete and a total perspective comes
only through interdisciplinary approach.
Steps in scientific method
The following are the steps involved in applying scientific method:
1. Formulation of hypothesis
2. Defining concepts
3. Establishing working definitions
4. Collection and analysis of data and
5. Relating the findings to the existing knowledge and generalization
Lundbergs four levels of research:
Lundberg has described the following four levels of research.
1.
Random Observation:
Systematic exploration:
The
Testing a well-Defined hypothesis: Suitable variables are correlated and the nature
and degree of their relationships is expressed in the form of a tentative generalization
and the research efforts are confined to testing its validity.
4.
Experiment Directed by Systematic Theory: This is the most refined and advanced
method of research. Empirical observations are made and necessary data are collected
for interpretation and generalization. Newtons principles and Einesteins Relativity
Theory come under these categories. A valid inference is one in which the conclusion
follows reasonably from the premises. For ascertaining a valid conclusion, a science
should depend on a logical method. Scientific method is therefore the persistent
application of logic as the common feature of all systematic and reasoned knowledge.
24
Formal Logic
Logic is an instrument of reasoning and its character is formal. Logical validity cannot
ensure the correctness of the subject matter of science. Logic is more like grammar of which
details with the correctness of the form or structure of the language. Similarly logic studies the
forms or structure of reasoning and formalities the conditions for the validity of reasoning, but
it does not and cannot study the material or correctness or incorrectness of a promise of
conclusion, legitimate is formal because it studies the formal structure of reasoning.
Logic and Scientific Methods
Logic involves reasoned language; and all sciences are applied logic. The universal
feature of science is in general method which consists in the persistent search for truth. But the
search for truth depends on evidence, the determination of which we call logic.
Nature of Scientific Method
The nature of scientific method depends upon the nature and objective of a particular
science. There are broadly two methods of sciences:
1. Technical (Technological) and
2. Logical
1.
Technical Method:
instruments. The more developed the technical method, the more exact a science
becomes in handling data required for experiment. The use of technical methods
makes a science progressive.
2.
25
26
2. When human behaviour is studied and analyzed by other human beings, the personal
characteristics of such human beings come into the picture and distort the analytical
facts.
3. Human behaviour do not admit measurement because it is psychological in nature.
4. Human behaviour is not uniform and predictable. Therefore, it is uncertain. Different
people will have different behaviour in similar circumstances. Similarly, one person
may have differently under similar circumstances.
5. The decision making in human behaviour becomes difficult. Thus reliable scientific
data cannot always be collected.
Limitations of Scientific Method
Though scientific method has many uses in research, it has got its own limitations.
1. It involves abstractness.
2. Its explanation is never complete.
3. The conclusions arrived by this method is not final.
4. As each science is concerned with a particular area and is based on certain assumptions,
it has limited scope.
5. Superstition, cherished beliefs etc., are hostile to the growth of scientific method.
6. Definitions and formal distinctions are not often used properly. Also, statistical
information may be irrelevant and inconclusive.
7. Scientific judgment is difficult. Sometimes, it is impossible immediate action is
demanded.
8. Development of this method needs necessary time for reflection and material for
experiment.
9. Scientific research suits only to the optimists.
10. There is no guarantee of achieving the goal and can prevent human life from being an
adventure.
27
UNIT II
CHAPTER IV
Research Problem
Introduction
In research process, the first and foremost step happens to be that of selecting and
properly defining a research problem. A researcher must find the problem and formulate it so
that it becomes susceptible to research. Like a medical doctor, a researcher must examine all
the symptoms (presented to him or observed by him) concerning a problem before he can
diagnose correctly. To define a problem correctly, a researcher must know: what a problem is?
The term problem originates from a Greek word Proballein meaning anything thrown
forward, a question proposed for solution, a matter stated for examination.
Definition of a Problem
R. S. Woodworth defines problem as a situation for which we have no ready and
successful response by instinct or by previously acquired habit. In other words, it means a
situation in which a ready solution is not available. The solution can be found out only after an
investigation.
John Dewey writes the need of clearing up confusion, of straightening out an
ambiguity, of overcoming obstacles, of covering the gap between things as they are and as they
may be when transformed, is, in germ, a problem.
Selection of a Research Problem
The research problem undertaken for study must be carefully selected. The selection of
a research problem is a very important job for a research worker. Before the selection of a
research problem, the researcher has to analyze the problem based on the following questions:
1. Is the problem relevant and important?
2. Does the subject area suite to his interest?
3. Does it contain originality and creativeness?
4. Does the problem require extension of knowledge?
5. Is the problem feasible with respect to time and date required in its solution?
6.
The two important factors that influence the choice of a research problem are the
personal value of the researcher and the social condition. The following are the five steps
involved in research.
1. Choice of a topic
2. Data Collection
3. Formation of Hypothesis
4. Verification
5. Writing the thesis
Stages in the selection of a Problem
Broadly, the selection of research problem would involve the following three stages.
Stage 1: Selection of Problem Area
Any problem is not of significance to the nation of profession is definitely not worth
consideration of the investigator. A research problem can be sponsored by an agency and
can also be invented by the investigator himself. It is helpful for the investigator to keep in
mind the following aspects while selecting the area of problem for research.
1. The problem to be chosen should be such so as to be meaningfully related to the
interest of the investigator himself.
2. The problem having alliance with the chain of thinking of research already in existence
can be handled more confidently.
3. The ambitious problems covering a wide range of area of interest should be avoided
and the problems of manageable size and limits should be taken up.
4. An important consideration for selecting the problem area relates to its feasibility in
terms of the application of scientific techniques, availability of resources in terms of
money, personnel and equipments.
Stage 2: Identification of the Problem
The researcher after having carefully understood the pattern of thinking in a particular
area of interest seeks to consider the following aspects for the selection of the problem for
study. These aspects can be classified as
1. External Factors
While considering the external factors, the following should be thoroughly explored:
29
a) Novelty
Novelty is one of the fundamental qualities needed for a research problem. The
problem shall be of a new area so that the duplication of work is avoided. To achieve
this quality, the researcher has to study the early work on this area. The data collected
should be recent. Even a problem that has been investigated previously is amenable to
further studies using newer and better devices, tools and procedures.
b) Significance
The research problem should be flexible enough to add the existing knowledge or to
improve the current practices. It should change the previous findings in any way. The
new findings should have some significant role to play in future investigations. If this is
not possible, the efforts made will be in vain.
c) Availability of other sources
The facilities in respect of money, time and materials are needed for a research. In case of a
good research problem, the following factors should be solved favourably:
1. financial commitment of the problem
2. time required to complete the research work
3. availability of necessary laboratory to carry out the research work
d) Techniques to be employed
e) Sponsorship
f) Working Conditions.
2. Internal Factors
In case of internal factors, the necessary consideration has to be given with respect to the
following:
a) Interest
The problem should be interesting to the investigator. During the course of the
investigation, the investigator has to face certain obstacles. To face these obstacles
boldly, the investigator should have a genuine interest in the problem, otherwise the
tendency to discontinue the research may develop. Hence intellectual interest must be
attributed as one of the criteria for the successful completion of the research work.
30
b) Amenability
The problem chosen for research should be amenable to investigation. The researcher
should possess the required competence, knowledge, and understanding. He should be
skillful enough to develop, administer, and interpret the necessary data gathering devices
and procedures. He should possess a reasonable basic knowledge in the necessary statistical
techniques. The problem should match his special qualifications, training and experience.
c) Availability of data
After selecting a particular problem for research, the collection of data related to the topic
may be sometimes very difficult. In certain cases, the data may be confidential or sensitive.
So, for a successful completion of a research work, easy availability of data is a
prerequisite. Thus, the topic of the research should ensure the easy collection of data.
d) Availability of Co-operation
The co-operation of various agencies like institution, authorities, and individuals is needed
for research work. During the course of investigation, the researcher has to administer
several men and materials and to conduct prolonged experiments. He must make sure that
the necessary permission and co-operation from authorities will be available when
required.
e) Availability of guidance
A researcher has to choose the problem in such a way that necessary guidance is available
for the research. Unless there is a competent qualified faculty member who would be
willing to supervise the research work, research cannot be carried out successfully.
f) Level of research
It is another criterion to help in the selection of a problem. The nature and scope of a study
will be determined on the light of levels like Masters degree, M.Phil., degree and Ph.D. In
the higher levels like Ph.D, the problem chosen for research should be more complex and
need not be so simple.
g) Intellectual Curiosity
h) Training
i) Temperament and Personal Characteristics
j) Cost involved
k) Risks
31
l) Timings
m) Motivation
Stage 3: Interpretation of the Problem (Analysis of the Problem)
The introductory explanation of the problem is usually followed by a detailed definition
and development of back ground concerning sub-problems, scope, the review of the related
literature, sources of data, explanation of terminology used and assumptions etc. To analyze
the problem in its proper perspective he would ask five simple questions to ensure its
feasibility.
They are:
1. What do you want to know?
2. Where and how will you get the information?
3. Who will collect the information?
4. How will the information be analyzed?
5. What does it mean?
Scope and limitation
Whatever be the topic chosen for a research work, the scope of the topic has to be
limited with reference to the availability of time, money, materials and space. The problem has
to be investigated with reference to particular period. The period should be specified.
Researcher shall concentrate his study on this particular period only. If the topic is related to a
geographical area, that area should be clearly defined, ie., usually the coverage will be that of
country, state, district, taluk, village, or in certain cases that of a single institution. Availability
of materials can also be pointed out as one of the limitations of the study. But in science and
technology field, the investigation always concentrates on a narrow area. And the investigation
limits his experiments, on his theoretical pursuit, as the case may be, on a specific problem.
Common Errors
There are some common errors committed in selecting and formulating a research
problem. They are:
1. Naming a broad field or area of study instead of a specific problem.
2. Stating it in such a way that the investigation I impossible.
3. Narrowing or localizing a topic.
4. Including in it terms of an unscientific, emotional or biased nature.
32
CHAPTER V
Research Design
Introduction
Having developed research question, identified a reading list, and planned the outline
for the literature review, we are now going to move on to look at quantitative research design.
What the researcher should keep in my mind is that there are two key, current applications for
developing his knowledge about quantitative research design:
(i) to look at how the research question can be examined through a variety of different designs;
(ii) to understand and describe the research designs used in the empirical literature which he is
reading for the literature review.
Research design provides the glue that holds the research project together. A design is
used to structure the research, to show how all of the major parts of the research project -- the
samples or groups, measures, treatments or programs, and methods of assignment -- work
together to try to address the central research questions.
Research designs fall into two broad classes: quasi-experimental and experimental.
Experimental studies are characterized by the ability to randomize subjects into treatment and
control groups. This randomization goes a long way toward controlling for variables which are
not included explicitly in the study. Because comparison groups are not true, randomized
control groups in quasi-experimental studies, this type of study has to control for confounding
variables explicitly through statistical techniques. For this reason, quasi-experimental studies
are sometimes labeled correlational designs.
Key Concepts and Terms
1. Experimental Designs
A design is experimental if subjects are randomly assigned to treatment groups and to control
(comparison) groups. Cook and Campbell (1979) mention ten types of experimental design.
33
Note that the control group may receive no treatment, or it may be a group receiving a standard
treatment (ex., students receiving computer-supported classes versus those receiving
conventional instruction). That is, the control group is not necessarily one to be labeled "no
treatment."
a) Classic experimental designs: randomization of subjects into control and treatment
groups is a classic experimental method, amenable to a variety of ANOVA designs. The
two broad classes of classic experimental design are:
b) Between subjects designs: In this type of design, the researcher is comparing between
subjects who experience different treatments. There are different subjects for each level
of the independent variable(s) (ex., for each different type of media exposure in a study
of the effect of political advertising). Any given subject is exposed to only one level and
comparisons are made between subjects' reactions or effects. The researcher relies on
randomization of subjects among the treatment groups to control for unmeasured
variables, though sometimes stratification of subjects is employed to guarantee
proportions on certain key variables (ex., race).
c) Factorial designs: This design uses categorical independent variables to establish
groups. For instance in a two factor design, the independent variables might be
information type (fiction, non-fiction) and media type (television, print, Internet),
generating 2 times 3 = 6 categories. An equal number of subjects would be assigned
randomly to each of the six possible groups (ex., to the fiction-television group). One
might then measure subjects on information retention. A null outcome would be indicated
by the average retention score being the same for all six groups of the factorial design.
Unequal mean retention scores would indicate a main effect of information type or media
type, and/or an interaction effect of both.
d) Fully-crossed vs. incomplete factorial designs: A design is fully crossed if there is a
study group for every possible combination of factors (independent variables). An
incomplete factorial design, leaving out some of the groups, may be preferred if some
combinations of values of factors are non-sensical or of no theoretical interest. Also,
when one of the factors is treatment vs. control (no treatment) and another factor is
34
types/levels of treatment, the control subjects by definition will not receive types/levels
of treatment so those cells in the factorial design remain empty.
e) Randomized block designs: These stratify the subjects and for each strata, a factorial
design is run. This is typically done when the researcher is aware of nuisance factors that
need to be controlled (example, there might be an air conditioned room stratum and a no
air conditioning stratum) or if there were other mitigating structural factors known in
advance (ex., strata might be different cities). That is, the blocking variables which
stratify the sample are factors which are considered to be control variables, not
independent variables as they would be in a simple factorial design. Randomized block
designs seek to control for the effects of main factors and their interactions, controlling
for the blocking variable(s).
f) In SPSS: Consider city to be the blocking variable and information type and media type
to be the main factors. In a simple factorial design, city would be an additional factor and
in SPSS one would ask for Analyze, General Linear Model, Univariate; the dependent
variable would be retention score; city, information type, and media type would be fixed
factors; the model would be "full factorial" (the default). In a randomized block design,
one would ask for Analyze, General Linear Model, Univariate; the dependent variable
would be retention score; information type, and media type would be fixed factors; the
blocking variable, city, would be entered as a random factor; click Model and select
Custom, then set "Build Term(s)" to "Main Effects" and move all three factors over to the
"Model:" box; uncheck "Include Intercept in Model."; Continue; OK. Note that this
procedure reflects the fact that in a randomized block design there are no interaction
effects, just main effects. Later, for multiple comparisons, repeat this procedure but click
the Post Hoc button and enter the main factors in the Post Hoc Tests box; also check the
type of test wanted (ex., Tukey's HSD).
g) Within subjects (repeated measures) designs: In this type of design, the researcher is
comparing measures for the same subjects (hence, "within subjects"). The same subjects
are used for each level of the independent variable, as in before-after studies or panel
studies. Since the subjects are the same for all levels of the independent variable(s), they
are their own controls (that is, subject variables are controlled). However, there is greater
danger to validity in the form of carryover effects due to exposure to earlier levels in the
35
treatment sequence (ex., practice, fatigue, attention) and there is danger of attrition in the
sample. Counterbalancing is a common strategy to address carryover effects: ex., half the
subjects get treatment A first, then B, while the other half get B first, then A, so that the
carryover effect washes out in the sense that it is counterbalanced in the overall sample.
Keep in mind that counterbalancing does not remove all effects - for instance, if there is a
practice effect in a test situation, with higher scores for the second-taken test, on the
average both tests will score higher in the overall sample than they would otherwise,
since for both tests half the sample had the benefit of a practice effect. Counterbalancing
in this situation only seeks that both test scores are biased equally upward, not that bias in
absolute scores is eliminated.
h) Matched pairs designs: Compared to between-subjects designs, within-subjects designs
control for subject variables better but at the expense of greater threat to validity in the
form of contamination from influences arising from subjects going from one
experimental level (condition) to another. Another type of repeated measures design is
matched pairs, where the repeated measurement is not of the same subjects but of very
similar subjects matched to have like key attributes. While matched pairs designs avoid
some types of invalidity of within subjects designs, such as the threat of subject fatigue
across repeated tests, matched pairs designs control only for the matched attributes
whereas same-subject within-subjects designs control for both explicit and unmeasured
subject variables.
2.Quasi-Experimental Design
a) Nonequivalent Control Group Designs
A design is quasi-experimental if subjects are not randomly assigned to groups but statistical
controls are used instead. There may still be a control or comparison group. While subjects
are not randomly assigned, they are either randomly selected (sampled) or are all the relevant
cases. For instance, a random sample of cities with councilmanager governments may be
compared with a random sample of cities with mayorcouncil governments. Cook and
Campbell (1979) outline 11 nonequivalent control group research designs. In each case, due
to the non-equivalency of the comparison group, threats to validity are much more possible
than in a randomized design and the researcher should consider checklist-style all the types
of validity threats.
36
1. One-Group Posttest-Only Design: Sometimes called the "one-shot case study," this design
lacks a pretest baseline or a comparison group, making it impossible to come to valid
conclusions about a treatment effect because only posttest information is available. The level
of the dependent variable may be due to treatment, or may be due to any number of causes of
invalidity such as history (other events coexisting with treatment), maturation (changes in
subjects which would have occurred anyway), experimenter expectation (subjects seeking to
provide responses known to be desired or simply reacting to the attention of being tested), or
other biases discussed in the section on validity. If this design is used, information must be
gathered on pretest conditions, if only through respondent recollections, which are often
subjective and unreliable.
2. Posttest-Only Design with Nonequivalent Comparison Groups Design: In this common
social science design, it is also impossible to come to valid conclusions about treatment effect
based solely on posttest information on two nonequivalent groups since effects may be due to
treatment or to nonequivalence between the groups. Strategies for improving validity center on
trying to create equivalency between groups by random assignment of subjects or matched-pair
assignment to groups. When such assignment is impossible, then attempts may be made to
control statistically by measuring and using as covariates all variables thought to affect the
dependent variable. Nonetheless, many of the same threats to validity exist as in one-group
posttest-only designs: history (concurrent events affect the two groups differently), maturation
(the two groups would have evolved differently anyway), testing (the two groups have
different reactions to testing itself), regression to the mean (the two groups tend to revert to
their respective means if starting from extreme levels), etc.
3. Posttest-Only Design with Predicted Higher-Order Interactions: Sometimes the
expectation of the treatment effect interacts with a third variable. Instead of the expectation that
treatment group subjects will be higher on the dependent, one has the expectation that the
subjects will be higher if in the upper half of third variable Y but lower (or not as high) if in the
bottom half of Y. For instance, training may lead to greater productivity for high education
employees but not for low education employees on the same tasks. The interaction creates two
or more expectations compared to the simple one-expectation one-group posttest only design.
Because there are more expectations, there is greater verification of the treatment effect.
37
However, this design is still subject to possible challenges to validity due to such factors as
history (subjects high in education had different experiences) -- it is just that the counterargument has to be more complex to account for the interaction, and therefore may be
somewhat less likely to be credible.
4. One-Group Pretest-Posttest Design: This is a common but flawed design in social science.
It is subject to such threats to validity as history (events intervening between pretest and
posttest), maturation (changes in the subjects that would have occurred anyway), regression
toward the mean (the tendency of extremes to revert toward averages), testing (the learning
effect on the posttest of having taken the pretest), and most challenges discussed in the separate
section on validity. Sometimes the pretest data is collected at the same time as the posttest data,
as when the researcher asks for recollection data of the "before" state. This is know as a proxy
pretest-posttest design and has additional validity problems since the pretest data are usually
significantly less reliable.
5. Two-Group Pretest-Posttest Design Using an Untreated Control Group (separate
pretest-posttest samples design): If a comparison group which does not receive treatment is
added to what otherwise would be a one-group pretest-posttest design, threats to validity are
greatly reduced. This is the classic experimental design. Since the groups are not equivalent,
there is still the possibility of selection (observed changes are due to selection of subjects, such
as working with more motivated volunteers in a treatment group -- see two-stage least squares
for a discussion of testing for selection bias). Much depends on the outcome. For instance, if
the treatment group starts below the comparison group and ends up above after treatment, a
stronger inference of a treatment effect exists than if both groups rise in performance, but the
treatment group more so (this might well be due to selection). A strongly recommended
modification to this design is to have more than one pre-test. Multiple pretests (at the same
interval as between the last pretest and the posttest) help establish the performance trends in
both the treatment group and the control group, and treatment should be revealed by a change
in the trend line for the treatment group but not the control group.
6. Double pretest designs. One can strengthen pretest-posttest designs by having two (or
more) pretest measures. This can establish if there is a trend in the data independent of the
38
treatment effect measured by the posttest. By seeing if there is a posttest effect over and above
the trend, one controls for maturation threats to study validity.
7. Four-group Design with Pretest-Posttest and Posttest-Only Groups. Also known as the
"Solomon four-group design," this design has a treatment and control group with both pretests
and post-tests and has treatment and control groups with posttests only. This design strengthens
the two-group pretest-posttest design because, if the same effect difference is found for
treatment vs. control groups in the pretest-posttest set as for the posttest-only set, then the
researcher may rule out threats to validity having to do with repeated measurement (ex.,
learning effects from having taken the test before).
8. Nonequivalent Dependent Variables Pretest-Posttest Design: In this design, the
researcher identifies dependent variables related to the treatment-related variable, but where
treatment is predicted to have no effect. Then, if the variable thought to be affected by
treatment does in fact change in the predicted direction, but there is no change in the other
related dependent variables, again as predicted, then the inference is made that the change in
question is due to treatment, not some confounding cause such as test experience from the
pretest.
9. Removed-Treatment Pretest-Posttest Design: In some situations it is possible not only to
introduce a treatment but also to remove it. If the dependent variable goes up after treatment
and then goes down when treatment is removed, this is some evidence for the effect of
treatment. Of course, if the variable goes up after treatment, it might come down on its own
anyway due to a declining return or attrition effect. Cook and Campbell (1979) therefore
recommend at least two posttests after treatment and before removal of treatment, in order to
establish trend effects after treatment. The researcher also needs to beware of resentment
effects due to treatment removal, as these also might cause a decline in the variable measured,
depending on the situation.
10. Repeated-Treatment Design: This design is similar to the preceding one but follows a
pretest-treatment-posttest-removal of treatment-posttest-restoration of treatment-posttest
pattern. The expected treatment effect is for the dependent variable to increase after treatment,
39
decline after removal of treatment, then increase again with restoration of treatment. Even if
this outcome occurs, inference is not foolproof as the decline phase may be due to resentment
at removal of treatment rather than direct adverse affects of removal of treatment, and the
subsequent rise may be due not to restoration of treatment but removal of the source of
resentment. Also, subjects may more easily become aware of experimenter expectations in this
design, and may seek to meet (or react against) expectations, thereby contaminating the study.
11. Switching Replications Designs. In this research design, there are two comparison groups
and three measures. Both groups are measured under pretest conditions. The treatment is given
to one group but not the control group, and a first post-test measure taken. Then the treatment
is given to the control group but not the first group, and a second post-test measure is taken.
12. Reversed-Treatment Pretest-Posttest Nonequivalent Comparison Groups Design. This
design is one in which the nonequivalent comparison group receives the opposite treatment
(ex., the treatment group receives participative leadership while the comparison group receives
autocratic leadership). The expectation is that the posttest will show increase for the treatment
group and decrease for the comparison group. Cook and Campbell (1979) suggest adding a notreatment group and even a placebo group where appropriate. Multiple pretests will improve
this design by showing preexisting trends in the treatment and nonequivalent comparison
group.
13. Cohort Designs with Cyclical Turnover: This design refers to the study of groups as they
evolve over time, as in the study of a fourth-grade class in year 1, the corresponding fifth grade
class in year two, etc. The expectation is that the class average will increase in the posttest after
treatment. This design is liable to the same challenges to validity as simple prettest-posttest
designs, but it can be strengthened by partitioning the cohort into subgroups according to their
exposure to the treatment. In a study of the effects of television violence, for instance, the
cohort may be divided into groups of high, medium, and low exposure to violent television
shows. The expectation is that the partitions exposed more will show more change on the
dependent variable. Where partitioning is not possible, having multiple prettests and posttests
can establish trends to rebut "it would have happened anyway" arguments about the validity of
conclusions under this design.
40
41
actually led to the observed effect). There may be other problems such as failure to
seasonally adjust data, confounding a seasonal effect with a treatment effect; selection
bias, as due to non-random attrition of subjects in the posttest; instrumentation bias (the
posttest is not equivalent to the pretest); and testing (there may be a learning effect from
the pretest such that the observed effect is one a test artifact rather than a treatment
effect).
b.
c.
42
accident rates on weekday nights when bars are open and accident rates at times when
bars are not open. The expectation was that accident rates would be significantly lower
on weekend nights because of the presence of the treatment. Counter-explanations for
lower accident rates (ex., safer cars, and stricter court treatment of offenders) must
explain not only the lower accident rate on weekend nights, but also the lack of effect at
other times. Of course, confounding factors may well exist, but they must be unique to
the dependent variable of interest.
d.
e.
Interrupted Time Series with Multiple Replications. This is simply the interrupted
time series with removed treatment design, except that treatment and removal occur
multiple times on a schedule. Circumstances rarely permit such a design, but it is
stronger yet. By timing the replications randomly, the researcher is able to minimize
contamination from cyclical factors. This design assumes one is dealing with a
treatment effect which dissipates in a timely manner before the next replication, without
carryover effects (otherwise there is "multiple treatment interference," meaning that
receiving earlier treatments adds to or multiplies the effect of receiving later
treatments).
f.
43
ii.
Non-Experimental Designs
A design is non-experimental if the subjects are neither randomly assigned nor
randomly selected. There may still be comparison groups.
1.
2.
3.
4.
5.
6.
7.
Assumptions
i.
ii.
The researcher is assumed to have considered all threats to validity associated with
the design.
iii.
Hypothesis
Introduction
Ordinarily, when one talks about hypothesis, one simply means a mere assumption or
some other supposition to be proved or disproved. But for a researcher, hypothesis is a formal
question that he intends to resolve. Thus, a hypothesis may be defined as a proposition or a set
of proposition set forth as an explanation for the occurrence of some specified group of
44
Rummel.
It has been defined as a tentative solution posed on cursory observation of known and
available data and adopted provisionally to explain certain events and to guide in the
investigation of others. It is in fact, a possible solution to the problems.
M.H.Gopal
Function of Hypothesis
From the above definitions, it is inferred that the following are the three main functions
of hypothesis:
1. To Test Theories
2. To suggest Theories
3. To describe Social Phenomena.
1.
To Test Theories
The relation between theory and hypothesis is very close indeed. Theories are based on
facts, which is if elaborated becomes hypothesis. Theories need to be tested in order to
demonstrate its predictive value and its adequacy as a tool of explanation for some
45
event. The theories lend themselves for empirical test if the facts proposition and
assumptions implied in their or split up into specific hypothesis.
2.
To Suggest Theories
Another function of hypothesis is to suggest theories. Theory is derived out of
hypothesis and is considered as elaborate hypothesis. In other words, hypothesis when
refined becomes theory.
There is no sharp line of demarcation between hypothesis and theory. Since the
basic difference is one of the complexity and the extent of testing against the evidence.
In its early stage of testing a theory usually has been called a hypothesis, but as the
hypothesis is checked against the data and their logical implication towards a successful
conclusion, it may become known as a theory. Goode and Hart says that Every worth
while theory permits the formulation of additional hypothesis. These when tested are
either proved or disproved and in turn constitution further tests of the original theory.
3. To Describe Social Phenomena
When a hypothesis is tested, it explains the phenomenon associated with it. This
phenomenon may be new or not known earlier. Neither a theory nor a hypothesis is
involved in such a test. Any commonsense knowledge or belief is put to empirical test
to determine its validity. Hence new knowledge is gathered. Goode and Hatt observe
that a hypothesis looks forward. It may be proved to be correct or incorrect on any
event, the hypothesis is a question put in such a way that answers some kind can be
forthcoming.
Forms of Hypothesis
1. Hypothesis concerning law: This explains how an agent works to produce a particular
effect or event.
46
2. Hypothesis concerning an agent : The law of operation of an agent is known, but the
agent which is working to an effect may not be known. This hypothesis is framed to
find out agent.
3. Hypothesis
concerning
collocation:
Collocation
means
an
arrangement
of
seeing cinema.
47
The null hypothesis is more useful than other hypothesis because it is exact. It is easier
to disprove the category of the hypothesis than to prove it with certainty.
Importance of Hypothesis
In addition to putting the theory to test, a hypothesis performs certain other functions:
1. guiding point to a research
2. prevents a blind search and indiscriminate gathering of data
3. serves as a forerunner to collect data
4. saves a lot of time and keep the researcher away from the considerable amount of
confusion
5. solutions to problems
6. has practical values to society
7. helps in understanding the social phenomenon in their proper perspective and refuse
certain Common Sense Nations about human behavior.
Characteristics of Hypothesis
1. Hypothesis must be defined clearly
2. Hypotheses should have empherical reference
3. The hypothesis must be specific
4. Hypothesis should be related to available techniques
5. The hypothesis should be related to a body of theory.
Testing of Hypotheses
Knowledge or a fact could be accepted only when it has validity. The validity of such
knowledge could be accepted only when it is tested with regard to its usefulness or truth.
Hence, to accept hypotheses, which are merely hunches or guesses, as facts, they are to be
used. Testing hypotheses means subjecting them to some sort of empherical scrutiny to
determine if they are supported or refuted by what the researcher observes.
48
Robert Baes states that the following questions should be asked about hypotheses
before they are tested:
1. are the terms empherically specific, so that the concepts or variables can be
distinguished in concrete situation?
2. is the relationship between variables such that it could be verified or nullified by
means of empherical operations?
3. is there any prior evidence as to the truth or falseness of the relationship?
4. can an appropriate study design be devised?
5. are the variables context bound or could they be equally well applied to other
interaction situation?
6. are the generalizations culture bound or can they be also applied realistically
to other cultures?
7. if other relevant factors are subject to change in the course of the observations,
are they adequately specified and enumerated, so that the observer can ascertain
whether they have changed during the period of observation?.
8. is the generalization a part of the theoretical system from which it could be
deduced as well as being verified by the proposed empherical induction?
9. is the empherical system that is constructed sufficiently praise and articulates to
permit predictions in concrete situations?
Problems in Formulating Hypotheses
A researcher has to come across several problems in formulating hypothesis. It is as
difficult as finding out a research problem or identifying a suitable theory. The following are
some of the difficulties in formulating hypotheses as pointed out by Good and Hart in their
book Methods of social research.
1. Absence of a theoretical frame work
2. Lack of ability to utilize that theoretical frame work logically
3. Failure to acquaint with available research techniques so as to able to phrase
the hypothesis properly
49
The researcher must bear in mind that he has to formulate a good definite and testable
hypothesis. At the beginning stage when one is not familiar with various processes of research,
the formulation of a sound hypothesis would be some what difficult. He may frame some
formulations and can say that they are hypotheses. But they may not have the testability.
Uses of Hypothesis
a) Hypothesis forms the starting point of investigation
b) Hypothesis makes observation and experiment possible
c) Hypothesis is an aid to explanation
d) Hypothesis makes deduction possible
e) Hypothesis act as a guide. In other words, it is investigators eye-a sort of guiding light
in the world of research
f) It prevents blind research. It spells out the difference between precision and hapazards,
between fruitful and fruitless research.
g) It provides direction to research identifying which is relevant and prevent irrelevant
review of literature and collection of useless or excess data
h) It focuses research, without it is like a random and aimless path.
i) It links up related facts and information in fully understandable.
j) It serves a frame work for drawing meaningful conclusions
UNIT III
CHAPTER VI
Historical Research
Historical research is the critical investigation of events, developments and experiences
of past, the careful weighting of evidence of the validity of sources of information on the past,
and the interpretation of the weighted evidence.
collects the data, evaluates it for its validity and finally interprets the data. It is also known as
historiography and differs from other methods by its nature of subject matter which is usually
based on past events. According to Sheik Ali, Historical Research is digging into the past in
order to re-enact the past in its entirely, to reconstruct the past events as fully as they must have
50
happened and to explain the meaning and significance of those events, to correct the wrong
notions so long prevalent if any and to elaborate, analysis, synthesis and philosophise the ideas
in the light of the knowledge we possess. In short, the critical examination of the past event
or happening in order to know truth and later on to generalization is known as Historical
Research.
Aim of Historical Method
The aim of historical method is to discover regularities and events of a given kind at a
certain period of time in the past. History is concerned with an accurate account of particular
happening, stressing the location and the date.
Thus the aim of historical method is to present an accurate record of how, when and
where of the events. The historian explains the events by describing the conditions which led
up to it and how it grew. In addition, a historian may indicate some of the important results
which have followed as a consequence of the event.
Historical method produces uncritically and unreflectively a final reconstructed
historical narrative to learn its bearing upon present phenomena, social institutions, social and
or behavioral problems.
Uses Of Historical Methods
The uses of historical methods are enumerated as follows:
i)
surveyors and even by the most advanced social or behavioural research workers,
both for designing the study and making inferences.
ii)
iii)
initiating projects,
It extends cludes which might indicate the sources and origin of the problem, the
degree of influence it has been exerting on the behavioural aspects and thus shows
the way in controlling and harmonizing the sources and influences.
v)
51
vi)
enquiry or investigation into a subject in order to discover facts and revise the
known facts and put the facts into theories. It is useful both for theoretical and
practical purposes.
vii)
It has made an important contribution to various branches of social sciences. Exhistorical methods include study of life history, history of various organizations and
institutions.
viii)
Advantages
The advantages of historical approach are:
a)
Some problems are such which can be investigated only by this approach may not
offer to other approaches. Therefore, historical approach fills in a big gap of
making the research possible and also meaningful on the problems that would
otherwise have remained unexplored. Many a times, it is considerable interest to
use time series data for assessing the progress or the impact of several policies and
initiatives and this can be done by looking into historical records only for such
problems, therefore, only historical approach would suit better.
b)
Secondly, historical data is not repeatable under any circumstances and, therefore,
historical approach serves as a ready hand method to the researchers whose
problems depend on historical observations. It is fairly easy to repeat observations
in laboratories under controlled conditions but cannot be done in case of historical
data. Historical approach, therefore, has an advantage to offer the past data under
the then prevailing conditions and afford an opportunity to the researcher to view
these observations in the past setting.
52
c)
Thirdly, historical records provide very useful information that goes a long way
towards the solution of a research problem. In cases where time series data is
unavoidable, it is advantageous to follow historical approach.
Library History
Library history is the systematic recounting of past events pertaining to the
establishment, maintenance, and utilization of systematically arranged collections of
recorded information or knowledge carefully conducted litrary history relates the causes
and results of events, it also often recognizes the social, economic, political, intellectual,
and cultural environment in which these events occurred. Further more, library history is
sometimes considered an exposition of past incidents and developments and their impact
on later times.
The term library history is commonly applied to an account of events that affected any
library or group of libraries as well as to the social and economic impacts of libraries on
their communities.
For example the contents of the Library Trends issue entitled, American Library
History: 1876-1976 illustrate the diversity of historical topics within the profession.
Articles are included about the distribution of libraries over the nation, research collection
53
development, library statistics, library buildings, the library profession, library associations,
publishing in the profession, the organization of library resources, and service aspects of
librarianship-all are treated from a historical point of view.
Pierce Butler wrote in An Introduction to Library science that librarianship can be
fully appreciated only through an understanding of its historic origins. The specific value
of library history, according to Jesse Shera, is that it allows librarians to synthesize and to
make generalizations from reconstructions of the past; this process of synthesis and
generalization will not only recreate the past but can serve as as aid in understanding the
present.
because they cannot function effectively (fullfil their social responsibility). When history
is regarded merely on esoteric aspect of knowledge.
The other examples for historical approach in libraries are the following:
Spensers The Chicago Public Library: origins and Backgrounds.
Sheras Foundations of the Public Library White fields Boston Public Library
etc.
Librarians who have conducted historical studies have sometimes been interested in the
general history of libraries or in library history in specific countries, states or regions.
Other librarian historians have concentrated on broad periods of time or ages ancient,
medieval or modern. A few history scholars have investigated aspects of large categories
of librarianship, such as (a) libraries in the United States during the colonial period: (b) the
period in which social libraries existed between 1731 and 1865 or (c) the nineteenth
century development period of the tax-supported public library in the United States. A
number of historians have conducted enquiries into the histories of particular type of
libraries such as academic public or special.
Some library historians are investigatory events and developments in the recent part
such as (a) the influence of legislators and of legislation such as Public Law 480, the Library
services and construction Act, and the Higher Education Act: (b) the library utilization of a
variety of new communication media and technological innovations such as tele-fasimile
devices, television, microforms, data-processing equipment, computers, reprographic tools, and
audio-visual materials; and (c) the impact of the great Depression and world war II on libraries.
54
Other subjects of contemporary historical studies have been (a) library services to
disadvantaged and culturally deprived citizens; (b) the roles of library associations and their
impact on library development (c) overseas library technical assistant, (d) the careers of block
librarians; (e) women in positions of leadership in the profession; and (f) political leadership
for library development.
Conditions Necessary for Historical Research
The research worker who plans to conduct a history study first determine whether a
reasonable amount of evidence about the selected topic is readily available, as well as how and
where access to it can be gained.
Good history cannot be written without adequate sources of information.
The
researcher should avoid selecting topics or problems that are too broad a complex. He should
often pose research questions that require very extensive investigations.
The investigator should choose a subject area for investigation in which he or she feels
most comfortable so far as personal ability, background, experience, knowledge, and aptitude
are concerned. Difficulties are likely to arise when historians have not acquired an in depth
knowledge about the targeted subject area.
A thorough and accurate system of bibliographical control and note taking procedures
should be devised before a study is initiated. Moreover, literature searchers are essential at an
early stage of the investigative process.
Good historians realize that not all knowledge is absolute, nor is all knowledge
scientific.
answers, and testing the answers. The scientific approach in historical inquiries produces
knowledge that is more certain.
The success or failure of historical inquiries will depend greatly upon the ability of the
research worker to adequately conceptualize the purpose and problem of the research, to
vigorously evaluate and categorize the collected evidence, and to analyse data intelligently in
view of research objective.
Operational Research
Operational Research (OR) is the use of scientific methods to study the functions or
operations of an organization so as to develop better type ways of planning and controlling,
changes. It can be viewed as a branch of management, engineering, or science or as a special
55
combination of these three. As part of the field at management, its purpose is to assist decision
makers in choosing preferred future courses to action by systematically identifying and
examining the alternatives which are available to the manager, and by predicting the possible
outcomes from such actions. As a branch of engineering, operational research is concerned
with the analysis and design of systems for automating and augmenting management type
functions, especially those concerned with information processing and control. Operational
research is also viewed as a form of applied science, and is closely related to the fields of
statistics, computer science, and optimization methods.
Operational research began with the efforts of small teams of British scientists who
were mobilized at the beginning of world war II to help solve pressing military problems.
Under the direction of Nobel Laureate P.M.S.Blackett, Blacketts circus quickly
demonstrated the value of interdisciplinary scientific approaches to operational problems.
Their success led to early initation by U.S. and other allied groups, and to the eventual
recognition of operations research as an integral part of modern military management. The use
of Operational Research Techniques in business, industry and government grew rapidly during
the 1950s and 1960s. Today, most large companies have special groups which specialize in the
application of the OR approach to corporate problems. Many consulting firms provide such
services to business, government and military organizations on a contractual basis, usually to
the higher levels of decision making in the organization.
Special courses and academic programmes in operations research were started in the
early 1950s at Massa Chusetts Institute of Technology, the Johns Hopkings University, and
Case Western Reserve University. By the end of the 1960s most Universities were offering so
kind of programme in the subject. Preparation for professional practice as operations research
specialist has been strongest in the engineering schools, business school, and military
academics. Departments of applied mathematics, industrial and systems engineering, and
quantitative methods in business and ecomonic are the kinds of faculties that offer advanced
graduate study in operations research.
research techniques are now being taught widely as supplementary topics in many other
disciplines. As an academic subject in its own right, however, emphasis is placed on the
mathematical aspects of the subject and on those methods which have proven to be especially
useful in describing and solving many kinds of decision problems. In practice, operations
56
research remains and electric art and is usually undertaken as a team effort combining the skills
from many disciplines. Initially such teams were largely dominated by physical scientists who
were interested in applying their scientific methodology to operational problems.
More
recently, their ranks have tended to give way to engineers, economists, psychologists,
mathematicians, and computer scientists. The common unifying theme to their work is the task
of formulating reals world problems in such a way hat they can be analysed in an systematic
fashion and solutions can be proposed that have credibility through logic and experience.
Frequently, it is found that formulations and solution techniques perfected in one problem area
have considerable transferability to other problem areas. The two major professional societies
in the United States are the Operations Research society of America (ORSA) and The Institute
of Management Sciences (TIMS).
The Operations Research Approach
Operations research begins with the study of the people who make decisions in an
organization and seeks to make more explicit the alternatives they face and their reasons for
making chockes. They are four major steps.
a)
b)
c)
d)
The total process is a cyclic one, that is, the research team begins with an elementary
formulation of the problem and the construction of as simple a model as is possible to capture
the essential features. This leads to a solution system that can be used in an experimental way
in the organization to see how it affects the original problem situation. This usually leads to a
much better clarification of the problem, enrichment of the model, refinement of the system,
and a new round of experimentation in the organization. The process is intended to continue in
this way to evolve into a fully satisfactory control of the operations under study, and also to
lead to higher levels of management decision making in an organization, it is possible to begin
at various problem situations and arrive at the same level of understanding and control.
However, it is usually most effective to begin with isolated on situations and then work toward
their convergence in a more common structure of decision making, rather than to develop a
total management system at the outset.
57
58
The most widely used OR model is that of linear programming, which is a technique
for finding the best allocation of commonly used resources among the various activities
that use those resources. A different kind of modeling that has been especially useful to
information and communication systems and other types of service organizations is
concerned with the analysis of waiting lines or queues. Professor Morse is a noted expert
on queueing theory, and he shows that many aspects of the behaviour of users in a library
can be readily modeled as random processes. A more recent study by W.B.Rouse used this
technique to determine the optimal staffing of service desks in libraries and to predict the
total volume of circulation of a library collection.
An alternative method of studying systems which are subjected to random perturbations
is by computer simulation, whereby the computer is made to generate such occurrences by
mathematical methods and then keep close track of the consequences of such events over a
period of time. This method is often used when the characteristics of the system under
study do not confirm to the assumptions and other limitations in the use of queueing theory.
59
be experimented with in order to see what is most useful and what data are needed to make
a new synthesis of libraries and computers work most effectively.
Models of Document Usage in Libraries
Operations research models that have been developed within library settings tend to
focus on the library as a document storage and retrieval centre. As early as 1953, Morese
and his colleagues in the operations Research centre at the Massachusetts Institute of
Technology began a series of studies of the institutes library system to develop methods
which could be used in other libraries too. Some of these studies centred on the books and
some on users characteristics. The key model in Morses study is one in which book
usage over time can be predicted to follow a known mathematical form, settling down to a
steady-state pattern after an initial period of popularity. The model is used to detect some
interesting differences in the usage patterns of books in different subject fields, although
the data are admittedly skimpy. The model is then used to make policy recommendations
for such activities as book acquisition, circulation, retirement, and storage, in the light of
the data obtained in the MIT libraries.
Experimental Method
An experiment is a research process used to establish some truth, principle, or effect. It
differs from other investigative methods in that the observed phenomena are controlled to
varying degrees by the investigator.
conditions; attempts are made by experimenters to eliminate as many extraneous factors are
possible.
A number of experimental procedures can be used by investigators no single design
could be characterized as the best for all inquiries. In selecting particular experimental
techniques, competent investigators remain aware that conceptual requirements of the
research hypothesis must be met by controlled experimental conditions.
Basic Concepts
An experiment can be defined as a research situation in which investigators specify
exactly, or control, the conditions that will prevail in the investigation. The values of one
or more independent variables are then manipulated and the effect of the manipulation on
the values of the dependent variables with respect to one or more experimental groups is
observed. The effects of other factors that might possibly be relevant to the research
60
problem (i.e., affect the values of the dependent variable) are minimized through careful
experimental design. In this way, conceptual requirements of the research hypothesis are
met by controlled experimental conditions.
In librarianship experiments can be used to test new techniques for developing,
maintaining, and utilizing library collections, to identify ill-defined or previously
unobserved library or informational phenomena and to explore conditions under which
certain phenomena in library and information science occur.
Subject of an Experimental Research
A subject of an experiment is the basic unit on which an experiment is performed. In
agricultural experimentation, subject might be plants of certain types of perhaps plots of
planted ground. In drug research, subjects might be white rates or guines pigs. In library
science, information, science, and other social science research, subjects are frequently
persons for example, patrons librarians or students.
Treatment In An Experimental Research
A treatment is the condition that is applied to an experimental group of subjects. In
agricultural research, a treatment may be the application of fertilizer to growing plants on
the regulation of the amount of moisture which the plants receive. In medical research, the
administering of drugs to patents (subjects) or the use of a surgical technique are examples
of treatments. In librarianship, a treatment that is being tested might be a system of
indexing, a mode of instruction, a type of catalogue organization, or a method of book
selection, among others.
61
Complete equivalence between experimental and control groups can seldom be obtained in
research in the social sciences, however, because with human subjects there are always
numerous environmental and hereditary factors that cannot normally be known.
Thus,
motivation, intelligence, socio-economic status, age, and perhaps other qualities as well. It
is easy to see why this is so; any one of these variables might have a differential effect on
otherwise equivalent students. For example, if the students of Group A manifest more
collective intelligence than Group B, then the factor itself may cause students in Group A
to learn library skills more effectively than those in Group B. But by designing and
experiment so, the groups are roughly equivalent in intelligence, this variable is controlled,
and thus the effect should be the same for both groups. Because of the careful control that
is exercised in its use, the experimental method is the most appropriate technique to test
hypothesis which involve casual relationships.
62
The
Experimental Group
Post test
Treatment
Figure 1.
Field Experiment
In a field experiment, the investigator observes a phenomenon in a natural setting, and
at the same time, manipulates one or more variables are taken in field experiments to avoid
unnecessary disruptions of the natural conditions being observed. When an investigator
goes beyond the field experiment and attempts to construct a particular setting or a set of
circumstances that typifies a naturally occurring phenomenon, the research design could be
described as n experimental simulation. The electronic computer has been found to be a
powerful tool for such stimulations.
Laboratory Experiment
A laboratory experiment is a study conducted under highly controlled conditions.
The term laboratory is used as an indication that an experiment is not conducted in a
naturalistic setting but in some other convenient place.
63
information science are conducted to test specific hypotheses, the experimental method is also
used advantageously in exploring other kinds of research problems. Experiments can be
performed to test new techniques for acquiring, classifying, storing, and relieving information,
or to test new library or information services. Further more, experiments allow investigators in
library and information science to explore conditions under which a phenomenon occurs.
Other experiments might be conducted merely to satisfy curiosity about certain library or
information phenomena.
Evaluation Of An Experiment
Once an investigator has conducted an experiment, the results of the experiment the
experimental data must be evaluated Evaluation involves a number of concepts, including
hypothesis testing, experimental error, sensitivity, internal validity, and external validity.
The evaluation of an experiment involves a test of a null hypothesis, a statement that no
significant difference exists between the control and experimental groups. In other word the
null hypothesis asserts that within specified limits of credibility, the control and experimental
groups are essentially equivalent. Two groups of subjects will rarely perform identically; some
variation will probably occur under circumstances. The investigator, however, must determine.
Whether differences between the performances of the control and experimental groups are
statistically significant. A test of a null hypothesis is an attempt to disprove the assertion of
no difference; to show that the experimental groups has in fact been affected by the treatment
in such a way as to significantly change the value of the dependent variable. An appropriate
procedure to test the significance of a treatment using the four-cell experimental design may be
thet-test, which is applied to the differences between pretest and post test scores for the
control and experimental groups. The chi-square test may also be appropriate. Analysis of
variance is required for more complex designs.
Sensitivity of an Experiment
The sensitivity of an experiment is its ability to detect relatively small effects. One way
to increase the sensitivity of an experiment is to increase the number of subjects; this decreases
the chances for random error (or experimental error) to affect the results in a significant way.
This option always adds to the expense of conducting the experiment, however. Another way
to reduce experimental error is to exert additional control over the experiment, to ensure that
the control and experimental groups are exposed of similar subjects. This is sometimes done
64
which it can be generalized? External validity is association with questions concerning the
extent to which a sample is representative of the target population. Can the results of an
experiment be generalized to the real world? It is normally easier to justify such generalization
for field experiments than for laboratory experiments because of the artificiality of the latters
overall, experimental findings are evaluated in terms of the reliability of the data, the scientific
importance of the results, and the extent to which the data can be generalized.
Expost Facto Study
The Latin expost facto means literally, after the fact or retrospectively. Although an
expost facto study is not really on experimental design, such a study sometimes is confused
with the experimental method, and is frequently discussed with it. What is lacking in the
expost facto study is control. Rather than introducing an independent variable and controlling
other variables, the expost facto study seeks to analyse what has already happened in an effort
to isolate the cause of the events. Mouly asserts that This is experimentation in reverse the
obvious weakness of such as experiment is that we have no control over the situations that
have already occurred and we can never be sure of how many other circumstances might have
been involved.
Expost facto study is a type of quasi-experimental design. Consider the hypothesis that
academic courses in research methods affect graduate library science students in such a way as
to cause them to the better librarians than librarians who did not complete such a course.
An expost facto analysis of graduated librarians with respect to an operational
definition of success in librarianship (such as salary or job title) might well reveal that a strong
relationship does indeed exist between completing a course in research methods and
professional success in librarianship. But is the observed relationship casual in nature? Does
the course cause the success?
The original hypothesis could be tested, albeit with some difficulty and over a period of
years, by the experimental method. Such an approach might involves the random division of a
65
group of equivalent students into two groups, a control group students not permitted to take
research methods and an experimental group students who are required to take research
methods. The effect of the course on graduate students could then be observed over the
students careers.
operationally defined, could then be said to have been caused by the treatment, in this case
completing or not taking the course in research methods.
Major Steps in Experimental Method
1.
2.
3.
4.
5.
6.
7.
8.
Drawing up conclusions
9.
Investigating the needs in the yield of action and deciding upon a problem.
ii)
66
67
68
recording of data completely depends on the skill and ability of the researcher. He should
also try to seek more information and derive special features from them.
Step 3: Interpretation of data
Interpretation is the search for the broader meaning of the research findings. It is the
deeper sense of investigation. It places particular event in the larger flow of events. It is the
color, the atmosphere, the human element that gives meaning to a fact. It is in short setting,
sequence and about all significance. The task of the interpretation falls on the shoulders of
the research himself. Mere presentation of the facts is not enough; the facts collected must
be clarified, explained and interpreted. The interpretation must be in logical and convenient
form. Qualitative classification and building up a proper background theory are also
essential. Good interpretation depends on the researchers knowledge, imagination and
wisdom.
Step 4: Reporting of data
The study should be reported in such a way that the reader can make a judgment as to
its adequacy.
Advantages
The main advantages of case study approach are
a) It produces new ideas and fresh suggestions
b) It helps in formulating a sound hypothesis; and
c) It may also help in exploring new areas of research.
Since the case study approach makes an in-depth study of a particular unit of
investigation and is always approached with an open mind, it bestows upon the researcher a
high wealth of new ideas and new suggestions for further exploration of research fields.
Investigator of an institution may uncover fresh knowledge about the problems that might
not have occurred to the researcher before he undertook the investigation. He may also get
new suggestions from the field of operation by intensively carrying out the examination of
the case study.
Secondly, case study approach is very useful in helping the researcher to develop and
formulate scientifically sound hypothesis for more research on a broader level. As already
mentioned, a researcher may not start with a given hypothesis, but may desirably undertake
a case study for formulating such hypothesis for further research. Case study approach has
69
also ad advantage in marking a multi-dimensional exploration of the same unit and this
enrich the knowledge pertaining to a particular case for further use in policy formulation.
Thirdly, when a case study is undertaken, some of the areas of research may not have
occurred to the researchers mind and the very case study may open out new avenues of
research where fruitful investigations can be undertaken either by the same researcher or
other researchers.
Thus, this approach allows a concentrated focus on a single phenomenon and the
utilization of a wide array of data gathering methods. The overall purpose a case study is
to obtain comprehensive information about the research object. Data gathering methods
used in case studies are based primarily upon direct observation; both participant and nonparticipant observation can be used when necessary, these methods are supplemented by
structured techniques such as interviews and questionnaires.
CHAPTER VII
Sampling
Sampling is the act, process, or technique of selecting a suitable sample, or a representative
part of a population for the purpose of determining parameters or characteristics of the whole
population.
What is a sample?
A sample is a finite part of a statistical population whose properties are studied to gain
information about the whole (Webster, 1985). When dealing with people, it can be defined as a
set of respondents (people) selected from a larger population for the purpose of a survey.
A population is a group of individuals persons, objects, or items from which samples are taken
for measurement, for example, a population of presidents or professors, books or students.
Purpose of sampling
To draw conclusions about populations from samples, we must use inferential statistics which
enables us to determine a populations characteristics by directly observing only a portion (or
70
sample) of the population. We obtain a sample rather than a complete enumeration (a census)
of the population for many reasons. Obviously, it is cheaper to observe a part rather than the
whole, but we should prepare ourselves to cope with the dangers of using samples. In this
tutorial, we will investigate various kinds of sampling procedures. Some are better than others
but all may yield samples that are inaccurate and unreliable. We will learn how to minimize
these dangers, but some potential error is the price we must pay for the convenience and
savings the samples provide.
There would be no need for statistical theory if a census rather than a sample was always used
to obtain information about populations. But a census may not be practical and is almost never
economical. There are six main reasons for sampling instead of doing a census. These are; Economy -Timeliness -The large size of many populations -Inaccessibility of some of the
population -Destructiveness of the observation -accuracy
The sampling process comprises several stages:
Defining the population of concern
Specifying a sampling frame, a set of items or events possible to measure
Specifying a sampling method for selecting items or events from the frame
Determining the sample size
Implementing the sampling plan
Sampling and data collecting
Reviewing the sampling process
1. Population definition :
Successful statistical practice is based on focused problem definition. Typically, we seek to
take action on some population, for example when a batch of material from production must be
released to the customer or sentenced for scrap or rework.
Alternatively, we seek knowledge about the cause system of which the population is an
outcome, for example when a researcher performs an experiment on rats with the intention of
gaining insights into biochemistry that can be applied for the benefit of humans. In the latter
71
case, the population of concern can be difficult to specify, as it is in the case of measuring
some physical characteristic such as the electrical conductivity of copper.
However, in all cases, time spent in making the population of concern precise is often well
spent, often because it raises many issues, ambiguities and questions that would otherwise have
been overlooked at this stage.
2 . Sampling frame
The sampling frame must be representative of the population and this is a question outside
the scope of statistical theory demanding the judgment of experts in the particular subject
matter being studied. All the above frames omit some people who will vote at the next election
and contain some people who will not. People not in the frame have no prospect of being
sampled. Statistical theory tells us about the uncertainties in extrapolating from a sample to the
frame. In extrapolating from frame to population, its role is motivational and suggestive.
There is, however, a strong division of views about the acceptability of representative sampling
across different domains of study. To the philosopher, the representative sampling procedure
has no justification whatsoever because it is not how truth is pursued in philosophy. "To the
scientist, however, representative sampling is the only justified procedure for choosing
individual objects for use as the basis of generalization, and is therefore usually the only
acceptable basis for ascertaining truth." (Andrew A. Marino) It is important to understand this
difference to steer clear of confusing prescriptions found in many web pages.
In defining the frame, practical, economic, ethical, and technical issues need to be
addressed. The need to obtain timely results may prevent extending the frame far into the
future.
The difficulties can be extreme when the population and frame are disjoint. This is a
particular problem in forecasting where inferences about the future are made from historical
data. In fact, in 1703, when Jacob Bernoulli proposed to Gottfried Leibniz the possibility of
using historical mortality data to predict the probability of early death of a living man,
Gottfried Leibniz recognized the problem in replying:
72
"Nature has established patterns originating in the return of events but only for the most
part. New illnesses flood the human race, so that no matter how many experiments you have
done on corpses, you have not thereby imposed a limit on the nature of events so that in the
future they could not vary."
Having established the frame, there are a number of ways for organizing it to improve
efficiency and effectiveness.
It is at this stage that the researcher should decide whether the sample is in fact to be
the whole population and would therefore be a census.
3. Sampling method
o
o
Quota sampling
Simple random sampling
Stratified sampling
Cluster sampling
Random sampling
Systematic sampling
Mechanical sampling
Convenience sampling
Line-intercept sampling
Within any of the types of frame identified above, a variety of sampling methods can be
employed, individually or in combination.
Quota sampling
In quota sampling, the population is first segmented into mutually exclusive subgroups, just as in stratified sampling. Then judgment is used to select the subjects or units from
each segment based on a specified proportion. For example, an interviewer may be told to
sample 200 females and 300 males between the age of 45 and 60.
73
It is the second step which makes the technique one of non-probability sampling. In
quota sampling the selection of the sample is non-random. For example interviewers might be
tempted to interview those who look most helpful. The problem is that these samples may be
biased because not everyone gets a chance of selection. This random element is its greatest
weakness and quota versus probability has been a matter of controversy for many years.
Simple random sampling
In a simple random sample of a given size, all such subsets of the frame are given an
equal probability. Each element of the frame thus has an equal probability of selection: the
frame is not subdivided or partitioned. It is possible that the sample will not be completely
random.
Stratified sampling
Where the population embraces a number of distinct categories, the frame can be organized
by these categories into separate "strata." A sample is then selected from each "stratum"
separately, producing a stratified sample. The two main reasons for using a stratified sampling
design are
1. to ensure that particular groups within a population are adequately represented in the
sample, and
2. to improve efficiency by gaining greater control on the composition of the sample. In
the second case, major gains in efficiency (either lower sample sizes or higher
precision) can be achieved by varying the sampling fraction from stratum to stratum.
The sample size is usually proportional to the relative size of the strata. However, if
variances differ significantly across strata, sample sizes should be made proportional to the
stratum standard deviation. Disproportionate stratification can provide better precision than
proportionate stratification. Typically, strata should be chosen to:
have means which differ substantially from one another
minimize variance within strata and maximize variance between strata.
74
Cluster sampling
Sometimes it is cheaper to 'cluster' the sample in some way e.g. by selecting
respondents from certain areas only, or certain time-periods only. (Nearly all samples are in
some sense 'clustered' in time - although this is rarely taken into account in the analysis.)
Cluster sampling is an example of 'two-stage sampling' or 'multistage sampling': in the
first stage a sample of areas is chosen; in the second stage a sample of respondent within those
areas is selected.
This can reduce travel and other administrative costs. It also means that one does not need a
sampling frame for the entire population, but only for the selected clusters. Cluster sampling
generally increases the variability of sample estimates above that of simple random sampling,
depending on how the clusters differ between themselves, as compared with the within-cluster
variation.
Random sampling
In random sampling, also known as probability sampling, every combination of items
from the frame, or stratum, has a known probability of occurring, but these probabilities are not
necessarily equal. With any form of sampling there is a risk that the sample may not adequately
represent the population but with random sampling there is a large body of statistical theory
which quantifies the risk and thus enables an appropriate sample size to be chosen.
Furthermore, once the sample has been taken the sampling error associated with the measured
results can be computed. With non-random sampling there is no measure of the associated
sampling error. While such methods may be cheaper this is largely meaningless since there is
no measure of quality. There are several forms of random sampling. For example, in simple
random sampling, each element has an equal probability of being selected. Another form of
random sampling is Bernoulli sampling in which each element has an equal probability of
being selected, like in simple random sampling. However, Bernoulli sampling leads to a
variable sample size, while during simple random sampling the sample size remains constant.
Bernoulli sampling is a special case of Poisson sampling in which each element may have a
75
76
Mechanical sampling
Mechanical sampling is typically used in sampling solids, liquids and gases, using
devices such as grabs, scoops, thief probes, the COLIWASA and riffle splitter. Care is needed
in ensuring that the sample is representative of the frame.
Convenience sampling
Sometimes called grab or opportunity sampling, this is the method of choosing items
arbitrarily and in an unstructured manner from the frame. Though almost impossible to treat
rigorously, it is the method most commonly employed in many practical situations. In social
science research, snowball sampling is a similar technique, where existing study subjects are
used to recruit more subjects into the sample.
Line-intercept sampling
Line-intercept sampling is a method of sampling elements in a region whereby an
element is sampled if a chosen line segment, called a transect, intersects the element.
Sampling error
What can make a sample unrepresentative of its population? One of the most frequent
causes is sampling error. Sampling error comprises the differences between the sample and the
population that are due solely to the particular units that happen to have been selected.
For example, suppose that a sample of 100 American women are measured and are all
found to be taller than six feet. It is very clear even without any statistical prove that this would
be a highly unrepresentative sample leading to invalid conclusions. This is a very unlikely
occurrence because naturally such rare cases are widely distributed among the population. But
it can occur. Luckily, this is a very obvious error and can be detected very easily.
The more dangerous error is the less obvious sampling error against which nature offers
very little protection. An example would be like a sample in which the average height is
overstated by only one inch or two rather than one foot which is more obvious. It is the
unobvious error that is of much concern.
77
There are two basic causes for sampling error. One is chance: That is the error that
occurs just because of bad luck. This may result in untypical choices. Unusual units in a
population do exist and there is always a possibility that an abnormally large number of them
will be chosen. For example, in a recent study in which I was looking at the number of trees, I
selected a sample of households randomly but strange enough, the two households in the whole
population, which had the highest number of trees (10,018 and 6345) were both selected
making the sample average higher than it should be. The average with these two extremes
removed was 828 trees. The main protection against this kind of error is to use a large enough
sample. The second cause of sampling is sampling bias.
Sampling bias is a tendency to favour the selection of units that have particular
characteristics. Sampling bias is usually the result of a poor sampling plan. The most notable is
the bias of non response when for some reason some units have no chance of appearing in the
sample. For example, take a hypothetical case where a survey was conducted recently by
Cornell Graduate School to find out the level of stress that graduate students were going
through. A mail questionnaire was sent to 100 randomly selected graduate students. Only 52
responded and the results were that students were not under stress at that time when the actual
case was that it was the highest time of stress for all students except those who were writing
their thesis at their own pace. Apparently, this is the group that had the time to respond. The
researcher who was conducting the study went back to the questionnaire to find out what the
problem was and found that all those who had responded were third and fourth PhD students.
Bias can be very costly and has to be guarded against as much as possible. For this case,
$2000.00 had been spent and there were no reliable results in addition, it cost the researcher his
job since his employer thought if he was qualified, he should have known that before hand and
planned on how to avoid it. A means of selecting the units of analysis must be designed to
avoid the more obvious forms of bias. Another example would be where you would like to
know the average income of some community and you decide to use the telephone numbers to
select a sample of the total population in a locality where only the rich and middle class
households have telephone lines. You will end up with high average income which will lead to
the wrong policy decisions.
78
CHAPTER VIII
Data Collection
Definition
Data collection is a process of gathering data systematically for a particular purpose
from various sources, including questionnaires, interviews, observation, existing records, and
electronic devices. The process is usually preliminary to statistical analysis of the data.
79
Kinds of Information
Different kinds of information are required in social research. This can be classified into
the following two types of information.
1. Primary Data
2. Secondary Data
Primary Data Collection Methods
In primary data collection, the data is collected by the researcher himself using methods
such as interviews and questionnaires. The key point here is that the data collected is unique to
researcher and his research and, until he publishes, no one else can access to it.There are many
methods of collecting primary data and the main methods include:
questionnaires
interviews
focus group interviews
observation
case-studies
diaries
critical incidents
portfolios.
The primary data, which is generated by the above methods, may be qualitative in nature
(usually in the form of words) or quantitative (usually in the form of numbers or where we can
make counts of words used).
Questionnaires
Questionnaires are a popular means of collecting data, but are difficult to design and
often require many rewrites before an acceptable questionnaire is produced.
80
Advantages
Can be used as a method in its own right or as a basis for interviewing or a telephone
survey.
Can be posted, e-mailed or faxed.
Can cover a large number of people or organizations.
Wide geographic coverage.
Relatively cheap.
No prior arrangements are needed.
Avoids embarrassment on the part of the respondent.
Respondent can consider responses.
Possible anonymity of respondent.
No interviewer bias.
Disadvantages
Design problems.
Questions have to be relatively simple.
Historically low response rate (although inducements may help).
Time delay whilst waiting for responses to be returned.
Require a return deadline.
Several reminders may be required.
Assumes no literacy problems.
No control over who completes it.
Not possible to give assistance if required.
Problems with incomplete questionnaires.
Replies not spontaneous and independent of each other.
81
Respondent can read all questions beforehand and then decide whether to complete or
not. For example, perhaps because it is too long, too complex, uninteresting, or too
personal.
82
83
Coding
If analysis of the results is to be carried out using a statistical package or spreadsheet it
is advisable to code non-numerical responses when designing the questionnaire, rather than
trying to code the responses when they are returned. An example of coding is:
Male [ ]
Female [ ]
84
out if the respondent takes an annual holiday, and then secondly find out if they go to
Spain.
Do not ask two questions in one by using and. For example, Did you watch television
last night and read a newspaper?
Avoid double negatives. For example, Is it not true that you did not read a newspaper
yesterday? Respondents may tackle a double negative by switching both negatives and
then assuming that the same answer applies. This is not necessarily valid.
State units required but do not aim for too high a degree of accuracy. For instance, use
an interval rather than an exact figure:
How much did you earn last year?
Less than 10,000 [ ]
10,000 but less than 20,000 [ ]
Avoid emotive or embarrassing words usually connected with race, religion, politics, sex,
money.
Types of questions
Closed questions
A question is asked and then a number of possible answers are provided for the
respondent. The respondent selects the answer which is appropriate. Closed questions are
particularly useful in obtaining factual information:
Sex:
Male [ ] Female [ ]
Yes [ ] No [ ]
Some Yes/No questions have a third category Do not know. Experience shows that as long
as this alternative is not mentioned people will make a choice. Also the phrase Do not know
is ambiguous:
Do you agree with the introduction of the EMU?
85
[ ]
Coach
[ ]
Motor bike
[ ]
Train
[ ]
With such lists the researcher should always include an other category, because not all
possible responses might have been included in the list of answers. Sometimes the respondent
can select more than one from the list. However, this makes analysis difficult.
Why have you visited the historic house? Tick the relevant answer(s). You may tick as
many as you like.
I enjoy
houses
visiting
historic
[ ]
[ ]
[ ]
Attitude questions
Frequently questions are asked to find out the respondents opinions or attitudes to a given
situation. A Likert scale provides a battery of attitude statements. The respondent then says
how much they agree or disagree with each one.
86
Read the following statements and then indicate by a tick whether you strongly agree, agree,
disagree or strongly disagree with the statement.
Strongly
agree
[ ]
Good salary
[ ]
[ ]
[ ]
[ ]
Friendly colleagues
[ ]
A semantic differential scale attempts to see how strongly an attitude is held by the
respondent. With these scales double-ended terms are given to the respondents who are asked
to indicate where their attitude lies on the scale between the terms. The response can be
indicated by putting a cross in a particular position or circling a number.
Work is: (circle the appropriate number)
Difficult
1 2 3 4 5 6 7
Easy
Useless
1 2 3 4 5 6 7
Useful
Interesting
1 2 3 4 5 6 7
Boring
87
For summary and analysis purposes, a score of 1 to 7 may be allocated to the seven
points of the scale, thus quantifying the various degrees of opinion expressed. This procedure
has some disadvantages. It is implicitly assumed that two people with the same strength of
feeling will mark the same point on the scale. This almost certainly will not be the case. When
faced with a semantic differential scale, some people will never, as a matter of principle, use
the two end indicators of 1 and 7. Effectively, therefore, they are using a five-point scale. Also
scoring the scale 1 to 7 assumes that they represent equidistant points on the continuous
spectrum of opinion. This again is probably not true. Nevertheless, within its limitations, the
semantic differential can provide a useful way of measuring and summarizing subjective
opinions.
Other types of questions to determine peoples opinions or attitudes are:
Which one/two words best describes...?
Which of the following statements best describes...?
How much do you agree with the following statement...?
Open questions
An open question such as What are the essential skills a manager should possess?
should be used as an adjunct to the main theme of the questionnaire and could allow the
respondent to elaborate upon an earlier more specific question. Open questions inserted at the
end of major sections, or at the end of the questionnaire, can act as safety valves, and possibly
offer additional information. However, they should not be used to introduce a section since
there is a high risk of influencing later responses. The main problem of open questions is that
many different answers have to be summarized and possibly coded.
Testing pilot survey
Questionnaire design is fraught with difficulties and problems. A number of rewrites
will be necessary, together with refinement and rethinks on a regular basis. Do not assume that
you will write the questionnaire accurately and perfectly at the first attempt. If poorly designed,
88
the researcher will collect inappropriate or inaccurate data and good analysis cannot then
rectify the situation.
To refine the questionnaire, the researcher needs to conduct a pilot survey. This is a
small-scale trial prior to the main survey that tests all his question planning. Amendments to
questions can be made. After making some amendments, the new version would be re-tested. If
this re-test produces more changes, another pilot would be undertaken and so on. For example,
perhaps responses to open-ended questions become closed; questions which are all answered
the same way can be omitted; difficult words replaced, etc.
It is usual to pilot the questionnaires personally so that the respondent can be observed
and questioned if necessary. By timing each question, the researcher can identify any questions
that appear too difficult, and he can also obtain a reliable estimate of the anticipated
completion time for inclusion in the covering letter. The result can also be used to test the
coding and analytical procedures to be performed later.
Distribution and return
The questionnaire should be checked for completeness to ensure that all pages are
present and that none is blank or illegible. It is usual to supply a prepaid addressed envelope
for the return of the questionnaire. The researcher needs to explain this in the covering letter
and reinforce it at the end of the questionnaire, after the Thank you.
Finally, many organizations are approached continually for information. Many, as a
matter of course, will not respond in a positive way.
Interviews
Interviewing is a technique that is primarily used to gain an understanding of the
underlying reasons and motivations for peoples attitudes, preferences or behaviour. Interviews
89
90
Planning an interview:
List the areas in which you require information.
91
Advantages
Relatively cheap.
92
Quick.
Can cover reasonably large numbers of people or organisations.
Wide geographic coverage.
High response rate keep going till the required number.
No waiting.
Spontaneous response.
Help can be given to the respondent.
Can tape answers.
Disadvantages
Often connected with selling.
Questionnaire required.
Not everyone has a telephone.
Repeat calls are inevitable average 2.5 calls to get someone.
Time is wasted.
Straightforward questions are required.
Respondent has little time to think.
Cannot use visual aids.
Can cause irritation.
Good telephone manner is required.
Question of authority.
Getting started
Locate the respondent
o
The researcher may not know an individuals name or title so there is the
possibility of interviewing the wrong person.
93
The researcher can send an advance letter informing the respondent that you
will be telephoning. This can explain the purpose of the research.
The researcher needs to state concisely the purpose of the call scripted and
similar to the introductory letter of a postal questionnaire.
Respondents will normally listen to this introduction before they decide to cooperate or refuse.
When contact is made respondents may have questions or raise objections about
why they could not participate. You should be prepared for these.
Ensuring quality
Quality of questionnaire follows the principles of questionnaire design. However, it
must be easy to move through as the researcher cannot have long silences on the
telephone.
Ability of interviewer follows the principles of face-to-face interviewing.
Smooth implementation
Interview schedule each interview schedule should have a cover page with number,
name and address. The cover sheet should make provision to record which call it is, the
date and time, the interviewer, the outcome of the call and space to note down specific
times at which a call-back has been arranged. Space should be provided to record the
final outcome of the call was an interview refused, contact never made, number
disconnected, etc.
Procedure for call-backs a system for call-backs needs to be implemented. Interview
schedules should be sorted according to their status: weekday call-back, evening callback, weekend call-back, and specific time call-back.
94
The table below compares the three common methods of postal, telephone and
interview surveys it might help you to decide which one to use.
Postal survey Telephone survey
Cost (assuming a good Often lowest
response rate)
Usually in-between
Personal interview
Usually highest
Ability to probe
No personal Some
chance
for
contact
or gathering additional data
observation
through elaboration on
questions, but no personal
observation
Greatest opportunity
for
observation,
building rapport, and
additional probing
Respondent ability to
complete at own
convenience
Yes
Perhaps, if interview
time is prearranged
with respondent
Interview bias
No chance
Least
Some
Impersonality
Greatest
Complex questions
More suitable
Visual aids
Little
opportunity
No opportunity
Greatest opportunity
Potential negative
respondent reaction
Junk mail
Junk calls
Invasion of privacy
Interviewer control
over interview
environment
Least
Greatest
Least
Suitable types of
questions
Simple,
mostly
dichotomous
(yes/no) and
multiple
choice
Requirement for
Least
Medium
95
Greatest
May be considerable
if a large area
involved
Greatest
technical skills in
conducting interview
Response rate
Low
Usually high
High
96
flexibility is needed in observation to identify key components of the problem and to develop
hypotheses. The potential for bias is high. Observation findings should be treated as hypotheses
to be tested rather than as conclusive findings.
Disguised or undisguised
In disguised observation, respondents are unaware they are being observed and thus
behave naturally. Disguise is achieved, for example, by hiding, or using hidden equipment or
people disguised as shoppers.
In undisguised observation, respondents are aware they are being observed. There is a
danger of the Hawthorne effect people behave differently when being observed.
Natural or contrived
Natural observation involves observing behaviour as it takes place in the environment,
for example, eating hamburgers in a fast food outlet.
In contrived observation, the respondents behaviour is observed in an artificial
environment, for example, a food tasting session.
Personal
In personal observation, a researcher observes actual behaviour as it occurs. The
observer may or may not normally attempt to control or manipulate the phenomenon being
observed. The observer merely records what takes place.
Mechanical
Mechanical devices (video, closed circuit television) record what is being observed.
These devices may or may not require the respondents direct participation. They are used for
continuously recording on-going behaviour.
Non-participant
97
The observer does not normally question or communicate with the people being
observed. He or she does not participate.
Participant
In participant observation, the researcher becomes, or is, part of the group that is being
investigated. Participant observation has its roots in ethnographic studies (study of man and
races) where researchers would live in tribal villages, attempting to understand the customs and
practices of that culture. It has a very extensive literature, particularly in sociology
(development, nature and laws of human society) and anthropology (physiological and
psychological study of man). Organisations can be viewed as tribes with their own customs
and practices.
The role of the participant observer is not simple. There are different ways of classifying
the role:
Researcher as employee.
Researcher as an explicit role.
Interrupted involvement.
Observation alone.
Researcher as employee
The researcher works within the organization alongside other employees, effectively as
one of them. The role of the researcher may or may not be explicit and this will have
implications for the extent to which he or she will be able to move around and gather
information and perspectives from other sources. This role is appropriate when the researcher
needs to become totally immersed and experience the work or situation at first hand.
There are a number of dilemmas. Do you tell management and the unions? Friendships
may compromise the research. What are the ethics of the process? Can anonymity be
maintained? Skill and competence to undertake the work may be required. The research may
be over a long period of time.
98
The extent to which the researcher would be comfortable in the role: If the researcher
intends to keep his identity concealed, will he or she also feel able to develop the type
of trusting relationships that are important? What are the ethical issues?
The amount of time the researcher has at his disposal: Some methods involve a
considerable amount of time. If time is a problem alternate approaches will have to be
sought.
Case-studies
The term case-study usually refers to a fairly intensive examination of a single unit
such as a person, a small group of people, or a single company. Case-studies involve
measuring what is there and how it got there. In this sense, it is historical. It can enable the
researcher to explore, unravel and understand problems, issues and relationships. It cannot,
however, allow the researcher to generalize, that is, to argue that from one case-study the
results, findings or theory developed apply to other similar case-studies. The case looked at
may be unique and, therefore not representative of other instances. It is, of course, possible to
look at several case-studies to represent certain features of management that we are interested
in studying. The case-study approach is often done to make practical improvements.
Contributions to general knowledge are incidental.
The case-study method has four steps:
1. Determine the present situation.
2. Gather background information about the past and key variables.
3. Test hypotheses. The background information collected will have been analysed for
possible hypotheses. In this step, specific evidence about each hypothesis can be
gathered. This step aims to eliminate possibilities which conflict with the evidence
collected and to gain confidence for the important hypotheses. The culmination of this
step might be the development of an experimental design to test out more rigorously the
hypotheses developed, or it might be to take action to remedy the problem.
100
4. Take remedial action. The aim is to check that the hypotheses tested actually work out
in practice. Some action, correction or improvement is made and a re-check carried out
on the situation to see what effect the change has brought about.
The case-study enables rich information to be gathered from which potentially useful
hypotheses can be generated. It can be a time-consuming process. It is also inefficient in
researching situations which are already well structured and where the important variables
have been identified. They lack utility when attempting to reach rigorous conclusions or
determining precise relationships between variables.
Diaries
A diary is a way of gathering information about the way individuals spend their time on
professional activities. They are not about records of engagements or personal journals of
thought! Diaries can record either quantitative or qualitative data, and in management research
can provide information about work patterns and activities.
Advantages
Useful for collecting information from employees.
Different writers compared and contrasted simultaneously.
Allows the researcher freedom to move from one organization to another.
Researcher not personally involved.
Diaries can be used as a preliminary or basis for intensive interviewing.
Used as an alternative to direct observation or where resources are limited.
Disadvantages
Subjects need to be clear about what they are being asked to do, why and what you plan
to do with the data.
Diarists need to be of a certain educational level.
101
Some structure is necessary to give the diarist focus, for example, a list of headings.
Encouragement and reassurance are needed as completing a diary is time-consuming
and can be irritating after a while.
Progress needs checking from time-to-time.
Confidentiality is required as content may be critical.
Analyses problems, so you need to consider how responses will be coded before the
subjects start filling in diaries.
Critical incidents
The critical incident technique is an attempt to identify the more noteworthy aspects
of job behaviour and is based on the assumption that jobs are composed of critical and noncritical tasks. For example, a critical task might be defined as one that makes the difference
between success and failure in carrying out important parts of the job. The idea is to collect
reports about what people do that is particularly effective in contributing to good performance.
The incidents are scaled in order of difficulty, frequency and importance to the job as a whole.
The technique scores over the use of diaries as it is centred on specific happenings and
on what is judged as effective behaviour. However, it is laborious and does not lend itself to
objective quantification.
Portfolios
A measure of a managers ability may be expressed in terms of the number and
duration of issues or problems being tackled at any one time. The compilation of problem
portfolios is recording information about how each problem arose, methods used to solve it,
difficulties encountered, etc. This analysis also raises questions about the persons use of time.
What proportion of time is occupied in checking; in handling problems given by others; on
self-generated problems; on top-priority problems; on minor issues, etc? The main problem
with this method and the use of diaries is getting people to agree to record everything in
sufficient detail for you to analyse. It is very time-consuming!
Sampling
102
Collecting data is time consuming and expensive, even for relatively small amounts of
data. Hence, it is highly unlikely that a complete population will be investigated. Because of
the time and cost elements the amount of data you collect will be limited and the number of
people or organizations you contact will be small in number. You will, therefore, have to take a
sample and usually a small sample.
Sampling theory says a correctly taken sample of an appropriate size will yield results
that can be applied to the population as a whole. There is a lot in this statement but the two
fundamental questions to ensure generalization are:
1. How is a sample taken correctly?
2. How big should the sample be?
The answer to the second question is as large as possible given the circumstances. It is like
answering the question How long is a piece of string? It all depends on the circumstances.
Whilst we do not expect you to normally generalize your results and take a large
sample, we do expect that you follow a recognized sampling procedure, such that, if the sample
was increased generalization would be possible. You therefore need to know some of the
basics of sampling. This will be done by reference to the following example.
The theory of sampling is based on random samples where all items in the population
have the same chance of being selected as sample units. Random samples can be drawn in a
number of ways but are usually based on having some information about population members.
This information is usually in the form of an alphabetical list called the sampling frame.
Three types of random sample can be drawn a simple random sample (SRS), a
stratified sample and a systematic sample.
Simple random sampling
Simple random sampling can be carried out in two ways the lottery method and using
random numbers.
103
36 96 47 36 61
97 74 24 67 62
42 81 14 57 20
16 76 62 27 66
56 50 26 71 07
12 56 85 99 26
96 96 68 27 31
55 59 56 35 64
38 04 80 46 22
Random numbers tend to be written in pairs and blocks of 5 by 5 to make reading easy.
However, care is needed when reading these tables. The numbers can be read in any direction
but they should be read as a singe string of digits i.e. left to right as 0, 3, 4, 7 etc, or top to
bottom as 0, 9, 1, 1, 5, 3, 7, etc. It is usual to read left to right.
The random number method involves:
Allocating a number to each person on the list (each number must consist of the same
number of digits so that the tables can be read consistently).
Find a starting point at random in the tables (close your eyes and point).
104
2.
The procedure is unbiased but the sample may be biased. For instance, if the 90 people
are a mixture of men and women and all men were selected this would be a biased
sample.
To overcome this problem a stratified sample can be taken. In this the population
105
30
ie.
x 9 = 3 men
90
60
of the sample should be women
90
60
ie.
x 9 = 6 women
90
1 to 9
10 to 19
etc
80 to 89
107
108
109
Sources for the last two categories are many and varied. If your dissertation requires
these sources you need to conduct a more thorough search of your library and perhaps seek the
assistance of the librarian.
110
Conclusion
In this unit, we have covered the various methods used to collect data and the various
methods of sampling and assessing the data collection process. There is a lot to absorb and the
researcher needs to make decisions about the best methods to use to collect the data that the
researcher needs to answer his research questions. Each piece of research is very individual and
when he comes to writing up his methodology section he needs to be able to justify and
evaluate the methods he has used. So it is a good idea to have planned this aspect very
carefully!
References
http://brent.tvu.ac.uk/dissguide/hm1u2/hm1u2fra.htm
UNIT V
Data analysis
Introduction
Data analysis is the process of looking at and summarizing data with the intent to
extract useful information and develop conclusions. Developments in the field of statistical
data analysis often parallel or follow advancements in other fields to which statistical methods
are fruitfully applied. Because practitioners of the statistical analysis often address particular
applied decision problems, methods developments is consequently motivated by the search to a
better decision making under uncertainties.
Statistical Data Analysis
Data are not Information!
To determine what statistical data analysis is, one must first define statistics. Statistics
is a set of methods that are used to collect, analyze, present, and interpret data. Statistical
methods are used in a wide variety of occupations and help people identify, study, and solve
111
many complex problems. In the business and economic world, these methods enable decision
makers and managers to make informed and better decisions about uncertain situations.
Vast amounts of statistical information are available in today's global and economic
environment because of continual improvements in computer technology. To compete
successfully globally, researchers and decision makers must be able to understand the
information and use it effectively. Statistical data analysis provides hands on experience to
promote the use of statistical thinking and techniques to apply in order to make educated
decisions in the research.
Computers play a very important role in statistical data analysis. The statistical software
package, SPSS, which is used in this course, offers extensive data-handling capabilities and
numerous statistical analysis routines that can analyze small to very large data statistics. The
computer will assist in the summarization of data, but statistical data analysis focuses on the
interpretation of the output to make inferences and predictions.
Studying a problem through the use of statistical data analysis usually involves four basic
steps.
1. Defining the problem
2. Collecting the data
3. Analyzing the data
4. Reporting the results
Defining the Problem
An exact definition of the problem is imperative in order to obtain accurate data about
it. It is extremely difficult to gather data without a clear definition of the problem.
112
We live and work at a time when data collection and statistical computations have
become easy almost to the point of triviality. Paradoxically, the design of data collection, never
sufficiently emphasized in the statistical data analysis textbook, have been weakened by an
apparent belief that extensive computation can make up for any deficiencies in the design of
data collection. One must start with an emphasis on the importance of defining the population
about which we are seeking to make inferences; all the requirements of sampling and
experimental design must be met.
Designing ways to collect data is an important job in statistical data analysis. Two
important aspects of a statistical study are: Population - a set of all the elements of interest in a
study Sample - a subset of the population Statistical inference is refer to extending your
knowledge obtain from a random sample from a population to the whole population. This is
known in mathematics as an Inductive Reasoning. That is, knowledge of whole from a
particular. Its main application is in hypotheses testing about a given population.
The purpose of statistical inference is to obtain information about population form information
contained in a sample. It is just not feasible to test the entire population, so a sample is the only
realistic way to obtain data because of the time and cost constraints. Data can be either
quantitative or qualitative. Qualitative data are labels or names used to identify an attribute of
each element. Quantitative data are always numeric and indicate either how much or how
many.
For the purpose of statistical data analysis, distinguishing between cross-sectional and
time series data is important. Cross-sectional data re data collected at the same or
approximately the same point in time. Time series data are data collected over several time
periods.
Data can be collected from existing sources or obtained through observation and
experimental studies designed to obtain new data. In an experimental study, the variable of
interest is identified. Then one or more factors in the study are controlled so that data can be
obtained about how the factors influence the variables. In observational studies, no attempt is
made to control or influence the variables of interest. A survey is perhaps the most common
type of observational study.
113
114
Types of Averages
The three most significant types of averages are
115
1. Arithmetic Mean
2. Median
3. Mode
1. Arithmetic Mean
Arithmetic average is also called as Mean. It is the most common type and widely used
measure of central tendency. Arithmetic average of a series is the figure obtained by dividing
the total value of the various items by the number of numbers.
a)
Individual Observation
The simple arithmetic mean (AM) of an individual series is equal to the sum of the values
divided by the number of observations. Let the values of the variable X be X 1 , X 2 , , X n and
let X denote the AM of X.
Step1: Add up all the values of the variable X and find out
Step2: Divide
That is. X
X .
Xn
X2
N
X
N
Example
Calculate mean from the following data:
Roll
No.
Marks
10
40
50
55
78
58
60
73
35
43
48
Solution:
Here, Marks is the variable X. The calculation is tabulated as follows:
Roll No.
1
Marks
40
116
50
55
78
58
60
73
35
43
10
N = 10
48
X = 540
X
N
540
10
= 54 marks.
Interpretation:
The arithmetic mean mark of the given 10 students is 54 marks.
b)
Discrete Series
fx
f )
fx
N
Example
Calculate mean from the following data:
Value
Frequency
1
21
2
30
3
28
4
40
5
26
117
6
34
7
40
8
9
9
15
10
57
Solution:
x
1
f
21
fx
21
30
60
28
84
40
160
26
130
34
204
40
280
72
15
135
10
57
N=
f =300
570
fx =1716
fx
N
1716
5.72
300
Continuous Series
In continuous frequency distribution, the value of each individual frequency
distribution is unknown. Therefore, an assumption is made to make them precise or on the
assumption that the frequency of the class intervals is concentrated at the centre that the mid
point of each class interval has to be found out.
The following procedure is to be adopted for calculating arithmetic mean in a
continuous series.
118
Step1: Find out the mid value of each group or class. The mid value, denoted by m, is
obtained by adding the lower limit and upper limit of the class and dividing the total by two.
For example, in a class interval 10 20, the mid value is 15 (
10 20
2
30
15 )
2
Step 2: Multiply the mid value of each class by the frequency of the class. That is, find fm.
Step 3: Add up all the products Step 4:
fm
fm is divided by N.
fm
Example
Find out the mean profit from the following data.
Profits
per shop
(Rs.)
Number
of shops
100-200
200-300
300-400
400-500
500-600
600-700
700-800
10
18
20
26
30
28
18
Solution
Profits (Rs.)
100-200
fm
1500
200-300
250
18
4500
300-400
350
20
7000
400-500
450
26
11700
500-600
550
30
16500
600-700
650
28
18200
700-800
750
18
f = 150
119
13500
fm = 72900
fm
N
72900
486 .
150
Interpretation
The average profit is Rs. 486.
Merits of Mean
Arithmetic mean is the simplest measurement of central tendency of a series. It is
widely used because
1. It is easy to understand.
2. It is easy to calculate.
3. It is used in further calculation.
4. It is rigidly defined.
5. It is based on the value of every item in the series.
6. It provides a good basis for comparison.
7. Arithmetic mean can be calculated if we know the number of items and aggregate. If
the mean and the number of items are known we can find the aggregate.
8. Its formula is rigidly defined. The mean is the same for the series who ever calculates
it.
9. It can be used for further analysis and algebraic treatment.
10. The mean is more stable measure of central tendency (Ideal average)
Demerits of arithmetic mean
1. The mean is unduly affected by the extreme items.
2. It is unrealistic
3. It may lead to false conclusion.
4. It cannot be accurately determined even if one of the values is not known.
5. It is not useful for the study of qualities like intelligence, honesty and character.
6. It cannot be located by observation or the graphic method.
7. It gives greater importance to bigger items of a series and lesser importance to smaller
items.
120
Even though arithmetic mean is subject to various demerits, it is considered to be the best
of all averages.
Median
Median is the value of item that goes to divide the series into equal parts. Median may
be defined as the value of that item which divides the series into two equal parts, one half
containing values greater than it and the other half containing values less that it. Therefore, the
series has to be arranged in ascending or descending order, before finding the median. In other
words, arranging is necessary to compute median.
As distinct from the arithmetic mean, which is calculated from the value of every item
in the series, the median is what is called a positional average. The term position refers to the
place of a value in a series.
The definitions of median given by different authors are: The median is that value of
the variable which divides the group into two equal parts, one part comprising all values
greater, and the other, all values less than median.
L.R.Connor.
Median of a series is the value of the item actual or estimated when a series is arranged
in order of magnitude which divides the distribution into two parts. - Secrist
The median is that value which divides a series so that one half or more of the items are equal
to or less than it and one half or more of the items are equal to or great that it.
-
The median, as its name indicates, is the value of the middle item in a series, when items are
arranged according to magnitude.
Individual series
Step 1. Arrange the data in ascending or descending order.
Step 2. Apply the formula
N 1 th
Median = Size of (
) item.
2
Odd number of observations
Example
The following are the marks scored by 7 students, find out the median marks.
Roll
1
2
3
4
5
6
numbers
Marks
45
32
18
57
65
28
121
7
46
Solution
Arrange the marks in ascending order.
Roll
numbers
Marks
18
28
32
45
57
58
65
Median = Size of (
N 1 th
7 1
) item =
= 4th item
2
2
58
61
42
38
65
72
66
Solution
Arrange the marks in ascending order.
38
Median = Size of (
42
57
58
61
65
N 1 th
8 1
) item =
= 4.5th item
2
2
Discrete series
Step 1. Arrange the data is ascending or descending order.
Step 2. Find the cumulative frequencies.
Step 3. Apply the formula
122
66
72
Median = Size of (
N 1 th
) item
2
Example
Locate the median from the following:
Size of
the shoes
frequency
5.5
6.5
7.5
10
16
28
15
30
40
34
Solution
Arrange the data in ascending order.
Size of the
shoes (x)
5
5.5
6
6.5
7
7.5
8
10
16
28
15
30
40
34
Cumulative
frequency
(cf)
10
26
54
69
99
139
173
Median = Size of (
N 1 th
173 1
) item =
= 87h item
2
2
N
.
2
Step 2: Find out the class in which median lies.
Step 3: Apply the formula:
Step 1: Find out the median by using
Median = L
N
2
cf
f
i , where
123
55-70
26
70-85
3
85-100
1
Solution
Marks
(x)
10-25
25-40
40-55
55-70
70-85
85-100
Median item =
Cumulative
frequency (cf)
6
26
70
96
99
100
6
20
44
26
3
1
N 100
=
=50.
2
2
Median item lies in 40-55 mark group. Therefore, median is calculated as follows.
N
2
Median = L
Here, L = 40;
N
= 50;
2
cf = 26;
cf
f
f = 44;
Median = 40
i = 15.
50 26
15
44
= 40 + 8.18
= 48.18 marks.
Merits of median
1.
2.
3.
4.
5.
124
6. Median can be calculated even from qualitative phenomena i.e. honesty, character etc.
7. Median can sometimes be known by simple inspection.
8. Its value generally lies in the distribution.
Demerits of median
1. Typical representative of the observations cannot be computed if the distribution of item is
irregular. For example, 1, 2, 3, 100 and 500, the median is 3.
2. When the number of items is large, the prerequisite process (Arraying the items) is a difficult
process.
3. It ignores the extreme items.
4. In case of continuous series, the median is estimated, but not calculated.
5. It is more affected by fluctuations of sampling than in mean.
6. Median is not amenable to further algebraic manipulation.
Mode
Mode is the most common item of a series. Mode is the value which occurs the
greatest number of frequency in a series. According to Croxton and Cowdon, The mode of
a distribution is the value at the point around which the item tend to be most heavily
concentrated.
The chief feature of mode is that it is the size of the item which has the maximum
frequency and is also affected by the frequencies of the neighbouring items.
Individual observation
Mode can often be found out by mere inspection in case of individual observations.
The data have to be arranged in the form of an array so that the value which has the highest
frequency can be known. For example, 10 persons have the following income:
Rs. 850, 750, 600, 825, 850, 900, 630, 850, 640, 530, 850
850 repeats four times, therefore the mode salary is Rs.850.
Discrete series
We cannot depend on the method of inspection to find out the mode. In such situations,
it is suggested to prepare a grouping table and an analysis table to find out the mode. First
we prepare grouping table and then an analysis table.
Step 1: Prepare a grouping table with 6 columns.
Step 2: Write the size of the item in the margin.
Step 3: In column 1, write the frequencies against the respective items.
125
Step 4: In column 2, the frequencies are grouped in twos. (1 and 2, 3 and 4, 5 and 6 and so
on).
Step 5; In column 3, the frequencies are grouped in twos, leaving the first frequency. (2 and
3, 4 and 5, 6 and 7 and so on).
Step 6: In column 4, the frequencies are grouped in threes. (1, 2 and 3; 4, 5 and 6; 7,8, and 9;
and so on)
Step 7: In column 5, the frequencies are grouped in threes, leaving the first frequency.(2, 3
and 4; 5, 6 and 7; 8, 9 and 10 and so on).
Step 8: In column6, the frequencies are grouped in threes, leaving the first two frequencies.
(3, 4 and 5; 6, 7 and 8 so on). In all the processes mark down the maximum
frequencies by bold letters or by a circle.
Step 9: After grouping the frequencies table, an analysis table is prepared to show the exact
size, which has the highest frequency.
Example
Calculate mode from the following:
Size
10
11
12
frequency
10
12
15
13
19
14
20
15
8
16
4
17
3
18
2
Solution
size
10
1
10
Grouping Table
3
4
22
11
12
37
27
12
15
46
34
13
19
54
39
14
20
15
16
17
18
47
28
32
12
15
7
9
5
126
Analysis Table
Column
number
1
2
3
4
5
6
Total
10
15
The mode is 13, as the size of the item repeats 5 times. But through inspection, it is
seen that mode is 14, because the size 14 occurs 20 times. But this wrong decision is revealed
by analysis table.
Continuous series
In a continuous series, to find out the mode, we need one step more than those used for
discrete series. As explained in the discrete series, modal class is determined by preparing
grouping table and analysis tables. Then we apply the following formula:
Z
f1
2 f1
f0
f0
f2
i , where
Z = Mode
L = Lower limit of the modal class
f 1 = Frequency of the modal class
f 0 = Frequency of the class preceding the modal class
f 2 = Frequency of the class succeeding the modal class
i = Class interval
Example
Compute the mode from the following data:
Size of
0-5
5-10 10-15 15-20
item
Frequency
20
24
32
28
20-25
25-30
30-35
35-40
40-45
20
16
34
10
127
Solution
Size of
item
0-5
1
20
5-10
24
10-15
32
Grouping Table
3
4
44
76
56
84
60
15-20
28
80
48
20-25
20
64
36
25-30
16
70
50
30-35
34
60
44
35-40
10
52
18
40-45
8
Analysis Table
Column
number
1
2
3
4
5
6
Total
0-5
1
1
1
1
1
5
1
1
1
25-30
1
1
1
3
1
1
From the above analysis table, it is found that the mode is 10-15 as its frequencies
occur the maximum times. Mode is estimated by the formula
Here, L = 10;
f 1 = 32;
Therefore, Mode = Z 10
f1
2 f1
f0
f0
f 0 = 24;
32 24
5
2(32) 24 28
128
f2
f 2 = 28;
i = 5.
8
5
64 24 28
40
10
12
= 10 + 3.33
= 13.33
10
Merits
1. It is easy to understand as well as easy to calculate. In certain cases, it can be found out
by inspection.
2. It is usually an actual value as it occurs most frequently in the series.
3. It is not affected by extreme values as in the average.
4. It is simple and precise.
5. It is the most representative average.
6. The value of mode can be determined by the graphic method.
7. Its value can be determined in an open end class-interval without ascertaining the class
limits.
Demerits
1. It is not suitable for further mathematical treatment.
2. It may not give weight to extreme items.
3. In a bimodal distribution there are two modal classes and it is difficult to determine the
value of the mode.
4. It is difficult to compute; when there are both positive and negative items in a series and
when there one or more items are zero.
5. It is stable only when the sample is large.
6. Mode is influenced by magnitude of the class-intervals.
7. It will not give the aggregate value as in average.
Relationship between Mean, Median and Mode
In a symmetrical distribution, the three averages are equal.
Mean = Median = Mode
In a asymmetrical distribution,
Mean Mode = 3 (Mean - Median).
129
Measures of Dispersion
While the measures of central tendency gives us one single figure that represents the
entire data, the measures of dispersion, also known as measures of central distendency,
attempts to measure the degree of spread, or dispersion around the central position.
According to Connor, Dispersion is a measure of the extent to which the individual
items vary. Brooks and Dick define it as the degree of scatter or variation of the variables
about a central value. Spiegel writes that the degree to which numerical data tend to spread
about an average value is called the variation or dispersion of the data.
Objectives of the measures of dispersion
1. It is used to compare two or more series with regard to their variability.
2. It is used to control the variability itself.
3. It is used to judge the reliability of the measures of the central tendency.
4. It is used to facilitate the use of other statistical measures.
5.
3. Mean Deviation
4. Standard Deviation
Range
The Range is an easy way to indicate Dispersion. It is the difference between the largest
and the smallest value included in a distribution.
For example the Range of the series 51, 57, 31, 35, 95 is equal to
Largest value Smallest value = 95 51 = 44.
Range is very simple to understand, and easy to compute. It is quite useful as a rough
measure of dispersion for a set of limited items.
Limitations
1. It is not rigidly defined.
2. It has no further algebraic properties.
3. It is not stable.
4. It does not take into account the entire distribution.
5. It cannot be computed from frequency distribution with open classes.
Quartile Deviation
Quartile Deviation, also known as Semi-Inter Quartile Range, is a measure of
dispersion based on the difference between the values of the first quartile and third quartile
divided by two. This is represented by the following formula.
Q.D. =
Q3
Q1
2
The First Quartile or the Lower Quartile, denoted by Q1 in the above formula, is the
value which leaves one-fourth of the value below it, and the Third Quartile or the Upper
Quartile, denoted by Q3 in the above formula, is the value which leaves three-fourths of the
value below it.
131
Example
The following are the marks obtained by 11 students in a test. The marks are arranged in the
ascending order.
25
28
33
Q1 = Size of (
37
42
45
48
51
54
57
60
N 1 th
11 1
) item =
= 3rd item
4
4
N 1 th
11 1
) item =3 (
) = 3 3 = 9th item
4
4
Q3
Q1
2
54 33
2
= 10.5
=
Merits
132
either the mean or the median or the mode, ignoring the sign of the deviation. This is
represented by the following formula.
M. D. =
D
N
, where
150
200
250
360
490
500
600
671
Solution
x
Mean x =
100
150
200
250
360
490
500
600
671
x=
3321
x
D = x- x
= x- 369
269
219
169
119
9
121
131
231
302
D = 1570
N
3321
=
= 369
9
D
1570
Mean Deviation from Mean =
=
= 174.44
N
9
Merits
1. It is simple to understand and easy to compute.
133
(Sigma).
Individual Observation
Steps for calculating S.D.
Step 1: Find out the actual mean of the series. ( X )
Step 2: Find out the deviation of each value from the mean. ( x
Step 3: Square the deviations and take the total of squared deviations.(
Step 4: Divide the total
X)
x2 )
x2
N
134
Example
Calculate the standard deviation from the following data.
14
22
15
20
17
12
11
Solution
Values (X)
x2
(X-15)
14
-1
22
49
-6
36
15
20
25
17
12
-3
11
-4
16
x 2 =140
X =120
Mean:
X=
Standard Deviation:
X
N
x2
N
120
=15
8
140
8
17 .5
4.18
Discrete series
Step 1: Calculate the mean of the series.
Step 2: Find deviations for various items from the mean. That is, X
Step 3: Square the deviations ( d 2 ) and multiply by the respective frequencies (f). We get
fd 2 .
135
Example
Calculate standard deviation from the following:
Marks
10
20
30
40
50
60
No. of
students
12
20
10
Solution
Marks (X)
10
20
30
40
50
60
X =210
fX
8
12
20
10
7
3
N =60
Mean:
80
240
600
400
350
180
fX =1850
X=
Standard Deviation:
fX
N
fd 2
N
d2
X X d
X 30.8
- 20.8
- 10.8
- 0.8
9.2
19.2
29.2
432.64
116.64
0.64
84.64
368.64
852.64
fd 2
3461.12
1399.68
12.80
846.40
2580.48
2557.92
2
fd =10858.40
1850
=30.8 marks
60
10858 .40
60
13.45 marks
Continuous Series
In the continuous series, the method of calculating standard deviation is almost the
same as in a discrete series. But in a continuous series, mid values of the class intervals are
to be found out. The formula is
fd '
where d '
A
C
fd '
136
Step 4: If the class intervals are equal, then take a common factor. Divide each deviation by
the common factor and denote this column by d ' .
Step 5: Multiply these deviations d ' by the respective frequencies and get
fd' .
fd 2 .
Step 8: Substitute the values in the following formula to get the standard deviation.
fd '
fd '
Example
Compute the standard deviation from the following data.
Class (x)
Frequency (f)
0-10
8
10-20
12
20-30
17
30-40
14
40-50
9
50-60
7
60-70
4
Solution
Class
(x)
0-10
Midvalue
(m)
5
10-20
15
-2
12
-24
48
20-30
25
-1
17
-17
17
30-40
35
14
40-50
45
50-60
55
14
28
60-70
65
12
36
d'
A
C
5 35
10
m 35
10
-3
fd '
fd
-24
72
N = 71
137
fd
30
fd 2
=210
fd '
Standard Deviation:
fd '
210
71
30
71
10
= 2.957 ( 0.4225) 2 10
=
2.957
2.7785 10
= 1.667
0.1785 10
10 = 16.67
Correlation Analysis
Correlation refers to the relationship of two or more variables. We can find some
relationship between two variables; for example, there exists some relationship between the
height of a father and the height of the son, price and demand, wage and price index, yield
and rainfall, height and weight and so on. Correlation is the statistical analysis which
measures and analyses the degree or extent to which two variable fluctuate with reference to
each other.
connection between the variables under observation. The correlation measures the closeness
of the relationship between the variable.
138
The
coefficient of correlation is a ratio which expresses the extent to which the changes in one
variable are accompanied by the changes in concerned variable.
Characteristics of Correlation
a)
in other series
b)
The change may be in same direction, that is, positive change, or may be in the opposite
Types of Correlation
a)
b)
c)
d)
Positive Correlation
When the change in one phenomena results in change in another phenomena in the
same direction then the relationship is known as positive correlation.
Negative Correlation
On the otherhand if the change in one phenomena results in the change in some other
phenomena in the opposite direction or the reverse direction, then, the relationship is said to be
negative correlation.
Simple Correlation
Simple correlation shows the correlation between more than two variables. It may be
positive or negative.
Multiple Correlation
Multiple Correlation shows the correlation between more than two variables. It may be
positive or negative.
Linear Correlation
Linear correlation is said to exist when the amount of change in one variable, followed
by the amount of change in other variable tends to bear a constant ratio.
139
Non-Linear Correlation
In non-linear correlation the ratio of the amount of change in one variable to the amount
of change in another variable is not constant.
Partial and Total Correlation
In partial correlation the relationship of two or more variable is examined, while other
variables included for calculation of total correlation are excluded. The total correlation is based
on all the relevant variables.
Measurement of Coefficient of Correlation
The degree of coefficient of correlation can be measured by the following mathematical
methods:
a) Karl Pearsons Co-efficient of correlation
b) Spearmans Ranking Method
c) Concurrent Deviation Method
d) Product Moment Method
As the Pearson Correlation Coefficient and the Spearmans Correlation are commonly
used measures we will study only these two methods.
Karl Pearsons Coefficient of Correlation
According to this method, the Correlation Coefficient between any two series of data is
computed by dividing the product of deviations from the Arithmetic Mean by the product of
the two Standard Deviations to the number of pairs in it.
The following is the formula for the Karl Pearsons Coefficient of Correlation
applicable to two sets of variables (say X and Y) normally distributed.
XY N
r
X2
X
2
Y
Y2
r Correlation coefficient
where
12
10
11
13
14
11
12
140
Solution
X
X2
Y2
XY
12
14
144
196
168
81
64
72
64
36
48
10
100
81
90
11
11
121
121
121
13
12
169
144
156
49
21
X =70
X2
Y =63
Therefore,
XY N
r
X2
728 7
=
=
=
676 7
4732
5096
4900
70
Y 2 =651
728
X
2
XY =676
Y
Y2
70 63
651 7
63
4410
4557
3969
322
196 588
322
= + 0.95
339.48
141
Rank correlation is applicable only to individual observations. The result we get from
this method is only an approximate one, because under ranking method original value are not
taken into account. The formula for spearmans rank correlation which is denoted by P is;
P 1
D2
N ( N 2 1)
or P 1
D2
N3
where
P = Rank coefficient of correlation
D2 .
Statistics
10
Mathematics
10
142
Solution
Rank of Statistics
(x)
1
2
3
4
5
6
7
8
9
10
Rank of
Mathematics (y)
2
4
1
5
3
9
7
10
6
8
D=x-y
-1
-2
2
-1
2
-3
0
-2
3
2
1
4
4
1
4
9
0
4
9
4
D 2 = 40
D2
N ( N 2 1)
=1
6 40
10 (10 2 1)
=1
240
10( 100 1)
=1
D2
240
990
= 1 0.24 = + 0.76
(2)
Example
A random sample of 5 college students is selected and their grades in Mathematics and
Statistics are found to be:
1
Mathematics
85
60
73
40
99
Statistics
93
75
65
50
80
143
Solution
Marks in
Mathematics
(X)
85
60
73
40
90
Ranks (x)
2
4
3
5
1
Marks in
Statistics
(Y)
93
75
65
50
80
Ranks (y)
Difference
D=x-y
1
3
4
5
2
1
1
-1
0
-1
1
1
1
0
1
D2 = 4
D2
P 1
D2
N ( N 2 1)
=1
6 4
5(5 2 1)
=1
24
5 24
=1
4
5
= 1 0.2
= + 0.8
Equal or repeated ranks
When two or more items have equal value, it is difficult to rank them. In that case, the
items are given the average of the ranks they would have received, if they are not tied. For
example, if two individuals are placed in seventh place they are each given the rank
7 8
2
7.5 which is the common rank to be assigned and the next will be 9 and if three are
ranked equal at the seventh place, they are given the rank
7 8 9
3
rank to be assigned to each and the next rank will be 10, in this case. The formula is
D2
P 1 6
1 3
1 3
m m
m
12
12
N ( N 2 1)
144
Example
From the following data, calculate the rank correlation coefficient after making
adjustment for tied ranks.
X
48
33
40
16
16
65
24
16
57
13
13
24
15
20
19
Solution
5.5
Difference
D=x-y
2.5
6.25
13
5.5
0.5
0.25
24
10
-3
9.00
2.5
-1.5
2.25
16
15
16.00
16
4.00
65
10
20
1.00
24
1.00
16
2.5
0.5
0.25
57
19
1.00
Rank (x)
Rank (y)
48
13
33
40
D2
D 2 = 41
D2
Therefore, P 1 6
41
1 6
1 6
1 3
m
12
1 3
3
12
1 3
m m
12
N ( N 2 1)
1 3
2 2
12
10(102 1)
41 2 0.5 0.5
990
145
1 3
2
12
1 3
m
12
264
990
= 1 0.267
= + 0.733
Merits
1. It is simple to understand and easy to calculate.
2. It is very useful in the case of data which are of qualitative nature, like intelligence,
honesty, beauty, efficiency, etc.
3. No other method can be used when the ranks are given, except this.
4. When the actual data are given, this method can also be applied.
Demerits
1. It cannot be used in the case of bi-variate distribution.
2. If the number of items is greater than, say 30, the calculation becomes tedious and
requires a lot of time. If we are given the ranks, then we can apply this method even
though N exceeds 30.
Regression Analysis
In Correlation, two variables are related and are studied in terms of existing casual
relationship between the two variables. However, we may be also interested in estimating or
predicting the value of one variable given the value of another. For example, if we know that
advertising and sales are correlated, we may find out the expected amount of sales for a given
advertising expenditure or the required amount of expenditure for attaining a given amount of
sales. Similarly, if we know that the yield of rice and rainfall are closely related, we may find
out the amount of rain required to achieve a certain production figure. Statistical tool with the
help of which we are in a position to estimate the unknown values of one variable is called
regression. Thus regression reveals the average relationship between two variables and this
makes possible estimation or prediction.
Definition
According to Blair, Regression is the measure of the average relationship between two
or more variable in terms of the original units of the data.
146
According to Taro Yamane, One of the most frequently used techniques in economics
and business research, to find a relation between two or more variable that are related casually,
is regression analysis.
According to Wallis and Robert, It is often more important to find out what the
relation actually is, in order to estimate or predict one variable (the dependent variable), and
statistical technique appropriate in such a case is called Regression Analysis.
According to Y-Lun-Chow, Regression analysis is a statistical device. With the help
of the regression analysis, we can estimate or predict the unknown values of one variable from
the known values of another variable. In the regression analysis, the independent variable is
also known as the regressor or predictor or explanatory and the dependent variable is
known as regressed or explained variable.
Uses of Regression Analysis
1. Regression analysis is used in statistics in all those fields where two or more
relative variables are having the tendency to go back to the average. It is used more
than the correlation analysis in many scientific studies. It is widely used in social
sciences like economics, natural and physical sciences. It is used to estimate the
relation between two economic variables like income and expenditure. Thus it is
highly valuable tool in economics and business. Most of the economics are based
on cause and effect relationship. It is very useful for prediction purposes.
In business also, it is very helpful to study business predictions. Cost of production
is affected by the sale of production. Economists have discovered many predictions
and theories on the basis of the regression.
2. Regression analysis predicts the value of dependent variables from the values of
independent variables.
3. The regression analysis is highly useful and the regression line equation helps to
estimates the value of dependent variable, when the values of independent variables
are used in the equation.
4. We can calculate coefficient of correlation (r) and coefficient of determination
( r 2 ) with the help of regression coefficient.
5. Regression analysis in statistical estimation of demand curves, supply curves,
production function, cost function, consumption functions etc., can be predicted.
147
4.
5.
6.
7.
8.
9.
10.
Regression
Regression means going back and it is a
mathematical measure showing the average
relationship between two variables.
Regression Line
In the graphical jargon, a regression line is a straight line fitted to the data by the
method of least squares. It indicates the best possible mean value of one variable
corresponding to the mean value of the other. Since a regression line is the line of best fit, it
148
cannot be used conversely: therefore, there are always two regression lines constructed for
the relationship between two variables, say, X and Y. Thus, one regression line shows
regression of X upon Y, and the other shows the regression of Y upon X.
Regression Equations
Regression equation is an algebraic method. It is an algebraic expression of the
regression line. It can be classified into regression equation, regression coefficient, individual
observation and group distribution.
Regression Equation of X on Y
Regression equation of X on Y is given by
X = a + b Y.
To find out the values of a and b, the following equations, called normal equations, can be
used.
Y
XY
Na b
a
X
2
X .
Regression Equation of Y on X
Regression equation of Y on X is given by
Y = a + b X.
To find out the values of a and b, the following equations, called normal equations, can
be used.
X
XY
Na b
a
Y
b
Y .
Example
A librarian wants to find out a suitable criteria a convenient measure to fix the
monthly wages of, let us say semi-professionals working in the library. He collects the data
about seven such semi-professionals as to their present monthly income and their years of
service.
Years of
service(X)
Income
(Y)
(in 100s)
11
10
10
11
149
Solution
Regression Equation of Income (Y) on Years of service (X) :
Regression equation of Y on X is given by
..(A)
Y= a + b X.
X
X2
Y2
XY
11
10
121
100
110
49
64
56
81
36
54
25
25
25
64
81
72
36
49
42
10
11
100
121
110
X = 56
Y =56
Y 2 = 476
X 2 = 476
XY = 469
To find a and b:
The normal equations are
Na b
XY
X
X
X .
(1)
(2)
(3)
21
= 0.75
28
b = 0.75
150
0.75
This implies,
56 42 = 7a
14 = 7a
a
14
=2
7
a=2
(4)
Suppose the Librarian wants to find out the wage for a semi-professional whose service is 13
years, then X = 13 .
Therefore, substituting X = 13 in equation (4), we have
Y = 2 + 0.75 13
Y = 2 + 9.75
Y = 11.75
Hence, the wage to be fixed = 11.75 Rs.100
= Rs.1175.00
Similarly, the Librarian can use this method to predict the years of service, when he
knows the income a particular semi-professional using the regression equation of X on Y.
Further, it can be used to prepare the budget, estimate the number of users in future.
CHAPTER X
151
Research reports are added to their concerned subject literature and stored
properly for future timely retrieval. Scientists need to communicate with other so that the
research experience is to reach its target users. Its aim is the dissemination of knowledge and
helping further research. The new findings, new methods or new data are made known through
reports. The report may help a researcher by some new light, leading to new hypothesis and
new theories.
Target Users
While writing research report, it is essential to keep in mind the level of knowledge of
the readers. Those who are find more useful of the results of the particular research named as
target users. Target users for any research report may be any of the following:
1. Academic community
2. Sponsors of research
3. General public
While writing the report it is essential to keep in mind the level of the readers,
experience and the interest of the target audience.
Types of Research Reports
Depending upon the users, the research report may be divided into two categories.
These are as follows.
1. Technical Report
2. Popular Report
Technical Report:
It is generally intended for other researchers or for research managers. The report
should enable another researcher to critique methodology, check calculations and accuracy and
to follow everything which is done on a step-by-step basis.
Popular Report:
The popular report is intended for more general audience. Compared to the technical
report, the presentation will be more attention to headlines, flow diagrams, charts, tables and
summaries for the purpose of stressing major points.
152
153
The above format gives the minimum requirements of a research report and does not give
comprehensive coverage to all aspects in research reporting.
Shah gives the comprehensive list of Contents for a research report as shown below:
a. Title page
b. Foreword
c. Table of Contents
d. Introduction
e. Identification, selection and formulation of the problem
f. Research design and data collection
g. Data processing and analysis
h. Findings
i. Summary
j. Appendices
k. Bibliographical references
Out of the many more prescriptions available for research reporting, the prescriptions of
Shah is more comprehensive list of contents, though he himself states that, this is only a brief
indication of the organization of various items to be included.
The united Nations Statistical Office has suggested the following major topics should
contain in the research report:
A. Introduction
1. Clear-cut statement as to the nature of the study
2. Aims
3. Sources of information
4. Scope of the study
B. Brief statement of the working hypothesis
C. Definitions of units of study
D. Brief statement of the techniques followed in the study
1. types of observation used and conditions under which observations were made
2. types of schedules formulated and conditions under which information was secured
3. types of case history secured; their sources, manner of presentation and preliminary
analysis made
154
155
should be described. The methodology adopted for social investigations should be given
elaborately.
Style of Writing:
The basic qualities of good scientific writing are accuracy and clarity. The following
points should be remembered while writing the report
1. Write clearly. Sentences must be simple. Avoid complicated sentences. Writing long
paragraph should be avoided. Appropriate subheadings should be provided wherever
necessary
2. Defined scientific terms and use them consistently. The target audience and their
knowledge of technical terms should be considered
3. Adequate attention should be paid to use the correct spelling and grammar
4. As far as possible, present tense should be used
5. Direct and positive sentences should be used. Long, technical or unusual words or
phrases should be avoided
6. All chapters and sections, subsections, tables, diagrams and charts should be labeled
adequately.
7. Use footnotes and labeled serially.
Construction of Tables
In a research report, it is necessary to arrange and present the data in a certain logical
manner. The collected data must be edited, coded and checked so as to develop arrays of
values particularly for variables. Tabulation is an art of arranging the data in columns and rows
in such a way as to facilitate comparison of data and show the relationships between them
graphically. It is a great help in analysis and interpretation of data. There are two types of
tables.
1. General purpose table
2. Special purpose table
A general purpose table is a large amount of data presented in an easily accessible
convenient form. The repository tables, original tables, primary tables and reference tables are
come under this category. A census report is a good example of a general purpose table.
A special purpose table is a secondary table derived from the general purpose table. The
data of a special purpose table may be grouped, averaged, rounded, and derived for the purpose
156
of classification and emphasis. These tables are known as presentation tables, analytical
summary tables, interpretive, derivative or secondary tables are come under this category.
Format of a Table
A good statistical table is an art. The following parts must be present in all tables.
1. Table number
2. Title
3. Head note
4. Caption
5. Stubs
6. Body of the table
7. Foot-note
8. Source-note
Table number: A table should always be numbered for identification and reference in
the future. Each column should also be numbered as shown in the illustration.
Title of the table: Each table should be given a suitable title. It must be written on the
top of table. It must describe the contents of the table. It must explain
1. what the data are
2. where the data are
3. time or period of data
4. how the data are classified etc
Head note: It is a statement, given below the title and enclosed in brackets. For example,
the unit of measurement is written as a head-note, such as in millions or in crores.
Captions: These are headings for the vertical columns. They must be brief and selfexplanatory. They have main heading and sub-headings are wider than columns.
Stubs: These are the headings or designation for the horizontal rows. Stubs are wider
than columns.
Body of the title: It contains the numerical information. It is the most important part of
the table. The arrangement in the body is generally from left to right in rows and from top to
bottom in columns.
Foot-note: If any explanation or elaboration regarding any item is necessary, foot notes
should be given.
157
Source-note: It refers to the source from where information has been taken. It is useful to
the reader to check the figures and gather additional information.
STRUCTURE OF A TABLE
Number
Title
(Head-note if any)
Caption
Stub Heading
Total
Stub entries
Body
Total
Foot-note: Source
General Rules for Tabulation
1. The table should be simple and compact. It should not be overloaded with details.
2. The captions and stubs in the tables should be arranged in a systematic manner. It
must be easy to read the important items. There are many types. They are
alphabetical, chronological, geographical, conventional etc.
3.
4. The unit of measurement should be clearly defined and given in the tables; for
example, height in metres, weight in kilograms, etc.
5. figures may be rounded off to avoid unnecessary details in the table. But a foot-note
must be given to this effect.
6. Suitable approximation may be adopted.
7. A miscellaneous column should be added to include unimportant items.
8. A table should be complete and self-explanatory.
9. A table should be attractive to draw the attention of readers.
10. As it forms a basis for statistical analysis, it should be accurate and free from all sorts
of errors.
158
159
The final stage of discussing, interpreting, and generalizing the findings follows that of
analyzing data and reporting in the appropriate and tabular and/or pictorial forms. It is not
usual and perhaps even not desirable to strictly compartmentalize these closely related 3
operations.
Interpretation and generalization of findings otherwise called conclusions and
recommendations summarize the data so as to yield answers to the research questions. The
conclusions should be logical and clear. Readers should not be left to infer their own
conclusions.
Implications and Suggestions for further Research
Research stimulated by theoretical considerations may lead to new theoretical issues.
This is in turn may lead to further research. Keeping this in mind, attempt should be made to
spell out the implications of the findings and research necessary for further test.
Proof Reading
Your manuscript may be neat and perfect but it is the typed or printed version that counts.
If the final printed report contains serious (Syntax and Semantics) your reputation may suffer a
heavy damage. Sometimes a wrong punctuation or a missing letter may change the meaning of
the concept. The word burrow means a hole in the ground. If you dont add w at the end,
burro means an ass. Hence proof reading is considered more important in the process of
research reporting.
Marking Corrections
Corrections should be twice, once at the point where it occurs and the other in the margin
opposite where it occurs. It is the margin marks that a typist/printer works to identify the
errors.
#
Leave space
trs.
delt
N.P/F.P/P.
Stet (-)
Let is stand
L.C.
U.C.
160
Rom.fig.
Roman figure
Cap./Caps.
Capital/Capitals CHENNAI
Sp.
Spelling
Insert fullstop
(-)/
Insert hyphen
The
Conclusion
The ultimate aim of any research may find out solutions to problems or to find out the
truth. But, of course, communication is the goal of research reporting. A poorly reported
research result may be misinterpreted and may go unnoticed. Hence, reporting should follow as
well as establish a definite standard. The grammar of research reporting discussed above
should able to give proper guidance. The papers on various aspects of research in the previous
should be supplementary to and complement this chapter.
161
e. Statistical Survey
f. Questionnaire method
g. Experimental design.
162