Development and Validation of Research Instruments For Cross-Cultural Studies in Economics and Management
Development and Validation of Research Instruments For Cross-Cultural Studies in Economics and Management
Development and Validation of Research Instruments For Cross-Cultural Studies in Economics and Management
BORDEIANU Otilia-Maria
Teaching assistant, Faculty of Economics and Public Administration / Department of Economy, Business
Administration and Tourism, University “Stefan cel Mare” Suceava, Suceava, Romania, otilia@seap.usv.ro
MOROŞAN-DĂNILĂ Lucia
Teaching assistant, Faculty of Economics and Public Administration / Department of Accounting, Finance and
Economic Informatics, University “Stefan cel Mare” Suceava, Suceava, Romania, luciad@seap.usv.ro
Abstract: Beyond some aspects that are easily and necessary subject of standardization, research
methodology can be considered only for some researchers an ideal and intangible system that could set the path
towards innovation or discovery or only to theoretical synthesis, practical solutions, ideas for adaptation and
change. Organizations must learn to use the available methods and tools (which can be created or adapted) to
exploit and value the information. The purpose of this paper is to identify specific steps in validating research
instruments in general and highlighting specific issues raised by validating research instruments in a cross-cultural
context.
Key words: research methodology, research instruments, instrument validity and reliability, cross-cultural
studies
1. Introduction
The need for research in economics and management is more pressing than ever. Both seek to
find solutions and answers to immediate problems of the organization practice and to achieve reasonable
theorizing and to propose viable generalizations at least at the current level of development of the
organizations.
According to (Zaiţ, D., 2006) the research instrument is defined by the technical artifice through
which a work is done or a research action is being initiated This has usually an actual physical size and is
adapted for in a way in order to get the research or action done, to facilitate its execution. Thus, the
instrument is the materialization of a method (e.g. the questionnaire is the instrument for surveys, the
interview guide is the instrument for semi-structured interview, observation guide is the instrument for the
observation method etc).
The tables, observation sheet, recording sheet, interview guide, index, coefficient, elasticity,
measuring scale are all research instruments in economics and management.
As for example, a subject of real interest is to measure learning outcomes within an organisation.
Thus, measurement of learning outcomes includes measuring knowledge, attitudes, beliefs, skills (Pedler,
M., et al., 1997). Such measurements involve, obviously, the creation and validation of research
instruments.
In most studies a questionnaire or an interview form is commonly used. In general, development
and validation of research instruments requires a systematic approach. These aspects are captured in the
following lines.
273
20th International Economic Conference – IECS 2013
5. pilot testing by applying the draft 2 of the instrument with review of the subsequences
(resulting the draft 3 of the instrument)
6. conducting content validation with review of subsequences (resulting the draft 4 of the
instrument)
7. reliability testing and review of subsequences (resulting in the fifth version of the
instrument)
8. assessing whether further review is required of the instrument, followed by a pilot test.
After completing Step 8, the final version of the instrument may be suitable for use in the field of
research. These 8 steps together with the sub-stages are tobe found in the table below:
Step 4. Conducting content validation with review of subsequences (draft 2 of the instrument)
Step 5. Pilot testing by applying the draft 2 of the instrument with review of the subsequences ( draft 3 of
the instrument)
Step 6. Conducting construct validation of the draft 3 and review of subsequences (resulting the draft 4
of the instrument)
Step 7. Reliability testing for draft 4 and review of subsequences – if needed (resulting in the fifth
version of the instrument)
Step 8. Review of the instrument and the second pilot testing – if needed
Source: Elliott, T., Regal, R., Elliott, B., Renier, C. – Design and Validation of Instruments to Measure
Knowledge, Fall 2001, ProQuest Central, p. 157
274
20th International Economic Conference – IECS 2013
They are also asked to comment on whether this instrument has achieved its main goal and its specific
objectives. The character of articles related to the target group addressed and the conceptual model are
also analyzed. Experts receive the instrument and auxiliary materials (letter of intent, agreement form and
instructions for filling the form) together with definition of research purpose, objectives, conceptual
framework, description of target group and specific evaluation criteria. Content experts are required to
complete a feedback form.
In a typical response form, content experts are asked to evaluate each item using a scale of
assessment as follows:
0 – information is not useful,
1 - useful information, but not essential,
2 - useful and essential information.
This information is used later in the stage of development/refining of the instrument. Items that
received low scores will be erased, and those with high scores will be kept. Articles with intermediate
scores should be reviewed and reassessed by content experts. Since the content validation phase is rarely
validated by statistical analysis, the number of content experts needed is arbitrary (Elliott, T., et al., 2001).
Content validation is often inferred from comments of experts and evaluation criteria instead of being
based on information from statistical tests. Process description and written evaluation criteria can provide
a structure in this step. Following content validation, initial instrument often requires revision. This effort
follows the model draft no.2 of the instrument that can be ready for the next step or may be returned to
the committee for validation of the content.
Validity criteria of each item within the instrument are known as the articles of analysis. Validity
criteria can be divided into concurrent validity and predictive validity. In case of concurrent validity, the
results of the new instrument are compared with the results of another instrument that was completed at
the same time. In case of predictive validity, the instrument is applied in a prospective study and the
results are longitudinal compared with the results of other measures / assessments. Predictive validity is
used very rarely (Elliott, T, et al, 2001).
Factorial validity or analysis is used to describe the conceptual structures behind the instrument.
Using response models for articles, factor analysis puts items into groups that seem to assess common
themes and concepts. Factor analysis can also be used to validate the construct. Construct validation is in
most cases applied to instruments that measure knowledge, and is explained in detail below.
For instruments that measure knowledge, the concept of construct validation is often used.
Construct validation begins with a conceptual framework or a construct to be measured. This construct
can be expressed by hypothesis indicating, for example, what kind of correlation should be obtained
through other instruments, which respondents should evaluate as important and less important, or what
other findings can be provided by these results. Convergent validity and discriminant validity are two
types of construct validation. Convergent validity examines the extent to which two instruments that
claim to analyze the same subject or content are similar to each other, while discriminant validity assesses
the extent to which an instrument is able to distinguish between groups who are expected to obtain
different results. This method is commonly used for validation of instruments measuring knowledge,
particularly when there is no criteria set.
Construct validation can not generalize beyond the sample groups if groups are not chosen
properly. The best studies on the construct validation sets hypotheses and test them using statistical tests
to analyze the results showing significant differences. Using appropriate statistical tests for significant
differences, further interpretation is needed as to indicate whether a major difference observed is
important. An example of construct of an instrument that was developed to assess changes in knowledge
is one that shows its ability to differentiate between different types / levels of students. Depending on the
type of formal validation test used and the characteristics of the group analyzed, preliminary validation
will be done in the pilot testing phase or may require composition of other testing groups.
As a summary, we can say that validity refers to the appropriateness and usefulness of the
instrument to achieve its objectives. Content validation verifies the way in which articles describe an
adequate issue of components within the content area. Construct validation examines how the instrument
succeed to measure what has to be measured. Demonstration of construct validity requires clear
definitions of the construct, then the hypothesis testing taking into account the group studies -
correlational or criterial.
According to Elliott, T. (2001), this process helps in giving alternative explanations of the results
of the instrument. Following the construct validation phase, it is possible to review the instrument and
then generating a draft model – draft 4 of the instrument.
275
20th International Economic Conference – IECS 2013
Reliability testing. While the validity determines the appropriateness of an instrument and its
veracity, reliability refers to the consistency of the instrument, its stability and repeatability. If an
instrument that assesses knowledge get the same results in a group of respondents after a short period of
time when no other changes are foreseen, then that tool is proving to be of high level of reliability.
Reliability testing is concerned with accuracy evaluation, or the extent to which assessment results are
having no unexpected errors (random errors). Sources of error include: the inherent qualities of the
instrument, variables related to the respondent, administrative techniques, errors in scoring and
interpretation of results and other variables. Reliability is often easier to prove than the validity so that the
reliability testing is performed frequently and is regularly reported. However, validity is more important
because an invalid instrument may have a very high reliability. But consistency is not as important as the
validity or reliability.
Instrument reliability can be determined in several ways and usually involves comparing the
administrative part of two instruments on the same respondents. This step is followed by calculating the
correlation coefficient to demonstrate high correlations or similarities of the two sets of answers. A high
correlation means there is less chance of errors. Reliability coefficients are 1, which means a perfect
correlation and 0, showing no correlation. Reliability coefficients can be determined by several
techniques, such as test and retest, equivalent forms (alternate or parallel forms) or split halves. The best
technique will depend on the characteristics of the instrument and the resources availability, including
time and money.
Another method of determining reliability is to calculate the correlations between different items,
subscales of the instrument on a single administration. Internal consistency or how well all related articles
measure the same theme or concept is often reported. Although many methods are available, the most
widely used indicator of assessing internal consistency is Cronbach alpha coefficient. Increasing the
number of items within an instrument or subscales will lead to internal consistency increase, and hence
reliability. Shrout and Yager showed strong increase of reliability by almost 10 items in a scale or
subscale, and then a gradually diminishing improvement with the inclusion of additional items. Although
long instruments improve the results, the response rate decreases when the instrument is very long. Also,
where items strongly inter-correlates, there is some redundancy in the instrument (Elliott, T, et al, 2001).
Reliability coefficients are designed to measure the ratio between changes in actual scores and
changes of observed scores. As the reliability increases, the confidence interval around the scores
obtained narrows. Desired reliability coefficient is determined by the purpose of the instrument.
Evaluation of characteristics of individuals requires a higher reliability than the studies focused on
populations analyzed. Large studies can tolerate low standards of reliability assessment comparing to
limited studies, such as for example, educational interventions in the community versus classroom
activities. Large samples reduce measurement errors of estimated mean values. Complete information
about each person sampled is not needed in making useful conclusions about the population. Statistical
sampling of a sufficient number of people reduces errors that occur in estimates about the population’s
characteristics, so long as sampling is not subject to biases.
In conclusion, the reliability coefficients provide useful information about an instrument. The
most common tests of reliability for the instruments that analyze knowledge are tested and retested to
check their stability and internal consistency. However, for further analysis, in particular to assess
knowledge, instrument reliability is less important than the content and construct validity.
In summary, the more easy to use is a tool, the more practical. The instrument form, the way of
reviewing and reading the items, relevant information for completing the instrument by investigators and
its completion time are important and critical issues. Additional determinant factors are: instrument
scaling, its processing, and easily analyzing information by the administrator. Errors and costs can be
reduced and the response rate may increase if these efforts are simple.
Finally, the valid, reliable and practical instruments are desirable to measure desired results. A
good instrument is supported by an appropriate theoretical model, which measures what has to be
measured with reasonable accuracy, low cost and low effort and acceptable response rates.
By following these steps listed, investigators can perform valid and reliable instruments to
measure results of activities of organizations or learning outcomes. As recommendation, a minimum of 5
steps (from steps 1 to 5) should be followed to create instruments for non-experimental work. For
experimental interventions should be driven all 8 steps to generate high-quality instruments. Instruments
with lacks in rigorous reliability testing have a very limited area of interpretability and can hardly be
generalized.
276
20th International Economic Conference – IECS 2013
Investigators, researchers are encouraged to report what methods were used and what were the
results of the validation and reliability testing process when they made public the results of their
intervention. The instrument role determines what kind of evidence is required, validation or reliability, or
both. Since reliable and valid instruments are the most desirable, it is necessary to balance the two
attributes used. As shown before, there are many types of validity and reliability testing. Practical aspects
are equally important. For detailed information is necessary to deepen the literature on the development
and validation of an instrument.
Table 2: A model for the process of adaptation of research instruments in different cultures
Literature review
Investigation of conceptual and item equivalence Discussion with experts in the field and members of target
population
Translator I
Fluent in target language, good understanding of original
language
Original instrument translated Translator II
Fluent in target language, good understanding of original
language
Translator III
A synthesized translated version Fluent in target language, good understanding of original
language
Back-translator I
Fluent in original language, good understanding of target
language
Back-translator II
Back-translation Fluent in original language, good understanding of target
language
277
20th International Economic Conference – IECS 2013
Back-translator III
A synthesized back-translation version Fluent in original language, good understanding of target
language
Committee of Experts
Instrument pretested
Revised instrument
Literature review
Investigation of operational equivalence Discussion with experts in the field and members of target
population
Main study
Exploratory and confirmatory analysis
Final instrument
Source: Gjersing et al. – Cross-cultural adaptation of research instruments: language, setting, time and
statistical considerations, BMC Medical Research Methodology, 2010, p. 2
The first step is to check whether is the same relationship between the questionnaire and the
concept that is intended to be highlighted both in the original context and in the target context
(Herdman, M., 1998)
Secondly it is important to assess that items within the instrument are equally relevant and
acceptable in the target population as they are in the original population.
Conceptual and item equivalence can be verified or identified by review of the literature.
Findings from the literature review should be discussed with experts in the field and members of
the target population (Herdman, M., 1998).
In the next step the instrument should be translated from the original language into the language of
the target population. At least two persons should start the initial translations independently. The
translators should be fluent in the language of the target population with a good understanding of
the original language (Beaton, D.E., et a., 2000)
The translated versions should be synthesized into one version by a third independent translator.
In the next step the synthesized version should be back-translated independently by at least two
different persons. The back-translators should be fluent in the original language with a good
understanding of the language in the target population (Wang, W.,, et al., 2006)
Then the synthesized translated version and the synthesized back-translated version should be
reviewed by an expert committee. The expert committee should consist in methodologists,
professionals in the field of research, language professionals, and the translators (forward and back-
translators). The expert committee assesses if the words or phrases reflect the same ideas or have
the same meaning in both the original and adapted versions of the questionnaire. This assessment
ensures that items are correctly translated and are relevant in the new context (Wang, W.,, et al.,
2006).
If there are uncertainties around the meaning of specific words or items, the developer of the
original instrument can be contacted for different clarifications. It is also suggested to return to the
target population and have experts in the field discussing the various translation proposals
(Reichenheim, M.E., Moraes C.L., 2007)
Instrument must be adjusted until reaching a consensus.
After completing these steps, the instrument will be pre-tested. The number of respondents will
be chosen for pre-testing. Respondents will be assisted to see their ability of understanding, the
acceptability and emotional impact of each question, identifying issues that may be confusing or can
cause misunderstandings. A technique to check the clarity / meaning of items is to propose respondents to
reformulate each item. Reichenheim (2007) suggests that interviews must be operated until a high
percentage in terms of understanding of each item will be reached (eg.> 90%). There should be an
semantic adjustment operated by the research group, based on evidence from the pilot study (Beaton, D.E.,
et al., 2000).
Operational equivalence of the instrument should be evaluated after semantic adjusting.
Operational equivalence is referring to the possibility to use the same format of the questionnaire, the
same instructions, routes of administration and methods of measurement for the same target population as
was used in the original population. The literature review can provide information about the use of
instruments in target context. It is also possible to contact experts in the field and members of the target
population to verify if the format, instructions, method of administration and measurement methods are
278
20th International Economic Conference – IECS 2013
appropriate. When consensus is reached regarding operational equivalence, the methods are incorporated
in the study (Herdman, M., 1998).
Finally, the instrument is administered to participants in a formal study. Based on the results
obtained in this study, the properties of the instrument should be tested using statistical methods (Gjersing
et al., 2010).
5. Conclusions
Interest in cross-cultural and comparative studies conducted in different organizations may
increase in the coming years. At the same time, in many parts of the world interest in the results of these
studies will likely increase. These developments will have more effect when there is a broad recognition
of the importance of theoretical and practical utility of such studies.
Investigators, researchers are encouraged to report what methods were used and what were the
results of the validation and reliability testing process when they made public the results of their
intervention. The instrument role determines what kind of evidence is required, validation or reliability, or
both. Since reliable and valid instruments are the most desirable, it is necessary to balance the two
attributes used. As shown before, there are many types of validity and reliability testing. Practical aspects
are equally important. For detailed information is necessary to deepen the literature on the development
and validation of an instrument.
Cross-cultural research difficulties caused by a large number of methodological problems have
been described by several authors (Berry, 1980, 1992); (Brislin, Loner şi Thorndike, 1973); Elder, 1976);
(Irvine şi Caroll, 1986); (Lonner şi Berry, 1986); (Przeworski şi Teune, 1970; Roberts, 1970).
To better understand the influence of "culture" within organization further research is needed
(Koopman, P., 1999). Unfortunately, this research area has been damaged by a fragmented approach,
numerous methodological flaws and a limited unification of research results. Comparative cross-cultural
research and other studies have been heavily criticized in these matters and not without full justification
(Berry et al., 1992); Roberts, 1970); (Roberts şi Boyacigiller, 1984).
6. References
Beaton D.E, Bombardier, C., Beaton, D (2000) – Guidelines for the process of cross-cultural
adaptation of self-report measures, Spine, p. 3186
Drenth P., J., D., Thierry, H., De Wolff, C. J. (1998) - A Handbook of Work and Organizational
Psychology, Psychology Press Ltd., p.136
Elliott, T., Regal, R., Elliott, B., Renier, C. (2001) – Design and Validation of Instruments to
Measure Knowledge, Fall 2001, ProQuest Central, p. 159
Gjersing et al.(2010) – Cross-cultural adaptation of research instruments: language, setting, time
and statistical considerations, BMC Medical Research Methodology, p. 1
Herdman, M., Fox-Rushby, J., Badia, X. (1998) – A model of equivalence in the cultural
adaptation of HRQoL instruments: the universalist approach, Qual Life Res, p. 323
Koopman, Paul et all. (1999) Organizational culture: the focus questionnaire, in „European
Journal of Work and Organizational Psychology“, nr. 4/1999
Pedler, M., Burgoyne, J., Boydell, T. (1997) - The Learning Company: A Strategy for Sustainable
Development, 2nd ed., McGraw-Hill, London
Reichenheim, M.E., Moraes C.L. (2007) – Operationalizing the cross-cultural adaptation of
epidemiological measurement instruments, Rev Saude Publica, p. 667
Wang, W., Lee, H., Fetzer, S.J.(2006) – Challenges and strategiesof instrument translation, West
J Nurs Res, p. 310
Zaiţ, D., Spalanzani, A. (2006) – Cercetare în economie şi management. Repere epistemologice şi
metodologice, Ed. Economică, Bucureşti, p. 11-15
279