Scale Construction
Scale Construction
Scale Construction
Outline
psychological assessment
criteria for valid assessment
scale development
phases of scale development
Psychological Measurement
systematic measurement of a persons behavior different strategies to assess target persons inferences (and clinical judgments) drawn from the results
I. Assessment Methods
direct observation psychophysiological measurement questionnaires
self- or other reports on personality / disorders / emotional states / mood / well being single-shot studies repeated measures diary methods ambulatory assessment
ISSAS 2009 - scale construction 4
Construct Definition
scope of the scale (level of abstraction) what is to be measured?
broad or narrow construct?
well-being vs. specific emotion
Construct Definition
aspects of construct continuum vs. categories
frequency / intensity of experiencing emotions (joy, anger)
scores on items can be combined using (weighted) means
Construct Definition
unidimensional vs. multidimensional
frequency of experiencing specific positive emotions
love, affection, intimacy, security joy, happiness, cheerfulness, contentment
facets of a construct
positive affect may comprise
love Happiness pride
ISSAS 2009 - scale construction 11
Recommendation
write out a brief, formal description of the construct relate it to other constructs search literature for information about dimensionality / facets
may also help to avoid known problems with respect to unclear instructions, problematic response formats, etc.
12
Design Scale
item writing operationalization of construct sample systematically all content that is potentially relevant to the construct
one may always discard items...
13
Design Scale
the initial item pool
is broader than ones own theoretical view about the construct includes multiple items for each potential facet / dimension includes also items that will finally proof to be distinct or tangential to the construct search for aspects in literature (scientific, but also fiction, dictionnaries, etc.) ask friends and family
Design Scale
basic principles of item writing simple, straightforward, and appropriate language
adequate to reading level no trendy expressions / colloquialisms
15
Design Scale
basic principles of item writing one aspect at a time
I feel happy and beloved I do not insult people because it is morally wrong
16
Design Scale
basic principles of item writing avoid frequencies in item wording
Sometimes, I am happy
Semantic Opposites?
vs.
19
Semantic Opposites?
implies
20
Design Scale
response format depends on introduction
How often do you feel .
frequency format (never, always)
Design Scale
response format avoid middle category to increase variability
not always desirable
avoid too many categories, respondents cannot differentiate adequately (analog scale?) avoid check-lists response bias avoid forced choice formats
indicates relative strength of alternatives and cannot be compared across individuals (no normative interindividual information)
ISSAS 2009 - scale construction 22
Design Scale
standard metrical analysis techniques work well with five or more categories However, labels from 1 to 5 do not guarantee a metric scale!
23
Pilot Test
small sample critique scale
ambiguous or confusing items mismatch of items and scale
24
25
Norms:
what are the properties of the distribution of scores for a given population?
ISSAS 2009 - scale construction 26
Statistics
Before you start:
items recoded?
save recoded items using a new name (mmq01 mmq01r)
look at distributions!
Wut
4
A
300
200
Frequency
Groll
100
Wut
27
Statistics
first simple analysis: item correlations with criterion
dichotomous with interval / dichotomous Pearson interval with interval Pearson dichotomous / ordinal with ordinal Wilcoxon
28
Internal Structure
interitem correlations factor analysis for metrical outcomes
standard factor analysis:
exploratory factor analysis confirmatory factor analysis
29
Interitem Correlations
30
or
Yi = i + i 1 + i
Yj = j + j 2 + j
32
Scree Plot
34
35
Factor Loadings
36
37
(Sub-) Dimensions
38
39
Facets
40
simple example
affection love intimacy security joy fortune happin. Conent. JOY LOVE
42
43
Internal Consistency
estimation: Cronbachs alpha (should be > .70)
2 2 k T T = 2 k 1 T
44
Reliability of items
CTT (Classical Test Theory):
Yi = + i +
reliability:
Rel(Yi ) =
i2 Var( )
Var(Yi )
45
Validity
item pool build in such a way that it is content valid reliability proofed criterion validity
concurrent: correlation with another measure that shall be predicted predictive: prediction (e.g., via regression) of a later event (test-score)
construct validity
46
Validity
construct validity Campbell & Fiske (1959): Multitrait-Multimethod (MTMM) Matrix convergent validity
high associations between different measures of the same construct
discriminant validity
Lower associations between measures of different constructs
trait-method-unit (TMU)
scores of a measure depend on underlying trait (construct) but also on method
Methods?
different raters
self-report, friend, acquaintance teacher, pupils
different tests
BDI, Hamilton Scale
49
Multitrait-Multimethod Matrix
heuristic inspection (Campbell & Fiske, 1959)
50
Multitrait-Multimethod Matrix
heuristic inspection (Campbell & Fiske, 1959)
51
discriminant validity
correlations besides the validity diagonal smaller than on validity diagonal? monotrait-heteromethod-correlations (MHC) should be higher than than heterotrait correlations of the same variables correlations of different traits should be similar across mono- and heteromethod blocks (pattern of associations between traits should be similar across methods)
method effects
degree to which monomethod correlations are higher than heteromethod correlations
ISSAS 2009 - scale construction 52
53
54
factor analysis
1 2
Y1 = 1 + 1 + 1 Y2 = 2 + 2 + 2
Y1 1
Y2 2
observed variables (Y1, Y2) : latent variable (common factor) 1, 2: residualvariablen [measurement error, uniqueness, unique factors] loadings i; intercepts i
ISSAS 2009 - scale construction 55 55
factor analysis
1 2
Y1 = 1 + 1 + 1 Y2 = 2 + 2 + 2
Y1 1
Y2 2
i2 Var () Rel (Yi ) = Var (Yi )
ISSAS 2009 - scale construction 56 56
2 factor model
Cov(F1, F2) Var(F1) 32
R1 3
F1: Extraversion
F2: Repair
Var(F2) 42
R2 4
11
E1 1
21
E2 2
Var(1)
Var(2)
Var(3)
Var(4)
57 57
Var(aX + bY) = a2 Var(X) + b2 Var(Y) + 2ab Cov(X,Y) Cov(a1X1 + b1Y1, a2X2 + b2Y2) = a1a2 Cov(X1, X2) + a1b2 Cov(X1, Y2) + b1a2 Cov(Y1, X2) + b1b2 Cov(Y1, Y2)
X, Y: variables a, b : constants
ISSAS 2009 - scale construction 58 58
2 factor model
Var(E1) Cov(E2, E1) Cov(R1, E1) Cov(R2, E1)
EXTRA1 EXTRA1 EXTRA2 REPAIR1 REPAIR2 0.707 0.434 0.073 0.113 0.584 0.073 0.114 0.359 0.256 0.320
Var(R2) samplecovariances
REPAIR2
E12 Var(F1) + Var(1) E2 E1 Var(F1) E22 Var(F1) + Var(2) R2 R1 Var(F2) R22 Var(F2) + Var(4)
59 59
R1 E1 Cov(F1, F2) R1 E2 Cov(F1, F2) R12 Var(F2) + Var(3) R2 E1 Cov(F1, F2) R2 E2 Cov(F1, F2)
ISSAS 2009 - scale construction
60
61
Implications
systematic variance in observed variables only caused by traits correlations between latent variables =discriminant validity reliability = consistency (convergent validity) residuals consist of measurement error and method effects (specific to TMU)
63
64
Implications
systematic variance in observed variables only caused by traits correlations between latent variables =discriminant validity reliability = consistency (convergent validity) residuals consist of measurement error and method effects (specific to TMU) residuals belonging to one method may correlate (method effects generalize) no separation of effects due to method and measurement error
ISSAS 2009 - scale construction 66
67
consistency :
2 Tjk Var (Tk ) CON (Y jk ) = 2 2 TjkVar (Tk ) + Mjk Var ( M j )
Method-specificity:
2 Mjk Var ( M j ) MS (Y jk ) = 2 2 Var ( M j ) TjkVar (Tk ) + Mjk
69
Implications
systematic variance in observed variables caused by traits and method each and every method deviates from the trait (no best method) correlations between latent variables =discriminant validity reliability = consistency (convergent validity) + method-specificity residuals consist of measurement error method effects generalize across traits
ISSAS 2009 - scale construction 70
71
consistency :
2 Tjk Var (Tk ) CON (Y jk ) = 2 2 TjkVar (Tk ) + Mjk Var ( M j )
Method-specificity:
2 Mjk Var ( M j ) MS (Y jk ) = 2 2 Var ( M j ) TjkVar (Tk ) + Mjk
73
Implications
one method is gold standard (reference method) contrast of other methods against reference trait factor is true-score of reference method (does not change if methods are added to the model; comprises trait and method effects of reference method) method effects are residuals in a latent regression variance components of trait and method effects estimable (consistency and method specificity)
ISSAS 2009 - scale construction 74
75
TM-Model
Latent variables represent TMU separation of systematic and random influences latent variables consist of trait and method influences basis for further analysis
76
Interchangeable Raters
ANOVA:
Mean is expected value in factor level deviations from mean are independent
Met. A
Met. B
CTUM
Y111 1 Y211 Y112 Y212 Y121 Y221
E111 1 M11 M211 1 M12 M211 1 M21 M221 1 M22 M221 1 M31 Y231 M231 1 M32 M231 79
Rater 1
Trait 1
T111
T211 1 T211
Rater 2
Rater 1
Trait 2
Rater 2
Rater 1
Trait 3
Rater 2
Met. B
Met. B
81
CTC(M-1)
Y111 T111 Y211 Y112 T111 Y212 Y113 Y213 Y121 T121 Y221 Y122 T121 Y222 Y123 Y223 Y131 T131 Y231 Y132 T131 Y232 Y133 Y233 M133 M132 M123 M122 M113 M112
82
84
CTUM model
85
CTC(M-1) model
86
Norms
Scales of measurement for most constructs is arbitrary Meaning of scores can only be determined in relation to some frame of reference Normative approaches uses distribution of scores (across and within individuals) as frame
large sample Representative sample (multiple samples)
descriptive statistics of scale scores
Norms for subpopulations (male and female; students and non-students samples; ethnical backgrounds; different cultures sharing the same language)
ISSAS 2009 - scale construction 88
Outlook
Computer programs Dichotomous / ordinal data Cross-cultural research Longitudinal data Multilevel MTMM models
89
Computer Programs
Simple statistics, Cronbachs , item total correlations, exploratory factor analysis:
SPSS R S-Plus
( ijk isjk )
1 e 2
x2 2
dx
The parameter isjk is a difficulty parameter for each response category bound s (s > 0) for item i measuring trait j with method k, and represents the probability distribution of the standard normal distribution.
91
great
1
P (Yijk 1| ijk ) P (Yijk 2 | ijk )
P (Yijk 3 | ijk )
0.8
( ijk isjk )
x 1 2 e dx 2
2
0.6
0.4
P (Yijk 4 | ijk )
0.2
0 -6 -5 -4 -3 -2 -1 0 1 2 3
ijk
unpleasant
pleasant
response probability
P (Yijk = s | ijk )
1
good
0 -not at all
0.8
very much so - 4
0.6
1
0.4
0.2
0 -5 -4 -3 -2 -1 0 1 2 3
ijk
unpleasant
pleasant
92
Assumption:
* Yijk
* cijk 1, for c 1 ijk < Yijk ( ijk ) * Yijk = s, for sijk < Yijk ( s +1) ijk * 0, for Yijk 1ijk
It can be shown that the latent variable Y*ijk is a function of the itemspecific probit ijk and a residual *ijk = Y*ijk - ijk (see Eid, 1995). This yields the following variance decomposition:
* * Var (Yijk ) = Var ( ijk ) + Var ( ijk )
Because in the graded response model the category characteristic curves are defined on the inverse of the standard normal distribution, the variance of the residual variable has to be 1 (Eid, 1995). As a consequence the variance of the latent variable only depends on the item-specific probit variable :
* * Var (Yijk = Var + Var ( ijk ) ) ( ijk ) = Var ( ijk ) + 1
ISSAS 2009 - scale construction 94
CFA for ordered categorical data using Graded Response Theory (short)
There is an underlying dimension for the observed ordered categories (Graded Response and CFA) Polychoric (tetrachoric) correlations can be used to estimate model Y* is normally distributed WLSMV estimator in Mplus
95
Alternative
Using robust estimators E.g., MLR in Mplus Five categories needed
96
Cross-Cultural Research
How can one determine if the same construct is measured in different cultures? How can one determine if the questionnaire (the scale) works the same way in different cultures?
97
98
F: Extraversion
Var(F) 1
D: Extraversion
Var(D)
1
E1 1
2
E2 2
2
E2 2
E1 1
Var(1)
Var(2)
Var(1)
Var(2)
99 99
Cross-Cultural Research
Different Languages: French and German speaking parts of Switzerland Guidelines for the translation of instruments
translation and back translation
Determine if structure and discriminant validity of questionnaire is the same across cultures
100
Longitudinal Data
measure of a stable trait or measure of a state? changes in convergent and discriminant validity over time? additional information about stability of traitand method effects available
101
102
Multiconstruct-LST Model
103
104
Clustered data?
IID assumption no longer met!
independent and identically distributed data
105
Multilevel CTUM-Model
Level 2 (Target)
Ert11 Yrt11 1 M21 M31 Mrt1
Level 1 (Rater)
Trait 1
Yrt21 Yrt31
Yrt12
Trait 2
Yrt22 Yrt32
Yrt13
Trait 3
Yrt23 Yrt33 ISSAS 2009 - scale construction
106
106
Multilevel CTUM-Modell
Level 2 (Target)
Tt11 1 Yrt11 Yrt21 Yrt31
Level 1 (Rater)
Ert11 1 M21 M31 Mrt1
Trait 1
Tt21 Tt31
Tt12
Trait 2
Tt22 Tt32
Tt13
Yrt13 Yrt23
Trait 3
Tt23 1 1 Yrt33 Tt33 ISSAS 2009 - scale construction
107
107
Variance Components
108
108
109
109
References
Scale construction: Clark, L.A. & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309-319. Haynes, S.N., Richard, D.C.S, & Kubany, E.S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7, 238-247. Spector, P. E. (1992). Summated Rating Scale Construction: An Introduction. Newbury Park,California: Sage Publications.
ISSAS 2009 - scale construction 110
References
MTMM / SEM: Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Bollen, K. A., & Long, J. S. (1993). Testing Structural Equation Models. Sage. Campbell, D. T. & Fiske, D. W. (1959). Convergent and discriminant valuidation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105. Courvoisier, D. S., Nussbeck, F. W., Eid, M., Geiser, C., & Cole, D. A. (2008). Analyzing the convergent validity of states and traits: Development and application of multimethod latent state-trait models. Psychological Assessment, 20, 270-280. Eid, M., Nussbeck, F. W., Geiser, C., Cole, D. A., Gollwitzer, M., & Lischetzke, T. (2008). Structural equation modeling of multitraitmultimethod data: Different models for different types of methods. Psychological Methods, 13, 230-253. Finney, S. J., & DiStefano, C. (2006). Non-normal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.). Structural equation modeling: A second course. Greenwich, CT: Information Age Publishing. Hoyle, R. H. (Ed.) (1995). Structural equation modeling. Concepts, issues, and applications. Thousands Oaks: Sage. Kline, R. B. (Ed.) (2004). Principles and practice of structural equation modelling (2nd ed.) New York: Guilford Press. Loehlin, J. C. (2003). Latent variable models: An introduction to factor, path, and structural analysis (4th ed.). Mahwah: Erlbaum. Nussbeck, F. W., Eid, M. & Lischetzke, T. (2006). Analysing MTMM data with SEM for ordinal variables applying the WLSMVestimator: What is the sample size needed for valid results? British Journal of Mathematical and Statistical Psychology, 59, 195-213. Raykov, T., & Marcoulides, G. A. (2006). A First Course in Structural Equation Modeling (2nd ed.). Mahwah: Erlbaum. Schermelleh-Engel, K., Moosbrugger, H., & Mller, H. (2003). Evaluating the fit of structural equation models: Test of significance and descriptive goodness-of-fit measures. Methods of Psychological Research - Online, 8, 23-74. Download: http://www.mpr-online.de/ Schumacker, R. E., & Lomax, R. G. (2004). A beginners guide to structural equation modelling (2nd ed.) Mahwah: Erlbaum. Other: Burns, G. L. & Haynes, S. N. (2006). Clinical Psychology: Construct validation with multiple sources of information and multiple settings. In M. Eid and E. Diener (Eds.) Handbook of Multimethod Measurement in Psychology. Washington, DC: American Psychological Association. Messick, S. (1995). Validity of psychological assessment: validation inferences from persons responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741-749. Steyer, R., Schwenkmezger, P., Notz, P. & Eid, M. (1994). Testtheoretische Analysen des Mehrdimensionalen ISSAS 2009 - scale construction 111 Befindlichkeitsfragebogens. Diagnostica, 40, 320-328.