The Moral Injury and Distress Scale: Psychometric Evaluation and Initial Validation in Three High-Risk Populations
Objective: The concept of moral injury resonates with impacted populations, but research has been limited
by existing measures, which have primarily focused on war veterans and asked about exposure to potentially
morally injurious events (PMIEs) rather than PMIE exposure outcomes. Our goal was to develop and exam-
ine the psychometric properties of the Moral Injury and Distress Scale (MIDS), a new measure of the pos-
sible emotional, cognitive, behavioral, social, and/or spiritual sequelae of PMIE exposure. Method: The
MIDS was validated by surveying three groups: military veterans, healthcare workers, and first responders
(N = 1,232). Results: Most respondents (75.0%; n = 924) reported PMIE exposure. Analyses yielded 18
items that contributed to a single latent factor representing moral distress with fully or partially invariant con-
figurations, loadings, and intercepts across occupational groups. The MIDS full-scale score demonstrated
excellent internal consistency (α = .95) and moderate 2-week stability (r = .68, p , .001, n = 155). For con-
vergent validity, associations between the MIDS and PMIE exposure measures, as well as putative indicators
of moral injury (e.g., guilt, shame), were positive and large (r = .59–.69, p , .001), as were correlations with
posttraumatic stress, depressive, and insomnia symptoms (r = .51–.67, p , .001). The MIDS was a stronger
predictor of functioning than PMIE exposure measures, explaining seven times greater unique variance
(9% vs. 1%–1.3%). Conclusions: The MIDS is the first scale to assess moral injury symptoms indexed
to a specific PMIE that is validated across several high-risk populations.
In the most widely used and accepted definition of moral injury, it outcomes may reflect general distress, rather than moral injury spe-
is defined as the lasting emotional, cognitive, behavioral, social, and/ cifically. For example, individuals may feel guilt or disgust about
or spiritual impact of exposure to a potentially morally injurious general life stressors (e.g., ruptured interpersonal relationships,
event (PMIE) such as “perpetrating, failing to prevent, bearing wit- unmet personal goals) that are not specific to a PMIE. In contrast,
ness to, or learning about acts that transgress deeply held moral measuring PMIE exposure without the hallmark symptoms makes
beliefs or expectations (Litz et al., 2009, p. 696).” It is further con- it challenging to fully understand symptoms and functional impair-
ceptualized as the constellation of symptoms (e.g., guilt, disgust, ment related to these exposures. Relatedly, to accurately assess the
inability to self-forgive) and related functional impairments (e.g., efficacy of moral injury treatments, it is important to have measures
self-sabotaging behaviors, self-punishment) that can follow expo- that assess symptoms on a continuum from mild to severe and can
sure to a PMIE (Jinkerson, 2016). Mixed methods empirical studies identify changes in symptoms and functioning. Thus, a psychomet-
have supported this definition and conceptualization of moral injury rically sound scale is needed that comprehensively assesses PMIE
(e.g., Griffin et al., 2019; Purcell et al., 2016). Other conceptualiza- exposure in addition to hallmark indicators of moral injury, includ-
tions of moral injury are consistent with that of Litz and colleagues in ing psychological, behavioral, social, and spiritual outcomes.
regard to the core features and symptoms (Griffin et al., 2019), with Another concern with existing measures is inconsistent factor
the exception of Shay (2014), who proposed that betrayal by an structure, sometimes within the same measure (Bryan et al., 2016;
authority or other in a position of power may also constitute a type Nash et al., 2013; Richardson et al., 2020), which obfuscates con-
of moral injury. struct validity. One reason for these mixed findings may be that
Research on moral injury has proliferated over the last decade, betrayal has been conceptualized in a number of different ways in
most commonly in studies of veterans exposed to war (e.g., Griffin relation to moral injury, and questions about whether it is a type of
et al., 2019; Maguen & Norman, 2022). Moral injury research has PMIE exposure, a separate or co-occurring construct, or a resulting
recently expanded from military veterans to include other popula- symptom of a PMIE exposure remain (Norman et al., 2022). Early
tions, such as healthcare workers on the frontlines of the in the conceptualization of moral injury, Shay (2014) proposed
COVID-19 pandemic (Hines et al., 2021; Norman et al., 2021; that betrayal by leadership was a type of PMIE. Subsequently,
Riedel et al., 2022). Research showing that moral injury is prevalent, some moral injury measures included betrayal from a variety of
distressing and impairing across populations, particularly among sources (e.g., others in the military, others outside the military;
those working in high stress environments, has revealed a need for Nash et al., 2013) as types of PMIEs or symptoms of moral injury.
a psychometrically sound measure suitable for a broad range of pop- These inconsistent uses of betrayal yielded inconsistent findings,
ulations. However, existing measures of moral injury have several likely because the items were often broad, went beyond the scope
limitations. of PMIEs (feelings of betrayal were generally endorsed rather than
One limitation is that existing measures that purport to assess tied to a specific PMIE event), and were not part of a preexisting the-
moral injury were developed and validated with veterans and ory (e.g., Litz et al., 2009; Shay, 2014). Some empirical studies have
query war-related situations (Bryan et al., 2016; Currier et al., demonstrated that betrayal may be more associated with posttrau-
2020; Litz et al., 2022; Nash et al., 2013). There is a lack of measures matic stress disorder (PTSD) than moral injury and may need to
validated with other impacted populations, including healthcare be considered separately from moral injury due to its unique signs
workers and first responders (emergency medical technicians, para- and symptoms (Borges et al., 2021; Jordan et al., 2017; Maguen
medics, firefighters, law enforcement, etc.), whose work routinely & Norman, 2022; Maguen et al., 2020). This uncertainty has led
involves life-and-death situations, particularly during crises or disas- to inconsistencies, which have consequently limited the ability to
ters (Lentz et al., 2021). In some cases, scholars have adapted mea- accurately measure moral injury.
sures originally developed in veteran samples for use with other Additionally, some measures of moral injury utilize a bifactor
populations, although these are usually not validated for the new model with a general factor representing overall distress and two spe-
population (Khan et al., 2021; Koenig et al., 2019; Zhizhong et cific factors representing distress directed toward either oneself or
al., 2020). Moral injury assessments validated for use across popu- others (Currier et al., 2020). In practice, however, item responses
lations and professions would allow for comparisons across on these and other measures of moral injury are typically summed
impacted groups and enable the field to study the prevalence and to produce a total score without evidence demonstrating that the
impact of moral injury during disasters such as the COVID-19 pan- items are better measures of the overall general factor than the
demic (Williamson et al., 2020). domain-specific factors (Chesnut et al., 2022; Koenig et al., 2018;
Another limitation is that many existing measures conflate expo- Litz et al., 2022). Because unidimensionality is a critical assumption
sure to a PMIE (e.g., witnessing the death of a civilian in war) with of most scoring models, a unidimensional scale to assess moral dis-
potential impacts of these exposures (e.g., the lasting psychological tress is needed.
and spiritual distress) or measure only one or the other. A measure To address the aforementioned limitations, our aim in the current
that clearly delineates exposure from the sequelae of the exposure study was to use prior moral injury research, expert opinion, and rig-
would allow for a better understanding of whether certain types of orous psychometric methods to develop and test the Moral Injury and
exposure (e.g., perpetration vs. witnessing) lead to different moral Distress Scale (MIDS), a new measure of moral injury that could be
injury responses. An ideal measure would assess a range of psycho- used across PMIEs and populations and that separately measures
logical and functional problems that characterize moral injury and PMIEs and reactions to these events. Based on Litz et al.’s (2009) def-
link these impacts to a precipitating event or series of events, thus inition of moral injury, this measure was designed to include informa-
acknowledging that morally injurious exposures and outcomes are tion about index event(s) that met the threshold for exposure to
related but distinct constructs. When moral injury signs and symp- PMIEs, as well as symptoms and related functional impacts of such
toms are queried without indexing to a clearly defined PMIE, exposures. We evaluated the psychometric properties and validity of
the MIDS in three populations at risk of moral injury, including com- PMIEs (rather than general life stressors). Only those who denied
bat veterans, healthcare workers, and first responders. PMIE exposure or reported a valid PMIE exposure were included
in the analyses (N = 953).
We report demographics for the full sample and each occupational
Method subgroup in Table 1. Combat veterans (n = 302) reported serving in
Participants the Army (39.1%), Navy (25.2%), Air Force (18.5%), Marines
(7.6%), or other branch (9.6%, e.g., Coast Guard). Time in service drugs more”). Similarly, five items appeared to assess nonspecific
ranged from 5 years or less (49.0%), between 6 and 19 years cynicism and disillusionment that were not necessarily associated
(19.9%), and 20 or more years (31.1%), and all reported at least with PMIEs (e.g., “I feel disillusioned by the world I live in,” “I
one deployment to a warzone. Veterans reported “moderate” or wonder what kind of world I live in”) or not well aligned with
greater levels of being exposed to and bothered by witnessing others’ Shay’s (2014) definition positing betrayal by a “person who holds
perceived transgressions (49.3%) and transgressing their own values legitimate authority” (i.e., “I feel betrayed by people in my commu-
by what they did (36.4%) or failed to do (27.5%). nity”). Consistent with prior work in this area (Currier et al., 2018),
Healthcare workers (n = 356) were nurses (43.8%; nurse practi- we decided to eliminate these items prior to analysis because includ-
tioner, registered nurse, licensed practical nurse, etc.), allied health ing them would artificially inflate correlations with mental health
professionals (24.7%; pharmacist, psychologist, respiratory/physi- symptom measures and introduce a lack of clarity into our concep-
cal/occupational/speech therapist, etc.), other nonclinical staff tualization of moral injury.
(19.1%; support staff, technician, administrator, volunteer), and phy- Phase two involved constructing the scale by pretesting questions
sicians or physician assistants (7.0%), 5.3% did not provide details (Step 3), administering the survey (Step 4), reducing the number of
regarding their occupation. We were inclusive when conceptualizing items (Step 5), and exploring the factor structure (Step 6). To pretest
healthcare workers, knowing that at certain times, such as during the the questions, several of the authors (Shira Maguen, Sonya
COVID-19 pandemic, some may be asked to take on duties that are B. Norman, Robert H. Pietrzak, and Carmen McLean) conducted
outside of their specialization or normal role. Time in occupation a series of interviews with veterans, healthcare workers, and first
ranged from 5 years or less (21.3%), between 6 and 19 years responders who provided feedback on the content validity and
(35.7%), 20 or more years (42.1%), and prefer not to answer acceptability of the items. Veterans provided feedback in the context
(0.8%). Most healthcare workers (77.5%) were currently working of a stakeholder engagement panel (n = 4), and healthcare workers
in their role; 22.5% were no longer working in healthcare (e.g., (n = 6) and first responders (n = 3) were interviewed individually.
retired). Healthcare workers reported “moderate” or greater levels The veteran panel meets regularly to give input to the National
of being exposed to and bothered by witnessing others’ perceived Center for PTSD investigators on research questions and educational
transgressions (39.0%) and transgressing their own values by what products. Healthcare workers and first responders were colleagues
they did (23.9%) or failed to do (22.0%). who were interested in the topic of moral injury and offered to
First responders (n = 295) were law enforcement or corrections give input on the draft measure. We revised items as suggested by
personnel (26.1%), emergency medical technicians or paramedics the various stakeholders. The items were then administered to
(18.3%), fire service or hazmat personnel (16.3%), and other first respondents in two rounds. All respondents completed the baseline
responders (39.3%; e.g., dispatcher, humanitarian/disaster worker, survey; a smaller subsample completed a follow-up survey 2 weeks
public works safety inspector). Time in occupation ranged from later (n = 155). Finally, phase three involved testing the dimension-
5 years or less (24.4%), between 6 and 19 years (31.5%), 20 or ality (Step 7), reliability (Step 8), and validity (Step 9) of the newly
more years (43.4%), and prefer not to answer (0.7%). Most developed scale. Data and study materials will be made available to
(78.6%) were currently working in their role; 21.4% no longer qualified investigators by contacting the authors. A copy of the final-
worked as a first responder (e.g., retired). First responders reported ized scale is available in the online supplemental material 1.
“moderate” or greater levels of being exposed to and bothered by
witnessing others’ perceived transgressions (45.1%) and transgress-
ing their own values by what they did (21.7%) or failed to do Measures
(16.3%). Moral Injury and Distress Scale
Measures of Moral Injury and Putative Indicators of Moral me”), moral (four items; “Worried that my actions were morally or
Injury spiritually wrong”), and doubt (four items; “Struggled to figure
out what I really believe about religion/spirituality) subscales.
We included existing moral injury measures as well as measures Respondents indicated the extent to which they have had each expe-
of constructs considered to be core components of moral injury. rience (1 = not at all, 5 = a great deal). Item responses were aver-
Given that existing moral injury measures were designed for use aged, such that higher scores indicated greater levels of religious/
with veterans, some of the language was altered to make them appli- spiritual distress. Internal consistency of participants’ scores on the
cable for a civilian population. The Morally Injurious Events Scale divine (α = .94), moral (α = .91), and doubt (α = .95) subscales
(MIES; Nash et al., 2013) is a nine-item self-report questionnaire for the current sample was excellent.
used to assess exposure to and subjective distress from PMIEs.
The scale assesses exposure by (a) witnessing, (b) perpetrating
(through acts of commission or omission), or (c) being betrayed
Measures of Mental Health
by leaders, peers, and others. Respondents indicated how much We assessed psychological problems including depression, post-
they agreed or disagreed with each statement (1 = strongly disagree, traumatic stress, and posttraumatic cognitions. The Patient Health
6 = strongly agree). Item responses were averaged, such that higher Questionnaire-9 (PHQ-9; Kroenke et al., 2001) is a nine-item self-
scores on the MIES correspond to greater intensity of exposure and report questionnaire that assesses depressive symptoms from the
distress. Internal consistency of scores on the MIES was excellent in Diagnostic and Statistical Manual, Fourth Edition (DSM-IV) crite-
the current sample (α = .91). ria. Respondents indicated how often they have been bothered by
The Expressions of Moral Injury Scale—Military Version—Short problems over the past 2 weeks (0 = not at all to 3 = nearly every
Form (EMIS-M-SF; Currier et al., 2020) is a four-item self-report day). Internal consistency of participants’ scores on PHQ-9 items
questionnaire based on the full-length Expressions of Moral Injury was excellent in the current sample (α = .91).
Scale-Military Version (EMIS; Currier et al., 2018). Two items cap- The PTSD Checklist for DSM-5 (PCL-5; Bovin et al., 2016) is a
ture self-directed moral injury, one asking about guilt and the other 20-item self-report questionnaire that measures posttraumatic stress
about shame in relation to events that happened during military ser- symptoms corresponding to the DSM-5 diagnostic criteria.
vice. The other two items ask about other-directed moral injury and Respondents indicated how often they have had each problem or
query disgust and witnessing the moral failures of others. symptom over the past month (0 = not at all, 4 = extremely), with
Respondents indicated how much they agreed or disagreed with higher scores representing greater levels of posttraumatic stress symp-
each statement (1 = strongly disagree, 5 = strongly agree). Item toms. This version does not ask participants to anchor responses to a
responses were averaged to create a total score, with higher scores specific trauma. Internal consistency of participants’ scores on PCL-5
indicating greater levels of emotions and beliefs associated with items was excellent in the current sample (α = .96). Additionally, we
moral injury. Internal consistency of participants’ scores on the administered the Posttraumatic Cognitions Inventory-Short Form
EMIS was good in the current sample (α = .87). (PTCI; Foa et al., 1999). The PTCI is a nine-item self-report question-
Measures of constructs theorized to be core components of moral naire used to assess thoughts after a traumatic experience.
injury included the Trauma-Related Shame Inventory (TRSI), the Respondents indicated how much they agreed or disagreed with
Global Guilt scale of the Trauma-Related Guilt Inventory (TRGI), each statement (1 = totally disagree, 7 = totally agree). Internal con-
and selected subscales from the Religious and Spiritual Struggles sistency of participants’ scores on the PTCI was good in the current
Scale (RSSS). The Global Guilt scale of the TRGI (Kubany et al., sample (α = .85).
1996) consists of four items that measure trauma-related guilt We also queried about insomnia and alcohol use. The Insomnia
(e.g., “I experience intense guilt related to what happened.”). Severity Index (ISI; Morin et al., 2011) is a seven-item self-report
Respondents indicated how they felt about each statement (1 = questionnaire that assesses the severity of daytime and nighttime fea-
never/not at all true to 5 = always/extremely true). Item responses tures of insomnia. Participants indicated the extent to which each
were averaged, such that higher scores indicated more trauma-related issue was a problem for them (0 = none, 4 = very much). Item
guilt. Internal consistency of participants’ scores on the TRGI responses were summed with higher scores indicating higher levels
Global Guilt scale was excellent in the current sample (α = .93). of insomnia. Internal consistency of participants’ scores on the ISI
The TRSI-24 (Øktedalen et al., 2014) is a 24-item self-report was excellent in the current sample (α = .91). The Alcohol Use
questionnaire used to measure trauma-related shame, conceptual- Disorders Identification Test (AUDIT; Saunders et al., 1993) is a
ized by four categories (internal condemnation, internal affective- 10-item self-report questionnaire that assesses drinking behaviors
behavioral, external condemnation, and external affective- in terms of hazardous use and dependence symptoms. Item
behavioral). Respondents indicated how true a statement was for responses were summed with higher scores indicating greater levels
them (0 = not true of me, 4 = completely true of me). Item of problematic alcohol use. Internal consistency of participants’
responses were averaged, such that higher scores on the TRSI indi- scores on the AUDIT was good in the current sample (α = .83).
cated greater levels of trauma-related shame. Internal consistency
of participants’ scores on the TRSI was excellent in the current
sample (α = .98).
Other Convergent and Discriminant Validity Measures
We utilized three subscales from the RSSS (Exline et al., 2014). The Brief Inventory of Psychosocial Functioning (B-IPF;
The original scale is a 26-item self-report questionnaire that mea- Kleiman et al., 2020) is a seven-item self-report questionnaire
sures religious and spiritual struggles across six domains (divine, used to measure impairments in psychosocial functioning related
demonic, interpersonal, moral, doubt, and ultimate meaning). We to stress. Respondents indicated their level of trouble over the past
used the divine (five items; “Felt as though God had abandoned month (0 = not at all, 6 = very much, with an additional n/a option)
for domains including work/education, self-care, and social relation- mean Eigenvalue of each factor from the randomly generated data-
ships. Item responses were summed, divided by the maximum pos- sets and plotted them against the Eigenvalues for each factor from
sible domain scale score for the applicable items, and multiplied by the observed MIDS data. We retained factors in the observed data
100. Higher scores indicated greater impairment. The B-IPF demon- with Eigenvalues that exceeded Eigenvalues of factors extracted
strated good-to-excellent internal consistency in the current sample from the random data. Then, we conducted a confirmatory factor
(α = .87). analysis (CFA) in the cross-validation subsample to determine
Conversely, the Connor-Davidson Resilience Scale (CD-RISC; whether the observed factor structure replicated in an independent
Connor & Davidson, 2003) is a 25-item self-report questionnaire sample. Model fit was assessed using the χ2 value and a three-index
designed to assess adaptability or the propensity to “bounce-back” strategy (Fan & Sivo, 2005). Values of 0.90 or above for the compar-
from a stressor. Respondents indicated how much each statement ative fit index (CFI; Tucker & Lewis, 1973), .06 or below for the root
applied over the past month (0 = not true at all, 4 = true nearly mean square error of approximation (RMSEA; Browne & Cudeck,
all the time). Item responses were summed, with higher scores indi- 1992), and 0.08 or below for the standardized root mean square
cating greater resilience. Internal consistency of the CD-RISC was residual (SRMR; Kline, 2005) were interpreted as acceptable
excellent in the current sample (α = .92). model fit.
The Big 5 Inventory Short Form (Rammstedt & John, 2007) is an To further test the dimensionality of the scale across occupational
abbreviated version of the full-length Big 5 Inventory designed to subgroups, we combined the initial validation and cross-validation
measure five dimensions of personality, including neuroticism, subsamples into a full sample and specified a series of multigroup
openness to experience, conscientiousness, extraversion, and agree- confirmatory factor analyses (MGCFA) to evaluate measurement
ableness. The short form is a 10-item (two items per dimension) self- invariance across occupational groups. This involved testing whether
report questionnaire on which respondents indicated the extent to the number of factors (configural model), factor loadings (metric
which they agree that each statement describes them (1 = disagree model), and intercepts (scalar model) were consistent across veterans,
strongly, 5 = agree strongly). Item responses were averaged, such healthcare workers, and first responders. We selected the first
that higher scores indicated a greater presence of each trait. responder group as the reference group, because the first responder
Consistent with Eisinga et al.’s (2013) recommendation for assess- group was more heterogeneous than the other groups in terms of par-
ing the reliability of two-item scales, we calculated the Spearman– ticipants’ genders and ages (i.e., as expected, veterans had a higher
Brown coefficient for each subscale. Findings for neuroticism representation of older men and healthcare workers were more likely
were fair (ρ = .58; roughly similar to α = .57). The remaining sub- to be women). If evidence initially supported full invariance, we com-
scales were excluded due to low reliability. pared latent means across the groups. Otherwise, we established par-
tial scalar invariance by comparing nested MGCFA models in which
Data Analysis we sequentially freed intercept constraints prior to comparing latent
means. Nested models were evaluated using χ2, RMSEA, and CFI dif-
Analyses were conducted using IBM SPSS Statistics, Version 27 ference tests. We interpreted nonsignificant χ2 differences tests,
and Mplus, Version 8.4 (IBM Corp., 2020; Muthén & Muthén, RMSEA difference values less than or equal to .01, and CFI difference
2017). Missing data diagnostics revealed that ,1% of the data values less than or equal to 0.01 as evidence of equivalence (Chen,
(0.4%) were missing for ,5% of cases (4.7%); thus, bias associated 2007; Cheung & Rensvold, 2002).
with incomplete data was not considered to be problematic. Once Next, we tested the reliability of scores by calculating Cronbach’s
data collection was complete, we used a random number generator alpha (α) for the full sample and within each occupational subgroup,
to divide the baseline sample into two subsamples: an initial valida- and we tested the stability of scores using Pearson correlation on a
tion subsample composed of 60% of the cases (n = 582) and a cross- subsample of respondents who completed a follow-up survey con-
validation subsample composed of 40% of the cases (n = 371). We taining the MIDS items 2 weeks after the baseline survey. The valid-
used the initial validation subsample to reduce the number of items ity of participants’ scores on the MIDS was examined by calculating
and examine the initial factor structure. This involved creating an Pearson correlations with the MIDS and validation measures that
item discrimination index by summing all items and separating assessed guilt/shame, religious/spiritual struggle, mental health
cases in the 25th percentile or below from those in the 75th percentile symptoms, psychosocial functioning, resilience, and personality
or above. We calculated the difference in the proportion of respon- traits. We used Cohen’s (1988) guidelines for interpretating effectsas
dents who endorsed a moderate or greater level of distress for each small (r = .10–.30), medium (r = .30–50), and large (r ≥ .50).
item, and we compared rates of endorsement between those in the Finally, we tested whether scores on the MIDS predicted variance
highest and lowest quartiles. Items were dropped if they failed to dis- in functional impairment above and beyond variance explained by
criminate (difference in proportion ,0.20) between those in the high- individual differences (e.g., gender, age, and race) and two existing
est and lowest quartiles. Then, we calculated interitem correlations. measures of PMIE exposure using hierarchical multiple regression.
When a dyad of items was highly correlated (r = .70), suggesting
redundancy in item content, we retained the item with greater discrim-
inatory ability as indicated by the item discrimination index and elim- Results
inated the item with less discriminatory ability. Item Reduction
Next, we examined the factor structure by conducting exploratory
factor analysis (EFA) using principal axis factoring. This included a Using the initial validation subsample (n = 582), we calculated
parallel analysis to compare eigenvalues from the existing dataset the item discrimination index and eliminated two items that failed
against eigenvalues from randomly generated datasets with similar to distinguish between those who scored in the highest and lowest
dimensions (Hayton et al., 2004). To do this, we calculated the quartiles on a sum of all items in the initial pool (e.g., “My spiritual
life has changed for the worse”). Additionally, we dropped seven was acceptable: χ2 (135, N = 371) = 337.46, p , .001,
items that were highly correlated with at least one other item, sug- RMSEA = .064 (95% CI [0.055, 0.072], p = .005), CFI = 0.88,
gesting that the items’ contents were redundant. For example, we SRMR = 0.055. Inspection of the modification indices revealed
removed “My life feels less meaningful” and retained “My life that fit could be improved by allowing the residuals of “I don’t
feels like it has less purpose.” When two items were highly corre- feel like I deserve to be happy” and “I should not be forgiven” to
lated, we kept the item with greater discriminatory ability. In sum, covary. We allowed these residuals to covary because the items
21 items met criteria for inclusion and were carried forward to factor are both positively valanced, unlike the majority of negatively
extraction. valanced items that comprise the MIDS. The revised model
fit the data very well: χ2 (134, N = 371) = 298.04, p , .001,
Factor Extraction RMSEA = .057 ([0.049, 0.066], p = .080), CFI = 0.91, SRMR =
0.052. Factor loadings ranged between 0.56 and 0.83 and are dis-
We conducted an EFA using principal axis factoring to identify played with item-level descriptive statistics in Table 2. Model com-
the factor structure of the retained items in the initial validation sub- parison using the Santorra Bentler scaled chi square difference test
sample. Parallel analysis supported a single-factor solution, such that (TRd) revealed that the model with correlated residuals for the afore-
only one factor in the observed data had an Eigenvalue (λ = 11.25) mentioned items fit the data better than did the model with no resid-
that exceeded the mean Eigenvalue of the largest factor obtained ual covariances (TRd = 77.16, p , .001). Overall, these findings
from the randomly generated dataset options (λ = 1.39). The single supported Litz et al.’s (2009) model of moral injury by extracting
factor explained 53.6% of the total item variance in the observed and replicating a single latent construct representing moral injury
data. To further eliminate redundancy in items, we calculated inter- sequelae with psychological/behavioral, social, and spiritual/exis-
item partial correlations that controlled for the sum score of the tential indicators.
selected items and eliminated three items with residuals that were
moderate to strongly correlated with another retained item (Funk Tests of Dimensionality and Measurement Invariance
& Rogge, 2007). For instance, we eliminated “I don’t take care of
my basic needs as well as I used to” and kept “I don’t take good To further test the dimensionality of the scale across occupational
care of myself.” We ran the EFA again with the retained 18 items subgroups, after combining the initial validation and cross-validation
to estimate factor loadings, which ranged between 0.62 and 0.76 subsamples into a full sample, we specified a series of MGCFA to
(Table 2). evaluate measurement invariance across occupational groups
Because we dropped items based on characteristics of only the ini- (Table S2 in the online supplemental materials). The Veteran-First
tial validation subsample, replication of the factor structure in an Responder configural model fit the data well, χ2 (270, N = 597) =
independent sample was needed. For this reason, we conducted 611.67, p , .001, RMSEA = .065 (95% CI [0.058, 0.072], p , .001),
CFA using the maximum likelihood estimator with robust standard CFI = 0.870, SRMR = 0.058, indicating that the number of factors
errors in the cross-validation subsample (n = 371). We specified a was consistent across groups. Factor loadings were also equivalent;
single factor with the 18 retained items as indicators. Model fit Chi-square, RMSEA, and CFI tests revealed no difference in fit
Table 2
Descriptive Statistics and Factor Loadings for MIDS Items by Subsample
Initial validation Cross-validation
subsample (n = 582) subsample (n = 371)
Item Freq Loading Freq Loading
I think about how I should have been able to do more. 19.4 0.74 20.2 0.78
I have withdrawn from others more often. 15.5 0.71 11.3 0.70
I feel guilty. 11.7 0.76 10.0 0.72
I doubt my own judgment. 7.4 0.75 8.6 0.80
I do not feel like I deserve to be happy. 5.3 0.76 5.1 0.72
I self-sabotage things in my life more often (relationships, things at work). 6.7 0.73 6.7 0.73
I feel helpless. 11.0 0.75 11.3 0.83
My life feels like it has less purpose. 7.7 0.73 8.6 0.84
I am worried that bad things will happen to me or my loved ones. 11.3 0.73 10.5 0.75
I have punished myself. 6.5 0.74 5.4 0.78
I feel disgusted. 10.0 0.71 10.2 0.69
I do not seek support because I feel like I do not deserve it. 5.0 0.72 4.6 0.73
I do not seek support because I worry others would not understand. 10.8 0.71 11.9 0.72
I feel betrayed by leaders or institutions. 25.6 0.62 22.4 0.62
I feel powerless. 11.3 0.70 10.2 0.80
I should not be forgiven. 6.4 0.66 7.8 0.56
My spirituality/faith is no longer a source of comfort. 7.4 0.65 8.1 0.64
I do not take good care of myself. 10.1 0.64 12.1 0.63
Note. Frequencies (Freq) are percentages of respondents who endorsed “moderately” or higher for each item. Frequency is reported as
binary in this table to enhance interpretation but was analyzed as ordered categorical (0–4) in exploratory and confirmatory factor
analyses. MIDS = Moral Injury and Distress Scale.
between the metric and configural models, Δχ2 (17) = 21.97, punished by God/divine (r = .35, p , .001) and doubting one’s
p = .186, ΔRMSEA = .002, ΔCFI = 0.000. Although there was no beliefs (r = .39, p , .001) to a lesser extent.
evidence of invariance based on the CFI and RMSEA difference Because prior research documents moderate and positive correla-
tests, the Chi-square difference test indicated that the scalar model tions between PMIE exposure and poor mental health (for a meta-
fit the data significantly worse than the metric model, Δχ2 (17) = analytic review, see McEwen et al., 2021), we expected to find an
29.83, p = .028, ΔRMSEA = .002, ΔCFI = 0.006. When we freed even stronger association between measures of morally injurious
the intercept for the item “I should not be forgiven” (Intveteran = outcomes (i.e., the MIDS) and mental/behavioral health symptoms.
0.36, Intfirst responder = 0.22), we found evidence of partial scalar Consistent with this theorizing, associations between scores on the
invariance, Δχ2 (16) = 22.16, p = .138, ΔRMSEA = .000, ΔCFI = MIDS and measures of depression (r = .60, p , .001) and posttrau-
0.005. Thus, we compared latent means and found no significant dif- matic stress (r = .67, p , .001) were positive and large. Associations
ferences between veterans and first responders on the MIDS overall between scores on the MIDS and behavioral health measures includ-
factor score (Δm = 0.05, p = .425). ing insomnia (r = .51, p , .001) and hazardous alcohol use (r = .18,
The Healthcare Worker-First Responder configural model fit the p , .001) were also positive and small to moderate in magnitude.
data well, χ2 (266, N = 651) = 647.27, p , .001, RMSEA = .065 Also, we observed a strong positive association between partici-
(95% CI [0.060, 0.073], p , .001), CFI = 0.876, SRMR = 0.055, pants’ scores on the MIDS and a multidomain composite of impaired
indicating that the number of factors was consistent across groups. psychosocial functioning (r = .56, p , .001) and a moderate nega-
Again, factor loadings were equivalent between groups. tive association between scores on the MIDS and scores on a mea-
Chi-square, RMSEA, and CFI tests demonstrated no differences in sure of general resilience (r = −.40, p , .001).
fit between the metric and configural models, Δχ2 (17) = 16.85, In terms of divergent validity, it is essential that moral injury be dif-
p = .464, ΔRMSEA = .002, ΔCFI = 0.001. Intercepts were also ferentiated from other responses to traumatic events including PTSD.
equivalent between groups; Chi-square, RMSEA, and CFI tests To that end, scores on the MIDS explained about 40% of the variation
demonstrated no differences in fit between the scalar and metric in scores on the aforementioned measure of posttraumatic stress (i.e.,
models, Δχ2 (17) = 11.90, p = .806, ΔRMSEA = .001, ΔCFI = PCL-5; r = .67, p , .001) and about 25% of the variation on a measure
0.004. Thus, we compared latent means and found no significant dif- of posttraumatic cognitions (r = .53, p , .001). These results suggest
ferences between veterans and first responders on the MIDS overall that moral injury and PTSD share some overlap, possibly to the extent
factor score (Δm = 0.02, p = .670). that both occur in response to highly stressful events, though the two
appear to result in unique cognitive interpretations and affective reac-
tions. Furthermore, MIDS scores were only moderately associated with
Tests of Reliability, Stability, and Descriptive Statistics neuroticism (r = .34, p , .001), which suggests that the emotional and
social sequelae of moral injury are not likely explained by a general
Next, we performed tests of reliability and stability. Internal consis-
tendency toward negative feelings or emotional instability.
tency of MIDS scores was excellent for the full sample (α = .95) and
for veterans (α = .94), healthcare workers (α = .95), and first respond-
ers (α = .94). Scores were moderately correlated across a 2-week Tests of Incremental Validity
interval among participants who were randomly selected to complete
Finally, to determine whether MIDS scores explained variance in
the MIDS follow-up survey (r = .68, p , .001, n = 155). We calcu-
outcomes beyond what was explained by existing measures of PMIE
lated descriptive statistics for the full sample (M = 7.22, SD =
exposure, we conducted a hierarchical multiple regression with the
11.11) and for veterans (M = 7.76, SD = 11.00), healthcare workers
severity of impairment in psychosocial functioning as the dependent
(M = 7.04, SD = 11.56), and first responders (M = 6.88, SD =
variable. Independent variables included gender (β = .05, p = .068),
10.67). Independent samples t-tests revealed no evidence of crude
age (β = −.16, p , .001), and minority race/ethnicity (β = .03,
mean differences in MIDS total scores between occupational sub-
p = .303) entered in Step 1, EMIS (β = .05, p = .209) and MIES
groups (p’s = .321–.860). Taken together, MIDS scores showed evi-
(β = .17, p , .001) total scores entered in Step 2, and MIDS
dence of excellent internal consistency among relevant military and
(β = .41, p , .001) total scores entered in Step 3. The full model pre-
civilian populations and good stability across repeated administra-
dicted 36.6% of the variance in functioning scores, F(6, 911) =
tions, while also being sensitive to potential fluctuations over time.
87.69, p , .001, R 2 = .37). After adjusting for gender, age, and
race, adding EMIS and MIES scores explained 22.8% of the vari-
Tests of Convergent and Discriminant Validity ance in functional impairment, ΔF(2, 912) = 143.62, p , .001,
ΔR 2 = .23. MIDS scores explained an additional 9.0% of the varia-
To evaluate convergent validity (Table 3), we examined associ- tion in functioning, after accounting for variance explained by
ations between participants’ scores on the MIDS and measures that demographics and existing moral injury measures, ΔF(1, 911) =
purport to assess exposure to PMIEs. Scores on the MIDS were 129.58, p , .001, ΔR 2 = .09. When all independent variables were
positively and strongly correlated with scores on the MIES included in the model, MIDS scores uniquely explained 9.0% of var-
(r = .64, p , .001) and EMIS (r = .59, p , .001). Associations iation in impairment versus the MIES and EMIS, which respectively
between scores on the MIDS and measures of constructs theorized explained 1.3% and ,1.0% of the variance in impairment.
to be core components of moral injury, including trauma-related
shame (r = .68, p , .001) and guilt (r = .69, p , .001), were pos- Discussion
itive and large. Similarly, scores on the MIDS were positively
related to religious/spiritual struggles such as worry about moral The goal of this study was to develop and evaluate the psychomet-
wrongdoing (r = .60, p , .001), as well as feeling abandoned/ ric properties and validity of the MIDS, the first measure designed to
Table 3
Bivariate Associations Among Study Variables
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1. MIDS —
2. MIES .64* —
3. EMIS .59* .66* —
4. TRSI .68* .49* .47* —
5. TRGI .69* .52* .52* .57* —
6. RSS divine .35* .27* .25* .40* .31* —
7. RSS moral .60* .52* .48* .54* .51* .38* —
8. RSS doubt .39* .31* .29* .40* .32* .64* .46* —
9. PHQ-9 .60* .45* .42* .59* .46* .37* .44* .36* —
10. PCL-5 .67* .54* .50* .62* .55* .40* .48* .39* .74* —
11. PTCI .53* .47* .45* .55* .47* .33* .46* .38* .52* .50* —
12. ISI .51* .43* .40* .48* .41* .35* .41* .35* .75* .65* .44* —
13. AUDIT .18* .18* .18* .19* .15* .40* .19* .16* .16* .17* .15* .15* —
14. CDRISC −.40* −.25* −.32* −.36* −.36* −.26* −.30* −.27* −.43* −.39* −.34* −.40* −.03 —
15. B-IPF .56* .46* .42* .54* .43* .37* .50* .40* .69* .65* .54* .58* .19* −.39* —
16. BIG 5-N .34* .25* .31* .32* .34* .25* .28* .26* .37* .39* .35* .34* .11* −.49* .38* —
Note. MIDS = Moral Injury and Distress Scale; MIES = Moral Injury Events Scale; EMIS = Expressions of Moral Injury Scale; TRSI = Trauma-Related
Shame Inventory; TRGI = Trauma-Related Guilt Inventory; RSS Divine, Moral, and Doubt subscales = Religious & Spiritual Struggles Scale; PHQ-9 =
Patient Health Questionnaire-9; PCL-5 = PTSD Symptom Checklist-5; PTSD = posttraumatic stress disorder; PTCI = Posttraumatic Cognitions Inventory;
ISI = Insomnia Severity Index; AUDIT = Alcohol Use Disorders Identification Test; CD-RISC = Connor-Davidson Resilience Scale; B-IPF = Brief
Inventory of Psychosocial Functioning; N = Big 5 Personality Dimensions of Neuroticism.
* p , .001.
assess both PMIE exposure and a comprehensive set of sequelae and the ability to compare the prevalence and severity of moral injury
symptoms and to measure moral injury exposures and outcomes across populations and PMIE types and being prepared to measure
across professions and populations. As hypothesized, the MIDS moral injury under a wide variety of circumstances. For example,
showed excellent internal consistency and good stability across at the start of the pandemic, there was much speculation about
repeated administrations. Associations of scores on the MIDS with moral injury among healthcare providers, but no validated measures
other measures of moral injury and putative indicators of moral to assess prevalence. The MIDS allows us to quickly measure moral
injury (e.g., guilt, shame) were positive and large, as were correla- injury when events involving new populations or PMIEs occur.
tions with measures of the most closely related posttraumatic psy- Although moral injury has many facets (e.g., emotional, cogni-
chopathology, including PTSD symptoms, depression, insomnia, tive, behavioral), we found an underlying single factor that
and psychosocial impairment. The MIDS also operated as expected explained the majority of the total item variance. This is not surpris-
with measures of discriminant validity (i.e., neuroticism). ing because all these facets relate to reactions associated with a spe-
Examination of incremental validity showed that the MIDS was a cific PMIE; they are all parts of a unified response and thus manifest
stronger predictor of impaired functioning than the EMIS or as a single factor. Perhaps if the scale were longer (e.g., ten items
MIES, uniquely explaining 9% of the variation in functioning, sug- each of spiritual aspects, emotional aspects, etc.), we would have
gesting it has additional, unique explanatory utility over other com- found a multidimensional factor structure similar to Koenig et al.’s
monly used moral injury measures. (2018) 45-item Moral Injury Symptom Scale, but given the purpose-
While interest in moral injury has increased over the past decade, ful brevity of the MIDS, the single factor is not surprising. While our
research has been hindered by limitations in our ability to effectively intention in developing the MIDS was to briefly and broadly assess
measure the construct. For example, because of the lack of appropri- the potential impacts of moral injury across domains identified by
ate moral injury measures, treatment studies have examined related Litz et al. (2009), the impact of PMIE exposure on specific domains
outcomes such as changes in PTSD symptoms, guilt, or functioning (e.g., religiousness/spirituality) may be better assessed using more
(e.g., Litz et al., 2021; Norman et al., 2022), which obscure whether tailored instrumentation.
interventions are truly alleviating moral injury. The MIDS was Notably, an ongoing controversy in the literature has been about
designed to address these limitations. Based on the most widely how betrayal relates to moral injury; specifically, whether betrayal
accepted conceptual definition (Litz et al., 2009), the MIDS queries is a PMIE, a common correlate of PMIE exposure, or a resultant
exposure to PMIEs, the extent to which one is bothered by those symptom of moral injury following a PMIE (Griffin et al., 2019;
events, as well as emotional, cognitive, behavioral, and/or spiritual Maguen & Norman, 2022). Carefully weighing past literature, con-
sequelae proposed to be common moral injury reactions. This struc- sultation with other experts, and guided by Litz et al.’s (2009) and
ture allows for a clear delineation of events, symptoms, and other Shay’s (2014) conceptual models, we included feelings of betrayal
sequelae. It also allows clinicians and investigators to know whether by authority as a moral injury symptom that can result from a
reactions are linked to exposure to a PMIE or are associated with dis- PMIE. For example, a service member may see an authority figure
tress stemming from other types of trauma or stressors. The MIDS is make decisions that the service member believes puts others in
also the first measure validated across military/civilian populations harm’s way (a “witnessing” PMIE) and as a result feel betrayed.
and types of PMIEs. Benefits of having such a measure include We decided not to include more all-encompassing types of betrayal
included in some other measures because it is often not clear if those psychometrically sound measure of moral injury that deserves further
types of items are related to PMIEs or are feelings that result from study. A psychometrically sound measure of morally injurious out-
other types of general stressors of negative life events (e.g., I feel comes like the MIDS equips researchers to understand the severity
betrayed by others outside of the military). Further highlighting and course of individuals’ moral distress, with the goal of identifying
the complicated role of betrayal in moral injury, findings reported those whose distress is likely to cause clinically significant functional
in Table 2 show that participants were more likely to endorse “feel- deficits and therefore benefit from treatment. Next steps in evaluating
ing betrayed by leaders or institutions” to a moderate or greater the psychometric properties of the MIDS are to evaluate whether it can
degree than any other item on the MIDS; however, the item was measure symptom change. This will require data on treatment-seeking
overall less strongly associated with the underlying latent factor rep- samples, before and after treatment and ideally with posttreatment
resenting moral injury in comparison to other items on the scale. follow-up timepoints. In addition, the MIDS should be evaluated in
Thus, betrayal may be a more ubiquitous experience and context other populations such as those who work in child protective services,
in which PMIEs are likely to occur, though feeling betrayed in border patrol agents, medical students, displaced populations, teach-
and of itself appears to be a poor indicator of the overall severity ers, and those exposed to mass violence events such as school shoot-
of morally injurious outcomes. Whether all betrayal from trauma ings. Similarly, the MIDS will need to be validated with international
should be considered part of the moral injury construct remains an and diverse samples to ensure it is appropriate to use across cultures.
important theoretical and empirical question. Future work is also needed to identify if there are differences by race,
ethnicity, or gender and other potentially contributing variables.
Limitations Epidemiologic work can inform the prevalence of moral injury in
the general population. The MIDS can be included in a wide variety
Given that moral injury is a relatively new and evolving construct, of studies to accelerate our understanding of the prevalence and course
there may be alterations to this construct as additional research of moral injury.
emerges. Consequently, the MIDS may need to be modified over
time. Also, our sample was community-dwelling rather than
