0% found this document useful (0 votes)

9 views26 pages

2019 Auerswaldmoshagen EFAextractioncriteria

The article investigates methods for determining the number of factors to retain in exploratory factor analysis (EFA) through a Monte Carlo simulation under realistic conditions. It compares traditional and modern extraction criteria, finding that no single method is best for all scenarios, and suggests using a combination of methods for more accurate results. The study emphasizes the importance of sample size and the impact of correlated factors and non-normal distributions on factor extraction accuracy.

Uploaded by

nisagokce65

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views26 pages

2019 Auerswaldmoshagen EFAextractioncriteria

Uploaded by

nisagokce65

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/330546928

How to Determine the Number of Factors to Retain in Exploratory Factor

Analysis: A Comparison of Extraction Methods Under Realistic Conditions

Article in Psychological Methods · January 2019

DOI: 10.1037/met0000200

CITATIONS READS

279 4,660

2 authors, including:

Morten Moshagen
Ulm University
124 PUBLICATIONS 6,547 CITATIONS

SEE PROFILE

All content following this page was uploaded by Morten Moshagen on 15 October 2022.

The user has requested enhancement of the downloaded file.

Psychological Methods
How to Determine the Number of Factors to Retain in
Exploratory Factor Analysis: A Comparison of Extraction
Methods Under Realistic Conditions
Max Auerswald and Morten Moshagen
Online First Publication, January 21, 2019. http://dx.doi.org/10.1037/met0000200

CITATION
Auerswald, M., & Moshagen, M. (2019, January 21). How to Determine the Number of Factors to
Retain in Exploratory Factor Analysis: A Comparison of Extraction Methods Under Realistic
Conditions. Psychological Methods. Advance online publication.
http://dx.doi.org/10.1037/met0000200
Psychological Methods
© 2019 American Psychological Association 2019, Vol. 1, No. 999, 000
1082-989X/19/$12.00 http://dx.doi.org/10.1037/met0000200

How to Determine the Number of Factors to Retain in Exploratory Factor

Analysis: A Comparison of Extraction Methods Under Realistic Conditions
Max Auerswald and Morten Moshagen
Ulm University

Abstract
Exploratory factor analyses are commonly used to determine the underlying factors of multiple observed
variables. Many criteria have been suggested to determine how many factors should be retained. In this
study, we present an extensive Monte Carlo simulation to investigate the performance of extraction
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

criteria under varying sample sizes, numbers of indicators per factor, loading magnitudes, underlying
This document is copyrighted by the American Psychological Association or one of its allied publishers.

multivariate distributions of observed variables, as well as how the performance of the extraction criteria
are influenced by the presence of cross-loadings and minor factors for unidimensional, orthogonal, and
correlated factor models. We compared several variants of traditional parallel analysis (PA), the
Kaiser-Guttman Criterion, and sequential ␹2 model tests (SMT) with 4 recently suggested methods:
revised PA, comparison data (CD), the Hull method, and the Empirical Kaiser Criterion (EKC). No single
extraction criterion performed best for every factor model. In unidimensional and orthogonal models,
traditional PA, EKC, and Hull consistently displayed high hit rates even in small samples. Models with
correlated factors were more challenging, where CD and SMT outperformed other methods, especially
for shorter scales. Whereas the presence of cross-loadings generally increased accuracy, non-normality
had virtually no effect on most criteria. We suggest researchers use a combination of SMT and either
Hull, the EKC, or traditional PA, because the number of factors was almost always correctly retrieved
if those methods converged. When the results of this combination rule are inconclusive, traditional PA,
CD, and the EKC performed comparatively well. However, disagreement also suggests that factors will
be harder to detect, increasing sample size requirements to N ⱖ 500.
Translational Abstract
Exploratory factor analysis (EFA) is a statistical tool commonly used in psychological research to
determine the underlying factors of questionnaire items. One of the key issues in EFA is deciding how
many underlying factors researchers need to assume to account for different responses to these items. In
this simulation study, we compared different extraction criteria, designed to determine this number, under
conditions that are realistic in empirical practice. We investigated conditions with one underlying factor,
multiple uncorrelated factors, and multiple correlated factors. In addition, we also violated two assump-
tions of the extraction criteria. First, we included conditions with minor underlying factors that represent
systematic measurement errors, for example, when different questionnaire items are phrased in a similar way.
Second, many extraction criteria assume a normal distribution of responses to the questionnaire items and we
included conditions where this distribution was non-normal. We found that (1) some criteria perform better
in conditions with one factor or multiple uncorrelated factors, whereas other criteria perform well in conditions
with multiple correlated factors, (2) the latter criteria perform worse when minor factors are present, and (3)
non-normality did not impact the performance of most criteria. We suggest that researchers use two criteria
in conjunction, one suited for single/uncorrelated factors and one suited for correlated factors. If both criteria
suggest the same number of factors, the result is likely correct. Otherwise, the sample size should be at least
500 because the number of underlying factors is harder to detect.

Keywords: factor analysis, number of factors, Monte Carlo simulation, non-normality

Exploratory factor analysis (EFA) is a widely used statistical of observed variables, especially if there is no strong a priori
method to study the underlying latent structure of a large number justification for a particular theoretical model. EFA determines the
underlying structure using a data-driven approach assuming a
common factor model (Thurstone, 1947). In this model, each
observed variable is conceptualized as the weighted sum of a set of
(potentially correlated) factor variables and a single unique factor.
Max Auerswald and Morten Moshagen, Institute of Psychology and Edu-
The common factors account for covariances among the observed
cation, Department of Quantitative Methods in Psychology, Ulm University.
Correspondence concerning this article should be addressed to Max
variables and, thus, are the factors of theoretical interest. Unique
Auerswald, Research Methods, Institute of Psychology and Education, factors, on the other hand, exclusively account for the variances of
Ulm University, Albert-Einstein-Allee 47, 89081 Ulm, Germany. E-mail: single observed variables, which is considered to reflect measure-
max.auerswald@uni-ulm.de ment error with regard to the common factors.

1
2 AUERSWALD AND MOSHAGEN

One of the key issues in EFA is deciding how many latent The Common Factor Model
factors need to be extracted. Both under- and overestimating the
number of factors (referred to as under- and overextraction, re- The common factor model (for an overview, see, e.g., Jöreskog,
2007) assumes a set of m latent common factors ␰1, . . . , ␰m that
spectively) have detrimental effects on the quality of EFA (Com-
explain variations in the p manifest (and standardized) random
rey, 1978). Underextraction results in substantial error on all factor
variables x1, . . . , xp. A single manifest variable xi is assumed to
loadings, irrespective of their weight in a correctly specified model
be a linear combination of ␰1, . . . , ␰m and one unique factor εi,
(Wood, Tataryn, & Gorsuch, 1996), and deteriorates the factor
similar to a linear regression:
scores compared with factor scores in a correctly specified model
(Fava & Velicer, 1996). In contrast, overextraction typically re- xi ⫽ li1␰1 ⫹ li2␰2 ⫹ . . . ⫹lim␰m ⫹ εi, 1 ⱕ i ⱕ p, (1)
sults in lower biases in factor scores and loadings (Fava & Velicer,
where εi is uncorrelated with all ␰1, . . . , ␰m and all εi= for which
1992; Wood et al., 1996). However, overextraction can lead to
i ⫽ i⬘, and lij is the loading of the i-th item on factor j. The goal
factor splitting, such that manifest variables with population load-
is thus to find common latent factors, fewer in number than the
ings on one factor are split on multiple factors after the rotation
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

number of manifest variables, that account for the covariances

(Wood et al., 1996). Overextraction also results in less parsimo-
This document is copyrighted by the American Psychological Association or one of its allied publishers.

among the manifest variables x1, . . . , xp, such that x1, . . . , xp

nious models that include constructs with little to no explanatory
would be uncorrelated conditional on the latent factors ␰1, . . . , ␰m.
value and increases the likelihood of Heywood cases, such as
The model is identified if
negative variance estimates (de Winter & Dodou, 2012).
Several methods are available to determine the number of fac- p(p ⫹ 1) ⬎ p ⫹
m

tors in EFA, most of which rely on eigenvalues representing the 2 j⫽1

兺(p ⫺ j ⫹ 1), (2)
variance explained by each common factor. These include the
widely known Kaiser-Guttman Criterion (KGC; Guttman, 1954; up to a rotation of the common factors that changes the complexity
Kaiser, 1960), Cattell’s scree test (Cattell, 1966), and parallel of the pattern of loadings and, in the case of oblique rotations, the
analysis (PA; Horn, 1965). Given that the common factor model is correlation among ␰j, 1 ⱕ j ⱕ m (for details, see, e.g., Browne,
a special case of structural equation modeling, model fit statistics 2001).
are often used to verify the retained number of factors (Fabrigar, Figure 1 shows a common factor model for two latent factors
Wegener, MacCallum, & Strahan, 1999). and seven manifest variables. In this case, every manifest variable
PA has received support by various Monte Carlo studies xi is assumed to depend on two latent factors ␰1, ␰2 and the unique
factor εi for 1 ⱕ i ⱕ 7. The latent factors that are supposed to
(Hubbard & Allen, 1987; Humphreys & Montanelli, 1975;
represent the underlying psychological variables of interest can be
Peres-Neto, Jackson, & Somers, 2005; Velicer, Eaton, & Fava,
correlated (␾). The unique factors measure item-specific variance
2000; Zwick & Velicer, 1986) and is therefore often considered
and are thus assumed to be independent from both the latent
state of the art (Hayton, Allen, & Scarpello, 2004; Schmitt,
factors and other item-specific factors. The loadings are estimated
2011). Recently, a number of new methods (Braeken & van
in conjunction with the variances of the unique variables and the
Assen, 2017; Green, Levy, Thompson, Lu, & Lo, 2012;
(co-)variances of the latent factors. For continuous variables, the
Lorenzo-Seva, Timmerman, & Kiers, 2011; Ruscio & Roche,
estimation is most often based on maximum likelihood or un-
2012) have been found to outperform PA in at least some
weighted least squares. In the case of seven manifest variables,
conditions. However, a thorough evaluation of these approaches
solutions with up to three common factors would be identified.
in relation to PA under a wide range of conditions is lacking,
leaving doubt about which criterion should be employed to
determine the number of factors when performing EFA. Ac- Eigenvalues in the Common Factor Model
cordingly, the purpose of the present study is to compare The majority of extraction criteria rely on eigenvalues to decide
traditional and modern techniques to choose the number of on the number of factors (e.g., Braeken & van Assen, 2017;
factors over a wide range of conditions designed to mimic Cattell, 1966; Green et al., 2012; Guttman, 1954; Horn, 1965;
typical data structures obtained in psychological research. Cru- Kaiser, 1960; Ruscio & Roche, 2012). Understanding the relation-
cially, these conditions include non-normal distributions which ship between eigenvalues and a common factor model is facilitated
are prevalent in the majority of psychological variables (e.g., by expressing the latter in matrix notation. For X ⫽ (x1, . . . ,
Cain, Zhang, & Yuan, 2017). Beyond investigating the perfor- xp)T, ␰ ⫽ (␰1, . . . , ␰m)T, ε ⫽ (ε1, . . . , εp)T and a p ⫻ m matrix of
mance of these criteria in isolation, we also evaluate whether loadings ⌳,
the combination of the results of various extraction criteria can
X ⫽ ⌳␰ ⫹ ε (3)
be used to improve decisions about the retained number of
factors. is an equivalent expression of Equation 1. We can express the
The next section describes the common factor model, introduces correlation matrix of the manifest variables X as
the concept of eigenvalues (on which most decision criteria rely)
R ⫽ E(XXT), (4)
and discusses how eigenvalues depend on various parameters of
the common factor model. Following a description of conditions where E is the expected value, because the manifest variables are
that are generally expected to facilitate or complicate the identifi- standardized. From Equation 3, it follows that
cation of the correct number of factors, we present a more detailed
XXT ⫽ (⌳␰ ⫹ ε)(⌳␰ ⫹ ε)T (5)
review of popular methods and modern techniques for determining
the number of factors in EFA. ⫽⌳␰␰ ⌳ ⫹ ⌳␰ε ⫹ ε␰ ⌳ ⫹ εε .
T T T T T T
(6)
DETERMINING THE NUMBER OF FACTORS 3

1,2
1,1 2,2

1 2

1 2 3 4 5 6 7
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Figure 1. A common factor model with two common latent factors and seven observed variables. The common
factors can be correlated whereas the unique factors are independent from other unique factors and the latent
factors. The arrows from the common factors to the observed variables indicate the loadings li1, li2 for 1 ⱕ i ⱕ
7 (see Equation 1).

We denote the correlation matrix between the common factors ␰ PCA is primarily a data reduction technique that does not differ-
as ⌽ (⫽ E(␰␰T)) and the covariance matrix of the unique factors ε entiate between common and unique variance. If the goal of the
as ⌬ (⫽ E(εεT)). Because ␰ and ε are independent, the model analysis is to uncover a latent structure that addresses the covari-
expresses the correlation matrix as ances among observed variables measured with some random
error, which is a more realistic case in psychological research,
R ⫽ ⌳⌽⌳T ⫹ ⌬. (7) EFA is usually preferred (e.g., Bentler & Kano, 1990; de Winter &
The common factor model thus becomes a statement about the Dodou, 2016). The main difference with regards to the eigenvalues
correlation matrix, where the matrices ⌳ and ⌽ are only deter- is that PCA eigenvalues are calculated based on the correlation
mined up to a rotation (Browne, 2001). matrix R instead of RC. However, a hypothetical population com-
The matrix ⌬ in Equation 7 is a diagonal matrix because the mon factor model fully determines the correlation matrix R (see
common factor model assumes that all unique factors εi, εi=, are Equation 7) and, therefore, the associated eigenvalues of R as well
independent for i ⫽ i⬘. The entries ␦i of ⌬ are called uniqueness (see, e.g., Braeken & van Assen, 2017, p. 465). Furthermore, there
factors and represent the part of variance of the manifest variable is evidence that EFA and PCA often yield comparable results in
xi that is independent of the latent factors. The communalities are practice (Stevens, 2009). We return to this issue later in this article
their counterpart, that is, the part of the variance of xi that can be in the context of which matrix to choose in PA.
explained by the latent factors.1 The common factor model esti-
mates ⌳ such that General Issues in Factor Extraction
R̂C ⬇ ⌳⌳ , T
(8) Regardless of the particular criterion employed to determine the
number of retained factors, conditions can be identified that will
where R̂C is the matrix that results when replacing the diagonal simplify or complicate recovery of the correct number of factors.
elements of R with the communalities. One least squares solution Most importantly, the correct number of factors is typically harder
to Equation 8 estimates the loadings ⌳ proportional to the so- to detect when factor saturation is low (due to, e.g., low factor
called eigenvectors of R̂C (Jöreskog, 2007). In general, eigenvec- loadings, a small number of items per factor, or high factor
tors are vectors ␷ for which intercorrelations), because the eigenvalues associated with true
A␷ ⫽ ␭␷, ␷ ⫽ 0 (9) factors are numerically closer to the remaining eigenvalues
(Braeken & van Assen, 2017).
holds for an arbitrary square matrix A of size p ⫻ p, a vector ␷ of Given that the eigenvalues can be immediately derived from a
length p, and a scalar ␭, the corresponding eigenvalue. Symmetric, population common factor model (see Equation 7 and 8), we can
positive semidefinite matrices like covariance matrices or RC
always have p (not necessarily distinct) non-negative eigenvalues.
1
Most importantly, the j-th largest eigenvalue of RC corresponds to The problem of communalities refers to the difficulty of simultane-
the variance explained by the j-th factor in a common factor model ously estimating the proportion of variance that can be explained by
common factors and the common factor model itself. The common factor
(see the Appendix for a more technical explanation of this fact). model approximates a correlation matrix with communalities on the diag-
Principal component analysis (PCA) is also often used as a onal, but the communalities are only known after the model is estimated
substitute for EFA based on the common factor model. However, (see e.g. Harman, 1976).
4 AUERSWALD AND MOSHAGEN

analytically derive the effects of different factor models on the Kaiser-Guttman Criterion
expected eigenvalues.2 For a hypothetical population common
One of the most prominent heuristics to determine the number
factor model with m factors, p items per factor, standardized factor
of factors to retain is the KGC (Guttman, 1954; Kaiser, 1960),
loadings l, and factor intercorrelation r, the eigenvalues can be
which extracts all factors with corresponding sample eigenvalues
derived from the model implied correlation matrix R as
greater than 1. The rationale behind this rule is that a factor should
␭1 ⫽ 1 ⫹ (p ⫺ 1)l2 ⫹ (m ⫺ 1)prl2 at least explain as much variance as a single item. However,
because sampling error leads to eigenvalues that exceed 1 even in
␭2 ⫽ . . . ⫽ ␭m ⫽ 1 ⫹ (p ⫺ 1)l2 ⫺ prl2 (10)
the absence of any factor, the KGC severely overestimates the
␭m⫹1 ⫽ . . . ⫽ ␭mp ⫽ 1 ⫺ l2 number of factors (e.g., Hakstian, Rogers, & Cattell, 1982; Lance,
Butts, & Michels, 2006; Zwick & Velicer, 1986). Despite this
(Braeken & van Assen, 2017). Similarly, we can derive the eigen-
substantial bias, the KGC is commonly used (Henson & Roberts,
values of RC as
2006) and is the default in several statistics programs such as SPSS
(IBM Corp, 2015).
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

␭1 ⫽ pl2 ⫹ (m ⫺ 1)prl2
This document is copyrighted by the American Psychological Association or one of its allied publishers.

␭2 ⫽ . . . ⫽ ␭m ⫽ pl2 ⫺ prl2 (11)

␭m⫹1 ⫽ . . . ⫽ ␭mp ⫽ 0
Cattell’s Scree Test

Figure 2 illustrates the effects of factor loadings, number of Cattell’s (1966) scree test is a graphical method based on the
plot of the successive eigenvalues in descending order (the so-
items per factor, and factor intercorrelation on the eigenvalues of
called scree plot). The test is performed by searching for an elbow,
R and RC assuming a factor model with three common factors. As
a point at which the eigenvalues decrease abruptly. The method
can be seen, eigenvalues of true factors will generally increase
suggests extracting all factors up to the factor corresponding to the
with the loading magnitude and the number of items per factor
eigenvalue preceding the sharpest decline. Being a graphical ap-
because both (p ⫺ 1)l2 (Equation 10) and pl2 (Equation 11) will be
proach, the method is obviously subjective and therefore rarely
larger. In contrast, high factor intercorrelations increase the term
evaluated systematically. Furthermore, scree plots can be ambig-
prl2, implying that the first eigenvalue will increase with r,
uous, either lacking any clear elbow or showing multiple elbows in
whereas the remaining eigenvalues associated with true factors the same scree plot (Ruscio & Roche, 2012). Raîche, Riopel, and
decrease with r. Except for the first factor, highly correlated Blais (2006) suggested nongraphical solutions for Cattell’s scree
factors therefore present a difficult condition for factor extraction. test that rely on the change in slope of adjacent eigenvalues. Both
Eigenvalues estimated from a sample will deviate from the methods clearly outperformed the Kaiser criterion, but tended to
population eigenvalues of Equations 10 and 11. Braeken and van underestimate the number of factors and were inferior to other
Assen (2017) demonstrated that sample dispersions tend to affect approaches, such as PA (Raîche, Walls, Magis, Riopel, & Blais,
eigenvalues of various examples in three relevant respects. First, 2013; Ruscio & Roche, 2012).
the eigenvalues ␭1, . . . , ␭<m ⁄ 2= (the first half of eigenvalues asso-
ciated with true factors) tend to increase because the extracted
Traditional and Revised Parallel Analysis
factors capitalize on chance correlations in the sample correlation
matrix. Second, ␭>m ⁄ 2?, . . . , ␭m (the second half of eigenvalues PA (Horn, 1965) compares the empirical eigenvalues with the
associated with true factors) tend to decrease, because a larger mean of eigenvalues obtained from random samples based on
portion of variance has already been explained by the first uncorrelated variables. The random samples have the same number
factors. Third, the first half of remaining eigenvalues 共␭m⫹1, . . . , of observations and variables as the empirical data, so the eigen-
␭<m⫹共p⫺m兲⁄2=兲 again tend to increase. Similar to the effect for values of the random samples take sampling error into account. PA
␭1, . . . , ␭<m ⁄ 2=, additional factors capitalize on remaining chance extracts all factors with eigenvalues that exceed the average cor-
correlations that were not addressed by the previous factors. Over- responding eigenvalue of the random samples (see Figure 3 for an
all, this deviation leads to a more ambiguous pattern of eigenval- example).
ues, because the difference between ␭m and ␭m⫹1 decreases. The eigenvalues in PA are typically based on the correlation
In sum, the above illustrates that any extraction criterion that matrix R of observed and random samples (PAPCA; e.g., Finch &
explicitly or implicitly relies on the sample eigenvalues will per- West, 1997; Steger, 2006), similar to a PCA. As we discussed
form well when there are many, high-loading indicators and when above, a common factor model fully determines both the eigen-
factor correlations are weak. In contrast, low loadings, few indi- values of R and RC, so PA can also be based on the correlation
matrix RC with communalities on the diagonal, reflecting the EFA
cators per factor, and strong factor correlations pose serious chal-
eigenvalues (PAEFA, Humphreys & Ilgen, 1969). There has been
lenges, especially when the sample size is small.
some controversy as to which eigenvalues are appropriate for PA
(Mulaik, 2010). Garrido, Abad, and Ponsoda (2013) argued that
Methods to Decide the Number of Retained Factors the common factor model is inappropriate for PA because the

EFA is typically used whenever there is no strong theoretical

2
reason to expect a particular number of latent factors underlie the For the sake of brevity, we only inspect different population common
factor models with homogeneous primary factor loadings and factor inter-
observed variables. In this section, we briefly review conventional
correlations. However, the main result that eigenvalues of true factors and
methods and introduce modern techniques designed to help deter- the remaining eigenvalues are numerically closer holds irrespective of this
mine the number of factors to retain in EFA. assumption.
DETERMINING THE NUMBER OF FACTORS 5

Increased l Increased p Increased r

4
λPCA λEFA
3

3
Eigenvalues

Eigenvalues

Eigenvalues
2

2
1

1
0

0
2 4 6 8 10 12 2 4 6 8 10 12 5 10 15 2 4 6 8 10 12

Number of factors Number of factors Number of factors Number of factors

Figure 2. Population eigenvalues of R (␭PCA) and RC (␭EFA). The first panel displays eigenvalues for a
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

common factor model with three orthogonal factors, four items per factor, and standardized loadings l ⫽ .6.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Compared with the first panel, the second panel shows the effect of increased loadings (l ⫽ .8), the third panel
displays increased items per factor (p ⫽ 6), and the fourth panel illustrates three correlated factors (r ⫽ .5).

random samples have uncorrelated variables with EFA commu- is generally considered to be the method of choice (e.g., Hayton et
nalities of h2 ⫽ 0 in the population, whereas the common factor al., 2004; Schmitt, 2011). However, there are two weaknesses
model assumes a common cause behind the observed variables. associated with PA, initially suggested by Horn (1965). The first
PCA, on the other hand, does not account for unique variance and stems from the fact that sampling error can lead to eigenvalues
might overestimate the explained variance of common factors. above the average eigenvalue of random samples. For example, if
However, this overestimation is similar for empirical and random all manifest variables are uncorrelated in the population (such that
sample eigenvalues, resulting in no meaningful bias once these there is no common factor), the first empirical eigenvalue would
eigenvalues are compared with each other (Garrido et al., 2013). exceed the first average eigenvalue from random samples in ap-
Furthermore, the performance of PAEFA is also affected by the proximately 50% of all samples, which would lead to overestima-
method of estimating the communalities. Crawford et al. (2010) tion of the number of factors for PA. One possible solution is to
found a higher hit rate for PAPCA unless factors were moderately use the 95th percentile of the eigenvalues obtained from random
or highly correlated, compared with a PA where communalities are samples as a threshold instead of the mean (Glorfeld, 1995).
estimated as sample multiple R2 between the variables and all The second weakness of PA involves the choice of the reference
remaining variables. Based on existing evidence (Garrido et al., eigenvalues for the second and following factors (Turner, 1998).
2013), PAPCA seems to produce better results than PAEFA. Assume that the empirical data set has a single underlying factor
PA is supported by strong evidence from simulation studies that explains a large portion of the item covariances. Any remain-
(Hubbard & Allen, 1987; Humphreys & Montanelli, 1975; Peres- ing factor can only explain a fraction of the yet unexplained
Neto et al., 2005; Velicer et al., 2000; Zwick & Velicer, 1986) and covariances. However, the items in the random samples that con-
6
5
4
Eigenvalues
3
2
1
0

0 5 10 15 20 25 30 35 40
Number of factors

Figure 3. Parallel analysis on a simulated sample with N ⫽ 100, 40 manifest variables, and five underlying
factors. The filled dots represent the sorted eigenvalues of the sample correlation matrix. The empty dots
represent the average eigenvalues of correlation matrices from 100 independent random samples. The solid line
depicts the threshold for the Kaiser-Guttman Criterion. Parallel analysis correctly identifies the number of factors
as five, while the scree test suggests either one or three. The Kaiser-Guttman Criterion suggests 14 factors and
thus overestimates the number of factors severely.
6 AUERSWALD AND MOSHAGEN

stitute the comparison threshold are uncorrelated, leading to a

higher portion of unexplained covariance for the random samples
compared to the empirical sample. This biased comparison, cre-
RMSE ⫽ 冑兺 p

i⫽1
(␭emp,i ⫺ ␭sim,i)2 , (12)

ated by differing portions of unexplained covariance, might leave for p items. This step results in two RMSEs, one for each
a second factor undetected and underestimate the number of fac- number of underlying factors.
tors. These two weaknesses can counteract each other, so only
correcting for one weakness can lead to lower accuracy in some 3. Repeat Steps 1 and 2, for example, 500 times.
conditions. For example, Cho, Li, and Bandalos (2009) showed
4. Assess if the RMSE is significantly lower in the condition
that PA was more accurate if the average eigenvalues were used as
with two factors via a (one-sided) Wilcoxon’s test with
a criterion compared to using the 95th percentile. However, there
␣ ⫽ .30.
is no guarantee that these deficiencies have effects to the same
extent but in opposing directions, so traditional PA is typically 5. If the difference in RMSEs is not significant, CD suggests
biased because one weakness is more significant than the other.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

j factors. Otherwise, j is increased by one and Steps 1

As a remedy, Turner (1998) suggested that the random eigen-
This document is copyrighted by the American Psychological Association or one of its allied publishers.

through 4 are repeated.

values should be calculated from samples that also have underly-
ing factors equivalent to the factors already extracted in the Note that both CD and revised PA iteratively compare factor
empirical data. Green, Levy, Thompson, Lu, and Lo (2012) im- solutions with j and j ⫹ 1 factors. However, CD differs from revised
plemented this idea in an iterative procedure. First, traditional PA PA in three respects (Green et al., 2012). First, revised PA only
is performed with the 95th percentile as a criterion to investigate compares the eigenvalue of the jth factor with the jth eigenvalue of the
the existence of the first factor. If the first empirical eigenvalue sampled data, whereas CD always takes all eigenvalues into account.
exceeds the 95th percentile, the factor is extracted and the model CD calculates two sets of RMSE values with different underlying
implied correlation matrix is calculated according to Equation 7. factors based on every eigenvalue (Equation 12). If too many factors
Second, random comparison data are drawn from a normal distri- are extracted, the eigenvalues for all subsequent eigenvalues will be
bution that complies with the correlation matrix implied by the lower in the sample data compared to the empirical data, leading to
model. The procedure then calculates the second eigenvalue by higher misfit in the overall pattern of eigenvalues. The second differ-
applying factor analysis to each comparison sample and compares ence between CD and revised PA concerns the chosen reference value
the 95th percentile of resulting eigenvalues with the second em- for the sample eigenvalues. CD relies on the average of the sample
pirical eigenvalue. If the empirical eigenvalue is again larger, the eigenvalues, which usually lies below the 95th percentile of the
last step is repeated with a correlation matrix that takes the addi- eigenvalues as used in revised PA. These two differences have dif-
tional factor into account, until an empirical eigenvalue is found ferent implications for the tendency to under- or overextract factors.
that is smaller than the 95th percentile. Green et al. (2012) dem- The different number of eigenvalues taken into account should lead to
onstrated that this revised PA outperforms traditional PA with the less overextractions for CD compared with both traditional and re-
95th percentile as a criterion for highly correlated factors and vised PA. However, the lower reference value used in CD should lead
factors with large loadings (see also Green, Thompson, Levy, & to more overextractions compared with revised PA. A third difference
Lo, 2015). between CD and revised PA pertains to the sampling procedure.
Revised PA generates random normally distributed samples based on
the underlying factor structure. The CD approach, however, repro-
Comparison Data
duces the marginal distributions observed in the empirical data set
Ruscio and Roche (2012) suggested an approach that finds the using an algorithm suggested by Ruscio and Kaczetow (2008). There-
number of factors by determining the solution that reproduces the fore, CD should be more accurate when data are not normally dis-
pattern of eigenvalues best (comparison data, CD). Similar to tributed.
revised PA, CD takes previous factors into account by generating Ruscio and Roche (2012) compared the performance of CD with
comparison data of a known factorial structure in an iterative traditional PA and other methods such as the KGC. The loadings in
procedure. Initially, CD compares whether the simulated compar- this simulation were set to create challenging conditions for traditional
ison data with one underlying factor (j ⫽ 1) reproduce the pattern PA, which led to exceptionally low loadings for single factor models
of empirical eigenvalues significantly worse compared with a 共l៮ ⫽ 0.225兲 and models with uncorrelated factors 共.275 ⱕ l៮ ⱕ
two-factor solution (j ⫹ 1). If this is the case, CD increases j until .425兲. CD identified the number of factors more accurately than
further improvements are nonsignificant or a preset maximum of traditional PA unless the number of factors was high.
factors is reached. Specifically, CD consists of five steps:

1. Generate random data with either j or j ⫹ 1 underlying Hull Method

factors and calculate the eigenvalues of the respective
correlation matrices. The Hull method (Lorenzo-Seva et al., 2011) is an approach
based on the Hull heuristic used in other areas of model selection
2. Calculate the root mean square error (RMSE) of the (e.g., Ceulemans & Kiers, 2006). Similar to nongraphical variants
difference between empirical and simulation-based of Cattell’s scree plot, the Hull method attempts to find an elbow
eigenvalues as as justification for the number of common factors. However,
DETERMINING THE NUMBER OF FACTORS 7

instead of using the eigenvalues relative to the number of factors, (GOF j ⫺ GOF j⫺1) ⁄ (df j ⫺ df j⫺1)
(13)
the Hull method relies on goodness-of-fit indices relative to the (GOF j⫹1 ⫺ GOF j) ⁄ (df j⫹1 ⫺ df j)
model degrees of freedom of the proposed model. More specifi-
cally, the method finds the number of factors in four steps: obtains its maximum and j is a viable solution.3 The Hull method
correctly identifies the number of factors as five in the example in
1. The method calculates a goodness-of-fit index GOFj and Figure 4.
model degrees of freedom dfj of various models with an The elbow is identified as the value where, relative to the change
increasing number of factors j up to a prespecified max- in the model df, model fit increases considerably compared to a
imum J (0 ⱕ j ⱕ J). Figure 4 depicts the comparative fit lower number of factors (j ⫺ 1) but is only barely lower as the
index (CFI, Bentler, 1990) for solutions with zero to model fit associated with a higher number of factors. This criterion
seven factors, corresponding model degrees of freedom, value is based on every viable fit value relative to both its preced-
and a simulated sample with five underlying factors. ing and subsequent fit values. Note that the suggested factor
solution therefore cannot be the first or last factor in the range for
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

2. A solution sj is considered to be unviable if a less complex which the model fit is estimated (unless all other solutions are
This document is copyrighted by the American Psychological Association or one of its allied publishers.

model (indicating a lower number of factors) with a higher unviable). This range typically includes a zero-factor model as a
(better) fit index exists. The j-th solution is thus unviable if minimum. In order to avoid overextractions, the maximum number
there is a solution sj= with j= ⬍ j and GOFj= ⬎ GOFj. In of factors is typically set to be the number of factors extracted
Figure 4, no solution is excluded at this point. based on traditional PA with the 95th percentile as a criterion, plus
one (Lorenzo-Seva et al., 2011). If the maximum is 1 (e.g., in the
3. The remaining solutions are further identified as unviable if case of a zero-factor model), the Hull method cannot be applied
GOFj is below the line connecting adjacent viable solutions and implicitly relies on traditional PA to identify the correct
in a plot of fit indices and model degrees of freedom. By this number of factors.
rule, Solutions 2 and 4 are excluded in Figure 4. This step is Lorenzo-Seva et al. (2011) compared the Hull method with
repeated until no remaining solutions can be identified as various goodness-of-fit indices to other selection criteria. The
unviable. design of the simulation study incorporated both major and minor
factors, where major factors constituted the factors of interest.
4. The Hull method then suggests the number of factors where Minor factors were associated with (random) loadings that ac-
counted for 15% of the variance on average. While no method
consistently outperformed the other approaches, the Hull method
based on the CFI was superior to other methods, including tradi-
Number of factors
tional PA, in conditions where the number of observed variables or
0 1 2 3 4 5 6 7
the sample size was large. However, the method has not yet been
1.0

compared to other variants of PA (Green et al., 2012) or the CD

approach (Ruscio & Roche, 2012). Compared with other ap-
proaches, the Hull method is more likely to extract comparatively
0.8

strong, unambiguous factors because it successfully ignores small,

systematic errors. We therefore expect that the Hull method is
particularly useful in the case of single factor models or models
0.6

with uncorrelated factors, but may fall short when factors are
highly correlated or when some factors only account for a small
CFI

proportion of the variance.

0.4

Empirical Kaiser Criterion

0.2

The Empirical Kaiser Criterion (EKC; Braeken & van Assen,

2017) is an approach that incorporates random sample variations
of the eigenvalues in Kaiser’s criterion. On a population level, the
0.0

criterion is equivalent to Kaiser’s criterion and extracts all factors

0 100 200 300 400 with associated eigenvalues of the correlation matrix greater than
one. However, on a sample level, the criterion takes the distribu-
dfparameters
tion of eigenvalues for normally distributed data into account.
Figure 4. The Hull method with the comparative fit index (CFI) as a Under the null model, the distribution of eigenvalues asymptoti-
criterion on a simulated sample with five true underlying factors. In this cally follows a Marčenko-Pastur distribution (Marčenko & Pastur,
case, the Hull method considers solutions in the range from zero to seven 1967). The resulting upper bound of this distribution is the refer-
factors (represented at the top of the figure). The empty dots are unviable ence value for the first eigenvalue ␭, so
solutions that lie below the line connecting adjacent viable solutions. The
filled dots represent viable solutions. The Hull method correctly identifies
five factors. 3
Note that j ⫺ 1 and j ⫹ 1 are not required to be viable solutions.
8 AUERSWALD AND MOSHAGEN

冉冑Np 冊 ,
␭1,ref ⫽ 1 ⫹
2
(14)
one additional factor, in this case a unidimensional factor model, is
estimated and tested. The procedure continues until a nonsignifi-
cant result is obtained, at which point the number of common
for N observations and p items. Subsequent eigenvalues are cor-
factors is identified.
rected by the explained variance, expressed as the eigenvalues of
Simulation studies investigating the performance of sequential
previous factors. The j-th reference eigenvalue is
␹2 model tests (SMT) as an extraction criterion have shown

␭ j,ref ⫽ max冉兺 j⫺1

p ⫺ i⫽0
p⫺ j⫹1
冋冑Np 册 , 1冊,
␭j
1⫹
2
(15)
conflicting results. Whereas some studies have shown that SMT
has a tendency to overextract (e.g., Linn, 1968; Ruscio & Roche,
2012; Schönemann & Wang, 1972), others have indicated that the
such that higher previous eigenvalues lower the reference eigen- SMT has a tendency to underextract (e.g., Green et al., 2015;
value because the proportion of unexplained variance will be Hakstian et al., 1982; Humphreys & Montanelli, 1975; Zwick &
lower. In accordance with the original Kaiser criterion, the refer- Velicer, 1986). Hayashi, Bentler, and Yuan (2007) demonstrated
ence eigenvalue cannot become smaller than one. that overextraction tendencies are due to violations of regularity
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Braeken and van Assen (2017) derived theoretical conditions for assumptions if the number of factors for the test exceeds the true
This document is copyrighted by the American Psychological Association or one of its allied publishers.

scale reliability, number of observations, number of factors, and number of factors. For example, if a test of three factors is applied
factor correlation under which the EKC is expected to correctly to samples from a population with two underlying factors, the
identify the number of factors. For example, for orthogonal factors, likelihood ratio test statistic will no longer follow a ␹2 distribution.
EKC is predicted to work if Note that the tests are applied sequentially, so a three-factor test is

冉冑 pNm冊 , 2 only employed if the two-factor test was incorrectly significant.

pj j
⬎ 1⫹ (16) Therefore, this violation of regularity assumptions does not de-
p j ⫺ (p j ⫺ 1)␣ j crease the accuracy of SMT, but leads to (further) overextractions
for all 1 ⱕ j ⱕ m and m (overall) underlying factors, ␣j Cronbach’s if a previous test was erroneously significant. Additionally, this
alpha in the population, pj the number of items of the respective overextraction tendency might be counteracted by the lack of
factor j, and N observations. For correlated factors, the conditions power in simulation studies with smaller sample sizes. The per-
that guarantee a high performance for EKC are more complex, but formance of SMT has not yet been assessed for non-normally
are also more likely to be fulfilled if ␣ and N are high, scales are distributed data or in comparison to most of the other modern
shorter, and factor correlations are low. Corroborating these as- techniques presented thus far in a larger simulation design.
sumptions, Braeken and van Assen (2017) found that the EKC
exhibited high hit rates if these conditions were met (⬎.90), but The Present Study
low hit rates if they were not (⬍.50). In particular, the EKC
outperformed traditional PA with the 95th percentile as a criterion The goal of the present study is to evaluate the performance of
when factors are correlated and are only measured by few items modern techniques for determining the number of factors to retain
with very high loadings. Furthermore, the EKC yielded compara- in EFA. We incorporated a wide range of data conditions that are
ble results to revised PA and CD in a simulation study with a high challenging but realistic in psychological research (Fabrigar et al.,
number of factors and few observed variables. However, the EKC 1999). This allows for the assessment of the overall performance
has not yet been compared with revised PA or CD in a more of factor extraction criteria under conditions relevant in practice.
general simulation study that also included the Hull method. Given that previous simulation studies found that no single method
Braeken and van Assen (2017) also showed that accuracies were was superior to all other methods under all conditions (e.g.,
generally high if the conditions under which the EKC is expected Braeken & van Assen, 2017; Green et al., 2012; Lorenzo-Seva et
to work were met (all accuracies ⱖ.93) and lower if they were al., 2011; Ruscio & Roche, 2012), we (a) focus on identifying
violated (all accuracies ⱕ.83). However, the theoretical conditions conditions under which each method performs well and (b) attempt
guaranteeing a high performance of the EKC and other extraction to suggest a combination rule utilizing the strengths of multiple
criteria require information that is only available to researchers if methods.
a specific common factor structure is assumed. This is potentially
undesirable in the context of EFA, which is usually performed to
Method
avoid assumptions about the underlying factor structure.4

Sequential ␹2 Model Tests Extraction Criteria

The fit of common factor models is often assessed with the We considered 11 methods for determining the number of
likelihood ratio test statistic (Lawley, 1940) using maximum like- retained factors.
lihood estimation (ML), which tests whether the model-implied Kaiser-Guttman Criterion (KGC). The KGC has been im-
covariance matrix is equal to the population covariance matrix. plemented using the eigenvalues of either the input correlation
The associated test statistic asymptotically follows a ␹2 distribu- matrix (KGCPCA) or the correlation matrix with communalities on
tion if the observed variables follow a multivariate normal distri- the diagonal (KGCEFA).
bution and other assumptions are met (e.g., Bollen, 1989). This test
can be sequentially applied to factor models with increasing num- 4
Note that assumptions about the underlying factor structure, reliabili-
bers of factors, starting with a zero-factor model. If the ␹2 test ties, and factor correlation could be used for a power analysis for EFA to
statistic is statistically significant (with e.g., p ⬍ .05), a model with determine a sample size under which EKC is expected to work.
DETERMINING THE NUMBER OF FACTORS 9

Traditional parallel analysis (PA). We implemented four Table 1

variants of traditional PA (Horn, 1965), varying the reference Population McDonald’s ␻ (Total) by Simulation Condition
eigenvalue for random samples (mean or 95th percentile) and
extraction method (PCA or EFA): (a) PAPCA-M, (b) PAPCA-95, (c) Number of indicators per factor
PAEFA-M, and (d) PAEFA-95. All methods were based on 100 Average loading 4 8 12
random samples, generated by (nonparametrically) resampling the
.5 .58 .73 .80
input data. .65 .75 .86 .90
Revised PA. As recommended by Green et al. (2012), we
used the eigenvalues obtained from an EFA and the 95th percentile
as reference eigenvalue based on 100 random samples.
Hull method. The Hull method (Lorenzo-Seva et al., 2011) ties in empirical research, factor extraction criteria are again often
was implemented using the CFI (Bentler, 1990) to assess the fit of used before item elimination, so reliability is likely smaller in the
each factor solution, given that the CFI-based Hull method was development stage when EFA is applied.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

superior to every other implementation in Lorenzo-Seva et al. Presence of cross-loadings. Cross-loadings also often occur
This document is copyrighted by the American Psychological Association or one of its allied publishers.

(2011). in empirical data sets (e.g., DiStefano & Hess, 2005) and were
Comparison data (CD). CD (Ruscio & Roche, 2012) was simulated with two levels in this simulation: present and absent. In
implemented using an alpha level of .30 and 500 resamples, in line the condition without cross-loadings, the loading pattern matrix
with the recommendations of Ruscio and Roche (2012). only contained the primary loadings (as described above). The
Empirical Kaiser Criterion (EKC). The EKC (Braeken & condition with cross-loadings included additional standardized
van Assen, 2017) was implemented using the eigenvalues of the loadings of l1 ⫽ .2 and l2 ⫽ ⫺.2 for the second and fourth
input correlation matrix. indicator out of each set of four indicators. For example, if a factor
Sequential ␹2 model tests (SMT). We implemented SMT was based on 12 indicators, the second, sixth, and 10th indicator
based on the hypothesis of perfect fit with ␣ ⫽ .05 and ML had cross-loadings of l1 ⫽ .2, whereas the fourth, eighth, and 12th
estimation. indicator had cross-loadings of l2 ⫽ ⫺.2. The cross-loading l1 was
on the first succeeding factor and l2 was on the second succeeding
factor. Note that the effect of cross-loadings potentially depends on
Experimental Conditions
both the number and the magnitude of the primary loadings, so that
We attempted to cover a wide range of data conditions plausibly the chosen approach to use a fixed magnitude for the secondary
occurring in empirical factor analysis studies. loadings might have a stronger effect when the primary loadings
Number of observations. The number of observations was are low. However, the inclusion of this condition does allow us to
set to 100, 200, 500, or 1,000, thereby covering the sample sizes identify whether cross-loadings have any effect on the accuracy of
used in most empirical studies (Fabrigar et al., 1999; Jackson, factor extraction methods at all.
Gillaspy, & Purc-Stephenson, 2009; Worthington & Whittaker, Presence of minor factors. We included conditions with only
2006). major factors, as described above, and two conditions with a single
Number of latent factors. Manifest variables were generated additional minor factor. Here, we define latent factors to be minor
with one, three, or five underlying factors, representing the dimen- when they represent systematic variance that is irrelevant with
sionality of scales most common in psychometric measurement respect to the factors of theoretical interest. For example, this
(DiStefano & Hess, 2005; Jackson et al., 2009). could include minor correlations among indicators due to phrasing
Factor intercorrelation. The intercorrelation among latent different items in the same direction. Specifically, we defined
factors was set to 0, .25, .50, or .75, covering the range from minor factors as factors with uniformly distributed standardized
independent to highly correlated scales. loadings on every indicator. The loadings were within the range
Indicators per latent factor. We examined four, eight, or 12 (⫺.1, .1) in the condition with weak minor factors and between
indicators per latent factor. While the majority of scales in psy- (⫺.11, ⫺.09) or (.09, .11) for moderate minor factors (Lorenzo-
chological assessment comprise four to eight indicators (DiStefano Seva et al., 2011). The advantage of this conceptualization is that
& Hess, 2005; Fabrigar et al., 1999; Jackson et al., 2009), factor the explained variance of minor factors is on average equal across
extraction criteria are especially important in the initial develop- conditions at 0.33% or 1%. Furthermore, the random loading
ment of a measurement instrument. The process of constructing a pattern and low explained variance ensure that such factors are not
scale typically involves the elimination of indicators, so a condi- meaningful and should not be extracted when performing EFA
tion involving 12 indicators per factor was realized to represent a with empirical data.
scale before the elimination process. Note that the present simulation study defined minor factors in
Loading magnitude. The standardized loadings of the ob- a way that they account for only a small or moderate amount of
served variables on the latent factors was set to either (.65, .55, .45, the variance. We pursued this particular approach to ensure that the
.35) or (.8, .7, .6, .5) for each set of four variables (i.e., every additional source of systematic variance would unambiguously be
loading was assigned three times when a factor was measured by considered as irrelevant in empirical practice. For example, one
12 indicators). The resulting average loadings were .50 or .65, condition in the simulation by Green et al. (2012) realized a
which is typical for psychological research (DiStefano & Hess, common factor model with two factors, loadings of l ⫽ .40 each,
2005). The implied McDonald’s ␻ reliability coefficients (McDon- the same number of indicators per factor, and a factor correlation
ald, 1999) are presented in Table 1. Whereas Fabrigar, Wegener, of ␳ ⫽ .80. In this condition, the (unrotated) second factor explains
MacCallum, and Strahan (1999) reported slightly higher reliabili- 1.6% of the common variance. By comparison, the minor factors
10 AUERSWALD AND MOSHAGEN

that methods were supposed to ignore in the study conducted was assigned two or three times (as was done for loading magni-
by Lorenzo-Seva et al. (2011) on average accounted for 15% of tudes, see above). The applied set of quantile mixtures resulted in
the common variance. Clearly, we cannot expect one statistical non-normal distributions exhibiting an average skewness of ␥3,F1 ⫽
method to differentiate between these conditions and therefore 0 共SD␥3,F1 ⫽ 0.27兲, ␥3,F2 ⫽ 0.69 共SD␥3,F2 ⫽ 0.10兲, ␥3,F3 ⫽ 0
defined minor factors as rather small sources of systematic vari- 共SD␥3,F3 ⫽ 0.49兲, ␥3, F4 ⫽ ⫺1.25 共SD␥3,F4 ⫽ 0.19兲 in both non-
ance. normality conditions. Kurtosis was on average ␥4 ⫽ 12 across all
Multivariate distribution. Three types of distributions were quantile mixtures and non-normality conditions 共SD␥4, f1 ⫽ 10.95,
used (normal, non-normal based on non-normal errors, non-normal SD␥4, f2 ⫽ 3.70, SD␥4, f3 ⫽ 3.78, SD␥4, f4 ⫽ 10.72兲. The realized levels
based on non-normal latent factors). The two types of non-normal of skewness and kurtosis are well within the boundaries commonly
distributions were included because recent evidence suggests that occurring in psychological assessment (Blanca, Arnau, López-
the performance of factor-based models may vary depending on Montiel, Bono, & Bendayan, 2013; Cain et al., 2017; Micceri, 1989).
whether the non-normality in the observed variables arises from
non-normal correlated variables (such as factors) or from non-
Data Generation and Analysis
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

normal independent variables (such as errors; Auerswald &

This document is copyrighted by the American Psychological Association or one of its allied publishers.

Moshagen, 2015; Foldnes & Grønneberg, 2015; Mair, Satorra, & In total, the design involved 4 (number of observations) ⫻ 3
Bentler, 2012). Normally distributed data were generated using (number of latent factors) ⫻ 4 (factor correlation) ⫻ 3 (number of
Cholesky decomposition. Non-normal random variables Xi were indicators per factor) ⫻ 2 (loading magnitude) ⫻ 2 (cross-load-
generated as the sum of two (standardized) random variables Li, Ei, ings) ⫻ 3 (minor factors) ⫻ 3 (underlying distribution) ⫽ 5,184
and a scalar c, with: conditions. For every condition, 500 independent random samples
were generated, leading to a total of 2,592,000 data sets. The data
Xi ⫽ cLi ⫹ Ei, 1 ⱕ i ⱕ p, (17)
sets were analyzed by all 11 extraction methods under scrutiny.
so that the resulting correlation matrix of Xi is equal to the model Analyses were performed in the statistical computing language
implied correlation matrix on the population level. The random R (R Core Team, 2016). All EFA methods used maximum likeli-
variables L1, . . . , Lp are correlated, whereas E1, . . . , Ep are hood estimation based on the package psych (Revelle, 2015). For
independent and thus uncorrelated. Furthermore, all Li are required the Hull method, we calculated the CFI based on the ␹2 provided
to be independent from all Ei (1 ⱕ i ⱕ p). by the psych package. We used R code provided by Ruscio and
In the condition with non-normal errors, the independent ran- Roche (2012) for the CD approach and custom implementations of
dom variables Ei are non-normal, Li ⬃ N共0, 1兲, and c ⫽ 2, whereas the KGC, traditional PA, and the EKC.5
the condition with non-normal latent factors incorporated non- We recorded the suggested number of factors for each simulated
normal Li, Ei ⬃ N共0, 1兲, and c ⫽ 2.5. The non-normal Ei and Li data set and each method. Combining this information with the
were in turn generated using the NORTA approach (Cario & population values defining each data set, we determined the bias
Nelson, 1997). As inverse cumulative distribution functions F⫺1 toward over- or underextraction for each data set. Bias was defined
for NORTA, we estimated quantile mixture distributions (Auer- as the number of suggested factors minus the actual number of
swald, 2017) with weights ai, 0 ⱕ ai ⱕ 1, 1 ⱕ i ⱕ 4, for each set factors in the population. Thus, negative values indicate underex-
of four indicators: traction, positive values indicate overextraction, and zero indicates
no bias.
• F⫺1 ⫺1 ⫺1
1 ⫽ a1FX5⫹X3 ⫹ 共1 ⫺ a1兲FN共0,1兲, where X ⬃ N共0, 1兲
• F⫺1
2 ⫽ a F ⫺1
2 Lognormal共0,1兲 ⫹ 共1 ⫺ ⫺1
a2兲FN共0,1兲,
⫺1 ⫺1 ⫺1
• F3 ⫽ a3FX ⫹ 共1 ⫺ a3兲FN共0,1兲, where X has a discrete Results
probability distribution with probability mass function fX
and Due to the complexity of our design, we evaluate the perfor-
mance of extraction criteria separately for designs with (a) only
⫺10 with probability p ⫽ .01

冦
one underlying factor, (b) multiple orthogonal factors, (c) multiple
⫺0.1 with probability p ⫽ .49 correlated factors, and (d) factor models with minor factors. The
f X(x) ⫽
0.1 with probability p ⫽ .49 latter is considered separately because the identification of a minor
10 with probability p ⫽ .01 factor would not necessarily be an argument against the theoretical
validity of a method. In each section, we emphasize three main
• F⫺1 ⫺1 ⫺1
4 ⫽ a4FY ⫹ 共1 ⫺ a4兲FN共0,1兲,
results: First, we report the accuracy and bias across all extraction
where Y is
criteria to determine which conditions are more challenging for

Y⫽ 再 X4 if X ⱖ 0
兹X if X ⬍ 0
EFA in general. Second, we determine which criteria perform
better than others under different conditions. Third, we evaluate
the change in accuracy for each extraction criterion as both N and
and X ⬃ N共0, 1兲. factor determinacy (i.e., loading magnitude and number of indica-
We chose quantile mixture distributions that allow the estima- tors per factor) increases. When reporting on average accuracies
tion of the weights ai so that the univariate kurtosis was 12 for each and biases, we use abbreviations for the different data conditions
resulting indicator. Using different quantile mixture distributions as indicated in Table 2.
ensures that the resulting marginal distributions are different for
each indicator Xi while still exhibiting the same kurtosis. When a 5
The syntax is available under https://osf.io/gqma2/?view_only⫽
factor was based on eight or 12 indicators, each quantile mixture d03efba1fd0f4c849a87db82e6705668.
DETERMINING THE NUMBER OF FACTORS 11

Table 2
Abbreviations for Different Data Conditions

Data condition Example Explanation

Number of observations N ⫽ 500 Sample size of 500

Number of latent factors ␰⫽3 3 latent factors
Factor intercorrelation ␳ ⫽ .5 Factor correlation of .5
Indicator per latent factor #x ⫽ 8 8 indicators per latent factor
Loading magnitude l៮ ⫽ 0.5 Average standardized loadings of .5
Presence of cross-loadings ¬cross No cross-loadings
cross Cross-loadings
Presence of minor factors nm No minor factors
wm Weak minor factors (.33% explained variance)
mm Moderate minor factors (1% explained variance)
Multivariate distribution normal Multivariate normal
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Err-NN Non-normal based on non-normal errors

This document is copyrighted by the American Psychological Association or one of its allied publishers.

Lat-NN Non-normal based on non-normal latent factors

Orthogonal factor models ort Aggregation of all orthogonal factor models

In each section, we report estimates of logistic regressions the large number of regression terms), we applied the logistic
predicting whether each method suggested the correct number of regression to the average accuracy of each cell in our design and
factors to quantify the effects, defined as accuracy. In these logistic calculated the regression coefficients directly. Similarly, we com-
regressions, all applicable conditions were effect-coded with puted a linear regression model predicting the extraction biases
PAPCA-M, N ⫽ 500, three latent variables, orthogonal factors, eight from the average of each cell. The linear model incorporated the
indicators per factor, average loadings of .5, no cross-loadings, no same predictors as the logistic model with the same reference
minor factors, and normal distribution as the reference categories. conditions, but used the extraction bias as the criterion.
Thus, an odds ratio (e␤, OR) of 2 would indicate that the odds of Finally, we assessed the performance of combination rules as we
identifying the correct number of factors in this specific condition assumed that no one method would outperform every other method
are twice as high than the grand mean and all else being equal. In in all conditions. The combination rules consisted of pairs of
each case, the logistic regression included main effects and all extraction criteria, for which we calculated the degree to which the
possible interactions. Due to the large number of conditions (and methods suggested the same number of factors and the accuracy

4 Indicator 8 Indicator 12 Indicator

1.0

1.0
0.8

0.8

0.8
p(correct)

p(correct)

p(correct)
0.6

0.6

0.6
0.4

0.4

PAPCA−M PA−R
PAPCA−95 CD
KGCPCA Hull
PAEFA−M EKC
0.2

0.2

PAEFA−95 SMT
KGCEFA

100 200 500 1000 100 200 500 1000 100 200 500 1000
N N N

Figure 5. Accuracy of factor extraction criteria for unidimensional factor models depending on the number of
indicators per factor and sample size. PAPCA-M ⫽ traditional parallel analysis based on the average of PCA
eigenvalues; PAPCA-95 ⫽ traditional parallel analysis based on the 95th quantile of PCA eigenvalues; KGCPCA ⫽
Kaiser-Guttman Criterion based on PCA eigenvalues; PAEFA-M ⫽ traditional parallel analysis based on the average
of EFA eigenvalues; PAEFA-95 ⫽ traditional parallel analysis based on the 95th quantile of EFA eigenvalues;
KGCEFA ⫽ Kaiser-Guttman Criterion based on EFA eigenvalues; PA-R ⫽ revised parallel analysis; CD ⫽
comparison data; Hull ⫽ Hull method; EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.
12 AUERSWALD AND MOSHAGEN

4 Indicator 8 Indicator 12 Indicator

1.0

1.0
0.8

0.8

0.8
0.6

0.6

0.6
p(correct)

p(correct)

p(correct)
0.4

0.4

0.4
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

PAPCA−M PA−R
0.2

0.2

0.2
PAPCA−95 CD
KGCPCA Hull
PAEFA−M EKC
PAEFA−95 SMT
0.0

0.0

0.0
KGCEFA

100 200 500 1000 100 200 500 1000 100 200 500 1000
N N N

Figure 6. Accuracy of factor extraction criteria for orthogonal factor models depending on number of indicators per
factor and sample size. PAPCA-M ⫽ traditional parallel analysis based on the average of PCA eigenvalues; PAPCA-95 ⫽
traditional parallel analysis based on the 95th quantile of PCA eigenvalues; KGCPCA ⫽ Kaiser-Guttman Criterion
based on PCA eigenvalues; PAEFA-M ⫽ traditional parallel analysis based on the average of EFA eigenvalues;
PAEFA-95 ⫽ traditional parallel analysis based on the 95th quantile of EFA eigenvalues; KGCEFA ⫽ Kaiser-Guttman
Criterion based on EFA eigenvalues; PA-R ⫽ revised parallel analysis; CD ⫽ comparison data; Hull ⫽ Hull method;
EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.

given prior agreement. In doing so, we relied only on parameters ORHull ⫽ 55.03), and the EKC 共acc៮EKC ⫽ 100%, OREKC ⫽ 28.16).
known to investigators. These results are in line with the theoretical expectations devel-
oped for the EKC, which predicts EKC to work in all unidimen-
Unidimensional Factor Models sional factor conditions in this simulation. SMT correctly identi-
fied the number of factors in 93% of unidimensional factor models,
Figure 5 shows the average accuracies for unidimensional factor comparable with the average performance of the extraction criteria
models. Overall, most methods displayed high accuracies under scrutiny 共OR␹2 ⫽ 0.29兲, and showed a slight tendency to
共acc
៮ ⫽ 91%兲 and low biases 共bias ៮ ⫽ 0.08兲. As expected, the ៮ 2 ⫽ 0.04兲. Consistent with previous results, SMT
overextract 共bias ␹
performance of most methods increased with factor determinacy
was less accurate if data were based on non-normal latent factors
共acc
៮#x⫽12 ⫽ 94%, OR#x⫽12 ⫽ 1.37, acc ៮៮l⫽.65 ⫽ 95%, OR៮l⫽.65 ⫽
៮Lat-NN,␹2 ⫽ 90%, ORLat-NN,␹2 ⫽ 0.74兲 as compared with non-
共acc
1.64兲. In contrast, accuracies only marginally increased with sam-
៮ Err-NN,␹2 ⫽ 95%, ORErr-NN,␹2 ⫽ 1.15兲 or normal
normal errors 共acc
ple size 共acc
៮N⫽1,000 ⫽ 92%, ORN⫽1,000 ⫽ 1.08, acc ៮N⫽100 ⫽ 90%兲.
៮Normal,␹2 ⫽ 95%兲. As was to be expected,
distributions 共acc
Non-normal distributions did not negatively affect factor ex-
KGCPCA displayed low accuracies 共acc ៮KGCPCA ⫽ 79%,
traction criteria in general 共acc
៮Lat-NN ⫽ 90%, ORLat-NN ⫽ 0.87,
៮Err-NN ⫽ 92%, ORErr-NN ⫽ 1.07兲.
acc ORKGCPCA ⫽ 0.08兲 and consistently overestimated the number of
៮
factors 共bias
Four methods displayed very high accuracy for unidimen- KGC PCA ⫽ 0.29, ␤KGC PCA ⫽ 0.21兲. In contrast,
sional factor models: PAPCA-M 共acc ៮ PAPCA-M ⫽ 100%兲, PAPCA-95 KGCEFA performed comparatively well 共acc ៮ KGCEFA ⫽ 94%,
共acc
៮PAPCA-95 ⫽ 100%, ORPAPCA-95 ⫽ 45.36兲, Hull 共acc ៮Hull ⫽ 100%, ORKGCEFA ⫽ 0.30兲, but displayed a slight tendency to underex-

Figure 7 (opposite). Accuracy of factor extraction criteria for correlated factor models depending on number of indicators per factor, factor correlation,
and sample size. The top panels display the accuracy for low factor correlations (␳ ⫽ .25), the middle panels for medium factor correlations (␳ ⫽ .50), and
the bottom panels for high factor correlations (␳ ⫽ .75). PAPCA-M ⫽ traditional parallel analysis based on the average of PCA eigenvalues; PAPCA-95 ⫽
traditional parallel analysis based on the 95th quantile of PCA eigenvalues; KGCPCA ⫽ Kaiser-Guttman Criterion based on PCA eigenvalues; PAEFA-M ⫽
traditional parallel analysis based on the average of EFA eigenvalues; PAEFA-95 ⫽ traditional parallel analysis based on the 95th quantile of EFA
eigenvalues; KGCEFA ⫽ Kaiser-Guttman Criterion based on EFA eigenvalues; PA-R ⫽ revised parallel analysis; CD ⫽ comparison data; Hull ⫽ Hull
method; EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

ρ = .75 ρ = .50 ρ = .25

p(correct) p(correct) p(correct)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

PAEFA−M

KGCEFA
PAPCA−M

KGCPCA

PAEFA−95
PAPCA−95

200
200
200

N
N
N

CD
Hull
EKC
SMT
PA−R
4 Indicator
4 Indicator
4 Indicator

500
500
500

1000
1000
1000
p(correct) p(correct) p(correct)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

200
200
200

N
N
N

Figure 7 (opposite).
8 Indicator
8 Indicator
8 Indicator

500
500
500

1000
1000
1000
DETERMINING THE NUMBER OF FACTORS

p(correct) p(correct) p(correct)

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

200
200
200

N
N
N

12 Indicator
12 Indicator
12 Indicator

500
500
500

1000
1000
1000
13
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Moderate Minor Weak Minor No Minor

p(correct) p(correct) p(correct)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

PAEFA−M

KGCEFA
PAPCA−M

KGCPCA

PAEFA−95
PAPCA−95

200
200
200

N
N
N

CD
Hull
EKC
SMT
PA−R

500
500
500

Unidimensional
Unidimensional
Unidimensional

1000
1000
1000
p(correct) p(correct) p(correct)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

200
200
200

N
N
N

Figure 8 (opposite).
Orthogonal
Orthogonal
Orthogonal

500
500
500
AUERSWALD AND MOSHAGEN

1000
1000
1000

p(correct) p(correct) p(correct)

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

100
100
100

200
200
200

N
N
N

Correlated
Correlated
Correlated

500
500
500

1000
1000
1000
DETERMINING THE NUMBER OF FACTORS 15

៮
tract 共bias ៮#xⱖ8,PAPCA-M ⫽ 99%, acc
very high accuracies 共acc ៮#xⱖ8,PAPCA-95 ⫽
KGCEFA ⫽ ⫺0.03, ␤KGCEFA ⫽ ⫺0.11). All PA methods
based on EFA eigenvalues were inferior to their PCA-based counter- ៮#xⱖ8,Hull ⫽ 100%, acc
100%, acc ៮#xⱖ8,EKC ⫽ 99%兲. PAPCA-M and
៮PAEFA-M ⫽ 72%, ORPAEFA-M ⫽ 0.05, acc
parts 共acc ៮PAEFA-95 ⫽ PAPCA-95 outperformed all other methods in conditions with four
92%, ORPAEFA-95 ⫽ 0.25兲. In line with our expectations, PA-R ៮#x⫽4,PAPCA-M ⫽ 96%, acc
indicators 共acc ៮#x⫽4,PAPCA-95 ⫽ 95%兲, where
outperformed other PA methods based on EFA eigenvalues Hull and EKC displayed lower hit rates 共acc ៮#x⫽4,Hull ⫽ 90%,
OR#x⫽4,Hull ⫽ 0.26, acc
៮#x⫽4,EKC ⫽ 87%, OR#x⫽4,EKC ⫽ 0.29兲 and
៮PA-R ⫽ 94%, ORPA-R ⫽ 0.31兲. However, accuracies were con-
共acc ៮
sistently lower compared with PCA-based PA, especially if the underestimated the number of factors 共bias #x⫽4,Hull ⫽ ⫺0.16,
៮
bias#x⫽4,EKC ⫽ ⫺0.21兲. SMT displayed high accuracies in condi-
number of indicators per factor was small 共acc ៮#x⫽4,PA-R ⫽
tions with larger sample sizes and at least eight indicators per
81%, OR#x⫽4,PA-R ⫽ 0.04兲. Whereas PAEFA-95 slightly underes-
៮ factor 共acc
៮Nⱖ200,#xⱖ8,SMT ⫽ 92%兲, but exhibited worse results with
timated the number of factors on average 共bias PAEFA-95 ⫽ small sample sizes and short scales 共acc ៮N⫽100,#x⫽4,SMT ⫽ 63%兲
⫺0.05, ␤ PAEFA-95 ⫽ ⫺0.12兲, PAEFA-M regularly overextracted ៮
៮ where the number of factors was underestimated 共bias N⫽100,#x⫽4,SMT ⫽
共bias PAEFA-M ⫽ 0.45, ␤ PAEFA-M ⫽ 0.37兲. CD exhibited lower than ⫺0.42兲. KGCPCA exhibited low accuracies and strongly overesti-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

average overall accuracies 共acc៮ CD ⫽ 81%, ORCD ⫽ 0.09) and mated the number of orthogonal factors 共acc ៮KGCPCA ⫽ 38%,
៮ ⫽ 0.23, ␤ ⫽ 0.15兲, especially if the
This document is copyrighted by the American Psychological Association or one of its allied publishers.

tended to overextract 共bias ៮

CD CD bias KGCPCA ⫽ 3.04). KGCEFA performed considerably better than
number of indicators was small 共acc៮#x⫽4,CD ⫽ 57%, OR#x⫽4,CD ⫽ KGCPCA 共acc ៮KGCEFA ⫽ 68%兲, overextracting to a lesser extent
៮
共bias KGCEFA ⫽ 0.71兲, but was still consistently inferior to the other
0.74兲.
Several extraction criteria displayed decreasing accuracies in extraction criteria. Consistent with the results for unidimensional
conditions of higher sample size or factor determinacy under factor models, PAEFA-M had a rather mediocre performance with a
certain conditions. For example, SMT showed lower accuracies tendency to overextract 共acc
៮PAEFA-M ⫽ 74%, bias ៮
PAEFA-M ⫽ 0.58兲.
in the condition with the largest sample size if non-normality The performance of PAEFA-95 was comparatively high overall
was based on non-normal latent factors 共acc ៮N⫽1,000,Lat-NN,␹2 ⫽ 共acc
៮PAEFA⫺95 ⫽ 90%, bias៮
PAEFA⫺95 ⫽ ⫺0.07兲, but strongly depended
87%, ORN⫽1,000,Lat-NN,␹2 ⫽ 0.97, bias ៮
N⫽1,000,Lat-NN,␹2 ⫽ 0.10, on the number of indicators per factor 共acc ៮#x⫽4,PAEFA⫺95 ⫽
␤N⫽1,000,Lat-NN,␹2 ⫽ 0.02兲. Similarly, the performance of PAEFA-M ៮#xⱖ8,PAEFA⫺95 ⫽ 99%兲. Revised PA and CD performed
72%, acc
was negatively affected by sample size in conditions with only moderately well across all conditions with orthogonal factors
longer, more reliable scales 共acc
៮ #xⱖ8,N⫽1,000,PAEFA-M ⫽ 68%, com- 共acc
៮PA-R ⫽ 85%, acc ៮CD ⫽ 80%兲. Revised PA slightly underesti-
pared with acc ៮ #xⱖ8,N⫽100,PAEFA-M ⫽ 88%). CD also displayed mated the number of factors 共bias ៮
PA-R ⫽ ⫺0.08兲, whereas CD
decreasing accuracy with larger sample sizes in the condition ៮ ⫽ 0.15兲.
extracted too many factors on average 共bias CD
with short scales 共acc ៮#x⫽4,N⫽1,000,CD ⫽ 40%, compared with As was the case for unidimensional factor models, KGCPCA dis-
៮#x⫽4,N⫽100,CD ⫽ 75%). As expected, KGCPCA did not improve
acc played lower performance in conditions with increasing indicators per
with the number of indicators per factor 共acc ៮#x⫽4,KGCPCA ⫽ factors, despite the higher factor determinacy 共acc៮#x⫽4,KGCPCA ⫽
97%, compared with acc ៮#x⫽12,KGCPCA ⫽ 61%, OR#x⫽12,KGCPCA ⫽ 67%, compared with acc ៮#x⫽12,KGCPCA ⫽ 17%, OR#x⫽12,KGCPCA ⫽
0.17). 0.13). KGCPCA severely overestimated the number of factors in
៮
these conditions 共bias #x⫽12,KGCPCA ⫽ 5.84兲. In conditions with
Multiple Orthogonal Factors shorter scales, the accuracy of PAEFA-M also decreased with in-
The accuracies for orthogonal factor models (see Figure 6) are creasing sample size 共acc
៮#x⫽4,N⫽1,000,PAEFA-M ⫽ 23%, compared to
on average lower 共acc៮ ⫽ 83%, bias៮ ⫽ 0.39兲 than for unidimen- ៮#x⫽4,N⫽100,PAEFA-M ⫽ 61%).
acc
sional factor models. Across extraction criteria, accuracy generally
improved with factor determinacy 共acc ៮#x⫽12 ⫽ 85%, OR#x⫽12 ⫽ Multiple Correlated Factors
៮៮l⫽.65 ⫽ 90%, OR៮l⫽.65 ⫽ 1.65兲 and sample size
1.11, acc
共acc
៮N⫽1,000 ⫽ 90%, ORN⫽1,000 ⫽ 1.74兲. Factor models with cross- Figure 7 summarizes the results for conditions with correlated
loadings were slightly easier to identify 共acc ៮cross ⫽ 84%, factors. As was to be expected, all methods exhibited a weaker
ORcross ⫽ 1.08兲, presumably because the additional standardized performance compared with orthogonal factor models 共acc ៮ ⫽
loadings increased the explained variance of the true factors. The 56%兲, frequently underestimating the number of factors 共bias ៮⫽
presence of non-normal distributions again had no substantial ⫺0.41兲. Conditions with high factor correlations, small sample sizes,
effect on factor extraction criteria 共acc
៮Lat-NN ⫽ 81%, ORLat-NN ⫽ few indicators per factor, and weak loadings were especially chal-
៮Err-NN ⫽ 84%, ORErr-NN ⫽ 1.05兲.
0.87, acc lenging 共acc
៮␳⫽.75 ⫽ 25%, OR␳⫽.75 ⫽ 0.26, acc ៮N⫽100 ⫽ 37%,
In orthogonal factor models with at least eight indicators per factor, ORN⫽100 ⫽ 0.45, acc ៮#x⫽4 ⫽ 39%, OR#x⫽4 ⫽ 0.50, acc ៮l៮⫽.5 ⫽
PAPCA-M, PAPCA-95, the Hull method, and the EKC again exhibited 45%, OR៮l⫽.65 ⫽ 1.55兲. Accuracy generally improved in the presence

Figure 8 (opposite). Accuracy of factor extraction criteria for factor models with or without minor factors. The top panels display the accuracy for no
minor factors, the middle panels for weak minor factors, and the bottom panels for moderate minor factors. PAPCA-M ⫽ traditional parallel analysis based
on the average of PCA eigenvalues; PAPCA-95 ⫽ traditional parallel analysis based on the 95th quantile of PCA eigenvalues; KGCPCA ⫽ Kaiser-Guttman
Criterion based on PCA eigenvalues; PAEFA-M ⫽ traditional parallel analysis based on the average of EFA eigenvalues; PAEFA-95 ⫽ traditional parallel analysis
based on the 95th quantile of EFA eigenvalues; KGCEFA ⫽ Kaiser-Guttman Criterion based on EFA eigenvalues; PA-R ⫽ revised parallel analysis; CD ⫽
comparison data; Hull ⫽ Hull method; EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.
16 AUERSWALD AND MOSHAGEN

of cross-loadings 共acc៮cross ⫽ 56%, acc ៮¬cross ⫽ 55%兲, whereas the Accuracies generally decreased in the condition with moderate
underlying distribution had virtually no effect overall minor factors 共acc
៮mm ⫽ 69%, ORmm ⫽ 0.65兲, especially for or-
共acc
៮Normal ⫽ 57%, acc ៮Lat-NN ⫽ 54%, acc ៮Err-NN ⫽ 57%兲. thogonal factor models with large sample sizes 共acc៮mm,ort,Nⱖ500 ⫽
When factor correlations were low and the sample size 73%, compared with acc ៮nm,ort,Nⱖ500 ⫽ 89%, ORmm,ort ⫽ 0.65). As
was larger, PAPCA-M and PAPCA-95 retrieved the number of was to be expected, most of these errors were overextractions
factors with very high accuracy 共acc ៮␳⫽.25,Nⱖ200,PAPCA-M ⫽ 99%, ៮
共biasmm,ort,Nⱖ500 ⫽ 0.35兲. Compared with the other methods, SMT
៮␳⫽.25,Nⱖ200,PAPCA-95 ⫽ 98%兲. Although the accuracy of PAPCA-M
acc and CD were particularly affected by the presence of moderate minor
was lower with fewer indicators per factor and smaller sample factors (ORmm,SMT ⫽ 0.37, ORmm,CD ⫽ 0.72), whereas the EKC,
sizes 共acc
៮ ␳⫽.25,N⫽100,#x⫽4,PAPCA-M ⫽ 72%兲, no other method Hull, and PA appeared to be more robust (ORmm,EKC ⫽ 1.51,
outperformed PAPCA-M under these conditions. In line with ORmm,Hull ⫽ 1.12, for any traditional PA method: ORmm,PA ⱖ
our expectations, most of these errors were underextractions 1.10).
៮
共bias ␳⫽.25,N⫽100,#x⫽4,PA PCA-M ⫽ ⫺0.27兲.
In conditions where ␳ ⫽ .50, no single extraction criterion
Combination Rules
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

displayed high accuracies across all conditions. Except for

This document is copyrighted by the American Psychological Association or one of its allied publishers.

KGCPCA and KGCEFA, all methods underestimated the number of The results presented thus far indicate that PAPCA-M and
៮
factors 共bias ៮
␳⫽.50,KGCPCA ⫽ 3.14, bias␳⫽.50,KGCEFA ⫽ 0.21, all
PAPCA-95 displayed the highest accuracies across conditions. How-
៮
other bias␳⫽.50 ⬍ ⫺0.06), again reflecting the effects of lower ever, most of the other methods outperformed PAPCA-M and
factor determinacy. With only four indicators per factor, SMT PAPCA-95 in at least some conditions: EKC and Hull provided very
exhibited the best performance of all methods under scrutiny high hit rates for unidimensional or orthogonal factor models even
共acc
៮␳⫽.50,#x⫽4,SMT ⫽ 70%兲. For large sample sizes, SMT displayed when the sample size was small, and SMT and CD were more
moderate to high accuracies 共acc ៮#x⫽4,Nⱖ500,␳⫽.50,SMT ⫽ 92%兲 and suitable when factors were highly correlated. The question arises
virtually no bias 共bias៮ whether extraction methods can be beneficially used in conjunc-
#x⫽4,Nⱖ500,␳⫽.50,SMT ⫽ 0.03兲. In conditions
with shorter scales, CD also performed comparatively well tion to determine the number of retained factors. However, a
共acc
៮␳⫽.50,#x⫽4,CD ⫽ 61%兲. complication is that investigators obviously have no access to
The accuracy of all extraction criteria was very low information regarding the true number of factors, the correlation
in conditions with highly correlated factors, especially with between the factors, or the average loading magnitude before
few indicators per factor 共OR␳⫽.75 ⫽ 0.23, OR#x⫽4 ⫽ 0.41, applying EFA and deciding how many factors to extract. We thus
៮␳⫽.75,#xⱕ8 ⫽ 17%). Only SMT achieved acceptable accuracies
acc attempted to determine combination rules only considering infor-
in these conditions, provided that the sample size was at least mation that is immediately available to researchers conducting an
500 共acc ៮ ␳⫽.75,#xⱕ8,Nⱖ500,SMT ⫽ 73%, compared with EFA, namely the number of observations, the average correlation
៮␳⫽.75,#xⱕ8,Nⱕ200,SMT ⫽ 27%). For scales that consisted of 12
acc among the observed variables, and the results of all factor extrac-
indicators per factor, performance was generally higher if the tion criteria.6 Importantly, note that the resulting overall accuracies
sample size was at least N ⫽ 500 (ORN⫽1,000,#x⫽12 ⫽ 1.27), do not necessarily reflect a method’s true performance in empirical
particularly if PAPCA-M, PAPCA-95, or CD were employed practice, because the conditions realized in our study are not
共acc
៮␳⫽.75,#x⫽12,Nⱖ500,PAPCA-M ⫽ 87%, acc ៮␳⫽.75,#x⫽12,Nⱖ500,PAPCA-95 ⫽ necessarily equally likely to occur in the real world.7
៮␳⫽.75,#x⫽12,Nⱖ500,CD ⫽ 86%兲.
82%, acc The results using various extraction criteria can be combined
KGCPCA showed lower accuracies for increasing number of according to very different schemes. In what follows, we consider
indicators per factor 共acc ៮␳ⱖ.25,#x⫽12,KGCPCA ⫽ 12%, compared a combination rule based on the idea that evidence to extract a
៮␳ⱖ.25,#x⫽4,KGCPCA ⫽ 60%). Irrespective of the under-
with acc particular number of factors is strongest when two criteria agree
lying correlation, most of these errors were overextractions with respect to the suggested number of retained factors.8 Two
៮
共bias ៮ criteria are of importance when considering the performance of
␳ⱖ.25,#x⫽12,KGCPCA ⫽ 6.05, compared with bias␳ⱖ.25,#x⫽4,KGCPCA ⫽
0.25). KGCEFA displayed decreasing performance with increas- combination rules: the coverage rate (cr), which expresses the
ing sample size for high factor correlations and eight indicators degree to which the two extraction criteria involved in a combi-
per factor 共acc៮ ␳⫽.25,#x⫽8,Nⱕ200,KGCEFA ⫽ 45%, compared with nation rule agree on the number of factors, and the conditional hit
៮␳⫽.25,#x⫽8,Nⱖ500,KGCEFA ⫽ 16%). PAEFA-M again showed decreas-
acc rate, which is the hit rate of this particular combination, given that
ing accuracies for increased sample sizes in conditions with weakly they agreed on the suggested number of factors. A satisfactory
correlated factors and fewer indicators 共acc ៮␳⫽.25,#x⫽4,Nⱕ200,PAEFA-M ⫽ combination rule requires both a high conditional hit rate and a
50%, compared with acc ៮␳⫽.25,#x⫽4,Nⱖ500,PAEFA-M ⫽ 28%). high coverage rate, because this indicates that the combination rule
tends to be both correct and widely applicable. Tables 3 and 4
Models With Minor Factors
6
Figure 8 displays the results for the conditions involving minor We also considered the number of indicators, but found that the
resulting accuracies and coverage rates were comparable for all combina-
factors, summarized for conditions with unidimensional, orthogo- tions of extraction criteria.
nal, and correlated factors. The presence of weak minor factors had 7
We thank Jamie DeCoster and Marcel van Assen for their insightful
no impact on average accuracies 共acc ៮wm ⫽ 73%, acc ៮nm ⫽ 72%兲 comments on this issue.
and did not lead to overextractions 共bias៮ ⫽ ⫺0.12, bias ៮ ⫽ 8
We also compared all triplets of factor extraction criteria where the
wm nm
resulting number of retrieved factors was equal to the median of the
⫺0.11兲. The performance of every extraction criterion was virtu- suggested number of each triplet. The highest overall accuracy resulted for
ally identical in the presence of weak minor factors, even if the PAPCA-M, PA-R, and SMT (83%), which was equal to PAPCA-M alone
sample size was very large. (83%).
DETERMINING THE NUMBER OF FACTORS 17

Table 3
Percentage of Correctly Identified Factors for Pairs of Methods Given That Both Methods Agree on the Number of Factors
(Percentage of Cases for Which Pairs of Methods Agree on the Number of Factors) and N ⱕ 200

Extraction method PAPCA-M PAPCA-95 KGCPCA PAEFA-M PAEFA-95 KGCEFA PA-R CD Hull EKC SMT

PAPCA-95 81 (91)
KGCPCA 99 (28) 100 (27)
PAEFA-M 83 (76) 84 (72) 87 (27)
PAEFA-95 81 (87) 79 (87) 98 (26) 81 (76)
KGCEFA 89 (58) 88 (58) 99 (25) 87 (50) 86 (58)
PA-R 83 (80) 80 (81) 99 (26) 83 (70) 78 (85) 84 (59)
CD 83 (74) 82 (74) 92 (26) 82 (65) 82 (71) 87 (51) 82 (68)
Hull 85 (81) 81 (86) 100 (27) 86 (67) 82 (80) 90 (55) 79 (79) 85 (67)
EKC 83 (81) 78 (87) 100 (26) 85 (66) 79 (81) 87 (56) 77 (79) 82 (69) 77 (86)
SMT 91 (73) 91 (72) 95 (30) 90 (64) 91 (70) 90 (54) 91 (68) 86 (68) 93 (67) 91 (67)
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Single method 77 74 32 67 72 55 70 65 70 68 74
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Note. PAPCA-M ⫽ traditional PA with mean PCA eigenvalues; PAPCA-95 ⫽ traditional PA with the 95th quantile of PCA eigenvalues; KGCPCA ⫽
Kaiser-Guttman Criterion with PCA eigenvalues; PAEFA-M ⫽ traditional PA with mean EFA eigenvalues; PAEFA-95 ⫽ traditional PA with the 95th quantile
of EFA eigenvalues; KGCEFA ⫽ Kaiser-Guttman Criterion with EFA eigenvalues; PA-R ⫽ revised PA; CD ⫽ comparison data; Hull ⫽ Hull method;
EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.

show the coverage rates and conditional hit rates of all pairs of methods disagreed to evaluate whether there is an optimal strategy
extraction criteria, separately for small (N ⱕ 200) and large (N ⱖ in situations where the proposed combination rules provide con-
500) sample sizes. As can be seen, all methods exhibited higher flicting results (see Table 5). As can be seen, these conditions were
accuracies when used in conjunction, thereby illustrating the ben- associated with low hit rates for all methods.9 Particularly low hit
efit of combining the information provided by various criteria. rates were evident for small sample sizes (N ⱕ 200), where the
Combination rules that involved SMT and either one of overall accuracy of the considered extraction criteria was ⱕ44%.
PAPCA-M, PAPCA-95, Hull, or the EKC provided both high Under these conditions, the highest hit rate can be obtained by
accuracies and relatively high coverage rates 共acc ៮ SMT,PAPCA-M ⫽ relying on any of the variants of traditional PA, CD, or PA-R.
95%, cr៮SMT,PAPCA-M ⫽ 74%, acc៮SMT,PAPCA-95 ⫽ 95%, cr ៮SMT,PAPCA-95 ⫽ Clearly, however, determining the number of factors to retain is
៮SMT,Hull ⫽ 97%, cr
72%, acc ៮SMT,Hull ⫽ 68%, acc
៮SMT,EKC ⫽ 95%, difficult under these conditions, so increasing the sample size
៮SMT,EKC ⫽ 68%). In large sample sizes and in conditions with
cr would be advisable. Larger samples sizes not only increase the
above-average correlations among the observed variables (r̄ ⫽ accuracy and coverage rate of the proposed combination rules, but
.23), all hit rates of combination rules involving SMT and one also improve hit rates of single extraction criteria when the com-
of the aforementioned criteria were close to 100% bination rules provide conflicting results. In particular, in situa-
共acc
៮SMT,PAPCA-M,r⬎r៮ ⫽ 99%. ៮SMT,PAPCA-95,r⬎r៮ ⫽ 99%
acc tions where SMT and either PAPCA-M, PAPCA-95, Hull, or the EKC
៮SMT,Hull,r⬎r៮ ⫽ 100% acc
acc ៮SMT,EKC,r⬎r៮ ⫽ 100% For larger sam- disagree, the results of PAPCA-M, PAPCA-95, CD, or the EKC can
ples, combinations of CD with either PAPCA-M, PAPCA-95, Hull, or be used to inform factor extraction 共acc ៮Nⱖ500,PAPCA-M ⱖ 63%,
the EKC also resulted in accuracies close to 100% and high ៮Nⱖ500,PAPCA-95 ⱖ 60%, acc
acc ៮Nⱖ500,CD ⱖ 58%, acc ៮Nⱖ500,EKC ⱖ
coverage rates. Other combination rules that utilized criteria 51%兲.
relying on similar methods (e.g., PAPCA-M and PAPCA-95)
achieved higher coverage rates 共cr ៮PAPCA-M,PAPCA-95 ⫽ 94%兲 at the
expense of overall accuracy 共acc ៮PAPCA-M,PAPCA-95 ⫽ 86%兲. Across Discussion
conditions, combination rules involving KGCPCA consistently dis-
played very high hit rates even in small sample sizes. However, Psychological researchers often need to determine the number
these hit rates were accompanied by exceptionally low coverage of latent factors underlying multiple observed variables. EFA is
rates 共cr
៮Nⱕ200,PAPCA-95,KGCPCA ⫽ 27%兲, indicating that KGCPCA only often employed to this end. An important issue when performing
identifies the correct number of factors if factor determinacy is an EFA is the number of latent factors required to adequately
high. Thus, whereas agreement between KGCPCA and (almost) any describe the covariance structure among the observed data. The
other extraction criterion allows for high confidence that the num- present study subjected a large number of traditional and modern
ber of suggested factors is correct, combinations involving extraction criteria to a critical test by examining their performance
KGCPCA do not cover a sufficient number of cases to be consid- under data conditions that are often encountered in psychological
ered a useful general combination rule. Taken together, combining research, systematically varying the number of factors, the factor
SMT and either one of PAPCA-M, PAPCA-95, Hull, or the EKC correlations, the number of indicators, the magnitude of loadings,
consistently provided excellent hit rates (beyond what can be the underlying multivariate distribution of manifest variables, as
achieved by considering any one criterion in isolation) and covered well as the presence of cross-loadings and minor factors.
a relatively wide range of conditions.
While concurrence between SMT and either PAPCA-M, PAPCA-95, 9
The average reliability in these conditions was ␻៮ ⫽ .77 and close to the
Hull, or the EKC reliably indicated that the suggested number of average reliability of scales in empirical research (e.g. Fabrigar et al.,
factors is correct, we also examined the conditions in which these 1999).
18 AUERSWALD AND MOSHAGEN

Table 4
Percentage of Correctly Identified Factors for Pairs of Methods Given That Both Methods Agree on the Number of Factors
(Percentage of Cases for Which Pairs of Methods Agree on the Number of Factors) and N ⱖ 500

Extraction method PAPCA-M PAPCA-95 KGCPCA PAEFA-M PAEFA-95 KGCEFA PA-R CD Hull EKC SMT

PAPCA-95 91 (97)
KGCPCA 100 (60) 100 (60)
PAEFA-M 93 (63) 92 (63) 91 (46)
PAEFA-95 94 (77) 93 (77) 95 (55) 79 (72)
KGCEFA 92 (79) 92 (80) 100 (49) 90 (58) 86 (77)
PA-R 92 (78) 91 (79) 95 (55) 82 (64) 81 (82) 85 (76)
CD 97 (75) 98 (74) 94 (54) 86 (60) 91 (68) 98 (61) 92 (66)
Hull 94 (88) 92 (89) 100 (58) 92 (61) 91 (77) 89 (79) 87 (79) 98 (68)
EKC 93 (91) 91 (93) 100 (58) 92 (62) 91 (78) 90 (81) 90 (78) 98 (70) 91 (89)
SMT 99 (74) 99 (73) 93 (59) 87 (61) 91 (70) 99 (61) 91 (69) 92 (72) 100 (69) 100 (70)
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Single method 90 89 63 60 74 74 73 77 82 85 80
This document is copyrighted by the American Psychological Association or one of its allied publishers.

The performance of all extraction criteria varied considerably Factor Determinacy and Sample Size
depending on factor determinacy and sample size. In unidimen-
The present study showed that the performance of all extraction
sional and orthogonal factor designs, traditional PA (based on the
criteria strongly depends on the signal-to-noise ratio in the data. In
sample correlation matrix and either the mean or 95th percentile of
line with previous simulation studies, extraction criteria generally
eigenvalues), Hull, and the EKC consistently retrieved the correct
deteriorated in conditions where the (expected) explained variance
number of factors. For highly correlated scales, the accuracies of of a common factor was low, either due to low loading magni-
all extraction criteria was lower due to frequent underextractions, tudes, high factor correlations, or few indicators per factor. These
which is consistent with theoretical expectations regarding sample conditions correspond to common factor models in which the
and population eigenvalues (Braeken & van Assen, 2017). In these eigenvalues associated with true underlying factors are numeri-
conditions, PAPCA-M and PAPCA-95 displayed high accuracies, cally closer to the remaining eigenvalues (Braeken & van Assen,
provided there was a sufficiently large number of indicators per 2017). Because most extraction criteria rely on the pattern of
factor and larger sample sizes. Unlike all other approaches, CD sample eigenvalues to determine how many factors to extract,
and SMT performed comparatively well in conditions with short, conditions with low factor determinacy are necessarily challeng-
highly correlated scales. ing, especially if the sample size is small. At the same time, our

Table 5
Percentage of Correctly Identified Number of Factors Given That SMT and Either PAPCA-M, PAPCA-95, Hull, or the EKC Provide
Different Solutions

N ⱕ 200 N ⱖ 500
SMT disagree with SMT disagree with
Extraction method PAPCA-M PAPCA-95 Hull EKC PAPCA-M PAPCA-95 Hull EKC

PAPCA-M 37 39 43 44 63 64 67 66
PAPCA-95 34 31 35 36 62 60 64 63
KGCPCA 16 17 16 18 42 42 40 43
PAEFA-M 37 39 42 42 34 33 34 33
PAEFA-95 36 35 39 38 46 45 44 42
KGCEFA 25 25 26 27 55 53 53 50
PA-R 37 37 40 39 41 40 42 40
CD 34 35 37 38 58 59 62 61
Hull 28 26 22 28 52 50 44 49
EKC 25 23 24 20 59 57 57 51
SMT 26 30 34 38 26 29 37 35
Note. PAPCA-M ⫽ traditional PA with mean PCA eigenvalues; PAPCA-95 ⫽ traditional PA with the 95th quantile of PCA eigenvalues; KGCPCA ⫽
Kaiser-Guttman Criterion with PCA eigenvalues; PAEFA-M ⫽ traditional PA with mean EFA eigenvalues; PAEFA-95 ⫽ traditional PA with the 95th quantile
of EFA eigenvalues; KGCEFA ⫽ Kaiser-Guttman Criterion with EFA eigenvalues; PA-R ⫽ revised PA; CD ⫽ comparison data; Hull ⫽ Hull method;
EKC ⫽ Empirical Kaiser Criterion; SMT ⫽ sequential ␹2 model tests.
DETERMINING THE NUMBER OF FACTORS 19

results also indicate that lower factor determinacy adversely af- Garrido et al., 2013; Glorfeld, 1995; Peres-Neto et al., 2005). This
fects extraction criteria such as SMT to a lesser extent, which are is particularly noteworthy concerning the Hull method and the
based on the fit of a structural equation model. EKC, given that these methods explicitly assume a normal distri-
The beneficial effect of cross-loadings, implemented as addi- bution (Braeken & van Assen, 2017; Lorenzo-Seva et al., 2011).
tional standardized loadings on a different latent factor, on the SMT was the only extraction criterion that was adversely affected
accuracy of factor recovery can also be explained by increased by non-normality in the latent variables in some conditions, con-
factor determinacy. The additional loadings increased the ex- sistent with evidence from confirmatory factor analysis (e.g., Au-
plained variance of a common factor, making the pattern of sample erswald & Moshagen, 2015; Foldnes & Grønneberg, 2015; Mair et
eigenvalues more distinct and easier to identify. Consequently, the al., 2012). In the present study, we only considered SMT using
results suggest that the investigated extraction criteria can be uncorrected ML-based ␹2, so a natural extension is to investigate
safely applied if cross-loadings are present. SMT under non-normality with appropriate corrections, such as
Given the overall advantage of higher factor determinacy and the Satorra-Bentler correction (Satorra & Bentler, 1994). Overall,
larger samples, it is also noteworthy that some extraction criteria however, the results of the present study indicate that the investi-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

displayed decreased accuracies under these conditions. The KGC gated extraction criteria can be applied safely under a wide range
This document is copyrighted by the American Psychological Association or one of its allied publishers.

severely overextracted the number of factors if unidimensional or of distributional properties of the observed data.
orthogonal factors were indicated by a large number of items (see
also Cattell & Vogelmann, 1977; Hakstian et al., 1982; Zwick &
Issues in Implementing PA
Velicer, 1986). The overextraction bias occurs because the KGC
does not take sampling variations into account, which can lead to When PA is used to determine the number of factors, two
eigenvalues greater than one even in the absence of common choices need to be made. The first choice pertains to how to
factors. CD, which performed well in challenging conditions in- summarize the random reference eigenvalues to which the empir-
volving highly correlated factors, consistently overestimated the ical eigenvalues are compared. Our results show that both variants
number of factors in unidimensional models with four indicators, of traditional PA displayed very similar hit rates for unidimen-
especially if the sample was large. Similarly, variants of parallel sional, orthogonal, and correlated factor designs. Given the disad-
analysis that relied upon the eigenvalues of an EFA model vantages of average-based PA in conditions with uncorrelated
(PAEFA-M, PAEFA-95, and PA-R) did not improve with increasing variables and the comparable hit rates otherwise, we recommend
sample size in conditions with fewer indicators per factor. As a PA be used with the 95th percentile as a reference value.
result, the aforementioned methods should be applied in conjunc- The second choice pertains to the matrix from which the em-
tion with other extraction criteria to protect against potential over- pirical and sampled eigenvalues are derived. The eigenvalues can
extraction biases. be obtained either from the correlation matrix, corresponding to a
PCA, or from a matrix in which the diagonal of the correlation
matrix is replaced with the item communalities estimated by a
Minor Factors
common factor model. Because the primary purpose of empirical
The performance of all extraction criteria under scrutiny was studies often is to uncover a set of latent variables that explain
virtually unaffected by the presence of weak minor factors, covariations among observed variables, the common factor model
whereas moderate minor factors primarily affected SMT and CD is usually recommended over PCA (e.g., Fabrigar et al., 1999;
in conditions with large sample sizes and orthogonal factors. These McArdle, 1990; Widaman, 1993). Traditional PA, on the other
results are in line with the expectations derived by Braeken and hand, typically uses the eigenvalues of the correlation matrix as a
van Assen (2017) in that overextractions are likely to occur only if criterion, which could be considered inconsistent, because EFA is
the explained variance of a minor factor sufficiently changes the derived from the common factor model (Ford, MacCallum, & Tait,
pattern of population eigenvalues. Consequently, the methods that 1986; Humphreys & Montanelli, 1975). However, a common
display higher performance when factors are highly correlated, factor model determines both the eigenvalues used in a PCA and
such as SMT and CD, were also more strongly affected by minor EFA. Indeed, Braeken and van Assen (2017) derived the distribu-
factors. Future research should consider whether the results extend tion of eigenvalues of the correlation matrix for normally distrib-
to other sources of systematic variance as well, such as correlated uted observed variables from a common factor model. In contrast,
unique factors. We expect that, similar to minor factors, correlated the eigenvalues of a common factor model additionally depend on
unique factors affect the hit rates of extraction criteria only if they the method that estimates the communalities. Our simulations
represent a substantial proportion of the systematic variation, for suggest that PA should be based on the PCA eigenvalues (see also
example in conditions with few indicators. Garrido et al., 2013). In conditions with few indicators per factor,
PAEFA displayed lower hit rates than PAPCA and did not consis-
tently improve with sample size. We see two factors that contribute
Non-Normal Multivariate Distributions
to this finding. First, the comparison samples for PA seem inap-
Previous studies investigating non-normality only evaluated tra- propriate for EFA eigenvalues. Whereas EFA assumes common
ditional PA and only varied the marginal distributions without variance among the observed variables, the variables of the com-
considering other extraction criteria or manipulating the multivar- parison samples are perfectly uncorrelated in the population. This
iate distribution itself. The present study showed that most extrac- leads to communality estimates close to zero for the comparison
tion criteria were highly robust under commonly observed samples, but not for the empirical sample. The EFA eigenvalues,
amounts of skewness and kurtosis in the manifest variables, which are based on the correlation matrix with communalities on
thereby replicating and extending previous results (Dinno, 2009; the diagonal, are adversely affected by the inappropriate compar-
20 AUERSWALD AND MOSHAGEN

ison. Second, while PCA tends to overestimate the explained variables translate to changes in the distribution of the observed
variance of common factors (Widaman, 1993), this overestimation variables in a nontrivial way. Specifically, the same values of
affects both the empirical and simulated sample alike. As we skewness and kurtosis in the observed ordinal variables can result
discussed in the section on general issues in factor extraction, the from different skewness and kurtosis in the underlying continuous
deciding factor appears to be the numerical difference between the variables depending on the thresholds chosen to obtain the ordinal
last eigenvalue associated with a true factor and the first remaining variables. Nevertheless, future studies should also examine the
eigenvalue, which is the same for EFA and PCA population performance of other factor extraction criteria for ordinal or di-
eigenvalues. Note that we only used ML to estimate communalities chotomous observed variables.
for variants of PA based on a common factor model. Future studies A second limitation pertains to the selection of the extraction
should systematically vary the estimation procedure, including criteria examined in the simulations. While we included a number
minimum rank factor analysis (Garrido et al., 2013) or estimate of modern techniques that have not yet been thoroughly investi-
communalities as multiple R2 between one variable and the re- gated, we did not consider methods that have been shown to be
maining variables (Crawford et al., 2010). inferior to traditional PA in previous simulation studies (Peres-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

The inferiority of a common factor PA also explains why Neto et al., 2005; Raîche et al., 2013; Ruscio & Roche, 2012;
This document is copyrighted by the American Psychological Association or one of its allied publishers.

revised PA exhibited lower accuracies than traditional PAPCA in Zwick & Velicer, 1986). These include fit indices of different
our study. Following the recommendations of Green et al. (2012), structural equation models (e.g., Ruscio & Roche, 2012), the
we implemented revised PA based on the common factor model. minimum average partial method (Velicer, 1976), or several non-
This difference likely resulted in the relatively low overall perfor- graphical solutions for Cattel’s Scree test (e.g., Raîche et al.,
mance of revised PA despite its theoretical advantages. Future 2013). Overall, there are more than 40 criteria to assess the
research should consider whether implementing revised PA based dimensionality of observed variables (Peres-Neto et al., 2005;
on PCA eigenvalues can be used to improve performance of this Raîche et al., 2013; Ruscio & Roche, 2012) and our selection was
approach. based on their relevance for factor analysis in psychology and
performance in previous simulation studies. However, it might be
possible that a criterion not considered here may improve com-
Combination Rules
bined hit rates when used in conjunction with another criterion.
Given that no single approach displayed the highest accuracy in Finally, the extraction criteria investigated in this study can be
all conditions, we investigated whether performance can be max- implemented in different ways. For example, the Hull method eval-
imized by jointly considering the outcomes of different extraction uates solutions based on a certain goodness-of-fit index, so different
criteria. Indeed, performance increased considerably when multi- fit-indices may lead to different results. In line with the recommen-
ple factor extraction criteria were used simultaneously. When SMT dation by Lorenzo-Seva et al. (2011), we implemented the Hull
and either PAPCA-M, PAPCA-95, Hull, or the EKC agree (which method using the CFI. However, performance might be improving
occurred in 74%, 72%, 68%, and 68% of all simulated data sets, when relying on another index (Moshagen & Auerswald, 2018). In
respectively), the correct number of factors is consistently identi- particular, we would recommend studying the behavior of the Hull
fied. In the data sets where these methods disagreed, confident method with goodness-of-fit indices that do not account for model
judgments could only be made when sample sizes were large (N ⱖ parsimony, such as the SRMR or McDonald’s mc (McDonald, 1989).
500). In these cases, CD, the EKC, or one of the variants of Furthermore, the chi-square used in the SMT was directly derived
traditional PAPCA correctly identified the number of factors with from ML estimation. Given that the chi-square statistic is inflated in
moderate accuracies, whereas all other approaches performed the presence of non-normal data, it would be interesting to investigate
more poorly. whether robust statistics (such as the SB-correction; Satorra &
SMT performed especially well in conditions with highly cor- Bentler, 1994) improve performance in conditions involving non-
related factor models but tended to overextract in the presence of normality. Likewise, the tendency of SMT to overextract in the
moderate minor factors and displayed lower performance when presence of minor factors might be counteracted by altering the testing
data were not normally distributed. As such, a useful complement strategy, such as using balanced error probabilities (Moshagen &
to SMT would be an extraction criterion that does not tend to Erdfelder, 2016) or equivalence testing (Yuan, Chan, Marcoulides, &
overextract and is robust against both minor factors and non- Bentler, 2016).
normality. PAPCA-M performed well when data were non-normal
or based on minor factors, but was reported to overextract in
Conclusion
previous simulation studies (Glorfeld, 1995). Because PAPCA-95,
Hull, and the EKC were both robust and did not overextract, we We investigated the performance of various criteria to determine
recommend combination rules consisting of SMT and one of the the number of retained factors in EFA. Our results indicate that the
aforementioned. highest accuracy can be obtained when considering the outcomes
of several criteria simultaneously. In particular, we recommend
that investigators compare the results of sequential ␹2 model tests
Limitations
and either PAPCA-95, Hull, or the EKC. If both methods suggest the
The results of Monte Carlo studies should only be interpreted same number of factors, this most often reflects the correct number
within the bounds of the realized conditions. One limitation of our of underlying factors. If the methods disagree, CD, the EKC, or
study is that we only considered continuous response variables one of the variants of traditional PAPCA are viable extraction
because we were also interested in the effect of non-normality in criteria provided that the sample is large. However, these condi-
the observed variables. Changes in the distribution of the latent tions are generally associated with greater difficulties in identify-
DETERMINING THE NUMBER OF FACTORS 21

ing the number of factors for all approaches we investigated, so Cattell, R. B. (1966). The scree test for the number of factors. Multivar-
larger sample sizes are required to make confident decisions. In the iate Behavioral Research, 1, 245–276. http://dx.doi.org/10.1207/
suggested decision rule, disagreement between SMT and either s15327906mbr0102_10
PAPCA-95, Hull, or the EKC can thus serve as an indicator that the Cattell, R. B., & Vogelmann, S. (1977). A comprehensive trial of the scree
and KG criteria for determining the number of factors. Multivariate
latent structure will be difficult to uncover.
Behavioral Research, 12, 289 –325. http://dx.doi.org/10.1207/
The present study also investigated the effects of minor factors,
s15327906mbr1203_2
cross-loadings, and non-normal distributions. Minor factors ad- Ceulemans, E., & Kiers, H. A. (2006). Selecting among three-mode prin-
versely affected extraction criteria only if they explained at least a cipal component models of different types and complexities: A numer-
moderate amount of variance. Cross-loadings increased the ex- ical convex hull based method. British Journal of Mathematical and
plained variance of true factors and therefore tended to increase the Statistical Psychology, 59, 133–150. http://dx.doi.org/10.1348/
performance of extraction criteria. Non-normal distributions that 000711005X64817
were based on non-normal latent distributions lead to decreasing Cho, S.-J., Li, F., & Bandalos, D. (2009). Accuracy of the parallel analysis
accuracy for SMT, while all other extraction criteria were virtually procedure with polychoric correlations. Educational and Psychological
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

unaffected. Measurement, 69, 748 –759. http://dx.doi.org/10.1177/001316440

This document is copyrighted by the American Psychological Association or one of its allied publishers.

Finally, we want to stress that decisions on the number of factors 9332229

Comrey, A. L. (1978). Common methodological problems in factor ana-
should also involve theoretical considerations. While the suggested
lytic studies. Journal of Consulting and Clinical Psychology, 46, 648 –
combination strategy is a helpful tool in assessing the number of
659. http://dx.doi.org/10.1037/0022-006X.46.4.648
factors and the confidence investigators should have in this num- Crawford, A. V., Green, S. B., Levy, R., Lo, W.-J., Scott, L., Svetina, D.,
ber, it should not be interpreted as a strict rule. The simulation & Thompson, M. S. (2010). Evaluation of parallel analysis methods for
results clearly indicate that most criteria have unique strengths determining the number of factors. Educational and Psychological Mea-
under particular conditions, which is not well reflected in consid- surement, 70, 885–901. http://dx.doi.org/10.1177/0013164410379332
ering the hit rates averaged across all conditions. Moreover, the de Winter, J. C., & Dodou, D. (2012). Factor recovery by principal axis
interpretability of the resulting loading patterns, theoretical con- factoring and maximum likelihood factor analysis as a function of factor
siderations concerning the relevance of an item for a scale, and the pattern and sample size. Journal of Applied Statistics, 39, 695–710.
resulting scale reliabilities are all equally important and should all http://dx.doi.org/10.1080/02664763.2011.610445
de Winter, J. C., & Dodou, D. (2016). Common factor analysis versus
be taken into account when deciding how many factors to retain in
principal component analysis: A comparison of loadings by means of
EFA.
simulations. Communications in Statistics - Simulation and Computa-
tion, 45, 299 –321. http://dx.doi.org/10.1080/03610918.2013.862274
Dinno, A. (2009). Exploring the sensitivity of Horn’s parallel analysis to
References
the distributional form of random data. Multivariate Behavioral Re-
Auerswald, M. (2017). Generating non-normal distributions: Methods and search, 44, 362–388. http://dx.doi.org/10.1080/00273170902938969
effects (Doctoral dissertation). University of Mannheim, Germany. DiStefano, C., & Hess, B. (2005). Using confirmatory factor analysis for
Auerswald, M., & Moshagen, M. (2015). Generating correlated, non- construct validation: An empirical review. Journal of Psychoeducational
normally distributed data using a non-linear structural model. Psy- Assessment, 23, 225–241. http://dx.doi.org/10.1177/07342829050
chometrika, 80, 920 –937. 2300303
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psy- Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J.
chological Bulletin, 107, 238 –246. http://dx.doi.org/10.1037/0033-2909 (1999). Evaluating the use of exploratory factor analysis in psycholog-
.107.2.238 ical research. Psychological Methods, 4, 272–299. http://dx.doi.org/10
Bentler, P. M., & Kano, Y. (1990). On the equivalence of factors and .1037/1082-989X.4.3.272
components. Multivariate Behavioral Research, 25, 67–74. http://dx.doi Fava, J. L., & Velicer, W. F. (1992). The effects of overextraction on factor
.org/10.1207/s15327906mbr2501_8 and component analysis. Multivariate Behavioral Research, 27, 387– 415.
Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. Fava, J. L., & Velicer, W. F. (1996). The effects of underextraction in
(2013). Skewness and kurtosis in real data samples. Methodology, 9, factor and component analyses. Educational and Psychological Mea-
78 – 84. http://dx.doi.org/10.1027/1614-2241/a000057 surement, 56, 907–929. http://dx.doi.org/10.1177/001316449605
Bollen, K. A. (1989). Structural equations with latent variables. New 6006001
York, NY: Wiley. Finch, J. F., & West, S. G. (1997). The investigation of personality
Braeken, J., & van Assen, M. A. (2017). An empirical Kaiser criterion. structure: Statistical models. Journal of Research in Personality, 31,
Psychological Methods, 22, 450 – 466. http://dx.doi.org/10.1037/ 439 – 485. http://dx.doi.org/10.1006/jrpe.1997.2194
met0000074 Foldnes, N., & Grønneberg, S. (2015). How general is the Vale–Maurelli
Browne, M. W. (2001). An overview of analytic rotation in exploratory simulation approach? Psychometrika, 80, 1066 –1083. http://dx.doi.org/
factor analysis. Multivariate Behavioral Research, 36, 111–150. http:// 10.1007/s11336-014-9414-0
dx.doi.org/10.1207/S15327906MBR3601_05 Ford, J. K., MacCallum, R. C., & Tait, M. (1986). The application of
Cain, M. K., Zhang, Z., & Yuan, K.-H. (2017). Univariate and multivariate exploratory factor analysis in applied psychology: A critical review and
skewness and kurtosis for measuring nonnormality: Prevalence, influ- analysis. Personnel Psychology, 39, 291–314. http://dx.doi.org/10.1111/
ence and estimation. Behavior Research Methods, 49, 1716 –1735. j.1744-6570.1986.tb00583.x
http://dx.doi.org/10.3758/s13428-016-0814-1 Garrido, L. E., Abad, F. J., & Ponsoda, V. (2013). A new look at Horn’s
Cario, M. C., & Nelson, B. L. (1997). Modeling and generating random parallel analysis with ordinal variables. Psychological Methods, 18,
vectors with arbitrary marginal distributions and correlation matrix 454 – 474. http://dx.doi.org/10.1037/a0030005
(Tech. Rep.). Evanston, IL: Department of Industrial Engineering and Glorfeld, L. W. (1995). An improvement on Horn’s parallel analysis
Management Sciences, Northwestern University. methodology for selecting the correct number of factors to retain. Edu-
22 AUERSWALD AND MOSHAGEN

cational and Psychological Measurement, 55, 377–393. http://dx.doi Linn, R. L. (1968). A Monte Carlo approach to the number of factors
.org/10.1177/0013164495055003002 problem. Psychometrika, 33, 37–71.
Green, S. B., Levy, R., Thompson, M. S., Lu, M., & Lo, W.-J. (2012). A Lorenzo-Seva, U., Timmerman, M. E., & Kiers, H. A. (2011). The Hull
proposed solution to the problem with using completely random data to method for selecting the number of common factors. Multivariate Be-
assess the number of factors with parallel analysis. Educational and havioral Research, 46, 340 –364. http://dx.doi.org/10.1080/00273171
Psychological Measurement, 72, 357–374. http://dx.doi.org/10.1177/ .2011.564527
0013164411422252 Mair, P., Satorra, A., & Bentler, P. M. (2012). Generating nonnormal multi-
Green, S. B., Thompson, M. S., Levy, R., & Lo, W.-J. (2015). Type I and variate data using copulas: Applications to SEM. Multivariate Behavioral
type II error rates and overall accuracy of the revised parallel analysis Research, 47, 547–565. http://dx.doi.org/10.1080/00273171.2012.692629
method for determining the number of factors. Educational and Psycho- Marčenko, V. A., & Pastur, L. A. (1967). Distribution of eigenvalues for
logical Measurement, 75, 428 – 457. http://dx.doi.org/10.1177/ some sets of random matrices. Mathematics of the USSR-Sbornik, 1,
0013164414546566 457– 483. http://dx.doi.org/10.1070/SM1967v001n04ABEH001994
Guttman, L. (1954). Some necessary conditions for common-factor anal- McArdle, J. J. (1990). Principles versus principals of structural factor
ysis. Psychometrika, 19, 149 –161. http://dx.doi.org/10.1007/ analyses. Multivariate Behavioral Research, 25, 81– 87. http://dx.doi
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

BF02289162 .org/10.1207/s15327906mbr2501_10
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Hakstian, A. R., Rogers, W. T., & Cattell, R. B. (1982). The behavior of McDonald, R. P. (1989). An index of goodness-of-fit based on noncen-
number-of-factors rules with simulated data. Multivariate Behavioral trality. Journal of Classification, 6, 97–103. http://dx.doi.org/10.1007/
Research, 17, 193–219. http://dx.doi.org/10.1207/s15327906mbr1702_3 BF01908590
Harman, H. H. (1976). Modern factor analysis (3rd ed.). Illinois: Univer- McDonald, R. P. (1999). Test theory: A unified treatment. Hillsdale, NJ:
sity of Chicago Press. Erlbaum.
Hayashi, K., Bentler, P. M., & Yuan, K.-H. (2007). On the likelihood ratio Micceri, T. (1989). The unicorn, the normal curve, and other improbable
test for the number of factors in exploratory factor analysis. Structural creatures. Psychological Bulletin, 105, 156 –166.
Equation Modeling, 14, 505–526. http://dx.doi.org/10.1080/107055107 Moshagen, M., & Auerswald, M. (2018). On congruence and incongruence
01301891 of measures of fit in structural equation modeling. Psychological Meth-
Hayton, J. C., Allen, D. G., & Scarpello, V. (2004). Factor retention ods, 23, 318 –336. http://dx.doi.org/10.1037/met0000122
decisions in exploratory factor analysis: A tutorial on parallel analysis. Moshagen, M., & Erdfelder, E. (2016). A new strategy for testing structural
Organizational Research Methods, 7, 191–205. http://dx.doi.org/10 equation models. Structural Equation Modeling: A Multidisciplinary
.1177/1094428104263675 Journal, 23, 54 – 60. http://dx.doi.org/10.1080/10705511.2014.950896
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis Mulaik, S. A. (2010). Foundations of factor analysis (2nd ed.). Boca
in published research: Common errors and some comment on improved Raton, FL: Chapman & Hall/CRC.
practice. Educational and Psychological Measurement, 66, 393– 416. Peres-Neto, P. R., Jackson, D. A., & Somers, K. M. (2005). How many
http://dx.doi.org/10.1177/0013164405282485 principal components? Stopping rules for determining the number of
Horn, J. L. (1965). A rationale and test for the number of factors in factor non-trivial axes revisited. Computational Statistics & Data Analysis, 49,
analysis. Psychometrika, 30, 179 –185. http://dx.doi.org/10.1007/ 974 –997. http://dx.doi.org/10.1016/j.csda.2004.06.015
BF02289447 Raîche, G., Riopel, M., & Blais, J.-G. (2006, June). Nongraphical solutions
Hubbard, R., & Allen, S. J. (1987). An empirical comparison of alternative for the Cattell’s scree test. Paper presented at the annual meeting of the
methods for principal component extraction. Journal of Business Re- Psychometric Society, Montreal, Quebec, Canada.
search, 15, 173–190. http://dx.doi.org/10.1016/0148-2963(84)90047-X Raîche, G., Walls, T. A., Magis, D., Riopel, M., & Blais, J.-G. (2013).
Humphreys, L. G., & Ilgen, D. R. (1969). Note on a criterion for the Non-graphical solutions for Cattell’s scree test. Methodology, 9, 23–29.
number of common factors. Educational and Psychological Measure- http://dx.doi.org/10.1027/1614-2241/a000051
ment, 29, 571–578. http://dx.doi.org/10.1177/001316446902900303 R Core Team. (2016). R: A language and environment for statistical
Humphreys, L. G., & Montanelli, R. G. (1975). An investigation of the computing [Computer software manual]. Vienna, Austria: Author. Re-
parallel analysis criterion for determining the number of common fac- trieved from https://www.R-project.org/
tors. Multivariate Behavioral Research, 10, 193–205. http://dx.doi.org/ Revelle, W. (2015). psych: Procedures for psychological, psychometric,
10.1207/s15327906mbr1002_5 and personality research (R package version 1.5.8) [Computer software
IBM Corp. (2015). IBM SPSS Statistics for Windows, Version 23.0. Ar- manual]. Retrieved from http://CRAN.R-project.org/package⫽psych
monk, NY: Author. Ruscio, J., & Kaczetow, W. (2008). Simulating multivariate nonnormal
Jackson, D. L., Gillaspy, J. A., & Purc-Stephenson, R. (2009). Reporting practices data using an iterative algorithm. Multivariate Behavioral Research, 43,
in confirmatory factor analysis: An overview and some recommendations. 355–381. http://dx.doi.org/10.1080/00273170802285693
Psychological Methods, 14, 6. http://dx.doi.org/10.1037/a0014694 Ruscio, J., & Roche, B. (2012). Determining the number of factors to retain
Jöreskog, K. G. (2007). Factor analysis and its extensions. In R. Cudeck & in an exploratory factor analysis using comparison data of known
R. C. MacCallum (Eds.), Factor analysis at 100: Historical develop- factorial structure. Psychological Assessment, 24, 282–292. http://dx.doi
ments and future directions (pp. 47–77). Mahwah, NJ: Erlbaum. .org/10.1037/a0025697
Kaiser, H. F. (1960). The application of electronic computers to factor Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and
analysis. Educational and Psychological Measurement, 20, 141–151. standard errors in covariance structure analysis. In A. Eye & C. C. Clogg
http://dx.doi.org/10.1177/001316446002000116 (Eds.), Latent variable analysis: Applications for developmental re-
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four search (pp. 399 – 419). Thousand Oaks, CA: Sage.
commonly reported cutoff criteria: What did they really say? Organiza- Schmitt, T. A. (2011). Current methodological considerations in explor-
tional Research Methods, 9, 202–220. http://dx.doi.org/10.1177/ atory and confirmatory factor analysis. Journal of Psychoeducational
1094428105284919 Assessment, 29, 304 –321. http://dx.doi.org/10.1177/0734282911406653
Lawley, D. N. (1940). The estimation of factor loadings by the method of Schönemann, P. H., & Wang, M.-M. (1972). Some new results on factor
maximum likelihood. Proceedings of the Royal Society of Edinburgh, indeterminacy. Psychometrika, 37, 61–91. http://dx.doi.org/10.1007/
60, 64 – 82. http://dx.doi.org/10.1017/S037016460002006X BF02291413
DETERMINING THE NUMBER OF FACTORS 23

Steger, M. F. (2006). An illustration of issues in factor extraction and Widaman, K. F. (1993). Common factor analysis versus principal compo-
identification of dimensionality in psychological assessment data. Jour- nent analysis: Differential bias in representing model parameters? Mul-
nal of Personality Assessment, 86, 263–272. http://dx.doi.org/10.1207/ tivariate Behavioral Research, 28, 263–311. http://dx.doi.org/10.1207/
s15327752jpa8603_03 s15327906mbr2803_1
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences. Wood, J. M., Tataryn, D. J., & Gorsuch, R. L. (1996). Effects of under- and
New York, NY: Taylor & Francis. overextraction on principal axis factor analysis with varimax rotation.
Thurstone, L. L. (1947). Multiple factor analysis. Illinois: University of Psychological Methods, 1, 354 –365. http://dx.doi.org/10.1037/1082-
Chicago Press. 989X.1.4.354
Turner, N. E. (1998). The effect of common variance and structure pattern Worthington, R. L., & Whittaker, T. A. (2006). Scale development re-
on random data eigenvalues: Implications for the accuracy of parallel
search: A content analysis and recommendations for best practices. The
analysis. Educational and Psychological Measurement, 58, 541–568.
Counseling Psychologist, 34, 806 – 838. http://dx.doi.org/10.1177/
http://dx.doi.org/10.1177/0013164498058004001
0011000006288127
Velicer, W. F. (1976). Determining the number of components from the
Yuan, K.-H., Chan, W., Marcoulides, G. A., & Bentler, P. M. (2016).
matrix of partial correlations. Psychometrika, 41, 321–327. http://dx.doi
Assessing structural equation models by equivalence testing with ad-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

.org/10.1007/BF02293557
Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication justed fit indexes. Structural Equation Modeling: A Multidisciplinary
This document is copyrighted by the American Psychological Association or one of its allied publishers.

through factor or component analysis: A review and evaluation of Journal, 23, 319 –330. http://dx.doi.org/10.1080/10705511.2015
alternative procedures for determining the number of factors or compo- .1065414
nents. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in Zwick, W. R., & Velicer, W. F. (1986). Comparison of five rules for
human assessment: Honoring Douglas N. Jackson at seventy (pp. 41– determining the number of components to retain. Psychological Bulletin,
71). Boston, MA: Kluwer Academic. 99, 432– 442. http://dx.doi.org/10.1037/0033-2909.99.3.432

Appendix
Eigenvalues and Explained Variance

The goal of this section is to explain the correspondence be- ⫽ E(XXT) ⫺ ⌬ (23)
tween explained variance of the common factor model and the
eigenvalues of the matrix of correlations RC with communalities ⫽ E共 T
XCXC 兲. (24)
on the diagonal, assuming that the (hypothetical) data fit the We denote the values of XC associated with observations in X
common factor model perfectly. Note that the explained variance as xCk, 1 ⱕ k ⱕ N for N observations and try to find factors that
in a PCA can be similarly derived if we used the correlation matrix linearly explain variations in XC. This is equivalent to finding lines
R instead of RC. on which we project each xCk such that the variance of the length
Suppose we have standardized observed variables X ⫽ (x1, . . . , of projections is maximal (and the variance of the distances to the
xm)T with correlation matrix R from which we partial out the unique- line is minimal). A line is a set of points that satisfy
ness ⌬. We denote the resulting variables as XC ⫽ (xC1, . . . , xCm)T,
so that x ⫽ ␣␷, (25)
xi ⫽ xCi ⫹ ␦i, 1 ⱕ i ⱕ p, (18) where ␷ is a vector of length p and ␣ 僆⺢. The length of the
which implies projection of xCk on this line is

E共XXT) ⫽ E共XCXC
T
兲 ⫹ ⌬. (19) 具xCk, ␷典
. (26)
㛳␷㛳
X is standardized, so
Note that the length of ␷ does not change the line in Equation 25,
E(XXT) ⫽ R, (20) so that we can set 㛳␷㛳 ⫽ 1 without loss of generality. The length of
with projections then is 具xCk, ␷典. In order to maximize the variance of
具xCk, ␷典, we first obtain the average of the projections. In this step,
R ⫽ RC ⫹ ⌬. (21) we utilize that the vector ␷ is part of an orthonormal basis of our
As can be seen from Equations 19 and 21, the covariance matrix vector space ⺢p. An orthonormal basis is a set of p linearly
of XC is equal to RC, because independent vectors (each of length 1), that can express any vector
␷ⴱ 僆⺢p as a linear combination of elements of the basis. We
RC ⫽ R ⫺ ⌬ (22) denote the orthogonal basis that contains ␷ as

(Appendix continues)
24 AUERSWALD AND MOSHAGEN

兵␷, ␷⬘2 , . . . , ␷⬘p 其 . (27) The first eigenvalue corresponds to the explained variance if we
choose the eigenvector e1 as a projection line. Suppose we choose
We can rewrite every xCk as
any other vector as a projection line. The eigenvectors e1, . . . , ep
xCk ⫽ ␣1k␷ ⫹ ␣2k␷⬘2 ⫹ . . . ⫹␣pk␷⬘,
p (28) form an orthonormal basis of our space. We can therefore rewrite
␷ as
because {␷, ␷=2, . . . , ␷=p} is an orthonormal basis, so that
p
p ␷ ⫽ 具 e1, ␷ 典 e1 ⫹ 具 e2, ␷ 典 e2 ⫹ . . . ⫹ 具 ep, ␷ 典 ep ⫽ 兺具ei, ␷典ei.
xCk ⫽ ␣1k␷ ⫹ 兺 ␣ik␷⬘i
i⫽2
(29) i⫽1

(41)
p
) ␷TxCk ⫽ ␷T␣1k␷ ⫹ ␷T 兺 ␣ik␷⬘i (30) The variance of the length of projections for ␷ then is

冉兺冊冉兺冊冉兺冊冉兺冊
i⫽2
p T p p T p
p
具ei, ␷典ei RC 具ei, ␷典ei ⫽ 具ei, ␷典ei 具ei, ␷典RCei
) ␷TxCk ⫽ ␣1k␷T␷ ⫹ 兺 ␣ik␷T␷⬘i
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

(31) i⫽1 i⫽1 i⫽1 i⫽1

This document is copyrighted by the American Psychological Association or one of its allied publishers.

i⫽2
(42)
) ␷TxCk ⫽ ␣1k .
冉兺冊冉兺冊
(32)
p T p
In the last step, we used that ␷⬜␷⬘,i 2 ⱕ i ⱕ p and ␷T ␷ ⫽ 㛳␷㛳 ⫽ ⫽ 具ei, ␷典ei 具ei, ␷典␭iei (43)
i⫽1 i⫽1
1. The mean of projections therefore is
p
N
兺 ␣1k␷ ⫽ k⫽1
N
兺 ␷TxCk␷ (33)
⫽ 兺具ei, ␷典2␭i 㛳 ei㛳2
i⫽1
(44)
k⫽1
In the last step, we used that ei ⬜ ei⬘ for 1 ⱕ i, i= ⱕ p and i ⫽ i⬘.
⫽ ␷T 冉兺冊
k⫽1
N
xCk ␷ (34) Note that the eigenvectors are standardized, so that 㛳ei㛳 ⫽ 1.
Further note that 具ei, ␷典2 ⱖ 0 and
⫽0 (35) p

because xCk is centered. We can therefore obtain the variance of 兺具ei, ␷典2 ⫽ 1
i⫽1
(45)
the length of projections of xCk as
because 㛳␷㛳 ⫽ 1. Therefore, the variance of the length of projec-
N N tions for ␷ is a weighted sum of eigenvalues where the weights are
1
兺具x , ␷典2 ⫽ 1
N ⫺ 1 k⫽1 Ck
(x · ␷)2
N ⫺ 1 k⫽1 Ck 兺 (36) all non-negative and sum to one, such that
p

兺具ei, ␷典2␭i ⱕ ␭1.

N
⫽ 1 ␷TxT x ␷
N ⫺ 1 k⫽1 Ck Ck 兺 (37) ␷TRC␷ ⫽
i⫽1
(46)

⫽ 1 ␷T
N⫺1 冉兺冊
N

k⫽1
xTCkxCk ␷ (38)
Hence, ␷ ⫽ e1 obtains a maximum of explained variance. If we
choose a second factor, we choose a line orthogonal to e1 and, by
analogy, arrive at the conclusion that ␷ ⫽ e2 with corresponding
⫽ ␷ RC ␷.
T
(39) explained variance l2. For m extracted factors, the explained vari-
ance is
The variance of the length of projections is ␷ · RC · ␷, we try T

m
to obtain the maximum.
We denote the eigenvectors of RC as e1, . . . , ep and the 兺
j⫽1
␭j. (47)
corresponding eigenvalues as ␭1, . . . , ␭p such that ␭1 ⱖ ␭2 ⱖ
. . . ⱖ ␭p. If we choose ␷ ⫽ e1, the variance is Received March 13, 2017
Revision received August 12, 2018
eT1 RC e1 ⫽ eT1 (␭1e1) ⫽ ␭1 . (40) Accepted August 21, 2018 䡲

View publication stats

Lbosimu Reviewer
No ratings yet
Lbosimu Reviewer
3 pages
Precalculus Functions and Graphs 12th Edition Swokowski Solutions Manual
100% (3)
Precalculus Functions and Graphs 12th Edition Swokowski Solutions Manual
55 pages
Exploratory Factor Analysis
100% (10)
Exploratory Factor Analysis
170 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
61 pages
Exploratory Factor Analysis: Arrow@Dit
No ratings yet
Exploratory Factor Analysis: Arrow@Dit
33 pages
MATRICES - Application of Matrices in Real Life
No ratings yet
MATRICES - Application of Matrices in Real Life
2 pages
2012 A New Look at Horn's Parallel Analysis With Ordinal Variables
No ratings yet
2012 A New Look at Horn's Parallel Analysis With Ordinal Variables
22 pages
Exploratory Factor Analysis: A Guide To Best Practice: Marley W. Watkins
No ratings yet
Exploratory Factor Analysis: A Guide To Best Practice: Marley W. Watkins
28 pages
SMS 101
No ratings yet
SMS 101
172 pages
Determining The Number of Factors in Exploratory and Confirmatory Factor Analysis
No ratings yet
Determining The Number of Factors in Exploratory and Confirmatory Factor Analysis
16 pages
Factor Analysis: Nazia Qayyum SAP ID 48541
100% (1)
Factor Analysis: Nazia Qayyum SAP ID 48541
34 pages
CODING QUESTIONS FROM MNCs by SUDHAKAR VADDI MCA
No ratings yet
CODING QUESTIONS FROM MNCs by SUDHAKAR VADDI MCA
8 pages
1 LMI's and The LMI Toolbox
0% (1)
1 LMI's and The LMI Toolbox
4 pages
Math
No ratings yet
Math
215 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
9 pages
Factor Analysis
100% (1)
Factor Analysis
36 pages
Factor Analysis
50% (2)
Factor Analysis
56 pages
Exploratory Factor Analysis Concepts and Theory
No ratings yet
Exploratory Factor Analysis Concepts and Theory
9 pages
Analysis and Design of Plate Girder Bridges
100% (1)
Analysis and Design of Plate Girder Bridges
58 pages
A Seminar Presentation On: Factor Analysis (Special Concentration On Exploratory & Confirmatory Factor Analysis)
No ratings yet
A Seminar Presentation On: Factor Analysis (Special Concentration On Exploratory & Confirmatory Factor Analysis)
68 pages
Overview of Factor Analysis
No ratings yet
Overview of Factor Analysis
11 pages
Baglin (2014)
No ratings yet
Baglin (2014)
15 pages
EFA Guide
No ratings yet
EFA Guide
28 pages
FA Rotated Component Matrix
No ratings yet
FA Rotated Component Matrix
7 pages
Factor Analysis
No ratings yet
Factor Analysis
15 pages
Efa Vs Cfa
No ratings yet
Efa Vs Cfa
11 pages
Proceedings of The 2015 Chinese Intelligent Systems Conference
No ratings yet
Proceedings of The 2015 Chinese Intelligent Systems Conference
650 pages
I&IIsem
No ratings yet
I&IIsem
30 pages
Improving Your Exploratory Factor Analysis For Ordinal Data: A Demonstration Using FACTOR
No ratings yet
Improving Your Exploratory Factor Analysis For Ordinal Data: A Demonstration Using FACTOR
15 pages
Chapter Four 4. Transformations
No ratings yet
Chapter Four 4. Transformations
14 pages
Determining The Number of Factors To Retain in EFA - Using The SPS
No ratings yet
Determining The Number of Factors To Retain in EFA - Using The SPS
15 pages
Spirobot: Spirographs Spirograph
No ratings yet
Spirobot: Spirographs Spirograph
10 pages
SRDP Lecture05Handout EFA 3slidesperpage
No ratings yet
SRDP Lecture05Handout EFA 3slidesperpage
36 pages
Osborne 2015
No ratings yet
Osborne 2015
8 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
1 page
A Beginner's Guide To Factor Analysis: Focusing On Exploratory Factor Analysis
No ratings yet
A Beginner's Guide To Factor Analysis: Focusing On Exploratory Factor Analysis
16 pages
Ferguson 1993
No ratings yet
Ferguson 1993
11 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
10 pages
Faculty of Commerce, Lucknow University: Group - A Paper (I) - Micro Economics
No ratings yet
Faculty of Commerce, Lucknow University: Group - A Paper (I) - Micro Economics
40 pages
Equation Editor PDF
No ratings yet
Equation Editor PDF
4 pages
Assignment On Factor Analysis - Zainullah
No ratings yet
Assignment On Factor Analysis - Zainullah
18 pages
10 21449-Ijate 827950-1404106
No ratings yet
10 21449-Ijate 827950-1404106
16 pages
Ch. 9 - Matrices
No ratings yet
Ch. 9 - Matrices
41 pages
Multi-Dimensional Model Order Selection
No ratings yet
Multi-Dimensional Model Order Selection
13 pages
PHD Thesis Using Exploratory Factor Analysis
100% (3)
PHD Thesis Using Exploratory Factor Analysis
8 pages
Eleven HM-01 EV Chapter-1 30.06.2020 Shawon
No ratings yet
Eleven HM-01 EV Chapter-1 30.06.2020 Shawon
26 pages
Using Proc IML: Statistical Computing Spring 2014
No ratings yet
Using Proc IML: Statistical Computing Spring 2014
50 pages
Exploratory Factor Analysis (Efa)
No ratings yet
Exploratory Factor Analysis (Efa)
56 pages
Mathematics M Assignment
No ratings yet
Mathematics M Assignment
9 pages
Watkins, 2018 - 2
No ratings yet
Watkins, 2018 - 2
28 pages
Add Maths Year 10
No ratings yet
Add Maths Year 10
18 pages
The Hull Method For Selecting The Number of Common Factors
No ratings yet
The Hull Method For Selecting The Number of Common Factors
26 pages
Hayton Et Al. - 2004 - Factor Retention Decisions in Exploratory Factor A
No ratings yet
Hayton Et Al. - 2004 - Factor Retention Decisions in Exploratory Factor A
16 pages
Faddeev
No ratings yet
Faddeev
6 pages
Lecture-10 Factor Analysis - Reduced & Modified James McNeill Set W Consent
No ratings yet
Lecture-10 Factor Analysis - Reduced & Modified James McNeill Set W Consent
55 pages
Factor Analysis
No ratings yet
Factor Analysis
8 pages
Dialnet HowToFactoranalyzeYourDataRight 3296455
No ratings yet
Dialnet HowToFactoranalyzeYourDataRight 3296455
16 pages
Finch 2013
No ratings yet
Finch 2013
20 pages
Improved Wideband Beamforming Algorithm Based On Microphone Arrays
No ratings yet
Improved Wideband Beamforming Algorithm Based On Microphone Arrays
4 pages
One Day Workshop On EFA & CFA Using IBM SPSS 24 & AMOS 24
No ratings yet
One Day Workshop On EFA & CFA Using IBM SPSS 24 & AMOS 24
31 pages
HW02
No ratings yet
HW02
2 pages
Year 2 Multiplication and Division Workbook
No ratings yet
Year 2 Multiplication and Division Workbook
26 pages
Factor Analysis
No ratings yet
Factor Analysis
18 pages
To Numerical Methods: Prof. P. K. Jha
No ratings yet
To Numerical Methods: Prof. P. K. Jha
82 pages
2D Transformations in Computer Graphics
No ratings yet
2D Transformations in Computer Graphics
6 pages
Transactions On: Large-Scale Data-And Knowledge - Centered Systems XXVIII
No ratings yet
Transactions On: Large-Scale Data-And Knowledge - Centered Systems XXVIII
168 pages
Module 21, Comp 1 Q1
No ratings yet
Module 21, Comp 1 Q1
11 pages
Factor Analysis
No ratings yet
Factor Analysis
22 pages
PPS V2 Common To All
No ratings yet
PPS V2 Common To All
5 pages
How To Survive During Ces 513 Final Exam
No ratings yet
How To Survive During Ces 513 Final Exam
125 pages
10 5923 J Ajms 20201002 03
No ratings yet
10 5923 J Ajms 20201002 03
11 pages
24MA101
No ratings yet
24MA101
2 pages
Factor Analysis
No ratings yet
Factor Analysis
44 pages
Exploratory and Confirmatory Factor Analysis Understanding Concepts and Applications 1st Edition Bruce Thompson All Chapters Instant Download
No ratings yet
Exploratory and Confirmatory Factor Analysis Understanding Concepts and Applications 1st Edition Bruce Thompson All Chapters Instant Download
77 pages
Prediction of Remaining Service Life
No ratings yet
Prediction of Remaining Service Life
21 pages
A First Course in Factor Analysis - 2nd Edition Total Access Ebook
100% (10)
A First Course in Factor Analysis - 2nd Edition Total Access Ebook
17 pages
Hefetz 2017
No ratings yet
Hefetz 2017
38 pages
Factor Analysis Intro
No ratings yet
Factor Analysis Intro
13 pages
Pone 0255532 s005
No ratings yet
Pone 0255532 s005
2 pages
Criterio de Horn Hayton, Scarpello
No ratings yet
Criterio de Horn Hayton, Scarpello
15 pages
Citation Hooper
No ratings yet
Citation Hooper
33 pages
A First Course in Factor Analysis - 2nd Edition ISBN 0805810625, 9780805810622 One-Click Ebook Download
No ratings yet
A First Course in Factor Analysis - 2nd Edition ISBN 0805810625, 9780805810622 One-Click Ebook Download
14 pages
Factor Analysis
No ratings yet
Factor Analysis
4 pages
What Is Exploratory Factor Analysis?: o o o o
No ratings yet
What Is Exploratory Factor Analysis?: o o o o
3 pages
Factor Analysis
No ratings yet
Factor Analysis
5 pages
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
From Everand
Big-O Notation Demystified: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Decision Analysis: Fundamentals and Applications
From Everand
Decision Analysis: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mass Spectrometry: Techniques and Applications
From Everand
Mass Spectrometry: Techniques and Applications
Anshul Pandey
No ratings yet
Smart Business Problems and Analytical Hints in Cancer Research
From Everand
Smart Business Problems and Analytical Hints in Cancer Research
Zemelak Goraga
No ratings yet

2019 Auerswaldmoshagen EFAextractioncriteria

Uploaded by

2019 Auerswaldmoshagen EFAextractioncriteria

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

How to Determine the Number of Factors to Retain in Exploratory Factor

Article in Psychological Methods · January 2019

The user has requested enhancement of the downloaded file.

How to Determine the Number of Factors to Retain in Exploratory Factor

Keywords: factor analysis, number of factors, Monte Carlo simulation, non-normality

number of manifest variables, that account for the covariances

among the manifest variables x1, . . . , xp, such that x1, . . . , xp

tors in EFA, most of which rely on eigenvalues representing the 2 j⫽1

␭2 ⫽ . . . ⫽ ␭m ⫽ pl2 ⫺ prl2 (11)

EFA is typically used whenever there is no strong theoretical

Increased l Increased p Increased r

Number of factors Number of factors Number of factors Number of factors

stitute the comparison threshold are uncorrelated, leading to a

j factors. Otherwise, j is increased by one and Steps 1

through 4 are repeated.

1. Generate random data with either j or j ⫹ 1 underlying Hull Method

compared to other variants of PA (Green et al., 2012) or the CD

strong, unambiguous factors because it successfully ignores small,

proportion of the variance.

Empirical Kaiser Criterion

The Empirical Kaiser Criterion (EKC; Braeken & van Assen,

criterion is equivalent to Kaiser’s criterion and extracts all factors

␭ j,ref ⫽ max冉 兺 j⫺1

冉 冑 pNm冊 , 2 only employed if the two-factor test was incorrectly significant.

Sequential ␹2 Model Tests Extraction Criteria

Traditional parallel analysis (PA). We implemented four Table 1

normal independent variables (such as errors; Auerswald &

Data condition Example Explanation

Number of observations N ⫽ 500 Sample size of 500

Err-NN Non-normal based on non-normal errors

Lat-NN Non-normal based on non-normal latent factors

4 Indicator 8 Indicator 12 Indicator

4 Indicator 8 Indicator 12 Indicator

ρ = .75 ρ = .50 ρ = .25

p(correct) p(correct) p(correct)

Moderate Minor Weak Minor No Minor

p(correct) p(correct) p(correct)

tended to overextract 共bias ៮

displayed high accuracies across all conditions. Except for

unaffected. Measurement, 69, 748 –759. http://dx.doi.org/10.1177/001316440

Finally, we want to stress that decisions on the number of factors 9332229

(31) i⫽1 i⫽1 i⫽1 i⫽1

兺 具ei, ␷典2␭i ⱕ ␭1.

View publication stats

You might also like

␭ j,ref ⫽ max冉兺 j⫺1

冉冑 pNm冊 , 2 only employed if the two-factor test was incorrectly significant.

兺具ei, ␷典2␭i ⱕ ␭1.