1
1
# ' Principal Component Analysis (PCA) and Factor Analysis (FA)
2
2
# '
3
- # ' The functions `principal_components()` and `factor_analysis()` can
4
- # ' be used to perform a principal component analysis (PCA) or a factor analysis
5
- # ' (FA). They return the loadings as a data frame, and various methods and
6
- # ' functions are available to access / display other information (see the
7
- # ' Details section).
3
+ # ' The functions `principal_components()` and `factor_analysis()` can be used to
4
+ # ' perform a principal component analysis (PCA) or a factor analysis (FA). They
5
+ # ' return the loadings as a data frame, and various methods and functions are
6
+ # ' available to access / display other information (see the 'Details' section).
8
7
# '
9
- # ' @param x A data frame or a statistical model.
8
+ # ' @param x A data frame or a statistical model. For `closest_component()`, the
9
+ # ' output of the `principal_components()` function.
10
10
# ' @param n Number of components to extract. If `n="all"`, then `n` is set as
11
11
# ' the number of variables minus 1 (`ncol(x)-1`). If `n="auto"` (default) or
12
12
# ' `n=NULL`, the number of components is selected through [`n_factors()`]
19
19
# ' @param rotation If not `"none"`, the PCA / FA will be computed using the
20
20
# ' **psych** package. Possible options include `"varimax"`, `"quartimax"`,
21
21
# ' `"promax"`, `"oblimin"`, `"simplimax"`, or `"cluster"` (and more). See
22
- # ' [`psych::fa()`] for details.
22
+ # ' [`psych::fa()`] for details. The default is `"none"` for PCA, and
23
+ # ' `"oblimin"` for FA.
24
+ # ' @param factor_method The factoring method to be used. Passed to the `fm`
25
+ # ' argument in `psych::fa()`. Defaults to `"minres"` (minimum residual). Other
26
+ # ' options include `"uls"`, `"ols"`, `"wls"`, `"gls"`, `"ml"`, `"minchi"`,
27
+ # ' `"minrank"`, `"old.min"`, and `"alpha"`. See `?psych::fa` for details.
23
28
# ' @param sparse Whether to compute sparse PCA (SPCA, using [`sparsepca::spca()`]).
24
29
# ' SPCA attempts to find sparse loadings (with few nonzero values), which improves
25
30
# ' interpretability and avoids overfitting. Can be `TRUE` or `"robust"` (see
26
31
# ' [`sparsepca::robspca()`]).
27
32
# ' @param sort Sort the loadings.
33
+ # ' @param n_obs An integer or a matrix.
34
+ # ' - **Integer:** Number of observations in the original data set if `x` is a
35
+ # ' correlation matrix. Required to compute correct fit indices.
36
+ # ' - **Matrix:** A matrix where each cell `[i, j]` specifies the number of
37
+ # ' pairwise complete observations used to compute the correlation between
38
+ # ' variable `i` and variable `j` in the input `x`. It is crucial when `x` is
39
+ # ' a correlation matrix (rather than raw data), especially if that matrix
40
+ # ' was derived from a dataset containing missing values using pairwise
41
+ # ' deletion. Providing a matrix allows `psych::fa()` to accurately calculate
42
+ # ' statistical measures, such as chi-square fit statistics, by accounting
43
+ # ' for the varying sample sizes that contribute to each individual
44
+ # ' correlation coefficient.
28
45
# ' @param threshold A value between 0 and 1 indicates which (absolute) values
29
46
# ' from the loadings should be removed. An integer higher than 1 indicates the
30
47
# ' n strongest loadings to retain. Can also be `"max"`, in which case it will
46
63
# ' with missing values from the original data, hence the number of rows of
47
64
# ' predicted data and original data is equal.
48
65
# ' @param ... Arguments passed to or from other methods.
49
- # ' @param pca_results The output of the `principal_components()` function.
50
66
# ' @param digits Argument for `print()`, indicates the number of digits
51
67
# ' (rounding) to be used.
52
68
# ' @param labels Argument for `print()`, character vector of same length as
83
99
# ' values, so it matches the original data frame.
84
100
# '
85
101
# ' - `performance::item_omega()` is a convenient wrapper around `psych::omega()`,
86
- # ' which provides some additioal methods to work seamleassly within the
102
+ # ' which provides some additional methods to work seamlessly within the
87
103
# ' *easystats* framework.
88
104
# '
89
105
# ' - [`performance::check_normality()`] checks residuals from objects returned
134
150
# '
135
151
# ' ## Computing Item Scores
136
152
# ' Use [`get_scores()`] to compute scores for the "subscales" represented by the
137
- # ' extracted principal components. `get_scores()` takes the results from
138
- # ' `principal_components()` and extracts the variables for each component found
139
- # ' by the PCA. Then, for each of these "subscales", raw means are calculated
140
- # ' (which equals adding up the single items and dividing by the number of items).
141
- # ' This results in a sum score for each component from the PCA, which is on the
142
- # ' same scale as the original, single items that were used to compute the PCA.
143
- # ' One can also use `predict()` to back-predict scores for each component,
144
- # ' to which one can provide `newdata` or a vector of `names` for the components.
153
+ # ' extracted principal components or factors. `get_scores()` takes the results
154
+ # ' from `principal_components()` or `factor_analysis()` and extracts the
155
+ # ' variables for each component found by the PCA. Then, for each of these
156
+ # ' "subscales", raw means are calculated (which equals adding up the single
157
+ # ' items and dividing by the number of items). This results in a sum score for
158
+ # ' each component from the PCA, which is on the same scale as the original,
159
+ # ' single items that were used to compute the PCA. One can also use `predict()`
160
+ # ' to back-predict scores for each component, to which one can provide `newdata`
161
+ # ' or a vector of `names` for the components.
145
162
# '
146
163
# ' ## Explained Variance and Eingenvalues
147
164
# ' Use `summary()` to get the Eigenvalues and the explained variance for each
213
230
# '
214
231
# ' # Factor Analysis (FA) ------------------------
215
232
# '
216
- # ' factor_analysis(mtcars[, 1:7], n = "all", threshold = 0.2)
217
- # ' factor_analysis(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE)
218
- # ' factor_analysis(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE)
233
+ # ' factor_analysis(mtcars[, 1:7], n = "all", threshold = 0.2, rotation = "Promax" )
234
+ # ' factor_analysis(mtcars[, 1:7], n = 2, threshold = "max", sort = TRUE)
235
+ # ' factor_analysis(mtcars[, 1:7], n = 2, rotation = "none", threshold = 2, sort = TRUE)
219
236
# '
220
237
# ' efa <- factor_analysis(mtcars[, 1:5], n = 2)
221
238
# ' summary(efa)
@@ -234,9 +251,9 @@ principal_components <- function(x, ...) {
234
251
235
252
# ' @rdname principal_components
236
253
# ' @export
237
- rotated_data <- function (pca_results , verbose = TRUE ) {
238
- original_data <- attributes(pca_results )$ dataset
239
- rotated_matrix <- insight :: get_predicted(attributes(pca_results )$ model )
254
+ rotated_data <- function (x , verbose = TRUE ) {
255
+ original_data <- attributes(x )$ dataset
256
+ rotated_matrix <- insight :: get_predicted(attributes(x )$ model )
240
257
out <- NULL
241
258
242
259
if (is.null(original_data ) || is.null(rotated_matrix )) {
@@ -246,7 +263,7 @@ rotated_data <- function(pca_results, verbose = TRUE) {
246
263
return (NULL )
247
264
}
248
265
249
- compl_cases <- attributes(pca_results )$ complete_cases
266
+ compl_cases <- attributes(x )$ complete_cases
250
267
if (is.null(compl_cases ) && nrow(original_data ) != nrow(rotated_matrix )) {
251
268
if (verbose ) {
252
269
insight :: format_warning(" Could not retrieve information about missing data." )
0 commit comments