12.5 Interpreting output

Let’s now look at how to actually interpret the output of a factor analysis, including how to make sense of the main numbers that you get out of a basic EFA/PCA output.

12.5.1 Extraction methods

PCA only has one method of deriving the eigenvalues of the components, and so it will give the same answer every time you run it on the same dataset. EFA, however, has multiple possible means of estimating factors, which are typically termed extraction methods. We won’t go into the details of how exactly they work, but the key thing to know is that the extraction method essentially changes

There are three extraction methods available in Jamovi:

Maximum likelihood. One of the most common options, and provides the most generalisable and robust estimates. ML methods assume normality and generally require large datasets, however.
Principal axis factoring. Principal axis factoring does not make particular assumptions about the normality or distribution of the data, meaning that it is good at handling more complex datasets.
Minimum residuals. The minimum residual method is sort of a middle-ground option between the two, but isn’t as commonly used in psychological research.

Under ideal conditions, maximum likelihood (ML) and PA (principal axis) methods will generally give very similar estimates of the factors. When data are severely non-normal (or you want to anticipate that it will be), it is better to go with PA in the first instance. Otherwise, ML estimates are generally the way to go.

To run a factor analysis in r, we use the fa function from psych. At minimum, we must specify the following:

The dataset as the first argument
The number of factors we want to extract
The type of rotation - for now set this to “none”, as we’ll talk about this later
The factoring method, fm - as above. "ml" stands for maximum likelihood, "pa" stands for principal axis factoring and "minres" stands for minimum residuals.

saq_efa <- fa(
  saq,
  nfactors = 3,
  rotate = "none",
  fm = "ml"
)

12.5.2 Interpreting output

Below is the main output of our EFA. This is what we call a factor matrix:

saq_efa

## Factor Analysis using method =  ml
## Call: fa(r = saq, nfactors = 3, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2   ML3   h2   u2 com
## q01  0.59  0.31 -0.12 0.46 0.54 1.6
## q02 -0.25  0.25  0.32 0.23 0.77 2.8
## q04  0.62  0.23 -0.02 0.44 0.56 1.3
## q05  0.56  0.19 -0.06 0.36 0.64 1.3
## q06  0.55 -0.21  0.32 0.45 0.55 2.0
## q14  0.63 -0.12  0.10 0.42 0.58 1.1
## q15  0.55 -0.15  0.10 0.34 0.66 1.2
## q19 -0.37  0.21  0.22 0.23 0.77 2.3
## q22 -0.28  0.31  0.24 0.24 0.76 2.9
## 
##                        ML1  ML2  ML3
## SS loadings           2.33 0.47 0.35
## Proportion Var        0.26 0.05 0.04
## Cumulative Var        0.26 0.31 0.35
## Proportion Explained  0.74 0.15 0.11
## Cumulative Proportion 0.74 0.89 1.00
## 
## Mean item complexity =  1.8
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  36  with the objective function =  1.43 with Chi Square =  3674.74
## df of  the model are 12  and the objective function was  0.01 
## 
## The root mean square of the residuals (RMSR) is  0.01 
## The df corrected root mean square of the residuals is  0.02 
## 
## The harmonic n.obs is  2571 with the empirical chi square  28.51  with prob <  0.0046 
## The total n.obs was  2571  with Likelihood Chi Square =  33.87  with prob <  0.00071 
## 
## Tucker Lewis Index of factoring reliability =  0.982
## RMSEA index =  0.027  and the 90 % confidence intervals are  0.016 0.037
## BIC =  -60.35
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    ML1   ML2   ML3
## Correlation of (regression) scores with factors   0.89  0.65  0.59
## Multiple R square of scores with factors          0.79  0.43  0.35
## Minimum correlation of possible factor scores     0.59 -0.15 -0.31

What do these numbers mean? Firstly, each value in each factor column (labelled ML1, ML2 etc) gives us our loadings. Note that as per the common factor model we described before, we estimate a loading for every variable on every factor. However, this can be a bit gross to interpret, so we generally choose to suppress (not remove) loadings below a certain threshold. By default, Jamovi will hide loadings below 0.3. We can also sort the items based on their loadings. To do this, we use the fa.sort() function and feed in our EFA object. We can then use print() to clean up this output with two arguments: digits to show rounding, and cut to suppress values below a certain size. This gives us a nicer output:

saq_efa <- fa.sort(saq_efa) 
print(saq_efa, digits = 3, cut = .30)

## Factor Analysis using method =  ml
## Call: fa(r = saq, nfactors = 3, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##        ML1    ML2    ML3    h2    u2  com
## q14  0.631               0.420 0.580 1.12
## q04  0.622               0.441 0.559 1.27
## q01  0.589  0.311        0.458 0.542 1.61
## q05  0.562               0.358 0.642 1.26
## q15  0.551               0.337 0.663 1.23
## q06  0.546         0.324 0.448 0.552 1.97
## q19 -0.371               0.229 0.771 2.27
## q22         0.314        0.238 0.762 2.88
## q02                0.319 0.229 0.771 2.84
## 
##                         ML1   ML2   ML3
## SS loadings           2.333 0.474 0.352
## Proportion Var        0.259 0.053 0.039
## Cumulative Var        0.259 0.312 0.351
## Proportion Explained  0.738 0.150 0.111
## Cumulative Proportion 0.738 0.889 1.000
## 
## Mean item complexity =  1.8
## Test of the hypothesis that 3 factors are sufficient.
## 
## df null model =  36  with the objective function =  1.432 with Chi Square =  3674.737
## df of  the model are 12  and the objective function was  0.013 
## 
## The root mean square of the residuals (RMSR) is  0.012 
## The df corrected root mean square of the residuals is  0.021 
## 
## The harmonic n.obs is  2571 with the empirical chi square  28.515  with prob <  0.00465 
## The total n.obs was  2571  with Likelihood Chi Square =  33.874  with prob <  0.000706 
## 
## Tucker Lewis Index of factoring reliability =  0.982
## RMSEA index =  0.0266  and the 90 % confidence intervals are  0.0163 0.0374
## BIC =  -60.351
## Fit based upon off diagonal values = 0.998
## Measures of factor score adequacy             
##                                                     ML1    ML2    ML3
## Correlation of (regression) scores with factors   0.892  0.652  0.587
## Multiple R square of scores with factors          0.795  0.425  0.345
## Minimum correlation of possible factor scores     0.590 -0.150 -0.310

Statistically speaking, in a factor matrix each loading is a regression coefficient for the latent factor predicting the variable. We can interpret them as we would with normal regressions, except this is a regression coefficient for our latent variable predicting each observed variable. For example, the loading for q14 on factor 1 is 0.631. This means that for every 1 unit increase on latent factor 1, scores on Q14 increase by .631 units.

Jamovi typically only gives you the uniqueness column, u2, which gives us the value of unique variance as a percentage. This is how much variance in each item is not explained by the factors we have chosen. In this instance, 58% of the variance in Q14 is not explained by the three factors.

Generally though, it’s easier to think in terms of communalities, or the values in the h2 column, which is the amount of variance that is explained by the factors. Communalities are simply 1 - uniqueness; thus, the communality of Q14 is 0.420, which indicates that 42% of the variance in Q14 is explained by the three factors. Higher communalities indicate that the factors collectively explain more variance in the observed variable.

How is this value calculated? Communalities are the sum of the squared factor loadings. Therefore, we can look at Q14’s factor loadings in the first (pre-sorted) output table, and calculate the communality as:

\[ h^2 = .631^2 + -.116^2 + .0963^2 \]

Which gives us an answer of approximately 0.420:

.631^2 + (-.116)^2 + .0963^2

## [1] 0.4208907

12.5.3 Total variance explained

The output of psych will also give you a brief table of the total amount of variance explained by each factor. This is shown in the output of fa, but can also be accessed again below. It is generally useful to at least report the total cumulative variance explained by all factors. In this case, the three factors collectively explain 35.1% of the total variance in the data (shown by the row saying “Cumulative var”).

saq_efa$Vaccounted

##                             ML1        ML2        ML3
## SS loadings           2.3325379 0.47409809 0.35202983
## Proportion Var        0.2591709 0.05267757 0.03911443
## Cumulative Var        0.2591709 0.31184845 0.35096287
## Proportion Explained  0.7384567 0.15009441 0.11144890
## Cumulative Proportion 0.7384567 0.88855110 1.00000000