15 Performing Factor Analysis

As a data analyst, the goal of a factor analysis is to reduce the number of variables to explain and to interpret the results. This can be accomplished in two steps:

  1. factor extraction
  2. factor rotation

Factor extraction involves making a choice about the type of model as well the number of factors to extract. Factor rotation comes after the factors are extracted, with the goal of achieving simple structure in order to improve interpretability.

Extracting Factors

There are two approaches to factor extraction which stems from different approaches to variance partitioning: a) principal components analysis and b) common factor analysis.

Principal Components Analysis

Unlike factor analysis, principal components analysis or PCA makes the assumption that there is no unique variance, the total variance is equal to common variance. Recall that variance can be partitioned into common and unique variance. If there is no unique variance then common variance takes up total variance (see figure below). Additionally, if the total variance is 1, then the common variance is equal to the communality.

The goal of a PCA is to replicate the correlation matrix using a set of components that are fewer in number and linear combinations of the original set of items. Although the following analysis defeats the purpose of doing a PCA we will begin by extracting as many components as possible as a teaching exercise and so that we can decide on the optimal number of components to extract later.

First go to Analyze – Dimension Reduction – Factor. Move all the observed variables over the Variables: box to be analyze.

Under Extraction – Method, pick Principal components and make sure to Analyze the Correlation matrix. We also request the Unrotated factor solution and the Scree plot. Under Extract, choose Fixed number of factors, and under Factor to extract enter 8. We also bumped up the Maximum Iterations of Convergence to 100.

The equivalent SPSS syntax is shown below:

FACTOR

 /VARIABLES q01 q02 q03 q04 q05 q06 q07 q08

 /MISSING LISTWISE

 /ANALYSIS q01 q02 q03 q04 q05 q06 q07 q08

 /PRINT INITIAL EXTRACTION

 /PLOT EIGEN

 /CRITERIA FACTORS(8) ITERATE(100)

 /EXTRACTION PC

 /ROTATION NOROTATE

 /METHOD=CORRELATION.

Eigenvalues and Eigenvectors

Before we get into the SPSS output, let’s understand a few things about eigenvalues and eigenvectors.

Eigenvalues represent the total amount of variance that can be explained by a given principal component.  They can be positive or negative in theory, but in practice they explain variance which is always positive.

  • If eigenvalues are greater than zero, then it’s a good sign.
  • Since variance cannot be negative, negative eigenvalues imply the model is ill-conditioned.
  • Eigenvalues close to zero imply there is item multicollinearity, since all the variance can be taken up by the first component.

Eigenvalues are also the sum of squared component loadings across all items for each component, which represent the amount of variance in each item that can be explained by the principal component.

Eigenvectors represent a weight for each eigenvalue. The eigenvector times the square root of the eigenvalue gives the component loadings which can be interpreted as the correlation of each item with the principal component. For this particular PCA of the SAQ-8, the  eigenvector associated with Item 1 on the first component is 0.3770.377, and the eigenvalue of Item 1 is 3.0573.057. We can calculate the first component as (0.377)√3.057=0.659.(0.377)3.057=0.659.

In this case, we can say that the correlation of the first item with the first component is 0.6590.659. Let’s now move on to the component matrix.

Component Matrix

The components can be interpreted as the correlation of each item with the component. Each item has a loading corresponding to each of the 8 components. For example, Item 1 is correlated 0.6590.659 with the first component, 0.1360.136 with the second component and −0.398−0.398 with the third, and so on.

The square of each loading represents the proportion of variance (think of it as an R2R2 statistic) explained by a particular component. For Item 1, (0.659)2=0.434(0.659)2=0.434 or 43.4%43.4% of its variance is explained by the first component. Subsequently, (0.136)2=0.018(0.136)2=0.018 or 1.8%1.8% of the variance in Item 1 is explained by the second component. The total variance explained by both components is thus 43.4%+1.8%=45.2%43.4%+1.8%=45.2%. If you keep going on adding the squared loadings cumulatively down the components, you find that it sums to 1 or 100%. This is also known as the communality, and in a PCA the communality for each item is equal to the total variance.

Component Matrixa
Item Component
1 2 3 4 5 6 7 8
1 0.659 0.136 -0.398 0.160 -0.064 0.568 -0.177 0.068
2 -0.300 0.866 -0.025 0.092 -0.290 -0.170 -0.193 -0.001
3 -0.653 0.409 0.081 0.064 0.410 0.254 0.378 0.142
4 0.720 0.119 -0.192 0.064 -0.288 -0.089 0.563 -0.137
5 0.650 0.096 -0.215 0.460 0.443 -0.326 -0.092 -0.010
6 0.572 0.185 0.675 0.031 0.107 0.176 -0.058 -0.369
7 0.718 0.044 0.453 -0.006 -0.090 -0.051 0.025 0.516
8 0.568 0.267 -0.221 -0.694 0.258 -0.084 -0.043 -0.012

Extraction Method: Principal Component Analysis.

a. 8 components extracted.

Summing the squared component loadings across the components (columns) gives you the communality estimates for each item, and summing each squared loading down the items (rows) gives you the eigenvalue for each component. For example, to obtain the first eigenvalue we calculate:

(0.659)2+(−.300)2–(−0.653)2+(0.720)2+(0.650)2+(0.572)2+(0.718)2+(0.568)2=3.057(0.659)2+(−.300)2–(−0.653)2+(0.720)2+(0.650)2+(0.572)2+(0.718)2+(0.568)2=3.057

You will get eight eigenvalues for eight components, which leads us to the next table.

Total Variance Explained in the 8-component PCA

Recall that the eigenvalue represents the total amount of variance that can be explained by a given principal component. Starting from the first component, each subsequent component is obtained from partialling out the previous component. Therefore the first component explains the most variance, and the last component explains the least. Looking at the Total Variance Explained table, you will get the total variance explained by each component. For example, Component 1 is 3.0573.057, or (3.057/8)%=38.21%(3.057/8)%=38.21% of the total variance. Because we extracted the same number of components as the number of items, the Initial Eigenvalues column is the same as the Extraction Sums of Squared Loadings column.

Total Variance Explained
Component Initial Eigenvalues Extraction Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative %
1 3.057 38.206 38.206 3.057 38.206 38.206
2 1.067 13.336 51.543 1.067 13.336 51.543
3 0.958 11.980 63.523 0.958 11.980 63.523
4 0.736 9.205 72.728 0.736 9.205 72.728
5 0.622 7.770 80.498 0.622 7.770 80.498
6 0.571 7.135 87.632 0.571 7.135 87.632
7 0.543 6.788 94.420 0.543 6.788 94.420
8 0.446 5.580 100.000 0.446 5.580 100.000
Extraction Method: Principal Component Analysis.

Choosing the number of components to extract

Since the goal of running a PCA is to reduce our set of variables down, it would useful to have a criterion for selecting the optimal number of components that are of course smaller than the total number of items. One criterion is the choose components that have eigenvalues greater than 1. Under the Total Variance Explained table, we see the first two components have an eigenvalue greater than 1. This can be confirmed by the Scree Plot which plots the eigenvalue (total variance explained) by the component number. Recall that we checked the Scree Plot option under Extraction – Display, so the scree plot should be produced automatically.

The first component will always have the highest total variance and the last component will always have the least, but where do we see the largest drop? If you look at Component 2, you will see an “elbow” joint. This is the marking point where it’s perhaps not too beneficial to continue further component extraction. There are some conflicting definitions of the interpretation of the scree plot but some say to take the number of components to the left of the the “elbow”. Following this criteria we would pick only one component. A more subjective interpretation of the scree plots suggests that any number of components between 1 and 4 would be plausible and further corroborative evidence would be helpful.

Some criteria say that the total variance explained by all components should be between 70% to 80% variance, which in this case would mean about four to five components. The authors of the book say that this may be untenable for social science research where extracted factors usually explain only 50% to 60%. Picking the number of components is a bit of an art and requires input from the whole research team. Let’s suppose we talked to the principal investigator and she believes that the two component solution makes sense for the study, so we will proceed with the analysis.

Running a PCA with 2 components in SPSS

Running the two component PCA is just as easy as running the 8 component solution. The only difference is under Fixed number of factors – Factors to extract you enter 2.

We will focus the differences in the output between the eight and two-component solution. Under Total Variance Explained, we see that the Initial Eigenvalues no longer equals the Extraction Sums of Squared Loadings. The main difference is that there are only two rows of eigenvalues, and the cumulative percent variance goes up to 51.54%51.54%.

Total Variance Explained
Component Initial Eigenvalues Extraction Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative %
1 3.057 38.206 38.206 3.057 38.206 38.206
2 1.067 13.336 51.543 1.067 13.336 51.543
3 0.958 11.980 63.523
4 0.736 9.205 72.728
5 0.622 7.770 80.498
6 0.571 7.135 87.632
7 0.543 6.788 94.420
8 0.446 5.580 100.000
Extraction Method: Principal Component Analysis.

Similarly, you will see that the Component Matrix has the same loadings as the eight-component solution but instead of eight columns it’s now two columns.

Component Matrixa
Item Component
1 2
1 0.659 0.136
2 -0.300 0.866
3 -0.653 0.409
4 0.720 0.119
5 0.650 0.096
6 0.572 0.185
7 0.718 0.044
8 0.568 0.267
Extraction Method: Principal Component Analysis.
a. 2 components extracted.

Again, we interpret Item 1 as having a correlation of 0.659 with Component 1. From glancing at the solution, we see that Item 4 has the highest correlation with Component 1 and Item 2 the lowest. Similarly, we see that Item 2 has the highest correlation with Component 2 and Item 7 the lowest.

Quick check:

True or False

  1. The elements of the Component Matrix are correlations of the item with each component.
  2. The sum of the squared eigenvalues is the proportion of variance under Total Variance Explained.
  3. The Component Matrix can be thought of as correlations and the Total Variance Explained table can be thought of as R2R2.

1.T, 2.F (sum of squared loadings), 3. T

Communalities of the 2-component PCA

The communality is the sum of the squared component loadings up to the number of components you extract. In the SPSS output you will see a table of communalities.

Communalities
Initial Extraction
1 1.000 0.453
2 1.000 0.840
3 1.000 0.594
4 1.000 0.532
5 1.000 0.431
6 1.000 0.361
7 1.000 0.517
8 1.000 0.394
Extraction Method: Principal Component Analysis.

Since PCA is an iterative estimation process, it starts with 1 as an initial estimate of the communality (since this is the total variance across all 8 components), and then proceeds with the analysis until a final communality extracted. Notice that the Extraction column is smaller Initial column because we only extracted two components. As an exercise, let’s manually calculate the first communality from the Component Matrix. The first ordered pair is (0.659,0.136)(0.659,0.136) which represents the correlation of the first item with Component 1 and Component 2. Recall that squaring the loadings and summing down the components (columns) gives us the communality:

h21=(0.659)2+(0.136)2=0.453h12=(0.659)2+(0.136)2=0.453

Going back to the Communalities table, if you sum down all 8 items (rows) of the Extraction column, you get 4.1234.123. If you go back to the Total Variance Explained table and summed the first two eigenvalues you also get 3.057+1.067=4.1243.057+1.067=4.124. Is that surprising? Basically it’s saying that the summing the communalities across all items is the same as summing the eigenvalues across all components.

Quiz

  1. In a PCA, when would the communality for the Initial column be equal to the Extraction column?

Answer: When you run an 8-component PCA.

True or False

  1. The eigenvalue represents the communality for each item.
  2. For a single component, the sum of squared component loadings across all items represents the eigenvalue for that component.
  3. The sum of eigenvalues for all the components is the total variance.
  4. The sum of the communalities down the components is equal to the sum of eigenvalues down the items.

Answers:

  1. F, the eigenvalue is the total communality across all items for a single component, 2. T, 3. T, 4. F (you can only sum communalities across items, and sum eigenvalues across components, but if you do that they are equal).

Common Factor Analysis

The partitioning of variance differentiates a principal components analysis from what we call common factor analysis. Both methods try to reduce the dimensionality of the dataset down to fewer unobserved variables, but whereas PCA assumes that there common variances takes up all of total variance, common factor analysis assumes that total variance can be partitioned into common and unique variance. It is usually more reasonable to assume that you have not measured your set of items perfectly. The unobserved or that makes up common variance is called a factor, hence the name factor analysis. The other main difference between PCA and factor analysis lies in the goal of your analysis. If your goal is to simply reduce your variable list down into a linear combination of smaller components then PCA is the way to go. However, if you believe there is some latent construct that defines the interrelationship among items, then factor analysis may be more appropriate. In this case, we assume that there is a construct called SPSS Anxiety that explains why you see a correlation among all the items on the SAQ-8, we acknowledge however that SPSS Anxiety cannot explain all the shared variance among items in the SAQ, so we model the unique variance as well. Based on the results of the PCA, we will start with a two factor extraction.

Source: “Marketing Research and Analysis-II” by NPTEL is licensed under CC BY-NC-SA 4.0

Running a Common Factor Analysis with 2 factors in SPSS

To run a factor analysis, use the same steps as running a PCA (Analyze – Dimension Reduction – Factor) except under Method choose Principal axis factoring. Note that we continue to set Maximum Iterations for Convergence at 100 and we will see why later.

Pasting the syntax into the SPSS Syntax Editor we get:

FACTOR

/VARIABLES q01 q02 q03 q04 q05 q06 q07 q08

/MISSING LISTWISE

/ANALYSIS q01 q02 q03 q04 q05 q06 q07 q08

/PRINT INITIAL EXTRACTION

/PLOT EIGEN

/CRITERIA FACTORS(2) ITERATE(100)

/EXTRACTION PAF

/ROTATION NOROTATE

/METHOD=CORRELATION.

Note the main difference is under /EXTRACTION we list PAF for Principal Axis Factoring instead of PC for Principal Components. We will get three tables of output, Communalities, Total Variance Explained and Factor Matrix. Let’s go over each of these and compare them to the PCA output.

Communalities of the 2-factor PAF

Communalities
Item Initial Extraction
1 0.293 0.437
2 0.106 0.052
3 0.298 0.319
4 0.344 0.460
5 0.263 0.344
6 0.277 0.309
7 0.393 0.851
8 0.192 0.236
Extraction Method: Principal Axis Factoring.

The most striking difference between this communalities table and the one from the PCA is that the initial extraction is no longer one. Recall that for a PCA, we assume the total variance is completely taken up by the common variance or communality, and therefore we pick 1 as our best initial guess. What principal axis factoring does is instead of guessing 1 as the initial communality, it chooses the squared multiple correlation coefficient R2R2. To see this in action for Item 1  run a linear regression where Item 1 is the dependent variable and Items 2 -8 are independent variables. Go to Analyze – Regression – Linear and enter q01 under Dependent and q02 to q08 under Independent(s).

Pasting the syntax into the Syntax Editor gives us:

REGRESSION

/MISSING LISTWISE

/STATISTICS COEFF OUTS R ANOVA

/CRITERIA=PIN(.05) POUT(.10)

/NOORIGIN

/DEPENDENT q01

/METHOD=ENTER q02 q03 q04 q05 q06 q07 q08.

The output we obtain from this analysis is

Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .541a 0.293 0.291 0.697

Note that 0.293 (highlighted in red) matches the initial communality estimate for Item 1. We can do eight more linear regressions in order to get all eight communality estimates but SPSS already does that for us. Like PCA,  factor analysis also uses an iterative estimation process to obtain the final estimates under the Extraction column. Finally, summing all the rows of the extraction column, and we get 3.00. This represents the total common variance shared among all items for a two factor solution.

Total Variance Explained (2-factor PAF)

The next table we will look at is Total Variance Explained. Comparing this to the table from the PCA we notice that the Initial Eigenvalues are exactly the same and includes 8 rows for each “factor”. In fact, SPSS simply borrows the information from the PCA analysis for use in the factor analysis and the factors are actually components in the Initial Eigenvalues column. The main difference now is in the Extraction Sums of Squares Loadings. We notice that each corresponding row in the Extraction column is lower than the Initial column. This is expected because we assume that total variance can be partitioned into common and unique variance, which means the common variance explained will be lower. Factor 1 explains 31.38% of the variance whereas Factor 2 explains 6.24% of the variance. Just as in PCA the more factors you extract, the less variance explained by each successive factor.

Total Variance Explained
Factor Initial Eigenvalues Extraction Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative %
1 3.057 38.206 38.206 2.511 31.382 31.382
2 1.067 13.336 51.543 0.499 6.238 37.621
3 0.958 11.980 63.523
4 0.736 9.205 72.728
5 0.622 7.770 80.498
6 0.571 7.135 87.632
7 0.543 6.788 94.420
8 0.446 5.580 100.000
Extraction Method: Principal Axis Factoring.

A subtle note that may be easily overlooked is that when SPSS plots the scree plot or the Eigenvalues greater than 1 criteria (Analyze – Dimension Reduction – Factor – Extraction), it bases it off the Initial and not the Extraction solution. This is important because the criteria here assumes no unique variance as in PCA, which means that this is the total variance explained not accounting for specific or measurement error. Note that in the Extraction of Sums Squared Loadings column the second factor has an eigenvalue that is less than 1 but is still retained because the Initial value is 1.067. If you want to use this criteria for the common variance explained you would need to modify the criteria yourself.

Quick Quiz

  1. In theory, when would the percent of variance in the Initial column ever equal the Extraction column?
  2. True or False, in SPSS when you use the Principal Axis Factor method the scree plot uses the final factor analysis solution to plot the eigenvalues.

Answers: 1. When there is no unique variance (PCA assumes this whereas common factor analysis does not, so this is in theory and not in practice), 2. F, it uses the initial PCA solution and the eigenvalues assume no unique variance.

Factor Matrix (2-factor PAF)

Factor Matrixa
Item Factor
1 2
1 0.588 -0.303
2 -0.227 0.020
3 -0.557 0.094
4 0.652 -0.189
5 0.560 -0.174
6 0.498 0.247
7 0.771 0.506
8 0.470 -0.124
Extraction Method: Principal Axis Factoring.
a. 2 factors extracted. 79 iterations required.

First note the annotation that 79 iterations were required. If we had simply used the default 25 iterations in SPSS, we would not have obtained an optimal solution. This is why in practice it’s always good to increase the maximum number of iterations. Now let’s get into the table itself. The elements of the Factor Matrix table are called loadings and represent the correlation of each item with the corresponding factor. Just as in PCA, squaring each loading and summing down the items (rows) gives the total variance explained by each factor. Note that they are no longer called eigenvalues as in PCA. Let’s calculate this for Factor 1:

(0.588)2+(−0.227)2–(−0.557)2+(0.652)2+(0.560)2+(0.498)2+(0.771)2+(0.470)2=2.51(0.588)2+(−0.227)2–(−0.557)2+(0.652)2+(0.560)2+(0.498)2+(0.771)2+(0.470)2=2.51

This number matches the first row under the Extraction column of the Total Variance Explained table. We can repeat this for Factor 2 and get matching results for the second row. Additionally, we can get the communality estimates by summing the squared loadings across the factors (columns) for each item. For example, for Item 1:

(0.588)2+(−0.303)2=0.437(0.588)2+(−0.303)2=0.437

Note that these results match the value of the Communalities table for Item 1 under the Extraction column. This means that the sum of squared loadings across factors represents the communality estimates for each item.

Source: “Exploratory factor analysis/Quiz” by wikiversity is licensed under CC BY-SA 3.0

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Business Research Methods by Icfai Business School is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book