# 17 Comparing Common Factor Analysis versus Principal Components

As we mentioned before, the main difference between common factor analysis and principal components is that factor analysis assumes total variance can be partitioned into common and unique variance, whereas principal components assumes common variance takes up all of total variance (i.e., no unique variance). For both methods, when you assume total variance is 1, the common variance becomes the communality. The communality is unique to each item, so if you have 8 items, you will obtain 8 communalities; and it represents the common variance explained by the factors or components. However in the case of principal components, the communality is the total variance of each item, and summing all 8 communalities gives you the total variance across all items. In contrast, common factor analysis assumes that the communality is a portion of the total variance, so that summing up the communalities represents the total common variance and not the total variance. In summary, for PCA, total common variance is equal to total variance explained, which in turn is equal to the total variance, but in common factor analysis, total common variance is equal to total variance explained but does not equal total variance.

Quiz

True or False

The following applies to the SAQ-8 when theoretically extracting 8 components or factors for 8 items:

1. For each item, when the total variance is 1, the common variance becomes the communality.
2. In principal components, each communality represents the total variance across all 8 items.
3. In common factor analysis, the communality represents the common variance for each item.
4. The communality is unique to each factor or component.
5. For both PCA and common factor analysis, the sum of the communalities represent the total variance explained.
6. For PCA, the total variance explainedequals the total variance, but for common factor analysis it does not.

Answers: 1. T, 2. F, the total variance for each item, 3. T, 4. F, communality is unique to each item (shared across components or factors), 5. T, 6. T.

Rotation Methods

After deciding on the number of factors to extract and with analysis model to use, the next step is to interpret the factor loadings. Factor rotations help us interpret factor loadings. There are two general types of rotations, orthogonal and oblique.

• orthogonalrotation assume factors are independent or uncorrelated with each other
• oblique rotation factors are not independent and are correlated

The goal of factor rotation is to improve the interpretability of the factor solution by reaching simple structure.

Simple structure

Without rotation, the first factor is the most general factor onto which most items load and explains the largest amount of variance. This may not be desired in all cases. Suppose you wanted to know how well a set of items load on each factor; simple structure helps us to achieve this.

The definition of simple structure is that in a factor loading matrix:

1. Each row should contain at least one zero.
2. For factors, each column should have at least mzeroes (e.g., three factors, at least 3 zeroes per factor).

For every pair of factors (columns),

1. there should be several items for which entries approach zero in one column but large loadings on the other.
2. a large proportion of items should have entries approaching zero.
3. only a small number of items have two non-zero entries.

The following table is an example of simple structure with three factors:

 Item Factor 1 Factor 2 Factor 3 1 0.8 0 0 2 0.8 0 0 3 0.8 0 0 4 0 0.8 0 5 0 0.8 0 6 0 0.8 0 7 0 0 0.8 8 0 0 0.8

Let’s go down the checklist to criteria to see why it satisfies simple structure:

1. each row contains at least one zero (exactly two in each row)
2. each column contains at least three zeros (since there are three factors)
3. for every pair of factors, most items have zero on one factor and non-zeros on the other factor (e.g., looking at Factors 1 and 2, Items 1 through 6 satisfy this requirement)
4. for every pair of factors, all items have zero entries
5. for every pair of factors, none of the items have two non-zero entries

An easier criteria from Pedhazur and Schemlkin (1991) states that

2. each factor has high loadings for only some of the items.

Quiz

For the following factor matrix, explain why it does not conform to simple structure using both the conventional and Pedhazur test.

 Item Factor 1 Factor 2 Factor 3 1 0.8 0 0.8 2 0.8 0 0.8 3 0.8 0 0 4 0.8 0 0 5 0 0.8 0.8 6 0 0.8 0.8 7 0 0.8 0.8 8 0 0.8 0

Solution: Using the conventional test, although Criteria 1 and 2 are satisfied (each row has at least one zero, each column has at least three zeroes), Criteria 3 fails because for Factors 2 and 3, only 3/8 rows have 0 on one factor and non-zero on the other. Additionally, for Factors 2 and 3, only Items 5 through 7 have non-zero loadings or 3/8 rows have non-zero coefficients (fails Criteria 4 and 5 simultaneously). Using the Pedhazur method, Items 1, 2, 5, 6, and 7 have high loadings on two factors (fails first criteria) and Factor 3 has high loadings on a majority or 5/8 items (fails second criteria).

Orthogonal Rotation (2 factor PAF)

We know that the goal of factor rotation is to rotate the factor matrix so that it can approach simple structure in order to improve interpretability. Orthogonal rotation assumes that the factors are not correlated. The benefit of doing an orthogonal rotation is that loadings are simple correlations of items with factors, and standardized solutions can estimate unique contribution of each factor. The most common type of orthogonal rotation is Varimax rotation. We will walk through how to do this in SPSS.

Running a two-factor solution (PAF) with Varimax rotation in SPSS

The steps to running a two-factor Principal Axis Factoring is the same as before (Analyze – Dimension Reduction – Factor – Extraction), except that under Rotation – Method we check Varimax. Make sure under Display to check Rotated Solution and Loading plot(s), and under Maximum Iterations for Convergence enter 100.

Pasting the syntax into the SPSS editor you obtain:

FACTOR

/VARIABLES q01 q02 q03 q04 q05 q06 q07 q08

/MISSING LISTWISE

/ANALYSIS q01 q02 q03 q04 q05 q06 q07 q08

/PRINT INITIAL EXTRACTION ROTATION

/PLOT ROTATION

/CRITERIA FACTORS(2) ITERATE(100)

/EXTRACTION PAF

/CRITERIA ITERATE(100)

/ROTATION VARIMAX

/METHOD=CORRELATION.

Let’s first talk about what tables are the same or different from running a PAF with no rotation. First, we know that the unrotated factor matrix (Factor Matrix table) should be the same. Additionally, since the  common variance explained by both factors should be the same, the Communalities table should be the same. The main difference is that we ran a rotation, so we should get the rotated solution (Rotated Factor Matrix) as well as the transformation used to obtain the rotation (Factor Transformation Matrix). Finally, although the total variance explained by all factors stays the same, the total variance explained by each factor will be different.

``` ```

Rotated Factor Matrix (2-factor PAF Varimax)

 Rotated Factor Matrixa Factor 1 2 1 0.646 0.139 2 -0.188 -0.129 3 -0.490 -0.281 4 0.624 0.268 5 0.544 0.221 6 0.229 0.507 7 0.275 0.881 8 0.442 0.202 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 3 iterations.

The Rotated Factor Matrix table tells us what the factor loadings look like after rotation (in this case Varimax). Kaiser normalization is a method to obtain stability of solutions across samples. After rotation, the loadings are rescaled back to the proper size. This means that equal weight is given to all items when performing the rotation. The only drawback is if the communality is low for a particular item, Kaiser normalization will weight these items equally with items with high communality. As such, Kaiser normalization is preferred when communalities are high across all items. You can turn off Kaiser normalization by specifying

/CRITERIA NOKAISER

Here is what the Varimax rotated loadings look like without Kaiser normalization. Compared to the rotated factor matrix with Kaiser normalization the patterns look similar if you flip Factors 1 and 2; this may be an artifact of the rescaling. Another possible reasoning for the stark differences may be due to the low communalities for Item 2  (0.052) and Item 8 (0.236). Kaiser normalization weights these items equally with the other high communality items.

 Rotated Factor Matrixa Factor 1 2 1 0.207 0.628 2 -0.148 -0.173 3 -0.331 -0.458 4 0.332 0.592 5 0.277 0.517 6 0.528 0.174 7 0.905 0.180 8 0.248 0.418 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax without Kaiser Normalization. a. Rotation converged in 3 iterations.

In the table above, the absolute loadings that are higher than 0.4 are highlighted in blue for Factor 1 and in red for Factor 2. We can see that Items 6 and 7 load highly onto Factor 1 and Items 1, 3, 4, 5, and 8 load highly onto Factor 2. Item 2 does not seem to load highly on any factor. Looking more closely at Item 6 “My friends are better at statistics than me” and Item 7 “Computers are useful only for playing games”, we don’t see a clear construct that defines the two. Item 2, “I don’t understand statistics” may be too general an item and isn’t captured by SPSS Anxiety. It’s debatable at this point whether to retain a two-factor or one-factor solution, at the very minimum we should see if Item 2 is a candidate for deletion.

The Factor Transformation Matrix tells us how the Factor Matrix was rotated. In SPSS, you will see a matrix with two rows and two columns because we have two factors.

 Factor Transformation Matrix Factor 1 2 1 0.773 0.635 2 -0.635 0.773 Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization.

How do we interpret this matrix? Well, we can see it as the way to move from the Factor Matrix to the Rotated Factor Matrix. From the Factor Matrix we know that the loading of Item 1 on Factor 1 is 0.5880.588 and the loading of Item 1 on Factor 2 is −0.303−0.303, which gives us the pair (0.588,−0.303)(0.588,−0.303); but in the Rotated Factor Matrix the new pair is (0.646,0.139)(0.646,0.139). How do we obtain this new transformed pair of values? We can do what’s called matrix multiplication. The steps are essentially to start with one column of the Factor Transformation matrix, view it as another ordered pair and multiply matching ordered pairs. To get the first element, we can multiply the ordered pair in the Factor Matrix (0.588,−0.303)(0.588,−0.303) with the matching ordered pair (0.773,−0.635)(0.773,−0.635) in the first column of the Factor Transformation Matrix.

(0.588)(0.773)+(−0.303)(−0.635)=0.455+0.192=0.647.(0.588)(0.773)+(−0.303)(−0.635)=0.455+0.192=0.647.

To get the second element, we can multiply the ordered pair in the Factor Matrix (0.588,−0.303)(0.588,−0.303) with the matching ordered pair (0.773,−0.635)(0.773,−0.635) from the second column of the Factor Transformation Matrix:

(0.588)(0.635)+(−0.303)(0.773)=0.373−0.234=0.139.(0.588)(0.635)+(−0.303)(0.773)=0.373−0.234=0.139.

Voila! We have obtained the new transformed pair with some rounding error. The figure below summarizes the steps we used to perform the transformation

The Factor Transformation Matrix can also tell us angle of rotation if we take the inverse cosine of the diagonal element. In this case, the angle of rotation is cos−1(0.773)=39.4∘cos−1(0.773)=39.4∘. In the factor loading plot, you can see what that angle of rotation looks like, starting from 0∘0∘ rotating up in a counterclockwise direction by 39.4∘39.4∘. Notice here that the newly rotated x and y-axis are still at 90∘90∘ angles from one another, hence the name orthogonal (a non-orthogonal or oblique rotation means that the new axis is no longer 90∘90∘ apart. The points do not move in relation to the axis but rotate with it.

Total Variance Explained (2-factor PAF Varimax)

The Total Variance Explained table contains the same columns as the PAF solution with no rotation, but adds another set of columns called “Rotation Sums of Squared Loadings”. This makes sense because if our rotated Factor Matrix is different, the square of the loadings should be different, and hence the Sum of Squared loadings will be different for each factor. However, if you sum the Sums of Squared Loadings across all factors for the Rotation solution,

1.701+1.309=3.011.701+1.309=3.01

and for the unrotated solution,

2.511+0.499=3.01,2.511+0.499=3.01,

you will see that the two sums are the same. This is because rotation does not change the total common variance. Looking at the Rotation Sums of Squared Loadings for Factor 1, it still has the largest total variance, but now that shared variance is split more evenly.

 Total Variance Explained Factor Rotation Sums of Squared Loadings Total % of Variance Cumulative % 1 1.701 21.258 21.258 2 1.309 16.363 37.621 Extraction Method: Principal Axis Factoring.

Other Orthogonal Rotations

Here is the output of the Total Variance Explained table juxtaposed side-by-side for Varimax versus Quartimax rotation.

 Total Variance Explained Factor Quartimax Varimax Total Total 1 2.381 1.701 2 0.629 1.309 Extraction Method: Principal Axis Factoring.

You will see that whereas Varimax distributes the variances evenly across both factors, Quartimax tries to consolidate more variance into the first factor.

Equamax is a hybrid of Varimax and Quartimax, but because of this may behave erratically and according to Pett et al. (2003), is not generally recommended.

Oblique Rotation

In oblique rotation, the factors are no longer orthogonal to each other (x and y axes are not 90∘90∘ angles to each other). Like orthogonal rotation, the goal is rotation of the reference axes about the origin to achieve a simpler and more meaningful factor solution compared to the unrotated solution. In oblique rotation, you will see three unique tables in the SPSS output:

1. factor pattern matrixcontains partial standardized regression coefficients of each item with a particular factor
2. factor structure matrix contains simple zero order correlations of each item with a particular factor
3. factor correlation matrix is a matrix of intercorrelations among factors

Suppose the Principal Investigator hypothesizes that the two factors are correlated, and wishes to test this assumption. Let’s proceed with one of the most common types of oblique rotations in SPSS, Direct Oblimin.

Running a two-factor solution (PAF) with Direct Quartimin rotation in SPSS

The steps to running a Direct Oblimin is the same as before (Analyze – Dimension Reduction – Factor – Extraction), except that under Rotation – Method we check Direct Oblimin. The other parameter we have to put in is delta, which defaults to zero. Technically, when delta = 0, this is known as Direct Quartimin. Larger positive values for delta increases the correlation among factors. However, in general you don’t want the correlations to be too high or else there is no reason to split your factors up. In fact, SPSS caps the delta value at 0.8 (the cap for negative values is -9999). Negative delta factors may lead to orthogonal factor solutions. For the purposes of this analysis, we will leave our delta = 0 and do a Direct Quartimin analysis.

Pasting the syntax into the SPSS editor you obtain:

FACTOR

/VARIABLES q01 q02 q03 q04 q05 q06 q07 q08

/MISSING LISTWISE

/ANALYSIS q01 q02 q03 q04 q05 q06 q07 q08

/PRINT INITIAL EXTRACTION ROTATION

/PLOT ROTATION

/CRITERIA FACTORS(2) ITERATE(100)

/EXTRACTION PAF

/CRITERIA ITERATE(100) DELTA(0)

/ROTATION OBLIMIN

/METHOD=CORRELATION.

Quiz

True or False

All the questions below pertain to Direct Oblimin in SPSS.

1. When selecting Direct Oblimin, delta = 0 is actually Direct Quartimin.
2. Smaller delta values will increase the correlations among factors.
3. You typically want your delta values to be as high as possible.

Answers: 1. T, 2. F, larger delta values, 3. F, delta leads to higher factor correlations, in general you don’t want factors to be too highly correlated

Factor Pattern Matrix (2-factor PAF Direct Quartimin)

The factor pattern matrix represent partial standardized regression coefficients of each item with a particular factor. For example,  0.7400.740 is the effect of Factor 1 on Item 1 controlling for Factor 2 and −0.137−0.137 is the effect of Factor 2 on Item 1 controlling for Factor 2. Just as in orthogonal rotation, the square of the loadings represent the contribution of the factor to the variance of the item, but excluding the overlap between correlated factors. Factor 1 uniquely contributes (0.740)2=0.405=40.5%(0.740)2=0.405=40.5% of the variance in Item 1 (controlling for Factor 2 ), and Factor 2 uniquely contributes (−0.137)2=0.019=1.9(−0.137)2=0.019=1.9 of the variance in Item 1 (controlling for Factor 1).

 Pattern Matrixa Factor 1 2 1 0.740 -0.137 2 -0.180 -0.067 3 -0.490 -0.108 4 0.660 0.029 5 0.580 0.011 6 0.077 0.504 7 -0.017 0.933 8 0.462 0.036 Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. a. Rotation converged in 5 iterations.

Factor Structure Matrix (2-factor PAF Direct Quartimin)

The factor structure matrix represent the simple zero-order correlations of the items with each factor (it’s as if you ran a simple regression of a single factor on the outcome). For example, 0.6530.653 is the simple correlation of Factor 1 on Item 1 and 0.3330.333 is the simple correlation of Factor 2 on Item 1. The more correlated the factors, the more difference between pattern and structure matrix and the more difficult to interpret the factor loadings. From this we can see that Items 1, 3, 4, 5, and 7 load highly onto Factor 1 and Items 6, and 8 load highly onto Factor 2. Item 2 doesn’t seem to load well on either factor.

Additionally, we can look at the variance explained by each factor not controlling for the other factors. For example,  Factor 1 contributes (0.653)2=0.426=42.6%(0.653)2=0.426=42.6% of the variance in Item 1, and Factor 2 contributes (0.333)2=0.11=11.0(0.333)2=0.11=11.0 of the variance in Item 1. Notice that the contribution in variance of Factor 2 is higher 11%11% vs. 1.9%1.9% because in the Pattern Matrix we controlled for the effect of Factor 1, whereas in the Structure Matrix we did not.

 Structure Matrix Factor 1 2 1 0.653 0.333 2 -0.222 -0.181 3 -0.559 -0.420 4 0.678 0.449 5 0.587 0.380 6 0.398 0.553 7 0.577 0.923 8 0.485 0.330 Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization.

Factor Correlation Matrix (2-factor PAF Direct Quartimin)

Recall that the more correlated the factors, the more difference between pattern and structure matrix and the more difficult to interpret the factor loadings. In our case, Factor 1 and Factor 2 are pretty highly correlated, which is why there is such a big difference between the factor pattern and factor structure matrices.

 Factor Correlation Matrix Factor 1 2 1 1.000 0.636 2 0.636 1.000 Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization.

Factor plot

The difference between an orthogonal versus oblique rotation is that the factors in an oblique rotation are correlated. This means not only must we account for the angle of axis rotation θθ, we have to account for the angle of correlation ϕϕ. The angle of axis rotation is defined as the angle between the rotated and unrotated axes (blue and black axes). From the Factor Correlation Matrix, we know that the correlation is 0.6360.636, so the angle of correlation is cos−1(0.636)=50.5∘cos−1(0.636)=50.5∘, which is the angle between the two rotated axes (blue x and blue y-axis). The sum of rotations θθ and ϕϕ is the total angle rotation. We are not given the angle of axis rotation, so we only know that the total angle rotation is θ+ϕ=θ+50.5∘θ+ϕ=θ+50.5∘.

Compare the plot above with the Factor Plot in Rotated Factor Space from SPSS. You can see that if we “fan out” the blue rotated axes in the previous figure so that it appears to be 90∘90∘ from each other, we will get the (black) x and y-axes for the Factor Plot in Rotated Factor Space. The difference between the figure below and the figure above is that the angle of rotation θθ is assumed and we are given the angle of correlation ϕϕ that’s “fanned out” to look like it’s 90∘90∘ when it’s actually

Relationship between the Pattern and Structure Matrix

The structure matrix is in fact a derivative of the pattern matrix. If you multiply the pattern matrix by the factor correlation matrix, you will get back the factor structure matrix. Let’s take the example of the ordered pair (0.740,−0.137)(0.740,−0.137) from the Pattern Matrix, which represents the partial correlation of Item 1 with Factors 1 and 2 respectively. Performing matrix multiplication for the first column of the Factor Correlation Matrix we get

(0.740)(1)+(−0.137)(0.636)=0.740–0.087=0.652.(0.740)(1)+(−0.137)(0.636)=0.740–0.087=0.652.

Similarly, we multiple the ordered factor pair with the second column of the Factor Correlation Matrix to get:

(0.740)(0.636)+(−0.137)(1)=0.471−0.137=0.333(0.740)(0.636)+(−0.137)(1)=0.471−0.137=0.333

Looking at the first row of the Structure Matrix we get (0.653,0.333)(0.653,0.333) which matches our calculation! This neat fact can be depicted with the following figure:

As a quick aside, suppose that the factors are orthogonal, which means that the factor correlations are 1′ s on the diagonal and zeros on the off-diagonal, a quick calculation with the ordered pair (0.740,−0.137)(0.740,−0.137)

(0.740)(1)+(−0.137)(0)=0.740(0.740)(1)+(−0.137)(0)=0.740

and similarly,

(0.740)(0)+(−0.137)(1)=−0.137(0.740)(0)+(−0.137)(1)=−0.137

and you get back the same ordered pair. This is called multiplying by the identity matrix (think of it as multiplying 2∗1=22∗1=2).

Questions

1. Without changing your data or model, how would you make the factor pattern matrices and factor structure matrices more aligned with each other?
2. True or False, When you decrease delta, the pattern and structure matrix will become closer to each other.

Answers: 1. Decrease the delta values so that the correlation between factors approaches zero. 2. T, the correlations will become more orthogonal and hence the pattern and structure matrix will be closer.

Total Variance Explained (2-factor PAF Direct Quartimin)

 Total Variance Explained Factor Extraction Sums of Squared Loadings Rotation Sums of Squared Loadingsa Total % of Variance Cumulative % Total 1 2.511 31.382 31.382 2.318 2 0.499 6.238 37.621 1.931 Extraction Method: Principal Axis Factoring. a. When factors are correlated, sums of squared loadings cannot be added to obtain a total variance.

As a demonstration, let’s obtain the loadings from the Structure Matrix for Factor 1

(0.653)2+(−0.222)2+(−0.559)2+(0.678)2+(0.587)2+(0.398)2+(0.577)2+(0.485)2=2.318.(0.653)2+(−0.222)2+(−0.559)2+(0.678)2+(0.587)2+(0.398)2+(0.577)2+(0.485)2=2.318.

Note that 2.3182.318 matches the Rotation Sums of Squared Loadings for the first factor. This means that the Rotation Sums of Squared Loadings represent the non-unique contribution of each factor to total common variance, and summing these squared loadings for all factors can lead to estimates that are greater than total variance.

 Pattern Matrix Structure Matrix Factor Factor 1 2 1 2 1 0.740 -0.137 0.653 0.333 2 -0.180 -0.067 -0.222 -0.181 3 -0.490 -0.108 -0.559 -0.420 4 0.660 0.029 0.678 0.449 5 0.580 0.011 0.587 0.380 6 0.077 0.504 0.398 0.553 7 -0.017 0.933 0.577 0.923 8 0.462 0.036 0.485 0.330

Quiz

True or False

1. In oblique rotation, an element of a factor pattern matrix is the unique contribution of the factor to the item whereas an element in the factor structure matrix is the non-unique contribution to the factor to an item.
2. In the Total Variance Explained table, the Rotation Sum of Squared Loadings represent the unique contribution of each factor to total common variance.
3. The Pattern Matrix can be obtained by multiplying the Structure Matrix with the Factor Correlation Matrix
4. If the factors are orthogonal, then the Pattern Matrix equals the Structure Matrix
5. In oblique rotations, the sum of squared loadings for each item across all factors is equal to the communality (in the SPSS Communalities table) for that item.

Answers: 1. T, 2. F, represent the non-unique contribution (which means the total sum of squares can be greater than the total communality), 3. F, the Structure Matrix is obtained by multiplying the Pattern Matrix with the Factor Correlation Matrix, 4. T, it’s like multiplying a number by 1, you get the same number back, 5. F, this is true only for orthogonal rotations, the SPSS Communalities table in rotated factor solutions is based off of the unrotated solution, not the rotated solution.