Research Design and Statistical Consulting

George M. Diekhoff, Ph.D.

Although the output from many SPSS statistical procedures is rich (sometimes too rich), that output is not always clearly labeled. This is particularly true when it comes to information about correlations, partial correlations, and semi-partial correlations (called “part” correlations by SPSS) that is included in the output from the Regression procedure (at least it’s included if you’ve selected “part and partial correlations” in the Statistics dialog box).

Here’s a quick guide to reading Regression output pertaining to correlations, partial correlations, and semi-partial correlations.

Let’s start with a simple example of partial and semi-partial correlations. Suppose that you want to study the relationship between the number of cigarettes people smoke per year (your X variable) and the number of colds they catch per year (your Y variable). However, you think that stress may cloud this relationship because stress could affect both X (amount of smoking) and Y (how many colds people get). In a partial correlation analysis, you would treat stress as your nuisance or mediating variable (we’ll call it A) and you’d control statistically for the influence of stress on both cigarette smoking and colds because stress can reasonably be expected to impact both your X and Y variables.

For an example of semi-partial correlation, suppose you are looking at the correlation between employee age (X) and job satisfaction (Y), and you see that the correlation is positive: older workers are more satisfied than younger workers. You think, however, that this relationship might be mediated by income (A): older employees might be more satisfied with their jobs than younger employees, not because of age per se, but because older employees make more money. Here, you would need to statistically control for the effects of income on just job satisfaction, but not age, because income could affect satisfaction but income could not affect age.

Now that we’ve seen a couple examples of when partial and semi-partial correlation might be used, let’s get some symbols out of the way.

X, Y, and A are the names of our variables as follows:

X and Y = the primary variables. It is the relationship between these two variables that you’re mostly interested in, like the relationship betwen X = temperature and Y = burglaries.

A = the nuisance or mediating variable that you want to control statistically in order to more clearly study the relationship between X and Y. For instance A = the number of homeowners who are away on vacation.

Now here are symbols for the various types of correlations:

Rxy = the Pearson correlation between X and Y. SPSS calls this the “zero-order” correlation because in this correlation there are “zero” nuisance or mediating variables being controlled.

Rxy.a = the partial correlation between X and Y, controlling both statistically for the influence of A

Rx(y.a) = the semi-partial (“part”) correlation between X and Y, statistically controlling only Y for the influence of A

Ray = the Pearson correlation between A and Y

Ray.x = the partial correlation between A and Y, controlling both statistically for the influence of X

Ra(y.x) = the semi-partial (“part”) correlation between A and Y, statistically controlling only Y for the influence of X

All six of these correlations are listed in the SPSS output box labeled “Coefficients.” But they are not clearly labeled, so it’s hard to know which numbers mean which things. Here is how the correlations are organized in the SPSS Output:

Correlations | ||

Zero-Order | Partial | Part |

Rxy | Rxy.a | Ry(x.a) |

Ray | Ray.x | Ry(a.x) |

As is typical in SPSS, they’ve given us far more information than we needed (or wanted) in order to answer our research questions. For the example of the partial correlation given earlier, we’d want to know the correlation between cigarette smoking and colds, Rxy, and the partial correlation between cigarette smoking and colds, controlling both statistically for the influence of stress, Rxy.a, but none of the other values provided by SPSS are relevant.

For the example of the semi-partial correlation, we’d want to know the correlation between age and job satisfaction, Rxy, and the semi-partial correlation between age and satisfaction, controlling satisfaction statistically for the influence of salary, Rx(y.a), but none of the rest of the output is relevant.