Research Design and Statistical Consulting
George M. Diekhoff, Ph.D.

Your Case ID Numbers and SPSS’s Case ID Numbers Are Not the Same

It is helpful to number each case (line) in your SPSS data file so that you can refer to cases by their numbers. A convenient way of generating a variable “CaseID” which consists of consecutive case numbers for a data file is to create and run a syntax file as follows: 


FORMAT CaseID (F8.0). 


That will create a variable labeled CaseID which numbers each line of your data file from 1 thru N. 

Suppose now, though, that you delete the data from case #3 for some reason–perhaps the case identified as a “flatliner” (no variability from one rating scale to the next) or a “speeder” (completed your survey in an unreasonably short period of time) or something like that. With that case removed, your CaseID numbers will now run 1, 2, 4, 5…. N. 

Now suppose you run an analysis in SPSS in which the output identifies cases–perhaps like the Explore procedure which can be used to identify univariate extreme scores and outliers. Further suppose the case that you’ve designated as #4, which now occupies the third line of the data file, is such an outlier. SPSS won’t identify that case by the CaseID number that you’ve assigned. Rather, since the case occupies the third line in the data file, the outlier will be identified by its line number (3) not its CaseID number (4).