Research Design and Statistical Consulting
George M. Diekhoff, Ph.D.

Recoding String Variables (words) as Numeric Variables (numbers)

I often get data files from clients who have not discovered yet just how much SPSS hates words! As a consequence, a variable like Gender will be treated as a string variable and is coded “male” or “female” instead of as a numeric variable with numerical codes representing the categories like 1 = “male” and 0 = “female.” In order to do any serious statistical work with these string variables, they must be recoded as numeric variables. Writing a short SPSS syntax file is the simplest way of getting this done. 

Using the example above where men are coded as “male” and women are coded as “female”:
File > New > Syntax will open a blank syntax file.
Type the following commands:
recode Gender (‘male’ = 1) (‘female’ = 0) into nGender.
Then click on Run All. A new variable will be created in your data file called nGender, and cases that are described as “male” will be coded 1 in nGender, and cases that are described as “female” will be coded 0 in nGender. 

There’s just one caveat, and I was frustrated beyond belief until I finally figured this out (and of course nobody thinks to mention it!!). Although SPSS is generally not case-sensitive software, it is when it comes to dealing with string variables! So guess what would happen if you wrote and ran the following syntax: 

recode Gender (‘Male’ = 1) (‘Female’ = 0) into nGender.

NOTHING! Nothing gets recoded at all, because SPSS is looking for instances of “Male” to recode as 1’s and it’s only finding “male.” And it’s looking for instances of “Female” to recode as 0’s and it’s only finding “female.”The same problem with case sensitivity pops up when using the SPSS “Select Cases” command to select a subset of cases for an analysis. Returning to the example above, if you ran a Select Cases If: Gender = ‘Male’ you wouldn’t select anybody because in the data file the men are identified as “male,” not “Male.”