Wednesday, 25 July 2012

To ADD (or SUM) in SPSS

Well, in SPSS you can add a series of variables in two different ways. First is you add two variables i.e., boys and girls and get the total children. Or, you want to create an Index based on a series of scores but want to ignore the respondent who missed out on any of the variables in the series (i.e., there is a MISSING value in 1 or more variables for them).
 
compute t_child=sons+daughters.
OR
compute t_child=sum (sons, daughters).

"The difference between the two procedures above is that in the first procedure, the case on total would be missing if any one of the four variables had missing values on a case; in the second procedure, the total would be computed while ignoring missing values on the four variables." No cases will be dropped due to a missing value in any of the variables. "Essentially SPSS treats the missing value as ZERO." 

In the SUM argument the variables must be separated by comma but if there are multiple variables you can use the option of TO to provide a range. For example, if you want to construct a happiness index based on 12 indicators/variables hap1 thru hap12, you can use the following syntax:

compute happiness=sum (hap1 thru hap12).

Source: Indiana University IT Services and others.

 
Another point to note is that  "the SUM() function is evidently flexible enough to respect more complex statements like SUM(Var1+Var2, Var3-Var4, Var5*Var6).  Hence, do not use the addition symbol when you use SUM unless that is part of the list of arguments. Source: SPSSX Discussion group

While talking about the flexibility and greatness of SUM, there is another neat function that you can take note of. So, in case you want to limit the CASE DROPPING based on any MISSING values, you can provide a number to TELL the computer to keep a CASE/RESPONDENT if at least X # of variables are answered. So, 

COMPUTE V3 = SUM.2(V1, V2). EXECUTE .

"The .2 appended to the end of the SUM function in the above example can be any integer. Use it to indicate the minimum number of valid cases necessary to perform a given calculation." Source: Indiana University IT Services

Also remember Listwise and pairwise deletion a concept SPSS uses while using addition function. According to a discussion group they are defined as:

Listwise - then if the respondent has any missing value for any variable then the respondent is omitted from all your data analysis.

Pairwise - not as harsh as listwise in that the respondent is dropped only on analyses involving variables that have missing values.


Also check the IBM site and Psychwiki for more on list and pairwise deletion.