Thursday 5 January 2012

Working with a sub-group of data

Sometimes you want to include only a sub-set or sample of the data in your analysis. There are different ways of doing that.

1. Using a sample of the data
If you just want to randomly choose a percentage of the data to work on it, here is the formula to select 30% of the responses:
TEMPORARY. SAMPLE .30.
You can also select a number of cases, ie:
SAMPLE 50 from 100.
will choose 50 cases randomly from the first 100 cases.

Note: Adding the "TEMPORARY" before makes sure that your original data does not change.

2. Selected group of cases

You can use select if (by itself or with temporary) on the basis of a criteria to run your statistics. For example, if you want to see the sleep time for mothers with under 2 year old children, you may have to select Female in the sex field (if there is one) and the age of child. This command can also be used with SYSMIS (system missing values) also.

3. Filter

You can filter the data according to a criteria assigned in a variable ie work only with boys or girls sample or children who are in school or not in school. The only thing to remember is that the variable has to be a dummy variable (with 0 and 1 values). The filter "turns off" the zeros, ie if the variable gender has assigned 0 to boys and 1 to girls, when you use filter it will generate statistics only for girls. To do so,

FILTER BY variable name.


When you are done, don't forget to use:
FILTER OFF.

4. Split file

First of all sort the cases by the variable you want to split the file with.

SORT CASES BY variable name.
Now use the split command.

SPLIT FILE BY variable name.

Don't forget to do turn the split function off.

SPLIT FILE OFF.


(thanks to UCLA SPSS Learning Modules for their online guide)

No comments: