I will try to collect all my SPSS/STATA and other stats notes here!!

## Thursday, 3 November 2011

### Uses of (System) missing

COMPUTE temp = $sysmis.

(this syntax will create a variable called temp which will initially have all values set as missing)

For a conditional function:

IF sysmis(v1) v2=$sysmis.

You can also use missing values in RECODE.

RECODE (sysmis=99)

or

RECODE (99=sysmis)

RECODE [your command] (ELSE=sysmis).

Also read the UCLA SPSS page and CDC page on handling of missing data.

## Tuesday, 1 November 2011

### PURPOSIVE SAMPLING

**Extreme or Deviant Case**- Learning from highly unusual manifestations of the phenomenon of interest, such as outstanding success/notable failures, top of the class/dropouts, exotic events, crises.**Intensity**- Information-rich cases that manifest the phenomenon intensely, but not extremely, such as good students/poor students, above average/below average.**Maximum Variation**- Purposefully picking a wide range of variation on dimensions of interest...documents unique or diverse variations that have emerged in adapting to different conditions. Identifies important common patterns that cut across variations.**Homogeneous**- Focuses, reduces variation, simplifies analysis, facilitates group interviewing.**Typical Case****Stratified Purposeful**- Illustrates characteristics of particular subgroups of interest; facilitates comparisons.**Critical Case**- Permits logical generalization and maximum application of information to other cases because if it's true of this once case it's likely to be true of all other cases.**Snowball or Chain**- Identifies cases of interest from people who know people who know people who know what cases are information-rich, that is, good examples for study, good interview subjects.**Criterion****Theory-Based or Operational Construct**- Finding manifestations of a theoretical construct of interest so as to elaborate and examine the construct.**Confirming or Disconfirming****Opportunistic**- Following new leads during fieldwork, taking advantage of the unexpected, flexibility.**Random Purposeful****Politically Important Cases****Convenience****Combination or Mixed Purposeful**

*Qualitative evaluation and research methods*(2nd ed.). Newbury Park, CA: Sage Publications.

## Thursday, 27 October 2011

### What can you do with COMPUTE in SPSS

COMPUTE y=ABS(x). absolute value of x. ABS(!7) =7.

COMPUTE y=SQRT(x). square root

COMPUTE y=LN(x). natural logarithm

COMPUTE y=LG10(x). base 10 logarithm

COMPUTE y=EXP(x). exponential: ex

COMPUTE y=TRUNC(x). integer part. TRUNC(5.7)=5.

COMPUTE y=RND(x). round to nearest integer. RND(5.7)=6

COMPUTE y=MOD(x,11). remainder after division by 11

COMPUTE y=SUM(x1,x2,x3). sum of 3 variables if at least one is non-missing

COMPUTE y=SUM.5(x1 TO x10). sum of 10 variables if at least 5 are non-missing.

COMPUTE y=MEAN.2(x1,x2,x3). mean of 3 variables if at least 2 are non-missing

COMPUTE y=LAG(x). x from previous case

COMPUTE y=$SYSMIS. sets Y to sysmis.

Source: SPSS for Windows 8, 9 and 10 by Svend Juul

### Dropping missing values in SPSS

The SELECT command with the SYSMIS() function can drop all missing cases from the current SPSS data set. Consider the following:

SELECT IF NOT (SYSMIS (amount)). SAVE OUTFILE= 'newfile.sav'.

This example drops all cases whose value of the variable amount is missing, and then saves this data to an SPSS system file called newfile.sav.

If the dataset has more than one coding for missing values, as is often the case for survey data, select all of the different codings for missing values with the AND operator:

SELECT IF NOT (SYSMIS(amount1)) AND NOT (SYSMIS(amount2)). SAVE OUTFILE= 'newfile.sav'.

http://kb.iu.edu/data/afay.html

## Thursday, 7 July 2011

### little issue with Excel sorting

Found the answer on a Excel discussion board.

## Tuesday, 14 June 2011

### Remove ALL spaces from cells in MS Excel

One more thing you MAY have to do when you download your data from surveyshare.com is to remove all unnecessary spaces from certain fields, especially if they happen to be your index variable. I my case I had to do it from the email field which was used to match the responses to other surveys!!! Here is the macro

I have found:

Sub TrimEText()

' This module will trim extra spaces from BOTH SIDES and excessive spaces from inside the text.

Dim MyCell As Range

On Error Resume Next

For Each MyCell In Selection.Cells

MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")

MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")

MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")

MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", "")

Next

On Error GoTo 0

End Sub

Really grateful to the author of the macro!!

## Thursday, 9 June 2011

### Excel Macro to convert the CASE of a range of TEXT

Before using:

"Uncomment" (remove the apostrophe from) the line of code that changes the text to the case you want. For example I needed everything to converted to lower case and hence I removed the apostrophe from "' Rng.Value = StrConv(Rng.Text, vbLowerCase)"

Sub ChangeCase()

Dim Rng As Range

On Error Resume Next

Err.Clear

Application.EnableEvents = False

For Each Rng In Selection.SpecialCells(xlCellTypeConstants, _

xlTextValues).Cells

If Err.Number = 0 Then

' Rng.Value = StrConv(Rng.Text, vbUpperCase)

' Rng.Value = StrConv(Rng.Text, vbLowerCase)

' Rng.Value = StrConv(Rng.Text, vbProperCase)

End If

Next Rng

Application.EnableEvents = True

End Sub

Source: http://www.cpearson.com/excel/ChangingCase.aspx

## Wednesday, 27 April 2011

### What is the difference between causation and correlation?

**What is the difference between causation and correlation? **

One of the most common errors we find in the press is the confusion between *correlation *and *causation* in scientific and health-related studies. In theory, these are easy to distinguish — an action or occurrence can *cause* another (such as smoking causes lung cancer), or it can *correlate* with another (such as smoking is correlated with alcoholism). If one action causes another, then they are most certainly correlated. But just because two things occur together does not mean that one caused the other, even if it seems to make sense.

## Monday, 7 March 2011

## Thursday, 24 February 2011

## Tuesday, 15 February 2011

### Truncating a string variable & other things

#### This text has been copied from UCLA website!!!

Create a String Variable **up**that will be the name converted into upper case,

**lo**that will be the name converted to lower case, and

**sub**that will be the third through eighth character in the persons name. Note that we first had to use the

**string**command to tell SPSS that

**up**

**lo**and

**sub**are string variables that will have a length of up to 14 characters. Had we omitted the

**string**command, these would have been treated as numeric variables, and when SPSS tried to assign a character value to the numeric variables, it would have generated an error. We also create

**len**that is the length of the name variable, and

**len2**

that is the length of the persons name.

STRING up lo (A14)

/sub (A6).

COMPUTE up = UPCASE(name).

COMPUTE lo = LOWER(name).

COMPUTE sub = SUBSTR(name,3,8).

COMPUTE len = LENGTH(name).

COMPUTE len2 = LENGTH(RTRIM(name)).

For more info visit: http://www.ats.ucla.edu/stat/spss/modules/functions.htm

## Tuesday, 8 February 2011

### Assigning Student Grades Using Excel

=IF(A2>89,"A",IF(A2>79,"B", IF(A2>69,"C",IF(A2>59,"D","F"))))

If more than 6 conditions to check, better to use LOOKUP then IF/THEN

=LOOKUP(A2,{0,60,63,67,70,73,77,80,83,87,90,93,97},{"F","D-","D","D+","C-","C","C+","B-","B","B+","A-","A","A+"})

source: http://office.microsoft.com/en-us/excel-help/if-HP005209118.aspx

## Thursday, 13 January 2011

### a very simple table using CTables

| group Universe vs sample | |||||

1 Universe | 2 Sample | Total Respondents | ||||

Column N % | Count | Column N % | Count | Column N % | Count | |

1 Female | 53.0% | 904 | 61.0% | 153 | 54.0% | 1057 |

2 Male | 46.3% | 789 | 39.0% | 98 | 45.3% | 887 |

Not specified | .7% | 12 | .0% | 0 | .6% | 12 |

To get the above table use the following syntex:

CTABLES /TABLE gender2 by group [colpct count]

/CATEGORIES VARIABLES=group TOTAL=YES LABEL='Total Respondents'.

&

| Main groups | ||||||

Our big univ | our sample | Total | |||||

Column N % | Count | Column N % | Count | Column N % | Count | ||

| 1 Female | 53.0% | 904 | 61.0% | 153 | 54.0% | 1057 |

2 Male | 46.3% | 789 | 39.0% | 98 | 45.3% | 887 | |

3 Not specified | .7% | 12 | .0% | 0 | .6% | 12 | |

Total | 100.0% | 1705 | 100.0% | 251 | 100.0% | 1956 |

For the above, here is the syntax (notice the columns also have totals now):

CTABLES /TABLE gender2 by group [colpct count]

/CATEGORIES VARIABLES=group TOTAL=YES LABEL='Total Respondents'

/CATEGORIES VARIABLES= gender2 TOTAL=YES POSITION=AFTER.