I will try to collect all my SPSS/STATA and other stats notes here!!
Thursday, 3 November 2011
Uses of (System) missing
COMPUTE temp = $sysmis.
(this syntax will create a variable called temp which will initially have all values set as missing)
For a conditional function:
IF sysmis(v1) v2=$sysmis.
You can also use missing values in RECODE.
RECODE (sysmis=99)
or
RECODE (99=sysmis)
RECODE [your command] (ELSE=sysmis).
Also read the UCLA SPSS page and CDC page on handling of missing data.
Tuesday, 1 November 2011
PURPOSIVE SAMPLING
- Extreme or Deviant Case - Learning from highly unusual manifestations of the phenomenon of interest, such as outstanding success/notable failures, top of the class/dropouts, exotic events, crises.
- Intensity - Information-rich cases that manifest the phenomenon intensely, but not extremely, such as good students/poor students, above average/below average.
- Maximum Variation - Purposefully picking a wide range of variation on dimensions of interest...documents unique or diverse variations that have emerged in adapting to different conditions. Identifies important common patterns that cut across variations.
- Homogeneous - Focuses, reduces variation, simplifies analysis, facilitates group interviewing.
- Typical Case - Illustrates or highlights what is typical, normal, average.
- Stratified Purposeful - Illustrates characteristics of particular subgroups of interest; facilitates comparisons.
- Critical Case - Permits logical generalization and maximum application of information to other cases because if it's true of this once case it's likely to be true of all other cases.
- Snowball or Chain - Identifies cases of interest from people who know people who know people who know what cases are information-rich, that is, good examples for study, good interview subjects.
- Criterion - Picking all cases that meet some criterion, such as all children abused in a treatment facility. Quality assurance.
- Theory-Based or Operational Construct - Finding manifestations of a theoretical construct of interest so as to elaborate and examine the construct.
- Confirming or Disconfirming - Elaborating and deepening initial analysis, seeking exceptions, testing variation.
- Opportunistic - Following new leads during fieldwork, taking advantage of the unexpected, flexibility.
- Random Purposeful - (still small sample size) Adds credibility to sample when potential purposeful sample is larger than one can handle. Reduces judgment within a purposeful category. (Not for generalizations or representativeness.)
- Politically Important Cases - Attracts attention to the study (or avoids attracting undesired attention by purposefully eliminating from the sample politically sensitive cases).
- Convenience - Saves time, money, and effort. Poorest rational; lowest credibility. Yields information-poor cases.
- Combination or Mixed Purposeful - Triangulation, flexibility, meets multiple interests and needs. (Patton, 1990)
Thursday, 27 October 2011
What can you do with COMPUTE in SPSS
COMPUTE y=ABS(x). absolute value of x. ABS(!7) =7.
COMPUTE y=SQRT(x). square root
COMPUTE y=LN(x). natural logarithm
COMPUTE y=LG10(x). base 10 logarithm
COMPUTE y=EXP(x). exponential: ex
COMPUTE y=TRUNC(x). integer part. TRUNC(5.7)=5.
COMPUTE y=RND(x). round to nearest integer. RND(5.7)=6
COMPUTE y=MOD(x,11). remainder after division by 11
COMPUTE y=SUM(x1,x2,x3). sum of 3 variables if at least one is non-missing
COMPUTE y=SUM.5(x1 TO x10). sum of 10 variables if at least 5 are non-missing.
COMPUTE y=MEAN.2(x1,x2,x3). mean of 3 variables if at least 2 are non-missing
COMPUTE y=LAG(x). x from previous case
COMPUTE y=$SYSMIS. sets Y to sysmis.
Source: SPSS for Windows 8, 9 and 10 by Svend Juul
Dropping missing values in SPSS
The SELECT command with the SYSMIS() function can drop all missing cases from the current SPSS data set. Consider the following:
SELECT IF NOT (SYSMIS (amount)). SAVE OUTFILE= 'newfile.sav'.
This example drops all cases whose value of the variable amount is missing, and then saves this data to an SPSS system file called newfile.sav.
If the dataset has more than one coding for missing values, as is often the case for survey data, select all of the different codings for missing values with the AND operator:
SELECT IF NOT (SYSMIS(amount1)) AND NOT (SYSMIS(amount2)). SAVE OUTFILE= 'newfile.sav'.
http://kb.iu.edu/data/afay.html
Thursday, 7 July 2011
little issue with Excel sorting
Found the answer on a Excel discussion board.
Tuesday, 14 June 2011
Remove ALL spaces from cells in MS Excel
One more thing you MAY have to do when you download your data from surveyshare.com is to remove all unnecessary spaces from certain fields, especially if they happen to be your index variable. I my case I had to do it from the email field which was used to match the responses to other surveys!!! Here is the macro
I have found:
Sub TrimEText()
' This module will trim extra spaces from BOTH SIDES and excessive spaces from inside the text.
Dim MyCell As Range
On Error Resume Next
For Each MyCell In Selection.Cells
MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")
MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")
MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", " ")
MyCell.Value = Application.WorksheetFunction.Substitute(Trim(MyCell.Value), " ", "")
Next
On Error GoTo 0
End Sub
Really grateful to the author of the macro!!
Thursday, 9 June 2011
Excel Macro to convert the CASE of a range of TEXT
Before using:
"Uncomment" (remove the apostrophe from) the line of code that changes the text to the case you want. For example I needed everything to converted to lower case and hence I removed the apostrophe from "' Rng.Value = StrConv(Rng.Text, vbLowerCase)"
Sub ChangeCase()
Dim Rng As Range
On Error Resume Next
Err.Clear
Application.EnableEvents = False
For Each Rng In Selection.SpecialCells(xlCellTypeConstants, _
xlTextValues).Cells
If Err.Number = 0 Then
' Rng.Value = StrConv(Rng.Text, vbUpperCase)
' Rng.Value = StrConv(Rng.Text, vbLowerCase)
' Rng.Value = StrConv(Rng.Text, vbProperCase)
End If
Next Rng
Application.EnableEvents = True
End Sub
Source: http://www.cpearson.com/excel/ChangingCase.aspx
Wednesday, 27 April 2011
What is the difference between causation and correlation?
What is the difference between causation and correlation?
One of the most common errors we find in the press is the confusion between correlation and causation in scientific and health-related studies. In theory, these are easy to distinguish — an action or occurrence can cause another (such as smoking causes lung cancer), or it can correlate with another (such as smoking is correlated with alcoholism). If one action causes another, then they are most certainly correlated. But just because two things occur together does not mean that one caused the other, even if it seems to make sense.
Monday, 7 March 2011
Thursday, 24 February 2011
Tuesday, 15 February 2011
Truncating a string variable & other things
This text has been copied from UCLA website!!!
Create a String Variable up that will be the name converted into upper case, lo that will be the name converted to lower case, and sub that will be the third through eighth character in the persons name. Note that we first had to use the string command to tell SPSS that up lo and sub are string variables that will have a length of up to 14 characters. Had we omitted the string command, these would have been treated as numeric variables, and when SPSS tried to assign a character value to the numeric variables, it would have generated an error. We also create len that is the length of the name variable, and len2that is the length of the persons name.
STRING up lo (A14)
/sub (A6).
COMPUTE up = UPCASE(name).
COMPUTE lo = LOWER(name).
COMPUTE sub = SUBSTR(name,3,8).
COMPUTE len = LENGTH(name).
COMPUTE len2 = LENGTH(RTRIM(name)).
For more info visit: http://www.ats.ucla.edu/stat/spss/modules/functions.htm
Tuesday, 8 February 2011
Assigning Student Grades Using Excel
=IF(A2>89,"A",IF(A2>79,"B", IF(A2>69,"C",IF(A2>59,"D","F"))))
If more than 6 conditions to check, better to use LOOKUP then IF/THEN
=LOOKUP(A2,{0,60,63,67,70,73,77,80,83,87,90,93,97},{"F","D-","D","D+","C-","C","C+","B-","B","B+","A-","A","A+"})
source: http://office.microsoft.com/en-us/excel-help/if-HP005209118.aspx
Thursday, 13 January 2011
a very simple table using CTables
| group Universe vs sample | |||||
1 Universe | 2 Sample | Total Respondents | ||||
Column N % | Count | Column N % | Count | Column N % | Count | |
1 Female | 53.0% | 904 | 61.0% | 153 | 54.0% | 1057 |
2 Male | 46.3% | 789 | 39.0% | 98 | 45.3% | 887 |
Not specified | .7% | 12 | .0% | 0 | .6% | 12 |
To get the above table use the following syntex:
CTABLES /TABLE gender2 by group [colpct count]
/CATEGORIES VARIABLES=group TOTAL=YES LABEL='Total Respondents'.
&
| Main groups | ||||||
Our big univ | our sample | Total | |||||
Column N % | Count | Column N % | Count | Column N % | Count | ||
| 1 Female | 53.0% | 904 | 61.0% | 153 | 54.0% | 1057 |
2 Male | 46.3% | 789 | 39.0% | 98 | 45.3% | 887 | |
3 Not specified | .7% | 12 | .0% | 0 | .6% | 12 | |
Total | 100.0% | 1705 | 100.0% | 251 | 100.0% | 1956 |
For the above, here is the syntax (notice the columns also have totals now):
CTABLES /TABLE gender2 by group [colpct count]
/CATEGORIES VARIABLES=group TOTAL=YES LABEL='Total Respondents'
/CATEGORIES VARIABLES= gender2 TOTAL=YES POSITION=AFTER.