Homework #11

SPSS Statistical commands:  CROSSTABS

(15 pts)


For this assignment, you will be performing what most statistics texts call a Chi-square analysis. You may also hear others refer to it as a "crosstabulation" or "crosstabs."  This analysis is simple and requires only "qualitative" variables.  If the two variables are qualitative or one is qualitative and one is quantitative with a few (e.g., 3 or less) response alternatives, CROSSTABS is appropriate. CROSSTABS produces a two-way frequency distribution. You will be doing several analyses.  For each, you will describe:


a.         Your research question.  This includes the variables you are studying and how you think they may be related.

b.         The obtained result of the analysis:  What is the Chi-square value and the significance level?  Is it statistically significant?  What is the phi coefficient value?  Are there any significant standardized residuals?

c.         A verbal description of the results.  In “English,” describe what you found.


Questions to ask yourself when doing a chi-squared statistical analysis (especially useful for part b)

1.      Is the relationship statistically significant?  Look at the Chi-square statistic and its corresponding p-value.

2.      How strong is the relationship?  Look at the phi coefficient.

3.      Where is the relationship?  Look at the standardized residuals.


In the real world, the answers to questions 2 and 3 are important ONLY if you answer Yes to question 1.  However, for homework, you must report the answers to all three questions.  It’s good for you!!



a.         Have SPSS open your survey.sav file.

b.         Choose: Analyze, Descriptive Statistics, Crosstabs

c.         Specify a variable to be the row and then specify a variable to be the column. SPSS will do many crosstabs if you specify many variables. For our purposes, just do two at a time. The statistical procedure doesn't care which variable you specify for the column versus row. Rules of thumb:  the variable with fewer categories should be the column; the variable for which you want to make comparisons (e.g., male vs. female) should be the column. Remember, it doesn't matter for statistical purposes.

d.         Choose the Statistics that will tell you whether the variables are significantly related. Choose Chi-square for the chi-square statistic and probability value. Choose Phi tell you how strongly the variables are related (this is a type of correlation coefficient that ranges from 0.00 to 1.00).

e.         Choose Cells to specify the output in the cells. For Count, by default you get the observed frequencies. You should also specify expected frequencies and standardized residuals.  If you like,  you can have SPSS print out the percentage:  you should specify Row, Column and/or Total. The least useful is Total. Take good notes in class, play with the numbers, and figure out how to interpret the percentages. This is very important.  Be sure and check the Standardized Residual box!!

f.          OK will execute the Crosstabs.


Example Analysis:

a.         Research Question

            I want to examine whether lurkers are more likely to use private email to communicate with others.  Do lurkers privately email more than posters?


b.         Analysis

            The Chi-square analysis shows that there is a significant relationship between lurking and privately emailing others in the group, c2(1) = 21.27, p <.05. The phi correlation coefficient was .71. The strength of this relationship is strong. The standardized residual for the lurkers who privately email is 2.5; therefore, it is significant.


The relationship is not significant,
c2(1) = 1.47, p=.23.  The phi correlation coefficient was .19.  The strength of this relationship is weak. No standardized residuals were greater than 2.0.

Note that I’ve used p< .05 and p=.23.  You can use either system.  You can give the exact probability to no more than three decimals -- p = .023, but two decimal places, p= .02  is also acceptable.  .

NOTE:  The (2) is the degrees of freedom for the test (Rows-1 * Columns-1).


c.         Interpretation

            More lurkers used private email than were expected, if there was no relationship between the variables.  Therefore, lurkers are more likely to use private email to communicate with others in their virtual community than non-lurkers (i.e., posters). 


OR  (say this if there is NOT a significant relationship.) 

            Lurkers and posters used private email to communicate with others in their group at about the same rate.  Therefore, there was not a difference between lurkers and non-lurkers (i.e., posters) in using private email. 



Use the CROSSTABS command to analyze at least three research questions concerning the relationship among qualitative or limited range "quantitative" variables.  THIS IS VERY IMPORTANT!  IF YOU DO A CROSSTAB ON A CONTINUOUS OR LARGE RANGE QUANTITATIVE VARIABLE, THAT IS WRONG!!!  Describe (using a word processor) each analysis using the format of the example analysis described above (i.e., a research question, analysis and interpretation).  Attach the appropriate printout to the back of the questions.  Feel free to do more analyses, try out the way that CROSSTABS works, spend some time figuring out the difference between row, column, and total percentages.  Before you start, make sure you understand (really “get”) what variables we have in our survey and which ones you can use.


Note:  Figuring out the source of the percentages in the tables is very important.  A standardized residual of greater than 2.00 generally indicates a significant difference in that particular cell.


Print your output. You can print directly from SPSS by choosing File, then Print. However, you can also save your output as a .lst file. When SPSS asks if you want to save your output, indicate yes and then provide a file name on your diskette. You can edit as much as you like and print from the word processing program.


Exit from SPSS:  Unless you have changed your .sav file with additional recodes, computes, etc., you do not need to save the data window. Saving the output window is covered above.


Separate all printouts and attach to your analyses..


If you have problems interpreting crosstab output, send an e-mail question.


Note: You should get into the practice of using superscript to type c2.  You can also have your word processing program insert a special graphic symbol for the Greek letter c.