HerfstkleurenHelpdesk IBM SPSS Statistics 20

For students from Arnhem Business School
Home Codebook Data Data editing Analysis Graphs Settings Links Methods

Analysis Chi-square Test for a Crosstabulation

By means of a chi-square test for a crosstabulation you test whether there exists a relationship between two variables or whether there are differences between groups with respect to a measured variable. On this page we show you how to perform the chi-square test and also how to visualize the result.

The presentation takes the following steps:

Note: In the section on testing you find some explananation of the method and way of thinking for testing of hypotheses.


In a number of districts in both Arnhem and Nijmegen research was done regarding the internet use of the people there (data from Spring 2004).
One of the research subquestions was whether there are differences between the districts regarding this internet use.

H0: There are no differences between the districts regarding the internet use.
HA: There are differences between the districts regarding the internet use.


Asking for the test using SPSS

We make a crosstabulation of the two variables involved. Next we click the "Statistics..." button and tick the box for Chi-square.
See the screendumps below.


crosstab window plus chisquare


The SPSS output

uitvoer chikwadraattoets


Checking the chi-square conditions

In order for a chi-square test to give reliable results the crosstabulation has to satisfy certain conditions. They are:

  1. At most 20% of the expected counts are allowed to be less than 5.
  2. The minimum expected count is at least equal to 1.

These conditions are reported by SPSS as a footnote in the chi-square output. In our example we find:

  1. 0% of the cells have expected counts less than 5; 0% is clearly less than 20%.
  2. The minimum expected count is 16.64. This is clearly more than 1.

Hence both conditions are satisfied and a chi-square test is valid for our crosstabulation.

Note: If the conditions are not satisfied one might adjust the crosstabulation by either combining some of the rows or columns in a logical way or by deleting some rows or columns that cause this problem (as long as this can be justified).
If the conditions are met after the changes we can redo the chi-square test. If not, we have to collect more data or abandon this test.


Interpreting the output

The "Pearsons Chi-square" has a significance of 0.021.

Hence we can reject the null hypothesis with a significance of 0.021.
In plain English: There is clear evidence there are differences between the four district regarding the use of internet.


Post-hoc analysis

We have established that there are differences between the districts. The logical next question is: What exactly are these differences?

Answer 1: Look at the standardized residuals.

You can ask for them in the Crosstabs window by clicking on the "Cells" button.

Standardized residuals have approximately a standard normal distribution. So any value larger than +2 or below -2 is substantial, while value over +3 or below -3 indicate a really big difference between what is observed and what was to be expected if no differences between the districts would exist.

In our example we find the table shown below. It clearly shows that the district Altrade is different from the others in that there are far fewer people there not using the internet than in the other districts.
standardized residuals

post hoc table

Answer 2: Use a stacked bar chart to visualize the differences.

We obtain a stacked bar chart through the Chart Builder. We put the districts on the X-axis and use V01 to determine the stacks.
Instead of counts we ask for percentages, where every bar adds up to 100%.
After we have created the chart we do some further editing.
You can see our final result below. It clearly shows that Altrade differs from the other three districts.

stacked bar for post hoc

post hoc bar chart


Last modified 30-10-2012

Jos Seegers, 2009; English version by Gé Groenewegen.