HerfstkleurenHelpdesk IBM SPSS Statistics 20

For students from Arnhem Business School
Home Codebook Data Data editing Analysis Graphs Settings Links Methods

Analysis A Scale Variable

On this page we deal with the Explore command of SPSS. We use it to describe quantitative data by means of numerical measures (statistics). Graphical displays for such a variable are discusses elsewhere on this site. Look at the pages on boxplots and on histograms.

The theory

When studying a quantitative variable (SPSS calls it a scale variable) a frequency table is less useful. This is especially so for a continuous variable or for a quantitative discrete variable with many different answers.
If we want to describe such a variable we will use statistics. The four main issues to consider are:

  • location
  • variability around the center
  • symmetry or skewness
  • outliers

We will use Analyze > Descriptive Statistics > Explore to find the relevant statistics.

TOP

Coding in SPSS

In this example we use data about the tallest buildings in the world. This data comes from Wikipedia and was downloaded and edited in September 2011. Surely there is an update of this data available at the moment.
The following variables are in our data file. Height is recorded in meters. Year is the year the building was completed.

Meerkeuzevraag

TOP

The options of Explore

Choose from the menu: Analyze > Descriptive Statistics > Explore

First you fill in the dialog box as shown on the right. We choose "Height" as dependent variable.

We leave the field for "Factor List" empty. If it is appropriate you can put a qualitative variable here to split the data into groups. When you do so the output statistics will be calculated for each subgroup separately.

We have chosen to label outlier cases by their names. For this we use the variable "Building".

As you can see you can choose between the display of only statistics, only plots or both.

Regarding plots: A boxplot will always be provided. You can ask in addition for a stem-and-leaf display and/or a histogram.

explore statistics box

TOP

The first result: Statistics

  • location: We use the mean and the median. When there are outliers they might have a substantial impact on the value of the mean. In this case the 5% trimmed mean comes in handy. It filters out the 5% highest and lowest scores and calculates the mean for the remaining data.
      
  • variability around the center: We use the standard deviation and the interquartile range. The range (max - min) is less useful, since outliers have a huge impact on it.
      
  • symmetry or skewness: A graphical display will show this best. But if there is a clear difference between the mean and the median that is an indication for skewness.
    See also the percentiles below.
    If you are familiar with them, you might use the skewness and kurtosis statistics here as well.
       
  • outliers: In the output below you see a list of the five tallest buildings and of five of the smallest buildings. According to the footnote there are more buildings with a height of 240 meters.

percentiles tall buildings

To assess the symmetry or skewness of the distribution you can compare the value of median-Q1 to that of Q3-median. I.e. you compare the sizes of the left and right halves of the box in the boxplot.

Note: You can see that there are several ways to calculate percentiles and quartiles and also that their outcomes may differ. Using the syntax of the SPSS Explore command you can choose six different ways to calculate percentiles and quartiles. See the page on syntax on this site if you are interested.

extremes tall buildings

TOP

The first result: Stem-and-leaf display

TOP

An annotated stem-and-leaf display

We assume that you are familiar with this display. But we have added a few notes to show you what kind of information you find in it.
You can also see from it that there are a total of ten buildings with a height of 240 meters. Hence the extreme cases in the lower tail that were given above could be extended with five extra names.

Furthermore, the display clearly shows that the distribution is right skewed.

 

TOP

Last modified 30-10-2012
Graph

©
Jos Seegers, 2009; English version by Gé Groenewegen.