Helpdesk IBM SPSS Statistics 20 METHODS | |||||||
Introduction | Sample size | Table design | Graph design | Syntax | Testing | Links | SPSS Statistics 20 |
Testing PhilosophyIn every hypothesis test we have to chose between the null hypothesis and the
alternative hypothesis. But there is no level playing field between the two. One of them (H0) is true until proven otherwise. The
other one (HA) is an interesting claim that we need to/hope to prove.
The alternative hypothesisThe alternative is the interesting part of the story behind a test. Situation 1: Testing a research hypothesis. Situation 2: Testing the validity of a claim. Situation 3: Testing in decision-making situations. The null hypothesisIn statistics, the null hypothesis proposes an
established model for the world. Then we look at the data.
If the data is consistent with that model, we have no reason
to disbelieve H0. We know what to expect according to H0. This is input into the calculations that are needed to find a significance for the sample data we have collected. Test design and test statisticNow we have to set up our research. We specify what and how we will measure.
These measurements will be summed up in a single number, the test
statistic T. The actual formula that describes T differs from test to
test. Throughout much of the
documentation, we avoid detailed discussion of the inner
workings of procedures in order to promote readability. The
algorithms documents are designed as a resource for those
interested in the specific calculations performed by
procedures. The algorithms are available in two forms: Distribution as predicted by the null nulhypothesisGiven the setup and the specifications of H0 the statistical theory tells us the exact distribution for T or a useful approximation (for example T follows approximately a normal distribution). We have one sample that results in a single value for the test
statistic T. We would like to know if this result is
"Something" or "Nothing At All". Yes, then H0 stands; No, then we have proof for HA. “Fitting in nicely” is translated into a probability about
likelihood of occurrence given the null hypothesis. We call
this the significance of the sample data.
Using this probability we conclude whether we have found
something of interest (proof for HA) or we have to concede that there is
nothing at all going on (we stick to H0). sample significance = P( T = sample result or more extreme | the distribution for T as predicted by H0 ). A large significance is consistent with the null hypothesis, while a small significance is (very) unlikely given H0, hence casts serious doubt on the correctness of H0. In many cases the alternative hypothesis states that
there is some difference between groups or between a
population (parameter) and a given number or distribution.
If we stick to the null hypothesis we have found
insufficient evidence against it. "No evidence of a difference" is definitely not the same as "evidence of no difference". If in a court case an accused person is acquitted due to lack of evidence, because there are doubts regarding the evidence, that does by no means implicate we have proven that this person is innocent. He or she might be, but that was not the issue. Not enough evidence against H0 is something else than proving the truth of H0. In a hypothesis test you can never hope to prove H0. You can only look for a proof of HA and against H0. Significance levelsThe explanation above tells us that the choice between "stick to H0" and "reject H0, choose HA" is based on a probability, the significance of the test result. The smaller it gets, the more convincing our evidence is. But when is it small enough? Here are some guidelines:
In marketing research the default setting is that when the significance drops below 0.05 we will reject the null hypothesis. Please note that significances vary on a continuous scale and that the value of 0.05 is not written in stone. It was settled on when significances were hard to compute and so some specific values needed to be provided in tables. Nowadays calculating exact significances is easy (thank you SPSS) and so an investigator can report "sign. = 0.06" and leave it to the reader to decide how significant it is. Referring to outcomes where sign. < 0.05 as significant and where sign. > 0.05 as nonsignificant is problematic when the significance is close to 0.05. We are not dealing with an all-or-nothing situation, where 0.049 means everything and 0.051 means nothing. Ask yourself whether the effect is interesting enough for further research. ConclusionHow do we proceed when H0 is rejected? A statistically significant
result is not automatically a scientifically significant result.
Errors"Statistics is the only profession that demands the right to make mistakes five percent of the time." We base our conclusions on sample data. No matter how good the test design was and how well it was executed, a sample can never give you absolute centainty about the properties of the underlying population. The possibility of errors is inherent to any test. There are two types of errors we can make:
Note that we know the probability of a Type I error. It is equal to the significance. We can control this and impose a maximum (like 0.05) before we start the research. The probability whether a test will correctly decide that the null hypothesis is false is far harder to handle. It is called the power of the test. This topic is beyond this website. If you are interested start with searching on "power of a test". It gave over 75,000 hits on Google when I wrote this page. |
Last modified
30-10-2012
© Jos Seegers, 2009; English version by Gé Groenewegen. |