Section 7.2: When things go wrong
From Research Methods in Psychology
Contents |
[edit] WHAT ARE YOU STUDYING?
IN THIS SECTION WILL LOOK AT THE ERRORS THAT ARE MADE IN SIGNIFICANCE TESTING. FIRST WE WILL EXPLAIN THE DIFFERENT TYPES OF ERROR THAT CAN BE MADE – TO MAKE LIFE SIMPLER, WE WILL USE AN ANALOGY, OF GOING TO THE SHOP TO FETCH MILK. WE WILL THEN LOOK AT HOW LIKELY YOU ARE TO MAKE DIFFERENT KINDS OF ERROR, AND WHAT AFFECTS THE PROBABILITY OF MAKING A DIFFERENT KIND OF ERROR.
[edit] Statistical Significance and Errors
|
Crucial Concept: A type I error is a false rejection of the null hypothesis - a false positive. |
|
Crucial Concept: A type II error is a false rejection of the experimental hypothesis - a false negative. |
Just because a result is statistically significant does not mean that it is true. In addition, just because a result is not statistically significant does not mean that it is false. It is possible that an error was made. There are costs associated with each of those types of error. Let’s use an example to try to make it clearer. You want some milk, but only one shop is open, and it may have run out of milk. You are trying to decide whether you should bother going to the shop to get some milk, but you don’t know whether the shop will have any.
You have a theory, that there will be milk in the shop. If you go to the shop, and find that there is no milk there, then your theory was wrong. This is a little like saying that you have rejected your null hypothesis, when the null hypothesis was actually correct.
If you actually decide that your theory is wrong, so you don’t go to the shop, but there was milk there, then your theory was correct, but you incorrectly thought it was incorrect. This is a little like saying that you have not rejected your null hypothesis, when the null hypothesis was false, and should have been rejected.
How sure do you have to be that there will be milk in the shop before you go? The value that determines how sure you have to be is alpha. In psychology, we use the value 0.05 as the value for alpha, the cut-off, before we say that the null hypothesis is rejected. This means that we will not say that we have a result, unless the probability that we are wrong is less than 0.05 (i.e. 5% or 1 in 20). This means that you will not go to the shop unless you are fairly sure that there is milk there - you will not go unless you are 95% sure that you are going to find milk when you get there.
Although this seems like quite a stringent level of alpha - we are having to be very sure before we go to the shop, this makes sense in psychology (and science in general). If you only go to the shop when there is a 0.05 (or less) probability that there is not going to be milk there, you will get milk 95% of the time.
If psychologists stood up and claimed to have made discoveries but these so-called discoveries were not soundly based, psychology textbooks would be full of meaningless results. In the same way as you could go the shop, and regularly return with no milk, psychologists would look for some phenomena, and regularly find it, when there was no such thing occurring. The outcome of a significance testing, and the outcome of your decision to go to the shop or not is a binary decision. (That is a clever way of saying that you can say yes, I believe that there is milk, or no, I believe that there is no milk.) The state of the shop is also one of two things, there is milk, or there is no milk . Thus there are four possible combinations of events, as shown in Table 1.
| Decision Made | State of the Shop | Result |
| Went to shop | Milk | Correct Acceptance |
| Stayed home | No milk | Correct Reject |
| Went to shop | No milk | Type I error |
| Stayed home | Milk | Type II error |
In the first scenario, we have gone to the shop, and there was milk there. This was OK.
In the second scenario, we stayed at home, and there was no milk in the shop. This was OK too.
In the third scenario, we went to the shop, but there was no milk when we got there. , This was a Type I error.
In the final scenario, we stayed at home, thinking there was no milk in the shop, when there was milk. This was also an error. We could have gone to the shop and got milk, and as a result have had to drink our coffee black, go without cornflakes and generally have a bad time. This is called a Type II error.
|
Crucial Tip To remember which is a Type I error and which is a Type II error, use the following memory aid: If you made a Type I error, (thinking there was milk, but there wasn’t), you were optimistic. If you made a Type II error (thinking there was no milk, when there was) you were pessimistic. Optimistic: Type I error. Pessimistic: Type II error. The order of errors is the same order as O and P in the alphabet. |
[edit] Error Rates
We can find out how likely it is that we have made a Type I error or a Type II error. It is very easy to determine the probability that you have made a Type I error, but very difficult to determine if you have made a Type II error.
(Remember that a Type I error is an optimistic error, or a false positive, and a type II error is a pessimistic error, or a false negative.)
[edit] Type I Errors
The probability of making a Type I error (that is rejecting your null hypothesis, when the correct result would have been to accept it - the equivalent of going to the shop and finding it empty) is equal to the value for alpha. If alpha is 0.05, then there is a probability of 0.05 (or 5%, or 1 in 20) you will have made a Type I error, and falsely accepted the hypothesis. (Assuming that the assumptions of the test were not violated.)
[edit] Type II Errors
If your hypothesis is false, you can do one of two things: you can reject it, and be correct, or you can fail to reject it, and make a Type II error. The probability of correctly rejecting your null hypothesis is called power. The probability of falsely failing to reject your null hypothesis, i.e. making a type II error, is called beta (). Because one of those two things must happen, we know that:
power + beta = 1
and therefore it follows that
beta = 1 - power
So to calculate the Type II error rate, we need to first calculate the power and then subtract the value we get for the power from 1.
Testing a hypothesis can be thought of as being a like looking for something - your probability of finding what you are looking for depends upon how hard you look, and what you are looking for - the bigger the object, the easier it will be to find.
So to increase power, you need to look harder for evidence to reject the null hypothesis. There are two ways of looking harder for this evidence:
The first way is to use a higher value for alpha. You pay the price of an increase in the type I error rate, and get the reward of more power, and therefore a lower type II error rate.
The second way is to use a larger sample. By decreasing the uncertainty about what we have, we increase our ability to say we have found something. If we look through more things, we have a greater chance of finding something.
Your probability of finding something also depends upon how large it is. In psychological research, this means the size of the effect we are looking for – how much effect does our independent variable have? For example, the number of hours you spend studying for an exam has a large effect on how well you do on the exam, and therefore this effect should be easy to find. If the colour of the walls in the exam room has any effect on how well you do in the exam, this effect will be very small, and therefore very difficult to find. The problem is that, in psychology, we don’t know what we are looking for before we have found it, and so we don’t know how big it is going to be.
If we know the size of the thing we are looking for, the size of the sample we will use and the value for alpha, the power can be calculated. Power is calculated by looking in a table in a book, for example Cohen (1988) or Kraemer and Thiemann (1987) or by using a specialised computer program.
| Factor | What happens to power as it increases? | What happens to type II error rate as it increases? |
| Sample size | Increases | Decreases |
| Effect size | Increases | Decreases |
| Alpha | Increases | Decreases |
