Section 8.5: Correlation
From Research Methods in Psychology
Contents |
[edit] Introduction
|
Concept: a correlation is a measure of the linear relationship (agreement) between two variables. |
So far in this chapter, we have been looking how data from experiments are analysed. We saw in Chapter 3 that experimental studies are not always appropriate – we might be interested in the relationship between two variables. We can search for natural variation in two or more variables, and look for a relationship between them. On some occasions we might consider one of these variables to be the independent variable, and one the dependent variable – if we were interested in looking at the relationship between age and beliefs about pre-marital sex, we would consider age to be the independent variable. On other occasions, we do not think of the relationship between two variables as being one of an independent variable and a dependent variable - there is no implication that there is any sort of a causal relationship between our variables - we are just interested in the relationship. We might be interested in the relationship between two different measures of intelligence, to see if they were measuring the same, or a similar, thing, for example.
If we want to visually examine the relationship between two variables, we can do this on a scatterplot. The figure below is a scatterplot showing how the relationship between two variables can be plotted, with one point representing one individual being measured.
A correlation tells us two things: it tells us about the degree of the relationship between two variables and the direction of the relationship between two variables.
[edit] Degree of the Relationship
A correlation is represented by the correlation coefficient statistic ‘r,’ which can vary from +1.00 to -1.00. A correlation coefficient of +1.00 or -1.00 indicates a perfect linear relationship. A coefficient near to +1.00 indicates that if A, the first variable goes up, so does B, the second variable.
The graph below shows a scattergram that represents a perfect linear relationship between the variables. The correlation in this case is r=1.00, and all of the points lie in a perfectly straight line, from the bottom left hand corner to the top right hand corner.
r = 0.90. A strong, positive relationship. The points all lie close to the diagonal line of a perfect relationship.
A moderate relationship. The points to like near the diagonal, but are spread out.
r = 0.00. No linear relationship. The points are scattered randomly
[edit] Direction of the relationship
The direction of a correlation can be either positive or negative.
- Positive correlation occurs when an higher scores in one variable are accompanied by an higher scores in the other.
- Negative correlation occurs when an increase in one variable is accompanied by a decrease in the other
A + (positive) or - (negative) sign is used to indicate the direction of the relationship.
It is also worth noting that a correlation of zero, or close to zero, need not imply that there is no relationship between the two variables under examination. It is still possible that there is some other, non-linear, relationship. For example, an inverted-U shape might exist: if you were to investigate the relationship between adrenaline and athletic performance, then low levels of adrenaline would likely be associated with poor performance, higher levels with optimal, better performance, and higher levels still with a return to poor performance. This is one of the reasons for always creating the plots shown above, in case non-linear relationships are suggested.
[edit] Measures of Correlation
We will look at the two most common calculations used for correlations: Pearson’s product moment correlation and Spearman’s Rho, which are often shortened to Spearman and Pearson correlations.
[edit] Pearson’s Correlation
|
Concept: A Pearson Product Moment Correlation (usually shortened to Pearson correlation) is a parametric measure of agreement between two continuous measures. |
The Pearson correlation is the most common form of correlation. This is a parametric test, and so makes the standard parametric assumptions:
- Normal distribution
- Interval data
When you report a Pearson correlation you should report the value of r, either df or N, and the probability value.
[edit] Spearman’s Correlation
|
Concept: A Spearman’s Rho Correlation (usually shortened to Spearman correlation) is a non-parametric measure of agreement between two ordinal measures. |
This is a non-parametric equivalent to the Pearson correlation, and is suitable for use with ordinal data or data which are not normally distributed. When reporting a Spearman correlation, you should report the value of r, the N and the associated probability value.
[edit] Description and Correlations
A correlation is a little curious, because it functions in two different ways. It is both a descriptive statistic, and an inferential statistic. It is a descriptive statistic because it tells us something about the relationship between two variables - it is a way of summarising a scattergram. It is an inferential statistic, because you can test the significance of a particular value – that is, you can see if a correlation of that magnitude were likely to occur in your sample, if the null hypothesis – that there is no correlation in the population – were true.
Cohen (1988) defined the standard for assessing the size of correlations. He talked about the size of the effect of one variable on another variable, calling this effect size. He said that a correlation of 0.5 was a large effect size, 0.3 is a medium effect size, and 0.1 is a small effect size.
[edit] Further Reading
What you should read to learn more depends upon whether the approach taken in your university is to do statistical analysis ‘by hand’ or whether you use a computer package. If the approach is to carry out statistics ‘by hand’ a good book to start off with is ‘Learning to use statistical tests in psychology’ by Greene and D’Oliveira (Open University Press). A more advanced test, which you should consider looking at whether you use a computer or not is ‘Statistical Methods for Psychology’ by Howell (published by Wadsworth).
If you are using a computer package to carry out your statistical analysis, you need to consult a book that covers the package that you use. The most common program used by universities is SPSS, and so I will mention books that use this - apologies if you are using a different program. Two books which focus mainly on the program and less on the statistics are ‘SPSS for Psychologists’ by Brace, Kemp and Snelgar (published by Sage) and SPSS for Windows/Macintosh Made Simple’ by Kinnear and Gray (published by Erlbaum). If you want a book that focuses on both, then see ‘Discovering statistics using SPSS’ by Andy Field (published by Sage), Understanding and Using Statistics Psychology, by Jeremy Miles and Phil Banyard or ‘Biostatistics: The Bare Essentials,’ by Norman and Streiner (published by Decker). These books take a slightly more lighthearted approach to statistical analysis.

