Geography 300


	Under Construction
		09/28/2006
Geography 300: *The Geographer's Craft*	*Skill Module: Correlation Coefficients*

Background: All geographers should have a working knowledge with some of basic statistical measures. Some geographers should be very fluent in statistics. Correlation is one of those basic tools in the geographer's tool kit. Below is a brief introduction, or refresher on correlation. A number of good statistical text are available for purchase and make an excellent addition to a geographer's reference library.

Student Learning Objectives:

Student will describe, explain and correctly calculate one or more correlation coefficients.
Student will explain the difference between correlation and causality

Correlation is a statistical property that exists between two sets of related data. Several correlation coefficient can be calculated to measure of the strength and nature of the relationship.

If you would use a correlation coefficient to measure the relationship between the sale of ice cream cones and temperature throughout a year, you would likely find a strong positive relationship between the two variables. This would make sense and you could make a reasonable arguement that rising temperatures are a causal factor in the sale of ice cream.

But be careful, because "correlation does not prove causation". You may erroneously assume that ice cream cones cause the temperature to rise. You may also do a second measurement and find a strong positive relationship between the sale of ice cream and the sale of bathing suits. This last instance is a good example of a spurious relationship, one in which there is a correlation but no causal relationship. Frequently the causal variable is hidden. In this example, increasing temperatures is very likely the causal factor behind both the rise in bathing suit sales and ice cream sales.

There are many correlation coefficients but geographers tend to rely most heavily on a few. Pearson's Product Moment (Pearson's r) and Spearman's Rank (Spearman's r). Pearson's is used when data is normally distributed (like a bell curve) and Spearman's is used when the data is non-normal, or skewed. Many of the data sets geographers work with are skewed...many are normal. You'll have to either assume your data is normal/non-normal or run several tests on it to be sure.

How to read a correlation coefficient:

The number derived from the correlation statistic range from -1.0, indicating a perfectly inverse or negative relationship. A negative or inverse relationship occurs when one of the variables increases as the other goes down.

A perfectly positive correlation coefficient equals 1.0. This happens when in every instance a change in one variable is matched by an equal change in the second variable. In other words, as X goes up, Y matches its change every time.

When there is no relationship between variables X and Y, you get a correlation coefficient that is closer to 0.0. That means that as one variable goes up, the other may go up some of the time and some of the time it might go down...or not change at all.

Understanding whether or not a correlation is 'strong' or 'weak' is a matter of judgment. In many social science research projects, a correlation of .90 would be considered extremely strong. In some science cases, that might be considered weak, or unacceptable.

One of the best ways to understand correlation is to plot point on a scatterplot diagram.

If you have questions or comments, please contact me at steve.graves@csun.edu