What you need to know about Causation and Correlation

Paul Brodie's image for:
"What you need to know about Causation and Correlation"
Image by: 

In life, everything has a cause and effect, although neither one is always readily identifiable. Physical pain is sometimes preceded by observable injury, in which case it is clear what caused the pain. Sometimes there is pain without any observable reason. Research scientists attempt to answer questions by determining how one variable leads to or affects the outcome of another. By isolating and controlling variables it is possible to define how variables relate. It is important to keep in mind that there are other variables not always accounted for in a study which may influence an outcome.


To say that one variable causes another is a common occurrence, but not inside of a laboratory. In order to determine causation there needs to be a high level of control. An unhealthy diet may contribute to heart disease, but so may a lack of exercise, poor sleeping habits, and weak genetics. To say an unhealthy diet causes heart disease is shortsighted. Making a determination in this case would require controlling each of these other variables, and all others that might have an influence in the outcome, which isn’t very likely.


When causation can’t be determined, it may be possible to determine correlation. Heart disease may have a correlation with diet. As the diet becomes less healthy, the risk of heart disease increases. This means that one variable has an influence on the other. Positive correlations occur when two variables increase or decrease in measure together. Negative correlations occur when one variable increases as the other variable decreases. Correlations do not indicate that one variable causes the other, only that one variable influences the other, and even then it is sometimes difficult to determine which variable is affecting the other, or if there is a third unseen variable working between the two.

The third variable

The reason why it is difficult to determine causation and correlation is because it is difficult to identify every variable at work. Statistics might show that heart disease is found in higher rates among people from a low socioeconomic background. This doesn’t mean that having less money leads to heart disease, although on the surface it may appear so. Low socioeconomic status may decrease access to healthy foods, resulting in a poor diet, which then contributes to heart disease. In this case, diet is the lurking variable.


Scientists and lay people alike will benefit by knowing the difference between causation and correlation and understanding how these definitions are given. Researchers who fail to identify influential variables run the risk of making false determinations. People who do not understand how research works might be mislead by erroneous claims that a product is proven to bring about a certain result. It is difficult, even in a laboratory to define true causation. Correlations do not mean that one variable is proven to cause another, only that there is an influence. There are often other variables at work and knowing to look for them keeps everyone safe from error and, perhaps, wasted resources.

More about this author: Paul Brodie

From Around the Web

  • InfoBoxCallToAction ActionArrow
  • InfoBoxCallToAction ActionArrow
  • InfoBoxCallToAction ActionArrow