You are currently browsing the category archive for the ‘Data Science’ category.

“Straight-Line-Graph-Through-The-Origin”

The words of Mr Michael Twomey, physics teacher, in Coláiste an Spioraid Naoimh, I can still hear them.

There were two main reasons to produce this *straight-line-graph-through-the-origin:*

- to measure some quantity (e.g. acceleration due to gravity, speed of sound, etc.)
- to demonstrate some law of nature (e.g. Newton’s Second Law, Ohm’s Law, etc.)

We were correct to draw this *straight-line-graph-through-the origin *for measurement, but not always, perhaps, in my opinion, for the demonstration of laws of nature.

The purpose of this piece is to explore this in detail.

## Direct Proportion

Two variables and are in direct proportion when there is some (real number) constant such that .

*Correlation does not imply causation *is a mantra of modern data science. It is probably worthwhile at this point to define the terms correlation, imply, and (harder) causation.

### Correlation

For the purposes of this piece, it is sufficient to say that if we measure and record values of variables and , and they appear to have a straight-line relationship, then the correlation is a measure of how close the data is to being on a straight line. For example, consider the following data:

*The variables and have a strong correlation. *

### Causation

Causality is a deep philosophical notion, but, for the purposes of this piece, if there is a relationship between variables and such that for each value of there is a single value of , then we say that * is a function of *: is the cause and is the effect.

In this case, we write , said *is a function of *. This is a causal relationship between and . (As an example which shows why this definition is only useful for the purposes of this piece, is the relationship between sales days after January 1, and the sales, , on that day: for each value of there is a single value of : indeed is a *function *of , but does not *cause *).

## Recent Comments