The Line of Best Fit

“Straight-Line-Graph-Through-The-Origin”

The words of Mr Michael Twomey, physics teacher, in Coláiste an Spioraid Naoimh, I can still hear them.

There were two main reasons to produce this straight-line-graph-through-the-origin:

to measure some quantity (e.g. acceleration due to gravity, speed of sound, etc.)
to demonstrate some law of nature (e.g. Newton’s Second Law, Ohm’s Law, etc.)

We were correct to draw this straight-line-graph-through-the origin for measurement, but not always, perhaps, in my opinion, for the demonstration of laws of nature.

The purpose of this piece is to explore this in detail.

Direct Proportion

Two variables $P$ and $Q$ are in direct proportion when there is some (real number) constant $k$ such that $P=k\cdot Q$ .

In this case we say $P$ is directly proportional to $Q$ , written $P\propto Q$ , and we call $k$ the constant of proportionality. If we assume for the moment that both $P$ and $Q$ can be equal to zero, then when $Q=0$ , then $P$ is certainly also equal to zero:

$P=k\cdot Q\Rightarrow \underbrace{P}_{\text{when }Q=0}=k(0)=0$ .

Now imagine a plane, with $P$ values on the $y$ -axis, and $Q$ -values on the $x$ -axis, and the values of $P$ plotted against those of $Q$ . We certainly have that $(Q,P)=(0,0)$ (aka the origin) is on this graph. Now suppose that $(Q_0,P_0)$ is another such point.

diagram1

The tangent of this angle $\theta$ , opposite divided by adjacent is given by:

$\displaystyle \tan\theta=\frac{P_0}{Q_0}$ ,

however by assumption, $P_0=k\cdot Q_0$ so that:

$\displaystyle\tan\theta=\frac{P_0}{Q_0}=\frac{k\cdot Q_0}{Q_0}=k$ .

Similarly, any other point $(Q_1,P_1)$ , has $\tan\theta=k$ . This means the line segment connecting $(0,0)$ to any point on the graph, any $(Q_1,P_1)$ , makes an angle $\theta=\tan^{-1}(k)$ with the $x$ -axis: they all lie on a straight-line, and as we noted above, through-the-origin:

diagram2

Consider again the graph with the right-angled triangle. The slope of a line is defined as the rise-over-the-run: aka the tangent of the angle formed with the $x$ -axis. Therefore, we have that the slope of this line is given by $k$ , the constant of proportionality.

Measurement using the Straight-Line-Graph-Through-The-Origin

If we have two quantities that are in direction proportion… we need an example:

Example: Refractive Index

When a light travels from a vacuum into a medium, it bends:

diagram3

[image robbed from Geometrics.com and edited]

It turns out, using Maxwell’s Equations of Electromagnetism, that $\sin i$ and $\sin r$ are directly proportional:

$\sin i=k\cdot \sin r$ .

This constant of proportionality, $k$ , more usually denoted by $n$ , is called the Refractive Index of the material.

For the moment, make the totally unrealistic assumption that we can measure $i$ and $r$ , and calculate $\sin i$ and $\sin r$ , without error. Then we can hit the medium with light, calculate the specific $\sin i_0$ and $\sin r_0$ , and calculate the refractive index:

$\displaystyle \sin i_0=n\cdot \sin r_0\Rightarrow n=\frac{\sin i_0}{\sin r_0}$ .

This, however, is completely unrealistic. We cannot measure without error (nor calculate without error).

Recall again that $\sin i\propto \sin r$ , a graph of $\sin i$ vs $\sin r$ is a straight-line-graph-through-the-origin:

graph6

This line represents the true, exact relationship between $\sin i$ and $\sin r$ – and if we had it’s equation: $\sin i=n\cdot \sin r$ , we would know the refractive index. Let us call this the line of true behaviour.

In the real world, any measurement of $i$ and $r$ , and subsequent calculation of $\sin i$ and $\sin r$ , comes with an error.

For example, suppose, for a number of values of $i$ , say $i_1,i_2,\dots,i_n$ , we measure and record the corresponding $r$ , say $r_1,r_2,\dots,r_n$ . Suppose we calculate the corresponding sines and then plot the coordinates:

$(\sin r_1,\sin i_1), (\sin r_2,\sin i_2),\dots, (\sin r_n,\sin i_n)$ :

graph22

Now these errors in measurement and calculation mean that these points do not lie on a line. Teaching MATH6000 here in CIT, would see some students wanting to draw a line between the first and last data points:

graph23

We can get the equation of this line, it’s not too difficult, and the slope of this line approximates the refractive index (we get $n\approx 1.198$ )… but why did we bother with making seven measurements when we only used two to draw the line?

When we do this we are completely neglecting the following principle:

Data is only correct on average. Individual data points are always wrong.

The second statement should be self-evident: individual data points contain errors and are certainly wrong. When we are not dealing with discrete data, but continuous data this statement always wrong can be quantified (in the sense of almost always).

The first statement is a little more difficult, on relies the Law of Large Numbers, but for the purposes of this piece, just suppose that the errors are just as likely to be positive (under-estimate) as negative (over-estimate), then we might hope that errors cancel each other out — and we might expect furthermore that the more data we have, the more likely that these errors do indeed cancel each other out — and the data correct on average.

So using just two data points is not a good idea: we should try and use all seven. You can at this point come up with various strategies of how to draw this line, this line of best fit. Maybe you want to get just as many points above the line as below for example.

Recall individual data points always contain errors. So for example, we try and measure $i$ , we try and measure the corresponding $r$ – these have measurement errors. Then we calculate $\sin i$ and $\sin r$ and these errors are propogated into errors here.

It turns out, if we make assumptions about these errors (for the purposes of this talk errors are just as likely to be positive as negative), that if we have many, many data points, they scatter around the true relationship between $\sin i$ and $\sin r$ , categorised by the line in the below, in a very certain way:

graph1

The errors have the property, that the sum of the vertical deviations – squared – is as small as possible. We explain with a picture what vertical deviation means:

graph2

The vertical deviations, briefly just the deviations, are as shown.

Now we flip this on it’s head. If we have some data we know that the line of true behaviour is such that if we had lots of data the sum of the squared deviations would be as small as possible.

Therefore, if we find the line that has the property that the sum of squared deviations for our smaller sample of data is a minimum, then this, line of best fit, approximates the line of true behaviour.

line of best fit $\approx$ line of true behaviour

The more data we have (subject to various assumptions), the better this approximation.

Now how do we find this line of best fit? An example shows the way:

graph3

The quest is to find the line such that the sum of the squared deviations is as small as possible. As we note above, if $y\propto x$ then the line of best fit is of the form $y=kx$ for some constant $k$ : and different values of $k$ give different errors. For example, below we see graphs with $k=10,11,12$ together with the data:

graph4

How we find this line of best fit is to consider $k$ as a variable, find a formula for

$S(k)$ = sum of squared deviations = $\displaystyle \sum_i \delta_i^2$ ,

and then minimise the value of $S(k)$ — in other words finding the line that minimises $\sum \delta_i^2$ . Here we calculate for an arbitrary line — value of $k$ — the ten deviations. The deviation between the data point $y_i$ and the corresponding point on the line of best fit, $y(x_i)=kx_i$ , is given by the absolute difference of the two:

$\delta_i=|y_i-kx_i|$ .

table1

Now note that $|x^2|=x^2$ , and so

$S(k)=(14-k)^2+(19-2k)^2+\cdots +(108-10k)^2$ .

A little time later we have, in this case,

$S(k)=385k^2-8440k+46457$ .

How to minimise this? If you know any calculus this is an easy problem, however a rudimentary knowledge of quadratics is enough to help us find the minimum. All quadratics $q(x)=ax^2+bx+c$ can be written in the form:

$\displaystyle q(x)=a\cdot \left(x+\frac{b}{2a}\right)+c-\frac{b^2}{4a}$ .

This is minimised when $x+\frac{b}{2a}=0\Rightarrow x=-\frac{b}{2a}$ . Therefore, $S(k)=385k^2-8440k+46457$ is minimised at

$\displaystyle k=-\frac{b}{2a}=-\frac{-8440}{2(385)}\approx 10.96$

I wouldn’t ordinarily be an advocate for rounding fractions like this, however the important approximation

line of best fit $\approx$ line of true behaviour

means that we cannot be so prescriptive. We finish here with

$y\propto x\Rightarrow y=k\cdot x\approx y=10.96x$ .

More generally, where we have $(x_i,y_i)$ data and we want to find the line (through the origin) of best fit we have (via $|a|^2=a^2$ )

$\displaystyle S(k)=\sum_i \delta_i^2=\sum_i |y_i-k\cdot x_i|^2$

$\displaystyle =\sum_i\left(y_i^2-2x_iy_i\cdot k+x_i^2\cdot k^2\right)$

$\displaystyle =\left(\sum_i x_i^2\right)\cdot k^2-\left(2\sum_i x_iy_i\right)k+\sum_i y_i^2\sim a\cdot k^2+b\cdot k+c$

so that as before, the quadratic $S(k)$ is at a minimum at $k=-\frac{b}{2a}$

$\displaystyle k=\frac{\sum_i x_iy_i}{\sum_i x_i^2}$ .

Verifying Laws

Maxwell’s Equations imply Snell’s Law but Snell’s Law was proposed almost 1000 years before Maxwell!

Suppose you hear about Snell’s Law for the first time. How could someone convince you of it (without running the gamut on Maxwell’s Equations!)

Well, they’d have to do an experiment of course! They would have to allow you measure different values of $i$ and $r$ , the corresponding values of $\sin i$ and $\sin r$ , and plot them. For it to be through-the-origin they would have to convince you that $i=0\Rightarrow r=0$ .

Then you would find the line of best fit: straight-line-graph-through-the-origin.

You would see the data lining up well, providing qualitative evidence for $\sin i=k\cdot \sin r$ and hence Snell’s Law.

Then they might show you how to calculate how well the data fits the line of best fit, perhaps by calculating the correlation coefficient.

Beyond

Things get slightly more complicated if we have a straight-line-graph that doesn’t go through the origin. Or a different curve than that of a line. We are however able to to fit data to a family of curves, curves that minimise the sum of the squared deviations. This is Linear Least Squares where partial differentiation can help find the curve-of-best-fit that approximates the curve-of-true-behaviour.

Things get more and more complicated after this but we can leave it there now.

	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	A Sufficient Conditi… on Almost All Trees have Quantum…

The Line of Best Fit

Direct Proportion

Measurement using the Straight-Line-Graph-Through-The-Origin

Example: Refractive Index

Verifying Laws

Beyond

Recent Comments

Categories

J.P. McCarthy Maths

Leave a comment

Leave a comment Cancel reply

The Line of Best Fit

Direct Proportion

Measurement using the Straight-Line-Graph-Through-The-Origin

Example: Refractive Index

Verifying Laws

Beyond

Share this:

Related

Recent Comments

Categories

J.P. McCarthy Maths

Leave a comment

Leave a comment Cancel reply