Why Do We Multiply Matrices Like We Do?

In this short note we will explain why we multiply matrices in this “rows-by-columns” fashion. This note will only look at $2\times 2$ matrices but it should be clear, particularly by looking at this note, how this generalises to matrices of arbitrary size.

First of all we need some objects. Consider the plane $\Pi$ . By fixing an origin, orientation ( $x$ – and $y$ -directions), and scale, each point $P\in\Pi$ can be associated with an ordered pair $(a,b)$ , where $a$ is the distance along the $x$ axis and $b$ is the distance along the $y$ axis. For the purposes of linear algebra we denote this point $P=(a,b)$ by

$\displaystyle P=\left(\begin{array}{c}a\\ b\end{array}\right)$ .

graph7

We have two basic operations with points in the plane. We can add them together and we can scalar multiply them according to, if $Q=(c,d)$ and $\lambda\in\mathbb{R}$ :

$P+Q=\left(\begin{array}{c}a\\ b\end{array}\right)+\left(\begin{array}{c}c\\ d\end{array}\right)$

$\displaystyle=\left(\begin{array}{c}a+c\\ b+d\end{array}\right)$ , and

$\lambda\cdot P=\lambda\cdot \left(\begin{array}{c}a\\ b\end{array}\right)=\left(\begin{array}{c}\lambda\cdot a\\ \lambda\cdot b\end{array}\right)$ .

Objects in mathematics that can be added together and scalar-multiplied are said to be vectors. Sets of vectors are known as vector spaces and a feature of vector spaces is that all vectors can be written in a unique way as a sum of basic vectors.

In the case of the plane $\Pi$ , the vectors $e_1=(1,0)$ (one along the $x$ ) and $e_2=(0,1)$ (one along the $y$ ) are basic vectors and the set $\mathcal{B}:=\{e_1,e_2\}$ are said to be a basis for $\Pi$ . The dimension of a vector space is the size of the basis (bases are not unique but their size is) .Every vector $P\in\Pi$ may be, in a unique way, be written as a sum of elements of $\mathcal{B}$ :

$\displaystyle P=\left(\begin{array}{c}a\\ b\end{array}\right)=\left(\begin{array}{c}a\\ 0\end{array}\right)+\left(\begin{array}{c}0\\ b\end{array}\right)=ae_1+be_2$ .

One of the first things to do when an algebraic structure is defined, in this case the plane, is to consider functions on it. A function $f:\Pi\rightarrow \Pi$ is a map that sends each vector $P\in \Pi$ to another $f(P)\in \Pi$ . For example, the function $R_{\pi/2}$ that rotates a point $\pi/2$ radians around the origin, in the anti-clockwise direction, is a function.

graph8

Of particular interest are linear maps. A linear map is a function between two vector spaces that preserves the operations of vector addition and scalar multiplication. In the case of functions $\Pi\rightarrow \Pi$ , a linear map is any function $T:\Pi\rightarrow \Pi$ where $T(u+\lambda \cdot v)=T(u)+\lambda\cdot T(v)$ for any vectors $u,v\in \Pi$ and scalar $\lambda\in \mathbb{R}$ . The quick calculation:

$T\left(a\cdot e_1+b\cdot e_2\right)=a\cdot T(e_1)+b\cdot T(e_2)$ ,

shows that a linear map is defined what it does to the basis vectors. Suppose that a linear map is defined, for scalars $x_{ij}\in\mathbb{R}$ by:

$T(e_1)=x_{11}\cdot e_1+x_{21}\cdot e_2$ , and

$T(e_2)=x_{12}\cdot e_1+x_{22}\cdot e_2$ ,

then we see that

$T(a,b)=a\cdot T(e_1)+b\cdot T(e_2)=a\cdot (x_{11}\cdot e_1+x_{21}\cdot e_2)+b\cdot (x_{12}\cdot e_1+x_{22}\cdot e_2)$

$=(x_{11}a+x_{12}b)\cdot e_1+(x_{21}a+x_{22}b)\cdot e_2$ .

Now it turns out that all this information can be encoded by a matrix $A$ as follows. Let $v=(a,b)\in \Pi$ . Then $T(v)=Av$ where $A$ is a matrix given as follows:

$\displaystyle T(v)=T\left(\begin{array}{c}a \\ b\end{array}\right)=\underbrace{\left(\begin{array}{cc}x_{11} & x_{12} \\ x_{21} & x_{22}\end{array}\right)}_{:=A}\left(\begin{array}{c}a \\ b\end{array}\right).$

If we take matrix multiplication to be as we define it then multiplying this out we see that the two of these are the same thing:

$T(v)=(x_{11}a+x_{12}b)\cdot e_1+(x_{21}a+x_{22}b)\cdot e_2$

$\left(\begin{array}{cc}x_{11} & x_{12} \\ x_{21} & x_{22}\end{array}\right)\left(\begin{array}{c}a \\ b\end{array}\right)=\left(\begin{array}{c}x_{11}a +x_{12}b \\ x_{21}a+x_{22}b\end{array}\right)$ .

Therefore two-by-two matrices are actually functions in the sense that every linear map $T:\Pi\rightarrow \Pi$ is of the form:

$T(v)=Av$ ,

for some $2\times2$ matrix $A$ .

Another notation for $\Pi$ is $\mathbb{R}^2$ — basically two copies of the real numbers. All finite-dimensional vector spaces, of dimension $n$ , where the scalars are real numbers, are of the form $\mathbb{R}^n$ — basically a list of $n$ numbers. It turns out that a matrix of size $M\times N$ ( $M$ rows, $N$ columns) encodes a linear map $\mathbb{R}^N\rightarrow \mathbb{R}^M$ (note the switch from $M\text{-}N$ to $N\text{-}M$ ).

We can compose two functions to produce another. For example, consider two linear maps $T_A,T_B:\Pi\rightarrow \Pi$ encoded by two $2\times2$ matrices $A$ and $B$ . Suppose we act on a point $P\in\Pi$ first by $T_B$ and then by $T_A$ :

– graph9

Now this composition is a function in itself, sending $P$ to

$(T_A\circ T_B)P=T_A(T_B(P))=T_A(BP)=ABP$ .

Now there are two questions. The map $T_{AB}$ sending $P$ to $ABP$ … is it linear (yes, a straightforward exercise) and can we associate to $AB$ a single matrix, say $C$ , such that $AB=C$ and $T_{AB}=T_C$ ? The answer is also yes.

Let us write $P=(x,y)$ and define $T_A$ and $T_B$ by matrices $[a_{ij}]$ and $[b_{ij}]$ . Then

$T_B(P)=BP=(b_{11}x+b_{12}y,b_{21}x+b_{22}y)$ ,

and so

$T_A(BP)=T_A((b_{11}x+b_{12}y,b_{21}x+b_{22}y))$

$=\left(a_{11}(b_{11}x+b_{12}y)+a_{12}(b_{21}x+b_{22}y),\right.$

$\left.,a_{21}(b_{11}x+b_{12}y)+a_{22}(b_{21}x+b_{22}y)\right)$ .

Some careful inspection shows that this is nothing but, where $r_i^A$ is the $i$ -th row of $A$ , and $c_i^B$ is the $i$ -th column of $B$ :

$\displaystyle \left(\begin{array}{cc} r_1^A\bullet c_1^B & r_1^A\bullet c_2^B \\ r_{2}^A\bullet c_1^B & r_2^A\bullet c_2^B \end{array}\right)\left(\begin{array}{c}x\\ y\end{array}\right)$ ,

where this $\bullet$ , called the dot product takes a pair of vectors and sends them to a scalar. In the case of vectors in the plane:

$(a_1,b_1)\bullet (a_2,b_2)=a_1a_2+b_1b_2$ .

So the reason that we multiply matrices why we do is that the matrix product $AB$ represents the function composition $(T_A\circ T_B)$ .

6 comments

Comments feed for this article

October 5, 2017 at 9:18 am

MATH6040: Winter 2017, Week 4 | J.P. McCarthy: Math Page

[…] at Matrix Inverses — “dividing” for Matrices. This allowed us to solve matrix equations. Here find a note that answers the question: why do we multiply matrices like we […]

February 8, 2018 at 8:07 am

MATH6038: Spring 2018, Week 2 | J.P. McCarthy: Math Page

[…] For those of you interested in the why when it comes to matrix multiplication, have a look here. […]

October 4, 2018 at 9:25 am

MATH6040: Winter 2018, Week 4 | J.P. McCarthy: Math Page

[…] at Matrix Inverses — “dividing” for Matrices. This will allow us to solve matrix equations. Here find a note that answers the question: why do we multiply matrices like we […]

February 20, 2019 at 10:21 am

MATH6040: Spring 2019, Week 4 | J.P. McCarthy: Math Page

[…] Inverses — “dividing” for Matrices. This will allow us to solve matrix equations. Here find a note that answers the question: why do we multiply matrices like we […]

October 3, 2019 at 9:56 am

MATH6040: Winter 2019, Week 4 | J.P. McCarthy: Math Page

[…] to work and moments and began Chapter 2: Matrices. We did some examples of matrix arithmetic. Here find a note that answers the question: why do we multiply matrices like we […]

February 19, 2020 at 8:55 am

MATH6040: Spring 2020, Week 4 | J.P. McCarthy: Math Page

[…] Here find a note that answers the question: why do we multiply matrices like we do? […]

	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	A Sufficient Conditi… on Almost All Trees have Quantum…

Why Do We Multiply Matrices Like We Do?

Recent Comments

Categories

J.P. McCarthy Maths

6 comments

Leave a comment Cancel reply

Why Do We Multiply Matrices Like We Do?

Share this:

Related

Recent Comments

Categories

J.P. McCarthy Maths

6 comments

Leave a comment Cancel reply