The Basics of Linear Algebra

An essence of mathematics is to cut through the gloss of each topic and exhibit the core features of the construct.

We all encountered vectors as a teenager. We may have been told that a vector is a physical quantity with magnitude in direction. This is to distinguish it from a scalar – which has magnitude alone. It begs the question – what has direction without magnitude? Anyway we can think of those familiar arrow vectors in the plane, think of all possible magnitudes and directions, and think of them as some big, infinite set $V$ . One basic thing about these vectors is that we can add them together – by the parallelogram law and more than that when we add them together we get another vector. So there is a way to add vectors $u$ , $v$ together, and also $u+v\in V$ . This means $+$ is a closed binary relation. Furthermore it is doesn’t matter in what order we add the vectors: $u+v=v+u$ , the addition is commutative. Also we can take a vector $v$ and double it, or treble it or indeed multiply it by any real number $k\in \mathbb{R}$ – and again we’ll get another vector $kv\in V$ (with $1v=v$ ). We don’t care about multiplying two vectors – at the moment. There is special zero vector, $\mathbf{0}$ , such that for any vector $v$ , $v+\mathbf{0}=v$ . Every vector $v$ has a negative, namely $-v$ , such that $v+(-v)=\mathbf{0}$ . Also the addition and scalar multiplication satisfies some nice algebraic laws; those of associativity and distributivity.

Vector Spaces
So we note a set $V$ of vectors, a field $\mathbb{R}$ of scalars, a vector addition $+$ and a scalar multiplication $\cdot$ . The following axioms are satisfied $\forall u,v,w \in V$ , $\lambda,\mu\in\mathbb{R}$ .

$u+v\in V$ , $\lambda v\in V$ (Closure)
$\exists\mathbf{0}\in V$ such that $\mathbf{0}+v=v$ (zero vector)
$\forall v\in V,\,\exists -v\in V$ such that $v+(-v)=\mathbf{0}$ (negative vectors)
$u+v=v+u$ (commutativity)
$1v=v$ (multiplicative identity)
$u+(v+w)=(u+v)+w$ (associativity)
$\lambda(u+v)=\lambda u+\lambda v$ , $(\lambda+\mu)v=\lambda v+\mu v$ , and $(\lambda \mu)v=\lambda(\mu v)$ (distributivity)

Now the structure can be abstracted and we say that any (set,field,addition,scalar multiplication) tuple $\{V,\mathbb{F},+,\cdot\}$ which satisfies these axioms is called a vector space.

Bases

One of the features of the geometric vectors seen in second level education is that they can be written in the $\hat{i}$ – $\hat{j}$ basis. In the context of how MA 2055 is presently taught (2009/10), $\hat{i}$ and $\hat{j}$ are said to be basis because every vector has a unique representation in terms of a linear combination of $\hat{i}$ and $\hat{j}$ : $v=a\hat{i}+b\hat{j}$ uniquely. Again this notion of a basis may be abstracted. A set $\{e_1,e_2,\dots,e_n\}\subset V$ is said to be a basis of $V$ if every vector $v\in V$ has a unique representation as

$\sum_{i=1}^na_ie_i$

It can be shown that this condition is equivalent to the basis being a linearly independent, spanning set. In MA 2055, a set $\{v_1,v_2,\dots,v_n\}$ is said to be linearly independent if the only linear combination of the $v_i$ is the trivial one; i.e. linear independence is

$\sum_{i=1}^na_iv_i=\mathbf{0}\,\Rightarrow\,a_i=0\,,\forall\,i$

In turn this is equivalent to the condition that none of the vectors are a linear combination of the others – in some sense the ‘directions’ of the vectors are all different. A spanning set is one in which all possible linear combinations exhaust the space:

$\text{span}\{v_1,\dots,v_n\}:=\{\sum_{i=1}^n a_iv_i :a_i\in\mathbb{R}\}$

i.e. a set $A$ is a spanning set for $V$ if span $A=V$ . The dimension of a vector space is given by the number of vectors in a basis.

Linear Maps

One of the first things to do when an abstract structure is defined is to consider functions between them. A linear map is a function between two vector spaces that preserves the operations of vector addition and scalar multiplication. In other words a linear map is any function $T:V\rightarrow U$ where $T(u+_V\lambda v)=T(u)+_U\lambda T(v)$ for any vectors $u,v\in V$ and scalar $\lambda\in \mathbb{F}_V$ . The quick calculation:

$T\left(\sum_{i=1}^na_ie_i\right)=\sum_{i=1}^na_iT(e_i)$

shows that a linear map is defined what it does to the basis vectors. Let $\{e_1,\dots,e_n\}$ and $\{f_1,\dots,f_m\}$ be bases for $U$ and $V$ . If the linear map is defined by the equations:

$T(e_i)=\sum_{j=1}^mb^i_jf_j\,,\,\,\,i=1,\dots,n$

then the matrix $A$ with columns $c_i=(b^i_1\,b^i_2\,\cdots\,b^i_m)^T$ acts on the vectors $e_1=(1\,0\,\cdots\,0)^T,\,e_2=(0\,1\,0\,\cdots\,0),\cdots, e_n=(0\,\cdots\,0\,1)$ according to $T$ . Hence, once a basis is fixed, a linear map is nothing but a matrix (and indeed a matrix is suitably linear).

Occasionally a linear map will be invertible. There are many equivalent definitions of invertible the most intuitive being that $T$ is bijective. A function $f:A\rightarrow B$ is bijective if:

If $f(x)=f(y)$ , then $x=y$
For all $z\in B$ , there exists $w\in A$ such that $f(w)=z$ .

Now suppose a linear map sends two distinct vectors $u_1,u_2$ to a vector $v$ ; i.e $Tu_1=v$ and $Tu_2=v$ . Thence

$Tu_1-Tu_2=v-v$

$\Rightarrow T(u_1-u_2)=\mathbf{0}$

That is, a non-zero vector, $u_1-u_2$ is sent to $\mathbf{0}$ . By linearity, any linear map will send $\mathbf{0}$ to $\mathbf{0}$ . Hence a linear map cannot be invertible if $\{u\in U:Tu=\mathbf{0}\}\neq\{\mathbf{0}\}$ . By linearity this is an equivalent condition to invertibility: a linear map $T$ is invertible if and only if $\{u\in U:Tu=\mathbf{0}\}=\{\mathbf{0}\}$ .

A few dimensional considerations show that to be invertible a linear map must map between vector spaces of equal dimension. In particular, when bases are set, the matrix of the linear map will be a square matrix. In the following assume the dimension of the spaces is $n$ .

Given a matrix $A$ , to calculate the inverse of $A$ , $A^{-1}$ , a quantity known as the determinant of $A$ , $\det A$ is calculated. The subsequent analysis yields another equivalent condition for invertibility: a linear map $T$ is invertible if and only if $\det T\neq 0$ .

The Eigenvalue Problem

For some vectors $v$ the action of a linear map $T$ is by scalar multiplication:

$Tv=\lambda v$

In this case $v$ is called an eigenvector of the linear map $T$ with eigenvalue $\lambda$ . In the first instance we restrict to non-zero vectors. Otherwise every number is an eigenvalue of every linear map because $T\mathbf{0}=\lambda \mathbf{0}$ for all $\lambda\in\mathbb{C}$ . Hence restrict to non-zero vectors and hopefully it will make sense to consider the eigenvalues of a linear map $T$ . Consider

$Tv-\lambda I v=\mathbf{0}$

$\Rightarrow (T-\lambda I)v=\mathbf{0}$

Now if $T-\lambda I$ is invertible then both sides may be hit by $(T-\lambda I)^{-1}$ to yield $v=\mathbf{0}$ which is not allowed. Hence $T-\lambda I$ must not be invertible and thus the eigenvalues of $T$ are given by the ( $n$ ) solutions of the (polynomial) equation:

$\det(T-\lambda I)=0$

Indeed this gives yet another equivalent condition to $T$ being invertible: a linear map $T$ is invertible if and only if $0$ is not an eigenvalue.

A particularly nice class of linear maps are diagonaliseable linear maps. Suppose a set of eigenvectors $\{v_1,\dots,v_n\}$ with eigenvalues $\{\lambda_1,\dots,\lambda_n\}$ of a linear map $T$ form a basis. Thence in this basis, $T$ is a diagonal matrix with the $\lambda_i$ along the diagonal.

Inner Product Spaces

Earlier on we mentioned that we don’t care about multiplying vectors however there is the notion of a dot product of two vectors. For every pair of vectors, $\mathbf{a}=a_x\hat{i}+a_y\hat{j},\,\mathbf{b}=b_x\hat{i}+b_y\hat{j}$ on the $\hat{i}$ – $\hat{j}$ plane, there is a scalar, $\mathbf{a}\cdot\mathbf{b}$ , called the dot product of $\mathbf{a}$ and $\mathbf{b}$ :

$\mathbf{a}\cdot\mathbf{b}=a_xb_x+a_yb_y$

The dot product satisfies the following two properties for all vectors $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ and scalars $\lambda\in\mathbb{R}$ :

The dot product is linear on the left: $(\mathbf{a}+\lambda \mathbf{b})\cdot \mathbf{c}=\mathbf{a}\cdot \mathbf{c}+\lambda\mathbf{b}\cdot\mathbf{c}$ .
The dot product is positive definite: $\mathbf{a}\cdot\mathbf{a}\geq 0$ , with equality if and only if $\mathbf{a}=\mathbf{0}$ .

In a similar modus operandi to before, this structure can be abstracted to a more general map from pairs of vectors to $\mathbb{F}$ . Any map $(u,v)\rightarrow \mathbb{F}$ that satisfies 1. and 2. for all vectors $u,\,v,\,w\in V$ and scalar $\lambda\in\mathbb{F}$ , with the additional property that $\langle\,,\,\rangle$ is conjugate linear on the right ( $\langle u,v+\lambda w\rangle=\langle u,v\rangle+\bar{\lambda}\langle u,w\rangle$ ), is an inner product on $V$ and $\{V,\mathbb{F},+,\cdot,\langle\,,\,\rangle\}$ is an inner product space.

With respect to a given inner product, to each linear map $T:V\rightarrow V$ there exists its adjoint, $T^\star$ , given by:

$\langle Tu,v\rangle=\langle u,T^\star v\rangle$

A linear map is called self-adjoint if $T=T^\star$ .

Proposition 1: The eigenvalues of a self-adjoint linear map are real.

Proof: Let $\{v_1,\dots,v_n\}$ be eigenvectors of eigenvalues $\{\lambda_1,\dots,\lambda_n\}$ of the linear map $T$ . Now

$\langle v_i,Tv_i\rangle=\langle v_i,\lambda_iv_i\rangle=\bar{\lambda_i}\langle v_i,v_i\rangle$

Similarly,

$\langle Tv_i,v_i\rangle=\langle \lambda_iv_i,v_i\rangle=\lambda_i\langle v_i,v_i\rangle$

But $T$ is self-adjoint thence $\bar{\lambda_i}\langle v_i,v_i\rangle=\lambda_i\langle v_i,v_i\rangle$ and because eigenvectors are non-zero and positive definiteness $\bar{\lambda_i}=\lambda_i$ $\bullet$

The spectral theorem for self-adjoint maps states that a self-adjoint map has a basis of eigenvectors. When a basis is fixed, it may be seen that the matrix of a self-adjoint linear map is one that is equal to its conjugate-transpose: $A=\bar{A^T}=A^\star$ . These two facts combine to give a basis-dependent proof of Proposition 1.

Proof (basis-dependent): By the spectral theorem, $T$ has an eigenbasis, say $\{v_1,\dots,v_n\}$ . Let $A$ be the matrix representation of $T$ in this basis. $A$ is thus the diagonal matrix with $\lambda_1,\dots,\lambda_n$ along the diagonal. Now $A=\bar{A^T}$ . As $A$ is diagonal it is equal to its transpose. Taking conjugates shows $\bar{\lambda_i}=\lambda_i$ $\bullet$

Vector Space Isomorphism

Sometimes two different vector spaces appear to behave very similarly even if outwardly they look very different. For example, the set of polynomials of degree at most two with real coefficients behaves exactly like $\mathbb{R}^3$ when we add vectors together, etc. Indeed as vector spaces they have identical structure. To show they are isomorphic we must construct a bijective homomorphism (isomorphism) between them. If the isomorphism is linear then the homomorphism property will be respected. So now we can say that two vector spaces $U$ and $V$ are isomorphic as vector spaces if there exists a bijective linear map from $U\rightarrow V$ . For example, $\phi$ defined by $\phi(ax^2+bx+c)=(a\,b\,c)^T\in\mathbb{R}^3$ is a bijective linear map.

Subspaces

A vector subspace is a subset $U$ of a vector space $V$ that is a vector space in its own right. That is the axioms of a vector space are satisfied by $U$ . Now axioms 5., 6. and 7. hold automatically as $U$ is in a vector space. Suppose

$\mathbf{0}\in U$
For all $u,\,v\in U,\,\lambda\in\mathbb{F},\, u+\lambda v\in U$

Then axiom 1. holds by letting $\lambda=1$ and $u=\mathbf{0}$ . Axiom 2. is vacuously true. Axiom 3. holds by letting $u=\mathbf{0}$ and $\lambda=-1$ . Conditions 1. & 2. here comprise the Subspace Test.

	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH6040: Spring 2022, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	J.P. McCarthy on MATH7019: Winter 2020, Week…
	A Sufficient Conditi… on Almost All Trees have Quantum…

The Basics of Linear Algebra

Recent Comments

Categories

J.P. McCarthy Maths

1 comment

Leave a reply to Why Do We Multiply Matrices Like We Do? | J.P. McCarthy: Math Page Cancel reply

The Basics of Linear Algebra

Share this:

Related

Recent Comments

Categories

J.P. McCarthy Maths

1 comment

Leave a reply to Why Do We Multiply Matrices Like We Do? | J.P. McCarthy: Math Page Cancel reply