An essence of mathematics is to cut through the gloss of each topic and exhibit the core features of the construct.

We all encountered vectors as a teenager. We may have been told that a vector is a physical quantity with magnitude in direction. This is to distinguish it from a scalar – which has magnitude alone. It begs the question – what has direction without magnitude? Anyway we can think of those familiar arrow vectors in the plane, think of all possible magnitudes and directions, and think of them as some big, infinite set V. One basic thing about these vectors is that we can add them together – by the parallelogram law and more than that when we add them together we get another vector. So there is a way to add vectors u, v together, and also u+v\in V. This means + is a closed binary relation. Furthermore it is doesn’t matter in what order we add the vectors: u+v=v+u, the addition is commutative. Also we can take a vector v and double it, or treble it or indeed multiply it by any real number k\in \mathbb{R} – and again we’ll get another vector kv\in V (with 1v=v). We don’t care about multiplying two vectors – at the moment. There is special zero vector, \mathbf{0}, such that for any vector v, v+\mathbf{0}=v. Every vector v has a negative, namely -v, such that v+(-v)=\mathbf{0}. Also the addition and scalar multiplication satisfies some nice algebraic laws; those of  associativity and distributivity.

Vector Spaces
So we note a set V of vectors, a field  \mathbb{R} of  scalars, a vector addition + and a scalar multiplication \cdot. The following axioms are satisfied \forall u,v,w \in V, \lambda,\mu\in\mathbb{R}.

  1. u+v\in V, \lambda v\in V     (Closure)
  2. \exists\mathbf{0}\in V such that \mathbf{0}+v=v (zero vector)
  3. \forall v\in V,\,\exists -v\in V such that v+(-v)=\mathbf{0} (negative vectors)
  4. u+v=v+u (commutativity)
  5. 1v=v (multiplicative identity)
  6. u+(v+w)=(u+v)+w (associativity)
  7. \lambda(u+v)=\lambda u+\lambda v, (\lambda+\mu)v=\lambda v+\mu v, and (\lambda \mu)v=\lambda(\mu v) (distributivity)

Now the structure can be abstracted and we say that any (set,field,addition,scalar multiplication) tuple \{V,\mathbb{F},+,\cdot\} which satisfies these axioms is called a vector space.

Bases

One of the features of the geometric vectors seen in second level education is that they can be written in the \hat{i}\hat{j} basis. In the context of how MA 2055 is presently taught (2009/10), \hat{i} and \hat{j} are said to be basis because every vector has a unique representation in terms of a linear combination of \hat{i} and \hat{j}: v=a\hat{i}+b\hat{j} uniquely. Again this notion of a basis may be abstracted. A set \{e_1,e_2,\dots,e_n\}\subset V is said to be a basis of V if every vector v\in V has a unique representation as

\sum_{i=1}^na_ie_i

It can be shown that this condition is equivalent to the basis being a linearly independent, spanning set. In MA 2055, a set \{v_1,v_2,\dots,v_n\} is said to be linearly independent if the only linear combination of the v_i is the trivial one; i.e. linear independence is

\sum_{i=1}^na_iv_i=\mathbf{0}\,\Rightarrow\,a_i=0\,,\forall\,i

In turn this is equivalent to the condition that none of the vectors are a linear combination of the others – in some sense the ‘directions’ of the vectors are all different. A spanning set is one in which all possible linear combinations exhaust the space:

\text{span}\{v_1,\dots,v_n\}:=\{\sum_{i=1}^n a_iv_i :a_i\in\mathbb{R}\}

i.e. a set A is a spanning set for V if span A=V. The dimension of a vector space is given by the number of vectors in a basis.

Linear Maps

One of the first things to do when an abstract structure is defined is to consider functions between them. A linear map is a function between two vector spaces that preserves the operations of vector addition and scalar multiplication. In other words a linear map is any function T:V\rightarrow U where T(u+_V\lambda v)=T(u)+_U\lambda T(v) for any vectors u,v\in V and scalar \lambda\in \mathbb{F}_V. The quick calculation:

T\left(\sum_{i=1}^na_ie_i\right)=\sum_{i=1}^na_iT(e_i)

shows that a linear map is defined what it does to the basis vectors. Let \{e_1,\dots,e_n\} and \{f_1,\dots,f_m\} be bases for U and V. If the linear map is defined by the equations:

T(e_i)=\sum_{j=1}^mb^i_jf_j\,,\,\,\,i=1,\dots,n

then the matrix A with columns c_i=(b^i_1\,b^i_2\,\cdots\,b^i_m)^T acts on the vectors e_1=(1\,0\,\cdots\,0)^T,\,e_2=(0\,1\,0\,\cdots\,0),\cdots, e_n=(0\,\cdots\,0\,1) according to T. Hence, once a basis is fixed, a linear map is nothing but a matrix (and indeed a matrix is suitably linear).

Occasionally a linear map will be invertible. There are many equivalent definitions of invertible the most intuitive being that T is bijective. A function f:A\rightarrow B is bijective if:

  1. If f(x)=f(y), then x=y
  2. For all z\in B, there exists w\in A such that f(w)=z.

Now suppose a linear map sends two distinct vectors u_1,u_2 to a vector v; i.e Tu_1=v and Tu_2=v. Thence

Tu_1-Tu_2=v-v

\Rightarrow T(u_1-u_2)=\mathbf{0}

That is, a non-zero vector, u_1-u_2 is sent to \mathbf{0}. By linearity, any linear map will send \mathbf{0} to \mathbf{0}. Hence a linear map cannot be invertible if \{u\in U:Tu=\mathbf{0}\}\neq\{\mathbf{0}\}. By linearity this is an equivalent condition to invertibility: a linear map T is invertible if and only if \{u\in U:Tu=\mathbf{0}\}=\{\mathbf{0}\}.

A few dimensional considerations show that to be invertible a linear map must map between vector spaces of equal dimension. In particular, when bases are set, the matrix of the linear map will be a square matrix. In the following assume the dimension of the spaces is n.

Given a matrix A, to calculate the inverse of A, A^{-1}, a quantity known as the determinant of A, \det A is calculated. The subsequent analysis yields another equivalent condition for invertibility: a linear map T is invertible if and only if \det T\neq 0.

The Eigenvalue Problem

For some vectors v the action of a linear map T is by scalar multiplication:

Tv=\lambda v

In this case v is called an eigenvector of the linear map T with eigenvalue \lambda. In the first instance we restrict to non-zero vectors. Otherwise every number is an eigenvalue of every linear map because T\mathbf{0}=\lambda \mathbf{0} for all \lambda\in\mathbb{C}. Hence restrict to non-zero vectors and hopefully it will make sense to consider the eigenvalues of a linear map T. Consider

Tv-\lambda I v=\mathbf{0}

\Rightarrow (T-\lambda I)v=\mathbf{0}

Now if T-\lambda I is invertible then both sides may be hit by (T-\lambda I)^{-1} to yield v=\mathbf{0} which is not allowed. Hence T-\lambda I must not be invertible and thus the eigenvalues of T are given by the (n) solutions of the (polynomial) equation:

\det(T-\lambda I)=0

Indeed this gives yet another equivalent condition to T being invertible: a linear map T is invertible if and only if 0 is not an eigenvalue.

A particularly nice class of linear maps are diagonaliseable linear maps. Suppose a set of eigenvectors \{v_1,\dots,v_n\} with eigenvalues \{\lambda_1,\dots,\lambda_n\} of a linear map T form a basis. Thence in this basis, T is a diagonal matrix with the \lambda_i along the diagonal.

Inner Product Spaces

Earlier on we mentioned that we don’t care about multiplying vectors however there is the notion of a dot product of two vectors. For every pair of vectors, \mathbf{a}=a_x\hat{i}+a_y\hat{j},\,\mathbf{b}=b_x\hat{i}+b_y\hat{j} on the \hat{i}\hat{j} plane, there is a scalar, \mathbf{a}\cdot\mathbf{b}, called the dot product of \mathbf{a} and \mathbf{b}:

\mathbf{a}\cdot\mathbf{b}=a_xb_x+a_yb_y

The dot product satisfies the following two properties for all vectors \mathbf{a},\,\mathbf{b},\,\mathbf{c} and scalars \lambda\in\mathbb{R}:

  1. The dot product is linear on the left: (\mathbf{a}+\lambda \mathbf{b})\cdot \mathbf{c}=\mathbf{a}\cdot \mathbf{c}+\lambda\mathbf{b}\cdot\mathbf{c}.
  2. The dot product is positive definite: \mathbf{a}\cdot\mathbf{a}\geq 0, with equality if and only if \mathbf{a}=\mathbf{0}.

In a similar modus operandi to before, this structure can be abstracted to a more general map from pairs of vectors to \mathbb{F}. Any map (u,v)\rightarrow \mathbb{F} that satisfies 1. and 2. for all vectors u,\,v,\,w\in V and scalar \lambda\in\mathbb{F}, with the additional property that \langle\,,\,\rangle is conjugate linear on the right (\langle u,v+\lambda w\rangle=\langle u,v\rangle+\bar{\lambda}\langle u,w\rangle), is an inner product on V and \{V,\mathbb{F},+,\cdot,\langle\,,\,\rangle\} is an inner product space.

With respect to a given inner product, to each linear map T:V\rightarrow V there exists its adjoint, T^\star, given by:

\langle Tu,v\rangle=\langle u,T^\star v\rangle

A linear map is called self-adjoint if T=T^\star.

Proposition 1: The eigenvalues of a self-adjoint linear map are real.

Proof: Let \{v_1,\dots,v_n\} be eigenvectors of eigenvalues \{\lambda_1,\dots,\lambda_n\} of the linear map T. Now

\langle v_i,Tv_i\rangle=\langle v_i,\lambda_iv_i\rangle=\bar{\lambda_i}\langle v_i,v_i\rangle

Similarly,

\langle Tv_i,v_i\rangle=\langle \lambda_iv_i,v_i\rangle=\lambda_i\langle v_i,v_i\rangle

But T is self-adjoint thence \bar{\lambda_i}\langle v_i,v_i\rangle=\lambda_i\langle v_i,v_i\rangle and because eigenvectors are non-zero and positive definiteness \bar{\lambda_i}=\lambda_i \bullet

The spectral theorem for self-adjoint maps states that a self-adjoint map has a basis of eigenvectors. When a basis is fixed, it may be seen that the matrix of a self-adjoint linear map is one that is equal to its conjugate-transpose: A=\bar{A^T}=A^\star. These two facts combine to give a basis-dependent proof of Proposition 1.

Proof (basis-dependent): By the spectral theorem, T has an eigenbasis, say \{v_1,\dots,v_n\}. Let A be the matrix representation of T in this basis. A is thus the diagonal matrix with \lambda_1,\dots,\lambda_n along the diagonal. Now A=\bar{A^T}. As A is diagonal it is equal to its transpose. Taking conjugates shows \bar{\lambda_i}=\lambda_i \bullet

Vector Space Isomorphism

Sometimes two different vector spaces appear to behave very similarly even if outwardly they look very different. For example, the set of polynomials of degree at most two with real coefficients  behaves exactly like \mathbb{R}^3 when we add vectors together, etc. Indeed as vector spaces they have identical structure. To show they are isomorphic we must construct a bijective homomorphism (isomorphism) between them. If the isomorphism is linear then the homomorphism property will be respected. So now we can say that two vector spaces U and V are isomorphic as vector spaces if there exists a bijective linear map from U\rightarrow V. For example, \phi defined by \phi(ax^2+bx+c)=(a\,b\,c)^T\in\mathbb{R}^3 is a bijective linear map.

Subspaces

A vector subspace is a subset U of a vector space V that is a vector space in its own right. That is the axioms of a vector space are satisfied by U. Now axioms 5., 6. and 7. hold automatically as U is in a vector space. Suppose

  1. \mathbf{0}\in U
  2. For all u,\,v\in U,\,\lambda\in\mathbb{F},\, u+\lambda v\in U

Then axiom 1. holds by letting \lambda=1 and u=\mathbf{0}. Axiom 2. is vacuously true. Axiom 3. holds by letting u=\mathbf{0} and \lambda=-1. Conditions 1. & 2. here comprise the Subspace Test.