Professional Documents
Culture Documents
Birmingham B4 7ET
www.ncrg.aston.ac.uk/∼vanmourj/Welcome.html
Contacts:
Office: MB317C Main Building
Hours: On appointment (by email)
Telephone: (0121) 204 3657 (internal 3657)
e-mail: j.p.neirotti@aston.ac.uk
Contents
1.1 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Linear Objects 16
5.1 Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.1 2 lines in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.3 2 planes in 4D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
AM10VA: Vector Algebra and Geometry 3
6.1 Using the vector product for the normal equation of a plane . . . . . . . . . . . 25
7 Geometry 30
8.3.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.2 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.3 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.4 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.7.2 Eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.8 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9 Vector analysis 64
9.3 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.1 Scalars
A Scalar is a quantity which only has a magnitude but which is not related to any definite
direction in space. A scalar is completely specified by a number, e.g. 1, π, − 7.243623 are
scalars.
notation: In this course, we will always denote a scalar by a lower case symbol, e.g. a, b, x, r, λ, ...
1.2 Vectors
A vector is an object that has both magnitude (length) and a definite direction in space.
It can most easily be pictured as an arrow.
notation: In this course, we will always denote a vector with an arrow on top, e.g. ~a, ~b, ~r.
Note that in some text books other notations are used, such as boldface: r, or underscore: r.
We denote the length of a vector ~a by: |~a|, which is always positive, i.e. |~a| ≥ 0.
Note that the length of a vector is sometimes also called the norm of that vector.
• A vector of length zero is called the zero vector and is denoted by ~0.
This is the only vector that does not have a definite direction in space.
We note that two vectors ~a and ~b are only equal if and only if they have the same direction
and the same length.
2. Then we choose a special point in that space from which we start measuring, which we call
the origin, and which we denote by O.
R
RQ = q
~−~
r
OR = ~
r OQ = q
~
O 0̃
We already know several operations on scalars, such as addition (a + b), multiplication (ab),
division (a/b), absolute value (|a|) etc. Furthermore, we know several properties of these
operations, e.g.:
• a + b = b + a, a + (b + c) = (a + b) + c
• a(b + c) = ab + ac
• −a = (−1)a
We note that the result of these operations between scalars (if defined) is always again a scalar.
Now we are going to introduce operations between scalars and vectors. As we will see the result
of these operations could either be a scalar or a vector, so it is very important that you know
what to expect.
1. We have already encountered one operation: taking the length of a vector. The length of
a vector ~a is denoted by |~a|, and the result is a scalar.
Properties:
(a) |~a| ≥ 0
8 AM10VA: Vector Algebra and Geometry
2. We can add vectors together, and the result is a vector, e.g. ~r = ~a + ~b. Graphically
we do this by putting the starting point of ~b in the endpoint of ~a. The vector ~r is then the
vector that connects the starting point of ~a with the endpoint of ~b.
Obviously, we can put an arbitrary number of vectors one behind the other and connect
the starting point of the first with the endpoint of the last, and the order doesn’t matter.
Hence we have the following properties:
3. We can multiply a scalar λ with a vector ~a, to give the result ~r = λ~a which is a vector.
~r has the same direction as ~a but a different length. If |λ| > 1 then ~r is longer then ~a, if
0 < |λ| < 1 then ~r is shorter then ~a, and if λ < 0, ~r points in the opposite direction from ~a.
We have the following properties:
Given a set of N vectors {~a1 ,~a2 , ..,~aN }, and N scalars {c1 , .., cN }, the vector
The set of N vectors {~a1 ,~a2 , ..,~aN } is called linearly independent if the equations for {c1 , .., cN }:
The set of N vectors {~a1 ,~a2 , ..,~aN } is said to span a certain space if any vector ~r in that space
can be written as a linear combination of those vectors:
It could be that there are many possible values for the {c1 , .., cN }.
If we deal with a space of dimension N, we need at least a set of N vectors to span the space.
10 AM10VA: Vector Algebra and Geometry
3.3 Basis
1. If we have a set of exactly N linearly independent vectors, this set automatically spans the
space.
2. If we have a set of exactly N vectors that spans the space, then this set is automatically
linearly independent.
3. In that case the set of N vectors (which is linearly indepent, and spans the space) is called
a basis for this space S.
4. Any vector in the space S can be written as a unique linear combination of the basis
vectors.
6. For an orthonormal basis all the basis vectors have unit length, and each basis vector is
orthogonal (⊥) to all the others.
Suppose that we have a space S of dimension N, and that we have decided on:
1. an origin ~0
then any point R in the space can uniquely be represented by the vector:
~r = OR = ~0 + r1~e1 + r2~e2 + .. + rN ~eN = r1~e1 + r2~e2 + .. + rN ~eN .
The set of scalars {r1 , r2 , .., rN } are called the coordinates of ~r in that basis.
We now use the following notation for the vector ~r:
r1
~r = · · · = (r1 , .., rN )T
rN
which is called the coordinate notation of ~r, and only has meaning if the basis is defined!
Since the coordinates are unique for any given vector, this means that two vectors ~a and ~b are
equal if and only if each coordinate of ~a is equal to the corresponding coordinate of ~b. Hence in
N dimensions a vector equation corresponds to N simultaneous equations for the coordinates:
a1 b1 a1 = b1
~
~a = b ⇔ ··· = ··· ⇔ ···
aN bN aN = bN
AM10VA: Vector Algebra and Geometry 11
In principle any set of vectors that are spanning and linearly independent could be used as
basis. However, if everybody would use a different basis, it would be very hard for people
to exchange (position) vectors by their coordinates. Therefore, there are standard bases that
everybody uses (unless exlicitely stated otherwise). These standard bases are orthonormal and
so-called right handed.
In 3D, the three (unit) basis vectors are {î, ĵ, k̂}, and the components of a vector ~r = (r1 , r2 , r3 )T =
r1 î + r2 ĵ + r3 k̂, are often called the x- y- and z-component of ~r.
We have that:
3. if you put the basis vectors in order î, ĵ, k̂, î, ĵ, k̂, .., and if your right index finger points
from in the direction of any basis vector and your second finger points in the direction of
the next basis vector, then your extended thumb points in the direction of the one after
that. This is known as the right hand rule.
The standard basis vectors are given by î = (1, 0, 0)T , ĵ = (0, 1, 0)T and k̂ = (0, 0, 1)T in
coordinate notation.
If you want to plot the 3D basis in two dimensions, you use the following convention:
- k̂ points in the positive vertical direction (z-axis),
- ĵ points towards the right (y-axis)
- î points towards you (out of the paper), but you draw this in perspective by pointing it at 45
degrees to the left bottom (x-axis).
k
j
0
j
0 i
i
In 2D, the standard basis vectors are denoted by {î, ĵ}, and we use the following convention:
- î points towards the right (x-axis)
- ĵ points up (y-axis).
The standard basis vectors are given by î = (1, 0)T and ĵ = (0, 1)T in coordinate notation.
12 AM10VA: Vector Algebra and Geometry
Linear independence:
The set (1, 2, 3)T , (1, 0, 0)T , (0, 1, 0)T is linearly independent.
1 1 0 0 r+s=0
r 2
+s 0 +t 1 = 0 → 2r + t = 0 → r = s = t = 0
3 0 0 0 3r = 0
The set (1, 2, 3)T , (1, 0, 0)T , (−1, 4, 6)T is linearly dependent.
1 1 −1 0 r+s−t= 0
r 2
+s 0
+t 4 = 0 → 2r + 4t = 0 → r = −2t, s = 3t
3 0 6 0 3r + 6t = 0
Spanning vectors:
The set (1, 2, 3)T , (1, 0, 0)T , (0, 1, 0)T spans the space:
1 1 0 x r+s=x
r 2 +s 0 +t 1
= y → 2r + t = y → r = z/3, s = x−z/3, t = y−2z/3
3 0 0 z 3r = z
Whatever
the values of x, y, z, we find a solution.
The set (1, 2, 3) , (1, 0, 0)T , (0, 4, 6)T does not span the space.
T
1 1 0 x r+s=x s=x−r
r 2 + s 0 + t 4 = y → 2r + 4t = y → t = (y − 2x)/4
3 0 6 z 3r + 6t = z 3y = 2z
A vector ~r = (x, y, z)T for which 3y 6= 2z can not be written as a linear combination of the set.
Show that any vector has unique coordinates in a basis {~e1 , .., ~eN }.
Since the basis vectors span the space, any vector ~r can be written as ~r = r1~e1 +r2~e2 +..+rN ~eN .
Suppose now that ~r = r1~e1 + r2~e2 + .. + rN ~eN and that also ~r = x1~e1 + x2~e2 + .. + xN ~eN .
Then ~r − ~r = (r1 − x1 )~e1 + (r2 − x2 )~e2 + .. + (rN − xN )~eN = ~0.
Since the basis vectors are linearly independent this means that the only solution is that all the
coordinates are 0, hence r1 = x1 , .., rN = xN , in other words, the coordinates of ~r are unique.
AM10VA: Vector Algebra and Geometry 13
Consider two vectors ~a and ~b. The scalar or dot product of these two vectors is defined as:
3. ~a · ~b = ~b · ~a (commutative law)
5. If ~a ⊥ ~b (orthogonal), then ~a · ~b = 0.
We can see how the dot product works in an orthonormal basis on the coordinates of the
vectors by applying the definition and properties to the basis vectors.
Given a orthonormal basis {ê1 , .., êN }, we can write ~a = (a1 , a2 , .., aN )T , ~b = (b1 , b2 , .., bN )T and
the dot product is then given by
N
~a · ~b = a1 b1 |ê1 |2 + a2 b2 |ê2 |2 + ..aN bN |êN |2 + a1 b2 ê1 · ê2 + .. = a1 b1 + .. + aN bN =
X
ai bi
i=1
We can now use this to calculate (the cosine of) the angle between the vectors
~a · ~b
PN
ai bi
cos(θ) = = i=1
|~a| |~b| |~a| |~b|
~
a a − ~b
~ ~
a⊥ ~
a
θ θ
P ~b Q ~b ~
ak
Let î, ĵ, k̂ be the standard orthonormal basis vectors. If ~r = (x, y, z)T is a non-zero vector,
then the angles α, β, γ between ~r and the vectors î, ĵ, k̂ respectively are called the direction
angles of ~r, and the numbers cos α, cos β, cos γ are called the direction cosines of ~r.
Using the dot product is easy to see that the direction cosines are nothing else than the coor-
dinates of the unit vector r̂ in the direction of ~r, e.g.
~r · î ~r
cos(α) = = · î = r̂ · î
|~r||î| |~r|
r
γ
α β
0
j
i
16 AM10VA: Vector Algebra and Geometry
5 Linear Objects
First, we must decide on the dimension of the space we work in, and we’ll call the dimension
N. This determines the number of basis vectors, and hence the number of components of any
vector in it.
A linear object S in that space is an object that is straight or flat. It is called a linear object
because any point in it can be obtained by a linear combination of vectors. In general a linear
object is fully determined by a point ~a in it, and a set of M ≤ N linearly independent vectors
{~u1 , .., ~uM } parallel to it.
A linear object also has a well defined dimension. This dimension is equal to the maximum
number of linearly independent vectors parallel to it, i.e. M.
Examples are:
2. A line has dimension M = 1, and ~r = ~a + t~u, where ~a is any point on the line and ~u is a
vector parallel to the line.
3. A plane has dimension M = 2, and ~r = ~a + s~u + t~v , where ~a is any point in the plane and
~u, ~v are two linearly independent vectors parallel to the plane.
• Note that S forms an M-dimensional space itself. ~a can be considered as its origin and
{u1 , .., uM } as its basis vectors.
AM10VA: Vector Algebra and Geometry 17
1. (~a, {~u1 , .., ~uM }), where ~a is the position vector of a point in the object, and {~u1 , .., ~uM } is a
set of M linearly independent vectors parallel to the object.
The position vector ~r = (x1 , .., xN )T of any point in the object can then be written as:
~r = ~a + t1~u1 + .. + tM ~uM
2. (~a, {~n1 , .., ~nN −M }), where ~a is the position vector of a point in the object, and {~n1 , .., ~uN −M }
is a set of N − M linearly independent vectors orthogonal to the object.
The position vector ~r = (x1 , .., xN )T of any point in the object then satisfies the following
(set of) equation(s):
~ni · ~r = ~ni · ~a ∀i = 1, .., N − M
These are known as the normal ( or scalar) equation(s) of the object.
Since ~r and ~a both lie in the object, this means that the vector going from ~a to ~r, i.e.
~r − ~a is parallel to the object. So it must be orthogonal to all of the normal vectors. Hence
(~r − ~a) · ~ni = 0 ∀i = 1, .., N − M.
• We can go from the vector equation of an object to the normal equations of the object by
elimination of the parameters {t1 , .., tM }.
Example: Consider the plane (2-D) in 4-D space given by ~a = (1, 1, 0, 0)T , {~u1 = (1, 0, 1, 0), ~u2 =
(0, 1, 1, 1)}, then we have that
w 1 + t1
t1 = w − 1
x 1 + t2
t2 = x − 1 w+x−y = 2
y = t1 + t2 → y = x + w − 2 → x − z = 1
~r = ~a + t1~u1 + t2~u2 →
z t2 z =x−1
indeed 2 equations. We can then also immediately read off two ortogonal vectors to the plane
from the equations, namely ~n1 = (1, 1, −1, 0)T, ~n2 = (0, 1, 0, −1)T and we can check that
~n1 · ~a = 2 and ~n2 · ~a = 1.
18 AM10VA: Vector Algebra and Geometry
5.1 Intersections
• The intersection of two objects S1 and S2 is defined as all the points that lie both in S1
and S2 . This means that the points must satisfy all the equations of both objects.
• If S1 and S2 are both linear objects then the intersection is also a linear object if it exists.
The dimension of the intersection is smaller or equal to the lowest of the dimensions of S1
and S2 , and can even be empty if S1 and S2 have no points in common.
• The only sure way of determining the dimension of the intersection is by working out how
many linearly independent equations the intersection has, especially in higher dimensional
spaces where your intuition may easily lead you astray!
• Since each hyperplane corresponds to one normal equation and vice versa, any M-D linear
object S in an N-D space (with M < N) may be considered as the intersection of N −
M linearly independent hyperplanes. Each contributing to one of the N − M linearly
independent normal equations of S.
Examples:
5.1.1 2 lines in 3D
~r = ~a1 + t1~u1 .
~r = ~a2 + t2~u2 .
1 = 1 + t2 t2 = 0
0 = 2 + t2 → 0 = 2
1 + t1 = 3 t1 = 2
AM10VA: Vector Algebra and Geometry 19
1 + t1 = t2 t2 = 1 + t1
1 + 2t1 = −1 + 2t2 → 1 + 2t1 = 1 + 2t1
1 + 3t1 = −2 + 3t2 1 + 3t1 = 1 + 3t1
~r = ~a1 + t1~u1.
1.
~r = ~a1 + t1~u1 = ~a2 + s2~u2 + t2~v2
We have to solve this for t1 , s2 and t2 .
2.
(~a1 + t1~u1 ) · ~n2 = ~a2 · ~n2
We have to solve this for t1 .
5.1.3 2 planes in 4D
0 1 0 1 1 1
0 0
, ~n2 = 1
1 1
S2 : ~b = , ~u2 = 2
0 , ~n1 = 1
S1 : ~a =
0 1 , ~u1 = 1
3
0 0 1 1 1 4
Hence we have the (vector) equations:
0
~r · ~n1 = ~a · ~n1
(
(~b + t1~u1 + t2~u2 ) · ~n1 = ~a · ~n1
t1 = −1 0
~r · ~n2 = ~a · ~n2 → → → ~r =
(~b + t1~u1 + t2~u2 ) · ~n2 = ~a · ~n2 t2 = 0 0
~r = ~b + t1~u1 + t2~u2
0
Definitions:
1. The (orthogonal) projection of a point ~r on an object S is the point ~p~r in the object that
is closest to ~r.
2. The reflection of a point ~r with respect to an object S is the point R ~ ~r exactly opposite ~r
with respect to p~~r . This means that it lies on the line through ~r and p~~r , but on the other
side of ~r, therefore R~ ~r = 2~p~r − ~r. The easiest way to obtain it is first to determine ~p~r , and
then to use the vector equation above.
First we note the property that the vector ~r − ~p~r is always ortogonal to S at the point ~p~r .
This means that ~p~r lies in S⊥ : the linear object ⊥ S containing ~r.
11111111111111111
00000000000000000
00000000000000000
11111111111111111
S S
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
r 00000000000000000
11111111111111111
00000000000000000
11111111111111111 1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
S 00000000000000000r
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000 r
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
pr
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
r rr
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 p
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
r
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
S
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
Examples:
Hence
(~a − ~r) · ~n
t=
~n2
It must also lie in the hyperplane through ~r and orthogonal to ~u: ~r.~u = p~~r .~u = (~a + t ~u).~u =
~a.~u + t ~u.~u
Hence
(~r − ~a) · ~u
t=
~u2
Consider the plane S through ~a = (0, 0, 0, 0)T and k ~u1 = (1, 1, 1, 1)T , ~u2 = (1, 0, 1, 0)T, and the
point ~r = (1, 2, 3, 4)T.
Definition: The projection of a line onto an object is the collection of points formed by the
projection of all the individual points of the line onto the object.
The result of the projection of a line onto a linear object will be a line lying in that ob-
ject (except when the line and object are orthogonal, then the projection will just be a point).
L
L’
Example:
Consider the line through ~a = (1, 1, 1, 1)T and k ~u = (1, 0, 1, 0)T in 4D, and the hyperplane
through ~b = (1, 2, 3, 4)T and ⊥ ~n = (1, 1, 1, 0).
The projection of any point ~r = (w, x, y, z) onto the object is obtained by (see 5.2)
(~b − ~r) · ~n
~p~r = ~r + s~n with s=
~n2
so s = (6 − w − x − y)/3, and
2 + 2w/3 − x/3 − y/3
2 − w/3 + 2x/3 − y/3
~p~r =
2 − w/3 − x/3 + 2y/3
z
We only want the projection of points that lie on the line so
w 1+t
w =1+t
x 1
x=1
~r =
y
= ~a + t~u =
1+t
→
y =1+t
z 1 z=1
which is indeed a line in the hyperplane (you can check that p~~r (t) · ~n = ~b · ~n = 6, ∀t, i.e. ~p~r (t)
satisfies the equation of the hyperplane).
24 AM10VA: Vector Algebra and Geometry
c
v
8. ~v × ~v = ~0
We first see how the cross product works on the standard orthonormal basis vectors:
1. î × î = ĵ × ĵ = k̂ × k̂ = ~0
Now we can see how the vector product works on the coordinates of vectors by applying the
definition and properties:
In this basis, we write ~a = (a1 , a2 , a3 )T , ~b = (b1 , b2 , b3 )T and the cross product is then given by
~a ×~b = a1 b1 (î × î) + a1 b2 (î × ĵ) + .. + a3 b3 (k̂ × k̂) = (a1 b2 − a2 b1 )k̂ + (a2 b3 − a3 b2 )î + (a3 b1 − a1 b3 )ĵ
We can also use this to calculate (the sine of) the angle between the vectors
|~a × ~b|
sin(θ) =
|~a| |~b|
6.1 Using the vector product for the normal equation of a plane
In the figure below, the vector product of ~v and ~u is the vector ~c (pointing out of the paper
towards you) of magnitude |~v||~u| sin θ, i.e. the area of the parallelogram formed by the two
vectors. This is because the “height” of the parallelogram (indicated by the dashed line is
|~v| sin θ.
v
θ
u
C
B
J
The symbol represents the
N tip of an arrow that is coming towards you, while the fligth (rear)
of the arrow is the symbol , and represents a vector directed into the page away from you.
26 AM10VA: Vector Algebra and Geometry
If ~a, ~b, ~c are three arbitrary vectors, then the scalar product of ~a × ~b with ~c is called the scalar
triple product of ~a, ~b, ~c, and is given by:
We have that ~a × ~b = (|~a||~b| sin θ)~p, where p~ is a unit vector perpendicular to both ~a and ~b. So,
(~a × ~b) · ~c = (|~a||~b| sin θ) ~p · ~c = (|~a||~b| sin θ) |~c| cos φ = area of base × orthogonal height
c φ
θ
b
a
The absolute value (because cos(φ) could be negative) of the scalar triple product is hence the
volume of the parallelepiped described by the the three vectors, or
Note that if (~a × ~b) · ~c = 0, the volume of the parallelepiped would be zero. This would mean
that the three vectors must lie in the same plane, thus the scalar triple product also gives us a
test to check whether three 3D vectors are co-planar.
AM10VA: Vector Algebra and Geometry 27
In order to calculate the vector product it is not very practical to remember the formula by
hard, or to deduce it every time from the basis vectors. Therefore we now introduce the notion
of determinant, which is very important and will come back when we deal more in detail with
matrices.
• The diagonal elements of A are all elements for which i = j, i.e. a11 , a22 , ..aii , .., ann .
All the other elements for which i 6= j are called off-diagonal elements.
• Each position is either even (+) or odd (-), depending whether the sum i + j of the row
index i and the column index j is even or odd. This only depends on the position of an
element and not on its sign or value!
+ − + ...
− + − ...
+ − + ...
..
··· ··· ··· .
For all diagonal elements i + j = i + i = 2i, and hence they are in even positions.
• We can also consider the reduced matrix A′ij of (n − 1) × (n − 1) numbers that we obtain
by taking away the ith row and the jth column of A. E.g.
a21 a23 a2n
a31 a33 . . . a3n
′
A12 =
. . . ···
···
an1 an3 . . . ann
A determinant is a number that is defined for any square matrix of (n × n) numbers, and to
calculate it we use a recursive procedure.
So:
Examples:
det(−2) = −2
1 −2 a11 a12
det = +1 det(−3)−(−2) det(3) = −3+6 = 3, det = a11 a22 −a12 a21
3 −3 a21 a22
Where we have developed the first row.
It is easy to remember that the determinant of a 2 × 2 matrix is the product of the diagonal
elements minus the product of the off-diagonal elements.
1 2 3
2 3 1 3 1 2
0 −1 −3 = −(0)
3 4 + (−1) 2 4 − (−3) 2 3 = 0 − (4 − 6) + 3(3 − 4) = −1
2 3 4
Make sure that you’re confident in calculating determinants, as we will use this a lot!
AM10VA: Vector Algebra and Geometry 29
Now that we know that a determinant is in principle only defined for square sets of numbers, we
will immediately abuse this definition to use it as an easy way to calculate the vector product
(which isn’t a number at all but a vector!):
If you want to calculate the vector product ~a ×~b you can do this by calculating the determinant
of the matrix with,
a1 a2 a3
~a × ~b = b1 b2 b3
î ĵ k̂
Exercise: check this!
We can also calculate the scalar triple product easily, using determinants:
a1 a2 a3
(~a × ~b) · ~c = b1 b2 b3
c1 c2 c3
Exercise: check this!
30 AM10VA: Vector Algebra and Geometry
7 Geometry
So far, we have encountered only linear geomeric objects such as points lines, planes and
hyperplanes. They are called linear because:
1. they can all be expressed as linear combinations of vectors in the vector or parametric
notation, e.g.
~r = ~a + s~u + t~v
2. the (normal) equation(s), for the coordinates of a point in the object, that describe them
are all linear equations, e.g.
x + 2y − 3z = 1
3y + z = 3
The advantage of linear objects and linear equations, that any linear problem can be solved,
and that very good algorithms exist to deal with them e.g. on a computer.
However, linear (ie. straight and flat) objects are just a very small subset of all the geometric
objects that exist. If you just look around you, you will notice that most objects are actually
curved.
In general, objects are classified by the (highest) degree of the equations that describe them.
Linear objects have equations of degree 1, quadratic objects have equations of degree 2, cubic
objects have equations of degree 3, etc. Note that the dimension in which an object lives, has
nothing to do with its degree. E.g. a curve x = y 4 is a 2D object of degree 4, and a plane is an
object in 3D space of degree 1!
You can easily imagine that the higher the degree of an object, the harder it will be to solve
problems involving it. Therefore in this course we will restrict ourselves to linear and quadratic
objects only, but you have to keep in mind that there is a whole zoo of complicated geometric
objects out there.
AM10VA: Vector Algebra and Geometry 31
As mentioned above, quadratic objects are described by quadratic equations. Maybe without
realising you are already familiar with a concept that involves quadratic equations of coordi-
nates: length or distance.
Indeed, most if not all quadratic objects can be expressed in terms of (squares) of distances.
The Ellipse
Let A and be B be two distinct points, named the focal points of the ellipse. A point P on
the ellipse is such that the sum of the lengths AP and BP is constant, and greater than the
distance from A to B.
O P
A B
b
Figure 1: An ellipse in standard form, that is, rotated so that its major axis lies along the X-axis.
The major axis of the ellipse is defined as the distance AO, as shown in fig. 1. The minor axis
has length b, as labelled in fig. 1.
where a is the length of the major axis of the ellipse, and f~1 , f~2 are the focal points.
example:
If the position vectors of the focal points of an ellipse are ~c and −~c and the length of the major
axis is 2a. Show that the equation of the ellipse may be written
solution:
The equation for the ellipse is
|~r − ~c| + |~r + ~c| = 2a (1)
or
From the previous example, with ~r = (r1 , r2 )T , a = |~a| and c = |~c|, show that
Using the definition of the minor axis length b, hence show that
r12 r22
+ =1 (8)
a2 b2
This is the standard form for the equation of an ellipse.
The Circle:
The circle is a special case of the ellipse, for which the 2 focal points coincide: f~1 = f~2 = ~c, i.e.
the centre point of the circle.
The defining property of a circle is that the distance from its centre ~c to a point ~r on the circle
is constant and equal to the radius r. (This is true also for the sphere and hypersphere in
higher dimensions). Mathematically, this is written as
where we have squared the distance to get rid of the cumbersome square root.
AM10VA: Vector Algebra and Geometry 33
The circle.
The Hyperbola
A hyperbola is the locus of points such that the square of the difference between the distances
from the hyperbola to the two given points (focal points) is constant.
-c a c
The Hyperbola in standard form. A point P is such that the difference in the distance to the
two focal points at c and −c is twice the distance a.
This is the same definition as the ellipse, except that for the ellipse, it is the sum of the dis-
tances which is kept constant. To find an alternative expression for the hyperbola, we carry
out a similar derivation as that which we did for the ellipse.
or
~r2 − 2~r · ~c + ~c2 = 4a2 + 4a|~r + ~c| + ~r2 + 2~r · ~c + ~c2 (10)
which is of the
same
form as for an ellipse.
c
Taking ~c = , this gives
0
Or, letting: b2 = c2 − a2
r12 r22
− =1
a2 b2
The Parabola
The parabola is obtained when the second of the 2 focal points (f~2 ) of the hyperbola is moved
off to infinity. The distance to the f~2 then becomes equal to a constant minus the distance to
a line ⊥ to the focal axis, see figure.
AM10VA: Vector Algebra and Geometry 35
The standard parabola is defined by the equation, where the focal axis is parallel to the y-axis.
y = ax2 + bx + c
for some constants a, b, c. This is also known as a quadratic function, and is one of the most
common non-linear functions.
3
2.5
2
y
1.5
0.5
0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x
The parabola y = x2 + 1
1. min/max
2. intercept
3. roots,
4. etc.
The quadratic objects in 2D are called conic sections, because they can all be obtained from
the intersection of a cone with a plane!
36 AM10VA: Vector Algebra and Geometry
x2 y2 z2 x2 y2
Ellipsoid: l2
+ m2
+ n2
=1 Elliptic Cone: z 2 = l2
+ m2
]
x2 y2 z2 x2 y2
One sheet Hyperboloid: l2
+ m2
− n2
=1 Elliptic Paraboloid: z = l2
+ m2
x2 y2 z2 y2 x2
Two sheet Hyperboloid: l2
+ m2
− n2
= −1 Hyperbolic Paraboloid: z = m2
− l2
AM10VA: Vector Algebra and Geometry 37
It turns out that in N dimensions there is exactly one hyper-sphere that goes through N + 1
points in general position. We will now see the procedure of how to determine the hyper-sphere
S, i.e. its centre point ~c = (x1 , .., xN )T (N unknown coordinates) and radius r (another un-
known).
Consider the set of N +1 linearly independent points {~r1 , .., ~rN }, that must lie on S, thus giving
N + 1 quadratic equations:
|~r1 − ~c|2 = r 2
···
|~rN − ~c|2 = r 2
We can reduce this to a set of N linear equations for the coordinates by taking differences of
the quadratic equations, and solve them for the coordinates of ~c.
Once we have done this, we can use any of the quadratic equations to finally determine the
radius r.
25
so x = y = 7/6 and e.g. r 2 = |~r1 − ~c|2 = 25/18 (check that also |~r2 − ~c|2 = |~r3 − ~c|2 = 18
!)
The intersection of a line L : ~r = ~a + t~u and a (hyper)-sphere S (centre ~c, radius r), can either
be 0, 1, or 2 points.
The line has only 1 free parameter t, which can be determined by demanding that the inter-
section also obeys the equation of the sphere. This will give us a quadratic equation for t that
has either:
~
The intersection of a plane P (point in plane ~a, normal vector ~n) and a sphere S (centre C,
radius R), can either be a circle (plane cuts through the sphere), 1 point (plane touches the
sphere = circle of radius 0), or empty (plane passes sphere by).
In the procedure we will assume that the intersection is a circle (a contradiction is found if it
is empty).
To uniquely determine a circle C in N-D, we need three things:, its centre ~c, its radius r, but
also the normal vector ~nc to the (hyper-)plane in which the circle lies.
1. We get ~nc = ~n for free as we know that the circle must lie in the plane P.
~
n
r
~
c
~
C 01
example:
P : ~a = (1, 0, 1)T , ~n = (2, 2, 2)T ,
S: C ~ = (1, 1, −1)T , R = 2
Then:
• ~nc = ~n = (2, 2, 2)
• ~c = (1 + 2t, 1 + 2t, −1 + 2t)T
• (~a − ~c) · ~n = 0 → 12t − 2 = 0 → t = 1/6, and ~c = (4/3, 4/3, −2/3)T
~ 2 = 4 − 1/3 = 11/3, so r = 11/3.
p
• r 2 = R2 − |~c − C|
~2 − C
1. We get ~nc = C ~ 1 for free as we know that the circle must lie in the plane ⊥ (C
~2 − C
~ 1 ).
~ 1 and C
2. We know that ~c must lie on the line through C ~ 2 , (see figure), so ~c = C
~ 1 + t(C
~2 − C
~ 1 ).
3. To determine r and t, we use Pythagoras’ rule twice for the two right-angled triangles (see
figure):
~ 1 |2 + r 2
(a) R12 = |~c − C → ~ 1 |2
r 2 = R12 − |~c − C
~ 2 |2 + r 2
(b) R22 = |~c − C → ~ 2 |2
r 2 = R22 − |~c − C
~ 1 |2 = R2 − |~c − C
The difference of these equations (R12 − |~c − C ~ 2 |2 ) gives a linear equation
2
for t, which can be solved.
R1 r R2
00111100 0110
C ~
1 ~
c ~2
C
Intersection of 2 spheres
example:
~ 1 = (1, 1, 1)T , R1 = 1,
S1 : C
~ 2 = (1, 2, 3)T , R2 = 3
S2 : C
Then:
• ~nc = C~2 − C~ 1 = (0, 1, 2)
• ~c = (1, 1 + t, 1 + 2t)T
• t2 + 4t2 − 1 = (−1 + t)2 + (−2 + 2t)2 − 9 → t = p −3/10, and ~c = (1, 7/10, 4/10)T
~ 1 |2 = 1 − 45/100 = 55/100, so r = 55/100
• r 2 = R12 − |~c − C
Check R22 − |~c − C ~ 2 |2 = 9 − 845/100 = 55/100 = r 2 , ok!
Although formulae for projections onto any quadratic object can be derived, in this course we
will restrict ourselves to projections onto (hyper)-spheres, as it is relatively simple.
Consider a point ~r and a (hyper)-sphere S with centre point ~c and radius r. The projection p~~r
of ~r onto S is the point on S that is closest to ~r.
You can easily convince yourself that this point must lie on the line through ~r and ~c hence:
±r 2
|~p~r − ~c|2 = r 2 → t2 |~c − ~r|2 = r 2 → t =
|~c − ~r|2
There are two solutions, one is the desired projection, the other is the point on the sphere
furthest away from ~r.
AM10VA: Vector Algebra and Geometry 41
A matrix is a rectangular array of numbers (scalars). In these lecture notes we will use (bold)
upper case letters to indicate matrices. The numbers in the array are called the elements of
the matrix.
The size (or order) of the matrix is specified as m × n, where m is the number of rows, and
n is the number of columns. This means that n-dimensional (column) vectors are also n × 1
matrices. Similarly, n-dimensional row vectors are also 1 × n matrices.
1 2
1 2 3
A = 3 2 , B= , C= 5
−1 3 5
−10 3
To indicate a specific element of a matrix A, we have to indicate the row-index i, and the
column-index j of the element, and we denote it by Aij . In the examples above we have e.g.
A31 = −10 and B12 = 2.
Only matrices of the same size can be added. The summation is carried out as follows: suppose
that we want to add the m × n matrices A and B, then the result A + B is again an m × n
matrix and each element is the sum of the corresponding elements in A and B. Mathematically:
1 2 3 4 2 1
A= B=
6 5 2 4 −1 7
then
5 4 4
A+B=
10 4 9
Two matrices A (m × n) and B (k × ℓ) can only be multiplied with each other if the number
of columns of the first matrix is equal to the number of rows of the second matrix.
So AB exists only if n = k, and BA exists only if ℓ = m.
If the elements of the m × r matrix A are Aij , and the elements of the r × n matrix B are Bij ,
then the result of the product AB is an m × n matrix. The ij-th element of the result is the
sum of the the product of all the elemnts in the i-th row of A with the corresponding elements
in the j-th column in B. Mathematically:
r
X
(AB)ij = Aik Bkj .
k=1
Then
A11 B11 + A12 B21 A11 B12 + A12 B22
AB = ,
A21 B11 + A22 B21 A21 B12 + A22 B22
Because of the conditions on the size of the matrices, it should be clear that matrix multiplica-
tion does not possess the comutative property. In fact, even if AB exists it is not guaranteed
that BA also exists!
The combination of the matrix addition and the matrix multiplaction does have the distributive
property:
A(B + C) = AB + AC
when it exists.
An m×n matrix A multiplied by a scalar λ is again an m×n matrix λA, where all the elements
are multiplied by λ:
(λA)ij = λAij
Combining the multiplication with a scalar and matrix addition we can consider linear combi-
nation of matrices (just like we did for vectors). For a set of k matrices {Ai i = 1..k} and a set
of k numbers {ci , i = 1..k}, we can consider the linear combination:
k
X
ci Ai = c1 A1 + c2 A2 + .. + ck Ak
i=1
AM10VA: Vector Algebra and Geometry 43
Transposing an m × n matrix A (notation AT ), means that we swap rows and columns of the
matrix, e.g. the i-th row becomes the first column and the i-th column becomes the first row.
The result AT is then an n × m matrix. Mathematically:
We can identify a scalar as a 1 × 1 matrix (therefore we can drop both the row and column
indices without ambiguity. The notation ill depend Whether we look upon the number as a
scalar or a matrix e.g:
A = A11 ∼ a
and is important because the operations on them do not necessarily mea the same things.
Vectors are matrices with only one column, so n × 1 matrices (therefore we do not tend to give
column index as it is always 1 anyway).
As such we can do all matrix operations on them. Again it is a good idea to distinguish in your
notation how you consider them because of different notation and meaning of the operations
on them, e.g.:
A11 a1 B11 b1
A = A21 ∼ ~a = a2 , B = B21 ∼ ~b = b2 ,
A31 a3 B31 b3
Since vectors (even in vector notation) can still be regarded as matrices consisting of a single
column, we can still multiply a matrix (of the right dimensions) with a vector:
A11 A12 A13 b1 A11 b1 + A12 b2 + A13 b3
A~b = A21 A22 A23 b2 = A21 b1 + A22 b2 + A23 b3
A31 A32 A33 b3 A31 b1 + A32 b2 + A33 b3
An n × n matrix which has the same number of rows as columns is called a square matrix.
Square matrices are important for many reasons which will become clear in the rest of this
course:
2. multiplying a square matrix with a vector yields a vector of the same dimension (in the
same space).
• The elements of a square matrix that have the same row and column index, of the form
Aii are called the diagonal elements, and the set of all of them simply the diagonal of the
matrix.
A symmetric matrix is a matrix that remains unchanged when you transpose it:
AT = A Aji = AT ij = Aij
→
This means that a symmetric matrix is necessarily square and the corresponding elements above
and below the diagonal (when you mirror around the diagonal) are the same. E.g.
1 0 3
A = 0 −2 π
3 π 5
is symmetric.
Note that there is no condition on the diagonal elements.
A zero matrix 0 is a matrix in which each element is 0. There are zero matrices of any size,
hence it is good practice to indicate the m × n zero matrix as 0mn . Mathematically:
0ij = 0 ∀i, j
A diagonal matrix D is a square matrix for which all off-diagonal elements are 0. Note that
there is no condition on the values of the diagonal elements. Mathematically:
Dij = 0, if i 6= j
E.g.
1 0 0
A = 0 −2 0
0 0 π
is diagonal.
A triangular matrix T is a square matrix for which all elements above the diagonal or all
the elements below the diagonal are 0, is called a lower triangular or upper triangular matrix
respectively.
E.g.
1 1 3 1 0 0
A = 0 −2 7 , B = 0 −2 0 ,
0 0 π 1 −3 π
A is an upper triagular matrix and B lower triangular.
An identity matrix is defined as a matrix I which, under matrix multiplication with any matrix
A, leaves A unchanged (i.e. IA = A = AI).
If A is an m × n matrix and we multiply it on the left with an I the result should again be A
(m × n), so I must be a square m × m matrix. Similarly, if multiply A with an I on the right,
it follows that I should be square n × n matrix.
46 AM10VA: Vector Algebra and Geometry
We conclude that any identity matrix must be square but it can be of different dimensions
(n × n), so it is good practice to indicate the dimnsion of the square matrix In , such that it is
clear which one we are talking about.
It truns out that there is exactly 1 identity matrix In for each dimension n, and it is a diagonal
matrix with all diagonal elements equal to 1. Mathematically:
1 0 .. 0
0 1 .. 0
Iij = 1, i = j
In = ,
.
: : .. :
Iij = 0, i 6= j
0 0 .. 1
You can check for yourself that if you multiply this matrix on the left or right with a matrix of
compatible dimensions you get that matrix back.
You can see that In are all diagonal matrices.
The matrix multiplication ideas that we developed above are very useful in describing linear
transformations of vectors. If T is a linear transformation it maps every vector onto another
vector, and satisfies the following properties (defining it as linear):
We know that matrix multiplication satisfies these properties, so matrix multiplication must
be a linear transformation. In fact it turns out (without proof) that there is a one-to-one
relationship between linear ransformations and matrices:
With every linear transformation T there corresponds exactly one matrix A such
that
T (~x) = A~x,
and vice-versa.
Since we have the linear properties, it is enough to know how the tranformation works on the
basis vectors {ve1 , .., ~en } → {~e′1 = T (~e1 ), .., ~e′n = T (~en )}, to know how it works on any vector
because:
a1
T (~a) = T ... = T (a1~e1 + .. + an~en ) = a1 T (~e1 ) + ..an T (~en ) = a1~e′1 + ..an~e′n
an
and the matrix A corresponding to the transformation is given by putting the coordinates of
the vectors {~e′1 , .., ~e′n } (in the basis {~e1 , .., ~en }) as the columns of A.
Practically, it is a good idea to draw the basis vectors ~ei and their images ~e′i , to understand
how the transformation works, and to see what the coordinates of th images are.
AM10VA: Vector Algebra and Geometry 47
Examples:
8.3.1 Scaling
Scaling a vector means changing the length of the vector, but not its direction. Since the direc-
tion in which a vector points is associated with the ratio of its components, scaling corresponds
to simply multiplying all the components of the vector by an equal amount. That is
Scaling a vector.
Indeed, we have that ~e′i = T (~ei ) = λ~ei for all basis vectors, so putting these as the columns of
the matrix gives S.
8.3.2 Reflection
Reflection operators flip vectors about a certain line, plane or hyper-plane. For example, if we
flip about the y-axis, we have that
′ −1
î = −î =
0 −1 0
→ R=
′ 0 0 1
ĵ = ĵ =
1
8.3.3 Projection
Linear projection operators project all vectors orthogonally onto a line, plane or hyperplane.
Consider the projection in 3-D onto the xy-plane.
48 AM10VA: Vector Algebra and Geometry
Since î and ĵ lie in the xy-plane, and k̂ is orthogonal to it, we have that:
1
′
î = î = 0
0
0 1 0 0
′
ĵ = ĵ = 1 → P = 0 1 0
0 0 0 0
0
~
′
k̂ = 0 = 0
0
8.3.4 Rotation
Linear rotation operators rotate all vectors over some angle around a line through the origin.
Consider the case that we rotate the 3-D vectors over an angle θ clockwise around the z − axis.
We have that (see figure below):
cos(θ)
î′ = cos(θ)î − sin(θ)ĵ = − sin(θ)
0
sin(θ) cos(θ) sin(θ) 0
ĵ ′ = sin(θ)î + cos(θ)ĵ = cos(θ) → R = − sin(θ) cos(θ) 0
0 0 0 1
0
′
k̂ = k̂ = 0
1
AM10VA: Vector Algebra and Geometry 49
All the above transformations can be represented by matrix multiplications on vectors. We can
apply a sequence of such transformations, e.g. a rotation R, followed by a rescaling, S. This
would have the effect of transforming a vector ~x into ~x′ by rotation, and then transformning ~x′
into ~x′′ by rescaling. Hence,
′
~x = R~x
→ ~x′′ = S(R~x) = (SR)~x
~x′′ = S~x′
so the matrix corresponding to the combined transformation from ~x to ~x′′ is given by the matrix
product of the matrices in the opposite order of the matrices of the individual transformations
(i.e. the first transformation on the right and the last one on the left)!
Obviously we can generalise this to an arbitrary number of linear transformations. If the tranfor-
mations A1 , A2 , .., An are applied consecutively in that order then the matrix C corresponding
to the overall transformation is given by:
C = An An−1..A2 A1
8.4 Determinants
The determinant of a matrix is a an operation that is only defined for square matrices and
maps every square matrix A onto a number det(A).
We have already introduced determinants and explained how to calculate them when we dis-
cussed the vector product, so we are not going to repeat that here.
However, there are some extra important properties concerning determinants worth mentioning:
• det AT = det A
• Multiplying a single row or column of A by a scalar, k changes the determinant by a factor
k. That is det B = k det A if B is A but with one of the columns or rows multipled by k.
50 AM10VA: Vector Algebra and Geometry
• From the above property we see that multiplying an n × n matrix with a scalar has the
following effect on the determinant: det(λA) = λn det(A).
• Interchanging two rows or two columns changes the sign of the determinant.
• If B is the matrix that results when a multiple of one row of A is added to another row or
when a mulitple of one column is added to another column, then det B = det A.
• If A is a square matrix and the set of rows or columns (looked upon as vectors) is linearly
dependent then det A = 0.
• The determinant of a triangular matrix (either all the elements above the diagonal or all
the elements below the diagonal are zero) is equal to the product of its diagonal elements.
• Since a diagonal matrix is also triangular, the determinant of a diagonal matrix is also just
the product of its diagonal elements.
Another operation that is only defined on (some) square matrices is matrix inversion. The
inverse A−1 of a n×n matrix A, is that matrix that undoes (reverses) the effect of multiplication
by A on either side:
A−1 AB = B, and CAA−1 = C
for any compatible matrices B and C. Therefore we have that
A−1 A = In = AA−1
E.g. if we applied the transformation A to a vector ~x, what transformation do we need to apply
to the vector ~x′ = A~x to recover the original vector ~x?
• Singular matrices
Note that it is not always possible to find A−1 , and it turns out that A−1 does not exist
when det(A) = 0. In that case, we call the matrix A singular.
With any linear transformation of vectors corresponds a matrix A. Now you should try
and think for what kind of transformations you could work your way back to the original
vector. In order to do so each different vector ~x should be mapped onto a different unique
vector ~x′ . If more than one vector gets mapped onto the same ~x′ it is no longer possible to
undo the operation, because you don’t know where it came from.
It turns out that any scaling, reflection or rotation is invertible, while any projection is not.
You can check as an exercise for yourself that the determinant of any scaling reflection and
rotation that we have dealt with is non-zero, while the determinant of any projection is 0.
AM10VA: Vector Algebra and Geometry 51
• scaling a row
We can use elementary row operations to find the inverse in the following way:
1. Write down the matrix A on the left, and the identity matrix I of the same dimension on
the right.
2. Apply an elementary operations to both the left and the right, working your way towards
getting the identity matrix on the left.
3. If you can’t get any closer with elemntary operations on the left, then the inverse doesn’t
exist.
4. If you manage to reduce the matrix on the left all the way to the identity matrix, then the
matrix on the right is A−1
• It is good practice to write down the operations that you do in each step.
• Also work systematically, starting from making the first column to the same form as for the
identity matrix, you work your way to the last column. thus you avoid undoing work that
you did before.
52 AM10VA: Vector Algebra and Geometry
1 2 3
Example: Find the inverse of the matrix 2 5 3
1 0 8
Start:
1 2 3 1 0 0 r2 → r2 − 2r1 1 2 3 1 0 0 r1 → r1 − 2r2
2 5 3 0 1 0 → r3 → r3 − r1 0 1 −3 −2 1 0 → 3 → r3 + 2r2
r
1 0 8 0 0 1 0 −2 5 −1 0 1
1 0 9 5 −2 0 1 0 9 5 −2 0 r1 → r1 − 9r3
0 1 −3 −2 1 0 → r3 → −r 3 0 1 −3 −2 1 0 → 2 → r2 + 3r3
r
0 0 −1 −5 2 1 0 0 1 5 −2 −1
1 0 0 −40 16 9
0 1 0 13 −5 −3
0 0 1 5 −2 −1
So −1
1 2 3 −40 16 9
2 5 3 = 13 −5 −3
1 0 8 5 −2 −1
Check that indeed AA−1 = I3 .
−1 −1 2
Example: Find the inverse of 1 −2 1.
0 −1 1
Start:
−1 −1 2 1 0 0 1 1 −2 −1 0 0
1 −2 1 0 1 0 → r1 → −r1 1 −2 1 0 1 0 → r2 → r2 − r1
0 −1 1 0 0 1 0 −1 1 0 0 1
1 1 −2 −1 0 0 1 1 −2 −1 0 0 r1 → r1 − r2
0 −3 3 1 1 0 → r2 → −r2 /3 0 1 −1 − 3 − 3 0 → r3 → r3 + r2
1 1
0 −1 −1 0 0 1 0 −1 1 0 0 1
2 1
1 0 −1 − 3 3
0
0 1 −1 − 1 − 1 0
3 3
0 0 0 − 13 − 31 1
At this point we see that row 3 is all zeros, so we can never get column 3 into the form of
the identity matrix, and hence the inverse does not exist. Check that the determinant of this
matrix is indeed 0.
1 2 3 4 1 −2 1 0
0 1 2 3 0 1 −2 1
Extra example: Find the inverse of 0 0 1 2, (answer 0 0
).
1 −2
0 0 0 1 0 0 0 1
AM10VA: Vector Algebra and Geometry 53
An alternative to elementary row operations for the calculation of the inverse of a square matrix
A, makes use of the calculation of determinants.
First we define the cofactor Cij of an element Aij of the matrix A as the determinant of
the reduced matrix A′ij (i.e. the matrix A where we have taken away the ith row and the jth
column, also known as the ij minor of A).
Cij = det(A′ij )
Then we have the following expression for the element ij of the inverse matrix A−1 :
1
A−1 (−1)(i+j) Cji
ij
=
det(A)
• Note that the formula above has Cji not Cij ! Hence we need to transpose the matrix of the
cofactors!!!!!!
This particularly simple for for 2x2 matrices.
−1 T
a b 1 d −c 1 d −b
= =
c d ad − bc −b a ad − bc −c a
Example:
1 3 0
A= 1 2 0
0 0 3
First we calculate all the cofactors:
2 0 1 0 1 2
C11 = det = 6, C12 = det = 3, C13 = det =0
0 3 0 3 0 0
1 0 1 0 1 1
C21 = det = 9, C22 = det = 3, C23 = det =0
0 3 0 3 0 0
1 0 1 0 1 1
C31 = det = 0, C32 = det = 0, C33 = det = −1
2 0 1 0 1 2
Then we can calculate the determinant by developping the 3rd row
det(A) = 0 − 0 + 3C33 = −3
Extra exercises, apply this method to the examples of the previous sections, and check that
you get the same answers.
54 AM10VA: Vector Algebra and Geometry
A~x = ~b
e.g.
2x + y + z =3 2 1 1 x 3
x−y =5 → 1 −1 0 y = 5
−x − 6y + 5z = 2 −1 −6 5 z 2
1. One method to solve such equations uses the inverse of the matrix (provided that it exists),
the solution is then given by multiplying both sides of the equation by A−1 :
Although this is a correct approach to solving this linear system of equations, in practice,
we normally don’t solve this problem this way. Calculating the inverse is a lot of work if
you have to solve this set of equations only once, and much more efficient methods exist.
Nevertheless, it is worthwhile knowing this method when we have to solve the same set of
equations many times for different right hand sides (i.e. for different ~b’s).
Then we only have to calculate the inverse of A once, and can use it again and again.
1 3 0 x 1
Example: Solve 1 2 0 y = 2 using the inverse. We already have the
0 0 3 z 3
inverse of this matrix (see previous section), so the solution is given by:
x −2 3 0 1 4
y = 1 −1 0 2 = −1
z 0 0 13 3 1
If the determinant of the matrix is non-zero there exists exactly 1 solution, while if the
determinant of the matrix is 0, there could be either no, or many solutions.
2. If you need to solve the set of equations only once, then it is better to use elementary
row operations, and reduction to triangular form. This is numerically more effiecient
and can also deal with non-invertibel matrices and even non-square matrices (i.e. sets of
equtions with a different number of equations and unknowns.
The steps for solving it this way are:
(a) write the matrix on the left and the right hand sides on the right
(b) try to reduce the matrix to an upper triangular form using elementary row operations,
and do the same operations to the right hand side.
(c) if you manage to get to that form, then you can get the solution by back substitution.
You solve the last row first, fill that into the second last and so on.
(d) if you get a contradiction before you arrive at that form, this means that the set of
equations has no solutions
AM10VA: Vector Algebra and Geometry 55
cn
Looking at the equation in the last row, we see that xn = unn
. Given this solved value for
xn , this can be used in the (n − 1)th equations, namely
which is solved by
cn−2 − (un−1n−2 xn−1 + unn−2xn )
xn−2 =
un−2n−2
We can continue in this manner, solving each time for a new variable, which is can then be
fed into the equation on the row above, along with the other solved values.
The formal name for this reduction to an upper triangular form is Gaussian Elimina-
tion, and the subsequent method for solving these simplified equations is known as back
substitution.
Although you may have been solving sets of linear equations this way without realising,
doing it in the matrix notation, makes everything more systematic, and more similar to
how a computer would solve it.
1 2 −2 1 r2 → r2 − 2r1 1 2 −2 1 11r
2 1 −4 −1 → r3 → r3 − 4r1 0 −3 0 −3 → r3 → r3 − 3 2
4 −3 1 11 0 −11 9 7
1 2 −2 1
0 −3 0 −3 (14)
0 0 9 18
56 AM10VA: Vector Algebra and Geometry
x + 5y + 3z = 1
5x + y − z = 2
x + 2y + z = 3
1 5 3 1 r2 → r2 − 5r1 1 5 3 1
5 1 −1 2 → 3 → r3 − r1
r 0 −24 −16 −3 → r3 → r3 − 82
r
1 2 1 3 0 −3 −2 2
1 5 3 1
0 −24 −16 −3 (15)
19
0 0 0 8
19
The last row now reads 0x + 0y + 0z = 8
6= 0 and so this is a contradiction and no solution
exists.
x + 5y + 3z = 1
5x + y − z = 2
x + 2y + z = 85
1 5 3 1 r2 → r2 − 5r1 1 5 3 1
5 1 −1 2 → 3 → r3 − r1
r 0 −24 −16 −3 → r3 → r3 − 82
r
1 2 1 58 0 −3 −2 − 83
1 5 3 1
0 −24 −16 −3 (16)
0 0 0 0
The last row now reads 0x + 0y + 0z = 0 and so there is no constraint on z, and there are many
solutions.
AM10VA: Vector Algebra and Geometry 57
One of the most important concepts in linear algebra is that of eigenvalues and eigenvectors. In
a sense, the eigenvectors of the matrix correspond to the natural coordinate system, in which
the transformation can be most easily understood.
A~e = λ~e
Eigenvectors are special directions in which the vectors only get rescaled under the operation
of the matrix A, and the eigen values are the rescaling factors. A n × n square matrix has n
eigenvalues although they do not all have to be different.
In general, for an n × n dimensional matrix, there are (possibly including repetitions) n eigen-
values, each with a corresponding eigenvector. We can reform the equation above to
(A − λI)~e = ~0
It can happen that the same value λi appears more than once! The number of times that the
same eigenvalue appears as a root is called its multiplicity mi .
So, supposing that there are k different eigenvalues, the characteristic polynomial can be rewrit-
ten as
Y k X k
mj
(λj − λ) , where mj = n
j=1 j=1
Once we have found an eigenvalue, then the corresponding eigenvector can be found by substi-
tuting this value for λ in the eigenvector equation and solving for ~e.
58 AM10VA: Vector Algebra and Geometry
1. You should realise that eigenvalues can in general be complex numbers (because they are
the roots of a polynomial), although in this course the examples will be chosen such that
they are always real.
4. If A and B are of order n and A is a non-singular matrix then A−1 BA and B have the
same eigenvalues.
6. The product of all the eigenvalues of a matrix is equal to the determinant of that matrix.
7. The constant in the characteristic equation is also equal to the determinant and by the
previous property equal
Q to the product of all the eigenvalues.
Proof: det(A − λI) = ni=1 (λi − λ) →λ→0 det(A) = ni=1 λi
Q
8. To solve the characteristic equation which is often more complicated than a quadratic equa-
tion, the eigenvalues (which for this course are often small integers), can be guessed from
the constant term. e.g. if the constant term is 6, then try if ±1, ± 2, ± 3, ± 6 are roots.
9. If you have found an eigenvalue λi , then you can divide the characteristic polynomial by
(λi − λ), to obtain a simpler problem for the remaining eigenvalues. You do this by long
division (see example below).
10. If a matrix has a row or a column that is all 0, except for a number on the diagonal, then
that number is an eigenvalue.
11. If a matrix is a triangular matrix, then the numbers on the diagonal are the eigenvalues.
1 3
Example: Find the eigenvalues and eigenvectors of
4 2
The characteristic equation is
1−λ 3
=0
4 2−λ
Evaluating the determinant, this is
(1 − λ)(2 − λ) − 3 4 = 0 → λ2 − 3λ − 10 = 0 → (λ − 5)(λ + 2)
so that the solutions are λ = −2, 5. To find the eigenvectors, we consider each eigenvalue
separately:
AM10VA: Vector Algebra and Geometry 59
x
First for λ = 5 we try eigenvector ~e5 =
y
3
−4 3 x 0 3
= → 4x = 3y → ~e5 → ê5 = 54
4 −3 y 0 4 5
Note that the second equation was simply the negative of the first. Such redundancies will
always occur in the solution of these eigenvector equations. !
−1
√
x
Similarly for λ = −2 we try eigenvector ~e−2 = and find λ = −2 : → ê−2 = √12 for
y 2
the normalised eigenvectors.
2 3 1
Example: Find the eigenvalues of the matrix A = 3 1 2.
1 2 3
So we see that 6 is indeed a root (rest is 0), and that λ3 − 6λ2 − 3λ + 18 divided by (λ − 6) is
λ2 − 3 (which a quadratic equation for which we can easily find the roots), so
√ √
λ3 − 6λ2 − 3λ + 18 = (λ − 6)(λ2 − 3) = (λ − 6)(λ − 3)(λ + 3)
√
and the eigenvalues are ± 3, 6.
8.7.2 Eigenspaces
Note that sometimes, it may be that there is not just a single vector which is an eigenvector
for a matrix. It may be that there is a whole space of such vectors (this typically only happens
for eigenvalues with a multiplicity greater than 1).
Example:
0 0 −2
Find the eigenvalues and eigenvectors of the matrix 1 2 1 .
1 0 3
λ3 − 5λ2 + 8λ − 4 = 0
We need to solve this equation. since the constant in the polynomial is 4, the integer roots
could be ±1, ± 2, ± 4. Substituting λ = 1 in the polynomial, we see that this is a solution.
Division by (λ − 1) gives: λ2 − 4λ + 4 = (λ − 2)2 so λ = 2 has multiplicity 2.
Lets try to find the eigenvectors:
−1 0 −2 x 0 2
λ=1→ 1 1 1 y = 0 → ~e1 = s −1
1 0 2 z 0 −1
As we can see, if we set x = s, then we have the condition z = −s. However, y is left free, so
we can pick y = t. This means that any vector ~e2 in the space
1 0
s 0 +t 1
−1 0
8.8 Diagonalization
3. The matrix D = P−1 AP will then be diagonal with λ1 , .., λn as its successive diagonal
entries, where λi is the eigenvalue corresponding to ~pi .
The reason that this construction diagonalises A is as follows. Consider that the vectors ~p1 , .., ~pn
are eigenvectors of A. That is
A~p1 = λ1 p~1
..
.
A~p = λ p~
n n n
If we define the matrix P = ~p1 ..~pn then the above set of equations can be succinctly written
as AP = PD
λ1 .. 0
where D is the diagonal matrix D = ... . . . ... .
0 .. λn
Provided that P is invertible, we can therefore express A as
A = PDP−1
0 0 −2
Example Find a matrix P that diagonalises A = 1 2 1 .
1 0 3
This is the same matrix that we have previously found the eigenvalues and eigenvectors for,
namely
−1 0
λ=2: , ~p1 = 0 , p~2 = 1 (17)
1 0
−2
λ=1: , ~p3 = 1 (18)
1
62 AM10VA: Vector Algebra and Geometry
Note that if the dimension of the eigenspace of any of the eigenvalues is smaller than the
multiplicity of that eigenvalue, then we cannot find enough linearly independent eigenvectors,
so P is not invertible and A is not diagonisable.
0 1
Example: Diagonalise A = .
0 0
We find that the only eigenvalue of this matrix is λ = 0. It has a multiplicity 2 but the
corresponding eigenspace
0 1 x 0 s
= → y = 0 → ~e0 =
0 0 y 0 0
Very often, we need to compute the power of a matrix, Ap = AA..A (A multiplied by itself p
times).
If the matrix is diagonalisable, then the powers of the matrix are readily related to powers of
the eigenvalues, since
p
Ap = PDP−1 = PDP−1 . . . PDP−1 (p times)
Ap = PDp P−1
p
λ1 .. 0
The p-th power of a diagonal matrix is ... . . . ...
0 .. λpn
Although we will not prove it here, it turns out that we can also use this to calculate any function
of a diagonisable matrix (provided that we can calculate this function on all the eigenvalues)
f (λ1 ) .. 0
f (A) = Pf (D)P−1 = P ... ..
.
.. P−1
.
0 .. f (λn )
For almost all functions (except multiplication with a scalar, or adding a scalar), and all matrices
(except diagonal ones) this is just wrong!!!!
Now,
1 0 2
P−1 = 1 1 1
−1 0 −1
so that, using A2 = PD2 P−1 ,
2
−1 0 −2 2 0 0 1 0 2
2 2
A = 0 1 1 0 2 0 1 1 1 (19)
1 0 1 0 0 12 −1 0 −1
−1 0 −2 4 0 8 −2 0 −6
= 0 1 1 4 4 4 = 3 4 3 (20)
1 0 1 −1 0 −1 3 0 7
One can check by explicit calculation of AA that this is the correct result. For such small
powers of A, the usefulness of this method may not be apparent. However, computing A100 is
just as easy, since it involves computing only the diagonal matrix to this power.
9 Vector analysis
If we combine vectors with analysis (differentiation, integration etc.) we have the field of
mathematics known as vector analysis. In the second year you will do a lot more on this
subject, but in this course you just get a little taste...
~x = ~a + t~u
where ~a is some point on the line, and ~u points along the direction of the line. This is an
example of a parameterised curve. Other simple examples include the circle, which has the
following parametric representation
x1 c1 cos θ
= +r
x2 c2 sin θ
where ~c = (c1 , c2 )T is the centre of the circle, and r is its radius. The parameter that traces
out the curve is here θ. To see that this corresponds to our previous definition of a circle, lets
calculate the length
|~x − ~c|2 = r 2 (cos2 θ + sin2 θ) = r 2
The helix The helix is defined parametrically in the form (see figure)
cos z
~x(θ) = sin z
z
35
30
25
20
z
15
10
0
1
0.5 1
0.5
0
0
−0.5
−0.5
−1 −1
y x
Find the unit tangent vector to the helix (cos t, sin t, t)T .
It is often necessary to think of quantities that can be defined at each point in space. For
example, if ~x is the position vector of a point in the atmosphere, then the temperature at that
point, T (~x) would be an example of a scalar field. That is, for each point in the space, we can
associate a scalar value, namely here the temperature.
A vector field is defined in a similar manner. For example, in the atmosphere, we may be
interested to know the local wind direction w
~ at any point ~x. This would be an example of a
vector field, w(~
~ x). An example of a 2 dimensional vector field is given in the following figure
9.3 Surfaces
φ(~x) = z + x2 + y 2
The constraint
φ(~x) = c
66 AM10VA: Vector Algebra and Geometry
Figure 3: An example of a two dimensional vector field. At each point in the space, a
vector is defined, here a unit vector representing the wind direction. Note that
in general, the vectors need not have the same length.
z = c − x2 − y 2.
This surface is plotted in the figure below for a value of c = 10. Note that the effect of choosing
a different value of c would, in this example, simply shift the surface up or down the Z axis.
10
9.9
9.8
Z
9.7
9.6
9.5
0.5
0.5
0
0
−0.5 −0.5
Y
Figure 4: A two dimensional surface in a three dimensional space. The surface is the
quadratic function z = c − x2 − y 2 .
AM10VA: Vector Algebra and Geometry 67
Given a scalar field we are interested in finding the direction in which the function increases
the fastest. It turns out that this direction can easily be calculated using derivatives, and at
any point ~x in space the vector pointing in the direction of fastest increase in φ(~x) is known as
the gradient of φ in ~x. The formula of the gradient (without derivation) is given by:
∂φ(~x)
∂x1
~ x) = .
∇φ(~ ..
x)
∂φ(~
∂xn
x)
where ∂φ(~
∂xi
means the derivative if φ(~x) with respect to xi where we treat all the other com-
ponents of ~x as constants.
Examples:
• Calculate the gradient of φ(~x) = x2 + xy 2 − z = 10, in the point ~x1 = (−1, 1, 50)T .
The gradient vector thus points in the direction
2x + y 2
−1
~ x) = 2xy → ∇φ(~
∇φ(~ ~ x1 ) = −2
1 1