Vector Algebra and Geometry

Aston University
Birmingham B4 7ET
www.ncrg.aston.ac.uk/∼vanmourj/Welcome.html
Vector Algebra and Geometry

Juan Neirotti
Module AM10VA: September 30, 2008
Contacts:
Office: MB317C Main Building
Hours: On appointment (by email)
Telephone: (0121) 204 3657 (internal 3657)
e-mail: j.p.neirotti@aston.ac.uk
All course information and hand-outs can be found on the websites:
http://www.ncrg.aston.ac.uk/∼neirotjp/Welcome.html (home page)

2 AM10VA: Vector Algebra and Geometry
Contents
1 Scalars, Vectors and Positions 6
1.1 Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Positions (points in space) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Elementary Operations on Scalars and Vectors 7
3 Linear Combinations of Vectors 9
3.1 Linearly Independent Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Spanning Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Coordinate Notation of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5 The Standard Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6 Worked Excercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Scalar or Dot product of vectors 13
4.1 Generalised Pythagoras using dot product . . . . . . . . . . . . . . . . . . . . . 14
4.2 Parallel k and orthogonal ⊥ component of a vector . . . . . . . . . . . . . . . . 14
4.3 Direction Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Linear Objects 16
5.1 Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.1 2 lines in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1.2 line and plane in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.1.3 2 planes in 4D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
AM10VA: Vector Algebra and Geometry 3
5.2 Projection and Reflection of a point on a linear object . . . . . . . . . . . . . . . 21
5.2.1 Projection of a point on a hyperplane . . . . . . . . . . . . . . . . . . . . 22
5.2.2 Projection of a point on a line . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2.3 Projection of a point on a 2-D plane in 4-D . . . . . . . . . . . . . . . . 22
5.3 Projection of a line on a linear object . . . . . . . . . . . . . . . . . . . . . . . . 23
6 The Vector or Cross Product of Vectors 24
6.1 Using the vector product for the normal equation of a plane . . . . . . . . . . . 25
6.2 Geometric interpretation of the vector product . . . . . . . . . . . . . . . . . . . 25
6.3 Scalar Triple Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 Determinant Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 Geometry 30
7.1 Linear Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7.2 Quadratic Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.2.1 Quadratic objects in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.2.2 Quadratic objects in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.3 Determining the N-D hyper-sphere through N + 1 points . . . . . . . . . . . . . 37
7.4 Intersections involving linear and quadratic objects . . . . . . . . . . . . . . . . 37
7.4.1 Intersection of a line and a sphere . . . . . . . . . . . . . . . . . . . . . . 37
7.4.2 Intersection of a (hyper-)plane and a (hyper-)sphere . . . . . . . . . . . . 38
7.4.3 Intersection of two spheres . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.5 Projections on quadratic objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8 Matrices and Linear Transformations 41
8.1 Operations on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8.1.1 Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.1.2 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.1.3 Multiplying a matrix by a scalar . . . . . . . . . . . . . . . . . . . . . . . 42
8.1.4 Transposing a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.1.5 Scalars, Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2.1 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2.2 Symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2.3 Zero matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2.4 Diagonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.2.5 Triangular matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.2.6 Identity matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.3 Linear Transformations of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.3.1 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.2 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.3 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.3.4 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.3.5 Compositions of Linear Transformations . . . . . . . . . . . . . . . . . . 49
8.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.5 Matrix Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.5.1 Matrix Inverse using elementary row operations . . . . . . . . . . . . . . 51
8.5.2 Matrix Inverse using cofactors . . . . . . . . . . . . . . . . . . . . . . . . 53
8.6 Solution of sets of linear equations with matrix methods . . . . . . . . . . . . . 54
8.7 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

8.7.1 Finding Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.7.2 Eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.8 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.8.1 Computing Functions of matrices . . . . . . . . . . . . . . . . . . . . . . 62
9 Vector analysis 64
9.1 Parameterised Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.1.1 The tangent to a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.2 Scalar and Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3.1 The gradient vector operator . . . . . . . . . . . . . . . . . . . . . . . . . 67

1 Scalars, Vectors and Positions
1.1 Scalars
A Scalar is a quantity which only has a magnitude but which is not related to any definite
direction in space. A scalar is completely specified by a number, e.g. 1, π, − 7.243623 are
scalars.
notation: In this course, we will always denote a scalar by a lower case symbol, e.g. a, b, x, r, λ, ...
1.2 Vectors
A vector is an object that has both magnitude (length) and a definite direction in space.
It can most easily be pictured as an arrow.
notation: In this course, we will always denote a vector with an arrow on top, e.g. ~a, ~b, ~r.
Note that in some text books other notations are used, such as boldface: r, or underscore: r.
We denote the length of a vector ~a by: |~a|, which is always positive, i.e. |~a| ≥ 0.
Note that the length of a vector is sometimes also called the norm of that vector.
• A vector of length zero is called the zero vector and is denoted by ~0.
This is the only vector that does not have a definite direction in space.
• A vector ~r that has unit length |~r| = 1, is called a unit vector.

We may use the special notation r̂ to indicate this fact.
We note that two vectors ~a and ~b are only equal if and only if they have the same direction
and the same length.
1.3 Positions (points in space)
Suppose that we have a space, such as a line, a plane or 3D-space.
1. Any point (position) in this space is represented by an uppercase letter, e.g. A, B, C.
2. Then we choose a special point in that space from which we start measuring, which we call
the origin, and which we denote by O.
3. Now any point R in the space can be represented by a vector:
(a) starting in the origin, arriving in the point R, indicated by OR.

(b) the vector for the origin itself is OO (= ~0).

(c) the vector starting from point R, arriving in point Q (see figure), is indicated by RQ.
(d) The distance between the points R and Q is the length of RQ, i.e. |RQ|.
R
RQ = q
~−~
r
OR = ~
r OQ = q
~
O 0̃
2 Elementary Operations on Scalars and Vectors
We already know several operations on scalars, such as addition (a + b), multiplication (ab),
division (a/b), absolute value (|a|) etc. Furthermore, we know several properties of these
operations, e.g.:
• a + b = b + a, a + (b + c) = (a + b) + c
• ab = ba, a(bc) = (ab)c
• a(b + c) = ab + ac
• −a = (−1)a
• a/b is only defined if b 6= 0.
We note that the result of these operations between scalars (if defined) is always again a scalar.
Now we are going to introduce operations between scalars and vectors. As we will see the result
of these operations could either be a scalar or a vector, so it is very important that you know
what to expect.
1. We have already encountered one operation: taking the length of a vector. The length of
a vector ~a is denoted by |~a|, and the result is a scalar.
Properties:
(a) |~a| ≥ 0
2. We can add vectors together, and the result is a vector, e.g. ~r = ~a + ~b. Graphically
we do this by putting the starting point of ~b in the endpoint of ~a. The vector ~r is then the
vector that connects the starting point of ~a with the endpoint of ~b.
Obviously, we can put an arbitrary number of vectors one behind the other and connect
the starting point of the first with the endpoint of the last, and the order doesn’t matter.
Hence we have the following properties:
(a) ~a + ~b = ~b + ~a (commutative law)

(b) (~a + ~b) + ~c = ~a + (~b + ~c) (associative law)
(c) |~a + ~b| ≤ |~a| + |~b|
(d) ~0 + ~a = ~a
3. We can multiply a scalar λ with a vector ~a, to give the result ~r = λ~a which is a vector.
~r has the same direction as ~a but a different length. If |λ| > 1 then ~r is longer then ~a, if
0 < |λ| < 1 then ~r is shorter then ~a, and if λ < 0, ~r points in the opposite direction from ~a.
We have the following properties:
(a) |λ~a| = |λ||~a|

(b) 0~a = ~0, 1~a = ~a
(c) λ(~a + ~b) = λ~a + λ~b (distributive law)
(d) (λ + γ)~a = λ~a + γ~a (distributive law)
(e) ~a − ~b = ~a + (−1)~b
(f) 1
|~a|
~a = ~a
|~a|
= â, i.e. the unit vector in direction of ~a (only defined if ~a 6= ~0)
3 Linear Combinations of Vectors
Given a set of N vectors {~a1 ,~a2 , ..,~aN }, and N scalars {c1 , .., cN }, the vector
~b = c1~a1 + c2~a2 + c3~a3 + ... + cN ~aN
is called a linear combination of the vectors.
3.1 Linearly Independent Vectors
The set of N vectors {~a1 ,~a2 , ..,~aN } is called linearly independent if the equations for {c1 , .., cN }:
c1~a1 + c2~a2 + c3~a3 + ... + cN ~aN = ~0
has as only solution c1 = c2 = ... = cN = 0

Otherwise they are called linearly dependent.
If they are linearly dependent this means that some of the ci ’s are not 0, so the corresponding
vector ~ai can be written as a linear combination of the others.
Assumme e.g. that c1 6= 0, then we can write
1
c1~a1 + c2~a2 + ... + cN ~aN = ~0 → ~a1 = − (c2~a2 + ... + cN ~aN )
c1
If we deal with a space of dimension N, a set of at most N vectors can be linearly independent.
3.2 Spanning Vectors
The set of N vectors {~a1 ,~a2 , ..,~aN } is said to span a certain space if any vector ~r in that space
can be written as a linear combination of those vectors:
~r = c1~a1 + c2~a2 + ... + cN ~aN
It could be that there are many possible values for the {c1 , .., cN }.
If we deal with a space of dimension N, we need at least a set of N vectors to span the space.
3.3 Basis
For a space S of dimension N:
1. If we have a set of exactly N linearly independent vectors, this set automatically spans the
space.
2. If we have a set of exactly N vectors that spans the space, then this set is automatically
linearly independent.
3. In that case the set of N vectors (which is linearly indepent, and spans the space) is called
a basis for this space S.
4. Any vector in the space S can be written as a unique linear combination of the basis
vectors.
5. The number of basis vectors is equal to the dimension of the space.
6. For an orthonormal basis all the basis vectors have unit length, and each basis vector is
orthogonal (⊥) to all the others.
3.4 Coordinate Notation of Vectors
Suppose that we have a space S of dimension N, and that we have decided on:
1. an origin ~0
2. a basis {~e1 , ~e2 , .., ~eN }
then any point R in the space can uniquely be represented by the vector:
~r = OR = ~0 + r1~e1 + r2~e2 + .. + rN ~eN = r1~e1 + r2~e2 + .. + rN ~eN .
The set of scalars {r1 , r2 , .., rN } are called the coordinates of ~r in that basis.
We now use the following notation for the vector ~r:
 
r1
~r =  · · ·  = (r1 , .., rN )T
rN
which is called the coordinate notation of ~r, and only has meaning if the basis is defined!
Since the coordinates are unique for any given vector, this means that two vectors ~a and ~b are
equal if and only if each coordinate of ~a is equal to the corresponding coordinate of ~b. Hence in
N dimensions a vector equation corresponds to N simultaneous equations for the coordinates:
    
a1 b1  a1 = b1
~
~a = b ⇔  ···  =  ···  ⇔ ···
aN bN aN = bN

3.5 The Standard Orthonormal Basis
In principle any set of vectors that are spanning and linearly independent could be used as
basis. However, if everybody would use a different basis, it would be very hard for people
to exchange (position) vectors by their coordinates. Therefore, there are standard bases that
everybody uses (unless exlicitely stated otherwise). These standard bases are orthonormal and
so-called right handed.
In 3D, the three (unit) basis vectors are {î, ĵ, k̂}, and the components of a vector ~r = (r1 , r2 , r3 )T =
r1 î + r2 ĵ + r3 k̂, are often called the x- y- and z-component of ~r.
We have that:
1. |î| = |ĵ| = |k̂| = 1.
2. î ⊥ ĵ, ĵ ⊥ k̂ and k̂ ⊥ î.
3. if you put the basis vectors in order î, ĵ, k̂, î, ĵ, k̂, .., and if your right index finger points
from in the direction of any basis vector and your second finger points in the direction of
the next basis vector, then your extended thumb points in the direction of the one after
that. This is known as the right hand rule.
The standard basis vectors are given by î = (1, 0, 0)T , ĵ = (0, 1, 0)T and k̂ = (0, 0, 1)T in
coordinate notation.
If you want to plot the 3D basis in two dimensions, you use the following convention:
- k̂ points in the positive vertical direction (z-axis),
- ĵ points towards the right (y-axis)
- î points towards you (out of the paper), but you draw this in perspective by pointing it at 45
degrees to the left bottom (x-axis).
k
j
0
j
0 i
i
In 2D, the standard basis vectors are denoted by {î, ĵ}, and we use the following convention:
- î points towards the right (x-axis)
- ĵ points up (y-axis).
The standard basis vectors are given by î = (1, 0)T and ĵ = (0, 1)T in coordinate notation.
3.6 Worked Excercises
Linear independence:

The set (1, 2, 3)T , (1, 0, 0)T , (0, 1, 0)T is linearly independent.
        
1 1 0 0  r+s=0
r 2
  +s 0   +t 1   =  0  → 2r + t = 0 → r = s = t = 0
3 0 0 0 3r = 0


The set (1, 2, 3)T , (1, 0, 0)T , (−1, 4, 6)T is linearly dependent.
        
1 1 −1 0  r+s−t= 0
r 2
  +s 0
  +t  4  =  0  → 2r + 4t = 0 → r = −2t, s = 3t
3 0 6 0 3r + 6t = 0

so we can take any value for t, e.g. t = 1 → r = −2, s = 3 is a non-zero solution.
Spanning vectors:

The set (1, 2, 3)T , (1, 0, 0)T , (0, 1, 0)T spans the space:
        
1 1 0 x  r+s=x
r 2 +s 0 +t 1
      =  y  → 2r + t = y → r = z/3, s = x−z/3, t = y−2z/3
3 0 0 z 3r = z

Whatever
the values of x, y, z, we find a solution.
The set (1, 2, 3) , (1, 0, 0)T , (0, 4, 6)T does not span the space.
T
         
1 1 0 x  r+s=x  s=x−r
r  2  + s 0  + t 4  =  y  → 2r + 4t = y → t = (y − 2x)/4
3 0 6 z 3r + 6t = z 3y = 2z
 
A vector ~r = (x, y, z)T for which 3y 6= 2z can not be written as a linear combination of the set.
Show that any vector has unique coordinates in a basis {~e1 , .., ~eN }.
Since the basis vectors span the space, any vector ~r can be written as ~r = r1~e1 +r2~e2 +..+rN ~eN .
Suppose now that ~r = r1~e1 + r2~e2 + .. + rN ~eN and that also ~r = x1~e1 + x2~e2 + .. + xN ~eN .
Then ~r − ~r = (r1 − x1 )~e1 + (r2 − x2 )~e2 + .. + (rN − xN )~eN = ~0.
Since the basis vectors are linearly independent this means that the only solution is that all the
coordinates are 0, hence r1 = x1 , .., rN = xN , in other words, the coordinates of ~r are unique.
4 Scalar or Dot product of vectors
Consider two vectors ~a and ~b. The scalar or dot product of these two vectors is defined as:
~a · ~b = |~a| |~b| cos(θ)
where θ is the shortest angle between the vectors ~a and ~b.

It has the following properties:
1. notation: put a dot between the vectors (hence name)
2. the result is a scalar (hence name)
3. ~a · ~b = ~b · ~a (commutative law)
4. ~a · (~b + ~c) = ~a · ~b + ~a · ~b (distributive law)
5. If ~a ⊥ ~b (orthogonal), then ~a · ~b = 0.
6. (λ~a) · ~b = λ(~a · ~b) = ~a · (λ~b)
7. ~a · ~a = |~a||~a| cos(0) = |~a|2
We can see how the dot product works in an orthonormal basis on the coordinates of the
vectors by applying the definition and properties to the basis vectors.
Given a orthonormal basis {ê1 , .., êN }, we can write ~a = (a1 , a2 , .., aN )T , ~b = (b1 , b2 , .., bN )T and
the dot product is then given by
N
~a · ~b = a1 b1 |ê1 |2 + a2 b2 |ê2 |2 + ..aN bN |êN |2 + a1 b2 ê1 · ê2 + .. = a1 b1 + .. + aN bN =
X
ai bi
i=1
since |êi |2 = 1, and êi · êj = 0 ∀i 6= j.
We can now use this to calculate (the cosine of) the angle between the vectors
~a · ~b
PN
ai bi
cos(θ) = = i=1
|~a| |~b| |~a| |~b|
We can also use this to calculate the length of a vector

v
N uN
X √ uX
~a · ~a = |~a|2 = a2i → |~a| = ~a · ~a = t a2i
i=1 i=1
4.1 Generalised Pythagoras using dot product
Consider the triangle PQR in the figure below.

We have that the square of the length of ~a − ~b is given by
|~a − ~b|2 = (~a − ~b) · (~a − ~b) = ~a · ~a + ~b · ~b − 2 ~a · ~b
= |~a|2 + |~b|2 − 2 |~a| |~b| cos(θ)

This is known as the cosine rule for triangles.
Now, if we deal with a right-angled triangle, the angle θ between ~a and ~b is π/2, so cos(θ) = 0,
and we have indeed
|~a − ~b|2 = |~a|2 + |~b|2
the well known Pythagoras’ rule.
~
a a − ~b
~ ~
a⊥ ~
a
θ θ
P ~b Q ~b ~
ak
4.2 Parallel k and orthogonal ⊥ component of a vector
Consider again the vectors ~a and ~b of the figure above.

We now want to decompose the vector ~a as the sum of 2 vectors:
- one parallel (k) to ~b
- one orthogonal (⊥) to ~b.
~a = ~ak + ~a⊥
We know that ~ak is a vector in the direction of ~b but with length |~a| cos(θ) (see figure above).
So it is equal to |~a| cos(θ) times the unit vector in that direction (b̂). Hence,
!
~
~a · b ~b (~a · ~b) ~
~ak = |~a| cos(θ) b̂ = |~a| = b,
|~a||~b| |~b| |~b|2
and the orthogonal component we obtain from ~a⊥ = ~a − ~ak .

4.3 Direction Cosines
Let î, ĵ, k̂ be the standard orthonormal basis vectors. If ~r = (x, y, z)T is a non-zero vector,
then the angles α, β, γ between ~r and the vectors î, ĵ, k̂ respectively are called the direction
angles of ~r, and the numbers cos α, cos β, cos γ are called the direction cosines of ~r.
Using the dot product is easy to see that the direction cosines are nothing else than the coor-
dinates of the unit vector r̂ in the direction of ~r, e.g.
~r · î ~r
cos(α) = = · î = r̂ · î
|~r||î| |~r|
r
γ
α β
0
j
i
5 Linear Objects
First, we must decide on the dimension of the space we work in, and we’ll call the dimension
N. This determines the number of basis vectors, and hence the number of components of any
vector in it.
A linear object S in that space is an object that is straight or flat. It is called a linear object
because any point in it can be obtained by a linear combination of vectors. In general a linear
object is fully determined by a point ~a in it, and a set of M ≤ N linearly independent vectors
{~u1 , .., ~uM } parallel to it.
A linear object also has a well defined dimension. This dimension is equal to the maximum
number of linearly independent vectors parallel to it, i.e. M.
Any point ~r in S can then be written as:

M
X
~r = ~a + t1~u1 + .. + tM ~uM = ~a + tN ~uM
i=1
Examples are:
1. A point has dimension M = 0, and ~r = ~a, where ~a is the point itself.
2. A line has dimension M = 1, and ~r = ~a + t~u, where ~a is any point on the line and ~u is a
vector parallel to the line.
3. A plane has dimension M = 2, and ~r = ~a + s~u + t~v , where ~a is any point in the plane and
~u, ~v are two linearly independent vectors parallel to the plane.
4. The space we live in has dimension M = 3, and ~r = ~a + s~u + t~v + k w,

~ where ~a is any point
in the plane and ~u, ~v , w
~ are three linearly independent vectors parallel to the space.
5. Note that the space we work in, is a linear object itself.
6. In any space of dimension N, a linear object with dimension N − 1 is called a hyperplane.

E.g. a point is a hyperplane in a 1D space, a line is a hyperplane in 2D space, a plane is a
hyperplane in 3D space, and our 3D space would be a hyperplane in a higher 4D space.
• Note that S forms an M-dimensional space itself. ~a can be considered as its origin and
{u1 , .., uM } as its basis vectors.
In an N-dimensional space, a linear object S of dimension M (≤ N) can be specified in 2

different ways using vectors:
1. (~a, {~u1 , .., ~uM }), where ~a is the position vector of a point in the object, and {~u1 , .., ~uM } is a
set of M linearly independent vectors parallel to the object.
The position vector ~r = (x1 , .., xN )T of any point in the object can then be written as:
~r = ~a + t1~u1 + .. + tM ~uM
This is known as the vector ( or parametric) equation of the object.
2. (~a, {~n1 , .., ~nN −M }), where ~a is the position vector of a point in the object, and {~n1 , .., ~uN −M }
is a set of N − M linearly independent vectors orthogonal to the object.
The position vector ~r = (x1 , .., xN )T of any point in the object then satisfies the following
(set of) equation(s):
~ni · ~r = ~ni · ~a ∀i = 1, .., N − M
These are known as the normal ( or scalar) equation(s) of the object.
Since ~r and ~a both lie in the object, this means that the vector going from ~a to ~r, i.e.
~r − ~a is parallel to the object. So it must be orthogonal to all of the normal vectors. Hence
(~r − ~a) · ~ni = 0 ∀i = 1, .., N − M.
• The dimension (D) theorem states that:
1. N = M + the number of linearly independent orthogonal directions (= N − M)
2. N = M + the number of normal equations (= N − M)
• We can go from the vector equation of an object to the normal equations of the object by
elimination of the parameters {t1 , .., tM }.
Example: Consider the plane (2-D) in 4-D space given by ~a = (1, 1, 0, 0)T , {~u1 = (1, 0, 1, 0), ~u2 =
(0, 1, 1, 1)}, then we have that
    
w 1 + t1 
 t1 = w − 1
 x   1 + t2  
t2 = x − 1 w+x−y = 2
 y  =  t1 + t2  →  y = x + w − 2 → x − z = 1
~r = ~a + t1~u1 + t2~u2 →    

z t2 z =x−1

indeed 2 equations. We can then also immediately read off two ortogonal vectors to the plane
from the equations, namely ~n1 = (1, 1, −1, 0)T, ~n2 = (0, 1, 0, −1)T and we can check that
~n1 · ~a = 2 and ~n2 · ~a = 1.
5.1 Intersections
• The intersection of two objects S1 and S2 is defined as all the points that lie both in S1
and S2 . This means that the points must satisfy all the equations of both objects.
• If S1 and S2 are both linear objects then the intersection is also a linear object if it exists.
The dimension of the intersection is smaller or equal to the lowest of the dimensions of S1
and S2 , and can even be empty if S1 and S2 have no points in common.
• The only sure way of determining the dimension of the intersection is by working out how
many linearly independent equations the intersection has, especially in higher dimensional
spaces where your intuition may easily lead you astray!
• Since each hyperplane corresponds to one normal equation and vice versa, any M-D linear
object S in an N-D space (with M < N) may be considered as the intersection of N −
M linearly independent hyperplanes. Each contributing to one of the N − M linearly
independent normal equations of S.
Examples:
5.1.1 2 lines in 3D
If a point ~r lies on line 1 (~a1 , ~u1 ) it can be written as:
~r = ~a1 + t1~u1 .
If a point ~r lies on line 2 (~a2 , ~u2 ) it can be written as:
~r = ~a2 + t2~u2 .
If it lies in the intersection of the two lines it lies on both so
~r = ~a1 + t1~u1 = ~a2 + t2~u2
We have to solve this for t1 , and t2 . There are 3 possibilities:
1. 0 solutions (no intersection)

       
0 0 1 1
0 : ~r =  0  + t1  0  =  2  + t2  1 
1 0 3 0
1 = 1 + t2 t2 = 0
0 = 2 + t2 → 0 = 2
1 + t1 = 3 t1 = 2
2. 1 solution (lines intersect)

       
1 1 0 1
1 : ~r =  1  + t1 2
  =  −1  + t2 1 

1 3 −2 1
 
1 + t1 = t2 0
t =0
1 + 2t1 = −1 + t2 → 2 → ~r =  −1 
t1 = −1
1 + 3t1 = −2 + t2 −2
3. ∞ solutions (lines coincide)

       
1 1 0 1
∞ : ~r =  1  + t1  2  =  −1  + t2  2 
1 3 −2 3
1 + t1 = t2 t2 = 1 + t1
1 + 2t1 = −1 + 2t2 → 1 + 2t1 = 1 + 2t1
1 + 3t1 = −2 + 3t2 1 + 3t1 = 1 + 3t1
5.1.2 line and plane in 3D
If a point ~r lies on the line (~a1 , ~u1) it can be written as:
~r = ~a1 + t1~u1.
If a point ~r lies in the plane given by
1. (~a2 , ~u2 , ~v2 ) it can be written as:
~r = ~a2 + s2~u2 + t2~v2 .
2. (~a2 , ~n2 ) we have that

~r · ~n2 = ~a2 · ~n2
If it lies in the intersection we have that:
1.
~r = ~a1 + t1~u1 = ~a2 + s2~u2 + t2~v2
We have to solve this for t1 , s2 and t2 .
2.
(~a1 + t1~u1 ) · ~n2 = ~a2 · ~n2
We have to solve this for t1 .
We have to solve this for t1 , s2 , and t2 . There are 3 possibilities:

1. 0 solutions (parallel line not in the plane)
2. 1 solution (not parallel)
3. ∞ solutions (line lies in the plane)
5.1.3 2 planes in 4D
           
0 1 0 1 1 1
 0   0
, ~n2 =  1 
 1   1
S2 : ~b =  , ~u2 =  2 
     
 0 , ~n1 =  1
S1 : ~a =   
  0   1 , ~u1 =  1
 
  3 
0 0 1 1 1 4
Hence we have the (vector) equations:
 
 0
 ~r · ~n1 = ~a · ~n1
(
(~b + t1~u1 + t2~u2 ) · ~n1 = ~a · ~n1

t1 = −1  0 
~r · ~n2 = ~a · ~n2 → → → ~r =  
(~b + t1~u1 + t2~u2 ) · ~n2 = ~a · ~n2 t2 = 0  0 
~r = ~b + t1~u1 + t2~u2

0
So the intersection is 0-D, just 1 point (no free parameter left)

5.2 Projection and Reflection of a point on a linear object
Definitions:
1. The (orthogonal) projection of a point ~r on an object S is the point ~p~r in the object that
is closest to ~r.
2. The reflection of a point ~r with respect to an object S is the point R ~ ~r exactly opposite ~r
with respect to p~~r . This means that it lies on the line through ~r and p~~r , but on the other
side of ~r, therefore R~ ~r = 2~p~r − ~r. The easiest way to obtain it is first to determine ~p~r , and
then to use the vector equation above.
First we note the property that the vector ~r − ~p~r is always ortogonal to S at the point ~p~r .
This means that ~p~r lies in S⊥ : the linear object ⊥ S containing ~r.
11111111111111111
00000000000000000
00000000000000000
11111111111111111
S S
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
r 00000000000000000
11111111111111111
00000000000000000
11111111111111111 1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
S 00000000000000000r
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000 r
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
pr
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
r rr
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 p
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
r
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
S
0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111 0000000000000000000000000000000000000000
1111111111111111111111111111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
00000000000000000
11111111111111111
The projection of a point onto a M-D linear object S in N-D:
• Case 1: S is given by (~a, {~u1, .., ~uM }

1. ~p~r lies in S , so it satisfies the vector eqn. :
~p~r = ~a + t1 ~u1 + .. + tM ~uM
2. ~p~r lies in S⊥ , so it satisfies the normal eqns.:
~p~r · ~ui = ~r · ~ui , ∀i = 1, .., M.
3. Solving all eqns. yields the values of t1 , .., tM and p~~r .
• Case 2: S is given by (~a, {~n1 , .., ~nN −M }
1. ~p~r lies in S , so it satisfies the normal eqns.:
~p~r · ~ni = ~a · ~ni , ∀i = 1, .., N − M.
2. ~p~r lies in S⊥ , so it satisfies the vector eqn. :
~p~r = ~r + t1~n1 + .. + tN −M ~nN −M
3. Solving all eqns. yields the values of t1 , .., tN −M and p~~r .
Examples:
5.2.1 Projection of a point on a hyperplane
Consider the hyperplane through ~a and ⊥ ~n.

Since ~p~r lies in the hyperplane, it satisfies: ~p~r · ~n = ~a · ~n
It also lies on the line through ~r parallel to ~n so: ~p~r = ~r + t~n
Hence
(~a − ~r) · ~n
t=
~n2
5.2.2 Projection of a point on a line
Consider the line through ~a and k ~u.

Since ~p~r lies on the line, it can be expressed as ~p~r = ~a + t ~u
It must also lie in the hyperplane through ~r and orthogonal to ~u: ~r.~u = p~~r .~u = (~a + t ~u).~u =
~a.~u + t ~u.~u
Hence
(~r − ~a) · ~u
t=
~u2
5.2.3 Projection of a point on a 2-D plane in 4-D
Consider the plane S through ~a = (0, 0, 0, 0)T and k ~u1 = (1, 1, 1, 1)T , ~u2 = (1, 0, 1, 0)T, and the
point ~r = (1, 2, 3, 4)T.
p~~r lies in the plane S so

 
t1 + t2
 t1 
~p~r = ~a + t1~u1 + t2~u2 =  
 t1 + t2 
t1
p~~r lies in the plane S ⊥ through ~r and ⊥ ~u1 , ~u2 so

 
2
~p~r · ~u1 = ~r · ~u1 4t1 + 2t2 = 10 t1 = 3  3 
→ → → p~~r =  
~p~r · ~u2 = ~r · ~u2 2t1 + 2t2 = 4 t2 = −1  2 
3
5.3 Projection of a line on a linear object
Definition: The projection of a line onto an object is the collection of points formed by the
projection of all the individual points of the line onto the object.
The result of the projection of a line onto a linear object will be a line lying in that ob-
ject (except when the line and object are orthogonal, then the projection will just be a point).
L
L’
Example:
Consider the line through ~a = (1, 1, 1, 1)T and k ~u = (1, 0, 1, 0)T in 4D, and the hyperplane
through ~b = (1, 2, 3, 4)T and ⊥ ~n = (1, 1, 1, 0).
The projection of any point ~r = (w, x, y, z) onto the object is obtained by (see 5.2)
(~b − ~r) · ~n
~p~r = ~r + s~n with s=
~n2
so s = (6 − w − x − y)/3, and
 
2 + 2w/3 − x/3 − y/3
 2 − w/3 + 2x/3 − y/3 
~p~r =  
 2 − w/3 − x/3 + 2y/3 
z
We only want the projection of points that lie on the line so
    
w 1+t 
 w =1+t
 x   1  
x=1
~r = 
 y 
 = ~a + t~u = 
 1+t 
→

 y =1+t
z 1 z=1

Filling this into the expression for p~~r gives:

     
2 + t/3 2 1/3
 2 − 2t/3   2   −2/3 
 2 + t/3  = 
~p~r (t) =     + t 
2   1/3 
1 1 0
which is indeed a line in the hyperplane (you can check that p~~r (t) · ~n = ~b · ~n = 6, ∀t, i.e. ~p~r (t)
satisfies the equation of the hyperplane).
6 The Vector or Cross Product of Vectors
Consider two vectors ~u and ~v in 3D.

The vector or cross product ~u × ~v of these two vectors is a vector ~c orthogonal to both ~u and
~v of length
|~c| = |~u × ~v| = |~u| |~v| sin(θ)
where θ is the shortest angle between the vectors ~u and ~v.
To determine the direction of ~c, we again use the right hand rule. Point your index finger
along the direction of the first vector ~v and your second finger in the direction of the second
vector ~u. The resulting vector ~c then points in the direction of your extended thumb.
c
v
1. notation: put an × between the vectors (hence name)
2. the result is a vector (hence name)
3. ~v × ~u = −~u × ~v (anti-commutative law)
4. ~v × (~u + ~c) = ~v × ~u + ~v × ~c (distributive law)
5. If ~v k ~u (parallel), then ~v × ~u = ~0.
6. If ~v ⊥ ~u (orthogonal), then |~v × ~u| = |~v| |~u|.
7. (λ~v ) × ~u = ~v × (λ~u) = λ(~v × ~u)
8. ~v × ~v = ~0
We first see how the cross product works on the standard orthonormal basis vectors:
1. î × î = ĵ × ĵ = k̂ × k̂ = ~0
2. î × ĵ = −(ĵ × î) = k̂, ĵ × k̂ = −(k̂ × ĵ) = î, and k̂ × î = −(î × k̂) = ĵ

Now we can see how the vector product works on the coordinates of vectors by applying the
definition and properties:
In this basis, we write ~a = (a1 , a2 , a3 )T , ~b = (b1 , b2 , b3 )T and the cross product is then given by
~a ×~b = a1 b1 (î × î) + a1 b2 (î × ĵ) + .. + a3 b3 (k̂ × k̂) = (a1 b2 − a2 b1 )k̂ + (a2 b3 − a3 b2 )î + (a3 b1 − a1 b3 )ĵ
We can also use this to calculate (the sine of) the angle between the vectors
|~a × ~b|
sin(θ) =
|~a| |~b|
6.1 Using the vector product for the normal equation of a plane
Suppose that you have a plane in 3D given by its vector equation:

~r = ~a + s~u1 + t~u2 .
The vector product gives us a new way to determine the normal vector ~n and hence the normal
equation of the plane. Simply take ~n = ~u1 × ~u2 , this is a vector normal to ~u1 and ~u2 and so
also to the plane. The normal equation of the plane is given by:
~r · (~u1 × ~u2 ) = ~a · (~u1 × ~u2 )
6.2 Geometric interpretation of the vector product
In the figure below, the vector product of ~v and ~u is the vector ~c (pointing out of the paper
towards you) of magnitude |~v||~u| sin θ, i.e. the area of the parallelogram formed by the two
vectors. This is because the “height” of the parallelogram (indicated by the dashed line is
|~v| sin θ.
v
θ
u
C
B
J
The symbol represents the
N tip of an arrow that is coming towards you, while the fligth (rear)
of the arrow is the symbol , and represents a vector directed into the page away from you.
6.3 Scalar Triple Product
If ~a, ~b, ~c are three arbitrary vectors, then the scalar product of ~a × ~b with ~c is called the scalar
triple product of ~a, ~b, ~c, and is given by:
~c · (~a × ~b) = c1 (a2 b3 − a3 b2 ) + c2 (a3 b1 − a1 b3 ) + c3 (a1 b2 − a2 b1 )
We have that ~a × ~b = (|~a||~b| sin θ)~p, where p~ is a unit vector perpendicular to both ~a and ~b. So,
(~a × ~b) · ~c = (|~a||~b| sin θ) ~p · ~c = (|~a||~b| sin θ) |~c| cos φ = area of base × orthogonal height
c φ
θ
b
a
The absolute value (because cos(φ) could be negative) of the scalar triple product is hence the
volume of the parallelepiped described by the the three vectors, or
vol(~a, ~b, ~c) = |(~a × ~b) · ~c|
Note that if (~a × ~b) · ~c = 0, the volume of the parallelepiped would be zero. This would mean
that the three vectors must lie in the same plane, thus the scalar triple product also gives us a
test to check whether three 3D vectors are co-planar.
6.4 Determinant Representation
In order to calculate the vector product it is not very practical to remember the formula by
hard, or to deduce it every time from the basis vectors. Therefore we now introduce the notion
of determinant, which is very important and will come back when we deal more in detail with
matrices.
First we consider a square set (or matrix) A of (n × n) numbers:

 
a11 a12 . . . a1n
 a21 a22 a2n 
A=
 
..
. ··· 

 ···
an1 an2 . . . ann
• We see that each position in this matrix has two indices:

- the first is called the row index and indicates in which row the number is (counting starts
from the top).
- the second is called the column index and indicates in which column the number is
(counting starts from the left).
Each of the numbers in the grid are called the elements of A and are indicated with a small
character a and its indices. In general, the element of A on the ith row and the jth column
is indicated by:
(A)ij = aij
• The diagonal elements of A are all elements for which i = j, i.e. a11 , a22 , ..aii , .., ann .
All the other elements for which i 6= j are called off-diagonal elements.
• Each position is either even (+) or odd (-), depending whether the sum i + j of the row
index i and the column index j is even or odd. This only depends on the position of an
element and not on its sign or value!
 
+ − + ...
 − + − ... 
 
 + − + ... 
 
..
··· ··· ··· .
For all diagonal elements i + j = i + i = 2i, and hence they are in even positions.
• We can also consider the reduced matrix A′ij of (n − 1) × (n − 1) numbers that we obtain
by taking away the ith row and the jth column of A. E.g.
 
a21 a23 a2n
 a31 a33 . . . a3n 
′
A12 = 
 
. . . ··· 

 ···
an1 an3 . . . ann
Now we have all the definitions we need, to introduce the determinant:

A determinant is a number that is defined for any square matrix of (n × n) numbers, and to
calculate it we use a recursive procedure.
We write the determinant of a set of n × n numbers in terms of the determinants of matrices

of (n − 1) × (n − 1) numbers, and repeat this untill we only have determinants of sets of 1 × 1
numbers (i.e. just a number), following some basic rules
1. The determinant of a number, is the number itself.
2. The determinant of a matrix A (n × n with n > 1) is given by

 Pn
 j=1 (−1)i+j aij |A′ij | where i is any row of A
det A = |A| =
 Pn
i=1 (−1)
i+j
aij |A′ij | where j is any column of A
So:
• we can choose any row or column to develop det A.

• we have to multiply with a − sign in all the odd positions.
• in principle, we then have to calculate n smaller determinants Cij = det(A′ij ) = |A′ij |
the so-called co-factors.
Hint: Obviously we don’t have to calculate the co-factor of an element aij that is 0,
so you can save yourself quite a bit of work by chosing a row or column with many 0’s
in it!
Examples:
det(−2) = −2

1 −2 a11 a12
det = +1 det(−3)−(−2) det(3) = −3+6 = 3, det = a11 a22 −a12 a21
3 −3 a21 a22
Where we have developed the first row.
It is easy to remember that the determinant of a 2 × 2 matrix is the product of the diagonal
elements minus the product of the off-diagonal elements.

1 2 3
2 3 1 3 1 2
0 −1 −3 = −(0)
3 4 + (−1) 2 4 − (−3) 2 3 = 0 − (4 − 6) + 3(3 − 4) = −1

2 3 4
where we have developed using the 2nd row.

You can check that you obtain the same result whatever row or column you start with.
Make sure that you’re confident in calculating determinants, as we will use this a lot!
Now that we know that a determinant is in principle only defined for square sets of numbers, we
will immediately abuse this definition to use it as an easy way to calculate the vector product
(which isn’t a number at all but a vector!):
If you want to calculate the vector product ~a ×~b you can do this by calculating the determinant
of the matrix with,
1. the coordinates of the first vector ~a as the first row,
2. the coordinates of the second vector ~b as the second row,
3. the standard orthonormal basis vectors as the 3rd row:

a1 a2 a3
~a × ~b = b1 b2 b3

î ĵ k̂
Exercise: check this!
We can also calculate the scalar triple product easily, using determinants:

a1 a2 a3
(~a × ~b) · ~c = b1 b2 b3

c1 c2 c3
Exercise: check this!
7 Geometry
7.1 Linear Objects
So far, we have encountered only linear geomeric objects such as points lines, planes and
hyperplanes. They are called linear because:
1. they can all be expressed as linear combinations of vectors in the vector or parametric
notation, e.g.
~r = ~a + s~u + t~v
2. the (normal) equation(s), for the coordinates of a point in the object, that describe them
are all linear equations, e.g.
x + 2y − 3z = 1
3y + z = 3
The advantage of linear objects and linear equations, that any linear problem can be solved,
and that very good algorithms exist to deal with them e.g. on a computer.
However, linear (ie. straight and flat) objects are just a very small subset of all the geometric
objects that exist. If you just look around you, you will notice that most objects are actually
curved.
In general, objects are classified by the (highest) degree of the equations that describe them.
Linear objects have equations of degree 1, quadratic objects have equations of degree 2, cubic
objects have equations of degree 3, etc. Note that the dimension in which an object lives, has
nothing to do with its degree. E.g. a curve x = y 4 is a 2D object of degree 4, and a plane is an
object in 3D space of degree 1!
You can easily imagine that the higher the degree of an object, the harder it will be to solve
problems involving it. Therefore in this course we will restrict ourselves to linear and quadratic
objects only, but you have to keep in mind that there is a whole zoo of complicated geometric
objects out there.
7.2 Quadratic Objects
As mentioned above, quadratic objects are described by quadratic equations. Maybe without
realising you are already familiar with a concept that involves quadratic equations of coordi-
nates: length or distance.
Indeed, most if not all quadratic objects can be expressed in terms of (squares) of distances.
7.2.1 Quadratic objects in 2D
The Ellipse
Let A and be B be two distinct points, named the focal points of the ellipse. A point P on
the ellipse is such that the sum of the lengths AP and BP is constant, and greater than the
distance from A to B.
O P
A B
b
Figure 1: An ellipse in standard form, that is, rotated so that its major axis lies along the X-axis.
The major axis of the ellipse is defined as the distance AO, as shown in fig. 1. The minor axis
has length b, as labelled in fig. 1.
In vector terms, an ellipse can be written as
|~x − f~1 | + |~x − f~2 | = 2a
where a is the length of the major axis of the ellipse, and f~1 , f~2 are the focal points.
example:
If the position vectors of the focal points of an ellipse are ~c and −~c and the length of the major
axis is 2a. Show that the equation of the ellipse may be written
a4 − a2 (~r 2 + ~c 2 ) + (~c · ~r)2 = 0

solution:
The equation for the ellipse is
|~r − ~c| + |~r + ~c| = 2a (1)
or
|~r − ~c| = 2a − |~r + ~c| (2)
Squaring both sides gives
~r2 − 2~r · ~c + ~c2 = 4a2 − 4a|~r + ~c| + ~r2 + ~r · ~c + ~c2 (3)
Removing the cancelling terms, and dividing by 4,
a|~r + ~c| = a2 + ~r · ~c (4)
Squaring both sides,
a2 (~r 2 + 2~r · ~c + ~c 2 ) = a4 + 2a2~r · ~c + (~r · ~c)2 (5)
This gives, after cancelling equal terms on both sides,
a2 (~r 2 + ~c 2 ) − a4 − (~r · ~c)2 = 0 (6)
which is the desired result, when taken to the other side.
From the previous example, with ~r = (r1 , r2 )T , a = |~a| and c = |~c|, show that
(a2 − c2 )r12 + a2 r22 = a2 (a2 − c2 ) (7)
Using the definition of the minor axis length b, hence show that
r12 r22
+ =1 (8)
a2 b2
This is the standard form for the equation of an ellipse.
The Circle:
The circle is a special case of the ellipse, for which the 2 focal points coincide: f~1 = f~2 = ~c, i.e.
the centre point of the circle.
The defining property of a circle is that the distance from its centre ~c to a point ~r on the circle
is constant and equal to the radius r. (This is true also for the sphere and hypersphere in
higher dimensions). Mathematically, this is written as
|~r − ~c|2 = constant = r 2
where we have squared the distance to get rid of the cumbersome square root.
The circle.
The Hyperbola
A hyperbola is the locus of points such that the square of the difference between the distances
from the hyperbola to the two given points (focal points) is constant.
-c a c
The Hyperbola in standard form. A point P is such that the difference in the distance to the
two focal points at c and −c is twice the distance a.
This is the same definition as the ellipse, except that for the ellipse, it is the sum of the dis-
tances which is kept constant. To find an alternative expression for the hyperbola, we carry
out a similar derivation as that which we did for the ellipse.
The equation for the hyperbola is
||~r − ~c| − |~r + ~c|| = 2a

or
|~r − ~c| = 2a + |~r + ~c| (9)
Squaring both sides gives
~r2 − 2~r · ~c + ~c2 = 4a2 + 4a|~r + ~c| + ~r2 + 2~r · ~c + ~c2 (10)
Removing the cancelling terms, and dividing by 4,
a|~r + ~c| = −(a2 + ~r · ~c) (11)
Squaring both sides,
a2 (~r2 + 2~r · ~c + ~c2 ) = a4 + 2a2~r · ~c + (~r · ~c)2 (12)
This gives, after cancelling equal terms on both sides,
a2 (~r2 + ~c2 ) − a4 − (~r · ~c)2 = 0 (13)
which is of the
same
form as for an ellipse.
c
Taking ~c = , this gives
0
(c2 − a2 )r12 − a2 r22 = a2 (c2 − a2 )
Or, letting: b2 = c2 − a2
r12 r22
− =1
a2 b2
The Parabola
The parabola is obtained when the second of the 2 focal points (f~2 ) of the hyperbola is moved
off to infinity. The distance to the f~2 then becomes equal to a constant minus the distance to
a line ⊥ to the focal axis, see figure.
The standard parabola is defined by the equation, where the focal axis is parallel to the y-axis.
y = ax2 + bx + c
for some constants a, b, c. This is also known as a quadratic function, and is one of the most
common non-linear functions.
3
2.5
2
y
1.5
0.5
0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x
The parabola y = x2 + 1
We can discuss the parabola in terms of its parameters a, b, c.
1. min/max
2. intercept
3. roots,
4. etc.
The quadratic objects in 2D are called conic sections, because they can all be obtained from
the intersection of a cone with a plane!
7.2.2 Quadratic objects in 3D
x2 y2 z2 x2 y2
Ellipsoid: l2
+ m2
+ n2
=1 Elliptic Cone: z 2 = l2
+ m2
]
x2 y2 z2 x2 y2
One sheet Hyperboloid: l2
+ m2
− n2
=1 Elliptic Paraboloid: z = l2
+ m2
x2 y2 z2 y2 x2
Two sheet Hyperboloid: l2
+ m2
− n2
= −1 Hyperbolic Paraboloid: z = m2
− l2
7.3 Determining the N -D hyper-sphere through N + 1 points
It turns out that in N dimensions there is exactly one hyper-sphere that goes through N + 1
points in general position. We will now see the procedure of how to determine the hyper-sphere
S, i.e. its centre point ~c = (x1 , .., xN )T (N unknown coordinates) and radius r (another un-
known).
Consider the set of N +1 linearly independent points {~r1 , .., ~rN }, that must lie on S, thus giving
N + 1 quadratic equations: 
 |~r1 − ~c|2 = r 2
···
|~rN − ~c|2 = r 2

We can reduce this to a set of N linear equations for the coordinates by taking differences of
the quadratic equations, and solve them for the coordinates of ~c.
Once we have done this, we can use any of the quadratic equations to finally determine the
radius r.
We will now demonstrate the procedure for an example.

Consider 3 points in 2D: {(1, 0)T , (2, 2)T , (0, 1)T }.
We need to determine the centre point ~c = (x, y)T and radius r of the circle (2d hyper sphere)
through these three points.

 |~r1 − ~c|2 = r 2 → x2 − 2x + y 2 + 1 = r 2 (eq −eq )
2 2 2 2 2
1 2
2x + 4y − 7 = 0
|~r2 − ~c| = r → x − 4x + y − 4y + 8 = r → (eq1 −eq3 ) ,
2 2 2 2 2 −2x + 2y = 0
|~r3 − ~c| = r → x + y − 2y + 1 = r

25
so x = y = 7/6 and e.g. r 2 = |~r1 − ~c|2 = 25/18 (check that also |~r2 − ~c|2 = |~r3 − ~c|2 = 18
!)
7.4 Intersections involving linear and quadratic objects
7.4.1 Intersection of a line and a sphere
The intersection of a line L : ~r = ~a + t~u and a (hyper)-sphere S (centre ~c, radius r), can either
be 0, 1, or 2 points.
The line has only 1 free parameter t, which can be determined by demanding that the inter-
section also obeys the equation of the sphere. This will give us a quadratic equation for t that
has either:
1. 1 solution (L just touches S)

2. 2 solution (L goes through S, entering in one point and exiting in another)
3. no solution (L passes S by).
example: L : ~r = (1 + t, 3 − 2t, 5t), S : ~c = (1, 1, 1), r = 4, then:

√
9± 411
|~r − ~c|2 = r 2 → t2 + (2 − 2t)2 + (1 − 5t)2 = 16 → t =
30
7.4.2 Intersection of a (hyper-)plane and a (hyper-)sphere
~
The intersection of a plane P (point in plane ~a, normal vector ~n) and a sphere S (centre C,
radius R), can either be a circle (plane cuts through the sphere), 1 point (plane touches the
sphere = circle of radius 0), or empty (plane passes sphere by).
In the procedure we will assume that the intersection is a circle (a contradiction is found if it
is empty).
To uniquely determine a circle C in N-D, we need three things:, its centre ~c, its radius r, but
also the normal vector ~nc to the (hyper-)plane in which the circle lies.
1. We get ~nc = ~n for free as we know that the circle must lie in the plane P.
2. We know that ~c must lie:

~ k ~n (see figure), so ~c = C
• on the line through C, ~ + t~n
• in P, so (~a − ~c) · ~n = 0
which allows us to determine t and hence ~c.

~ 2 + r2.
3. Finally to determine r, we use Pythagoras’ rule (see figure): R2 = |~c − C|
~ 2 and hence r 2 < 0.

• If there is no intersection this is indicated by the fact that R2 < |~c − C|
~ 2 and hence r 2 = 0.
• If the intersection is just 1 point, we have that R2 = |~c − C|
~
n
r
~
c
~
C 01
Intersection of a plane and a sphere in 3D

example:
P : ~a = (1, 0, 1)T , ~n = (2, 2, 2)T ,
S: C ~ = (1, 1, −1)T , R = 2
Then:
• ~nc = ~n = (2, 2, 2)
• ~c = (1 + 2t, 1 + 2t, −1 + 2t)T
• (~a − ~c) · ~n = 0 → 12t − 2 = 0 → t = 1/6, and ~c = (4/3, 4/3, −2/3)T
~ 2 = 4 − 1/3 = 11/3, so r = 11/3.
p
• r 2 = R2 − |~c − C|
7.4.3 Intersection of two spheres
The intersection of two spheres S1 : |~r − C ~ 2 |2 = R2 , can either be a

~ 1 |2 = R2 , and S2 : |~r − C
1 2
circle (spheres intersect cuts through the sphere), 1 point (spheres touch each other), or empty
(spheres are too far apart, or one sphere is completely inside the other). In the procedure we
will assume that the intersection is a circle (a contradiction is found if it is empty).
To uniquely determine a circle C in 3D, we need three things:, its centre ~c, its radius r, but also
the normal vector ~nc to the plane in which the circle lies.
~2 − C
1. We get ~nc = C ~ 1 for free as we know that the circle must lie in the plane ⊥ (C
~2 − C
~ 1 ).
~ 1 and C
2. We know that ~c must lie on the line through C ~ 2 , (see figure), so ~c = C
~ 1 + t(C
~2 − C
~ 1 ).
3. To determine r and t, we use Pythagoras’ rule twice for the two right-angled triangles (see
figure):
~ 1 |2 + r 2
(a) R12 = |~c − C → ~ 1 |2
r 2 = R12 − |~c − C
~ 2 |2 + r 2
(b) R22 = |~c − C → ~ 2 |2
r 2 = R22 − |~c − C
~ 1 |2 = R2 − |~c − C
The difference of these equations (R12 − |~c − C ~ 2 |2 ) gives a linear equation
2
for t, which can be solved.
4. Finally we use either of the two pythagoras equations to find r 2
~ 1 |2 and hence r 2 < 0.

• If there is no intersection, then e.g. R12 < |~c − C
• If the intersection is just 1 point, we have that R12 = |~c − C ~ 1 |2 and hence r 2 = 0.
R1 r R2
00111100 0110
C ~
1 ~
c ~2
C
Intersection of 2 spheres
example:
~ 1 = (1, 1, 1)T , R1 = 1,
S1 : C
~ 2 = (1, 2, 3)T , R2 = 3
S2 : C
Then:
• ~nc = C~2 − C~ 1 = (0, 1, 2)
• ~c = (1, 1 + t, 1 + 2t)T
• t2 + 4t2 − 1 = (−1 + t)2 + (−2 + 2t)2 − 9 → t = p −3/10, and ~c = (1, 7/10, 4/10)T
~ 1 |2 = 1 − 45/100 = 55/100, so r = 55/100
• r 2 = R12 − |~c − C
Check R22 − |~c − C ~ 2 |2 = 9 − 845/100 = 55/100 = r 2 , ok!
7.5 Projections on quadratic objects
Although formulae for projections onto any quadratic object can be derived, in this course we
will restrict ourselves to projections onto (hyper)-spheres, as it is relatively simple.
Consider a point ~r and a (hyper)-sphere S with centre point ~c and radius r. The projection p~~r
of ~r onto S is the point on S that is closest to ~r.
You can easily convince yourself that this point must lie on the line through ~r and ~c hence:
~p~r = ~c + t(~r − ~c),
and must also satisfy the (quadratic) equation of S:
±r 2
|~p~r − ~c|2 = r 2 → t2 |~c − ~r|2 = r 2 → t =
|~c − ~r|2
There are two solutions, one is the desired projection, the other is the point on the sphere
furthest away from ~r.
8 Matrices and Linear Transformations
A matrix is a rectangular array of numbers (scalars). In these lecture notes we will use (bold)
upper case letters to indicate matrices. The numbers in the array are called the elements of
the matrix.
The size (or order) of the matrix is specified as m × n, where m is the number of rows, and
n is the number of columns. This means that n-dimensional (column) vectors are also n × 1
matrices. Similarly, n-dimensional row vectors are also 1 × n matrices.
 
1 2
1 2 3
A =  3 2 , B= , C= 5
−1 3 5
−10 3
Above, we have examples of a 3 × 2 (A), 2 × 3 (B), and 1 × 1 (C) matrix.
To indicate a specific element of a matrix A, we have to indicate the row-index i, and the
column-index j of the element, and we denote it by Aij . In the examples above we have e.g.
A31 = −10 and B12 = 2.
8.1 Operations on Matrices
8.1.1 Matrix Addition
Only matrices of the same size can be added. The summation is carried out as follows: suppose
that we want to add the m × n matrices A and B, then the result A + B is again an m × n
matrix and each element is the sum of the corresponding elements in A and B. Mathematically:
(A + B)ij = Aij + Bij
For example, if A and B are as defined below

1 2 3 4 2 1
A= B=
6 5 2 4 −1 7
then

5 4 4
A+B=
10 4 9
Matrix addition has obviously the following (commutative) property: A + B = B + A.

8.1.2 Matrix Multiplication
Two matrices A (m × n) and B (k × ℓ) can only be multiplied with each other if the number
of columns of the first matrix is equal to the number of rows of the second matrix.
So AB exists only if n = k, and BA exists only if ℓ = m.
If the elements of the m × r matrix A are Aij , and the elements of the r × n matrix B are Bij ,
then the result of the product AB is an m × n matrix. The ij-th element of the result is the
sum of the the product of all the elemnts in the i-th row of A with the corresponding elements
in the j-th column in B. Mathematically:
r
X
(AB)ij = Aik Bkj .
k=1
For example, if the matrices are

A11 A12 B11 B12
A= , B= ,
A21 A22 B21 B22
Then
A11 B11 + A12 B21 A11 B12 + A12 B22
AB = ,
A21 B11 + A22 B21 A21 B12 + A22 B22
Because of the conditions on the size of the matrices, it should be clear that matrix multiplica-
tion does not possess the comutative property. In fact, even if AB exists it is not guaranteed
that BA also exists!
The combination of the matrix addition and the matrix multiplaction does have the distributive
property:
A(B + C) = AB + AC
when it exists.
8.1.3 Multiplying a matrix by a scalar
An m×n matrix A multiplied by a scalar λ is again an m×n matrix λA, where all the elements
are multiplied by λ:
(λA)ij = λAij
Combining the multiplication with a scalar and matrix addition we can consider linear combi-
nation of matrices (just like we did for vectors). For a set of k matrices {Ai i = 1..k} and a set
of k numbers {ci , i = 1..k}, we can consider the linear combination:
k
X
ci Ai = c1 A1 + c2 A2 + .. + ck Ak
i=1
8.1.4 Transposing a matrix
Transposing an m × n matrix A (notation AT ), means that we swap rows and columns of the
matrix, e.g. the i-th row becomes the first column and the i-th column becomes the first row.
The result AT is then an n × m matrix. Mathematically:
(AT )ij = Aji
8.1.5 Scalars, Vectors and Matrices
We can identify a scalar as a 1 × 1 matrix (therefore we can drop both the row and column
indices without ambiguity. The notation ill depend Whether we look upon the number as a
scalar or a matrix e.g:
A = A11 ∼ a
and is important because the operations on them do not necessarily mea the same things.
Vectors are matrices with only one column, so n × 1 matrices (therefore we do not tend to give
column index as it is always 1 anyway).
As such we can do all matrix operations on them. Again it is a good idea to distinguish in your
notation how you consider them because of different notation and meaning of the operations
on them, e.g.:
       
A11 a1 B11 b1
A = A21  ∼ ~a =  a2 , B = B21  ∼ ~b =  b2 ,
A31 a3 B31 b3
Then the scalar product of verctors can be expressed as a matrix product:
AT B = (A11 B11 + A21 B21 + A31 B31 ) ∼ (a1 b1 + a2 b2 + a3 b3 ) = ~a · ~b
Since vectors (even in vector notation) can still be regarded as matrices consisting of a single
column, we can still multiply a matrix (of the right dimensions) with a vector:
    
A11 A12 A13 b1 A11 b1 + A12 b2 + A13 b3
A~b = A21 A22 A23   b2  =  A21 b1 + A22 b2 + A23 b3 
A31 A32 A33 b3 A31 b1 + A32 b2 + A33 b3
and the result is again a vector.

8.2 Special Matrices
8.2.1 Square matrices
An n × n matrix which has the same number of rows as columns is called a square matrix.
Square matrices are important for many reasons which will become clear in the rest of this
course:
1. some matrix operations are only defined on square matrices
2. multiplying a square matrix with a vector yields a vector of the same dimension (in the
same space).
3. almost all special matrices are square.
• The elements of a square matrix that have the same row and column index, of the form
Aii are called the diagonal elements, and the set of all of them simply the diagonal of the
matrix.
• All the other elements are called the off-diagonal elements
8.2.2 Symmetric matrices
A symmetric matrix is a matrix that remains unchanged when you transpose it:
AT = A Aji = AT ij = Aij

→
This means that a symmetric matrix is necessarily square and the corresponding elements above
and below the diagonal (when you mirror around the diagonal) are the same. E.g.
 
1 0 3
A = 0 −2 π 
3 π 5
is symmetric.
Note that there is no condition on the diagonal elements.
8.2.3 Zero matrices
A zero matrix 0 is a matrix in which each element is 0. There are zero matrices of any size,
hence it is good practice to indicate the m × n zero matrix as 0mn . Mathematically:
0ij = 0 ∀i, j

1. Multiplication of any matrix by a zero matrix produces a zero matrix.
2. A zero matrix added to another matrix A reproduces A.
3. Note that zero matrices need not be square.
8.2.4 Diagonal matrices
A diagonal matrix D is a square matrix for which all off-diagonal elements are 0. Note that
there is no condition on the values of the diagonal elements. Mathematically:
Dij = 0, if i 6= j
E.g.  
1 0 0
A = 0 −2 0 
0 0 π
is diagonal.
• A diagonal matrix is always symmetric.
• A square zero matrix is also a diagonal matrix.
8.2.5 Triangular matrices
A triangular matrix T is a square matrix for which all elements above the diagonal or all
the elements below the diagonal are 0, is called a lower triangular or upper triangular matrix
respectively.
Tij = 0, if i > j(upper), or Tij = 0, if i < j(lower),
E.g.    
1 1 3 1 0 0
A = 0 −2 7  , B = 0 −2 0  ,
0 0 π 1 −3 π
A is an upper triagular matrix and B lower triangular.
8.2.6 Identity matrices
An identity matrix is defined as a matrix I which, under matrix multiplication with any matrix
A, leaves A unchanged (i.e. IA = A = AI).
If A is an m × n matrix and we multiply it on the left with an I the result should again be A
(m × n), so I must be a square m × m matrix. Similarly, if multiply A with an I on the right,
it follows that I should be square n × n matrix.
We conclude that any identity matrix must be square but it can be of different dimensions
(n × n), so it is good practice to indicate the dimnsion of the square matrix In , such that it is
clear which one we are talking about.
It truns out that there is exactly 1 identity matrix In for each dimension n, and it is a diagonal
matrix with all diagonal elements equal to 1. Mathematically:
 
1 0 .. 0
0 1 .. 0
Iij = 1, i = j
In =  ,
 
.
: : .. :

Iij = 0, i 6= j
0 0 .. 1
You can check for yourself that if you multiply this matrix on the left or right with a matrix of
compatible dimensions you get that matrix back.
You can see that In are all diagonal matrices.
8.3 Linear Transformations of Vectors
The matrix multiplication ideas that we developed above are very useful in describing linear
transformations of vectors. If T is a linear transformation it maps every vector onto another
vector, and satisfies the following properties (defining it as linear):
T (~x + ~y ) = T (~x) + T (~y ) ∀~x, ~y

T (λ~x) = λT (~x) ∀λ, ~x
We know that matrix multiplication satisfies these properties, so matrix multiplication must
be a linear transformation. In fact it turns out (without proof) that there is a one-to-one
relationship between linear ransformations and matrices:
With every linear transformation T there corresponds exactly one matrix A such
that
T (~x) = A~x,
and vice-versa.
Since we have the linear properties, it is enough to know how the tranformation works on the
basis vectors {ve1 , .., ~en } → {~e′1 = T (~e1 ), .., ~e′n = T (~en )}, to know how it works on any vector
because:
 
a1
T (~a) = T  ...  = T (a1~e1 + .. + an~en ) = a1 T (~e1 ) + ..an T (~en ) = a1~e′1 + ..an~e′n
 
an
and the matrix A corresponding to the transformation is given by putting the coordinates of
the vectors {~e′1 , .., ~e′n } (in the basis {~e1 , .., ~en }) as the columns of A.
Practically, it is a good idea to draw the basis vectors ~ei and their images ~e′i , to understand
how the transformation works, and to see what the coordinates of th images are.
Examples:
8.3.1 Scaling
Scaling a vector means changing the length of the vector, but not its direction. Since the direc-
tion in which a vector points is associated with the ratio of its components, scaling corresponds
to simply multiplying all the components of the vector by an equal amount. That is
~x′ = λ~x ≡ S~x
where the scaling matrix S = λI, λ times the identity matrix.
Scaling a vector.
Indeed, we have that ~e′i = T (~ei ) = λ~ei for all basis vectors, so putting these as the columns of
the matrix gives S.
8.3.2 Reflection
Reflection operators flip vectors about a certain line, plane or hyper-plane. For example, if we
flip about the y-axis, we have that

′ −1
 î = −î =


0 −1 0
→ R=
′ 0 0 1
 ĵ = ĵ =


1
8.3.3 Projection
Linear projection operators project all vectors orthogonally onto a line, plane or hyperplane.
Consider the projection in 3-D onto the xy-plane.
Since î and ĵ lie in the xy-plane, and k̂ is orthogonal to it, we have that:
  

 1
′
î = î = 0 

 


0 



  
0 1 0 0



′
ĵ = ĵ =  1  → P = 0 1 0
0  0 0 0




0



~

′
k̂ = 0 = 0 

 


0

Projecting a vector onto the xy-plane
8.3.4 Rotation
Linear rotation operators rotate all vectors over some angle around a line through the origin.
Consider the case that we rotate the 3-D vectors over an angle θ clockwise around the z − axis.
We have that (see figure below):
  

 cos(θ)
î′ = cos(θ)î − sin(θ)ĵ =  − sin(θ) 




0 



   
sin(θ) cos(θ) sin(θ) 0



ĵ ′ = sin(θ)î + cos(θ)ĵ =  cos(θ)  → R = − sin(θ) cos(θ) 0
0 0 0 1



  
0




′
k̂ = k̂ = 0 

 


1

Rotating a vector through θ degrees clockwise around z-axis
8.3.5 Compositions of Linear Transformations
All the above transformations can be represented by matrix multiplications on vectors. We can
apply a sequence of such transformations, e.g. a rotation R, followed by a rescaling, S. This
would have the effect of transforming a vector ~x into ~x′ by rotation, and then transformning ~x′
into ~x′′ by rescaling. Hence,
′
~x = R~x
→ ~x′′ = S(R~x) = (SR)~x
~x′′ = S~x′
so the matrix corresponding to the combined transformation from ~x to ~x′′ is given by the matrix
product of the matrices in the opposite order of the matrices of the individual transformations
(i.e. the first transformation on the right and the last one on the left)!
Obviously we can generalise this to an arbitrary number of linear transformations. If the tranfor-
mations A1 , A2 , .., An are applied consecutively in that order then the matrix C corresponding
to the overall transformation is given by:
C = An An−1..A2 A1
8.4 Determinants
The determinant of a matrix is a an operation that is only defined for square matrices and
maps every square matrix A onto a number det(A).
We have already introduced determinants and explained how to calculate them when we dis-
cussed the vector product, so we are not going to repeat that here.
However, there are some extra important properties concerning determinants worth mentioning:
• det AT = det A
• Multiplying a single row or column of A by a scalar, k changes the determinant by a factor
k. That is det B = k det A if B is A but with one of the columns or rows multipled by k.
• From the above property we see that multiplying an n × n matrix with a scalar has the
following effect on the determinant: det(λA) = λn det(A).
• Interchanging two rows or two columns changes the sign of the determinant.
• If B is the matrix that results when a multiple of one row of A is added to another row or
when a mulitple of one column is added to another column, then det B = det A.
• If A is a square matrix and the set of rows or columns (looked upon as vectors) is linearly
dependent then det A = 0.
• If A and B are square matrices of the same size,
det AB = det A det B
• The determinant of a triangular matrix (either all the elements above the diagonal or all
the elements below the diagonal are zero) is equal to the product of its diagonal elements.
• Since a diagonal matrix is also triangular, the determinant of a diagonal matrix is also just
the product of its diagonal elements.
We can use these results to help evaluate the determinant.
8.5 Matrix Inversion
Another operation that is only defined on (some) square matrices is matrix inversion. The
inverse A−1 of a n×n matrix A, is that matrix that undoes (reverses) the effect of multiplication
by A on either side:
A−1 AB = B, and CAA−1 = C
for any compatible matrices B and C. Therefore we have that
A−1 A = In = AA−1
E.g. if we applied the transformation A to a vector ~x, what transformation do we need to apply
to the vector ~x′ = A~x to recover the original vector ~x?
• Singular matrices
Note that it is not always possible to find A−1 , and it turns out that A−1 does not exist
when det(A) = 0. In that case, we call the matrix A singular.
With any linear transformation of vectors corresponds a matrix A. Now you should try
and think for what kind of transformations you could work your way back to the original
vector. In order to do so each different vector ~x should be mapped onto a different unique
vector ~x′ . If more than one vector gets mapped onto the same ~x′ it is no longer possible to
undo the operation, because you don’t know where it came from.
It turns out that any scaling, reflection or rotation is invertible, while any projection is not.
You can check as an exercise for yourself that the determinant of any scaling reflection and
rotation that we have dealt with is non-zero, while the determinant of any projection is 0.
• Inverses of matrix products

We now show that (AB)−1 = B−1 A−1 :
B−1 A−1 AB = B−1 IB = B−1 B = I, and AB B−1 A−1 = AIA−1 = AA−1 = I
• Equivalence of the right and left matrix inverse

We now show that the matrix A−1 which multiplied on the left of a matrix A, gives the
identity matrix, is the same as the matrix that multiplied on the right of A gives the identity
matrix. Lets assume that A−1 A = I then
AA−1AA−1 = A(A−1 A)A−1 = AA−1 → AA−1 = I
8.5.1 Matrix Inverse using elementary row operations
The elementary row operations consist of the following:
• interchanging two rows
• scaling a row
• adding or subtracting a multiple of one row to another
We can use elementary row operations to find the inverse in the following way:
1. Write down the matrix A on the left, and the identity matrix I of the same dimension on
the right.
2. Apply an elementary operations to both the left and the right, working your way towards
getting the identity matrix on the left.
3. If you can’t get any closer with elemntary operations on the left, then the inverse doesn’t
exist.
4. If you manage to reduce the matrix on the left all the way to the identity matrix, then the
matrix on the right is A−1
• It is good practice to write down the operations that you do in each step.
• Also work systematically, starting from making the first column to the same form as for the
identity matrix, you work your way to the last column. thus you avoid undoing work that
you did before.
 
1 2 3
Example: Find the inverse of the matrix 2 5 3
1 0 8
Start:
   
1 2 3 1 0 0 r2 → r2 − 2r1 1 2 3 1 0 0 r1 → r1 − 2r2
 2 5 3 0 1 0  → r3 → r3 − r1  0 1 −3 −2 1 0  → 3 → r3 + 2r2
r
1 0 8 0 0 1 0 −2 5 −1 0 1
   
1 0 9 5 −2 0 1 0 9 5 −2 0 r1 → r1 − 9r3
 0 1 −3 −2 1 0  → r3 → −r 3  0 1 −3 −2 1 0  → 2 → r2 + 3r3
r
0 0 −1 −5 2 1 0 0 1 5 −2 −1
 
1 0 0 −40 16 9
 0 1 0 13 −5 −3 
0 0 1 5 −2 −1
So  −1  
1 2 3 −40 16 9
 2 5 3  =  13 −5 −3 
1 0 8 5 −2 −1
Check that indeed AA−1 = I3 .  
−1 −1 2
Example: Find the inverse of  1 −2 1.
0 −1 1
Start:
   
−1 −1 2 1 0 0 1 1 −2 −1 0 0
 1 −2 1 0 1 0  → r1 → −r1  1 −2 1 0 1 0  → r2 → r2 − r1
0 −1 1 0 0 1 0 −1 1 0 0 1
   
1 1 −2 −1 0 0 1 1 −2 −1 0 0 r1 → r1 − r2
 0 −3 3 1 1 0  → r2 → −r2 /3  0 1 −1 − 3 − 3 0  → r3 → r3 + r2
1 1
0 −1 −1 0 0 1 0 −1 1 0 0 1
2 1
 
1 0 −1 − 3 3
0
 0 1 −1 − 1 − 1 0 
3 3
0 0 0 − 13 − 31 1
At this point we see that row 3 is all zeros, so we can never get column 3 into the form of
the identity matrix, and hence the inverse does not exist. Check that the determinant of this
matrix is indeed 0.
   
1 2 3 4 1 −2 1 0
0 1 2 3 0 1 −2 1 
Extra example: Find the inverse of 0 0 1 2, (answer 0 0
  ).
1 −2
0 0 0 1 0 0 0 1
8.5.2 Matrix Inverse using cofactors
An alternative to elementary row operations for the calculation of the inverse of a square matrix
A, makes use of the calculation of determinants.
First we define the cofactor Cij of an element Aij of the matrix A as the determinant of
the reduced matrix A′ij (i.e. the matrix A where we have taken away the ith row and the jth
column, also known as the ij minor of A).
Cij = det(A′ij )
Then we have the following expression for the element ij of the inverse matrix A−1 :
1
A−1 (−1)(i+j) Cji

ij
=
det(A)
• Note that the formula above has Cji not Cij ! Hence we need to transpose the matrix of the
cofactors!!!!!!
This particularly simple for for 2x2 matrices.
−1 T
a b 1 d −c 1 d −b
= =
c d ad − bc −b a ad − bc −c a
Example:  
1 3 0
A= 1 2 0 
0 0 3
First we calculate all the cofactors:

2 0 1 0 1 2
C11 = det = 6, C12 = det = 3, C13 = det =0
0 3 0 3 0 0

1 0 1 0 1 1
C21 = det = 9, C22 = det = 3, C23 = det =0
0 3 0 3 0 0

1 0 1 0 1 1
C31 = det = 0, C32 = det = 0, C33 = det = −1
2 0 1 0 1 2
Then we can calculate the determinant by developping the 3rd row
det(A) = 0 − 0 + 3C33 = −3
So the inverse becomes

 T  
6 −3 0 −2 3 0
1
A−1 = −  −9 3 −0  =  1 −1 0 
3
0 −0 −1 0 0 13
Extra exercises, apply this method to the examples of the previous sections, and check that
you get the same answers.
8.6 Solution of sets of linear equations with matrix methods
A set of simultaneous linear equations can always be written in the form
A~x = ~b
e.g.      
 2x + y + z =3 2 1 1 x 3
x−y =5 →  1 −1 0   y  =  5 
−x − 6y + 5z = 2 −1 −6 5 z 2

1. One method to solve such equations uses the inverse of the matrix (provided that it exists),
the solution is then given by multiplying both sides of the equation by A−1 :
A−1 A~x = ~x = A−1~b
Although this is a correct approach to solving this linear system of equations, in practice,
we normally don’t solve this problem this way. Calculating the inverse is a lot of work if
you have to solve this set of equations only once, and much more efficient methods exist.
Nevertheless, it is worthwhile knowing this method when we have to solve the same set of
equations many times for different right hand sides (i.e. for different ~b’s).
Then we only have to calculate the inverse of A once, and can use it again and again.
    
1 3 0 x 1
Example: Solve 1 2 0  y  =  2  using the inverse. We already have the
0 0 3 z 3
inverse of this matrix (see previous section), so the solution is given by:
      
x −2 3 0 1 4
 y  =  1 −1 0   2  =  −1 
z 0 0 13 3 1
If the determinant of the matrix is non-zero there exists exactly 1 solution, while if the
determinant of the matrix is 0, there could be either no, or many solutions.
2. If you need to solve the set of equations only once, then it is better to use elementary
row operations, and reduction to triangular form. This is numerically more effiecient
and can also deal with non-invertibel matrices and even non-square matrices (i.e. sets of
equtions with a different number of equations and unknowns.
The steps for solving it this way are:
(a) write the matrix on the left and the right hand sides on the right
(b) try to reduce the matrix to an upper triangular form using elementary row operations,
and do the same operations to the right hand side.
(c) if you manage to get to that form, then you can get the solution by back substitution.
You solve the last row first, fill that into the second last and so on.
(d) if you get a contradiction before you arrive at that form, this means that the set of
equations has no solutions
Assume that we have reduced the equations to upper triangular form.

    
u11 u12 u13 .. u1n x1 c1
 0 u22 u23 .. u2n   x2   c2 
    
 0 0 u33 .. u3n 
  x3  =  c3 
   

 .. .. .. .. ..   ..   .. 
0 0 0 0 unn xn cn
This systems of equations is now easy to solve since we can work our way back from the
solutions of xn to x1 as follows.
cn
Looking at the equation in the last row, we see that xn = unn
. Given this solved value for
xn , this can be used in the (n − 1)th equations, namely
un−1n−1xn−1 + unn−1 xn = cn−1

This is solved by
cn−1 − unn−1 xn
xn−1 =
un−1n−1
Similarly, the next equation gives
un−2n−2xn−2 + un−1n−2 xn−1 + unn−2 xn = cn−2
which is solved by
cn−2 − (un−1n−2 xn−1 + unn−2xn )
xn−2 =
un−2n−2
We can continue in this manner, solving each time for a new variable, which is can then be
fed into the equation on the row above, along with the other solved values.
The formal name for this reduction to an upper triangular form is Gaussian Elimina-
tion, and the subsequent method for solving these simplified equations is known as back
substitution.
Although you may have been solving sets of linear equations this way without realising,
doing it in the matrix notation, makes everything more systematic, and more similar to
how a computer would solve it.
Example: Solve the following set of equations

x + 2y − 2z = 1
2x + y − 4z = −1
4x − 3y + z = 11
   
1 2 −2 1 r2 → r2 − 2r1 1 2 −2 1 11r
 2 1 −4 −1  → r3 → r3 − 4r1  0 −3 0 −3  → r3 → r3 − 3 2
4 −3 1 11 0 −11 9 7
 
1 2 −2 1
 0 −3 0 −3  (14)
0 0 9 18
Now we can do back substitution and we find that z = 2, y = 1, x = 3.
x + 5y + 3z = 1
5x + y − z = 2
x + 2y + z = 3
   
1 5 3 1 r2 → r2 − 5r1 1 5 3 1
 5 1 −1 2  → 3 → r3 − r1
r  0 −24 −16 −3  → r3 → r3 − 82
r
1 2 1 3 0 −3 −2 2
 
1 5 3 1
 0 −24 −16 −3  (15)
19
0 0 0 8
19
The last row now reads 0x + 0y + 0z = 8
6= 0 and so this is a contradiction and no solution
exists.
x + 5y + 3z = 1
5x + y − z = 2
x + 2y + z = 85
   
1 5 3 1 r2 → r2 − 5r1 1 5 3 1
 5 1 −1 2  → 3 → r3 − r1
r  0 −24 −16 −3  → r3 → r3 − 82
r
1 2 1 58 0 −3 −2 − 83
 
1 5 3 1
 0 −24 −16 −3  (16)
0 0 0 0
The last row now reads 0x + 0y + 0z = 0 and so there is no constraint on z, and there are many
solutions.
8.7 Eigenvalues and Eigenvectors
One of the most important concepts in linear algebra is that of eigenvalues and eigenvectors. In
a sense, the eigenvectors of the matrix correspond to the natural coordinate system, in which
the transformation can be most easily understood.
We call ~e an eigenvector and λ the corresponding eigenvalue of the matrix A if
A~e = λ~e
Eigenvectors are special directions in which the vectors only get rescaled under the operation
of the matrix A, and the eigen values are the rescaling factors. A n × n square matrix has n
eigenvalues although they do not all have to be different.
8.7.1 Finding Eigenvalues
In general, for an n × n dimensional matrix, there are (possibly including repetitions) n eigen-
values, each with a corresponding eigenvector. We can reform the equation above to
(A − λI)~e = ~0
This is known as the eigenvector equation of A.

This is simply a linear equation, and we can use our knowledge of linear algebra to solve this.
We know that ~e = ~0 is always a solution, but that is not the one we want.
We also know that if A − λI has an inverse (i.e. det(A − λI) 6= 0) the solution is unique and
so must be ~o. Therefore, we can only have non-zero eigenvectors if λ takes some special values
(the eigenvalues) such that:
det(A − λI) = 0
This is known as the characteristic equation of A, and the left hand side is a polynomial of
order n in λ known as the characteristic polynomial.
After you have found all the roots {λ1 , .., λn } of this polynomial, it can be rewritten as:
n
Y
det(A − λI) = (λi − λ)
i=1
It can happen that the same value λi appears more than once! The number of times that the
same eigenvalue appears as a root is called its multiplicity mi .
So, supposing that there are k different eigenvalues, the characteristic polynomial can be rewrit-
ten as
Y k X k
mj
(λj − λ) , where mj = n
j=1 j=1
Once we have found an eigenvalue, then the corresponding eigenvector can be found by substi-
tuting this value for λ in the eigenvector equation and solving for ~e.
Properties and practical tips concerning eigenvalues:
1. You should realise that eigenvalues can in general be complex numbers (because they are
the roots of a polynomial), although in this course the examples will be chosen such that
they are always real.
2. If A is a real symmetric matrix, then its eigenvalues are real.
3. If A has eigenvalues λ1 . . . λn then Am (where m is a positive integer) has eigenvalues

λm m
1 . . . λn .
Proof: suppose that ~e is an eigenvector of A with eigenvalue λ then:
A~e = λ~e → Am~e = Am−1 (A~e) = λAm−1~e = .. = λm~e,
so ~e is also an eigenvector of Am with eigenvalue λm .
4. If A and B are of order n and A is a non-singular matrix then A−1 BA and B have the
same eigenvalues.
5. A and AT have the same eigenvalues.
6. The product of all the eigenvalues of a matrix is equal to the determinant of that matrix.
7. The constant in the characteristic equation is also equal to the determinant and by the
previous property equal
Q to the product of all the eigenvalues.
Proof: det(A − λI) = ni=1 (λi − λ) →λ→0 det(A) = ni=1 λi
Q
8. To solve the characteristic equation which is often more complicated than a quadratic equa-
tion, the eigenvalues (which for this course are often small integers), can be guessed from
the constant term. e.g. if the constant term is 6, then try if ±1, ± 2, ± 3, ± 6 are roots.
9. If you have found an eigenvalue λi , then you can divide the characteristic polynomial by
(λi − λ), to obtain a simpler problem for the remaining eigenvalues. You do this by long
division (see example below).
10. If a matrix has a row or a column that is all 0, except for a number on the diagonal, then
that number is an eigenvalue.
11. If a matrix is a triangular matrix, then the numbers on the diagonal are the eigenvalues.

1 3
Example: Find the eigenvalues and eigenvectors of
4 2
The characteristic equation is
1−λ 3
=0
4 2−λ
Evaluating the determinant, this is
(1 − λ)(2 − λ) − 3 4 = 0 → λ2 − 3λ − 10 = 0 → (λ − 5)(λ + 2)
so that the solutions are λ = −2, 5. To find the eigenvectors, we consider each eigenvalue
separately:

x
First for λ = 5 we try eigenvector ~e5 =
y
3
−4 3 x 0 3
= → 4x = 3y → ~e5 → ê5 = 54
4 −3 y 0 4 5
Note that the second equation was simply the negative of the first. Such redundancies will
always occur in the solution of these eigenvector equations. !
−1
√
x
Similarly for λ = −2 we try eigenvector ~e−2 = and find λ = −2 : → ê−2 = √12 for
y 2
the normalised eigenvectors.
 
2 3 1
Example: Find the eigenvalues of the matrix A = 3 1 2.
1 2 3
− det(A − λI) = λ3 − 6λ2 − 3λ + 18 = 0
You can check that λ = 6 is a root of this polynomial, so we divide it by (λ − 6):
λ3 − 6λ2 −3λ + 18 |(λ − 6)

3 2
−(λ − 6λ ) λ2 − 3
0 −3λ + 18
− (−3λ + 18)
0
So we see that 6 is indeed a root (rest is 0), and that λ3 − 6λ2 − 3λ + 18 divided by (λ − 6) is
λ2 − 3 (which a quadratic equation for which we can easily find the roots), so
√ √
λ3 − 6λ2 − 3λ + 18 = (λ − 6)(λ2 − 3) = (λ − 6)(λ − 3)(λ + 3)
√
and the eigenvalues are ± 3, 6.
8.7.2 Eigenspaces
Note that sometimes, it may be that there is not just a single vector which is an eigenvector
for a matrix. It may be that there is a whole space of such vectors (this typically only happens
for eigenvalues with a multiplicity greater than 1).
Example:  
0 0 −2
Find the eigenvalues and eigenvectors of the matrix 1 2 1 .
1 0 3
First we need to solve the eigenvalue problem

 
−λ 0 −2
 1 2−λ 1 =0
1 0 3−λ
The characteristic polynomial is
λ3 − 5λ2 + 8λ − 4 = 0
We need to solve this equation. since the constant in the polynomial is 4, the integer roots
could be ±1, ± 2, ± 4. Substituting λ = 1 in the polynomial, we see that this is a solution.
Division by (λ − 1) gives: λ2 − 4λ + 4 = (λ − 2)2 so λ = 2 has multiplicity 2.
Lets try to find the eigenvectors:
      
−1 0 −2 x 0 2
λ=1→  1 1 1   y = 0 → ~e1 = s −1
   
1 0 2 z 0 −1
so the eigenspace of λ = 1 is 1 dimensional (a line).

    
−2 0 −2 x 0
λ = 2 →  1 0 1  y  =  0 
1 0 1 z 0
As we can see, if we set x = s, then we have the condition z = −s. However, y is left free, so
we can pick y = t. This means that any vector ~e2 in the space
   
1 0
s  0  +t 1 

−1 0
is an eigenvector of A with eigenvalue 2. So the eigenspace of λ = 2 is 2 dimensional, a plane.
Property: The dimension of the eigenspace corresponding to an eigenvalue, is smaller or equal

to the multiplicity of that eigenvalue.
8.8 Diagonalization
An n × n matrix A, is diagonalizable if there is an invertible matrix P such that D = P−1 AP

is a diagonal matrix; the matrix P is said to diagonalise A.
The way that we find such a matrix P is as follows:
1. Find n linearly independent eigenvectors of A, ~p1 , .., ~pn .
2. Form the matrix P having ~p1 , ..~pn as its column vectors.
3. The matrix D = P−1 AP will then be diagonal with λ1 , .., λn as its successive diagonal
entries, where λi is the eigenvalue corresponding to ~pi .
The reason that this construction diagonalises A is as follows. Consider that the vectors ~p1 , .., ~pn
are eigenvectors of A. That is 
 A~p1 = λ1 p~1

..
 .
 A~p = λ p~
n n n

If we define the matrix P = ~p1 ..~pn then the above set of equations can be succinctly written
as AP = PD
 
λ1 .. 0
where D is the diagonal matrix D =  ... . . . ... .
 
0 .. λn
Provided that P is invertible, we can therefore express A as
A = PDP−1
 
0 0 −2
Example Find a matrix P that diagonalises A = 1 2 1 .
1 0 3
This is the same matrix that we have previously found the eigenvalues and eigenvectors for,
namely
   
−1 0
λ=2: , ~p1 =  0  , p~2 =  1  (17)
1 0
 
−2
λ=1: , ~p3 =  1  (18)
1
Since there are three linearly independent vectors, the matrix

 
−1 0 −2
P= 0 1 1 
1 0 1
is invertible and diagonalises A.
Note that if the dimension of the eigenspace of any of the eigenvalues is smaller than the
multiplicity of that eigenvalue, then we cannot find enough linearly independent eigenvectors,
so P is not invertible and A is not diagonisable.

0 1
Example: Diagonalise A = .
0 0
We find that the only eigenvalue of this matrix is λ = 0. It has a multiplicity 2 but the
corresponding eigenspace

0 1 x 0 s
= → y = 0 → ~e0 =
0 0 y 0 0
is only 1 dimensional. Therefore, we cannot construct an invertible P and A is not diagonisable.
8.8.1 Computing Functions of matrices
Very often, we need to compute the power of a matrix, Ap = AA..A (A multiplied by itself p
times).
If the matrix is diagonalisable, then the powers of the matrix are readily related to powers of
the eigenvalues, since
p
Ap = PDP−1 = PDP−1 . . . PDP−1 (p times)
All the middle P−1 P cancel, so this reduces to
Ap = PDp P−1
 p 
λ1 .. 0
The p-th power of a diagonal matrix is  ... . . . ... 
 
0 .. λpn
Although we will not prove it here, it turns out that we can also use this to calculate any function
of a diagonisable matrix (provided that we can calculate this function on all the eigenvalues)
 
f (λ1 ) .. 0
f (A) = Pf (D)P−1 = P  ... ..
.
..  P−1

. 
0 .. f (λn )
provided that f (λi) exists for all eigenvalues.

Warning: you cannot apply a function on a matrix, by applying it directly onto all its elements.
For almost all functions (except multiplication with a scalar, or adding a scalar), and all matrices
(except diagonal ones) this is just wrong!!!!
Examples: Find A2 where  

0 0 −2
A = 1 2 1 
1 0 3
From previous examples, we know that
 
−1 0 −2
P= 0 1 1 
1 0 1
diagonalises A, with the diagonal representation

 
2 0 0
D = 0 2 0
0 0 1
Now,  
1 0 2
P−1 = 1 1 1 
−1 0 −1
so that, using A2 = PD2 P−1 ,
  2  
−1 0 −2 2 0 0 1 0 2
2 2
A =  0 1 1   0 2 0   1 1 1 (19)
1 0 1 0 0 12 −1 0 −1
    
−1 0 −2 4 0 8 −2 0 −6
= 0 1 1  4 4 4 = 3 4 3 (20)
1 0 1 −1 0 −1 3 0 7
One can check by explicit calculation of AA that this is the correct result. For such small
powers of A, the usefulness of this method may not be apparent. However, computing A100 is
just as easy, since it involves computing only the diagonal matrix to this power.
Compute A20 for A as defined in the above example.

 
−1048574 0 −2097150
 1048575 1048576 1048575 
1048575 0 2097151
1
Compute A 2 for A as defined in the above example.
 
0.5858 0 −0.8284
0.4142 1.4142 0.4142 
0.4142 0 1.8284
1 1
Check that A 2 A 2 is indeed A
9 Vector analysis
If we combine vectors with analysis (differentiation, integration etc.) we have the field of
mathematics known as vector analysis. In the second year you will do a lot more on this
subject, but in this course you just get a little taste...
9.1 Parameterised Curves
In two and three dimensions, we represented points ~x along a line as
~x = ~a + t~u
where ~a is some point on the line, and ~u points along the direction of the line. This is an
example of a parameterised curve. Other simple examples include the circle, which has the
following parametric representation

x1 c1 cos θ
= +r
x2 c2 sin θ
where ~c = (c1 , c2 )T is the centre of the circle, and r is its radius. The parameter that traces
out the curve is here θ. To see that this corresponds to our previous definition of a circle, lets
calculate the length
|~x − ~c|2 = r 2 (cos2 θ + sin2 θ) = r 2
The helix The helix is defined parametrically in the form (see figure)
 
cos z
~x(θ) =  sin z 
z
35
30
25
20
z
15
10
0
1
0.5 1
0.5
0
0
−0.5
−0.5
−1 −1
y x
Figure 2: The helix (cos θ, sin θ, θ)T , θ ∈ [0, 10π]

9.1.1 The tangent to a curve
The tangent to the parametric curve ~x(t) is given by

d
~x(t)
dt
Differentiation acts component wise, so that
d
 
x (t)
dt 1
d d
~x(t) =  x (t)
dt 2

dt d
x (t)
dt 3
Find the unit tangent vector to the helix (cos t, sin t, t)T .
Differentiating each component, we get

 
− sin t
d
~x(t) =  cos t 
dt
1
and is the tangent vector to the curve. To obtain the unit vector, you still have to normalise
this!
9.2 Scalar and Vector Fields
It is often necessary to think of quantities that can be defined at each point in space. For
example, if ~x is the position vector of a point in the atmosphere, then the temperature at that
point, T (~x) would be an example of a scalar field. That is, for each point in the space, we can
associate a scalar value, namely here the temperature.
A vector field is defined in a similar manner. For example, in the atmosphere, we may be
interested to know the local wind direction w
~ at any point ~x. This would be an example of a
vector field, w(~
~ x). An example of a 2 dimensional vector field is given in the following figure
9.3 Surfaces
Consider the following scalar field:

φ(~x)
where φ(~x) is some function of ~x = (x, y, z)T . For example,
φ(~x) = z + x2 + y 2
The constraint
φ(~x) = c
Figure 3: An example of a two dimensional vector field. At each point in the space, a
vector is defined, here a unit vector representing the wind direction. Note that
in general, the vectors need not have the same length.
where c is a constant, then represents a surface in 3 dimensional space, since x, y, z are no

longer three independent variables, but have one constraint imposed upon them. In essence,
there are then two independent variables, and the third is determined as a function of the other
two. For example, lets take the independent variables to be x and y. Then, in the example
above, the Z coordinate is determined by
z = c − x2 − y 2.
This surface is plotted in the figure below for a value of c = 10. Note that the effect of choosing
a different value of c would, in this example, simply shift the surface up or down the Z axis.
10
9.9
9.8
Z
9.7
9.6
9.5
0.5
0.5
0
0
−0.5 −0.5
Y
Figure 4: A two dimensional surface in a three dimensional space. The surface is the
quadratic function z = c − x2 − y 2 .
9.3.1 The gradient vector operator
Given a scalar field we are interested in finding the direction in which the function increases
the fastest. It turns out that this direction can easily be calculated using derivatives, and at
any point ~x in space the vector pointing in the direction of fastest increase in φ(~x) is known as
the gradient of φ in ~x. The formula of the gradient (without derivation) is given by:
 ∂φ(~x) 
∂x1
~ x) =  . 
∇φ(~  .. 
x)
∂φ(~
∂xn
x)
where ∂φ(~
∂xi
means the derivative if φ(~x) with respect to xi where we treat all the other com-
ponents of ~x as constants.
~ x) is always perpendicular to any surface of the form φ(~x) = c.

One can prove that ∇φ(~
Examples:
• Calculate the gradient of φ(~x) = x2 + xy 2 − z = 10, in the point ~x1 = (−1, 1, 50)T .
The gradient vector thus points in the direction
2x + y 2
   
−1
~ x) =  2xy  → ∇φ(~
∇φ(~ ~ x1 ) =  −2 
1 1
• Calculate the gradient of φ(~x) = sin(xy) + cos(xyz) − 3x.

 
y cos(xy) − yz sin(xyz) − 3
~ x) =  x cos(xy) − xz sin(xyz) 
∇φ(~
−xy sin(xyz)

Vector Algebra and Geometry

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Vector Algebra and Geometry

Uploaded by

Copyright:

Available Formats

Aston University

Vector Algebra and Geometry

Module AM10VA: September 30, 2008

All course information and hand-outs can be found on the websites:

http://www.ncrg.aston.ac.uk/∼neirotjp/Welcome.html (home page)

1 Scalars, Vectors and Positions 6

1.3 Positions (points in space) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Elementary Operations on Scalars and Vectors 7

3 Linear Combinations of Vectors 9

3.1 Linearly Independent Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Spanning Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.4 Coordinate Notation of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.5 The Standard Orthonormal Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.6 Worked Excercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Scalar or Dot product of vectors 13

4.1 Generalised Pythagoras using dot product . . . . . . . . . . . . . . . . . . . . . 14

4.2 Parallel k and orthogonal ⊥ component of a vector . . . . . . . . . . . . . . . . 14

4.3 Direction Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

5.1.2 line and plane in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.2 Projection and Reflection of a point on a linear object . . . . . . . . . . . . . . . 21

5.2.1 Projection of a point on a hyperplane . . . . . . . . . . . . . . . . . . . . 22

5.2.2 Projection of a point on a line . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2.3 Projection of a point on a 2-D plane in 4-D . . . . . . . . . . . . . . . . 22

5.3 Projection of a line on a linear object . . . . . . . . . . . . . . . . . . . . . . . . 23

6 The Vector or Cross Product of Vectors 24

6.2 Geometric interpretation of the vector product . . . . . . . . . . . . . . . . . . . 25

6.3 Scalar Triple Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6.4 Determinant Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

7.1 Linear Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

7.2 Quadratic Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7.2.1 Quadratic objects in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7.2.2 Quadratic objects in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7.3 Determining the N-D hyper-sphere through N + 1 points . . . . . . . . . . . . . 37

7.4 Intersections involving linear and quadratic objects . . . . . . . . . . . . . . . . 37

7.4.1 Intersection of a line and a sphere . . . . . . . . . . . . . . . . . . . . . . 37

7.4.2 Intersection of a (hyper-)plane and a (hyper-)sphere . . . . . . . . . . . . 38

7.4.3 Intersection of two spheres . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7.5 Projections on quadratic objects . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

8 Matrices and Linear Transformations 41

8.1 Operations on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8.1.1 Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8.1.2 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

8.1.3 Multiplying a matrix by a scalar . . . . . . . . . . . . . . . . . . . . . . . 42

8.1.4 Transposing a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

8.1.5 Scalars, Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . 43

8.2 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.2.1 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.2.2 Symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.2.3 Zero matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.2.4 Diagonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.2.5 Triangular matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.2.6 Identity matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.3 Linear Transformations of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8.3.5 Compositions of Linear Transformations . . . . . . . . . . . . . . . . . . 49

8.5 Matrix Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

8.5.1 Matrix Inverse using elementary row operations . . . . . . . . . . . . . . 51

8.5.2 Matrix Inverse using cofactors . . . . . . . . . . . . . . . . . . . . . . . . 53

8.6 Solution of sets of linear equations with matrix methods . . . . . . . . . . . . . 54

8.7 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

8.7.1 Finding Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

8.8.1 Computing Functions of matrices . . . . . . . . . . . . . . . . . . . . . . 62

9.1 Parameterised Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64