You are on page 1of 56

6301 Discrete Mathematics for

Computer Scientists, Spring 2012


Alexander V. Sobolev

Department of Mathematics, University College London, Gower Street, London, WC1E 6BT
E-mail address: asobolev@maths.ucl.ac.uk

http://www.homepages.ucl.ac.uk/ucahaso

2000 Mathematics Subject Classification.

.
Abstract. This is is a draft a one term course for 1st year Computer Science students.

Contents
Chapter 1. Set Theory, Permutations, Groups
1. Basic definitions and concepts
2. Advanced operations on sets
3. Functions
4. Permutations and Groups
5. Binary operations and Groups
6. Equivalence relations and Lagranges Theorem

1
1
3
4
7
10
13

Chapter 2. Elementary Number Theory


1. Basic definitions and concepts
2. Congruences

19
19
23

Chapter 3. Linear Algebra


1. Basic definitions and concepts
2. Matrices and linear maps
3. Linear systems
4. Inverting square matrices
5. Determinants
6. Eigenvalues and eigenvectors

31
31
35
37
40
42
46

CHAPTER 1

Set Theory, Permutations, Groups


1. Basic definitions and concepts
1.1. Basic definitions. A set is a collection of elements. Every
element is either in the set or not in the set. Sets are written with the
elements separated by commas and enclosed in curly brackets: e.g.,
{3, 10, 5} is a set with three elements, which are 3, 10 and 5. We use
letters to denote sets: A, S etc. If x is an element of the set S, we write
x S. If x is not an element of the set we write x
/ S. For example,
denoting the above set by B, we can claim that 3 B, but 4
/ B.
If A, B are two sets and every element of A is also an element of
B, we say that A is a subset of B, and write A B. This sometimes
is expressed as follows: A is a subset of B if for every x A one also
has x B.
In the beginning we shall be concerned with the properties of sets
irrespectively of their nature, but we shall use a very simple tool to
picture them. Very often it is convenient to consider sets as parts of a
bigger universal set, which we denote E, so that A E, B E etc.
Then it is convenient to use the so-called Venn diagrams to visualise
the sets.
Two sets are equal, i.e. A = B, if every element of A is also an
element of B, and every element of B is an element of A. In other
words A = B if A B and B A. Note that when we want to prove
that A = B we always check these two inclusions. For example, the
sets {3, 10, 5} and {3, 5, 10} are equal, i.e. the same.
If A B and A 6= B, then A is said to be a proper subset of B. In
this case we write A B.
Example.
(1) is empty set, i.e. the set with no elements.
For any set A we have A.
(2) The set of natural numbers: N = {1, 2, 3, . . . }.
(3) The set of integer numbers: Z = {. . . , 1, 0, 1, 2, . . . }.
(4) The set of integers between 5 and 7:
A = {m Z : 5 m 7}.
Note the notation, describing the property of the elements!
1

1. SET THEORY, PERMUTATIONS, GROUPS

(5) The set of rational numbers: Q = { m


: p Z, n N}.
n

(6) Real numbers: R. This set includes such numbers as 2, 3
etc., which cannot be described as rational numbers.
Real numbers are depicted as points on the straight line.
(7) Set of coins in my pocket. Empty set?
1.2. Elementary operations on sets.
Union: for two sets A and B the union A B is a new set whose
elements are either in A or in B. In other words, x A B if either
x A or x B. Note A B = B A. Venn diagram, truth table.
Intersection: given sets A and B the intersection A B is the set
of elements which are both in A and B. In other words, x A B if
x A and x B. Note A B = B A. Venn diagram, truth table.
Difference: A \ B is the set which consists of elements which are
in A, but not in B, i.e. x A \ B if x A and x
/ B. Note that
A \ B 6= B \ A! Venn diagram, truth table.
Symmetric difference: AB = (A \ B) (B \ A), i.e. it is the
set of elements, which are either in A or B, but not in both. Venn
diagram, truth table. Note AB = BA.
The complement: the set Ac , whose elements are in the universal
set, but not in A, i.e. Ac = E \ A. Venn diagram, truth table.
Theorem 1.1 (De Morgans Law). Let A, B be two subsets of a
universal set E. Then
(1.1)

Ac B c = (A B)c ,

(1.2)

Ac B c = (A B)c .

Proof. Let us prove (1.1). Let us use the definitions. To prove


that two sets coincide we need to check that Ac B c (A B)c and
(A B)c Ac B c .
To prove the first inclusion assume that x Ac B c , i.e. x
/ A or
x
/ B. This means that x
/ A B, i.e. x (A B)c .
Assume now that x (A B)c , that is x
/ A or x
/ B, which
means that x Ac or x B c , and hence x Ac B c .
Similarly one proves (1.2).

Alternative method 1: Venn diagrams.

Alternative proof 2: Truth Tables.

Theorem 1.2. For any three sets A, B, C we have


(AB)C = A(BC), A (B C) = (A B) (A C),
A (B C) = (A B) (A C)

2. ADVANCED OPERATIONS ON SETS

2. Advanced operations on sets


Infinite unions and intersections. If we have many (possibly
infinitely many) sets it is convenient to label them using an index
set I. For example, the set of natural numbers can be an index set/
Then we write A1 , A2 , . . . for the infinite collection of sets. In general,
Ai , i I denotes a collection of sets labeled by i from the chosen index
set I.
Example. An = [0, 1 1/n], n = 1, 2, . . . , so that I = N.
B = [0, ], [0, 1], so that I = [0, 1].
Then we define:
[
Ai = {x : x Ai for some i I}
iI

= {x : i I : x Ai };
\

Ai = {x : i I : x Ai }.

iI

Theorem 1.3 (Generalised De Morgans Laws).


[ c \
\ c [
c
Ai =
Ai ,
Ai =
Aci ,
iI

iI

iI

iI

Proof.
or truth tables wouldnt help! Suppose
 Venn diagrams
c
S
S
A
that x
,
that
is
x

/
i
iI
iI Ai . This means, that for all
T
Therefore x iI Aci .
i I we have x
/ Ai , i.e. x Aci . T
Conversely, assume that x iI Aci . This means that for every
c
i I we
S have x Ai , i.e. for no i I we have x Ai , and hence x is
not in Ai , as required.
Similarly one proves the second statement.

Ordered pairs and products. We have already seen that the
order of elements does not matter for a set. However, it is useful to
introduce the so-called ordered pairs, i.e. sets of the form (x, y),
where x, y are elements of some sets X and Y . In this object the order
of elements is important.
Example. Points on the plane are represented by their coordinates:
(x, y). The points (1, 2) and (2, 1) are different!
Given two sets X and Y we define the product set or simply
product X Y as the set of ordered pairs (x, y), where x X and
y Y.

1. SET THEORY, PERMUTATIONS, GROUPS

Example. Points on the plane is the product R.


The product [0, 1] [0, 1] is the unit square.
{0, 1} {a, b, c} = {(0, a), (0, b), (0, c), (1, a), (1, b), (1, c)}.
Proposition 1.4. (A B) C = (A C) (B C).
Proof. Let (x, y) (A B) C, that is x A B and y C.
Then either x A or x B. Thus (x, y) A C or (x, y) B C
respectively. In both cases (x, y) (A C) (B C).
Conversely, if (x, y) (A C) (B C), then either (x, y) A C
or (x, y) B C, so that either x A or x B with y C, which
means that x A B, as required.

Power sets. For any set X the power set P (X) is the set of all
subsets of X.
If X has n elements, then P (X) will have 2n elements.
Example. If X = {0, 1}, then
P (X) = {, {0}, {1}, {0, 1}}.
3. Functions
3.1. Preliminaries. Let X and Y be sets.
Definition 1.5. A function (or mapping) from X to Y is a rule
which assigns to each element x X a unique element y Y . Notation: y = f (x) and f : X Y .
The element y is called the image of x, and x is called the preimage of y. The set R(f ) = {y Y : y = f (x), x X} is called the
image(or range) of the function.
For any set the notation idX stands for the the function which does
nothing to x X, it is called the identity on X.
We have the following operations on functions:
Definition 1.6. Let f : X Y and g : Y Z. The function
(g f )(x) = g(f (x))
is called the composition of f and g.
If f : X Y and g : Y X are functions such that g f = idX ,
then g is called a left inverse of the function f . If f g = idY , then g
is a right inverse of f . A function g which is a left and right inverse,
is said to be a two-sided inverse, or simply inverse.
Note the obvious identities:
idY f = f, f idX = f,

3. FUNCTIONS

and
f (g h) = (f g) h.
Example. Let X = {1, 2, 3}, Y = {a, b}. Define f by f (1) =
a, f (2) = f (3) = b. Define g by g(a) = 1, g(b) = 3. Then (f g)(a) = a
and (f g)(b) = b, so that g is a right inverse.
On the other hand (g f )(2) = g(b) = 3, so g is not a left inverse!
Note that the right inverse is not unique. Indeed, let h : Y X be
defined by h(a) = 1, h(b) = 2. Then (f h)(a) = a and (f h)(b) = b,
so that h is another right inverse.
Example. Let f (x) = ex , f : R R To construct an inverse define
on R+ = {t R : t > 0} the function h(t) = log t as a unique number
x R such that ex = t. By this definition (h f )(x) = log(ex ) = x,
for all x R, so that h is a left inverse of ex . If we want to make this
function the right inverse, we need to re-define f as a function from R
into R+ . Then (f h)(t) = elog t = t for all t > 0.
3.2. Types of functions. For applications it is crucial to be able
to find inverses. This leads us to the following definition:
Definition 1.7. A function f : X Y is said to be injective
(or an injection) if for all a, b X, if a 6= b one has f (a) 6= f (b). In
other words, if f (a) = f (b), then a = b, i.e. each y Y has only one
pre-image.
It is said to be surjective (or a surjection) if for every y Y there
is at least one pre-image, i.e. an element x X such that f (x) = y.
The function is said to be bijective (or a bijection) if it is both
injective and surjective.
Theorem 1.8 (Inverses Theorem). Let f : X Y be a function
between non-empty sets X, Y . Then
(1) f has a left inverse iff f is injective;
(2) f has a right inverse iff f is surjective;
(3) f has a two-sided inverse iff f is bijective.
Proof. (1) Let f have a left inverse, that is g f = idX . Then
from f (a) = f (b) we infer that g(f (a)) = g(f (b)) = a = b, i.e. f is
injective.
Let f be injective. For each y R(f ) define x = g(y), where x is
the uniquely defined element such that f (x) = y. Then g f = idX .
(2) Let f have a right inverse, i.e. f h = idY . Thus for each y Y
we have f (x) = y with x = h(y), so that y has a pre-image, as required.
Suppose that f is surjective, i.e. each y has at least one pre-image.
Denote it by h(y). Then f (h(y)) = y by construction.

1. SET THEORY, PERMUTATIONS, GROUPS

(3) If f has a two-sided inverse, then it is injection and bijection by


(1), (2). Conversely, by (1) and (2) we know that f has left and right
inverses, g and h. Thus it remains to show that g = h. Write:
g = g idY = g (f h) = (g f ) h = idX h = h.

2
3.3. Calculating
inverses. Let f (x) = x , f : R R+ . This
Define g(t) = t for all t 0. Then

(f g)(t) = ( t)2 = t,

so that we have a right inverse. Left inverse? I does not exist, since
the function is not injective! The rule: reflect the graph of the function
in the line y = x and throw the part which prevents it from being a
function!
For the injective function we do what we did for the exponential.
3.4. Countability.
Definition 1.9. A set X is countable if there exists a bijection
between X and N.
Sometimes finite sets are also called countable.
Theorem 1.10. Z is countable.
Proof. For each n Z define
(
2n, n > 0;
F (n) =
2(n) + 1, n 0.
It is bijective, since it has a two-sided inverse:
(
m/2, m even;
g(m) =
(1 m)/2, m odd.

Theorem 1.11. The set N N is countable.
Proof. Arrange the pairs (m, n), m, n N in a table, and count
them.

As a corollary, the set Q is countable as well.
However, the set of real numbers R is not countable!

4. PERMUTATIONS AND GROUPS

4. Permutations and Groups


4.1. Permutations, cycles.
Definition 1.12. Let n N. A permutation of degree n is a
bijection : {1, 2, . . . , n} {1, 2, . . . , n}. Notation: Sn is the set of
all permutations of degree n.
The set Sn is called the symmetric group of degree n.
Example.
(1) The only permutation of degree 1 is the function that takes 1 into 1.
(2) If n = 2, there are two permutations.
(3) If n = 3, there are 6 permutations. In general, there are n!
permutations of degree n.
We often use the notation

1
2
...
(1) (2) . . .


n
.
(n)

For example, S3 contains the element




1 2 3
.
2 1 3
If , Sn are two permutations of the same degree n, we can construct
their composition . Since both and are bijections, the result
is again a permutation. For example:


 

1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
=
.
1 3 2 5 4
2 1 4 5 3
3 1 5 4 2
Note that in general 6= .
Since permutations are bijections, they have inverses. The inversion is done by swapping the lines and re-arranging the columns so that
the top row is back to 1, 2, . . . , n. For example,

1 

1 2 3 4 5
1 2 3 4 5
=
.
2 1 4 5 3
2 1 5 3 4
Every permutation can be represented as a composition of cycles:
Definition 1.13. Let a1 , a2 , . . . , ar {1, 2, . . . , n}, r n be distinct numbers. The permutation defined by
(a1 ) = (a2 ), (a2 ) = 3 , . . . , (ar ) = a1 ,
and (m) = m if m
/ {a1 , a2 , . . . , ar }, is called a cycle of length r and
it is denoted (a1 a2 . . . ar )

1. SET THEORY, PERMUTATIONS, GROUPS

Example.

(1) In S6 :


1 2 3 4 5 6
(1435) =
.
4 2 5 3 1 6

(2) Not all permutations are cycles, but all of them can be represented as composition of cycles:


1 2 3 4 5
= (12)(354).
2 1 5 3 4
(3) Another observation: cycles can be written in more than one
way: (123) = (231).
(4) It is a convenient object to take a power of. Indeed (354)3 = id,
so that (354)24 = id and (354)26 = (354)2 = (345). (1534)246 =
(1534)2 = (13)(45).
Definition 1.14. Two cycles (a1 a2 . . . ar ) and (b1 b2 . . . bs ) are called
disjoint if they have no common elements.
For example, (12) (56) are disjoint, but (12) and (158) are not.
Proposition 1.15. If = (b1 b2 . . . bs ) and = (a1 a2 . . . ar ) are
disjoint cycles, then = .
Proof. We want to show that ( )(m) = ( )(m) for each m
{1, 2, . . . , n}.
Suppose that m {a1 , a2 , . . . , ar }, so that m
/ {b1 , b2 , . . . , bs }.
Thus (m) = m and (m)
/ {b1 , b2 , . . . , bs }, so that (m) = (m)
and (m) = (m).
The same is true if m {b1 , b2 , . . . , bs }.
Now assume that m is in neither of the two sets, so that (m) =
(m) = m, so that the claimed identity is trivial.

As a consequence, ( )k = k k for any two disjoint cycles.
Theorem 1.16. Any permutation can be represented as a composition of disjoint cycles.
Proof. Start with 1 and write (1, (1), 2 (1), . . . , k } where k is
the minimal number such that k+1 (1) = 1. Clearly, k n. Now take
the minimal m, which is not in the above set and repeat the procedure:
{m, (m), 2 (m), . . . , l (m)}, where l is the smallest number such that
l+1 (m) = m. This cycle is disjoint with the previous one. Continue
until all elements are exhausted.


4. PERMUTATIONS AND GROUPS

Example.

(1) Let


1 2 3 4 5 6 7
=
7 6 5 4 3 2 1

Then
(1) = 7, (7) = 1, so we have a cycle (17).
(2) = 6, (6) = 2, so (26).
(3) = 5, (5) = 3, so (35).
(4) = 4, and hence = (17)(26)(35).
(2)

=

1 2 3 4 5 6 7 8 9
3 2 5 8 1 7 6 9 4


= (135)(489)(67).

(3)


1 2 3 4 5 6
=
= (1542).
5 1 3 2 4 6
Now to raise a permutation to power k represent it as a product(composition) of disjoint cycles = 1 2 . . . p , and then write k =
1k 2k . . . pk .
Example.

1234
1 2 3 4 5 6 7 8 9
= (1249)1234 (35)1234 (687)1234
2 4 5 9 3 8 6 7 1
= (1249)2 id(687) = (14)(29)(687).
Definition 1.17. For a permutation the order of is the smallest
number k such that k = id.
For a cycle the order is its length. Clearly, such a number exists,
since every permutation is a composition of cycles.
Theorem 1.18. If = 1 . . . p with cycles of lengths l1 , . . . lp , then
the order of the permutation equals the least common multiple of the
lengths.
Proof. Let l = lcm(l1 , l2 , . . . , lp ). Then l = id, since jl = id for
all j = 1, 2, . . . , p.
Let us prove the converse, i.e. that k 6= id for all k < l. Suppose
that k = id, so that one of the jk is not id. Assume that 1k 6= id. Then
there is an x {1, 2, . . . , n} such that 1k (x) 6= x and 2k . . . pk (x) = x,
and hence k (x) 6= x, which gives a contradiction.


10

1. SET THEORY, PERMUTATIONS, GROUPS

4.2. Transpositions.
Definition 1.19. A transposition is a cycle of length 2.
Example: (12). Every permutation can be written as a product of
transpositions. To see this, decompose as a product of disjoint cycles:
= 1 2 . . . r . So, it suffices to represent each cycle as a product of
transpositions:
(a1 a2 . . . ar ) = (a1 a2 )(a2 a3 ) . . . (ar1 ar ).
The number of transpositions in this representation is r 1.
Example.


1 2 3 4 5 6 7 8 9
= (135)(489)(67)
3 2 5 8 1 7 6 9 4
= (13)(35)(48)(89)(67).


1 2 3 4 5 6 7 8 9
= (1249)(35)(687)
2 4 5 9 3 8 6 7 1
= (12)(24)(49)(35)(68)(87).
Definition 1.20. Let be a cycle of length l. Then the signature
of is defined to be ( ) = (1)l1 .
For a permutation written as = Q
cycles
1 2 . . . r with disjoint
P
r
(lj 1)
j
j , j = 1, 2, . . . , l, the signature is () = j=1 (j ) = (1)
.
A permutation is called even if () = 1, and odd if () = 1.
From the construction it is clear that for any two permutations
( ) = ()( ).
5. Binary operations and Groups
5.1. Definitions. Let X be a set. A binary operation on X is a
function : X X X. Instead of (x, y) we write x y.
Example.
(1) X = Z, x y = x + y - the usual addition of
integers.
(2) X = P (R), the power set of R: A B = A B.
(3) X = R: x y = xy - the usual product.
Alternatively, we can take X = R \ {0}.
(4) X = R: x y = y.
(5) X = Sn : = .
A binary operation is said to be associative if it satisfies the condition x, y, z, (x y) z = x (y z).

5. BINARY OPERATIONS AND GROUPS

11

Example.
(1) X = Z: (x + y) + z = x + (y + z) for all integer
x, y, z.
(2) Same for examples 2, 3, 4, 5 above.
For example 4:
(x y) z = z, x (y z) = x z = z,
so it is also associative.
(3) X = R, x y = (x + y)/2. Then
(x y) z =

x+y
2

+z
x + y + 2z
=
,
2
4

and on the other hand,


x + y+z
2x + y + z
2
=
,
2
4
so this operation is not associative.
x (y z) =

An element x X is called an identity element for the operation


if it satisfies x y = y x = y for all y X.
Example.
(1) X = Z: for the operation x y = x + y the
number 0 is the identity element.
(2) X = R with x y = xy: the identity is 1.
(3) X = P (R), A B = A B. The identity is .
(4) X = Sn , = . The identity is id.
There is at most one identity element. Indeed, suppose there are
two: e1 , e2 . Then e1 = e1 e2 = e2 . There may be no identity element
at all. For example, let X = R and x y = y. Suppose e is the identity,
so x = x e = e and x = e x = x. If x 6= e, we have a contradiction.
From now on we denote the identity by 1X .
Suppose that is associative, and has an identity. An element
x X is called an inverse of an element y X if x y = y x = 1X .
An element may have at most one inverse. Indeed, assume that y has
two inverses: x, z, so that
x = x 1X = x (y z) = (x y) z = 1X z = z.
It might happen that an element has no inverse at all. For example,
for X = P (R) and A B = A B the identity is , but unless A = ,
one cant find a B such that A B = .
Definition 1.21. A group is a pair (G, ) where G is a non-empty
set and is an operation satisfying:
(1) is associative;
(2) There is an identity for ;

12

1. SET THEORY, PERMUTATIONS, GROUPS

(3) Every element g G has an inverse.


If the group has finitely many elements, it is called finite, and the
number of elements is called order of the group, denoted #G.
Example.
(1) (Z, +) is a group.
(2) (R \ {0}, ) is a group.
(3) (Sn , ) is a group.
(4) (R, ), x y = y is not a group.
(5) (P (R), ) is not a group.
(6) (P (X), ) is a group for any set X.
The last example is formatted as a Theorem:
Theorem 1.22. The pair (P (X), ) is a group.
Proof. Note first that is a binary operation. To prove that it
is associative, we can use the truth tables.
A
1
1
1
1
0
0
0
0

B C AB BC (AB)C A(BC)
1 1
0
0
1
1
1 0
0
1
0
0
0 1
1
1
0
0
0 0
1
0
1
1
1 1
1
0
0
0
1 0
1
1
1
1
0 1
0
1
1
1
0 0
0
0
0
0

It shows that is indeed associative, i.e (AB)C = A(BC).


Clearly, is the identity: A = A = A.
Each element is its own inverse: AA = .

5.2. Dihedral group D8 . This is the group of symmetries of a
square. It consists of eight operations which preserve the square: four
rotations and four reflections: {id, R1 , R2 , R3 , F1 , F2 , F3 , F4 }. Any two
symmetries can be composed to give another symmetry, e.g. R1 F2 =
F3 . The inverses of the elements are
id1 = id, R11 = R3 , R21 = R2 , R31 = R1 ,
F11 = F1 , F21 = F2 , F31 = F3 , F41 = F4 .
If one attaches labels to the vertices of the square in the counterclockwise direction, one can calculate compositions of symmetries by representing them as permutations:
R1 = (1234), R2 = (13)(24), R3 = (1432),
F1 = (12)(43), F2 = (13), F3 = (14)(23), F4 = (24).

6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM

13

Now it is easy to find the compositions, e.g.


F1 R2 F3 R3 F4 = (12)(43)(13)(24)(14)(23)(1432)(24)


1 2 3 4
=
= (14)(23) = F3 .
4 3 2 1
Now one can draw the multiplication table.
More generally, a regular n-gon has 2n symmetries, which form a
group called dihedral group D2n of order 2n.
5.3. Subgroups.
Definition 1.23. Let (G, ) be a group. A subset H G is said
to be a subgroup if (H, ) is a group. In other words, H is a group
with respect to the operations on G. This is equivalent to saying that
(1) For any g, h H we have g h H,
(2) For every g H also g 1 H.
Example.
(1) {1G } is a subgroup of any group G.
(2) G is its own subgroup.
(3) Let g G. Denote by hgi the subgroup {g k , k Z}, where
g 0 = 1G . If the group G is finite, then
hgi = {1G , g, g 2 , . . . , g n1 },
where n is the smallest number such that g n = 1G . This subgroup is called cyclic. The element g is called the generating
element. The number n is called the order of the subgroup
and it is written as o(g).
(4) The dihedral group D2n is a subgroup of Sn .
6. Equivalence relations and Lagranges Theorem
6.1. Definitions.
Definition 1.24. Let X be a set. A relation on X is a subset
R X X. If (x, y) R, then we write xRy and say that x is related
to y.
Example.
(1) X any set and R = . This relation does not
relate any pairs.
(2) X any set and R = {(x, x)}. Then each x X is related to
itself.
(3) X = P (N) and R = {(A, B) : A B N}. Then is related
to any element, and any element is related to N. {1}R{1, 2},
but {1} is not related to {3}.
(4) X = R, R = {(x, y) : x y}.

14

1. SET THEORY, PERMUTATIONS, GROUPS

(5) X = R, R = {(x, y) : x 6= y}.


We focus on relations satisfying some extra conditions.
Definition 1.25. A relation R is called reflexive if xRx for any
x X.
A relation R is called symmetric if xRy implies yRx.
A relation R is called transitive if xRy, yRz imply xRz.
A relation R is said to be an equivalence relation if it is reflexive,
symmetric and transitive.
Example. Example 2 above defines an equivalence relation.
Example (3) is reflexive, transitive, but not symmetric.
Example (4) is transitive, reflexive, but not symmetric.
Example (5) is not even reflexive! It is symmetric, and not transitive.
Definition 1.26. Let X = Z, and let n N be a number. Then
we say that two integer numbers m, p are congruent modulo n iff
m p is divisible by n. Notation: m p(n) or m p mod n.
Proposition 1.27. Congruence modulo n is an equivalence relation.
Proof. Since m = m, we also have m m(n), since 0 is divisible
by n. THis proves reflexivity.
Suppose m p(n), i.e. m = p + nk with some k Z, so p m =
nk, and hence p m(n). This means symmetry.
Suppose m p(n) and p = q(n), that is m = p + nk and p = q + nl
with some integer k, l. Therefore m = q + n(l + k), i.e. m q(n).
Transitive.

Examples: 8 = 5(3), 31 = 1(10) etc.
6.2. Partitions and equivalence relations.
Definition 1.28. Let X be a set. A partition of X is a set of
subsets Xj X such that every element belongs to exactly one of the
Xj s. In other words, Xj Xk = if j 6= k and j Xj = X.
Example.
(1) If X = {1, 4, 5, 8, 6}, then the sets X1 = {1, 4, 5}
and X2 = {8, 6} form a partition.
(2) If X = R, then the sets
X1 = {x R : x < 0}, X2 = {x R : 0 x 1}, X3 = {x R : x > 1}
form a partition.

6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM

15

A remarkable fact is that any equivalence relation induces a partition!


Definition 1.29. Let R be an equivalence relation on the set X.
For any x X the equivalence class [x] is defined to be the set
{y X : xRy}.
Note that if xRy, then [x] = [y].
Theorem 1.30. The equivalence classes form a partition of X.
Proof. We need to show that (1) every element x X belongs to
an equivalence class and (b) no element is in two distinct classes.
(a) Is clear, since x [x], since R is reflexive.
(b). Suppose that x [w] and x [z]. Let us show that [w] = [z].
We have xRw, and by symmetry wRx. On the other hand xRz, so
that by transitivity, wRz, so that [w] = [z].

6.3. Lagranges Theorem.
Theorem 1.31. Let H be a subgroup of a finite group G. Then
#H divides #G.
Proof. Let us define an equivalence relation on the group and
prove that each equivalence class contains #H elements. This will
mean that n #H = #G, where n is the number of equivalence classes.
For any g, g 0 G we say that g g 0 iff gg 01 H. It is reflexive,
since for any g we have gg 1 = I H. it is symmetric since under
the condition gg 01 = h H we also have g 0 g 1 = h1 H. For
transitivity assume that g1 g2 , g2 g3 , so that g1 = h1 g2 and
g2 = h2 g3 , so that g1 = h1 h2 g3 , whence g1 g31 H. Thus is an
equivalence relation.
Let us show that each equivalence class has #H elements. For
g G define f : H [g] as follows: f (h) = hg. For each g 0 g we
have g 0 = h0 g, so this map is a surjection. Assume that f (h) = f (h0 ),
that is hg = h0 g. Multiplying by g 1 on the right we get h = h0 .
Therefore this map is an injection. Thus we have a bijection, and
hence the number of elements in [g] is exactly #H.

Corollary 1.32. For any g G the order o(g) divides #G. If
#G = n, then g n = I.
Proof. The set H = hgi is a cyclic subgroup with o(g) = #H :=
m. By the Lagranges Theorem o(g) divides #G, i. e. n = mk with
some natural k. Furthermore, g n = (g m )k = I k = I.

Corollary 1.33. If #G is a prime number, then G is a cyclic
group.

16

1. SET THEORY, PERMUTATIONS, GROUPS

6.4. Applications of Lagranges Theorem.


(1) Subgroups of S3 . The group has 6 elements. The orders of
possible subgroups are 1, 2, 3, 6.
(a) The subgroup of order 1 is {id}.
(b) Order 2. The generating element has order 2, and hence it
is a transposition. Thus possible subgroups are {id, (1, 2)},
{id, (2, 3)}, {id, (1, 3)}.
(c) Order 6: the group itself: S3 .
(d) Order 3. Subgroups are cyclic, since 3 is a prime. Thus
the the generating element has order 3. The only elements
of order 3 are cycles, and there are only two cycles in S3 :
(123) and (132). Moreover, one of them is the square
of the other: (123)2 = (132). Therefore the subgroup is
{id, (123), (132)}.
(2) Subgroups of D10 (group of symmetries of a regular pentagon).
Since 10 = 2 5, we have possible subgroups of orders 2 or 5.
Both types are cyclic, since 2 and 5 are both prime.
(3) Subgroups of D8 . Divisors are 1, 2, 4, 8. Recall that the group
D8 consists of 8 elements:
id, R1 = (1234), R2 = (13)(24), R3 = (1432),
F1 = (12)(43), F2 = (13), F3 = (14)(23), F4 = (24).
(a) Order 1: {id}.
(b) Order 8: D8 .
(c) Order 2: this is a cyclic subgroup. There are four elements
of order 2: reflections F1 = (12)(34), F2 = (13), F3 =
(14)(23), F4 = (24). Thus we have subgroups {id, F1 },
{id, F2 }, {id, F3 }, {id, F4 }.
(d) Order 4: this is not a prime, so the subgroup is not necessarily cyclic. We need to consider two cases:
(i) Suppose that H is cyclic. Thus it is generated by an
element of order 4The group D8 contains only two
elements of order 4: R1 and R3 = R13 . Both these elements generate the same subgroup {id, R1 , R12 , R13 }
of order 4.
(ii) Suppose that H is not cyclic, so that no element of
H has order 4, and hence they may have only orders
1 or 2. The elements of order 2 are
R2 , F1 , F2 , F3 , F4 .

6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM

17

However, we cannot put them arbitrarily in groups


of four. The result may not be a group. For instance, if we take {id, F2 , F3 , F4 }, then F2 F4 = R2 ,
and hence the group operation does not preserve
this set! We have to be more careful and need to find
some extra features of the subgroup which would allow us to pick the right elements. Write the sought
subgroup in the form {id, A, B, C}, where A, B, C
are of order 2, so that A2 = B 2 = C 2 = id. Since
none of the elements, except for id, is the identity,
we have AB = C and BA = C, so that AB = BA,
i.e. commute with each other. Thus the subgroup
is commutative. Hence we need to look for pairs
of commuting elements. Clearly, among the reflections, the only pairs are F2 , F4 and F1 , F3 , since
F1 F3 = F3 F1 = F2 F4 = F4 F2 = R2 . Also, R2 commutes with every reflection. Thus the possible subgroups have the form {id, R2 , F1 , F3 }, {id, R2 , F2 , F4 }.
6.5. More examples of subgroups.
(1) Let G = Z with usual addition. Then the set N is not a
subgroup. Indeed, {0}
/ N.
(2) G = R with addition. The set Z is a subgroup.
(3) G = R \ {0} with multiplication. Let H = {x R : x2 Q}.
Let us check that the group operation preserves the set: for
any x, y H we have x2 y 2 Q, since x2 , y 2 Q. Now, 1 H,
since 12 Q. For any x H we also have x2 Q, so that
the inverse x1 is in H. This proves that H is a subgroup.
(4) Again G = Z with addition. This is a group, generated by one
element: n0 = 1. Indeed: n0 n0 = 2, n0 n0 n0 = 3 and
1
k
nl0 = l. Similarly, n1
0 = 1, n0 1 n0 = 2 and n0 = k.
(5) G = R \ {0} with multiplication. The set H = {1, 1} is a
cyclic subgroup generated by one element: 1.
(6) Let An = { Sn : () = 1}, i.e. the set of all even permutations of degree n. This is called the alternating group.
Check that it is a group.

CHAPTER 2

Elementary Number Theory


1. Basic definitions and concepts
We are working with the set of integers now: Z.
1.1. Divisibility and primes.
Definition 2.1. If m, n are integers and there is an integer k such
that m = kn, then we say that n divides m (or more precisely, n is a
factor, or divisor, of m ) and write n | m. If n is not a factor of m,
then we say that n does not divide m and write n - m.
An integer m > 1 whose divisors are only 1 and m, is called a
prime number, or simply a prime.
An integer n > 1 which is not a prime number, is said to be composite.
Example: 2|6, 13|(39), but 3 - 10. Note: 0 | 0!
Note some immediate consequences of the above definition:
If b|a and c|b, then c|a.
If b|a, then bc|ac for any c.
If c|a and c|b, then c|ma + nb for any m, n.
A basic fact is the following division algorithm: for any pair of
numbers, a Z, b 1, there exists a uniquely defined pair q, r such
that
(2.1)

a = bq + r, 0 r < b.

The number r is called the remainder of division of a by b.


Theorem 2.2. Every positive integer, except 1, either is prime, or
it can be represented as a product of primes. Moreover, this representation is unique in the following sense: if
n = pa11 pa22 pakk , aj > 0, j = 1, 2, . . . , k,

p1 < p2 < < pk ,

with prime factors p1 , p2 , . . . , pk , and at the same time,


n = q1b1 q2b2 qsbs , bj > 0, j = 1, 2, . . . , s,
19

q1 < q2 < < qs ,

20

2. ELEMENTARY NUMBER THEORY

with prime factors q1 , q2 , . . . , qs , then


s = k;
qj = pj , j = 1, 2, . . . , k,
bj = aj , j = 1, 2, . . . , k.
Proof. Let n > 1. If n is prime, then there is nothing to prove.
Suppose that n has divisors between 1 and n. Let m be the least
of these divisors. We claim that m is prime. Indeed, if it were not the
case, then there would be a number l : 1 < l < m such that l|m, which
would imply that l|n. This contradicts the definition of m.
Hence n is divisible, by a prime number, say p1 :
n = p1 n1 , 1 < n1 < n.
Now, either n1 is prime, in which case the proof is complete, or it is
divisible by a prime < n1 , which we denote p2 :
n = p1 p2 n2 , 1 < n2 < n1 < n.
Repeating this procedure, we obtain a sequence of decreasing numbers
n1 , n2 , . . . , nk , all greater than 1. At each step we have the alternative
of nk being prime or not. This sequence is finite, and hence at some
stage nk will be prime, so that
n1 = p1 p2 pk ,
as required.
The proof of uniqueness is omitted.

Theorem 2.3. There are infinitely many primes.


Proof. Suppose that there are only k primes p1 , p2 , . . . , pk , i.e. all
numbers starting with pk + 1 are composite. Let
n = p1 p2 pk + 1,
and let p be a prime divisor of n. Then p is one of the numbers
p1 , p2 , . . . pk , so that p | p1 p2 pk . Since p | n, we have
p | (n p1 p2 pk ).
In other words, p | 1, but it is absurd, as p > 1. This proves the
claim.

1.2. The greatest common divisor.
Definition 2.4. If a, b are integers, then the largest number m
such that m | a and m | b, is called the greatest common divisor of a, b.
Notation: d = (a, b).
If (a, b) = 1, then the numbers a, b are called relatively prime, or
coprime.

1. BASIC DEFINITIONS AND CONCEPTS

21

In order to write out some useful formulae relating a, b and (a, b)


we introduce the following set:
S = S(a, b) = {xa + yb, x Z, y Z}.
Note that the set (S, +) is a group. In particular, the sum of any two
numbers from S is again in S. Note that a, b S.
Lemma 2.5. Let c be the smallest positive number in S. Then
S = {nc, n Z},
or, in other words, any number m S has the form nc with some
n Z.
Proof. It is clear that all numbers nc are in the set S. Now we
need to show that S contains no other numbers. Suppose that it is not
the case, that is there exists a number m S which is not of the form
nc. Then using (2.1) we can write
m = nc + r
with some r : 0 < r < c. Since nc S and S is a group, the number
r = m nc is also in S. Since 0 < r < c, this contradicts the definition
of c. Thus r = 0, as claimed.

Theorem 2.6. Let S = S(a, b) and c S be as defined above. Then
c = d := (a, b).
Proof. By definition of c there are numbers n, m such that
a = nc, b = mc,
so that c | a and c | b. Thus c d. On the other hand, d | a and d | b,
and hence d | xa + yb for any x, y, so d divides any element of S, and
in particular c, which means that d c. Therefore d = c.

Corollary 2.7. For any two integers a, b the equation
ax + by = n
is solvable in integers x, y iff d | n.
In particular,
ax + by = d
is solvable.
Using the above facts we prove the following theorem:
Theorem 2.8 (Euclids first theorem). If p is a prime, and p | ab,
then either p | a or p | b.

22

2. ELEMENTARY NUMBER THEORY

Proof. Suppose that p - a, so that (a, p) = 1. By Theorem 2.6,


one can find numbers x, y such that
xa + yp = 1.
Multiply by b:
xab + ypb = b.
The left-hand side is divisible by p, and hence p | b, as claimed.

1.3. The Eucledian algorithm. Now we need to develop an algorithm which would allow us to find for any integers a, b
their greatest common divisor d,
numbers x, y such that xa + yb = d.
Examples: (6, 8) = 2, (8, 25) = 1, (7, 63) = 7.
We use the division algorithm (2.1). Note that m | a and m | b iff
m | b and m | r. Therefore (a, b) = (b, r). Thus we can use the above
algorithm to find (a, b).
Example (The Eucledian Algorithm). Find (24, 356). Write:
356 = 14 24 + 20,
so that (24, 356) = (24, 20). Continue:
24 = 1 20 + 4,
so that (24, 356) = (20, 4) = 4. Working backwards we see that
4 = 24 1 20
= 24 1 (356 14 24)
= 15 24 1 356.
Find (53, 77). Write:
77 =
53 =
24 =
5=

1 53 + 24
2 24 + 5,
4 5 + 4,
1 4 + 1,

and hence (77, 53) = (1, 4) = 1. Therefore the numbers 53 and 77 are
relatively prime. Again, working backwards, we find that
1 = 5 1 4 = 5 1 (24 4 5)
= 5 5 1 24 = 5 (53 2 24) 1 24
= 5 53 11 24 = 5 53 11 (77 1 53)
= 16 53 11 77.

2. CONGRUENCES

23

2. Congruences
2.1. Congruency classes. Recall Definition 1.26. We have established earlier that the congruence mod n is an equivalence relation.
Let us establish some elementary properties of congruent numbers.
Lemma 2.9. If a a0 (n), b b0 (n), then a + b = a0 + b0 (n) and
ab = a0 b0 (n).
Proof. By definition a = a0 + kn and b = b0 + mn with some
integer k, m. Thus
(a + b) (a0 + b0 ) = n(k + m),
and
ab a0 b0 = a(b b0 ) + b0 (a a0 ) = amn + b0 kn = n(am + b0 k).
The required relations follow.

To understand these matters better, we need a bit of terminology.


Definition 2.10. If a b mod n, then we say that b is a residue
of a modulo n. If 0 b < n we say that b is the least residue (or the
least non-negative residue) of a modulo n.
For any a Z we call [a] the class of residues congruent to a modulo
n.
For example, 21 is a residue of 5 modulo 8, and at the same time 5
is the least (non-negative) residue of 21 modulo 8.
As a rule, when talking about congruencies, we use the least residues.
So now we call the congruency classes the classes of residues. Let
us show that every number a Z is congruent modulo n to one of the
numbers
(2.2)

0, 1, 2, . . . , n 1.

To see this we use the division algorithm, i.e. the fact that for any
a Z there exists a unique pair q, r such that
a = qn + r, and 0 r < n.
This identity shows that r a mod n and that r is defined uniquely.
Now, since the list (2.2) does not contain congruent pairs, we conclude
that there are exactly n distinct classes of residues modulo n.
Definition 2.11. The set of n numbers a1 , a2 , . . . , an is said to be
a complete system of residues modulo n, if every number from (2.2) is
congruent to one of the numbers al , l = 1, 2, . . . , n.

24

2. ELEMENTARY NUMBER THEORY

In particular, (2.2) is a complete system of residues modulo n. Another example:


1, 3, 4, . . . , n 1, n + 2.
In general, any collection of n incongruent numbers is a complete system of residues.
Let us introduce a notation for the set of all classes of residues
modulo n:
Zn = Z/n = {[0], [1], . . . , [n 1]}.
There are two operations on Zn : addition and multiplication, defined
as follows:
[k] + [m] = [k + m], [k][m] = [km],
for any two numbers k, m. In words, in order to find the sum of two
classes, you should take one representative from each class, add them
up and take the equivalence class of the obtained number.
For example, let [k] be the class of residues of the number k modulo
5. If we want to find [3] + [4] we add 3 and 4, which gives 7. Thus
the answer is [7]. If we want to use the least non-negative residues, we
can write [2] instead of [7]. Thus [3] + [4] = [2]. In other words, in the
language of residues,
3 + 4 2 mod 5.
Similarly, [3] [4] = [12] = [2], i.e. 3 4 2 mod 5.
2.2. Group properties. The pair (Zn , +) is a group: the identity
element is [0], the addition is associative, and every element [a] has an
inverse: [a] = [na]. On the contrary, the pair (Zn , ) is not a group,
since at least one element does not have a multiplicative inverse: [0].
However, there may be other elements without inverses, e.g. in Z4 the
element 2 does not have an inverse:
[2] [0] = [0], [2] [1] = [2], [2] [2] = 0, [2] [3] = [2].
So define
Zn = (Zn , ) = (Z/n) ,
as the set of residue classes modulo n, which have multiplicative inverses. For instance,
Z4 = {[1], [3]}, [1]1 = [1], [3]1 = [3].
Theorem 2.12. The set (Zn , ) is a group.
Proof. Let us show that multiplication is a binary operation. Let
[a], [b] G = (Zn , ), that is [a] and [b] have multiplicative inverses,
which we call [a1 ] and [b1 ], so that aa1 1 mod n and bb1 1
mod n. Thus (ab)(a1 b1 ) 1 mod n, so [ab] G as well.

2. CONGRUENCES

25

Clearly, the multiplication is associative, and [1] is the identity element. By definition of G every element has an inverse. Thus G is a
group.

Let us find out when a number is invertible modulo n.
Proposition 2.13. A number a is invertible modulo n iff a and n
are relatively prime.
Proof. Let d = (a, n). Suppose that ak 1 mod n, so that
ak + nb = 1 with some b. By Corollary 2.7 this equation is solvable in
k and b, iff d | 1, that is d = 1.

Definition 2.14. The Euler totient function : N N is
defined by (n) = #Zn , i.e. (n) is the number of positive integers less
than n, which are coprime to n.
Example. (2) = 1, (3) = 2, (4) = 2, (5) = 4, (6) = 2.
The easiest case is when p is a prime, so that every number less
than p is relatively prime to p, so that
Zp = {[1], [2], . . . , [p 1]}, (p) = p 1.
Theorem 2.15 (Eulers generalisation of Fermats Little Theorem).
Let a be coprime to n. Then a(n) 1 mod n.
In particular, if p is a prime number, then ap1 1 mod p (Fermats Little Theorem).
Proof. Since Zn is a group, and (n) is its order, by Corollary
1.32 we have [a](n) = [1], i.e. a(n) 1 mod n.

2.3. Linear congruences. Using the above Proposition we can
solve Linear congruences, i.e. equations of the type
mx k

mod n.

To this end we need we need to find the inverse for m mod n. This
is only possible if (n, m) = 1. The inverse is found from the formula
1 = mp nq with some p, q, so we need to express (n, m) as a linear
combination of n and m.
Example 2.16.
as follows:

(1) The congruence 5x 17(87) is resolved


87 = 5 17 + 2, 5 = 2 2 + 1,

so
1 = 5 2 2 = 5 2(87 5 17) = 35 5 2 87 = 35 5

mod 87.

26

2. ELEMENTARY NUMBER THEORY

This shows that 35 is the inverse of 5 mod 87, and hence


x 35 17 595 73

mod 87.

(2)
2.4. Calculating the Euler function. If p is prime, then we
know that (p) = p 1. More generally, if n = pa with some prime p,
then a natural number r is not comprime with n iff p | r. Therefore
Zpa = Zpa \ {[0], [p], [2p], . . . , [pa p]}.
This means that
(pa ) = pa pa1 = (p 1)pa1 .
To calculate (n) if n is not prime, we need the following notion:
Definition 2.17. A function : N N is called multiplicative if
for any two coprime numbers m, n we have (mn) = (m)(n).
We intend to show that Euler function is multiplicative. For this
we need this intermediate result.
Lemma 2.18. Suppose that (m, n) = 1. Let a1 , a2 , . . . , an be a complete set of residues modulo n, and let b1 , b2 , . . . , bm be a complete system of residues modulo m. Then the set
(2.3)

ak m + br n, k = 1, 2, . . . , n; r = 1, 2, . . . , m,

forms a complete system of residues modulo nm.


Proof. There are exactly mn numbers of the form (2.3). It remains to check that the above set of numbers does not contain pairs
congruent modulo mn. Assume that
am + bn a0 m + b0 n mod nm.
This means that
am a0 m mod n, bn b0 n mod m.
Since (m, n) = 1, by Theorem 2.8,
a a0

mod n, b b0

mod m.

Therefore the numbers (2.3) are all incongruent, as required.

Theorem 2.19. The Euler function is multiplicative.


Proof. We have already proved that the numbers (2.3) form a
complete system of residues modulo mn. let us count among them the
numbers coprime to nm. This means that we count numbers such that
(am + bn, nm) = 1,

2. CONGRUENCES

27

i.e., by Corollary 2.7,


(am + bn)p + nmq = 1,
with some p, q. Rewriting it as
a(mp) + n(bp + nq) = 1,
we conclude that (a, n) = 1. Similarly, (b, m) = 1. There are exactly
(n)(m) pairs a and b satisfying the above relations. This leads to
the required result.

Now we can calculate (n) by factorising n as a product of primes:
n = pa11 pa22 . . . pas s .
The factors are pairwise coprime, so
(n) = (pa11 )(pa22 ) (pas s )
= (p1 1)pa11 1 (p2 1)pa22 1 (ps 1)pas s 1 .
Example.
(1) (15) = (3 5) = 2 4 = 8.
(2) (16) = (24 ) = 1 23 = 8.
(3) (17) = 16.
(4) (18) = (32 2) = 2 3 1 = 6.
(5) (20) = (22 5) = 1 2 4 = 8.
(6) (200) = (23 52 ) = 1 22 4 5 = 80.
This recipe allows one to calculate large powers modulo n relying on
Theorem 2.15. More precisely, we are able to find ak mod n assuming
that (a, n) = 1. For example, to find 71234 mod 15, find (15) = 8.
Now, 1234 2 mod 8, and hence, by Theorem 2.15,
71234 72 49 4

mod 15.

Example.
(1) Find 442 mod 25. First note: (4, 25) = 1 and
(25) = 20, so 442 42 16 mod 25.
(2) Find 764 mod 120. Note: (7, 120) = 1 and (120) = (23 3
5) = 1 22 2 4 = 32, so 764 70 1 mod 120.
Find 762 mod 120.
Write 762 764 72 72 491 mod 120.
In order to find 491 mod 120 we run the Euclidean algorithm:
120 = 2 49 + 22,
49 =2 22 + 5,
22 = 4 5 + 2,
5 = 2 2 + 1.

28

2. ELEMENTARY NUMBER THEORY

Thus
1 = 5 2 2 = 5 2 (22 4 5) = 9 5 2 22
= 9 (49 2 22) 2 22 = 9 49 20 22
= 9 49 20 (120 2 49) = 49 49 20 120
49 49 mod 120.
Thus 491 49 mod 120. Thus 762 49 mod 120.
(3) Find 16287 mod 765. Eulers function:
(765) = (5 32 17) = 4 2 3 16 = 384.
Note also that (16, 765) = 1 and
16287 = 24287 = 21148 = 233844 24

mod 765.

Now we need to find the inverse of 24 = 16 modulo 765. Write


765 = 47 16 + 13, 16 = 1 13 + 3, 13 = 4 3 + 1.
Thus
1 = 13 4 3 = 13 4 (16 13) = 5 13 4 16
= 5 (765 47 16) 4 16 = 5 765 239 16
239 16 mod 765,
so that 161 mod 765 239 mod 765 526 mod 765.
Hence 16287 526 mod 765.
2.5. Higher congruences. Let us find out how to find solutions
to congruences or the type
xa = b mod n.
Theorem 2.20. Suppose that (n, b) = 1 and (a, (n)) = 1. Then
1
the above congruence has a unique solution x ba
mod n where a1
is the inverse of a modulo (n)(!).
1

Proof. Let us show first that x ba


(b

a1

)a b

aa1

mod n is a solution:

b1+k(n) b bk(n) b mod n.

Conversely, since (b, n) = 1, any solution x will be coprime to n as well.


Thus, writing aa1 = 1 mod (n), we can calculate
1

xaa

= x1+k(n) = xxk(n) x mod n.

On the other hand,


xaa

(xa )a
1

Thus for any solution x ba

ba

mod n.

mod n.


2. CONGRUENCES

29

Example.
(1) Solve x3 3 mod 20. Write: 20 = 22 5, so
(20) = 2 4 = 8. The number 3 is coprime to 8 and to 20.
Thus the above theorem is applicable. Let us find first 31
mod 8:
8 = 2 3 + 2, 3 = 2 + 1,
so that
1 = 3 2 = 3 (8 2 3) = 3 3 8 3 3

mod 8,

and hence 31 3 mod 8. Thus


x 33
(2) Solve x
that

53

mod 20 27

mod 20 7

mod 20.

= 3 mod 200. To find (200) write 200 = 52 23 , so

(200) = 4 5 1 22 = 80.
Clearly, (53, 80) = 1, so that 53 is invertible modulo 80. Let
us find the inverse:
80 = 1 53 + 27, 53 = 1 27 + 26, 27 = 26 + 1.
From here:
1 = 27 26 = 27 (53 27) = 2 27 53
= 2 (80 53) 53 = 2 80 3 53 3 53

mod 80,

so 531 3 mod 80. By the above theorem


x 33 271

mod 200.

Again, need to find the inverse:


200 = 7 27 + 11, 27 = 2 11 + 5, 11 = 2 5 + 1.
Thus
1 = 11 2 5 = 11 2 (27 2 11) = 5 11 2 27
= 5 (200 7 27) 2 27 = 5 200 37 27 37 27

mod 200,

Consequently, x 37 163(200).
2.6. Public key cryptography. Here is a way of encoding and
decoding messages. Choose two primes p, q. Choose a number a coprime to (pq) and compute the inverse b of a modulo (pq). Then the
pair {pq, a} is the public key and {pq, b} is the private key. The
messages are encoded using the public key in the following way. The
message is a number M modulo pq. The encoded message is N M a
mod pq, which is easily found using the algorithms developed in the
lectures, using the public key. The message is recovered by calculating M N b mod pq, if one knows b. However, finding b is a serious

30

2. ELEMENTARY NUMBER THEORY

problem, if only the public key is given, since one needs to determine
(pq), which is done by finding p and q first. As there is no sensible
recipe how to find the factors p, q from the product pq, this procedure
is virtually impossible to implement.
Example. Let p = 3, q = 11. Then (pq) = (p 1)(q 1) = 20.
Take {33, 7} as the public key. The inverse of 7 modulo 20 is 3, so the
private key is {33, 3}. Suppose that we are sending the message which
is number 2. Then the coded message is 27 = 128 4 mod 33. To
decode the message calculate (4)3 = 64 2 mod 33.

CHAPTER 3

Linear Algebra
1. Basic definitions and concepts
1.1. Addition and multiplication. A real matrix A of size nm
is a n m array of real numbers:

a11 a12 . . . a1m


a21 a22 . . . a2m
A=
..
..
..
...
.
.
.
an1 an2 . . . anm
Examples:




1 7 4
0 6
A=
, B=
.
2 11 3
2 3
We denote the set of all matrices of size n m by M(n, m). Let us
define addition and multiplication on the set of matrices. Addition is
defined for the matrices of the same size:
Definition 3.1 (Addition). Let A, B be two n m matrices with
entries ajk and bjk respectively. Then the matrix C = A + B is defined
as having the entries cjk = ajk + bjk :

a11 a12 . . . a1m


b11 b12 . . . b1m
a21 a22 . . . a2m b21 b22 . . . b2m
.
+ .
..
..
.. . .
..
...
..
.
.
. ..
.
.
an1 an2 . . .

anm

bn1 bn2 . . .
a11 + b11
a21 + b21
=
..

an1 + bn1

bnm

a12 + b12 . . . a1m + b1m


a22 + b22 . . . a2m + b2m
.
..
..
..

.
.
.
an2 + bn2 . . . anm + bnm

Example:

 
 

1 7 4
0 2 5
1 5 9
+
=
.
2 11 10
9 3 3
11 14 7
31

32

3. LINEAR ALGEBRA

Definition 3.2 (Multiplication). Let A be an n m matrix and


let B be an m k matrix. Then the product AB is defined as the n k
matrix with entries
m
X
cjl =
ajs bsl
s=1

Note that the number of columns of matrix A coincides with the


number of rows in matrix B.
Examples:


 0 5


1 3 0
3 11

1 2 =
,
2 0 6
12 10
2 0


1 3
2 4


 

3 2
18 16
=
,
5 6
14 28



 7 1 4 7
23 5 2 5
2 0 3

2
5 0 4 =
.
15 6 26 39
4 1 5
3 1 2 3

If we multiply two n n matrices, we get another n n matrix! If the


matrices A and B have the right sizes to form the product AB, it is not
always possible to define the product BA. However, if it is possible, it
is not always true that AB = BA. Counterexample:


 

1 1 1 0
2 1
=
,
0 1 1 1
1 1


1 0
1 1


 

1 1
1 1
=
.
0 1
1 2

Define the n n identity matrix:

1 0 ... 0
0 1 ... 0

In =
... ... . . . ... ,
0 0 ... 1
or, in other words, ajl = jl , where jl is the so-called Kroneker symbol:
(
jl =

1, j = l,
0, j 6= l

1. BASIC DEFINITIONS AND CONCEPTS

33

Define also multiplication by a real number (scalar):

a11 a12 a1n


a21 a22 a2n
A =
, R.
..
..
..
...
.
.
.
an1 an2

ann

In other words,

0
A =
...


a11 a12 a1n
0 ... 0
. . . 0 a21 a22 a2n
..
..
.. . .
. .
..
.
. .. ..
.
.
.
an1 an2 ann
0 0 ...

, R.

This follows from the following lemma:


Lemma 3.3. Let A M(n, n). Then AIn = In A = A.
Proof. Let B = AIn . By definition of the identity matrix, we
have
n
X
bjl =
ajs sl .
s=1

By definition of the Kroneker symbol, only one term is distinct from


zero: ll , so that bjl = ajl . Similarly for In A.

Let us establish some useful properties of addition and multiplication:
Proposition 3.4. Let A M(n, m), B, C M(m, p), D
M(p, q). Then
(1) (AB)D = A(BD), i.e. multiplication is associative,
(2) A(B + C) = AB + AC, (B + C)D = BD + CD, i.e. multiplication is distributive.
Proof.
(1) By definition, the entry tjl of the matrix T =
(AB)D is
tjl =

p
m
X
X

p
m
X
X


ajs bsk dkl =
ajs
bsk dkl ,

k=1 s=1

s=1

k=1

which coincides with the corresponding entry of the matrix


A(BD).
(2) Let S = A(B + C) and E = AB + AC, so that
sjl =

m
X
s=1

ajs (bsl + csl ) =

m
X
s=1

ajs bsl +

m
X
s=1

ajs csl = ejl .

34

3. LINEAR ALGEBRA

The second identity is proved in the same way.



1.2. Structure of the set of matrices. Now we can investigate
the structure of the set M(n, n). Is it a group for the addition (or
multiplication) operation? The pair (M(n, n), +) is certainly a group.
Indeed, define the zero matrix O by ojl = 0. Then it will be the identity
element for the group. Every element A M(n, n) has an inverse: the
matrix A with entries ajl . Notice that OA = O for any A.
Consider now the set of matrices M = M(n, n) with multiplication.
For a matrix A define the multiplicative inverse as a matrix B M
such that AB = BA = In . Notation: B = A1 . This definition shows
that B is also invertible and B 1 = A.
A natural question is whether all matrices A M have inverses
(or are invertible). In other words, is the pair (M, ) a group? It
is immediately clear, that it is not, since some matrices do not have
inverses, for instance, the zero matrix. Examples of invertible matrices:
In , the diagonal matrices

a11 0 . . . 0
0 a22 . . . 0
A=
.. . .
..
...
.
.
.
0

...

with non-zero entries. Indeed, the matrix


1
a11
0 ...
0 a1
...
22
B=
..
..
...
.
.
0

...

ann

0
0
..
.
a1
nn

is the inverse to A.
(1)

1/5 0
0
5 0 0
1/3 0 .
A = 0 3 0 , A1 = 0
0
0 1/7
0 0 7

Example.

(2)





2 4
3/10 4/10
1
A=
, A =
.
1 3
1/10 2/10
(3) Arbitrary matrix T M(2, 2) is invertible if t11 t22 t12 t21 6= 0
and


1
t22 t12
1
T =
.
t11 t22 t12 t21 t21 t11

2. MATRICES AND LINEAR MAPS

35

Let G = GL(n, R) M(n, n) be the subset of all invertible matrices. Claim: G is a group. Let us check first that multiplication defines
a binary operation on G, that is for any two matrices A, B G we also
have AB G:
Proposition 3.5. If A, B GL(n, R), then AB GL(n, R) and
(AB)1 = B 1 A1 .
Proof. Denote T = AB and S = B 1 A1 . We want to show
that S = T 1 . According to definition, for this we need to check that
ST = T S = In . Indeed:
ST = B 1 A1 AB = B 1 In B = B 1 B = In ,
T S = ABB 1 A1 = AIn A1 = AA1 = In .

We also know the following:
(1) Multiplication is associative;
(2) The element In is the identity;
(3) Every element has an inverse.
Thus GL(n, R) is a group. This group is called the general linear
group of n n-matrices.

2. Matrices and linear maps


2.1. Vectors, Linear Maps. Denote by Rn the product |R R
{z R}.
n

Instead of writing the elements of Rn as (x1 , x2 , . . . , xn ) we write them


as columns

x1
x2

x=
... .
xn
The elements of Rn are called vectors. One can view them as elements
of M(n, 1), that is matrices with one column and n rows. Therefore
they have natural operations defined: addition and multiplication by

36

3. LINEAR ALGEBRA

m n- matrices:


y1
x1
x2 y2
. + .
.. ..
yn
xn

a11
a21
.
..

a12
a22
..
.

..
.

am1 am2

x1 + y1
x2 + y2
,
=
..

xn + yn

a11 x1 + a12 x2 + + a1n xn


x1
a1n
a2n x2 a21 x1 + a22 x2 + + a2n xn
. =
..
..
.
. ..
am1 x1 + am2 x2 + + amn xn
xn
amn

In particular, we can multiply the vectors by scalars:

x1
x1
x2 x2

... = ... , R.
xn
Define the vectors

1
0
0
1

e1 =
... , e2 = ...
0
0

xn

0
0

. . . , en =
... .

Every vector x can be written as a linear combination


x = x 1 e 1 + x2 e 2 + + xn e n .
Among the functions(maps) defined on the set Rn we single out the set
of linear maps:
Definition 3.6. A function T : Rn Rm is called a linear map if
it satisfies the properties
T (x + y) = T (x) + T (y),
T (x) = T (x),
for all x, y Rn and R.
A remarkable fact is that every linear map can be represented by a
matrix:
Theorem 3.7. Let T : Rn Rm be a linear map. Then T (x) =
Ax for all x Rn where A is a matrix from M(m, n) defined by
(T (e1 ), T (e2 ), . . . , T (en )) .

3. LINEAR SYSTEMS

37

Proof. Write

x1
n
n
X
X
x2

T (x) = T (
xj e j ) =
xj T (ej ) = (T (e1 ), T (e2 ), . . . , T (en ))
... .

j=1

j=1

xn


3. Linear systems
The concept of matrices is useful when studying the following system of linear equations:

a x + + a1m xm = b1 ,

11 1
a21 x1 + + a2m xm = b2 ,
...

a x + + a x = b ,
n1 1
nm m
n
where b1 , b2 , . . . , bn are fixed real numbers, and x1 , x2 , . . . , xn are unknown. They are sometimes called simultaneous equations. It is convenient to rewrite this system using the matrix notation:

a11 a12 a1m


x1
b1
a21 a22 a2m
x
b
, x = .2 , b = .2 ,
A=
.
.
.
.
..
..
..
..
..
..
an1 an2

anm

xm

bn

so that
Ax = b.
To solve the system define the augmented matrix:

a11 a12 a1m b1


a21 a22 a2m b2
A =
..
..
..
..
...
.
.
.
.
an1 an2

anm bn

Now well perform a number of transformation which preserve the solutions. We will
Swap two equations,
Multiply an equation by a non-zero number,
Add one equation to another.
These operations correspond to the following ones on the augmented
matrix:
Swap two rows,
Multiply a row by a non-zero number,

38

3. LINEAR ALGEBRA

Add one row to another.


These are called elementary row operations
Example.

(1) Consider the system




x + y = 3,
1 1

A =
2x + 3y = 7,
2 3

3
7

R1 R2

Thus


1 1
2 3

3
7

R2 2R1

1 1
0 1

3
1

Consequently, x1 = 2, x2 = 1.
(2) Consider the system

2 4 6
2x1 + 4x2 + 6x3 = 18,

4 5 6
4x1 + 5x2 + 6x3 = 24,
A =

3x1 + x2 2x3 = 4
3 1 2

1 0
0 1

2
1


.

18
24 .
4

Thus

2 4 6
18 1 2 3
9
4 5 6
24 R1/2 4 5 6
24
3 1 2
4
3 1 2
4

1 2
3
1
2
3
9
9

R2 4R1
0 3 6
12 R2/3 0 1
2
4
R3 3R1
0 5 11
23
0 5 11
23

1 0 1
1
1 0 1
1

R1 2R2

0 1 2
4
4
R3 0 1 2
R3 + 5R2
0 0 1
3
0 0 1
3

1 0 0
4
R1 + R3
0 1 0
2
R2 2R3
0 0 1
3
Consequently, x1 = 4, x2 = 2, x3 = 3.
(3) An example with infinitely many solutions. Consider the system

2 4 6
18
2x1 + 4x2 + 6x3 = 18,
4x1 + 5x2 + 6x3 = 24,
24 .
A = 4 5 6
2x + 7x + 12x = 30
2 7 12
30
2
2
3

3. LINEAR SYSTEMS

39

Thus

1 2 3
9
4 5 6
24
2 7 12
30

1 2
3
9
1 2 3
R2 4R1
0 3 6
12 R2/3, R3/3 0 1 2
R3 2R1
0 3
6
12
0 1 2

1 0 1
1
R1 2R2
0 1 2
4 ,
R3 R2
0
0 0 0

9
4
4

so x1 x3 = 1, x2 + 2x3 = 4, and hence the solutions is (1 + x3 , 4


2x3 , x3 ). This vector is a solution for any x3 R. Thus it is called the
general solution.
To make it formal:
Definition 3.8. An entry akl is said to be a leading one, if it is the
first non-zero entry in the row k, i.e. akj = 0 for all j < l and akl 6= 0.
We say that a matrix A is in row echelon form for every leading
entry akl we have:
(1) akl = 1,
(2) akj = 0, j < l,
(3) anl = 0, n > k.
In other words, every leading element has zeros on the left and underneath.
Theorem 3.9. Every matrix can be reduced to the row echelon form
by elementary row operations.
Without proof.
The process of reduction to this form is called Gaussian elimination.
This method is also useful if one needs to know what kind of map
a matrix defines. More precisely, how to determine if the linear map
defined by a matrix A is injective?
Lemma 3.10. The linear map A : Rm Rn is injective if the
equation Ax = 0 has only one solution: x = 0.
Proof. Suppose that Ax1 = b, Ax2 = b with some b, x1 , x2 . Then
due to the linearity
A(x1 x2 ) = 0.
If the equation Ax = 0 has a unique solution x = 0, then x1 = x2 , so
the map is injective.


40

3. LINEAR ALGEBRA

Example. Is the map defined by the matrix




1 1
A=
2 3
A in the previous example injective? The answer is YES, as the equation Ax = 0 has only one solution x = 0.
Example. Find a connection between the parameters a, b and c,
which ensures that the system:
x + 2y 3z = a
3x y + 2z = b
x 5y + 8z = c
has:
(1) a unique solution.
(2) infinitely many solutions.
(3) no solution.
Let us reduce the extended matrix:

1 2 3
a 1 2 3
a
R2 := 3R1 R2
3 1 2
0 7 11
b
3a b
R3 := R1 R3
1 5 8
c
0 7 11
ac

1 2 3
a

3a b
R3 := R3 R2 0 7 11
0 0 0
2a + b c

If 2a + b c = 0, then the system has infinitely many solutions. If


2a + b c 6= 0, there are no solutions. The system never has a unique
solution.
4. Inverting square matrices
4.1. Inverting matrices. Analysing again what we did with the
linear systems, we realise that all we did was to invert the matrix
A. Indeed, to find x from the equation Ax = b we can simply write
x = A1 b. On the other hand, after reducing the augmented matrix to
the row echelon form we quickly arrive at the augmented matrix of the
form (In c), so that = A1 b. This suggests that we can find the matrix
inverse by applying the elementary row operations to the augmented
n 2n-matrix of the form (AIn ) until we get the matrix of the form
(In B). Then B = A1 .

4. INVERTING SQUARE MATRICES

Example.

41

(1) Let


2 3
4 5

R2 + 2R1

A=

Write


R1 3R2

2 3
4 5


2 0
0 1

1 0
0 1

2 3
0 1

 
R1/2
1 0
R2
0 1

5 3
2
1

1 0
2 1

5/2 3/2
2
1

so that
A


=

5/2 3/2
2
1

(2) Here is a matrix which is not invertible:




1
2
2 4

R2 + 2R1

A=

Indeed,


1
2
2 4

1 0
0 1

1 2
0 0

(3) Let

1 1 1
A = 0 2 3 .
5 5 1

1 0
2 1


.


,

42

3. LINEAR ALGEBRA

Write
1 1 1
0 2 3
5 5 1

1 0 0
1 1 1

0 1 0 R3 5R1 0 2 3
0 0 1
0 0 4

1 0 0
0 1 0
5 0 1

1 0 0 4 4 0
15 4 3 4R1 + R3 0 8 0
0 0 4
5 0 1

1 1 1
4R2 + 3R3 0 8 0
0 0 4

4 0 0
R1 R2/2 0 8 0
0 0 4

1 0 0
R1/4, R2/8, R3/4 0 1 0
0 0 1

1 0 1
15 4 3
5 0 1

13/2 2 1/2
15 4
3
5
0
1

13/8 1/2 1/8


15/8 1/2
3/8 .
5/4
0
1/4

5. Determinants
5.1. Definition.
Definition 3.11. The determinant of an n n-matrix A is defined
to be
X
()a1(1) a2(2) an(n) ,
det A =
Sn

where the summation is taken over all permutations of degree n, and


() is the signature (sign) of the permutation.
Equivalent definition:
X
det A =
()a(1)1 a(2)2 a(n)n .
Sn

Let us note a few simple properties of this number.


(1) If one swaps two rows(columns), det A gets multiplied by 1;
(2) If a row(column) is multiplied by R, then det A is multiplied by ;
(3) The determinant does not change if a multiple of one row
(column) is added to another one.
Definition 3.12. The transpose of the matrix A M(n, n) is
another matrix AT , with the entries aTjk = akj .
Another property of det is that
det A = det AT .

5. DETERMINANTS

43

Computation of the determinant is not easy, if the matrix is large.


There are a few tricks which can simplify the task.
Observe first of all that for any 2 2 matrix we have
det A = a11 a22 a12 a21 .
If the matrix has a special form, we sometimes can do it easily.
tjk

Definition 3.13. We call a matrix T upper (lower) triangular if


= 0 for all j > k (j < k).

If A is lower or upper triangular, then the determinant is easily


found. For instance, assume that

1 . . .
0 2 . . .
A=
.. . .
.
...
. ..
.
0

...

with some 1 , 2 , . . . , n on the diagonal. Then


det A = 1 2 n .
In particular, this applies to diagonal matrices.
Using the above facts we conclude that one way of finding the determinant is to reduce the matrix to the triangular form using the
elementary row (or column) operations.
Example.

(1) Let

2 1 4
5 .
A= 0 1
6 3 4

Then

2 1 4
2 1 4
2 1 4

0 1
5 .
5 R3 3R1 0 1
5 R3 6R2 0 1
0 0 46
6 3 4
0 6 16

According to our previous observations, the determinant does


not change under these operations, so that det A = 92.
(2) Let

1 3 2
A = 4 5 1 .
2 4 3

44

3. LINEAR ALGEBRA

Then

1 3 2
1 3
2
det 4 5 1 = det 0 7 7
2 4 3
0 2 1

1 3
2
1 3 2
1 = 7 det 0 1 1 = 7.
= 7 det 0 1
0 2 1
0 0 1
Comparing this procedure with the redaction to the row echelon
form, we deduce the following result:
Theorem 3.14. The matrix is invertible iff det A 6= 0.
How to find the determinant of the inverse matrix? One uses the
following important property:
Proposition 3.15. For any matrices A, B M(n, n) we have
det(AB) = det A det B.
Without proof.
If A is invertible, then
1
.
det A
This immediately follows from the above Proposition in view of the
identities AA1 = In and det In = 1.
det A1 =

5.2. Minors and co-factors. Let A M(n, n) and let Mjk


M(n 1, n 1) be the matrix obtained from A by deleting from A the
jth row and kth column.
Definition 3.16. The matrix Mjk is called the jkth minor of the
matrix A.
The number
Ajk = (1)j+k det Mjk
is called the jkth co-factor of A.
Example.

(1)

2 1 4
5 .
A= 0 1
6 3 4
Then






0 1
1 5
2 4
M13 =
, M11 =
, M32 =
.
6 3
3 4
0 5

5. DETERMINANTS

45

The co-factors are


A13 = 6, A11 = 19, A32 = 10.
(2) For a 2 2 matrix
A11 = a22 , A12 = a21 , A21 = a12 , A22 = a11 .
These notions are very useful for calculating the determinant of A:
Proposition 3.17. Let A M(n, n). Then for any j, l = 1, 2, . . . , n:
n
X
det A =
ajk Ajk
=

k=1
n
X

akl Akl .

k=1

These formulae are called the expansion of the determinant in the jth
row and the lth column respectively.
Example. Let

2
A= 0
6
Let us expand the determinant in

1 4
1
5 .
3 4
the first row:

det A = 2A11 1A12 + 4A13 ,


with

A11 = det

1 5
3 4


= 19, A12 = det


A13 = det

0 1
6 3

0 5
6 4


= 30,


= 6,

so that
det A = 2 (19) 1 30 + 4 (6) = 92.
Alternatively, we may notice that the expansion in the second row will
have only two co-factors:
det A = A22 + 5A23 ,
with

A22 = det

2 4
6 4


= 32, A23 = det

2 1
6 3

so that
det A = 32 + 5 (12) = 92.


= 12,

46

3. LINEAR ALGEBRA

A general observation: if a matrix has a column or a row of zeros,


then the determinant equals zero.
5.3. The adjoint matrix and another way of finding the
inverse.
Definition 3.18. The adjoint matrix for a given A M(n, n) is
the matrix ad(A) M(n, n) of the form

A11 A21 . . . An1


A12 A22 . . . An2
.
.
..
..
..
..
.
.
.
A1n A2n . . .

Ann

In other words, the jkth entry of ad(A) is Akj , i.e. it is the transpose
of the matrix built of Ajk s.
Example. n = 2:




a11 a12
a22 a12
A=
, ad(A) =
.
a21 a22
a21 a11
The inverse of a matrix A is found as
A1 =

1
ad(A).
det A

In the two-dimensional case it gives


A

1
=
a11 a22 a12 a21

a22 a12
a21 a11


.

6. Eigenvalues and eigenvectors


Our aim now is to introduce the method called diagonalisation. It
consists in finding for a matrix A another invertible matrix M such
that A = M DM 1 with a diagonal matrix D, that is djk = 0 if j 6=
k. Note that not all matrices can be represented in this form. This
representation is convenient, since allows one to calculate easily the
powers of matrices. For instance, if

1 0 . . . 0
0 2 . . . 0
A=
.. . .
. ,
...
. ..
.
0

...

6. EIGENVALUES AND EIGENVECTORS

Then for any l = 1, 2, . . . , one has


l
1 0 . . . 0

0 l2 . . . 0
l

A = ..
.. . .
.
. ..
.
.
0 0 . . . ln

47

For example,
l l

2 0
0
2 0 0
0 3 0 = 0 3l
0 .
0 0 4
0 0 (4)l

In general, for diagonal matrices the elementary matrix operations become simpler.
If A = M DM 1 , then
Al = M DM 1 M DM 1 . . . M DM 1 = M Dl M 1 ,
so taking the power of A boils down to taking the power of a diagonal
matrix!
Definition 3.19. A real number is called an eigenvalue of A if
there exists a non-zero vector v such that Av = v.
A matrix may have more than one eigenvalue.
Observe straight away that if v is an eigenvector, then tv is also an
eigenvector for any non-zero t R.
Example.
(1) A = In . For any vector v 6= 0 we have Av = v,
so = 1 is the only eigenvalue of In and any non-zero vector
is an eigenvector.
(2) Let

 

2
10 18
,
, v=
A=
6 11
1
so that

   
10 18
2
2
Av =
=
.
6 11
1
1
Thus = 1 is an eigenvalue and v is an eigenvector.
The procedure for finding eigenvalues and eigenvectors is simple:
suppose that Av = v, so that (AIn )v = 0. Since we are looking for
a non-zero solution of this system, the matrix A In is not invertible,
i.e. det(A In ) = 0. Conversely, if the determinant equals zero, this
means that the system has either no or infinitely many solutions. Since
v = 0 is a solution, this means that one has infinitely many of them,

48

3. LINEAR ALGEBRA

and hence there exists a vector v 6= 0 such that Av = v. This means


that is an eigenvalue and we have proved the following result:
Proposition 3.20. Eigenvalues are precisely the solutions of the
equation det(A In ) = 0.
The above equation is an algebraic equation for the roots of an
nth order polynomial. Thus there are exactly n roots, and some of
them are real. Suppose all of them are real: 1 , 2 , . . . , n . To find
the eigenvectors vj associated with them, need to solve the systems
Ax = j x, j = 1, 2, . . . , n. Then we form the matrix

M = v1 v2 . . . vn .
Then we have

1 0 . . . 0
0 2 . . . 0
A = M DM 1 , D =
.. . .
. .
...
. ..
.
0 0 . . . n

Example.

(1) Let

A=

1 2
3 2


.

Write the equation:




1
2
det(A I2 ) = det
= (1 )(2 ) 6
3
2
= 2 3 4 = ( 4)( + 1).
Thus we have two eigenvalues: 1 = 1, 2 = 4.
Find the eigenvector for 1 = 1:


  
2 2
x1
0
(A + I2 )x =
=
.
3 3
x2
0
Thus x1 + x2 = 0, whence




x1
1
v1 =
= x1
.
x1
1
Find the eigenvector for 1 = 1:


  
3 2
x1
0
(A 4I2 )x =
=
.
3 2
x2
0

6. EIGENVALUES AND EIGENVECTORS

49

Thus 3x1 2x2 = 0, whence




 
2x1
2
v2 =
= x1
.
3x1
3
Define the matrix M as follows:


1 2
M=
.
1 3
Note:
1

1
=
5

3 2
1 1


.

Then the matrix A can be now represented as M DM 1 . Let


us check:




1
1 2
1 0
3 2
1
M DM =
0 4
1 1
5 1 3
1
=
5

1 2
1 3



3 2
4 4


=

1 2
3 2


.

As we have observed previously,




l 

1
1 2
1 0
3 2
l
l
1
A = MD M =
0 4
1 1
5 1 3
1
=
5

1 2
1 3

(1)l
=
5
4l
+
5



1 2
1 3
1 2
1 3

(1)l 0
0
4l



3 2
1 1





3 2
1 1



(1)l
=
5

1 0
1 0

(1)l
=
5

3 2
3 2



1 0
0 0
0 0
0 1



3 2
1 1
4l
+
5

3 2
1 1

4l
+
5

0 2
0 3

2 2
3 3


.



3 2
1 1

50

3. LINEAR ALGEBRA

(2) Here is matrix which cannot be diagonalised:




2 1
B=
0 2
Indeed, the equation det(B I2 ) = 0 looks as follows: (2
)2 = 0, so that = 2 is an eigenvalue. Does it have two
eigenvectors associated with it? Let us find out:



0 1
x1
= 0,
0 0
x2
so x2 = 0, and x1 takes an arbitrary value. Therefore the
eigenvectors look like


 
x1
1
= x1
.
0
0
Thus if we take the matrix


1 1
M=
,
0 0
to attempt diagonalisation, it will
(3) Let us find eigenvalues of a 3 3
1 1 4
A = 3 2 1
2 1 1

not be invertible!
matrix:

Write the equation:

1 1
4
2
1
det(A I3 ) = det 3
2
1
1


2
1
= (1 ) det
1
1


3 2
+ 4 det
2
1


+ det

3
1
2 1

Thus




det(A I3 ) = (1 ) (2 )(1 + ) + 1 3(1 + ) + 2 + 4 3 2(2 )
= (1 )(2 1) 3 5 + 8
= (1 )(2 1) + 5( 1)
= ( 1)(2 6) = ( 1)( + 2)( 3).

6. EIGENVALUES AND EIGENVECTORS

51

Thus the eigenvalues are 1 = 1, 2 = 2, 3 = 3.


Eigenvector for 1 = 1:

0 1 4
0 0 1 4
0
R2 + R1
3 1 1
0
3 0 3
0
R3 + R1
2 1 2
2 0 2
0
0

0 1 4
R3/2 R2/3
1 0 1
R2/3
0 0 0

0
0 .
0

Thus x1 = x3 and x2 = 4x3 , so

1
v1 = 4 .
1
Eigenvector for 2 = 2:

3 1 4
0 3 1 4
R2 R1
3 4 1
0
0 5 5
3R3 2R1
2 1
1
0
0 5 5

3 1 4
R3 R2
0 1 1
R2/5
0 0
0

0
3 0 3

0 R1 + R2 0 1 1
0
0 0 0

0
0
0

0
0 .
0

Therefore x1 = x3 , x2 = x3 , and hence

1
v2 = 1 .
1
Eigenvector for 3 = 3:

2 1 4
0 2 1 4
2R2 + 3R1
3 1 1
0
0 5 10
R3 + R1
2
1 4
0
0
0 0

2 1 4
R1
0 1 2
R2/5
0 0 0

0
0 .
0

0
0
0

52

3. LINEAR ALGEBRA

Therefore x2 = 2x3 , 2x1 = 4x3 x2 = 2x3 , and hence



1
v3 = 2 .
1

You might also like