You are on page 1of 140

Applied Linear Algebra Refresher Course

Karianne Bergen
kbergen@stanford.edu
Institute for Computational and Mathematical Engineering,
Stanford University

September 16-19, 2013

K. Bergen

(ICME)

Applied Linear Algebra

1 / 140

Course Topics

Matrix and vector properties


Matrix algebra
Linear transformations
Solving linear systems
Orthogonality
Eigenvectors and eigenvalues
Matrix decompositions

K. Bergen

(ICME)

Applied Linear Algebra

2 / 140

Vectors and Matrices

K. Bergen

(ICME)

Applied Linear Algebra

3 / 140

Scalars, vectors and matrices

Scalar : a single quantity or measurement that is invariant under


coordinate rotations (e.g. mass, volume, temperature).
, , R

(Greek letters)

Vector : an ordered collection of scalars (e.g. displacement,


acceleration, force).

x1
x2

x = . , xi R i = 1, . . . n
..
xn
x, y, z Rn

(lower case) is a vector with n entries.

All vectors are column vectors unless otherwise noted.


Row vectors are denoted by xT = [ x1 x2 xn ].
For simplicity, we will restrict ourselves to real numbers. The generalization to the
complex case can be found in any linear algebra text.
K. Bergen

(ICME)

Applied Linear Algebra

4 / 140

... Scalars, vectors and matrices

Matrix : a two-dimensional collection of scalars.

a11
a21

an = .
..

A = a1 a2

a12
a22

..
.

am1 am2

a1n
a2n

.. ,
.
amn

aT1
aT2

AT =

T
am

A Rmn is a matrix with m rows and n columns.


A, B, , Rmn

K. Bergen

(ICME)

(upper case)

Applied Linear Algebra

5 / 140

Operations on vectors
Scalar multiplication:


x1
x1
x2 x2


x = . = .
.. ..

(x)i = xi ,

xn

xn
Vector addition/subtraction:

(x + y)i = xi + yi ,

x1
y1
x 1 + y1
x 2 y2 x 2 + y2

x+y = . + . =

..
.. ..

.
xn

yn

x n + yn

Inner Product (dot product, scalar product):


xT y =

n
X

xi yi = x1 y1 + x2 y2 + + xn yn

i=1

Other common notation for the inner product: hx, yi,


K. Bergen

(ICME)

Applied Linear Algebra

xy

6 / 140

Vector norms

How do we measure the length or size of a vector?


Vector norm is a function f : Rn R+ that satisfies the following
properties:
R and v Rn

Scale invariance: f (v) = ||f (v),

Triangle inequality: f (u + v) f (u) + f (v),

Positivity: f (x) = 0 x 0

K. Bergen

(ICME)

u, v Rn

Applied Linear Algebra

7 / 140

Commonly used vector norms


1-norm (taxi-cab norm):
kvk1 =

n
X

|vi |

i=1

2-norm (Euclidean norm):


kvk2 =

n
X

!1

vi2

vT v

i=1

infinity-norm (max norm):


kvk = max |vi |
i

These are examples of the p-norm, for p = 1, 2, and :


!1
n
p
X
kvkp =
|vi |p
, p1
i=1
K. Bergen

(ICME)

Applied Linear Algebra

8 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

9 / 140

Mini-Quiz 1:

Compute the 1-norm, 2-norm, infinity-norm of the following vector:



3

v = 6
2

K. Bergen

(ICME)

Applied Linear Algebra

10 / 140

Solution (Q1):
Solutions:
1-norm:
kvk1 =

3
X

vi = 3 + 6 + 2 = 11

i=1

2-norm:

kvk2 =

3
X

!1
2

vi2

32 + 62 + 22 = 9 + 36 + 4 = 49 = 7

i=1

-norm
kvk = max |vi | = max{3, 6, 2} = 6
i

K. Bergen

(ICME)

Applied Linear Algebra

11 / 140

Cauchy-Schwartz Inequality

The inner product satisfies the Cauchy-Schwartz inequality:


|xT y| kxk2 kyk2

for x, y Rn

with equality when y is a scalar multiple of x.


This inequality is arises in many applications, including
I
I
I

Geometry: triangle inequality


Physics: Heisenberg uncertainty principle
Statistics: Cramer-Rao lower bound

K. Bergen

(ICME)

Applied Linear Algebra

12 / 140

Inner Product and Orthogonality

Orthogonal vectors: vectors u and v are orthogonal (u v) if their


inner product is zero.
uT v = 0

uv

Orthonormal vectors: vectors u and v are orthonormal if they are


orthogonal and both are normalized to length 1.
uv

and kuk = 1, kvk = 1

The orthogonal complement of a set of vectors V , denoted V , is


the set of vectors x such that xT v = 0, v V .

K. Bergen

(ICME)

Applied Linear Algebra

13 / 140

Inner Product and Projections

Projection: the inner product gives us a way to compute the


component of a vector v along any direction u:
proju v =

uT v u
kuk kuk

When u is normalized to unit length, kuk = 1, this becomes



proju v = uT v u

K. Bergen

(ICME)

Applied Linear Algebra

14 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

15 / 140

Mini-Quiz 2:

Give an algebraic proof for the triangle inequality


kx + yk2 kxk2 + kyk2
using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

K. Bergen

(ICME)

Applied Linear Algebra

16 / 140

Solution (Q2):
Give an algebraic proof for the triangle inequality
kx + yk2 kxk2 + kyk2
using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )
Solution:
kx + yk2 = (x + y)T (x + y)
= xT x + 2xT y + y T y
= kxk2 + kyk2 + 2 xT y

kxk2 + kyk2 + 2|xT y|


kxk2 + kyk2 + 2kxkkyk

(by Cauchy-Schwartz)

= (kxk + kyk)

K. Bergen

(ICME)

Applied Linear Algebra

17 / 140

Operations with Matrices


Scalar multiplication:

(A)ij = Aij ,

a11

am1

a1n
..

a11


=
am1
amn

a1n
..

.
amn

Matrix addition:
(A + B)ij = Aij + Bij

a11 + b11
a1n + b1n

..
A+B =
,
.
am1 + bm1
amn + bmn

K. Bergen

(ICME)

A, B Rmn

Applied Linear Algebra

18 / 140

Matrix-vector multiplication
Matrix-vector multiplication:
(Ax)i =

n
X

Aij xj ,

A Rmn ,

x Rn

j=1

Matrix-vector product as linear combination of matrix columns





x1
x2

Ax = a1 a2 an . = x1 a1 +x2 a2 + +xn an
..
xn
Matrix-vector product in terms as a vector of scalar products

a
T1
a
T1 x

T
T

a
2
2 x
Ax = . x =

.
..
..

a
Tm

K. Bergen

(ICME)

a
Tm x

Applied Linear Algebra

19 / 140

Matrix-matrix multiplication
Matrix product:
(AB)ij =

p
X

A Rmp ,

Aik Bkj ,

B Rpn

k=1

Matrix products in terms of inner

a
T1
T
a
2
(AB)ij = a
Ti bj ,
A= .
..
a
Tm

products:

B = b1 b2

bn

Since inner products are only defined for vectors of the same
length, the matrix product requires that a
i , bj Rp for all i, j.

K. Bergen

(ICME)

Applied Linear Algebra

20 / 140

... Matrix-matrix multiplication


Matrix products in terms of Matrix-vector products:


AB = A b1 b2

AB =

a
T1
a
T2
..
.
a
Tm

bn = Ab1 Ab2

B =

a
T1 B
a
T2 B
..
.
a
Tm B

Abn

Matrix-matrix multiplication can be expressed in terms of the


concatenation of matrix-vector products.
For A Rkl and B Rmn the product AB is only defined if
l = m, and the product BA is only defined if k = n.
K. Bergen

(ICME)

Applied Linear Algebra

21 / 140

Properties of Matrix-matrix multiplication


Matrix multiplication is associative:
A(BC) = (AB)C
Matrix multiplication is distributive:
A(B + C) = AB + AC
Matrix multiplication is, in general, not commutative:
AB 6= BA

For example, let A =




1 1
1 0
and B =
, then
0 1
1 1


 

2 1
1 1
AB =
6
=
= BA.
1 1
1 2
K. Bergen

(ICME)

Applied Linear Algebra

22 / 140

Vector Outer Product


Using the definition of matrix multiplication, we can define the
vector outer product:

x1 y1 x1 y2 x1 yn
x1
x2 
x 2 yn

x2 y1 x2 y2


xy T = . y1 y2 yn = .
..
..
..
..
.
.
xm y1 xm y2

xm
xy T

xm yn

= x i yj

ij

x Rm , y Rn ,

xy T Rmn

The outer product xy T of vectors x and y is a matrix, and is


defined even if the vectors are not the same length.

K. Bergen

(ICME)

Applied Linear Algebra

23 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

24 / 140

Mini-Quiz 3:
Which of the following vector and matrix operations are well-defined?

1
2
3

x
A
xy

4
5
6

= 5,

1
x = 5 ,
2

K. Bergen

(ICME)

xT y
xT z
zxT


4
y = 0 ,
1

z=

 
3
,
0

Az

z T Ay

A=

Applied Linear Algebra



4 0 2
1 5 3

25 / 140

Solution (Q3):
Which of the following vector and matrix operations are well-defined?

1
2
3

x
A
xy

Yes
Yes
Yes

4
5
6

= 5,

1
x = 5 ,
2

K. Bergen

(ICME)

xT y
xT z
zxT

Yes
No
Yes


4
y = 0 ,
1

z=

 
3
,
0

Az

No

z T Ay

Yes

A=

Applied Linear Algebra



4 0 2
1 5 3

26 / 140

Transpose of a Matrix
Transpose: the transpose of matrix A Rmn is the matrix
AT Rnm formed by exchanging the rows and columns of A:
(AT )ij = Aji
Properties:
(AT )T
T

(A + B)

K. Bergen

(ICME)

= A
= AT + B T

(AB)T

= B T AT

(B)T

= (B T )

Applied Linear Algebra

27 / 140

Trace of a Matrix
Trace: the trace of a square matrix A Rnn is the sum of its
diagonal entries:
n
X
tr(A) =
aii
i=1

The trace is a linear map and has the following properties:


tr(A + B) = tr(A) + tr(B)
tr(A) = tr(A),

tr(A) = tr(A )
tr(AB) = tr(BA)
The trace of a matrix product functions similarly to the inner
product of vectors:
X
tr(AT B) =
Aij Bij , A, B Rmn
i,j
K. Bergen

(ICME)

Applied Linear Algebra

28 / 140

Matrix norms
Induced norm (operator norm) corresponding to vector norm k k:
kAk = max kAxk
kxk=1

I
I
I

kAk1 : maximum absolute column sum


kAk : p
maximum absolute row sum
kAk2 = max (AT A) = max (A) (spectral norm)

Frobenius norm:
v
uX
n
q
um X
|aij |2 = trace(AT A)
kAkF = t
i=1 j=1

The Frobenius norm is equivalent to kvec(A)k2 .

K. Bergen

(ICME)

Applied Linear Algebra

29 / 140

Geometric Interpretation of Matrix Norms

diagram from Trefethen and Bau.


K. Bergen

(ICME)

Applied Linear Algebra

30 / 140

Properties of Matrix Norms

Matrix norms satisfy the same properties as vector norms:


1

Scale invariance: kAk = ||kAk

Triangle inequality: kA + Bk kAk + kBk

Positivity: kAk 0 and kAk = 0 only if A 0

Some matrix norms, including induced norms and the Frobenius


norm, also satisfy the submultiplicative property:
kAxk kAkkxk,

K. Bergen

(ICME)

and kABk kAkkBk

Applied Linear Algebra

31 / 140

A few words on: Vector and Matrix Norms


Vector norms is to give us a way to measure how big a vector is.
2-norm (Euclidean norm) is most common: it has nice relationship
with the inner product kvk22 = v T v, and fits with our usual
geometric notions of length and distance.
The choice of norm may depend on the application:
Let y Rn be a set of n measurements, and let y Rn be the
estimates for y from a model. Let r = y y be the difference
between the measured value and the model estimate. We want to
optimize our model to make r as small as possible, but how
should we measure small?
I
I
I

krk : use this if no error should exceed a prescribed value .


krk2 : use this if large errors are bad, but small errors are ok.
krk1 : use this norm if you want to your measurement and estimate
to match exactly for as many samples as possible.

Matrix norms are an extension of vector norms to matrices.


K. Bergen

(ICME)

Applied Linear Algebra

32 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

33 / 140

Mini-Quiz 4:
Determine whether each statement given below is True or False:
There exist matrices A, B 6= 0 such that AB = 0.
The trace is invariant under cyclic permutations:
tr(ABC) = tr(CAB) = tr(BCA).
The definition of the induced norm allows us to use different vector
norms on the numerator and denominator
kAk, = max kAxk .
kxk =1

kAk,1 is equal to the largest element of A in absolute value .


K. Bergen

(ICME)

Applied Linear Algebra

34 / 140

Solutions (Q4):
Determine whether each statement given below is True or False:
There exist matrices A, B 6= 0 such that AB = 0.




1 2
2 4
TRUE : For example, when A =
, and B =
.
2 4
1 2
The trace is invariant under cyclic permutations:
tr(ABC) = tr(CAB) = tr(BCA).
TRUE
Use the property that tr(XY ) = tr(Y X):
To show that tr(ABC) = tr(CAB), let X = AB, and Y = C.
To show that tr(ABC) = tr(BCA), let X = A and Y = BC.

K. Bergen

(ICME)

Applied Linear Algebra

35 / 140

Solutions (Q4):
Determine whether each statement given below is True or False:
The definition of the induced norm allows us to use different vector
norms on the numerator and denominator
kAk, = max kAxk .
kxk =1

kAk,1 is equal to the largest element of A in absolute value .


TRUE

kAk,1 = max kAxk


kxk1 =1

K. Bergen

(ICME)



T
= max max(
ai xi ) = max |aij |.
kxk1 =1

Applied Linear Algebra

i,j

36 / 140

Special Matrices: Diagonal, Identity


Diagonal Matrix : a matrix is diagonal if the
are on the matrix diagonal.

d1
(

di if i = j
0
Dij =
,
D=.
..
0 if i 6= j
0

only non-zero entries


0
d2
..
.
0

0
0

..
.
dn

Identity Matrix : the identity In Rnn is a diagonal matrix with


1s along the diagonal (denoted by I when size is unambiguous).

1 0 0
(
0 1
0
1 if i = j

Iij =
,
I = .
..
.
.
.
.
. .
0 if i 6= j
0 0 1
AIn = Im A = A
AI = IA = A
K. Bergen

(ICME)

for rectangular A Rmn


for square A Rnn
Applied Linear Algebra

37 / 140

Special Matrices: Triangular


Triangular Matrix : a matrix is lower/upper triangular if all of the
entries above/below the diagonal are zero:

l11
l21

L= .
..

l22
..
.

lm1 lm2
I
I
I

..

u11 u12

u22

U =
..

lmn

u1n
u2n

..
.
umn

The sum of two upper triangular matrices is upper triangular.


The product of two upper triangular matrices is upper triangular.
The inverse of an invertible upper triangular matrix is upper
triangular.

K. Bergen

(ICME)

Applied Linear Algebra

38 / 140

Special Matrices: Symmetric, Orthogonal


Symmetric Matrix : a square matrix is symmetric if
its own transpose.

1
2

A = AT Rnn ,
A= .
..
..
.

it is equal to

..
.
n

Orthogonal (Unitary) Matrix : a square matrix U Rnn is


unitary if it satisfies
U T U = U U T = I.
The columns of a orthogonal matrix U are orthonormal.

K. Bergen

(ICME)

Applied Linear Algebra

39 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

40 / 140

Mini-Quiz 5:
Determine whether each statement given below is True or False:
For any A Rnn , AT A is symmetric.
For any A Rnn , A AT is symmetric.
Invariance under Unitary Multiplication.
For any matrix x Rm and unitary Q Rmm , we have
kQxk2 = kxk2
Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then
DA = AD for any matrix A Rnn .

K. Bergen

(ICME)

Applied Linear Algebra

41 / 140

Solution (Q5):

Determine whether each statement given below is True or False:


For any A Rnn , AT A is symmetric.
TRUE : (AT A)T = (A)T (AT )T = AT A
For any A Rnn , A AT is symmetric.


1 1
FALSE : For example when A =
,
1 0

K. Bergen

(ICME)

Applied Linear Algebra

AT



0 2
=
.
2 0

42 / 140

Solution (Q5):
Invariance under Unitary Multiplication.
For any matrix x Rm and unitary Q Rmm , we have
kQxk2 = kxk2
TRUE : kQxk22 = (Qx)T (Qx) = xT QT Qx = xT x = kxk22
Similarly, for the matrix 2-norm and Frobenius norm, we have:

kQAk22 = max (QA)T (QA)

= max AT QT QA

= max AT A)
= kAk22
and kQAk2F


= tr (QA)T (QA)

= tr AT A
= kAk2F

K. Bergen

(ICME)

Applied Linear Algebra

43 / 140

Solution (Q5):
Determine whether each statement given below is True or False:
Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then
DA = AD for any matrix A Rnn .
FALSE
DA rescales the rows of A while AD rescales the columns of A:

d1 a
T1
a
1
d1 0 0
T d2 a
0 d2
T2
0
2

=
DA = .

6=

.
.
.
..
..

..
. .. ..
T
T
dn a
n
a
n
0 0 dn

AD = a1

K. Bergen

(ICME)

d1 0
0 d2

an .
..
..
.
0 0

.. = d1 a1
.

dn an

dn

Applied Linear Algebra

44 / 140

Vector Spaces

K. Bergen

(ICME)

Applied Linear Algebra

45 / 140

Linear Transformations
Linear transformation: T : Rn Rm is a linear transformation if
for all x, y Rn the following properties hold:
I

T (x + y) = T (x) + T (y)

T (x) = T (x)

Theorem: Let A Rmn be a matrix and define the function


TA (x) = Ax. Then TA : Rn Rm is a linear transformation.
I

Matrix multiplication is a linear function.

Theorem: Let T : Rn Rm , then there exists a matrix A such


that T (x) = Ax.
I

There is a matrix for every linear function.

K. Bergen

(ICME)

Applied Linear Algebra

46 / 140

Linear Combinations
Let v1 , v2 , , vr Rm be a set of vectors. Then any vector v
which can be written in the form
v = 1 v1 + 2 v2 + + r vr ,

i R i = 1, . . . , r

is a linear combination of vectors v1 , v2 , , vr .


Span: the set of all linear combination of vectors v1 , , vr Rm
is called the span of {v1 , , vr }
I
I

span{v1 , , vr }, is always a subspace of Rm .


If S = span{v1 , , vr }, then S is spanned by v1 , v2 , , vr .

A set S is called a subspace if it is closed under vector addition


and scalar multiplication:
I

if x, y S and R, then x + y S and x S

K. Bergen

(ICME)

Applied Linear Algebra

47 / 140

Linear Independence
Linear dependence: a set of vectors v1 , v2 , , vr is linearly
dependent if there exists a set of scalars 1 , 2 , , r R with
at least one i 6= 0 such that
1 v1 + 2 v2 + + r vr = 0
That is, a set of vectors is linearly dependent if one of the vectors
in the set can be written as a linear combination of one or more
other vectors in the set.
Linear independence: a set of vectors v1 , v2 , , vr is linearly
independent if it is not linearly dependent. That is
1 v1 + 2 v2 + + r vr = 0

K. Bergen

(ICME)

1 = = r = 0

Applied Linear Algebra

48 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

49 / 140

Mini-Quiz 6:
Are the following sets of vectors linearly independent or linearly
dependent?
1

{a, b, c}

{b, c, d}

{a, b, d}

{a, b, c, d}

1
a = 0 ,
0

K. Bergen

(ICME)


0
b = 1 ,
1


0
c = 0 ,
1

0
d = 1
1

Applied Linear Algebra

50 / 140

Solution (Q6):
Are the following sets of vectors linearly independent or linearly
dependent?
1

{a, b, c}

independent

{b, c, d}

dependent: b + 2c = d

{a, b, d}

independent

{a, b, c, d} dependent: at most 3 linearly independent vectors in R3






1
0
0
0
a = 0 , b = 1 , c = 0 , d = 1
0
1
1
1

K. Bergen

(ICME)

Applied Linear Algebra

51 / 140

Basis and Dimension


A basis for a subspace S is a linearly independent set of vectors
that span S
I

The basis for a subspace S is not unique, but all bases for the
subspace contain the same number of vectors.
For example, consider the case where S = R2 :
   
   
1
0
1
1
,
and
,
are both bases that span R2 .
0
1
1
1

The dimension of the subspace S, dim(S), is the number of


linearly independent vectors in the basis for S (in the example
above, dim(S) = 2).
Theorem (unique representation): if vectors v1 , . . ., vn are a
basis for subspace S, then every vector in S can be uniquely
represented as a linear combination of these basis vectors.

K. Bergen

(ICME)

Applied Linear Algebra

52 / 140

Range and Nullspace


The range (column space, image) of a matrix A Rmn , denoted
by R(A), is the set of all linear combinations of the columns of A:
R(A) := {Ax | x Rn } ,

R(A) Rm

The nullspace (kernel) of a matrix A Rmn , denoted by N (A), is


the set of vectors z such that Az = 0:
N (A) := {z Rn | Az = 0} ,

N (A) Rn

The range and nullspace of AT are called the row space and left
nullspace of A.
I

These four subspaces of A are intrinsic to A and do not depend on


the choice of basis.

K. Bergen

(ICME)

Applied Linear Algebra

53 / 140

Rank of a Matrix

The column rank of A, denoted rank(A), is dimension of R(A).


The row rank of A is the dimension of R(AT ).
Rank of a matrix: the column rank of a matrix is always equal to
the row rank, and therefore we refer to this value as the rank of A.
Matrix A Rmn is full rank if rank(A) = min{m, n}.

K. Bergen

(ICME)

Applied Linear Algebra

54 / 140

Fundamental Theorem of Linear Algebra


Fundamental theorem of linear algebra:
1

The nullspace of A is the orthogonal complement of the row space.


N (A) = R(AT )

The left nullspace of A is the orthogonal complement of the column


space.

N (AT ) = (R(A))

Corollary (Rank-nullity):
dim (R(A)) + dim (N (A)) = n,

K. Bergen

(ICME)

A Rmn

Applied Linear Algebra

55 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

56 / 140

Mini-Quiz 7:

Give the range R() and nullspace N () for the matrices



1 0
1 0 1
32

A = 0 2 R , and B =
R23
0 1 1
0 0
What are their dimensions ( dim (R()) and dim (N ()) )?

K. Bergen

(ICME)

Applied Linear Algebra

57 / 140

Solution (Q7):
Give the range R() and nullspace N () for the matrices



1 0
1 0 1
32

A = 0 2 R , and B =
R23
0 1 1
0 0
What are their dimensions ( dim (R()) and dim (N ()) )?
Solution:


0
1

R(A) = 0 + 2 , , R ,
N (A) =

0
0

 

  
1

1
0

+
, , R ,
N (B) = 1 , R
R(B) =
0
1

1
dim (R(A)) = 2

dim (N (A)) = 0

dim (R(B)) = 2

dim (N (B)) = 1

K. Bergen

(ICME)

Applied Linear Algebra

58 / 140

Solving Linear Systems

K. Bergen

(ICME)

Applied Linear Algebra

59 / 140

Nonsingular Matrix
Theorem: a matrix A Rmn with m n has full rank if and
only if it maps no two distinct vectors to the same vector, and
Ax = Ay

x = y.

Singular matrix : a square matrix A Rnn that is not full rank.


Nonsingular matrix : a square matrix A Rnn of full rank.
If A is nonsingular, we can uniquely express any vector y Rn as
y = Ax for some unique x Rn . If we chose vectors ei = Axi ,
i = 1, . . . , n, we can write

AX = A x1

K. Bergen

(ICME)

xn = Ax1

Axn = e1

Applied Linear Algebra

60 / 140

en = I

Matrix Inverse
Matrix inverse: the inverse of a square matrix A Rnn is the
matrix B Rnn such that
AB = BA = I
I
I

I
I

where I is the identity matrix

The inverse of A is denoted by A1 .


Any square, nonsingular matrix A has a unique inverse A1
satisfying AA1 = A1 A = I.
1
(AB) = B 1 A1 (assuming both inverses exist).
(AT )1 = (A1 )T

If Ax = b, then A1 b gives the vector of coefficients in the linear


combination of the columns of A that yields b:

x1
..
1
b = Ax = x1 a1 + + xn an ,
x=A b= .
xn
K. Bergen

(ICME)

Applied Linear Algebra

61 / 140

Invertible Matrix Theorem

Let A Rnn be a square matrix, then the following statements


are equivalent:
I
I
I
I
I
I
I
I

A is invertible, A is nonsingular.
There exists a matrix A1 such that AA1 = In = A1 A.
Ax = b has exactly one solution for each b Rn .
Az = 0 has only the trivial solution z = 0, i.e. N (A) =
The columns of A are linearly independent.
A is full rank, i.e. rank(A) = n.
det(A) 6= 0.
0 is not an eigenvalue of A.

K. Bergen

(ICME)

Applied Linear Algebra

62 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

63 / 140

Mini-Quiz 8:

If A and B Rnn are two invertible matrices, which of the following


formulas are necessarily true?
A + B is invertible and (A + B)1 = A1 + B 1 .
(A + B)2 = A2 + 2AB + B 2 .
(ABA1 )3 = AB 3 A1 .
A1 B is invertible and A1 B

K. Bergen

(ICME)

1

= B 1 A.

Applied Linear Algebra

64 / 140

Solutions (Q8):
If A and B Rnn are two invertible matrices, which of the following
formulas are necessarily true?
A + B is invertible and (A + B)1 = A1 + B 1 .
FALSE : A =



1 1
,
0 1



1 1
=
,
0 1

B=
1



1 0
,
1 1


1 0
=
,
1 1



2 1
1 2

A+B =

(A + B)


=

2
3

31

13

2
3

(A + B)2 = A2 + 2AB + B 2 .
FALSE : (A + B)2 = A2 + AB + BA + B 2 and AB 6= BA for all
invertible matrix pairs A, B.

K. Bergen

(ICME)

Applied Linear Algebra

65 / 140

Solutions (Q8):
If A and B Rnn are two invertible matrices, which of the following
formulas are necessarily true?
(ABA1 )3 = AB 3 A1

TRUE

(ABA1 )3 = (ABA1 )(ABA1 )(ABA1


= AB(A1 A)B(A1 A)BA1 )
= AB 3 A1
A1 B is invertible and A1 B

1

= B 1 A

TRUE

(A1 B)(A1 B)1 = (A1 B)(B 1 A)


= A1 (BB 1 )A
= A1 A
= In
K. Bergen

(ICME)

Applied Linear Algebra

66 / 140

Systems of Linear Equations


Example: Find values x1 , x2 R that satisfy
2x1 + x2 = 7
and

x1 3x2 = 7

(for example, if we want to find the intersection point of two lines).


This can be written as a matrix equation:

   
x1
7
2 1
=
or Ax = b
1 3 x2
7

where A =

K. Bergen

(ICME)


2 1
,
1 3


x=


 
x1
7
, and b =
x2
7

Applied Linear Algebra

67 / 140

Linear Systems
One of the fundamental problems in linear algebra:
Given A Rmn and b Rm , find x Rn such that
Ax = b.
The set of solutions of a linear system can behave in three ways:
1
2
3

The system has one unique solution.


The system has infinitely many solutions.
The system has no solution.

For a solution x to exist, we must have b R(A).


I

If b R(A) and rank(A) = n, then there is a unique solution


(usually square systems: m = n).
If b R(A) and rank(A) < n (N (A) 6= ), then there are infinitely
many solutions (usually underdetermined systems: m < n).
If b
/ R(A), then there is no solution (usually overdetermined
systems: m > n).

K. Bergen

(ICME)

Applied Linear Algebra

68 / 140

Solution Set of Linear Systems

diagram from http://oak.ucc.nau.edu/jws8/3equations3unknowns.html


K. Bergen

(ICME)

Applied Linear Algebra

69 / 140

Gaussian Elimination (row reduction)


How do we solve Ax = b, A Rnn ?
For some types of matrices, Ax = b is easy to solve.
I
I

Diagonal matrices: equations independent, so xi = bi /aii , i


Upper/Lower triangular matrices: back/forward substitution.

Gaussian Elimination: Transform the system Ax = b into an


upper triangular system U x = y using elementary row operations.
The following are elementary
 row
 operations that can be applied
to the augmented system A | b to introduce zeros below the
diagonal without changing the solution set x:
1
2
3

Multiply row by a scalar.


Add scalar multiples of one row to another.
Permute rows.

K. Bergen

(ICME)

Applied Linear Algebra

70 / 140

Step 1 : For A R33 , b R3 , form



A b =

Step 2 : Compute

Step 3 : Solve

+ +
+
K. Bergen

the augmented system.

the row echelon form using



L1
0 +

0 +



L2

+

0

row operations.


+ +
+ +


+ +
+ +

the triangular system by back substitution.



back substitution
+ = U y , Ux = y

x = U 1 y
+

(ICME)

Applied Linear Algebra

71 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

72 / 140

Mini-Quiz 9:
Determine the lower triangular matrices Li that correspond to the
following elementary row operations:
L1 such that L1 A = A1 :

2 8
4
1 4
2
1 = 2 5
1
L1 2 5
4 10 1
4 10 1
L2 such that L2 A1 = A2 :

1 4
2
1 4
2
1 = 0 3 3
L2 2 5
4 10 1
4 10 1
L3 such that L3 A2 = A3 :

1 4
2
1 4
2
L3 0 3 3 = 0 3 3
4 10 1
0 6 9
K. Bergen

(ICME)

Applied Linear Algebra

73 / 140

Solution (Q9):
1

0 0
2 8
4
1 4
2
1 = 2 5
1
L1 A = A1 : 0 1 0 2 5
0 0 1
4 10 1
4 10 1
2

1 0 0
1 4
2
1 4
2
1 = 0 3 3
L2 A1 = A2 : 2 1 0 2 5
0 0 1
4 10 1
4 10 1

1 0 0
1 4
2
1 4
2
L2 A3 = A3 : 0 1 0 0 3 3 = 0 3 3
4 0 1
4 10 1
0 6 9
L3 L2 L1 such that L3 L2 L1 A = A3 :
1

0 0
2 8
4
1 4
2
2
1 1 0 2 5
1 = 0 3 3
2 0 1
4 10 1
0 6 9
K. Bergen

(ICME)

Applied Linear Algebra

74 / 140

Gaussian Elimination and LU Decomposition


We can think of the elementary row operations applied to A as
linear transformations, which can be represented as a series of
lower triangular matrices, L1 , , Ln1 .
Ln1 L1 A = U

or

L1 A = U,

1
L1 = L1
1 Ln1

rearranging yields A = LU.


If we can factor A as A = LU , then we can solve the system
Ax = b by solving two triangular systems:
I
I

Solve Ly = b using forward substitution,


Solve U x = y using backward substitution.

Gaussian elimination implicitly computes the LU factorization.

K. Bergen

(ICME)

Applied Linear Algebra

75 / 140

Partial Pivoting

Pure Gaussian elimination can not be used to solve general linear


systems because we may encounter a zero on the diagonal during
the process, e.g.


0 1
A=
1 2
In practical implementation of the algorithm, the rows of the
matrix must be permuted to avoid division by zero.
Partial pivoting swaps rows if an entry below the diagonal of the
current column is larger in absolute value than the current
diagonal entry (pivot element).

K. Bergen

(ICME)

Applied Linear Algebra

76 / 140

Permutations
The reordering, or permutation, of the columns in partial pivoting
can be represented as a linear transformation.
Permutation matrix : a square, binary matrix P Rnn that has
exactly one entry 1 in each row and each column.
I

Left-multiplication of a matrix A by a permuation matrix reorders


the rows of A, while right-multiplication reorders the columns of A.

0 1 0
For example, let P = 0 0 1 ,
1 0 0

1
2
3
A = 10 20 30 ,
100 200 300

then the row and column permutations are, respectively:

10 20 30
3
2
1
P A = 100 200 300 , and AP = 30 20 10 .
1
2
3
300 200 100
K. Bergen

(ICME)

Applied Linear Algebra

77 / 140

LU Decomposition
Let A Rnn be a square matrix, then the LU decomposition
factors A into the product of a lower triangular matrix L Rnn
and an upper triangular matrix U Rnn
A = LU.
When partial pivoting is used to permute the rows, we obtain a
decomposition for the form Ln1 Pn1 L1 P1 A = U , which can
be written3 in form: L1 P A = U .
In general, any square matrix A Rnn (singular or nonsingular)
has a factorization
P A = LU,

where P is a permutation matrix.

see Trefethen and Bau: Chapter 21 for details.


K. Bergen

(ICME)

Applied Linear Algebra

78 / 140

Cholesky and Positive Definiteness


Positive definite matrix : a matrix A Rnn is positive definite if
z T Az > 0

for every z Rn ,

often denoted by A  0.

Example:

A=

xT Ax =

x1 x2

4 2
2 10

 
4 2 x1
2 10 x2

= 4x21 4x1 x2 + 10x22


= (2x1 x2 )2 + 9x22 > 0
I
I

x 6= 0

Positive semi-definite matrix : z T Az 0 for every z Rn .


Negative definite matrix :
z T Az < 0 for every z Rn .

K. Bergen

(ICME)

Applied Linear Algebra

79 / 140

... Cholesky and Positive Definiteness


A symmetric matrix A is positive definite (A  0) if and only if
there exists a unique lower triangular matrix L such that
A = LLT .
This factorization is called the Cholesky decomposition of A.
The Cholesky algorithm, a modified version of Gauss elimination,
is used to compute the factor L.
I

Basic idea: In Gauss-elimination we use row operations to


introduce zeros below the diagonal in each column. To maintain
symmetry, we can apply the same operations to the columns of the
matrix to introduce zeros in the first row.



L1 A
0 + + = L1 A = A1
A =

0 + +


0 0
A1 LT
A1 = 0 1 0 + + = A2 = A1 LT1 = L1 ALT1
0
0 + +

K. Bergen

(ICME)

Applied Linear Algebra

80 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

81 / 140

Mini-Quiz 10:
Quadratic forms: a function q(x) : Rn R if it is a linear
combination of functions of the form xi xj . A quadratic form can
be written as
q(x) = xT Ax,

where A Rnn is symmetric.

Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find


A such that q(x) = xT Ax, and determine the definiteness of A.
Given A, B, find the permutation matrix P such that P BP = A:

1 2 3
1 3 2
A = 4 5 6 , B = 7 9 8
7 8 9
4 6 5

K. Bergen

(ICME)

Applied Linear Algebra

82 / 140

Solution (Q10):
Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find
A such that q(x) = xT Ax, and determine the definiteness of A.
Solution:
q(x) = (Ax)T (Ax) = xT AT Ax = xT Bx,

where B = AT A.

q(x) is positive semi-definite, since q(x) = kAxk2 0, x Rm .


Given A, B,

1
A = 4
7

K. Bergen

find the permutation matrix P such that P BP = A:

2 3
1 3 2
1 0 0
5 6 , B = 7 9 8 , and P = 0 0 1
8 9
4 6 5
0 1 0

(ICME)

Applied Linear Algebra

83 / 140

Orthogonalization

K. Bergen

(ICME)

Applied Linear Algebra

84 / 140

Orthogonal (Unitary) Matrices


Recall that two vectors x, y Rn are orthogonal if xT y = 0, and a
set of vectors {u1 , , un } is orthonormal if
(
1 for i = j
uTi uj =
, kui k = 1, i
0 for i 6= j
Orthogonal (Unitary) Matrix : a matrix Q Rnn is orthogonal if
its columns q1 , , qn Rn form an orthonormal set:
T

q1

QT Q = ... q1 qn = In
qnT
I
I

If Q is orthogonal, then Q1 = QT (inverse is easy to compute!)


Orthogonal transformations preserve lengths and angles and
represent rotations or reflections.
p

kQxk2 = xT QT Qx = xT Ix = kxk2

K. Bergen

(ICME)

Applied Linear Algebra

85 / 140

Projectors

Recall our notation for the projection of vector v onto u:


proju v =

uT v u
kuk kuk

We can generalize the idea to projections onto subspaces of Rn .


Projector: a square matrix P Rnn is a projector if P = P 2 .
I
I

For v Rn , P v represents the projection of v into the range of P .


If v R(P ) (x : v = P x), then P v = P 2 x = P x = v.

K. Bergen

(ICME)

Applied Linear Algebra

86 / 140

Orthogonal Projectors

If x Rm and A Rmn is a matrix whose columns form a


subspace of Rm , then x can be decomposed as:
x = xk + x ,

where xk R(A) and x R(A)

xk is the orthogonal projection of x onto R(A).


I

xk is the vector in R(A) closest to x, in the sense that


kx xk k < kx vk,

K. Bergen

(ICME)

v R(A) 6= xk .

Applied Linear Algebra

87 / 140

... Orthogonal Projectors


An orthogonal projector is a projector for which the range and
nullspace are orthogonal complements.
I
I

An orthogonal projector satisfies P T = P .


If u Rn is a unit vector, then Pu = uuT is the orthogonal
projection onto the line containing u.
If u1 , , uk are an orthonormal basis for subspace V and
A = [u1 un ], then the projection onto V is
PA = AAT
or equivalently: projA x = PA x =
=

K. Bergen

(ICME)

(uT1 x)u1 + + (uTn x)un


proju1 x + + projun x

Applied Linear Algebra

88 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

89 / 140

Mini-Quiz 11:
Show that if P is an orthogonal projector, then I 2P is
orthogonal.
Hint: P T = P for orthogonal projectors and P 2 = P for all
projectors. Q is orthogonal if QT Q = I.
Find the matrix for the orthogonal

A= 0
1

projector PA onto R(A):

0
1 .
0

Hint: The columns of A are orthogonal, but not orthonormal


because the first column does not have length 1. The projection
onto R(Q) is PQ = QQT if the columns of Q are orthonormal.

K. Bergen

(ICME)

Applied Linear Algebra

90 / 140

Solution (11):
Show that if P is an orthogonal projector, then I 2P is
orthogonal. Solution:
(I 2P )T (I 2P ) = I 2 2P T I 2IP + 4P T P
= 4P 2 4P + I
= 4P 4P + I

(P T = P )
(P 2 = P )

= I

1 0
Find the orthogonal projector PA onto R(A): A = 0 1.
1 0

R(A) = R Q = 0 1
1
0
2
1

1
2 0 2
PA = PQ = QQT = 0 1 0
1
1
2 0 2
K. Bergen

(ICME)

Applied Linear Algebra

91 / 140

Gram-Schmidt Orthogonalization
Gram-Schmidt algorithm: a method for computing an orthonormal
basis {q1 , , qm } for a set of vectors A = [a1 , , an ] Rmn .
Each vector is orthogonalized with respect to the previous vectors
and then normalized:
v1 = a1

q1 =

v2 = a2 projq1 a2

q2 =

v1
kv1 k
v2
kv2 k

..
.
vk = ak

..
.
k1
X

projqi ak

qk =

vk
kvk k

i=1

In matrix form: A = QR, where


(


qiT aj
Q = q1 , , qm is orthogonal, and Rij =
0
K. Bergen

(ICME)

Applied Linear Algebra

if i j
if i > j

92 / 140

QR Decomposition
In the process of orthogonalizing the columms of matrix A, the
Gram-Schmidt algorithm computes the QR decomposition of A.
QR decomposition: any matrix A Rmn can be decomposed as
A = QR
where Q Rmm is an orthogonal matrix, and R Rmn is an
upper triangular matrix.
For rectangular A Rmn , m n


A = QR = Q1

 
 R1
Q2
= Q1 R1
0

where R1 Rnn is upper triangular, and Q1 Rmn and


Q2 Rm(mn) both have orthogonal columns.
K. Bergen

(ICME)

Applied Linear Algebra

93 / 140

QR Decomposition for Solving Linear Systems

Consider the linear system Ax = b, where A Rnn is a square,


nonsingular matrix and b Rn .
I

The QR decomposition expresses the matrix A = QR as the product


of an orthogonal matrix Q (QQT = In ) and upper triangular R:
Ax = b = QRx = b = Rx = QT b

y = QT b can be easily computed, and Rx = y is a triangular system


that can be solved by back substitution.

K. Bergen

(ICME)

Applied Linear Algebra

94 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

95 / 140

Mini-Quiz 12:

Use Gram-Schmidt to compute an orthonormal basis for the


columns of A :

1
2
12

1
3

13

A=
3

2
3
2

3
1
3

Hint: All three columns are normalized to length 1. The first and
second columns are orthogonal, and the second and third columns
are orthogonal. The projection operator for normalized vectors is
proju v = (uT v)u.

K. Bergen

(ICME)

Applied Linear Algebra

96 / 140

Solution (Q12):
Use Gram-Schmidt to compute an orthonormal basis for A :

A=

q1 = a1 ,

1
13

3
1
3

1
2
12

2
3
2

3
1
3

q2 = a2

v3 = a3 projq1 a3 projq2 a3 = a3 (aT1 a3 )a1



2
1
1
3
9
3
1

5 1
= 23 3
= 9
3 3
1
1

29
3
3
1

q3 =

K. Bergen

(ICME)

16
v3
=
kv3 k 6
26
Applied Linear Algebra

97 / 140

Least Squares Problems

K. Bergen

(ICME)

Applied Linear Algebra

98 / 140

Least Squares

We want to solve Ax = b, but what do we do if b


/ R(A) (i.e.
there does not exist a vector x such that Ax = b)?
I

For example, let A Rmn , m > n be a tall-and-skinny


(overdetermined ) matrix. Then for most b Rm , there is no
solution x Rn such that Ax = b.

Least Squares problem: Define residual r = b Ax and find the


vector x that minimizes
krk22 = kb Axk22 .

K. Bergen

(ICME)

Applied Linear Algebra

99 / 140

... Least Squares

We can decompose any vector b Rm into components b = b1 + b2 ,


with b1 R(A) and b2 N (AT ).
Since b2 is in the orthogonal complement of the R(A), we obtain
the following expression for the residual norm:
krk22 = kb1 Ax + b2 k22 = kb1 Axk22 + kb2 k22
which is minimized when Ax = b1 and r = b2 N (AT ).

K. Bergen

(ICME)

Applied Linear Algebra

100 / 140

Normal Equations

The least squares solution x occurs when r = b2 N (AT ) or


equivalently AT r = AT (b Ax) = 0.
Rearranging this expression gives the normal equations:
AT Ax = AT b
I

If A has full column rank, then A = AT A is invertible.


= b (where b = AT b) has a unique solution x.
Thus Ax

K. Bergen

(ICME)

Applied Linear Algebra

101 / 140

Least Squares and Orthogonal Projections


If x? Rn is the least squares solution to Ax = b, then Ax? is the
orthogonal projection of b onto R(A).
x? = arg min kb Axk
x

kb Ax? k kb Axk,

Ax? = PA b

x Rn

(PA is the projection onto R(A))


=

b Ax? R(A) = N (AT )

AT (b Ax? ) = 0

AT Ax? = AT b

Combining the solution x? = (AT A)1 AT b and Ax? = PA b gives


the matrix for the orthogonal projection onto R(A):
PA = A(AT A)1 AT
K. Bergen

(ICME)

Applied Linear Algebra

102 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

103 / 140

Mini-Quiz 13
Find the least squares solution of


1 0  
1
0 1 x1 = 2 .
x2
0 1
1
by solving Ax = b1 or AT Ax = AT b
Hints:


0
1

R(A) = 0 + 1 , , R ,

0
1

N (AT ) = 1 , R ,




1
0
0
b = 1 0 + 1.5 1 + 0.5 1
0
1
1
K. Bergen

(ICME)

Applied Linear Algebra

104 / 140

Mini-Quiz 13
Find the least squares solution of


1 0  
1
x
0 1 1 = 2 .
x2
0 1
1
Solution:

1
0
b = b1 + b2 + 1.5 + 0.5 ,
1.5
0.5


1 0
A A=
,
0 2
T


1 0  
1
x
Ax = b1 = 0 1 1 = 1.5 =
x2
0 1
1.5
   

1 0 x1
1
=
AT Ax = AT b =
=
0 2 x2
3
K. Bergen

(ICME)

 
1
A b=
3
T

   
x1
1
=
x2
1.5


Applied Linear Algebra

  
x1
1
=
x2
1.5
105 / 140

Least Squares via QR


The normal equation AT Ax = AT b can be rewritten using the QR
factorization A = QR:
AT Ax = AT b
RT QT QRx = RT QT b
RT Rx = RT QT b
Rx = QT b
Solution using QR decomposition:
I
I
I

compute QR factorization of A : A = QR
form y = QT b
solve Rx = y by back substitution

K. Bergen

(ICME)

Applied Linear Algebra

106 / 140

Least Squares via Cholesky

M = AT A is symmetric and positive definite, and can be rewritten


in terms the Cholesky factorization M = LLT
Solution using Cholesky factorization:
I
I
I
I
I

form M = AT A
compute the Cholesky factorization of M : M = LLT
form y = AT b
solve Lz = y by forward substitution
solve LT x = z by back substitution

K. Bergen

(ICME)

Applied Linear Algebra

107 / 140

Matrix Calculus
Vector-by-scalar and Scalar-by-vector derivatives:

x

R, x Rn ,

1
x
2
=
.. ,
.

 x

x2

xn

Vector-by-vector derivatives:
y
y Rm , x Rn ,

y
x

1
x
y1
x2

=
..
.

y1
xn

K. Bergen

(ICME)

y2
x1
y2
x2

..

y2
xn

Applied Linear Algebra

ym
x1

ym
x2

..

ym
xn

108 / 140

xn

... Matrix Calculus4


Vector-by-vector identities:
(xT x)
= 2x,
x

(bT x)
= b,
x


(xT Ax)
= A + AT x
x

(Ax)
= AT
x

= 2Ax, if A = AT

Derivative of a quadratic:

x kb
I

Axk22 =



xT AT Ax 2bT Ax + bT b = 2 AT Ax AT b

Setting this expression equal to zero gives the normal equations:

x kb

Axk22 = 0 = AT Ax = AT b

see http://en.wikipedia.org/wiki/Matrix_calculus for more properties.


K. Bergen

(ICME)

Applied Linear Algebra

109 / 140

Eigenvalues and Eigenvectors

K. Bergen

(ICME)

Applied Linear Algebra

110 / 140

Eigenvalues and Eigenvectors

For any square matrix A Rnn , there is at least one scalar and
a corresponding vector v 6= 0 such that:
Av = v

or equivalently

(A I)v = 0.

This scalar is an eigenvalue of A and v is an eigenvector


corresponding to eigenvalue .

K. Bergen

(ICME)

Applied Linear Algebra

111 / 140

Geometric interpretation of Eigenvectors

(a)

(b)


2 1
Figure: Under the transformation matrix A =
, the directions of
 
 1 2
1
1
vectors parallel to v1 =
(blue) and v2 =
(purple) are preserved.
1
1
5

diagram from wikipedia.org/wiki/Eigenvalues_and_eigenvectors


K. Bergen

(ICME)

Applied Linear Algebra

112 / 140

... Eigenvalues and Eigenvectors


The set of distinct eigenvalues of A is called the spectrum of A and
is denoted (A):
(A) := { R | A I is singular.}
The magnitude of the largest eigenvalue in absolute value is called
the spectral radius of A,
(A) = max |i |.
i (A)

The set of all eigenvectors associated with an eigenvalue form


the eigenspace, E = N (A I), associated with .

K. Bergen

(ICME)

Applied Linear Algebra

113 / 140

Similar Matrices
Two matrices A and B are similar is they share the same
eigenvalues.
I

Similar matrices represent the same linear transformation under two


different bases.
If P is a nonsingular matrix, then B = P 1 AP is similar to A.
P is called a similarity transformation or change of basis matrix.
A and B do not in general have the same eigenvectors.
If v is an eigenvector of A, then P 1 v is an eigenvector of B.

Given a square matrix A Rnn , we wish to to reduce it to its


simplest form by means of a similarity transformation.

K. Bergen

(ICME)

Applied Linear Algebra

114 / 140

Diagonalizable Matrices
Diagonalizable matrix : a matrix A Rnn that is similar to a
diagonal matrix, i.e. there exists an invertible matrix P such that
P 1 AP = D,

where D is diagonal.

A matrix is diagonalizable if and only if it has n linearly


independent eigenvectors.
I

Matrices with n distinct eigenvalues are diagonalizable.

Real symmetric matrices are diagonalizable by unitary matrices.


I

Matrices are diagonalizable by unitary matrices if and only if they


are normal (a matrix is normal if it satisfies AT A = AAT ).

K. Bergen

(ICME)

Applied Linear Algebra

115 / 140

Eigendecomposition

Let A Rnn be a square, diagonalizable matrix, then A can be


factorized as
A = V V 1
where = diag(1 , , n ) is a diagonal matrix whose elements
are eigenvalues of A, and V is a square matrix whose ith column
vi is the eigenvector corresponding to eigenvalue ii = i .
I

Note: this decomposition does not exist for every square matrix A.


1 1
e.g. A =
can not be diagonalized.
0 1

K. Bergen

(ICME)

Applied Linear Algebra

116 / 140

Spectral Theorem

Any symmetric matrix A = AT Rnn , has n (not necessarily


distinct) eigenvalues, and A can be decomposed with the
symmetric eigenvalue decomposition:
A=

n
X

i ui uTi = U U T ,

i=1

where U is orthogonal and = diag(1 , , n ) is diagonal.

K. Bergen

(ICME)

Applied Linear Algebra

117 / 140

Properties of Eigenvalues
The trace of A Rnn is equal to the sum of its eigenvalues:
tr(A) =

n
X

i=1

The rank of A Rnn is equal to the number of non-zero


eigenvalues of A.
If A Rnn is non-singular with eigenvalue i 6= 0, then
eigenvalue of A1 .

1
i

If A Rnn is symmetric positive definite (A = AT and


z T Az > 0, z Rn ), then all of its eigenvalues are positive.

K. Bergen

(ICME)

Applied Linear Algebra

118 / 140

is an

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

119 / 140

Mini-Quiz 14

Show that if is an eigenvalue of a nonsingular matrix A with


corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .
Hint: we know that Av = v and we want to show that A1 v = 1 v.

K. Bergen

(ICME)

Applied Linear Algebra

120 / 140

Solution (Q14)
Show that if is an eigenvalue of a nonsingular matrix A with
corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .
Av = v
A

(Av) = A1 (v)
v = (A1 v)
1
v

K. Bergen

(ICME)

= A1 v

Applied Linear Algebra

121 / 140

Solving Eigenvalue Problems

Solving eigenvalue problems Av = v is a fundamentally more


difficult problem than solving a linear system Ax = b.
Solving the eigenvalue problem for an n n matrix can be reduced
to solving for the roots of an nth degree polynomial. For n 5, no
equation exists for the roots of an arbitrary nth degree polynomial
given its coefficients.
All algorithms to solve eigenvalue problems for matrices of
arbitrary size are iterative. This contrasts with linear systems
which have algorithms that are guaranteed to produce the solution
in a finite number of steps.

K. Bergen

(ICME)

Applied Linear Algebra

122 / 140

Solving Small Eigenvalue Problems


Recall that if Av = v, then (A I)v = 0.
I

Since (A I) has a vector, v, in its nullspace, it is not singular


(non-invertible).
Therefore finding the eigenvalues of A is equivalent to finding the
values for which (A I) is singular.

For a 2 2 matrix,


a b
A=
, the matrix inverse is A1 =
c d

1
adbc


d b
.
c a

The quantity ad bc is called the determinant of A, det(A), and


the inverse exists if and only if det(A) 6= 0.
Therefore, for our 2 2 eigenvalue problem, we want to find the
eigenvalues such that det(A I) = 0:


(a )
b
(AI) =
, det(AI) = (a)(d)bc = 0
c
(d )
K. Bergen

(ICME)

Applied Linear Algebra

123 / 140

... Solving Small Eigenvalue Problems


Example:



2 1
A=
,
1 2


(2 )
1
(A I) =
1
(2 )

det(A I) = (2 )(2 ) 1 1
= 2 4 + 3
This is a 2nd degree polynomial (i.e. quadratic). We want to find
the roots of this polynomial, 1 , and 2 , which will satisfy
det(A i I) = 0.
2 4 + 3 = 0
( 3)( 1) = 0
= 1 = 3,
K. Bergen

(ICME)

2 = 1
Applied Linear Algebra

124 / 140

Power Method for Finding Eigenvectors


One method for finding the eigenvector, v1 , corresponding to the
largest eigenvalue, 1 , of A Rnn is called the power method :
Pick any vector z Rn , and compute the sequence:
n
o
Az, A2 z, A3 z, A4 z, , Ak z , then Ak z v1 as k .
z can be written as a linear combination of the (linearly
independent) eigenvectors of A:
z = 1 v1 + + n vn
Therefore

Az = A(1 v1 + + n vn )
= 1 Av1 + + n Avn
= 1 1 v1 + + n n vn

and iterating gives

Ak z = 1 k1 v1 + + n kn vn

since 1 is the largest eigenvalue, k1 will dominate the sum, so


Ak z will converge toward the direction of v1 .
K. Bergen

(ICME)

Applied Linear Algebra

125 / 140

The Singular Value


Decomposition

K. Bergen

(ICME)

Applied Linear Algebra

126 / 140

Singular Value Decomposition


Theorem (Singular Value Decomposition): every matrix
A Rmn has a decomposition, called the singular value
decomposition (SVD) of the form
A = U V T =

r
X

i ui viT ,

i=1
mm , V Rnn are unitary matrices and
where
 U R
r 0
Rmn is a diagonal matrix with
=
0 0
r = diag(1 , , r ) and r min(m, n) = rank(A).

The positive numbers 1 2 r > 0 are called the


singular values of A and are uniquely determined.

K. Bergen

(ICME)

Applied Linear Algebra

127 / 140

Geometric Interpretation of the SVD

The image of the unit sphere under a matrix is a hyper-ellipse.

diagram: http://people.sc.fsu.edu/~jburkardt/latex/fsu_2006/svd.png
K. Bergen

(ICME)

Applied Linear Algebra

128 / 140

Singular Values
The number of non-zero singular values r is the rank of the matrix.
The singular values are also related to a class of matrix norms:
Spectral norm (induced by the Euclidean vector norm):
q
kAk2 = max (AT A) = max (A).
Frobenius norm:
v
v
u r
uX
n
q
uX
um X
T
2
kAkF = t
|aij | = trace(A A) = t
i2
i=1 j=1

K. Bergen

(ICME)

i=1

Applied Linear Algebra

129 / 140

Singular Values and Eigenvalues


The singular value decomposition of A = U V T Rmn is related
to the eigenvalue decomposition of symmetric matrix AT A Rnn :
AT A = (V U T )(U V T ) = V 2 V T = V V T
T.
and similarly AAT = U U T = U U
q
q
i (AAT ), i = 1, , r
= i (A) = i (AT A) =
The singular values of A are the square roots of the eigenvalues of
both AT A and AAT .
The right singular vectors, V , of A are the eigenvectors of AT A
The left singular vectors, U , of A are the eigenvectors of AAT

K. Bergen

(ICME)

Applied Linear Algebra

130 / 140

Singular Vectors
The columns of U = [u1 , , um ] and V = [v1 , , vn ] are called
the left and right singular vectors of A, respectively.
The columns of U and V form an orthonormal set of vectors,
which can be regarded as orthonormal bases.
The singular vectors provide a convenient way to represent the
bases of the fundamental subspaces of matrix A.
If A Rmn is a matrix of rank r, then
R(A) = span {u1 , , ur }
N (AT ) = span {ur+1 , , um }
R(AT ) = span {v1 , , vr }
N (A) = span {vr+1 , , vn }

K. Bergen

(ICME)

Applied Linear Algebra

131 / 140

Low-rank Approximation via the SVD


From the definition of the SVD, any matrix A Rmn of rank r
can be written as a sum of rank-one matrices:
A = U V T =

r
X

i ui viT ,

i=1

Theorem (low-rank approximation): Let A Rmn and


0 k r, then
min
B : rank(B)=k

kA Bk2 = k+1

where the minimum is attained by B ? = Ak =

k
P
i=1

i ui viT .

Application: image compression

K. Bergen

(ICME)

Applied Linear Algebra

132 / 140

Image Compression using SVD

(a) rank-1
approximation
(18,456 bytes 0.6%)
K. Bergen

(ICME)

(b) rank-2
approximation
(36,912 bytes 1.1%)

(c) rank-5
approximation
(92,280 bytes 2.8%)

Applied Linear Algebra

133 / 140

Image Compression using SVD

(a) rank-10
approximation
(184,560 bytes 5.6%)
K. Bergen

(ICME)

(b) rank-20
approximation
(369,120 bytes 11.1%)

(c) rank-40
approximation
(738,240 bytes 22.3%)

Applied Linear Algebra

134 / 140

Image Compression using SVD

(a) original image


(3,317,760 bytes)
K. Bergen

(ICME)

Applied Linear Algebra

135 / 140

Questions?

K. Bergen

(ICME)

Applied Linear Algebra

136 / 140

Mini-Quiz 15
The SVD is useful in many linear algebra proofs, since it decomposes a
matrix into orthogonal and diagonal factors which have nice properties.
Let A Rmn = U V T . Show that
kAk2 = max (A) =

max

kxk=1, kyk=1

y T Ax.

Hint: Use the SVD and unitary invariance of the Euclidean vector
norm (kU xk = kxk).
Polar decomposition:
Show how any square matrix A Rnn = U V T can be written as
A = QS,

Q orthogonal, S symmetric positive semi-definite.

Hint: V V T is symmetric positive semi-definite.

K. Bergen

(ICME)

Applied Linear Algebra

137 / 140

Solution (Q15)
Let A = U V T . Show that kAk2 = max (A) =
Solution:
max

kxk=1,kyk=1

y T Ax =
=
=

max

kxk=1, kyk=1

max

max

kxk=1,kyk=1

y T Ax.

y T (U V T )x
(y T U )(V T x)

kV T xk=1, kU T yk=1

max

k
xk=1, k
y k=1

yT
x

= max ii
i

= max (A)
Show how any square matrix A = U V T can be written as
A = QS, Q orthogonal, S symmetric positive semi-definite.
Solution:
A = U V T = U In V T = U (V T V )V T = (U V T )(V V T )
A = QS,
K. Bergen

(ICME)

where Q = U V T , and S = V V T .
Applied Linear Algebra

138 / 140

The End!

K. Bergen

(ICME)

Applied Linear Algebra

139 / 140

References and Acknowledgements


1

Numerical Linear Algebra by Lloyd N. Trefethen & David Bau III.

Matrix Computations by Gene H. Golub & Charles F. Van Load.

Linear Algebra and its Applications by Gilbert Strang.

Linear Algebra with Applications. by Otto Bretscher.

www.Wikipedia.org and Mathworld.Wolfram.com

Thanks to former ICME Refresher Course instructors Yuekai Sun,


Nicole Taheri, and Milinda Lakkam whose slides served as useful
references in creating this presentation.
Thanks to Carlos Sing-Long and Anil Damle for their feedback on
this presentation.
K. Bergen

(ICME)

Applied Linear Algebra

140 / 140

You might also like