You are on page 1of 25

LINEAR ALGEBRA REVIEW

SCOTT ROME

C ONTENTS
1. Preliminaries
1.1. The Invertible Matrix Theorem
1.2. Computing the Rank
1.3. Other Definitions and Results
2. Determinants
2.1. Elementary Row Matrices (Row Operations) and Effects on Determinants
3. Eigenvalues
3.1. Basics
3.2. Eigenvalues after Addition
3.3. Relation to the Determinant and Trace
3.4. Tips on finding eigenvectors quickly
3.5. Questions to answer
4. Characteristic and Minimal Polynomial
4.1. Cayley-Hamilton Theorem
4.2. Using the Characteristic Polynomial to find Inverses and Powers
4.3. More On The Minimal Polynomial
4.4. Finding the Characteristic Polynomial using Principal Minors
4.5. Eigenvalues depend continuously on the entries of a matrix
4.6. AB and BA have the same characteristic polynomial
4.7. Lagrange Interpolating Polynomial
5. Similarity, Diagonalization and Commuting
5.1. Facts About Similarity
5.2. Diagonalizable
5.3. Simultaneously Diagonalizable and Commuting Families
6. Unitary Matrices
6.1. Definition and Main Theorem
6.2. Determining Unitary Equivalence
7. Schurs Unitary Triangulization Theorem
8. Hermitian and Normal Matrices
8.1. Normal Matrices
8.2. Definition and Characterizations of Hermitian matrices
8.3. Extra Spectral Theorem for Hermitian Matrices
8.4. Other Facts About Hermitian Matrices
9. The Jordan Canonical Form
9.1. Finding The Jordan Form
9.2. Remarks on Finding the Number of Jordan Blocks
9.3. Regarding the Characteristic and Minimal Polynomial of a Matrix A
1

3
3
3
3
3
3
4
4
5
5
5
5
5
5
6
6
6
7
7
7
7
7
8
8
8
8
9
9
9
9
10
10
10
11
11
11
12

9.4. Finding the S matrix


9.5. Solutions to Systems of Differential Equations
9.6. Companion Matrices and Finding Roots to Polynomials
9.7. The Real Jordan Canonical Form
10. QR And Applications
10.1. Gram-Schmidt
10.2. QR Factorization
10.3. QR Algorithm for Finding Eigenvalues
11. Rayleigh-Ritz Theorem
12. Courant Fischer Theorem
13. Norms and Such
13.1. Vector Norms
13.2. Matrix Norms
13.3. Theorems, Definitions, and Sums
13.4. Gelfands Formula for the Spectral Radius
14. Weyls Inequalities and Corollaries
15. Interlacing Eigenvalues
16. Gersgorin Disc Theorem
16.1. Eigenvalue Trapping
16.2. Strict Diagonal Dominance
17. Positive (Semi) Definite Matrices
17.1. Characterizations
17.2. Finding the square roots of positive semidefininte matrices
17.3. Schurs Product Theorem and Similar Things
18. Singular Value Decomposition and Polar Form
18.1. Polar Form
18.2. Misc. Results
18.3. Singular Value Decomposition
18.4. Computing the Singular Value Decomposition
18.5. Notes on Nonsquare matrices
19. Positive Matrices and Perron Frobenius
19.1. Nonnegative matrices
20. Loose Ends
20.1. Cauchy-Schwarz inequality
20.2. Random Facts
20.3. Groups of Triangular Matrices
20.4. Block Diagonal Matrices
20.5. Identities involving the Trace
21. Examples

12
12
13
13
13
13
14
14
14
14
15
15
15
16
16
16
17
17
18
18
19
19
19
19
20
20
20
20
21
21
21
22
22
22
22
22
22
23
23

The following material is based in part on Matrix Analysis by Horn and Johnson and the lectures
of Dr. Hugo Woerdeman. No originality is claimed or implied. The following document is made
available as a reference for Drexel Universitys Matrix Analysis qualifier.

1. P RELIMINARIES
1.1. The Invertible Matrix Theorem. Let A Mn . The following are equivalent:
(a) A is nonsingular
(b) A1 exists, i.e. A is invertible
(c) rankA = n
(d) The rows of A are linearly independent
(e) The columns of A are linearly independent
(f) detA 6= 0
(g) The dimension of the range of A is n
(h) The dimension of the null space of A is 0
(i) Ax = b is consistent for each b
(j) If Ax = b is consistent, then the solution is unique
(k) Ax = b has a unique solution for each b
(l) The only solution to Ax = 0 is x = 0
(m) 0 is not an eigenvalue of A
1.2. Computing the Rank. When computing the rank of a matrix by row reducing, the rank of a
matrix is invariant under the following operations:
permuting rows
permuting columns
adding a multiple of a row to another row
multiplying a row by a nonzero scalar
I include this because I occasionally forget about being able to permute columns and rows. This is
helpful for determining the Jordan Form.
The rank of a matrix can be deduced whenever a matrix is in any echelon (triangular-ish) form
by counting the pivots.
Relation to Eigenvalues In this light, the rank is also equal to the number of nonzero eigenvalues
counting multiplicity.
1.3. Other Definitions and Results.
A is invertible if there exists a matrix B such that AB = BA = I. This matrix is unique
and we say B = A1 .
The standard scalar product for x, y Cn is < x, y >= y x.
A set of mutually orthogonal vectors {vi } such that < vk , vj >= 0 for j 6= k is linearly
independent. (Dot c1 v1 + ... + cn vn = 0 with each vector to prove)
2. D ETERMINANTS
2.1. Elementary Row Matrices (Row Operations) and Effects on Determinants. We will begin with a few examples to build intuition and give a method of remembering the rules using 2x2
matrices. To convince yourself on what each matrix E does, multiply it by the identity (on the
right, so EI).


0 1
The permutation matrix
corresponds to the elementary row operation of switch1 0
ing the first row with the second. Notice this can be deduced from the first row containing
a 1 in the second column (i.e. this says switch the first row with the second row). The
determinate of this matrix is 1.


The matrix

c 0
0 1


multiplies the first row of a matrix by c. The determinant of this

matrix is c. 

1 0
The matrix
adds the d (first row) to the second row. The determinant of this
d 1
matrix is 1.
Now let us explore their effects on the determinants by multiplying them by a matrix A. So we
have the following rules


0 1
det(
A) = det(A)
1 0


c 0
det(
A) = c det(A)
0 1


1 0
det(
A) = det(A)
d 1
So from this we have the following rules, I will do the rule for the 2x2 case for convenience.
Theorem Row Operations and the Determinant
(1) Interchange of two rows

a b

c d





= c d

a b

(2) Multiplication of a row by a nonzero scalar






a b 1 ka kd




c d = k c d
(3) Addition of a scalar multiple of one row to another row




a b
a
b

=

c d c + ka d + kb
(4) Since detA = detAT you may also do column operations and switch columns (with a sign)
3. E IGENVALUES
3.1. Basics. There will be some repeats here from later sections to drive the point home.
The eigenvalues for a triangular or diagonal matrix are the diagonal entries.
Every n-by-n square matrix A has n eigenvalues counting multiplicity, and at least one
nonzero eigenvector.
The eigenvectors of an n-by-n matrix A with n distinct eigenvalues are linearly independent.
Proof. Proceed by induction on the size of a matrix A. The 1-by-1 case is trivial. Assume
it is true for the n case. Then if c1 v1 + ... + cn+1 vn+1 = 0, apply A n+1 I to both sides
and use the induction hypothesis to find c1 = ... = cn = 0. Plug this back into the original
equation noting that vn+1 is nonzero to prove the result.
The eigenvectors of a hermitian matrix A are orthogonal (see hermitian section).

AB and BA have the same characteristic polynomial and eigenvalues. (if A is nonsingular,
they are similar using S = A. If not, for (AB), AB = x = BAB = Bx and so
is an eigenvalue of BA and similarly for the eigenvalues of BA being an eigenvalue of
AB )
If A is invertible, then AB BA, if not they will not be similar. (Example below)
3.2. Eigenvalues after Addition. Let A and B be in Mn
If (A) with eigenvector x, then for k constant, (k + ) (kI + A) with eigenvector
x.
Lemma For any matrix A, for all sufficiently small  > 0, the matrix A + I is invertible.
The proof involves noting A +  = U T U + U U whose entries on the diagonal are all
nonzero.
If A and B commute, (A + B) (A) + (B). (That is, if (A) = {1 , ..., n } and
(B) = {1 , ..., n }, there exists a permutation i1 , ..., in of indices 1, ..., n so that the
eigenvalues of A + B are 1 + i1 , ..., n + in .)
3.3. Relation to the Determinant and Trace. Assume A Mn and (A) = {1 , ..., n }. We
have the following identities:
n
X
trA =
i
detA =

i=1
n
Y

i=1

3.4. Tips on finding eigenvectors quickly.


When solving for eigenvectors, you are solving a system of linear equations. Because
interchanging equations makes no difference to the solution, neither does interchanging
rows of the matrix when trying to reduce to echelon form and solving for eigenvectors!
If an n-by-n matrix is diagonalizable and you have already found n 1 eigenvectors, find
the last one by choosing a vector orthogonal to the others. Make sure to check Ax = x to
make sure you chose it correctly.
Complex eigenvalues of real matrices come in conjugate pairs. Their eigenvectors do also:
Ax = Ax = x = x
3.5. Questions to answer.
It is clear if (A) then (A ), but do they have the same eigenvalues? Proof.
0 = det(A I) = det(A I) = det(A I). Let x be the eigenvalue associated
with for A. Then Ax = x and further x
4. C HARACTERISTIC AND M INIMAL P OLYNOMIAL
Note: the characteristic polynomial always has the same degree as the dimension of a square
matrix. A useful definition to use is
pA (t) = det(A tI)
4.1. Cayley-Hamilton Theorem. Let pA (t) be the characteristic polynomial of A Mn . Then
pA (A) = 0.

4.2. Using the Characteristic Polynomial to find Inverses and Powers.




3 1
(Powers Example) Let A =
Then pA (t) = t2 3t + 2. In particular pA (A) =
2 0
A2 3A + 2I = 0 = A2 = 3A 2I. Using this we can find A3 = AA2 and so on and
so forth.
Corollary to Cayley-Hamilton If A Mn is nonsingular, then there is a polynomial q(t) (whose
coefficients depend on A) of degree at most n 1, such that A1 = q(A).
Proof. If A is invertible, we can write pA (t) = tn + an1 tn1 + ... + a0 with nonzero a0 . Then we
have pA (A) = An + an1 An1 + ... + a1 + a0 I = 0 = I = A(cn1 An2 + ... + c0 I) = A1 =
q(A).
4.3. More On The Minimal Polynomial. For this section, assume A Mn , qA (t) denotes the
minimal polynomial and pA (t) denotes the characteristic polynomial of A respectively. For specific
more information, view the segment in the Jordan Canonical Form section.
qA (t) is the unique monic polynomial such that qA (A) = 0.
The degree of qA (t) is at most n.
If p(t) is any polynomial such that p(A) = 0, then the minimal polynomial qA (t) divides
p(t).
Similar matrices have the same minimal polynomial (this makes sense, they have the same
Jordan form)
For A Mn the minimal polynomial qA (t) divides the characteristic polynomial pA (t).
Also, qA () = 0 if and only if is an eigenvalue of A. So every root of pA (t) is a root of
qA (t). This was basically already here, but just for emphasis I added it.
Each of the following is a necessary and sufficient condition for A to be diagonalizable:
(a) qA (t) has distinct linear factors
(b) Every root of the minimal polynomial has multiplicity 1.
(c) For all t such that qA (t) = 0, qA0 (t) 6= 0.
4.4. Finding the Characteristic Polynomial using Principal Minors. Before proceeding we
must have some definitions:
Definition For a square matrix A Mn . Let {1, 2, . . . , n} and let || = k Then
the submatrix A() is a k-by-k principal submatrix of A.This matrix is formed by taking
entries of A that lie in positions (i , j ). That is, a k-by-k principal submatrix of A is one
lying in the same set of k rows and k columns.
Example

1 2 3
Let A = 4 5 6 , = {1, 3}. Then a 2-by-2 principal submatrix of A is A() =
7 8 9


1 3
7 9

Definition The
 determinant of a principal submatrix is called a principal minor.
There are nk k-by-k principal minors of A = [aij ].
We denote the sum of
Pthe k-by-k principal minors of A as Ek (A).
Example E1 (A) = ni=1 aii = trA, En (A) = detA.

Now we are able to present the important formula:


Theorem The characteristic polynomial of a n-by-n matrix A can be given as
pA (t) = tn E1 (A)tn1 + E2 (A)tn2 En (A)
Examples:
Let A M2 . Then pA (t) = t2 (trA)t + detA.
Consider

a b c
A= d e f
g h i
then


a b c






a b a c e f
+
+
t d e f
pA (t) = t3 (a + e + i)t2 +







d e
g i
h i
g h i
4.5. Eigenvalues depend continuously on the entries of a matrix. Facts The following facts
combine to yield the title of the section. Since this is mostly based on ideas involving the characteristic polynomial, I placed it in this section. The following only holds for square matrices.
The zeros of a polynomial depend continuously on the coefficients.
The coefficients of the characteristic polynomial are continuous functions of the entries of
the matrix.
The zeros of the characteristic polynomial are the eigenvalues.
This is discussed more thoroughly in Appendix D of Horn and Johnson. The moral is: sufficiently
small changes in the entries of A will cause small changes in the coefficients of pA (t) which will
result in small changes in the eigenvalues.
4.6. AB and BA have the same characteristic polynomial. AB and BA are similar in the case
the either A or B is invertible and the result is clear. Otherwise: Consider A = A + I which
is invertible for all sufficiently small . Then A B BA . Letting  0, similarity may fail,
but the characteristic polynomials will still be equal since the characteristic polynomials depend
continuously on the parameter . This holds from the above subsections discussion.
4.7. Lagrange Interpolating Polynomial. (0.9.11) add in later
5. S IMILARITY, D IAGONALIZATION AND C OMMUTING
A nice way to think about similarity is the following: Two matrices in Mn are similar if they
represent the same linear transformation T : Cn Cn in (possibly) two different basis. Therefore
similarity may be thought of as studying properties which are intrinsic to a linear transformation
itself, or properties that are common to all its various basis representations.
5.1. Facts About Similarity.
Similarity is an equivalence relation
If two matrices are similar, they have the same characteristic polynomial. Which implies
they have the same eigenvalues counting multiplicity.
The converse of the previous
is not true unless the matrices are normal. (Consider
 statement

0 1
the zero matrix and A =
)
0 0

The only matrix similar to the zero matrix is itself.


Rank is similarity invariant
A necessary but not sufficient condition for two matrices to be similar is that their determinants are equal. That is because the determinants can be seen as the product of the
eigenvalues. Another check is if the traces are equal.
If two matrices are similar, then they have the same Jordan Canonical Form by transitivity.
From the above statement, it is clear two similar matrices have the same minimal polynomial.
Two normal matrices are similar if and only if they are unitarily equivalent. (problem
2.5.31)
Theorem Similar matrices have the same characteristic polynomial.
Proof: If A and B are similar, then A = SBS 1 for some invertible S. Then
pA (t) = det(AtI) = det(S)det(AtI)det(S 1 ) = det(SAS 1 tSS 1 ) = det(BtI) = pB (t)
5.2. Diagonalizable.
To start easy: diagonal matrices always commute.
A matrix is diagonalizable if and only if it is similar to a diagonal matrix.
An nxn matrix A is diagonalizable if and only if it has n linearly independent eigenvectors.
An aside: If A is 2x2 and dimKer(A I) = 2, then necessarily that eigenvalue has 2
linearly independent eigenvectors from the Jordan Form. The Jordan form is helpful for
this. From this we see:
If a matrix has n distince eigenvalues, then it is diagonalizable.
if C is block diagonal, then C is diagonalizable if and only if each block on the diagonal is
diagonalizable.
For an additional characterization by minimal polynomials, check that section!
5.3. Simultaneously Diagonalizable and Commuting Families. The main takeaway of this section should be: if two matrices commute, they can be simultaneously diagonalized/(unitarily triangularized) by a single matrix.
For two diagonalizable matrices A and B. A and B commute if and only if they are simultaneously diagonalizable. Stronger Theorem Let F Mn be a commuting family. Then
there exists a unitary U such that for all A F , U AU is upper triangular. This says that
matrices only need to be commute to be simultaneously unitarily triangularized!!
If M is a commuting family of normal matrices, then M is simultaneously unitarily diagonalizable. This is the same as the above statement.. not sure why Im keeping it.
If F is a commuting family, then there is a vector x which is an eigenvector for every
A F.
Commuting families are also simultaneously diagonalizable families
If A and B commute, from Schurs triangulization theorem and a property above, we have
that (A + B) (A) + (B).
6. U NITARY M ATRICES
6.1. Definition and Main Theorem.
Definition A matrix U Mn is unitary provided U U = I.
Theorem The following are equivalent:
(a) U is unitary

U is nonsingular and U 1 = U
UU = I
U is unitary
The columns of U form an orthonormal set, that is, its range is Rn
The row of U form an orthonormal set; and
For all x Cn , the Euclidean length of y U x is the same as that of x, that is,
y y = x x or again kU xk = kxk.
A real unitary matrix is called an orthogonal matrix.
|detU | = 1
(U ), || = 1.
A sequence of unitary matrices that converges, converges to a unitary matrix.
(b)
(c)
(d)
(e)
(f)
(g)

6.2. Determining Unitary Equivalence.


P
P
If A = [aij ] and B = [bij ] are unitarily equivalent, then ni,j=1 |bij |2 = ni,j=1 |aij |2 .
(Pearcys Theorem) Two matrices A, B Mn are unitarily equivalent if and only if
tr(w(A, A )) = tr(w(B, B )) for every word w(s, t) of degree at most 2n2 .
Note an example of a word w(s, t) in the letters s and t of degree 3 is sts.
In general this is used to show two matrices are not unitarily equivalent. To use
this you find a combination, where the trace is not equal, for example tr(A AA ) 6=
tr(B BB ) where A and B are given.
7. S CHUR S U NITARY T RIANGULIZATION T HEOREM
Theorem Given A Mn with eigenvalues 1 , . . . , n in any prescribed order, there is a unitary
matrix U such that
U AU = T = [tij ]
where T is upper triangular, with the diagonal entries tii = i for i = 1, . . . , n. In words, every
square matrix A is unitarity equivalent to an upper triangular matrix whose diagonal entries are the
eigenvalues of A in any prescribed order. Furthermore, if A Mn = (R) and if all the eigenvalues
of A are real, then U may be chosen to be real and orthogonal.
Therefore, neither T nor U are unique!
One way to interpret Schurs theorem is that it says that every matrix is almost diagonalizable
which is made precise in the following way:
Theorem (2.4.6) Let A = [aij ] Mn . For every  > 0, there exists a matrix A() = [a()ij ] that
has n distinct eigenvalues (and therefore diagonalizable) such that
X
= 1n |aij a()ij |2 < 
i,j

.
8. H ERMITIAN AND N ORMAL M ATRICES
8.1. Normal Matrices.
A matrix is normal if A A = AA
A Mn is normal if and only if every matrix unitarily equivalent to A is normal.
A real, normal matrix A M2 (R) is either symmetric (A = A ), or the sum of a scalar
matrix and some skew symmetric matrix (i.e. A = kI + B for k R and B = B ).
Unitary, hermitian, and skew hermitian matrices are normal.

For a normal matrix A, Ax = 0 if and only if A x = 0. So they have the same kernel. (Use
the fact below to prove).
For a normal matrix A, kAxk = x A Ax = x AA x = kA xk.
For C, A + I is normal.
A normal triangular matrix T is diagonal. Equate entries of T T and T T
If Ax = x for nonzero x, A x = x. From ||(A I)x|| = ||(A I) x|| since (A I)
is normal.
Spectral Theorem If A = [aij ] Mn has eigenvalues 1 , ..., n , the following are equivalent
(a) A is normal
(b) P
A is unitarily diagonalizable
(A is unitarily equivalent to a diagonal matrix)
P
(c) ni,j=1 |aij |2 = ni=1 |i |2 = tr(A A)
(d) There is an orthonormal set of n eigenvectors of A.
8.2. Definition and Characterizations of Hermitian matrices.
A matrix A Mn is Hermitian if A = A . It is skew-Hermitian if A = A .
A + A , AA , A A are always Hermitian
If A is Hermitian, Ak is Hermitian for all k N, and if A is nonsingular, A1 is Hermitian.
If A, B are Hermitian, aA + bB is hermitian for all real scalers a, b.
A A is skew-Hermitian
If A, B is skew-Hermitian, so is aA + bB for real scalers a, b.
If A is Hermitian, iA is skew-Hermitian. If A is skew-Hermitian, iA is Hermitian,
Any matrix A may be written A = 21 (A + A ) + 12 (A A ) where we have A split into a
Hermitian and Skew Hermitian part.
If A is Hermitian, the diagonal entries of A are real.
Theorem A is Hermitian if and only if at least one of the following holds:
x Ax is real for all x Cn
A is normal and all the eigenvalues of A are real
S AS is Hermitian for all S Mn .
If one of them holds, A is Hermitian and the others hold as well. If A is Hermitian all three hold.
The point of the theorem is that if you can find one of these hold, then you may conclude A is
Hermitian.
For example, if you can show A is unitarily diagonalizable by using Schurs theorem and finding
the non-diagonal entries are 0, then A is normal and if you can then show each diagonal entry is
real, then A is Hermitian.
Hermitian matrices are neccessarily normal and so all the theorems about normal matrices hold for
them.
8.3. Extra Spectral Theorem for Hermitian Matrices. If A is hermitian, then
(1) All eigenvalues of A are real and
(2) A is unitarily diagonalizable.
(3) If A Mn (R) is symmetric, then A is real orthogonally diagonalizable.
8.4. Other Facts About Hermitian Matrices.
For all x Cn the nxn matrix xx is hermitian. The only eigenvalues of this matrix are
0 and kxk22 = x x. Proof: Clearly by choosing a vector orthogonal to x, say p, xx p =
x(x p) = x0 = 0p. So notice xx x = x(x x) = xkxk22 = kxk22 x. If there was another

eigenvalue, say , then for some y, xx y = y = x(x y) = (x y)x. So this implies y is in


the span of x and so = kxk22 .
If A is hermition, the rank of A is equal to the number of nonzero eigenvalues of A.
Every principal submatrix of a hermitian matrix is hermitian.
If A is hermitian and x Ax 0 for all x Cn , then all the eigenvalues of A are nonnegative. If in addition trA = 0, then A = 0.
You may find the minimal polynomial by computing the Jordan form. A quicker version
is to find for each the smallest k such that rank(A I)k = rank(A I)k+1 . Then
construct the polynomial from this.
Every matrix A is uniquely determined by its Hermitian form x Ax. That is, if for matrices
A, B we have x Ax = x Bx for all x C. Then A = B. (Exercise 4.1.6)
Eigenvalues of Skew Hermitian matrices are pure imaginary. The square of a skew hermitian matrix has real nonpositive eigenvalues. (exercise 4.1.10).
9. T HE J ORDAN C ANONICAL F ORM

9.1. Finding The Jordan Form.


Step 1) Find the eigenvalues with multiplicity. This is equivalent to factoring the characteristic
polynomial.
Recall: If given the characteristic polynomials, one may read off the eigenvalues and their
multiplicity.
Example: pA () = ( 1)3 ( + 2)5 . The eigenvalues of A are 1 and 2 with algebraic
multiplicity 3 and 5 respectively.
Step 2) For each eigenvalue, determine the number of Jordan Blocks using the relation:
k

dim ker[(A I) ] =

k
X

# Jordan Blocks of Order j

j=1

= # of J Blocks associated with + # of J Blocks of size 2 +...+ # of J blocks of size k


Step 3) Mention that the forms you found are the only possible ones up to permutation of the
blocks.
9.2. Remarks on Finding the Number of Jordan Blocks.
The geometric multiplicity (which is the number of linearly independent eigenvectors for
the eigenvalue and also characterized as the dimension of the eigenspace) determines the
total number of Jordan Blocks in the form:
dim ker(A I) = #Jordan Blocks of Order 1 = Total # of Jordan Blocks
The orders of the Jordan Blocks of an eigenvalue must sum to the algebraic multiplicity
of .
# of Jordan Blocks of order m = dim ker[(A I)m ] dim ker[(A I)m1 ]
The order of the largest Jordan Block of A corresponding to an eigenvalue (called the
index of ) is the smallest value of k N such that rank(A I)k = rank(A I)k+1 .
Proof. If rank(A I)k = rank(A I)k+1 then
0 = dim ker[(A I)k+1 ] dim ker[(A I)k ] = # of J Blocks of order k + 1
Since this is the smallest such k which makes this equation equal 0, this implies that
0 dim ker[(A I)k ] dim ker[(A I)k1 ] = # of J Blocks of order k

For a J block of order k associated with (Jk ()), we have that k is also the smallest
number such that
dim ker[(Jk () I)k ] = 0
9.3. Regarding the Characteristic and Minimal Polynomial of a Matrix A. Let A Mn
whose
polynomial is given by pA (t) =
Qm distinctsieigenvalues are 1 , ..., m . Assume the characteristic
Qm
ri
(t

)
and
the
minimal
polynomial
by
q
(t)
=
(t

i
A
i) .
i=1
i=1
The exponent in the characteristic polynomial si (i.e. the algebraic multiplicity) is the sum
of the orders of all the J blocks of i .
The exponent in the minimal polynomial ri is the largest order J block corresponding to i .
9.4. Finding the S matrix. For the Jordan form we have A = SJS 1 for some S. If J is diagonal
then S is simply the eigenvectors of A arranged in a prescribed order,
 but if not: To find S notice
1
AS = SJ. Lets look at the 2-by-2 case for S = [s1 , s2 ] and J =
, notice that AS = SJ
0
can be written:

As1 As2 = s1 s1 + s2
So you can solve for s1 as an eigenvector for . To find s2 , we solve (A I)s2 = s1 .
P
xk
9.5. Solutions to Systems of Differential Equations. As ex :=
k=0 k! this gives us the definition of eA = I + A + 2!1 A2 + ... which converges for all square matrices in Cnxn . If we have
two matrices A, B such that AB = BA then we have the identity eA+B = eA eB since eA , eB must
commute here. Using the Jordan form where J = D + T where D is the diagonal entries and T
1
are the other entries, we can see that etA = etSJS = Set(D+T ) S 1 = SetD etT S 1 , which will
allow us to solve systems of ODEs:

x~0 (t) = Ax(t) A Mn , ~x Cn
~
~x(0) = C
~ Another problem you may encounter is: Solve for x(t) where
Then our solution is ~x(t) = etA C.
x3 (t) x2 (t) 4x0 (t) + 4x(t) = 0
To do so, convert this to a system of ODEs.
x1 (t) = x(t) = x01 (t) = x2 (t)
x2 (t) = x0 (t) = x02 (t) = x3 (t)
x3 (t) = x2 (t) = x03 (t) = x3 (t) = x2 (t) + 4x0 (t) 4x(t) = x3 (t) + 4x2 (t) 4x1 (t)
If we define ~x(t) = (x1 (t), x2 (t), x4 (t), x4 (t))T
above as
0

x1 (t)
0
~x0 (t) = x02 (t) = 0
x03 (t)
4

we immediately see we can rewrite this system

1 0
x1 (t)
0 1 x2 (t) = A~x(t)
4 1
x3 (t)

You then solve this ODE for ~t, and the first component x1 (t) is the solution to the original question.
This construction also gives the general framework for Companion Matrices.

9.6. Companion Matrices and Finding Roots to Polynomials. Let p(x) = xn + an1 xn1 +
... + a1 x + a0 be a polynomial. The Companion Matrix C of p(x) is defined to be

0
1
0
0
...
0
0

0
1
0
...
0

..
C=
0
0
0
0
0

0
0 ...
1
0
a0 a1 a2 . . . an2 an1
Notice that this matrix has 1s along most of the super diagonal, and 0s on most of the diagonal,
and across the bottom are the negative coefficients. Now notice

1
x
x
1
x
x2
x
x2
2
.
2
.

x
x
.
.
C
=
=
=
x
.
.
.

.
..
xn1
..
xn1
n1
n1
n
x
a0 a1 x ... an1 x
x
xn1
if and only if x is a root of p(x) since then xn = an1 xn1 ... a1 x a0 . But this implies x
is also an eigenvalue of C and we also have an eigenvector. Therefore, the important thing about
this matrix is the eigenvalues of C are roots of p(x). Performing the QR algorithm on a matrix
like this is an efficient way to find roots of polynomials.
9.7. The Real Jordan Canonical Form. For = a + bi C we define


a b
C(a, b) :=
b a
Theorem If A Mn (R) then there exists an invertible real matrix S such that

Jn1 (1 ) 0
0
0
0
0
0

..
.
0
0
0
0
0
0

0
0 Jnr (r )
0
0
0
0

A=S
0
0
0
Cnq (a1 , b1 ) 0
0
0

.
.

.
0
0
0
0
0
0
0
0
0
0
0 Cnn (ak , bk ) 0

1
S

Where k = ak + bk i for k = nq , ..., nk and 1 , ..., r are the real eigenvalues of A. Furthermore,

C(a, b) I2
0
0
...
...

0
0

Ck (a, b) ==

0
0 C(a, b)
I2
0
0
0
C(a, b)
is a block matrix with 2k C(a, b) along the diagonal and I2 along the superdiagonal. This matrix is
where = a+bi.
similar to (and takes the place of in the Jordan form) D() = diag(Jk (), Jk ())
10. QR A ND A PPLICATIONS
10.1. Gram-Schmidt. Let x1 , ..., xm Cm be linearly independent. To find an orthonormal set
of vectors {y1 , ..., ym } whose span equals that of x1 , ..., xm use Gram-Schmidt. Algorithm
(1) Let v1 = x1

v x

2 ,v1 >
v = x2 v1 v11 v1
(2) v2 = x2 <x
<v1 ,v1 > 1
1
Notice the P
second term is the projection of x2 onto the subspace spanned by v1 .
<xk ,vk >
(3) vj = xj j1
k=1 <vk ,vk > vk
(4) Let yi = kvvii k = <vi ,vvii>1/2

10.2. QR Factorization. Theorem If A Mn,m and n m, there is a matrix Q Mn,m with


orthonormal columns and an upper triangular matrix R Mm such that A = QR. If m = n, then
Q is unitary; if in addition A is nonsingular, R may be chosen so that all its diagonal entries are
positive, and in this event, the factors Q, R are both unique. If A is real, then both Q and R may
be taken to be real.
Factorization To find the QR factorization: First find Q by performing Gram-Schmidt on the
columns of A. Place this new orthonormal set as the columns of Q without changing the order.
Then to get R, simply compute R = Q A.
NOTE: If the columns of A are linearly dependent, then one of the vectors from Gram-Schmidt
is the zero vector. Obviously this is not correct (as then the matrix will not be orthogonal), and so
you
should choose
your last vector to be orthogonal (it should be simple).
1 0 1
Ex: 0 1 0
1 0 1
Tip Say the k-th column of A has i in each entry. You can rewrite A as A = A0 B where B is
diagonal and multiplies the k-th column of A by i. Then find A0 = Q0 R0 . Substituting in we have
A = Q0 R0 B. Letting R = R0 B which is still upper triangular we have the desired factorization.
10.3. QR Algorithm for Finding Eigenvalues. Let A0 be given. Write A0 = Q0 R0 . Define
A1 = R0 Q0 . Then write A1 = Q1 R1 and repeat. In g eneral factor Ak = Qk Rk and define
Ak+1 = Rk Qk . Ak will converge (in most cases) to an upper triangular matrix which is unitarily
equivalent to A and so they have the same eigenvalues. If there are complex eigenvalues, it will
converge to a block triangular matrix under certain conditions.
11. R AYLEIGH -R ITZ T HEOREM
Let A Mn be Hermitian and let the eigenvalues of A be ordered as min = 1 2
n = max . Then
1 x x x Ax n x x x C
x Ax
max = n = max = max
x Ax
x x=1
x6=0 x x
x Ax
min = 1 = min = min
x Ax

x
x=1
x6=0 x x
12. C OURANT F ISCHER T HEOREM
Let A Mn be a Hermitian matrix with eigenvalues 1 2 ... n and let k be a given
integer with 1 k n. Then
min

w1 ,w2 ,...,wnk Cn

max

x6=0,xCn
xw1 ,w2 ,...,wnk

x Ax
= k
x x

and
max

w1 ,w2 ,...,wk1 Cn

min

x6=0,xCn
xw1 ,w2 ,...,wk1

x Ax
= k
x x

Remark: If k = 1, n then the theorem reduces to Rayleigh-Ritz. I would like to also put a word on
interpreting this result physically. The min/max statement was so intimidating it took me several
months to actually think about it. On the first equality, it says we first take the max of x Ax/x x
over all x 6= 0 that are perpendicular to a particular n k dimensional subspace of Cn . Then we
take the minimum value of the set of numbers formed for the previous condition over all possible
n k dimensional subspaces.
Corollary The singular values of A, 1 2 . . . q where q = min{m, n} and for 1 k leqn
may be given by
kAxk2
max n
k =
min
n
w1 ,w2 ,...,wk1 C
x6=0,xC
kxk2
xw1 ,w2 ,...,wk1

k =

max

w1 ,w2 ,...,wnk Cn

min

x6=0,xCn
xw1 ,w2 ,...,wnk

kAxk2
kxk2

13. N ORMS AND S UCH


A Norm k k : V R satisfies:
(1) kxk 0 for all x V
(2) kxk = 0 if and only if x = 0
(3) kcxk = |c|kxk
(4) kx + yk = kxk + kyk
13.1. Vector Norms.

kxk2 = (|x1 |2 + . . . |xn |2 )1/2 . Note for x Cn , kxk2 = x x.


13.2. Matrix Norms. A matrix norm k| k| satisfies (1-4) of the properties above and in addition
submultiplicity: k|ABk| k|Ak|k|Bk| for any matrix A, B.
One may induce a norm from a vector norm k k by defining k|Ak| = maxkxk=1 kAxk.
The spectral norm is

k|Ak|2 = max{ : (A A)}

k|Ak|2 = max{ : (AA )}


k|Ak|2 = max{ : is a singular value of A}
and further
kAxk2
k|Ak|2 = max kAxk2 = max kAxk2 = max
kxk2 =1
kxk2 1
kxk2 6=0 kxk2
=

max

kxk2 =kyk2 =1

|y Ax| =

max

kxk2 1;kyk2 1

|y Ax|

The previous identities can be used to show k|Ak|2 = k|A k|2 for all A Mn . Additionally, k|A Ak|2 = k|AA k|2 = k|Ak|22 from properties of matrix norms and that A A is
hermitian.
P
k|Ak| = max1in nj=1 |aij | This max row sum of the absolute values of the row entries.

P
k|Ak|1 = max1jn ni=1 |aij | This max column sum of the absolute values of the column
entries.
A matrix norm that is
pnot an induced norm (not determined by a vector norm) is the Frobenius norm kAk2 := tr(A A).
kAk := maxi,j=1,..,n |aij | is a norm on the vector space of matrices but NOT a matrix
norm.
13.3. Theorems, Definitions, and Sums.
(A) := max{|| : (A)}
The following lemmas build to the first theorem:
Property
k|Ak k| k|Ak|k
If k|Ak k| 0 for some norm, since all vector norms on the n2 dimensional space Mn are
equivalent, we have Ak 0 because the limit would also hold in the k k norm.
Lemma If there is a matrix norm k| k| such that k|Ak| < 1, then limk Ak = 0. Theorem If
A Mn , then limm Am = 0 if and only if all the eigenvalues of A have modulus < 1. That is,
(A) < 1 if and only if limn An = 0
This tells us that if Ak 0, then (A) < 1 so there exists a matrix norm k|Ak| < 1 in which case
k|Ak|k 0.
Theorem If k| k| is a matrix norm, then (A) k|Ak|.
Corollary Since kAk and kAk1 are matrix norms,
(A) min{max
i

n
X
j=1

|aij |, max j

n
X

|aij |}

i=1

Lemma Let A Mn and  > 0. Then there exists a matrix norm k| k| such that k|Ak| (A) + .
k
Theorem If A Mn , then the series
k=0 ak A converges if there exists a matrix norm k| k| such
k
that the numerical series
k=0 |ak |k|Ak| converges, or even if the partial sums of this series are
bounded.
Important
P Theorem If there is a matrix norm such that k|I Ak| < 1, then A is invertible and
A1 = k=0 (I A)k .
P
k
1
Theorem If (A) < 1, then I A is invertible and
k=0 A = (I A) . This could also be
stated with the condition if there exists a matrix norm such that k|Ak| < 1.
13.4. Gelfands Formula for the Spectral Radius. If A Mn and k| k| is a matrix norm, then
(A) = limk k|Ak k|1/k
For the proof, consider (A)k = (Ak ) k|Ak k| for one inequality and the matrix A = ((A) +
)1 A for the other and use that the limit of Ak is 0.
14. W EYL S I NEQUALITIES AND C OROLLARIES
This is a consequence of the Courant Fischer Theorem.
Theorem Let A, B Mn be Hermitian and let the eigenvalues i (A), i (B) and i (A + B) be
arranged in increasing order (max = 1 ... n = max. For each k N we have
k (A) + 1 (B) k (A + B) k (A) + n (B)

Corollaries Many of these corollaries continue to stress intuition about the eigenvalues of Hermitian matrices. For example, the first corollary stresses that positive semidefinite matrices have
positive eigenvalues.
Assume B is positive semidefinite with the above assumptions, then k (A) k (A + B).
For a vector z Cn with the eigenvalues of A and A zz arranged in increasing order
(a) k (A zz ) k+1 (A) k+2 (A zz ) for k = 1, 2, ..., n 2
(b) k (A) k+1 (A zz ) k+2 (A) for k = 1, 2, ..., n 2.
Theorem Let A, B Mn be Hermitian and suppose that B has at most rank r. Then
(a) k (A + B) k+r (A) k+2r (A + B) k = 1, 2, ..., n 2r
(b) k (A) k+r (A + B) k+2r (A) k = 1, 2, ..., n 2r
(c) If A = U U with U = [u1 , ..., un ] unitary and = diag(1 , ..., n ) arranged in increasing
order, and if
B = n un un + n1 un1 un1 + ... + nr+1 unr+1 unr+1
then max (A B) = nr (A).
Corollary By applying the above theorem repeatedly we get: If B has at most rank r, then
kr (A) k (A + B) k+r (A)
This theorem intrinsically requires k r 1 and k + r n.
Theorem For A, B Hermitian,
(a) 1 j, k n, j + k n + 1 then
j+kn (A + B) j (A) + k (B)
(b) if j + k n + 1,
j (A) + k (B) j+k1 (A + B)
15. I NTERLACING E IGENVALUES
Let A Mn+1 be Hermitian. Define A Mn , y Cn and a R so that


A
y
A =
y a
where the eigenvalues of A and A are denoted {k } and {i } for k = 1, .., n + 1 and i = 1, .., n
(respectively) arranged in increasing order. Then

1 1 2 2 n1 n n n+1
Note: This Theorem sometimes specifies that A is Hermitian, but that is given because every principal submatrix of a Hermitian matrix is Hermitian. Further, notice a R, because the diagonal
entries of Hermitian matrices are real.
16. G ERSGORIN D ISC T HEOREM
This theorem gives us an idea of the location of the eigenvalues of a matrix. Furthermore it
allows us to say things about the eigenvalues of a matrix without computing them.
Theorem Let A = [aij ] Mn and let
n
X
Ri0 :=
|aij |, 1 i n
j=1
j6=i

denote the deleted absolute row sums of A. Then all the eigenvalues of A are located in the union
of n discs
n
[
{z C : |z aii | Ri0 } G(A)
i=1

in other words, (A) G(A). Furthermore, if a union of k of these n discs form a connected
region that is disjoint from all the remaining n k discs, then there are precisely k eigenvalues of
A in this region.
Corollary All the eigenvalues of A are located in the union of n discs
n
[
{z Cn : |z ajj | Cj0 } = G(AT )
j=1

where Cji :=

Pn

i=1
j6=i

|aij | is the deleted absolute column sum. Furthermore, if a union of k of these

n discs form a connected region that is disjoint from all the remaining n k discs, then there are
precisely k eigenvalues of A in this region.
Since similar matrices have the same eigenvalues, you can sometimes find a better estimate on the
location of a matrices eigenvalues by looking at a similar matrix. A convenient choice is a matrix
D = diag(p1 , ..., pn ) with pi > 0. One can then easily calculate D1 AD = [pj aij /pi ] and apply
Gersgorin to D1 AD and its transpose to yield:
Corollary All the eigenvalues of A lie in the region
n
n
[


1 X
n
z C : |z aii |
pj |aij | = G(D1 AD)
pi j=1
i=1
j6=i

as well as in the region


n
[

i=1

n
X

1
z C : |z ajj | pj
|aij | = G[(D1 AD)T ]
p
j=1 i
n

j6=i

16.1. Eigenvalue Trapping.


(A)

G(D1 AD)

where D is diagonal. Therefore you can use different Ds to attempt to trap eigenvalues by
taking intersections.
16.2. Strict Diagonal Dominance. Definition: A matrix is strictly diagonally dominant if:
n
X
|aii | >
|aij | = Ri0 for all i = 1, . . . , n
j=1
j6=i

Theorem: If a square matrix A is strictly diagonally dominant then


A is invertible
If all the main diagonal entries of A are positive, then all the eigenvalues of A have positive
real part
If A is Hermitian and all main diagonal entries of A are positive, then all the eigenvalues
of A are real and positive.

17. P OSITIVE (S EMI ) D EFINITE M ATRICES


In the following section, things in parenthesis are true for other things in parenthesis, ie if I
write, the grass (sky) is green (blue). Im saying the grass is green and the sky is blue.
17.1. Characterizations.
By definition, a positive definite matrix (and semi) is hermitian, this is proved below by a
lemma.
The trace, determinant, and all principal minors (determinants of principal submatrices) of
a positive definite matrix (semi definite) are positive (nonnegative).
Lemma If x Ax R for all x Cn , then A is hermitian.
Proof. ei Aei = aii R. So look at (ei + ej ) A(ei + ej ) = aii + ajj + aij + aji R =
Im(aij ) = Im(aji ). Similarly (iei + ej ) A = (iei + ej ) = aii + ajj iaij + iaji R =
Im(iaij ) = Im(iaji ) = Re(aij ) = Re(aji ). Therefore aij = aji . So A is hermitian.
This lemma in particular shows that both positive semi definite matrices and positive definite matrices are hermitian.
A hermitian matrix A is positive (semi) defininte if and only if all of its eigenvalues are
positive (nonnegative).
The preceding statement proves that all powers of a positive semi defininte matrix Ak are
also psd.
A hermitian is positive definite if and only if all leading principal minors of A are positive.
So for any leading principal submatrix Ai := A({1, 2, ..., i}), i = 2, ..., n, detAi > 0.
If A is nonsingular, then AA and A A are both positive definite. If A is singular they are
positive semi definite.
17.2. Finding the square roots of positive semidefininte matrices. Theorem Let A Mn be
positive semi definite and k 1 be a given integer. Then there exists a unique positive semidefinite
Hermitiam matrix B such that B k = A. We also have
(a) BA=AB and there is a polynomial p(t) such that B = p(t)
(b) rankB = rankA so B is positive definite if A is
(c) B is real if A is real
Uniqueness is a key property that is sometimes used in proofs of Polar Form theorems!
Algorithm to Find the Kth Root The idea is that A can be unitarily diagonalized and so A =
1/k
1/k
1/k
k
U U . Define 1/k = diag(1 , ...,
 n ) and
 then B = U U . Therefore B = A.
5 3
Example Consider the matrix A =
. It may be computed that the eigenvalues are 8, 2 and
3 5
that x1 = (1, 1) N (A 8I) and x2 = (1, 1) N (A 2I). Therefore we notice x1 x2
and so we simply normalize them and place them as the columns of U in the order the eigenvalues
show up in our desired . Finally we compute A1/2 = U 1/2 U .
17.3. Schurs Product Theorem and Similar Things. Theorem If A, B are pos. (semi) definite,
with A = (aij ) and B = (bij ) then A B = (aij bij )ni,j=1 is positive (semi) definite.
Lemma The sum of a positive definite matrix A and a positive semi definite matrix B is positive
definite.
Proof. x (A + B)x = x Ax + x Bx > 0 since x Ax > 0.

18. S INGULAR VALUE D ECOMPOSITION AND P OLAR F ORM


18.1. Polar Form. Let A Mm,n with m n. Then A may be written
A = PU
where P Mm is positive semidefinite, rankP = rankA, and U Mm,n has orthonormal rows
(that is U U = I). The matrix P is always uniquely determined as P = (AA )1/2 and U is
uniquely determined when A has rank m. If A is real then U, P may be taken to be real as well.
An important special case:
Corollary !! If A Mn then
A = PU
where P = (AA )1/2 is positive semidefinite and U unitary. P is uniquely determined; if A is
nonsingular, then U is uniquely determined as U = P 1 A. If A is real then P, U may be taken to
be real.
Necessarily P is hermitian in the square case!
Let A = P U . Then A is normal if and only if P U = U P . The forward direction comes
from uniqueness of square roots and the converse from noticing P is hermitian and looking
at AA and A A.
18.2. Misc. Results.
For any square matrix A, A A is unitarily similar to AA and therefore have the same
eigenvalues. Proof by observing using word in two letters trace theorem.
18.3. Singular Value Decomposition. The singular value decomposition may be seen as a substitute eigenvalue decomposition for matrices which are not normal (or square). For positive
definite matrices (Hermitian), the SVD is the eigenvalue decomposition.
Theorem If A Mm,n with rank k it may be written in the form A = V W .
The matrices V Mm and W Mn are unitary.
= [i,j ] Mm,n as ij = 0 for i 6= j and 11 22 kk k+1,k+1 = =
qq = 0 where q = min{n, m}.
The numbers {ii } {i} are the nonnegative square roots of the eigenvalues of AA ,
and hence are uniquely determined. (It is also true that they are the nonnegative square
roots of the eigenvalues of A A)
The columns of V are the eigenvectors of AA and the columns of W are the eigenvalues
of A A (arranged in the same order as the corresponding eigenvalues of i2 ). [This follows
because each matrix is Hermitian, and so they have an orthonormal set of eigenvectors,
yielding unitary matrices]
If m n and if AA has distinct eigenvalues, then V is determined up to a right diagonal
factor D = diag(ei1 , ..., ein ). That is if A = V1 W1 = V2 W2 then V2 = V1 D.
If m < n, then W is never uniquely determined.
If n = m = k and V is given, then W is uniquely determined.
If n m the uniqueness of V and W is determined by considering A .
If A is real, P
then V , , and W may be taken to be real.

tr(AA ) = i2

18.4. Computing the Singular Value Decomposition. For a nonsingular square matrix A Mn :
(a) Form the positive definite Hermitian matrix AA and compute the unitary diagonalization
AA = U U by finding the (positive) eigenvalues {i } of AA and a corresponding set
{ui } of normalized eigenvectors.
(b) Set = 1/2 and V = W = {u1 , ..., un }.
(c) Set W = A V 1
Notes:
If A were singular, then AA would be positive semi definite, and so would not be
invertible. Therefore it would be necessary to compute the eigenvectors of A A to find W .
The eigenvectors must be arranged in the same order as the singular values.
The singular values of a normal matrix are the absolute values of the eigenvalues. Besides
the usual way, since V, W are not unique, the columns of V can also be the eigenvectors
of A. In this case V is determined by the eigenvectors of A, but V 6= W . To find find W ,
notice for each k , k = |k |eik . So let D = diag(ei1 , ..., ein ). Then A = U U =
U ||DU = U ||(U D)
If A Mn is singular, that means it will have at least one zero singular value. When
this occurs, augment U and W as necessary by choosing orthogonal vectors to make them
square as well.
In fact, whenever you come up short, always augment U and W with orthogonal vectors
If the matrix is not square or singular, the process will not be fun.
18.5. Notes on Nonsquare matrices. When computing the singular value decomposition for nonsquare matrices, there are a few things to keep in mind:
Although you only use the nonzero eigenvalues for the singular value decomposition, you
still use the eigenvectors associated with any zero eigenvalues. AA = A A have the same
nonzero eigenvalues.
The matrix V should be the same size as AA and similarly for W and A A.
19. P OSITIVE M ATRICES AND P ERRON F ROBENIUS

If |A| |B| then kAk2 kBk2


kAk2 = k|Ak|2
If |A| B then (A) (|A|) (B)
For A 0, if the all row sums of A are equal, (A) = k|Ak| . If all the column sums of
A are equal, (A) = k|Ak|1 .
If |A| B, then (A) (|A|) (B).
Theorem Let A Mn and suppose A 0. Then
min

1in

min

1jn

n
X
j=1
n
X
i=1

aij (A) max

1in

aij (A) max

1jn

n
X

aij

j=1
n
X

aij

i=1

This says (A) is between the biggest and smallest column and row sum of A.

Let A Mn and A 0. Then for any positive x Cn we have


n
n
1 X
1 X
aij xj (A) max
aij xj
min
1in xi
1in xi
j=1
j=1
min xj

1jn

n
X
aij
i=1

xi

(A) max xj
1jn

n
X
aij
i=1

xi

Another way to view the first is:


1
1
min (Ax)i (A) max (Ax)i
1in xi
1in xi
Theorem (Perron) If A Mn and A > 0 (i.e. aij > 0 for all i, j) then
(a) (A) > 0
(b) (A) is an eigenvalue of A
(c) There is an x Cn with x > 0 and Ax = (A)x
(d) (A) is an algebraically (and hence geometrically) simple eigenvalue of A.
(e) || < (A) for every eigenvalue 6= 0, that is, (A) is the unique eigenvalue of maximum
modulus
(f) [(A)1 A]m L as m where L xy T , Ax = (A)x, AT y = (A)y, x > 0, y > 0 and
xT y = 1
P
Lemma If A 0 (i.e. aij 0 for all i, j), if rowsums are equal to c then (A) = c. ( nk=1 aik = c
for all i)
19.1. Nonnegative matrices. If A Mn and A 0, then (A) is an eigenvalue of A and their is
a nonnegative eigenvector x (x 0, x 6= 0) associated with it.
20. L OOSE E NDS
20.1. Cauchy-Schwarz inequality. | < x, y > | < x, x >1/2 < y, y >1/2 which can also be
written:
| < x, y > | kxk kyk

20.2. Random Facts.


ei Aej = aij
If T is triangular and the diagonal entries of T T are the same of T T (in order), then T is
diagonal (This is usedto prove the
spectral
theorem for normal matrices).
0 1 0
0 0

Matrices of the form 0 0 1 and 0 0 commute! This is useful sometimes


0 0 0
0 0
for solving ODEs.
20.3. Groups of Triangular Matrices. The set of upper triangular matrices form a group under
matrix multiplication, and similarly for lower triangular matrices. Therefore we have:
The inverse of an upper triangular matrix is upper triangular
The inverse of a lower triangular matrix is lower triangular

20.4. Block Diagonal Matrices.


The trace of a block diagonal matrix is the sum of the traces of the block entries.
The determinant of a block diagonal matrix is the product of the determinants of the entries.
(Easy proof: show they are similar to a matrix with their Jordan forms along the diagonal)
20.5. Identities involving the Trace.
For any matrices C, D, tr(CD) = tr(DC). This can be used to deduce that the trace is
similarity invariant.
P
tr(A + B) = tr(A) + tr(B)
i,j |aij |2 = trA A
P
If (A) =P{1 , ..., n }, tr(Ak ) = ni=1 ki for k N.
tr(A) = ni=1 ei Aei

21. E XAMPLES
Having
is necessary but NOT sufficient for similarity. Consider

the same
 eigenvalues

0 1
0 0
and
. These both have eigenvalue 0 with multiplicity 2, but they are
0 0
0 0
not similar.


0 1
A=
is not diagonalizable. If it were, it would be similar to the 0 matrixm which
0 0
is untrue since the only matrix similar to the 0 matrix is itself. Also, the dimKer(A0I) =
1 and so we know there is only one eigenvector associated with 0, but to be diagonalizable
there must be 2 since
 A is 2x2.
0 1
Consider
and the 2x2 identity I. These two matrices commute but are not simul0 0
taneously diagonalizable. This is because the first matrix is not diagonalizable.


0 1
Consider two matrices where AB is diagonalizable but BA is not. A =
B =
0 0


1 1
0 0




3 1
1 1

and
are similar but not unitarily equivalent.
2 0
0 2
P
P
The following two matrices satisfy i,j |aij |2 = i,j |bij |2 but are not unitarily equivalent,
in fact they are not even similar!



0 1
1 0
0 0
0 0

A=
AA 

1 1
0 0


and B =

0 1
0 0

is a real 2-by-2 matrix that is not normal since A A 6=


1 1
A=
is a real 2-by-2 matrix that is normal but not symmetric, skew-symmetric
1 1
or orthogonal (real unitary).




1 0
3 0
A=
and B =
. A and B commute so (A + B) ((A) + (B)). But
0 2
0 4
1 + 4 = 5 ((A) + (B)) and (A + B) = {4, 6}. So (A + B) 6= ((A) + (B)) in
general.




0 1
0 0
A=
and B =
. A and B do not commute, and clearly ((A) + (B)) =
0 0
1 0
{0} but
 (A +
 B) = {1, 1}. So (A + B) 6 ((A) + (B)).
0 1
A=
the rank of A is bigger than the number of nonzero eigenvalues (This can
0 0
happen when A is not hermitian). Furthermore, the geometric multiplicity (1) is strictly
smaller than the number of eigenvalues counting multiplicity, and A only has one eigenvector(up to scaling).



1 i
i i
A =
and B =
are both complex and symmetric matrices, but one is
i 1
i 1
normal
 and one
 is not.
0 1
A=
0 0


0 0

A =
do not have the same null space. (For normal matrices they do).
1 0


1 1
A=
is positive semi definite but not positive definite.
1 1


0 0
A =
has all nonnegative leading principal minors but is not positive semi
0 1
definite (So the corresponding theorem for positive definite matrices does not apply to psd
matrices).


P
P
0 1
2
A =
is not normal but satisfies i,j ai,j = tr(A A) =
i |i (A)| which is
0 0
associated with normal matrices.
AB and BA can have the same eigenvalues and characteristic poly, but not the same minimal polynomial,
i.e.different jordan forms, i.e. not similar to each other:

1 0 0
A = 0 0 1 and
0 0 1

1 0 0
B= 0 1 0
0 0 0
For Hermitian matrices, the rank of A is equal to the number of nonzero eigenvalues. This
is not true for non-Hermitian matrices:
0 1
0 0
0 1
0 0
Weyls inequality (the first one) if A, B are not hermitian. Consider
and
0 0
1 0
Matrices with the same minimal polynomial and characteristic polynomial NEED NOT be
similar for matrices of order 4. Ex: 4x4 matrices with zeroes on the diagonals, where
one has two J blocks of order 2, and the other has only 1.

Idempotent matrices can only have eigenvalues


0 or 1. A 2x2 matrix that is idempotent that


1 0
is not the zero matrix or the identity is
0 0

You might also like