Linearalg03 PDF

6.
E IGENVALUES AND E IGENVECTORS
The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.
1/51
6. E IGENVALUES AND E IGENVECTORS
Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix.
1/51
Definition
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.
1/51
Definition
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.
Remark
(i) Clearly, f is diagonalizable iff the matrix associated to f with
respect to some basis (any basis) is diagonalizable.
1/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi .
2/51
Remark
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?
2/51
Remark
Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .
2/51
Remark
Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .
For a square matrix A we say λ is an eigenvalue if there exists a non

zero column vector v such that Av = λv. Of course v is then called the
eigenvector of A corresponding to λ.
2/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn .
3/51
Proposition
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
3/51
Proposition
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
3/51
Proposition
i = 1, 2, . . . , n.
Proof: Let D := D(λ1 , . . . , λn ).
3/51
Proposition
i = 1, 2, . . . , n.
Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For

j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc..
3/51
Proposition
i = 1, 2, . . . , n.

j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc.. Then
(AC)j = AC j = Avj = λj vj .
3/51
Proposition
i = 1, 2, . . . , n.

j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc.. Then
(AC)j = AC j = Avj = λj vj .
Similarly,
(CD)j = CD j = λj vj .
Hence AC = CD as required. To complete the proof, we need to show
that C is invertible. This follows from a result proved later, which says
that eigenvectors corresponding to distinct eigenvalues are linearly
independent. In particular, C has n linearly independent columns and
so rank(C) = n and C is invertible. 2
3/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
4/51
Remark
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
4/51
Remark
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.
4/51
Remark
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.
Definition
For a square matrix A the eigenspace of A w.r.t. a eigenvalue λ is
defined as
EA (λ) := {v : Av = λv}.
4/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
5/51
Proposition
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
5/51
Proposition
det(A − λI) = 0.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ.
5/51
Proposition
det(A − λI) = 0.
Proof:
λ. Hence (A − λI)v = 0.
5/51
Proposition
det(A − λI) = 0.
Proof:
Thus the nullity of A − λI is positive.
5/51
Proposition
det(A − λI) = 0.
Proof:
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.
5/51
Proposition
det(A − λI) = 0.
Proof:
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.
(2)
EA (λ) = {v ∈ V : Av = λv}
= {v ∈ V : (A − λI)v = 0} = N (A − λI).
2
5/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
6/51
Definition
Example

1 2
(1) A = .
0 3
6/51
Definition
Example

1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3

1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
6/51
Definition
Example

1 2
0 3

1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).
6/51
Definition
Example

1 2
0 3

1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).
By definition E(1) = {v : (A − I)v = 0} and E(3) = {v : (A − 3I)v = 0}.
6/51
Example

0 2
A−I = .
0 2
7/51
Example

0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2

0 2 x 2y 0
= = .
0 2 y 2y 0
7/51
Example

0 2
0 2

0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
7/51
Example

0 2
0 2

0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0

1−3 2 −2 2
A − 3I = = .
0 3−3 0 0

−2 2 x 0
Suppose = .
0 0 y 0

−2x + 2y 0
Then = .
0 0
7/51
Example

0 2
0 2

0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0

1−3 2 −2 2
A − 3I = = .
0 3−3 0 0

−2 2 x 0
Suppose = .
0 0 y 0

−2x + 2y 0
Then = .
0 0
This is possible iff x = y . Thus E(3) = L({(1, 1)}).
7/51
Example
 
3 0 0
(2) Let A =  −2 4 2  .
−2 1 5
8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A.
8/51
Example
 
3 0 0
−2 1 5
of A. We say that λ = 3 has algebraic multiplicity 2.
8/51
Example
 
3 0 0
−2 1 5
Let us find the eigenspaces E(3) and E(6).
8/51
Example
 
3 0 0
−2 1 5
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
8/51
Example
 
3 0 0
−2 1 5
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2.
8/51
Example
 
3 0 0
−2 1 5
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).
8/51
Example
 
3 0 0
−2 1 5
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).
The dimension of EA (λ) is called the geometric multiplicity of λ. Hence
geometric multiplicity of λ = 3 is 2.
8/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
9/51
Example
 
−3 0 0
−2 1 −1
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
9/51
Example
 
−3 0 0
−2 1 −1
6 are equal to 1.

1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraic multiplicity 2.
9/51
Example
 
−3 0 0
−2 1 −1
6 are equal to 1.

1 1
0 1
algebraicmultiplicity
2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }).
0 0
9/51
Example
 
−3 0 0
−2 1 −1
6 are equal to 1.

1 1
0 1
algebraicmultiplicity
2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }). In
0 0
this case the geometric multiplicity is less than the algebraic multiplicity
of the eigenvalue 1.
9/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ).
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
10/51
Remark
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
10/51
Remark
This definition does not depend upon the choice of the basis.
10/51
Remark
(ii) If we expand det(A − λI) we see that there is a term

(a11 − λ)(a22 − λ) . . . (ann − λ).
10/51
Remark

(a11 − λ)(a22 − λ) . . . (ann − λ). This is the only term which
contributes to λn and λn−1 .
10/51
Remark

(a11 − λ)(a22 − λ) . . . (ann − λ). This is the only term which
contributes to λn and λn−1 .
It follows that the degree of the characteristic polynomial is exactly
equal to n, the size of the matrix;
moreover, the coefficient of the top degree term is equal to (−1)n .
10/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.
11/51
Remark
(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That

however, does not allow us to conclude that it has real roots.
11/51
Remark

So while discussing eigenvalues we should consider even a real
matrix as a complex matrix and keep in mind the associated linear
map Cn → Cn . The problem of existence of real eigenvalues and
real eigenvectors will be discussed soon.
11/51
Remark

So while discussing eigenvalues we should consider even a real
matrix as a complex matrix and keep in mind the associated linear
map Cn → Cn . The problem of existence of real eigenvalues and
real eigenvectors will be discussed soon.
(iv) The coefficient of λn−1 is equal to

(−1)n−1 (a11 + . . . + ann ) = (−1)n−1 trA.
11/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.
12/51
Lemma
Proof: Start with Aw = λw where w is a non zero column vector with

complex entries. Write w = v + ιv0 where both v, v0 are real vectors.
12/51
Lemma

We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts.
12/51
Lemma

We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts. Since w 6= 0, at least one of the two v, v0 must be a non zero
vector and we are done. 2
12/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
13/51
Proposition
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
Proof: The characteristic

 polynomial of A is 
a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
det(A − λI) = det 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ
= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)
13/51
Proposition
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ
= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)
Put λ = 0 to get det A = constant term of det(A − λI).
13/51
Proposition
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ
= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)
Put λ = 0 to get det A = constant term of det(A − λI).

Since λ1 , λ2 , . . . , λn are roots of det(A − λI) = 0 we have
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )

13/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2
14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2
Proposition
Let v1 , v2 , . . . , vk be eigenvectors of a matrix A associated to distinct
eigenvalues λ1 , λ2 , . . . , λk . Then v1 , v2 , . . . , vk are linearly
independent.
14/51
Proof: Apply induction on k . It is clear for k = 1.
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck .
15/51
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
15/51
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence
λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )
= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.
15/51
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence

By induction, v2 , v3 , . . . , vk are linearly independent. Hence
(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .
15/51
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence

(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .
Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is

also zero.
15/51
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence

(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .
Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is

also zero.
Thus v1 , v2 , . . . , vk are linearly independent. 2
15/51
Relation Between Algebraic and Geometric
Multiplicities
Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).
16/51
Relation Between Algebraic and Geometric
Multiplicities
Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).
Definition
The geometric multiplicity gA (µ) of an eigenvalue µ of A is defined to
be the dimension of the eigenspace EA (µ).
gA (µ) := dim EA (µ).
16/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Proof: We have already seen that for any invertible matrix C

χA (λ) = χC −1 AC (λ).
17/51
Proposition

χA (λ) = χC −1 AC (λ). Thus the invariance of algebraic multiplicity is
clear.
17/51
Proposition

clear.
On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)).
17/51
Proposition

clear.
On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,

dim(EC −1 AC (λ)) = dim C −1 (EA (λ)) = dim(EA (λ)), the last equality
being the consequence of invertibility of C. 2
17/51
Proposition

clear.
On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,

dim(EC −1 AC (λ)) = dim C −1 (EA (λ)) = dim(EA (λ)), the last equality
being the consequence of invertibility of C. 2
Proposition
Let A be an n × n matrix. Then the geometric multiplicity of an
eigenvalue µ of A is less than or equal to the algebraic multiplicity of µ.
17/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
18/51
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
18/51
Then P is an invertible matrix
18/51
Then P is an invertible matrix and

µIg X
P −1 AP =
0 Y
where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.
18/51

µIg X
P −1 AP =
0 Y

Therefore
det(A − λI) = det[P −1 (A − λI)P]

= det(P −1 AP − λI)
18/51

µIg X
P −1 AP =
0 Y

Therefore

= det((µ − λ)Ig ) det(Y − λIn−g )
18/51

µIg X
P −1 AP =
0 Y

Therefore

= (µ − λ)g det(Y − λIn−g ).
18/51

µIg X
P −1 AP =
0 Y

Therefore

Let k = aA (µ).
18/51

µIg X
P −1 AP =
0 Y

Therefore

Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not.
18/51

µIg X
P −1 AP =
0 Y

Therefore

Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not. Thus g ≤ k . 2
18/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.
19/51
Remark
Since these concepts are similarity invariants, it is necessary that the

same is true for any matrix which is diagonalizable. This turns out to
be sufficient also.
19/51
Remark
Since these concepts are similarity invariants, it is necessary that the

same is true for any matrix which is diagonalizable. This turns out to
be sufficient also.
Theorem
A n × n matrix A is diagonalizable if and only if for each eigenvalue µ in
C of A, we have that the algebraic and geometric multiplicities are
equal:
aA (µ) = gA (µ).
Proof: We have already seen the necessity of the condition, i.e., if A is

diagonalizable, then we know that the algebraic and geometric
multiplicities are equal for each of the eigenvalues of A.
19/51
Proof Contd. : To prove the converse, suppose that the two
multiplicities coincide for each eigenvalue. Suppose that λ1 , λ2 , . . . , λk
are all the eigenvalues of A with algebraic multiplicities n1 , n2 , . . . , nk .
Then n1 + · · · + nk = n. Let
B1 = {v11 , v12 , . . . , v1n1 } be a basis of E(λ1 ),

..
.
Bk = {vk 1 , vk 2 , . . . , vknk } be a basis of E(λk ),
Observe that B = B1 ∪ B2 ∪ . . . ∪ Bk is a P linearly

P independent set. [To
see this note that if a linear combination ki=1 nj=1 i
cij vij = 0, then
v1 + · · · + vk = 0, where vi = nj=1
P i
cij vij ∈ EA (λi ). Since λ1 , . . . , λk are
distinct we must have vi = 0 for all i = 1, . . . , k . This implies cij = 0 for
all i, j.] Since B has n elements, it follows that B is a basis of Cn .
Denote the matrix with columns as elements of B by P. Then P is
invertible and P −1 AP is the diagonal matrix with λ1 , . . . , λk on its
diagonal, with each λi repeated ni times. Hence A is diagonalizable. 2
20/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
21/51
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
Definition
any scalar.
(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);
21/51
Definition
any scalar.

(2) hu, αv + βwi = αhu, vi + βhu, wi (sesquilinearity);
21/51
Definition
any scalar.

(3) hv, vi > 0 if v 6= 0 (positivity).
21/51
Definition
any scalar.

(3) hv, vi > 0 if v 6= 0 (positivity).
A vector space with an inner product is called an inner product space.
21/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
22/51
Remark
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)
22/51
Remark
(ii) Combining (1) and (2) we obtain
22/51
Remark

(20 ) hαu + βv, wi = ᾱhu, wi + β̄hv, wi.
In the special case when K = R, (2) and (20 ) together are called
‘bilinearity’.
22/51
Remark

(20 ) hαu + βv, wi = ᾱhu, wi + β̄hv, wi.
In the special case when K = R, (2) and (20 ) together are called
‘bilinearity’.
Example
(i) The dot product in Rn is an inner product. This is often called the
standard inner product.
22/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
23/51
Example
(ii) Let x P
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0
It is easy to see that hf , gi is an inner product.
23/51
Example
(ii) Let x P
vector space Cn .
Z 1
0
Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.
23/51
Example
(ii) Let x P
vector space Cn .
Z 1
0
Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.
For standard inner product in Rn , kvk is the usual length of vector v.

23/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
24/51
Proposition
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.
24/51
Proposition
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
Proof: (i) and (ii) follow from definitions.
24/51
Proposition
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;

(iii) This is called the Cauchy-Schwarz inequality. If either u or v is
zero, it is clear.
24/51
Proposition
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;

(iii) This is called the Cauchy-Schwarz inequality. If either u or v is
zero, it is clear. So let u be nonzero. Consider the vector
w = v − au.
where a is a scalar to be specified later.
24/51
Then
0 ≤ hw, wi = hv − au, v − aui

= hv, vi − hv, aui − hau, vi + hau, aui
25/51
Then

= hv, vi − ahv, ui − ahu, vi + aahu, ui
25/51
Then

Substitute a = hu, vi/hu, ui and multiply by hu, ui, we get

|hu, vi|2 ≤ hu, uihv, vi. This proves (iii).
25/51
Then


(iv) Using (iii), we get Re (hu, vi) ≤ kuk kvk. Therefore,
ku + vk2 = kuk2 + hu, vi + hv, ui + kvk2

= kuk2 + 2Rehu, vi + kvk2
25/51
Then



≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 .
25/51
Then



≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 .
Thus ku + vk ≤ kuk + kvk.

25/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).
26/51
⇔ hu, vi is real and w = 0.
26/51

⇔ v = au and hv, ui = ahu, ui is real.
26/51

⇔ v is a real multiple of u. 2
26/51

⇔ v is a real multiple of u. 2
Definition
In any inner product space V , the distance between u and v is defined
as
d(u, v) = ku − vk.
26/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).
In an inner product space over R, we can also define the notion of an

angle.
Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that
hu, vi
cos θ = .
kukkvk
27/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).
In an inner product space over R, we can also define the notion of an

angle.
Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that
hu, vi
cos θ = .
kukkvk
27/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
28/51
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
28/51
Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
28/51
Definition
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
28/51
Definition
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set.
28/51
Definition
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set. If an orthonormal set S is a basis for V
then S is called an orthonormal basis.
28/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have
kw + uk2 = kwk2 + kuk2 .
29/51
Theorem
Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2
29/51
Theorem
Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2
Proposition
Let S = {u1 , u2 , . . . , un } be an orthogonal set of nonzero vectors in an
inner product space V . Then S is linearly independent.
29/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with
c1 u1 + c2 u2 + . . . + cn un = 0.
30/51
c1 u1 + c2 u2 + . . . + cn un = 0.
Take inner product with ui on both sides to get
ci hui , ui i = 0.
30/51
c1 u1 + c2 u2 + . . . + cn un = 0.
ci hui , ui i = 0.
Since ui 6= 0, we get ci = 0. Thus S is linearly independent. 2
30/51
c1 u1 + c2 u2 + . . . + cn un = 0.
ci hui , ui i = 0.
Since ui 6= 0, we get ci = 0. Thus S is linearly independent. 2
Definition
If u and v are vectors in an inner product space V and v 6= 0 then the
vector
v
hv, ui
kvk2
is called the orthogonal projection of u along v.
30/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un }
31/51
Theorem
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
for all 1 ≤ k ≤ n.
31/51
Theorem
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
for all 1 ≤ k ≤ n. In particular, if V is finite dimensional, then it has an

orthonormal basis.
31/51
Theorem
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
for all 1 ≤ k ≤ n. In particular, if V is finite dimensional, then it has an

orthonormal basis.
Proof: The construction of the orthonormal set is algorithmic and
inductive. First we construct an intermediate orthogonal set
{w1 , . . . , wn }
with the same property for each k .
31/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k
That will give us an orthonormal set as required.
32/51
wi
vi = .
kwi k
That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .
32/51
wi
vi = .
kwi k
That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .
Take w1 := u1 . So, the construction is over for k = 1. To construct w2 ,
subtract from u2 , its orthogonal projection along w1 .
32/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1
Now check that L({w1 , w2 , . . . , wk +1 }) = L({w1 , . . . , wk , uk +1 }). By

induction, this completes the proof and along with
L({w1 , w2 , . . . , wk +1 }) = L({u1 , . . . , uk +1 }).
33/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.
34/51
Example
V is an inner product space under the inner product

Z 1
−1
34/51
Example

Z 1
−1
To find an orthonormal basis, we begin with the basis {1, x, x 2 , x 3 }.

Set v1 = w1 = 1.
34/51
Example

Z 1
−1
To find an orthonormal basis, we begin with the basis {1, x, x 2 , x 3 }.

Set v1 = w1 = 1. Then
1
w2 = x − hx, 1i
k1k2
Z 1
1
= x− tdt = x,
2 −1
34/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
Thus {1, x, x 2 − 13 , x 3 − 53 x} is an orthogonal basis.
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
Thus {1, x, x 2 − 13 , x 3 − 53 x} is an orthogonal basis. We divide these by

respective norms to get an orthonormal basis.
( r √ √ )
1 3 2 1 3 5 3 3 5 7
√ , x , (x − ) √ , (x − x) √ .
2 2 3 2 2 5 2 2
35/51
Orthogonal Complement
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
and call it the orthogonal complement of W .
36/51
Definition
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
36/51
Definition
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
(ii) W ∩ W ⊥ = {0};
36/51
Definition
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;
36/51
Definition
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;
Proof: All the proofs are easy.
36/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
where w ∈ W and w⊥ ∈ W ⊥ .
37/51
Proposition
v = w + w⊥ (∗)
where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the

property that it is the near most point in W to v.
37/51
Proposition
v = w + w⊥ (∗)

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define

k
X
w= hwi , viwi .
i=1
37/51
Proposition
v = w + w⊥ (∗)


k
X
w= hwi , viwi .
i=1
Take w⊥ = v − w and verify that w⊥ ∈ W ⊥ . This proves the existence

of such an expression (∗).
37/51
Proposition
v = w + w⊥ (∗)


k
X
w= hwi , viwi .
i=1

Now suppose, we have v = u + u⊥ is another similar expression.
37/51
Proposition
v = w + w⊥ (∗)


k
X
w= hwi , viwi .
i=1

Now suppose, we have v = u + u⊥ is another similar expression. Then
it follows that w − u = w⊥ − u⊥ ∈ W ∩ W ⊥ = {0}.
37/51
Proposition
v = w + w⊥ (∗)


k
X
w= hwi , viwi .
i=1

Now suppose, we have v = u + u⊥ is another similar expression. Then
it follows that w − u = w⊥ − u⊥ ∈ W ∩ W ⊥ = {0}.
Therefore w = u and w⊥ = u⊥ . This proves the uniqueness.
37/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
38/51
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
38/51
v − u0 = (v − w) + (w − u0 ).
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
38/51
v − u0 = (v − w) + (w − u0 ).
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
This proves that w is the (only) nearest point to v in W . 2
38/51
v − u0 = (v − w) + (w − u0 ).
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V .
38/51
v − u0 = (v − w) + (w − u0 ).
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V . Also
dim W + dim W ⊥ = dim V

(see exercise at the end of section Linear Dependence.)
38/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .
39/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .
Also observe that PW is identity on W and hence
2 := P ◦ P
PW W W = PW .
39/51
Eigenvalues of Special Matrices
If u = (u1 , u2 , . . . , un )t and v = (v1 , v2 , . . . , vn )t ∈ Cn , we have defined

their inner product in Cn by
hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .
40/51

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .

For any square matrix A
hAu, vi = (Au)∗ v = u∗ A∗ v = hu, A∗ vi.
40/51

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .

Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;
40/51

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .

Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;
(ii) Skew Hermitian if A = −A∗ .
40/51
Lemma
A is Hermitian iff for all column vectors v, w we have
(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)
41/51
Lemma
Proof: If A is Hermitian then (Av)∗ w = v∗ A∗ w = v∗ Aw.
41/51
Lemma

To see the converse, take v, w to be standard basic column vectors.
So e∗i A∗ ej = e∗i Aej . 2
41/51
Lemma

Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.
41/51
Lemma

Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.
(ii) A is Hermitian iff ιA is skew Hermitian.
41/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
42/51
Proposition
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
42/51
Proposition
Proof:
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
42/51
Proposition
2. All eigenvalues of A are real.
Proof:
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
42/51
Proposition
Proof:
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.
Proof:
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.
Proof:
3. Let λ and µ be two distinct eigenvalues of A and u and v be
corresponding eigenvectors. Then Au = λu and Av = µv. Hence
λu∗ v = (λu)∗ v = (Au)∗ v = u∗ (Av) = u∗ µv = µ(u∗ v).
Hence (λ − µ)u∗ v = 0. Since λ 6= µ, u∗ v = 0. 2
42/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
43/51
Corollary
2. Each eigenvalue of A is either zero or a purely imaginary number.
43/51
Corollary
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.
43/51
Corollary
Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian
43/51
Corollary
Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian
and that a complex number c is real iff ιc is either zero or purely
imaginary.
43/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
44/51
Definition
(ii) orthogonal if A is real and unitary.
44/51
Definition
Thus a real matrix A is orthogonal iff AT = A−1 .
44/51
Definition

Also observe that A is unitary iff AT is unitary iff Ā is unitary.
44/51
Definition

Also observe that A is unitary iff AT is unitary iff Ā is unitary.
Example
The matrices

cos θ sin θ 1 1 i
U= and V = √
− sin θ cos θ 2 i 1
are orthogonal and unitary respectively.
44/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.
(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
45/51
Proposition
(i) U is unitary.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.
45/51
Proposition
(i) U is unitary.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.
Proof: Write the matrix U column-wise :

u∗1
 
 u∗2 
U = (u1 u2 . . . un ) so that U∗ =  .
 
..
 . 
u∗n
45/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
un u1 un u2 . . . un un
46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
Thus
U ∗U = I
iff the column vectors of U form an orthonormal set.
46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
Thus
U ∗U = I
iff the column vectors of U form an orthonormal set.
This proves (i) ⇔ (ii).
46/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
47/51
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I
47/51
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence
hUx, Uyi = hx, U ∗ Uyi = hx, yi.
47/51
Conversely, let U preserves inner product. Take x = ei and y = ej to

get
e∗i (U ∗ U)ej = hUei , Uej i = hei , ej i = e∗i ej = δij
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise).
47/51

get
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise). This
means the (i, j)th entry of U ∗ U is δij . Hence U ∗ U = I. 2
47/51

get
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise). This
means the (i, j)th entry of U ∗ U is δij . Hence U ∗ U = I. 2
Remark
The above theorem is valid for an orthogonal matrix also by merely
applying it for a real matrix.
47/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.
48/51
Corollary
orthogonal.
Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so
kUxk2 = hUx, Uxi = hx, xi = kxk2 .
48/51
Corollary
orthogonal.
Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so
kUxk2 = hUx, Uxi = hx, xi = kxk2 .
(2) If λ is an eigenvalue of U with eigenvector x then Ux = λx. Hence
kxk = kUxk = |λ|kxk.
Hence |λ| = 1.
48/51
Corollary
orthogonal.
Proof:
(3) Let Ux = λx and Uy = µy where x, y are eigenvectors with
distinct eigenvalues λ and µ respectively. Then
hx, yi = hUx, Uyi = hλx, µyi = λ̄µhx, yi.
Thus λ̄µ = 1 or hx, yi = 0.

Since λ̄λ = 1, we cannot have λ̄µ = 1.
Hence hx, yi = 0, i.e., x and y are orthogonal. 2
48/51
Example

cos θ − sin θ
U= is an orthogonal matrix.
sin θ cos θ
49/51
Example

cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :

cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ
= λ2 − 2λ cos θ + 1.
49/51
Example

cos θ − sin θ
sin θ cos θ

= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:
√
2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2
49/51
Example

cos θ − sin θ
sin θ cos θ

= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:
√
2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2
Hence |λ| = 1. Check
that
eigenvectors are:
ιθ 1 −ιθ 1
for λ = e : x = and for λ = e :y= .
−ι ι
49/51
Example

1
Thus x∗ y = (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
50/51
Example

1
Thus x∗ y= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,

1 1 1
C=√
2 −ι ι
50/51
Example

1
Thus x∗ y
= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,

1 1 1
C=√
2 −ι ι
then C −1 UC = D(eιθ , e−ιθ ).
50/51

Linearalg03 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linearalg03 PDF

Uploaded by

Copyright:

Available Formats

6.

E IGENVALUES AND E IGENVECTORS

For a square matrix A we say λ is an eigenvalue if there exists a non

Proof: Let D := D(λ1 , . . . , λn ).

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For

By definition E(1) = {v : (A − I)v = 0} and E(3) = {v : (A − 3I)v = 0}.

(ii) If we expand det(A − λI) we see that there is a term

(ii) If we expand det(A − λI) we see that there is a term

(ii) If we expand det(A − λI) we see that there is a term

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That

(iv) The coefficient of λn−1 is equal to

Proof: Start with Aw = λw where w is a non zero column vector with

Proof: Start with Aw = λw where w is a non zero column vector with

Proof: Start with Aw = λw where w is a non zero column vector with

Proof: The characteristic

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

Proof: The characteristic

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

Put λ = 0 to get det A = constant term of det(A − λI).

Proof: The characteristic

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

Put λ = 0 to get det A = constant term of det(A − λI).

det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.

Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.

Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is

gA (µ) := dim EA (µ).

Proof: We have already seen that for any invertible matrix C

Proof: We have already seen that for any invertible matrix C

Proof: We have already seen that for any invertible matrix C

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)).

Proof: We have already seen that for any invertible matrix C

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,

Proof: We have already seen that for any invertible matrix C

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

det(A − λI) = det[P −1 (A − λI)P]

Since these concepts are similarity invariants, it is necessary that the

Since these concepts are similarity invariants, it is necessary that the

Proof: We have already seen the necessity of the condition, i.e., if A is

B1 = {v11 , v12 , . . . , v1n1 } be a basis of E(λ1 ),

Observe that B = B1 ∪ B2 ∪ . . . ∪ Bk is a P linearly

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);