You are on page 1of 236

6.

E IGENVALUES AND E IGENVECTORS

The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.

1/51
6. E IGENVALUES AND E IGENVECTORS

The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.

Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix.

1/51
6. E IGENVALUES AND E IGENVECTORS

The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.

Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.

1/51
6. E IGENVALUES AND E IGENVECTORS

The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.

Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.

Remark
(i) Clearly, f is diagonalizable iff the matrix associated to f with
respect to some basis (any basis) is diagonalizable.

1/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi .

2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?

2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?

Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .

2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?

Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .

For a square matrix A we say λ is an eigenvalue if there exists a non


zero column vector v such that Av = λv. Of course v is then called the
eigenvector of A corresponding to λ.

2/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn .

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.

Proof: Let D := D(λ1 , . . . , λn ).

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For


j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc..

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For


j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc.. Then

(AC)j = AC j = Avj = λj vj .

3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.

Proof: Let D := D(λ1 , . . . , λn ). It is enough to prove AC = CD. For


j = 1, 2, . . . , n, let C j (= vj ) denote the j th column of C etc.. Then

(AC)j = AC j = Avj = λj vj .

Similarly,
(CD)j = CD j = λj vj .
Hence AC = CD as required. To complete the proof, we need to show
that C is invertible. This follows from a result proved later, which says
that eigenvectors corresponding to distinct eigenvalues are linearly
independent. In particular, C has n linearly independent columns and
so rank(C) = n and C is invertible. 2
3/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.

4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.

4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.

4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.

Definition
For a square matrix A the eigenspace of A w.r.t. a eigenvalue λ is
defined as
EA (λ) := {v : Av = λv}.

4/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.

5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.

Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.
(2)

EA (λ) = {v ∈ V : Av = λv}
= {v ∈ V : (A − λI)v = 0} = N (A − λI).

2
5/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).

6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).

Example
 
1 2
(1) A = .
0 3

6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).

Example
 
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
 
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ

6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).

Example
 
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
 
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).

6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).

Example
 
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
 
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).

By definition E(1) = {v : (A − I)v = 0} and E(3) = {v : (A − 3I)v = 0}.

6/51
Example
 
0 2
A−I = .
0 2

7/51
Example
 
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
      
0 2 x 2y 0
= = .
0 2 y 2y 0

7/51
Example
 
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
      
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0

7/51
Example
 
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
      
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
   
1−3 2 −2 2
A − 3I = = .
0 3−3 0 0
    
−2 2 x 0
Suppose = .
0 0 y 0
   
−2x + 2y 0
Then = .
0 0

7/51
Example
 
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
      
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
   
1−3 2 −2 2
A − 3I = = .
0 3−3 0 0
    
−2 2 x 0
Suppose = .
0 0 y 0
   
−2x + 2y 0
Then = .
0 0
This is possible iff x = y . Thus E(3) = L({(1, 1)}).

7/51
Example
 
3 0 0
(2) Let A =  −2 4 2  .
−2 1 5

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A.

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2.

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).

8/51
Example
 
3 0 0
(2) Let A =  −2 4 2  . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
 
0 0 0
λ = 3 : A − 3I =  −2 1 2  . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).
The dimension of EA (λ) is called the geometric multiplicity of λ. Hence
geometric multiplicity of λ = 3 is 2.

8/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)

9/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.

9/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
 
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraic multiplicity 2.

9/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
 
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraicmultiplicity
 2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }).
0 0

9/51
Example
 
−3 0 0
λ = 6 : A − 6I =  −2 −2 2  . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
 
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraicmultiplicity
 2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }). In
0 0
this case the geometric multiplicity is less than the algebraic multiplicity
of the eigenvalue 1.

9/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ).

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.

(ii) If we expand det(A − λI) we see that there is a term


(a11 − λ)(a22 − λ) . . . (ann − λ).

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.

(ii) If we expand det(A − λI) we see that there is a term


(a11 − λ)(a22 − λ) . . . (ann − λ). This is the only term which
contributes to λn and λn−1 .

10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.

(ii) If we expand det(A − λI) we see that there is a term


(a11 − λ)(a22 − λ) . . . (ann − λ). This is the only term which
contributes to λn and λn−1 .
It follows that the degree of the characteristic polynomial is exactly
equal to n, the size of the matrix;
moreover, the coefficient of the top degree term is equal to (−1)n .

10/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.

11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That


however, does not allow us to conclude that it has real roots.

11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That


however, does not allow us to conclude that it has real roots.
So while discussing eigenvalues we should consider even a real
matrix as a complex matrix and keep in mind the associated linear
map Cn → Cn . The problem of existence of real eigenvalues and
real eigenvectors will be discussed soon.

11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.

(iii) If A is a real matrix then of course χA (λ) is a real polynomial. That


however, does not allow us to conclude that it has real roots.
So while discussing eigenvalues we should consider even a real
matrix as a complex matrix and keep in mind the associated linear
map Cn → Cn . The problem of existence of real eigenvalues and
real eigenvectors will be discussed soon.

(iv) The coefficient of λn−1 is equal to


(−1)n−1 (a11 + . . . + ann ) = (−1)n−1 trA.

11/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.

12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.

Proof: Start with Aw = λw where w is a non zero column vector with


complex entries. Write w = v + ιv0 where both v, v0 are real vectors.

12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.

Proof: Start with Aw = λw where w is a non zero column vector with


complex entries. Write w = v + ιv0 where both v, v0 are real vectors.

We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts.

12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.

Proof: Start with Aw = λw where w is a non zero column vector with


complex entries. Write w = v + ιv0 where both v, v0 are real vectors.

We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts. Since w 6= 0, at least one of the two v, v0 must be a non zero
vector and we are done. 2

12/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

Proof: The characteristic


 polynomial of A is 
a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
det(A − λI) = det 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

Proof: The characteristic


 polynomial of A is 
a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
det(A − λI) = det 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

Put λ = 0 to get det A = constant term of det(A − λI).

13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .

Proof: The characteristic


 polynomial of A is 
a11 − λ a12 ... a1n
 a21 a 22 − λ . .. a2n 
det(A − λI) = det 
 
.. .. .. 
 . . . 
an1 an2 . . . ann − λ

= (−1)n λn + (−1)n−1 λn−1 (a11 + . . . + ann ) + . . . (∗)

Put λ = 0 to get det A = constant term of det(A − λI).


Since λ1 , λ2 , . . . , λn are roots of det(A − λI) = 0 we have

det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )


13/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)

14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2

14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2

Proposition
Let v1 , v2 , . . . , vk be eigenvectors of a matrix A associated to distinct
eigenvalues λ1 , λ2 , . . . , λk . Then v1 , v2 , . . . , vk are linearly
independent.

14/51
Proof: Apply induction on k . It is clear for k = 1.

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck .

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence

c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence

c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0

Hence

λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence

c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0

Hence

λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.


By induction, v2 , v3 , . . . , vk are linearly independent. Hence
(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence

c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0

Hence

λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.


By induction, v2 , v3 , . . . , vk are linearly independent. Hence
(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .

Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is


also zero.

15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence

c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0

Hence

λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )

= (λ1 − λ2 )c2 v2 + (λ1 − λ3 )c3 v3 + . . . + (λ1 − λk )ck vk = 0.


By induction, v2 , v3 , . . . , vk are linearly independent. Hence
(λ1 − λj )cj = 0 for j = 2, 3, . . . , k .

Since λ1 6= λj for j = 2, 3, . . . , k , cj = 0 for j = 2, 3, . . . , k . Hence c1 is


also zero.
Thus v1 , v2 , . . . , vk are linearly independent. 2

15/51
Relation Between Algebraic and Geometric
Multiplicities

Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).

16/51
Relation Between Algebraic and Geometric
Multiplicities

Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).

Definition
The geometric multiplicity gA (µ) of an eigenvalue µ of A is defined to
be the dimension of the eigenspace EA (µ).

gA (µ) := dim EA (µ).

16/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

Proof: We have already seen that for any invertible matrix C


χA (λ) = χC −1 AC (λ).

17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

Proof: We have already seen that for any invertible matrix C


χA (λ) = χC −1 AC (λ). Thus the invariance of algebraic multiplicity is
clear.

17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

Proof: We have already seen that for any invertible matrix C


χA (λ) = χC −1 AC (λ). Thus the invariance of algebraic multiplicity is
clear.

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)).

17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

Proof: We have already seen that for any invertible matrix C


χA (λ) = χC −1 AC (λ). Thus the invariance of algebraic multiplicity is
clear.

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,


dim(EC −1 AC (λ)) = dim C −1 (EA (λ)) = dim(EA (λ)), the last equality
being the consequence of invertibility of C. 2

17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.

Proof: We have already seen that for any invertible matrix C


χA (λ) = χC −1 AC (λ). Thus the invariance of algebraic multiplicity is
clear.

On the other hand check that EC −1 AC (λ) = C −1 (EA (λ)). Therefore,


dim(EC −1 AC (λ)) = dim C −1 (EA (λ)) = dim(EA (λ)), the last equality
being the consequence of invertibility of C. 2

Proposition
Let A be an n × n matrix. Then the geometric multiplicity of an
eigenvalue µ of A is less than or equal to the algebraic multiplicity of µ.

17/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)
= det((µ − λ)Ig ) det(Y − λIn−g )

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)
= det((µ − λ)Ig ) det(Y − λIn−g )
= (µ − λ)g det(Y − λIn−g ).

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)
= det((µ − λ)Ig ) det(Y − λIn−g )
= (µ − λ)g det(Y − λIn−g ).

Let k = aA (µ).

18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)
= det((µ − λ)Ig ) det(Y − λIn−g )
= (µ − λ)g det(Y − λIn−g ).

Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not.
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
 
µIg X
P −1 AP =
0 Y

where X is a g × (n − g) matrix and Y is an (n − g) × (n − g) matrix.


Therefore

det(A − λI) = det[P −1 (A − λI)P]


= det(P −1 AP − λI)
= det((µ − λ)Ig ) det(Y − λIn−g )
= (µ − λ)g det(Y − λIn−g ).

Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not. Thus g ≤ k . 2
18/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.

19/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.

Since these concepts are similarity invariants, it is necessary that the


same is true for any matrix which is diagonalizable. This turns out to
be sufficient also.

19/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.

Since these concepts are similarity invariants, it is necessary that the


same is true for any matrix which is diagonalizable. This turns out to
be sufficient also.

Theorem
A n × n matrix A is diagonalizable if and only if for each eigenvalue µ in
C of A, we have that the algebraic and geometric multiplicities are
equal:
aA (µ) = gA (µ).

Proof: We have already seen the necessity of the condition, i.e., if A is


diagonalizable, then we know that the algebraic and geometric
multiplicities are equal for each of the eigenvalues of A.

19/51
Proof Contd. : To prove the converse, suppose that the two
multiplicities coincide for each eigenvalue. Suppose that λ1 , λ2 , . . . , λk
are all the eigenvalues of A with algebraic multiplicities n1 , n2 , . . . , nk .
Then n1 + · · · + nk = n. Let

B1 = {v11 , v12 , . . . , v1n1 } be a basis of E(λ1 ),


..
.
Bk = {vk 1 , vk 2 , . . . , vknk } be a basis of E(λk ),

Observe that B = B1 ∪ B2 ∪ . . . ∪ Bk is a P linearly


P independent set. [To
see this note that if a linear combination ki=1 nj=1 i
cij vij = 0, then
v1 + · · · + vk = 0, where vi = nj=1
P i
cij vij ∈ EA (λi ). Since λ1 , . . . , λk are
distinct we must have vi = 0 for all i = 1, . . . , k . This implies cij = 0 for
all i, j.] Since B has n elements, it follows that B is a basis of Cn .
Denote the matrix with columns as elements of B by P. Then P is
invertible and P −1 AP is the diagonal matrix with λ1 , . . . , λk on its
diagonal, with each λi repeated ni times. Hence A is diagonalizable. 2
20/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.

21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);

21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);


(2) hu, αv + βwi = αhu, vi + βhu, wi (sesquilinearity);

21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);


(2) hu, αv + βwi = αhu, vi + βhu, wi (sesquilinearity);
(3) hv, vi > 0 if v 6= 0 (positivity).

21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.

Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.

(1) hu, vi = hv, ui (Hermitian property or conjugate symmetry);


(2) hu, αv + βwi = αhu, vi + βhu, wi (sesquilinearity);
(3) hv, vi > 0 if v 6= 0 (positivity).

A vector space with an inner product is called an inner product space.

21/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.

22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)

22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)

(ii) Combining (1) and (2) we obtain

22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)

(ii) Combining (1) and (2) we obtain


(20 ) hαu + βv, wi = ᾱhu, wi + β̄hv, wi.
In the special case when K = R, (2) and (20 ) together are called
‘bilinearity’.

22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)

(ii) Combining (1) and (2) we obtain


(20 ) hαu + βv, wi = ᾱhu, wi + β̄hv, wi.
In the special case when K = R, (2) and (20 ) together are called
‘bilinearity’.

Example
(i) The dot product in Rn is an inner product. This is often called the
standard inner product.

22/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .

23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0

It is easy to see that hf , gi is an inner product.

23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0

It is easy to see that hf , gi is an inner product.

Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.

23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0

It is easy to see that hf , gi is an inner product.

Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.

For standard inner product in Rn , kvk is the usual length of vector v.


23/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;

24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.

24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.

Proof: (i) and (ii) follow from definitions.

24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.

Proof: (i) and (ii) follow from definitions.


(iii) This is called the Cauchy-Schwarz inequality. If either u or v is
zero, it is clear.

24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.

Proof: (i) and (ii) follow from definitions.


(iii) This is called the Cauchy-Schwarz inequality. If either u or v is
zero, it is clear. So let u be nonzero. Consider the vector

w = v − au.

where a is a scalar to be specified later.

24/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui

25/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui
= hv, vi − ahv, ui − ahu, vi + aahu, ui

25/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui
= hv, vi − ahv, ui − ahu, vi + aahu, ui

Substitute a = hu, vi/hu, ui and multiply by hu, ui, we get


|hu, vi|2 ≤ hu, uihv, vi. This proves (iii).

25/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui
= hv, vi − ahv, ui − ahu, vi + aahu, ui

Substitute a = hu, vi/hu, ui and multiply by hu, ui, we get


|hu, vi|2 ≤ hu, uihv, vi. This proves (iii).
(iv) Using (iii), we get Re (hu, vi) ≤ kuk kvk. Therefore,

ku + vk2 = kuk2 + hu, vi + hv, ui + kvk2


= kuk2 + 2Rehu, vi + kvk2

25/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui
= hv, vi − ahv, ui − ahu, vi + aahu, ui

Substitute a = hu, vi/hu, ui and multiply by hu, ui, we get


|hu, vi|2 ≤ hu, uihv, vi. This proves (iii).
(iv) Using (iii), we get Re (hu, vi) ≤ kuk kvk. Therefore,

ku + vk2 = kuk2 + hu, vi + hv, ui + kvk2


= kuk2 + 2Rehu, vi + kvk2
≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 .

25/51
Then

0 ≤ hw, wi = hv − au, v − aui


= hv, vi − hv, aui − hau, vi + hau, aui
= hv, vi − ahv, ui − ahu, vi + aahu, ui

Substitute a = hu, vi/hu, ui and multiply by hu, ui, we get


|hu, vi|2 ≤ hu, uihv, vi. This proves (iii).
(iv) Using (iii), we get Re (hu, vi) ≤ kuk kvk. Therefore,

ku + vk2 = kuk2 + hu, vi + hv, ui + kvk2


= kuk2 + 2Rehu, vi + kvk2
≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 .

Thus ku + vk ≤ kuk + kvk.


25/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).

26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).

⇔ hu, vi is real and w = 0.

26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).

⇔ hu, vi is real and w = 0.


⇔ v = au and hv, ui = ahu, ui is real.

26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).

⇔ hu, vi is real and w = 0.


⇔ v = au and hv, ui = ahu, ui is real.
⇔ v is a real multiple of u. 2

26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).

⇔ hu, vi is real and w = 0.


⇔ v = au and hv, ui = ahu, ui is real.
⇔ v is a real multiple of u. 2

Definition
In any inner product space V , the distance between u and v is defined
as
d(u, v) = ku − vk.

26/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).

In an inner product space over R, we can also define the notion of an


angle.

Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that

hu, vi
cos θ = .
kukkvk

27/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).

In an inner product space over R, we can also define the notion of an


angle.

Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that

hu, vi
cos θ = .
kukkvk

27/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.

28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .

28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .

Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.

28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .

Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.

28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .

Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set.

28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .

Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set. If an orthonormal set S is a basis for V
then S is called an orthonormal basis.

28/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have

kw + uk2 = kwk2 + kuk2 .

29/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have

kw + uk2 = kwk2 + kuk2 .

Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2

29/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have

kw + uk2 = kwk2 + kuk2 .

Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2

Proposition
Let S = {u1 , u2 , . . . , un } be an orthogonal set of nonzero vectors in an
inner product space V . Then S is linearly independent.

29/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with

c1 u1 + c2 u2 + . . . + cn un = 0.

30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with

c1 u1 + c2 u2 + . . . + cn un = 0.

Take inner product with ui on both sides to get

ci hui , ui i = 0.

30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with

c1 u1 + c2 u2 + . . . + cn un = 0.

Take inner product with ui on both sides to get

ci hui , ui i = 0.

Since ui 6= 0, we get ci = 0. Thus S is linearly independent. 2

30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with

c1 u1 + c2 u2 + . . . + cn un = 0.

Take inner product with ui on both sides to get

ci hui , ui i = 0.

Since ui 6= 0, we get ci = 0. Thus S is linearly independent. 2

Definition
If u and v are vectors in an inner product space V and v 6= 0 then the
vector
v
hv, ui
kvk2
is called the orthogonal projection of u along v.

30/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un }

31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that

L({u1 , . . . , uk }) = L({v1 , . . . , vk })

for all 1 ≤ k ≤ n.

31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that

L({u1 , . . . , uk }) = L({v1 , . . . , vk })

for all 1 ≤ k ≤ n. In particular, if V is finite dimensional, then it has an


orthonormal basis.

31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that

L({u1 , . . . , uk }) = L({v1 , . . . , vk })

for all 1 ≤ k ≤ n. In particular, if V is finite dimensional, then it has an


orthonormal basis.
Proof: The construction of the orthonormal set is algorithmic and
inductive. First we construct an intermediate orthogonal set

{w1 , . . . , wn }

with the same property for each k .

31/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k

That will give us an orthonormal set as required.

32/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k

That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .

32/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k

That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .
Take w1 := u1 . So, the construction is over for k = 1. To construct w2 ,
subtract from u2 , its orthogonal projection along w1 .

32/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.

33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).

33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1

33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1

Now check that L({w1 , w2 , . . . , wk +1 }) = L({w1 , . . . , wk , uk +1 }). By


induction, this completes the proof and along with

L({w1 , w2 , . . . , wk +1 }) = L({u1 , . . . , uk +1 }).

33/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.

34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.

V is an inner product space under the inner product


Z 1
hf , gi = f (t)g(t) dt.
−1

34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.

V is an inner product space under the inner product


Z 1
hf , gi = f (t)g(t) dt.
−1

To find an orthonormal basis, we begin with the basis {1, x, x 2 , x 3 }.


Set v1 = w1 = 1.

34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.

V is an inner product space under the inner product


Z 1
hf , gi = f (t)g(t) dt.
−1

To find an orthonormal basis, we begin with the basis {1, x, x 2 , x 3 }.


Set v1 = w1 = 1. Then
1
w2 = x − hx, 1i
k1k2
Z 1
1
= x− tdt = x,
2 −1

34/51
Example

1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3

35/51
Example

1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5

35/51
Example

1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5

Thus {1, x, x 2 − 13 , x 3 − 53 x} is an orthogonal basis.

35/51
Example

1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5

Thus {1, x, x 2 − 13 , x 3 − 53 x} is an orthogonal basis. We divide these by


respective norms to get an orthonormal basis.
( r √ √ )
1 3 2 1 3 5 3 3 5 7
√ , x , (x − ) √ , (x − x) √ .
2 2 3 2 2 5 2 2
35/51
Orthogonal Complement

Definition
Let V be an inner product space. Given a subspace W of V , we define

W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }

and call it the orthogonal complement of W .

36/51
Orthogonal Complement

Definition
Let V be an inner product space. Given a subspace W of V , we define

W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }

and call it the orthogonal complement of W .

Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;

36/51
Orthogonal Complement

Definition
Let V be an inner product space. Given a subspace W of V , we define

W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }

and call it the orthogonal complement of W .

Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};

36/51
Orthogonal Complement

Definition
Let V be an inner product space. Given a subspace W of V , we define

W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }

and call it the orthogonal complement of W .

Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;

36/51
Orthogonal Complement

Definition
Let V be an inner product space. Given a subspace W of V , we define

W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }

and call it the orthogonal complement of W .

Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;

Proof: All the proofs are easy.

36/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ .

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define


k
X
w= hwi , viwi .
i=1

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define


k
X
w= hwi , viwi .
i=1

Take w⊥ = v − w and verify that w⊥ ∈ W ⊥ . This proves the existence


of such an expression (∗).

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define


k
X
w= hwi , viwi .
i=1

Take w⊥ = v − w and verify that w⊥ ∈ W ⊥ . This proves the existence


of such an expression (∗).
Now suppose, we have v = u + u⊥ is another similar expression.

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define


k
X
w= hwi , viwi .
i=1

Take w⊥ = v − w and verify that w⊥ ∈ W ⊥ . This proves the existence


of such an expression (∗).
Now suppose, we have v = u + u⊥ is another similar expression. Then
it follows that w − u = w⊥ − u⊥ ∈ W ∩ W ⊥ = {0}.

37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum

v = w + w⊥ (∗)

where w ∈ W and w⊥ ∈ W ⊥ . Further w is characterised by the


property that it is the near most point in W to v.

Proof: Fix an orthonormal basis {w1 , . . . , wk } for W . Define


k
X
w= hwi , viwi .
i=1

Take w⊥ = v − w and verify that w⊥ ∈ W ⊥ . This proves the existence


of such an expression (∗).
Now suppose, we have v = u + u⊥ is another similar expression. Then
it follows that w − u = w⊥ − u⊥ ∈ W ∩ W ⊥ = {0}.
Therefore w = u and w⊥ = u⊥ . This proves the uniqueness.
37/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).

38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2

38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."

38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2

38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V .

38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V . Also

dim W + dim W ⊥ = dim V


(see exercise at the end of section Linear Dependence.)

38/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .

39/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .
Also observe that PW is identity on W and hence
2 := P ◦ P
PW W W = PW .

39/51
Eigenvalues of Special Matrices

If u = (u1 , u2 , . . . , un )t and v = (v1 , v2 , . . . , vn )t ∈ Cn , we have defined


their inner product in Cn by

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .

40/51
Eigenvalues of Special Matrices

If u = (u1 , u2 , . . . , un )t and v = (v1 , v2 , . . . , vn )t ∈ Cn , we have defined


their inner product in Cn by

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .


For any square matrix A

hAu, vi = (Au)∗ v = u∗ A∗ v = hu, A∗ vi.

40/51
Eigenvalues of Special Matrices

If u = (u1 , u2 , . . . , un )t and v = (v1 , v2 , . . . , vn )t ∈ Cn , we have defined


their inner product in Cn by

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .


For any square matrix A

hAu, vi = (Au)∗ v = u∗ A∗ v = hu, A∗ vi.

Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;

40/51
Eigenvalues of Special Matrices

If u = (u1 , u2 , . . . , un )t and v = (v1 , v2 , . . . , vn )t ∈ Cn , we have defined


their inner product in Cn by

hu, vi = u∗ v = ū1 v1 + ū2 v2 + . . . + ūn vn .


For any square matrix A

hAu, vi = (Au)∗ v = u∗ A∗ v = hu, A∗ vi.

Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;
(ii) Skew Hermitian if A = −A∗ .

40/51
Lemma
A is Hermitian iff for all column vectors v, w we have

(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)

41/51
Lemma
A is Hermitian iff for all column vectors v, w we have

(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)

Proof: If A is Hermitian then (Av)∗ w = v∗ A∗ w = v∗ Aw.

41/51
Lemma
A is Hermitian iff for all column vectors v, w we have

(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)

Proof: If A is Hermitian then (Av)∗ w = v∗ A∗ w = v∗ Aw.


To see the converse, take v, w to be standard basic column vectors.
So e∗i A∗ ej = e∗i Aej . 2

41/51
Lemma
A is Hermitian iff for all column vectors v, w we have

(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)

Proof: If A is Hermitian then (Av)∗ w = v∗ A∗ w = v∗ Aw.


To see the converse, take v, w to be standard basic column vectors.
So e∗i A∗ ej = e∗i Aej . 2

Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.

41/51
Lemma
A is Hermitian iff for all column vectors v, w we have

(Av)∗ w = v∗ Aw; (i.e., hAv, wi = hv, Awi)

Proof: If A is Hermitian then (Av)∗ w = v∗ A∗ w = v∗ Aw.


To see the converse, take v, w to be standard basic column vectors.
So e∗i A∗ ej = e∗i Aej . 2

Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.
(ii) A is Hermitian iff ιA is skew Hermitian.

41/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.

42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.

Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.

42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.

Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .

42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.

Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .

42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.

Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.

Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.

Proof:
3. Let λ and µ be two distinct eigenvalues of A and u and v be
corresponding eigenvectors. Then Au = λu and Av = µv. Hence

λu∗ v = (λu)∗ v = (Au)∗ v = u∗ (Av) = u∗ µv = µ(u∗ v).

Hence (λ − µ)u∗ v = 0. Since λ 6= µ, u∗ v = 0. 2

42/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.

43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.

43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.

43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.

Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian

43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.

Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian
and that a complex number c is real iff ιc is either zero or purely
imaginary.

43/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;

44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.

44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.

Thus a real matrix A is orthogonal iff AT = A−1 .

44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.

Thus a real matrix A is orthogonal iff AT = A−1 .


Also observe that A is unitary iff AT is unitary iff Ā is unitary.

44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.

Thus a real matrix A is orthogonal iff AT = A−1 .


Also observe that A is unitary iff AT is unitary iff Ā is unitary.
Example
The matrices
   
cos θ sin θ 1 1 i
U= and V = √
− sin θ cos θ 2 i 1

are orthogonal and unitary respectively.

44/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.

(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.

45/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.

(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.

45/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.

(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.

Proof: Write the matrix U column-wise :


u∗1
 
 u∗2 
U = (u1 u2 . . . un ) so that U∗ =  .
 
..
 . 
u∗n

45/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n

46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
un u1 un u2 . . . un un

46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
un u1 un u2 . . . un un

Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n

46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
un u1 un u2 . . . un un

Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
iff the column vectors of U form an orthonormal set.

46/51
Hence
u∗1
 
 u∗2 
U ∗U =   (u1 u2 . . . un )
 
..
 . 
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
 
 u∗2 u1 u∗2 u2 . . . u∗2 un 
=  ..  .
 
..
 . . 
∗ ∗ ∗
un u1 un u2 . . . un un

Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
iff the column vectors of U form an orthonormal set.
This proves (i) ⇔ (ii).

46/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.

47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I

47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence

hUx, Uyi = hx, U ∗ Uyi = hx, yi.

47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence

hUx, Uyi = hx, U ∗ Uyi = hx, yi.

Conversely, let U preserves inner product. Take x = ei and y = ej to


get
e∗i (U ∗ U)ej = hUei , Uej i = hei , ej i = e∗i ej = δij
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise).

47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence

hUx, Uyi = hx, U ∗ Uyi = hx, yi.

Conversely, let U preserves inner product. Take x = ei and y = ej to


get
e∗i (U ∗ U)ej = hUei , Uej i = hei , ej i = e∗i ej = δij
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise). This
means the (i, j)th entry of U ∗ U is δij . Hence U ∗ U = I. 2

47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence

hUx, Uyi = hx, U ∗ Uyi = hx, yi.

Conversely, let U preserves inner product. Take x = ei and y = ej to


get
e∗i (U ∗ U)ej = hUei , Uej i = hei , ej i = e∗i ej = δij
where δij are Kronecker symbols (δij = 1 if i = j; = 0 otherwise). This
means the (i, j)th entry of U ∗ U is δij . Hence U ∗ U = I. 2

Remark
The above theorem is valid for an orthogonal matrix also by merely
applying it for a real matrix.

47/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.

48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.

Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so

kUxk2 = hUx, Uxi = hx, xi = kxk2 .

48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.

Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so

kUxk2 = hUx, Uxi = hx, xi = kxk2 .

(2) If λ is an eigenvalue of U with eigenvector x then Ux = λx. Hence

kxk = kUxk = |λ|kxk.

Hence |λ| = 1.

48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.

Proof:
(3) Let Ux = λx and Uy = µy where x, y are eigenvectors with
distinct eigenvalues λ and µ respectively. Then

hx, yi = hUx, Uyi = hλx, µyi = λ̄µhx, yi.

Thus λ̄µ = 1 or hx, yi = 0.


Since λ̄λ = 1, we cannot have λ̄µ = 1.

Hence hx, yi = 0, i.e., x and y are orthogonal. 2

48/51
Example
 
cos θ − sin θ
U= is an orthogonal matrix.
sin θ cos θ

49/51
Example
 
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
 
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ

= λ2 − 2λ cos θ + 1.

49/51
Example
 
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
 
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ

= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:

2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2

49/51
Example
 
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
 
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ

= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:

2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2
Hence |λ| = 1. Check
 that
 eigenvectors are:  
ιθ 1 −ιθ 1
for λ = e : x = and for λ = e :y= .
−ι ι

49/51
Example
 
1
Thus x∗ y = (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι

50/51
Example
 
1
Thus x∗ y= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,
 
1 1 1
C=√
2 −ι ι

50/51
Example
 
1
Thus x∗ y
= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,
 
1 1 1
C=√
2 −ι ι

then C −1 UC = D(eιθ , e−ιθ ).

50/51

You might also like