Professional Documents
Culture Documents
The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.
1/51
6. E IGENVALUES AND E IGENVECTORS
The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.
Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix.
1/51
6. E IGENVALUES AND E IGENVECTORS
The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.
Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.
1/51
6. E IGENVALUES AND E IGENVECTORS
The simplest of matrices are the diagonal ones. This naturally leads us
to the problem of investigating the existence and construction of a
suitable basis with respect to which the matrix associated to a given
linear transformation is diagonal.
Definition
A n × n matrix A is called diagonalizable if there exists an invertible
n × n matrix M such that M −1 AM is a diagonal matrix. A linear map
f : V → V is called diagonalizable if the matrix associated to f with
respect to some basis is diagonal.
Remark
(i) Clearly, f is diagonalizable iff the matrix associated to f with
respect to some basis (any basis) is diagonalizable.
1/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi .
2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?
2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?
Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .
2/51
Remark
(ii) Let {v1 , . . . , vn } be a basis. The matrix Mf of a linear
transformation f w.r.t. this basis is diagonal iff f (vi ) = λi vi ,
1 ≤ i ≤ n for some scalars λi . Naturally a question here is: does
there exist such a basis for a given linear transformation?
Definition
Given a linear map f : V → V we say v ∈ V is an eigenvector for f if
v 6= 0 and f (v) = λv for some λ ∈ K. In that case λ is called as
eigenvalue of f .
2/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn .
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
(AC)j = AC j = Avj = λj vj .
3/51
Proposition
Suppose A is an n × n matrix. Let A have n distinct eigenvalues
λ1 , λ2 , . . . , λn . Let C be the matrix whose column vectors are
respectively v1 , v2 , . . . , vn where vi is an eigenvector for λi for
i = 1, 2, . . . , n.
Then C −1 AC = D(λ1 , . . . , λn ) the diagonal matrix.
(AC)j = AC j = Avj = λj vj .
Similarly,
(CD)j = CD j = λj vj .
Hence AC = CD as required. To complete the proof, we need to show
that C is invertible. This follows from a result proved later, which says
that eigenvectors corresponding to distinct eigenvalues are linearly
independent. In particular, C has n linearly independent columns and
so rank(C) = n and C is invertible. 2
3/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.
4/51
Remark
(i) It is easy to see that eigenvalues and eigenvectors of a linear
transformation are same as those of the associated matrix.
(ii) Even if a linear map is not diagonalizable, the existence of
eigenvectors and eigenvalues itself throws some light on the
nature of the linear map.
They arise naturally in the study of differential equations. Here we
shall use them to address the problem of diagonalization and then
see some geometric applications of diagonalization itself.
Definition
For a square matrix A the eigenspace of A w.r.t. a eigenvalue λ is
defined as
EA (λ) := {v : Av = λv}.
4/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.
5/51
Characteristic Polynomial
Proposition
(1) Eigenvalues of a square matrix A are solutions of the equation
det(A − λI) = 0.
(2) The null space of A − λI is equal to the eigenspace.
Proof:
(1) If v is an eigenvector of A then v 6= 0 and Av = λv for some scalar
λ. Hence (A − λI)v = 0.
Thus the nullity of A − λI is positive. Hence rank (A − λI) is less
than n. Hence det(A − λI) = 0.
(2)
EA (λ) = {v ∈ V : Av = λv}
= {v ∈ V : (A − λI)v = 0} = N (A − λI).
2
5/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
Example
1 2
(1) A = .
0 3
6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
Example
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
Example
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).
6/51
Definition
For any square matrix A, the polynomial det(A − λI) in λ is called the
characteristic polynomial of A and is denoted by χA (λ).
Example
1 2
(1) A = . To find the eigenvalues of A, we solve the equation
0 3
1−λ 2
det(A − λI) = det = (1 − λ)(3 − λ) = 0.
0 3−λ
Hence the eigenvalues of A are 1 and 3. Let us calculate the
eigenspaces E(1) and E(3).
6/51
Example
0 2
A−I = .
0 2
7/51
Example
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
0 2 x 2y 0
= = .
0 2 y 2y 0
7/51
Example
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
7/51
Example
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
1−3 2 −2 2
A − 3I = = .
0 3−3 0 0
−2 2 x 0
Suppose = .
0 0 y 0
−2x + 2y 0
Then = .
0 0
7/51
Example
0 2
A−I = . Hence (x, y )T ∈ E(1) iff
0 2
0 2 x 2y 0
= = . Hence E(1) = L({(1, 0)}).
0 2 y 2y 0
1−3 2 −2 2
A − 3I = = .
0 3−3 0 0
−2 2 x 0
Suppose = .
0 0 y 0
−2x + 2y 0
Then = .
0 0
This is possible iff x = y . Thus E(3) = L({(1, 1)}).
7/51
Example
3 0 0
(2) Let A = −2 4 2 .
−2 1 5
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A.
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
0 0 0
λ = 3 : A − 3I = −2 1 2 . Hence rank (A − 3I) = 1.
−2 1 2
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
0 0 0
λ = 3 : A − 3I = −2 1 2 . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2.
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
0 0 0
λ = 3 : A − 3I = −2 1 2 . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).
8/51
Example
3 0 0
(2) Let A = −2 4 2 . Then det(A − λI) = (3 − λ)2 (6 − λ).
−2 1 5
Hence eigenvalues of A are 3 and 6.
The eigenvalue λ = 3 is a double root of the characteristic polynomial
of A. We say that λ = 3 has algebraic multiplicity 2.
Let us find the eigenspaces E(3) and E(6).
0 0 0
λ = 3 : A − 3I = −2 1 2 . Hence rank (A − 3I) = 1.
−2 1 2
Thus nullity (A − 3I) = 2. By solving the system (A − 3I)v = 0, we find
that
N (A − 3I) = EA (3) = L({(1, 0, 1), (1, 2, 0)}).
The dimension of EA (λ) is called the geometric multiplicity of λ. Hence
geometric multiplicity of λ = 3 is 2.
8/51
Example
−3 0 0
λ = 6 : A − 6I = −2 −2 2 . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
9/51
Example
−3 0 0
λ = 6 : A − 6I = −2 −2 2 . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
9/51
Example
−3 0 0
λ = 6 : A − 6I = −2 −2 2 . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraic multiplicity 2.
9/51
Example
−3 0 0
λ = 6 : A − 6I = −2 −2 2 . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraicmultiplicity
2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }).
0 0
9/51
Example
−3 0 0
λ = 6 : A − 6I = −2 −2 2 . Hence rank (A − 6I) = 2. Thus
−2 1 −1
dim EA (6) = 1. (It can be shown that {(0, 1, 1)} is a basis of EA (6).)
Thus both the algebraic and geometric multiplicities of the eigenvalue
6 are equal to 1.
1 1
(3) A = . Then det(A − λI) = (1 − λ)2 . Thus λ = 1 has
0 1
algebraicmultiplicity
2.
0 1
A−I = . Hence nullity (A − I) = 1 and EA (1) = L({e1 }). In
0 0
this case the geometric multiplicity is less than the algebraic multiplicity
of the eigenvalue 1.
9/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ).
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.
10/51
Remark
(i) Observe that χA (λ) = χM −1 AM (λ). Thus the characteristic
polynomial is an invariant of similarity.
Thus the characteristic polynomial of any linear map f : V → V is
also defined (where V is finite dimensional) by choosing some
basis for V , and then taking the characteristic polynomial of the
associated matrix M(f ) of f .
This definition does not depend upon the choice of the basis.
10/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.
11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.
11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.
11/51
Remark
(ii) (contd...) Thus in general, it has n complex roots, some of which
may be repeated, some of them real, and so on. All these patterns
are going to influence the geometry of the linear map.
11/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.
12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.
12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.
We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts.
12/51
Lemma
Suppose A is a real matrix with a real eigenvalue λ. Then there exists
a real column vector v 6= 0 such that Av = λv.
We then have Av + ιAv0 = λ(v + ιv0 ) Compare the real and imaginary
parts. Since w 6= 0, at least one of the two v, v0 must be a non zero
vector and we are done. 2
12/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
13/51
Proposition
Let A be an n × n matrix with eigenvalues λ1 , λ2 , . . . , λn . Then
(i) tr (A) = λ1 + λ2 + . . . + λn .
(ii) det A = λ1 λ2 . . . λn .
14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2
14/51
det(A − λI) = (−1)n (λ − λ1 )(λ − λ2 ) . . . (λ − λn )
= (−1)n [λn − (λ1 + λ2 + . . . + λn )λn−1 + . . . + (−1)n λ1 λ2 . . . λn ]. (∗∗)
Comparing (∗) and (∗∗), we get the constant term of det(A − λI) is
equal to λ1 λ2 . . . λn = det A and
tr (A) = a11 + a22 + . . . + ann = λ1 + λ2 + . . . + λn . 2
Proposition
Let v1 , v2 , . . . , vk be eigenvectors of a matrix A associated to distinct
eigenvalues λ1 , λ2 , . . . , λk . Then v1 , v2 , . . . , vk are linearly
independent.
14/51
Proof: Apply induction on k . It is clear for k = 1.
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck .
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence
λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence
λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence
λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )
15/51
Proof: Apply induction on k . It is clear for k = 1. Suppose k ≥ 2 and
c1 v1 + . . . + ck vk = 0 for some scalars c1 , c2 , . . . , ck . Hence
c1 Av1 + c2 Av2 + . . . + ck Avk = 0 Hence
c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk = 0
Hence
λ1 (c1 v1 + c2 v2 + . . . + ck vk ) − (λ1 c1 v1 + λ2 c2 v2 + . . . + λk ck vk )
15/51
Relation Between Algebraic and Geometric
Multiplicities
Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).
16/51
Relation Between Algebraic and Geometric
Multiplicities
Definition
The algebraic multiplicity aA (µ) of an eigenvalue µ of a matrix A is
defined to be the multiplicity k of the root µ of the polynomial χA (λ).
Definition
The geometric multiplicity gA (µ) of an eigenvalue µ of A is defined to
be the dimension of the eigenspace EA (µ).
16/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
17/51
Proposition
Both algebraic multiplicity and the geometric multiplicities are
invariants of similarity.
Proposition
Let A be an n × n matrix. Then the geometric multiplicity of an
eigenvalue µ of A is less than or equal to the algebraic multiplicity of µ.
17/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
Let k = aA (µ).
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not.
18/51
Proof: Let g = gA (µ) be the geometric multiplicity of µ. Then EA (µ)
has a basis consisting of g eigenvectors v1 , v2 , . . . , vg .
We can extend this basis of EA (µ) to a basis of Cn , say
{v1 , v2 , . . . , vg , . . . , vn }. Let P be the matrix such that P j = vj .
Then P is an invertible matrix and
µIg X
P −1 AP =
0 Y
Let k = aA (µ). Then (λ − µ)k divides det(A − λI), but (λ − µ)k +1 does
not. Thus g ≤ k . 2
18/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.
19/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.
19/51
Remark
Observe that the geometric multiplicity and algebraic multiplicity of an
eigenvalue coincide for a diagonal matrix.
Theorem
A n × n matrix A is diagonalizable if and only if for each eigenvalue µ in
C of A, we have that the algebraic and geometric multiplicities are
equal:
aA (µ) = gA (µ).
19/51
Proof Contd. : To prove the converse, suppose that the two
multiplicities coincide for each eigenvalue. Suppose that λ1 , λ2 , . . . , λk
are all the eigenvalues of A with algebraic multiplicities n1 , n2 , . . . , nk .
Then n1 + · · · + nk = n. Let
21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
7. I NNER P RODUCT S PACES
Parallelogram law, the ability to measure angle between two vectors
and in particular, the concept of perpendicularity make the euclidean
space quite a special type of a vector space. Essentially all these are
consequences of the dot product.
Definition
Let V be a vector space. By an inner product on V we mean a ‘binary
operation,’ which associates a scalar say hu, vi for each pair of vectors
u, v in V , satisfying the following properties for all u, v, w in V and α, β
any scalar.
21/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)
22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)
22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)
22/51
Remark
(i) The above definition includes both the cases: V is a real vector
space or a complex vector space.
If K = R then the conjugation is just the identity.
Thus for real vector spaces, (1) will becomes ‘symmetric property’
(since for c ∈ R, we have c̄ = c.)
Example
(i) The dot product in Rn is an inner product. This is often called the
standard inner product.
22/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0
23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0
Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.
23/51
Example
= (x1 , x2 , . . . , xn )T and y = (y1 , y2 , . . . , yn )T ∈ Cn . Define
(ii) Let x P
hx, yi = ni=1 xi yi = x∗ y. This is an inner product on the complex
vector space Cn .
(iii) Let V = C[0, 1]. For f , g ∈ V , put
Z 1
hf , gi = f (t)g(t) dt.
0
Definition
Let V be an inner product space. For any v ∈ V , the norm of v,
denoted by kvk, is the positive
p square root of hv, vi :
kvk = hv, vi.
24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.
24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.
24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.
24/51
Norm Associated to an Inner Product
Proposition
Let V be an inner product space. Let u, v ∈ V and c be a scalar. Then
(i) kcuk = |c|kuk;
(ii) kuk > 0 for u 6= 0;
(iii) |hu, vi| ≤ kukkvk;
(iv) ku + vk ≤ kuk + kvk, equality holds in this triangle inequality iff
u, v are linearly dependent over R.
w = v − au.
24/51
Then
25/51
Then
25/51
Then
25/51
Then
25/51
Then
25/51
Then
26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).
26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).
26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).
26/51
Now equality holds in (iv) iff Rehu, vi = hu, vi and equality holds in (iii).
Definition
In any inner product space V , the distance between u and v is defined
as
d(u, v) = ku − vk.
26/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).
Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that
hu, vi
cos θ = .
kukkvk
27/51
Distance and Angle
The distance function defined is easily seen to satify properties that
any reasonable notion of distance would be expected to satisfy:
Proposition
(i) d(u, v) ≥ 0 and d(u, v) = 0 iff u = v;
(ii) d(u, v) = d(v, u);
(iii) d(u, w) ≤ d(u, v) + d(v, w).
Definition
Let V be a real inner product space. The angle between any nonzero
vectors u, v is defined to be the unique θ ∈ R with 0 ≤ θ ≤ π such that
hu, vi
cos θ = .
kukkvk
27/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set.
28/51
Gram-Schmidt Orthonormalization
The standard basis E = {e1 , e2 , . . . , en } of Rn has two properties :
(i) each ei has length one
(ii) any two elements of E are mutually perpendicular.
Now, let V be any inner product space of dimension d. We wish to
construct an analogue of E for V .
Definition
Two vectors u, v of an inner product space V are called orthogonal to
each other or perpendicular to each other if hu, vi = 0. We express this
symbolically by u ⊥ v.
A set S consisting of non zero elements of V is called orthogonal if
hu, vi = 0 for every pair of elements of S.
Further if every element of an orthogonal set S is of unit norm then S
is called an orthonormal set. If an orthonormal set S is a basis for V
then S is called an orthonormal basis.
28/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have
29/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have
Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2
29/51
Theorem
(Pythagoras Theorem:) Let V be an inner product space. Given
w, u ∈ V such that w ⊥ u we have
Proof: We have,
kw + uk2 = hw + u, w + ui
= hw, wi + hu, wi + hw, ui + hu, ui = kwk2 + kuk2
since hu, wi = hw, ui = 0. 2
Proposition
Let S = {u1 , u2 , . . . , un } be an orthogonal set of nonzero vectors in an
inner product space V . Then S is linearly independent.
29/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with
c1 u1 + c2 u2 + . . . + cn un = 0.
30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with
c1 u1 + c2 u2 + . . . + cn un = 0.
ci hui , ui i = 0.
30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with
c1 u1 + c2 u2 + . . . + cn un = 0.
ci hui , ui i = 0.
30/51
Proof: Suppose c1 , c2 , . . . , cn are scalars with
c1 u1 + c2 u2 + . . . + cn un = 0.
ci hui , ui i = 0.
Definition
If u and v are vectors in an inner product space V and v 6= 0 then the
vector
v
hv, ui
kvk2
is called the orthogonal projection of u along v.
30/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un }
31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
for all 1 ≤ k ≤ n.
31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
31/51
Theorem
(Gram-Schmidt Orthonormalization Process): Let V be any inner
product space. Given a linearly independent subset {u1 , . . . , un } there
exists an orthonormal set of vectors {v1 , . . . , vn } such that
L({u1 , . . . , uk }) = L({v1 , . . . , vk })
{w1 , . . . , wn }
31/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k
32/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k
That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .
32/51
Then we ‘normalize’ each element, viz, take
wi
vi = .
kwi k
That will give us an orthonormal set as required. The last part of the
theorem follows if we begin with {u1 , . . . , um } as any basis for V .
Take w1 := u1 . So, the construction is over for k = 1. To construct w2 ,
subtract from u2 , its orthogonal projection along w1 .
32/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1
33/51
Thus
w1
w2 = u2 − hw1 , u2 i .
kw1 k2
Check that hw2 , w1 i = 0.
Thus {w1 , w2 } is an orthogonal set. Check also that
L({u1 , u2 }) = L({w1 , w2 }).
Now suppose we have constructed wi for i ≤ k < n as required. Put
k
X wj
wk +1 = uk +1 − hwj , uk +1 i .
kwj k2
j=1
33/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.
34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.
34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.
34/51
Example
Let V = P3 [−1, 1] denote the real vector space of polynomials of
degree at most 3 defined on [−1, 1] together with the zero polynomial.
34/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
35/51
Example
1 x
w3 = x 2 − hx 2 , 1i − hx 2 , xi
2 (2/3)
Z 1 Z 1
1 3
= x2 − t 2 dt − x t 3 dt
2 −1 2 −1
1
= x2 − ,
3
1 x 1 w3
w4 = x 3 − hx 3 , 1i − hx 3 , xi − hx 3 , x 2 − i
2 (2/3) 3 kw3 k2
3
= x 3 − x.
5
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
36/51
Orthogonal Complement
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
36/51
Orthogonal Complement
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};
36/51
Orthogonal Complement
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;
36/51
Orthogonal Complement
Definition
Let V be an inner product space. Given a subspace W of V , we define
W ⊥ := {v ∈ V : v ⊥ w, ∀ w ∈ W }
Proposition
Let W be a subspace of an inner product space. Then
(i) W ⊥ is a subspace of V ;
(ii) W ∩ W ⊥ = {0};
(iii) W ⊂ (W ⊥ )⊥ ;
36/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
where w ∈ W and w⊥ ∈ W ⊥ .
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
37/51
Proposition
Let W be a finite dimensional subspace of a vector space. Then every
element v ∈ V is expressible as a unique sum
v = w + w⊥ (∗)
38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2
38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V .
38/51
Finally, let u0 ∈ W be any vector. Then
v − u0 = (v − w) + (w − u0 ).
Since w − u0 ∈ W and v − w ∈ W ⊥ , we can apply Pythagoras theorem
to get
kv − u0 k2 = kv − wk2 + kw − u0 k2
and conclude that
kv − u0 k2 ≥ kv − wk2 and “equality holds iff u0 = w."
This proves that w is the (only) nearest point to v in W . 2
Remark
(i) Observe that if V itself is finite dimensional, then it follows that the
above proposition is applicable to all subspaces in V . Also
38/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .
39/51
Remark
(ii) The element w so obtained is called the orthogonal projection of v
on W . One can easily check
P that the assignment v 7→ w where
w = i hwi , viwi .
defines a linear map PW : V → V whose range in W .
Also observe that PW is identity on W and hence
2 := P ◦ P
PW W W = PW .
39/51
Eigenvalues of Special Matrices
40/51
Eigenvalues of Special Matrices
40/51
Eigenvalues of Special Matrices
Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;
40/51
Eigenvalues of Special Matrices
Definition
Let A be a square matrix with complex entries. A is called
(i) Hermitian if A = A∗ ;
(ii) Skew Hermitian if A = −A∗ .
40/51
Lemma
A is Hermitian iff for all column vectors v, w we have
41/51
Lemma
A is Hermitian iff for all column vectors v, w we have
41/51
Lemma
A is Hermitian iff for all column vectors v, w we have
41/51
Lemma
A is Hermitian iff for all column vectors v, w we have
Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.
41/51
Lemma
A is Hermitian iff for all column vectors v, w we have
Remark
(i) If A is real then A = A∗ means A = At . Hence real symmetric
matrices are Hermitian.
(ii) A is Hermitian iff ιA is skew Hermitian.
41/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.
Proof:
1. Since u∗ Au is a complex number, to prove it is real, we prove that
(u∗ Au)∗ = u∗ Au.
But
(u∗ Au)∗ = u∗ A∗ (u∗ )∗ = u∗ Au.
Hence u∗ Au is real for all u ∈ Cn .
2. Suppose λ is an eigenvalue of A and u is an eigenvector for λ.
Then u∗ Au = u∗ (λu) = λ(u∗ u) = λkuk2 .
Since u∗ Au is real and kuk is a nonzero real number, λ is real.
42/51
Proposition
Let A be an n × n Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au (i.e., hu, Aui) is a real number.
2. All eigenvalues of A are real.
3. Eigenvectors of a Hermitian matrix corresponding to distinct
eigenvalues are mutually orthogonal.
Proof:
3. Let λ and µ be two distinct eigenvalues of A and u and v be
corresponding eigenvectors. Then Au = λu and Av = µv. Hence
42/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.
43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.
Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian
43/51
Corollary
Let A be an n × n skew Hermitian matrix. Then:
1. For any u ∈ Cn , u∗ Au is either zero or a purely imaginary number.
2. Each eigenvalue of A is either zero or a purely imaginary number.
3. Eigenvectors of A corresponding to distinct eigenvalues are
mutually orthogonal.
Proof: All this follow straight way from the corresponding statement
about Hermitian matrix, once we note that
A is skew Hermitian implies ιA is Hermitian
and that a complex number c is real iff ιc is either zero or purely
imaginary.
43/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.
44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.
44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.
44/51
Definition
Let A be a square matrix over C. A is called
(i) unitary if A∗ A = I;
(ii) orthogonal if A is real and unitary.
44/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.
(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
45/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.
(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.
45/51
Proposition
Let A be a square matrix. Then the following conditions are equivalent.
(i) U is unitary.
(ii) The columns of U form a orthonormal set of vectors.
(iii) The rows of U form a orthonormal set of vectors.
(iv) U preserves the inner product, i.e., for all vectors x, y ∈ Cn , we
have hUx, Uyi = hx, yi.
45/51
Hence
u∗1
u∗2
U ∗U = (u1 u2 . . . un )
..
.
u∗n
46/51
Hence
u∗1
u∗2
U ∗U = (u1 u2 . . . un )
..
.
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
u∗2 u1 u∗2 u2 . . . u∗2 un
= .. .
..
. .
∗ ∗ ∗
un u1 un u2 . . . un un
46/51
Hence
u∗1
u∗2
U ∗U = (u1 u2 . . . un )
..
.
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
u∗2 u1 u∗2 u2 . . . u∗2 un
= .. .
..
. .
∗ ∗ ∗
un u1 un u2 . . . un un
Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
46/51
Hence
u∗1
u∗2
U ∗U = (u1 u2 . . . un )
..
.
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
u∗2 u1 u∗2 u2 . . . u∗2 un
= .. .
..
. .
∗ ∗ ∗
un u1 un u2 . . . un un
Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
iff the column vectors of U form an orthonormal set.
46/51
Hence
u∗1
u∗2
U ∗U = (u1 u2 . . . un )
..
.
u∗n
u∗1 u1 u∗1 u2 . . . u∗1 un
u∗2 u1 u∗2 u2 . . . u∗2 un
= .. .
..
. .
∗ ∗ ∗
un u1 un u2 . . . un un
Thus
U ∗U = I
iff u∗i uj = 0 for i 6= j and u∗i ui = 1 for i = 1, 2, . . . , n
iff the column vectors of U form an orthonormal set.
This proves (i) ⇔ (ii).
46/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I
47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence
47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence
47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence
47/51
Since U ∗ U = I implies UU ∗ = I, the proof of (i) ⇔ (iii) follows.
To prove (i) ⇔ (iv) let U be unitary. Then U ∗ U = I and hence
Remark
The above theorem is valid for an orthogonal matrix also by merely
applying it for a real matrix.
47/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.
48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.
Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so
48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.
Proof:
(1) We have, hUx, Uyi = hx, U ∗ Uyi = hx, yi and so
Hence |λ| = 1.
48/51
Corollary
Let U be a unitary matrix. Then:
(1) For all x, y ∈ Cn , hUx, Uyi = hx, yi. Hence kUxk = kxk.
(2) If λ is an eigenvalue of U then |λ| = 1.
(3) Eigenvectors corresponding to different eigenvalues are
orthogonal.
Proof:
(3) Let Ux = λx and Uy = µy where x, y are eigenvectors with
distinct eigenvalues λ and µ respectively. Then
48/51
Example
cos θ − sin θ
U= is an orthogonal matrix.
sin θ cos θ
49/51
Example
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ
= λ2 − 2λ cos θ + 1.
49/51
Example
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ
= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:
√
2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2
49/51
Example
cos θ − sin θ
U= is an orthogonal matrix. The characteristic
sin θ cos θ
polynomial of U is :
cos θ − λ − sin θ
D(λ) = det(U − λI) = det
sin θ cos θ − λ
= λ2 − 2λ cos θ + 1.
Roots of D(λ) = 0 are:
√
2 cos θ ± 4 cos2 θ − 4
λ= = cos θ ± ι sin θ = e±ιθ .
2
Hence |λ| = 1. Check
that
eigenvectors are:
ιθ 1 −ιθ 1
for λ = e : x = and for λ = e :y= .
−ι ι
49/51
Example
1
Thus x∗ y = (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
50/51
Example
1
Thus x∗ y= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,
1 1 1
C=√
2 −ι ι
50/51
Example
1
Thus x∗ y
= (1 ι) = 1 + ι2 = 0. Hence x⊥y.
ι
Normalize the eigenvectors x and y. Therefore if we take,
1 1 1
C=√
2 −ι ι
50/51