You are on page 1of 141

B.

 Chaluvaraju 

KARNATAKA STATE OPEN UNIVERSITY 
Manasagangotri, Mysore­570 006 

M. Sc.  Mathematics 
(Semester Scheme) 

Math 2.1: Linear Algebra 
(II Semester) 
 

Dr.  B.  CHALUVARAJU. M. Sc., Ph. D 
Assistant Professor (Senior Scale) 
Department of Mathematics, Central College Campus,  
Bangalore University,  Bangalore‐560 001, INDIA 
E‐Mail: bcrbub@gmail.com 

0
Linear Algebra 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

1
B. Chaluvaraju 

TABLE OF CONTENTS 
•  Preface  
•  Preliminaries 
BLOCK ­ I: Vector Spaces, Linear Transformations and   Matrix 
• Objectives 
• UNIT­1:  
1.  0.      Introduction 
1.  1.   Vector Spaces  
1.  2.   Subspaces 
1.  3.   Linear Combinations and Systems of Linear equation 
1.  4.   Linearly Dependence and Linearly Independence 
1.  5.   Bases and Dimension 
1.  6.       Maximal Linearly Independent Subsets 
1.  7.   Summary  
1.  8.   Keywords 
 
• UNIT­2 :  
2.  0.       Introduction 
2.  1.   Linear Transformation, Null Spaces and Ranges 
2.  2.   The Matrix Representation of a Linear Transformation 
2.  3.   Composition of a Linear Transformation and Matrix Multiplication 
2.  4.   Invertiblity and Isomorphism  
2.  5.   The Change of Coordinate Matrix 
2.  6.   The Dual space 
2.  7.   Summary 
2.  8.   Keywords 
 
• UNIT­3 :  
3.  0.       Introduction 
3.  1.   Elementary Matrix  Operations and Elementary Matrices 
3.  2.   The Rank of a Matrix and Matrix Inverse 
3.  3.   Systems  of  Linear Equation 
3.  4.   Summary 
3.  5.   Keywords 
 
• UNIT­ 4 :  
4.  0.       Introduction 
4.  1.   Properties  of Determinants 
4.  2.   Cofactor Expansions 
4.  3.   Elementary Operation and Cramer’s rule 
4.  4.   Summary 
4.  5.   Keywords 
 
• Exercises 
• References 
2
Linear Algebra 

BLOCK ­ II: Diagonalization and Inner Product Spaces  
• Objectives 
• UNIT­1:  
 1.  0.      Introduction 
1.  1.   Eigen values and Eigen vectors 
1.  2.   Diagonalizability 
1.  3.   Invariant Subspaces and the Cayley‐Hamilton Theorem 
1.  4.   Summary 
1.  5.   Keywords 
 
• UNIT­2 :  
2.  0.       Introduction 
2.  1.   Inner Products and Norms 
2.  2.   The Gram‐Schmidt Orthogonalization Process  
2.  3.   Orthogonal Complements  
2.  4.   Summary  
2.  5.   Keywords 
 
• UNIT­3 :  
 3.  0.      Introduction 
3.  1.   The Adjoint of a Linear Operator 
3.  2.   Normal and Self‐Adjoint Operators 
3.  3.   Unitary and Orthogonal Operators and Their Matrices 
3.  4.   Orthogonal Projection and Spectral Theorem 
3.  5.   Summary 
3.  6.   Keywords 
 
• UNIT­ 4 :  
 4.  0.      Introduction 
4.  1.   Bilinear and Quadratic Form  
4.  2.   Summary 
4.  3.   Keywords 
 
• Exercises  
• References 

 
BLOCK ­ III: Canonical Forms 
• Objectives 
• UNIT­1 :  
 1.  0.      Introduction 
1.  1.   The Diagonal Form 
1.  2.   The Triangular Form 
1.  3.   Summary 
1.  4.   Keywords 

3
B. Chaluvaraju 

 
• UNIT­2 :  
2.  0.       Introduction 
2.  1.   The Jordan Canonical Form  
2.  2.   Summary 
2.  3.   Keywords 
 
• UNIT­3:  
 3.  0.      Introduction 
3.  1.   The Minimal Polynomial 
3.  2.   Summary 
3.  3.   Keywords 
 
• UNIT­ 4:  
 4.  0.      Introduction 
4.  1.   The Rational Canonical Form 
4.  2.   Summary 
4.  3.   Keywords 
 
• Exercises 
• References 
• Glossary of Symbols 

Preface

Linear algebra, mathematical discipline that deals with vectors and matrices and,
more generally, with vector spaces and linear transformations. Unlike other parts of
mathematics that are frequently invigorated by new ideas and unsolved problems, linear
algebra is very well understood. Linear algebra is a very useful subject, and its basic
concepts arose and were used in different areas of mathematics and its applications. It is
therefore not surprising that the subject had its roots in such diverse fields as number
theory (both elementary and algebraic), geometry, abstract algebra (groups, rings, fields,
Galois theory), analysis (differential equations, integral equations, and functional
analysis), and physics. Among the elementary concepts of linear algebra are linear
equations, matrices, determinants, linear transformations, linear independence,
dimension, bilinear forms, quadratic forms, and vector spaces. Since these concepts are
closely interconnected, several usually appear in a given context (e.g., linear equations
and matrices) and it is often impossible to disengage them. It has extensive applications
in engineering, natural sciences, computer science, and the social sciences. Nonlinear

4
Linear Algebra 

mathematical models can often be approximated by linear ones. In 1880, many of the
basic results of linear algebra had been established, but they were not part of a general
theory. In particular, the fundamental notion of vector space, within which such a theory
would be framed, was absent. This was introduced only in 1888 by Giuseppe Peano,
Italian mathematician and he known as a founder of symbolic logic. This study material
deals with three blocks. In block -I contain vector spaces, linear transformation and
matrices, block-II contains diagonalization and inner product space and finally block-III
contains canonical forms, each blocks comprises four units.

0. Preliminaries.

This section provides a brief review of all basic concepts, definitions, examples
etc which will be used in the subsequent blocks. It has been assumed that the reader is
familiar with elementary Algebra.

Definition 0. 1. A group is a nonempty set G with a function G ×G → G denoted


(a, b) → ab, which satisfies the following axioms:
(i) (Associativity) For each a, b, c G, (ab) c = a (bc).
(ii) (Identity) There exists an element e G such that ea = a = ae for each a G.
(iii) (Inverses) For each a G, there exists an element a−1 G such that
a −1a = e = aa−1.
A group is abelian if ab = ba for all a, b ∈ G.
Examples.
1. Any vector space is an abelian group under +.
2. The set of invertible n × n real matrices is a group under matrix multiplication.

Definition 0. 2. A ring is a set R together with a function R×R → R called addition,


denoted (a, b) →a +b, and a function R × R → R called multiplication, denoted
(a, b) →ab, which satisfy the following axioms:
(i) (Commutativity of +) For each a, b R, a + b = b + a.
(ii) (Associativity) For each a, b, c R, (a + b) + c = a + (b + c) and
(ab) c = a (bc).

5
B. Chaluvaraju 

(iii) (+ Identity) There exists an element 0 in R such that 0 + a = a.


(iv) (+ Inverse) For each a R, there exists an element −a R such that
(−a) + a = 0.
(v) (Distributivity) For each a, b, c R, a (b + c ) = ab + ac and
(a + b) c = ac + bc.
A zero divisor in a ring R is a nonzero element a ∈ R such that there exists a nonzero
b ∈ R with ab = 0 or ba = 0.

Example. The set of integers, Z, is a ring.

Definition 0. 3. A field is a set F with at least two elements together with a function F ×
F → F called addition, denoted (a, b) → a + b, and a function F × F → F called
multiplication, denoted (a, b) → ab, which satisfy the following axioms:
(i) (Commutativity) For each a, b F, a + b = b + a and ab = ba.
(ii) (Associativity) For each a, b, c F, (a + b) + c = a + (b + c) and
(ab) c = a (bc).
(iii) (Identities) There exist two elements 0 and 1 in F such that 0 + a = a and
1a = a for each a F
(iv) (Inverses) For each a F, there exists an element −a F such that
(−a) + a = 0. For each nonzero a F, there exists an element a−1 F such that
a−1a = 1.
(v) (Distributivity) For each a, b, c F, a(b + c ) = ab + ac.

Examples.
1. The real numbers R the complex numbers C, and the rational numbers Q, are all fields.
2. The set of integers Z, is not a field.

Definition 0. 4. An object of the form (a1, a2, . . . ,an), where the entries
a1, a2, . . . ,an are elements of a field F, is called n-tuple with entries from F.

Definition 0. 5. A polynomial with coefficients from a field F is an expression of the


form p(x) = a0 +a1x +….+ anxn, where n is a nonnegative integer and each ai, called the
coefficient of xi, is in F. The set of all polynomials with coefficients from F is a vector
6
Linear Algebra 

space is denoted by P(F) or F[x]. A polynomial p(x) in F[x] is said to be irreducible over
F if whenever p(x) = g(x) h(x) with g(x) h(x)∈ F[x], then one of g(x) or h(x) has degree 0
(that is a constant).

Definition 0. 6. A metric on a set X is d : X × X → R satisfying the following axioms :


(i) d(x, y) ≥ 0 for all x, y∈ X
(ii) d(x, y) ≥ 0 exactly when x= y
(iii) d(x, y) = d(y, x)
(iv) d(x, y) ≤ d(x, z) + d(z, y) ≥ 0 for all x, y, z∈ X.

In other words, a metric is simply a generalization of the notion of distance.

Definition 0. 7. An m × n matrix with entries from afield F is a rectangular array of the


                   
    
form         …         , where each entry aij (1≤ i ≤ m, 1≤ j≤ n) is an element
      
of F. The set of all m × n matrices with entries from a field F is a vector space is denoted
by Mm×n(F).

Definition 0. 8. An n × n matrix M is called a diagonal matrix if Mij = 0 whenever i ≠


j, that is if all its nondiagonal entries are zero.

Definition 0. 9. The transpose of an m × n matrix A = [aij] is the n × m matrix AT with


aji as the element in the ith row and jth column.

Definition 0. 10. The trace of an n × n matrix M, denoted tr(M), is the sum of the
diagonal entries of M. That is, tr(M) = M11+ M22+ . . . .+ Mnn.

Definition 0. 11. A symmetric matrix is a matrix A such that AT=A and a skew-
symmetric matrix is a matrix A such that AT= –A.

Definition 0. 12. For any pair of positive integers i and j, the symbol δij is defined δij =
0 if i≠ j and δij = 1 if i=j. This symbol is known as the kronecker delta.

7
B. Chaluvaraju 

BLOCK – I
Vector Spaces, Linear Transformations and Matrix

Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of vector space, linear transformation and matrix.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.

UNIT-1
1. 0. Introduction
A vector space is a mathematical structure formed by a collection of vectors:
objects that may be added together and multiplied ("scaled") by numbers, called scalars
in this unit. Scalars are often taken to be real numbers, but one may also consider vector
spaces with scalar multiplication by complex numbers, rational numbers, or even more
general fields instead. The operations of vector addition and scalar multiplication have to
satisfy certain requirements, called axioms.
This unit mainly deals with standard properties of Vectors, Subspaces, Linear
combinations and systems of linear equation, Linearly dependence and linearly
independence, Bases and Dimension, and Maximal linearly independent subsets along
with some illustrated examples and theorems.

1. 1. Vector Spaces
Definition 1. 1. 1. A vector space V over a field F is an abelian group with a scalar
product a⋅v or av defined for all a∈F and all v∈V satisfying the following axioms.
(i) a(bv) = (ab)v
(ii) (a +b)v = av + bv

8
Linear Algebra 

(iii) a(u+v)= au + av
(iv) 1v = v, where a, b∈F and u, v∈V.
Note.
1. The elements of V are called vectors, the elements of F are called scalars.  
2. To differentiate between the scalar zero and the vector zero or null vector, we will
write them as 0 and 0, respectively.  

Let us examine some examples of vector spaces.


Example -1. The n-tuples of real numbers, denoted by (u1, . . . ,un), form a vector space
over R. Given vectors u = (u1, . . . ,un) and v = (v1, . . . ,vn) in Rn and a in R , we can
define vector addition by u + v = (u1, . . . ,un) + (v1, . . ,vn) = (u1 + v1, . . . , un + vn) and
scalar multiplication by au =a(u1, . . . ,un) = (au1, . . . ,aun).

Example - 2. If F is a field, then F[x] is a vector space over F.


The vectors in F[x] are simply polynomials. Vector addition is just polynomial addition.
If a∈F and p(x)∈F[x], then scalar multiplication is defined by ap(x).

Theorem 1. 1. 1. Let V be a vector space over a field F and 0 be the zero vector in V.
Then each of the following statements is true.

(i) 0v = 0 for all v∈V.


(ii) a 0 = 0 for all a∈F.
(iii) If av = 0, then either a = 0 or v = 0.
(iv) (−1)v = −v for all v∈V.
(v) −(av) = (−a)v = a(−v) for all a∈F and v∈V.

Proof. To prove (i), observe that 0v = (0 + 0)v = 0v + 0v; consequently, 0 + 0v = 0v +


0v. Since V is an abelian group, 0 = 0v. The proof of (ii) is almost identical to the proof
of (i). For (iii), we are done if a = 0. Suppose that a ≠ 0. Multiplying both sides of a =
0 by 1/a, we have v = 0. To show (iv), observe that v + (−1)v = 1v + (−v) = (1−1)v = 0v =
0, and so −1v = (−1)v. The proof (v) is obvious.

9
B. Chaluvaraju 

1. 2. Subspace

Definition 1. 2. 1. Let V be a vector space over a field F, and W a subset of V. Then W is


a subspace of V if it is closed under vector addition and scalar multiplication; that is, if u,
v ∈ W and a∈F, it will always be the case that u + v and av are also in W.

Illustrative Example - 1. Let W be the subspace of R3 defined by


W = {(x1, 2x1 + x2, x1 − x2) : x1, x2∈ R3}.
Then show that W is a subspace of R3.
Solution. Since a(x1, 2x1 + x2, x1 − x2) = (ax1, a(2x1 + x2), a(x1 − x2))
= (ax1, a(2x1) + ax2, ax1 − ax2),
Therefore, W is closed under scalar multiplication.
To show that W is closed under vector addition,
Let u = (x1, 2x1 + x2, x1 − x2) and v = (y1, 2y1 + y2, y1 − y2) be vectors in W.
Then u + v =(x1 + y1, 2(x1 + y1) + (x2 + y2), (x1 + y1) − (x2 + y2)).
Hence W is a subspace of R3.

Illustrative Example - 2. Let R be the field of real numbers and S be the set of all
solutions of the equations x + y + 2z = 0. Show that S is a subspace of R3.
Solution. We have, S = {(x, y, z): x + y + 2z = 0, x, y, z ∈ R}.
Clearly, 1× 0 + 1× 0 + 2 × 0 = 0 . So, (0, 0, 0) satisfies the equation x + y + 2z = 0
Therefore, (0, 0, 0)∈S
⇒ S is non- empty subset of R3.
Let u = (x1, y1,z1) and v = (x2, y2, z2) be any two elements of S.
Then x1 + y1 + 2z1 = 0 and x2 + y2 + 2z2 = 0
Let a, b be any two elements of R. Then,
au + bv = a(x1, y1,z1) + b(x2, y2, z2)
⇒ au + bv = (ax1 +bx2, ay1 +by2, az1 + bz2)
Now (ax1 +bx2) + (ay1 +by2) + 2 (ac1 + bc2) = a(x1 + y1 + 2z1) + b(x2 + y2 + 2z2)
=a ×0+b ×0=0
Therefore, au + bv = (ax1 +bx2, ay1 +by2, az1 + bz2) ∈ S

10
Linear Algebra 

Thus au + bv ∈ S for all u, v ∈ S and a, b ∈ R


Hence, S is a subspace of R3.

Illustrative Example - 3. Let V be the vector space of all real valued continuous
functions over the field R of all real numbers. Show that the set S of solutions of the
d2y dy
differential equation 2 − 9 + 2 y = 0 is a subspace of V.
dx 2
dx
⎧ d2y dy ⎫
Solution. We have S = ⎨ y : 2 2 − 9 + 2 y = 0 ⎬ , where y = f(x).
⎩ dx dx ⎭
Clearly, y = 0 satisfies the given differential equations.
Therefore, 0 ∈S ⇒ S ≠ φ .

d2y dy
Let y1, y2 ∈ S, Then, y1, y2 are solutions of 2 − 9 + 2y = 0
dx 2
dx
d 2 y1 dy d 2 y2 dy
⇒ 2 − 9 1 + 2 y1 = 0 and 2 − 9 2 + 2 y2 = 0
dx 2
dx dx 2
dx
Let a, b ∈ R and y = ay1 + by2. Then
d2y dy d2 d
2 2 − 9 + 2 y = 2 2 ( ay1 + by2 ) − 9 ( ay1 + by2 ) + 2 ( ay1 + by2 )
dx dx dx dx
d2y dy ⎛ d 2 y1 d 2 y2 ⎞ ⎛ dy1 dy ⎞
⇒ 2 2 − 9 + 2y = 2⎜ a 2 + b 2 ⎟ − 9⎜ a + b 2 ⎟ + 2 ( ay1 + by2 )
dx dx ⎝ dx dx ⎠ ⎝ dx dx ⎠

d2y dy ⎛ d 2 y1 dy1 ⎞ ⎛ d 2 y2 dy ⎞
⇒ 2 − 9 + 2 y = a ⎜ 2 − 9 + 2 y1⎟
+ b ⎜ 2 2 − 9 2 + 2 y2 ⎟
dx 2
dx ⎝ dx
2
dx ⎠ ⎝ dx dx ⎠
d2y dy
⇒ 2 − 9 + 2y = a×0 + b × 0
dx 2
dx
Therefore, y = ay1 + by2 ∈ S for all y1, y2 ∈ S and a, b ∈ R.
Hence, S is a subspace of V.

1. 3. Linear combinations and systems of linear equation

Definition 1. 3. 1. Let V be a vector space and S a nonempty subset of V. A vector


v ∈ V is called a linear combination of vectors of S if there exist a finite number
of vectors u1,u2,…..,un in S and scalars a1,a2,……,an in F such that
11
B. Chaluvaraju 

v = a1 u1 + a2 u2 + a3 u3 + ………… + an un.
In this case we also say that v is a linear combination of u1,u2, ….un and call a1,
a2,……,an the coefficients of the linear combination.

Note. In any vector space V, 0v = 0 for each v ∈ V. Thus the zero vector is a linear
combination of any nonempty subset of V.

Definition 1. 3. 2. Let S be a nonempty subset of a vector space V. The span of S,


denoted span(S), is the set consisting of all linear combinations of the vectors in S.

Note.
1. For convenience, we define span ( φ ) = {0}.
2. In R3, for instance, the span of the set {(1, 0, 0), (0, 1, 0)} consists of all vectors
in R3 that have the form a(1, 0, 0) + b(0, 1, 0) = (a, b, 0) for some scalars a and b.
Thus the span of {(1, 0, 0), (0, 1, 0)} contains all the points in the xy-plane. In
this case, the span of the set is a subspace of R3. This fact is true in general.

Theorem 1. 3. 1. The span of any subset S of a vector space V is a subspace of V.


Moreover, any subspace of V that contains S must also contain the span of S.
Proof. This result is immediate is S = φ because span ( φ ) = {0}, which is a subspace
that is contained in any subspace of V.
If S ≠ φ , then S contains a vector z. So 0z = 0 is in span(S). Let x, y ∈ span(S). Then
there exist vectors u1, u2, ….um, v1, v2, ….vn in S and scalars a1, a2, …,am, b1, b2,….bn such
that x = a1 u1 + a2 u2 + a3 u3 + …… + am um and y = b1 v1 + b2 v2 + b3 v3 + …… + bn vn.
Therefore, x + y = a1 u1 + a2 u2 + ……+ am um + b1 v1 + b2 v2 + …… + bn vn and, for any
scalar c, cx = (ca1)u1 + (ca2)u2 + (ca3)u3 + ………+(cam)um are clearly linear
combinations of the vectors in S, so x + y and cx are in span (S).
Thus span (S) is a subspace of V.

Definition 1. 3. 3. A subset S of a vector space V generates (or spans) V is span (S) = V.


In this case, we also say that the vectors of S generate (0 span) V.

12
Linear Algebra 

Example - 1. The vectors (1, 1, 0), (1, 0, 1) and (0, 1, 1) generate R3 since an arbitrary
vector (a1, a2, a3) in R3 is a linear combination of the three given vectors, in fact, the
scalars r, s and t for which r(1, 1, 0) + s(1, 0, 1) + t(0, 0, 1) = (a1,a2, a3) are r = ½ (a1
+ a2 – a3), s = ½ (a1 – a2 + a3), and t = ½ (–a1 + a2 + a3).

Example - 2. The polynomials x2 + 3x – 2, 2x2 + 5x – 3, and –x2 –4x + 4 generate


P2(R) since each of the three given polynomials belongs to P2(R) and each polynomial
ax2 + bx + c in P2(R) is a linear combination of these three, namely,
(– 8a + 5b + 3c) (x2 + 3x – 2)
+ (4a – 2b – c) (2x2 + 5x – 3)
+ (– a + b + c) (– x2 – 4x + 4) = (ax2 + bx + c).

Definition 1. 3. 4. The system of linear equations can be written in the form


a11x1 + a12x2 + . . . . . . . . + a1nxn = b1
a21x1 + a22x2 + . . . .. . . . + a2nxn = b2
:
am1x1 + am2x2 + . . . . . . . + amnxn = bm
where aij and bi (1 ≤ i ≤ m and 1 ≤ j ≤ n ) are scalars in a field F and x1 , x2 , x3, . .. . ,xn are
n variables taking values in F, is called a system of m linear equations in n unknowns
over F.

Illustrative Example - 4. Express v = {1, –2, 5) in R3 as a linear combination of the


following vectors: v1 = (1, 1, 1), v2 = (1, 2, 3), v3 = (2, –1, 1).
Solution. Let x, y, z be scalars in R such that v = xv1 +yv2 +zv3
⇒ (1, –2, 5) = x(1,1,1) + y(1,2,3) + z (2, –1, 1)
⇒ (1, –2, 5) = (x + y +2z, x +2y – z, x + 3y + z)
⇒ x + y + 2z = 1, x +2y – z = –2, x + 3y + z = 5.
This system of equations can be written in matrix as follows:
⎡1 1 2 ⎤ ⎡x⎤ ⎡ 1 ⎤
⎢1 2 −1⎥⎥ ⎢⎢ y ⎥⎥ = ⎢⎢ −2 ⎥⎥

⎢⎣1 3 1 ⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 5 ⎥⎦

This is equivalent to

13
B. Chaluvaraju 

⎡1 1 2 ⎤ ⎡ x ⎤ ⎡ 1 ⎤
⎢ 0 1 −3⎥ ⎢ y ⎥ = ⎢ −3⎥ Applying R →R – R , R →R – R
2 2 1 3 3 1
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0 2 −1⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 4 ⎥⎦

⎡1 1 2 ⎤ ⎡ x ⎤ ⎡ 1 ⎤
or ⎢ 0 1 −3⎥ ⎢ y ⎥ = ⎢ −3⎥ Applying R →R – R
3 3 2
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0 0 5 ⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣10 ⎥⎦

This is equivalent to x + y + 2z = 1, y – 3z = –3, 5z = 10


⇒ x = –6, y = 3, z = 2.
Hence, v = –6v1 + 3v2 +2v3.

1. 4. Linearly Dependence and Linearly Independence.

Definition 1. 4. 1. A subset S of a vector space V is called linearly dependent if there


exists a finite number of distinct vectors u1, u2 , u3 ,……,un in S and scalars a1, a2, a3,
……,an , not all zero, such that a1 u1 + a2 u2 + a3 u3 + ……+ an un= 0. In this case we
also say that the vectors of S are linearly dependent.

Note. For any vectors u1, u2 , u3 , ……,un, we have a1 u1 + a2 u2 + a3 u3 + …… + an un =


0 if a1 = a2 = a3 = ……… = an = 0. We call this the trivial representation of 0 as a linear
combination of u1, u2 , u3 , ……, un. Thus, for a set to be linearly dependent, there must
exist a nontrivial representation of 0 as a linear combination of vectors in the set.

Illustrative Example - 1. If the set S = {(1, 3, –4, 2), (2, 2, –4, 0), (1, –3, 2, –4), (–1,
0, 1, 0)} in R4, then show that S is linearly dependent and express one of the vectors in S
as a linear combination of the other vectors in S.
Solution. To show S is linearly dependent, we must find scalars a1, a2, a3 and a4 , not all
zero, such that a1 (1, 3,–4, 2) + a2(2, 2,–4, 0) + a3(1,–3, 2,–4) + a4(–1, 0, 1, 0) = 0.
Finding such scalars to finding a nonzero solution to the system of linear equations.
a1 + 2a2 + a3 – a4 = 0
3a1 + 2a2 – 3a3 =0
–4a1 – 4a2 + 2a3 + a4 = 0
2a1 – 4a4 = 0

14
Linear Algebra 

One such solution is a1 = 4, a2 = –3, a3 = 2, and a4 = 0. Thus S is a linearly dependent


subset of R4, and 4 (1, 3, –4, 2) – 3( 2, 2, –4, 0) + 2(1, –3, 2, –4) + 0(–1, 0, 1, 0) = 0.

Illustrative Example - 2. Let v1, v2, v3, be vectors in a vector space V(F), and
let λ1 , λ2 ∈ F. Show that the set { v1, v2, v3} is linearly dependent, if the set
{ v1 + λ1 v2 + λ2 v3, v2, v3} is linearly dependent.
Solution. If the set {v1 + λ1 v2 + λ2 v3, v2, v3} is linearly dependent, then there exist
scalars λ, µ, γ ∈F (not all zero) such that λ (v1 + λ1 v2 + λ2 v3) + µ v2 + γ v3 = 0V
⇒ λ v1 + (λ λ1 + µ) v2 + (λ λ2+ γ )v3 = 0V → (1)
The set {v1, v2, v3} will be linearly dependent if in (1) at least one of the scalar coefficients
is non-zero. If λ ≠ 0, then the set will be linearly dependent whatever may be the values
of µ and γ. But, if λ = 0, then at least one of µ and γ should not be equal to zero and
hence at least one of λ λ1 + µ and λ λ2+ γ will not be zero.
Since λ = 0 ⇒ λ λ1 + µ = µ and λ λ2+ γ = γ. Hence, from (1), we find that the scalars
λ, λ λ1 + µ, λ λ2+ γ are not all zero. Consequently, the set {v1, v2, v3} is linearly
dependent.

Definition 1. 4. 2. A subset S of a vector space that is not linearly dependent is called


linearly independent. As before, we also say that the vectors of S are linearly
independent.

Note. The following facts about linearly independent sets are true in any vector space.
1. The empty set is linearly independent, for linearly dependent sets must be
nonempty.
2. A set consisting of a single nonzero vector is linearly independent. For if {u} is
linearly dependent, then au = 0 for some nonzero scalar a.
Thus, u = a–1 (au) = a–10 = 0
3. A set is linearly independent if and only if the only representations of 0 as linear
combinations of its vectors are trivial representations.

The condition-3 provides a useful method for determining whether a finite set is
linearly independent. This technique is illustrated in the following examples.

15
B. Chaluvaraju 

Illustrative Example - 3. Prove that the set S = {(1, 0, 0, –1), (0, 1, 0, –1),
(0, 0, 1, –1), (0, 0, 0, 1)} is linearly independent.
Solution. We must show that the only linear combination of vectors in S that equals the
zero vector is the one in which all the coefficients are zero.
Suppose that a1, a2, a3 and a4 are scalars such that
a1(1, 0, 0, –1) + a2 (0, 1, 0, –1) + a3(0, 0, 1, –1) + a4(0, 0, 0, 1) = (0, 0, 0, 0).
Equating the corresponding coordinates of the vectors on the left and the right sides of
this equation, we obtain the following system of linear equations.
a1 =0
a2 =0
a3 =0
– a1 – a2 –a3 + a4 = 0
Clearly the only solution to this system is a1 = a2 = a3 = a4 = 0, and so S is linearly
independent.

Illustrative Example - 4. Suppose the vectors u, v, w are linearly independent vectors


in the vector space V(F). Show that the vectors u + v, u – v, u – 2v +w are also linearly
independent.
Solution. Let λ1, λ2, λ3 be scalars such that
λ1(u + v) + λ2 (u – v) + λ3 (u – 2v + w) = 0V
⇒ (λ1 + λ2 + λ3 ) u + (λ1 – λ2 – 2 λ3 )v + λ3w = 0V
⇒ λ1 + λ2 + λ3 = 0, λ1 – λ2 – 2 λ3 = 0, 0λ1 + 0λ2 + λ3 = 0 , since u, v, w are linearly
independent.
The coefficient matrix A of the above system of equations is given by
⎡1 1 1⎤

A = ⎢1 − 1 − 2 ⎥⎥ ⇒ A = −2 ≠ 0
⎢⎣0 0 1 ⎥⎦

So, the above system of equations has only trivial solution λ1 = λ2 = λ3 = 0.


Thus, u + v, u – v, u – 2v +w are linearly independent vectors.

1. 5. Bases and Dimension

16
Linear Algebra 

Definition 1. 5. 1. A basis A for a vector space V is a linearly independent subset

of V that generates V. If A is a basis for V, we also say that the vectors of A form a
basis for V.

Example - 1. Recalling that span ( φ ) = {0} and φ is linearly independent, we see that
φ is a basis for the zero vector space.

Example - 2. If Mmxn(F), let Eij denote the matrix whose only nonzero entry is a 1 in the
ith row and jth column. Then {Eij : 1 ≤ i ≤ m, 1 ≤ j ≤ n }is a basis for Mm x n(F).

Definition 1. 5. 2. In Fn, let e1 = (1, 0, 0, ……,0), e2 = (0, 1, 0……,0) , ……., en =


(0,0,……,0,1); { e1, e2, e3, ……,en } is readily seen to be a basis for Fn and is called the
standard basis for Fn and is denoted by εn.

Example - 3. If Pn(F) the set {1, x, x2,……, xn} is a basis. This is the standard basis for
Pn(F).

Theorem 1. 5. 1. Let V be a vector space and A = {u1, u2, ……, un} be a basis of a

subset of V. Then A is a basis for V if and only if each v ∈ V can be uniquely expressed
as a linear combination of vectors of A, that is, can be expresses in the form v = a1 u1 +
a2 u2 + a3 u3 + …… + an un for unique scalars a1, a2, a3, …,an.
Proof. Let A be a basis for V. If v ∈ V, then v ∈ span (A) because span (A) = V. Thus
v is a linear combination of the vectors of A. Suppose that
v = a1 u1 + a2 u2 + a3 u3 + ………… + an un and
v = b1 u1 + b2 u2 + b3 u3 + ………… + bn un
are two such representations of v. Subtracting the second equation from the first gives
0 = (a1 – b1)u1 + (a2 – b2)u2 + (a3 – b3)u3 + …………….+ (an – bn)un.
Since A is linearly independent, it follows that
(a1 – b1) = (a2 – b2) = (a3 – b3) = …………..(an – bn) = 0.
Hence a1 = b1 , a2 = b2 , a3 = b3 ,………., an = bn, so v is uniquely expressible as a linear
combination of the vectors of the basis A.

17
B. Chaluvaraju 

Definition 1. 5. 2. A vector space is called finite-dimensional if it has a basis consisting


of a finite number of vectors. The unique number of vectors in each basis for V is called
the dimension of V and is denoted by dim(V). A vector space that is not finite-
dimensional is called infinite-dimensional.

Examples - 4.
(i) The vector space {0} has dimension zero.
(ii) The vector space Fn has dimension n.
(iii) The vector space Mm × n(F) has dimension mn.
(iv) The vector space Pn(F) has dimension (n+1).

The following examples show that the dimension of a vector space depends on its field of
scalars.
Examples - 5.
(i) Over the field of complex numbers C, the vector space of complex numbers has
dimension 1. (A basis is {1, i}).
(ii) Over the field of real numbers R, the vector space of complex numbers has
dimension 2. (A basis is {1, i}).
Note.
1. Let V be finite dimensional vector space over a field F, and let S be a subspace of
V. Then dim (V/S) = dim (V) – dim (S).
2. If S and T are two subspaces of a finite dimensional vector space V over a field F,
then dim (S+T) = dim (S) + dim (T) – dim (S∩T).

1. 6. Maximal Linearly Independent subsets.

Definition 1. 6. 1. Let F be a family of sets. A member M of F is called maximal (with


respect to set inclusion) if M is contained in no member of F other than M itself.

Example- 1. Let F be the family of all subsets of a nonempty set S. (This family F is
called the power set of S.) The set S is easily seen to be a maximal element of F.

18
Linear Algebra 

Example -2. Let S and T be disjoint nonempty sets, and let F be the union of their power
sets. Then S and T are both maximal elements of F.

Definition 1. 6. 2. Let S be a subset of a vector space V. A maximal linearly


independent subset of S is a subset H of S satisfying both of the following conditions.
(i) H is linearly independent.
(ii) The only linearly independent subset of S that contains H is H itself.

Note. A basis A for a vector space V is a maximal linearly independent subset of V,


because
1. A is linearly independent by definition.

2. If v ∈ V and v∉A, then A ∪{v} is linearly dependent by S is a linearly


independent subset of a vector space V, and v in V that is not in S. Then S ∪ {v}
is linearly dependent if and only if v ∈ span (S) because span (A) = V.

Our next result shows that the converse of this statement is also true.
Theorem 1. 6. 1. Let V be a vector space and S a subset that generates V. If A is a

maximal linearly independent subset of S, then A is a basis for V.

Proof. Let A be a maximal linearly independent subset of S. Because the basis A is

linearly independent, it suffices to prove that A generates V. We claim that S ⊆ span

(A), for otherwise there exists a v ∈ S such that v∉ span(A). Since S is a linearly
independent subset of a vector space V, and let v ∈ V that is not in S. Then S ∪ {v} is
linearly dependent if and only if v ∈ span (S). This implies that A ∪ {v} is linearly

independent, we have contradicted the maximality of A. Therefore, S ⊆ span (A).


Because span (S) = V, it follows from the span of any subset S of a vector space V is a
subspace of V. Moreover, any subspace of V that contains S must also contains the span
of S. That is span (A) = V.

Theorem 1. 6. 2. Let S be a linearly independent subset of a vector space V. There


exists a maximal linearly independent subset of V that contains S.
19
B. Chaluvaraju 

Proof. Let F denote the family of all linearly independent subsets of V that contain S.
In order to show that F contains a maximal element, we must show that if C is a chain in
F, then there exists a member U of F that contains each member of C. We claim that U,
the union of the members of C, is the desired set. Clearly U contains each member of C,
and so it suffices to prove that U ∈ F (That is U is a linearly independent subset of V that
contains S). Because each member of C is a subset of V containing S, we have S ⊆ U ⊆
V. Thus we need only prove that U is linearly independent. Let u1, u2 , u3 , . . . . ,un be in
U and a1, a2, a3, . . . .,an be scalars such that a1 u1 + a2 u2 + . . . . . . + an un. = 0. Because
ui ∈ U for i = 1, 2, 3……..n, there exists a set Ai in C such that ui ∈ Ai. But since C is a
chain, one of these sets, say Ak, contains all the others. Thus ui ∈ Ak for i = 1, 2, 3……n.
However, Ak is a linearly independent set; so a1 u1 + a2 u2 + a3 u3 + …… + an un. = 0
implies that F has a maximal element. This element is easily seen to be a maximal
linearly independent subset of V that contains S.

1. 7. Summary
In the above unit, we formulate a general definition of vector space and establish
its basic properties. Vector space, a set of multidimensional quantities, known as vectors,
together with a set of one-dimensional quantities, known as scalars, such that vectors can
be added together and vectors can be multiplied by scalars while preserving the ordinary
arithmetic properties (associativity, commutativity, distributivity, and so forth). The
concept of a linear subspace (or vector subspace) is important in linear algebra and
related fields of mathematics. A linear subspace is usually called simply a subspace
when the context serves to distinguish it from other kinds of subspaces. A linear
combination is an expression constructed from a set of terms by multiplying each term
by a constant and adding the results. The concept of linear combinations is central to
linear algebra and a system of linear equations (or linear system) is a collection of linear
equations involving the same set of variables. A family of vectors is linearly independent
if none of them can be written as a linear combination of finitely many other vectors in
the collection. A family of vectors which is not linearly independent is called linearly
dependent. A basis is a set of linearly independent vectors that, in a linear combination,
can represent every vector in a given vector space or free module, or, more simply put,

20
Linear Algebra 

which define a "coordinate system". Every vector space has a basis, and all bases of a
vector space have the same number of elements, called the dimension of the vector space.
In this unit, we study the dimension theorem, which is play very vital role for subsequent
units. Finally, the maximal principle guarantees the existence of maximal elements in a
family of sets, with the help of above maximal property we study the Maximal linearly
independent subsets.

1. 8. Keywords
Basis Scalar Multiplication
Dimension Span of a subset
Dimension theorem Spans
Finite - dimensional space Standard basis
Linear combination Subspace
Linearly dependent System of linear equation
Linearly independent Vector
Polynomial Vector addition
Scalar Vector space

UNIT-2
2. 0. Introduction

The relation of two vector spaces can be expressed by linear map or linear
transformation. They are functions that reflect the vector space structure. That is, they
preserve vectors in additive form and scalars in multiplicative form. A. Cayley formally
introduced m × n matrices in two papers in 1850 and 1858 (the term “matrix” was coined
by Sylvester in 1850). He noted that they “comport themselves as single entities” and
recognized their usefulness in simplifying systems of linear equations and composition of
linear transformations. This unit deals with Linear transformation, Null Spaces and
Ranges, The Matrix Representation of a Linear Transformation, Composition of a Linear
Transformation and Matrix multiplication, Invertiblity and Isomorphism, The Change of
Coordinate Matrix, and The Dual space.

21
B. Chaluvaraju 

2. 1. Linear Transformation, Null space and Ranges.

Definition 2. 1. 1. Let U and V be vector space over field F. Then a map T: U→V is a
linear transformation (or a linear map) if T (u+v) = T(u) + T(v) and T(av) = a T (v).
Two conditions together can be written in the form T (au+bv) = aT(u) + bT(v) for
all a, b∈F and u, v∈U.
Note.
1. If T is linear map, then T(0)=0.
2. T is linear map if and only if T(au+v) = aT(u) + T(v) for all a∈F and u, v∈U.
3. If T linear map, then T (u – v) = T(u) – T(v) for all u, v∈U.
4. T is linear map if and only if, for u1, u2, . . . ,un∈U and a1, a 2, . . . , an ∈F,
n n
we have T (∑ a iui ) = ∑ a iT (ui ) .
i =1 i =1

The following examples provide some more detailed illustrations concerning


linear transformations.

Illustrative Example - 1. If the mapping T: R2→R2 defined by T(x, y) = (4x+5y, 6x–y),


then show that T is linear transformation.
Solution. Let a, b∈ R and u, v∈ R2, where u = (u1, u2) and v = (v1, v2).
Then, T(au+ bv) = T(au1+ bv1, au2+ bv2)
= (4(au1+ bv1 ) + 5(au2+ bv2), 6(au1+ bv1 ) – 5(au2+ bv2))
= (4au1+ 5au2, 6au1 – au2 ) + (4bv1 +5 bv2, 6bv1 – bv2)
= aT(u) + bT(v).
Thus, T is a linear transformation of R2 into R2.

Note. Let Pn(R) denote the vector space that consists of all polynomials in x with degree
n or less and coefficient in R.
Illustrative Example - 2. If the mapping T: P1(R) →P2(R) defined by
T(p(x)) = (1+x)p(x) for all p(x) in P1(R), then show that T is linear transformation.
Solution. Let a, b∈ R and p(x), q(x)∈ P1(R) and the polynomials T(p(x)), T(q(x))∈
P2(R). Then, T (a p(x) + b q(x)) = (1+x)( a p(x)+b q(x))

22
Linear Algebra 

= a (1+x)( p(x)+b (1+x) q(x)


= a T( p(x)) + b T(1+x) q(x).
Thus, T is a linear transformation of P1(R) into P2(R).
For verifying the above example whether T is linear transformation or not, check
a specific computation of a value of T is provided by T(3+2x) = (1+x) (3+2x) =
3+5x+2x2.

Note:
1. For vector spaces V and W over a field F, we define an identity transformation
I : V → W by I(v) =v for all v∈V and the zero transformation T0 : V → W by
T0(v) = 0 for all v∈V. It is clear that both of these transformations are linear.
2. If S and T are linear transformation of vector space U into V over a field F, then
sum S + T is defined by (S + T )(u) = S(u) + T(u) for all u in U. Also, for each
linear transformation T of U into V and each scalar in F, we define the product of
a and T to be the mapping aT of U into V given by (aT)(u)=a(T(u)).

Theorem 2. 1. 1. Let U and V be vector space over the same field F. Then the set of all
linear transformation of U into V is a vector space over F.
Proof. Let T1, T2 and T3 denote arbitrary linear transformation of U into V, let u and v
be arbitrary vectors in U, and let a, b and c be scalars in F.
Since, (T1+T2) (au+ bv) = T1 (au+ bv) + T2 (au+ bv)
= aT1(u)+ b T1(v) + aT2 (u)+ b T2 (v)
= a[T1(u)+ T2 (u)] + b[T1(v)+ T2(v)]
= a(T1+ T2 )(u) + b(T1+ T2 )(v),
Therefore, T1+ T2 is a linear transformation of U into V.
Addition is associative, since
(T1+ (T2 +T3))(u) = T1(u) + (T2 + T3)(u)
= T1(u) + [T2 (u) + T3(u)]
= [T1(u) +T2 (u)] + T3(u)
= (T1+ T2) (u) + T3(u)
= ((T1+ T2) +T3) (u).
The zero linear transformation Z is an additive identity, since
23
B. Chaluvaraju 

(T1+ Z)(u) = T1(u) + Z(u) = T1(u) + 0 = T1(u) for all u in U.


The additive inverse of T1 is the linear transformation (–T1) of U into V defined by
(–T1)(u)= –T1(u), since (T1+ (–T1))(u) = T1(u)+ (–T1(u)) = 0 for all u in U.
For any u in U, (T1+ T2) (u) = T1(u) + T2(u) = T2(u) + T1(u) = (T2+ T1) (u),
so T1+ T2= T2+ T1.
Since, (cT1) (au+ bv) = c(T1 (au+ bv)) = c(aT1 (u)+ b T1(v)) = c(aT1 (u))+ b (T1(v)),
cT1 is a linear transformation of U into V.

Theorem 2. 1. 2. Let T be a linear transformation of U into V. If W is any subspace of U,


then T(W) is a subspace of V.
Proof. Let W be a subspace of a vector space U. Then 0∈W and T(0) = 0, so T(W) is
nonempty. Let v1, v2 ∈ T(W), and a, b∈F. There exist vectors w1, w2∈W such that
T(w1) = v1 and T(w2)= v2. Since W is a subspace of a vector space U, aw1+ bw2 is in W
and T1 (a w1+ b w2) = aT1 (w1) + b T1(w2)) = av1 + b v2. Thus, av1 + bv2 is in T(W), and
T(W) is a subspace of a vector space V.

Definition 2. 1. 2. The subspace T(U) of V is called range of T. The dimension of T(U)


is called the rank of T and is denoted by rank (T) or r(T).

Note. Let T be a linear transformation of U into V. If W is any subspace of a vector


space V, the inverse image T–1(W) is a subspace of a vector space U.

Definition 2. 1. 3. The subspace T–1(0) is called kernel of the linear transformation T.


The dimension of T–1(0) is the nullity of T, and is denoted by nullity (T) or n(T).

Theorem 2. 1. 3. If T is a linear transformation of U into V and A = (u1, u2, . . . ,un) is a

basis of U, then T(A) spans T(U).

Proof. Suppose that A = ( u1, u2, . . . ,un) is a basis of a vector space U, and consider the

set T(A) ={ T(u1), T(u2), . . . ,T(un)}. For any vector v in T(U), there is a vector u in U
n
such that T(u) = v. The vector u can be written as u = ∑ ai ui since A is a basis of U.
i =1

24
Linear Algebra 

n n
This gives v = T (∑ ai ui ) = ∑ aiT (ui ), and T(A) spans T(U).
i =1 i =1

Theorem 2. 1. 4 [Dimension Theorem]. Let T be a linear transformation of U into V. If


U has finite dimension, then rank(T) + nullity(T) = dim(U).

Proof. Suppose dim(U) = n, and let nullity(T) = k. Choose ( u1, u2, . . . ,uk) to be a basis
of the kernel T–1(0). This linearly independent set can be extended to a basis
A = { u1, u2, . . . ,uk, uk+1, . . . ,un} of U. By Theorem 2. 1. 3, the set T(A) spans T(U).
But T(u1) = T(u2) = T(u3) =, . . . . . . , = T(un) = 0, so this means that the set of (n–k)
vectors { T(uk+1), T(uk+2), . . . ,T(un)} spans T(U). To show that this set is linearly
independent, suppose that ck+1T(uk+1)+ ck+2T(uk+2) + , . . . ,+ cn T(un) = 0. Then
n
T(ck+1uk+1+ ck+2 uk+2 + , . . . ,+ cn un) = 0, and ∑ ci ui is in T–1(0). Thus there are
i = k +1

n n n n
scalars (d1, d2, . . . ,dk) such that ∑ diui = ∑ ciui ⇒ ∑ diui − ∑ ciui = 0 . Since A is a
i =1 i =k +1 i =1 i = k +1

basis, each ci and di must be zero. Hence {T(uk+1), T(uk+2), . . . ,T(un)} is a basis of T(U).
Since rank (T) is the dimension of T(U), n – k = rank (T) and rank (T) + nullity (T) = n =
dim (U).

2. 2. The Matrix Representation of a Linear Transformation

Let U and V be vector spaces of dimension n and m, respectively, over the same
field F, and T will denote the a linear transformation of U into V. Suppose that
A = ( u1, u2, . . . ,un) is a basis of U. Any u in U can be written uniquely in the form
n n
u = ∑ x ju j , and T (u) = ∑ x jT (u j ). If B = ( v1, v2, . . . ,vm) is a basis of V, then each T(uj)
j =1 j =1

m
can be written uniquely as T (u j ) = ∑ aij vi . Thus, with each choice of basis A and B, a
i =1

linear transformation T of U into V determines a unique indexed set{aij} of mn elements


of F.

25
B. Chaluvaraju 

Definition 2. 2. 1. Suppose that A = ( u1, u2, . . . ,un) and B = ( v1, v2, . . . ,vm) are basis
of U into V, respectively. Let T be a linear transformation of U into V. The matrix of T
relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A = [T ]BA , where the {aij}
m
are determined by the conditions T (u j ) = ∑ aij vi = a1 j v1 + a2 j v2 +,....., +amj vm for
i =1

j =1,2,…, n.

Note. The symbols of [T ]B, A in the above definition denote the same matrix, but the

first one place notational emphasis on the elements of the matrix, while the second one
place emphasis on T and the basis A and B. This matrix A is also referred to as the

matrix of T with respect to A and B, and we say that T is represented by the matrix A.

Also, the element of aij are uniquely determined by T for given basis A and B. Another
way to describe A is to observe that the jth column of A is the coordinate matrix of T(uj)
with respect to B.

⎡ a1 j ⎤
⎢ ⎥
⎢ a2 j ⎥
That is ⎢ . ⎥ = [T(uj)]B and A =[T ]B, A = [T(u1)]B , [T(u2)]B, . . . . . . , [T(un)]B.
⎢ ⎥
⎢ . ⎥
⎢ ⎥
⎣ amj ⎦

Illustrative Example - 1. Let T: P2(R)→ P3(R) be the linear transformation defined by


T(a0+ a1x+ a2 x2) = (2a0+ 2a2 )+(a0+ a1+ 3a2 )x+( a1+ 2a2)x2 +( a0+ a2 ) x3.
Then, find the matrix A of T relative to the basis.
Solution. Let A = {1, 1–x, x2} be a basis of P2(R) and B={ 1, x,1–x2, 1+x3 }be a basis
of P3(R). To find the first column of A, we compute T(1) and write it as a linear
combination of the vectors in B.
If T(1) = 2+x+x3= (1)(1)+(1)(x)+(0)(1–x2)+(1)( 1+x3).

26
Linear Algebra 

⎡1 ⎤
⎢1 ⎥
Then, the first column of A is [T(1)]B = ⎢ ⎥ ,
⎢0 ⎥
⎢ ⎥
⎣1 ⎦
To find the second column of A, we compute T(1–x) and write it as a linear combination
of the vectors in B.
If T(1–x ) = 2–x+x3= (0)(1)+(0)(x)+(1)( 1–x2)+(1)( 1+x3).
⎡0 ⎤
⎢0 ⎥
Then, the second column of A is [T(1–x)]B = ⎢ ⎥
⎢1 ⎥
⎢ ⎥
⎣1 ⎦
To find the third column of A, we compute T(x2) and write it as a linear combination of
the vectors in B.
If T(x2) = 2+3x+2 x2+x3 = (3)(1) + (3)(x) + ( –2)( 1–x2) + (1)( 1+x3).
⎡ 3⎤
⎢ 3⎥
Then, the third column of A is [T(x )]B = ⎢ ⎥
2
⎢ −2 ⎥
⎢ ⎥
⎣ 1⎦
Thus the matrix of T with respect to A and B is given by

⎡1 0 3⎤
⎢1 0 3 ⎥⎥
A=[T ]B, A = [T(1)]B , [T(1–x)]B, [T(x2)]B= ⎢
⎢0 1 −2⎥
⎢ ⎥
⎣1 1 1 ⎦.

Illustrative Example - 2. If the linear transformation T: R4→ R3 given by


T(x1, x2, x3, x4) = ( x1– x3+ x4 , 2x1+ x2+3 x4 , x1+ 2x2+3 x3+3 x4). Find [T ]ε3 ,ε 4 .

Solution. Using the definition of standard basis of ε 4 and ε3 , we compute

T(1, 0, 0, 0) = (1, 2, 1),


T(0, 1, 0, 0) = (0, 1, 2),
T(0, 0, 1, 0) = (–1, 0, 3),
T(0, 0, 0, 1) = (1, 3, 3).
27
B. Chaluvaraju 

Thus the matrix of T relative to ε 4 and ε3 is

⎡ 1 0 −1 1 ⎤
[T ]ε 3 ,ε 4 = ⎢⎢ 2 1 0 3 ⎥⎥ .
⎢⎣ 1 2 3 3 ⎥⎦

Note.
1. If A is any matrix that represents the linear transformation T of U into V, then
rank (A) = rank(T). That is rank ([T ]B, A ) = rank(T).

2. The matrix of T relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A .

If u is an arbitrary vector in U, then [T(u)]B= [T]B,A [u]A.

3. The matrix of T relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A is

a matrix such that the equation [T(u)]B= A[u]A is satisfied for all u in U, then A is

the matrix of the linear transformation T relative to the basis A and B.

2. 3. Composition of Linear Transformation and Matrix Multiplication

Definition 2. 3. 1. Let U, V and Z be vector spaces over the same field F, and let S:
U→V and T : V → Z be linear transformations. Then the product TS is the mapping
of U into Z (that is TS : U → Z) defined by TS(u) = (T o S)(u) = T(S(u)) for each u in U.

Theorem 2. 3. 1. The product of two linear transformations is a linear transformation


Proof. Let U, V and Z be vector spaces over the same field F, and let S: U→V and T : V
→ Z be linear transformations. Then TS : U → Z defined by TS(u) = (T o S)(u) = T(S(u))
for each u in U. To show that TS is a linear transformation, let u1, u2∈U and a, b∈F.
TS (au1+ b u2) = T(S (au1 + b u2))
= T(aS(u1) + b S( u2))
= aT(S(u1)) + b T( S( u2))
= aTS(u1) + b T S( u2),
and TS is indeed a linear transformation of U into Z.

The relation between multiplication of linear transformation and multiplication of


matrices.
28
Linear Algebra 

Theorem 2. 3. 2. Suppose that U, V and Z are finite dimensional vector spaces with
bases A, B and C, respectively. If S has matrix A relative to A and B and T has matrix

B relative to B and C, then TS has matrix BA relative to A and C.

Proof. Assume that the hypotheses are satisfied with A = ( u1, u2, . . . ,un),

B = ( v1, v2, . . . ,vn) and C=( w1, w2, . . . ,wn), A = [aij]m×n, and B = [bij]p×m. Let C =

[cij]p×n be the matrix of TS relative to the basis A and C. For j=1, 2, …, n , we have
p m m m p m p
∑ cij wi = TS (u j ) = T ( ∑ akj vk ) = ∑ akjT (vk ) = ∑ akj (∑ bik wi ) = ∑ ∑ (akj bik wi )
i =1 k =1 k =1 k =1 i =1 k =1 i =1

p m
= ∑ ( ∑ (bik akj ) wi , and
i =1 k =1

m
consequently cij = ∑ bik akj for all values of i and j. Therefore C=BA and the theorem is
k =1

proved.

Illustrative Example -1. If the linear transformation S : M2x2(R) →R3 and


⎛ ⎡a b ⎤ ⎞
T : R3→P1(R) defined by S ⎜ ⎢ ⎥ ⎟ = (a + 2b, b − 3c, c + d ) and T(a1, a2, a3) = (a1– 2a2
⎝ ⎣c d ⎦ ⎠
–6a3) + (a2 +3a3)x. Then
(i) Compute the product TS.
(ii) Find the matrices of S, T, TS relative to these basis
⎧⎛ 1 0 ⎞ ⎛ 0 1 ⎞ ⎛ 0 0 ⎞ ⎛ 0 0 ⎞ ⎫
A = ⎨⎜ ⎟,⎜ ⎟,⎜ ⎟,⎜ ⎟ ⎬ , B = ε3 , and C = {1, x}.
⎩⎝ 0 0 ⎠ ⎝ 0 0 ⎠ ⎝ 1 0 ⎠ ⎝ 0 1 ⎠ ⎭
(iii) Verify that the matrix of TS is the product of the matrix of T times the matrix of S. 

Solution. (i) We have

⎛ ⎡a b ⎤ ⎞ ⎛ ⎛ ⎡a b ⎤ ⎞ ⎞
=
⎜⎜ ⎜ ⎣⎢ c d ⎥⎦ ⎟ ⎟⎟
TS ⎜ ⎢ ⎥ ⎟ T S
⎝ ⎣c d ⎦ ⎠ ⎝ ⎝ ⎠⎠
= T( a + 2b , b − 3c , c + d )
= (a+2b–2(b–3c) –6(c+d) + (b–3c+3(c+d))x
= (a–6d) + (b+3d)x.

29
B. Chaluvaraju 

⎛ ⎡1 0 ⎤ ⎞ ⎛ ⎡0 1 ⎤ ⎞
(ii) Since S⎜⎢ ⎥ ⎟ = (1,0, 0), S ⎜ ⎢ ⎥ ⎟ = (2,1, 0),
⎝ ⎣0 0⎦ ⎠ ⎝ ⎣0 0⎦ ⎠
⎛ ⎡0 0⎤ ⎞ ⎛ ⎡0 0⎤ ⎞
S⎜⎢ ⎥ ⎟ = (0, −3,1), S ⎜ ⎢ ⎥ ⎟ = (0, 0,1) .
⎝ ⎣ 1 0 ⎦ ⎠ ⎝ ⎣ 0 1 ⎦⎠
⎡1 2 0 0 ⎤
The matrix A of S relative to A and B is A = ⎢ 0 1 − 3 0 ⎥ .
⎢ ⎥
⎢⎣ 0 0 1 1 ⎥⎦

Since T (1, 0, 0) = (1) (1) + (0) (x)


T (0, 1, 0) = (–2) (1) + (1) (x)
T (1, 0, 0) = (–6) (1) + (3) (x),
⎡1 − 2 − 6 ⎤
The matrix B of T relative to B and C is B = ⎢ ⎥.
⎣0 1 3 ⎦
For the product TS, we have
⎛ ⎡1 0 ⎤ ⎞ ⎛ ⎡0 1 ⎤ ⎞
TS ⎜ ⎢ ⎥ ⎟ = (1)(1) + (0)( x), TS ⎜ ⎢ ⎥ ⎟ = (0)(1) + (1)( x),
⎝ ⎣0 0⎦ ⎠ ⎝ ⎣0 0⎦ ⎠
⎛ ⎡0 0⎤ ⎞ ⎛ ⎡0 0⎤ ⎞
TS ⎜ ⎢ ⎥ ⎟ = (0)(1) + (0)( x ), TS ⎜⎢ ⎥ ⎟ = (−6)(1) + (3)( x),
⎝ ⎣1 0 ⎦ ⎠ ⎝ ⎣0 1 ⎦ ⎠
⎡1 0 0 − 6 ⎤
so the matrix of TS relative to A and C is C = ⎢ ⎥
⎣0 1 0 3 ⎦
(iii) Verify the usual arithmetic matrix multiplication, we have
⎡1 2 0 0 ⎤
⎡ 1 −2 −6 ⎤ ⎢ ⎥ = ⎡ 1 0 0 −6 ⎤ = C .
BA = ⎢ ⎥ ⎢ 0 1 − 3 0 ⎥ ⎢0 1 0 3 ⎥
⎣0 1 3 ⎦ ⎢ ⎥⎦ ⎣ ⎦
⎣ 0 0 1 1

Note.
1. In the above example the product AB is not defined, and this is consistent with
the fact that S is not defined
2. The operation of addition and multiplication of matrices are connected by the
distributive property, that is A(B+C) = AB+AC, where A = [aij]m×n, B = [bij]p×m
and C = [cij]p×n.

30
Linear Algebra 

3. Let A be an n × n matrix with entries from a field F. The mapping TA : Fn → Fn


defined by TA(x) = Ax (the matrix product of A and x) for each column vector x∈
Fn . This is known as Left-Multiplication transformation.

Illustrative Example -2. Let T : R2 → R2 be a linear transformation defined by


T(a, b) = (2a – 3b, a + b) for all (a, b) ∈ R2. Find the matrix of T relative to bases
B = {v1 = (1, 0), v2 = (0, 1)} and B1 = { = (2, 3), = (1, 2)}.

Solution. In order to find the matrix of t relative to bases B and B 1, we have to express
T(v1) and T(v2) as a linear combination of and .
For this, we first find the coordinates of an arbitrary vector (a, b) ∈ R2 with respect to
basis B1.
(a, b) = +y
⇒ (a, b) = x(2, 3) + y(1, 2)
⇒ (a, b) = (2x + y, 3x + 2y)
⇒ 2x + y = a and 3x + 2y = b
⇒ x = 2a – b and y = – 3a + 2b
Therefore (a, b) = (2a – b)  + (– 3a + 2b)
Now, T(a, b) = (2a – 3b, a + b)→(1)
⇒ T(v1) = T(1, 0) = (2, 1) and T(v2) = T(0, 1) (– 3,1) [ Putting a = 2 and b = 1 in (1)]
⇒ T(v1) = 3 –4 and T(v2) = –7 + 11 [ Putting a = –3 and b = 1]
T
⎡ 3 −4 ⎤ ⎡ 3 −7 ⎤
[T ]B,B 1 =⎢ ⎥ =⎢ ⎥
⎣ −7 1 ⎦ ⎣ −4 11⎦ .

2. 4. Invertibility and Isomorphism

Definition 2. 4. 1. Let V be a vector space over a field F and let Hom (V, V) or A(V) be
the set of all linear transformation of V onto itself. This is known as Algebra of Linear
transformation.
That is a map T : U →V is a linear transformation on a vector space with U=V.
For T1, T2 ∈ A(V),
(T1 + T2)(v) = T1(v) + T2(v) ⇒ T1 + T2 ∈ A(V)
31
B. Chaluvaraju 

a(T(v)) = (aT)(v) ⇒ aT ∈ A(V)


Now, we show that (T1 T2) (au + bv) = a(T1T2)(u) + b(T1T2)(v) for all a, b ∈ F, u, v ∈ V
and T1,T2 ∈ A(V).
Consider (T1T2) (au +bv) = T1(T2 (au + bv))
= T1(aT2(u)) + T1(bT2(u))
= a(T1T2)(u) + b(T1T2)(v).

Definition 2. 4. 2. An element TϵA(V) is invertible (regular) if it is both right and left


invertible. That is, ST = TS = I for some S∈A(V). We denote the inverse S of T as T–1.
Note.
1. An element in A(V) which is not invertible is called singular.
2. It is quite possible that an element in A(V) is right invertible but is not
invertible.
For example, we consider the field F of real numbers and V. The linear map

S, T: V→V be defined by S(q(x)) =   and T (q(x))=  


 .

First, TS(q(x)) = T (S (q(x))) = T (   )=    = q(x) + k, where

k is a non zero integrating constant. Therefore T S ≠ 1.

Next , ST(q(x)) = S (T (q(x)) = S(  


 ) =     = q(x).

Therefore ST = 1. Finally, ST≠TS.


3. Let A be an algebra with unit element e, over a field F and let
p(x) = a0 +a1x +….+ anxn be a polynomial in F. For α ∈ A, the polynomial p(α) ,
we shall mean the element α0 e + α1a+…..+αn an in A. If p(α) = 0 we shall say α
satisfies p(x).
4. A polynomial p(x) is said to be the Minimal polynomial for T, if
(i) p(T) = 0
(ii) If for all q(x) ∈F[x] such that degree of q(x) < degree of p(x), then q(T)≠0.
5. If V is a finite dimensional vector space over a field F, then T∈A(V) is invertible
(regular) if and only if the constant term of the minimal polynomial for T is not
zero.

32
Linear Algebra 

6. If V is finite dimensional over F and if Tϵ A(V) is non-invertible(singular), then


there exists an S ≠ 0 in A(V) such that ST = TS = 0.

Theorem 2. 4. 1. If V is finite dimensional vector space over a field F, then TϵA(V) is


non-invertible (singular) if and only if there exists a v ≠ 0 in V such that T(v) = 0.
Proof. Suppose TϵA(V) is singular. Then there exists S ≠ 0 , there is an element w∈V
such that S(w) ≠ 0.
Let v = S(w) ≠ 0 ∈ V. Then T(v) = T (S(w)) = T (S(w)) = (TS)(w)=0w= 0, since T is
singular. Thus T(v) = 0 for v≠ 0 in V.
Conversely, suppose T(v) = 0 for v≠ 0 in V. We claim that T is singular. Suppose T is
regular, then there exits T–1 such that TT–1 = T–1T = I.
So T(v) = 0 ⇒ T–1 (T(v)) = T–10 ⇒ (T–1T)v = 0 ⇒v = 0, a contradiction.
There fore T is singular.

Note.
1. Let A be an n×n matrix. Then A is invertible if there exists an n×n matrix B such
that AB=BA=I, which is unique (If C is an another such matrix, then C = CI =
C(AB) = (CA)B = IB+B). The matrix B is called the inverse of A and is denoted
by A–1.
2. If V is finite dimensional vector space over a field F with ordered bases A and B,
respectively, then a linear transformation T∈A(V) is invertible if and only if

( )
−1
[T ]B, A is invertible. Furthermore, ⎡⎣T −1 ⎤⎦B,A = [T ]B,A .

Definition 2. 4. 3. Let U and V be vector space. We say that U is isomorphic to V if there


exist a linear transformation T: U→V that is invertible. Such a linear transformation is
called an isomorphism U onto V.

Note.
1. Let U and V be finite dimensional vector space (over the same filed F). Then V is
isomorphic to U if and only if dim(U) = dim(V).
2. Let V be finite dimensional vector space over a field F. Then V is isomorphic to
Fn if and only if dim (V) = n.
33
B. Chaluvaraju 

Theorem 2. 4. 2. If A is an algebra, with unit element, over F, then A is isomorphic to a


sub algebra of A(V) for some vector space V over F.
Proof. Since A is an algebra over F, it must be a vector space over F.
We shall use V = A to prove the result
For x ∈ A , define Tx : V → V by Tx(v) = vx for all v∈V or A
We assert that Tx is a linear transformation on V(=A)
Consider Tx (v1+ v2) = (v1 v2)x = v1 x + v2 x = Tx (v1) + Tx (v2) and
Tx (av) = (avx) = a(vx) = a(Tx(v))
Hence Tx is a linear transformation on V. There fore, Tx ∈A(V).
Consider the mapping ψ : A →A(V) defined by ψ(x) = Tx .
We claim that ψ is an isomorphism of A into A(V).
For x, y ∈ A and a, b ∈ F, we have
T a x+ by (v) = v (a x + by)
= v(a x) + v(by)
= a(vx) + b(vy)
= a (Tx (v)) + b(Ty(v))
= (a Tx + bTy)(v), since Tx and Ty are linear transformation.
Thus, T a x+ by = a Tx + bTy ⇒ ψ( a x + by) = a ψ (x) + bψ (y)
Therefore ψ is a linear transformation of A into A(V).
Now for x, y ∈ A ,we have
Txy(v) = v(xy) = (vx)y = (Tx(v))Ty = Tx(Ty(v))= (TxTy)(v).
Thus, Txy = Tx Ty ⇒ ψ(xy) = ψ (x) + ψ (y).
Therefore ψ is also a ring homomorphism of A. So far we have proved that ψ is a
homomorphism of A, as an algebra in to A(V).
To prove ψ is one to one, we claim that Ker (ψ) = {0}
Let x∈ Ker (ψ). Then ψ(x) = 0 ⇒ Tx = 0
⇒ Tx(v) = 0(v) = 0 for any v ∈V(=A)
⇒ Tx(e) = 0 where e is the unit element of V
⇒ ex = 0 or x = 0, since e ≠ 0
Therefore Ker (ψ) = {0} and hence ψ is one to one.

34
Linear Algebra 

So ψ is an isomorphism of A in to A(V).

Theorem 2. 4. 3. Let A be an algebra, with unit element, over a field F, and suppose that
A is of dimension m over F. Then every element in A satisfies some nontrivial
polynomial in F[x] (or Pn(F)) of degree at most m.
Proof. Let e be the unit element of A. For y∈ A, we have e, y, y2,. . ., ym in A.
Since A is m - dimensional over a field F, it follows that the (m+1)- elements (vectors)
e, y,y2, . . . ,ym are linearly dependent over a field F. So there exist the elements
a0, a1,……,am in a field F, not all zero, therefore x satisfies the non trivial polynomial
q(x) = a0 + a1 +,……..,+am xm , of degree at most m, in F[x]. Hence, degree of p(x) ≤ m.

Note. A(V) = Hom (U, V) = {T: U→V / T is a linear transformation}

Theorem 2. 4. 4. If V is an n - dimensional vector space over a field F, then, given any


element T in A(V), there exists a non trivial polynomial q(x) ∈ F[x] of degree atmost n2,
such that q(T) = 0.
Proof. We know that dim (Hom (U, V)) = dim(U) dim (V), where V and W are vector
spaces over a field F. Hence dim (A(V)) = dim (Hom (V, V)) = n2.
Hence, by theorem 2. 4. 3, any element T of A(V) satisfies some non trivial polynomial
q(x) in F[x] of degree n2, that is, q(T) = 0.

Illustrative Example-1. Let T : R2 → R2 be a linear transformation defined by


T(x, y) = (2x + y, 3x + 2y). Then show that T is invertible and find T – 1.
Solution. To show T is invertible, it is sufficient to show that T is non-singular.
That is, T(x, y) = (0, 0) ⇔ (x, y) = (0,0).
Now, T (x, y) = (0, 0) ⇔ (2x+ y, 3x+2 y) = (0,0)
⇔ 2x+ y = 0, 3x+2y=0
⇔ x = 0, y = 0
⇔ (x, y) = (0, 0)
Thus, T(x, y) = (0,0) ⇔ (x, y) = (0, 0).
To find the formula for T– 1.
Let T(x, y) = (a, b). Then, T – 1(a, b) = (x, y)

35
B. Chaluvaraju 

Now, T(x, y) = (a, b) ⇒ (2x + y, 3x + 2y) = (a, b)


⇒ 2x + y =a and 3x + 2y=b
⇒ x = 2a – b, y = –3a + 2b
Therefore, T – 1(a, b) = (2a – b, –3a + 2b).

2. 5. The Change of Coordinate Matrix.

The relation between those matrices that represent the same linear transformation.
This description is found by examining the effect that a change in the bases of U and V
has on the matrix of T.
Theorem 2. 5. 1. Let C= ( w1, w2, . . . ,wk) and C1 = ( w11, w21, . . . , wk1 ) be two bases
of the vector space W over F. For an arbitrary vector w in W, let
⎡c1 ⎤ ⎡c11 ⎤
⎢c ⎥ ⎢ 1⎥
⎢ 2⎥ ⎢c2 ⎥
[ w]C = C = ⎢ . ⎥ and [w]C l = C 1 = ⎢.⎥
⎢ ⎥ ⎢ ⎥
⎢.⎥ ⎢.⎥
⎢c ⎥ ⎢c 1 ⎥
⎣ k⎦ ⎣ k⎦
denote the coordinate matrices of w relative to C and C1 , respectively. If P is the matrix

of transition from C to C 1, then C =PC 1. That is , [w]C=P[w]C 1 .


Proof. Let P = [pij]k×k, and assume that the hypotheses of the theorem are satisfied.
k k k
Then w = ∑ ci wi = ∑ c1j w1j and w1j = ∑ pi j wi .
i =1 j =1 i =1

( ) ⎛k ⎞
k k k
Combining these equalities, we have w = ∑ ci wi = c1j ∑ pi j wi = ∑ ⎜ ∑ pi j c j ⎟ wi .
1

i =1 i =1 i =1 ⎝ j =1 ⎠
⎡ c1 ⎤ ⎡ p11c11 + p12 c11 + ....... + p1k ck1 ⎤
⎢ c ⎥ ⎢ p c1 + p c1 + ....... + p c1 ⎥
k ⎢ 2 ⎥ ⎢ 21 1 22 1 2k k ⎥

Therefore, ci = ∑ pi j c j and
1
⎢ . ⎥ = ⎢ ... ⎥ = PC1.
j =1 ⎢ ⎥ ⎢ ⎥
⎢.⎥ ⎢ ... ⎥
⎢⎣ck ⎥⎦ ⎢⎣ pk 1c11 + pk 2 c11 + ....... + pkk c1k ⎥⎦

36
Linear Algebra 

Example -1. Consider the bases C = {x, 2+x} and C1={ 4+x, 4–x} of the vector space
W = P1(R). Since 4+x = (–1)(x)+(2)(2+x) and 4–x = (–3)(x)+(2)(2+x) the matrix of
⎡ −1 −3⎤
transition P from C to C 1 is give by P = ⎢ ⎥.
⎣ 2 2⎦

⎡ 2⎤
The vector w=4+3x can be written as 4+3x=2(4+x)+(–1)( 4–x) , so [w]C 1= ⎢ ⎥ .
⎣ −1⎦
By Theorem 2.5.1, the coordinate matrix [w]C may be found from

⎡ −1 −3⎤ ⎡ 2 ⎤ ⎡1 ⎤
[w]C=P[w]C 1= ⎢ ⎥⎢ ⎥ = ⎢ ⎥.
⎣ 2 2 ⎦ ⎣ −1⎦ ⎣ 2 ⎦
This result can be checked by using the base vectors in C.
We get, (1)(x)+2(2+x) = 4+3x = w.

Definition 2. 5. 1. If A= ( u1, u2, . . . ,ur) is asset of vectors in Rn and B= ( v1, v2, . . . ,vs)

is a set of vectors in 〈A 〉, a matrix of transition (or transition matrix) from A to B is


n
a matrix A=[aij]rxs such that v j = ∑ aij ui for j= 1, 2, . . . . ,s.
j =1

Theorem 2. 5. 2. Suppose that T has matrix A = [aij]m×n relative to the bases A of U and

B of V. If Q is the matrix of transition from A to the basis A1 of U and P is the matrix

of transition from B to the basis B1 of V, then the matrix of T relative to A1 and B1 is P–


1
AQ (See Figure-1 ).

Figure-1
Proof. Assume the hypotheses of the theorem are satisfied. Let U be an arbitrary vector
in U, let X and X1 denote the coordinate matrices of u relative to A and A1,

respectively, and let Y and Y1 denote the coordinate matrices of T(u) relative to B and

B1, respectively. Since the matrix of T relative to the bases A and B is the matrix
37
B. Chaluvaraju 

A = [aij]m×n= [T ]B, A . If u is an arbitrary vector in U, then [T(u)]B= [T]B,A [u]A and by

Theorem 2. 5. 1, we have Y=AX, where Y=PY1 and X= QX1. Substituting for Y and X,
we have PY1=AQX1, and therefore Y1= ( P–1AQ)X1.
If the matrix of T relative to the bases A and B is the matrix A = [T ]B, A is a

matrix such that the equation [T(u)]B = A[u]A is satisfied for all u in U, then A is the

matrix of the linear transformation T relative to A and B. Thus P–1AQ is the matrix of T

relative to A1and B1.

Theorem 2. 5. 2. Two m×n matrices A and B represent the same linear transformation T
of U into V if and only if A and B are equivalent.
Proof. If A and B represent T relative to the set of bases A,B and A1,B1, respectively,

then B= P–1AQ, where Q is the matrix of transition from A to A1and P is the matrix of

transition from B to B1. Hence A and B are equivalent. If B is equivalent to A, then

B= P–1AQ for invertible P and Q. If A represents T relative to A and B, the B

represents T relative to A1andB1, where Q is the matrix of transition from A to A1 and

P is the matrix of transition from B to B1.

To obtain two similar results concerning row and column equivalence. For
requiring B =B1 is the same as requiring P=Im, and requiring Q=In. Thus we have the
following theorem
Theorem 2. 5. 3. Two m×n matrices A and B represent the same linear transformation T
of U into V if and only if they are row (or column) - equivalent.

Illustrative Example - 2. Relative to the basis B = {v1, v2} = {(1, 1), (2, 3)} of R2, find
the coordinate matrix of (i) v = (4, –3) and (ii) v = (a, b)
Solution. Let x, y ∈ R such that v = xv1 + yv2 = x(1, 1) + y(2, 3) = (x + 2y, x + 3y)
(i) If v = (4, –3), then v = xv1 + yv2 ⇒ (4, –3) = (x + 2y, x + 3y)
⇒ x + 2y = 4, x + 3y = –3
⇒ x = 18, y = –7.

38
Linear Algebra 

⎡ 19 ⎤
Hence, coordinate matrix [v]B of v relative to the basis B is given by [v]B= ⎢ ⎥
⎣ −7 ⎦
(ii) If v = (a, b), then v = xv1 + yv2 ⇒ (a, b) = (x + 2y, x + 3y)
⇒ x + 2y = a, x + 3y = b
⇒ x = 3a – 2b, y = –a + b
⎡3a - 2b⎤
Hence, coordinate matrix [v]B of v relative to the basis B is given by [v]B= ⎢ ⎥
⎣ -a + b ⎦

2. 6. The dual space


Definition 2. 6. 1. Liner transformation from a vector space V into its field of scalars F,
which is itself a vector space of dimension 1 over a field F. Such a linear transformation
is called a linear functional on V. We generally use the letters f, g, h….. to denote linear
functional .

Example -1. Let V be the vector space of continuous real valued functions on the
interval [0, 2π]. Fix a function g ∈ V. The function h : V → R defined by

    x t  g t  dt , is a linear functional on V. In the cases that g(t) equals sin


π

(nt) or cos (nt), h(x) is often called the nth Fourier co-efficient of x.

Example -2. Let V be a finite - dimensional vector space and let B = {x1 , x2, . . . . , xn}
be an ordered basis for V. For each i = 1, 2, . . . . . , n, define fi(x) = ai , where

[x]B   is the coordinate vector of x relative to B. Then fi is a linear functional on a

vector space V is called the ith coordinate function with respect to the basis B.
Note.
1. fi(xj) = δij, where δij is the kronceker delta. let V and W be finite - dimensional
vector spaces (over the same field F). Then V is isomorphic to W if and only if
dim (V) = dim (W)

39
B. Chaluvaraju 

Definition 2. 6. 2. For a vector space V over F, we define the dual space of V to be the
vector space Hom (V, F), denoted by V*. Thus V* is the vector space consisting of all
linear functional on V with the operations of addition and scalar multiplication. If V is
finite - dimensional, then dim (V*) = dim (Hom (V, F)) = dim (V) ⋅ dim (F) = dim (V).
Hence by above note, if V and V* are isomorphic. We also define the double dual
V** of V to be the dual of V*.

Theorem 2. 6. 1. Suppose that V is a finite dimensional vector space over a field F


with the ordered basis B = {x1 , x2, . . . . , xn}. Let B* ={ f1 , f2, . . . . , fn }, where

fi (1≤ i ≤ n) is the ith coordinate function with respect to B. Then B* is an ordered basis

of V*, and , for any f∈ V*, we have ∑ . 


Proof. Let f∈ V*. Since dim (V*) = n, we need only show that ∑ .
Since any finite generating set for V contains at least n vectors, and a generating set for V
that contains exactly n vectors is a basis for V. Hence, it follows that B* generates V* is
a basis. Let ∑ . For 1≤ j ≤ n, we have

    δ      .

There fore f = g due to the fact that two vector space V and W, with V has a finite basis
{ v1, v2, . . . ,vn}, if U, T: V→W are linear and U(vi) = T(vi) for i = 1,2, …, n, then U=T.

Using the notion of above theorem, we define


Definition 2. 6. 3. The coordinate basis B* ={ f1 , f2, . . . . , fn } of V* that satisfies

fi(xj) = δij (1≤ i, j ≤ n) is known as the dual basis of B.

Illustrative Example-3. Let B ={ (2, 1), (3, 1)} be an ordered basis for R2 . Suppose

that the dual basis B is given by B* ={ f1 , f2}. Determine a formula for f1.
Solution. We need to consider the equations
1= f1(2, 1) = f1(2e1 + e2) = 2 f1(e1) + f1(e2)
0= f1(3, 1) = f1(3e1 + e2) = 3 f1(e1) + f1(e2).
Solving these equations, we obtain f1(e1)= – 1 and f1(e2) = 3. That is , f1(x, y) = – x+ 3y.

40
Linear Algebra 

Similarly, it can be show that f2(x, y) = x– 2y.

2. 7. Summary
In this unit, the important concept of a linear transformation of a vector space is
introduced. A linear map, linear mapping, linear transformation, or linear operator (in
some contexts also called linear function) is a function between two vector spaces that
preserves the operations of vector addition and scalar multiplication. As a result, it
always maps straight lines to straight lines or 0. The expression "linear operator" is
commonly used for linear maps from a vector space to itself (That is, endomorphisms). If
one has a linear transformation T(x) in functional form, it is easy to determine the
transformation matrix A by simply transforming each of the vectors of the standard basis
by T and then inserting the results into the rows [columns] of a matrix. Also, matrix
multiplication is a binary operation that takes a pair of matrices, and produces another
matrix. The matrix of a composition of linear maps is given by the product of the
matrices. The change of basis refers to the conversion of vectors and linear
transformations between matrix representations which have different bases. A dual basis
is a set of vectors that forms a basis for the dual space of a vector space.

2. 8. Keywords
Change of coordinate matrix Identity matrix
Coordinate vector relative to a basis Identity transformation
Composition of linear transformation Invertible linear transformation
Dual basis Invertible matrix
Dual space Isomorphic vector spaces
Kernel of a linear transformation Isomorphism
Left - multiplication transformation Nullity of a linear transformation
Linear functional Null space
Linear transformation Product of Matrices
Matrix multiplication Range
Non-invertible linear transformation Rank of a linear transformation

41
B. Chaluvaraju 

UNIT -3
3. 0. Introduction
In the previous two units, we have learnt about vector space and linear
transformations between two vector spaces U(F) and V(F) defined over the same filed F.
In this unit, we shall see that each linear transformation from n-dimensional vector space
U(F) to an m-dimensional vector space V(F) corresponds to an m×n matrix over a field F.
Here, we see how the matrix representation of a linear transformation changes with the
changes of basis of the given vector with the help of elementary matrix operation, we
solve the many problems to the rank of a matrix , matrix inverses and systems of linear
equations.

3. 1. Elementary Matrix Operations and Elementary Matrices

Definition 3. 1. 1. Let A be an m × n matrix. Any one of the following three operations


on the rows [columns] of A is called an elementary row [column] operation:
1. Interchanging any two rows [columns] of A;
2. Multiplying any row [column] of A by a nonzero scalar;
3. Adding any scalar multiple of a row [column] of A to another row [column].
Any of these three operations is called an elementary operation. Elementary
operations are of Type -1, Type -2, or Type- 3 depending on whether they are obtained by
(1), (2) or (3).

1 2    3 4
Example - 1. Let A =  2       1       1       3 
4 0    1 2
interchanging the second row of A with the first row is an example of an elementary row
operation of Type -1.
 2 1 1 3
The resulting matrix is B = 1       2          3       4  .
4 0    1  2
Multiplying the second column of A by 3 is and example of an elementary column
operation of Type - 2.

42
Linear Algebra 

 1 6   3 4
The resulting matrix is C = 2      3      1      3 .
4 0    1 2 
Adding 4 times the third row of A to the first row is an example of an elementary row
operation of Type-3.
17 2   7 12
In this case, the resulting matrix is M =     2      1      1       3  
  4 0    1  2
If a matrix Q can be obtained from a matrix P by means of an elementary row
operation, then P can be obtained from Q by an elementary row operation of the same
type. In the above example, A can be obtained from M by adding –4 times the third row
of M to the first row of M.

Definition 3. 1. 2. An n × n elementary matrix is a matrix obtained by performing


elementary operation on In, where In is identity matrix.
The elementary matrix is said to be of Type 1, 2, or 3 according to whether the
elementary operation performed on In is a Type 1, 2 or 3 operation, respectively.

Example - 2. Interchanging the first two rows of I3 produces the elementary matrix
0 1 0
E =  1 0 0 
0 0 1
Note that E can also be obtained by interchanging the first two columns of I3. In fact, any
elementary matrix can be obtained in at least tow ways – either by performing an
elementary row operation on In or by performing an elementary column operation on In.
Similarly,
1 0 2
 0 1    0  
0 0    1
is an elementary matrix since it can be obtained from I3 by an elementary column
operation of Type-3 (adding – 2 times the first column of I3 to the third column) or by an
elementary row operation of Type 3 (adding – 2 times the third row to the first row).

Our first theorem shows that performing an elementary row operation on a matrix
is equivalent to multiplying the matrix by an elementary matrix.

43
B. Chaluvaraju 

Theorem 3. 1. 1. Let A∈Mmxn(F), and suppose that B is obtained from A by performing


an elementary row [column] operation. Then there exists an m × m [n × n] elementary
matrix E such that B = EA [B = AE].

In fact, E is obtained from Im[In] by performing the same elementary row


[column] operations as that which was performed on A to obtain B. Conversely,
if E is an elementar m × m [n × n] matrix, then EA [AE] is the matrix obtained from
A by performing the same elementary row [column] operation as the which produces E
from Im [In].
The proof, which we omit, requires verifying the above theorem for each type of
elementary row operation. The proof for column operations can then be obtained by
using the matrix transpose to transform a column operation into a row operation. The
next example illustrates the use of the above theorem.

Example - 3. Consider the matrices A and B in Example-1. In this case, B is obtained


from A by interchanging the first two rows of A. Performing this same operation on I3,
we obtain the elementary matrix
0 1 0
E= 1 0 0
0 0 1
Note that EA = B.
In the second part of Example -1, C is obtained from A by multiplying the second
column of A by 3. Performing this same operation on I4, we obtain the elementary matrix
1 0 0 0
E =    0        3       0      0 
0 0 1 0
 
0 0 0 1
Observe that AE = C. It is a useful fact that the inverse of an elementary matrix is also an
elementary matrix.

Theorem 3. 1. 2. Elementary matrices are invertible, and the inverse of an elementary


matrix is an elementary matrix of the same type.
Proof. Let E be an elementary n × n matrix. Then E can be obtained by an elementary
row operation on In. By reversing the steps used to transform In into E, we can transform

44
Linear Algebra 

E back into In. The result is that In can be obtained from E by an elementary row
operation of the same type. By Theorem 3. 1. 1, there is an elementary matrix such
that E = In. Therefore, E is invertible and E–1 =  .

3. 2. The Rank of a Matrix and Matrix Inverses

Definition 3. 2. 1. If A ∈ Mmxn(F), we define the rank of A, denoted rank (A), to be the


rank of the linear transformation TA: Fn → Fm.
Note.
1. Many results about the rank of a matrix follow immediately from the
corresponding facts about a linear transformation. An important result of this
type is that an n × n matrix is invertible if and only if its rank is n.

2. Every matrix A is the matrix representation of the linear transformation TA is the


same as the rank of one of its matrix representations, namely, A. The
theorem 3. 2.1, extends this fact to any matrix representation of any linear
transformation defined on finite-dimensional vector spaces.

3. Let T : V → W be a linear transformation between finite-dimensional vector


spaces over a field F, and let A and B be ordered bases for V and W respectively.
Then rank (T) = rank [T ]BA = [T ]B, A .

4. Let A be an m × n matrix. If P and Q are invertible m × m and n × n matrices,


respectively, then

(i) rank (AQ) = rank (A)


(ii) rank (PA ) = rank (A), and therefore
(iii) rank (PAQ) = rank(A)

5. Elementary row and column operations on a matrix are rank preserving.

Theorem 3. 2. 1. The rank of any matrix equals the maximum number of its linearly
independent columns; that is, the rank of a matrix is the dimension of the subspace
generated by its columns.

45
B. Chaluvaraju 

Proof. For any A ∈ Mmxn(F), rank (A) = rank (TA) = dim (rank (TA) ). Let A be the

standard ordered basis for Fn. Then A spans Fn and let T : V → W be linear

transformation. If A = ( u1, u2, . . . ,un) is a basis for V, then

rank (T) = span (T(A)) = span ({T(v1), T(v2), T(v3),……,T(vn)}).

Similarly rank (TA) = span (TA(A)) = span ({TA (e1), TA (e2), TA (e3), ….., TA(en)}). But,
for any j, we see that TA(ej) = Aej = aj, where aj the jth column of A.
Hence rank (TA) = span ({a1, a2, a3,……,an}).
Thus, rank (A) = dim (rank (TA)) = dim (span{ a1, a2, a3,……,an }).

1 0 1
Example -1. Let A =  0 1 1 
1 0 1
Observe that the first and second columns of A are linearly independent and that the third
column is a linear combination of the first two.
1 0 1
Thus, rank dim   0 ,      1 ,      1 2.
1 0 1
To compute the rank of a matrix A, it is frequently useful to postpone the use of
above theorem until A has been suitably modified by means of appropriate elementary
row and column operations so that the number of linearly independent columns is
obvious.
By Note 2, guarantees that the rank of the modified matrix is the same as the rank
of A. One such modification of A can be obtained by using elementary row and column
operations to introduce zero entries. The next example illustrates this procedure.

 1 2 1
Example -2. Let A =  1     0      3  .
 1 1 2 
If we subtract the first row of A from rows 2 and 3 (Type-3 elementary row operations),
 1   2 1
the result is  0      2       2 .
 0   1 1 
If we now subtract twice the first column from the second and subtract the first column
from the third (Type-3 elementary column operations),

46
Linear Algebra 

 1    0 0
we obtain,  0      2       2 .
 0   1  1
It is now obvious that the maximum number of linearly independent columns of this
matrix is 2. Hence the rank of A is 2.

Example -3.
1 2   1 1
(i) Let A =                    
2 1 1 1

Note that the first and second rows of A are linearly independent since one is not a
multiple of the other. Thus rank (A) = 2.
1 3 1 1
(ii) Let A = 1     0     1     1 
0 3 0 0

In this case, there are several ways to proceed. Suppose that we begin with an
elementary row operation to obtain a zero in the 2, 1 position. Subtracting the first
1    3 1 1
row from the second row, we obtain, 0      3     0     0  .
0    3 0 0
Note that the third row is a multiple of the second row, and the first and second rows
are linearly independent. Thus rank (A) = 2.
1    2 3 1
(iii) Let A = 2         1      1     1
1 1 1 0
Using Elementary row operations, we can transform A as follows:
1    2    3   1 1     2    3    1
A→ 0      3      5      1 → 0       3      5      1
0 3 2 1 0     0   3    0

It is clear that the last matrix has three linearly independent rows and hence has
rank 3.

Definition 3. 2. 2. Let A and B be m × n and m × p matrices, respectively. By the


augmented matrix (A|B), we mean the m × (n+p) matrix (A|B), that is, the matrix whose
first n columns are the columns of A, and whose last p columns are the columns of B.

47
B. Chaluvaraju 

0 2 4
Illustrative Example - 4. Determine whether the matrix A = 2 4 2 is invertible or
3 3 1
not. Also, find A–1.

Solution. First, we attempt to use elementary row operations to transform


0 2 4 1 0 0
(A| I) = 2 4 2             0 1 0
3 3 1 0 0 1
into a matrix of the form (I|B). One method for accomplishing this transformation is to
change each column of A successively, beginning with the first column, into the
corresponding column of I. Since we need a nonzero entry in the 1, 1 position, we
begin by interchanging rows 1 and 2.
2 4 2 0 1 0
The result is 0 2 4             1 0 0
3 3 1 0 0 1
In order to place a 1 in the 1, 1 position, we must multiply the first row by ½; this
operation yields
1 2 1 0 0
0 2 4             1 0 0
3 3 1 0 0 1
We now complete work in the first column by adding – 3 times row 1 to row 3 to obtain
1    2     1 0    0
0    2     4             1   0 0
0 3   2 0   
1
In order to change the second column of the preceding matrix into the second column of
I, we multiply row 2 by ½ to obtain a 1 in the 2, 2 position. This operation produces

1    2     1 0    0
0    1     2               0 0
0 3   2 0   
1
We now complete our work on the second column by adding –2 times row 2 to row 1 and
3 times row 2 to row 3. The result is

1    0   3 1    0
0    1     2                   0 0
0   0     4       
1

48
Linear Algebra 

Only the third column remains to be changed. In order to place a 1 in the 3, 3 position,
we multiply row 3 by ¼ ; this operation yields

1    0   3 1    0
0    1     2                   0 0
0   0     1       

Adding appropriate multiples of row 3 to rows 1 and 2 completes the process and gives

1    0     0                


0    1     0                            
0    0     1             
Thus A is invertible, and the inverse of the matrix A, that is
               
A–1 =                
            
Illustrative Example - 5. Determine whether the following matrix is invertible or not.

1 2     1
A = 2     1      1
1 5    4
Solution. Using the strategy similar to the one used in the above example,
we attempt to use elementary row operations to transform
1 2     1 1 0 0
 |        2     1      1           0     1     0
1 5    4 0 0 1
into a matrix of the form (I | B). We first add - 2 times row 1 to row 2 and - 1 times row
1 to row 3. We then add row 2 to row 3. The result,
1 2     1 1 0 0
2     1      1           0     1     0      
1 5    4 0 0 1
1    2     1    1 0 0
⇒ 0      3      3       2      1     0
0    3    3   1 0 1
1    2     1    1 0 0
⇒   0      3      3    2      1     0 is a matrix with a row whose first 3 entries are
0    0    0   3 1 1
zeros. Therefore A is not invertible.

3. 3. Systems of Linear Equations.


49
B. Chaluvaraju 

Definition 3. 3. 1. The system of linear equations


a11x1 + a12x2 + . . . . . . . . + a1nxn = b1
a21x1 + a22x2 + . . . .. . . . + a2nxn = b2
:
am1x1 + am2x2 + . . . . . . . + amnxn = bm
where aij and bi (1 ≤ i ≤ m and 1 ≤ j ≤ n ) are scalars in a field F and x1 , x2 , x3, . .. . ,xn are
n variables taking values in F, is called a system of m linear equations in n unknowns
over F.

                   


    
Definition 3. 3. 2. The m × n matrix         …         is called the
      
coefficient matrix of the system.

Note.

1. If we let   and     . Then the system may be rewritten as a

single matrix equation Ax = b.


2. Consider a system of linear equations as a single matrix equation.

   ∈  

such that As = b. The set of all solutions to the system of equation is called the
solution set of the system. System of equation is called consistent if its solution
set is nonempty; otherwise it is called inconsistent.
Illustrative Example - 1. Solve the system of equation x1 + x2 = 3 and x1 – x2 = 1
Solution. By use of familiar techniques, we can solve the preceding system and conclude
2
that there is only one solution: x1 = 2, x2 = 1; that is       . In matrix form, the
1
1    1 3 1    1 3
system can be written as       . So     and   .
1 1 1 1 1 1
50
Linear Algebra 

Illustrative Example -2. Solve the system of equation 2x1 + 3x2 + x3 = 1 and
x1 – x2 + 2x3 = 6.

2   3 1  1
Solution. In matrix form, the system can be written as                         .
1 1 2  6

6       8
This system has many solutions, such as s =     2    and s =     4    .
     7       3 

Illustrative Example -3. Consider the system of equation x1 + x2 = 0 and x1 + x2 = 1.

1    1  0
Solution. In matrix form, the system can be written as         , it is
1   1  1 
evident that this system has no solutions. Thus we see that a system of linear equations
can have one, many, or no solutions.

Definition 3. 3. 3. A system Ax = b of m liner equations in n unknowns is said to be


homogeneous if b = 0. Otherwise the system is said to be non homogeneous.
Note.
1. Let Ax=0 be a homogeneous system of m linear equations in n unknowns over a
field F. Let K denote the set of all solutions to Ax=0. Then K= nullity (TA), hence
K is a subspace of Fn of dimension n–rank (TA) = n–rank (TA).
2. If m<n , the system Ax=0 has a nonzero solution.
Based on above note we illustrate the following examples.
Example - 4.
(i) Consider the system of equation x1 + 2x2 + x3 = 0 and x1 – x2 – x3 = 0.

1   2     1
Let A=                 be the coefficient matrix of this system.
1 1 1
It is clear that rank (A) = 2. If K is the solution set of this system,
then dim (K) = 3 – 2 = 1.
Thus any nonzero solution constitutes a basis for K.

51
B. Chaluvaraju 

    1
For example, since   2    is a solution to the given system,
     3 
    1
  2    is a basis for K.
     3 
    1      
Thus any vector in K is of the form   2           2    , where t ∈ R.
     3       3  
(ii) Consider the system x1 – 2x2 + x3 = 0 of one equation in three unknowns.
If A = (1 –2 1) is the coefficient matrix, then rank (A) = 1.

Hence if K is the solution set, then dim (K) = 3 – 1 = 2.


    2     1
Note that     1    and       0    are linearly independent vectors in K.
     0        1 
Thus they constitute a basis for K,
2 1
so that   1                    0        ,     .
0    1

Theorem 3. 3. 1. Le K be the solution set of a system of linear equations Ax = b, let KH


be the solution set of the corresponding homogeneous system of equation Ax = 0. Then
for any solution s to Ax = b. K = {s} + KH = {s + k : k ∈ KH}.
Proof. Let s be any solution to Ax = b. We must show that K = {s} + KH. If w ∈ K,
then Aw = b. Hence A(w – s ) = Aw – As = b – b = 0. So (w – s) ∈ KH. Thus there
exists k ∈ KH such that (w – s) = k. It follows that w = (s + k) ∈ {s} + KH, and therefore
K is a subset of {s} + KH.
Conversely, suppose that w ∈ {s} + KH ; then w = (s + k ) for some k ∈ KH.
But then, Aw = A(s +k) = As + Ak = b + 0 = b; so w ∈ K. Therefore {s} + KH is a subset
of K. Thus, K = {s + KH}.

Note. Let Ax = b be a system of n linear equations in n unknowns. If a is invertible,


then the system has exactly one solution, namely, A–1b. Conversely, if the system has
exactly one solution, then A is invertible.

52
Linear Algebra 

3. 4. Summary
This unit is devoted to two related objectives:
1. The study of certain “rank-preserving” operations on matrices;
2. The application of these operations and the theory of linear transformations to the
solution of systems of linear equations.
As a consequence of Objective 1, we obtain a simple method for computing the rank
of a linear transformation between finite-dimensional vector spaces by applying these
rank-preserving matrix operations to a matrix that represents that transformation.
Solving a system of linear equations is probably the most important application of
linear algebra. The familiar method of elimination for solving systems of linear equation,
involves the elimination of variables so that a simpler system can be obtained. The
technique by which the variables are eliminated utilizes three types of operations:
(i) Interchanging any two equations in the system;
(ii) Multiplying any equation in the system by a nonzero constant;
(iii) Adding a multiple of one equation to another.
In System of Linear Equations, we express a single matrix equation. In this
representation of the system, the three operations above are the “elementary row
operations” for matrices. These operations provide a convenient computational method
for determining all solutions to a system of linear equations. There are two types of
elementary matrix operations - row operations and column operations. As we will see,
the row operations are more useful. They arise from the three operations that can be used
to eliminate variables in a system of linear equations. In the above unit, we define the
rank of a matrix. We then use elementary operations to compute the rank of a matrix and
a linear transformation. We have remarked that an n x n matrix is invertible if and only if
its rank is n. Since we know how to compute the rank of any matrix, we can always test a
matrix to determine whether it is invertible. The section concludes with a procedure for
computing the inverse of an invertible matrix.

3. 5. Keywords
Augmented matrix Consistent of linear equations
Coefficient of a linear equations Elementary column operation

53
B. Chaluvaraju 

Elementary matrix Matrix


Elementary operation Rank of a matrix
Elementary row operation System of linear equations
Inverse of matrix Type- 1, -2 and -3 operations

UNIT-4

4. 0. Introduction
An exposition of the theory of determinants independent of their relation to the
solvability of linear equations was first given by Vandermonde in his “Memoir on
elimination theory” of 1772. (The word “determinant” was used for the first time by
Gauss, in 1801, to stand for the discriminant of a quadratic form, where the discriminant
of the form ax2 + bxy + cy2 is b2 − 4ac.) Laplace extended some of Vandermonde’s work
in his Researches on the Integral Calculus and the System of the World (1772), showing
how to expand n × n determinants by cofactors.
The determinant of a square matrix is a value computed from the elements of the
matrix by certain, equivalent rules. The determinant provides important information when
the matrix consists of the coefficients of a system of linear equations, and when it
describes a linear transformation: in the first case the system has a unique solution if and
only if the determinant is nonzero, in the second case that same condition means that the
transformation has an inverse operation. So this unit deals with some important properties
of determinants, cofactor expansions and Cramer’s rule.

4. 1. Properties of Determinants.

Definition 4. 1. 1. A Permutation of set {x1, x2 ,…, xn }is simply an arrangement of the


elements of the set into a particular sequence or order. Although the number of
interchanges used to carry j1, j2, …, jn into 1, 2, …, n is not always the same, because the
number of interchanges used to a carry a permutation j1, j2, …, jn of {1, 2, …, n } into the
natural ordering is either always odd or even. Hence the sign (–1)t of each term is well-
defined.

54
Linear Algebra 

Definition 4. 1. 2. The determinants of the square matrix A = [aij] over F is the scalar

det(A) defined by det( A) = ∑ (−1) a1 j1 a2 j2 ...anjn , where ∑ denotes the sum of all terms
t

( j) ( j)

t
of the form (−1) a1 j1 a2 j2 ...anjn as j1, j2 ,…, jn assumes all possible permutations of the

numbers 1, 2, …, n, and the exponent t is the number of interchanges used to carry


j1, j2, …, jn into the natural ordering 1, 2, …, n.
The notations det(A) and ⎜A⎟ are used interchangeably with det(A). When the
elements of A = [aij]n are written as a rectangular array, we would have
a11 a12 . . a1n
a21 a22 . . a21
det( A) = A = and det(A) is uniquely determined by A.
. . . . .
an1 an1 . . ann

We observe that there are n! terms in the sum det(A) since there are n! possible
ordering of 1, 2, …, n. The determinant of n × n matrix is referred to as n × n determinant,
or a determinant of order n.

Note.
1. Determinants of order 1 and 2.
a11 a12
That is ⎜ a11 ⎟ = a11 and = a11 a22 – a12 a21 .
a21 a22

2. Determinants of order 3.

⎛ a11 a12 a13 ⎞


Consider a 3× 3 matrix A = ⎜ a21 a22 a23 ⎟


⎜a a32 a33 ⎟⎠
⎝ 31
By the definition,

det( A) = (−1)t1 a11a22 a33 + (−1)t2 a11a23a32 + (−1)t3 a12a23a31

+(−1)t4 a12a21a33 + (−1)t5 a13a22a31 + (−1)t6 a13a21a32 .


Since 1, 2, 3 is the natural ordering, we may take t1=0. Since 1, 3, 2 can be carried
into 1, 2, 3 by the single interchange of 2 and 3, we may take t2=1. The ordering

55
B. Chaluvaraju 

2, 3, 2 can be carried into 1, 2, 3 be an interchange of 2 and 1, followed by an


interchange of 2 and 3. Thus we may take t3=2.
By the same method, we find t4=1, t5=1 and t6=2.
Hence, det( A) = a11a22 a33 − a11a23a32 + a12 a23a31 − a12 a21a33 − a13a22 a31 + a13a21a32 .

⎛2 1 1 ⎞ ⎛ 3 2 1⎞ ⎛ 1 2 −1 ⎞
Examples - 1. Let A = ⎜ 0 5
⎟, ⎜ ⎟ and
− 2 ⎟ B = ⎜ −4 5 −1 ⎟

C = ⎜ −2 0

7 ⎟.

⎜1 −3 4 ⎟⎠ ⎜ 2 −3 4 ⎟ ⎜ 3 0 7 ⎟⎠
⎝ ⎝ ⎠ ⎝
Then the by note-2, we have det (A) = 21, det (B) = 81 and det(C) = –70.

Theorem 4. 1. 1. For any matrix A = [aij], det( A) = ∑ (−1) a1 j1 a2 j2 ...anjn , where ∑


t

( j) (i )

denotes the sum over all possible permutations i1, i2,…,in of 1, 2, …, n, and s is the
number of interchanges used to carry i1 , i2,…,in into the natural ordering.

Proof. Let S = ∑ (−1) ai11 , ai2 2 ,..., ain n. Now both S and det(A) have n! terms. Except
s

(i )

possibly for sign, each term of S is a term of det(A), and each term of det(A) is a term of
S. Thus, S and det(A) consist of the same terms, with a possible difference in sign.
s t
Consider a certain term (−1) ai11, ai2 2 ,..., ain n and (−1) a1 j1 a2 j2 ...anjn be the corresponding

s t
term in det(A). Then (−1) ai11, ai2 2 ,..., ain n can be carried into (−1) a1 j1 a2 j2 ...anjn by s

interchanges of factors since the permutation i1 , i2,…,in can be changed into the natural
ordering 1, 2, …, n by s interchanges of elements. This means that the natural ordering
1, 2, …, n can be changed into the permutations j1, j2, …, jn by s interchanges since the
column subscripts have been interchanged each time the factors were interchanged. But
j1, j2, …, jn can be carried into 1, 2, …, n by t interchanges, by the definition of det(A).
Thus 1, 2, …, n can be carried into j1, j2, …, jn and then back into itself by (s+t)
interchanges. Since 1, 2, …, n can be carried into itself by an even number(zero) of
interchanges, (s+t) is even, because the number of interchanges used to a carry a
permutation j1, j2, …, jn of {1, 2, …, n } into the natural ordering is either always odd or
even. Therefore (–1)s+t = 1 and (–1)s = (–1)t. Now we have the corresponding terms in
det (A) and S with the same sign, and therefore det(A) = S.

56
Linear Algebra 

Theorem 4. 1. 2. If A = [aij]n, then det (AT) = det (A).


Proof. Let B=AT, so that bij=aji for all pairs i,j.

Thus det( B) = ∑ (−1) b1 j1 b2 j2 ...bnjn by the definition of det (B), and


t

( j)

det( B ) = ∑ ( −1 )t a j11 ,a j2 2 ,...,a jn n .


( j)

Therefore det (B) = det(A), by theorem 4. 1. 1.

⎛1 2 3⎞ ⎛1 2 3⎞
Example - 2. Let A = ⎜ 2 1 3⎟
⎟ and T ⎜ ⎟
A = ⎜ 2 1 1 ⎟ be two matrices.

⎜3 1 2 ⎟⎠ ⎜ 3 3 2⎟
⎝ ⎝ ⎠
Then det (AT) = det (A) = 6.

Theorem 4. 1. 3. If B results from matrix A by interchanging two rows (columns) of A,


then det (B) = –det (A).
Proof. Suppose that B arises from A by interchanging rows r and s of A, say r < s. Then
we have brj=asj, bsj=arj and bij=aij for i≠r, i≠s. Now

det(B) = ∑ (−1)t b1 j1 b2 j2 ....brjr ....bsjs ....bnjn


( j)

= ∑ (−1) a1 j1 a2 j2 ....asjr ....arjs ....anjn


t

( j)

= ∑ (−1) a1 j1 a2 j2 ....arjr ....asjs ....anjn .


t

( j)

The permutation j1, j2 … js… jr… jn results from the permutation j1, j2 … jr… js… jn by an
interchange of two numbers; the number of inversions in the former differs by an odd
number from the number of inversions in the latter. This means that the sign of each term
in det (B) is the negative of the sign of the corresponding term in det (A).
Hence det (B) = –det (A).
Now suppose that B is obtained from A by interchanging two columns of A. Then BT is
obtained from AT by interchanging two rows of AT. So det (BT) = –det (AT), but
det (BT) = det (B) and det (AT) = det (A). Hence det (B) = – det (A).

57
B. Chaluvaraju 

⎛ 2 −1⎞ ⎛3 2⎞
Example -3. Let A = ⎜ ⎟ and B = ⎜ ⎟ . Then det (B) = – det (A) = –7.
⎝3 2⎠ ⎝ 2 −1⎠

Theorem 4. 1. 4. If a matrix A = [aij]n is upper (lower) triangular, then


det (A) = a11 a22 . . . .ann.
Proof. Let A=[aij] be upper triangular (that is aij=0 for i>j). Then a term
a1 j1 , a2 j2 ...., anjn in the expression for det (A) can be nonzero only for 1≤ j1, 2≤ j2, . . .,n ≤

jn. Now, j1, j2, …, jn must be a permutation, or rearrangement, of {1, 2, …,n}.


Hence, we must have jn=n, jn–1= n–1,…., j2=2, j1=1. Thus the only term of det (A) that
can be nonzero is the product of the elements on the main diagonal of A. Since the
permutation 1, 2,….,n has no inversions, the sign associated with it is +. Therefore, det
(A) =a11 a22 . . . .ann.
Similarly, we can prove the lower triangular case.

4. 2. Cofactor Expansions

Definition 4. 2. 1. Let A=[aij]n be an n × n matrix. Let Mij be the (n–1) × (n–1)


submatrix of A obtained by deleting the ith row and jth column of A. The det(Mij) is
called the Minor of aij. The cofactor Aij of aij is defined as Aij = (–1)i+j det(Mij).

⎛ 3 −1 2 ⎞
Example - 1. Let A = ⎜ 4 5 6 ⎟ . Then
⎜ ⎟
⎜7 1 2⎟
⎝ ⎠
4 6
det(M12 ) = = 8 – 42 = –32,
7 2

3 −1
det(M 23 ) = = 3 + 7=10, and
7 1

−1 2
det(M31 ) = = – 6 –10 = –16.
5 6
Also we have, A12 = (–1)1+2 det (M12) = (–1)( –34) = 34,

A23= (–1)2+3 det (M23) = (–1)( 10) = –10, and

58
Linear Algebra 

A31= (–1)3+2 det (M31) = (1)( –16) = –16.


The expression given in the theorem is referred to as “ an expansion by cofactors” or
more precisely as the “the expansion about the ith row”. This expansion is the main
result of this section.

Theorem 4. 2. 2. If A=[aij]n, then det (A) = ai1 Ai1 + ai2 Ai2+ . . . . + ain Ain.
Proof. For a fixed integer i, we collect all of the terms in the sum

det( A ) = ∑ ( −1 )t a1 j1 a2 j2 ...anjn that contains ai1 as a factor in one group, all of the terms
( j)

that contains ai2 as a factor in another group, and so on for each column number. This
separates the terms in det (A) into n group with no overlapping since each term contains
exactly one factor from row i. In each of the terms containing ai1, we factor our ai1 and
let Fi1 denote the remaining factor. Repeating this process for each ai1 ai2 . . . .ain in turn,
we obtain det(A) = ai1 Fi1 + ai2 Fi2+ . . . . + ain Fin. To finish the proof, we need only
show that Fij=Aij = (–)i+jMij, where Mij is the minor of aij. Consider first the case where
i=1 and j=1.
We shall show that a11 F11 =a11 M11. Each term in F11was obtained be factoring a11 from
−t
a term (−1) 1 a1 j1 a2 j2 ...anjn in the expansion of det (A). Thus each term F11 has the form

(−1)−t1 a2 j2 a3 j3 ...anjn , where t2 is the number of interchanges used to carry j2, …, jn into

2, 3,….,n. Letting j2, …, jn range over all permutations of 2,3,…,n, we see that each of
F11 and M11 has (n–1)! terms. Now1, j2, …, jn can be carried into natural ordering by the
same interchanges used to carry j2, …, jn into 2,3,…,n. That is, we may take t1=t2. This
means that F11 and M11 have exactly the same terms, yielding F11 = M11 and
a11 F11 =a11 M11. Consider now an arbitrary aij. By (i–1) interchanges of the original
row i with the adjacent row above and then ( j–1) interchanges of column j with the
adjacent column on the left, we obtain a matrix B that has aij in the first row, first column
position. Since the order of the remaining rows and columns of A was not changed, the
minor of aij in B is the same Mij as it is in A. If B results from matrix A by interchanging
two rows (columns) of A, then det(B) = – det(A).
So det (B) = (–1)i–1+ j–1 det(A) = (–1)i+j det (A). This gives det (A) = (–1)i+j det (B). The
sum of all the terms in det (B)that contains aij as a factor is aij Mij, from our first case.
59
B. Chaluvaraju 

Since det(A) = (–1)i+j det (B), the sum of all the terms in det (A) that contains aij as a
factor is (–1)i+j aij Mij. Thus aij Fij= (–1)i+j aij Mij= aij Aij, and the theorem proved.

The dual statement for above theorem as follows.


Theorem 4. 2. 3. If A=[aij]n, then det(A) = a1j A1j + a2j A2j+ . . . . + anj Anj.

Note. If A=[aij]3, the expansion of det(A) about the 2nd row is given by
a11 a12 a13
A = a21 a22 a23 = a21 A21 + a22 A22 + a23 A23
a31 a32 a33

= −a21 (a12 a33 − a13a32 ) + a22 (a11a33 − a13a31 ) − a23 (a11a32 − a12 a31 ).

4. 3. Elementary Operations and Cramer Rules

For an elementary operation, which are based on the properties of determinant


and cofactor expansion, we have the following results
1. If two rows (columns) of A are equal, then det (A) = 0
2. If a row(column) of A consists entirely of zeros, then det (A)=0
3. If B is obtained from A by multiplying a row (column) of A by a real number k,
then det (B) = k det (A).
4. If B = [bij]n is obtained from A = [aij]n by adding to each element of the rth
row(column) of A a constant k times the corresponding element of the sth
row(column) r≠s of A, then det (B) = det (A).
5. If two rows (columns) of A are equal with det (A) = 0 and if two The
determinant of a product of two matrices is the product of their determinant.
That is det (AB) = det (A) det(B).
1
6. If A is non singular, then det(A)≠0 and det(A −1 ) =
det(A)

60
Linear Algebra 

Definition 4. 3. 1. Let A = [aij] be an n×n matrix. The n×n matrix adj(A), called the
adjoint of A, is the matrix whose i, jth element is the cofactor Aji of aji. Thus
⎛ A11 … An1 ⎞
⎜ ⎟.
adj ( A) = ⎜ ⎟
⎜A Ann ⎟⎠
⎝ 1n
Note.
1. The adjoint of A is formed by taking the transpose of the matrix of cofactors of
the elements of A.
2. It should be noted that the term adjoint has other meaning in linear algebra in
addition to its use in the above definitions.
3. If A = [aij] be an n×n matrix. Then
(i) A adj(A)=(adj(A))A= det(A)In, where In is an identity matrix.
1
(ii) A −1 = adj ( A) , provided det(A)≠0.
det( A)
Our final results of this unit make an important connection between
determinants and the solutions of certain types of systems of linear equations. This
theorem presents a formula for the unknowns in terms of certain determinants. This
formula is commonly known as Cramer’s rule.

Theorem 4. 3. 1. Consider a system of linear equations AX=B, in which A = [aij]n×n,


X=[ x1, x2, …,xn]T, and B=[ b1, b2, …,bn]T. If det(A)≠0, the unique solution of the system
n
∑ bk Akj
k =1
is given by x j = , j=1,2,3…,n.
det( A)
Proof. We first show that the given values are a solution of the system. Substitution of
these values for xj into the left member of the ith equation of the system yields
1 ⎛ ⎛ n ⎞ ⎛ n ⎞⎞
ai1 x1+ . . . + ain xn = ⎜ ai1 ⎜ ∑ bk Ak1 ⎟ + . . . + ain ⎜ ∑ bk Akn ⎟ ⎟
det( A) ⎝ ⎝ k =1 ⎠ ⎝ k =1 ⎠⎠
1 n ⎛ n ⎞
= ∑ ⎜ ∑ aij bk Akj ⎟
det( A) j =1 ⎝ k =1 ⎠

61
B. Chaluvaraju 

1 n ⎛ n ⎞
= ∑ ⎜ bk ∑ aij Akj ⎟
det( A) k =1 ⎝ j =1 ⎠
1 n
= ∑ ( bk (δ ik det( A)) ) , where δik is the Kronecker delta
det( A) k =1
1
= bi (δ ii det( A) ) = bi .
det( A)
n
∑ bk Akj
k =1
Thus, the values x j = furnish a solution of the system.
det( A)
To prove the uniqueness, suppose that xj=yj, j=1, 2, 3…, n, represents any
n
solution to the system. Then the ith equation ∑ aik yk = bi is satisfied for i=1,2,….,n.
k =1

If we multiply both members of the ithequation by Ai j(j fixed) and form the sum of these
n
⎛ n ⎞ n n
⎛n ⎞ n
equations, we find that ∑ ⎜ ∑ aik Aij yk ⎟ = ∑ bi Aij and ∑ ⎜ ∑ ik ij ⎟ k ∑ bi Aij .
a A y =
i =1 ⎝ k =1 ⎠ i =1 k =1 ⎝ i =1 ⎠ i =1

n
n n n ∑ bi Aij
But, for each k, ∑ aik Aij = δ kj det( A). Thus ∑ δ kj det( A)yk = ∑ bi Aij , and y j = i =1 .
k =1 k =1 i =1 det( A)
Hence these yi’s are the same as the solution given in the statement of the theorem.

n
Note. The sum ∑ bk Akj is the determinant of the matrix obtained by replacing the jth
k =1

column of A by the column of constants B = [ b1, b2, …,bn]T.

Example -1. For a system in three unknowns with det(A)≠0,


a11 x1 + a12 x2 + a13 x3=b1
a21 x1 + a22 x2 + a23 x3=b2
a31 x1 + a32 x2 + a33 x3=b3,
the solution stated in the above theorem, can be written as
b1 a12 a13 a11 b1 a13 a11 a12 b1
b2 a22 a23 a21 b2 a23 a21 a22 b2
b3 a32 a33 a31 b3 a33 a31 a32 b3
x1 = , x2 = , x3 = .
det( A) det( A) det( A)
62
Linear Algebra 

4. 4. Summary
In this unit, the fundamental of the theory of determinants and its properties are
explored. A knowledge of this unit is necessary in the study of eigenvalues and
eigenvectors of linear transformation. For this purpose we give some important theorems
along their proof. The different method for evaluating the determinant of an n × n matrix,
which reduces the problem to the evaluation of determinants of matrices of order (n–1).
We can then repeat the process for these (n–1) × (n–1) matrices until we get to 2 × 2
matrix. For an elementary operation, which are based on the properties of determinant
and cofactor expansion. The important connection between determinants and the
solutions of certain types of systems of linear equations. We present the Cramer’s rule for
finding the unknowns in terms of certain determinants.

3. 5. Keywords
Determinant of a matrix
Adjoint matrix
Elementary operation
Cofactor
Minor
Cramer’s rule
n – linear function
Determinant

Exercises

1. Let S be the set of all elements of the form (x + 2y, y, –x + 3y) in R3, where x, y ∈
R. Show that S is a subspace of R3.

(Hint: Let u, v ∈S. Then, u = (x1 + 2y1, y1, –x1 +3y1), v = (x2 + 2y2, y2, –x2 +3y2)
where x1, y1, x2, y2 ∈ R. Then show au + bv = (α + 2β , β , −α + 3β ) ∈S,

where α = ax1+bx2 and β = ay1+by2 ).

2. Express the polynomial f(x) = x2+ 4x –3 in the vector space V of all polynomials
over R as a linear combination of the polynomials g(x) = x2 – 2x + 5,
h(x) = 2x2 –3x and φ(x) = x +3.
Answer: f(x) = –3g(x) +2h(x) +4 φ(x).

63
B. Chaluvaraju 

3. Express v = (2, –5, 3) in R3 as a linear combination of the vector:


v1 = (1, –3, 2), v2 = (2, –4, –1), v3 = (1, –5, 7).
Answer.
⎡1 2 1 ⎤ ⎡x⎤ ⎡ 2 ⎤
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢0 2 −2 ⎥ ⎢ y ⎥ = ⎢ 1 ⎥
⎢⎣ 0 0 0 ⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 3 2 ⎥⎦

This is an inconsistent system of equations and so has no solution. Hence, v


cannot be written as a linear combination of v1, v2 and v3.

4.  Show that the mapping J: R2 →R3 given by J(a, b) = (a+b, a–b, b) for all (a,b) є
R2, is a linear transformation. Find the range, rank, kernel and nullity of J.
Answers. rank(J) = dim(Im(J))= dim(range(J))= 2 , nullity(J) = dim R2 – rank(J) =
2 – 2= 0 and Kernel (J) = Null Space = {(0,0)}, where Im(J) = { t(1,0), t(0,1)}=
{(1,1,0), (1, -1, 1)}.

5. Let V = P1(t) = { a + bt: a, b є R} be the vector space of real polynomials of


degree at most one. Find the basis {b1, b2} of V that is dual to the basis {x1, x2} of
V* defined by
1 2
x1 ( f (t )) = ∫ f (t )dt and x 2 ( f (t )) = ∫ f (t )dt
0 0

1
Answer. b1 = 2 − 2t , b2 = − +t
2

7. Write out the complete expression for det (A) if A = [aij]4.

8. Determine whether t is even or odd in the given term of det(A), where A = [aij]n.
(i) (–)ta13 a21 a34 a42
(ii) (–)ta14 a21 a33 a42
(iii) (–)ta14 a23 a32 a41
(iv) (–)ta12 a24 a31 a43

9. Prove or disprove each of the given statements with suitable examples


(i) ⎜A+ B⎟ = ⎜A⎟ + ⎜B⎟
64
Linear Algebra 

(ii) ⎜A+ AT ⎟ = ⎜A⎟ + ⎜ AT ⎟.

⎛ 3 −2 1 ⎞
10. Let A = ⎜ 5 6 2 ⎟ . Then compute adj(A) and A–1.
⎜ ⎟
⎜ 1 0 −3 ⎟
⎝ ⎠

⎛ −18 −6 −10 ⎞ ⎛ 94 18 6
94
10
94

⎜ 1 ⎟
Answer. adj ( A) = ⎜ 17 −10 − 1 ⎟ and A = ⎜ − 17
−1 10
⎜ ⎟ 94 94 94 ⎟
⎜ −6 −2 ⎟
28 ⎠ ⎜ 6 2 28 ⎟
− 94
⎝ ⎝ 94 94 ⎠.

⎛1 2 3⎞
11. If A = ⎜ 4 5

6 ⎟ , find the following minors and cofactors:

⎜7 8 9 ⎟⎠

(i) M23 and A23
(ii) M13 and A13

Answer. (i) M23 = – 6 and A23= 6, (ii) M13 = – 3 and A13 = – 3

12. Solve the system of linear equation by using the Cramer rule:

–2 x1 + 3 x2 – x3 = 1
x 1 + 2 x2 – x 3 = 4
–2x1 – x2 + x3 = –3.
Answer. det (A)= – 2, and x1 =2, x2=3, x3= 4.

References
1. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
2. I. N. Herstein – Topics in Algebra, Vikas Publishing House, New Delhi, 2002.
3. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
New Delhi, 2000.
4. Michael Artin– Algebra, Prentice Hall India, New Delhi, 2007.

65
B. Chaluvaraju 

BLOCK –II
Diagonalization and Inner product space

Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of Diagonalization and Inner product space.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.

UNIT-1

1. 0. Introduction
Eigenvectors are a special set of vectors associated with a linear system of
equations ( that is a matrix equation) that are sometimes also known as characteristic
vectors, proper vectors, or latent vectors. The determination of the eigenvectors and
eigenvalues of a system is extremely important in physics and engineering, where it is
equivalent to matrix diagonalization and arises in such common applications as stability
analysis, the physics of rotating bodies, and small oscillations of vibrating systems, to
name only a few. In 1858, A. Cayley proved the important Cayley-Hamilton theorem
that a square matrix satisfies its characteristic polynomial.

1. 1. Eigen values and Eigen vectors


Definition 1. 1. 1. Let T be a linear operator on a vector space V. A nonzero vector v∈
V is called an eigenvector of T if there exists a scalar λ such that T(v) = λv . The scalar λ
is called the eigenvalue corresponding to the eigenvector v.

Note.

66
Linear Algebra 

1. Let A be in Mnxn(F). A nonzero vector v∈ Fn is called an eigenvector of A if v is


an eigenvector of TA; that is, if Av = λv for some scalar λ. The scalar λ is called
the eigenvalue of A corresponding to the eigenvector v.
2. A vector is an eigenvector of a matrix A if and only if it is an eigenvector of TA.
Likewise, a scalar λ is an eigenvalue of A if and only if it is an eigenvalue of TA.

To obtain a basis of eigenvectors for a matrix (or a linear operator), we need to be


able to determine its eigenvalues and eigenvectors. The following theorem gives us a
method for finding eigenvalues.
Theorem 1. 1. 1. Let A ∈ Mm x n(F) be a matrix. Then a scalar λ is an eigenvalue of A if
and only if det (A - λIn) = 0.
Proof. A scalar λ is an eigenvalue of A if and only if there exists a nonzero vectors v∈ V
such that Av = λv , that is, (A – λIn)(v) = 0. By two vector spaces U and V are equal,
finite dimensional vector space over F , and let T : U → V be linear. Then the following
are equivalent.
(i) T is one-to-one

(ii) T is onto.

(iii) rank (T) = dim (V).

This is true if and only if (A – λIn)is not invertible. However, this result is equivalent to
the statement that det (A – λIn) = 0.

Definition 1. 1. 2. Let A ∈ Mm x n(F) be a matrix, The polynomial f(t) = det (A – tIn) is


called the characteristic polynomial of A.

Let T be a linear operator on a n-dimensional vector space V with ordered basis


B. We define the characteristic polynomial f(t) of T to be the characteristic polynomial

of A = [T]B, which is independent of the choice of ordered basis B. Thus if T is a linear

operator on a finite-dimensional vector space V and B is an ordered basis for V, then λ is

an eigenvalue of T if and only if λ is an eigenvalue of [T]B. The characteristic polynomial


f(t) is also denoted as ∆A(t).
67
B. Chaluvaraju 

Theorem 1. 1. 1, states that the eigenvalues of a matrix are the zeros of its
characteristic polynomial. When determining the eigenvalues of a matrix or a linear
operator, we normally compute its characteristic polynomial, as in the next example.

1 1
Illustrative Example -1. Find the eigenvalues of              ∈      .
4 1
Solution. The characteristic polynomial of A is
1 1
det   det        
4 1
= t2 – 2t – 3 = (t – 3)(t + 1).
It follows from Theorem 1. 1. 1, that the only eigenvalues of A are 3 and –1.

1 1 0
Illustrative Example -2. Find the eigenvalues of the matrix A= 0     2     2 
0 0 3
Solution. The characteristic polynomial of T is
1 1 0
det (A – tI3) =    det   0      2       2
0 0 3
= (1 – t) (2 – t)(3 – t) = – (t –1)(t–2)(t –3).
Hence λ is an eigenvalue of T (or A) if and only if λ = 1, 2 or 3.

Note. Let T be a linear transformation on a vector space V. Then a vector v∈ V is an


eigenvector of T corresponding to eigenvalue λ if and only if v ≠ 0 and v∈ n(T λI).

Illustrative Example -3. Show that a square matrix A has 0 as eigenvalues if and only if
A is not invertible.
Solution. Let A be an n × n matrix over F. First, let 0 be an eigenvalue of matrix A and
non-zero vector X ∈ Fn be the corresponding eigenvector. Then, AX = 0X ⇒ AX = 0
If possible, let A be an invertible matrix. Then,
AX = 0 ⇒ A – 1 (AX) = A –1 0 ⇒ – 1 (AX) = A – 1 0 ⇒ (A – 1A) X = 0 ⇒ IX = 0 ⇒ X = 0
But X ≠ 0. So, we arrive at a contradiction. Hence, A must be a non-invertible matrix.
Conversely, let A be a non-invertible matrix. Then, A is non-invertible. The system of
equations AX = 0 has non-trivial solutions. So, there exists a non-zero vector X ∈ Fn
such that AX = 0 ⇒AX = 0X ⇒ 0 is an eigenvalue of A.
68
Linear Algebra 

Illustrative Example -4. If λ is an eigenvalue of an invertible matrix A over a field F,


then show that λ – 1 is an eigenvalue of A– 1.
Solution. Since A is an invertible matrix with an eigenvalue λ. By above example, we
have , λ ≠ 0. So λ – 1∈F. Now, λ is an eigenvalue of matrix A. A – 1(AX) = A – 1 (λX) ,
since A is invertible, therefore A – 1 exists
⇒ (A – 1 A) X = λ(A – 1 X) ⇒ IX = λ(A – 1 X) ⇒ X = λ(A – 1 X) ⇒ λ – 1X = A – 1 X.
Therefore, λ – 1 is the eigenvalue of A – 1.

1. 2. Diagonalizability

Definition 1. 2. 1. A linear operator T on a finite-dimensional vector space V is called


diagonalizable if there is an ordered basis B for V such that [T]B is a diagonal matrix. A
square matrix A is called diagonalizable if TA is diagonalizable.
In other words, A square matrix A over F is said to be diagonalizable if there
exists an invertible matrix C such that C–1AC is a diagonal matrix. Clearly, A is
diagonalizable if and only if A is similar to a diagonal matrix.

We want to determine when a linear operator T on a finite-dimensional vector


space V is diagonalizable and, if so, how to obtain an ordered basis B = {v1, v2, ……,vn}

for V such that [T]B is a diagonal matrix, Note that, if D = [T]B is a diagonal matrix, then

for each vector vj∈B, we have,  ∑           λ   , where λj = Djj.

Conversely, if B = {v1, v2, ……,vn} is an ordered basis for V such that T(vj) = λj vj
 λ 0 0
 λ 0
for some scalars λ1 , λ2 , ………, λn , then clearly [T]B =    0                   for
0 0  λ
each vector v in the basis B satisfies the condition that T(v) = λv for some scalar λ.

Theorem 1. 2. 1. Let T be a linear operator on a vector space V, and let λ1, λ2, . . . . , λk
be distinct eigenvalues of T. If v1, v2, . . . . , vk are eigenvectors of T such that λi
corresponds to vi (1 ≤ i ≤ k), then { v1, v2, . . . . , vk } is linearly independent.

69
B. Chaluvaraju 

Proof. The proof is by mathematical induction on k. Suppose that k = 1. Then v1 ≠ 0


since v1 is an eigenvector, and hence {vi} is linearly independent.
Now assume that the theorem holds for (k – 1) - distinct eigenvalues, where (k – 1) ≥ 1,
and that we have k eigenvectors v1, v2, . . . . , vk corresponding to the distinct eigenvalues
λ1, λ2, . . . . , λk. We wish to show that {v1, v2, . . . , vk } is linearly independent.
Suppose that a1, a2, ……,ak are scalars such that
a1v1 + a2v2 + ……….. + akvk = 0 → (1)
Applying T – λkI to both sides of (1), we obtain
a1 (λ1 – λk)v1 + a2 (λ2 – λk)v2 +. . . . . . . . . . + ak – 1 (λk– 1 – λk) vk – 1 = 0.
By the induction hypothesis {v1, v2, . . . . , vk – 1} is linearly independent, and hence
a1 (λ1 – λk) = a2 (λ2 – λk) =. . . . . . . . . . = ak – 1 (λk– 1 – λk) = 0.
Since λ1, λ2, . . ,λk are distinct eigenvalues, it follows that (λi – λk) ≠ 0 for 1 ≤ i ≤ k – 1.
So, a1 = a2 = . . . . . . = ak – 1 = 0, and (1) therefore reduces to ak vk = 0. But vk ≠ 0 and
therefore ak = 0. Consequently a1 = a2 = . . . . = ak = 0, and it follows that {v1, v2, . . ., vk }
is linearly independent.

Theorem 1. 2. 2. Let T be a linear operator on an n-dimensional vector space V. If T


has n distinct eigenvalues, then T is diagonalizable.
Proof. Suppose that T has n distinct eigenvalues λ1, λ2, . . . . , λk. For each i choose an
eigenvector vi corresponding to λi. By theorem 1. 2. 1, { v1, v2, . . . . , vn } is linearly
independent, and since dim(V) = n, this set is a basis for V. Thus, T is diagonalizable.

Remark. A linear operator T on a finite-dimensional vector space V is diagonalizable if


and only if ordered basis B for V consisting of eigenvectors of T. So, to diagonalizable
a matrix or a linear operator is to find a basis of eigenvectors and the corresponding
eigenvalues.

1 3   1 3
Illustrative Example -1. Let A =   ,    , and . Find [TA]B
4 2 1 4

1 3   1   2   1


Solution. Since [T]B(v1) =     2 2 ,
4 2 1 2 1

70
Linear Algebra 

v1 is an eigenvector of TA, and hence of A. Here λ1= –2 is the eigenvalue corresponding


1 3  3 15 3
to v1. Further, [T]B(v2) =    5 5 , and so v2 is an eigenvector
4 2 4 20 4
of TA, and hence of A, with the corresponding eigenvalue λ2= 5.
Note that B = {v1, v2} is an ordered basis for R2 consisting of eigenvectors both A and TA,
and therefore A and TA are diagonalizable.
2 0
Therefore, by above remark we have [TA]B =  .
  0 5
1 1
Example -2. Let               ∈      . 
1 1
The characteristic polynomial of A (and hence of TA) is
1 1
det          det                  2 ,
1 1
and thus the eigen values of TA are 0 and 2. Since TA is a linear operator on the two-
dimensional vector space R2, we conclude from the preceding Theorem 1. 2. 2, that TA
(and hence A) is diagonalizable.

Definition 1. 2. 2. A polynomial f(t) in P(F) splits over F if there are scalars c, a1, a2,
……,an (not necessarily distinct) in F such that f(t) = c(t – a1) (t – a2) . . . . . . . (t – an).

Example - 3. Consider the polynomial f(t) = t2 – 1 = (t + 1) (t – 1) splits over R, but


(t2 + 1) ( t – 2) does not split over R because (t2 + 1) cannot be factored into a product of
linear factors. However (t2 + 1)( t – 2) does split over c because it factors into the product
(t + i)(t – i)( t – 2). If f(t) is the characteristic polynomial of a linear operator or a matrix
over a field F, then f(t) splits over F.

Theorem 1. 2. 3. The characteristic polynomial of any diagonalizable linear operator


splits.
Proof. Let T be a diagonalizable linear operator on the n-dimensional vector space V,
and let B be an ordered basis for V such that [T]B = D is a diagonal matrix. Suppose that
 λ 0 0
 λ 0
     0                  
0 0  λ

71
B. Chaluvaraju 

and let f(t) be the characteristic polynomial of T. Then


 λ 0 0
0  λ 0
     det det                    
0 0  λ
= (λ1 – t) (λ2 – t) . . . . . . . . (λn – t)
= (–1)n (t – λ1)( t – λ2). . . . . . . . . (t – λn).

Note. Let λ be an eigenvalue of a linear operator or matrix with characteristic


polynomial f(t). The multiplicity of λ is the largest positive integer k for which (t–λ)k is a
factor of f(t).
3 1 0
Example-4. Consider the matrix A= 0 3 4 , which has characteristic polynomial
0 0 4
f(t)= – (t –3)2(t –4). Hence λ = 3 is an eigenvalue of A with multiplicity 2, and λ = 4 is
an eigenvalue of A with multiplicity 1.

Note. For diagonalzing matrices we have following facts


1. Let A be n × n matrix over a field F. If has n distinct eigenvalues, then the
corresponding eigenvectors of A are linearly independent.
2. Let A be n × n matrix over F. If has n distinct eigenvalues λ1, λ2, . . . . , λk , then
there exists an invertible matrix C such that C–1AC= diagonal(λ1, λ2, . . ., λk).
3. Let A be n × n matrix over a field F. If the characteristic polynomial f(t) of a
matrix A is product of n distinct factors, say (–1)n (t – λ1)( t – λ2). . . . . . . (t – λn).
Then A is similar to the diagonal matrix D= diagonal (λ1, λ2, . . . . , λk).

1. 3. Invariant Subspaces and the Cayley – Hamilton Theorem

If v is an eigenvector of a linear operator T, then T maps the span of {v} into


itself. Subspaces that are mapped into themselves are of great importance in the study of
linear operators.

Definition 1. 3. 1. Let T be a linear operator on a vector space V. A subspace W of V is


called a T-invariant subspace of V if T(W) ⊆ W. That is, if T(v) ∈ W for all v ∈ W.

72
Linear Algebra 

Examples - 1. Suppose that T is a linear operator on a vector space V. Then {0}, V ,


rank(T), nullity(T) and Eλ, for any eigen value λ of T are T-invariant subspaces of V.

Example - 2. Let T be the linear map on R3 defined by T(a + b + c) = (a + b, b + c, 0).


Then the xy – plane = {(x, y, 0): x, y ∈ R} and the x-axis = {(x, 0, 0): x ∈ R} are
T - invariant subspaces of R3.

Definition 1. 3. 2. Let T be a linear operator on a vector space V, and let x be a nonzero


vector in V, the subspace W = span ({x, T(x), T2(x), . . . . . . }) is called the T-cyclic
subspace of V generated by x.
It is a simple matter to show that W is T-invariant. In fact, W is the “smallest” T-
invariant subspace of V containing x. That is, any T - invariant subspace of V containing
x must also contain W. Cyclic subspaces have various uses. We apply them in this unit
to establish the Cayley - Hamilton theorem.

Example -3. Let T be the linear operator on R3 defined by T(a, b, c) = (–b + c, a + c, 3c)
We determine the T - cyclic subspace generated by e1 = (1, 0, 0).
Since T(e1) = T (1, 0, 0) = (0, 1, 0) = e2 and T2(e1) = T(T(e1)) = T(e2) = (–1, 0, 0) = – e1,
It follows that span ({e1, T(e1) , T2(e1) . . . . . }) = span ({e1, e2}) = {(s, t, o): s, t ∈ R}.

Theorem 1. 3. 1 [Cayley - Hamilton Theorem]. Let T be a linear operator on a finite -


dimensional vector space V, and let f(t) be the characteristic polynomial of T. Then
f(T) = T0, the zero transformation. That is, T “satisfies” its characteristic equation.
Proof. We show that F(T) (v) = 0 for all v ∈ V. This is obvious if v = 0 because f(T) is
linear; so suppose that v ≠ 0. Let W be the T - cyclic subspace generated by v, and
suppose that dim(W) = k. Let T be a linear operator on a finite - dimensional vector space
V, and let W denote the T - cyclic subspace of V generated by a nonzero vector v ∈ V.
Let k = dim (W). Then {v, T(v) , T2(v),…….Tk – 1(v)} is a basis for W. There exist scalars
a0, a1, ……, ak – 1 such that a0v + a1T(v) + ……….. + ak – 1 Tk – 1(v) + Tk (v) = 0. This
implies that g(t) = (–1)k (a0 + a1t + ……….. + ak – 1 tk – 1 + tk ) is the characteristic
polynomial of TW. Combining these two equations yields

73
B. Chaluvaraju 

g(T)(v) = (–1)k (a0I + a1T+ ……….. + ak – 1 Tk – 1 + Tk )(v) = 0.


g(t) divides f(t); hence there exists a polynomial q(t) such that f(t) = q(t)g(t).
So f(T)(v) = q(T)g(T)(v) = q(T)(g(T)(v)) = q(T)(0) = 0.

Example - 4. Let T be the linear operator of R2 defined by T(a, b) = (a + 2b, –2a + b),
   1 2
and let B = {e1, e2}. Then            , where A = [T]B. The characteristic
2 1
polynomial of T is, therefore,
   1 2
                          2 5.
2 1
It is easily verified that T0 = f(T) = T2 – 2T + 5I.
Similarly, f(A) = A2 – 2A + 5I
3    4 2 4 5 0 0 0
                                            
4 3    4 2 0 5 0 0

Corollary. Let A be an n × n matrix, and let f(t) be the characteristic polynomial of A.


Then f(A) = 0, the n × n zero matrix.

1. 4. Summary
In this unit we have discussed about matrices and determinants with respect to the
vectors (to be called characteristic vectors or eigenvectors) and a scalar λ (to be called
characteristic value or eigenvalue), such that Ax = λx. A formula for the reflection of R2
about the line y = 2x. The key to our success was to find a basis B1 . For which [T]B1, is a
diagonal matrix. This unit is concerned with the so-called diagonalization problem. A
solution to the diagonalization problem leads naturally to the concepts of eigen value and
eigen vector. Aside from the important role that these concepts play in the
diagonalization problem, they also prove to be useful tools in the study of many non-
diagonalizable operators. An invariant subspace of a linear mapping T : V → V from
some vector space V to itself is a subspace W of V such that T(W) is contained in W. An
invariant subspace of T is also said to be T- invariant. With the help of an invariant
subspace we study the Cayley-Hamilton theorem states that every square matrix over a
commutative ring (including the real or complex field) satisfies its own characteristic
equation
74
Linear Algebra 

1. 5. Keywords

Cayley- Hamilton theorem Eigenspace of a matrix


Characteristic polynomial of a linear Eigenvalue of a matrix
operator Eigenvalue of a linear operator
Characteristic polynomial of a matrix Eigenvector of a matrix
Diagonalizable linear operator Invariant subspace
Diagonalizable matrix Splits
Eigenspace of a liner operator Transition matrix

UNIT-2

2. 0. Introduction
Inner product space is a vector space or function space in which an operation for
combining two vectors or functions (whose result is called an inner product) is defined
and has certain properties. In this unit we also study the Gram-Schmidt process,
which takes a finite, linearly independent set S = {v1, …, vk} for k ≤ n and generates
an orthogonal set S′ = {u1, …, uk} that spans the same k-dimensional subspace of
Rn as S. Finally, this unit we concentrate the orthogonal complement W of a subspace W
of an inner product space V is the set of all vectors in V that are orthogonal to every
vector in W.

2. 1. Inner product spaces and Norms


Definition 2. 1. 1. Let V be a vector space over F, an inner product on V is a function
that assigns, to every ordered pair of vectors x and y in V, a scalar in F1 denoted 〈x, y〉,
such that for all x, y and z in V and all c in F, the following holds:
1. 〈x+ z, y〉 = 〈x, y〉 + 〈y, z〉.

2. 〈cx, y〉 = c〈x, y〉.

3. 〈 , 〉     〈y, x〉,  where the bar denotes complex conjugation.

4. 〈x, x〉 > 0 if x ≠ 0.
75
B. Chaluvaraju 

Note that (3) reduces to 〈x, y〉 = 〈y, x〉 if F = R. Condition (1) and (2) simply require that
the inner product be linear in the first component.
It is easily shown that if a1, a2, ……,an ∈ F and y, v1, v2, ……,vn∈ V, then
     ∑ ,       ∑ ,   
For example, x = (a1, a2, ……,an ) and y = (b1, b2, ……,bn) in Fn, define
 〈x, y〉 ∑ a b.
The verification that ( ⋅ , ⋅) satisfies conditions (1) through (4) is easy.
For example, if z = (c1, c2, ……,cn), we have for (1)
    〈x z, y〉    ∑ a   c b          ∑ a b       ∑ c b
= 〈x, y〉 + 〈z, y〉.
Thus, for x = (1 + i, 4) and y = (2 – 3i, 4 + 5i) in C2,
〈x, y〉 = ( 1 + i )(2+3i) + 4(4 – 5i) = 15 – 15i =15(1 – i).
The inner product in the above example is called the standard inner product on Fn.
Where F = R the conjugations are not needed, and in early courses this standard inner
product is usually called the dot product and is denoted by x . y instead of 〈x, y〉.

Example - 1. If 〈x, y〉 is any inner product on a vector space V and r > 0, we may define
another inner product by the rule 〈x, y〉1 = r 〈x, y〉. If r ≤ 0, then (4) would not hold.

Example -2. Let V = C( [0,1] ), the vector space of real - valued continuous functions on
[0,1]. For f, g ∈ V, define 〈f, g〉 = . Since the preceding integral is linear in
f, (1) and (2) are immediate, and (3) is trivial. If f ≠ 0, then f2 is bounded away from zero
on some subinterval of [0,1] and hence 〈f, f〉 =   0.

Definition 2. 1. 2. Let A ∈ Mm × n(F). We define the conjugate transpose or adjoint of


A to be the n × m matrix A* such that (A*)ij = for all i, j.

1 2  2 
Example - 3. Let         . Then           .
2 3 4  1 2 3 4 

A vector space V over a field F endowed with a specific inner product is called an
inner product space. If F = C, we call V a complex inner product space, whereas if
76
Linear Algebra 

F = R, we call V a real inner product space. It is clear that if V has an inner product
〈x, y〉 and W is a subspace of V, then W is also an inner product space when the same
function 〈x, y〉 is restricted to the vectors x, y ∈ W.

Note. Let V be an inner product space. Then for x, y, z ∈ V and c ∈ F, the following
statements are true.
(i) 〈x, y + z〉 = + 〈x, z〉
(ii) 〈x, cy〉 = 〈x, y〉.
(iii) 〈x, 0〉 = 〈0, x〉 = 0
(iv) 〈x, x〉 = 0 if and only if x = 0
(v) If 〈x, y〉 = 〈x, z〉 for all x ∈ V, then y – z.

Definition 2. 1. 3. Let V be an inner product space. For x ∈ V, we define the norm or


length of x by   〈x, x〉.

Example - 4. Let V= Fn. If x = (a1, a2, ……,an), then

 ,  , … … …  =  ∑ | | .

Theorem 2. 1. 1. Let V be an inner product space over F. Then for all x, y ∈ V and c ∈
F, the following statements are true.
(i)    | |   ·
(ii) 0 if and only if x = 0. In any case,   0
(iii) (Cauchy – Schwarz Inequality) |〈x, y〉|  ⋅ 
(iv) (Triangle Inequality)    

Proof. (i) and (ii) are obvious.


(iii) If y = 0, then the result is immediate. So assume that y ≠ 0. For any c ∈ F, we
have

0       〈 x cy, x cy〉   〈x, x cy〉  c〈 y cy〉


 〈x, x〉    c〈x, y〉  c〈y, x〉  cc〈y, y〉. 

77
B. Chaluvaraju 

〈 , 〉 
In particular, if we set   . The inequality becomes
〈 , 〉 

|〈 , 〉 | |〈 , 〉 |
0   〈x, x〉           x         , from which Cauchy –
〈 , 〉   

Schwarz Inequality follows.


(iv) We have  〈 x y, x y 〉   〈x, x〉  〈y, x〉   〈x, y〉   〈y, y〉  
  x    2  〈x, y〉     y   
   x    2|〈x, y〉|    y
   x    2  x   · y    y  
    x      y     ,
where R〈x, y〉 denotes the real part of the complex number 〈x, y〉.

Definition 2. 1. 4. Let V be an inner product space. Vectors x and y in V are orthogonal


(perpendicular) if 〈x, y〉 = 0.
A subset S of V is orthogonal if any two distinct vectors in S are orthogonal. A
vector x in V is a unit vector if x = 1. Finally, a subset S of V is orthonormal if S is
orthogonal and consists entirely of unit vectors.

Note. If S = {v1, v2, …, vk }, then S is orthonormal if and only if 〈vi, vj〉 = δij, where δij
denotes the kronecker delta. Also, observe that multiplying vectors by nonzero scalars
does not affect their orthogonality and that if x is any nonzero vector, then (1/ ) x is a
unit vector. The process of multiplying a nonzero vector by the reciprocal of its length is
called normalizing.

Example - 5. In F3, {(1, 1, 0), (1, –1, 1), (–1, 1, 2)} is an orthogonal set of nonzero
vectors, but it is not orthonormal; however, if we normalize the vectors in the set, we

obtain the orthonormal set.   1, 1, 0 ,    1, 1, 1 , 1,1,2 .


√ √ √

Note. Let V be a vector space over F, where F is either R or C. Regardless of whether V


is or is not an inner product space, we may still define a norm  ⋅  as a real - valued
function on V satisfying the following three conditions for all x, y ∈ V and a ∈ F:
1.   0 and 0 if and only if x = 0.
2.  | | ·

78
Linear Algebra 

3.         .

Illustrative Example-6. Let u and v be two vectors in an inner product space v such that
    . Prove that u and v are linear dependent vectors. Give an
example to show that the converse of this statement is not true.

Solution. We have    
⇒                  
⇒       ,          2 
⇒       ,  2   ,      ,           2 
 ⇒       ,  
⇒ u and v are linearly dependent vectors
The converse is not true, because vectors u = ( –1, 0, 1) and v = (2, 0, –2) in R3 are
linearly dependent as v = 2u. But       .

Illustrative Example -7. Let V be an inner product space and u, v ∈ V. Then, prove that
(i)   4  ,

(ii)    2    

Solution. By using the definition of norm of a vector, we have


      ,
⇒               ,    ,
⇒               ,    ,      ,      ,
⇒               ,   2   ,      , , since      ,    ,
⇒                  2   , → (1)
     ,
⇒           ,   ,
 ⇒           ,   ,     ,   ,
 ⇒           ,   ,     ,   1 ,
 ⇒           , 2  ,   ,
 ⇒             2  ,   → (2)
On subtracting (2) from (1), we get
79
B. Chaluvaraju 

  4  ,
On adding (1) and (2), we get
   2    

2. 2. The Gram - Schmidt Orthogonalization Process

Let V be an inner product space. A subset of V is an orthonormal basis for V if it


is an ordered basis that is orthonormal.

Example -1. The standard ordered basis for Fn is an orthonormal basis for Fn.

Example -2. The set    ,    ,  ,     is an orthonormal basis for R2.


√ √ √ √

Theorem 2. 2. 1. Let V be an inner product space and S = {v1, v2,………,vk }be an


orthogonal subset of V consisting of nonzero vectors. If y ∈ span (S), then
〈 , 〉
      ∑  

Proof. Write  ∑  , where a1, a2, ……,ak ∈ F. Then, for 1 ≤ j ≤ k,

we have, 〈y, v 〉     ∑ ,v      ∑ v   , v       a   v   , v    a   v .


,   
So        , and the result follows.

The next couple of results follow immediately from the above theorem
Corollary-1. If, in addition to the hypotheses of the above theorem, S is orthonormal and
y ∈ span (S), then       ∑ 〈y, v 〉   .
If v possesses a finite orthonormal basis, then Corollary 1 allows us to compute
the coefficients in a linear combination very easily.

Corollary- 2. Let V be an inner product space, and let S be an orthogonal subset of V


consisting of nonzero vectors. Then S is linearly independent.
Proof. Suppose that v1, v2, ……,vk ∈ S and ∑  a 0. As in the poof of above

theorem with y = 0, we have aj = 〈0, vj〉 / 0  for all j. So S is linearly


independent.

80
Linear Algebra 

Example -3. By Corollary 2, the orthonormal set

  1, 1, 0 ,    1, 1, 1 , 1,1,2
√ √ √

obtained is an orthonormal basis for R3.


Let x = (2, 1, 3). The coefficients given by Corollary 1 to theorem 2. 2.1, that express x
as a linear combination of the basis vectors are     2 1    ,
√ √

  2 1 3 and   2 1 6   ..
√ √ √ √

As a check, we have 2, 1, 3     1, 1, 0      1, 1, 1     1, 1, 2 .

Theorem 2. 2. 2. Let V be an inner product space and S = {w1, w2, ……,wn}be a


linearly independent subset of V. Define S1 = {v1, v2, ……,vn}, where v1 = w1 and
〈  , 〉
  ∑     for 2 ≤ k ≤ n. → (1)

Proof. The proof is by mathematical induction on n, the number of vectors in S. For


k = 1, 2 ,….., n, let Sk = { w1, w2, ……, wn}. If n = 1, then the theorem is proved by taking
= S1. That is, v1 = w1 ≠ 0. Assume then that the set = { v1, v2, ……,.vk – 1}with
the desired properties has been constructed by the repeated use of (1). We show that
the set = { v1, v2, ……,vk –1vk} also has the desired properties, where vk is obtained
from by (1). If vk = 0, then (1) implies that wk ∈ span ( ) = span (Sk – 1),

which contradicts the assumption that Sk is linearly independent. For 1 ≤ i ≤ k – 1,


it follows from (1) that
〈  , 〉 〈  , 〉
〈   ,  〉    〈    , 〉     ∑  〈 v  , v  〉   〈    ,  〉        0.

Since 〈   ,  〉 = 0 if i ≠ j by the induction assumption that is orthogonal. Hence


     
is an orthogonal set of nonzero vectors. Now, by (1), we have that span( ) ⊆
  
span(Sk). But by Corollary 2 to Theorem 2. 2.1, is linearly independent; So
     
dim(span( ) ) = dim (span (Sk)) = k. Therefore span ( ) = span (Sk). The
construction of {v1, v2, ……,vn} by the use of theorem 2. 2. 2 is called the Gram-Schmidt
Orthogonalization Process.

Illustrated Example -4. Let w1 = (1, 0, 1, 0), w2 = (1, 1, 1, 1) , and w3 = (0, 1, 2, 1) in


R4. If {w1, w2, w3} is linearly independent. Use the Gram-Schmidt Orthogonalization
81
B. Chaluvaraju 

Process to compute the orthogonal vectors v1, v2 and v3, and then find the normalize these
vectors to obtain an orthonormal set.
Solution. Take v1 = w1 = (1, 0, 1, 0).
〈  , 〉
Then           1, 1, 1, 1     1, 0, 1, 0 = (0, 1, 0, 1).
〈  , 〉 〈  , 〉
Finally,              

=  0, 1, 2, 1     1, 0, 1, 0     0, 1, 0, 1 = (–1, 0, 1, 0).

These vectors can be normalized to obtain the orthonormal basis { u1, u2, u3},
where            1, 0, 1, 0 ,            0, 1, 0,1  and
√ √

           1, 0, 1, 0 .

2. 3. Orthogonal Complement

Definition 2. 3. 1. Let S be a non empty subset of an inner product space V. We define


S ⊥ (read “S perp”) to be the set of all vectors in V that are orthogonal in every vector in

S; That is, S⊥ = {x ∈ V: x, y = 0 for all y ∈ S}. The set S⊥ is called the orthogonal
complement of S. It is easily seen that S⊥ is a subspace of V for any subset S of V.

Theorem 2. 3. 1. Let W be a finite-dimensional subspace of an inner product space V,


and let y ∈ V. Then there exist unique vectors u ∈ W and z ∈ W⊥ such that y = u + z.
k
Furthermore, if {v1, v2, . . . , vk} is an orthonormal basis for W, then u = ∑ y,v
i =1
i vi .

Proof. Let {v1, v2, . . . , vk} be an orthonormal basis for W, let u be as defined in the
preceding equation, and let z = y – u. Clearly in u ∈ W and y = u + z. To show that z ∈
W ⊥ , it suffices that z is orthogonal to each v j . For any j, we have

⎛ k ⎞ k
z , v j = ⎜⎜ y − ∑ y , vi vi ⎟⎟ , v j = y − v j − ∑ y , vi vi , v j
⎝ i= 1 ⎠ i =1

= y,vj − y,vj = 0

82
Linear Algebra 

To show uniqueness of u and z, suppose that y = u + z = u1 + z1, where u1 ∈ W and


z1 ∈ W⊥. Then u – u1 = z1 – z ∈ W∩ W⊥ = {0}. Therefore u = u1 and z = z1.

In the notation of the above theorem, we have following corollary


Corollary. The vector u is the unique vector in W that is “closest” to y; that is, for any
x ∈ W, y − x ≥ y − u , and this inequality is an equality if and only if x = u.

Theorem 2. 3. 2. Suppose that S = {v1, v2, …,vn } is an orthonormal set in an n-


dimensional inner product space V. Then
(i) S can be extended to an orthonormal basis { v1, v2, ……,vk , vk + 1,……, vn } for V.
(ii) If W = span(S), then S1 = { vk + 1, vk + 2 , ……,vn } is an orthonormal basis for W⊥.
(iii) If W is any subspace of V, then dim (V) = dim (W) + dim (W⊥).

Proof. (i) Any finite generating set for V with dimension n contains at aleast n vectors,
and a generating set for vector space V that contains exactly n vectors is a basis for V, S
can be extended to an ordered basis S1 = {v1, v2, . . . . , vk , wk + 1 , . . . . . , wn} for V. Now
apply the Gram-Schmidt Orthogonalization Process to S1. The first k vectors resulting
from this process are the vectors in S, and this new set spans V. Normalizing the last
(n – k) vectors of this set produces an orthonormal set that spans V. The result follows.
(ii) Because S1 is a subset of a basis, it is linearly independent. Since S1 is clearly
a subset of W⊥, we need only show that it spans W⊥. Note that, for any x ∈ V, we have
     ∑  , . If x ∈ W⊥, then  , = 0 for 1 ≤ i ≤ k.
Therefore, x = ∑  , ∈ span (S1).
(iii) Let W be a subspace of a vector space V. It is a finite - dimensional inner
product space because V is, and so it has an orthonormal basis {v1, v2, ……,vk }. By (i)
and (ii), we have dim (V) = n = k + (n – k) = dim (W) + dim (W⊥).

Example -1. Let W = span ({e1 , e2}) in F3. Then x = (a, b, c) ∈ W⊥ if and only if
0 = 〈 x, e1 〉 = a and 0 = 〈 x, e2 〉 = b. So x = (0, 0, c), and therefore W⊥ = span ({e3}).
One can deduce the same result by noting that e3 ∈ W⊥ and from (iii), that dim (W⊥) =
3 – 2 = 1.

83
B. Chaluvaraju 

Illustrated Example -2. Let C[ – π, π] be the inner product space of all continuous
π
functions defined on [ – π, π] with the inner product defined by ,   π
.

Prove that sin t and cos t are orthogonal functions in C[ - π, π].


π
Solution . We have, ,   π
π
sin ,       π
       
π
sin ,       π
 2  

sin ,       cos 2 π
   π

sin ,       1 1 0
  

Thus, sin t and cos t are orthogonal functions in the inner product space C[–π, π].

Illustrated Example -3. Let u = (−1, 4, −3) be a vector in the inner product space with
the standard inner product. Find a basis of the subspace u⊥ of R3.
Solution. We have
⊥    ∈   :  , 0 or , ⊥   , , ∈  :  4 3 0
Thus, u⊥ consists of all vectors v = (x, y, z) such that –x + 4y − 3z = 0. In this equation
there are only two free variables. Taking y and z as free variable, we find that
y = 1, z = 1 ⇒ x = 1; y = 0, z = 1 ⇒ x = − 3
Thus, v1 = (1, 1, 1) and v2 = (−3, 0, 1) are two independent solutions of −x + 4y −3z = 0.
Hence, {v1 = (1, 1, 1), v2 = (−3, 0, 1)} form a basis for u⊥

Illustrated Example - 4. If { v1, v2, . . . . . . .vn } is an orthonormal set in an inner


product space V and v ∈ V. Then prove that ∑ | , |        (Bessel’s
inequality). Further, show that equality holds if and only if v is in the subspace spanned
by { v1, v2, . . . . . . .vn }.
Solution. Consider the vector u given by       – ∑ ,
  ,

  , , ,

84
Linear Algebra 

   ,   , , , ,   , , ,

   ,   , , , , , , ,   

   ,  2 | , | , , δ  

     2 | , | | , |  

      | , |  

    | , |  ≥ 0 

∑ | , |   ≤    . This proves the first part.

Now, ∑ | , |       
⇒           0
⇒             0

⇒            ,  0

⇒              ,

⇒ v is the linear combination of v1, v2, . . . . . . .vn.


⇒ v ∈ subspace spanned by the set { v1, v2, . . . . . . .vn }.
Conversely, let v be linear combination of v1, v2, . . . . . . .vn .
Then,    ∑ ,
⇒ u = 0v ⇒ u = 0 ⇒      ∑ | , | .

2. 4. Summary

85
B. Chaluvaraju 

Most applications of mathematics are involved with the concept of measurement


and hence of the magnitude or relative size of various quantities. So it is not surprising
that the fields of real and complex numbers, which have a built - in notion of distance,
should play a special role. We assume that all vector spaces are over the field F, where F
denotes either R or C. In this unit we shall study a special class of vector spaces which is
very rich in geometry. Consider VR = R3. For any a = (a1 , a2 , a3 ) ∈ V, the length | | =

       Further, given b = (b1 , b2 , b3) ∈ V the dot product (or inner product) a
. b = a1 b1 , a2 b2 , a3 b3 is well known. The angle θ between a and b is derived from the
 .
equation.           . Here, we satudy the idea of distance or length into vector
 

spaces via a much richer structure, the so-called inner product space structure. This
added structure provides applications to geometry, physics, conditioning in systems of
linear equations, least squares and quadratic forms.
In mathematics, particularly linear algebra and numerical analysis, the Gram–
Schmidt process is a method for orthonormalising a set of vectors in an inner product
space, most commonly the Euclidean space Rn. The Gram–Schmidt process takes a
finite, linearly independent set S = {v1, …, vk} for k ≤ n and generates an orthogonal set S′
= {u1, …, uk} that spans the same k-dimensional subspace of Rn as S. Also, the
orthogonal complement W of a subspace W of an inner product space V is the set of all
vectors in V that are orthogonal to every vector in W .

2. 4. Summary
Gram – Schmidt orthogonalization
Inner product Orthogonal complement
Inner product space Orthogonal matrix
Norm of a matrix Orthogonally equivalent matrices
Norm of a vector Orthogonal operator
Normal operator Orthogonal vectors
Normalizing a vector Orthonormal

86
Linear Algebra 

UNIT-3
3. 0. Introduction

This unit investigates the space A(V) of linear operator T on an inner product
space V. Adjoints of operators generalize conjugate transposes of square matrices to
(possibly) infinite-dimensional situations. The adjoint of an operator A is also sometimes
called the Hermitian conjugate (after Charles Hermite) of A. So, most of the results on
unitary spaces are identical to the corresponding results on inner product space. In this
unit, we learn about the normal and self - adjoint operators, unitary and orthogonal
operators and their matrices, orthogonal projections and the spectral theorem, bilinear and
quadratic forms.

3. 1. The adjoint of a linear operator

Definition 3. 1. 1. The conjugate transpose A of a matrix A. For a linear operator T on


an inner product space V, we now define a related linear operator on V called the adjoint
of T, whose matrix representation with respect to any orthonormal basis B for V is[T ]*B .

Let V be an inner product space, and let y ∈ V. The function g: V → F defined by


g(x) = 〈 x, y 〉 is linear.

Theorem 3. 1. 1. Let V be a finite- dimensional inner product space over a field F, and
let g: V → F be a linear transformation. Then there exists a unique vector y ∈ V such
that g(x) = 〈 x, y 〉 for all x ∈ V.
Proof. Let B = {v1, v2, ……vn } be an orthonormal basis for V, and let

 ∑    . Define h: V → F by h(x) = 〈 x, y 〉, which is linear.


Furthermore, for i ≤ j ≤ n, we have
h(vj) = 〈 vj, y 〉 =  ∑   = ∑        , =∑   .
Since g and h both agree on B, so we have, g = h due to the fact that two vector space

V and W, with V has a finite basis { v1, v2, . . . ,vn}, if U, T: V→W are linear and
U(vi) = T(vi) for i = 1,2, …, n, then U=T .
To show that y is unique, suppose that g(x) = 〈 x, y1 〉 for all x.

87
B. Chaluvaraju 

Then 〈 x, y 〉 = 〈 x, y1 〉 for all x, since if 〈 x, y 〉 = 〈 x, z 〉 for all x, y, z ∈ V , then y = z.


Hence, we have y = y1.

Example -1. Define g: R2 → R by g(a1 , a2) = 2a1 + a2; clearly g is a linear


transformation. Let B = {e1 , e2}, and let y = g(e1) e1 + g(e2) e2 = 2 e1 + e2 = (2, 1), as in
the proof of Theorem 3. 1. 1, Then g(a1, a2) = a  , a  , 2 , 1 2   .

Theorem 3. 1. 2. Let V be a finite - dimensional inner product space, and let T be a


linear operator on V. Then there exists a unique function T*: V → V such that
,    , for all x, y ∈ V. Furthermore, T* is linear.
Proof. Let y ∈ V. Define g: V → F by g(x) = , for all x ∈ V.
First, we show that g is linear.
Let x1 , x2 ∈ V and c ∈ F. Then g( cx1 + x2 ) =   ,
=     ,
=     ,     ,
  .
Hence g is linear.
Now, we apply the theorem 3. 1. 1, to obtain a unique vector y1 ∈ V such that
g(x) =  , . That is   ,    , for all x ∈ V.
Defining T*: V → V by T*(y) = y1, we have   ,  ,     . 
To show that T* is linear, we consider y1, y2 ∈ V and c ∈ F. Then for any x ∈ V, we have
 ,     ,    
  , + ,
= ,   ,
= ,       
Since x is arbitrary, we have T*(cy1 + y2) = c T*(y1) + T*(y2).
Let V be an inner product space. If ,   , for all x, y, z ∈ V, then y = z.
Finally, we show that T* is unique.
Suppose that U: V → V is linear and that it satisfies   ,   , for all
x, y ∈ V. Then  ,       , for all x, y ∈ V, so T* = U, this completes the
proof.
88
Linear Algebra 

Note. The linear operator T* described in the above theorem is called the adjoint of the
operator T. Thus T* is the unique operator on V satisfying   ,    ,     for
all x , y ∈ V.

Theorem 3. 1. 3. Let V be a finite - dimensional inner product space, and let B be an

orthonormal basis for V. If T is a linear operator on V, then [ T*]B = [T ]*B .

Proof. Let A = [ T ]B and B = [T*]B be two matrix, and the basis B = {v1 , v2,. . . . , vn}.

Then V is an inner product space with an orthonormal basis B = {v1 , v2,. . . , vn} and T is

a liner operator on V, and let A = [ T ]B . Then for any i and j, Aij =     , ), we have

Bij =  , =  , =    ,    = = ( A*)ij .
Hence B = A*.

Note. Let A = [ T ]B be an n × n matrix. Then TA* = (TA)*.


Example -2. Let T be the linear transformation on C2 defined by
T( a1 , a2) = (2ia1+ 3a2, a1 – a2). If B is the standard ordered basis for C2,
2    1  2    1
then [T]B =         . So [T*]B = [T ]*B =        
1 1     3 1
Hence T*(a1 , a2) = (– 2ia1 + a2, 3a1 – a2).

Note.
1. Let V be an inner product space, and let T and U be linear operators on V. Then
(i) (T + U)* = T* + U*,
(ii) (cT)* = T* for any c ∈ F,
(iii) (TU)* = U* T*,
(iv) T** = T,
(v) I* = I.

2. Let A and B be n × n matrices. Then


(i) (A + B)* = A* + B*
(ii) (cA)* = A* for any c ∈ F;
(iii) (AB)* = B* A*;
89
B. Chaluvaraju 

(iv) A** = A;
(v) I* = I.

3. Let A ∈ M m× n(F), x∈Fn, and y ∈ Fm. Then  ,  , .


4. Let A ∈ M m× n(F). Then rank (A*A) = rank (A).
5. If A is an m× n matrix such that rank (A) = n, then A*A is invertible.

Illustrative Example-3. Find the adjoint of linear transformation T: R2 → R2 given by


T(x, y) = (x +2y, x−y) for all (x, y) ∈ R2 .
Solution. Clearly B = {e1(2), e2(2)} is an orthonormal basis of R2 such that
T(e1(2)) = T(1,0) = (1,1) = le1(2) + le2(2)
T(e2(2)) = T(0, 1) = (2, -1) = 2e1(2) − 1e2(2)
The matrix A that represents T relative to the standard basis B is given by
1    2 1    1
             ⇒           
1 1 2 1
The adjoint T* is represented by the transpose of A.
1    1
Hence,                          
2 1 2
Therefore, T* (x, y) = (x + y, 2x − y).

3 . 2. Normal and self - adjoint operators


Lemma 3. 2. 1. Let T be a linear operator on a finite - dimensional inner product space
V. If T has an eigen vector, then so does T*.
Proof. Suppose that v is an eigenvector of a linear operator T with corresponding
eigenvalue λ. Then for any x ∈ V,
0 = 0 , =  λ ,   , λ   ,  λ  ,
and hence v is orthogonal to the range of (T* – λ I) . So (T* – λ I) is not onto and hence
is not one-to-one. Thus (T* – λ I) has a nonzero null space, and any nonzero vector in
this null space is an eigenvector of T* with corresponding eigen valueλ.

Note.

90
Linear Algebra 

1. An subspace W of a vector space V is said to be T-invariant if T(W) is contained


W and is defined by TW :W→W by TW(x)= T(x).
2. A polynomial is said to split if it factors into linear polynomials.
3. Let T be a linear operator on a finite dimensional inner product space V. Suppose
that the eigen polynomial of T splits. Then there exists an orthonormal basis B for

V such that the matrix [T]B is upper triangular. This is known as a Schur
Theorem.

Definition 3. 2. 1. Let V be an inner product space, and let T be a linear operator on V.


We say that T is normal if TT* = T*T.
An n× n real or complex matrix A is normal if AA* = A*A.

Example - 3. Let T: R2 → R2 be rotation by θ, where 0 < θ < π. The matrix


 θ   θ
representation of T in the standard ordered basis is given by         .
 θ    θ

Example - 4. Suppose that A is a real skew - symmetric matrix, that is, AT= − A. Then A
is normal because both AAT and ATA are equal to −A2.

Theorem 3. 2. 2. Let V be an inner product space, and let T be a normal operator on V.


Then the following statements are true.
(i)   for all x ∈ V.
(ii) (T – cI) is normal for every c ∈ F.
(iii) If x is an eigenvector of T, then x is also an eigenvector of T*. In fact, if
T(x) = λx. then T*(x) = λ x.
(iv) If λ1 and λ2 are distinct eigen values of T with corresponding eigenvectors x1
and x2, then x1 and x2 are orthogonal.

Proof. (i) For any x ∈ V, we have


  ,     ,   , = ,   .
(ii) The proof is obvious.
(iii) Suppose x is an eigenvector of T , that is T(x) = λx for some x ∈ V. Let
U = (T – λ I). Then U(x) = 0, and U is normal by (ii). Thus (i) implies that
91
B. Chaluvaraju 

0 = U ( x) = U * ( x) = (T * − λ I )( x) = T * − λ x

Hence T*(x) = λ x. So x is an eigenvector of T*.


(iv) Let λ1 and λ 2 be distinct eigen values of T with corresponding eigenvectors x1
and x2. Then, using (iii), we have

λ1 x1 , x2 = λ1 x1 , x2 = T ( x1 ), x2 = x1 ,T * ( x2 ) = x1 , λ2 x2 = λ2 x1 , x2
.
Since λ1 ≠ λ2 , we conclude that x1 , x2 = 0

Note. Suppose that p(z) = anzn + an-1 zn – 1 + . . . . . . . .+ a1z + a0 is a polynomial in


Pn(C) of degree n ≥ 1. Then p(z) has a zero. This is known as Fundamental Theorem of
Algebra.

Theorem 3. 2. 3. Let T be a linear operator on a finite - dimensional complex inner


product space V. Then T is normal if and only if there exists an orthonormal basis for V
consisting of eigen vectors of T.
Proof. Suppose that T is normal. By the fundamental theorem of algebra, the eigen
polynomial of T splits. So we may apply Schur’s theorem to obtain an orthonormal basis
B = {v1 , v2,. . . . , vn} for V such that [T]B = A is upper triangular. We know that v1 is an
eigenvector of T because A is upper triangular. Assume that v1 , v2,. . . . . , vk–1 are
eigenvectors of T. We claim that vk is also an eigenvector of T. It then follows by
mathematical induction on k that all of the vi’s are eigenvectors of T. Consider any j < k,
and let λj denote the eigenvalue of T corresponding to vj. By theorem 3. 2. 2. (iii),
T*(vj) = λ  . Since A is upper triangular,
T(vk) = A1kv1 + A2kv2 + . . . . . . . . + Ajkvj + . . . . . . + Akkvk.
By a finite - dimensional inner product space V with an orthonormal basis
B = {v1 , v2,. . . . , vn}. If T is a linear operator on V and A = [T]B. Then for any i and j,
Aij =     , ),
Ajk = , = ,   ,λ  λ , 0
It follows that T(vk) = Akk vk, and hence vk is an eigenvector of T. So by induction, all the
vectors in B are eigenvectors of T.

92
Linear Algebra 

Definition 3. 2. 2. Let T be a linear operator on an inner product space V. Then T is


known as self-adjoint (Hermitian) if T = T*.
An n × n real or complex matrix A is self-adjoint (Hermitian) if A = A*.

Lemma 3. 2. 4. Let T be a self-adjoint operator on a finite-dimensional inner product


space V. Then
(i) Every eigenvalue of T is real.
(ii) Suppose that V is a real inner product space. Then the eigen polynomial of T
splits.

Proof. (i) Suppose that T(x) = λx for x ≠ 0. Because a self-adjoint operator is also
normal, by T is a normal operator on inner product space V. If x is an eigenvector of T,
then x is also an eigenvector of T*. In fact, if T(x) = λx. Then T*(x) = λ x.), to obtain
λx =T(x) = T*(x) = λ x. So λ = λ, that is, λ is real.
(ii) Let dim(V) = n, B be an orthonormal basis for V, and A = [T]B. Then A is self -
adjoint . Let TA be the liner operator on Cn defined by TA(x) = Ax for all x ∈ Cn.
Note that TA is self - adjoint because [T]A = A, where A is the standard ordered
(orthonormal) basis for Cn. So, by (i), the eigenvalues of TA are real. By the
Fundamental theorem of algebra, the characteristic polynomial of TA splits into factors of
the form (t – λ). Since each λ is real, the characteristic polynomial splits over R. But TA
has the same characteristic polynomial as A, which has the same characteristic
polynomial as T. Therefore the characteristic polynomial of T splits.

Theorem 3. 2. 5. Let T be a linear operator on a finite - dimensional real inner product


space V. Then T is self - adjoint if and only if there exists an orthonormal basis B for V
consisting of eigenvectors of T.
Proof. Suppose that T is self-adjoint. By lemma 3. 2. 4, we may apply Schur’s theorem
to obtain an orthonormal basis B for V such that the matrix A = [T]B is upper triangular.

But, A* = [T]*B = [T*]B = [T]B = A. So A and A* are both upper triangular, and therefore

A is a diagonal matrix. Thus B must consist of eigenvectors of T. Converse is obvious.

93
B. Chaluvaraju 

3. 3. Unitary and Orthogonal Operator and Their Matrices

Definition 3. 3. 1. Let T be a linear operator on a finite - dimensional inner product


space V over a field F. If T ( x) = x for all x ∈ V. Then T is known as a unitary

operator if F = C and an orthogonal operator if F = R.

Note. In the infinite - dimensional, an operator satisfying the preceding norm requirement
is generally called an isometry. If, in addition, the operator is onto (the condition
guarantees one-to-one), then the operator is called a unitary or orthogonal operator.

Example -1. Let h ∈ H satisfy h(x) = 1 for all x. Define the linear operator T on H
2π 2
1 2

2π ∫0
2 2
by T(f) = hf. Then T( f ) = hf = h(t ) f (t ) h(t ) f (t ) dt = f . Since h(t ) =1

for all t. So T is a unitary operator.

Note. Let T be a linear operator on a finite - dimensional inner product space V. Then
the following statements are equivalent.
(i) TT* = T*T = 1.
(ii) T ( x) , T ( y) = x , y for all x, y ∈ V.

(iii) If B is an orthonormal basis for V, then T(B) is an orthonormal basis for V.

(iv) T ( x) = x for all x ∈ V.

Definition 3. 3. 2. A square matrix A is called an orthogonal matrix if ATA = AAT = I


and unitary if A*A = AA* = I.

Note.
1. Since for a real matrix A we have A* = AT, a real unitary matrix is also orthogonal.
In this case, we call A orthogonal rather than unitary.
2. The condition AA* = I is equivalent to the statement that the rows of A form an
n n
orthonormal basis for Fn because δ ij = I ij = ( AA* ) ij = ∑ Aik ( A* ) kj = ∑ Aik A jk ,
k =1 k =1

94
Linear Algebra 

and the last term represents the inner product of the ith and jth rows of A.
Similarly, we have columns of A and the condition A*A = I .

Theorem 3. 3. 1. Let A be a complex n × n matrix. Then A is normal if and only if A is


unitarily equivalent to a diagonal matrix.
Proof. By the above note, we need only prove that if A is unitarily equivalent to a
diagonal matrix, then A is normal. Suppose that A = P*DP, where P is a unitary matrix
and D is a diagonal matrix. Then
AA* = (P* DP) (P*DP)* = (P*DP)(P*D*P) = P*D I D*P = P*D D*P.
Similarly, A*A = P* D* D P. Since D is a diagonal matrix, however,
we have DD* = D*D. Thus AA* = A*A.

Theorem 3. 3. 2. Let A be a real n × n matrix. Then A is symmetric if and only if A is


orthogonally equivalent to a real diagonal matrix.
Proof. Proof of this theorem is similar to theorem 3. 3.1.

3. 4. Orthogonal Projections and the Spectral Theorem


If V = W1 ⊕ W2 , then a linear operator T on V is the projection on W1 along W2
if, whenever x = x1 + x2 , with x1∈W1 and x2∈W2, we have T(x) = x1. We know that
r(T) = W1 = { x ∈ V: T(x) = x} and n(T) = W2.
So V = r(T) ⊕ n(T). Thus there is no ambiguity if we refer to T as a “projection on W1”
or simply as a “projection”. In fact, it can be shown that T is a projection if and only if
T = T2. Because V = W1⊕ W2 = W1 ⊕ W3, we see that W1 does not uniquely determine
T. For an orthogonal projection T, however, T is uniquely determined by its range.

Definition 3. 4. 1. Let V be an inner product space, and let T: V → V be a projection.


Then T is known as an orthogonal projection if r(T)⊥ = n(T) and n(T)⊥ = r(T).

Note. If r(T)⊥ = n(T), then r(T)⊥ ⊥ = n(T) ⊥, provided V is a finite dimensional inner
product space over a field F.

95
B. Chaluvaraju 

Theorem 3. 4. 1. Let V be an inner product space, and let T be a linear operator on V.


Then T is an orthogonal projection if and only if T has an adjoint T* and T2 = T = T*.
Proof. Suppose that T is an orthogonal projection.
Since T2 = T because T is a projection, we need only show that T* exists and T = T*.
Now V = rank(T) ⊕ nullity(T) and rank(T)⊥ = nullity(T).
Let x, y∈ V. Then x = x1 + x2 and y = y1 + y , where x , y1 ∈ rank (T) and x2, y2 ∈
nullity (T). Hence
,         ,    ,     and
  ,   ,       ,    , .
So   ,       , for all x, y ∈ V, thus T* exists and T = T*.
Now suppose that T2 = T = T*.
Then T is a projection and hence we must show that
rank (T) = nullity (T) ⊥ and rank (T) ⊥ = nullity(T).
Let x ∈ rank (T) and y ∈ nullity(T).
Then x = T(x) = T*(x), and so ,       ,     ,     ,0 0 
Therefore x ∈ nullity (T)⊥, from which it follows that rank (T) ⊆ nullity (T)⊥.
Let y ∈ nullity (T)⊥.
We must show that y ∈ rank (T), that is, T(y) = y.
Now,       ,    ,       ,
Since y – T(y) ∈ nullity (T) , the first term must equal zero.
But also   ,  ,    ,    .0 0
Thus y – T(y) = 0, that is, y = T(y) ∈ rank (T).
Hence rank (T) = nullity (T) ⊥ .
Thus, we have rank (T) ⊥ = nullity(T)⊥⊥ ⊇ nullity(T).
Now suppose that x ∈ rank(T)⊥.
For any y ∈ V, we have   ,      ,      , 0.
So T(x) = 0, and thus x ∈ nullity(T). Hence rank (T) ⊥ = nullity(T) .

Theorem 3. 4. 2 [Spectral Theorem]. Suppose that T is a linear operator on a finite -


dimensional inner product space V over F with the distinct eigenvalues λ1, λ2 , . . . . , λk .

96
Linear Algebra 

Assume that T is normal if F = C and that T is self -adjoint if F = R. For each I (1 ≤ i ≤


k), let Wi be the eigenspace of T corresponding to the eigenvalue λi , and let Ti be the
orthogonal projection of V on Wi. Then the following statements are true.
(i) V = W1 ⊕ W 2 ⊕ . . . . . . . ⊕ W k

(ii) If Wi । denotes the direct sum of the subspaces Wj for j ≠ i, then W⊥i = W ।i

(iii) Ti Tj = δij Ti for 1 ≤ i , j ≤ k.

(iv) I = T1 + T2 + . . . . . . . . Tk

(v) T = λ1 T1 + λ2 T2 + . . . . . . .+ λk T1.

To prove the spectral Theorem, we need following Facts

Fact 1. Let T be a linear operator on a finite -dimensional complex inner product space
V. Then T is normal if and only if there exists an orthonormal basis for V consisting of
eigenvectors of T.

Fact 2. Let T be a linear operator on a finite- dimensional real inner product space V.
Then T is self - adjoint if and only if there exists an orthonormal basis B for V consisting
of eigenvectors of T.

Fact 3. A linear operator T on a finite - dimensional vector space V is diagonalizable if


and only if V is the direct sum of the eigenspaces of T.

Fact 4. Let V be an inner product space, and let T be a normal operator on V. Then λ1
and λ2 are distinct eigenvalues of T with corresponding eigenvectors x1 and x2, then x1
and x2 are orthogonal.

Fact 5. Suppose that S = { v1, v2, . . . . . . ,vk }is an orthonormal set in an n - dimensional
inner product space V. Then
a) S can be extended to an orthonormal basis { v1, v2, . . .. .vk, vk + 1, . . . . .vn } for V.

b) If W = span (S), then S1 = { vk + 1 , vk + 2 , . . . . . , vn } is an orthonormal basis for W⊥

97
B. Chaluvaraju 

c) If W is any subspace of V, then dim (V) = dim (W) + dim (W⊥).

Proof of the Main Theorem. By Fact -1 and Fact - 2, T is diagonalizable;


V = W1 ⊕ W2 ⊕ . . . . . . . ⊕ Wk by Fact-3.
If x ∈ Wi and y ∈ Wj for some i ≠ j, then    ,   0.
By Fact- 4. It follows easily from this result that W1i ⊆ Wi⊥ , From (i), we have

dim   dim dim dim


   

On the other hand, we have dim (Wi⊥) = dim (V) − dim (Wi) by Fact- 5.
Hence W1i ⊆ Wi⊥ , proving (ii). The proof of (iii) is obvious.
(iv) Since Ti is the orthogonal projection of V on Wi, it follows from (ii) that
nullity (Ti) = rank (Ti)⊥ = Wi⊥= W1i .
Hence, for x ∈ V, we have x = x1 + x2 + . . . . . + xk , where Ti(x) = xi ∈ Wi, proving (iv).
(v) For x ∈ V, write x = x1 + x2 + . . . . . . . + xk , where xi ∈ Wi. Then
T(x) = T (x1) + T(x2) + . . . . . . . + T(xk)
= λ x1 + λx2 + . . . . . . . + λxk.
= λ1T1 (x1) + λ2T2(x2) + . . . . . . . + λkTk(xk)
= ( λ1T1 + λ2T2 + . . . . . . . + λkTk ) (x).
This completes the proof.

Note. The set {λ1 , λ2 , . . . . ., λk } of eigenvalues of T is called the Spectrum of T, and


the condition (iv) is called a resolution of the identity operator induced by T and the
condition (v) is called a Spectral decomposition of T.

3. 5. Summary

The adjoint of an operator A is also sometimes called the Hermitian conjugate of


A and is denoted by A*. The self-adjoint operator is an operator that is its own adjoint,
or, equivalently, one whose matrix is Hermitian, where a Hermitian matrix is one which
is equal to its own conjugate transpose.

98
Linear Algebra 

By the finite-dimensional spectral theorem such operators have an orthonormal


basis in which the operator can be represented as a diagonal matrix with entries in the real
numbers. The spectral theorem is any of a number of results about linear operators or
about matrices. In broad terms the spectral theorem provides conditions under which an
operator or a matrix can be diagonalized This concept of diagonalization is relatively
straightforward for operators on finite-dimensional spaces, but requires some
modification for operators on infinite-dimensional spaces. In general, the spectral
theorem identifies a class of linear operators that can be modelled by multiplication
operators.

The spectral theorem also provides a canonical decomposition, called the spectral
decomposition, eigenvalue decomposition, or eigendecomposition, of the underlying
vector space on which the operator acts. The importance of diagonalizable operators is
seen in Block – II. For Normal and self - adjoint operators, it is necessary and sufficient
for the vector space V to possess a basis of eigen vectors. As V is an inner product space
in this block, it is reasonable to seek conditions that guarantee that V has an orthonormal
basis of eigen vectors.

3. 6. Keywords
Adjoint of a linear operator Self - adjoint matrix
Adjoint of a matrix Self - adjoint operator
Hermitian Spectrum
Normal matrix Spectral decomposition
Normal operator Spectral Theorem
Normalizing a vector Unitarily equivalent matrices
Orthogonal operator Unitary matrix
Orthogonal vectors Unitary operator
Orthonormal basis

99
B. Chaluvaraju 

UNIT - 4
4. 0. Introduction

In this unit, we will generalize the notion of linear forms. In fact, we will
introduce the notion of a bilinear form on a finite-dimensional vector space. We have
studied linear forms on V(F). Here, we will study bilinear forms as mapping from V×V to
F, which are linear forms in each variable. Bilinear forms also give rise to quadratic and
Hermitian forms.

4. 1. Bilinear and Quadratic Forms

Definition 4. 1. 1. Let V be a vector space over a field F. A transformation B: V × V→


F is said to be bilinear form on V if it satisfies following properties:
(i) B(ax1 + x2,y) = a B(x1 , y) + B (x2 , y) for all x1, x2, y ∈ V and a ∈ F
(ii) B (x, ay1 + y2) = a B (x, y1) + B (x , y2) for all x, y1, y2 ∈ V and a ∈ F.

We denote the set of all bilinear forms on V by B (V).


Note. An inner product on a vector space is a bilinear form if the underlying field is real,
but not if the underlying field is complex.

Example -1. Let V be a vector space over F = R. Then the mapping defined by
B(x, y) = x . y (which is the inner product of x and y) for x, y in V, is a bilinear form on V.

Example - 2. Define a function B: R2 x R2 → R by

 , 2  3 4  

For  ,     .

We could verify directly that B is a bilinear form on R2.


2    3
However, if         ,        and   , then B(x, y) = xT Ay.
4 1
The bilinearity of B now follows directly from the distributive property of matrix
multiplication over matrix addition.

100
Linear Algebra 

Example - 3. Let V = Fn, where the vectors are considered as column vectors.
For any A ∈ Mn×n(F), define B : V × V → F by B(x, y) = xT Ay for x, y ∈ V.
Notice that since x and y are n × 1 matrices and A is an n × n matrix, B(x, y) is a 1 × 1
matrix. We identify this matrix with its single entry. The bilinearity of B follows as in
the example-2.
For example, for a ∈ F and x1, x2, y ∈ V, we have
B(ax1 + x2, y) = (ax1 + x2)T A y = (axT1 + xT2)Ay
= axT1 Ay + xT2 Ay
= a B(x1, y) + B(x2, y).
Note.
1. By xT Ay, we understand the product of three matrices.

⎡ x1 ⎤
⎢x ⎥
⎢ 2⎥
That is x = ⎢ . ⎥ , A=[aij]n×n and y =[y1 , y2 , . . . . ., yn] .
T

⎢ ⎥
⎢. ⎥
⎢⎣ xn ⎥⎦

2. For any bilinear form B on a vector space V over afield F, the following properties
hold:
(i) If, for any x∈V, the functions Tx, Rx: V →F are defined by Tx(y) = B(x, y)
and Rx(y)=B(y, x) for all y∈V, then Tx and Rx are linear.
(ii) B(0, x) = B(x, 0) =0 for all x∈V.
(iii) For all x, y, z, w∈V,
B(x+y, z+w) = B(x, z) + B(x, w)+ B(y, z) + B(y, w)
(iv) If S: V × V→ F is defined by S(x, y) = B(y, x), then C is a bilinear form.

Definition 4. 1. 2. Let V be a vector space, Let B1 and B2 be bilinear forms on V, and let
a be a scalar. We define the sum B1 + B2 and the scalar product aB1 by the equations
(B1 + B2)(x, y) = B1(x, y) + B2(x, y) and
(aB1)(x, y) = a(B1(x, y)) for all x, y ∈ V.
Note.

101
B. Chaluvaraju 

1. For any vector space V, the sum of two bilinear forms and the product of a scalar
and a bilinear form on v are again bilinear forms on V. Furthermore, B(V) is a
vector space with respect to these operations.
2. Let B = {v1, v2, . . . . . ,vn} be an ordered basis for an n-dimensional vector space

V, and let B∈ B(V). We can associate with B an n × n matrix A whose entry in ith
row and jth column is defined by Aij = B (vi, vj) for i, j = 1, 2, . . . . . . n.
3. The matrix A above is called the matrix representation of B with respect to be
the ordered basis B and is denoted by ΨB is an isomorphism.

Definition 4. 1. 3. A bilinear form B on a vector space V is symmetric.


If B(x, y) = B(y, x) for all x, y ∈ V.

Theorem 4. 1. 1. Let B be a bilinear form on a finite - dimensional vector space V, and


let B be an ordered basis for V. Then B is symmetric if and only if ΨB(B) is symmetric.

Proof. Let B = {v1, v2, . . . . . ,vn} and C = ΨB(B). First assume that B is symmetric.
Then for 1≤ i, j ≤ n, Cij = B(vi, vj) = B(vj, vi) = Cji, and it follows that B is symmetric.
Conversely, suppose that B is symmetric. Let S : V × V → F, where F is the field of
scalars for V, be the mapping defined by S(x, y) = B(y, x) for all x, y ∈ V. If
S : V × V → F is defined by S(x, y) = B(y, x), then S is a bilinear form. Let D = ΨB(S),

Then for 1 ≤ i, j≤ n, Dij = S(vi, vj) = B(vj, vi) = Cji = Cij. Thus ΨB(S) = D = C = ΨB(B) ,

since ΨB is one-to-one, we have S = B. Hence B(y, x) = S(x, y) = B(x, y) for all x, y ∈ V,


and therefore B is symmetric.

Definition 4. 1 .4. A bilinear form B on a finite - dimensional vector space V is called


diagonalizable if there is an ordered basis B for V such that ΨB(B) is a diagonal matrix.

Corollary. Let B be a diagonalizable bilinear form on a finite - dimensional vector


space V. Then B is symmetric.

102
Linear Algebra 

Proof. Suppose that B is diagonalizable. Then there is an ordered basis B for V such that

ΨB(B) = D is a diagonal matrix. Trivially, D is a symmetric matrix, and hence, by


theorem 4. 1. 1, B is symmetric.

Definition 4. 1. 5. Let V be a vector space over a field F. A function K : V → F is


called a quadratic form if there exists a symmetric bilinear form B ∈ B(V). such that
K(x) = B(x, x) for all x ∈ V. → (1)
Note.
1. If the field F is not of characteristic two, there is a one-to-one correspondence
between symmetric bilinear forms and quadratic forms given by the equation (1).
In fact, if K is a quadratic form on a vector space V over a field F not of
characteristic two, and K(x) = B(x, x) for some symmetric bilinear form B on V,
then we can recover B from K because B(x, y) = ½ [K(x + y) – K(x) – K(y)].
2. Let V be a finite - dimensional real inner product space, and let B be a symmetric
bilinear form on V. Then there exists an orthonormal basis B for V such that

ΨB(B) is a diagonal matrix.

Corollary. Let K be a quadratic form on a finite - dimensional real inner product


space V. There exists an orthonormal basis B = {v1, v2, . . . . . ,vn} for V and scalars

λ1 ,λ2 , . . . . , λn (not necessarily distinct) such that if x ∈ V and  ∑   , ∈  .


Then  ∑ λ   .
In fact, if B is the symmetric bilinear form determined by K, then B can be

chosen to be any orthonormal basis for V such that ΨB(B) is a diagonal matrix.

4. 2. Summary
There is a certain class of scalar - valued functions of two variables defined on a
vector space that arises in the study of such diverse subjects as geometry and
multivariable calculus. This is the class of bilinear forms. We study the basic properties
of this class with a special emphasis on symmetric bilinear forms, and we consider some
of its applications to quadratic surfaces. a quadratic form is a homogeneous polynomial
of degree two in a number of variables. Quadratic forms occupy a central place in
103
B. Chaluvaraju 

various branches of mathematics, including number theory, linear algebra, group theory.
Quadratic forms are homogeneous quadratic polynomials in n variables.

4. 3. Keywords
Bilinear form Symmetric bilinear form
Diagonalizable bilinear form Quadratic for
Exercises
1. If λ∈ F is a eigenvalue of T∈A(V), then show that there is a vector v ≠ 0 in V
such that T(v) = λv.
2. If λ∈F is a eigenvalue of T∈A(V), then show that for any polynomial q(x) ∈F[x],
q(λ) is a eigen value of q(T).
3. Find the all eigenvalues of the following matrix
1   2 1
2 2
(i) A=     and (ii) B =  1   0     1 
1 3
4 4    5

Answer. (i) λ=1 and λ= 4, (ii) λ=1 , λ= 2 and λ= 3.


4. Show that similar matrices have the same characteristic polynomial.
Hint. If the matrix B is similar to the matrix A over F, then there is an invertible
matrix C∈F such that B=C–1AC. Show det(B–xI) = det(A–xI)
3 1 0
5. Test the matrix A= 0 3 0 ∈M3×3(R) for diagonalizability.
0 0 4
Hint. The characteristic polynomial of A is det(A-tI) = – (t –4)(t –3)2 and find the
eigen values, finally show which is not diagonalizable.
6. Let T: R3 → R3 be defined by T(x, y, z) = ( 2x + y – 2z , 2x + 3y – 4z , x + y – z ).
Find all eigenvalues of T. Is T diagonalizable?
Hint. ∆(t) = t3 – 4t2 + 5t – 2 and λ = 1 and λ = 2 are eigenvalues.
7. Verify {0}⊥ = V and V ⊥ = {0} for any inner product space V.
8. Let V be an inner product space with 〈 , 〉 as an inner product and u, v in V.
Then show that u=v if and only if 〈 u, w 〉 = 〈 v, w 〉 for all w in W.

104
Linear Algebra 

9. Apply the Gram-Schmidt orthogonalization process to the basis B={(1, 0,1), ( 1,


0, –1), (0, 3, 4)}of the inner product space R3 to find an orthogonal and on
orthonormal basis of R3.
  
Answer. u1=  , 0, , u2=  , 0, and u3=  0, 1, 0 .
√   √ √ √

10. Find the adjoint of the linear operator T: R3→ R3 defined by T(x, y, z) = (x+2y,
3x–4z, y).
Answer. T*(x, y, z) = (x+3y, 2x+z, –4y).

References
1. S. Friedberg. A. Insel, and L. Spence – Linear Algebra, Fourth Edition, PHI,
2009.
2. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
3. Hoffman and Kunze – Linear Algebra, Prentice – Hall of India, 1978, 2nd Ed.,
4. P. R. Halmos – Finite Dimensional Vector Space, D. Van Nostrand, 1958.
5. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
2000.

105
B. Chaluvaraju 

BLOCK-III
Canonical Forms
Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of Canonical Forms.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.

UNIT-1
1. 0. Introduction
If every linear operator is not diagonalizable, even if its characteristic polynomial
splits. That purpose of the this unit to consider alternative matrix representations for
nondiagonalizable operators. Such representation is generally, known as canonical forms.
Here, we study the diagonal and triangular canonical form.

1. 1. The Diagonal form

Note.
1. Two square matrices A and B are said to be similar matrices if there exists a non
singular matrix C such that B = CAC–1 or A = C–1BC.
2. The linear transformations S, T∈A(V) are said to be similar linear transformation
if there exists an invertible element C∈A(V) such that T = C–1SC.
3. Similarly of linear transformation in A(V) is an equivalence relation, because
(i) T ∼ T a T = ITI–1
(ii) T∼S ⇒ T = CSC–1 ⇒ S = C–1T (C–1)–1 ⇒ S ∼ T
(iii) T ∼ S∼ U –1
⇒ T = CSC , S = DUD
–1

–1 –1
⇒ T = C(DUD )C

106
Linear Algebra 

–1
⇒ T = (CD) U (CD)
⇒ T ∼ U.
The equivalence classes are called similarly classes.
Basic definitions and Facts.

1. The linear operator T is called diagonalizable if there exists a basis for V with
respect to which the matrix for T is a diagonal matrix.
2. Let T be a linear operator on a vector space V, and let   λ be an eigenvalue of T.
Define Eλ = {x ∈ V: T(x) = λx} = n(T  – λIv). The set E λ is called the eigenspace
of T corresponding to the eigenvalue λ. Analogously, we define the eigenspace of
a square matrix A to be the eigenspace of TA.

Example -1. Let T be the linear operator on P2(R) defined by T(f(x)) =f1(x). The
matrix representation of T with respect to the standard ordered basis B for P2(R) is
0 1 0
[T]B      0 0 2 .
0 0 0
Consequently, the characteristic polynomial of T is
  1   0
3
det ([T]B– tI) =   0   2 = –t .
  0   0
Thus T has only one eigenvalue (λ=0) with multiplicity 3. Solving T(f(x)) = f1(x) = 0
shows that Eλ = n(T – λIv) = n(T ) is the subspace of P2(R) consisting of the constant
polynomials. So {1} is a basis for Eλ, and therefore dim (Eλ) =1.
Consequently, there is no basis for P2(R) consisting of eigenvectors of T, and
therefore T is not diagonalizable.

3. Let T be a linear operator on a finite dimensional vector space V, and let λ be an


eigenvalue of T having multiplicity m. Then 1≤ dim(Eλ) ≤ m, where Eλ is a
subspace of V consisting of the zero vector and the eigenvectors of T
corresponding to the eigenvalue λ.

107
B. Chaluvaraju 

4. Let T be a linear operator, and let λ1, λ2, . . , λk be distinct eigenvalues of T.


For each i = 1, 2, . . . , k, let vi∈Eλ, the eigenspace corresponding to λi.
If v1+ v2+ . . . . + vk=0, then vi = 0 for all i.
5. Let T be a linear operator, and let λ1, λ2, . . , λk be distinct eigenvalues of T. For
each i = 1, 2, . . . , k, let Si be a finite independent subset of the eigenspaces Eλi.
Then S= S1 ∪ S2 ∪ . . . .∪ Sk is a linearly independent of V.
6. Let T be a linear operator on a finite dimensional vector space V. Then T is
diagonalizable if and only if following conditions holds:
(i) The characteristic polynomial of T splits.
(ii) For each eigenvalue λ of T, the multiplicity of λ equals n – rank (T  – λl).

The above conditions can be used to test if a square matrix A is diagonalizable


because diagonalizability of A is equivalent to diagonalizability of the operator TA.

3 1 0
Illustrative Example- 2. Test the matrix A= 0 3 0 ∈  diagonalizable or not.
0 0 4
Solution. The characteristic polynomial of A is det (A–tI) = – (t –4) (t –3)2, which splits,
and so condition 1 of the test (Fact 6(i)) for diagonalization is satisfied. Also A has
eigenvalues λ1= 4 and λ2=3 with multiplicities 1 and 2, respectively. Since λ1 has
multiplicity 1, condition 2 is satisfied for λ1. Thus we need only test condition 2 for λ2.
0 1 0
Because, A– λ2 I =    0 0 0 has rank 2,
0 0 1
we see that n – rank (T – λl) = 3 – rank(A– λ2 I) =1, which is not the multiplicity of λ2.
Thus condition 2 fails for λ2. Therefore A is not diagonalizable.

Remark. However, not every linear operator is diagonalizable, even if its characteristic
polynomial split. This Block consider to alternative matrix representations for
nondiagonalizable operators (see example-1 and illustrative example-2). These
representations are called Canonical forms.

108
Linear Algebra 

Definition 1. 1. 1. Two linear operators T and U on an n-dimensional vector space V are


called simultaneously diagonalizable if there exists some basis B of V such that [T]B and

[U]B are diagonal matrices.

Similarly, A, B∈Mn×n(F) are called simultaneously diagonalizable if there exists an


invertible matrix C∈∈Mn×n(F) such that both C–1AC and C–1BC are diagonal matrices.

Lemma 1. 1. 1. If D1, D2 ∈Mn×n(F) are two diagonal matrices, then D1D2 = D2 D1.
Proof. If D1, D2 ∈Mn×n(F) are two diagonal matrices, then
(D1 D2)ij =  ∑
=  ∑ δ δ
= δ δ
= δ δ
= (D2 D1)ij .
Theorem 1. 1. 2. If T and U are simultaneously diagonalizable operators, then T and U
commute.
Proof. Let B be a basis of a vector space V that diagonalizes both T and U. Since
diagonal matrices commutes with each other (by above lemma),
[TU]B = [T]B [U]B = [U]B [T]B =[UT]B.
Now we can relate the two operations in the same way:
TU=   φB-1 T[TU ] φB = φB-1 T[UT ] B φB    =UT, where φB is the standard representation
B

with respect to basis B and T[TU ] is the left multiplication transformation by matrix [TU]B.
B

Illustrative Example - 3. Show that the linear map satisfying T2 = T is diagonalizable.


Solution. The minimum polynomial of T must divide t2 – t = t(t – 1). Hence it has
distinct roots. Hence T is diagonalizable as V can be decomposed into a sum of
subspaces annihilated by T and T – 1 and each of these spaces is diagonalizable.

Illustrative Example - 4. Let λi = 0 for all i and T satisfy Tn = 1. Show that if T has all
the eigenvalues in F, then T is diagonalizable.

109
B. Chaluvaraju 

Solution. Since T has all the eigenvalues of a field F, the minimum polynomial of T is

∏ (t − λ )
ni
q(t) = i . Now we claim that all these roots are simple.
i

Since Tn = I, q(t) divides tn – 1, and if q(t) has a multiple root, so does tn – 1.


But tn – 1 cannot have multiple root in F as λi = 0 for all i and there is no common root
between tn – 1 and its derivative.

1. 2. Triangular Canonical Form


Definition 1. 2. 1. The linear transformation T is called triangular if there exists a basis
for V such that the matrix of T relative to the basis is an upper or lower triangular matrix.
In other words, a matrix A = [aij]n×n is triangular if all the entries above or below the main
diagonal are zero.
     0             0      
    0 0     
For example, A = [aij]n×n=       …         or             …         
       0 0       
Note.
1. det (A) = a11,a12,………,ann.
2. A is non singular if and only if all ai i≠ 0, for all I = 1, 2, . . . ,n; that is,
det (A) ≠ 0, In fact eigen values of A are the diagonal elements.
3. The subspace W of V is invariant under T∈A(V) if T(W) ⊂ W .

Lemma 1. 2. 1. If W⊂V is invariant under T, then T induces a linear transformation on


V/W, defined by (v + W) = T(v) + W. If T satisfies the polynomial q(x)∈F[x], then so
does . If p1(x) is the minimal polynomial for over F and if p(x) is that for T, then
p1(x)/p(x).
Proof. Given T: V→V is linear and W is an invariant sub space of V, WT⊂ W.
Let   = V/W ={v + w : v∈V} = { : ∈ }.
For   ∈ , define : V/W →V/W by (v) = T(v) + W.
First , To prove is well defined.

110
Linear Algebra 

That is, to show that v1+W = v2+W ⇒ (v1+W) = (v2+W)


Let v1+W = v2+W . Then
⇒ v1– v2∈W
⇒ T(v1–v2)∈W , because T is linear
⇒ T(v1) – T(v2) ∈W , because T(W) ⊂ W, since W is invariant under T
⇒ T(v1) + W = T(v2) +W
⇒ (v1+W) = (v2+W) or ( 1 )= ( 2)
Thus , is well defined linear transformation of V/W .
Now, linearity of follows by the linearity of T
Let v1+W, v2+W∈V/W. Then,
Consider ((v1+W) + (v2+W)) = ((v1+v2)+W), by definition
= T (v1+v2) + W, by definition
= (T(v1) +W) + (T(v2) +W) , because T is linear
= (v1+W) = (v2+W) or (  1 )= ( 2)
Therefore preserves addition.
Consider, v + W∈V/W , a F
(a(v + W)) = (av + W)
= T(av) + W
= a(T(v) +W), because T is linear
= a (v + W), by definition
(a ) = a( ( )).
Therefore    preserves scalar multiplication. Thus, is a linear transformation.
Let q(x)∈F[x], if q(T) = 0, then q(⎯T ) = 0.
We claim that ( )k = k

For, ( )2 (v + W) = ( (v + W))
= (T(v) +W)
= (TT(v) + W)
= T2 (v) + W
= ( )2(v + W)

111
B. Chaluvaraju 

Similarly, ( )3 = ( 3), ( )4,…………, ( )k = ( k


)
Let q(x)∈F[x], If q(x) = a0+a1x+a2x2+……….. + akxn
q(T) = a0 + a1( )+ a2( )2+……….. + ak( )k
2 k
= a0+a1 +a2+( )+……….. + ak( )
For any polynomial q(x)∈F[x], we have q( ) = . There fore q(T) = 0 ⇒ ⎯ = 0,
that is, q( ) = 0. Given p(x) is the minimal polynomial of .
Therefore p(T) = 0 ⇒ p( ) = 0 and given p1(x) is the minimal polynomial of .
Therefore p1( ) = 0. Therefore, by the definition of minimal polynomial, we have
p1(x)/p(x).

Theorem 1. 2. 2. If T∈A(V) has all its eigenvalues in F, then there is a basis of V in


which the matrix of T is triangular.
Proof. We prove this result by induction method on dim (V) over F. Let dim(V(F)) = 1 ,
Then every element in A(V) is a scalar, and hence the theorem is true. Suppose that the
theorem is true for all vector spaces over F of dimension (n –1) and let V be of dimension
n over F. Given, all the eigenvalues of T, then there exists a non zero vector v1∈V such
that T(v1) = λ1v1. Let W be a subspace of V generated by v, that is W = {av1: a∈F}.
Then W is a one - dimensional subspace of V, which is invariant under T. Let = V/W.
Then dim ( ) = dim (V) – dim(W) = (n –1). Given T∈ A(V) ⇒ ∈ A( ), where
(v + W) = T(v) + W= ( ). Then, we know that the minimal polynomial of
divides the minimal polynomial of T. That is, p1(x)/p(x) ( by lemma 1.2.1). Therefore
the eigenvalues of are in F because eigenvalues of T are in F. By induction
hypothesis, there exists a basis { 2,  3,…….,  n} = {v2 + W, v3 + W,………, Vn+W}
of = V/W over F such that ( ) = a22  2

( ) = a32 2 + a33  3

…   …   …  …  …  …  …
( ) = an2 2 + an3  3+ . . . . .+ ann  n

We now verify that B = {v1, v2, ……,vn} is a basis for V with respect to which T has
matrix in triangular form.

112
Linear Algebra 

Let a1v1 + a2v2 +…………. +anvn = 0 → (1)


⇒ (a1v1 + a2v2 +…………. +anvn)+W = W
⇒ a1(v1+W) + a2(v2+W)+……. an(vn+W) = W
⇒ W + a2  2 + a3 3 +……….. + an n =W
⇒ a2  2 + a3 3 +……….. + an n = 0 of V/W
⇒ a2= 0, a3 = 0,…….. an = 0. Since 2 , 3, ……, n is linearly independent.
Now, from (1) , we get a1v1 = 0. Therefore a1 = 0 as v1 ≠ 0
Therefore, the basis B ={v1,v2,………,vn} is linearly independent in V of dimension n.
If v2,………,vn are elements of V mapping in to 2,  3,…….,  n , respectively, then
B ={v1,v2,………,vn} form a basis of V.

Further, ( ) = a22  2 ⇒ ( ) – a22  2 = 0 of V/W


⇒ (T(v2) +W) – a22 (v2 + W) = 0
⇒ (T(v2) – a22v2) + W = W
⇒ (T(v2) – a22v2)∈W
⇒ T(v2) – a22v2= a21v1, since a21v1= 0
⇒ T(v2) = a21v1+ a22v2
Similarly , we can prove T(vi) = ai1 + ai2+…….. + aiivi
Thus, T(v1) = λ1v1 or a11v1 (where λ1 = a11)
T(v2) = a21v1 + a22v2
………………………………
T(vi) = ai1 + ai2+…….. + aiivi
……………………………..
T(vn) = an1 + an2+…….. + annvn
     0             0
    0
Therefore A =       …         is a triangular matrix with respect to the
      
basis B ={v1,v2,………,vn}.

Lemma 1. 2. 3. If V is n - dimensional over F and if T∈A(V) has the matrix A in the


basis A ={u1,u2,… …,un} and there is a matrix an element C ∈Fn such that B = CAC–1.

113
B. Chaluvaraju 

In fact , if S is the linear transformation of V defined by S(ui) = vi for i = 1,2, ……, n, then
the matrix C can be chosen to the basis B ={v1,v2,………,vn}.
Proof. Let A = [aij] and B = [bij]. Then T(ui) = ∑ and T(vi) = ∑ ,
respectively. Let S∈A(V) be defined by S(ui) = vi so that S is invertible ( Since S takes
basis to basis, if and only if S is bijective if and only if S is invertible).
Now from T(vi) =    ∑

⇒ T(S(ui)) =  ∑ , since S(uj) = vj

⇒ (TS) (ui) =   ∑

⇒ S–1(TS) (ui) = S–1 ∑

⇒ S–1TS (ui) =  ∑ , since S is invertible.

By the matrix is taken with respect to the change of basis B ={v1,v2,………,vn}.


Therefore CAC–1 = B, by virtue of the fact that the mapping T→[T ]B, A is an isomorphism
of A(V) on to Fn.

Theorem 1. 2. 4. If the matrix A∈Fn has all its eigen values in F, then there is a matrix
C∈ Fn such that CAC–1 is a triangular matrix.
                   
    
Proof. Suppose that A =       …         ∈ Fn has all its eigen values in F.
      
Now, define a linear map T: Fn → Fn which maps the basis B ={v1,v2,… …,vn}, where
v1= (1, 0, . . . ,0), v2 = (0, 1, . . . ,0), . . . , vn= (0, 0, . . . ,1) as indicated below
T(v1) = (a11, a12,………,a1n) = a11 v1 + a12 v2 + a13v3 + ………… + a1n vn
T(v2) = (a21, a22,………,a2n) = a21 v1 + a22 v2 + a23v3 + ………… + a2n vn
……………………………………………………………………………….
T(vi) = (a11, a12,………,a1n) = = an1 v1 + an2 v2 + an3v3 + ………… + ann vn
                   
    
Therefore A =       …         .
      

114
Linear Algebra 

Suppose that the matrix A∈ Fn as all its eigen values in F. A defines a linear
transformation T on F, whose matrix in the basis v1= (1, 0, . . . ,0), v2 = (0, 1, . . . ,0), . . .
vn= (0, 0, . . . ,1), is precisely A. The eigen values of T being equal to those A, are all in F.
By Theorem 1. 2. 2, there is a basis say, B ={v1,v2,… …,vn}, which is a triangular matrix
with respect to the basis of F. However this change of basis, merely changes the matrix
A of linear transformation T in the first basis CAC–1 for a suitable C∈ Fn. By lemma 1.
2. 3, we have CAC–1 is a triangular matrix for some C∈ Fn.
Note.
1. Theorem 1. 2. 4, is also known as alternate form of Theorem 1. 2. 2.
2. Next theorem, we use λi = aii for i= 1, 2, …., n

Theorem 1. 2. 5. If V is n-dimensional vector space over a field F and T∈A(V) has all
its eigen values in F, then T satisfies a polynomial of degree n over F.
Proof. Since T∈A(V) has all its eigen values in F, there is a basis B ={v1,v2,… …,vn} of
V in which satisfies
T(v1) = λ1v1 or a11v1 (where λ1 = a11)
T(v2) = a21v1 + λ2v2
T(v2) = a31v1 + a32v2+ λ3v3
……………………………..
T(vn) = an1 v1 + an2 v2+…….. + λnvn
Equivalently, (T – λ1) v1 = 0
(T – λ2) v2 = a21v1
(T – λ3) v3 = a31v1 + a32v2
………………………………
(T – λn) vn = an1 v1 + an2 v2+…….. + an, n–1 vn–1.
Note that (T – λ1) (T – λ2) v1 = (T – λ2) (T – λ1) v1=(T – λ2) .0 = 0, since (T – λ1) v1 = 0.
Also, (T – λ1) (T – λ2) v2 = (T – λ1) a21v1, since (T – λ2) v2 = a21v1
= a21 ((T – λ1) v1) = a21.0 = 0, since (T – λ1) v1 = 0.
Continuing this type of computation, we get
(T – λn) (T – λn–1) . . . . . (T – λ1) v1 = 0

115
B. Chaluvaraju 

(T – λn) (T – λn–1) . . . . . (T – λ1) v2 = 0


……………………………………………
(T – λn) (T – λn–1) . . . . . (T – λ1) vn = 0
Let S = (T – λn) (T – λn–1) . . . . . (T – λ1)∈ A(V). Then
S (v1) = (T – λn) (T – λn–1) . . . . . (T – λ1) v1 = 0
S (v2) = (T – λn) (T – λn–1) . . . . . (T – λ1) v2 = 0
……………………………………………..………
S (vn) = (T – λn) (T – λn–1) . . . . . (T – λ1) vn = 0
That is S annihilates all the vectors of basis of V. So, S annihilates all the vectors of V.
That is S (v) = 0 for all v∈V.
Therefore S = 0 ⇒ (T – λn) (T – λn–1) . . . . . (T – λ1) = 0.
Therefore T satisfies the polynomial (x – λn) (x – λn–1) . . . . . (x – λ1) in F[x] of
degree n. This completes the proof.

1. 3. Summary
Suppose we have some set S of objects, with an equivalence relation. A canonical
form is given by designating some objects of S to be "in canonical form", such that every
object under consideration is equivalent to exactly one object in canonical form. In other
words, the canonical forms in S represent the equivalence classes, once and only once. To
test whether two objects are equivalent, it then suffices to test their canonical forms for
equality. A canonical form thus provides a classification theorem and more, in that it not
just classifies every class, but gives a distinguished (canonical) representative.

Let's look at what it means for the matrix of T to be diagonal. Recall from Block
-II, we get the matrix by choosing a basis B= {v1, v2, . . . , vn}, and then entering the
coordinates of T(v1) as the first column, the coordinates of T(u2) as the second column,
etc. The matrix is diagonal, with entries λ1, λ2, λ3, . . . . , λn, if and only if the chosen
basis has the property that T(vi) = λivi, for 1 ≤ i ≤ n. This leads to the definition of an
eigenvalue and its corresponding eigenvectors. So, diagonalizing a matrix is equivalent to
finding a basis consisting of eigenvectors.

116
Linear Algebra 

1. 4. Keywords

Canonical form Minimal polynomial


Diagonalizable Simultaneously diagonalizable
Generalized eigenspace Triangular canonical fom
Generalized eigenvector

UNIT-2
2. 0. Introduction

The Jordan canonical form describes the structure of an arbitrary linear


transformation on a finite-dimensional vector space over an algebraically closed field.
Here we study only the most basic concepts of linear algebra, with no reference to
determinants or ideals of polynomials.

2. 1. Jordan Canonical form


Basic Facts of Nilpotent Transformations

Fact 1. If V = V1 ⊕ V2 ⊕ . . . . . . . ⊕ Vk, where each subspace Vi is of dimension ni and


is invariant under T, an element of A(V) , then a basis of V can be found so that the matrix
0 0
0       0
of T in this basis is of the form             , where each Ai is an ni × ni matrix
0 0
and is the matrix of the linear transformation induced by T on Vi.
Fact 2. If T ∈ A(V) is nilpotent, then a0 + a1T + . . . . . . . + amTm , where the ai ∈ F, is
invertible if a0 ≠ 0.
Notation. Ms will denote the s × s matrix.
0 1 0 0 0
0 0 1 0 0
Ms =                               .
0 0 0 0 1
0 0 0 0 0
All of whose entries are 0 except on the superdiagonal, where they are all i’s.

117
B. Chaluvaraju 

Definition 2. 1. 1. If T ∈ A(V) is nilpotent, then k is called the index of nilpotence of T if


Tk = 0 but Tk – 1 ≠ 0.

Fact 3. If T ∈ A(V) is nilpotent, of index of nilpotence ni, then a basis of V can be found
0 0
0       0
such that the matrix of T in this basis has the form             , where
0 0
n1 ≥ n2 ≥ . . . . . . . ≥ nr and where n1 + n2 + . . . . . . . ≥ nr = dim (V). Here, the
integers n1 , n2 , . . . . . . . , nr are called the invariants of T.

Definition 2. 1. 2. If T ∈ A(V) is nilpotent, the subspace M of V, of dimension m, which


is invariant under T, is called cyclic with respect to Tk if
(i) MTm = (0), MTm–1 ≠ (0);

(ii) There is an element z ∈ M such that z, T(z), . . . . . . . ,Tm – 1(z) form a basis of M.

Note.
1. Let V1 be a subspace of vector space V over F invariant under T∈A(V). Then T
induces a linear transformation T1∈A(V1) given by T1 (u)= T(u) for all u∈V1.
For any q(x) ∈ F[x] , q(T) ∈ A(V), the linear transformation induced by q(T) on V1
is q(T1) i.e., q(T)⏐V1 = q(T1), where T1 = T ⏐V.
Further, q(T) = 0 ⇒ q(T1) = 0. That is T1 satisfies any polynomial satisfied by T.

Lemma 2. 1. 1. Suppose that V = V1 V2 , where V1 and V2 are subspaces of V invariant


under T. Let T1 and T2 be the linear transformations induced by T on V1 and V2,
respectively of T2 is p2(x), then the minimal polynomial for T over F is the least common
multiple of p1(x) and p2(x).
Proof. Let p(x) be the minimal polynomial of T .
If q(x) = Least common multiple of { p1(x) , p2(x)}, then p1(x)⏐q(x) and p2(x)⏐q(x).
Since p(T) = 0, we get p(T1) = 0 and p(T2) = 0
But p1(T1) = 0 and p2(T2) =0 follows from the fact that p1(x) and p2(x) are minimal
polynomial of T1 and T2.
There fore, p1(x)⏐p(x) and p2(x)⏐p(x)

118
Linear Algebra 

⇒ Least common multiple of { p1(x) , p2(x)}⏐p(x) ⇒ q(x)⏐p(x) → (1)


On the other hand, p1(T1) = 0 gives p1(T1) (v1) = 0 for any v1∈V1.
But then p1(x)⏐q(x) ⇒ q(T1) (v1) = 0
similarly p2(x)⏐p(x) ⇒ q(T2)(v2) = 0 for any v2∈V2.
For any v∈V, we have v = v1+v2, where v1∈V1, v2∈V2.
Therefore, q(T) v = q(T) v1 + q(T) v2
= q(T1) v1 + q(T2) v2
q(T) v = 0 for any v∈V.
Therefore, q(T) = 0, that is q(x) is satisfied by the linear operator T∈A(V), since p(x) is
the minimal polynomial of T , we get p(x)⏐q(x) → (2)
From (1) and (2) , we get q(x) = p(x).
Hence p(x) = Least common multiple of {p1(x) , p2(x)}.

Corollary. If V = V1 V2 ……. Vk where each Vi is a invariant under T and if pi(x) is


the minimal polynomial over a field F of Ti, the linear transformation induced by T on Vi,
then the minimal polynomial of T over a field F is the Least common multiple of
p1(x), p2(x) ,… ., pk(x).
Note. In what follows, we use
V1 = { v∈V /     0
V2 = { v∈V /     0}
........................
Vk = { v∈V /     0} as subspaces of V, where q1(x), q2(x) ,……….,
qk(x) are distinct irreducible polynomials and l1,l2,………, lk are positive integers. Then
we can verify that each Vi is invariant under T .
For u ∈V i , we have T(u)  = [u ] T = 0 T = 0, since 0.
Therefore, vi is an invariant under T, for all i = 1, 2,….. ,k.

Theorem 2. 1. 2. Let Vi= { v∈V /     0} for all i = 1, 2,….. , k be the


invariant subspaces of V under T . Then Vi ≠ 0 and V = V1 V2 ………. Vk,, the
minimal polynomial of Ti on Vi is .

119
B. Chaluvaraju 

Proof. If k = 1 , then V = V1 and there is nothing that needs proving suppose that k >1.
To prove Vi ≠ 0 for all i = 1, 2, . . . . , k, we define the polynomials
h1(x) =    .   .  .  .   .  . ,
h2(x) =    .   .  .  .   .  .
..................................
hi(x) = ∏ , . . . . .,
..................................
hk(x) =    .   .  .  .   .  .
If p(x) =      .   .  .  .   .  . is the minimal polynomial of T ∈ AF(V),
then hi(x) ≠ p(x) and hence hi(T) ≠ 0.
Therefore, hi(T)(v) ≠ 0 for some v∈V. That is, w ≠ 0 , where w = hi(T)(v)
However,   = ( hi(T)     = p(T) (v)= 0 (v) = 0, since p(T)=
hi(T)  .
Thus w∈Vi, where w ≠ 0 of V.
Therefore, Vi ≠{0} for i = 1,2, …….. k.
We also note, from above argument that hi(T)V ≠ {0} is in Vi . That is, hi(T)V ⊂ Vi.
Further, for j≠i we have ⏐hi(x).
Therefore, vj∈Vj , we have hi(T)(vj) = 0.
We now show that V = V1+ V2+……+Vk
The k polynomials h1(x), h2(x) ,………, hk(x) are relatively prime.
Hence we can find the polynomials a1(x), a2(x), .….., ak(x) in F[x] such that
1 = a1(x) h1(x) + a2(x) h2(x) +……….. + ak(x) hk(x)
1 = a1(T) h1(T) + a2(T) h2(T) +……….. + ak(T) hk(T)
v = a1(T) h1(T) (v) + a2(T) h2(T) (v) +……….. + ak(T) hk(T) (v) for v∈V.
But ai(T) (v) ∈V for all i ⇒ ai(T) hi(T) (v)∈ hi(T)V ⊂Vi.
⇒ ai(T) hi(T) (v)∈Vi for all i = 1,2…. k.
Hence from the above expression , we get
v = v1 + v2 +………. + vk , where vi = ai(T) hi(T) (v) for all i = 1,2… k.
Thus V = V1 + V2 +……. + Vk.

120
Linear Algebra 

We must now verify that this sum is a direct sum. To show this , it is enough to prove that
if u1+u2+………+uk = 0 with each ui ∈ Vi, then each ui = 0 suppose that
u1+u2……… +uk = 0 where some ui, say ui ≠ 0
Then h1(T) (u1+u2+……….+ uk) = 0 gives
h1(T) (u1) +………+ h1(T) (uk) = 0 , where h1(T) (uj) = 0 for all j ≠ 1
But u1∈V1 ⇒ (u1) = 0
Now   is respectively prime with h1(x) implies
1 = b1(x) h1(x) + b2(x) for b1(x), b2(x) ∈F[x]
⇒ 1 = b1(T) h1(T) + b2(T)
⇒ u1 = b1(T) h1(T)( u1) + b2(T) (u1)
⇒ = b1(T)( h1(T)( u1)) + b2(T)( (u1))
⇒ = b1(T) 0 + b2(T) 0 = 0.
This is a contradiction to the fact ui ≠ 0
Therefore, u1 + u2 +………. +uk = 0 ⇒ ui = 0 for all i .
Therefore, V = V1 V2 ………. Vk.
Finally, we show that the minimal polynomial of Ti on Vi is .
By the definition of Vi, we have Vi = 0
Therefore, = 0, since Vi ≠ 0
Therefore, the minimal polynomial of Ti is a divisor of .
That is, the minimal polynomial of Ti is , where fi ≤ li.
But, by a corollary on the minimal polynomial, we have the minimal polynomial of T
over F is the least common multiple of {     .   .  .  .   .  . }
That is,      .   .  .  .   .  . =     .   .  .  .   .  . .
This implies that l1 = f1,………, = li = fi .
Therefore, the minimal polynomial of Ti = . or . This completes the proof.
λ 1 0 0 0
0 λ 0 0
Definition 2. 1. 3. The Matrix                               such λ’s on the diagonal, 1’s
0 0 0 1
0 0 0 λ
on the super diagonal, and 0’s elsewhere, is a basic Jordan block belonging to λ.

121
B. Chaluvaraju 

Note. The matrix A is said to be in Jordan form if it satisfies following conditions.


(i) It must be block diagonal form, where each block has a fixed scalar on the
main diagonal and 1’s or 0’s on the super diagonal. These blocks are called
primary blocks of A.
(ii) The scalars for different primary blocks must be distinct.
(iii) Each primary block must be made up of secondary blocks with a scalar on the
diagonal and only 1’s on the superdiagonal. These blocks must be in decreasing
size (moving down the main diagonal).

Example - 1. Here are two matrices in Jordan form. The first has two primary blocks.
While the second has three.

⎡3 1 0 ⎤ ⎡3 1 0 ⎤
⎢0 3 1 ⎥ ⎢0 3 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢0 0 3 ⎥ ⎢0 0 3 ⎥
⎢ ⎥ ⎢ ⎥
3 1 ⎢ 3 ⎥
⎢ ⎥
⎢ ⎥ ⎢ 3 ⎥
0 3 ⎢ ⎥
⎢ ⎥ ⎢ 2 1 0 0 ⎥
⎢ 2 1 ⎥ ⎢ 0 2 1 0 ⎥
⎢ 0 2 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ 0 0 2 1 ⎥
⎢ 2 1⎥ ⎢ 0 0 0 2 ⎥
⎢ 0 2 ⎥⎦
⎢ ⎥
⎣ ⎢ 5 ⎥
⎢ ⎥
⎣ 5⎦

As a point of intersect, the first matrix has characteristic polynomial (x–3)5 (x–2)4
and minimal polynomial (x–3)3 (x–2)2. The second matrix has characteristic polynomial
(x–3)5 (x–2)4 (x–5)2 and minimal polynomial (x–3)3 (x–2)4 (x–5) .

Theorem 2. 1. 3. Let T ∈ A(V) have all its distinct characteristic roots, λ1 , λ2 , . . . . . ,


λk, in F. Then a basis of V can be found in which the matrix T is of the form

      , where each       and where ,  . .. . . ,

   
are Basic Jordan blocks belonging to λi.

122
Linear Algebra 

Proof. We note that an m × m basic Jordan block belonging to λ is merely λ + Mm. By


Fact-1 and Fact-3, we can reduce to the case when T has only one characteristic root λ,
that is, T = λ + (T − λ), and since T − λ is nilpotent, by Fact-1, there is a basis in which its
M

matrix is of the form       .

But then the matrix of T is of the form

M
λ
λ                                  
λ
   
using the first remark made in this proof about the relation of a basic Jordan block and
the Mm’s. This completes the theorem.

Note.
1. In each Ji the size of ≥ size of   ≥ . . . . . . , when this has been done, then the

matrix       is called the Jordan canonical form of T.

2. Two linear transformations in A(V) which have all their eigenvalues in F are
similar if and only if they can be brought to the same Jordan form.

1 0     0
Illustrative Example -2. Compute the Jordan canonical form for A= 0 0 2 .
0 1     3
Solution. Write A for the given matrix. The characteristic polynomial of A is
(λ–1)2(λ–2). So the two possible minimal polynomials are (λ–1) (λ–2) or the
characteristic polynomial itself.
We find that (A – I) (A – 2I) = 0 so the minimal polynomial is (λ–1) (λ–2), and hence the
invariant factors are λ–1, (λ–1) (λ–2).

123
B. Chaluvaraju 

The prime power factors of the invariant factors are the elementary divisors: λ–1, λ–1,
λ–2. Finally the Jordan canonical form of A is diagonal with diagonal entries 1, 1, 2.

Note. After determining that the minimal polynomial has all roots in the ground field and
no repeated roots, we can immediately conclude that the matrix is diagonalizable and
therefore the Jordan canonical form is diagonal.

Illustrative Example -3. Find all possible Jordan forms A for 6 × 6 matrices with
t2 (1 – t)2 as minimal polynomial.

Solution. The possible characteristic polynomials of A have the same irreducible factors
t and (1 – t) with atleast square exponents and of degree 6. Hence we have the
following cases:

Case (i). Characteristic polynomial is t4(t– 1)2. Jordan form in this case is
0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0
                                                                   
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 1 1
0 0 0 0 0 1 0 0 0 0 0 1
Case (ii). Characteristic polynomial is t3 (t – 1)3. Jordan form is
0 1 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
                             
0 0 0 1 1 0
0 0 0 0 1 0
0 0 0 0 0 1
Case (iii). Characteristic polynomials is t2(t– 1)4. Jordan form is
0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
                                                                   
0 0 0 1 0 0 0 0 0 1 0 0
0 0 0 0 1 1 0 0 0 0 1 0
0 0 0 0 0 1 0 0 0 0 0 1

Illustrative Example -4. Let J be a Jordan block with diagonal entriesλ. Then show that
λ is the only eigenvalue, and the associated eigenspace is only 1-dimensional.

124
Linear Algebra 

Solution. Since J is upper-triangular, it is clear that the only eigenvalue is λ.


Solving Jx = λx gives us the equations λxi + xi+1 = λxi for i < n, and Jxn = λxn, from
which we see that x2 = , . . . , = xn = 0, giving the eigenvector (1, 0, . . . . , 0).

2. 2. Summary
Jordan canonical form of a linear operator on a finite dimensional vector space
is an upper triangular matrix of a particular form called Jordan matrix, representing the
operator on some basis. The form is characterized by the condition that any non-diagonal
entries that are non-zero must be equal to 1, be immediately above the main diagonal (on
the superdiagonal), and have identical diagonal entries to the left and below them. If the
vector space is over a field K, then a basis on which the matrix has the required form
exists if and only if all eigenvalues of M lie in K, or equivalently if the characteristic
polynomial of the operator splits into linear factors over K. This condition is always
satified if K is the field of complex numbers. The diagonal entries of the normal form are
the eigenvalues of the operator, with the number of times each one occurs being given by
its algebraic multiplicity. If the operator is originally given by a square matrix M, then its
Jordan normal form is also called the Jordan normal form of M. Any square matrix has a
Jordan normal form if the field of coefficients is extended to one containing all the
eigenvalues of the matrix. In spite of its name, the normal form for a given M is not
entirely unique, as it is a block diagonal matrix formed of Jordan blocks, the order of
which is not fixed; it is conventional to group blocks for the same eigenvalue together,
but no ordering is imposed among the eigenvalues, nor among the blocks for a given
eigenvalue, although the latter could for instance be ordered by weakly decreasing size.
The diagonal form for diagonalizable matrices, for instance normal matrices, is a special
case of the Jordan normal form

2. 3. Keywords
Generalized eigenspace Jordan form of a linear operator
Generalized eigenvector Jordan form of a matrix
Jordan block Minimal polynomial of a linear operator
Jordan canonical basis Minimal polynomial of a matrix

125
B. Chaluvaraju 

UNIT-3

3. 0. Introduction
In this unit, we study the minimal polynomial, which is played vital role of
canonical forms, particularly generalized eigenvalue and eigenvector.

3. 1. Minimal polynomial

Definition 3. 1. 1. The monic polynomial p(x) of minimum degree such that p(T) = 0 is
called the minimal polynomial of T

Note.
1. Let F be a field. Let p(x) and h(x) ∈ F[x] ( or simply, P(F)). Suppose p(x) ≠ 0.
Then we may find q(x) and r(x) ∈ F[x] such that h(x) = q(x)p(x) + r(x), where
either r(x) = 0 or deg (r(x)) < deg(p(x)). This is known as Division Algorithm.
2. If we consider the set I, T, . . . , in n2-dimensional vector space A(V), we have
(n2 + 1) - elements, so they cannot be linearly independent. Thus there is some
linear combination a0I + a1T + , . . . , + that equals the zero function.
For every T∈A(V) satisfies some polynomial of degree ≤ n2. Knowing that there
is some polynomial that T satisfies, we can find a polynomial of minimal degree
that T satisfies, and then we can divide by its leading coefficient to obtain a monic
polynomial.

Theorem 3. 1. 1. Let p(x) be a minimal polynomial of a linear operator T on a finite


dimensional vector space V.
(i) The minimal polynomial of T exists and unique.
(ii) If p(x) is the minimal polynomial of T and h(x) is any polynomial with h(T) =0,
then p(x) is a factor of h(x).
Proof. Let p(x) be ammonic polynomial of minimal polynomial degree such that
p(T) = 0. Suppose that h(x) is any polynomial with h(T) = 0. The division algorithm
holds for polynomials with coefficients in the field F, so it is possible to write h(x) =
q(x)p(x) + r(x), where either r(x) = 0 or deg (r(x)) < deg(p(x)). If r(x) ≠ 0, then we have

126
Linear Algebra 

r(T) = h(T) – q(T)p(T) = 0. We can divide r(x) by its leading coefficient to obtain a
monic polynomial satisfied by T, and then this contradicts the choice of p(x) as a monic
polynomial of minimal degree satisfied by T. We conclude that the remainder r(x) = 0,
so h(x) = q(x)p(x) and p(x) is thus a factor of h(x). If g(x) is another monic polynomial of
minimal degree with g(T) = 0, then by the preceding paragraph p(x) is a factor of g(x),
and g(x) is a factor of p(x). Since both are monic polynomials, this forces g(x) = p(x),
showing that the minimal polynomial is unique.

Example - 1. Consider the matrices A0, A1 and A2 above. Now ni(x) = xi+1 is a
polynomial such that ni(Ai) = 0. So the minimal polynomial pi(x) must divide xi+1 in each
case. From this it is easy to see that the minimal polynomials are in fact ni(x), so that
p0(x) = x, p1(x) = x2 and p2(x) = x3.
For a more involved example consider the matrix B = Bk(λ) ∈ Mk,k(F), where λ is a scalar
and we have a trailing sequence of 1's on the main diagonal.
First consider T = Bk(0) = Bk(λ) –λIk = B –λIk. This matrix is nilpotent. In fact Tk = 0,
but Tk–1 ≠ 0. So if we set g(x) = (x – λ)k then g(B) = 0. Once again, the minimal
polynomial p(x) of B must divide g(x). So p(x) = (x – λ)i, some i ≤ k. But since Tk–1 ≠ 0,
in fact i = k, and the minimal polynomial of B is precisely p(x) = (x – λ)k.

Note.
1. The characteristic and minimal polynomials of a linear transformation have the
same zeros (except for multiplicities).
2. If V is a finite- dimensional vector space over F, then T∈A(V) is invertible if and
only if the constant term of the minimal polynomial for T is not zero.
3. Suppose all the characteristic roots if T∈A(V) are in F. Then the minimal
polynomial, q1(x)  λ    λ    .  .  .   .  . λ for λi∈F.
Here qi(x)  λ   and Vi = {v∈V/ λ (v) = 0 }.
So, if all the distinct characteristic roots λi………, λk of T lie in F, then V can be

written as V = V1 V2 ………. Vk, where Vi = {v∈V/ –λ (v) = 0 } and


where Ti has only one characteristic root λi on Vi.

127
B. Chaluvaraju 

Example - 2. Let T be the linear operator of R2 defined by T(a, b)=(2a+5b, 6a+b) and B
2 5
be the standard ordered basis for R2. Then [T]B= , and hence the characteristic
6 1
2 5
polynomial of T is f(t) = det = (t –7) (t+4).
6 1
Thus the minimal polynomial of T is also (t –7) (t+4).

Theorem 3. 1. 2. The linear transformation T is diagonalizable if and only if its minimal


polynomial is a product of distinct linear factors.
Proof. If T is diagonalizable, we can compute its minimal polynomial using a diagonal
matrix, and then it is clear that we just need one linear factor for each of the distinct
entries along the diagonal.
Conversely, suppose that the minimal polynomial of T is p(x) = (x – λn) (x – λn–1)
. . . . (x – λ1) . Then V is a direct sum of the null spaces n(T – λnIV), which shows that
there exists a basis for V consisting of eigenvectors.

Illustrative Example-3. Let V = F3[x] be the space of all polynomials of degree atmost
3 and let T: V → V be the linear transformation given by T(f) = f1. Find the minimal
polynomial of T.
Solution. The basis for V is {1, x, x2, x3} and T(1) = 0, T(x) = 1, T(x2) = 2x and
T(x3) = 3x2.
⎛0 0 0 0⎞
⎜ ⎟
⎜1 0 0 0⎟
Hence the matrix of T is A = ⎜ .
0 2 0 0⎟
⎜ ⎟
⎜0 0 3 0 ⎟⎠

By direct computation, A4 = 0 and A3 ≠ 0. Hence the minimum polynomial of T is x4.

Note.

1. Let µ be a nonempty set of elements in A(V), the subspace W⊂V is said to be


invariant under µ if for every M∈µ , WM⊂ W.
2. The nonempty set, µ, of linear transformation in A(V) is called an irreducible set
if the only subspaces of V invariants under µ are {0} and V. If µ is an irreducible

128
Linear Algebra 

set of linear transformation on V and if where D = {T∈A(V): TM = MT for all


M∈µ}.

0 1
Illustrative Example-4. Let F be the field of real numbers and let ∈F2.
1 1
0 1
Then prove that the set µ consisting only of   is an irreducible. Further, Find the
1 1
0 1
set D of all matrices commuting with , where D = {T∈A(V): TM = MT for all
1 1
M∈µ}.

0 1 0 1
Solution. Consider =  .
1 1 1 1

That is, =    .

Here –b = c , a = d. Therefore, we have

0 1
=  and
1 1
0 1
=  .
1 1
Hence D = { : a, b ∈R }.

0 1
That is, D = set of all matrices which commute with .
1 1
Define a map φ: D→ C by φ → a+ib is a field isomorphism.

To prove it is a field isomorphism.

First, φ  + = φ      
  
= (a + c) + i(b + d)
= (a + ib) + (c + id)

= φ   + φ

Therefore, φ preserves addition.

Now φ  = φ  
129
B. Chaluvaraju 

= (ac – bd) + i(ad+bc)


= (ac+iad) – i(bd – ibc)
= a(c+id) + ib(c+id)
= (a+ib) (c+id)

= φ  +φ

Therefore, φ preserves scalar multiplication.


Hence the mapping φ is well defined.
0 1
Let A = , then det(A–λI) = 0 , since det (A) = 1.
1 0

0 1 0 1
That is, det        λ     =0
1 0 1 0
λ 1
⇒ det   =0
1 0
⇒ λ2+1 = 0 ⇒ λ = ± √–1 = ± i .
Therefore, the minimal polynomial is (x+i) (x–i) = x2 +1 = 0
Now M⊂ A(V), M is a irreducible set if M(W) ⊂ W ⇒ W ={0} or W = V .
Therefore, T(W)⊂ W ∀ T∈M ⇒ either W = {0} or W = V .
Let T be a Linear transformation
0 1
T= : R2 → R2
1 0
Let W be a one - dimensional subspace of V , T(W)⊄W. Let {w1} be a basis of W. Then
{w1} can be extended to a basis {w1, w2} of V.
Suppose W is invariant under T .
T (w1) = a1w1 +0.w2 , because T (w1)∈W = 〈 w1〉.
T (w2) = b1w1+ b2w2 , because T (w2)∈W
0
Therefore, the matrix A = is a matrix of T with respect to {w1, w2} and the

0 1
matrix B = is also a matrix of T with respect to some other basis . Then by the
1 1
result A and B are similar.
0 1
That is, B = ⇒ the matrix B has ± i are complex roots, and
1 1

130
Linear Algebra 

0
A= ⇒ the matrix A has a1, b1 are real roots.

This is a contradiction to the fact that W is invariant under T . That is, there is no proper
subspace of V in D. Therefore, the matrix is irreducible.

3. 2. Summary

The minimal polynomial deals the distinct eigenvalues and the size of the largest
Jordan block corresponding to eigenvalues. While the Jordan normal form determines the
minimal polynomial, the converse is not true. This leads to the notion of elementary
divisors. The elementary divisors of a square matrix A are the characteristic polynomials
of its Jordan blocks. The factors of the minimal polynomial m are the elementary divisors
of the largest degree corresponding to distinct eigenvalues. The degree of an elementary
divisor is the size of the corresponding Jordan block, therefore the dimension of the
corresponding invariant subspace. If all elementary divisors are linear, A is
diagonalizable.

3. 3. Keywords
Diagonalizable Division Algorithm
Minimal polynomial Monic polynomial

UNIT-4
4. 0. Introduction
The generalizations of the eigenvalue and eigenspace, this unit deals with a
suitable canonical form of a linear operator to this context. The one that we study is
called rational canonical form.

4. 1. Rational canonical form


Lemma 4. 1. 1. Suppose that a linear transformation T∈A(V), has as minimal
polynomial over F the polynomial p(x) = γ0 + γ1x + . . . . . . . . .γr – 1xr – 1 + xr . Suppose,

131
B. Chaluvaraju 

further, that V, as a module is a cyclic module. Then there is basis of V over F such that,
in this basis, the matrix of T is
0 1 0 0
0 0 1 0
                                         .
0 0 0 1
γ γ γ
Proof. Since V is cyclic relative to T, there exists a vector v∈V such that every element
w∈V, is of the form w = f(T)(v) for some f(x) ∈ F[x]. Now if for some polynomial
h(x) ∈ F[x], h(T) (v) = 0, then for any w∈V, h(T) (f(T)(v)) = (h(T)(v)) f(T) = 0; thus h(T)
annihilates all of V and so h(T) = 0.
But then p(x) | h(x), since p(x) is the minimal polynomial of T. This remark implies that
v, T(v), T2(v), . . . . . , T r – 1(v) are linearly independent over F, for if not, then a0 v+ a1
T(v) + . . . . . + ar – 1 T r – 1(v) = 0 with a0 , a1 , . . . . . . ,ar – 1 in F.
r – 1
But then (a0 + a1 T + . . . . . . . + ar – 1 T )(v) = 0, hence by the above discussion
r–1
p(x) | (a0 + a1 T + . . . . . . . + ar – 1 T ), which is impossible since p(x) is of degree r
unless a0 = a1 = . . . . . . . + ar – 1 = 0. Since Tr = −γ0 − γ1 T − . . . . . . . . . −γr – 1 Tr – 1 , we
immediately have that Tr + k , for k ≥ 0, is a linear combination of 1, T, . . . . . . , Tr – 1, and
so f(T), for any f(x) ∈ F[x], is a linear combination of 1, T, ….., Tr – 1 over F.
Since any w∈V is of the form w = f(T)(v) we get that w is a linear combination of
v, T(v), . . . .. , Tr – 1 (v). We have proved, in the above two paragraphs, that the elements
v, T(v), . . . . . . . . , Tr – 1 (v) form a basis of V over F. In this basis, as is immediately
verified, the matrix of T is exactly as claimed.

Definition 4. 1. 1. If f(x) = γ0 + γ1x + . . . . . . . . . + γr – 1xr – 1 + xr is in F[x], then the r × r


matrix
0 1 0 0
0 0 1 0
                       
0 0 0 1
γ γ γ
is called the companion matrix of f(x). We write it as C(f(x)).

Note.
132
Linear Algebra 

1. If V is cyclic relative to T and if the minimal polynomial of T in F[x] is p(x) then


for some basis of V the matrix of T is C(p(x)).
2. The matrix C(f(x)), for any monic f(x) in F[x], satisfies f(x) and has f(x) as its
minimal polynomial.

Theorem 4. 1. 2. If T∈A(V) has a minimal polynomial p(x) = q(x)e , where q(x) is a


monic, irreducible polynomial in F[x], than a basis of V over F can be found in which the
C q x
C q x
matrix of T is of the form                     , where e = e1 ≥
C q x
e 2 ≥ . . . . . . . ≥ er.
Proof. Since V, as a module over F[x], is finitely generated, and since F[x] is Euclidean,
we can decompose V as V = V1 ⊕ V2 ⊕ . . . . . . . ⊕ Vr where the Vi are cyclic modules.
The Vi are thus invariant under T, if Ti is the linear transformation induced by T on Vi, its
minimal polynomial must be a divisor of p(x) = q(x)e so is of the form q(x)ei.
We can renumber the spaces so that e1 ≥ e2 ≥ . . . . . . . ≥ er .
Now annihilates each Vi, hence annihilates V, when = 0.
Thus e1 ≥ e; since e1 is clearly at most e we get that e1 = e.
By Lemma 4.1.1, since each Vi is cyclic relative to T, we can find a basis such that the
matrix of the linear transformation of Ti on Vi is C . Thus, for each i= 1, 2, . . . k,
Vi ≠ (0) and V = V1 ⊕ V2 ⊕ . . . .. ⊕ Vk . The minimal polynomial of Ti is qi(x)li) a basis
of V can be found so that the matrix of T in this basis is
C q x
C q x
                 
C q x

Corollary. If T∈A(V) has minimal polynomial p(x) =      .   .  .  .   .  .


over a field F, where   , , .   .  .  .   .  , are irreducible distinct

133
B. Chaluvaraju 

polynomials in F[x], then a basis of V can be found in which the matrix of T is of the
C q x
form                    , where each                       ,
C q x
where ei = e i1 ≥ e i2 ≥ . . . . . . . ≥ e iri.
Proof. By Fact -1, V can be decomposed into the direct sum V = V1 ⊕ V2 ⊕ . . . . . . . ⊕
Vk, where each Vi is invariant under T and where the minimal polynomial of Ti, the linear
transformation induced by T on Vi, has as minimal polynomial . Using Fact-1 and
the theorem 4. 1 just proved, we obtain the above result. If the degree of qi(x) is di, note
that the sum of all the dieij is n, the dimension of V over F.

Definition 4. 1. 2. The matrix of T in the statement of the above corollary is called the
rational canonical form of T.

Definition 4. 1. 3. The polynomials


rk
  , , .  .  .   .   . ,, .  .  .   .   . in F[x]
are called the elementary divisors of T. So, if dim(V) = 0, then the characteristic
polynomial of , pT(x), is the product of its elementary divisors.

Illustrative Example-1. Show that the characteristic polynomial of the companion


matrix C(q(x)) is q(x).
0 1 0 0
0 0 1 0
Solution.                                  ,
0 0 0 1

where q(x) = a0 + a1x + . . . . . + an – 1 x n – 1 + x n.


λ 1    0        0
0     λ 1        0
        λ                               
0     0    0 λ 1
          λ   
Also to the first column, λ times the second column, λ2 times the third column etc, and
λn – 1 times the last column. Then the determinant becomes

134
Linear Algebra 

0 1    0        0 1    0        0


0     λ 1        0     λ 1        0
                            =    1   λ                    
0     0    0 λ 1     0    0 λ 1
λ        λ   
= ( − 1)n – 1 q ( λ) (− 1)n – 1 = q ( λ).

Illustrative Example-2. Deduce from the previous problem that the characteristic
polynomial of T is the product of all elementary divisors of T.

0
Solution. Under the Rational Canonical form, the matrix of T is              ,
0

where each                 .

Then the characteristic polynomial of T is the product of characteristic polynomials of


C( ) which is nothing but ). Hence the characteristic polynomial of T is
the product of all elementary divisors.

Illustrative Example-3. Find all possible rational canonical forms for 6 x 6 matrices
with (x – 2)(x + 2)3 as minimal polynomial.
Solution. Since the matrix is of size 6 x 6 and the minimal polynomial is (x – 2)(x + 2)3
there are three cases for the characteristic polynomial.
Case 1. Characteristic polynomial is (x – 2)(x + 2)5 . In this case, the rational canonical
forms are C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2)2
or C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2) ⊕ C(x + 2).
Case 2. Characteristic polynomial is (x – 2)(x + 2)4 . In this case, the rational canonical
form is C(x – 2) ⊕ C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2).
Case 3. Characteristic polynomial is (x – 2)2(x + 2)3. In this case, the rational canonical
form is C(x – 2) ⊕ C(x – 2) ⊕ C(x – 2) ⊕ C(x + 2)3.
All these can be written in the block matrix from using the matrix form of C(q(x)), the
companion matrix of q(x).

4. 2. Summary
135
B. Chaluvaraju 

The Jordan canonical form is the one most generally used to prove theorems
about linear transformations and matrices. Unfortunately, it has one distinct, serious
drawback in that it puts requirements on the location of the charecterstic roots. Thus we
need some canonical form for element in A(V) (or in Fn) which presumes nothing about
the location of the characteristic roots of its element , a canonical form and asset of
invariants created in A(V) itself using only its elements and operation. Such a canonical
form is proved the above unit is rational canonical form.

4. 3. Key words
Companion matrix Multiplicity of an elementary divisor
End vector of a cycle Rational canonical basis
Generalized eigenspace Rational canonical form
Generalized eigenvector

Exercises

1. Let A and B are two diagonalizable n × n matrices. Prove that A, B is


simultaneously diagonalizable if and only if A, B commute. (That is AB = BA).
0 2 3 4
2. Let A =   , B =  . Find the basis which simultaneously
1   3    2    3
diagonalizes A and B.
Answer. {(1, 1), (1, 2)}.
3. Let T: V→V be a linear transformation of all the characteristic roots in F. Show
that T is diagonalizable if and only if (T – λI)n (v)= 0 for all v∈V and λ∈F implies
(T –λI) (v) = 0
4. The linear transformation T is diagonalizable if and only if its minimal
polynomial is a product of distinct linear factors.
5. If T, S∈A(V) and if S is regular, then show that T and STS–1 have the same
minimal polynomial.
6. Let A be an n × n real matrix. Then show the minimal polynomial of A is unique.

136
Linear Algebra 

⎛2 6 − 15 ⎞
⎜ ⎟
7. Find the Jordan canonical form of A= ⎜ 1 1 − 5⎟
⎜1 2 − 6 ⎟⎠

.
Answer. The characteristic polynomial of A is (λ–1)4
⎛1 2 0 0⎞
⎜ ⎟
⎜0 1 2 0⎟
8. Determine the Jordan canonical form for the matrix A= ⎜ .
0 0 1 2⎟
⎜ ⎟
⎜0 0 0 1 ⎟⎠

Answer. The characteristic polynomial of A is (λ–1)4 and A has a single Jordan
block of type (1, 4).
9. Show that the elements S and T in A(V) are similar in A(V) if and only if they have
the same elementary divisors.
⎛1 1 1 1⎞
⎜ ⎟
⎜0 0 0 0⎟
10. Find the rational canonical form of A= ⎜
0 0 −1 0⎟
⎜ ⎟
⎜0 −1 1 0 ⎟⎠

Hint. The characteristic polynomial of A is (x–1) x (x2+x+1), then use the
companion matrix of q(x).

References
1. S. Friedberg. A. Insel, and L. Spence – Linear Algebra, Fourth Edition, PHI,
2009.
2. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
3. I. N. Herstein – Topics in Algebra, Vikas Publishing House, New Delhi, 2002.
4. Hoffman and Kunze – Linear Algebra, Prentice – Hall of India, 1978, 2nd Ed.,
5. P. R. Halmos – Finite Dimensional Vector Space, D. Van Nostrand, 1958.
6. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
2000.

137
B. Chaluvaraju 

GLOSSARY OF SYMBOLS

Symbols Meaning
Aij The ijth entry of the matrix A
A- 1 The inverse of the matrix A
A* The ajdoint of the matrix A
AT The transpose of the matrix A
Tr(A) The trace of the matrix A
(A | B) The matrix A augmented by the matrix B
B(x, y) The bilinear form
B (V) The set of all bilinear forms on V

B and A The basis of B and A

B* and A* The dual basis of the basis B and A


C The field of complex numbers
C (R) The vector space of continuous functions on R
det (A) or |A| The determinant of the matrix A
δij The Kronecker delta
dim (V) The dimension of V
ei The ith standard vector of Fn
Eλ The eigenspace of T corresponding to λ
F A field
f (A) The polynomial f(x) evaluated at the matrix A
Fn The set of n - tuples with entries in a field F
f (T) The polynomial f(x) evaluated at the operator T
In or I The n x n identity matrix
Iv or I The identity operator on v in V
K(x) The quadratic form
Kλ The generalized eigenspace of T corresponding to λ
TA The left - multiplication transformation by matrix A
A (V) The linear transformations from vector space V to V
138
Linear Algebra 

A (V, W) The space of linear transformations from V to W


Mm x n (F) The set of m x n matrices with entries in F
nullity(T) or n(T) The dimension of the null space of T
P (F) The space of polynomials with coefficients in F
Pn (F) The polynomials in P(F) of degree at most n
φB The standard representation with respect to basis B
R The field of real numbers
rank (A) The rank of the matrix A
rank (T) The rank of the linear transformation T
range (T) or r(T) The range of the linear transformation T
Span (S) The span of the set S
S⊥ The orthogonal complement of the set S
T The linear transformation
T–1 The inverse of the linear transformation T
T* The adjoint of the linear operator T
TT The transpose of linear transformation T
T0 The zero transformation
[T]B The matrix representation of T in basis B

[T ]B, A = [T ]BA The matrix representation of T in basis B and A

TW The restriction of T to a subspace W


V, U, Z The vector spaces
V* The dual space of the vector space V
[x]B The coordinate vector of x relative to B

W1 ⊕W2⊕ . . . . .⊕Wk The direct sum of sub spaces W1 ,W2, . . . . .,Wk


W1 +W2+ . . . . .+Wk The sum of sub spaces W1 ,W2, . . . . . ,Wk
 
 
 
 
 
139
B. Chaluvaraju 

NOTES
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

140

You might also like