Professional Documents
Culture Documents
Chaluvaraju
KARNATAKA STATE OPEN UNIVERSITY
Manasagangotri, Mysore570 006
M. Sc. Mathematics
(Semester Scheme)
Math 2.1: Linear Algebra
(II Semester)
Dr. B. CHALUVARAJU. M. Sc., Ph. D
Assistant Professor (Senior Scale)
Department of Mathematics, Central College Campus,
Bangalore University, Bangalore‐560 001, INDIA
E‐Mail: bcrbub@gmail.com
0
Linear Algebra
1
B. Chaluvaraju
TABLE OF CONTENTS
• Preface
• Preliminaries
BLOCK I: Vector Spaces, Linear Transformations and Matrix
• Objectives
• UNIT1:
1. 0. Introduction
1. 1. Vector Spaces
1. 2. Subspaces
1. 3. Linear Combinations and Systems of Linear equation
1. 4. Linearly Dependence and Linearly Independence
1. 5. Bases and Dimension
1. 6. Maximal Linearly Independent Subsets
1. 7. Summary
1. 8. Keywords
• UNIT2 :
2. 0. Introduction
2. 1. Linear Transformation, Null Spaces and Ranges
2. 2. The Matrix Representation of a Linear Transformation
2. 3. Composition of a Linear Transformation and Matrix Multiplication
2. 4. Invertiblity and Isomorphism
2. 5. The Change of Coordinate Matrix
2. 6. The Dual space
2. 7. Summary
2. 8. Keywords
• UNIT3 :
3. 0. Introduction
3. 1. Elementary Matrix Operations and Elementary Matrices
3. 2. The Rank of a Matrix and Matrix Inverse
3. 3. Systems of Linear Equation
3. 4. Summary
3. 5. Keywords
• UNIT 4 :
4. 0. Introduction
4. 1. Properties of Determinants
4. 2. Cofactor Expansions
4. 3. Elementary Operation and Cramer’s rule
4. 4. Summary
4. 5. Keywords
• Exercises
• References
2
Linear Algebra
BLOCK II: Diagonalization and Inner Product Spaces
• Objectives
• UNIT1:
1. 0. Introduction
1. 1. Eigen values and Eigen vectors
1. 2. Diagonalizability
1. 3. Invariant Subspaces and the Cayley‐Hamilton Theorem
1. 4. Summary
1. 5. Keywords
• UNIT2 :
2. 0. Introduction
2. 1. Inner Products and Norms
2. 2. The Gram‐Schmidt Orthogonalization Process
2. 3. Orthogonal Complements
2. 4. Summary
2. 5. Keywords
• UNIT3 :
3. 0. Introduction
3. 1. The Adjoint of a Linear Operator
3. 2. Normal and Self‐Adjoint Operators
3. 3. Unitary and Orthogonal Operators and Their Matrices
3. 4. Orthogonal Projection and Spectral Theorem
3. 5. Summary
3. 6. Keywords
• UNIT 4 :
4. 0. Introduction
4. 1. Bilinear and Quadratic Form
4. 2. Summary
4. 3. Keywords
• Exercises
• References
BLOCK III: Canonical Forms
• Objectives
• UNIT1 :
1. 0. Introduction
1. 1. The Diagonal Form
1. 2. The Triangular Form
1. 3. Summary
1. 4. Keywords
3
B. Chaluvaraju
• UNIT2 :
2. 0. Introduction
2. 1. The Jordan Canonical Form
2. 2. Summary
2. 3. Keywords
• UNIT3:
3. 0. Introduction
3. 1. The Minimal Polynomial
3. 2. Summary
3. 3. Keywords
• UNIT 4:
4. 0. Introduction
4. 1. The Rational Canonical Form
4. 2. Summary
4. 3. Keywords
• Exercises
• References
• Glossary of Symbols
Preface
Linear algebra, mathematical discipline that deals with vectors and matrices and,
more generally, with vector spaces and linear transformations. Unlike other parts of
mathematics that are frequently invigorated by new ideas and unsolved problems, linear
algebra is very well understood. Linear algebra is a very useful subject, and its basic
concepts arose and were used in different areas of mathematics and its applications. It is
therefore not surprising that the subject had its roots in such diverse fields as number
theory (both elementary and algebraic), geometry, abstract algebra (groups, rings, fields,
Galois theory), analysis (differential equations, integral equations, and functional
analysis), and physics. Among the elementary concepts of linear algebra are linear
equations, matrices, determinants, linear transformations, linear independence,
dimension, bilinear forms, quadratic forms, and vector spaces. Since these concepts are
closely interconnected, several usually appear in a given context (e.g., linear equations
and matrices) and it is often impossible to disengage them. It has extensive applications
in engineering, natural sciences, computer science, and the social sciences. Nonlinear
4
Linear Algebra
mathematical models can often be approximated by linear ones. In 1880, many of the
basic results of linear algebra had been established, but they were not part of a general
theory. In particular, the fundamental notion of vector space, within which such a theory
would be framed, was absent. This was introduced only in 1888 by Giuseppe Peano,
Italian mathematician and he known as a founder of symbolic logic. This study material
deals with three blocks. In block -I contain vector spaces, linear transformation and
matrices, block-II contains diagonalization and inner product space and finally block-III
contains canonical forms, each blocks comprises four units.
0. Preliminaries.
This section provides a brief review of all basic concepts, definitions, examples
etc which will be used in the subsequent blocks. It has been assumed that the reader is
familiar with elementary Algebra.
5
B. Chaluvaraju
Definition 0. 3. A field is a set F with at least two elements together with a function F ×
F → F called addition, denoted (a, b) → a + b, and a function F × F → F called
multiplication, denoted (a, b) → ab, which satisfy the following axioms:
(i) (Commutativity) For each a, b F, a + b = b + a and ab = ba.
(ii) (Associativity) For each a, b, c F, (a + b) + c = a + (b + c) and
(ab) c = a (bc).
(iii) (Identities) There exist two elements 0 and 1 in F such that 0 + a = a and
1a = a for each a F
(iv) (Inverses) For each a F, there exists an element −a F such that
(−a) + a = 0. For each nonzero a F, there exists an element a−1 F such that
a−1a = 1.
(v) (Distributivity) For each a, b, c F, a(b + c ) = ab + ac.
Examples.
1. The real numbers R the complex numbers C, and the rational numbers Q, are all fields.
2. The set of integers Z, is not a field.
Definition 0. 4. An object of the form (a1, a2, . . . ,an), where the entries
a1, a2, . . . ,an are elements of a field F, is called n-tuple with entries from F.
space is denoted by P(F) or F[x]. A polynomial p(x) in F[x] is said to be irreducible over
F if whenever p(x) = g(x) h(x) with g(x) h(x)∈ F[x], then one of g(x) or h(x) has degree 0
(that is a constant).
Definition 0. 10. The trace of an n × n matrix M, denoted tr(M), is the sum of the
diagonal entries of M. That is, tr(M) = M11+ M22+ . . . .+ Mnn.
Definition 0. 11. A symmetric matrix is a matrix A such that AT=A and a skew-
symmetric matrix is a matrix A such that AT= –A.
Definition 0. 12. For any pair of positive integers i and j, the symbol δij is defined δij =
0 if i≠ j and δij = 1 if i=j. This symbol is known as the kronecker delta.
7
B. Chaluvaraju
BLOCK – I
Vector Spaces, Linear Transformations and Matrix
Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of vector space, linear transformation and matrix.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.
UNIT-1
1. 0. Introduction
A vector space is a mathematical structure formed by a collection of vectors:
objects that may be added together and multiplied ("scaled") by numbers, called scalars
in this unit. Scalars are often taken to be real numbers, but one may also consider vector
spaces with scalar multiplication by complex numbers, rational numbers, or even more
general fields instead. The operations of vector addition and scalar multiplication have to
satisfy certain requirements, called axioms.
This unit mainly deals with standard properties of Vectors, Subspaces, Linear
combinations and systems of linear equation, Linearly dependence and linearly
independence, Bases and Dimension, and Maximal linearly independent subsets along
with some illustrated examples and theorems.
1. 1. Vector Spaces
Definition 1. 1. 1. A vector space V over a field F is an abelian group with a scalar
product a⋅v or av defined for all a∈F and all v∈V satisfying the following axioms.
(i) a(bv) = (ab)v
(ii) (a +b)v = av + bv
8
Linear Algebra
(iii) a(u+v)= au + av
(iv) 1v = v, where a, b∈F and u, v∈V.
Note.
1. The elements of V are called vectors, the elements of F are called scalars.
2. To differentiate between the scalar zero and the vector zero or null vector, we will
write them as 0 and 0, respectively.
Theorem 1. 1. 1. Let V be a vector space over a field F and 0 be the zero vector in V.
Then each of the following statements is true.
9
B. Chaluvaraju
1. 2. Subspace
Illustrative Example - 2. Let R be the field of real numbers and S be the set of all
solutions of the equations x + y + 2z = 0. Show that S is a subspace of R3.
Solution. We have, S = {(x, y, z): x + y + 2z = 0, x, y, z ∈ R}.
Clearly, 1× 0 + 1× 0 + 2 × 0 = 0 . So, (0, 0, 0) satisfies the equation x + y + 2z = 0
Therefore, (0, 0, 0)∈S
⇒ S is non- empty subset of R3.
Let u = (x1, y1,z1) and v = (x2, y2, z2) be any two elements of S.
Then x1 + y1 + 2z1 = 0 and x2 + y2 + 2z2 = 0
Let a, b be any two elements of R. Then,
au + bv = a(x1, y1,z1) + b(x2, y2, z2)
⇒ au + bv = (ax1 +bx2, ay1 +by2, az1 + bz2)
Now (ax1 +bx2) + (ay1 +by2) + 2 (ac1 + bc2) = a(x1 + y1 + 2z1) + b(x2 + y2 + 2z2)
=a ×0+b ×0=0
Therefore, au + bv = (ax1 +bx2, ay1 +by2, az1 + bz2) ∈ S
10
Linear Algebra
Illustrative Example - 3. Let V be the vector space of all real valued continuous
functions over the field R of all real numbers. Show that the set S of solutions of the
d2y dy
differential equation 2 − 9 + 2 y = 0 is a subspace of V.
dx 2
dx
⎧ d2y dy ⎫
Solution. We have S = ⎨ y : 2 2 − 9 + 2 y = 0 ⎬ , where y = f(x).
⎩ dx dx ⎭
Clearly, y = 0 satisfies the given differential equations.
Therefore, 0 ∈S ⇒ S ≠ φ .
d2y dy
Let y1, y2 ∈ S, Then, y1, y2 are solutions of 2 − 9 + 2y = 0
dx 2
dx
d 2 y1 dy d 2 y2 dy
⇒ 2 − 9 1 + 2 y1 = 0 and 2 − 9 2 + 2 y2 = 0
dx 2
dx dx 2
dx
Let a, b ∈ R and y = ay1 + by2. Then
d2y dy d2 d
2 2 − 9 + 2 y = 2 2 ( ay1 + by2 ) − 9 ( ay1 + by2 ) + 2 ( ay1 + by2 )
dx dx dx dx
d2y dy ⎛ d 2 y1 d 2 y2 ⎞ ⎛ dy1 dy ⎞
⇒ 2 2 − 9 + 2y = 2⎜ a 2 + b 2 ⎟ − 9⎜ a + b 2 ⎟ + 2 ( ay1 + by2 )
dx dx ⎝ dx dx ⎠ ⎝ dx dx ⎠
d2y dy ⎛ d 2 y1 dy1 ⎞ ⎛ d 2 y2 dy ⎞
⇒ 2 − 9 + 2 y = a ⎜ 2 − 9 + 2 y1⎟
+ b ⎜ 2 2 − 9 2 + 2 y2 ⎟
dx 2
dx ⎝ dx
2
dx ⎠ ⎝ dx dx ⎠
d2y dy
⇒ 2 − 9 + 2y = a×0 + b × 0
dx 2
dx
Therefore, y = ay1 + by2 ∈ S for all y1, y2 ∈ S and a, b ∈ R.
Hence, S is a subspace of V.
v = a1 u1 + a2 u2 + a3 u3 + ………… + an un.
In this case we also say that v is a linear combination of u1,u2, ….un and call a1,
a2,……,an the coefficients of the linear combination.
Note. In any vector space V, 0v = 0 for each v ∈ V. Thus the zero vector is a linear
combination of any nonempty subset of V.
Note.
1. For convenience, we define span ( φ ) = {0}.
2. In R3, for instance, the span of the set {(1, 0, 0), (0, 1, 0)} consists of all vectors
in R3 that have the form a(1, 0, 0) + b(0, 1, 0) = (a, b, 0) for some scalars a and b.
Thus the span of {(1, 0, 0), (0, 1, 0)} contains all the points in the xy-plane. In
this case, the span of the set is a subspace of R3. This fact is true in general.
12
Linear Algebra
Example - 1. The vectors (1, 1, 0), (1, 0, 1) and (0, 1, 1) generate R3 since an arbitrary
vector (a1, a2, a3) in R3 is a linear combination of the three given vectors, in fact, the
scalars r, s and t for which r(1, 1, 0) + s(1, 0, 1) + t(0, 0, 1) = (a1,a2, a3) are r = ½ (a1
+ a2 – a3), s = ½ (a1 – a2 + a3), and t = ½ (–a1 + a2 + a3).
This is equivalent to
13
B. Chaluvaraju
⎡1 1 2 ⎤ ⎡ x ⎤ ⎡ 1 ⎤
⎢ 0 1 −3⎥ ⎢ y ⎥ = ⎢ −3⎥ Applying R →R – R , R →R – R
2 2 1 3 3 1
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0 2 −1⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣ 4 ⎥⎦
⎡1 1 2 ⎤ ⎡ x ⎤ ⎡ 1 ⎤
or ⎢ 0 1 −3⎥ ⎢ y ⎥ = ⎢ −3⎥ Applying R →R – R
3 3 2
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ 0 0 5 ⎥⎦ ⎢⎣ z ⎥⎦ ⎢⎣10 ⎥⎦
Illustrative Example - 1. If the set S = {(1, 3, –4, 2), (2, 2, –4, 0), (1, –3, 2, –4), (–1,
0, 1, 0)} in R4, then show that S is linearly dependent and express one of the vectors in S
as a linear combination of the other vectors in S.
Solution. To show S is linearly dependent, we must find scalars a1, a2, a3 and a4 , not all
zero, such that a1 (1, 3,–4, 2) + a2(2, 2,–4, 0) + a3(1,–3, 2,–4) + a4(–1, 0, 1, 0) = 0.
Finding such scalars to finding a nonzero solution to the system of linear equations.
a1 + 2a2 + a3 – a4 = 0
3a1 + 2a2 – 3a3 =0
–4a1 – 4a2 + 2a3 + a4 = 0
2a1 – 4a4 = 0
14
Linear Algebra
Illustrative Example - 2. Let v1, v2, v3, be vectors in a vector space V(F), and
let λ1 , λ2 ∈ F. Show that the set { v1, v2, v3} is linearly dependent, if the set
{ v1 + λ1 v2 + λ2 v3, v2, v3} is linearly dependent.
Solution. If the set {v1 + λ1 v2 + λ2 v3, v2, v3} is linearly dependent, then there exist
scalars λ, µ, γ ∈F (not all zero) such that λ (v1 + λ1 v2 + λ2 v3) + µ v2 + γ v3 = 0V
⇒ λ v1 + (λ λ1 + µ) v2 + (λ λ2+ γ )v3 = 0V → (1)
The set {v1, v2, v3} will be linearly dependent if in (1) at least one of the scalar coefficients
is non-zero. If λ ≠ 0, then the set will be linearly dependent whatever may be the values
of µ and γ. But, if λ = 0, then at least one of µ and γ should not be equal to zero and
hence at least one of λ λ1 + µ and λ λ2+ γ will not be zero.
Since λ = 0 ⇒ λ λ1 + µ = µ and λ λ2+ γ = γ. Hence, from (1), we find that the scalars
λ, λ λ1 + µ, λ λ2+ γ are not all zero. Consequently, the set {v1, v2, v3} is linearly
dependent.
Note. The following facts about linearly independent sets are true in any vector space.
1. The empty set is linearly independent, for linearly dependent sets must be
nonempty.
2. A set consisting of a single nonzero vector is linearly independent. For if {u} is
linearly dependent, then au = 0 for some nonzero scalar a.
Thus, u = a–1 (au) = a–10 = 0
3. A set is linearly independent if and only if the only representations of 0 as linear
combinations of its vectors are trivial representations.
The condition-3 provides a useful method for determining whether a finite set is
linearly independent. This technique is illustrated in the following examples.
15
B. Chaluvaraju
Illustrative Example - 3. Prove that the set S = {(1, 0, 0, –1), (0, 1, 0, –1),
(0, 0, 1, –1), (0, 0, 0, 1)} is linearly independent.
Solution. We must show that the only linear combination of vectors in S that equals the
zero vector is the one in which all the coefficients are zero.
Suppose that a1, a2, a3 and a4 are scalars such that
a1(1, 0, 0, –1) + a2 (0, 1, 0, –1) + a3(0, 0, 1, –1) + a4(0, 0, 0, 1) = (0, 0, 0, 0).
Equating the corresponding coordinates of the vectors on the left and the right sides of
this equation, we obtain the following system of linear equations.
a1 =0
a2 =0
a3 =0
– a1 – a2 –a3 + a4 = 0
Clearly the only solution to this system is a1 = a2 = a3 = a4 = 0, and so S is linearly
independent.
16
Linear Algebra
of V that generates V. If A is a basis for V, we also say that the vectors of A form a
basis for V.
Example - 1. Recalling that span ( φ ) = {0} and φ is linearly independent, we see that
φ is a basis for the zero vector space.
Example - 2. If Mmxn(F), let Eij denote the matrix whose only nonzero entry is a 1 in the
ith row and jth column. Then {Eij : 1 ≤ i ≤ m, 1 ≤ j ≤ n }is a basis for Mm x n(F).
Example - 3. If Pn(F) the set {1, x, x2,……, xn} is a basis. This is the standard basis for
Pn(F).
Theorem 1. 5. 1. Let V be a vector space and A = {u1, u2, ……, un} be a basis of a
subset of V. Then A is a basis for V if and only if each v ∈ V can be uniquely expressed
as a linear combination of vectors of A, that is, can be expresses in the form v = a1 u1 +
a2 u2 + a3 u3 + …… + an un for unique scalars a1, a2, a3, …,an.
Proof. Let A be a basis for V. If v ∈ V, then v ∈ span (A) because span (A) = V. Thus
v is a linear combination of the vectors of A. Suppose that
v = a1 u1 + a2 u2 + a3 u3 + ………… + an un and
v = b1 u1 + b2 u2 + b3 u3 + ………… + bn un
are two such representations of v. Subtracting the second equation from the first gives
0 = (a1 – b1)u1 + (a2 – b2)u2 + (a3 – b3)u3 + …………….+ (an – bn)un.
Since A is linearly independent, it follows that
(a1 – b1) = (a2 – b2) = (a3 – b3) = …………..(an – bn) = 0.
Hence a1 = b1 , a2 = b2 , a3 = b3 ,………., an = bn, so v is uniquely expressible as a linear
combination of the vectors of the basis A.
17
B. Chaluvaraju
Examples - 4.
(i) The vector space {0} has dimension zero.
(ii) The vector space Fn has dimension n.
(iii) The vector space Mm × n(F) has dimension mn.
(iv) The vector space Pn(F) has dimension (n+1).
The following examples show that the dimension of a vector space depends on its field of
scalars.
Examples - 5.
(i) Over the field of complex numbers C, the vector space of complex numbers has
dimension 1. (A basis is {1, i}).
(ii) Over the field of real numbers R, the vector space of complex numbers has
dimension 2. (A basis is {1, i}).
Note.
1. Let V be finite dimensional vector space over a field F, and let S be a subspace of
V. Then dim (V/S) = dim (V) – dim (S).
2. If S and T are two subspaces of a finite dimensional vector space V over a field F,
then dim (S+T) = dim (S) + dim (T) – dim (S∩T).
Example- 1. Let F be the family of all subsets of a nonempty set S. (This family F is
called the power set of S.) The set S is easily seen to be a maximal element of F.
18
Linear Algebra
Example -2. Let S and T be disjoint nonempty sets, and let F be the union of their power
sets. Then S and T are both maximal elements of F.
Our next result shows that the converse of this statement is also true.
Theorem 1. 6. 1. Let V be a vector space and S a subset that generates V. If A is a
(A), for otherwise there exists a v ∈ S such that v∉ span(A). Since S is a linearly
independent subset of a vector space V, and let v ∈ V that is not in S. Then S ∪ {v} is
linearly dependent if and only if v ∈ span (S). This implies that A ∪ {v} is linearly
Proof. Let F denote the family of all linearly independent subsets of V that contain S.
In order to show that F contains a maximal element, we must show that if C is a chain in
F, then there exists a member U of F that contains each member of C. We claim that U,
the union of the members of C, is the desired set. Clearly U contains each member of C,
and so it suffices to prove that U ∈ F (That is U is a linearly independent subset of V that
contains S). Because each member of C is a subset of V containing S, we have S ⊆ U ⊆
V. Thus we need only prove that U is linearly independent. Let u1, u2 , u3 , . . . . ,un be in
U and a1, a2, a3, . . . .,an be scalars such that a1 u1 + a2 u2 + . . . . . . + an un. = 0. Because
ui ∈ U for i = 1, 2, 3……..n, there exists a set Ai in C such that ui ∈ Ai. But since C is a
chain, one of these sets, say Ak, contains all the others. Thus ui ∈ Ak for i = 1, 2, 3……n.
However, Ak is a linearly independent set; so a1 u1 + a2 u2 + a3 u3 + …… + an un. = 0
implies that F has a maximal element. This element is easily seen to be a maximal
linearly independent subset of V that contains S.
1. 7. Summary
In the above unit, we formulate a general definition of vector space and establish
its basic properties. Vector space, a set of multidimensional quantities, known as vectors,
together with a set of one-dimensional quantities, known as scalars, such that vectors can
be added together and vectors can be multiplied by scalars while preserving the ordinary
arithmetic properties (associativity, commutativity, distributivity, and so forth). The
concept of a linear subspace (or vector subspace) is important in linear algebra and
related fields of mathematics. A linear subspace is usually called simply a subspace
when the context serves to distinguish it from other kinds of subspaces. A linear
combination is an expression constructed from a set of terms by multiplying each term
by a constant and adding the results. The concept of linear combinations is central to
linear algebra and a system of linear equations (or linear system) is a collection of linear
equations involving the same set of variables. A family of vectors is linearly independent
if none of them can be written as a linear combination of finitely many other vectors in
the collection. A family of vectors which is not linearly independent is called linearly
dependent. A basis is a set of linearly independent vectors that, in a linear combination,
can represent every vector in a given vector space or free module, or, more simply put,
20
Linear Algebra
which define a "coordinate system". Every vector space has a basis, and all bases of a
vector space have the same number of elements, called the dimension of the vector space.
In this unit, we study the dimension theorem, which is play very vital role for subsequent
units. Finally, the maximal principle guarantees the existence of maximal elements in a
family of sets, with the help of above maximal property we study the Maximal linearly
independent subsets.
1. 8. Keywords
Basis Scalar Multiplication
Dimension Span of a subset
Dimension theorem Spans
Finite - dimensional space Standard basis
Linear combination Subspace
Linearly dependent System of linear equation
Linearly independent Vector
Polynomial Vector addition
Scalar Vector space
UNIT-2
2. 0. Introduction
The relation of two vector spaces can be expressed by linear map or linear
transformation. They are functions that reflect the vector space structure. That is, they
preserve vectors in additive form and scalars in multiplicative form. A. Cayley formally
introduced m × n matrices in two papers in 1850 and 1858 (the term “matrix” was coined
by Sylvester in 1850). He noted that they “comport themselves as single entities” and
recognized their usefulness in simplifying systems of linear equations and composition of
linear transformations. This unit deals with Linear transformation, Null Spaces and
Ranges, The Matrix Representation of a Linear Transformation, Composition of a Linear
Transformation and Matrix multiplication, Invertiblity and Isomorphism, The Change of
Coordinate Matrix, and The Dual space.
21
B. Chaluvaraju
Definition 2. 1. 1. Let U and V be vector space over field F. Then a map T: U→V is a
linear transformation (or a linear map) if T (u+v) = T(u) + T(v) and T(av) = a T (v).
Two conditions together can be written in the form T (au+bv) = aT(u) + bT(v) for
all a, b∈F and u, v∈U.
Note.
1. If T is linear map, then T(0)=0.
2. T is linear map if and only if T(au+v) = aT(u) + T(v) for all a∈F and u, v∈U.
3. If T linear map, then T (u – v) = T(u) – T(v) for all u, v∈U.
4. T is linear map if and only if, for u1, u2, . . . ,un∈U and a1, a 2, . . . , an ∈F,
n n
we have T (∑ a iui ) = ∑ a iT (ui ) .
i =1 i =1
Note. Let Pn(R) denote the vector space that consists of all polynomials in x with degree
n or less and coefficient in R.
Illustrative Example - 2. If the mapping T: P1(R) →P2(R) defined by
T(p(x)) = (1+x)p(x) for all p(x) in P1(R), then show that T is linear transformation.
Solution. Let a, b∈ R and p(x), q(x)∈ P1(R) and the polynomials T(p(x)), T(q(x))∈
P2(R). Then, T (a p(x) + b q(x)) = (1+x)( a p(x)+b q(x))
22
Linear Algebra
Note:
1. For vector spaces V and W over a field F, we define an identity transformation
I : V → W by I(v) =v for all v∈V and the zero transformation T0 : V → W by
T0(v) = 0 for all v∈V. It is clear that both of these transformations are linear.
2. If S and T are linear transformation of vector space U into V over a field F, then
sum S + T is defined by (S + T )(u) = S(u) + T(u) for all u in U. Also, for each
linear transformation T of U into V and each scalar in F, we define the product of
a and T to be the mapping aT of U into V given by (aT)(u)=a(T(u)).
Theorem 2. 1. 1. Let U and V be vector space over the same field F. Then the set of all
linear transformation of U into V is a vector space over F.
Proof. Let T1, T2 and T3 denote arbitrary linear transformation of U into V, let u and v
be arbitrary vectors in U, and let a, b and c be scalars in F.
Since, (T1+T2) (au+ bv) = T1 (au+ bv) + T2 (au+ bv)
= aT1(u)+ b T1(v) + aT2 (u)+ b T2 (v)
= a[T1(u)+ T2 (u)] + b[T1(v)+ T2(v)]
= a(T1+ T2 )(u) + b(T1+ T2 )(v),
Therefore, T1+ T2 is a linear transformation of U into V.
Addition is associative, since
(T1+ (T2 +T3))(u) = T1(u) + (T2 + T3)(u)
= T1(u) + [T2 (u) + T3(u)]
= [T1(u) +T2 (u)] + T3(u)
= (T1+ T2) (u) + T3(u)
= ((T1+ T2) +T3) (u).
The zero linear transformation Z is an additive identity, since
23
B. Chaluvaraju
Proof. Suppose that A = ( u1, u2, . . . ,un) is a basis of a vector space U, and consider the
set T(A) ={ T(u1), T(u2), . . . ,T(un)}. For any vector v in T(U), there is a vector u in U
n
such that T(u) = v. The vector u can be written as u = ∑ ai ui since A is a basis of U.
i =1
24
Linear Algebra
n n
This gives v = T (∑ ai ui ) = ∑ aiT (ui ), and T(A) spans T(U).
i =1 i =1
Proof. Suppose dim(U) = n, and let nullity(T) = k. Choose ( u1, u2, . . . ,uk) to be a basis
of the kernel T–1(0). This linearly independent set can be extended to a basis
A = { u1, u2, . . . ,uk, uk+1, . . . ,un} of U. By Theorem 2. 1. 3, the set T(A) spans T(U).
But T(u1) = T(u2) = T(u3) =, . . . . . . , = T(un) = 0, so this means that the set of (n–k)
vectors { T(uk+1), T(uk+2), . . . ,T(un)} spans T(U). To show that this set is linearly
independent, suppose that ck+1T(uk+1)+ ck+2T(uk+2) + , . . . ,+ cn T(un) = 0. Then
n
T(ck+1uk+1+ ck+2 uk+2 + , . . . ,+ cn un) = 0, and ∑ ci ui is in T–1(0). Thus there are
i = k +1
n n n n
scalars (d1, d2, . . . ,dk) such that ∑ diui = ∑ ciui ⇒ ∑ diui − ∑ ciui = 0 . Since A is a
i =1 i =k +1 i =1 i = k +1
basis, each ci and di must be zero. Hence {T(uk+1), T(uk+2), . . . ,T(un)} is a basis of T(U).
Since rank (T) is the dimension of T(U), n – k = rank (T) and rank (T) + nullity (T) = n =
dim (U).
Let U and V be vector spaces of dimension n and m, respectively, over the same
field F, and T will denote the a linear transformation of U into V. Suppose that
A = ( u1, u2, . . . ,un) is a basis of U. Any u in U can be written uniquely in the form
n n
u = ∑ x ju j , and T (u) = ∑ x jT (u j ). If B = ( v1, v2, . . . ,vm) is a basis of V, then each T(uj)
j =1 j =1
m
can be written uniquely as T (u j ) = ∑ aij vi . Thus, with each choice of basis A and B, a
i =1
25
B. Chaluvaraju
Definition 2. 2. 1. Suppose that A = ( u1, u2, . . . ,un) and B = ( v1, v2, . . . ,vm) are basis
of U into V, respectively. Let T be a linear transformation of U into V. The matrix of T
relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A = [T ]BA , where the {aij}
m
are determined by the conditions T (u j ) = ∑ aij vi = a1 j v1 + a2 j v2 +,....., +amj vm for
i =1
j =1,2,…, n.
Note. The symbols of [T ]B, A in the above definition denote the same matrix, but the
first one place notational emphasis on the elements of the matrix, while the second one
place emphasis on T and the basis A and B. This matrix A is also referred to as the
matrix of T with respect to A and B, and we say that T is represented by the matrix A.
Also, the element of aij are uniquely determined by T for given basis A and B. Another
way to describe A is to observe that the jth column of A is the coordinate matrix of T(uj)
with respect to B.
⎡ a1 j ⎤
⎢ ⎥
⎢ a2 j ⎥
That is ⎢ . ⎥ = [T(uj)]B and A =[T ]B, A = [T(u1)]B , [T(u2)]B, . . . . . . , [T(un)]B.
⎢ ⎥
⎢ . ⎥
⎢ ⎥
⎣ amj ⎦
26
Linear Algebra
⎡1 ⎤
⎢1 ⎥
Then, the first column of A is [T(1)]B = ⎢ ⎥ ,
⎢0 ⎥
⎢ ⎥
⎣1 ⎦
To find the second column of A, we compute T(1–x) and write it as a linear combination
of the vectors in B.
If T(1–x ) = 2–x+x3= (0)(1)+(0)(x)+(1)( 1–x2)+(1)( 1+x3).
⎡0 ⎤
⎢0 ⎥
Then, the second column of A is [T(1–x)]B = ⎢ ⎥
⎢1 ⎥
⎢ ⎥
⎣1 ⎦
To find the third column of A, we compute T(x2) and write it as a linear combination of
the vectors in B.
If T(x2) = 2+3x+2 x2+x3 = (3)(1) + (3)(x) + ( –2)( 1–x2) + (1)( 1+x3).
⎡ 3⎤
⎢ 3⎥
Then, the third column of A is [T(x )]B = ⎢ ⎥
2
⎢ −2 ⎥
⎢ ⎥
⎣ 1⎦
Thus the matrix of T with respect to A and B is given by
⎡1 0 3⎤
⎢1 0 3 ⎥⎥
A=[T ]B, A = [T(1)]B , [T(1–x)]B, [T(x2)]B= ⎢
⎢0 1 −2⎥
⎢ ⎥
⎣1 1 1 ⎦.
⎡ 1 0 −1 1 ⎤
[T ]ε 3 ,ε 4 = ⎢⎢ 2 1 0 3 ⎥⎥ .
⎢⎣ 1 2 3 3 ⎥⎦
Note.
1. If A is any matrix that represents the linear transformation T of U into V, then
rank (A) = rank(T). That is rank ([T ]B, A ) = rank(T).
2. The matrix of T relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A .
3. The matrix of T relative to the basis A and B is the matrix A = [aij]m×n= [T ]B, A is
a matrix such that the equation [T(u)]B= A[u]A is satisfied for all u in U, then A is
Definition 2. 3. 1. Let U, V and Z be vector spaces over the same field F, and let S:
U→V and T : V → Z be linear transformations. Then the product TS is the mapping
of U into Z (that is TS : U → Z) defined by TS(u) = (T o S)(u) = T(S(u)) for each u in U.
Theorem 2. 3. 2. Suppose that U, V and Z are finite dimensional vector spaces with
bases A, B and C, respectively. If S has matrix A relative to A and B and T has matrix
Proof. Assume that the hypotheses are satisfied with A = ( u1, u2, . . . ,un),
B = ( v1, v2, . . . ,vn) and C=( w1, w2, . . . ,wn), A = [aij]m×n, and B = [bij]p×m. Let C =
[cij]p×n be the matrix of TS relative to the basis A and C. For j=1, 2, …, n , we have
p m m m p m p
∑ cij wi = TS (u j ) = T ( ∑ akj vk ) = ∑ akjT (vk ) = ∑ akj (∑ bik wi ) = ∑ ∑ (akj bik wi )
i =1 k =1 k =1 k =1 i =1 k =1 i =1
p m
= ∑ ( ∑ (bik akj ) wi , and
i =1 k =1
m
consequently cij = ∑ bik akj for all values of i and j. Therefore C=BA and the theorem is
k =1
proved.
⎛ ⎡a b ⎤ ⎞ ⎛ ⎛ ⎡a b ⎤ ⎞ ⎞
=
⎜⎜ ⎜ ⎣⎢ c d ⎥⎦ ⎟ ⎟⎟
TS ⎜ ⎢ ⎥ ⎟ T S
⎝ ⎣c d ⎦ ⎠ ⎝ ⎝ ⎠⎠
= T( a + 2b , b − 3c , c + d )
= (a+2b–2(b–3c) –6(c+d) + (b–3c+3(c+d))x
= (a–6d) + (b+3d)x.
29
B. Chaluvaraju
⎛ ⎡1 0 ⎤ ⎞ ⎛ ⎡0 1 ⎤ ⎞
(ii) Since S⎜⎢ ⎥ ⎟ = (1,0, 0), S ⎜ ⎢ ⎥ ⎟ = (2,1, 0),
⎝ ⎣0 0⎦ ⎠ ⎝ ⎣0 0⎦ ⎠
⎛ ⎡0 0⎤ ⎞ ⎛ ⎡0 0⎤ ⎞
S⎜⎢ ⎥ ⎟ = (0, −3,1), S ⎜ ⎢ ⎥ ⎟ = (0, 0,1) .
⎝ ⎣ 1 0 ⎦ ⎠ ⎝ ⎣ 0 1 ⎦⎠
⎡1 2 0 0 ⎤
The matrix A of S relative to A and B is A = ⎢ 0 1 − 3 0 ⎥ .
⎢ ⎥
⎢⎣ 0 0 1 1 ⎥⎦
Note.
1. In the above example the product AB is not defined, and this is consistent with
the fact that S is not defined
2. The operation of addition and multiplication of matrices are connected by the
distributive property, that is A(B+C) = AB+AC, where A = [aij]m×n, B = [bij]p×m
and C = [cij]p×n.
30
Linear Algebra
Solution. In order to find the matrix of t relative to bases B and B 1, we have to express
T(v1) and T(v2) as a linear combination of and .
For this, we first find the coordinates of an arbitrary vector (a, b) ∈ R2 with respect to
basis B1.
(a, b) = +y
⇒ (a, b) = x(2, 3) + y(1, 2)
⇒ (a, b) = (2x + y, 3x + 2y)
⇒ 2x + y = a and 3x + 2y = b
⇒ x = 2a – b and y = – 3a + 2b
Therefore (a, b) = (2a – b) + (– 3a + 2b)
Now, T(a, b) = (2a – 3b, a + b)→(1)
⇒ T(v1) = T(1, 0) = (2, 1) and T(v2) = T(0, 1) (– 3,1) [ Putting a = 2 and b = 1 in (1)]
⇒ T(v1) = 3 –4 and T(v2) = –7 + 11 [ Putting a = –3 and b = 1]
T
⎡ 3 −4 ⎤ ⎡ 3 −7 ⎤
[T ]B,B 1 =⎢ ⎥ =⎢ ⎥
⎣ −7 1 ⎦ ⎣ −4 11⎦ .
Definition 2. 4. 1. Let V be a vector space over a field F and let Hom (V, V) or A(V) be
the set of all linear transformation of V onto itself. This is known as Algebra of Linear
transformation.
That is a map T : U →V is a linear transformation on a vector space with U=V.
For T1, T2 ∈ A(V),
(T1 + T2)(v) = T1(v) + T2(v) ⇒ T1 + T2 ∈ A(V)
31
B. Chaluvaraju
32
Linear Algebra
Note.
1. Let A be an n×n matrix. Then A is invertible if there exists an n×n matrix B such
that AB=BA=I, which is unique (If C is an another such matrix, then C = CI =
C(AB) = (CA)B = IB+B). The matrix B is called the inverse of A and is denoted
by A–1.
2. If V is finite dimensional vector space over a field F with ordered bases A and B,
respectively, then a linear transformation T∈A(V) is invertible if and only if
( )
−1
[T ]B, A is invertible. Furthermore, ⎡⎣T −1 ⎤⎦B,A = [T ]B,A .
Note.
1. Let U and V be finite dimensional vector space (over the same filed F). Then V is
isomorphic to U if and only if dim(U) = dim(V).
2. Let V be finite dimensional vector space over a field F. Then V is isomorphic to
Fn if and only if dim (V) = n.
33
B. Chaluvaraju
34
Linear Algebra
So ψ is an isomorphism of A in to A(V).
Theorem 2. 4. 3. Let A be an algebra, with unit element, over a field F, and suppose that
A is of dimension m over F. Then every element in A satisfies some nontrivial
polynomial in F[x] (or Pn(F)) of degree at most m.
Proof. Let e be the unit element of A. For y∈ A, we have e, y, y2,. . ., ym in A.
Since A is m - dimensional over a field F, it follows that the (m+1)- elements (vectors)
e, y,y2, . . . ,ym are linearly dependent over a field F. So there exist the elements
a0, a1,……,am in a field F, not all zero, therefore x satisfies the non trivial polynomial
q(x) = a0 + a1 +,……..,+am xm , of degree at most m, in F[x]. Hence, degree of p(x) ≤ m.
35
B. Chaluvaraju
The relation between those matrices that represent the same linear transformation.
This description is found by examining the effect that a change in the bases of U and V
has on the matrix of T.
Theorem 2. 5. 1. Let C= ( w1, w2, . . . ,wk) and C1 = ( w11, w21, . . . , wk1 ) be two bases
of the vector space W over F. For an arbitrary vector w in W, let
⎡c1 ⎤ ⎡c11 ⎤
⎢c ⎥ ⎢ 1⎥
⎢ 2⎥ ⎢c2 ⎥
[ w]C = C = ⎢ . ⎥ and [w]C l = C 1 = ⎢.⎥
⎢ ⎥ ⎢ ⎥
⎢.⎥ ⎢.⎥
⎢c ⎥ ⎢c 1 ⎥
⎣ k⎦ ⎣ k⎦
denote the coordinate matrices of w relative to C and C1 , respectively. If P is the matrix
( ) ⎛k ⎞
k k k
Combining these equalities, we have w = ∑ ci wi = c1j ∑ pi j wi = ∑ ⎜ ∑ pi j c j ⎟ wi .
1
i =1 i =1 i =1 ⎝ j =1 ⎠
⎡ c1 ⎤ ⎡ p11c11 + p12 c11 + ....... + p1k ck1 ⎤
⎢ c ⎥ ⎢ p c1 + p c1 + ....... + p c1 ⎥
k ⎢ 2 ⎥ ⎢ 21 1 22 1 2k k ⎥
Therefore, ci = ∑ pi j c j and
1
⎢ . ⎥ = ⎢ ... ⎥ = PC1.
j =1 ⎢ ⎥ ⎢ ⎥
⎢.⎥ ⎢ ... ⎥
⎢⎣ck ⎥⎦ ⎢⎣ pk 1c11 + pk 2 c11 + ....... + pkk c1k ⎥⎦
36
Linear Algebra
Example -1. Consider the bases C = {x, 2+x} and C1={ 4+x, 4–x} of the vector space
W = P1(R). Since 4+x = (–1)(x)+(2)(2+x) and 4–x = (–3)(x)+(2)(2+x) the matrix of
⎡ −1 −3⎤
transition P from C to C 1 is give by P = ⎢ ⎥.
⎣ 2 2⎦
⎡ 2⎤
The vector w=4+3x can be written as 4+3x=2(4+x)+(–1)( 4–x) , so [w]C 1= ⎢ ⎥ .
⎣ −1⎦
By Theorem 2.5.1, the coordinate matrix [w]C may be found from
⎡ −1 −3⎤ ⎡ 2 ⎤ ⎡1 ⎤
[w]C=P[w]C 1= ⎢ ⎥⎢ ⎥ = ⎢ ⎥.
⎣ 2 2 ⎦ ⎣ −1⎦ ⎣ 2 ⎦
This result can be checked by using the base vectors in C.
We get, (1)(x)+2(2+x) = 4+3x = w.
Definition 2. 5. 1. If A= ( u1, u2, . . . ,ur) is asset of vectors in Rn and B= ( v1, v2, . . . ,vs)
Theorem 2. 5. 2. Suppose that T has matrix A = [aij]m×n relative to the bases A of U and
Figure-1
Proof. Assume the hypotheses of the theorem are satisfied. Let U be an arbitrary vector
in U, let X and X1 denote the coordinate matrices of u relative to A and A1,
respectively, and let Y and Y1 denote the coordinate matrices of T(u) relative to B and
B1, respectively. Since the matrix of T relative to the bases A and B is the matrix
37
B. Chaluvaraju
Theorem 2. 5. 1, we have Y=AX, where Y=PY1 and X= QX1. Substituting for Y and X,
we have PY1=AQX1, and therefore Y1= ( P–1AQ)X1.
If the matrix of T relative to the bases A and B is the matrix A = [T ]B, A is a
matrix such that the equation [T(u)]B = A[u]A is satisfied for all u in U, then A is the
matrix of the linear transformation T relative to A and B. Thus P–1AQ is the matrix of T
Theorem 2. 5. 2. Two m×n matrices A and B represent the same linear transformation T
of U into V if and only if A and B are equivalent.
Proof. If A and B represent T relative to the set of bases A,B and A1,B1, respectively,
then B= P–1AQ, where Q is the matrix of transition from A to A1and P is the matrix of
To obtain two similar results concerning row and column equivalence. For
requiring B =B1 is the same as requiring P=Im, and requiring Q=In. Thus we have the
following theorem
Theorem 2. 5. 3. Two m×n matrices A and B represent the same linear transformation T
of U into V if and only if they are row (or column) - equivalent.
Illustrative Example - 2. Relative to the basis B = {v1, v2} = {(1, 1), (2, 3)} of R2, find
the coordinate matrix of (i) v = (4, –3) and (ii) v = (a, b)
Solution. Let x, y ∈ R such that v = xv1 + yv2 = x(1, 1) + y(2, 3) = (x + 2y, x + 3y)
(i) If v = (4, –3), then v = xv1 + yv2 ⇒ (4, –3) = (x + 2y, x + 3y)
⇒ x + 2y = 4, x + 3y = –3
⇒ x = 18, y = –7.
38
Linear Algebra
⎡ 19 ⎤
Hence, coordinate matrix [v]B of v relative to the basis B is given by [v]B= ⎢ ⎥
⎣ −7 ⎦
(ii) If v = (a, b), then v = xv1 + yv2 ⇒ (a, b) = (x + 2y, x + 3y)
⇒ x + 2y = a, x + 3y = b
⇒ x = 3a – 2b, y = –a + b
⎡3a - 2b⎤
Hence, coordinate matrix [v]B of v relative to the basis B is given by [v]B= ⎢ ⎥
⎣ -a + b ⎦
Example -1. Let V be the vector space of continuous real valued functions on the
interval [0, 2π]. Fix a function g ∈ V. The function h : V → R defined by
(nt) or cos (nt), h(x) is often called the nth Fourier co-efficient of x.
Example -2. Let V be a finite - dimensional vector space and let B = {x1 , x2, . . . . , xn}
be an ordered basis for V. For each i = 1, 2, . . . . . , n, define fi(x) = ai , where
vector space V is called the ith coordinate function with respect to the basis B.
Note.
1. fi(xj) = δij, where δij is the kronceker delta. let V and W be finite - dimensional
vector spaces (over the same field F). Then V is isomorphic to W if and only if
dim (V) = dim (W)
39
B. Chaluvaraju
Definition 2. 6. 2. For a vector space V over F, we define the dual space of V to be the
vector space Hom (V, F), denoted by V*. Thus V* is the vector space consisting of all
linear functional on V with the operations of addition and scalar multiplication. If V is
finite - dimensional, then dim (V*) = dim (Hom (V, F)) = dim (V) ⋅ dim (F) = dim (V).
Hence by above note, if V and V* are isomorphic. We also define the double dual
V** of V to be the dual of V*.
fi (1≤ i ≤ n) is the ith coordinate function with respect to B. Then B* is an ordered basis
δ .
There fore f = g due to the fact that two vector space V and W, with V has a finite basis
{ v1, v2, . . . ,vn}, if U, T: V→W are linear and U(vi) = T(vi) for i = 1,2, …, n, then U=T.
Illustrative Example-3. Let B ={ (2, 1), (3, 1)} be an ordered basis for R2 . Suppose
that the dual basis B is given by B* ={ f1 , f2}. Determine a formula for f1.
Solution. We need to consider the equations
1= f1(2, 1) = f1(2e1 + e2) = 2 f1(e1) + f1(e2)
0= f1(3, 1) = f1(3e1 + e2) = 3 f1(e1) + f1(e2).
Solving these equations, we obtain f1(e1)= – 1 and f1(e2) = 3. That is , f1(x, y) = – x+ 3y.
40
Linear Algebra
2. 7. Summary
In this unit, the important concept of a linear transformation of a vector space is
introduced. A linear map, linear mapping, linear transformation, or linear operator (in
some contexts also called linear function) is a function between two vector spaces that
preserves the operations of vector addition and scalar multiplication. As a result, it
always maps straight lines to straight lines or 0. The expression "linear operator" is
commonly used for linear maps from a vector space to itself (That is, endomorphisms). If
one has a linear transformation T(x) in functional form, it is easy to determine the
transformation matrix A by simply transforming each of the vectors of the standard basis
by T and then inserting the results into the rows [columns] of a matrix. Also, matrix
multiplication is a binary operation that takes a pair of matrices, and produces another
matrix. The matrix of a composition of linear maps is given by the product of the
matrices. The change of basis refers to the conversion of vectors and linear
transformations between matrix representations which have different bases. A dual basis
is a set of vectors that forms a basis for the dual space of a vector space.
2. 8. Keywords
Change of coordinate matrix Identity matrix
Coordinate vector relative to a basis Identity transformation
Composition of linear transformation Invertible linear transformation
Dual basis Invertible matrix
Dual space Isomorphic vector spaces
Kernel of a linear transformation Isomorphism
Left - multiplication transformation Nullity of a linear transformation
Linear functional Null space
Linear transformation Product of Matrices
Matrix multiplication Range
Non-invertible linear transformation Rank of a linear transformation
41
B. Chaluvaraju
UNIT -3
3. 0. Introduction
In the previous two units, we have learnt about vector space and linear
transformations between two vector spaces U(F) and V(F) defined over the same filed F.
In this unit, we shall see that each linear transformation from n-dimensional vector space
U(F) to an m-dimensional vector space V(F) corresponds to an m×n matrix over a field F.
Here, we see how the matrix representation of a linear transformation changes with the
changes of basis of the given vector with the help of elementary matrix operation, we
solve the many problems to the rank of a matrix , matrix inverses and systems of linear
equations.
1 2 3 4
Example - 1. Let A = 2 1 1 3
4 0 1 2
interchanging the second row of A with the first row is an example of an elementary row
operation of Type -1.
2 1 1 3
The resulting matrix is B = 1 2 3 4 .
4 0 1 2
Multiplying the second column of A by 3 is and example of an elementary column
operation of Type - 2.
42
Linear Algebra
1 6 3 4
The resulting matrix is C = 2 3 1 3 .
4 0 1 2
Adding 4 times the third row of A to the first row is an example of an elementary row
operation of Type-3.
17 2 7 12
In this case, the resulting matrix is M = 2 1 1 3
4 0 1 2
If a matrix Q can be obtained from a matrix P by means of an elementary row
operation, then P can be obtained from Q by an elementary row operation of the same
type. In the above example, A can be obtained from M by adding –4 times the third row
of M to the first row of M.
Example - 2. Interchanging the first two rows of I3 produces the elementary matrix
0 1 0
E = 1 0 0
0 0 1
Note that E can also be obtained by interchanging the first two columns of I3. In fact, any
elementary matrix can be obtained in at least tow ways – either by performing an
elementary row operation on In or by performing an elementary column operation on In.
Similarly,
1 0 2
0 1 0
0 0 1
is an elementary matrix since it can be obtained from I3 by an elementary column
operation of Type-3 (adding – 2 times the first column of I3 to the third column) or by an
elementary row operation of Type 3 (adding – 2 times the third row to the first row).
Our first theorem shows that performing an elementary row operation on a matrix
is equivalent to multiplying the matrix by an elementary matrix.
43
B. Chaluvaraju
44
Linear Algebra
E back into In. The result is that In can be obtained from E by an elementary row
operation of the same type. By Theorem 3. 1. 1, there is an elementary matrix such
that E = In. Therefore, E is invertible and E–1 = .
Theorem 3. 2. 1. The rank of any matrix equals the maximum number of its linearly
independent columns; that is, the rank of a matrix is the dimension of the subspace
generated by its columns.
45
B. Chaluvaraju
Proof. For any A ∈ Mmxn(F), rank (A) = rank (TA) = dim (rank (TA) ). Let A be the
standard ordered basis for Fn. Then A spans Fn and let T : V → W be linear
Similarly rank (TA) = span (TA(A)) = span ({TA (e1), TA (e2), TA (e3), ….., TA(en)}). But,
for any j, we see that TA(ej) = Aej = aj, where aj the jth column of A.
Hence rank (TA) = span ({a1, a2, a3,……,an}).
Thus, rank (A) = dim (rank (TA)) = dim (span{ a1, a2, a3,……,an }).
1 0 1
Example -1. Let A = 0 1 1
1 0 1
Observe that the first and second columns of A are linearly independent and that the third
column is a linear combination of the first two.
1 0 1
Thus, rank dim 0 , 1 , 1 2.
1 0 1
To compute the rank of a matrix A, it is frequently useful to postpone the use of
above theorem until A has been suitably modified by means of appropriate elementary
row and column operations so that the number of linearly independent columns is
obvious.
By Note 2, guarantees that the rank of the modified matrix is the same as the rank
of A. One such modification of A can be obtained by using elementary row and column
operations to introduce zero entries. The next example illustrates this procedure.
1 2 1
Example -2. Let A = 1 0 3 .
1 1 2
If we subtract the first row of A from rows 2 and 3 (Type-3 elementary row operations),
1 2 1
the result is 0 2 2 .
0 1 1
If we now subtract twice the first column from the second and subtract the first column
from the third (Type-3 elementary column operations),
46
Linear Algebra
1 0 0
we obtain, 0 2 2 .
0 1 1
It is now obvious that the maximum number of linearly independent columns of this
matrix is 2. Hence the rank of A is 2.
Example -3.
1 2 1 1
(i) Let A =
2 1 1 1
Note that the first and second rows of A are linearly independent since one is not a
multiple of the other. Thus rank (A) = 2.
1 3 1 1
(ii) Let A = 1 0 1 1
0 3 0 0
In this case, there are several ways to proceed. Suppose that we begin with an
elementary row operation to obtain a zero in the 2, 1 position. Subtracting the first
1 3 1 1
row from the second row, we obtain, 0 3 0 0 .
0 3 0 0
Note that the third row is a multiple of the second row, and the first and second rows
are linearly independent. Thus rank (A) = 2.
1 2 3 1
(iii) Let A = 2 1 1 1
1 1 1 0
Using Elementary row operations, we can transform A as follows:
1 2 3 1 1 2 3 1
A→ 0 3 5 1 → 0 3 5 1
0 3 2 1 0 0 3 0
It is clear that the last matrix has three linearly independent rows and hence has
rank 3.
47
B. Chaluvaraju
0 2 4
Illustrative Example - 4. Determine whether the matrix A = 2 4 2 is invertible or
3 3 1
not. Also, find A–1.
1 2 1 0 0
0 1 2 0 0
0 3 2 0
1
We now complete our work on the second column by adding –2 times row 2 to row 1 and
3 times row 2 to row 3. The result is
1 0 3 1 0
0 1 2 0 0
0 0 4
1
48
Linear Algebra
Only the third column remains to be changed. In order to place a 1 in the 3, 3 position,
we multiply row 3 by ¼ ; this operation yields
1 0 3 1 0
0 1 2 0 0
0 0 1
Adding appropriate multiples of row 3 to rows 1 and 2 completes the process and gives
1 2 1
A = 2 1 1
1 5 4
Solution. Using the strategy similar to the one used in the above example,
we attempt to use elementary row operations to transform
1 2 1 1 0 0
| 2 1 1 0 1 0
1 5 4 0 0 1
into a matrix of the form (I | B). We first add - 2 times row 1 to row 2 and - 1 times row
1 to row 3. We then add row 2 to row 3. The result,
1 2 1 1 0 0
2 1 1 0 1 0
1 5 4 0 0 1
1 2 1 1 0 0
⇒ 0 3 3 2 1 0
0 3 3 1 0 1
1 2 1 1 0 0
⇒ 0 3 3 2 1 0 is a matrix with a row whose first 3 entries are
0 0 0 3 1 1
zeros. Therefore A is not invertible.
Note.
∈
such that As = b. The set of all solutions to the system of equation is called the
solution set of the system. System of equation is called consistent if its solution
set is nonempty; otherwise it is called inconsistent.
Illustrative Example - 1. Solve the system of equation x1 + x2 = 3 and x1 – x2 = 1
Solution. By use of familiar techniques, we can solve the preceding system and conclude
2
that there is only one solution: x1 = 2, x2 = 1; that is . In matrix form, the
1
1 1 3 1 1 3
system can be written as . So and .
1 1 1 1 1 1
50
Linear Algebra
Illustrative Example -2. Solve the system of equation 2x1 + 3x2 + x3 = 1 and
x1 – x2 + 2x3 = 6.
2 3 1 1
Solution. In matrix form, the system can be written as .
1 1 2 6
6 8
This system has many solutions, such as s = 2 and s = 4 .
7 3
1 1 0
Solution. In matrix form, the system can be written as , it is
1 1 1
evident that this system has no solutions. Thus we see that a system of linear equations
can have one, many, or no solutions.
1 2 1
Let A= be the coefficient matrix of this system.
1 1 1
It is clear that rank (A) = 2. If K is the solution set of this system,
then dim (K) = 3 – 2 = 1.
Thus any nonzero solution constitutes a basis for K.
51
B. Chaluvaraju
1
For example, since 2 is a solution to the given system,
3
1
2 is a basis for K.
3
1
Thus any vector in K is of the form 2 2 , where t ∈ R.
3 3
(ii) Consider the system x1 – 2x2 + x3 = 0 of one equation in three unknowns.
If A = (1 –2 1) is the coefficient matrix, then rank (A) = 1.
52
Linear Algebra
3. 4. Summary
This unit is devoted to two related objectives:
1. The study of certain “rank-preserving” operations on matrices;
2. The application of these operations and the theory of linear transformations to the
solution of systems of linear equations.
As a consequence of Objective 1, we obtain a simple method for computing the rank
of a linear transformation between finite-dimensional vector spaces by applying these
rank-preserving matrix operations to a matrix that represents that transformation.
Solving a system of linear equations is probably the most important application of
linear algebra. The familiar method of elimination for solving systems of linear equation,
involves the elimination of variables so that a simpler system can be obtained. The
technique by which the variables are eliminated utilizes three types of operations:
(i) Interchanging any two equations in the system;
(ii) Multiplying any equation in the system by a nonzero constant;
(iii) Adding a multiple of one equation to another.
In System of Linear Equations, we express a single matrix equation. In this
representation of the system, the three operations above are the “elementary row
operations” for matrices. These operations provide a convenient computational method
for determining all solutions to a system of linear equations. There are two types of
elementary matrix operations - row operations and column operations. As we will see,
the row operations are more useful. They arise from the three operations that can be used
to eliminate variables in a system of linear equations. In the above unit, we define the
rank of a matrix. We then use elementary operations to compute the rank of a matrix and
a linear transformation. We have remarked that an n x n matrix is invertible if and only if
its rank is n. Since we know how to compute the rank of any matrix, we can always test a
matrix to determine whether it is invertible. The section concludes with a procedure for
computing the inverse of an invertible matrix.
3. 5. Keywords
Augmented matrix Consistent of linear equations
Coefficient of a linear equations Elementary column operation
53
B. Chaluvaraju
UNIT-4
4. 0. Introduction
An exposition of the theory of determinants independent of their relation to the
solvability of linear equations was first given by Vandermonde in his “Memoir on
elimination theory” of 1772. (The word “determinant” was used for the first time by
Gauss, in 1801, to stand for the discriminant of a quadratic form, where the discriminant
of the form ax2 + bxy + cy2 is b2 − 4ac.) Laplace extended some of Vandermonde’s work
in his Researches on the Integral Calculus and the System of the World (1772), showing
how to expand n × n determinants by cofactors.
The determinant of a square matrix is a value computed from the elements of the
matrix by certain, equivalent rules. The determinant provides important information when
the matrix consists of the coefficients of a system of linear equations, and when it
describes a linear transformation: in the first case the system has a unique solution if and
only if the determinant is nonzero, in the second case that same condition means that the
transformation has an inverse operation. So this unit deals with some important properties
of determinants, cofactor expansions and Cramer’s rule.
4. 1. Properties of Determinants.
54
Linear Algebra
Definition 4. 1. 2. The determinants of the square matrix A = [aij] over F is the scalar
det(A) defined by det( A) = ∑ (−1) a1 j1 a2 j2 ...anjn , where ∑ denotes the sum of all terms
t
( j) ( j)
t
of the form (−1) a1 j1 a2 j2 ...anjn as j1, j2 ,…, jn assumes all possible permutations of the
We observe that there are n! terms in the sum det(A) since there are n! possible
ordering of 1, 2, …, n. The determinant of n × n matrix is referred to as n × n determinant,
or a determinant of order n.
Note.
1. Determinants of order 1 and 2.
a11 a12
That is ⎜ a11 ⎟ = a11 and = a11 a22 – a12 a21 .
a21 a22
2. Determinants of order 3.
55
B. Chaluvaraju
⎛2 1 1 ⎞ ⎛ 3 2 1⎞ ⎛ 1 2 −1 ⎞
Examples - 1. Let A = ⎜ 0 5
⎟, ⎜ ⎟ and
− 2 ⎟ B = ⎜ −4 5 −1 ⎟
⎜
C = ⎜ −2 0
⎟
7 ⎟.
⎜
⎜1 −3 4 ⎟⎠ ⎜ 2 −3 4 ⎟ ⎜ 3 0 7 ⎟⎠
⎝ ⎝ ⎠ ⎝
Then the by note-2, we have det (A) = 21, det (B) = 81 and det(C) = –70.
( j) (i )
denotes the sum over all possible permutations i1, i2,…,in of 1, 2, …, n, and s is the
number of interchanges used to carry i1 , i2,…,in into the natural ordering.
Proof. Let S = ∑ (−1) ai11 , ai2 2 ,..., ain n. Now both S and det(A) have n! terms. Except
s
(i )
possibly for sign, each term of S is a term of det(A), and each term of det(A) is a term of
S. Thus, S and det(A) consist of the same terms, with a possible difference in sign.
s t
Consider a certain term (−1) ai11, ai2 2 ,..., ain n and (−1) a1 j1 a2 j2 ...anjn be the corresponding
s t
term in det(A). Then (−1) ai11, ai2 2 ,..., ain n can be carried into (−1) a1 j1 a2 j2 ...anjn by s
interchanges of factors since the permutation i1 , i2,…,in can be changed into the natural
ordering 1, 2, …, n by s interchanges of elements. This means that the natural ordering
1, 2, …, n can be changed into the permutations j1, j2, …, jn by s interchanges since the
column subscripts have been interchanged each time the factors were interchanged. But
j1, j2, …, jn can be carried into 1, 2, …, n by t interchanges, by the definition of det(A).
Thus 1, 2, …, n can be carried into j1, j2, …, jn and then back into itself by (s+t)
interchanges. Since 1, 2, …, n can be carried into itself by an even number(zero) of
interchanges, (s+t) is even, because the number of interchanges used to a carry a
permutation j1, j2, …, jn of {1, 2, …, n } into the natural ordering is either always odd or
even. Therefore (–1)s+t = 1 and (–1)s = (–1)t. Now we have the corresponding terms in
det (A) and S with the same sign, and therefore det(A) = S.
56
Linear Algebra
( j)
⎛1 2 3⎞ ⎛1 2 3⎞
Example - 2. Let A = ⎜ 2 1 3⎟
⎟ and T ⎜ ⎟
A = ⎜ 2 1 1 ⎟ be two matrices.
⎜
⎜3 1 2 ⎟⎠ ⎜ 3 3 2⎟
⎝ ⎝ ⎠
Then det (AT) = det (A) = 6.
( j)
( j)
The permutation j1, j2 … js… jr… jn results from the permutation j1, j2 … jr… js… jn by an
interchange of two numbers; the number of inversions in the former differs by an odd
number from the number of inversions in the latter. This means that the sign of each term
in det (B) is the negative of the sign of the corresponding term in det (A).
Hence det (B) = –det (A).
Now suppose that B is obtained from A by interchanging two columns of A. Then BT is
obtained from AT by interchanging two rows of AT. So det (BT) = –det (AT), but
det (BT) = det (B) and det (AT) = det (A). Hence det (B) = – det (A).
57
B. Chaluvaraju
⎛ 2 −1⎞ ⎛3 2⎞
Example -3. Let A = ⎜ ⎟ and B = ⎜ ⎟ . Then det (B) = – det (A) = –7.
⎝3 2⎠ ⎝ 2 −1⎠
4. 2. Cofactor Expansions
⎛ 3 −1 2 ⎞
Example - 1. Let A = ⎜ 4 5 6 ⎟ . Then
⎜ ⎟
⎜7 1 2⎟
⎝ ⎠
4 6
det(M12 ) = = 8 – 42 = –32,
7 2
3 −1
det(M 23 ) = = 3 + 7=10, and
7 1
−1 2
det(M31 ) = = – 6 –10 = –16.
5 6
Also we have, A12 = (–1)1+2 det (M12) = (–1)( –34) = 34,
58
Linear Algebra
Theorem 4. 2. 2. If A=[aij]n, then det (A) = ai1 Ai1 + ai2 Ai2+ . . . . + ain Ain.
Proof. For a fixed integer i, we collect all of the terms in the sum
det( A ) = ∑ ( −1 )t a1 j1 a2 j2 ...anjn that contains ai1 as a factor in one group, all of the terms
( j)
that contains ai2 as a factor in another group, and so on for each column number. This
separates the terms in det (A) into n group with no overlapping since each term contains
exactly one factor from row i. In each of the terms containing ai1, we factor our ai1 and
let Fi1 denote the remaining factor. Repeating this process for each ai1 ai2 . . . .ain in turn,
we obtain det(A) = ai1 Fi1 + ai2 Fi2+ . . . . + ain Fin. To finish the proof, we need only
show that Fij=Aij = (–)i+jMij, where Mij is the minor of aij. Consider first the case where
i=1 and j=1.
We shall show that a11 F11 =a11 M11. Each term in F11was obtained be factoring a11 from
−t
a term (−1) 1 a1 j1 a2 j2 ...anjn in the expansion of det (A). Thus each term F11 has the form
(−1)−t1 a2 j2 a3 j3 ...anjn , where t2 is the number of interchanges used to carry j2, …, jn into
2, 3,….,n. Letting j2, …, jn range over all permutations of 2,3,…,n, we see that each of
F11 and M11 has (n–1)! terms. Now1, j2, …, jn can be carried into natural ordering by the
same interchanges used to carry j2, …, jn into 2,3,…,n. That is, we may take t1=t2. This
means that F11 and M11 have exactly the same terms, yielding F11 = M11 and
a11 F11 =a11 M11. Consider now an arbitrary aij. By (i–1) interchanges of the original
row i with the adjacent row above and then ( j–1) interchanges of column j with the
adjacent column on the left, we obtain a matrix B that has aij in the first row, first column
position. Since the order of the remaining rows and columns of A was not changed, the
minor of aij in B is the same Mij as it is in A. If B results from matrix A by interchanging
two rows (columns) of A, then det(B) = – det(A).
So det (B) = (–1)i–1+ j–1 det(A) = (–1)i+j det (A). This gives det (A) = (–1)i+j det (B). The
sum of all the terms in det (B)that contains aij as a factor is aij Mij, from our first case.
59
B. Chaluvaraju
Since det(A) = (–1)i+j det (B), the sum of all the terms in det (A) that contains aij as a
factor is (–1)i+j aij Mij. Thus aij Fij= (–1)i+j aij Mij= aij Aij, and the theorem proved.
Note. If A=[aij]3, the expansion of det(A) about the 2nd row is given by
a11 a12 a13
A = a21 a22 a23 = a21 A21 + a22 A22 + a23 A23
a31 a32 a33
= −a21 (a12 a33 − a13a32 ) + a22 (a11a33 − a13a31 ) − a23 (a11a32 − a12 a31 ).
60
Linear Algebra
Definition 4. 3. 1. Let A = [aij] be an n×n matrix. The n×n matrix adj(A), called the
adjoint of A, is the matrix whose i, jth element is the cofactor Aji of aji. Thus
⎛ A11 … An1 ⎞
⎜ ⎟.
adj ( A) = ⎜ ⎟
⎜A Ann ⎟⎠
⎝ 1n
Note.
1. The adjoint of A is formed by taking the transpose of the matrix of cofactors of
the elements of A.
2. It should be noted that the term adjoint has other meaning in linear algebra in
addition to its use in the above definitions.
3. If A = [aij] be an n×n matrix. Then
(i) A adj(A)=(adj(A))A= det(A)In, where In is an identity matrix.
1
(ii) A −1 = adj ( A) , provided det(A)≠0.
det( A)
Our final results of this unit make an important connection between
determinants and the solutions of certain types of systems of linear equations. This
theorem presents a formula for the unknowns in terms of certain determinants. This
formula is commonly known as Cramer’s rule.
61
B. Chaluvaraju
1 n ⎛ n ⎞
= ∑ ⎜ bk ∑ aij Akj ⎟
det( A) k =1 ⎝ j =1 ⎠
1 n
= ∑ ( bk (δ ik det( A)) ) , where δik is the Kronecker delta
det( A) k =1
1
= bi (δ ii det( A) ) = bi .
det( A)
n
∑ bk Akj
k =1
Thus, the values x j = furnish a solution of the system.
det( A)
To prove the uniqueness, suppose that xj=yj, j=1, 2, 3…, n, represents any
n
solution to the system. Then the ith equation ∑ aik yk = bi is satisfied for i=1,2,….,n.
k =1
If we multiply both members of the ithequation by Ai j(j fixed) and form the sum of these
n
⎛ n ⎞ n n
⎛n ⎞ n
equations, we find that ∑ ⎜ ∑ aik Aij yk ⎟ = ∑ bi Aij and ∑ ⎜ ∑ ik ij ⎟ k ∑ bi Aij .
a A y =
i =1 ⎝ k =1 ⎠ i =1 k =1 ⎝ i =1 ⎠ i =1
n
n n n ∑ bi Aij
But, for each k, ∑ aik Aij = δ kj det( A). Thus ∑ δ kj det( A)yk = ∑ bi Aij , and y j = i =1 .
k =1 k =1 i =1 det( A)
Hence these yi’s are the same as the solution given in the statement of the theorem.
n
Note. The sum ∑ bk Akj is the determinant of the matrix obtained by replacing the jth
k =1
4. 4. Summary
In this unit, the fundamental of the theory of determinants and its properties are
explored. A knowledge of this unit is necessary in the study of eigenvalues and
eigenvectors of linear transformation. For this purpose we give some important theorems
along their proof. The different method for evaluating the determinant of an n × n matrix,
which reduces the problem to the evaluation of determinants of matrices of order (n–1).
We can then repeat the process for these (n–1) × (n–1) matrices until we get to 2 × 2
matrix. For an elementary operation, which are based on the properties of determinant
and cofactor expansion. The important connection between determinants and the
solutions of certain types of systems of linear equations. We present the Cramer’s rule for
finding the unknowns in terms of certain determinants.
3. 5. Keywords
Determinant of a matrix
Adjoint matrix
Elementary operation
Cofactor
Minor
Cramer’s rule
n – linear function
Determinant
Exercises
1. Let S be the set of all elements of the form (x + 2y, y, –x + 3y) in R3, where x, y ∈
R. Show that S is a subspace of R3.
(Hint: Let u, v ∈S. Then, u = (x1 + 2y1, y1, –x1 +3y1), v = (x2 + 2y2, y2, –x2 +3y2)
where x1, y1, x2, y2 ∈ R. Then show au + bv = (α + 2β , β , −α + 3β ) ∈S,
2. Express the polynomial f(x) = x2+ 4x –3 in the vector space V of all polynomials
over R as a linear combination of the polynomials g(x) = x2 – 2x + 5,
h(x) = 2x2 –3x and φ(x) = x +3.
Answer: f(x) = –3g(x) +2h(x) +4 φ(x).
63
B. Chaluvaraju
4. Show that the mapping J: R2 →R3 given by J(a, b) = (a+b, a–b, b) for all (a,b) є
R2, is a linear transformation. Find the range, rank, kernel and nullity of J.
Answers. rank(J) = dim(Im(J))= dim(range(J))= 2 , nullity(J) = dim R2 – rank(J) =
2 – 2= 0 and Kernel (J) = Null Space = {(0,0)}, where Im(J) = { t(1,0), t(0,1)}=
{(1,1,0), (1, -1, 1)}.
1
Answer. b1 = 2 − 2t , b2 = − +t
2
8. Determine whether t is even or odd in the given term of det(A), where A = [aij]n.
(i) (–)ta13 a21 a34 a42
(ii) (–)ta14 a21 a33 a42
(iii) (–)ta14 a23 a32 a41
(iv) (–)ta12 a24 a31 a43
⎛ 3 −2 1 ⎞
10. Let A = ⎜ 5 6 2 ⎟ . Then compute adj(A) and A–1.
⎜ ⎟
⎜ 1 0 −3 ⎟
⎝ ⎠
⎛ −18 −6 −10 ⎞ ⎛ 94 18 6
94
10
94
⎞
⎜ 1 ⎟
Answer. adj ( A) = ⎜ 17 −10 − 1 ⎟ and A = ⎜ − 17
−1 10
⎜ ⎟ 94 94 94 ⎟
⎜ −6 −2 ⎟
28 ⎠ ⎜ 6 2 28 ⎟
− 94
⎝ ⎝ 94 94 ⎠.
⎛1 2 3⎞
11. If A = ⎜ 4 5
⎟
6 ⎟ , find the following minors and cofactors:
⎜
⎜7 8 9 ⎟⎠
⎝
(i) M23 and A23
(ii) M13 and A13
12. Solve the system of linear equation by using the Cramer rule:
–2 x1 + 3 x2 – x3 = 1
x 1 + 2 x2 – x 3 = 4
–2x1 – x2 + x3 = –3.
Answer. det (A)= – 2, and x1 =2, x2=3, x3= 4.
References
1. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
2. I. N. Herstein – Topics in Algebra, Vikas Publishing House, New Delhi, 2002.
3. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
New Delhi, 2000.
4. Michael Artin– Algebra, Prentice Hall India, New Delhi, 2007.
65
B. Chaluvaraju
BLOCK –II
Diagonalization and Inner product space
Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of Diagonalization and Inner product space.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.
UNIT-1
1. 0. Introduction
Eigenvectors are a special set of vectors associated with a linear system of
equations ( that is a matrix equation) that are sometimes also known as characteristic
vectors, proper vectors, or latent vectors. The determination of the eigenvectors and
eigenvalues of a system is extremely important in physics and engineering, where it is
equivalent to matrix diagonalization and arises in such common applications as stability
analysis, the physics of rotating bodies, and small oscillations of vibrating systems, to
name only a few. In 1858, A. Cayley proved the important Cayley-Hamilton theorem
that a square matrix satisfies its characteristic polynomial.
Note.
66
Linear Algebra
(ii) T is onto.
This is true if and only if (A – λIn)is not invertible. However, this result is equivalent to
the statement that det (A – λIn) = 0.
Theorem 1. 1. 1, states that the eigenvalues of a matrix are the zeros of its
characteristic polynomial. When determining the eigenvalues of a matrix or a linear
operator, we normally compute its characteristic polynomial, as in the next example.
1 1
Illustrative Example -1. Find the eigenvalues of ∈ .
4 1
Solution. The characteristic polynomial of A is
1 1
det det
4 1
= t2 – 2t – 3 = (t – 3)(t + 1).
It follows from Theorem 1. 1. 1, that the only eigenvalues of A are 3 and –1.
1 1 0
Illustrative Example -2. Find the eigenvalues of the matrix A= 0 2 2
0 0 3
Solution. The characteristic polynomial of T is
1 1 0
det (A – tI3) = det 0 2 2
0 0 3
= (1 – t) (2 – t)(3 – t) = – (t –1)(t–2)(t –3).
Hence λ is an eigenvalue of T (or A) if and only if λ = 1, 2 or 3.
Illustrative Example -3. Show that a square matrix A has 0 as eigenvalues if and only if
A is not invertible.
Solution. Let A be an n × n matrix over F. First, let 0 be an eigenvalue of matrix A and
non-zero vector X ∈ Fn be the corresponding eigenvector. Then, AX = 0X ⇒ AX = 0
If possible, let A be an invertible matrix. Then,
AX = 0 ⇒ A – 1 (AX) = A –1 0 ⇒ – 1 (AX) = A – 1 0 ⇒ (A – 1A) X = 0 ⇒ IX = 0 ⇒ X = 0
But X ≠ 0. So, we arrive at a contradiction. Hence, A must be a non-invertible matrix.
Conversely, let A be a non-invertible matrix. Then, A is non-invertible. The system of
equations AX = 0 has non-trivial solutions. So, there exists a non-zero vector X ∈ Fn
such that AX = 0 ⇒AX = 0X ⇒ 0 is an eigenvalue of A.
68
Linear Algebra
1. 2. Diagonalizability
for V such that [T]B is a diagonal matrix, Note that, if D = [T]B is a diagonal matrix, then
Conversely, if B = {v1, v2, ……,vn} is an ordered basis for V such that T(vj) = λj vj
λ 0 0
λ 0
for some scalars λ1 , λ2 , ………, λn , then clearly [T]B = 0 for
0 0 λ
each vector v in the basis B satisfies the condition that T(v) = λv for some scalar λ.
Theorem 1. 2. 1. Let T be a linear operator on a vector space V, and let λ1, λ2, . . . . , λk
be distinct eigenvalues of T. If v1, v2, . . . . , vk are eigenvectors of T such that λi
corresponds to vi (1 ≤ i ≤ k), then { v1, v2, . . . . , vk } is linearly independent.
69
B. Chaluvaraju
1 3 1 3
Illustrative Example -1. Let A = , , and . Find [TA]B
4 2 1 4
70
Linear Algebra
Definition 1. 2. 2. A polynomial f(t) in P(F) splits over F if there are scalars c, a1, a2,
……,an (not necessarily distinct) in F such that f(t) = c(t – a1) (t – a2) . . . . . . . (t – an).
71
B. Chaluvaraju
72
Linear Algebra
Example -3. Let T be the linear operator on R3 defined by T(a, b, c) = (–b + c, a + c, 3c)
We determine the T - cyclic subspace generated by e1 = (1, 0, 0).
Since T(e1) = T (1, 0, 0) = (0, 1, 0) = e2 and T2(e1) = T(T(e1)) = T(e2) = (–1, 0, 0) = – e1,
It follows that span ({e1, T(e1) , T2(e1) . . . . . }) = span ({e1, e2}) = {(s, t, o): s, t ∈ R}.
73
B. Chaluvaraju
Example - 4. Let T be the linear operator of R2 defined by T(a, b) = (a + 2b, –2a + b),
1 2
and let B = {e1, e2}. Then , where A = [T]B. The characteristic
2 1
polynomial of T is, therefore,
1 2
2 5.
2 1
It is easily verified that T0 = f(T) = T2 – 2T + 5I.
Similarly, f(A) = A2 – 2A + 5I
3 4 2 4 5 0 0 0
4 3 4 2 0 5 0 0
1. 4. Summary
In this unit we have discussed about matrices and determinants with respect to the
vectors (to be called characteristic vectors or eigenvectors) and a scalar λ (to be called
characteristic value or eigenvalue), such that Ax = λx. A formula for the reflection of R2
about the line y = 2x. The key to our success was to find a basis B1 . For which [T]B1, is a
diagonal matrix. This unit is concerned with the so-called diagonalization problem. A
solution to the diagonalization problem leads naturally to the concepts of eigen value and
eigen vector. Aside from the important role that these concepts play in the
diagonalization problem, they also prove to be useful tools in the study of many non-
diagonalizable operators. An invariant subspace of a linear mapping T : V → V from
some vector space V to itself is a subspace W of V such that T(W) is contained in W. An
invariant subspace of T is also said to be T- invariant. With the help of an invariant
subspace we study the Cayley-Hamilton theorem states that every square matrix over a
commutative ring (including the real or complex field) satisfies its own characteristic
equation
74
Linear Algebra
1. 5. Keywords
UNIT-2
2. 0. Introduction
Inner product space is a vector space or function space in which an operation for
combining two vectors or functions (whose result is called an inner product) is defined
and has certain properties. In this unit we also study the Gram-Schmidt process,
which takes a finite, linearly independent set S = {v1, …, vk} for k ≤ n and generates
an orthogonal set S′ = {u1, …, uk} that spans the same k-dimensional subspace of
Rn as S. Finally, this unit we concentrate the orthogonal complement W of a subspace W
of an inner product space V is the set of all vectors in V that are orthogonal to every
vector in W.
4. 〈x, x〉 > 0 if x ≠ 0.
75
B. Chaluvaraju
Note that (3) reduces to 〈x, y〉 = 〈y, x〉 if F = R. Condition (1) and (2) simply require that
the inner product be linear in the first component.
It is easily shown that if a1, a2, ……,an ∈ F and y, v1, v2, ……,vn∈ V, then
∑ , ∑ ,
For example, x = (a1, a2, ……,an ) and y = (b1, b2, ……,bn) in Fn, define
〈x, y〉 ∑ a b.
The verification that ( ⋅ , ⋅) satisfies conditions (1) through (4) is easy.
For example, if z = (c1, c2, ……,cn), we have for (1)
〈x z, y〉 ∑ a c b ∑ a b ∑ c b
= 〈x, y〉 + 〈z, y〉.
Thus, for x = (1 + i, 4) and y = (2 – 3i, 4 + 5i) in C2,
〈x, y〉 = ( 1 + i )(2+3i) + 4(4 – 5i) = 15 – 15i =15(1 – i).
The inner product in the above example is called the standard inner product on Fn.
Where F = R the conjugations are not needed, and in early courses this standard inner
product is usually called the dot product and is denoted by x . y instead of 〈x, y〉.
Example - 1. If 〈x, y〉 is any inner product on a vector space V and r > 0, we may define
another inner product by the rule 〈x, y〉1 = r 〈x, y〉. If r ≤ 0, then (4) would not hold.
Example -2. Let V = C( [0,1] ), the vector space of real - valued continuous functions on
[0,1]. For f, g ∈ V, define 〈f, g〉 = . Since the preceding integral is linear in
f, (1) and (2) are immediate, and (3) is trivial. If f ≠ 0, then f2 is bounded away from zero
on some subinterval of [0,1] and hence 〈f, f〉 = 0.
1 2 2
Example - 3. Let . Then .
2 3 4 1 2 3 4
A vector space V over a field F endowed with a specific inner product is called an
inner product space. If F = C, we call V a complex inner product space, whereas if
76
Linear Algebra
F = R, we call V a real inner product space. It is clear that if V has an inner product
〈x, y〉 and W is a subspace of V, then W is also an inner product space when the same
function 〈x, y〉 is restricted to the vectors x, y ∈ W.
Note. Let V be an inner product space. Then for x, y, z ∈ V and c ∈ F, the following
statements are true.
(i) 〈x, y + z〉 = + 〈x, z〉
(ii) 〈x, cy〉 = 〈x, y〉.
(iii) 〈x, 0〉 = 〈0, x〉 = 0
(iv) 〈x, x〉 = 0 if and only if x = 0
(v) If 〈x, y〉 = 〈x, z〉 for all x ∈ V, then y – z.
, , … … … = ∑ | | .
Theorem 2. 1. 1. Let V be an inner product space over F. Then for all x, y ∈ V and c ∈
F, the following statements are true.
(i) | | ·
(ii) 0 if and only if x = 0. In any case, 0
(iii) (Cauchy – Schwarz Inequality) |〈x, y〉| ⋅
(iv) (Triangle Inequality)
77
B. Chaluvaraju
〈 , 〉
In particular, if we set . The inequality becomes
〈 , 〉
|〈 , 〉 | |〈 , 〉 |
0 〈x, x〉 x , from which Cauchy –
〈 , 〉
Note. If S = {v1, v2, …, vk }, then S is orthonormal if and only if 〈vi, vj〉 = δij, where δij
denotes the kronecker delta. Also, observe that multiplying vectors by nonzero scalars
does not affect their orthogonality and that if x is any nonzero vector, then (1/ ) x is a
unit vector. The process of multiplying a nonzero vector by the reciprocal of its length is
called normalizing.
Example - 5. In F3, {(1, 1, 0), (1, –1, 1), (–1, 1, 2)} is an orthogonal set of nonzero
vectors, but it is not orthonormal; however, if we normalize the vectors in the set, we
78
Linear Algebra
3. .
Illustrative Example-6. Let u and v be two vectors in an inner product space v such that
. Prove that u and v are linear dependent vectors. Give an
example to show that the converse of this statement is not true.
Solution. We have
⇒
⇒ , 2
⇒ , 2 , , 2
⇒ ,
⇒ u and v are linearly dependent vectors
The converse is not true, because vectors u = ( –1, 0, 1) and v = (2, 0, –2) in R3 are
linearly dependent as v = 2u. But .
Illustrative Example -7. Let V be an inner product space and u, v ∈ V. Then, prove that
(i) 4 ,
(ii) 2
4 ,
On adding (1) and (2), we get
2
Example -1. The standard ordered basis for Fn is an orthonormal basis for Fn.
The next couple of results follow immediately from the above theorem
Corollary-1. If, in addition to the hypotheses of the above theorem, S is orthonormal and
y ∈ span (S), then ∑ 〈y, v 〉 .
If v possesses a finite orthonormal basis, then Corollary 1 allows us to compute
the coefficients in a linear combination very easily.
80
Linear Algebra
1, 1, 0 , 1, 1, 1 , 1,1,2
√ √ √
2 1 3 and 2 1 6 ..
√ √ √ √
As a check, we have 2, 1, 3 1, 1, 0 1, 1, 1 1, 1, 2 .
Process to compute the orthogonal vectors v1, v2 and v3, and then find the normalize these
vectors to obtain an orthonormal set.
Solution. Take v1 = w1 = (1, 0, 1, 0).
〈 , 〉
Then 1, 1, 1, 1 1, 0, 1, 0 = (0, 1, 0, 1).
〈 , 〉 〈 , 〉
Finally,
= 0, 1, 2, 1 1, 0, 1, 0 0, 1, 0, 1 = (–1, 0, 1, 0).
These vectors can be normalized to obtain the orthonormal basis { u1, u2, u3},
where 1, 0, 1, 0 , 0, 1, 0,1 and
√ √
1, 0, 1, 0 .
√
2. 3. Orthogonal Complement
S; That is, S⊥ = {x ∈ V: x, y = 0 for all y ∈ S}. The set S⊥ is called the orthogonal
complement of S. It is easily seen that S⊥ is a subspace of V for any subset S of V.
Proof. Let {v1, v2, . . . , vk} be an orthonormal basis for W, let u be as defined in the
preceding equation, and let z = y – u. Clearly in u ∈ W and y = u + z. To show that z ∈
W ⊥ , it suffices that z is orthogonal to each v j . For any j, we have
⎛ k ⎞ k
z , v j = ⎜⎜ y − ∑ y , vi vi ⎟⎟ , v j = y − v j − ∑ y , vi vi , v j
⎝ i= 1 ⎠ i =1
= y,vj − y,vj = 0
82
Linear Algebra
Proof. (i) Any finite generating set for V with dimension n contains at aleast n vectors,
and a generating set for vector space V that contains exactly n vectors is a basis for V, S
can be extended to an ordered basis S1 = {v1, v2, . . . . , vk , wk + 1 , . . . . . , wn} for V. Now
apply the Gram-Schmidt Orthogonalization Process to S1. The first k vectors resulting
from this process are the vectors in S, and this new set spans V. Normalizing the last
(n – k) vectors of this set produces an orthonormal set that spans V. The result follows.
(ii) Because S1 is a subset of a basis, it is linearly independent. Since S1 is clearly
a subset of W⊥, we need only show that it spans W⊥. Note that, for any x ∈ V, we have
∑ , . If x ∈ W⊥, then , = 0 for 1 ≤ i ≤ k.
Therefore, x = ∑ , ∈ span (S1).
(iii) Let W be a subspace of a vector space V. It is a finite - dimensional inner
product space because V is, and so it has an orthonormal basis {v1, v2, ……,vk }. By (i)
and (ii), we have dim (V) = n = k + (n – k) = dim (W) + dim (W⊥).
Example -1. Let W = span ({e1 , e2}) in F3. Then x = (a, b, c) ∈ W⊥ if and only if
0 = 〈 x, e1 〉 = a and 0 = 〈 x, e2 〉 = b. So x = (0, 0, c), and therefore W⊥ = span ({e3}).
One can deduce the same result by noting that e3 ∈ W⊥ and from (iii), that dim (W⊥) =
3 – 2 = 1.
83
B. Chaluvaraju
Illustrated Example -2. Let C[ – π, π] be the inner product space of all continuous
π
functions defined on [ – π, π] with the inner product defined by , π
.
sin , cos 2 π
π
sin , 1 1 0
Thus, sin t and cos t are orthogonal functions in the inner product space C[–π, π].
Illustrated Example -3. Let u = (−1, 4, −3) be a vector in the inner product space with
the standard inner product. Find a basis of the subspace u⊥ of R3.
Solution. We have
⊥ ∈ : , 0 or , ⊥ , , ∈ : 4 3 0
Thus, u⊥ consists of all vectors v = (x, y, z) such that –x + 4y − 3z = 0. In this equation
there are only two free variables. Taking y and z as free variable, we find that
y = 1, z = 1 ⇒ x = 1; y = 0, z = 1 ⇒ x = − 3
Thus, v1 = (1, 1, 1) and v2 = (−3, 0, 1) are two independent solutions of −x + 4y −3z = 0.
Hence, {v1 = (1, 1, 1), v2 = (−3, 0, 1)} form a basis for u⊥
, , ,
84
Linear Algebra
, , , , , , , ,
, , , , , , , ,
, 2 | , | , , δ
2 | , | | , |
| , |
| , | ≥ 0
Now, ∑ | , |
⇒ 0
⇒ 0
⇒ , 0
⇒ ,
2. 4. Summary
85
B. Chaluvaraju
Further, given b = (b1 , b2 , b3) ∈ V the dot product (or inner product) a
. b = a1 b1 , a2 b2 , a3 b3 is well known. The angle θ between a and b is derived from the
.
equation. . Here, we satudy the idea of distance or length into vector
spaces via a much richer structure, the so-called inner product space structure. This
added structure provides applications to geometry, physics, conditioning in systems of
linear equations, least squares and quadratic forms.
In mathematics, particularly linear algebra and numerical analysis, the Gram–
Schmidt process is a method for orthonormalising a set of vectors in an inner product
space, most commonly the Euclidean space Rn. The Gram–Schmidt process takes a
finite, linearly independent set S = {v1, …, vk} for k ≤ n and generates an orthogonal set S′
= {u1, …, uk} that spans the same k-dimensional subspace of Rn as S. Also, the
orthogonal complement W of a subspace W of an inner product space V is the set of all
vectors in V that are orthogonal to every vector in W .
2. 4. Summary
Gram – Schmidt orthogonalization
Inner product Orthogonal complement
Inner product space Orthogonal matrix
Norm of a matrix Orthogonally equivalent matrices
Norm of a vector Orthogonal operator
Normal operator Orthogonal vectors
Normalizing a vector Orthonormal
86
Linear Algebra
UNIT-3
3. 0. Introduction
This unit investigates the space A(V) of linear operator T on an inner product
space V. Adjoints of operators generalize conjugate transposes of square matrices to
(possibly) infinite-dimensional situations. The adjoint of an operator A is also sometimes
called the Hermitian conjugate (after Charles Hermite) of A. So, most of the results on
unitary spaces are identical to the corresponding results on inner product space. In this
unit, we learn about the normal and self - adjoint operators, unitary and orthogonal
operators and their matrices, orthogonal projections and the spectral theorem, bilinear and
quadratic forms.
Theorem 3. 1. 1. Let V be a finite- dimensional inner product space over a field F, and
let g: V → F be a linear transformation. Then there exists a unique vector y ∈ V such
that g(x) = 〈 x, y 〉 for all x ∈ V.
Proof. Let B = {v1, v2, ……vn } be an orthonormal basis for V, and let
V and W, with V has a finite basis { v1, v2, . . . ,vn}, if U, T: V→W are linear and
U(vi) = T(vi) for i = 1,2, …, n, then U=T .
To show that y is unique, suppose that g(x) = 〈 x, y1 〉 for all x.
87
B. Chaluvaraju
Note. The linear operator T* described in the above theorem is called the adjoint of the
operator T. Thus T* is the unique operator on V satisfying , , for
all x , y ∈ V.
Proof. Let A = [ T ]B and B = [T*]B be two matrix, and the basis B = {v1 , v2,. . . . , vn}.
Then V is an inner product space with an orthonormal basis B = {v1 , v2,. . . , vn} and T is
a liner operator on V, and let A = [ T ]B . Then for any i and j, Aij = , ), we have
Bij = , = , = , = = ( A*)ij .
Hence B = A*.
Note.
1. Let V be an inner product space, and let T and U be linear operators on V. Then
(i) (T + U)* = T* + U*,
(ii) (cT)* = T* for any c ∈ F,
(iii) (TU)* = U* T*,
(iv) T** = T,
(v) I* = I.
(iv) A** = A;
(v) I* = I.
Note.
90
Linear Algebra
V such that the matrix [T]B is upper triangular. This is known as a Schur
Theorem.
Example - 4. Suppose that A is a real skew - symmetric matrix, that is, AT= − A. Then A
is normal because both AAT and ATA are equal to −A2.
0 = U ( x) = U * ( x) = (T * − λ I )( x) = T * − λ x
λ1 x1 , x2 = λ1 x1 , x2 = T ( x1 ), x2 = x1 ,T * ( x2 ) = x1 , λ2 x2 = λ2 x1 , x2
.
Since λ1 ≠ λ2 , we conclude that x1 , x2 = 0
92
Linear Algebra
Proof. (i) Suppose that T(x) = λx for x ≠ 0. Because a self-adjoint operator is also
normal, by T is a normal operator on inner product space V. If x is an eigenvector of T,
then x is also an eigenvector of T*. In fact, if T(x) = λx. Then T*(x) = λ x.), to obtain
λx =T(x) = T*(x) = λ x. So λ = λ, that is, λ is real.
(ii) Let dim(V) = n, B be an orthonormal basis for V, and A = [T]B. Then A is self -
adjoint . Let TA be the liner operator on Cn defined by TA(x) = Ax for all x ∈ Cn.
Note that TA is self - adjoint because [T]A = A, where A is the standard ordered
(orthonormal) basis for Cn. So, by (i), the eigenvalues of TA are real. By the
Fundamental theorem of algebra, the characteristic polynomial of TA splits into factors of
the form (t – λ). Since each λ is real, the characteristic polynomial splits over R. But TA
has the same characteristic polynomial as A, which has the same characteristic
polynomial as T. Therefore the characteristic polynomial of T splits.
But, A* = [T]*B = [T*]B = [T]B = A. So A and A* are both upper triangular, and therefore
93
B. Chaluvaraju
Note. In the infinite - dimensional, an operator satisfying the preceding norm requirement
is generally called an isometry. If, in addition, the operator is onto (the condition
guarantees one-to-one), then the operator is called a unitary or orthogonal operator.
Example -1. Let h ∈ H satisfy h(x) = 1 for all x. Define the linear operator T on H
2π 2
1 2
2π ∫0
2 2
by T(f) = hf. Then T( f ) = hf = h(t ) f (t ) h(t ) f (t ) dt = f . Since h(t ) =1
Note. Let T be a linear operator on a finite - dimensional inner product space V. Then
the following statements are equivalent.
(i) TT* = T*T = 1.
(ii) T ( x) , T ( y) = x , y for all x, y ∈ V.
Note.
1. Since for a real matrix A we have A* = AT, a real unitary matrix is also orthogonal.
In this case, we call A orthogonal rather than unitary.
2. The condition AA* = I is equivalent to the statement that the rows of A form an
n n
orthonormal basis for Fn because δ ij = I ij = ( AA* ) ij = ∑ Aik ( A* ) kj = ∑ Aik A jk ,
k =1 k =1
94
Linear Algebra
and the last term represents the inner product of the ith and jth rows of A.
Similarly, we have columns of A and the condition A*A = I .
Note. If r(T)⊥ = n(T), then r(T)⊥ ⊥ = n(T) ⊥, provided V is a finite dimensional inner
product space over a field F.
95
B. Chaluvaraju
96
Linear Algebra
(ii) If Wi । denotes the direct sum of the subspaces Wj for j ≠ i, then W⊥i = W ।i
(iv) I = T1 + T2 + . . . . . . . . Tk
(v) T = λ1 T1 + λ2 T2 + . . . . . . .+ λk T1.
Fact 1. Let T be a linear operator on a finite -dimensional complex inner product space
V. Then T is normal if and only if there exists an orthonormal basis for V consisting of
eigenvectors of T.
Fact 2. Let T be a linear operator on a finite- dimensional real inner product space V.
Then T is self - adjoint if and only if there exists an orthonormal basis B for V consisting
of eigenvectors of T.
Fact 4. Let V be an inner product space, and let T be a normal operator on V. Then λ1
and λ2 are distinct eigenvalues of T with corresponding eigenvectors x1 and x2, then x1
and x2 are orthogonal.
Fact 5. Suppose that S = { v1, v2, . . . . . . ,vk }is an orthonormal set in an n - dimensional
inner product space V. Then
a) S can be extended to an orthonormal basis { v1, v2, . . .. .vk, vk + 1, . . . . .vn } for V.
97
B. Chaluvaraju
On the other hand, we have dim (Wi⊥) = dim (V) − dim (Wi) by Fact- 5.
Hence W1i ⊆ Wi⊥ , proving (ii). The proof of (iii) is obvious.
(iv) Since Ti is the orthogonal projection of V on Wi, it follows from (ii) that
nullity (Ti) = rank (Ti)⊥ = Wi⊥= W1i .
Hence, for x ∈ V, we have x = x1 + x2 + . . . . . + xk , where Ti(x) = xi ∈ Wi, proving (iv).
(v) For x ∈ V, write x = x1 + x2 + . . . . . . . + xk , where xi ∈ Wi. Then
T(x) = T (x1) + T(x2) + . . . . . . . + T(xk)
= λ x1 + λx2 + . . . . . . . + λxk.
= λ1T1 (x1) + λ2T2(x2) + . . . . . . . + λkTk(xk)
= ( λ1T1 + λ2T2 + . . . . . . . + λkTk ) (x).
This completes the proof.
3. 5. Summary
98
Linear Algebra
The spectral theorem also provides a canonical decomposition, called the spectral
decomposition, eigenvalue decomposition, or eigendecomposition, of the underlying
vector space on which the operator acts. The importance of diagonalizable operators is
seen in Block – II. For Normal and self - adjoint operators, it is necessary and sufficient
for the vector space V to possess a basis of eigen vectors. As V is an inner product space
in this block, it is reasonable to seek conditions that guarantee that V has an orthonormal
basis of eigen vectors.
3. 6. Keywords
Adjoint of a linear operator Self - adjoint matrix
Adjoint of a matrix Self - adjoint operator
Hermitian Spectrum
Normal matrix Spectral decomposition
Normal operator Spectral Theorem
Normalizing a vector Unitarily equivalent matrices
Orthogonal operator Unitary matrix
Orthogonal vectors Unitary operator
Orthonormal basis
99
B. Chaluvaraju
UNIT - 4
4. 0. Introduction
In this unit, we will generalize the notion of linear forms. In fact, we will
introduce the notion of a bilinear form on a finite-dimensional vector space. We have
studied linear forms on V(F). Here, we will study bilinear forms as mapping from V×V to
F, which are linear forms in each variable. Bilinear forms also give rise to quadratic and
Hermitian forms.
Example -1. Let V be a vector space over F = R. Then the mapping defined by
B(x, y) = x . y (which is the inner product of x and y) for x, y in V, is a bilinear form on V.
, 2 3 4
For , .
100
Linear Algebra
Example - 3. Let V = Fn, where the vectors are considered as column vectors.
For any A ∈ Mn×n(F), define B : V × V → F by B(x, y) = xT Ay for x, y ∈ V.
Notice that since x and y are n × 1 matrices and A is an n × n matrix, B(x, y) is a 1 × 1
matrix. We identify this matrix with its single entry. The bilinearity of B follows as in
the example-2.
For example, for a ∈ F and x1, x2, y ∈ V, we have
B(ax1 + x2, y) = (ax1 + x2)T A y = (axT1 + xT2)Ay
= axT1 Ay + xT2 Ay
= a B(x1, y) + B(x2, y).
Note.
1. By xT Ay, we understand the product of three matrices.
⎡ x1 ⎤
⎢x ⎥
⎢ 2⎥
That is x = ⎢ . ⎥ , A=[aij]n×n and y =[y1 , y2 , . . . . ., yn] .
T
⎢ ⎥
⎢. ⎥
⎢⎣ xn ⎥⎦
2. For any bilinear form B on a vector space V over afield F, the following properties
hold:
(i) If, for any x∈V, the functions Tx, Rx: V →F are defined by Tx(y) = B(x, y)
and Rx(y)=B(y, x) for all y∈V, then Tx and Rx are linear.
(ii) B(0, x) = B(x, 0) =0 for all x∈V.
(iii) For all x, y, z, w∈V,
B(x+y, z+w) = B(x, z) + B(x, w)+ B(y, z) + B(y, w)
(iv) If S: V × V→ F is defined by S(x, y) = B(y, x), then C is a bilinear form.
Definition 4. 1. 2. Let V be a vector space, Let B1 and B2 be bilinear forms on V, and let
a be a scalar. We define the sum B1 + B2 and the scalar product aB1 by the equations
(B1 + B2)(x, y) = B1(x, y) + B2(x, y) and
(aB1)(x, y) = a(B1(x, y)) for all x, y ∈ V.
Note.
101
B. Chaluvaraju
1. For any vector space V, the sum of two bilinear forms and the product of a scalar
and a bilinear form on v are again bilinear forms on V. Furthermore, B(V) is a
vector space with respect to these operations.
2. Let B = {v1, v2, . . . . . ,vn} be an ordered basis for an n-dimensional vector space
V, and let B∈ B(V). We can associate with B an n × n matrix A whose entry in ith
row and jth column is defined by Aij = B (vi, vj) for i, j = 1, 2, . . . . . . n.
3. The matrix A above is called the matrix representation of B with respect to be
the ordered basis B and is denoted by ΨB is an isomorphism.
Proof. Let B = {v1, v2, . . . . . ,vn} and C = ΨB(B). First assume that B is symmetric.
Then for 1≤ i, j ≤ n, Cij = B(vi, vj) = B(vj, vi) = Cji, and it follows that B is symmetric.
Conversely, suppose that B is symmetric. Let S : V × V → F, where F is the field of
scalars for V, be the mapping defined by S(x, y) = B(y, x) for all x, y ∈ V. If
S : V × V → F is defined by S(x, y) = B(y, x), then S is a bilinear form. Let D = ΨB(S),
Then for 1 ≤ i, j≤ n, Dij = S(vi, vj) = B(vj, vi) = Cji = Cij. Thus ΨB(S) = D = C = ΨB(B) ,
102
Linear Algebra
Proof. Suppose that B is diagonalizable. Then there is an ordered basis B for V such that
chosen to be any orthonormal basis for V such that ΨB(B) is a diagonal matrix.
4. 2. Summary
There is a certain class of scalar - valued functions of two variables defined on a
vector space that arises in the study of such diverse subjects as geometry and
multivariable calculus. This is the class of bilinear forms. We study the basic properties
of this class with a special emphasis on symmetric bilinear forms, and we consider some
of its applications to quadratic surfaces. a quadratic form is a homogeneous polynomial
of degree two in a number of variables. Quadratic forms occupy a central place in
103
B. Chaluvaraju
various branches of mathematics, including number theory, linear algebra, group theory.
Quadratic forms are homogeneous quadratic polynomials in n variables.
4. 3. Keywords
Bilinear form Symmetric bilinear form
Diagonalizable bilinear form Quadratic for
Exercises
1. If λ∈ F is a eigenvalue of T∈A(V), then show that there is a vector v ≠ 0 in V
such that T(v) = λv.
2. If λ∈F is a eigenvalue of T∈A(V), then show that for any polynomial q(x) ∈F[x],
q(λ) is a eigen value of q(T).
3. Find the all eigenvalues of the following matrix
1 2 1
2 2
(i) A= and (ii) B = 1 0 1
1 3
4 4 5
104
Linear Algebra
10. Find the adjoint of the linear operator T: R3→ R3 defined by T(x, y, z) = (x+2y,
3x–4z, y).
Answer. T*(x, y, z) = (x+3y, 2x+z, –4y).
References
1. S. Friedberg. A. Insel, and L. Spence – Linear Algebra, Fourth Edition, PHI,
2009.
2. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
3. Hoffman and Kunze – Linear Algebra, Prentice – Hall of India, 1978, 2nd Ed.,
4. P. R. Halmos – Finite Dimensional Vector Space, D. Van Nostrand, 1958.
5. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
2000.
105
B. Chaluvaraju
BLOCK-III
Canonical Forms
Objectives
After studying this unit you will be able to:
1. Understand the basic concepts of each unit.
2. Study the importance of Canonical Forms.
• Explain the defining properties of above.
• Give examples of each concept.
• Descriptions of some important theorems along with their proof.
• Some illustrative examples.
UNIT-1
1. 0. Introduction
If every linear operator is not diagonalizable, even if its characteristic polynomial
splits. That purpose of the this unit to consider alternative matrix representations for
nondiagonalizable operators. Such representation is generally, known as canonical forms.
Here, we study the diagonal and triangular canonical form.
Note.
1. Two square matrices A and B are said to be similar matrices if there exists a non
singular matrix C such that B = CAC–1 or A = C–1BC.
2. The linear transformations S, T∈A(V) are said to be similar linear transformation
if there exists an invertible element C∈A(V) such that T = C–1SC.
3. Similarly of linear transformation in A(V) is an equivalence relation, because
(i) T ∼ T a T = ITI–1
(ii) T∼S ⇒ T = CSC–1 ⇒ S = C–1T (C–1)–1 ⇒ S ∼ T
(iii) T ∼ S∼ U –1
⇒ T = CSC , S = DUD
–1
–1 –1
⇒ T = C(DUD )C
106
Linear Algebra
–1
⇒ T = (CD) U (CD)
⇒ T ∼ U.
The equivalence classes are called similarly classes.
Basic definitions and Facts.
1. The linear operator T is called diagonalizable if there exists a basis for V with
respect to which the matrix for T is a diagonal matrix.
2. Let T be a linear operator on a vector space V, and let λ be an eigenvalue of T.
Define Eλ = {x ∈ V: T(x) = λx} = n(T – λIv). The set E λ is called the eigenspace
of T corresponding to the eigenvalue λ. Analogously, we define the eigenspace of
a square matrix A to be the eigenspace of TA.
Example -1. Let T be the linear operator on P2(R) defined by T(f(x)) =f1(x). The
matrix representation of T with respect to the standard ordered basis B for P2(R) is
0 1 0
[T]B 0 0 2 .
0 0 0
Consequently, the characteristic polynomial of T is
1 0
3
det ([T]B– tI) = 0 2 = –t .
0 0
Thus T has only one eigenvalue (λ=0) with multiplicity 3. Solving T(f(x)) = f1(x) = 0
shows that Eλ = n(T – λIv) = n(T ) is the subspace of P2(R) consisting of the constant
polynomials. So {1} is a basis for Eλ, and therefore dim (Eλ) =1.
Consequently, there is no basis for P2(R) consisting of eigenvectors of T, and
therefore T is not diagonalizable.
107
B. Chaluvaraju
3 1 0
Illustrative Example- 2. Test the matrix A= 0 3 0 ∈ diagonalizable or not.
0 0 4
Solution. The characteristic polynomial of A is det (A–tI) = – (t –4) (t –3)2, which splits,
and so condition 1 of the test (Fact 6(i)) for diagonalization is satisfied. Also A has
eigenvalues λ1= 4 and λ2=3 with multiplicities 1 and 2, respectively. Since λ1 has
multiplicity 1, condition 2 is satisfied for λ1. Thus we need only test condition 2 for λ2.
0 1 0
Because, A– λ2 I = 0 0 0 has rank 2,
0 0 1
we see that n – rank (T – λl) = 3 – rank(A– λ2 I) =1, which is not the multiplicity of λ2.
Thus condition 2 fails for λ2. Therefore A is not diagonalizable.
Remark. However, not every linear operator is diagonalizable, even if its characteristic
polynomial split. This Block consider to alternative matrix representations for
nondiagonalizable operators (see example-1 and illustrative example-2). These
representations are called Canonical forms.
108
Linear Algebra
Lemma 1. 1. 1. If D1, D2 ∈Mn×n(F) are two diagonal matrices, then D1D2 = D2 D1.
Proof. If D1, D2 ∈Mn×n(F) are two diagonal matrices, then
(D1 D2)ij = ∑
= ∑ δ δ
= δ δ
= δ δ
= (D2 D1)ij .
Theorem 1. 1. 2. If T and U are simultaneously diagonalizable operators, then T and U
commute.
Proof. Let B be a basis of a vector space V that diagonalizes both T and U. Since
diagonal matrices commutes with each other (by above lemma),
[TU]B = [T]B [U]B = [U]B [T]B =[UT]B.
Now we can relate the two operations in the same way:
TU= φB-1 T[TU ] φB = φB-1 T[UT ] B φB =UT, where φB is the standard representation
B
with respect to basis B and T[TU ] is the left multiplication transformation by matrix [TU]B.
B
Illustrative Example - 4. Let λi = 0 for all i and T satisfy Tn = 1. Show that if T has all
the eigenvalues in F, then T is diagonalizable.
109
B. Chaluvaraju
Solution. Since T has all the eigenvalues of a field F, the minimum polynomial of T is
∏ (t − λ )
ni
q(t) = i . Now we claim that all these roots are simple.
i
110
Linear Algebra
For, ( )2 (v + W) = ( (v + W))
= (T(v) +W)
= (TT(v) + W)
= T2 (v) + W
= ( )2(v + W)
111
B. Chaluvaraju
( ) = a32 2 + a33 3
… … … … … … …
( ) = an2 2 + an3 3+ . . . . .+ ann n
We now verify that B = {v1, v2, ……,vn} is a basis for V with respect to which T has
matrix in triangular form.
112
Linear Algebra
113
B. Chaluvaraju
In fact , if S is the linear transformation of V defined by S(ui) = vi for i = 1,2, ……, n, then
the matrix C can be chosen to the basis B ={v1,v2,………,vn}.
Proof. Let A = [aij] and B = [bij]. Then T(ui) = ∑ and T(vi) = ∑ ,
respectively. Let S∈A(V) be defined by S(ui) = vi so that S is invertible ( Since S takes
basis to basis, if and only if S is bijective if and only if S is invertible).
Now from T(vi) = ∑
⇒ (TS) (ui) = ∑
Theorem 1. 2. 4. If the matrix A∈Fn has all its eigen values in F, then there is a matrix
C∈ Fn such that CAC–1 is a triangular matrix.
Proof. Suppose that A = … ∈ Fn has all its eigen values in F.
Now, define a linear map T: Fn → Fn which maps the basis B ={v1,v2,… …,vn}, where
v1= (1, 0, . . . ,0), v2 = (0, 1, . . . ,0), . . . , vn= (0, 0, . . . ,1) as indicated below
T(v1) = (a11, a12,………,a1n) = a11 v1 + a12 v2 + a13v3 + ………… + a1n vn
T(v2) = (a21, a22,………,a2n) = a21 v1 + a22 v2 + a23v3 + ………… + a2n vn
……………………………………………………………………………….
T(vi) = (a11, a12,………,a1n) = = an1 v1 + an2 v2 + an3v3 + ………… + ann vn
Therefore A = … .
114
Linear Algebra
Suppose that the matrix A∈ Fn as all its eigen values in F. A defines a linear
transformation T on F, whose matrix in the basis v1= (1, 0, . . . ,0), v2 = (0, 1, . . . ,0), . . .
vn= (0, 0, . . . ,1), is precisely A. The eigen values of T being equal to those A, are all in F.
By Theorem 1. 2. 2, there is a basis say, B ={v1,v2,… …,vn}, which is a triangular matrix
with respect to the basis of F. However this change of basis, merely changes the matrix
A of linear transformation T in the first basis CAC–1 for a suitable C∈ Fn. By lemma 1.
2. 3, we have CAC–1 is a triangular matrix for some C∈ Fn.
Note.
1. Theorem 1. 2. 4, is also known as alternate form of Theorem 1. 2. 2.
2. Next theorem, we use λi = aii for i= 1, 2, …., n
Theorem 1. 2. 5. If V is n-dimensional vector space over a field F and T∈A(V) has all
its eigen values in F, then T satisfies a polynomial of degree n over F.
Proof. Since T∈A(V) has all its eigen values in F, there is a basis B ={v1,v2,… …,vn} of
V in which satisfies
T(v1) = λ1v1 or a11v1 (where λ1 = a11)
T(v2) = a21v1 + λ2v2
T(v2) = a31v1 + a32v2+ λ3v3
……………………………..
T(vn) = an1 v1 + an2 v2+…….. + λnvn
Equivalently, (T – λ1) v1 = 0
(T – λ2) v2 = a21v1
(T – λ3) v3 = a31v1 + a32v2
………………………………
(T – λn) vn = an1 v1 + an2 v2+…….. + an, n–1 vn–1.
Note that (T – λ1) (T – λ2) v1 = (T – λ2) (T – λ1) v1=(T – λ2) .0 = 0, since (T – λ1) v1 = 0.
Also, (T – λ1) (T – λ2) v2 = (T – λ1) a21v1, since (T – λ2) v2 = a21v1
= a21 ((T – λ1) v1) = a21.0 = 0, since (T – λ1) v1 = 0.
Continuing this type of computation, we get
(T – λn) (T – λn–1) . . . . . (T – λ1) v1 = 0
115
B. Chaluvaraju
1. 3. Summary
Suppose we have some set S of objects, with an equivalence relation. A canonical
form is given by designating some objects of S to be "in canonical form", such that every
object under consideration is equivalent to exactly one object in canonical form. In other
words, the canonical forms in S represent the equivalence classes, once and only once. To
test whether two objects are equivalent, it then suffices to test their canonical forms for
equality. A canonical form thus provides a classification theorem and more, in that it not
just classifies every class, but gives a distinguished (canonical) representative.
Let's look at what it means for the matrix of T to be diagonal. Recall from Block
-II, we get the matrix by choosing a basis B= {v1, v2, . . . , vn}, and then entering the
coordinates of T(v1) as the first column, the coordinates of T(u2) as the second column,
etc. The matrix is diagonal, with entries λ1, λ2, λ3, . . . . , λn, if and only if the chosen
basis has the property that T(vi) = λivi, for 1 ≤ i ≤ n. This leads to the definition of an
eigenvalue and its corresponding eigenvectors. So, diagonalizing a matrix is equivalent to
finding a basis consisting of eigenvectors.
116
Linear Algebra
1. 4. Keywords
UNIT-2
2. 0. Introduction
117
B. Chaluvaraju
Fact 3. If T ∈ A(V) is nilpotent, of index of nilpotence ni, then a basis of V can be found
0 0
0 0
such that the matrix of T in this basis has the form , where
0 0
n1 ≥ n2 ≥ . . . . . . . ≥ nr and where n1 + n2 + . . . . . . . ≥ nr = dim (V). Here, the
integers n1 , n2 , . . . . . . . , nr are called the invariants of T.
(ii) There is an element z ∈ M such that z, T(z), . . . . . . . ,Tm – 1(z) form a basis of M.
Note.
1. Let V1 be a subspace of vector space V over F invariant under T∈A(V). Then T
induces a linear transformation T1∈A(V1) given by T1 (u)= T(u) for all u∈V1.
For any q(x) ∈ F[x] , q(T) ∈ A(V), the linear transformation induced by q(T) on V1
is q(T1) i.e., q(T)⏐V1 = q(T1), where T1 = T ⏐V.
Further, q(T) = 0 ⇒ q(T1) = 0. That is T1 satisfies any polynomial satisfied by T.
118
Linear Algebra
119
B. Chaluvaraju
Proof. If k = 1 , then V = V1 and there is nothing that needs proving suppose that k >1.
To prove Vi ≠ 0 for all i = 1, 2, . . . . , k, we define the polynomials
h1(x) = . . . . . . ,
h2(x) = . . . . . .
..................................
hi(x) = ∏ , . . . . .,
..................................
hk(x) = . . . . . .
If p(x) = . . . . . . is the minimal polynomial of T ∈ AF(V),
then hi(x) ≠ p(x) and hence hi(T) ≠ 0.
Therefore, hi(T)(v) ≠ 0 for some v∈V. That is, w ≠ 0 , where w = hi(T)(v)
However, = ( hi(T) = p(T) (v)= 0 (v) = 0, since p(T)=
hi(T) .
Thus w∈Vi, where w ≠ 0 of V.
Therefore, Vi ≠{0} for i = 1,2, …….. k.
We also note, from above argument that hi(T)V ≠ {0} is in Vi . That is, hi(T)V ⊂ Vi.
Further, for j≠i we have ⏐hi(x).
Therefore, vj∈Vj , we have hi(T)(vj) = 0.
We now show that V = V1+ V2+……+Vk
The k polynomials h1(x), h2(x) ,………, hk(x) are relatively prime.
Hence we can find the polynomials a1(x), a2(x), .….., ak(x) in F[x] such that
1 = a1(x) h1(x) + a2(x) h2(x) +……….. + ak(x) hk(x)
1 = a1(T) h1(T) + a2(T) h2(T) +……….. + ak(T) hk(T)
v = a1(T) h1(T) (v) + a2(T) h2(T) (v) +……….. + ak(T) hk(T) (v) for v∈V.
But ai(T) (v) ∈V for all i ⇒ ai(T) hi(T) (v)∈ hi(T)V ⊂Vi.
⇒ ai(T) hi(T) (v)∈Vi for all i = 1,2…. k.
Hence from the above expression , we get
v = v1 + v2 +………. + vk , where vi = ai(T) hi(T) (v) for all i = 1,2… k.
Thus V = V1 + V2 +……. + Vk.
120
Linear Algebra
We must now verify that this sum is a direct sum. To show this , it is enough to prove that
if u1+u2+………+uk = 0 with each ui ∈ Vi, then each ui = 0 suppose that
u1+u2……… +uk = 0 where some ui, say ui ≠ 0
Then h1(T) (u1+u2+……….+ uk) = 0 gives
h1(T) (u1) +………+ h1(T) (uk) = 0 , where h1(T) (uj) = 0 for all j ≠ 1
But u1∈V1 ⇒ (u1) = 0
Now is respectively prime with h1(x) implies
1 = b1(x) h1(x) + b2(x) for b1(x), b2(x) ∈F[x]
⇒ 1 = b1(T) h1(T) + b2(T)
⇒ u1 = b1(T) h1(T)( u1) + b2(T) (u1)
⇒ = b1(T)( h1(T)( u1)) + b2(T)( (u1))
⇒ = b1(T) 0 + b2(T) 0 = 0.
This is a contradiction to the fact ui ≠ 0
Therefore, u1 + u2 +………. +uk = 0 ⇒ ui = 0 for all i .
Therefore, V = V1 V2 ………. Vk.
Finally, we show that the minimal polynomial of Ti on Vi is .
By the definition of Vi, we have Vi = 0
Therefore, = 0, since Vi ≠ 0
Therefore, the minimal polynomial of Ti is a divisor of .
That is, the minimal polynomial of Ti is , where fi ≤ li.
But, by a corollary on the minimal polynomial, we have the minimal polynomial of T
over F is the least common multiple of { . . . . . . }
That is, . . . . . . = . . . . . . .
This implies that l1 = f1,………, = li = fi .
Therefore, the minimal polynomial of Ti = . or . This completes the proof.
λ 1 0 0 0
0 λ 0 0
Definition 2. 1. 3. The Matrix such λ’s on the diagonal, 1’s
0 0 0 1
0 0 0 λ
on the super diagonal, and 0’s elsewhere, is a basic Jordan block belonging to λ.
121
B. Chaluvaraju
Example - 1. Here are two matrices in Jordan form. The first has two primary blocks.
While the second has three.
⎡3 1 0 ⎤ ⎡3 1 0 ⎤
⎢0 3 1 ⎥ ⎢0 3 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢0 0 3 ⎥ ⎢0 0 3 ⎥
⎢ ⎥ ⎢ ⎥
3 1 ⎢ 3 ⎥
⎢ ⎥
⎢ ⎥ ⎢ 3 ⎥
0 3 ⎢ ⎥
⎢ ⎥ ⎢ 2 1 0 0 ⎥
⎢ 2 1 ⎥ ⎢ 0 2 1 0 ⎥
⎢ 0 2 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ 0 0 2 1 ⎥
⎢ 2 1⎥ ⎢ 0 0 0 2 ⎥
⎢ 0 2 ⎥⎦
⎢ ⎥
⎣ ⎢ 5 ⎥
⎢ ⎥
⎣ 5⎦
As a point of intersect, the first matrix has characteristic polynomial (x–3)5 (x–2)4
and minimal polynomial (x–3)3 (x–2)2. The second matrix has characteristic polynomial
(x–3)5 (x–2)4 (x–5)2 and minimal polynomial (x–3)3 (x–2)4 (x–5) .
are Basic Jordan blocks belonging to λi.
122
Linear Algebra
M
λ
λ
λ
using the first remark made in this proof about the relation of a basic Jordan block and
the Mm’s. This completes the theorem.
Note.
1. In each Ji the size of ≥ size of ≥ . . . . . . , when this has been done, then the
2. Two linear transformations in A(V) which have all their eigenvalues in F are
similar if and only if they can be brought to the same Jordan form.
1 0 0
Illustrative Example -2. Compute the Jordan canonical form for A= 0 0 2 .
0 1 3
Solution. Write A for the given matrix. The characteristic polynomial of A is
(λ–1)2(λ–2). So the two possible minimal polynomials are (λ–1) (λ–2) or the
characteristic polynomial itself.
We find that (A – I) (A – 2I) = 0 so the minimal polynomial is (λ–1) (λ–2), and hence the
invariant factors are λ–1, (λ–1) (λ–2).
123
B. Chaluvaraju
The prime power factors of the invariant factors are the elementary divisors: λ–1, λ–1,
λ–2. Finally the Jordan canonical form of A is diagonal with diagonal entries 1, 1, 2.
Note. After determining that the minimal polynomial has all roots in the ground field and
no repeated roots, we can immediately conclude that the matrix is diagonalizable and
therefore the Jordan canonical form is diagonal.
Illustrative Example -3. Find all possible Jordan forms A for 6 × 6 matrices with
t2 (1 – t)2 as minimal polynomial.
Solution. The possible characteristic polynomials of A have the same irreducible factors
t and (1 – t) with atleast square exponents and of degree 6. Hence we have the
following cases:
Case (i). Characteristic polynomial is t4(t– 1)2. Jordan form in this case is
0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 1 1
0 0 0 0 0 1 0 0 0 0 0 1
Case (ii). Characteristic polynomial is t3 (t – 1)3. Jordan form is
0 1 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 1 1 0
0 0 0 0 1 0
0 0 0 0 0 1
Case (iii). Characteristic polynomials is t2(t– 1)4. Jordan form is
0 1 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 0 0 0 0 1 1 0 0
0 0 0 1 0 0 0 0 0 1 0 0
0 0 0 0 1 1 0 0 0 0 1 0
0 0 0 0 0 1 0 0 0 0 0 1
Illustrative Example -4. Let J be a Jordan block with diagonal entriesλ. Then show that
λ is the only eigenvalue, and the associated eigenspace is only 1-dimensional.
124
Linear Algebra
2. 2. Summary
Jordan canonical form of a linear operator on a finite dimensional vector space
is an upper triangular matrix of a particular form called Jordan matrix, representing the
operator on some basis. The form is characterized by the condition that any non-diagonal
entries that are non-zero must be equal to 1, be immediately above the main diagonal (on
the superdiagonal), and have identical diagonal entries to the left and below them. If the
vector space is over a field K, then a basis on which the matrix has the required form
exists if and only if all eigenvalues of M lie in K, or equivalently if the characteristic
polynomial of the operator splits into linear factors over K. This condition is always
satified if K is the field of complex numbers. The diagonal entries of the normal form are
the eigenvalues of the operator, with the number of times each one occurs being given by
its algebraic multiplicity. If the operator is originally given by a square matrix M, then its
Jordan normal form is also called the Jordan normal form of M. Any square matrix has a
Jordan normal form if the field of coefficients is extended to one containing all the
eigenvalues of the matrix. In spite of its name, the normal form for a given M is not
entirely unique, as it is a block diagonal matrix formed of Jordan blocks, the order of
which is not fixed; it is conventional to group blocks for the same eigenvalue together,
but no ordering is imposed among the eigenvalues, nor among the blocks for a given
eigenvalue, although the latter could for instance be ordered by weakly decreasing size.
The diagonal form for diagonalizable matrices, for instance normal matrices, is a special
case of the Jordan normal form
2. 3. Keywords
Generalized eigenspace Jordan form of a linear operator
Generalized eigenvector Jordan form of a matrix
Jordan block Minimal polynomial of a linear operator
Jordan canonical basis Minimal polynomial of a matrix
125
B. Chaluvaraju
UNIT-3
3. 0. Introduction
In this unit, we study the minimal polynomial, which is played vital role of
canonical forms, particularly generalized eigenvalue and eigenvector.
3. 1. Minimal polynomial
Definition 3. 1. 1. The monic polynomial p(x) of minimum degree such that p(T) = 0 is
called the minimal polynomial of T
Note.
1. Let F be a field. Let p(x) and h(x) ∈ F[x] ( or simply, P(F)). Suppose p(x) ≠ 0.
Then we may find q(x) and r(x) ∈ F[x] such that h(x) = q(x)p(x) + r(x), where
either r(x) = 0 or deg (r(x)) < deg(p(x)). This is known as Division Algorithm.
2. If we consider the set I, T, . . . , in n2-dimensional vector space A(V), we have
(n2 + 1) - elements, so they cannot be linearly independent. Thus there is some
linear combination a0I + a1T + , . . . , + that equals the zero function.
For every T∈A(V) satisfies some polynomial of degree ≤ n2. Knowing that there
is some polynomial that T satisfies, we can find a polynomial of minimal degree
that T satisfies, and then we can divide by its leading coefficient to obtain a monic
polynomial.
126
Linear Algebra
r(T) = h(T) – q(T)p(T) = 0. We can divide r(x) by its leading coefficient to obtain a
monic polynomial satisfied by T, and then this contradicts the choice of p(x) as a monic
polynomial of minimal degree satisfied by T. We conclude that the remainder r(x) = 0,
so h(x) = q(x)p(x) and p(x) is thus a factor of h(x). If g(x) is another monic polynomial of
minimal degree with g(T) = 0, then by the preceding paragraph p(x) is a factor of g(x),
and g(x) is a factor of p(x). Since both are monic polynomials, this forces g(x) = p(x),
showing that the minimal polynomial is unique.
Example - 1. Consider the matrices A0, A1 and A2 above. Now ni(x) = xi+1 is a
polynomial such that ni(Ai) = 0. So the minimal polynomial pi(x) must divide xi+1 in each
case. From this it is easy to see that the minimal polynomials are in fact ni(x), so that
p0(x) = x, p1(x) = x2 and p2(x) = x3.
For a more involved example consider the matrix B = Bk(λ) ∈ Mk,k(F), where λ is a scalar
and we have a trailing sequence of 1's on the main diagonal.
First consider T = Bk(0) = Bk(λ) –λIk = B –λIk. This matrix is nilpotent. In fact Tk = 0,
but Tk–1 ≠ 0. So if we set g(x) = (x – λ)k then g(B) = 0. Once again, the minimal
polynomial p(x) of B must divide g(x). So p(x) = (x – λ)i, some i ≤ k. But since Tk–1 ≠ 0,
in fact i = k, and the minimal polynomial of B is precisely p(x) = (x – λ)k.
Note.
1. The characteristic and minimal polynomials of a linear transformation have the
same zeros (except for multiplicities).
2. If V is a finite- dimensional vector space over F, then T∈A(V) is invertible if and
only if the constant term of the minimal polynomial for T is not zero.
3. Suppose all the characteristic roots if T∈A(V) are in F. Then the minimal
polynomial, q1(x) λ λ . . . . . λ for λi∈F.
Here qi(x) λ and Vi = {v∈V/ λ (v) = 0 }.
So, if all the distinct characteristic roots λi………, λk of T lie in F, then V can be
127
B. Chaluvaraju
Example - 2. Let T be the linear operator of R2 defined by T(a, b)=(2a+5b, 6a+b) and B
2 5
be the standard ordered basis for R2. Then [T]B= , and hence the characteristic
6 1
2 5
polynomial of T is f(t) = det = (t –7) (t+4).
6 1
Thus the minimal polynomial of T is also (t –7) (t+4).
Illustrative Example-3. Let V = F3[x] be the space of all polynomials of degree atmost
3 and let T: V → V be the linear transformation given by T(f) = f1. Find the minimal
polynomial of T.
Solution. The basis for V is {1, x, x2, x3} and T(1) = 0, T(x) = 1, T(x2) = 2x and
T(x3) = 3x2.
⎛0 0 0 0⎞
⎜ ⎟
⎜1 0 0 0⎟
Hence the matrix of T is A = ⎜ .
0 2 0 0⎟
⎜ ⎟
⎜0 0 3 0 ⎟⎠
⎝
By direct computation, A4 = 0 and A3 ≠ 0. Hence the minimum polynomial of T is x4.
Note.
128
Linear Algebra
0 1
Illustrative Example-4. Let F be the field of real numbers and let ∈F2.
1 1
0 1
Then prove that the set µ consisting only of is an irreducible. Further, Find the
1 1
0 1
set D of all matrices commuting with , where D = {T∈A(V): TM = MT for all
1 1
M∈µ}.
0 1 0 1
Solution. Consider = .
1 1 1 1
That is, = .
0 1
= and
1 1
0 1
= .
1 1
Hence D = { : a, b ∈R }.
0 1
That is, D = set of all matrices which commute with .
1 1
Define a map φ: D→ C by φ → a+ib is a field isomorphism.
First, φ + = φ
= (a + c) + i(b + d)
= (a + ib) + (c + id)
= φ + φ
Now φ = φ
129
B. Chaluvaraju
= φ +φ
0 1 0 1
That is, det λ =0
1 0 1 0
λ 1
⇒ det =0
1 0
⇒ λ2+1 = 0 ⇒ λ = ± √–1 = ± i .
Therefore, the minimal polynomial is (x+i) (x–i) = x2 +1 = 0
Now M⊂ A(V), M is a irreducible set if M(W) ⊂ W ⇒ W ={0} or W = V .
Therefore, T(W)⊂ W ∀ T∈M ⇒ either W = {0} or W = V .
Let T be a Linear transformation
0 1
T= : R2 → R2
1 0
Let W be a one - dimensional subspace of V , T(W)⊄W. Let {w1} be a basis of W. Then
{w1} can be extended to a basis {w1, w2} of V.
Suppose W is invariant under T .
T (w1) = a1w1 +0.w2 , because T (w1)∈W = 〈 w1〉.
T (w2) = b1w1+ b2w2 , because T (w2)∈W
0
Therefore, the matrix A = is a matrix of T with respect to {w1, w2} and the
0 1
matrix B = is also a matrix of T with respect to some other basis . Then by the
1 1
result A and B are similar.
0 1
That is, B = ⇒ the matrix B has ± i are complex roots, and
1 1
130
Linear Algebra
0
A= ⇒ the matrix A has a1, b1 are real roots.
This is a contradiction to the fact that W is invariant under T . That is, there is no proper
subspace of V in D. Therefore, the matrix is irreducible.
3. 2. Summary
The minimal polynomial deals the distinct eigenvalues and the size of the largest
Jordan block corresponding to eigenvalues. While the Jordan normal form determines the
minimal polynomial, the converse is not true. This leads to the notion of elementary
divisors. The elementary divisors of a square matrix A are the characteristic polynomials
of its Jordan blocks. The factors of the minimal polynomial m are the elementary divisors
of the largest degree corresponding to distinct eigenvalues. The degree of an elementary
divisor is the size of the corresponding Jordan block, therefore the dimension of the
corresponding invariant subspace. If all elementary divisors are linear, A is
diagonalizable.
3. 3. Keywords
Diagonalizable Division Algorithm
Minimal polynomial Monic polynomial
UNIT-4
4. 0. Introduction
The generalizations of the eigenvalue and eigenspace, this unit deals with a
suitable canonical form of a linear operator to this context. The one that we study is
called rational canonical form.
131
B. Chaluvaraju
further, that V, as a module is a cyclic module. Then there is basis of V over F such that,
in this basis, the matrix of T is
0 1 0 0
0 0 1 0
.
0 0 0 1
γ γ γ
Proof. Since V is cyclic relative to T, there exists a vector v∈V such that every element
w∈V, is of the form w = f(T)(v) for some f(x) ∈ F[x]. Now if for some polynomial
h(x) ∈ F[x], h(T) (v) = 0, then for any w∈V, h(T) (f(T)(v)) = (h(T)(v)) f(T) = 0; thus h(T)
annihilates all of V and so h(T) = 0.
But then p(x) | h(x), since p(x) is the minimal polynomial of T. This remark implies that
v, T(v), T2(v), . . . . . , T r – 1(v) are linearly independent over F, for if not, then a0 v+ a1
T(v) + . . . . . + ar – 1 T r – 1(v) = 0 with a0 , a1 , . . . . . . ,ar – 1 in F.
r – 1
But then (a0 + a1 T + . . . . . . . + ar – 1 T )(v) = 0, hence by the above discussion
r–1
p(x) | (a0 + a1 T + . . . . . . . + ar – 1 T ), which is impossible since p(x) is of degree r
unless a0 = a1 = . . . . . . . + ar – 1 = 0. Since Tr = −γ0 − γ1 T − . . . . . . . . . −γr – 1 Tr – 1 , we
immediately have that Tr + k , for k ≥ 0, is a linear combination of 1, T, . . . . . . , Tr – 1, and
so f(T), for any f(x) ∈ F[x], is a linear combination of 1, T, ….., Tr – 1 over F.
Since any w∈V is of the form w = f(T)(v) we get that w is a linear combination of
v, T(v), . . . .. , Tr – 1 (v). We have proved, in the above two paragraphs, that the elements
v, T(v), . . . . . . . . , Tr – 1 (v) form a basis of V over F. In this basis, as is immediately
verified, the matrix of T is exactly as claimed.
Note.
132
Linear Algebra
133
B. Chaluvaraju
polynomials in F[x], then a basis of V can be found in which the matrix of T is of the
C q x
form , where each ,
C q x
where ei = e i1 ≥ e i2 ≥ . . . . . . . ≥ e iri.
Proof. By Fact -1, V can be decomposed into the direct sum V = V1 ⊕ V2 ⊕ . . . . . . . ⊕
Vk, where each Vi is invariant under T and where the minimal polynomial of Ti, the linear
transformation induced by T on Vi, has as minimal polynomial . Using Fact-1 and
the theorem 4. 1 just proved, we obtain the above result. If the degree of qi(x) is di, note
that the sum of all the dieij is n, the dimension of V over F.
Definition 4. 1. 2. The matrix of T in the statement of the above corollary is called the
rational canonical form of T.
134
Linear Algebra
Illustrative Example-2. Deduce from the previous problem that the characteristic
polynomial of T is the product of all elementary divisors of T.
0
Solution. Under the Rational Canonical form, the matrix of T is ,
0
Illustrative Example-3. Find all possible rational canonical forms for 6 x 6 matrices
with (x – 2)(x + 2)3 as minimal polynomial.
Solution. Since the matrix is of size 6 x 6 and the minimal polynomial is (x – 2)(x + 2)3
there are three cases for the characteristic polynomial.
Case 1. Characteristic polynomial is (x – 2)(x + 2)5 . In this case, the rational canonical
forms are C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2)2
or C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2) ⊕ C(x + 2).
Case 2. Characteristic polynomial is (x – 2)(x + 2)4 . In this case, the rational canonical
form is C(x – 2) ⊕ C(x – 2) ⊕ C(x + 2)3 ⊕ C(x + 2).
Case 3. Characteristic polynomial is (x – 2)2(x + 2)3. In this case, the rational canonical
form is C(x – 2) ⊕ C(x – 2) ⊕ C(x – 2) ⊕ C(x + 2)3.
All these can be written in the block matrix from using the matrix form of C(q(x)), the
companion matrix of q(x).
4. 2. Summary
135
B. Chaluvaraju
The Jordan canonical form is the one most generally used to prove theorems
about linear transformations and matrices. Unfortunately, it has one distinct, serious
drawback in that it puts requirements on the location of the charecterstic roots. Thus we
need some canonical form for element in A(V) (or in Fn) which presumes nothing about
the location of the characteristic roots of its element , a canonical form and asset of
invariants created in A(V) itself using only its elements and operation. Such a canonical
form is proved the above unit is rational canonical form.
4. 3. Key words
Companion matrix Multiplicity of an elementary divisor
End vector of a cycle Rational canonical basis
Generalized eigenspace Rational canonical form
Generalized eigenvector
Exercises
136
Linear Algebra
⎛2 6 − 15 ⎞
⎜ ⎟
7. Find the Jordan canonical form of A= ⎜ 1 1 − 5⎟
⎜1 2 − 6 ⎟⎠
⎝
.
Answer. The characteristic polynomial of A is (λ–1)4
⎛1 2 0 0⎞
⎜ ⎟
⎜0 1 2 0⎟
8. Determine the Jordan canonical form for the matrix A= ⎜ .
0 0 1 2⎟
⎜ ⎟
⎜0 0 0 1 ⎟⎠
⎝
Answer. The characteristic polynomial of A is (λ–1)4 and A has a single Jordan
block of type (1, 4).
9. Show that the elements S and T in A(V) are similar in A(V) if and only if they have
the same elementary divisors.
⎛1 1 1 1⎞
⎜ ⎟
⎜0 0 0 0⎟
10. Find the rational canonical form of A= ⎜
0 0 −1 0⎟
⎜ ⎟
⎜0 −1 1 0 ⎟⎠
⎝
Hint. The characteristic polynomial of A is (x–1) x (x2+x+1), then use the
companion matrix of q(x).
References
1. S. Friedberg. A. Insel, and L. Spence – Linear Algebra, Fourth Edition, PHI,
2009.
2. Jimmie Gilbert and Linda Gilbert – Linear Algebra and Matrix Theory, Academic
Press, An imprint of Elsevier 2010.
3. I. N. Herstein – Topics in Algebra, Vikas Publishing House, New Delhi, 2002.
4. Hoffman and Kunze – Linear Algebra, Prentice – Hall of India, 1978, 2nd Ed.,
5. P. R. Halmos – Finite Dimensional Vector Space, D. Van Nostrand, 1958.
6. N S. Kumeresan – Linear Algebra, A Geometric approach, Prentice Hall India,
2000.
137
B. Chaluvaraju
GLOSSARY OF SYMBOLS
Symbols Meaning
Aij The ijth entry of the matrix A
A- 1 The inverse of the matrix A
A* The ajdoint of the matrix A
AT The transpose of the matrix A
Tr(A) The trace of the matrix A
(A | B) The matrix A augmented by the matrix B
B(x, y) The bilinear form
B (V) The set of all bilinear forms on V
NOTES
140