Professional Documents
Culture Documents
1 −2 1 0
%0 2 −8 8&
1.1 System of Linear Equations
−4 5 9 −9
The basic form of a linear equation is:
• ! + •" !" + ⋯ + •$ !$ = %.
Where b is a real or complex constant and the
coefficients •$ are real and complex. The subscript If we write it as an augmented matrix. The
n is any positive integer. Any equation that may be notation for describing this matrix is 3 x 4, where
rearranged to the above form is a linear equation. the first number indicates the number of rows and
A set of these equations involving the same the second the number of columns.
variables is referred to as a system of linear
equations which may have a solution in the form SOLVING A LINEAR SYSTEM
of a list (& , &" , … , &$ ) that can be substituted for The procedure of solving linear systems follows
! , … , !$ respectively. For example, the linear very straight forward steps. The ultimate goal is to
system: replace the linear system with an equivalent
2! − !" + 1.5!* = 8 system (same solution set) that is easier to solve.
! − 4!* = −7 By using three basic operations: (1) Replacing one
equation of the system by the sum of itself and a
Has the solution list (5, 6.5, 3), which when multiple of another, (2) interchange two equations
substituted back into the linear system for and (3) multiply all terms in an equation by a
! , !" , !* results in a simplification of the two nonzero constant. None of these operations
equations to 8 = 8 and -7 = -7. The solution set is change the solution set of the system. These steps
the set of all possible solutions of a linear system are applicable to both the system of the linear
and two linear systems are equivalent if they have equations and the corresponding augmented
the same solution set. The solution set of a system matrix.
of linear equations may have one of the three
following properties: 1.2 Row Reductions and Echelon Forms
A leading entry of a row: Leftmost nonzero entry
1. There is no solution. -> There is a (in a nonzero row)
contradiction in the system. A rectangular matrix is in echelon form if it
2. There is exactly one solution. -> The satisfies these three properties: (1) All nonzero
equations only have one possible rows are above any rows of all zeros. (2) Each
solution. leading entry of a row is in a column to the right of
3. There are infinitely many solutions. -> The the leading entry of the row above it. (3) All entries
linear system is solved by defining one of in a column below a leading entry are zeros.
the variables as a parameter and then For a reduced echelon form, the following
going through the system of solving the properties are also satisfied: (4) The leading entry
equations. The solution will contain that (the leading entry of a row is the leftmost nonzero
variable. entry in a nonzero row) in each nonzero row is 1.
(5) Each leading 1 is the only nonzero entry in its
MATRIX NOTATION column.
A linear system may be written as a rectangular These forms of a matrix are attained by using the
matrix where each column corresponds to the methods of solving linear systems that are
value of the coefficient of one variable (this would described above. Here is an example of a matrix in
be the coefficient matrix). If another column at reduced echelon form:
1 0 0 29
the right of the matrix is added that contains the
%0 1 0 16&
values of the solutions of the equations it is called
0 0 1 3
the augmented matrix. For example:
• − 2•" + •# = 0
2•" − 8•# = 8
THEREOM: Each matrix is row equivalent to one
and only one reduced echelon matrix.
1|P a g e
LINEAR ALGEBRA AND ITS APPLICATIONS – CHAPTER ONE - SUMMARY
•,
%=4 ⋮ 6
•3
A pivot position in a matrix is a location in that
matrix, which corresponds to a leading 1 in the
reduced echelon form of the matrix. A pivot A zero vector is a vector with all its entries as 0.
column contains a pivot position. Vectors can be used to describe a linear system as
such:
7, 89 + ⋯ + 73 8< = >
The systematic procedure to get this matrix (using
Which corresponds to the matrix:
[89 … 8< >]
the above matrix as an example) is to first reduce
the first entry of row 2 and 3 to zero, then reduce
the second entry of row 3 to zero (at this point we
SPAN OF VECTORS
The vector y is a linear combination of ?, … ?@
have the echelon form), then use the third row to
with the scalar factors 1, … 1@ when:
get the third entry in row 2 to zero. After this is
A = 1, B9 + ⋯ + 1@ BC
done, rows 2 and 3 can be used to get the second
2|P a g e
LINEAR ALGEBRA AND ITS APPLICATIONS – CHAPTER ONE - SUMMARY
3|P a g e
LINEAR ALGEBRA AND ITS APPLICATIONS – CHAPTER ONE - SUMMARY
4|P a g e
1. Find the reduced row echelon form of
Solution:
Solution:
3. Find the reduced row echelon form of
Solution:
Solution:
Solution:
6. Compute the rank of
Solution:
Solution:
9. Find a matrix with the following property, or say why you cannot have one.
Solution:
10. Find a matrix with the following property, or say why you cannot have one.
Solution:
Chapter 2 - Matrix Algebra
Column <
* … &'7
* 3, … *35
2 33 9
54 45
:1 &54 … * 8
! = :1 * +3
8' &87+, … &
*89 +5 8 Row ;
:1 54 45 54 8
0*63 *67
… & 6, … *65 7
· If we name the columns of the matrix by : , !" , … , !# and write the matrix in
column form, we obtain:
$ = [! !" … !# ]
· The diagonal entries of a matrix are%&'' , &(( , &)) , …. They form the main diagonal of
the matrix.
· A diagonal matrix is a square matrix (in * × * dimension) whose nondiagonal
entries are zero. An example is the identity matrix whose main diagonal entries
are all “1” and the other entries are all “0”.
· If all entries of a matrix are zero, this matrix called a zero matrix.
· $0+ =+0$
· 1$ 0 +2 0 - = $ 0 1+ 0 -2
· $03=$
· .1$ 0 +2 = .$ 0 .+
· 1. 0 /2$ = .$ 0 /$
· .1/$2 = 1./2$
o Two matrices are called equal when they have the same size and have
the same entries on the same location; they have the same columns and
same rows.
o We can sum two matrices up if only if they have the same size.
· $1+42 = 1$+24
1
Chapter 2 - Matrix Algebra
Each column in%$+ is the linear combination of columns in%$ using weights from the
corresponding column in%+.
o The 1;, <2 - entry of %$+ multiplication is defined as: 1$+287 = &8' C'7 0
&8( C(7 0 D 0 &89 C97
o .EF8 1$+2 = .EF8 1$2 G +
Notes:
· We cannot say that ! = ! . This may be correct for some special cases but in
general ! * ! .
· If ! = ", we cannot cancel' from both sides and say ! = ".
· If ! = 0, we cannot say that either = 0 or ! = 0. This may be correct for some
special cases, but in general we cannot have this conclusion.
Power of a Matrix:
-
· = . . ….
/
1
· If / = 0, 2=2
2
Chapter 2 - Matrix Algebra
Elementary Matrices:
· An elementary matrix is a matrix that is obtained by performing a single row
operation on an identity matrix.
· If we do an elementary row operation on a 4 × 5 matrix' , the new matrix can be
written as'G . Here, 4 × 4 matrix'G is created by doing the same row operation
on'$% .
· Every elementary matrix'G is invertible and the inverse of'G is the elementary
matrix of the same type that transforms'G back into'$.
· Matrix' is invertible if only if' is row equivalent to'$& . Also, the row operations
that reduce' to'$& also transform'$& into' 67.
· This means that [ $ ] is row equivalent to [$ 67 ]. If not, we say that' does not
have an inverse.
67
· We can check if = $ to see if we have done the row operations correctly.
3
Chapter 2 - Matrix Algebra
such that [ KI KL … KM ] = [ $]
· is invertible.
o There is a 5 × 5'" matrix such that " = $.
o There is a 5 × 5'O matrix such that O = $.
· is row equivalent to $& .
· has 5 pivot positions.
· 2 = 0 has only the trivial solution.
· The columns of are linearly independent.
· The linear transformation 2 P 2 is one-to-one.
· 2 = E has at least one solution for each'E vector in F& .
· The columns of span F& .
· The linear transformation 2 P 2 maps F& onto F& .
3
· is invertible.
67
If ! = $, then ! = and = !67 .
STQ(2)U = 2
QTS(2)U = 2
With ! a linear transformation, !(") = #$% " is the function which satisfies the above
equations.
Partitioned Matrices:
We can divide a matric into divisions and name each division (block / submatrice) as #&' .
within the same blocks (the blocks in the same location - (*, +) . Also, the scalar
If two matrices are the same size and partitioned in the same way, the addition is done
4
Chapter 2 - Matrix Algebra
set of - and . columns, the 2nd matrix should be divided into a set of - and . rows.
the partition of the matrices should be done in consistency: If the 1st matrix is divided into
· Partitioned matrices are multiplied in the same way row-column rule applies. The
94;% (1)
·
94;6 (1)
#1 = [345% (#) 3456 (#) … 3457 (#)] 8 ?
<
94;7 (1)
In order to find the inverse of a partitioned matrix, we assume that there is a matrix 1
which is the inverse of matrix # and is also partitioned with suitable block sizes. Then we
write the block equations obtained from: #1 = D.
· A block diagonal matrix is a partitioned matrix with zero block off the main
diagonal of blocks. This matrix is invertible provided that each block on the
diagonal is invertible.
The LU Factorization:
# is a / × 0 matrix and can be row reduced to echelon form without row interchanges. We
can write # = EF where E is a / × / lower triangular matrix with 1’s as the main
diagonal entries and U is a / × 0 echelon form of #. An example is shown below:
H B B B K I I I I
# = G II H B BJ G B K I I I
I H B B B B K IJ
I I I H B B B B B
E F
5
Chapter 2 - Matrix Algebra
# = EF L #" = M N E(F") = M
O = F" N EO = M P F" = O
o Both steps are easy to solve because both E and F are triangular matrices.
·
An LU Factorization Algorithm:
If possible, by doing row replacement operations, reduce # to an echelon form F.
Place entries in E such that the same sequence of row operations reduces E to D.
·
·
D" = Q" @ R
(D S Q)" = R!
" = (D S Q) R (This can be written provided that Q and R have nonnegative entries and
$%
each column sum of Q is less than 1) " is the production vector here and has nonnegative
entries.
of Q can be less than 1. Then, Q Z N B. So, solving #" = M and finding #$% will be
easier.
6
Chapter 2 - Matrix Algebra
Homogeneous Coordinates:
We can identify a point with coordinates (\, ^) in _6 with the point (\, ^, H) in _X . The
difference is that the second point is one unit above the \^-plane. So, it is said that (\, ^)
has homogeneous coordinates (\, ^, H).
Composite Transformations:
In order to move a figure on a computer screen, we need two or more basic
transformations. The composition of these transformations are related with matrix
multiplication with the use of homogeneous coordinates.
Homogeneous 3D Coordinates:
(`, a, b, c) are homogeneous coordinates for (\, ^, d) and c e B:
a b
·
\= , ^= , d=
c̀ c c
Perspective Projections:
distance from the screen. A perspective projection maps each point (\, ^, d) onto an image
A 3D object is seen as 2D on a computer screen. A person views the screen at a certain
point (\ I , ^ I , B) and so the two points and the eye position (the center of projection) are on
a line. The perspective projection can be represented by a matrix and the data on this
matrix is found by some calculations with the coordinates given.
2.8 Subspaces of _f :
A set called c in _f is a subspace for _f and:
The null space of matrix # is the set Nul # of all solutions of the equation #" = B.
The null space of a / × 0 matrix is a subspace of _f .
·
unknowns is a subspace of _f .
·
7
Chapter 2 - Matrix Algebra
· The pivot columns of matrix # are a basis for the column space of #.
Coordinate Systems:
k = lMV , … , Mm n is a basis for subspace c. For every " in c:
3%
["]k = o < q is the coordinate vector of " (relative to base k ), where 3% , … , 3p are the
3p
weights.
The rank of a matrix # is shown by rank #; that is the dimension of the column
o The dimension of zero subspace is zero.
space of #.
·
2 elements c is a basis for c. Any set of p elements of c that spans c is also a basis
·
for c.
For a square matrix # in size 0 × 0 , below statements are equivalent to the
statement that # is invertible:
·
dim Col # = 0
o
rank # = 0
o
Nul # = {B}
o
dim Nul # = B
o
o
8
1. Find all solutions to
Solution:
We create the augmented matrix and row reduce:
Solution:
Solution:
4. Simplify, or write “undefined” if the multiplication is not valid.
Solution:
Solution:
Solution:
7. Simplify, or write “undefined” if the multiplication is not valid.
Solution:
Undefined
Solution:
Solution:
10. Simplify, or write “undefined” if the multiplication is not valid.
Solution:
Chapter 3 - Determinants
Chapter 3 - Determinants
!"# = $%% !"#%% & $%' !"#%' + $%( !"#%( & ) + *&1,%-. $%. !"#%.
· The determinant of a × matrix is calculated by the formula below:
1
Chapter 3 - Determinants
Column Operations:
· !"#X = !"#
Cramer’s Rule:
When we search for the solution of4#^ = j, the entries of the solution vector ^ are found
by:
!"#6 *j,
^6 =
!"#
Here, #6 *j, means that 7 89 column of matrix4# is replaced by the vector4j. Notice that the
fraction is undefined when !"# = U is zero, which means that matrix4# should always be
invertible to avoid undefinable cases.
2
Chapter 3 - Determinants
A Formula For4k]Z :
yz8|~ *•\ ,
{*7T 3, & !m"Vp4qu4#]% } = ^x = 4
yz8|
·
Where •\ is the 3 89 4column of the identity matrix
!"#6 €•\ • = *&1,6-0 !"#06 = 506
%
#]% = $ 3#
yz8|
·
Where $ 3# is the adjugate of # which is the matrix containing the cofactors in the
ascending 7T 3 order.
For vectors YZ 4 and Y† which are not multiples of each other, the area of the
·
parallelogram obtained from these YZ 4 and Y† is the same as the area of the
·
Linear Transformations:
a is a transformation from c' ˆ c', ‰ is a parallelogram in c' :
{$V!$4qu4a*‰,} = ƒ !"#ƒ B {$V!$4qu4‰}
·
3
1. For what value of c is there a nonzero solution to the following equation? For
that value of c, find all solutions to the equation.
Solution:
To find the value of c, recall that if det(A)≠0 then Ax=0 has exactly one solution,
x=0. If det(A)=0 then Ax=0 has infinitely many solutions.
Proceed by computing the determinant:
If c = 2, the determinant is zero, and thus there are infinitely many solutions to
the equation. All but one of them are nonzero.
Finding all solutions for c=2 is the same as finding all solutions to the matrix
equation
The first row of this equation reads x+y=0. The second row reads 2x+2y=0.
These equations are redundant. Any (x,y) such that y=−x is a solution.
That is, any point of the form (x,−x) is a solution.
Solution:
To analyze a system of linear equations, it is convenient to put it into the
standard form Ax= b, where x is a vector of unknowns.
The unknowns in the linear equations are x, y, and z.
Because the right-hand side involves these unknowns, we subtract it from both
sides. The equation becomes
This matrix equation is now of the form Ax=b. Because the right-hand side, b, is
zero, the equation is homogeneous.
Recall that if (det)A≠0 then the homogeneous equation Ax=0 has exactly one
solution, x=0. If (det)A=0 then Ax=0 has infinitely many solutions.
In order for there to be a nonzero solution, det A must be zero. Hence, we seek
the values of λ for which
The second and third rows say that y=0, z=0. The first row provides no constraint
on x. Hence, any multiple of (100)’ is a nontrivial solution for λ=1.
Similarly, a nontrivial solution to
Solution:
We observe that this matrix equation is homogeneous because it is of the form
Ax=0.
If (det)A≠0 then the homogenous system Ax=0 has exactly one solution, x=0. If
(det)A=0 then Ax=0 has infinitely many solutions.
Because we are seeking values of c for which there are nontrivial solutions, we
require that (det)A=0.
We identify
Solution:
We observe that this equation is a linear system of the form Ax=b with nonzero b.
Hence, it is an inhomogeneous linear system.
Recall that if (det)A≠0 then the inhomogeneous linear system Ax=b has exactly
one solution, x=A−1b. If (det)A=0 then Ax=b has either no solutions or infinitely
many solutions.
Identifying A as the matrix in the problem and computing its determinant, we
have
Hence, there are either no solutions or there are infinitely many of them.
We try to find a single solution to
Solution:
Observing that the linear system is of the form Ax=b for a nonzero b, we note that
this equation is inhomogeneous. Recall that for an inhomogeneous linear system
If (det)A≠0 then Ax=b has exactly one solution, x=A−1b. If (det)A=0 then Ax=b has
either no solutions or infinitely many solutions.
We begin by computing the determinant
Solution:
The matrix product of an n×m matrix with an m×ℓ matrix is an n×ℓ matrix.
The (i,j) entry of the matrix product AB is the dot product of the ith row of A with
the jth column of B.
We identify
The left hand side of the matrix equaion is the matrix product AB, which is a 3×1
matrix.
The first entry of AB is the product of the first row of A with the first (and only)
column of B:
(AB)1=1 x+1 y+1 z.
Similarly, the second entry of AB is the product of the second row of A with the
first column of B:
(AB)2=0 x+1 y+1 z.
Finally, the third entry of AB is the product of the third row of A with the first
column of B:
(AB)3=0 x+0 y+1 z.
Hence, we rewrite the matrix equation as the system
x+y+z=4
y+z=3
z=1
To solve for x,y,z we observe that the third equation gives us z explicitly. From
that, we could plug into the second equation to get y. Then we could plug into the
first equation to get x.
Using z=1, the second equation gives y=2.
Plugging into the first equation, we get x+2+1=4. Hence, x=1.
The solution to the matrix equation is thus
7. Find the second degree polynomial going through (−1,1),(1,3), and (2,2).
Solution:
We are trying to find the function y(x)=a+bx+cx2 such that y(−1)=1, y(1)=3, and
y(2)=2.
Let's plug the three values x into y and see what that says about a, b, and c.
Plugging the x values into the function we get
Simplifying, we get
Hence,
Solution:
Recall that a matrix equation is of the form Ax=b, where A is a matrix of
constants, x is a column vector of unknowns, and b is a column vector of
constants.
In this problem, our unknowns are x,y,z, which we put into the vector x:
In order to put the system of equations into the form of a matrix equation, we
move all unknowns to the left hand side:
y+z=4
2x−y−z=0
Recall that the ith entry of the matrix-vector product Ax is the dot product of the
ith row of A with the only column of x.
We identify that y+z is the dot product of (0,1,1) with (x,y,z).
Similarly, 2x−y−z is the dot product of (2,−1,−1) with (x,y,z).
Putting the two left hand sides into a column vector, we can write
.
Hence, the system of linear equations can be written as
Solution:
To find the inverse of a 3×3 matrix,
1. Compute the minors of each element
2. Negate every other element, according to a checkerboard pattern
3. Take the transpose
4. Divide by the determinant of the original matrix
The minor of the (i,j)th entry of a matrix A is the determinant of the submatrix
obtained by removing the ith row and the jth column of A.
For example, the minor of the 1,1 entry is .
.
Next we negate every other entry, according to the pattern
.
The matrix of minors becomes
.
Transposing, we get
.
In order to divide by the determinant of A, we must first compute it. We can
compute that det(A)=6.
Hence,
To solve the linear system, recall that if A is square and invertible, then the
solution to Ax=b is x=A−1b.
Computing, we have
.
My matrix multiplication we get
.
10. Solve by matrix inversion:
Solution:
We observe that this matrix equation is in the standard form Ax=b, where
.
−1
Recall that if A is square and invertible, the solution to Ax=b is x=A b.
Hence, we need to find the inverse of A.
The inverse of a 2×2 matrix is given by swapping the diagonal entries, negating
the off-diagonal entries, and dividing by the determinant:
The determinant of A is 2 16−3 10=2.
Hence
Subspaces:
If - is a subset of !, then it should satisfy the properties:
1
Chapter 4 - Vector Spaces
· Nul 4 is the null space of the 5 × 6 matrix 4; it is the set of all solutions of the
equation 47 = %:
· The set of all solutions (null space) of the system 47 = % (5 linear equations with 6
unknowns) is a subspace of 9;.
2
Chapter 4 - Vector Spaces
3
Chapter 4 - Vector Spaces
4
Chapter 4 - Vector Spaces
Coordinates in 9; :
If there is a basis O for 9; given, the O-coordinate vector of 7 is found by
'D
7 = 'D @/ + 'E @F + G + 'R @; [7]O = T U V. For a basis O = {@/ , … @; }, let
'R
where
WO ND 7 = [7]O
5
Chapter 4 - Vector Spaces
4.6 Rank:
This section provides us the relationships between the rows and the columns of a matrix. The
maximum number of linearly independent columns of a 5 × 6 matrix 4 and the maximum
number of linearly independent columns in 4^ (which are the rows in 4) are equal.
· The row spaces of matrices 4 and _ are the same provided that 4 and _ are row
equivalent. If _ is in echelon form, the rows of _ that are nonzero form a basis for the
row space of 4 as well as for that of _.
· O = {@/ , … , @; } and ` = {b/ , … , b; } are the bases of vector space !. There is a unique 6 ×
6 matrix W that provides:
W
[7]` = [7]
`cO O
6
Chapter 4 - Vector Spaces
W
are the `-coordinate vectors in the basis O:
`cO
The columns of
W
= [[@/ ]` [@F ]` … [@; ]` ]
`cO
Change of Basis in 9; :
If O = {@/ , … , @; } is a basis and g = {h/ , … , h; } is the standard basis in 9;, [@/ ]g = @/, [@F ]g =
@F ,..., [@; ]g = @; . So,
W [@F ]g
= [[@/ ]g … [@; ]g ]
gcO
And so;
WO = [@/ @F … @; ]
Discrete-Time Signals:
A signal in vector space j of discrete-time signals is a function defined only on the integers
and written as a sequence of numbers {kl }. When a process begins at a specific time, we can
write a signal as a sequence of the form (kq , kD , kE , … ) . The kl terms for u v I are omitted.
Let us consider a set of only three signals in j ; {wl }, {yl } and {zl } . They are linearly
independent provided that 'D "Q + 'E #Q + '| $Q = I (for all u) is satisfied when 'D = 'E = '| = I.
So;
"Q #Q $Q 'D I
T"Q~/ #Q~/ $Q~/ V T'E V = TIV for all u
"Q~F #Q~F $Q~F '| I
The coefficient matrix above is called the Casorati matrix of signals. The determinant of
this matrix is called the Casoratian. If the Casorati matrix is invertible for at least one value
of u, then 'D = 'E = '| = I which means that the three signals are linearly independent.
7
Chapter 4 - Vector Spaces
· If {€l } and •R J I are given; kl~R + •D kl~RND + G + •RND kl~D + •R kl = €l has a unique
solution as long as kq , kD , kE , … kRND are specified.
· The set - of all solutions to the equation kl~R + •D kl~RND + G + •RND kl~D + •R kl = I is
a 6-dimensional vector space.
Nonhomogeneous Equations:
The general solution of the nonhomogeneous linear equation kl~R + •D kl~RND + G +
•RND kl~D + •R kl = €l can be written as one particular solution of kl~R + •D kl~RND + G +
•RND kl~D + •R kl = €l plus an arbitrary linear combination of a fundamental set of solutions of
the related homogeneous equation kl~R + •D kl~RND + G + •RND kl~D + •R kl = I.
Steady-State Vectors:
W is a stochastic matrix with „ the steady-state (equilibrium) vector for W; then:
W„ = „
8
Questions 1-8 are true or false questions. If the statement is true, prove it. If
false, give a counterexample.
1. If A is a 2 × 2 matrix such that A(Ax) = 0 for all x R2, then A is the zero
matrix.
Solution:
False. If A(Ax) = 0 for all x, then the column space of A and the nullspace of A
must be the same space. In particular, consider the matrix
Solution:
True. We can realize such a system of equations as a single matrix equation Ax
= b, where A is a 3 × 4 matrix. Hence, rank(A) ≤ 3, so the dimension of the
nullspace of A is at least 1: dim nul(A) = 4 − rank(A) ≥ 4 − 3 = 1. Hence, there
must be at least one free variable in the system, meaning that, if the system is
solvable at all, it must have an infinite number of solutions.
Solution:
False. Let V = R2, which is clearly a vector space, and let S be the singleton set {
[ 1 0 ] }. The single element of S does not span R2, so no subset of S can be a
basis for R2. Hence, this provides a counterexample to the statement
4. Suppose A is an m × n matrix such that Ax = b can be solved for any choice of
b Rm. Then the columns of A form a basis for Rm.
Solution:
False. Consider the matrix
.
Then A is already in reduced echelon form and clearly has 2 pivots, so rank(A) =
2. This implies that dim col(A) = 2, so the column space of A consists of all of R2.
Thus, the equation Ax = b can be solved for any b R2 (since any b is in col(A)).
However, the columns of A are clearly not linearly independent (no set containing
the zero vector can be linearly independent), so they cannot form a basis for R2.
A related but true statement would be the following: “Suppose A is an m × n
matrix such that Ax = b can be solved for any choice of b Rm. Then some
subset of the columns of A forms a basis for Rm.”
Solution:
True. Translating the system of equations into a matrix equation Ax = b, the
nullspace of A must be at least one-dimensional, so the solution-space must be
at least one-dimensional. Since the solution space of the matrix equation
corresponds to the intersection of the hyperplanes, that intersection must be at
least one-dimensional, meaning it must contain a line.
Solution:
False. Consider the symmetric matrix
.
Then A only has rank 1, meaning that A cannot be invertible, so this gives a
counterexample to the statement
Solution:
True. You should check that the set of polynomials of degree ≤ 5 satisfies all the
rules for being a vector space. The important facts are this space is closed under
addition and scalar multiplication.
For questions 9 and 10, determine whether the given subset is a subspace of the
given vector space. Explain your answer.
9. Vector space: R4
Subset: vectors of the form
Solution:
Yes, this is a subspace. If we take two vectors in the subset, say
is in the set, so this set is closed under scalar multiplication. Thus, the set is
closed under both addition and scalar multiplication, and so is a subspace.
Solution:
No, this is not a subspace. To see that it is not closed under addition, notice that
if f(t) = t2 and g(t) = −t2, then f and g are both in the set of quadratic polynomials,
but, since (f + g)(t) = f(t) + g(t) = t2 + (−t2) = 0, the sum f + g is not a quadratic
polynomial.
Chapter 5 - Eigenvalues and Eigenvectors
If ! = "! is provided, here " is a scalar value and is the eigenvalue if only the equation
has a nontrivial solution, and ! is the eigenvector related to the eigenvalue#".
! = "!
the steps below:
! $ "! = 0
( $ "%)! = 0 ... Here, if the columns of the matrix $ "% are linearly dependent,
we say that the equation ! = "! has a nontrivial solution. Then, we say that " is
an eigenvalue of matrix A.
0
If we have the eigenvalue(s) of a matrix, we find the corresponding eigenvectors by
performing row operations by inserting & ' as the last row next to the matrix#
·
0
# $ 7% . Row reductions can be used to find the eigenvectors, but not the
All eigenvectors including the zero vector as the solutions for ( $ "%)! = 0 form
eigenvalues.
·
the eigenspace of corresponding to#".
"/ , "1 , …#"2 of the 3 × 3 matrix# , the set {*+ , *- , … , *. } is linearly independent.
·
A 3 × 3 square matrix has 3 eigenvalues, but they are not always distinct. When
they are all distinct, there will be 3 corresponding eigenvectors. If the matrix has
·
1
Chapter 5 - Eigenvalues and Eigenvectors
Determinants:
· If the determinant of a matrix is not equal to zero and if “zero” is not an eigenvalue
of the matrix, then that matrix is invertible.
· If you perform row reduction operations on matrix# , the determinant of the last
form of the matrix will be the same as the determinant of matrix# . Note: Row
interchange changes the sign of the determinant! Also, a row scaling results in
>?@ = A DMNFJK#MO#P
0,###############################when# #is#not#invertible
·
>?@ U = det
·
·
· The determinant of a triangular matrix is the multiplication of the values on the
This polynomial is of 3VW degree and its roots are the eigenvalues of the matrix.
obtained by this operation is called the characteristic polynomial of matrix# .
Similarity:
If both matrices and T are 3 × 3 dimensioned and if there is an invertible matrix
providing X Y/ X = T or = XTXY/ or#Z Y/ TZ = ; then we say that and T are
·
First, find the eigenvalues ( "/ , "1 , … , "\ ) by using the characteristic equation
obtained from#det( $ "%) = 0.
·
2
Chapter 5 - Eigenvalues and Eigenvectors
^/
^1
!6 = ^/ *+ _ ^1 *- _ ` _ ^\ *] = a*+ *- ` *] c f g j
^\
·
5.3 Diagonalization:
If = XkXY/ where X is an invertible matrix and k is a diagonal matrix, then is
said to be diagonalizable because 5 = Xk 5 XY/ is obtainable.
·
· The diagonal entries of matrix k should be the eigenvalues ("/ , "1 , … , "\ ), written
in the order of the corresponding eigenvectors are written in matrix#X as the
columns.
Diagonalizing Matrices:
Write matrix k using the eigenvalues as the diagonal entries. Eigenvalues and
·
·
3
Chapter 5 - Eigenvalues and Eigenvectors
R/
! = R/ y+ _ R1 y- _ ` _ R\ y] with the coordinate vector: azc| = ~ g •
R\
·
) ) )
q(!) = q(R/ y+ _ R1 y- _ ` _ R\ y] = R/ q(y+ _ R1 q(y- _ ` _ R\ q(y] )
aq(!)c€ = R/ aq(y+ )c€ _ R1 aq(y- )c€ _ ` _ R\ aq(y] )c€
·
aq(!)c€ = •a!c| and • = aaq(y+ )c€ aq(y- )c€ … aq(y] )c€ c ‚ the matrix for T
·
4
Chapter 5 - Eigenvalues and Eigenvectors
• $Ž
The complex eigenvalues of are found in conjugate pairs.
The eigenvalues of the matrix Œ = & ' are " = • ± Ž•. R = •"• = ‘•1 _ Ž 1 . So
Ž •
•“R $Ž“R R 0 ^•–— $–•3—
·
• $Ž
eigenvalue *, X = a†?#* %‡#*c and Œ = & '; so = XkX Y/.
·
Ž •
Eigenvector decomposition: !6 = ^/ *+ _ ^1 *- _ ` _ ^\ *]
!/ = !˜ = ^/ *+ _ ^1 *- _ ` _ ^\ *] = !6 = ^/ "/ *+ _ ^1 "1 *- _ ` _ ^\ "\ *]
·
We can examine the behavior of the dynamic system for large values of 9 by
·
Change of Variable
A new sequence: š› = XY/ !› or#!› = Xš› . So; since#!58/ = !5 ,#Xš›8+ = Xš› =
(XkXY/ )Xš› = Xkš› . Shortly;#Xš›8+ = Xkš› . Multiplying both sides by XY/ on the
·
z/ (@) z/ œ (@)
!œ (@)
= !(@) is a linear equation where#!(@) = ~ g •, ! (@) = ~ g •, and =
œ
z\ (@) z\ œ (@)
·
•// … •/\
~ g # g •
•\/ … •\\
· If the derivative of a function only depends on the function itself, then the system
5
Chapter 5 - Eigenvalues and Eigenvectors
*! +,-) 2! 3 … 3 *! ,-)
·
3 2. & / *. ,-)
( . +,-)1 = (
*
/ / & 4 3 1( / 1
*0 +,-) 3 … 3 20 *0 ,-)
·
Complex Eigenvalues:
9 G = H C I C I. C IK C F C I0 C F
! ! !
.J KJ 0J
·
9 :> = H C ,2-) C .J ,2-). C KJ ,2-)K C F C 0J ,2-)0 C F
! ! !
Calculate&?5PA = W?5 .
!
·
` a
·
· For all ?^ values chosen, _X approaches the dominant eigenvalue and ?5
approaches the corresponding eigenvector.
Solve ,W c bd)#5 = ?5 .
·
6
Chapter 5 - Eigenvalues and Eigenvectors
Calculate eX = b C .
!
` a
·
Calculate ?5PA = # .
!
`a 5
·
· For all ?^ values chosen, eX approaches the eigenvalue 2&and ?5
approaches the corresponding eigenvector.
7
1. Find the eigenvalues and eigenvectors of
Solution:
Solution:
Solution:
Solution:
All matrices have the characteristic equation
Solution:
We can diagonalise A by using the matrix of eigenvectors, L
D = LAL-1
If we rewrite this my multiplying it by L from the right, and L−1 from the left then
A = L-1DL
Solution:
7. If
then SPST is a diagonal matrix. Without further calculation give the eigenvalues
of P, and its corresponding eigenvectors.
Solution:
If S is orthogonal then SST = I. The diagonal elements should be the eigenvalues
4, −2. The rows of S can be read off to give the corresponding eigenvectors
Solution:
9. Obtain eigenvalues and eigenvectors for
Solution:
Solution:
Chapter – 6: Orthogonality and Least Squares
Consider two vectors u and v inside Rn, then u and v are regarded as n×1 matrices with the
transpose uT being a 1×n matrix and the matrix product uTv being a 1×1 matrix is represented as
a single real number (without brackets). This number obtained from uTv is known as the inner
product of u and v and is often written as u.v and sometimes referred to as dot product. If,
!% 0%
$ &, $0& ,
# +
= ! # ' + !-./!0 = ! ##0' ++
#(+ #(+
" )* "0) *
0%
$0& ,
# +
[ % & ' 1 ) ] #0' + = % 0% 2! & 0& 2! ' 0' 2!1 2! ) 0)
#(+
"0) *
Theorem – 1:
-3 43 5 = 53 4
63 74 2 583 9 = 43 9 2 53 9
Length of a Vector:
For any v belonging to Rn with quantities v1, v2, … vn the length is a non-negative scalar
Also, for any scalar quantity c, the length of cv is c times the length of v, i.e.
G:HG = |:|GHG
A vector having the length of 1 is termed as a unit vector, by dividing a vector v by the scalar
quantity ǁvǁ, we get a unit vector u whose length is 1 and it is in the same direction as that of
Distance in Rn:
For any two vectors, u and v belonging to Rn, the distance between the vectors is represented as
dist7KL H8 = GK M HG
Orthogonal Vectors:
Two vectors are termed as orthogonal vectors if they are perpendicular to each other when
Theorem – 2:
GK 2 HG = !"! + !#!
orthogonal to W and the set of all vectors z which are orthogonal to W is known as the
orthogonal complement of W and is denoted by W┴. A vector x can only belong to W┴ if and
Theorem – 3:
For a m×n matrix suppose A, the orthogonal complement of row space of A is the null space of
A and the orthogonal complement of column space of A is the null space of AT.
If, u and v are two non-zero vectors, then the angle Ø between the geometric
". # = !"!$!#!$'*, -
A set of vectors (u1, u2, … un) belonging to Rn is an orthogonal set if each pair of distinct
vectors in the set is orthogonal to each other , i.e., if$"/ . "0 = 1$whenever$i 2 j.
Theorem – 4:
If S = (u1, u2, … un) is a set of orthogonal vectors which are non-zero and belong to Rn, then S is
linearly independent and hence is a basis for the subspace which is spanned by S.
Theorem – 5:
Let (u1, u2, … , un) be an orthogonal basis for a subspace W of Rn. For each y in W, the weights
y = c1u1 + … + cnun
4."5
are given by c3 = $(7 = 89 … 9 :)
"6 ."5
Orthogonal Projection:
Consider two vectors u and y belonging to Rn. Now, y can be divided into two parts such that
one part is a multiple of u and the other part is orthogonal to u. This can be written as,
4= <+>
Where ŷ is termed as orthogonal projection of y onto u and the term z is termed as component
of y orthogonal to u.
!. "
= "
". "
Orthonormal Set:
A set (u1, u2, … , un) is said to be an orthonormal set if it is an orthogonal set which contains of
Theorem – 7:
For an m×n matrix say U having orthonormal columns, and u and v belong to Rn, then:
· #U$# = #$#
An orthogonal matrix is a square matrix which is invertible such that U-1 = UT.
The orthogonal projection of any point in R2 onto a line passing through the origin
has an important analogue in the Rn. Suppose there is a vector y along with a
subspace W in Rn then, there exists a vector ŷ in the subspace such that ŷ is the
unique vector in W closest to y. These two properties of the vector ŷ provide the
solution to finding the least squares solutions for linear systems. The terms in the
sum for y can be grouped into two parts such that y can be represented as:
Theorem – 8:
Orthogonal Decomposition Theorem:
Considering a subspace of Rn say W, then each y in Rn can be written in the unique form
Here, ŷ is in W and z is in W┴. If (u1, u2,… , up) is any orthogonal basis of W then,
And z = y – ŷ.
Considering (u1, u2, … , up) is an orthogonal basis for W and y as a vector belonging to the
subspace W. Then,
Theorem – 9:
Considering a subspace of Rn say W and let a vector y be present in Rn and let ŷ be the
orthogonal projection of y onto W. Then the closest point in graphical representation to y is ŷ and
Considering an orthonormal basis such as (u1, u2, … , up) belonging to the subspace W of Rn,
Theorem – 11:
For a given basis say {x1,…,xn} for anon zero subspace W or Rn, define
Theorem – 12:
The QR Factorization:
Considering an m×n matrix say A which has linearly independent columns, then A can be
factored in the form of A = QR, where is an m×n matrix whose columns and an orthogonal
basis for Col A and R is an n×n upper triangular invertible matrix which has positive entries on
the diagonal.
The equation given above is used to represent a system of equations known as the normal
Theorem – 13:
The set of least squares solutions for Ay = b is coincidental with the nonempty set of solutions of
Theorem – 14:
Consideting A as a matrix of the order m×n. The following facts are subsequently true:
· The equation given by Ay = b has a unique least squares solution for every b belonging
to Rm.
· The columns present in A are linearly independent.
· The matrix given by ATA is invertible.
The above statements give rise to the least squares solution ŷ which is given by:
Theorem – 15:
Take a matrix of the order m×n which has linearly independent columns say A and let A = QR
be a QR factorization of A. Then in that case, for each b in Rm, the equation given by Ax = b has
a unique least-squares solution which is given by:
Ŷ = R-1QTb
Least-Squares Lines:
The simplest relation between any two given variables say x and y is a linear equation denoted
by y = ß0 + ß1x. Suppose the experimental data has a close relevance to an equation for a line
then, the value yj is the observed value for y and ß0 + ß1xj the predicted value which is
determined by the line. And the difference between y-value and a predicted y-value is known as
residual.
The least-squares line is the line y = ß0 + ß1x which minimizes the sum of the squares of the
residuals. This line is also known as the line of regression of y on x, since any errors present in
the data are assumed to be just in the y-coordinates. The coefficients ß0, ß1 of the line are known
as regression coefficients.
In case the data points were on the line, the parameters ß0 and ß1 would satisfy the equations of
the line and can be represented as:
The General Linear Model:
In some applications, it is required to trace the data points graphically with something other than
a straight line. Usually, to deal with this situation, a residual vector is introduced and the
equation is written as:
An equation which is written in this form is referred to as a linear model. Once X and y are
determined, the objective becomes to minimize the length of E, which generates a least squares
solution of Xß = y.
Let V be an inner product space, with the inner product denoted by <u,v>. We can define the
length or norm of a vector v to be the scalar:&'& = *< ', ' >
A unit vector is a vector which has the length 1. The distance between u and vector v is ǁu-vǁ
and the vectors u and v are orthogonal if <u,v> = 0.
Theorem – 16:
The Cauchy-Schwarz Inequality:
Theorem – 17:
ǁ u + v ǁ ≤ ǁuǁ + ǁvǁ
In case a function is approximated by a curve which can be given using the quadratic equation,
y = ß 0 + ß 1 t + ß2 t 2 ,
the coefficient ß2 may not provide us with the desired values which are in par with the quadratic
trend in the data, since it might not be “independent” in a statistical sense from the other ßi. In
order to generate what is known as trend analysis of the given data we can introduce an inner
product on the space Pn. For p, q in Pn, define
<p, q> = p(t0)q(t0) + … + p(tn)q(tn)
Let p0, p1, p2, p3 denote an orthogonal basis for the subspace P3 belonging to Pn, obtained by
applying the Gram-Schmidt process for the polynomials which are 1, t, t2, and t3. Let ŷ be the
orthogonal projection as per the given inner product of y on P3 say
In such a case, ŷ is known as the cubic trend function, and c0, … , c3 are the trend coefficients
of the data. The coefficient c1 measures the linear trend, c2 the quadratic trend, and c3 the cubic
trend. It turns out that if the data have certain properties, these coefficients are statistically
independent.
Continous functions can be approximated by using linear combinations of sine and cosine
finctions. Let us consider the functions on 0 ≤ t ≤ 2Π, and any function in C[0,2Π] can be
approximated as closely as the desired be a function of the form
for a largevalue of n. The function above is also known as a trigonometric polynomial. If an and
bn do not have zeroes as their value the polynomial is of the order n. The connection within the
trigonometric polynomials and other functions in C[0, 2Π] depends upon the given condition
stating for any n ≥ 1 the set:
{1, cos t , cos 2t, … , cos nt, sin t, sin 2t, … , sin nt}
The norm of the difference in between f and a Fourier approximation is known as the mean
square error in the approximation. For an expression given below, we can write the Fourier
series which is the expression for f(t) for f on [0,2Π].
.
&"
#($) = % + % ' (&* cos ,$ + % -* sin ,$)
2
*/0
Problems 1 and 2 are related to the following problem statement. Let L be the
line thru the origin in R2 that is parallel to the vector
Solution:
We know that projL: R2 → R2 is a linear transformation, so we can find the
columns of the standard matrix by plugging in the standard basis vectors. We
have the formula
2. Find the point on L which is closest to the point (7, 1) and find the point on L
which is closest to the point (−3, 5).
Solution:
projL(v) is the closest point to v in L.
So, (3, 4) is closest point on the line to (7, 1) and (33/25, 44/25) is the closest
point on the line to (−3, 5).
Problems 3 and 4 are related to the following problem statement. Let L be the
line thru the origin in R3 that is parallel to the vector
Solution:
We can also use the formula which says that A = UUT where U is the matrix
whose columns form an orthonormal basis of L. In the case of a line, an
orthonormal basis is just a vector with length one in the direction of L. To find this
we just normalize u. We first need to normalize the vector. The length
Solution:
Problems 5-7 are related to the following problem statement. Let x1 = [1 2 1]T
and x2 = [3 0 3]T and let P be the plane thru the origin spanned by x1 and x2.
Solution:
6. Find the standard matrix of the orthogonal projection onto P.
Solution:
We let U be the matrix whose columns forms the orthonormal basis.
7. Find the point on P which is closest to the point (1, 0, 0).
Solution:
Problems 8-10 are related to the following problem statement. Let x1 = [1 0 0]T
and x2 = [1 1 1]T and let P be the plane thru the origin spanned by x1 and x2.
Solution:
We use the Gram-Schmidt process to find an orthogonal basis.
9. Find the standard matrix of the orthogonal projection onto P.
Solution:
We let U be the matrix whose columns forms the orthonormal basis
10. Find the point on P which is closest to the point (0, 0, 1).
Solution:
Chapter – 7: Symmetric Matrices and Quadratic Forms
Symmetric Matrix:
A symmetric matrix say A is said to be a symmetric matrix if and only if A = AT and for this
property to be true, the matrix A should always be a square matrix in order to be a symmetric
matrix. While the values on the diagonal can remain any random values, the other values always
Theorem – 1:
For a matrix A which is symmetrical, any two given eigenvectors from two different eigenspaces
orthogonal matrix P such that the condition P-1 = PT holds true along with a diagonal matrix D
A = PDPT = PDP-1
For the completion of such a diagonalization we must first calculate the n eigenvectors which are
equation then the below equation holds true also stating the fact that A is symmetric,
Theorem – 2:
A square matrix let us suppose A of order n × n is orthogonally diagonalizable if and only if A is
a symmetric matrix.
Theorem – 3:
The set consisting of eigenvalues of A can also be sometimes termed as the spectrum of A and
the following properties regarding the eigenvectors make this theorem the spectral theorem. A
2. Each eigenvalue λ has an eigenspace whose dimensions are equal to λ as a root of the
characteristic equation.
3. Since the eigenvectors which correspond to different eigenvalues are orthogonal it is also
4. A is orthogonally diagonalizable.
Spectral Decomposition:
Let us suppose that A = PDP-1 where the columns of P are orthogonal eigenvectors u1, … ,un for
a and the corresponding eigenvalues are inside the diagonal matrix D as λ1, … , λn. Now as it is
The above expression for A is also known as the spectral decomposition of A as it breaks up the
matrix into small parts which are determined by the eigenvalues of A. Every matrix ujuTj is a
projection matrix because for each x in Rn, the vector denoted by (ujTj)x is the orthogonal
A function say Q which is defined over Rn and the general value of that value at a vector x in Rn
can be generated through the help of an equation which states Q(x) = xTAx, and where A is an n
× n symmetric matrix then Q is also termed as a quadratic form on Rn. The matrix known as A
An equation to change the variable in an equation say x with another variable say y the change
in the above equation, P is an invertible matrix and y belongs to Rn. Here y is the coordinate
vector of x which is a basis of Rn which is determined using the columns present in the matrix of
P.
If the change of variable is turned into a quadratic form xTAx in that case,
Theorem – 4:
For any symmetric matrix of the order n× n say A then there is an orthogonal change in variable,
x = Py such that the quadratic form xTAx is turned into a quadratic form yTDy without having
any cross-product term. The columns belonging to P are known as the principal axes of the
quadratic form xTAx. The vector y is the coordinate vector of x relative to the orthonormal basis
Suppose Q(x) = xTAx, where A is an invertible and symmetric matrix of the order 2 × 2. Taking
xT Ax = c
The matrix so formed is either an ellipse or a hyperbola as per the equation and if the equation is
not in standardized form, then the ellipse or hyperbola is rotated in either direction until it is as
3. Indefinite for cases where Q(x) has both positive as well as negative values.
4. Positive semidefinite is used for cases where Q(x) ≥ 0 for all values of x
Theorem – 5:
· Positive definite if and only if the eigenvalues belonging to A are all positive,
· Negative definite if and only if the eigenvalues which belong to A are all negative, or
· They are called indefinite when A has both positive as well as negative eigenvalues.
A positive definite matrix A is a symmetric matrix according to which the quadratic form xTAx
is positive definite. Similar derivation can also be done for a negative part also.
The need for having a unit vector x in Rn can be stated in a variety of similarly efficient ways as
given below:
ǁ x ǁ = 1, ǁ x ǁ2 = 1, xTx = 1
one of the commonly used versions (of xTx = 1) in applications is x12 + x22 + … + xn2 = 1
Theorem – 6:
Considering a symmetric matrix say A, it has the following values m and M respectively as the
least eigenvalue and the greatest eigenvalue λ1 of A. The values m and M are defined as:
The value of xTAx changes to M and m when x is a unit eigenvector say u1 which corresponds to
M and m respectively.
Theorem – 7:
Let A be a symmetric matrix and let λ1 be its greatest value and u1 be a unit eigenvector
xTx = 1, xTu1 = 0
this is the second greatest eigenvalue λ2 and this maximum is obtained when x is an eigenvector
Theorem – 8:
diagonalization done as A = PDP-1, where the entries on the diagonal of D are arranged in such a
manner that λ1 ≥ λ2 ≥ … ≥ λn and the places in which the columns of P are corresponding unit
eigenvectors u1, … , un. Then for k = 2, …, n, the maximum value of xTAx when subjected to
the constraints:
A matrix say A of the order m × n can be factorized in the form A = QDP-1 and this type of
Let A be an m × n matrix. Then ATA is symmetric and also is orthogonally diagonalizable. Let
{v1, …, vn} be an orthonormal basis for Rn which consists of eigenvectors given by ATA, and let
= viT(λivi)
= λi
The eigenvalues of ATA are all non-negative and can be easily arranged in the form:
λ1 ≥ λ2 ≥ ,…, ≥ λn ≥ 0
The singular values of A are the square roots of the eigenvalues of ATA.
Theorem – 9:
Assuming {v1, … , vn} is an orthonormal basis for Rn which has eigenvectors defined as ATA,
and these eigenvectors are arranged in such a manner that the corresponding eigenvalues given
by ATA adhere to the fact λ1 ≥ … ≥ λn, and assuming A to have r values which are non-zero
singular values then, {Av1, … , Avr} is an orthogonal basis for Col A as well as rank A = r.
Theorem – 10:
Supposing A is an m × n order matrix which has rank r. Then there exists an m × n matrix Σ
according to which the diagonal entries in D are the first r singular values of A and there exists
an m × m orthogonal matrix also which can be given by U along with a n × n orthogonal matrix
V such that
A = U Σ VT
The columns of U in decomposition are termed as left singular vectors and the columns of the
Theorem – 11:
Supposing A is a matrix of the order n × n then the following statements each prove that A is an
invertible matrix:
· (Nul A)┴ = Rn
· Row A = Rn
the diagonal values present in D are not equal to zero, the matrix D is invertible, the following
A+ = VrD-1UrT
In order to ready the principal component analysis, suppose [X1 … XN] is a matrix of
observations of the order p × N. Then, the sample mean, M, of the observation vectors X1, … ,
XN is given by
Have a sample mean and B is said to be in mean deviation form. The (sample) covariance
matrix is the p × p matrix S which is defined by the equation given below and also S is a matrix
given data is the sum of the variances which are given on the diagonal of matrix S. As in general,
the sum of the diagonal terms in a square matrix S is known as the trace of the matrix, given by
tr(S). Thus
The value for sij belonging to the vector S such that i ≠ j is known as the covariance of xi and xj.
Let us assume that a matrix given by [X1 … XN] is already in mean deviation form and the aim
for the principal component analysis is to get an orthogonal p × p matrix P = [u1 … up] that is
in order of decreasing variance. The orthogonal change of variable X = PY can be deduced that
Solution:
Since the rows of A are orthonormal AAT = I and hence ATAAT = AT. Since AT is
nonsingular it has an inverse (AT)−1 . Thus ATAAT(AT)−1 = AT(AT)−1 implying that
ATA = I, i.e., the columns of A are orthonormal.
Solution:
xTuuT(1−x) can be written as the product of two scalars (xTu)(uT(1 − x)) . The first
scalar is the sum of the coordinates of u corresponding to the subset S and the
second scalar is the sum of the complementary coordinates of u. To maximize
the product, one partitions the coordinates of u so that the two sums are as
equally as possible. Given the subset determined by the maximization, check if
xTu=uT(1 − x).
Questions 3 and 4 are related to the following problem statement. Let x1, x2, . . . ,
xn be n points in d-dimensional space and let X be the n×d matrix whose rows
are the n points. Suppose we know only the matrix D of pairwise distances
between points and not the coordinates of the points themselves. The xij are not
unique since any translation, rotation, or reflection of the coordinate system
leaves the distances invariant. Fix the origin of the coordinate system so that the
centroid of the set of points is at the origin.
Solution:
Since the centroid of the set of points is at the origin of the coordinate axes, ∑i=1
xij = 0.
Summing over j gives
and
are the averages of the square of the elements of the ith row, the square of the
elements of the jth column and all squared distances respectively.
4. Describe an algorithm for determining the matrix X whose rows are the xi.
Solution:
Having constructed XTX we can use an eigenvalue decomposition to determine
the coordinate matrix X. Clearly XTX is symmetric and if the distances come from
a set of n points in a d-dimensional space XTX will be positive definite and of rank
d. Thus we can decompose XTX as XTX = VTσV where the first d eigenvalues are
positive and the remainder are zero. Since the XTX = VTσ1/2σ1/2V and thus the
coordinates are given by X = VTσ1/2.
5 0
5. Let D = " #. Compute D2 and D3. In general, what is Dk, where k is a
0 4
positive integer?
Solution:
6 $1
6. Let A = " . Find a formula for Ak given that A = PDP−1 where " =
2 3
1 1 5 0 2 +1
# ,$ = # , %&'(")* = #
1 2 0 4 +1 1
Solution:
Solution:
8. Diagonalize the following matrix, if possible.
Solution:
Eigenvalues: −2 and 2 (each with multiplicity 2).
9. If possible, diagonalize the matrix and find an orthogonal basis in which it has
diagonal form:
Solution:
The characteristic polynomial is
(All roots are integers, so they can be found by trial and error, among the divisors
of 27.) The corresponding homogeneous systems for eigenvectors and their
solutions are:
Thus, the matrix can be diagonalized, and it has diagonal form in the basis {u1,
u2, u3}. Since A is symmetric, u1 is orthogonal to u2, u3, and we only need to
orthogonalize u2, u3. (By the way, the fact that A is symmetric also tells us that it
is diagonalizable, i.e., we must find three independent eigenvectors!) Applying
GramSchmidt to {u2, u3}, we replace u3 with [−1 2 1]T. Finally, the matrix has
diagonal form
10. Let A be a square matrix with integral entries. Prove that A−1 exists and has
integral entries if and only if det A = ±1.
Solution:
Clearly, the determinant of an integral matrix is an integer (as it is obtained from
the entries of the matrix by addition and multiplication only). Thus, for the only if
part it suffices to notice that det A det A−1 = det(AA−1) = det I = 1. Since the
product of two integers is 1, they must both be ±1. The if part follows from the
formula A−1 = (1/ det A) adj A and the fact that, if A is integral, so is adj A (as its
entries are determinants of integral matrices, see above).
Chapter – 8: The Geometry of vector Spaces
Say there are given vectors, v1, v2, …, vp which belong to Rn and scalars c1, c2, c3, … , cp an
affine combination of v1, v2, … , vp is a linear combination given by c1v1 + … + cpvp in such a
The set consisting of all affine combinations of points which are there in a set S is known as the
Theorem – 1:
A set S is said to be affine in case p,q ϵ S implies that (1 – t)p + tq ϵ S for every real number t.
Theorem – 2:
A given set S is affine if and only if every affine combination of points of S belongs to S. Which
Definition:
A translate of any given set say S in R n by a vector p is the set S + p = {s + p : s ϵ S}2. A flat in
Rn is a translate belonging to a subspace of Rn. Two flats are said to be parallel in case one of
the flats is a translate of the other. The dimension of a flat is also the dimension of the
corresponding parallel subspace and the dimension of a set S, symbolified as dim S, gives the
dimension for the smallest flat containing S. A line in Rn is a flat of dimension 1 whereas a
Theorem – 3:
Definition:
#
For v belonging to Rn, the standard homogenous form of v is denoted by = ! " $ such that they
1
belong to Rn+1.
Theorem – 4:
Any given point let us say y in Rn is said to be an affine combination of v1, …, vp in Rn if and
only if the homogenous form of y is in Span {ṽ1. …, ṽp}. As a matter of fact, y = c1v1 + … + cpvp
Definition:
An indexed set of points say {v1, …, vp} in Rn can be affinely dependent only in case there are
real numbers say c1 , …, cp which are not all zero in such a manner that c1 + … + cp = 0 and also
Theorem – 5:
Let us say that S = {v1, …, vp}belonging to Rn where p ≥ 2 and S is an indexed set. In such a
· One of the given points belonging to S is an affine combination of the other given points
belonging to S.
· The set {ṽ1, …, ṽp} of homogenous forms in Rn+1 is said to be linearly dependent.
Barycentric Coordinates:
Theorem – 6:
Let us suppose an affinely independent set given by S = {v1, …, vk} in Rn. After that each p in
aff S has a unique representation as an affine combination of v1, …, vk. Which means that for
each and every p there exists a unique set of scalars c1 + … + ck = 1 and p = c1v1 + … + ckvk.
Definition:
Let us suppose S = {v1, …, vk} is an affinely independent set, then for each point p belonging to
aff S, the coefficients c1, …, cp of p as in p = c1v1 + … + ckvk are known as the barycentric
coordinates of p.
Definition:
A convex combination of points say v1, v2, …, vk belonging to Rn is a linear combination given
by the equation c1v1 + c2v2 + … + ckvk in such a way that c1 + c2 + … + ck = 1 and ci ≥ 0 for all i.
Also, the set of all convex combinations of points in a set S is called the convex hull of S
denoted by conv S.
Definition:
A set say S is said to be convex if for each p,q ϵ S, the line segment pq is contained in S.
Theorem – 7:
A set say S is said to be convex if and only if every convex combination of points of S lies in S.
Theorem – 8:
Let be a random collection of convex sets. Then is convex and for any
Theorem – 9:
For any set say S, the convex hull of thei given set S is the place where all the convex sets meet
Theorem – 10:
(Caratheodory) In case S is a nonempty subset which belongs to Rn, then every single point in
conv S can be shown in the form of a convex combination of n+1 or lesser points from S.
8.4 – Hyperplanes:
A linear functional over Rn is a linear transformation f from Rn into R and for each scalar
quantity say d belonging to R, the symbol [f : d] gives the set for all x which are in Rn at which
Theorem – 11:
A subset H of Rn is said to be a hyperplane if and only if the condition H = [f : d] is true for any
nonzero linear functional f and some scalar quantity d belonging to R. This concludes that for H
to be a hyperplane, there exists a nonzero vector given be n along with a real number say d such
The convex hull of an open set is open whereas the convex hull in case of a compact set is
compact only.
Definition:
The hyperplane given by H = [f : d] separates two sets given by A and B now, if any one of the
In the conditions given if in case all the weak inequalities are replaced by strict inequalities in
Theorem – 12:
Let us say that A and B are two non-empty convex sets in such a manner that A is a compact set
while B is closed. Then, there is said to exist a hyperplane H which strictly separates A from B
Let us assume two non-empty compact sets given by A and B, then there exists a hyperplane
8.5 – Polytopes:
A polytope in Rn can be defined as a convex hull for a finite set of points in R2 the polytope is
simply a polygon.
Definition:
face of S in case F is not equal to S and there is a hyperplane such that H = [f : d] in a way such
that F = S ∩ H and either f(S) ≤ d or else f(S) ≥ d. The hyperplane H is known as a supporting
face of P is known as a vertex, a 1-face is termed as an edge while a (k-1) dimensional face is a
facet of S.
Definition:
Let us assume that S us a convex set and a point p in S is known as an extreme point of S in
case p is not present on the inside of any given line segment which belongs to S. Specifically, if
x, y ϵ S and p belongs to the line xy then, p = x or p = y and the set consisting of all the extreme
Definition:
The set {v1, …, vk} is a minimal representation of the polytope P in case P = conv {v1, …, vk}
and also for each i = 1, …, k, and also vi does not correspond to conv {vj : j ≠ i}.
Theorem – 14:
Let us say that M = {v1, …, vk} is the minimal representation of a polytope P. Then the
· pϵM
· p is a vertex of P
· p is an extreme point of P
Theorem – 15:
Let us suppose that S is a nonempty and compact convex set. Then S is the convex hull of its
profile i.e., the set which consists of the extreme points present in S.
Theorem – 16:
Let us sat that f is a linear functional defined on a nonempty compact convex set given by S.
Simplex:
A convex hull of an affinely independent finite set of vectors is termed as a simplex and in order
Let Ii be a line segment starting from the origin at 0 to the standard basis vector ei in Rn and in
Ck = I1 + I2 + … : Ik
Bezier Curves:
The control points for any curve can be denoted by p0, p1, p2 and p3 and these points can belong
the ending point of the first curve say x(t) is the starting point for the second curve at p2
assuming that the second curve is y(t). At p2, the combined curves are said to have G0 or
geometric continuity since the curves are joining at that point but if the tangent to curve 1 at p2
has a different direction as per the tangent line to the second curve then an abrupt change of
direction or a corner is seen. In order to avoid a sharp bend or a corner the curves are adjusted to
G1 also known as geometric continuity and here both the vectors at the point p2 are pointing
towards the same direction and the derivatives x’(1) and y’(0) point in the same direction though
the magnitudes of them might be different. When the tangent vectors are equal at p2, then such a
tangent vector is termed as continuous at p2, and the combined curve is said to have C1
matrix, G. The 4 × 4 matrix consisting of polynomial coefficients is known as the Bezier basis
matrix, MB. In case u(t) is the column vector of powers of t, in that case, the Bezier curve is
defined by
x(t) = GMBu(t)
When the entry for MB is changed appropriately, the resulting curves are known as B-splines,
and although they are smoother than Bezier curves, they don’t pass through any of the control
points. When the columns of the geometry matrix have the starting as well as the ending points
of the curves and the tangent vector to those points, the resulting curve is known to be a Hermite
cubic curve and it can be attained by replacing the matrix MB with a Hermite basis matrix.
The patameter t can be replaced by a parameter s for easiness while dealing with surfaces as:
The matrix in the above equation consisting of the control points is known as the geometry
vector.
Bezier Surfaces:
A three dimensional surface patch can be constructed using the Bezier equations given below:
A Bezier curve is produced when any one of the above given geometric matrices is multiplied at
Now, each of the entity of the resulting 4 × 1 matrix is a Bezier curve, given as:
As we fix t, then the above given equation is a column vector which can be also applied as a
geometry vector for a Bezier curve for another variable in place of s. The Bezier bicubic surface
resulting is:
The midpoint of the original curve x(t) occurs at the position x(.5) when x(t) has the standard
for Bezier curve given above in general, the direction of x(t) can be given by x’(t) since it is the
From the above given equations, we can also calculate the value for x’(.5) as:
1. Consider the hyperplane 5x1 − 2x2 + x3 − 3x4 = 5 of R4. Clearly explain why 5x1
− 2x2 + x3 − 3x4 = 5 is not a subspace of R4.
Solution:
Every subspace of R4 must contain the origin. The origin of R4 is (0, 0, 0, 0).
Since 5(0) − 2(0) + (0) − 3(0) ≠ 5, this hyperplane does not intersect the origin.
Hence, the equation does not define a subspace of R4.
Solution:
To find a basis, we need to convert this equation in to a vector equation. Solve
the equation for x1. (In truth, you can solve for any variable.)
Solution:
Example. A linear system is consistent if and only if the row rank of the
coefficient matrix equals the row rank of the augmented matrix.
Solution:
FALSE. There is no guarantee that the three vectors are not dependent. Three
vectors in a four dimensional space can form a line, a plane, or a three
dimensional hyperplane.
5. True or false, if a subspace W has two different bases B1 and B2, then B1 and
B2 can have a different number of vectors in them.
Solution:
FALSE. One of the properties of bases is that any basis for a vector space must
have the exact same number of vectors and this number of vectors determines
the dimension of the vector space.
Questions 6-10 require you to find a basis for the following vector spaces.
Solution:
Such vectors are of the form (x, x, x). They form a one dimensional subspace of
R3. A basis is given by (1, 1, 1). (Any nonzero vector (a, a, a) will give a basis.)
7. All vectors in R4 whose components add to zero and whose first two
components add to equal twice the fourth component.
Solution:
The subspace consists of vectors
1 1 1 1
It is the nullspace of the matrix ", a two dimensional subspace of
1 1 0 !2
R4, so any two independent vector gives a basis. For example, we can take as a
basis v1 = (1, 1, −3, 1), v2 = (1, −1, 0, 0).
Solution:
Any anti-symmetric 3 × 3 matrix is of the form
Solution:
This subspace is just the nullspace of the 1 × 4 matrix (1, 0, 1, 0). This is a three
dimensional hyperplane in R4. A basis can be read from the matrix (it is of row
reduced echelon form already!):
v1 = (0, 1, 0, 0), v2 = (−1, 0, 1, 0), v3 = (0, 0, 0, 1).
10. All polynomials p(x) whose degree is no more than 3 and satisfies p(0) = 0.
Solution:
We can write the polynomial p(x) as p(x) = ax3 + bx2 + cx + d. The condition p(0)
= 0 implies d = 0. Thus the space is {ax3 + bx2 + cx | a, b, c R}. It is a three
dimensional subspace in the vector space of polynomials, and a basis is given by
p1(x) = x3, p2(x) = x2, p3(x) = x.