Professional Documents
Culture Documents
Bai Huang
I.
Matrix Algebra
1. Vector Spaces
1.1 Real Vectors
: The set of (finite) real numbers
m : The m-dimensional Euclidean space
(Cartesian product of s: )
A Cartesian product A1 A2 Am:
All possible ordered pairs, whose i ele ent is fro
A , i = ,,
Addition:
x + y = (x + y , x + y , , x + y )
a. x and y must be of the same order.
b. [commutativity] x + y = y + x
c. [associativity] (x + y) + z = x + (y + z)
Scalar multiplication: cx : = (cx1 , cx2 , , cxm ) , where c is a constant
Two vectors x and y are collinear if either x = 0 or y = 0 or y = cx.
Inner product:
< x , >= m 1 x y = x y = y x
a. Properties:
i) < , > = < , x >
ii) < , + z > = < x , > + < x , >
iii) < x , > = c < x , >
iv) <x , x> 0, with <x , x> = 0 iff x = 0
b. Norm: || x || : = < x , x >1/2
It is the geometric idea of length of the vector x.
c. A vector x is said to be normalized if ||x|| = 1
Any nonzero vector x can be normalized to x by x =
|| ||
. Since x
e. If, in addition, ||x|| = ||y|| = 1, the two vectors are said to be orthonormal.
Example: In m , the unit vectors (or elementary vectors)
e1 = ( 0 0 0) , e2 = (0 0 0) ,, em = (0 0 0 ) ,
are orthonormal.
f. Cauchy Schwarz inequality:
< , >2 ||x||2 ||y||2 with equality iff x and y are collinear.
Exercise: Prove C-S inequality using properties of inner product.
g. Triangle inequality:
||x + y|| ||x|| + ||y|| with equality iff x and y are collinear.
Exercise: Prove triangle inequality. [Hint: using C-S inequality]
x
x-y
y
By the cosine rule, ||x y||2 = ||x||2 + ||y||2 ||x||||y||cos.
After simplification, this becomes < , > = || |||| ||
, thus the angle
between x and y is determined by cos =
,
|| |||| ||
(0 <
< )
|| ||
Division:
u uv* uv*
=
=
v vv* |v|2
> =
,
m
i=1
u i v*i
>1/2
Hilbert space
add completeness
A nonempty subset
of a vector space
is called a subspace of
if, for all
x, y , we have x + y and x for any scalar .
The intersection of two subspaces
and in a vector space , denoted by
, consists of all vectors that belong to both
and .
The union of two subspaces
and in a vector space , denoted by
,
consists of all vectors that belong to at least one of
and .
+ ,
The concept of an inner product not only induces the idea of length (the norm)
||x|| < , x >1/2 , but also of distance of two vectors x and y,
d (x , y) ||x y||, which satisfies
i) d (x , x) = 0
ii) d (x , y) > 0 if x y
iii) d (x , y) = d (y , x)
iv) d (x , y) d (x , z) + d (z , y)
In any inner product space, the Cauchy-Schwarz inequality and the triangle
inequality hold. Besides, an equality, called the parallelogram theorem, is
stated as
||x + y||2 + ||x y||2 = ||x||2 + ||y||2
Example: Prove the parallelogram inequality in terms of algebra and geometry.
For two vectors x and y in an inner product space, we say that x and y are
orthogonal if < , > = 0, we write xy.
[Pythagorean Theorem]
If xy in an inner product space, ||x + y||2 = ||x||2 + ||y||2.
Exercise: Does the converse hold?
A is called an orthogonal set if each pair of vectors in A is orthogonal. If, in
addition, each vector in A has unit length, then A is called an orthonormal set.
Example: Prove that any orthonormal set is linearly independent.
Two subspaces
and of an inner product space
are said to be
orthogonal if every vector in
is orthogonal to every vector in .
If
is a subspace of an inner product space , then the space of all vectors
is called the orthogonal complement of , denoted by
.
orthogonal to
Example: Prove that
is a subspace of .
A sequence {x n }n
x if ||x x|| 0 as n .
[Continuity of inner product]
If {x n } , { yn } are two sequences in an inner product space such that
||x x|| 0, ||y y|| 0, then
i) ||x || ||x||, ||y || ||y||
ii) < x , y > < x , y >
A sequence {x n }n
||x xm || 0 as n,
( > 0, () > 0
.
. . |x xm | < ,
>
() )
2. Matrices
2.1 Matrix Terminology
A matrix is a rectangular array of numbers, denoted
a11
a12
a1n
a21
a22
a2 n
am1 am 2
amn
a ji
i, j
B, s.t. An n Bn
Bn n An
In .
b. A is invertible
The column vectors of
independent.
c. Example: i) A zero matrix is non-invertible
ii) The inverse of an identity matrix is itself
iii) The inverse of a diagonal matrix:
a1
a1 1
0
,
an
are linearly
an
if all ai
a b
c d
d
ad bc c
1
b
,
a
if ad bc
ii) ( A 1 )
iii) ( AB)
(A )
1
B 1A
iv) ( A BD 1C )
A 1B ( D CA 1 B) 1 CA
Exercise: Try to figure out the special cases for iv) when
(1) B C ?
(2) D
I, B
b, C
c?
| A|
j 1,
, n , where
j 1
Aij is the matrix obtained from A by deleting row i and column j. The
Obviously, it is easy to choose the row or column that has the most zeros
to make the expansion. It is unlikely, though, that you will ever calculate
any determinant over 3 3 without a computer.
b. The determinant provides important information when the matrix is that
of the coefficients of a system of linear equations. The system has a
unique solution if and only if the determinant is nonzero.
c. When the determinant corresponds to a linear transformation of a vector
space, the transformation has an inverse operation if and only if the
determinant is nonzero.
d. A is invertible
| A| 0
a b
c d
ad bc
a b
d
ii)
g
e
h
f
i
e
h
f
i
d
g
f
i
d
g
e
h
f. Properties:
i) Switching two rows or columns changes the determinant sign.
ii) Any determinant with two identical rows or columns has value 0.
iii) A determinant with a row or column of zeros has value 0.
iv) Adding a scalar multiple of one row (or column) to another does not
change the determinant.
n
v) | An n |
| A|
vi) | A | | A |
vii) | A 1 | | A |
viii) If A1 , A2 ,
| A1 A2
AK | | A1 || A2 |
a11
ix) | A |
| AK |
a12
a1n
a11
a22
a2 n
a21
a22
aii
i 1
ann
x) | A |
an1
A11
A12
A11
A22
A21
A22
an 2
ann
| A11 || A22 |
Exercise: Using the concepts of inverse and determinant, prove the following
properties for orthogonal matrix P :
i) P P 1
ii) | P | =
mail diagonal. tr ( An n )
aii .
i 1
Properties:
i) The trace is invariant under cyclic permutations, for example
tr ( ABCD )
tr ( BCDA)
tr (CDAB )
tr ( DABC )
ii) tr ( A ) tr ( A)
iii) For two matrices of the same dimensions,
tr ( A B ) tr ( AB )
tr ( BA )
tr ( B A)
aij bij
i, j
iv) tr ( A B ) tr ( A) tr ( B )
v) tr (cA)
ctr ( A)
vi)* E (tr ( X )) tr ( E ( X ))
An m n matrix can be viewed as a set of n column vectors in m , or as a set
of m rows in . Thus, associated with a matrix A are two vector spaces: the
column space and the row space.
The column space of A , denoted by colA , consists of all linear
combinations of the columns of A ,
col {x m : x = y for so e y }.
The row space of A , denoted by colA , consists of all linear combinations
of the rows of A , or the columns of A ,
col {y : y = x for so e x m }.
The column rank of A is the maximum number of linearly independent
columns it contains, namely, the dimension of the vector space that is
(col ).
spanned by its column space
The row rank of A is the maximum number of linearly independent rows it
contains, namely, the dimension of the vector space that is spanned by its row
(col ).
space
(col ) =
[The Rank Theorem] The nontrivial fact that
(col )
implies that the column rank of A is equal to its row rank. It follows that
(col ).
rk( ) =
A square n n matrix A is said to be nonsingular if rk( ) = n ;
otherwise, the matrix is singular. In fact,
A is invertible
A is nonsingular
rk( ) = n
| A| 0
rk( ) < .
and Cn
in ( , ).
In .
and Cn
be
) = rk( ).
) = rk(
).
A (or AT )
(A )
A is a symmetric matrix
A is normal if AA
Addition:
A B [aij
A 0
A B
aij
bij ,
i, j
bij
a ji ,
i, j
AA
( A B) C
( A B)
A (B C)
B
Scalar multiplication:
cA [caij ]
Matrix product:
For an m r matrix A [aik ] and an r n matrix B [bkj ] , the product
matrix C
is the
j th column of B .
In general, AB
BA .
A and B commute if AB
BA .
Im A
( AB )C
A( BC )
( A B )C
( AB )
AC
BC
BA
xi
(1,1,
xi2
xx
xi yi
xy
,1)( x1 , x2 ,
, xn )
yx
For two real matrices A and B of the same dimension we define the inner
product as A, B
aij bij tr ( A B ) , which induces the norm:
i, j
|| A ||
A, A
1/2
aij2
tr ( AA )
i, j
A calculation that may help to condense the notation or simplify the coding is the
Kronecker product, denoted by
.
For general matrices A of dimension m n and B of dimension p q ,
a11 B
a12 B
a1n B
a21 B
a22 B
a2 n B
am1B am 2 B
amn B
0
I
Example:
(A
B)
(A
B)
and Bn n ,
B | | A |n | B |m
b. tr ( A
(A
B )(C
B)
tr ( A)tr ( B )
D)
( AC )
( BD )
[ : b] [ : x]
x1
x2
x3
5 .
7
3
2
5
SOLN:
1
2
3
6
1
5
1 0 0
0 1 0
3
2
0 0 1
x1
x2
x3
2 .
1
[Cramers Rule]:
Cramers Rule is an explicit formula for the solution of a system of linear
equations, with each variable given by a quotient of two determinants.
Example: Solve the equation
x1
x2
x3
5 .
7
SOLN: x1
x3
1
5
7
2
5
3
1
2
1
1
2
1
3
6
4
3
6
4
2 1 1
5 2 5
3 1 7
2 1
3
5 2
6
3 1 4
21
3 , x2
7
2
5
3
2
5
3
1
5
7
1
2
1
3
6
4
3
6
4
14
7
2,
7
1.
7
Am
Cn
Bm
Dn
None of the matrices needs to be square, but A and B must have the same number
of rows, A and C must have the same number of columns, and so on.
We will mainly focus on partitioned matrices with two row blocks and two column
blocks. It can be extended to m row blocks and n column blocks case, such as
Z11
Z12
Z1n
Z 21
Z 22
Z 2n
Z m1
Zm2
Z mn
As a special case, we say that a square matrix is block-diagonal if it takes the form
Z11
Z 22
Z rr
where all diagonal blocks are square, not necessarily of the same order.
A General Principle
The main tool in obtaining the inverse, determinant, and rank of a partitioned matrix
is to write the matrix as a product of simpler matrices, that is, matrices of which one
(or two) of the four blocks is the null matrix.
Some Basic Results
Partitioned sum: Let
Z1
A1
C1
B1
D1
and Z 2
A2
C2
B2
, then Z
D2
Z1 Z 2
Z1Z 2
A1 A2 B1C2
C1 A2 D1C2
A1B2 B1 D2
.
C1 B2 D1D2
Partitioned transpose:
A B
C D
A
B
C
D
A B
C D
tr ( A) tr ( D)
A1 A2
C1 C2
B1 B2
D1 D2
In
Im
ii)
A B
C D
C D
A B
E O
O In
A B
C D
EA EB
C
D
Im
O
iii)
E
In
A B
C D
A EC
C
B ED
D
A B
C D
O
Iq
Ip
O
ii)
A B
C D
F O
O Iq
iii)
A B
C D
Ip
Iq
B A
D C
AF
CF
B
D
A B AF
C D CF
If A is nonsingular, we have
Im
CA
Im
A 1B
In
Iq
O D CA 1 B
Im
O
BD
In
A B
C D
Ip
O
1
D C
A BD 1C
O
In
O
.
D
A O
O D
O
O B
C O
be orthogonal?
A 1 BE 1CA
E 1CA
A 1 BE
D 1CF
D CA 1 B are nonsingular,
A BD 1C are nonsingular,
F 1 BD
1
D 1CF 1 BD
Determinants
Let Z
A B
,
C D
If A is nonsingular, | Z | | A || D CA 1 B | .
If D is nonsingular, | Z | | D || A BD 1C |
Let A and D be square matrices, of order m and n, respectively.
a. For any m m matrix E ,
EA EB
A B
|E|
C
D
C D
b. For any n m matrix E ,
A
B
C EA D EB
Example: Will
C D
A B
A B
C D
A B
always hold?
C D
0 1 0
2 1 5
b.
6
1
ii) For the matrix in a., what are the (algebraic) multiplicity and the
geometric multiplicity of each eigenvalue? What about the matrix in b.?
iii) Are the eigenvectors for each matrix linearly independent? Are they
orthogonal?
Eigenvectors associated with distinct eigenvalues are linearly independent, but
not necessarily orthogonal.
If
is an eigenvalue of A , t
eigenvector(s).
If x is an eigenvector of A , t x ( t
with the same eigenvalue.
Do A and A have the same eigenvalues? Do they have the same eigenvectors?
Either provide a proof or a counterexample.
A is nonsingular if and only if all its eigenvalues are nonzero.
Example: Prove this theorem.
| A|
n
i
i 1
Ak has eigenvalues
tr ( A)
k
1
k
2
k
n
n
i 1
A BC with Bm
EAF
and Cn
both of rank r .
[ QR Decomposition]
A QR (when r
n) with Q *Q
m n
T 1 AT
Example:
a. [Singular Value Decomposition]
Consider the 4 5 matrix
1 0 0 0 2
A
0 0 3 0 0
0 0 0 0 0
0 4 0 0 0
I 4 and VV *
I5 .
For
For
P AP
I | (1
)(3
) 8 (
1 , Ax
5 , Ax
2
3
1
3
1
3
2
3
2 2
2 2
1)(
x1
x2
5)
2x 2
2x1
2 2
2 2
2
3
1
3
1,
2 1
, )
3 3
1
2
,
)
3 3
1
3
2
3
1 0
0
xi x j aij .
i 1 j 1
x Ax , where A is a symmetric
If x Ax
If A is positive definite, so is A 1 .
The identity matrix is positive definite.
If An
nonnegative definite.
If A is positive definite and B is a nonsingular matrix, then B AB is
positive definite.
Define d x Ax x Bx
x ( A B )x .
are those of B . If
definite.
b. If A B , then B
, i 1,
, n , then A B is nonnegative
A 1.
0.
Cholesky Decomposition
Let A be a Hermitian, positive-definite matrix, then A can be decomposed as
A LL* , where L is a lower triangular matrix with strictly positive diagonal
entries, and L* denotes the conjugate transpose of L .
The Cholesky decomposition is of great importance in terms of numerical
computation, especially for solving the system of linear equations.
An idempotent matrix, A , is one that is equal to its square, that is, A2 A .
All of the idempotent matrices we shall encounter are symmetric, though,
idempotent matrices can be either symmetric or not.
Properties: Let A be a symmetric idempotent matrix,
a. I A is also a symmetric idempotent matrix.
b. All the eigenvalues of A are 0 or 1.
c. rk ( A) tr ( A) .
d. A is nonsingular
A
Example: Prove a. d.
A useful idempotent matrix
I.
(n
X (X X ) 1 X .
H)?
. The
ax
,,
x x =
obtained when x = x
f ( x ) , if each value
dy
,
dx
following derivatives: f ( x)
f ( x)
d2y
, and so on.
dx 2
(g f )x
g ( f ( x )) , then h ( x )
g ( f ( x )) f ( x ) .
, xn ) as a scalar-valued function of a
f ( x1 , x2 ,
f (x) .
y
f (x)
x
Also we have
x1
x2 .
xn
f (x)
x
x1
x2
xn
For a matrix Am
f ( A)
A
f ( A)
a11
f ( A)
a12
f ( A)
a1n
f ( A)
a21
f ( A)
a22
f ( A)
a2 n .
f ( A)
am1
f ( A)
am 2
f ( A)
amn
f ( B)
B
B is an m n matrix, then
xy .
iii) f (x)
x By , then
f (x)
x
By .
iv) f (y)
x By , then
f (x)
y
B x.
v) f (x)
matrix, then
f (x)
x
( A A )x .
1 3
, show that
3 4
log | A | , then
| A|
aij
f ( A)
A
( A 1)
f (x)
x
(x Ax)
x
2 Ax .
2 Ax .
f1 (x)
Let
f (x)
where
is
vector
of
order
n,
then
f m (x)
f1 (x)
x1
f (x)
x
f1 (x)
xn
.
f m (x)
x1
f m (x)
xn
f (x)
is called the Jacobian matrix.
x
The Jacobian determinant is the determinant of the Jacobian matrix if
m n.
Hessian matrix
If we take the derivative for the gradient
f (x)
, then a second derivative
x
f (x)
x12
f (x)
x1 x2
f (x)
x2 x1
f (x)
x22
f (x)
xn x2
f (x)
xn x1
f (x)
x1 xn
f (x)
x2 xn .
f (x)
xn2
f1 (x)
Let h (x)
, then
f m (x)
h(x)
x
g (f (x))
f
f (x)
.
x