Professional Documents
Culture Documents
=
=
n
i
i i
T
y x
1
y x
A norm || . || on R
n
is a mapping that assigns a scalar ||x||
to every xR
n
and that has the following properties:
(a) ||x|| 0 x R
n
(b) ||cx|| = |c| ||x|| for every c R and every x R
n
(c) ||x|| = 0 if x = 0
(d) ||x+y|| ||x||+||y|| x,y R
n
(Triangular
inequality)
Euclidean norm of x, ||x|| = (x
T
x) =
2
1
1
2
|
.
|
\
|
=
n
i
i
x
Schwartz inequality (applies to the Euclidean norm):
|x
T
y| ||x||.||y||
with equality holding if and only if x = y for some
scalar
3) Matrices
m x n matrix =>
(
(
(
(
mn m
n
a a
a a
a a a
L
O M
M
L
1
22 21
1 12 11
=A=[a
ij
]
Product of a matrix A and a scalar , A or A = [a
ij
]
Product of two matrices A and B :
Let A, B, C be a mxn, nxp and mxp matrices,
respectively.
AB = C=[c
ij
], where c
ij =
=
n
k
kj ik
b a
1
Transpose of a mxn matrix A is the nxm matrix A
T
=
[a
ij
]= [a
ji
]
A square matrix A is symmetric if A
T
=A
Non-singular matrix
A square matrix (nxn) A is called non-singular or
invertible if there is an nxn matrix called the inverse of A
(denoted by A
-1
), such that A
-1
A = I = AA
-1
, where I is
the nxn identity matrix.
Positive Definite and Semidefinite Matrices
A square nxn matrix A is said to be positive semidefinite
if x
T
Ax 0 xR
n
A square nxn matrix A is said to be positive definite if
x
T
Ax > 0 xR
n
The matrix A is said to be negative semidefinite
(definite) if A is positive semidefinite (definite).
4) Derivatives
Let f : R
n
R be some function.
Partial Derivative:
For a fixed x R
n
, the first partial derivative of f at the
point x in the ith coordinate is defined by
) ( ) (
lim
) (
0
x e x x f f
x
f
i
i
+
=
where e
i
is the ith unit vector.
Gradient:
If the partial derivatives with respect to all coordinates
exist, f is called differentiable at x and its gradient at x is
defined as:
|
|
|
|
|
.
|
\
|
=
n
x
f
x
f
f
) (
) (
) (
1
x
x
x M
Note: f is orthogonal to the corresponding level
surface f(x
1
,,x
n
)=C
Various Definitions
The function f is called differentiable if it is
differentiable at every x R
n
.
The function f is said to be continuously differentiable if
f(x) exists for every x and is a continuous function of x.
The function f is said to be twice differentiable if the
gradient f(x) is itself a differentiable function.
Hessian Matrix
For this case, we can define Hessian matrix of f at x ,
(
(
=
j i
x x
f
f
) (
) (
2
2
x
x
, i.e. the elements are the second
partial derivatives of f at x.
The function f is twice continuously differentiable if
2
f(x) exists and is continuous.
Vector-valued function
A vector-valued function f : R
n
R
m
is differentiable
(continuously differentiable) if each component f
i
of f is
differentiable (continuously differentiable).
Gradient matrix of the vector-valued function
The gradient matrix of f, f(x), is the nxm matrix whose
ith column is f
i
(x),
[ ] ) ( ) ( ) (
1
x x x f
m
f f = L
Jacobian of the vector-valued function
Jacobian of f = (f(x))
T
Chain rule
Let f : R
k
R
m
and g : R
m
R
n
be continuously
differentiable functions, and let h(x) = g(f(x)).
The chain rule for differentiation states that
h(x) =
x
f(x)
f
g(f(x)), xR
k
Special examples
If f : R
n
R
m
is of the form f(x) = Ax, where A is an m
x n matrix,
f(x) = A
T
If f : R
n
R is of the form f(x) = x
T
Ax/2, where A is a
symmetric n x n matrix,
f(x) = Ax
2
f(x) = A
5) Unconstrained Optimization Techniques
Local and global minima
x*: local min if >0 s.t. f(x*)f(x), x in ||x-x*||
x*: strictly local min if >0 s.t. f(x*)<f(x), x in ||x-x*||
x*: global min if >0 s.t. f(x*)f(x), xR
n
f(x)
x
Strict local
minimum
Local
minima
Strict
global
minimum