You are on page 1of 82

OPTIMISATION AND

OPTIMAL CONTROL

AIM
To provide an understanding of
the principles of optimisation
techniques in the static and
dynamic contexts.
1
LEARNING OBJECTIVES
On completion of the module the student
should be able to demonstrate:
- an understanding of the basic
principles of optimisation and the ability to
apply them to linear and non-linear
unconstrained and constrained static
problems,
- an understanding of the fundamentals
of optimal control, system identification
and parameter estimation. 2
CONTENT
(20 lectures common to BEng and MSc)
Principles of optimisation, essential features and
examples of application.
(1 lecture)
Basic principles of static optimisation theory,
unconstrained and constrained optimisation, Lagrange
multipliers, necessary and sufficient conditions for
optimality, limitations of analytical methods.
(3 lectures)
Numerical solution of static optimisation problems:
unidimensional search methods; unconstrained
multivariable optimisation, direct and indirect methods,
gradient and Newton type approaches; non-linear
programming with constraints.
(4 lectures)
3
Introduction to genetic optimisation
algorithms.
(1 lecture)
Basic concepts of linear programming.
(1 lecture)
On-line optimisation, integrated system
optimisation and parameter estimation.
(1 lecture)

4
The optimal control problem for continuous
dynamic systems: calculus of variations, the
maximum principle, and two-point boundary
value problems. The linear quadratic regulator
problem and the matrix Riccati equation.
(4 lectures)
Introduction to system identification, parameter
estimation and self-adaptive control
(3 lectures)
Introduction to the Kalman filter.
(2 lectures)
5
(10 Lectures MSc Students only)

6
LABORATORY WORK
The module will be illustrated by laboratory
exercises and demonstrations on the use of
MATLAB and the associated Optimization and
Control Tool Boxes for solving unconstrained and
constrained static optimisation problems and for
solving linear quadratic regulator problems.

ASSESSMENT
Via written examination.
MSc only - also laboratory session and report
7
READING LIST
P.E Gill, W. Murray and M.H. Wright: Practical
Optimization" (Academic Press, 1981)
T.E. Edgar. D.M. Himmelblau and L.S. Lasdon:
"Optimization of Chemical Processes 2nd Edition (McGraw-
Hill, 2001)
M.S. Bazaraa, H.D. Sherali and C.M. Shetty: Nonlinear
Programming - Theory and Algorithms 2nd Edition (Wiley
Interscience, 1993)
J. Nocedal and S.J. Wright: Numerical Optimization,
(Springer, 1999)
J.E. Dennis, Jr. and R.B. Schnabel: Numerical methods
for Unconstrained Optimization and Nonlinear Equations
(SIAM Classics in Applied Mathematics, 1996) (Prentice
Hall, 1983)
K.J. Astrom and B Wittenmark: "Adaptive Control", 2nd
Edition (Prentice Hall, 1993)
F.L. Lewis and V S Syrmos: "Optimal Control" 82nd
Edition (Wiley, 1995)
PRINCIPLES OF OPTIMISATION
Typical engineering problem: You have a
process that can be represented by a
mathematical model. You also have a
performance criterion such as minimum cost.
The goal of optimisation is to find the values of
the variables in the process that yield the best
value of the performance criterion.
Two ingredients of an optimisation problem:
(i) process or model
(ii) performance criterion 9
Some typical performance
criteria:
maximum profit
minimum cost
minimum effort
minimum error
minimum waste
maximum throughput
best product quality
Note the need to express the performance
criterion in mathematical form.
10
Static optimisation: variables
have numerical values, fixed
with respect to time.
Dynamic optimisation:
variables are functions of
time.

11
Essential Features
Every optimisation problem contains
three essential categories:
1. At least one objective function
to be optimised
2. Equality constraints
3. Inequality constraints

12
By a feasible solution we mean a set of variables which
satisfy categories 2 and 3. The region of feasible solutions
is called the feasible region.

x2 nonlinear linear
inequality equality
constraints constraint

linear
inequality
constraint

nonlinear
inequality
constraint
x1
linear inequality constraint
13
An optimal solution is a set of values
of the variables that are contained in
the feasible region and also provide
the best value of the objective
function in category 1.

For a meaningful optimisation


problem the model needs to be
underdetermined.

14
Mathematical Description
Minimize : f (x) objective function
h(x) 0 equality constraints
Subject to:


g(x) 0 inequality constraints
where x n , is a vector of n variables (x1, x2 , , xn )
h(x) is a vector of equalities of dimension m1
g(x) is a vector of inequalities of dimension m2

15
Steps Used To Solve Optimisation Problems
1. Analyse the process in order to make a list of all the
variables.
2. Determine the optimisation criterion and specify
the objective function.
3. Develop the mathematical model of the process to
define the equality and inequality constraints.
Identify the independent and dependent variables
to obtain the number of degrees of freedom.
4. If the problem formulation is too large or complex
simplify it if possible.
5. Apply a suitable optimisation technique.
6. Check the result and examine its sensitivity to
changes in model parameters and assumptions. 16
Classification of
Optimisation Problems
Properties of f(x)
single variable or multivariable
linear or nonlinear
sum of squares
quadratic
smooth or non-smooth
sparsity
17
Properties of h(x) and g(x)
simple bounds
smooth or non-smooth
sparsity
linear or nonlinear
no constraints
18
Properties of variables x
time variant or invariant
continuous or discrete
take only integer values
mixed

19
Obstacles and Difficulties
Objective function and/or the constraint
functions may have finite discontinuities in
the continuous parameter values.
Objective function and/or the constraint
functions may be non-linear functions of the
variables.
Objective function and/or the constraint
functions may be defined in terms of
complicated interactions of the variables. This
may prevent calculation of unique values of
the variables at the optimum. 20
Objective function and/or the constraint
functions may exhibit nearly flat behaviour
for some ranges of variables or exponential
behaviour for other ranges. This causes the
problem to be insensitive, or too sensitive.
The problem may exhibit many local optima
whereas the global optimum is sought. A
solution may be obtained that is less
satisfactory than another solution elsewhere.
Absence of a feasible region.
Model-reality differences. 21
Typical Examples of Application
static optimisation
Plant design (sizing and layout).
Operation (best steady-state operating
condition).
Parameter estimation (model fitting).
Allocation of resources.
Choice of controller parameters (e.g.
gains, time constants) to minimise a given
performance index (e.g. overshoot, settling
time, integral of error squared). 22
dynamic optimisation

Determination of a control signal


u(t) to transfer a dynamic system
from an initial state to a desired
final state to satisfy a given
performance index.
Optimal plant start-up and/or shut
down.
Minimum time problems 23
BASIC PRINCIPLES OF STATIC
OPTIMISATION THEORY
Continuity of Functions
Functions containing discontinuities can
cause difficulty in solving optimisation
problems.
Definition: A function of a single variable x
is continuous at a point xo if:
(a) f ( xo ) exists
(b) lim f ( x ) exists
x xo

(c) lim f ( x ) f ( xo )
x xo 24
If f(x) is continuous at every point in a
region R, then f(x) is said to be continuous
throughout R.

f(x) is discontinuous.
f(x)

f(x) is continuous, but


f(x) df
f ( x) ( x) is not.
dx
x
25
Unimodal and Multimodal Functions
A unimodal function f(x) (in the range
specified for x) has a single extremum
(minimum or maximum).
A multimodal function f(x) has two or more
extrema.
If f ( x) 0at the extremum, the point is
called a stationary point.
There is a distinction between the global
extremum (the biggest or smallest between
a set of extrema) and local extrema (any
extremum). Note: many numerical
procedures terminate at a local extremum.26
A multimodal function

global max (not stationary)


f(x) local max (stationary)
stationary point
(saddle point)

local min (stationary)

global min (stationary)

27
Multivariate Functions -
Surface and Contour Plots
We shall be concerned with basic properties of
a scalar function f(x) of n variables (x1,...,xn).

If n = 1, f(x) is a univariate function


If n > 1, f(x) is a multivariate function.

For any multivariate function, the equation


z = f(x) defines a surface in n+1 dimensional
n 1
space .
28
In the case n = 2, the points z = f(x1,x2)
represent a three dimensional surface.

Let c be a particular value of f(x1,x2). Then


f(x1,x2) = c defines a curve in x1 and x2 on the
plane z = c.

If we consider a selection of different values


of c, we obtain a family of curves which
provide a contour map of the function z =
f(x1,x2).

29
contour map of z e x1 (4 x12 2 x22 4 x1 x2 2 x2 1)
2
3 4 5
6
1.5 1.7 1.8 2 z = 20
1
1.8
0.5 1.7
2
x2 0
saddle point 1.0
-0.5 0.7
0.4
3 0.2
-1

-1.5 4

-2 5
6
-2.5
local minimum
-3
-3 -2 -1
x1 0 1 2
30
31
32
Example: Surface and
Contour Plots of Peaks
Function
z 3(1 x1 ) exp x ( x2 1)
2 2
1
2

10(0.2 x1 x x )exp( x x )
3
1
5
2
2
1
2
2

1 3exp ( x1 1) x
2 2
2
33
10

5 multimodal!
0
z
-5

-10
20
15 20
10 15
10
x2 5 5
0 0 x1
100

90
global max
80

70
local min saddle
x2 60

50

40

30
local max
20

10 local max global min


10 20 30 40 50 60 70 80 90 34
x1
35
Gradient Vector
The slope of f(x) at a point x x in the direction
of the ith co-ordinate axis is
f ( x)
xi x x
The n-vector of these partial derivatives is
termed the gradient vector of f, denoted by:
L
M O
f ( x )
P
x1
f ( x )
M
M
P
P (a column vector)
Mf ( x )
P
M
N x n P
Q 36
The gradient vector at a point x x
is normal to the the contour through that
point in the direction of increasing f.
f ( x )
increasing f
x

At a stationary point:
f (x) 0 (a null vector)
37
Example
f ( x) x1 x22 x2 cos x1
L
f ( x ) O
M x P Lx x sin x O
2
f ( x ) M P 1
M 2
P
2 1

M f ( x ) P N
2 x x cos x Q
M
Nx P 2 Q
1 2 1

and the stationary point (points) are given by


the simultaneous solution(s) of:-
x x2 sin x1 0
2
2

2 x1 x2 cos x1 0
38
Note: If f ( x) is a constant vector,
f(x) is then linear.

e.g.

f (x) c x f( x ) = c
T

39
Hessian Matrix (Curvature Matrix
The second derivative of a n - variable
function is defined by the n2 partial
derivatives:
F I
f (x)
G J
H K
xi x j
, i 1,, n; j 1, , n

written as:
f ( x)
2
f ( x)
2
, i j, , i j.
xi x j xi2

40
These n2 second partial derivatives are
usually represented by a square, symmetric
matrix, termed the Hessian matrix, denoted
by:
L
f ( x)
M
2

2 f ( x) O
P
M x 2
1 x1xn P
H( x) f ( x) M
2
P
M
f ( x)
M
2
f ( x) P
2

x P

Nx x
1 n
2
n Q
41
Example: For the previous example:

LM f
2
f
2
O
P
f ( x) M
2 x 2
1 x1x2P
L
M
x2 cos x1 2 x2 sin x1 O
P
M
M
f
2
f
2
P
P N2 x2 sin x1 2 x1 Q
Nx x
1 2 x22 Q
Note: If the Hessian matrix of f(x) is a
constant matrix, f(x) is then quadratic,
expressed as:
f (x) 2 x Hx c x
1 T T

f (x) Hx c, 2 f (x) H
42
Convex and Concave Functions
A function is called concave over a given
region R if:

f (xa (1 )xb ) f (xa ) (1 ) f (xb )


where: xa , xb R, and 0 1.

The function is strictly concave if is


replaced by >.

A function is called convex (strictly convex)


if is replaced by (<).
43
concave function f(x)

f ( x ) 0

x
xa xb

convex function f(x)

f ( x ) 0

xb x44
xa
2 f
If f ( x ) 0 then f ( x ) is concave.
x 2

2 f
If f ( x ) 0 then f ( x ) is convex.
x 2

For a multivariate function f(x) the


conditions are:-
f(x) H(x) Hessian matrix
Strictly convex +ve def
convex +ve semi def
concave -ve semi def
strictly concave -ve def
45
Tests for Convexity and Concavity
H is +ve def (+ve semi def) iff
x Hx 0 ( 0), x 0.
T

H is -ve def (-ve semi def) iff


x Hx 0 ( 0), x 0.
T

Convenient tests: H(x) is strictly convex


(+ve def) (convex) (+ve semi def)) if:
1. all eigenvalues of H(x) are 0 ( 0)
or 2. all principal determinants of H(x) are
0 ( 0)
46
H(x) is strictly concave (-ve def)
(concave (- ve semi def)) if:

1. all eigenvalues of H(x) are 0 ( 0)


or 2. the principal determinants of H(x)
are alternating in sign:
1 0, 2 0, 3 0,
( 1 0, 2 0, 3 0, )

47
Example f ( x) 2 x 3x1 x2 2 x
2
1
2
2
f ( x) 2 f ( x) 2 f ( x)
4 x1 3 x2 4 3
x1 x12
x1 x2
f ( x) 2 f ( x)
3x1 4 x2 4
x2 x22

4 3 4 3
H ( x) , 1 4, 2 7
3 4 3 4

4 3
eigenvalues: | I 2 H | 2 8 7 0
3 4
1 1, 2 7. Hence, f (x) is strictly convex.
48
Convex Region
xa
xb
convex
xa region non convex
region xb

A convex set of points exist if for any two points, xa


and xb, in a region, all points:
x x a (1 )xb , 0 1
on the straight line joining xa and xb are in the set.
If a region is completely bounded by concave
functions then the functions form a convex region.
49
Necessary and Sufficient Conditions
for an Extremum of an
Unconstrained Function
A condition N is necessary for a result R if R can
be true only if N is true.
R N
A condition S is sufficient for a result R if R is
true if S is true.
SR
A condition T is necessary and sufficient for a
result R iff T is true.
TR
50
There are two necessary and a single sufficient
conditions to guarantee that x* is an extremum of a
function f(x) at x = x*:

1. f(x) is twice continuously differentiable at x*.


2. f (x* ) 0 , i.e. a stationary point exists at x*.

3. 2 f (x* ) H(x* ) is +ve def for a minimum to exist


at x*, or -ve def for a maximum to exist at x*

1 and 2 are necessary conditions; 3 is a


sufficient condition.
Note: an extremum may exist at x* even
though it is not possible to demonstrate the
fact using the three conditions. 51
Example: Consider:
f (x) 4 4.5x1 4 x2 x12 2 x22 2 x1 x2 x14 2 x12 x2

The gradient vector is:


L
4.5 2 x 2 x 4 x 4 x x O
f (x) M 1 2
3
1
P 1 2

N 4 4 x 22 x 21x Q 2
1
yielding three stationary points located by setting
f (x) 0 and solving numerically:
x*=(x1,x2) f(x*) eigenvalues classification
2 f (x)
of
A.(-1.05,1.03) -0.51 10.5 3.5 global min
B.(1.94,3.85) 0.98 37.0 0.97 local min
C.(0.61,1.49) 2.83 7.0 -2.56 saddle

where:
L
2 12 x 4 x
f ( x) M
2
2
1 2 2 4 x1 O
P
N 2 4 x 1 4 Q 52
contour map
6

3
x2
2

0
-4 -3 -2 -1 0 1 2 3 4
x1
53
Interpretation of the Objective Function
in Terms of its Quadratic Approximation
If a function of two variables can be approximated
within a region of a stationary point by a quadratic
function:

f ( x1 , x2 ) 2 x1 x2
1
L O
L
P
M O
h11 h12 x1
M P c1 c2
x1

L
M O
P
N Q
NQ
h12 h22 x2 x2 NQ
h x h x h12 x1 x2 c1 x1 c2 x2
1 2
2 11 1
1 2
2 22 2

then the eigenvalues and eigenvectors of:


H( x , x )
* * 2 * * L
h
f (x , x ) M
11 h12 O
P
1 2 1 2
N
h 12 h22 Q
can be used to interpret the nature of f(x1,x2) at:
x1 x , x2 x
*
1
*
2
54
They provide information on the shape of
f(x1,x2) at x1 x , x2 x If H( x1* , x2* ) is +ve def, the
* *
1 2
eigenvectors are at right angles (orthogonal) and
correspond to the principal axes of elliptical
contours of f(x1,x2).
A valley or ridge lies in the direction of the
eigenvector associated with a relative small
eigenvalue.

These interpretations can be generalized to the


multivariate quadratic approximation:

f (x) x Hx c x
1
2
T T
55
Case 1: Equal Eigenvalues - circular contours
which are interpreted as a circular hill (max)(-ve
eigenvalues) or circular valley (min) (+ve
eigenvalues)
2

1.5

0.5
x2
0

-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 x1 0.5 1 1.5 2 56
57
Case 2: Unequal Eigenvalues (same sign)
elliptical contours which are interpreted as
an elliptical hill (max)(-ve eigenvalues) or
elliptical valley (min)(+ve eigenvalues)
2

1.5

0.5

0
X2
-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 58
x1
59
Case 3: Eigenvalues of opposite sign but
equal in magnitude - symmetrical saddle.
2

1.5

0.5

0
x2
-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 60
x1
61
Case 4: Eigenvalues of opposite sign but
unequal in magnitude - asymmetrical saddle.
2

1.5

0.5

0
x2
-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 62 2
x1
63
Optimisation with Equality
Constraints
min f (x); x n
x
subject to: h(x) 0; m constraints (m n)
Elimination of variables:

example: min f (x) 4 x12 5x22 (a)


x1 , x2

s.t. 2 x1 3x2 6 (b)

Using (b) to eliminate x1 gives: x1 6 3x2 (c)


2
and substituting into (a) :- f ( x2 ) (6 3x2 ) 5x
2 2
642
At a stationary point

f ( x2 )
0 6(6 3x2 ) 10 x2 0
x2
28 x2 36 x 1286
*
2.

6 3x *
Then using (c): x
*
1 1071
2
.
2

Hence, the stationary point (min) is: (1.071, 1.286)


65
The Lagrange Multiplier Method
Consider a two variable problem with a
single equality constraint:
min f ( x1 , x2 )
x1 , x2

s. t. h( x1 , x2 ) 0

At a stationary point we may write:


f
(a) df dx1
f
dx2 0
LMf fO
P
x1 x2
Mx 1
L
P
M O
P L
x2 dx1
MO
P
0
h h Mh h dx2
P
N Q NQ 0
(b) dh dx1 dx2 0
x1 x2 M
Nx 1
P
Q
x2
66
f f
x1 x2
If: 0 nontrivial nonunique solutions
h h for dx1 and dx2 will exist.
x1 x2
This is achieved by setting

h h

x1 x2 f h f h
with
h h x1 x1 x2 x2
x1 x2
where is known as a Lagrange multiplier.

67
If an augmented objective function, called the
Lagrangian is defined as:
L( x1 , x2 , ) f ( x1 , x2 ) h( x1 , x2 )
we can solve the constrained optimisation problem by
solving:
L f

h
0
U
|
x1 x1 x1
L f h V
provides equations (a) and (b)
||
0
x2 x2 x2 W
L
h( x1 , x2 ) 0 re-statement of equality constraint

68
Generalizing : To solve the problem:
min f (x); x n
x
subject to: h(x) 0; m constraints (m n)

define the Lagrangian:

L(x, ) f (x) T h(x), m


and the stationary point (points) is obtained from:-
h ( x)
T

x L(x, ) x f (x) 0
x
L(x, ) h(x) 0
69
Example Consider the previous example
again. The Lagrangian is:-
L 4 x 5x (2 x1 3x2 6)
2
1
2
2

L
8 x1 2 0 (a)
x1
L
10 x2 3 0 (b)
x2
L
2 x1 3x2 6 0 (c)

Substituting (a) and (b) into (c) gives: 70
3 9 30
x1 , x2 6 0 4.281
4 10 2 10 7
15 90
Hence, x1 1071
. , x2 1286
.
14 70
which agrees with the previous result.
4

0
x2
-1

-2

-3

-4 71
-4 -3 -2 -1 0 1 2 3 4
x1
Necessary Conditions for a Local Extremum of
an Optimisation Problem Subject to Equality
and Inequality Constraints
(Kuhn-Tucker Conditions)

Consider the problem:


min f (x); x n
x
s.t.: h(x) 0; m equalities (m n)
g(x) 0; p inequalities
Here, we define the Lagrangian:
L(x, , ) f (x) T h(x) T g(x); m , p

The necessary conditions for x* to be a local


extremum of f(x) are:-
72
(a) f(x), hj(x), gj(x) are all twice differentiable at x*
(b) The Lagrange multipliers exist
(c) All constraints are satisfied at x*
h(x* ) 0 hj (x* ) 0; g(x*) 0 g j (x* ) 0
(d) The Lagrange multipliers *j (at x*) for the
inequality constraints are not negative, i.e.
0
*
j
(e) The binding (active) inequality constraints are
zero, the inactive inequality constraints are > 0,
and the associated js are 0 at x*, i.e.
g j (x ) 0
*
j
*

(f) The Lagrangian function is at a stationary point


73
x L(x, , ) 0
Notes:
1. Further analysis or investigation is
required to determine if the extremum is
a minimum (or maximum)

2. If f(x) is convex, h(x) are linear and g(x)


are concave:
x* will be a global extremum.

74
Limitations of Analytical Methods
The computations needed to evaluate the above
conditions can be extensive and intractable.
Furthermore, the resulting simultaneous
equations required for solving x*, * and * are
often nonlinear and cannot be solved without
resorting to numerical methods.

The results may be inconclusive.

For these reasons, we often have to resort to


numerical methods for solving optimisation
problems, using computer codes (e.g.. MATLAB)
75
Example
Determine if the potential minimum
x* = (1.00,4.90) satisfies the Kuhn Tucker
conditions for the problem:
min f (x) 4 x1 x22 12
x

s. t. h1 (x) 25 x12 x22 0


g1 (x) 10 x1 x 10 x2 x 34 0
2
1
2
2

g2 (x) ( x1 3) 2 ( x2 1) 2 0
g3 (x) x1 2
g 4 ( x ) x2 0 76
contours and constraints
8

7 -60
-50 -40 -30
6
(1.00,4.90) -25 -20
5 )
-10
4 h1(x)=0
-5
0
3 g1(x)=0
x2 5
2 10

1
x2=0
0

-1

-2
-2 0 2 4 6 8
x1 77
We test each Kuhn-Tucker condition in turn:
(a) All functions are seen by inspection to be twice
differentiable.
(b) We assume the Lagrange multipliers exist
(c) Are the constraints satisfied?

h1: 25 (100
. ) (4.90) 0.01 0 yes
2 2

. ) (100
g1: 10(100 . ) 10(4.90) (4.90) 34 0.01 0 yes binding
2 2

2
g2 : (1.00-3) (4.90 1) 19.21 0 yes not active
2

g3: 1.00 -2 yes not active


g4 : 4.90 0 yes not active

78
To test the rest of the conditions we need to
determine the Lagrange multipliers using the
stationarity conditions. First we note that from
condition (e) we require:
g j (x ) 0, j 1,2,3,4
*
j
*

can have any value because g1 (x ) 0


*
1
*

must be zero because g2 (x ) 0


*
2
*

must be zero because g3 (x ) 0


*
3
*

*4 must be zero because g4 (x* ) 0

79
Now consider the stationarity condition:
x L(x, , ) 0 where, since 2* 3* 0
L=4 x1 x22 12 1 (25 x12 x22 ) 1 (10 x1 x12 10 x2 x22 34)

Hence:
x1 L 4 2*1 x1* 1* (10 2 x1* ) 4 2*1 8 1*
x2 L 2 x2* 2*1 x2* 1* (10 2 x2* ) 9.8 9.8*1 0.2 1*
L
M
2 8O L
M OL4 O
P
*

P
N9.8 0.2Q M P 1015
N QN9.8Q
1
*
1
. , *
1
*
1 = 0.754

80
Now we can check the remaining conditions:
(d) Are *j 0, j 1,2,3,4 ?
1* 0.754, *2 *3 *4 0
Hence, the answer is yes
(e) Are *j g j (x* ) 0, j 1,2,3,4 ?
Yes, because we have already used this above.

(f) Is the Lagrangian function at a stationary point?


Yes, because we have already used this above.

81
Hence, all the Kuhn-Tucker conditions are
satisfied and we can have confidence in the
solution :

x* = (1.00,4.90)

82

You might also like