Professional Documents
Culture Documents
Optimization methods
Contents
Introduction and Basic Concepts
16
23
27
27
40
48
48
57
63
70
76
87
95
106
113
113
116
Transportation Problem
116
125
134
144
150
154
158
A.BENHARI
161
163
165
165
170
177
189
197
204
207
215
215
222
230
233
239
246
References
249
A.BENHARI
ii
Historical Development
The existence of optimization methods can be traced to the days of Newton, Lagrange, and
Cauchy. The development of differential calculus methods for optimization was possible
because of the contributions of Newton and Leibnitz to calculus. The foundations of calculus
of variations, which deals with the minimization of functions, were laid by Bernoulli, Euler,
Lagrange, and Weistrass. The method of optimization for constrained problems, which
involve the addition of unknown multipliers, became known by the name of its inventor,
Lagrange. Cauchy made the first application of the steepest descent method to solve
unconstrained optimization problems. By the middle of the twentieth century, the high-speed
digital computers made implementation of the complex optimization procedures possible and
stimulated further research on newer methods. Spectacular advances followed, producing a
massive literature on optimization techniques. This advancement also resulted in the
emergence of several well defined new areas in optimization theory.
A.BENHARI
Work by Kuhn and Tucker in 1951 on the necessary and sufficient conditions for the
optimal solution of programming problems laid the foundation for later research in
non-linear programming.
The contributions of Zoutendijk and Rosen to nonlinear programming during the early
1960s have been very significant.
Work of Carroll and Fiacco and McCormick facilitated many difficult problems to be
solved by using the well-known techniques of unconstrained optimization.
Geometric programming was developed in the 1960s by Duffin, Zener, and Peterson.
Gomory did pioneering work in integer programming, one of the most exciting and
rapidly developing areas of optimization. The reason for this is that most real world
applications fall under this category of problems.
Dantzig and Charnes and Cooper developed stochastic programming techniques and
solved problems by assuming design parameters to be independent and normally
distributed.
The necessity to optimize more than one objective or goal while satisfying the physical
limitations led to the development of multi-objective programming methods. Goal
programming is a well-known technique for solving specific types of multi-objective
optimization problems. The goal programming was originally proposed for linear problems
by Charnes and Cooper in 1961. The foundation of game theory was laid by von Neumann in
1928 and since then the technique has been applied to solve several mathematical, economic
and military problems. Only during the last few years has game theory been applied to solve
engineering problems.
Simulated annealing, genetic algorithms, and neural network methods represent a new class
of mathematical programming techniques that have come into prominence during the last
decade. Simulated annealing is analogous to the physical process of annealing of metals and
glass. The genetic algorithms are search techniques based on the mechanics of natural
selection and natural genetics. Neural network methods are based on solving the problem
using the computing power of a network of interconnected neuron processors.
A.BENHARI
Design of minimum weight structures for earth quake, wind and other types of
random loading.
Optimal plastic design of frame structures (e.g., to determine the ultimate moment
capacity for minimum weight of the frame).
Optimum design of linkages, cams, gears, machine tools, and other mechanical
components.
Design of material handling equipment such as conveyors, trucks and cranes for
minimizing cost.
Design of pumps, turbines and heat transfer equipment for maximum efficiency.
Inventory control.
Controlling the waiting and idle times in production lines to reduce the cost of
production.
Planning the best strategy to obtain maximum profit in the presence of a competitor.
A.BENHARI
Analysis of statistical data and building empirical models to obtain the most accurate
representation of the statistical phenomenon.
Data collection
Model development
Data collection may be time consuming but is the fundamental basis of the model-building
process. The availability and accuracy of data can have considerable effect on the accuracy of
the model and on the ability to evaluate the model.
The problem definition and formulation includes the steps: identification of the decision
variables; formulation of the model objective(s) and the formulation of the model constraints.
In performing these steps the following are to be considered.
A.BENHARI
A.BENHARI
Objective Function
As already stated, the objective function is the mathematical function one wants to maximize
or minimize, subject to certain constraints. Many optimization problems have a single
A.BENHARI
Multiple objective functions. In some cases, the user may like to optimize a number of
different objectives concurrently. For instance, in the optimal design of panel of a
door or window, it would be good to minimize weight and maximize strength
simultaneously. Usually, the different objectives are not compatible; the variables that
optimize one objective may be far from optimal for the others. In practice, problems
with multiple objectives are reformulated as single-objective problems by either
forming a weighted combination of the different objectives or by treating some of the
objectives as constraints.
x1
x2
To find X = . which minimizes f(X)
.
x
n
(1.1)
i = 1, 2, ., m
lj(X) = 0 ,
j = 1, 2, ., p
where X is an n-dimensional vector called the design vector, f(X) is called the objective
function, and gi(X) and lj(X) are known as inequality and equality constraints, respectively.
The number of variables n and the number of constraints m and/or p need not be related in
any way. This type problem is called a constrained optimization problem.
A.BENHARI
f = C1
f = C2
f = C3
f= C4
f = C5
Optimum
point
Fig 1
(1.2)
Such problems are called unconstrained optimization problems. The field of unconstrained
optimization is quite a large and prominent one, for which a lot of algorithms and software
are available.
A.BENHARI
These are essential. If there are no variables, we cannot define the objective function and the
problem constraints. In many practical problems, one cannot choose the design variable
arbitrarily. They have to satisfy certain specified functional and other requirements.
Constraints
Constraints are not essential. It's been argued that almost all problems really do have
constraints. For example, any variable denoting the "number of objects" in a system can only
be useful if it is less than the number of elementary particles in the known universe! In
practice though, answers that make good sense in terms of the underlying physical or
economic criteria can often be obtained without putting constraints on the variables.
Design constraints are restrictions that must be satisfied to produce an acceptable design.
Constraints can be broadly classified as:
1) Behavioral or Functional constraints: These represent limitations on the behavior
performance of the system.
2) Geometric or Side constraints: These represent physical limitations on design
variables such as availability, fabricability, and transportability.
For example, for the retaining wall design shown in the Fig 2, the base width W cannot be
taken smaller than a certain value due to stability requirements. The depth D below the
ground level depends on the soil pressure coefficients Ka and Kp. Since these constraints
depend on the performance of the retaining wall they are called behavioral constraints. The
number of anchors provided along a cross section Ni cannot be any real number but has to be
a whole number. Similarly thickness of reinforcement used is controlled by supplies from the
manufacturer. Hence this is a side constraint.
A.BENHARI
Ni no. of anchors
W
Fig. 2
Constraint Surfaces
Consider the optimization problem presented in eq. 1.1 with only the inequality constraint
gi(X) 0 . The set of values of X that satisfy the equation gi(X) 0 forms a boundary surface
in the design space called a constraint surface. This will be a (n-1) dimensional subspace
where n is the number of design variables. The constraint surface divides the design space
into two regions: one with gi(X) < 0 (feasible region) and the other in which gi(X) > 0
(infeasible region). The points lying on the hyper surface will satisfy gi(X) =0. The collection
of all the constraint surfaces gi(X) = 0, j= 1, 2, , m, which separates the acceptable region is
called the composite constraint surface.
Fig 3 shows a hypothetical two-dimensional design space where the feasible region is
denoted by hatched lines. The two-dimensional design space is bounded by straight lines as
shown in the figure. This is the case when the constraints are linear. However, constraints
may be nonlinear as well and the design space will be bounded by curves in that case. A
design point that lies on more than one constraint surface is called a bound point, and the
associated constraint is called an active constraint. Free points are those that do not lie on any
constraint surface. The design points that lie in the acceptable or unacceptable regions can be
classified as following:
1. Free and acceptable point
2. Free and unacceptable point
A.BENHARI
10
Infeasible
region
Side
constraint
g3 0
Feasible
region
Bound
acceptable point.
Free unacceptable
point
Bound
unacceptable
point.
Behavior
constraint
g2 0
Behavior
constraint
g1 0
Free acceptable
point
Fig. 3
Sought: an element x0 in A such that f(x0) f(x) for all x in A ("minimization") or such that
f(x0) f(x) for all x in A ("maximization").
Such a formulation is called an optimization problem or a mathematical programming
problem (a term not directly related to computer programming, but still in use for example,
A.BENHARI
11
that is to say, on some region around x* all the function values are greater than or equal to the
value at that point. Local maxima are defined similarly.
A large number of algorithms proposed for solving non-convex problems including the
majority of commercially available solvers are not capable of making a distinction between
local optimal solutions and rigorous optimal solutions, and will treat the former as the actual
solutions to the original problem. The branch of applied mathematics and numerical analysis
that is concerned with the development of deterministic algorithms that are capable of
guaranteeing convergence in finite time to the actual optimal solution of a non-convex
problem is called global optimization.
Problem formulation
Problem formulation is normally the most difficult part of the process. It is the selection of
design variables, constraints, objective function(s), and models of the discipline/design.
Selection of design variables
A design variable, that takes a numeric or binary value, is controllable from the point of view
of the designer. For instance, the thickness of a structural member can be considered a design
variable. Design variables can be continuous (such as the length of a cantilever beam),
A.BENHARI
12
The designer has to also choose models to relate the constraints and the objectives to the
design variables. These models are dependent on the discipline involved. They may be
empirical models, such as a regression analysis of aircraft prices, theoretical models, such as
from computational fluid dynamics, or reduced-order models of either of these. In choosing
the models the designer must trade-off fidelity with the time required for analysis.
The multidisciplinary nature of most design problems complicates model choice and
implementation. Often several iterations are necessary between the disciplines analyses in
order to find the values of the objectives and constraints. As an example, the aerodynamic
loads on a bridge affect the structural deformation of the supporting structure. The structural
deformation in turn changes the shape of the bridge and hence the aerodynamic loads. Thus,
it can be considered as a cyclic mechanism. Therefore, in analyzing a bridge, the
A.BENHARI
13
Once the design variables, constraints, objectives, and the relationships between them have
been chosen, the problem can be expressed as shown in equation 1.1
Maximization problems can be converted to minimization problems by multiplying the
objective by -1. Constraints can be reversed in a similar manner. Equality constraints can be
replaced by two inequality constraints.
Problem solution
The problem is normally solved choosing the appropriate techniques from those available in
the field of optimization. These include gradient-based algorithms, population-based
algorithms, or others. Very simple problems can sometimes be expressed linearly; in that case
the techniques of linear programming are applicable.
Gradient-based methods
Newton's method
Steepest descent
Conjugate gradient
Population-based methods
Genetic algorithms
Other methods
Random search
Grid search
Simulated annealing
Most of these techniques require large number of evaluations of the objectives and the
constraints. The disciplinary models are often very complex and can take significant amount
of time for a single evaluation. The solution can therefore be extremely time-consuming.
A.BENHARI
14
A.BENHARI
15
s ( X ) max ; b 0 ; d 0
where s is the settlement of the footing. Such problems are called parameter or static
optimization problems.
A.BENHARI
16
s ( X(t) ) max
0tl
b(t) 0
0tl
d(t) 0
0tl
The length of the footing (l) the loads P1 and P2 , the distance between the loads are assumed
to be constant and the required optimization is achieved by varying b and d along the length l.
Here the design variables are functions of the length parameter t. this type of problem, where
each design variable is a function of one or more parameters, is known as trajectory or
dynamic optimization problem.
P1
P1
P2
P2
b(t)
d
d(t)
(b)
(a)
Fig 1
A.BENHARI
17
f (x , y )
i =1
i = 1, 2, ., l
g j (x j ) 0 ,
j = 1, 2, ., l
hk ( y k ) 0 ,
k = 1, 2, ., l
Where xi is the ith control variable, yi is the ith state variable, and fi is the contribution of the
ith stage to the total objective function. gj, hk, and qi are the functions of xj, yj ; xk, yk and xi and
yi, respectively, and l is the total number of states. The control and state variables xi and yi
can be vectors in some cases.
(ii) Problems which are not optimal control problems are called non-optimal control
problems.
Classification based on the nature of the equations involved
Based on the nature of equations for the objective function and the constraints, optimization
problems can be classified as linear, nonlinear, geometric and quadratic programming
problems. The classification is very useful from a computational point of view since many
A.BENHARI
18
x1
x
2
Find X = .
.
x n
n
c x
i =1
a
i =1
ij
xi = b j ,
xi 0 ,
j = 1, 2, . . . , m
j = 1, 2, . . . , m
If any of the functions among the objectives and constraint functions is nonlinear, the
problem is called a nonlinear programming (NLP) problem. This is the most general form of
a programming problem and all other problems can be considered as special cases of the NLP
problem.
(iii) Geometric programming problem
A geometric programming (GMP) problem is one in which the objective function and
constraints are expressed as polynomials in X. A function h(X) is called a polynomial (with
A.BENHARI
19
j
C xi
j =1
i =1
N0
f(X) =
cj > 0, xi > 0
subject to
Nk
gk(X) =
a C x
j =1
jk
qijk
i
i =1
> 0,
where N0 and Nk denote the number of terms in the objective function and in the kth constraint
function, respectively.
(iv) Quadratic programming problem
A quadratic programming problem is the best behaved nonlinear programming problem with
a quadratic objective function and linear constraints and is concave (for maximization
problems). It can be solved by suitably modifying the linear programming techniques. It is
usually formulated as follows:
F(X) = c + qi xi + Qij xi x j
i =1
i =1 j =1
Subject to
n
a
i =1
ij
xi = b j ,
j = 1,2,.,m
xi 0 ,
i = 1,2,.,n
Under this classification, objective functions can be classified as integer and real-valued
programming problems.
A.BENHARI
20
If some or all of the design variables of an optimization problem are restricted to take only
integer (or discrete) values, the problem is called an integer programming problem. For
example, the optimization is to find number of articles needed for an operation with least
effort. Thus, minimization of the effort required for the operation being the objective, the
decision variables, i.e. the number of articles used can take only integer values. Other
restrictions on minimum and maximum number of usable resources may be imposed.
(ii) Real-valued programming problem
In a deterministic system, for a same input, the system will produce the same output always.
In this type of problems all the design variables are deterministic.
(ii) Stochastic programming problem
In this type of an optimization problem, some or all the design variables are expressed
probabilistically (non-deterministic or stochastic). For example estimates of life span of
structures which have probabilistic inputs of the concrete strength and load capacity is a
stochastic programming problem as one can only estimate stochastically the life span of the
structure.
Classification based on separability of the functions
Based on this classification, optimization problems can be classified as separable and nonseparable programming problems based on the separability of the objective and constraint
functions.
(i) Separable programming problems
A.BENHARI
21
f ( X ) = f i ( xi )
i =1
subject to
n
g j ( X ) = g ij ( xi ) b j ,
j = 1,2,. . . , m
i =1
where bj is a constant.
Classification based on the number of objective functions
Under this classification, objective functions can be classified as single-objective and multiobjective programming problems.
(i) Single-objective programming problem in which there is only a single objective function.
(ii) Multi-objective programming problem
j = 1, 2, . . . , m
A.BENHARI
22
The classical optimization techniques are useful in finding the optimum solution or
unconstrained maxima or minima of continuous and differentiable functions. These are
analytical methods and make use of differential calculus in locating the optimum solution.
The classical methods have limited scope in practical applications as some of them involve
objective functions which are not continuous and/or differentiable. Yet, the study of these
classical techniques of optimization form a basis for developing most of the numerical
techniques that have evolved into advanced techniques more suitable to todays practical
problems. These methods assume that the function is differentiable twice with respect to the
design variables and that the derivatives are continuous. Three main types of problems can be
handled by the classical optimization techniques, viz., single variable functions, multivariable
functions with no constraints and multivariable functions with both equality and inequality
constraints. For problems with equality constraints the Lagrange multiplier method can be
used. If the problem has inequality constraints, the Kuhn-Tucker conditions can be used to
identify the optimum solution. These methods lead to a set of nonlinear simultaneous
equations that may be difficult to solve. These classical methods of optimization are further
discussed in Module 2.
The other methods of optimization include
Linear programming: studies the case in which the objective function f is linear and
the set A is specified using only linear equalities and inequalities. (A is the design
variable space)
Integer programming: studies linear programs in which some or all variables are
constrained to take on integer values.
A.BENHARI
23
Nonlinear programming: studies the general case in which the objective function or
the constraints or both contain nonlinear parts.
Stochastic programming: studies the case in which some of the constraints depend
on random variables.
Dynamic programming: studies the case in which the optimization strategy is based
on splitting the problem into smaller sub-problems.
Infinite-dimensional optimization: studies the case when the set of feasible solutions
is a subset of an infinite-dimensional space, such as a space of functions.
Constraint satisfaction: studies the case in which the objective function f is constant
(this is used in artificial intelligence, particularly in automated reasoning).
Hill climbing
Hill climbing is a graph search algorithm where the current path is extended with a
successor node which is closer to the solution than the end of the current path.
In simple hill climbing, the first closer node is chosen whereas in steepest ascent hill
climbing all successors are compared and the closest to the solution is chosen. Both
forms fail if there is no closer node. This may happen if there are local maxima in the
search space which are not solutions. Steepest ascent hill climbing is similar to best
first search but the latter tries all possible extensions of the current path in order,
whereas steepest ascent only tries one.
Hill climbing is used widely in artificial intelligence fields, for reaching a goal state
from a starting node. Choice of next node starting node can be varied to give a
number of related algorithms.
A.BENHARI
24
Simulated annealing
The name and inspiration come from annealing process in metallurgy, a technique
involving heating and controlled cooling of a material to increase the size of its
crystals and reduce their defects. The heat causes the atoms to become unstuck from
their initial positions (a local minimum of the internal energy) and wander randomly
through states of higher energy; the slow cooling gives them more chances of finding
configurations with lower internal energy than the initial one.
In the simulated annealing method, each point of the search space is compared to a
state of some physical system, and the function to be minimized is interpreted as the
internal energy of the system in that state. Therefore the goal is to bring the system,
from an arbitrary initial state, to a state with the minimum possible energy.
Genetic algorithms
A genetic algorithm (GA) is a search technique used in computer science to find
approximate solutions to optimization and search problems. Specifically it falls into
the category of local search techniques and is therefore generally an incomplete
search. Genetic algorithms are a particular class of evolutionary algorithms that use
techniques inspired by evolutionary biology such as inheritance, mutation, selection,
and crossover (also called recombination).
A.BENHARI
25
Over time, however, the pheromone trail starts to evaporate, thus reducing its
attractive strength. The more time it takes for an ant to travel down the path and back
again, the more time the pheromones have to evaporate. A short path, by comparison,
gets marched over faster, and thus the pheromone density remains high as it is laid on
the path as fast as it can evaporate. Pheromone evaporation has also the advantage of
avoiding the convergence to a local optimal solution. If there was no evaporation at
all, the paths chosen by the first ants would tend to be excessively attractive to the
following ones. In that case, the exploration of the solution space would be
constrained.
Thus, when one ant finds a good (short) path from the colony to a food source, other
ants are more likely to follow that path, and such positive feedback eventually leaves
all the ants following a single path. The idea of the ant colony algorithm is to mimic
this behavior with "simulated ants" walking around the search space representing the
problem to be solved.
Ant colony optimization algorithms have been used to produce near-optimal solutions
to the traveling salesman problem. They have an advantage over simulated annealing
and genetic algorithm approaches when the graph may change dynamically. The ant
colony algorithm can be run continuously and can adapt to changes in real time. This
is of interest in network routing and urban transportation systems.
A.BENHARI
26
minimum
inflection point
maximum
Fig. 1
M2L1
27
A2
f(x)
. ..
.
f(x)
A3
A1
B2
B1
Fig. 2
A.BENHARI
28
f ( x * + h) f ( x*)
h
(1)
From our earlier discussion on relative maxima we have f ( x*) f ( x * + h) for h 0 . Hence
f ( x * + h) f ( x*)
0
h
h<0
(2)
f ( x * + h) f ( x*)
0
h
h>0
(3)
which implies for substantially small negative values of h we have f ( x*) 0 and for
substantially small positive values of h we have f ( x*) 0 . In order to satisfy both (2) and
(3), f ( x*) = 0. Hence this gives the necessary condition for a relative maxima at x = x* for
f ( x) .
It has to be kept in mind that the above theorem holds good for relative minimum as well.
The theorem only considers a domain where the function is continuous and differentiable. It
cannot indicate whether a maxima or minima exists at a point where the derivative fails to
exist. This scenario is shown in Fig 3, where the slopes m1 and m2 at the point of a maxima
are unequal, hence cannot be found as depicted by the theorem by failing for continuity. The
theorem also does not consider if the maxima or minima occurs at the end point of the
interval of definition, owing to the same reason that the function is not continuous, therefore
not differentiable at the boundaries. The theorem does not say whether the function will have
a maximum or minimum at every point where f (x) = 0, since this condition f (x) = 0 is for
stationary points which include inflection points which do not mean a maxima or a minima.
A point of inflection is shown already in Fig.1
A.BENHARI
29
f(x)
f(x*)
m2
m1
x*
Fig. 3
Sufficient condition: For the same function stated above let f (x*) = f (x*) = . . . = f (n-1)(x*)
= 0, but f (n)(x*) 0, then it can be said that f (x*) is (a) a minimum value of f (x) if f (n)(x*) > 0
and n is even; (b) a maximum value of f (x) if f
(n)
h2
h n 1
hn n
f ''( x*) + ... +
f ( n 1) ( x*) +
f ( x * + h)
2!
(n 1)!
n!
(4)
f ( x * + h) f ( x*) =
hn n
f ( x * + h)
n!
(5)
As f (n)(x*) 0, there exists an interval around x* for every point x of which the nth derivative
f
(n)
(n)
(n)
hn
(x ). When n is even
is positive irrespective of the sign of h,
n!
(n)
(n)
relative minimum if f (n)(x*) is positive, with f(x) convex around x*, and a relative maximum if
A.BENHARI
30
hn
changes sign with the
(x ) is negative, with f(x) concave around x . When n is odd,
n!
(n)
change in the sign of h and hence the point x* is neither a maximum nor a minimum. In this
case the point x* is called a point of inflection.
Example 1.
Find the optimum value of the function f ( x) = x 2 + 3 x 5 and also state if the function
attains a maximum or a minimum.
Solution
f '( x ) = 2 x + 3 = 0 for maxima or minima.
or x* = -3/2
f ''( x*) = 2 which is positive hence the point x* = -3/2 is a point of minima and the function
A.BENHARI
31
( )
f x * = 24 at x* = 2
Hence fn(x) is positive and n is even hence the point x = x* = 2 is a point of minimum and the
function attains a minimum value of 0 at this point.
Example 3.
Analyze the function f ( x) = 12 x 5 45 x 4 + 40 x3 + 5 and classify the stationary points as
maxima, minima and points of inflection.
Solution
=> x 4 3x3 + 2 x 2 = 0
x = 0,1, 2
A.BENHARI
32
The horse power generated by a Pelton wheel is proportional to u(v-u) where u is the velocity
of the wheel, which is variable and v is the velocity of the jet which is fixed. Show that the
efficiency of the Pelton wheel will be maximum at u = v/2.
Solution
f = K.u (v u )
f
= 0 => Kv 2Ku = 0
u
v
or u =
2
where K is a proportionality constant (assumed positive).
2 f
u 2
= 2K which is negative.
u=
v
2
Hence, f is maximum at u =
v
2
This concept may be easily extended to functions of multiple variables. Functions of two
variables are best illustrated by contour maps, analogous to geographical maps. A contour is a
line representing a constant value of f(x) as shown in Fig.4. From this we can identify
maxima, minima and points of inflection.
A.BENHARI
33
As can be seen in Fig. 4 and 5, perturbations from points of local minima in any direction
result in an increase in the response function f(x), i.e. the slope of the function is zero at this
point of local minima. Similarly, at maxima and points of inflection as the slope is zero, the
first derivatives of the function with respect to the variables are zero.
Which gives us
f
f
= 0;
= 0 at the stationary points, i.e., the gradient vector of f(X), x f
x1
x2
x ( *)
=0
x f = 1
f
x ( *)
2
x2
x1
Fig. 4
A.BENHARI
34
Relative minima
Global minima
Fig. 5
Sufficient conditions
A.BENHARI
35
x 2
H = 21
f
x1x2
2 f
x1x2
2 f
x22 [ x , x ]
1 2
a) If H is positive definite then the point X = [x1, x2] is a point of local minima.
b) If H is negative definite then the point X = [x1, x2] is a point of local maxima.
c) If H is neither then the point X = [x1, x2] is neither a point of maxima nor minima.
A square matrix is positive definite if all its eigen values are positive and it is negative
definite if all its eigen values are negative. If some of the eigen values are positive and some
negative then the matrix is neither positive definite or negative definite.
To calculate the eigen values of a square matrix then the following equation is solved.
A I = 0
The above rules give the sufficient conditions for the optimization problem of two variables.
Optimization of multiple variable problems will be discussed in detail in lecture notes 3
(Module 2).
Example 5.
Locate the stationary points of f(X) and classify them as relative maxima, relative minima or
neither based on the rules discussed in the lecture.
f(X) =
A.BENHARI
+5
36
Solution
From
f
(X) = 0 , x1 = 2 x2 + 2
x2
From
f
(X) = 0
x1
8 x22 + 14 x2 + 3 = 0
(2 x2 + 3)(4 x2 + 1) = 0
x2 = 3 / 2
or x2 = 1/ 4
A.BENHARI
37
2 4
I - H =
4 x1
2
2
4
At X1= [-1,-3/2],
I - H =
+4
2
2
= ( + 4)( 4) 4 = 0
4
2 16 4 = 0
2 = 12
1 = + 12
2 = 12
Since one eigen value is positive and one negative, X1 is neither a relative maximum nor a
relative minimum.
At X2 = [3/2,-1/4]
I - H =
6
2
2
= ( 6)( 4) 4 = 0
4
2 10 + 20 = 0
1 = 5 + 5
2 = 5 5
A.BENHARI
38
Example 6
x ( *) 2 2 x 0
1
=
x f = 1
= , to determine stationary point X*.
f
6 3x2 0
x ( *)
2
=0
2;
3;
x12
x2 2
x1x2
2 0
H=
0 3
I - H =
+2
0
0
= ( + 2)( + 3) = 0
+3
Here the values of do not depend on X and 1 = -2, 2 = -3. Since both the eigen values
are negative, f(X) is concave and the required ratio x1:x2 = 1:2 with a global maximum
strength of f(X) = 27 units.
A.BENHARI
39
Fig. 1
In other words, a function is convex if and only if its epigraph (the set of points lying on or
above the graph) is a convex set. A function is also said to be strictly convex if
A.BENHARI
40
for any t in (0,1) and a line connecting any two points on the function lies completely above
the function. These relationships are illustrated in Fig. 1.
Testing for convexity of a single variable function
A function is convex if its slope is non decreasing or 2 f / x 2 0. It is strictly convex if its
slope is continually increasing or
2
2
A.BENHARI
41
The absolute value function |x| is convex, even though it does not have a derivative at
x = 0.
The function f with domain [0,1] defined by f(0)=f(1)=1, f(x)=0 for 0<x<1 is convex;
it is continuous on the open interval (0,1), but not continuous at 0 and 1.
Every linear transformation is convex but not strictly convex, since if f is linear, then
f(a + b) = f(a) + f(b). This implies that the identity map (i.e., f(x) = x) is convex but
not strictly convex. The fact holds good if we replace "convex" by "concave".
Concave function
A differentiable function f is concave on an interval if its derivative function f is decreasing
on that interval: a concave function has a decreasing slope.
A function that is convex is often synonymously called concave upwards, and a function
that is concave is often synonymously called concave downward.
For a twice-differentiable function f, if the second derivative, f ''(x), is positive (or, if the
acceleration is positive), then the graph is convex (or concave upward); if the second
derivative is negative, then the graph is concave (or concave downward). Points, at which
concavity changes, are called inflection points.
If a convex (i.e., concave upward) function has a "bottom", any point at the bottom is a
minimal extremum. If a concave (i.e., concave downward) function has an "apex", any point
at the apex is a maximal extremum.
A function f(x) is said to be concave on an interval if, for all a and b in that interval,
A.BENHARI
42
Fig. 2
Testing for concavity of a single variable function
A function is concave if its slope is non increasing or 2 f / x 2 0. It is strictly concave if its
slope is continually decreasing or 2 f / x 2 < 0 throughout the function.
Properties of a concave functions
A continuous function on C is concave if and only if
a + b f ( a ) + f (b )
f
2
2
A.BENHARI
43
=> x 4 3x3 + 2 x 2 = 0
x = 0,1, 2
Consider x = x* = 2
f ''( x*) = 240( x*)3 540( x*) 2 + 240 x* = 240 at x* = 2
Since the second derivative is positive, the point x = x* = 2 is a point of local minima with a
minimum value of f(x) = -11. At this point the function is convex since 2 f / x 2 > 0.
A.BENHARI
44
f (t 1 + (1 t ) 2 ) < tf ( 1 ) + (1 t ) f ( 2 )
where X1 and X2 are points located by the coordinates given in their respective vectors.
Similarly a two variable function is strictly concave if
f (t 1 + (1 t ) 2 ) > tf ( 1 ) + (1 t ) f ( 2 )
Contour plot of a convex function is illustrated in Fig. 3
340
x2
70
120
450
x1
Fig. 3
Contour plot of a convex function is shown in Fig. 4
x2
305
210
40
110
x1
Fig. 4
A.BENHARI
45
Example 2
Consider the example in lecture notes 1 for a function of two variables. Locate the stationary
points of f(X) and find out if the function is convex, concave or neither at the points of
optima based on the rules discussed in this lecture.
f(X) = 2 x13 / 3 2 x1 x2 5 x1 + 2 x22 + 4 x2 + 5
Solution
f
x ( *) 2 x 2 2 x 5 0
1
2
= 1
x f =
=
f
2 x1 + 4 x2 + 4 0
x ( *)
2
2 4
I - H =
4 x1
2
2
4
At X1
I - H =
+4
2
2
= ( + 4)( 4) 4 = 0
4
2 16 4 = 0
A.BENHARI
46
2 = 12
1 = + 12
2 = 12
Since one eigen value is positive and one negative, X1 is neither a relative maximum nor a
relative minimum. Hence at X1 the function is neither convex nor concave.
At X2 = [3/2,-1/4]
I - H =
6
2
2
= ( 6)( 4) 4 = 0
4
2 10 + 20 = 0
1 = 5 + 5
2 = 5 5
Since both the eigen values are positive, X2 is a local minimum, and the function is convex at
this point as both the eigen values are positive.
A.BENHARI
47
Unconstrained optimization
If a convex function is to be minimized, the stationary point is the global minimum and
analysis is relatively straightforward as discussed earlier. A similar situation exists for
maximizing a concave variable function. The necessary and sufficient conditions for the
optimization of unconstrained function of several variables are given below.
Necessary condition
In case of multivariable functions a necessary condition for a stationary point of the function
f(X) is that each partial derivative is equal to zero. In other words, each element of the
gradient vector defined below must be equal to zero.
i.e. the gradient vector of f(X), x f at X=X*, defined as follows, must be equal to zero:
f
x ( *)
1
x ( *)
2
= 0
x f =
( *)
dxn
A.BENHARI
48
Sufficient condition
For a stationary point X* to be an extreme point, the matrix of second partial derivatives
(Hessian matrix) of f(X) evaluated at X* must be:
(i)
(ii)
f ( * + h) = f ( *) + hi
i =1
df
1 n n
2 f
( *) + hi h j
dxi
2! i =1 j =1
xi x j
=*+ h
0< <1
Since X* is a stationary point, the necessary condition gives
df
= 0,
dxi
i = 1,2,,n
Thus
f ( * + h) f ( *) =
1 n n
2 f
h
h
i j x x
2! i =1 j =1
i
j
=*+ h
For a minimization problem the left hand side of the above expression must be positive.
Since the second partial derivative is continuous in the neighborhood of X* the sign of
2 f xi x j = * + h is the same as the sign of 2 f xi x j = * . And hence
f ( * + h) f ( *) will be a relative minimum, if
f ( * + h) f ( *) =
1 n n
2 f
h
h
i j x x
2! i =1 j =1
i
j
=*
Q = hT Hh
X=X*
where
H X=X*
2 f
=
xi x j
=*
is the matrix of second partial derivatives and is called the Hessian matrix of f(X).
A.BENHARI
49
A I = 0
should be positive. Similarly, the matrix A will be negative definite if its eigenvalues are
negative. When some eigenvalues are positive and some are negative then matrix A is neither
positive definite or negative definite.
When all eigenvalues are negative for all possible values of X, then X* is a global maximum,
and when all eigenvalues are positive for all possible values of X, then X* is a global
minimum.
If some of the eigenvalues of the Hessian at X* are positive and some negative, or if some are
zero, the stationary point, X*, is neither a local maximum nor a local minimum.
Example
Analyze the function f ( x ) = x12 x22 x32 + 2 x1 x2 + 2 x1 x3 + 4 x1 5 x3 + 2 and classify the
stationary points as maxima, minima and points of inflection.
Solution
f
( *)
x1
2 x1 + 2 x2 + 2 x3 + 4 0
f
= 0
x f =
( *) =
2 x2 + 2 x1
x2
2 x + 2 x 5 0
3
1
f
(
*)
x3
= 2
2;
2;
x12
x2 2
x32
2 f
2 f
=
=2
x1x2 x2 x1
A.BENHARI
50
xi x j
2 2 2
H = 2 2 0
2 0 2
+2
I - H = 2
2
or
2
2
+2
0 =0
0
+2
( + 2)[ 2 + 4 + 4 4 + 4] = 0
( + 2)3 = 0
or 1 = 2 = 3 = 2
Since all eigenvalues are negative the function attains a maximum at the point
X*=[1/2, 1/2, -2]
A.BENHARI
51
(2)
assuming dx1 and dx2 are small the Taylor series expansion of this gives us
g ( x1 * + dx1 , x2 * + dx2 ) = g ( x1*, x2 *) +
g
g
(x1*,x 2 *) dx1 +
(x1*,x 2 *) dx2 = 0
x1
x2
(3)
or
dg =
g
g
dx1 +
dx2 = 0 at [x1*,x2*]
x1
x2
(4)
which is the condition that must be satisfied for all admissible variations.
Assuming g / x2 0 (4) can be rewritten as
dx2 =
g / x1
( x1*, x2 *)dx1
g / x2
(5)
which indicates that once variation along x1 (d x1) is chosen arbitrarily, the variation along x2
(d x2) is decided automatically to satisfy the condition for the admissible variation.
Substituting equation (5) in (1)we have:
f g / x1 f
df =
dx1 = 0
x1 g / x2 x2 (x1 *, x 2 *)
(6)
The equation on the left hand side is called the constrained variation of f. Equation (5) has to
be satisfied for all dx1, hence we have
f g f g
=0
x1 x2 x2 x1 (x1 *, x 2 *)
(7)
This gives us the necessary condition to have [x1*, x2*] as an extreme point (maximum or
minimum)
Solution by method of Lagrange multipliers
Continuing with the same specific case of the optimization problem with n = 2 and m = 1 we
define a quantity , called the Lagrange multiplier as
f / x2
g / x2
A.BENHARI
52
(8)
(x1 *, x 2 *)
f
g
+
=0
x1 (x *, x *)
x1
1
2
(9)
f
g
+
=0
x2 (x *, x *)
x2
1
2
(10)
(11)
Hence equations (9) to (11) represent the necessary conditions for the point [x1*, x2*] to be
an extreme point.
Note that could be expressed in terms of g / x1 as well and g / x1 has to be non-zero.
Thus, these necessary conditions require that at least one of the partial derivatives of g(x1, x2)
be non-zero at an extreme point.
The conditions given by equations (9) to (11) can also be generated by constructing a
function L, known as the Lagrangian function, as
L( x1 , x2 , ) = f ( x1 , x2 ) + g ( x1 , x2 )
(12)
Alternatively, treating L as a function of x1,x2 and , the necessary conditions for its
extremum are given by
L
f
g
( x1 , x2 , ) =
( x1 , x2 ) +
( x1 , x2 ) = 0
x1
x1
x1
g
L
f
( x1 , x2 , ) =
( x1 , x2 ) +
( x1 , x2 ) = 0
x2
x2
x2
(13)
L
( x1 , x2 , ) = g ( x1 , x2 ) = 0
The necessary and sufficient conditions for a general problem are discussed next.
Necessary conditions for a general problem
For a general problem with n variables and m equality constraints the problem is defined as
shown earlier
Maximize (or minimize) f(X), subject to gj(X) = 0, j = 1, 2, , m
where
A.BENHARI
x1
x
X = 2
M
xn
53
In this case the Lagrange function, L, will have one Lagrange multiplier j for each constraint
g j (X) as
L ( x1 , x2 ,..., xn , 1 , 2 ,..., m ) = f ( X) + 1 g1 ( X) + 2 g 2 ( X) + ... + m g m ( X)
(14)
L
= g j ( X) = 0,
j
i = 1, 2,..., n
j = 1, 2,..., m
(15)
j = 1, 2,..., m
1 *
x1 *
*
x *
2
X = and * = 2
M
M
m *
xn *
(16)
The vector X corresponds to the relative constrained minimum of f(X) (subject to the
verification of sufficient conditions).
Sufficient conditions for a general problem
A sufficient condition for f(X) to have a relative minimum at X* is that each root of the
polynomial in , defined by the following determinant equation be positive.
L11
L12
L21
M
L22
Ln1
Ln 2
g11
g12
g 21
g 22
M
g m1
A.BENHARI
gm2
L1n
g11
g 21 L g m1
g12
M
g 22
L2 n
M
L Lnn g1n
L
O
L
gm2
O M
g 2 n L g mn
g1n
g2n
M
g mn
M
0
M
0
54
=0
(17)
2 L
( X*, *),
xi x j
g pq =
g p
xq
( X*),
for i = 1, 2,..., n
j = 1, 2,..., m
(18)
Similarly, a sufficient condition for f(X) to have a relative maximum at X* is that each root of
the polynomial in , defined by equation (17) be negative. If equation (17), on solving yields
roots, some of which are positive and others negative, then the point X* is neither a
maximum nor a minimum.
Example
Minimize
f ( X) = 3 x12 6 x1 x2 5 x22 + 7 x1 + 5 x2
Subject to x1 + x2 = 5
Solution
g1 ( X) = x1 + x2 5 = 0
L( x1 , x2 ,..., xn , 1 , 2 ,..., m ) = f ( X) + 1 g1 ( X) + 2 g 2 ( X) + ... + m g m ( X) with n = 2 and m = 1
L = 3 x12 6 x1 x2 5 x22 + 7 x1 + 5 x2 + 1 ( x1 + x2 5)
L
= 6 x1 6 x2 + 7 + 1 = 0
x1
1
=> x1 + x2 = (7 + 1 )
6
1
=> 5 = (7 + 1 )
6
or
1 = 23
L
= 6 x1 10 x2 + 5 + 1 = 0
x2
1
=> 3x1 + 5 x2 = (5 + 1 )
2
1
=> 3( x1 + x2 ) + 2 x2 = (5 + 1 )
2
A.BENHARI
55
x2 =
x1 =
and,
1
2
11
2
1 11
Hence X* = , ; * = [ 23]
2 2
L12
g11
L11
L22 g 21 = 0
L21
g
g12
0
11
L11 =
2L
= 6
x12 ( X*,*)
L12 = L21 =
2L
x1x2
= 6
( X*,* )
L22 =
2L
= 10
x22 ( X*,*)
g11 =
g1
x1
=1
( X*,* )
g12 = g 21 =
g1
x2
=1
( X*,* )
10 1 = 0
6
1
0
1
or
A.BENHARI
56
Kuhn-Tucker Conditions
Introduction
In the previous lecture the optimization of functions of multiple variables subjected to
equality constraints using the method of constrained variation and the method of Lagrange
multipliers was dealt. In this lecture the Kuhn-Tucker conditions will be discussed with
examples for a point to be a local optimum in case of a function subject to inequality
constraints.
Kuhn-Tucker Conditions
It was previously established that for both an unconstrained optimization problem and an
optimization problem with an equality constraint the first-order conditions are sufficient for a
global optimum when the objective and constraint functions satisfy appropriate
concavity/convexity conditions. The same is true for an optimization problem with inequality
constraints.
The Kuhn-Tucker conditions are both necessary and sufficient if the objective function is
concave and each constraint is linear or each constraint function is concave, i.e. the problems
belong to a class called the convex programming problems.
Consider the following optimization problem:
Minimize f(X) subject to gj(X) 0 for j = 1,2,,p ; where X = [x1 x2 . . . xn]
Then the Kuhn-Tucker conditions for X* = [x1* x2* . . . xn*] to be a local minimum are
m
f
g
+ j
=0
xi j =1 xi
A.BENHARI
i = 1, 2,..., n
j g j = 0
j = 1, 2,..., m
gj 0
j = 1, 2,..., m
j 0
j = 1, 2,..., m
57
(1)
a)
f
g
g
+ 1 1 + 2 2 = 0
xi
xi
xi
i.e.
2 x1 + 1 + 2 = 0
(2)
4 x2 1 + 22 = 0
(3)
6 x3 21 32 = 0
(4)
b) j g j = 0
i.e.
1 ( x1 x2 2 x3 12) = 0
2 ( x1 + 2 x2 3x3 8) = 0
c) g j 0
A.BENHARI
58
(5)
(6)
x1 x2 2 x3 12 0
(7)
x1 + 2 x2 3 x3 8 0
(8)
d) j 0
i.e.,
1 0
2 0
(9)
(10)
1 2 1 22 21 + 32
12 = 0 or,
3
2
4
171 + 122 = 144 . But conditions (9) and (10) give us 1 0 and 2 0 simultaneously,
which cannot be possible with 171 + 122 = 144 .
Hence the solution set for this optimization problem is X* = [ 0 0 0 ]
A.BENHARI
59
Example 2
Minimize f = x12 + x22 + 60 x1 subject to the constraints
g1 = x1 80 0
g 2 = x1 + x2 120 0
using Kuhn-Tucker conditions.
Solution
The Kuhn-Tucker conditions are given by
a)
g
f
g
g
+ 1 1 + 2 2 + 3 3 = 0
xi
xi
xi
xi
i.e.
2 x1 + 60 + 1 + 2 = 0
(11)
2 x2 + 2 = 0
(12)
1 ( x1 80) = 0
2 ( x1 + x2 120) = 0
(13)
x1 80 0
(15)
x1 + x2 + 120 0
(16)
b) j g j = 0
i.e.
(14)
c) g j 0
i.e.,
d) j 0
i.e.,
A.BENHARI
60
1 0
2 0
(17)
(18)
30 and x2 =
2 = 2 x2
1 = 2 x2 220
(19)
2 x2 ( x2 40 ) = 0 .
For this to be true, either x2 = 0 or x2 40 = 0
For x2 = 0 , 1 = 220 . This solution set violates (15) and (16)
A.BENHARI
61
A.BENHARI
62
Preliminaries
Introduction
Linear Programming (LP) is the most useful optimization technique used for the solution of
engineering problems. The term linear implies that the objective function and constraints
are linear functions of nonnegative decision variables. Thus, the conditions of LP
problems (LPP) are
1. Objective function must be a linear function of decision variables
2. Constraints should be linear function of decision variables
3. All the decision variables must be nonnegative
For example,
Maximize
subject to
Z = 6x + 5 y
2x 3y 5
x + 3 y 11
4 x + y 15
Objective Function
1st Constraint
2nd Constraint
3rd Constraint
x, y 0
Nonnegativity Condition
Z = 3x1 5x2
subject to
2x1 3x2 15
x1 + x2 3
4x1 + x2 2
x1 0
x2 unrestricted
A.BENHARI
63
Z = Z = 3 x1 + 5x2
The first constraint can be rewritten as: 2x1 3x2 + x3 = 15. Note that, a new nonnegative
variable x3 is added to the left-hand-side (LHS) to make both sides equal. Similarly, the
second constraint can be rewritten as: x1 + x 2 + x 4 = 3 . The variables x3 and x4 are known as
slack variables. The third constraint can be rewritten as: 4x1 + x2 x5 = 2 . Again, note that a
new nonnegative variable x5 is subtracted form the LHS to make both sides equal. The
variable x5 is known as surplus variable.
Decision variable x2 can expressed by introducing two extra nonnegative variables as
x2 = x2 x2
Thus, x 2 can be negative if x 2 < x 2 and positive if x 2 > x 2 depending on the values of
x 2 and x 2 . x 2 can be zero also if x 2 = x 2 .
Z = Z = 3x1 + 5(x2 x2 )
2x1 3(x2 x2 ) + x3 = 15
x1 + (x2 x2 ) + x4 = 3
4x1 + (x2 x2 ) x5 = 2
x1 , x2 , x2 , x3 , x4 , x5 0
After obtaining solution for x 2 and x 2 , solution for x 2 can be obtained as, x 2 = x 2 x 2 .
A.BENHARI
64
Canonical form of standard LPP is a set of equations consisting of the objective function
and all the equality constraints (standard form of LPP) expressed in canonical form.
Understanding the canonical form of LPP is necessary for studying simplex method, the most
popular method of solving LPP. Simplex method will be discussed in some other class. In this
class, canonical form of a set of linear equations will be discussed first. Canonical form of
LPP will be discussed next.
Canonical form of a set of linear equations
Let us consider a set of three equations with three variables for ease of discussion. Later, the
method will be generalized.
Let us consider the following set of equations,
3x + 2 y + z = 10
(A0)
x 2 y + 3z = 6
(B0)
2x + y z = 1
(C0)
The system of equations can be transformed in such a way that a new set of three different
equations are obtained, each having only one variable with nonzero coefficient. This can be
achieved by some elementary operations.
The following operations are known as elementary operations.
1. Any equation Er can be replaced by kEr, where k is a nonzero constant.
2. Any equation Er can be replaced by Er + kEs, where Es is another equation of the
system and k is as defined above.
Note that the transformed set of equations through elementary operations is equivalent to the
original set of equations. Thus, solution of the transformed set of equations will be the
solution of the original set of equations too.
A.BENHARI
65
Now, let us transform the above set of equation (A0, B0 and C0) through elementary
operations (shown inside bracket in the right side).
x+
2
1
10
y+ z=
3
3
3
8
8
8
y+ z =
3
3
3
1
5
17
y z =
3
3
3
(A1 =
1
A0 )
3
(B1 = B0 A1)
(C1 = C0 2 A1)
Note that variable x is eliminated from equations B0 and C0 to obtain B1 and C1 respectively.
Equation A0 in the previous set is known as pivotal equation.
Following similar procedure, y is eliminated from A1 and C1 as follows, considering B1 as
pivotal equation.
x+0+ z = 4
(A2 = A1 -
2
B2)
3
3
(B2 = B1)
8
0 + y z = 1
0 + 0 2 z = 6
(C2 = C1 +
1
B2)
3
(A3 = A2 C3)
0+ y+0 = 2
(B3 = B2 + C3)
0+0+z =3
(C3 =
1
C2)
2
Thus we end up with another set of equations which is equivalent to the original set having
one variable in each equation. Transformed set of equations, (A3, B3 and C3), thus obtained
are said to be in canonical form. Operation at each step to eliminate one variable at a time,
from all equations except one, is known as pivotal operation. It is obvious that the number of
pivotal operations is the same as the number of variables in the set of equations. Thus we did
three pivotal operations to obtain the canonical form of the set of equations having three
variables each.
It may be noted that, at each pivotal operation, the pivotal equation is transformed first and
using the transformed pivotal equation, other equations in the system are transformed. For
A.BENHARI
66
( E1 )
a 21 x1 + a 22 x 2 + + a 2 n x n = b2
(E2 )
a n1 x1 + a n 2 x 2 + + a nn x n = bn
(En )
General procedure for one pivotal operation consists of following two steps,
th
E k a ki E j
A.BENHARI
67
Ej
a ji
equation (k = 1, 2, j 1, j + 1, , n ) , i.e.,
Above steps are repeated for all the variables in the system of equations to obtain the
canonical form. Finally the canonical form will be as follows:
1x1 + 0 x 2 + + 0 x n = b1
( E1c )
0 x1 + 1x 2 + + 0 x n = b2
( E 2c )
0 x1 + 0 x 2 + + 1x n = bn
( E nc )
It is obvious that solution of the system of equations can be easily obtained from canonical
form, such as:
xi = bi
which is the solution of the original set of equations too as the canonical form is obtained
through elementary operations.
Now let us consider more general case for which the system of equations has m equations
with n variables ( n m ). It is possible to transform the set of equations to an equivalent
canonical form from which at least one solution can be easily deduced.
( E1 )
a 21 x1 + a 22 x 2 + + a 2 n x n = b2
( E2 )
a m1 x1 + a m 2 x 2 + + a mn x n = bm
A.BENHARI
68
( Em )
( E1c )
0 x1 + 1x 2 + + 0 x m + a 2,m +1 x m +1 + + a 2n x n = b2
( E 2c )
x n = bm
0 x1 + 0 x 2 + + 1x m + a m ,m +1 x m +1 + + a mn
( E mc )
Similar procedure can be followed in the case of a standard form of LPP. Objective function
and all constraints for such standard form of LPP constitute a linear set of equations. In
general this linear set will have m equations with n variables ( n m ). The set of canonical
form obtained from this set of equations is known as canonical form of LPP.
If the basic solution satisfies all the constraints as well as non-negativity criterion for all the
variables, such basic solution is also known as basic feasible solution. It is obvious that, there
can be n c m numbers of different canonical forms and corresponding basic feasible solutions.
Thus, if there are 10 equations with 15 variables there exist
15
number to be inspected one by one to find out the optimal solution. This is the reason which
motivates for an efficient algorithm for solution of the LPP. Simplex method is one such
popular method, which will be discussed after graphical method.
A.BENHARI
69
Graphical Method
Graphical method to solve Linear Programming problem (LPP) helps to visualize the
procedure explicitly. It also helps to understand the different terminologies associated with
the solution of LPP. In this class, these aspects will be discussed with the help of an example.
However, this visualization is possible for a maximum of two decision variables. Thus, a LPP
with two decision variables is opted for discussion. However, the basic principle remains the
same for more than two decision variables also, even though the visualization beyond twodimensional case is not easily possible.
Let us consider the same LPP (general form) discussed in previous class, stated here once
again for convenience.
Maximize
subject to
Z = 6x + 5 y
2x 3 y 5
x + 3 y 11
4 x + y 15
x, y 0
(C 1)
(C 2)
(C 3)
(C 4) & (C 5)
First step to solve above LPP by graphical method, is to plot the inequality constraints oneby-one on a graph paper. Fig. 1a shows one such plotted constraint.
5
4
3
2
1
0
-2
-1
-1
-2
2x 3y 5
A.BENHARI
70
x + 3 y 11
4 x + y 15
4
3
x0
y0
1
0
-2
-1
-1
2x 3y 5
-2
Feasible
region
1
0
-2
-1
-1
-2
A.BENHARI
71
y axis is 3. If,
c=
6 x + 5 y = k => 5 y = 6 x + k => y =
6
k
x + , i.e.,
5
5
m=
6
5
and
k
= 3 => k = 15 .
5
5
4
3
2
1
0
-2
-1
-1
Z Line
-2
Optimal
Point
3
2
1
0
-2
-1
-1
-2
A.BENHARI
72
A linear programming problem may have i) a unique, finite solution, ii) an unbounded
solution iii) multiple (or infinite) number of optimal solutions, iv) infeasible solution and v) a
unique feasible point. In the context of graphical method it is easy to visually demonstrate the
different situations which may result in different types of solutions.
Unique, finite solution
The example demonstrated above is an example of LPP having a unique, finite solution. In
such cases, optimum value occurs at an extreme point or vertex of the feasible region.
Unbounded solution
If the feasible region is not bounded, it is possible that the value of the objective function
goes on increasing without leaving the feasible region. This is known as unbounded solution
(Fig 2).
5
4
3
Z Line
2
1
0
-2
-1
-1
-2
A.BENHARI
73
If the Z line is parallel to any side of the feasible region all the points lying on that side
constitute optimal solutions as shown in Fig 3.
Parallel
4
3
2
1
0
-2
-1
-1
Z Line
-2
Sometimes, the set of constraints does not form a feasible region at all due to inconsistency in
the constraints. In such situation the LPP is said to have infeasible solution. Fig 4 illustrates
such a situation.
5
4
3
2
1
Z Line
0
-2
-1
-1
-2
A.BENHARI
74
This situation arises when feasible region consist of a single point. This situation may occur
only when number of constraints is at least equal to the number of decision variables. An
example is shown in Fig 5. In this case, there is no need for optimization as there is only one
solution.
5
4
Unique
feasible point
3
2
1
0
-2
-1
-1
-2
A.BENHARI
75
Simplex Method - I
Introduction
It is already stated in a previous lecture that the most popular method used for the solution of
Linear Programming Problems (LPP) is the simplex method. In this lecture, motivation for
simplex method will be discussed first. Simplex algorithm and construction of simplex
tableau will be discussed later with an example problem.
Motivation for Simplex method
Recall from the second class that the optimal solution of a LPP, if exists, lies at one of the
vertices of the feasible region. Thus one way to find the optimal solution is to find all the
basic feasible solutions of the canonical form and investigate them one-by-one to get at the
optimal. However, again recall the example at the end of the first class that, for 10 equations
with 15 variables there exists a huge number ( 15 c10 = 3003) of basic feasible solutions. In such
a case, inspection of all the solutions one-by-one is not practically feasible. However, this can
be overcome by simplex method. Conceptual principle of this method can be easily
understood for a three dimensional case (however, simplex method is applicable for any
higher dimensional case as well).
Imagine a feasible region (i.e., volume) bounded by several surfaces. Each vertex of this
volume, which is a basic feasible solution, is connected to three other adjacent vertices by a
straight line to each being the intersection of two surfaces. Being at any one vertex (one of
the basic feasible solutions), simplex algorithm helps to move to another adjacent vertex
which is closest to the optimal solution among all the adjacent vertices. Thus, it follows the
shortest route to reach the optimal solution from the starting point. It can be noted that the
shortest route consists of a sequence of basic feasible solutions which is generated by simplex
algorithm. The basic concept of simplex algorithm for a 3-D case is shown in Fig 1.
A.BENHARI
76
Fig 1.
The general procedure of simplex method is as follows:
1. General form of given LPP is transformed to its canonical form (refer Lecture note 1).
2. A basic feasible solution of the LPP is found from the canonical form (there should
exist at least one).
3. This initial solution is moved to an adjacent basic feasible solution which is closest to
the optimal solution among all other adjacent basic feasible solutions.
4. The procedure is repeated until the optimum solution is achieved.
Step three involves simplex algorithm which is discussed in the next section.
Simplex algorithm
Simplex algorithm is discussed using an example of LPP. Let us consider the following
problem.
Maximize
Z = 4 x1 x 2 + 2 x3
subject to
2 x1 + x 2 + 2 x3 6
x1 4 x 2 + 2 x3 0
5 x1 2 x 2 2 x3 4
x1 , x 2 , x3 0
A.BENHARI
77
Z = 4 x1 x 2 + 2 x3
subject to
2 x1 + x 2 + 2 x3 + x 4 = 6
x1 4 x 2 + 2 x3 + x5 = 0
5 x1 2 x 2 2 x3 + x 6 = 4
x1 , x 2 , x3 , x 4 , x5 , x6 0
It can be recalled that x 4 , x5 and x6 are slack variables. Above set of equations, including the
objective function can be transformed to canonical form as follows:
4 x1 + x 2 2 x3 + 0 x 4 + 0 x5 + 0 x6 + Z
=0
2 x1 + x 2 + 2 x3 + 1x 4 + 0 x5 + 0 x6
=6
x1 4 x 2 + 2 x3 + 0 x 4 + 1x5 + 0 x6
=0
5 x1 2 x 2 2 x3 + 0 x 4 + 0 x5 + 1x6
=4
known as nonbasic variables of the canonical form shown above. Let us denote each equation
of above canonical form as:
(Z )
4 x1 + x 2 2 x3 + 0 x 4 + 0 x5 + 0 x6 + Z
=0
(x4 )
2 x1 + x 2 + 2 x3 + 1x 4 + 0 x5 + 0 x6
=6
( x5 )
x1 4 x 2 + 2 x3 + 0 x 4 + 1x5 + 0 x 6
=0
(x6 )
5 x1 2 x 2 2 x3 + 0 x 4 + 0 x5 + 1x 6
=4
For the ease of discussion, right hand side constants and the coefficients of the variables are
symbolized as follows:
(Z )
(x4 )
( x5 )
( x6 )
A.BENHARI
c1 x1
+ c2 x2
+ c3 x3 + c 4 x 4 + c5 x5 + c 6 x5
+Z
=b
c 41 x1 + c 42 x 2 + c 43 x3 + c 44 x 4 + c 45 x5 + c 46 x5
= b4
= b5
= b6
78
br
is minimum for all possible r, provided
c rs
c rs is positive.
A.BENHARI
79
b4
6
= = 3,
c 41 2
b5 0
b
b
4
= = 0 and 6 = = 0.8 . As, 5 is minimum, r is 5. Thus x5 is to be exited and c51 is
c51 1
c61 5
c51
the pivotal element and x5 is replaced by x1 in the basis. Set of equations are transformed
through pivotal operation to another canonical form considering c51 as the pivotal element.
The procedure of pivotal operation is already explained in first class. However, as a refresher
it is explained here once again.
1. Pivotal row is transformed by dividing it with the pivotal element. In this case, pivotal
element is 1.
2. For other rows: Let the coefficient of the element in the pivotal column of a particular
row be l. Let the pivotal element be m. Then the pivotal row is multiplied by l / m
and then subtracted from that row to be transformed. This operation ensures that the
coefficients of the element in the pivotal column of that row becomes zero, e.g., Z
row: l = -4 , m = 1. So, pivotal row is multiplied by l / m = -4 / 1 = -4, obtaining
4 x1 + 16 x 2 8 x3 + 0 x 4 4 x5 + 0 x6 = 0
This is subtracted from Z row obtaining,
0 x1 15 x2 + 6 x3 + 0 x4 + 4 x5 + 0 x6 + Z
=0
(Z )
(x4 )
(x1 )
( x6 )
0 x1 15 x 2 + 6 x3 + 0 x 4 + 4 x5 + 0 x6 + Z
=0
0 x1 + 9 x 2 2 x3 + 1x 4 2 x5 + 0 x6
=6
1x1 4 x 2 + 2 x3 + 0 x 4 + 1x5 + 0 x6
=0
0 x1 + 18 x 2 12 x3 0 x 4 5 x5 + 1x6
=4
A.BENHARI
80
any value from 4, 1 and 6. However, c12 (= 4 ) is negative. Thus, r may be either 4 or 6. It is
found that,
b
b
b4 6
4
= = 0.667 , and 6 =
= 0.222 . As 6 is minimum, r is 6 and x6 is to
c62
c 42 9
c62 18
be exited from the basis. c62 (=18) is to be treated as pivotal element. The canonical form for
next iteration is as follows:
(Z )
0 x1 + 0 x 2 4 x3 + 0 x 4
(x4 )
0 x1 + 0 x 2 + 4 x3 + 1x 4 +
(x1 )
1x1 + 0 x 2
(x2 )
0 x1 + 1x 2
1
1
x5
x6
2
2
10
3
=4
2
1
2
x3 + 0 x 4
x5 +
x6
3
9
9
8
9
2
5
1
x3 + 0 x 4
x5 +
x6
3
18
18
2
9
1
5
x5 +
x6 + Z
6
6
8
2
, x 2 = , x 4 = 4 , x 2 = x3 = x5 = 0 and
9
9
10
.
3
It is observed that c3 (= -4) is negative. Thus, optimum is not yet achieved. Following similar
procedure as above, it is decided that x3 should be entered in the basis and x 4 should be
exited from the basis. Thus, x 4 is replaced by x3 in the basis. Set of equations are
transformed to another canonical form considering c 43 (= 4) as pivotal element. By doing so,
the canonical form is shown below.
A.BENHARI
1
1
x5 + x 6 + Z
3
3
22
3
(Z )
0 x1 + 0 x 2 + 0 x3 + 1x 4 +
( x3 )
0 x1 + 0 x 2 + 1x3 +
1
1
1
x 4 + x5 x6
4
8
8
=1
(x1 )
1x1 + 0 x 2 + 0 x3 +
1
1
5
x4
x5 +
x6
6
36
36
14
9
(x2 )
0 x1 + 1x 2 + 0 x3 +
8
9
1
7
1
x4
x5
x6
6
36
36
81
Z=
8
14
, x 2 = , x3 = 1 , x 4 = x5 = x6 = 0 and
9
9
22
.
3
It is observed that all the cost coefficients are positive. Thus, optimum is achieved. Hence,
the optimum solution is
Z=
22
= 7.333
3
x1 =
14
= 1.556
9
x2 =
8
= 0.889
9
x3 = 1
The calculation shown above can be presented in a tabular form, which is known as Simplex
Tableau. Construction of Simplex Tableau will be discussed next.
Construction of Simplex Tableau
Same LPP is considered for the construction of simplex tableau. This helps to compare the
calculation shown above and the construction of simplex tableau for it.
After preparing the canonical form of the given LPP, simplex tableau is constructed as
follows.
A.BENHARI
82
Variables
Iteration
Basis
Z
x1
x2
x3
x4
x5
x6
br
br
c rs
-4
-2
--
x4
x5
-4
x6
-2
-2
4
5
Pivotal Column
Pivotal Row
Pivotal Element
After completing each iteration, the steps given below are to be
followed.
Logically, these steps are exactly similar to the procedure described earlier. However, steps
described here are somewhat mechanical and easy to remember!
Check for optimum solution:
1. Investigate whether all the elements in the first row (i.e., Z row) are nonnegative or
not. Basically these elements are the coefficients of the variables headed by that
column. If all such coefficients are nonnegative, optimum solution is obtained and no
need of further iterations. If any element in this row is negative, the operation to
obtain simplex tableau for the next iteration is as follows:
A.BENHARI
83
Variables
Iteration
Basis
Z
x1
x2
x3
x4
x5
x6
br
br
c rs
-15
--
x4
-2
-2
13
x1
-4
--
x6
18
-12
-5
29
A.BENHARI
84
Basis
Z
x1
x2
x3
x4
x5
x6
1
6
5
6
-4
x4
x1
2
3
x2
2
3
1
2
1
2
br
br
c rs
10
3
--
1
9
2
9
8
9
--
5
18
1
18
2
9
--
1
3
1
3
22
3
1
8
x3
1
4
x1
1
6
1
36
x2
1
6
7
36
1
8
2
9
14
9
1
36
8
9
Optimum value of Z
All the coefficients are
nonnegative. Thus optimum
solution is achieved.
A.BENHARI
85
Value of x3
Value of x1
Value of x 2
22
= 7.333 as shown above. Corresponding
3
14
8
= 1.556 , x 2 = = 0.889 , x3 = 1 and those of nonbasic
9
9
It can be noted that at any iteration the following two points must be satisfied:
1. All the basic variables (other than Z) have a coefficient of zero in the Z row.
2. Coefficients of basic variables in other rows constitute a unit matrix.
If any of these points are violated at any iteration, it indicates a wrong calculation. However,
reverse is not true.
A.BENHARI
86
Simplex Method II
Introduction
In the previous lecture the simplex method was discussed with required transformation of
objective function and constraints. However, all the constraints were of inequality type with
less-than-equal-to ( ) sign. However, greater-than-equal-to ( ) and equality ( = )
constraints are also possible. In such cases, a modified approach is followed, which will be
discussed in this lecture. Different types of LPP solutions in the context of Simplex method
will also be discussed. Finally, a discussion on minimization vs maximization will be
presented.
Simplex Method with greater-than-equal-to ( ) and equality ( = ) constraints
The LP problem, with greater-than-equal-to ( ) and equality ( = ) constraints, is transformed
to its standard form in the following way.
1. One artificial variable is added to each of the greater-than-equal-to ( ) and equality
( = ) constraints to ensure an initial basic feasible solution.
2. Artificial variables are penalized in the objective function by introducing a large
negative (positive) coefficient M for maximization (minimization) problem.
3. Cost coefficients, which are supposed to be placed in the Z-row in the initial simplex
tableau, are transformed by pivotal operation considering the column of artificial
variable as pivotal column and the row of the artificial variable as pivotal row.
4. If there are more than one artificial variable, step 3 is repeated for all the artificial
variables one by one.
Let us consider the following LP problem
Maximize
Z = 3x1 + 5 x2
subject to
x1 + x2 2
x2 6
3 x1 + 2 x2 = 18
x1 , x2 0
After incorporating the artificial variables, the above LP problem becomes as follows:
A.BENHARI
87
Maximize
subject to
x1 + x2 x3 + a1 = 2
x2 + x4 = 6
3 x1 + 2 x2 + a2 = 18
x1 , x2 0
where x3 is surplus variable, x 4 is slack variable and a1 and a2 are the artificial variables.
Cost coefficients in the objective function are modified considering the first constraint as
follows:
Z 3 x1 5 x2 + Ma1 + Ma2 = 0
x1 + x2 x3 + a1
=2
( E1 )
( E2 )
Pivotal Row
Pivotal Column
Z (3 + M )x1 (5 + M ) x 2 + Mx3 + 0 a1 + Ma 2 = 2 M
3 x1
+ 2 x2
+ a 2 = 18
(E 3 )
(E 4 )
Pivotal Row
Pivotal Column
Obviously pivotal operation is E3 M E 4 , which further modifies the cost coefficients as
follows:
Z (3 + 4 M )x1 (5 + 3M )x 2 + Mx3 + 0 a1 + 0 a 2 = 20 M
The modified cost coefficients are to be used in the Z-row of the first simplex tableau.
Next, let us move to the construction of simplex tableau. Pivotal column, pivotal row and
pivotal element are marked (same as used in the last class) for the ease of understanding.
A.BENHARI
88
Variables
Iteration
Basis
Note
that
x1
x2
x3
x4
a1
a2
br
br
c rs
3 4M
5 3M
20 M
--
a1
-1
x4
--
a2
18
( 3 4M )
comparing
while
( 3 4M ) < ( 5 3M ) as
and
( 5 3M ) ,
it
is
decided
that
Variables
Iteration
Basis
x1
x2
x3
x4
a1
a2
br
br
c rs
2+M
3 3M
3 + 4M
6 12 M
--
x1
-1
--
x4
--
a2
-1
-3
12
A.BENHARI
89
Basis
Z
x1
x2
x3
x4
a1
a2
br
br
c rs
-3
1+ M
18
--
x1
2
3
1
3
x4
x3
-1
1
3
--
1+ M
36
--
x1
1
3
--
x2
--
x3
1
3
-1
1
3
--
1
3
2
3
A.BENHARI
90
Unbounded solution
If at any iteration no departing variable can be found corresponding to entering variable, the
value of the objective function can be increased indefinitely, i.e., the solution is unbounded.
Z = 3x1 + 2 x2
subject to
x1 + x2 2
x2 6
3 x1 + 2 x2 = 18
x1 , x2 0
Curious readers may find that the only modification is that the coefficient of x2 is changed
from 5 to 2 in the objective function. Thus the slope of the objective function and that of third
constraint are now same. It may be recalled from lecture notes 2, that if the Z line is parallel to
any side of the feasible region (i.e., one of the constraints) all the points lying on that side
constitute optimal solutions (refer fig 3 in lecture notes 2). So, reader should be able to imagine
graphically that the LPP is having infinite solutions. However, for this particular set of
constraints, if the objective function is made parallel (with equal slope) to either the first
constraint or the second constraint, it will not lead to multiple solutions. The reason is very
simple and left for the reader to find out. As a hint, plot all the constraints and the objective
function on an arithmetic paper.
Now, let us see how it can be found in the simplex tableau. Coming back to our problem, final
tableau is shown as follows. Full problem is left to the reader as practice.
A.BENHARI
91
Basis
Z
x1
x2
x3
x4
a1
a2
0
2
3
1
1
-1
1+ M
1
3
0
1
3
x1
x4
x3
br
br
c rs
18
--
--
As there is no negative coefficient in the Z-row the optimal is reached. The solution is Z = 18
with x1 = 6 and x2 = 0 . However, the coefficient of non-basic variable x2 is zero as shown in
the final simplex tableau. So, another solution is possible by incorporating x2 in the basis.
Based on the
br
, x4 will be the exiting variable. The next tableau will be as follows:
c rs
Variables
Iteration
Basis
Z
x1
x2
x3
x4
a1
a2
0
2
3
1
1
3
1+ M
1
3
0
1
3
x1
x2
x3
0
0
-1
br
br
c rs
18
--
--
18
A.BENHARI
92
6
2
Thus, we have two sets of solutions as and . Other optimal solutions will be obtained
0
6
6
2
as + (1 ) where, [ 0,1] . For example, let = 0.4 , corresponding solution is
6
0
3.6
, i.e., x1 = 3.6 and x2 = 3.6 . Note that values of the objective function are not changed
3.6
for different sets of solution; for all the cases Z = 18 .
Infeasible solution
If in the final tableau, at least one of the artificial variables still exists in the basis, the solution
is indefinite.
Reader may check this situation both graphically and in the context of Simplex method by
considering following problem:
Maximize
Z = 3x1 + 2 x2
subject to
x1 + x2 2
3 x1 + 2 x2 18
x1 , x2 0
Minimization versus maximization problems
As discussed earlier, standard form of LP problems consist of a maximizing objective function.
Simplex method is described based on the standard form of LP problems, i.e., objective
function is of maximization type. However, if the objective function is of minimization type,
simplex method may still be applied with a small modification. The required modification can
be done in either of following two ways.
1. The objective function is multiplied by 1 so as to keep the problem identical and
minimization problem becomes maximization. This is because of the fact that
minimizing a function is equivalent to the maximization of its negative.
2. While selecting the entering nonbasic variable, the variable having the maximum
coefficient among all the cost coefficients is to be entered. In such cases, optimal
A.BENHARI
93
A.BENHARI
94
(Z )
( xi )
(x )
j
c1 x1 + c2 x2 + c3 x3 + L L L + cn xn + Z = 0
c11 x1 + c12 x2 + c13 x3 + L L L + c1n xn
= b1
= b2
( xl )
cm1 x1 + cm 2 x2 + cm 3 x3 + L L L + cmn xn
= bm
As the revised simplex method is mostly beneficial for large LP problems, it will be
discussed in the context of matrix notation. Matrix notation of above LP problem can be
expressed as follows:
A.BENHARI
95
Minimize z = C T X
subject to : AX = B
with : X 0
b1
c11 c12
x1
c1
0
c
x
c
b2
c 22
0
2
2
where X =
, A = 21
, C=
, B= , 0 =
M
c m1 c m 2
xn
c n
bm
c1n
c 2 n
c mn
It can be noted for subsequent discussion that column vector corresponding to a decision
c1k
c
variable x k is 2 k .
c mk
Let X S is the column vector of basic variables. Also let C S is the row vector of costs
coefficients corresponding to X S and S is the basis matrix corresponding to X S .
1. Selection of entering variable
For each of the nonbasic variables, calculate the coefficient (WP c ) , where, P is the
corresponding column vector associated with the nonbasic variable at hand, c is the cost
coefficient associated with that nonbasic variable and W = C S S 1 .
For maximization (minimization) problem, nonbasic variable, having the lowest negative
(highest positive) coefficient, as calculated above, is the entering variable.
2. Selection of departing variable
U(i )
, are calculated provided V (i ) > 0 . i = r , for which the ratio is least, is
V (i )
noted. The r th basic variable of the current basis is the departing variable.
If it is found that V (i ) 0 for all i , then further calculation is stopped concluding that
bounded solution does not exist for the LP problem at hand.
A.BENHARI
96
where E =
m 1
M
V (i )
V (r )
M and i =
1
M
V (r )
for
ir
for
i=r
r th column
S is replaced by S new and steps 1 through 3 are repeated. If all the coefficients calculated in
X S = S 1B and z = CX S
Duality of LP problems
Each LP problem (called as Primal in this context) is associated with its counterpart known
as Dual LP problem. Instead of primal, solving the dual LP problem is sometimes easier
when a) the dual has fewer constraints than primal (time required for solving LP problems is
directly affected by the number of constraints, i.e., number of iterations necessary to
converge to an optimum solution which in Simplex method usually ranges from 1.5 to 3
times the number of structural constraints in the problem) and b) the dual involves
maximization of an objective function (it may be possible to avoid artificial variables that
otherwise would be used in a primal minimization problem).
The dual LP problem can be constructed by defining a new decision variable for each
constraint in the primal problem and a new constraint for each variable in the primal. The
coefficients of the j th variable in the duals objective function is the i th component of the
primals requirements vector (right hand side values of the constraints in the Primal). The
duals requirements vector consists of coefficients of decision variables in the primal
objective function. Coefficients of each constraint in the dual (i.e., row vectors) are the
A.BENHARI
97
Primal
Dual
Maximization
Minimization
Minimization
Maximization
i th variable
i th constraint
j th constraint
j th variable
Inequality sign of i th Constraint:
xi 0
if dual is maximization
if dual is minimization
i th variable unrestricted
j th variable unrestricted
RHS of j th constraint
RHS of i th constraint
function
See the pictorial representation in the next page for better understanding and quick reference:
A.BENHARI
98
Maximize Z = c1 x1 + c2 x2 + L L L + cn xn
Subject to
cm1 x1 + cm 2 x2 + L L L + cmn xn bm
ym
x1 0, x2 unrestricted, L , xn 0
Coefficients
of the 1st
constraint
Right hand
side of the 1st
constraint
Corresponding
sign of the 1st
constraint is
Coefficients
of the 2nd
constraint
Corresponding
sign of the 2nd
constraint is =
M
Determine the
sign of y1
Right hand
side of the 2nd
constraint
Determine the
sign of y2
LL
M
Determine the
sign of ym
Dual Problem
Minimize Z = b1 y1 + b2 y2 + L L L + bm ym
Subject to
A.BENHARI
99
Maximize
Minimize
Subject to
Subject to
x1 +
Dual
2
x 2 6000
3
y1 y 2 + y 3 = 4
x1 x 2 2000
2
y1 + y 2 3
3
x1 4000
y1 0
x1 unrestricted
y2 0
x2 0
y3 0
Primal-Dual relationships
Following points are important to be noted regarding primal-dual relationship:
1. If one problem (either primal or dual) has an optimal feasible solution, other problem
also has an optimal feasible solution. The optimal objective function value is same for
both primal and dual.
2. If one problem has no solution (infeasible), the other problem is either infeasible or
unbounded.
3. If one problem is unbounded the other problem is infeasible.
A.BENHARI
100
Cost Coefficients
.
the sign of the elements of pivotal row, i.e., ratio =
1 Elements of pivotal row
The column corresponding to minimum ratio is identified as the pivotal column and
associated decision variable is the entering variable.
5. Pivotal operation: Pivotal operation is exactly same as in the case of simplex
method, considering the pivotal element as the element at the intersection of pivotal
row and pivotal column.
6. Check for optimality: If all the basic variables have nonnegative values then the
optimum solution is reached. Otherwise, Steps 3 to 5 are repeated until the optimum is
reached.
A.BENHARI
101
Minimize
Z = 2 x1 + x 2
subject to
x1 2
3 x1 + 4 x 2 24
4 x1 + 3 x 2 12
x1 + 2 x 2 1
By introducing the surplus variables, the problem is reformulated with equality constraints as
follows:
Minimize
Z = 2 x1 + x 2
subject to
x1
+ x3 = 2
3 x1
+4 x 2
+ x 4 = 24
4 x1
3 x 2
+ x5 = 12
x1
2 x 2
+ x6 = 1
Basis
br
Z
x1
x2
x3
x4
x5
x6
-2
-1
x3
-1
-2
x4
24
x5
-4
-3
-12
x6
-2
-1
0.5
1/3
--
--
--
Ratios
Pivotal Element
Pivotal Row
Pivotal Column
A.BENHARI
102
Iteration
Basis
Variables
x3
x4
x6
br
-2/3
-1/3
x3
-1
-2
x4
-7/3
4/3
x2
4/3
-1/3
x6
11/3
-2/3
2/3
--
--
--
--
--
x1
x2
Variables
x3
x4
x5
x6
br
Basis
-2/3
-1/3
16/3
x1
-1
x4
-7/3
4/3
38/3
x2
4/3
-1/3
4/3
x6
11/3
-2/3
-1/3
--
--
--
--
0.5
--
x1
x2
Variables
x3
x4
x5
x6
br
Basis
2.5
-0.5
5.5
x1
-1
x4
12
x2
-0.5
-0.5
1.5
x5
-5.5
-1.5
0.5
Ratios
A.BENHARI
x5
Ratios
Iteration
x2
Ratios
Iteration
x1
103
Dual
Maximize
Z = 4 x1 x2 + 2 x3
subject to
2 x1 + x2 + 2 x3 6
Minimize
Z ' = 6 y1 + 0 y 2 + 4 y3
subject to
2 y1 + y 2 + 5 y3 4
y1 4 y 2 2 y3 1
x1 4 x2 + 2 x3 0
2 y1 + 2 y 2 2 y3 2
5 x1 2 x2 2 x3 4
y1 , y 2 , y3 0
x1 , x2 , x3 0
Final simplex tableau of primal:
y1
y2
y3
As illustrated above solution for the dual can be obtained corresponding to the coefficients of
slack variables of respective constraints in the primal, in the Z row as, y1 = 1 , y 2 =
y3 =
A.BENHARI
1
and Z=Z=22/3.
3
104
1
and
3
where y j is the dual variable associated with the i th constraint, bi is the small change in the
RHS of i th constraint, and Z is the change in objective function owing to bi .
Let, for a LP problem, i th constraint be 2 x1 + x2 50 and the optimum value of the objective
function be 250. What if the RHS of the i th constraint changes to 55, i.e., i th constraint
changes to 2 x1 + x2 55 ? To answer this question, let, dual variable associated with the i th
constraint is y j , optimum value of which is 2.5 (say). Thus, bi = 55 50 = 5 and y j = 2.5 .
So, Z = y j bi = 2.5 5 = 12.5 and revised optimum value of the objective function is
A.BENHARI
105
Simplex Algorithm
Karmarkars Algorithm
Optimal solution point
Feasible Region
A.BENHARI
106
Z = CT X
subject to :
AX = 0
1X = 1
with : X 0
c1
x1
c11
c
x
c
21
2
2
where X = , C = , 1 = [1 1 L 1] (1n ) , A =
M
M
M
c n
xn
c m1
c12
c 22
M
cm 2
L c1n
L c 2 n
and n 2 . It is
O M
L c mn
1 n
1 n
also assumed that X 0 = is a feasible solution and Z min = 0 . The two other variables are
M
1 n
defined as r =
A.BENHARI
n(n 1)
,=
(n 1) .
3n
107
a) Compute C p = I P T PP T
P CT
0
0
0
X k (1)
0
0
X k (2) 0
AD k
, C = C T D and D =
where P =
k
k
1
0
O
0
0
0
0
0 X k (n )
If C p = 0 , any feasible solution becomes an optimal solution. Further iteration is not
required. Otherwise, compute the following
b) Ynew = X 0 r
c) X k +1 =
Cp
Cp
D k Ynew
DY
. However, it can be shown that for k = 0 , k new = Ynew .
1D k Ynew
1Dk Ynew
Thus, X1 = Ynew .
d) Z = C T X k +1
e) Repeat the steps (a) through (d) by changing k as k + 1 .
Minimize
Z = 2 x 2 x3
subject to :
x1 2 x 2 + x3 = 0
x1 + x 2 + x3 = 1
x1 , x 2 , x3 0
1 3
0
1 3
Thus, n = 3 , C = 2 , A = [1 2 1], X 0 = , r =
M
1
1 3
(n 1) = (3 1) = 2 .
3n
A.BENHARI
3 3
108
n(n 1)
3(3 1)
1
6
Iteration 0 (k=0):
0
1 / 3 0
D0 = 0 1/ 3 0
0
0 1 / 3
0
1 / 3 0
C = C T D 0 = [0 2 1] 0 1 / 3 0 = [0 2 / 3 1 / 3]
0
0 1 / 3
0
1 / 3 0
AD 0 = [1 2 1] 0 1 / 3 0 = [1 / 3 2 / 3 1 / 3]
0
0 1 / 3
AD 0 1 / 3 2 / 3 1 / 3
=
P=
1 1
1
1
1 / 3 1
1 / 3 2 / 3 1 / 3
2 / 3 0
PP T =
2 / 3 1 =
0 3
1
1
1
1 / 3 1
(PP )
T 1
1.5 0
=
0 1 / 3
0.5 0 0.5
1
P T PP T P = 0 1 0
0.5 0 0.5
1/ 6
1
C p = I P T PP T P C T = 0
1 / 6
C p = (1 / 6) 2 + 0 + (1 / 6) 2 =
A.BENHARI
2
6
109
Ynew = X 0 r
X1 = Ynew
Cp
Cp
1 3
2 1 1 / 6 0.2692
1 3 9 6
=
0 = 0.3333
2
M
0
.
1
/
6
3974
6
1 3
0.2692
= 0.3333
0.3974
0.2692
Z = C T X1 = [0 2 1] 0.3333 = 0.2692
0.3974
Iteration 1 (k=1):
0
0
0.2692
D1 = 0
0.3333
0
0
0
0.3974
0
0
0.2692
C = CT D1 = [ 0 2 1] 0
0.3333
0 = [ 0 0.6667 0.3974]
0
0
0.3974
0
0
0.2692
0.3333
0 = [0.2692
AD1 = [1 2 1] 0
0
0.3974
0
0.6666 0.3974]
1 1
1
1
0.2692 1
0.2692 0.6667 0.3974
0.675 0
PP T =
0.6667 1 =
0
3
1
1
1
0.3974 1
0
1.482
T 1
PP
=
( )
0
0.333
A.BENHARI
110
PT
1
PP T P = 0.067 0.992 0.059
C p = I P T PP T
0.151
P CT = 0.018
0.132
Ynew = X0 r
D1Ynew
Cp
0
0 0.2653 0.0714
0.2692
= 0
0.3333
0 0.3414 = 0.1138
0
0.3974 0.3928 0.1561
1D1Ynew
X2 =
Cp
1 3
2 1 0.151 0.2653
1 3 9
6 0.018 = 0.3414
=
M 0.2014
0.132 0.3928
1 3
0.0714
= [1 1 1] 0.1138 = 0.3413
0.1561
D1Ynew
1D1Ynew
0.0714 0.2092
1
=
0.1138 = 0.3334
0.3413
0.1561 0.4574
0.2092
Z = CT X 2 = [ 0 2 1] 0.3334 = 0.2094
0.4574
So far, two successive iterations are shown for the above problem. Similar iterations can be
followed to get the final solution upto some predefined tolerance level.
A.BENHARI
111
A.BENHARI
112
Optimization toolbox of MATLAB is very popular and efficient. It includes different types
of optimization techniques. In this lecture notes, we will briefly introduce the use of
MATLAB toolbox for Simplex Algorithm. However, it is assumed that the users are aware of
basics of MATLAB.
To use the simplex method, you have to set the option as 'LargeScale' to 'off' and 'Simplex' to
'on' in the following way.
A.BENHARI
113
Further details may be referred from the toolbox. However, with this basic knowledge, simple
LP problems can be solved. Let us consider the same problem as considered earlier.
Maximize
Z = 2 x1 + 3 x 2
Subject to
x1 5,
x1 2 x 2 5,
x1 + x 2 6
x1 , x 2 0
A.BENHARI
114
Note that objective function should be converted to a minimization problem before entering
as done in line 2 of the code. Finally, solution should be multiplied by -1 to the optimized
(maximum) solution as done in last but one line. Solution will be obtained as Z = 15.667
with x1 = 2.333 and x 2 = 3.667 as in the earlier case.
A.BENHARI
115
Transportation Problem
Introduction
In the previous lectures, we discussed about the standard form of a LP and the commonly
used methods of solving LPP. A key problem in many projects is the allocation of scarce
resources among various activities. Transportation problem refers to a planning model that
allocates resources, machines, materials, capital etc. in a best possible way so that the costs
are minimized or profits are maximized. In this lecture, the common structure of a
transportation problem (TP) and its solution using LP are discussed followed by a numerical
example.
Structure of the Problem
The classic transportation problem is concerned with the distribution of any commodity
(resource) from any group of 'sources' to any group of destinations or 'sinks'. While solving
this problem using LP, the amount of resources from source to sink will be the decision
variables. The criterion for selecting the optimal values of the decision variables (like
minimization of costs or maximization of profits) will be the objective function. And the
limitation of resource availability from sources will constitute the constraint set.
Consider a general transportation problem consisting of m origins (sources) O1, O2,, Om and
n destinations (sinks) D1, D2, , Dn. Let the amount of commodity available in ith source be
ai (i=1,2,.m) and the demand in jth sink be bj (j=1,2,.n). Let the cost of transportation of
unit amount of material from i to j be cij. Let the amount of commodity supplied from i to j be
denoted as xij. Thus, the cost of transporting xij units of commodity from i to j is cij xij .
Now the objective of minimizing the total cost of transportation can be given as
m
Minimize
f = cij xij
(1)
i =1 j =1
A.BENHARI
116
ij
j =1
= ai ,
i= 1 ,2, , m
(2)
Also, the total amount supplied to a particular sink should be equal to the corresponding
demand. Hence,
m
x
i =1
ij
= b j,
j = 1 ,2, , n
(3)
The set of constraints given by eqns (2) and (3) are consistent only if total supply and total
demand are equal.
m
a = b
i =1
j =1
(4)
But in real problems this condition may not be satisfied. Then, the problem is said to be
unbalanced. However, the problem can be modified by adding a fictitious (dummy) source or
destination which will provide surplus supply or demand respectively. The transportation
costs from this dummy source to all destinations will be zero. Likewise, the transportation
costs from all sources to a dummy destination will be zero.
Thus, this restriction causes one of the constraints to be redundant. Thus the above problem
have m x n decision variables and (m + n - 1) equality constraints.
The non-negativity constraints can be expressed as
A.BENHARI
117
(5)
Problem (1)
Consider a transport company which has to supply 4 units of paper materials from each of the
cities Faizabad and Lucknow to three cities. The material is to be supplied to Delhi,
Ghaziabad and Bhopal with demands of four, one and three units respectively. Cost of
transportation per unit of supply (cij) is indicated below in the figure. Decide the pattern of
transportation that minimizes the cost.
Solution:
Let the amount of material supplied from source i to sink j be xij. Here m =2; n = 3.
Total supply = 8 units and total demand = 4+1+3 = 8 units. Since both are equal, the problem
is balanced. The objective function is to minimize the total cost of transportation from all
combinations i.e.
m
Minimize
f = cij xij
i =1 j =1
A.BENHARI
118
(6)
x
j =1
ij
=4
i.e.
i= 1, 2
for i = 1
(7)
for i = 2
(8)
(2) The total amount of material received by each destination city should be equal to the
corresponding demand.
2
x
i =1
ij
= b j,
i.e.
j = 1 ,2, 3
x11 + x21 = 4
for j = 1
(9)
x12 + x22 = 1
for j = 2
(10)
x13 + x23 = 3
for j = 3
(11)
(12)
A.BENHARI
119
subject to
x11 + x12 + x13 + R1
=4
=4
x11 + x21+ R3
=4
x12+ x22 + R4
=1
x13+ x23+ R5
=3
Modifying the objective function to make the coefficients of the artificial variable equal to
zero, the final form objective function is
f + (5 + 2M ) x11 + (3 + 2M ) x12 + (8 + 2M ) x13
+ (4 + 2M ) x21 + (1 + 2M ) x22 + (7 + 2M ) x23
0 R1 + 0 R2 + 0 R3 + 0 R4 + 0 R5
A.BENHARI
120
First iteration
Basic
variables
Variables
RHS
Ratio
x11
x12
x13
x21
x22
x23
R1
R2
R3
R4
R5
-5
+2M
-3
+2M
-8
+2M
-4
+2M
-1
+2M
-7
+2M
16M
R1
R2
R4
R5
RHS
Ratio
R3
Table 2
Second iteration
Basic
variables
Variables
x11
x12
x13
x21
x22
x23
R1
R2
R3
R4
R5
-5+2M
-1
-8+2M
-4+2M
-7+2M
12M
1+14
M
R1
R2
-1
-1
R3
X22
R5
A.BENHARI
121
Third iteration
Basic
variables
Variables
x11
x12
x13
-5+2M
-5+2M
-8+2M
R1
X21
-1
R3
X22
R5
x21 x22
RHS
Ratio
x23
R1
R2
R3 R4
R5
-7+2M
42M
-3
13+8M
-1
-1
Table 4
Fourth iteration
Basic
variables
Variables
RHS
Ratio
18+6M
-1
-1
-1
x11
x12
x13
x21
x22
x23
R1
R2
R3
R4
R5
-8+2M
-7+2M
-1
52M
22M
R1
-1
X21
-1
X11
X22
R5
Repeating the same procedure, we get the final optimal solution f = 42 and the optimum
decision variable values as : x11 = 2.2430, x12 = 0.00, x13 = 1.7570, x21 = 1.7570, x22 =
1.00, x23 = 1.2430.
A.BENHARI
122
Wh2
Wh3
Wh4
Production
F1
523
682
458
850
60
F2
420
412
362
729
110
F3
670
558
895
695
150
Demand
65
85
80
70
Solution:
Let the amount of chemical to be transported from factory i to warehouse j be xij.
Total supply = 60+110+150 = 320 and total demand = 65+85+80+70 = 300. Since the total
demand is less than total supply, add one fictitious ware house, Wh5 with a demand of 20.
Thus, here m =3; n = 5
Wh1
Wh2
Wh3
Wh4
Wh5
Production
F1
523
682
458
850
60
F2
420
412
362
729
110
F3
670
558
895
695
150
Demand
65
85
80
70
20
The objective function is to minimize the total cost of transportation from all combinations.
Minimize
f = 523 x11 + 682 x12 + 458 x13+ 850 x14 + 0 x15 + 420 x21 + 412 x22 + 362 x23
+ 729 x24 + 0 x25 + 670 x31 + 558 x32 + 895 x33 +695 x34 + 0 x35
A.BENHARI
123
for i = 1
for i = 2
for i = 3
for j = 1
for j = 2
for j = 3
for j = 4
for j = 5
This optimization problem can be solved using the same procedure used for the previous
problem.
A.BENHARI
124
Assignment Problem
Introduction
In the previous lecture, we discussed about one of the bench mark problems called
transportation problem and its formulation. The assignment problem is a particular class of
transportation linear programming problems with the supplies and demands equal to
integers (often 1). Since all supplies, demands, and bounds on variables are integers, the
assignment problem relies on an interesting property of transportation problems that the
optimal solution will be entirely integers. In this lecture, the structure and formulation of
assignment problem are discussed. Also, traveling salesman problem, which is a special
type of assignment problem, is described.
Structure of assignment problem
As mentioned earlier, assignment problem is a special type of transportation problem in
which
1. Number of supply and demand nodes are equal.
2. Supply from every supply node is one.
3. Every demand node has a demand of one.
4. Solution is required to be all integers.
The goal of a general assignment problem is to find an optimal assignment of machines
(laborers) to jobs without assigning an agent more than once and ensuring that all jobs are
completed. The objective might be to minimize the total time to complete a set of jobs, or
to maximize skill ratings, maximize the total satisfaction of the group or to minimize the
cost of the assignments. This is subjected to the following requirements:
A.BENHARI
125
Minimize
c
i =1 j =1
ij
xij
Since each task is assigned to exactly one laborer and each laborer is assigned only one
job, the constraints are
A.BENHARI
126
x
i =1
ij
x
j =1
ij
=1
for j = 1, 2,...n
=1
for i = 1, 2,...m
xij = 0 or 1
Due to the special structure of the assignment problem, the solution can be found out using
a more convenient method called Hungarian method which will be illustrated through an
example below.
Example 1: (Taha, 1982)
Consider three jobs to be assigned to three machines. The cost for each combination is
shown in the table below. Determine the minimal job machine combinations.
Table 1
Job
Machine
1
ai
14
10
12
15
13
16
bj
Solution:
Step 1:
Create zero elements in the cost matrix (zero assignment) by subtracting the smallest
element in each row (column) from the corresponding row (column). After this exercise,
the resulting cost matrix is obtained by subtracting 5 from row 1, 10 from row 2 and 13
from row 3.
A.BENHARI
127
Table 2
Step 2:
Repeating the same with columns, the final cost matrix is
Table 3
The italicized zero elements represent a feasible solution. Thus the optimal assignment is
(1,1), (2,3) and (3,2). The total cost is equal to 60 (5 +12+13).
In the above example, it was possible to obtain the feasible assignment. But in more
complicated problems, additional rules are required which are explained in the next
example.
Example 2 (Taha, 1982)
Consider four jobs to be assigned to four machines. The cost for each combination is
shown in the table below. Determine the minimal job machine combinations.
A.BENHARI
128
Table 4
Job
Machine
1
ai
10
11
bj
Solution:
Step 1: Create zero elements in the cost matrix by subtracting the smallest element in each
row from the corresponding row.
Table 5
Step 2: Repeating the same with columns, the final cost matrix is
Table 6
A.BENHARI
129
Rows 1 and 3 have only one zero element. Both of these are in column 1, which means
that both jobs 1 and 3 should be assigned to machine 1. As one machine can be assigned
with only one job, a feasible assignment to the zero elements is not possible as in the
previous example.
Step 3: Draw a minimum number of lines through some of the rows and columns so that
all the zeros are crossed out.
Table 7
Step 4: Select the smallest uncrossed element (which is 1 here). Subtract it from every
uncrossed element and also add it to every element at the intersection of the two lines.
This will give the following table.
Table 8
This gives a feasible assignment (1,1), (2,3), (3,2) and (4,4) with a total cost of 1+10+5+5
= 21.
If the optimal solution had not been obtained in the last step, then the procedure of
drawing lines has to be repeated until a feasible solution is achieved.
A.BENHARI
130
A traveling salesman has to visit n cities and return to the starting point. He has to start
from any one city and visit each city only once. Suppose he starts from the kth city and the
last city he visited is m. Let the cost of travel from ith city to jth city be cij. Then the
objective function is
m
Minimize
c
i =1 j =1
ij
xij
x
i =1
ij
x
j =1
ij
=1
for j = 1, 2,...n, i j, i m
=1
for i = 1, 2,...m, i j , i m
xmk = 1
xij = 0 or 1
Solution Procedure:
Solve the problem as an assignment problem using the method used to solve the above
examples. If the solutions thus found out are cyclic in nature, then that is the final solution.
If it is not cyclic, then select the lowest entry in the table (other than zero). Delete the row
and column of this lowest entry and again do the zero assignment in the remaining matrix.
Check whether cyclic assignment is available. If not, include the next higher entry in the
table and the procedure is repeated until a cyclic assignment is obtained.
The procedure is explained through an example below.
A.BENHARI
131
Example 3:
Consider a four city TSP for which the cost between the city pairs are as shown in the
figure below. Find the tour of the salesman so that the cost of travel is minimal.
2
6
4
5
9
3
Table 9
Solution:
Step 1: The optimal solution after using the Hungarian method is shown below.
Table 10
A.BENHARI
132
Thus the next optimal assignment is 1 4, 21, 3 2, 4 3 which is cyclic. Thus the
required tour is 1 43 2 1 and the total travel cost is 5 + 9 + 4 + 6 = 24.
A.BENHARI
133
represented by Mb and moment in column is denoted by Mc. l = 8 units and h= 6 units and
forces F1=2 units and F2=1 unit. Assuming that plastic moment capacity of beam and
columns are linear functions of their weights; the objective function is to minimize the sum of
weights of the beam and column materials.
F2=1
F1=2
3
2
5
6
h=6
7
l=8
l=8
Fig. 1
A.BENHARI
134
(1)
where w is weight per unit length over unit moment in material. Since w is constant,
optimizing (1) is same as optimizing
f = ( 2lM b + 2hM c )
(2)
= 16 M b + 12 M c
The four possible collapse mechanisms are shown in the figure below with the corresponding
U and E values. Constraints are formulated from the design restriction U E for all the
mechanisms.
(a)
A.BENHARI
(b)
135
(b)
(d)
Fig. 2
(2)
subject to
Mc 3
(3)
Mb 2
(4)
2M b + M c 10
(5)
Mb + Mc 6
(6)
Mb 0
(7)
Mc 0
A.BENHARI
136
-Mc -3
-Mb -2
-2Mb - M c -10
-Mb -Mc -6
Introducing slack variables X1, X2, X3, X4 all 0 , the system of equations can be written in
canonical form as
-Mc+X1
=-3
-Mb+ X2
=-2
= - 10
=-6
16MB + 12MC f
=0
This model can be solved using Dual Simplex algorithm which is explained below
Starting Solution:
Basic
Variables
Variables
br
MB
MC
X1
X2
X3
X4
-16
-12
X1
-1
-3
X2
-1
-2
X3
-2
-1
-10
X4
-1
-1
-6
Ratio
12
A.BENHARI
137
Basic
Variables
Variables
br
MB
MC
X1
X2
X3
X4
-4
-8
80
X1
-1
-3
X2
MB
X4
-1
Ratio
Iteration 2:
Basic
Variables
Variables
br
MB
MC
X1
X2
X3
X4
-4
-8
92
MC
-1
X2
3/2
MB
7/2
X4
Ratio
A.BENHARI
138
Consider two crops 1 and 2. One unit of crop 1 produces four units of profit and one
unit of crop 2 brings five units of profit. The demand of production of crop 1 is A units and
that of crop 2 is B units. Let x be the amount of water required for A units of crop 1 and y be
the same for B units of crop 2. The amount of production and the amount of water required
can be expressed as a linear relation as shown below
A = 0.5(x - 2) + 2
B = 0.6(y - 3) + 3
Minimum amount of water that must be provided to 1 and 2 to meet their demand is two and
three units respectively. Maximum availability of water is ten units. Find out the optimum
pattern of irrigation.
Solution:
The objective is to maximize the profit from crop 1 and 2, which can be represented as
Maximize f = 4A + 5B;
Expressing as a function of the amount of water,
Maximize f = 4[0.5(x - 2) + 2] + 5[0.6(y - 3) + 3]= 2x + 3y + 10
subject to
y3
A.BENHARI
139
Basic
Variables
Variables
RHS
Ratio
S1
S2
S3
-2
-3
S1
10
10
S2
-1
-2
S3
-1
-3
A.BENHARI
140
Basic
Variables
Variables
RHS
Ratio
S1
S2
S3
-2
-3
S1
S2
-1
-2
-1
RHS
Ratio
Iteration 2:
Basic
Variables
Variables
x
S1
S2
S3
-2
-3
13
S1
-1
-1
-3
RHS
Ratio
Iteration 3:
Basic
Variables
Variables
x
S1
S2
S3
28
S3
-1
A.BENHARI
141
Waste load allocation for water quality management in a river system can be defined as
determination of optimal treatment level of waste, which is discharged to a river; such that
the water quality standards set by Pollution Control Agency (PCA) are maintained through
out the river. Conventional waste load allocation involves minimization of treatment cost
subject to the constraint that the water quality standards are not violated.
Consider a simple problem, where, there are M dischargers, who discharge waste into the
river, and I checkpoints, where the water quality is measured by PCA. Let xj is the treatment
level and aj is the unit treatment cost for jth discharger (j=1,2,,M). ci is the dissolved oxygen
(DO) concentration at checkpoint i (i=1,2,,I), which is to be controlled. Therefore the
decision variables for the waste load allocation model are xj (j=1,2,,M).
Thus, the objective function can be expressed as
Maximize
f = aj xj
j =1
The relationship between the water quality indicator, ci (DO) at a checkpoint and the
treatment level upstream to that checkpoint is linear (based on Streeter-Phelps Equation)
when all other parameters involved in water quality simulations are constant. Let g(x) denotes
the linear relationship between ci and xj. Then,
ci = g ( x j )
A.BENHARI
i, j
142
Solution of the optimization model using simplex algorithm gives the optimal fractional
removal levels required to maintain the water quality of the river.
A.BENHARI
143
Introduction
Introduction
In some complex problems, it will be advisable to approach the problem in a sequential
manner in order to find the solution quickly. The solution is found out in multi stages. This is
the basic approach behind dynamic programming. It works in a divide and conquer
manner. The word "programming" in "dynamic programming" has no particular connection
to computer programming at all. A program is, instead, the plan for action that is produced. In
this lecture, the multistage decision process, its representation, various types and the concept
of sub-optimization and principle of optimality are discussed.
Sequential optimization
In sequential optimization, a problem is approached by dividing it into smaller subproblems
and optimization is done for these subproblems without losing the integrity of the original
problem. Sequential decision problems are those in which decisions are made in multiple
stages. These are also called multistage decision problems since decisions are made at a
number of stages.
In multistage decision problems, an N variable problem is represented by N single variable
problems. These problems are solved successively such that the optimal value of the original
problem can be obtained from the optimal solutions of these N single variable problems. The
N single variable problems are connected in series so that the output of one stage will be the
input to the succeeding stage. This type of problem is called serial multistage decision
process.
For example, consider a water allocation problem to N users. The objective function is to
maximize the total net benefit from all users. This problem can be solved by considering each
user separately and optimizing the individual net benefits, subject to constraints and then
adding up the benefits from all users to get the total optimal benefit.
A.BENHARI
144
Input S1
Output S2
Stage 1
Decision variable, X1
Fig 1.
Let S1 be the input state variable, S2 be the output state variable, X1 be the decision variable
and NB1 be the net benefits. The input and output are related by a transformation function
expressed as,
S2 = g(X1, S1)
Also since the net benefits are influenced by the decision variables and also the input
variable, the benefit function can be expressed as
NB1 = h(X1, S1)
Now, consider a serial multistage decision process consisting of T stages as shown in the
figure below.
NB1
S1
Stage 1
X1
NBt
S2
St
Stage t
Xt
Fig 2.
A.BENHARI
145
NBT
St+1
ST
Stage T
XT
ST+1
t =1
t =1
either f = NBt = h( X t , S t )
or
t =1
t =1
f = NBt = h( X t , S t )
An objective function is monotonic if for all values of a and b for which the value of the
benefit function is h(xt = a, St ) h( xt = b, St ) , then
f (x1 , x2, ..., xt = a,..., xT , S t +1 ) f (x1 , x2, ..., xt = b,..., xT , S t +1 )
should be satisfied.
A.BENHARI
146
t =1
t =1
f = NBt = h( X t , S t )
for t = 1,2,,T
The concepts of suboptimization and principle of optimality are used to solve this problem
through dynamic programming. To explain these concepts, consider the design of a water
tank in which the cost of construction is to be minimized. The capacity of the tank to be
designed is given as K.
A.BENHARI
147
A.BENHARI
148
Tank
Columns
Foundation
Original System
Tank
Columns
Foundation
Suboptimize design of
Foundation component
Tank
Columns
Foundation
Suboptimize design of
Foundation & Columns together
Tank
Columns
Fig 3.
A.BENHARI
149
Foundation
Recursive Equations
Introduction
In the previous lecture, we have seen how to represent a multistage decision process and also
the concept of suboptimization. In order to solve this problem in sequence, we make use of
recursive equations. These equations are fundamental to the dynamic programming. In this
lecture, we will learn how to formulate recursive equations for a multistage decision process
in a backward manner and also in a forward manner.
Recursive equations
Recursive equations are used to structure a multistage decision problem as a sequential
process. Each recursive equation represents a stage at which a decision is required. In this, a
series of equations are successively solved, each equation depending on the output values of
the previous equations. Thus, through recursion, a multistage problem is solved by breaking it
into a number of single stage problems. A multistage problem can be approached in a
backward manner or in a forward manner.
Backward recursion
In this, the problem is solved by writing equations first for the final stage and then proceeding
backwards to the first stages. Consider the serial multistage problem discussed in the previous
lecture.
NB1
S1
Stage 1
X1
A.BENHARI
NBt
S2
St
Stage t
Xt
NBT
St+1
ST
Stage T
XT
150
ST+1
t =1
t =1
f = NBt = ht ( X t , S t )
= h1 ( X 1 , S1 ) + h2 ( X 2 , S 2 ) + ... + ht ( X t , S t ) + ... + hT 1 ( X T 1 , S T 1 ) + hT ( X T , S T )
...(1)
and the relation between the stage variables and decision variables are gives as
St+1 = g(Xt, St),
t = 1,2,, T.
(2)
Consider the final stage as the first subproblem. The input variable to this stage is ST.
According to the principle of optimality, no matter what happens in other stages, the decision
variable XT should be selected such that hT ( X T , ST ) is optimum for the input ST. Let the
optimum value be denoted as fT . Then,
f T ( S T ) = opt [hT ( X T , S T )]
...(3)
XT
Next, group the last two stages together as the second subproblem. Let fT1 be the optimum
objective value of this subproblem. Then, we have
f T1 ( S T 1 ) = opt [hT 1 ( X T 1 , S T 1 ) + hT ( X T , S T )]
...( 4)
X T 1 , X T
From the principle of optimality, the value of X T should be to optimize hT for a given ST .
For obtaining ST , we need ST 1 and X T 1 . Thus, fT1 ( ST 1 ) can be written as,
f T1 ( S T 1 ) = opt hT 1 ( X T 1 , S T 1 ) + f T ( S T )
...(5)
X T 1
f T1 ( S T 1 ) = opt hT 1 ( X T 1 , S T 1 ) + f T ( g T 1 ( X T 1 , S T 1 ))
X T 1
A.BENHARI
151
...(6)
opt
[hT i ( X T i , ST i ) + ... + hT 1 ( X T 1 , ST 1 ) + hT ( X T , ST )]
...(7)
X T i ,..., X T 1 , X T
f Ti ( S T i ) = opt hT i ( X T i , S T i ) + f T(i 1) ( g T i ( X T i , S T i ))
...(8)
X T i
where fT(i 1) denotes the optimal value of the objective function for the last i stages. Thus
for backward recursion, the principle of optimality can be stated as, no matter in what state of
stage one may be, in order for a policy to be optimal, one must proceed from that state and
stage in an optimal manner.
Forward recursion
In this approach, the problem is solved by starting from the stage 1 and proceeding towards
the last stage. Consider the serial multistage problem with the objective function as given
below
T
t =1
t =1
f = NBt = ht ( X t , S t )
= h1 ( X 1 , S1 ) + h2 ( X 2 , S 2 ) + ... + ht ( X t , S t ) + ... + hT 1 ( X T 1 , S T 1 ) + hT ( X T , S T )
...(9)
and the relation between the stage variables and decision variables are gives as
St = g ( X t +1 , St +1 )
t = 1,2,...., T
(10)
A.BENHARI
152
...(11)
X1
Now, group the first and second stages together as the second subproblem. The objective
function f 2 for this subproblem can be expressed as,
f 2 ( S 2 ) = opt [h2 ( X 2 , S 2 ) + h1 ( X 1 , S1 )]
...(12)
X 2 , X1
f 2 ( S 2 ) = opt h2 ( X 2 , S 2 ) + f1 ( S1 )
...(13)
X2
f 2 ( S 2 ) = opt h2 ( X 2 , S 2 ) + f1 ( g 2 ( X 2 , S 2 ))
...(14)
X2
Thus, here through the principle of optimality the dimensionality of the problem is reduced
from two to one. In general, the ith subproblem can be expressed as,
f i ( Si ) =
opt
[hi ( X i , Si ) + ... + h2 ( X 2 , S 2 ) + h1 ( X 1 , S1 )]
(15)
X 1 , X 2 ,..., X i
f i ( S i ) = opt hi ( X i , S i ) + f (i 1) ( g i( X i , Si ))
(16)
Xi
where f i denotes the optimal value of the objective function for the first i stages. The
principle of optimality for forward recursion is that no matter in what state of stage one may
be, in order for a policy to be optimal, one had to get to that state and stage in an optimal
manner.
A.BENHARI
153
t =1
t =1
f = NBt = ht ( X t , S t )
= h1 ( X 1 , S1 ) + h2 ( X 2 , S 2 ) + ... + ht ( X t , S t ) + ... + hT 1 ( X T 1 , S T 1 ) + hT ( X T , S T )
...(1)
Considering first subproblem i.e., the last stage, the objective function is
f T ( S T ) = opt [hT ( X T , S T )]
...(2)
XT
NBT
ST
Stage T
ST+1
XT
The input variable to this stage is ST. The decision variable XT and the optimal value of the
objective function fT depend on the input ST. At this stage, the value of ST is not known. ST
can take a range of values depending upon the value taken by the upstream components. To
A.BENHARI
154
ST
X T
fT
ST+1
Now, consider the second subproblem by grouping the last two components.
NBT
NBT-1
ST-1
Stage T-1
XT-1
The objective function can be written as
ST
ST+1
Stage T
XT
f T1 ( S T 1 ) = opt [hT 1 ( X T 1 , S T 1 ) + hT ( X T , S T )]
...(4)
X T 1 , X T
f T1 ( S T 1 ) = opt hT 1 ( X T 1 , S T 1 ) + f T ( S T )
...(5)
X T 1
Here also, a range of values are considered for ST 1 . All the information of first subproblem
can be obtained from Table 1. Thus, the optimal values of X T 1 and fT1 are found for these
range of values. The results thus calculated can be shown in Table 2.
A.BENHARI
155
ST-1
X T 1
ST
f T ( S T )
fT1
f Ti ( S T i ) =
opt
[hT i ( X T i , ST i ) + ... + hT 1 ( X T 1 , S T 1 ) + hT ( X T , ST )]
X T i ,..., X T 1 , X T
= opt hT i ( X T i , S T i ) + f T(i 1)
...(7)
X T i
At this stage, the suboptimizaiton has been carried out for all last i components. The
information regarding the optimal values of ith subproblem will be available in the form of a
table. Substituting this information in the objective function and considering a range of ST i
values, the optimal values of f Ti and X T i can be calculated. The table showing the
suboptimization of i+1th subproblem can be shown below.
Table 3
A.BENHARI
Sl no
ST-i
X T i
ST-(i-1)
f T(i 1) ( S T (i 1) )
156
f Ti
NB1
S1
Stage 1
X1
NBt
S2
St
Stage t
NBT
St+1
Xt
ST
Stage T
ST+1
XT
Here, for initial value problems, only one value S1 need to be analyzed.
After completing the suboptimization of all the stages, we need to retrace the steps through
the tables generated to find the optimal values of X. In order to do this, the Tth subproblem
gives the values of X 1 and f1 for a given value of S1 (since the value of S1 is known for an
initial value problem). Use the transformation equation S2 = g(X1, S1), to calculate the value
of S 2 , which is the input to the 2nd stage ( T-1th subproblem). Then, from the tabulated results
for the 2nd stage, the values of X 2 and f 2 are found out for the calculated value of S 2 .
Again use the transformation equation to find out S 3 and the process is repeated until the 1st
subproblem or Tth stage is reached. Finally, the optimum solution vector is given by
X 1 , X 2 , ...., X T .
A.BENHARI
157
Other Topics
Introduction
In the previous lectures we discussed about problems with a single state variable or input
variable St which takes only some range of values. In this lecture, we will be discussing about
problems with state variable taking continuous values and also problems with multiple state
variables.
Discrete versus Continuous Dynamic Programming
In a dynamic programming problem, when the number of stages tends to infinity then it is
called a continuous dynamic programming problem. It is also called an infinite-stage
problem. Continuous dynamic programming model is used to solve continuous decision
problems. The classical method of solving continuous decision problems is by the calculus of
variations. However, the analytical solutions, using calculus of variations, cannot be
generally obtained, except for very simple problems. The infinite-stage dynamic
programming approach, on the other hand provides a very efficient numerical approximation
procedure for solving continuous decision problems.
The objective function of a conventional discrete dynamic programming model is the sum of
individual stage outputs. If the number of stages tends to infinity, then summation of the
outputs from individual stages can be replaced by integrals. Such models are useful when
infinite number of decisions have to be made in finite time interval.
Multiple State Variables
In the problems previously discussed, there was only one state variable St. However there will
be problems in which one need to handle more than one state variable. For example, consider
a water allocation problem to n irrigated crops. Let Si be the units of water available to the
remaining n-i crops. If we are concerned only about the allocation of water, then this problem
can be solved as a single state problem, with Si as the state variable. Now, assume that L units
of land are available for all these n crops. We want to allocate the land also to each crop after
A.BENHARI
158
Input
Output
Stage 1
S1 & R1
S2 & R2
Decision variable, X1
In general, for a multistage decision problem of T stages containing two state variables St and
Rt , the objective function can be written as
T
t =1
t =1
f = NBt = h( X t , S t , Rt )
&
for t =1,2,, T
for t =1,2,, T
Curse of Dimensionality
Dynamic programming has a serious limitation due to dimensionality restriction. As the
number of variables and stages increase, the number of calculations needed increases rapidly
thereby increasing the computational effort. If the number of stage variables is increased,
then more combinations of discrete states should be examined at each stage. For a problem
A.BENHARI
159
A.BENHARI
160
W1
0
W2
1
L1
Wi
2
L2
i-1
Wi+1
i
Li
Wn
i+1
n-1
Li+1
n
Ln
The beam rests on n+1 rigid supports. The locations of the supports are assumed to be
known. The objective function of the problem is to minimize the sum of the cost of
construction of all spans.
It is assumed that simple plastic theory of beams is applicable. Let the reactant support
moments be represented as m1, m2, , mn. Once these support moments are known, the
complete bending moment distribution can be determined. The plastic limit moment for each
span and also the cross section of the span can be designed using these support moments.
The bending moment at the center of the ith span is -WiLi/4. Therefore, the largest bending
moment in the ith span can be computed as
A.BENHARI
161
m + mi Wi Li
M i = max mi 1 , mi , i 1
2
4
for i = 1, 2,...n
For a beam of uniform cross section in each span, the limit moment m_limi for the ith span
should be greater than or equal to Mi. The cross section of the beam should be selected in
such a way that it has the required limit moment. Since the cost of the beam depends on the
cross section, which in turn depends on the limit moment, cost of the beam can be expressed
as a function of the limit moments.
n
If
C (X )
i =1
represents the sum of the cost of construction of all spans of the beam where X
m _ lim1
m _ lim
2
X =
m _ lim n
n
C (X )
i =1
for i = 1, 2, ..., n .
This problem has a serial structure and can be solved using dynamic programming.
A.BENHARI
162
y3
h3
y4
h2
y2
h1
y1
W1=1
Consider a particular bay i. Assume the truss is statically determinate. Thus, the forces in the
bars of bay i depend only on the coordinates yi-1 and yi and not on any other coordinates. The
A.BENHARI
163
for i = 1,2,3
This is an initial value problem since the value y1 is already known. Let the y coordinate of
each node is limited to a finite number of values say 0.25, 0.5, 0.75 and 1. Then, as shown in
the figure below, there will be 64 different possible ways to reach y4 from y1.
1.00
0.75
0.50
0.25
y1
y2
y3
y4
This can be represented as a serial multistage initial value decision problem and can be
solved using dynamic programming.
A.BENHARI
164
Field 3
x3
x1
Field 1
x2
Field 2
The net benefits from producing the crops in each field are given by the functions below.
NB1 ( x1 ) = 5 x1 0.5 x1
NB2 ( x2 ) = 8 x2 1.5 x2
NB3 ( x3 ) = 7 x3 x3
A.BENHARI
165
Available
Quantity,
S1 = Q
S3 = S2- x2
S2 = S1- x1
x1
Crop 1
Net Benefits,
NB1 (x1)
x2
Crop 2
Net Benefits,
NB2 (x2)
Remaining
Quantity,
S4 = S3- x3
x3
Crop 3
Net Benefits,
NB3 (x3)
The objective function for this allocation problem is defined to maximize the net benefits,
3
x1 + x2 + x3 Q
0 xi Q
A.BENHARI
for i = 1,2,3
166
f1 (Q ) =
max NB i ( xi )
x1 + x 2 + x 3 Q
x1 , x 2 , x 3 0 i =1
Transforming this into three problems each having only one decision variable,
0 x1 Q
0 x2 Q x1 = S 2
0 x3 S 2 x 2 = S 3
Considering the last term of this equation, let f 3 ( S3 ) be the maximum net benefits from crop
3. The state variable for this stage is S 3 which can vary from 0 to Q. Therefore,
f 3 ( S 3 ) = max NB3 ( x3 )
x3
0 x3 S 3
0 x1 Q
0 x2 Q x1 = S 2
Now, let f 2 ( S 2 ) be the maximum benefits derived from crops 2 and 3 for a given quantity S 2
which can vary between 0 and Q. Therefore f 2 ( S 2 ) can be written as,
f 2 (S2 ) =
max
x2
0 x2 Q x1 = S 2
{NB2 ( x2 ) + f 3 ( S 2 x2 )}
Again, since S 2 = Q x1 , f1 (Q) which is the maximum total net benefit from the allocation to
the crops 1, 2 and 3, can be rewritten as
A.BENHARI
167
Now, once the value of f 3 ( S3 ) is calculated, the value of f 2 ( S 2 ) can be determined, from
which f1 (Q) can be determined.
Forward recursive equations
The procedure explained above can also be solved using a forward proceeding manner. Let
the function f i ( Si ) be the total net benefit from crops 1 to i for a given input of Si which is
allocated to those crops. Considering the first stage alone,
f1 ( S1 ) = max NB1 ( x1 )
x1
x1 S1
Since, the value of S1 is not known (excepting that S1 should not exceed Q), the equation
above has to be solved for a range of values from 0 to Q. Now, considering the first two crops
together, with S 2 units of water available to these crops, f 2 ( S 2 ) can be written as,
f 2 ( S 2 ) = max[NB2 ( x2 ) + f1 ( S 2 x2 )]
x2
x2 S 2
This equation also should be solved for a range of values for S 2 from 0 to Q. Finally,
considering the whole system i.e., crops 1, 2 and 3, f 3 ( S3 ) can be expressed as,
f 3 ( S 3 ) = max [NB3 ( x3 ) + f 2 ( S 3 x3 )]
x3
x3 S 3 = Q
Here, if it is given that the whole Q units of water should be allocated, then the value of S 3
can be taken as equal to Q. Otherwise, f 3 ( S3 ) should be solved for a range of values from 0 to
Q.
A.BENHARI
168
A.BENHARI
169
NB2 ( x2 ) = 8 x2 1.5 x2
NB3 ( x3 ) = 7 x3 x3
The possible net benefits from each crop are calculated according to the functions given and
are given below.
Table 1
xi
NB1 ( x1 )
NB2 ( x2 )
NB3 ( x3 )
0.0
0.0
0.0
4.5
6.5
6.0
8.0
10.0
10.0
10.5
10.5
12.0
12.0
8.0
12.0
The problem can be represented as a set of nodes and links as shown in the figure below. The
nodes represent the state variables and the links represent the decision variables.
A.BENHARI
170
x1
x2
0
x3
0
4
1
4
1
1
2
3
4
1
0
0
Crop 1
1
0
Crop 2
Crop 3
The values inside the nodes show the value of possible state variables at each stage. Number
of nodes for any stage corresponds to the number of discrete states possible for each stage.
The values over the links show the different values taken by decision variables corresponding
to the value taken by state variables. It may be noted that link values for all links are not
shown in the above figure.
Solution using Backward Recursion:
Starting from the last stage, the suboptimization function for the 3rd crop is given as,
f 3 ( S 3 ) = max NB3 ( x3 ) with the range of S 3 from 0 to 4.
x3
0 x3 S 3
A.BENHARI
171
The calculations for this stage are shown in the table below.
Table 2
NB3 ( x3 )
State
S3
x3 :
10
10
12
10
12
f 3 ( S3 )
x3
10
12
12
3,4
12
The value of f 3 ( S 2 x2 ) is noted from the previous table. The calculations are shown below.
Table 3
f 2 (S2 ) =
State S 2
x2
NB2 ( x2 )
( S 2 x2 )
f 3 ( S 2 x2 )
NB2 ( x2 ) +
f 2 (S2 )
x2
6.5
12.5
16.5
f 3 ( S 2 x2 )
0
1
A.BENHARI
6.5
6.5
10
10
6.5
12.5
10
10
12
12
6.5
10
16.5
10
16
172
10.5
10.5
12
12
6.5
12
18.5
10
10
20
10.5
16.5
20
Finally, by considering all the three stages together, the sub-optimization function is
f1 (Q) = max [NB1 ( x1 ) + f 2 (Q x1 )] . The value of S1 = Q = 4 . The calculations are shown in
x1
0 x1 Q
State
S1 = Q
x1
NB1 ( x1 )
(Q x1 )
f 2 (Q x1 )
NB1 ( x1 ) +
f1 ( S1 )
x1
21
f 2 (Q x1 )
0
20
20
4.5
16.5
21
12.5
20.5
10.5
6.5
17
12
12
Now, backtracking through each table to find the optimal values of decision variables, the
optimal allocation for crop 1, x1 = 1 for a S1 value of 4. This will give the value of S 2
A.BENHARI
173
f = 21
x1 = 1
x2 = 1
x3 = 2
Solution using Forward Recursion:
While starting to solve from the first stage and proceeding towards the final stage, the
suboptimization function for the first stage is given as,
f1 ( S1 ) = max NB1 ( x1 ) . The range of values for S1 is from 0 to 4.
x1
x1 S1
Table 5
x1
NB1 ( x1 )
f 2 (S2 )
x1
4.5
4.5
4.5
4.5
10.5
10.5
4.5
12
10.5
12
A.BENHARI
State S1
174
f 2 (S2 ) =
State S 2
x2
NB2 ( x2 )
( S 2 x2 )
f1 ( S 2 x2 )
NB2 ( x2 ) +
f 2 (S2 )
x2
6.5
11
14.5
1,2
18
f 2 ( S 2 x2 )
0
1
4.5
4.5
6.5
6.5
6.5
4.5
11
10
10
10.5
10.5
6.5
14.5
10
4.5
14.5
10.5
10.5
12
12
6.5
10.5
17
10
18
10.5
4.5
15
Now, considering the whole system i.e., crops 1, 2 and 3, f 3 ( S3 ) can be expressed as,
f 3 ( S 3 ) = max [NB3 ( x3 ) + f 2 ( S3 x3 )] with the value of S3 = 4 .
x3
x3 S 3 = Q
A.BENHARI
175
f 3 ( S3 ) =
State S 3
x3
NB3 ( x3 )
S3 x3
f 2 ( S3 x3 )
NB3 ( x3 ) +
f 3 ( S3 )
x3
21
f 2 ( S3 x3 )
18
18
14.5
20.5
10
11
21
12
6.5
18.5
12
12
In order to find the optimal solution, a backtracking is done. From Table 7, the optimal value
of x3 * is given as 2 for the S 3 value of 4. Therefore, S 2 = S3 x3 = 2 . Now, from Table 6,
f = 21
x1 = 1
x2 = 1
x3 = 2
These optimal values are same as those we got by solving using backward recursion method.
A.BENHARI
176
Capacity Expansion
Introduction
The most common applications of dynamic programming in water resources include water
allocation, capacity expansion of infrastructure and reservoir operation. In this lecture,
dynamic programming formulation for capacity expansion and a numerical example are
discussed.
Capacity expansion
Consider a municipality planning to increase the capacity of its infrastructure (ex: water
treatment plant, water supply system etc) in future. The increments are to be made
sequentially in specified time intervals. Let the capacity at the beginning of time period t be St
(existing capacity) and the required capacity at the end of that time period be Kt. Let xt be
the added capacity in each time period. The cost of expansion at each time period can be
expressed as a function of St and xt , i.e. Ct ( St , xt ) . The problem is to plan the time sequence
of capacity expansions which minimizes the present value of the total future costs subjected
to meet the capacity demand requirements at each time period. Hence, the objective function
of the optimization model can be written as,
T
Minimize
C (S , x )
t =1
where Ct ( St , xt ) is the present value of the cost of adding an additional capacity xt in the
time period t with an initial capacity St . Each periods final capacity or next periods initial
capacity should be equal to the sum of initial capacity and the added capacity. Also at the end
of each time period, the required capacity is fixed. Thus, for a time period t, the constraints
can be expressed as
S t +1 = S t + xt
for t = 1,2,..., T
S t +1 K t
for t = 1,2,..., T
A.BENHARI
177
Ct
C1
S1
Stage 1
S2
x1
St
Stage t
CT
St+1
xt
ST
Stage T
ST+1
xT
Considering the first stage, the objective function can be written as,
f1 ( S 2 ) = min C1 ( S1 , x1 )
= min C1 ( S1 , S 2 S1 )
The values of S 2 can be between K1 and K T where K1 is the required capacity at the end of
time period 1 and K T is the final capacity required. In other words, f1 ( S 2 ) should be solved
for a range of S 2 values between K1 and K T . Then considering first two stages, the
suboptimization function is
A.BENHARI
178
f 2 ( S 3 ) = min [C 2 (S 2 , x 2 ) + f1 ( S 2 )]
x2
x2 2
= min [C 2 (S 3 x 2 , x 2 ) + f1 ( S 3 x 2 )]
x2
x2 2
which should be solved for all values of S 3 ranging from K 2 to K T . Hence, in general for a
time period t, the suboptimization function can be represented as
f t ( S t +1 ) = min [Ct (S t +1 xt , xt ) + f t 1 ( S t +1 xt )]
xt
xt t
with constraint as K t S t +1 K T . For the last stage, i.e. t=T, the function fT ( ST +1 ) need to
be solved only for S T +1 = K T .
Backward Recursion
The expansion problem can also be solved using a backward recursion approach with some
modifications. Consider the state S t be the capacity at the beginning of each time period t.
Let fT ( ST ) be the minimum present value of total cost of capacity expansion in periods t
through T.
For the last period T, the final capacity should reach K T after doing the capacity expansions.
Thus, the objective function can be written as,
f T ( S T ) = min [CT (S T , xT )]
xT
xT T
ranging from K t 1 to K T .
A.BENHARI
179
Consider a five stage capacity expansion problem. The minimum capacity to be achieved at
the end of each time period is given below.
Table 1
Kt
10
20
20
25
The expansion costs for each combination of expansion for each stage are shown in the
corresponding links in the form of a figure below.
A.BENHARI
180
Stage 1
State Variable, S2
Added Capacity,
x1 = S2 S1
C1(S2)
f1*(S2)
10
10
11
11
15
15
15
15
20
20
21
21
25
25
27
27
Considering the 1st and 2nd stages together, the state variable S3 can take values from K2 to
K5. Thus, the objective function for 2nd subproblem is
f 2 ( S 3 ) = min [C 2 (S 2 , x 2 ) + f1 ( S 2 )]
x2
x2 2
= min [C 2 (S 3 x 2 , x 2 ) + + f 1 ( S 3 x 2 )]
x2
x2 2
The value of x2 should be taken in such a way that the minimum capacity at the end of stage 2
should be 10, i.e. S3 10 .
A.BENHARI
181
Stage 2
State
Variable,
S3
10
15
20
25
Added
Capacity, x2
C2(S3)
S2= S3 x2
f1*(S2)
f2(S3)=C2(S3)+f1*(S2)
10
11
11
17
15
15
15
10
11
19
10
10
19
20
21
21
15
15
22
10
10
10
11
21
15
13
22
25
27
27
20
21
28
10
15
15
24
15
14
10
11
25
20
20
29
Like this, repeat this steps till t = 5. For the 5th subproblem, state variable S6 = K5.
A.BENHARI
182
f2*(S3)
11
15
21
24
Stage 3
State
Variable,
S4
20
25
Added
Capacity, x3
C3(S4)
S3= S4
x3
f2*(S3)
f3(S4)=C3(S4)+f2*(S3)
f3*(S4)
20
21
21
15
15
21
10
10
11
20
25
24
24
20
21
27
10
15
15
34
15
12
10
11
23
f3*(S4)
f4(S5)=C4(S5)+f3*(S4)
f4*(S5)
20
20
23
Table 5
Stage 4
State
Variable,
S5
20
25
A.BENHARI
Added
Capacity, x4
C4(S5)
S4= S5
x4
20
20
20
25
23
23
20
20
25
183
23
Stage 5
State
Variable,
S6
25
Added
Capacity, x5
C5(S6)
S5= S6
x5
f4*(S5)
f5(S6)=C5(S6)+f4*(S5)
25
23
23
20
20
24
f5*(S6)
23
The figure below shows the solutions with the cost of each addition along the links and the
minimum total cost at each node.
From the figure, the optimal cost of expansion is 23 units. By doing backtracking from the
last stage (farthest right node) to the initial stage, the optimal expansion to be done at 1st stage
= 10 units, 3rd stage = 15 units and rest all stages = 0 units.
A.BENHARI
184
[ f 5 (S 5 , x5 )]
Stage 5
State
Added
Variable,
Capacity, x5
C5(S5)
f5*(S5)
S5
20
25
Following the same procedure for all the remaining stages, the optimal cost of expansion is
achieved. The computations for all stages 4 to 1 are given below.
Table 8
Stage 4
State
Variable,
S4
20
25
A.BENHARI
Added
f5*(S5) f4(S4)=C4(S4)+f5*(S5)
C4(S4)
S5 = S4+ x4
20
25
25
Capacity, x4
185
f4*(S4)
4
0
Stage 3
State
Variable, S3
10
15
20
25
Added
Capacity, x3
C3(S3)
S4= S3 +
x3
f4*(S4)
f3(S3)=C3(S3)+
f4*(S4)
10
20
13
15
12
25
12
20
10
10
25
10
20
25
25
f3*(S3)
12
10
4
0
Table 10
Stage 2
State
Variable, S2
10
15
20
25
A.BENHARI
Added
Capacity, x2
C2(S2)
S3= S2 +
x2
f3*(S3)
f2(S2)=C2(S2)+
f3*(S3)
10
12
20
10
10
15
10
20
15
13
20
17
20
20
25
20
10
12
12
15
10
18
10
10
20
14
15
14
25
14
15
10
10
20
11
10
25
20
25
25
186
f2*(S2)
17
12
4
0
Stage 2
State
Variable, S1
Added
Capacity, x1
C1(S1)
S2= S1 +
x1
f2*(S2)
f1(S1)=C1(S1)+
f2*(S2)
17
26
10
11
10
12
23
15
15
15
24
20
21
20
25
25
27
25
27
f1*(S2)
23
The solution is given by the figure below with the minimum total cost of expansion at the
nodes.
A.BENHARI
187
A.BENHARI
188
Reservoir Operation
Introduction
In the previous lectures, we discussed about the application of dynamic programming in
water allocation and capacity expansion of infrastructure. Another major application is in the
field of reservoir operation, which will be discussed in this lecture.
Reservoir operation Steady state optimal policy
Consider a single reservoir receiving inflow it and making releases rt for each time period t.
The maximum capacity of the reservoir is K. The optimization problem is to find the
sequence of releases to be made from the reservoir that maximizes the total net benefits.
These benefits may be from hydropower generation, irrigation, recreation etc. Let St and S t +1
be the initial and final storages for time period t. Expressing net benefits as a function of St ,
S t +1 and rt , the net benefit for period t is NBt (S t , S t +1 , rt ) .
If there are T periods in a year, then the objective function is to maximize the total net
benefits from all periods.
T
Maximize
NB (S , S
t =1
t +1
, rt )
This is subject to continuity and also capacity constraints. Neglecting all minor losses like
evaporation, seepage etc and assuming that there is no overflow, the continuity relation can
be written as,
S t +1 = S t + it rt
for t = 1,2,...T
A.BENHARI
for t = 1,2,...T
189
Starting from T of last year, which is at the far right, there is only one period remaining.
Thus, in this case t=T and n=1. Let fT1 (ST ) be the maximum net benefit in the last period of
the year considered. fT1 (ST ) can be expressed as
f T1 (S T ) =
max
rT 0
rT ST + iT
rT ST + iT K
[NBT {ST , (S T + iT rT ), rT }]
A.BENHARI
190
max
rT 1 0
rT 1 ST 1 + iT 1
rT 1 ST 1 + iT 1 K
[NB
T 1
{S T 1 , (S T 1 + iT 1 rT 1 ), rT 1 } + f T1 (S T 1 + iT 1 rT 1 )]
where the index t decreases from T to 1 and then takes the value T again for the previous year
and the cycle repeats while the index n starts from 1 and increases at each successive stage.
This cycle can be repeated till the optimum values of rt for an initial storage St will be the
same as the corresponding rt and St of previous year. Such a solution is called stationary
solution. The maximum net benefit can be obtained as the difference of f t n +T (S t ) and f t n (S t )
for any St and t.
Numerical example (Loucks et al., 1981)
Consider a reservoir for which the desirable constant storage is 20 units and the constant
release is 25 units. The capacity of the reservoir is 30 units and the inflows for three seasons
are given as 10, 50 and 20 units. The problem is to find the optimum St and rt that
minimizes the total squared deviation from the release and storage targets given. Hence, the
objective function is (20 S t ) + (25 rt ) . Let St take the discrete values of 0, 10, 20, 30
2
Solution:
A.BENHARI
191
Here no. of seasons (periods), T = 3. Considering the last period for which t = 3 and n = 1,
the optimization function is
Inflow for 3rd season, I3 = 20 units and capacity of the reservoir, K = 30 units.
The release constraints can be expressed as
r3 S 3 + I 3
S 3 + 20
and
r3 S 3 + I 3 K
S 3 + 20 30
The computation for the first subproblem (n = 1) is shown in the table below.
A.BENHARI
192
10
20
30
Release, r3
(20 S3 )2 + (25 r3 )
10
625
20
425
10
325
20
125
30
125
10
225
20
25
30
25
40
225
10
325
20
125
30
125
40
325
f 31 (S 3 )
Optimal
release, r3*
425
20
125
20, 30
25
20, 30
125
20, 30
Now considering the last two periods (n =2), the optimization function is
Inflow for 2nd season, I2 = 50 units. The release constraints can be expressed as
r2 S 2 + 50
and
r2 S 2 + 50 30
The computation for the second subproblem (n = 2) is shown in the table below.
For S2=30, r2 S 2 + 50 30 i.e. r2 50 i.e. r2 50 . Since r2 can take values only of 10,
20, 30 and 40 only, the release cannot be made for S2=30.
Table 2
A.BENHARI
193
S2 +
r2
(20 S 2 )2
2
+ (25 r2 )
20
425
30
Release,
Optimal
f 31
(S 2 + I 2 r2 )
(5)+(3)
30
125
550
425
20
25
450
40
625
10
125
750
30
125
30
125
250
40
325
20
25
350
20
40
225
30
125
30
na
na
na
na
variable,
S2
10
I2 r2
f 22 (S 2 )
release,
r2*
450
30
250
30
350
350
40
na
na
na
The same procedure is repeated for all stages till n = 7. The summarized solution for this
problem is given in the table below.
Table 3
n=1
n=2
Initial Storage,
St
f 31 (S 3 )
r3
425
10
n =3
f 22 (S 2 )
r2
f13 (S1 )
r1
20
450
30
1075
10
125
20, 30
250
30
575
10, 20
20
25
20, 30
350
40
275
20
30
125
20, 30
--
na
375
30
Table 4
Initial Storage,
St
n=4
n =5
f 34 (S 3 )
r3
1200
10
n =6
f 25 (S 2 )
r2
10
725
600
10
20
300
30
400
A.BENHARI
f16 (S1 )
r1
30
1350
10
525
30
850
10, 20
20
625
40
550
20
30
--
na
650
30
194
n=7
f 37 (S 3 )
r3
1475
10
10
875
10
20
575
20
30
675
30
At this stage, the value of r3 at n = 7 and n = 4 are exactly the same. Also the difference
f 37 (S 3 ) - f 34 (S 3 ) = 275 is same for all S t . This value is the minimum total squared deviations
from the target release and storage. Thus, the stationary policy obtained is given below.
Table 6
Optimal Releases
St
r3
r1
r2
10
30
10
10
10, 20
30
10
20
20
40
20
30
30
--
30
A main assumption made in dynamic programming is that the decisions made at one stage is
dependent only on the state variable and is independent of the decisions taken in other stages.
In cases where decisions made at one stage are dependent on the earlier decisions, then
dynamic programming will not be an appropriate optimization technique.
A.BENHARI
195
A.BENHARI
196
M6L6
2.
The objective function value given by the rounded-off solutions (even if some are
feasible) may not be the optimal one.
3.
Even if some of the rounded-off solutions are optimal, checking all the rounded-off
solutions is computationally expensive ( 2 n possible round-off values to be considered
for an n variable problem)
A.BENHARI
197
max
subject to
AX b
X 0
X must be integer valued
The associated linear program dropping the integer restrictions is called linear relaxation LR.
Thus, LR is less constrained than ILP. If the objective function coefficients are integer, then
for minimization, the optimal objective for ILP is greater than or equal to the rounded-off
value of the optimal objective for LR. For maximization, the optimal objective for ILP is less
than or equal to the rounded-off value of the optimal objective for LR.
For a minimization ILP, the optimal objective value for LR is less than or equal to the
optimal objective for ILP and for a maximization ILP, the optimal objective value for LR is
greater than or equal to that of ILP. If LR is infeasible, then ILP is also infeasible. Also, if LR
is optimized by integer variables, then that solution is feasible and optimal for IP.
A most popular method used for solving all-integer and mixed-integer linear programming
problems is the cutting plane method by Gomory (Gomory, 1957).
Gomorys Cutting Plane Method for All Integer Programming
Consider the following optimization problem.
Maximize
Z = 3x1 + x 2
subject to
2 x1 x 2 6
3x1 + 9 x 2 45
x1 , x 2 0
x1 and x2 are integers
The graphical solution for the linear relaxation of this problem is shown below.
A.BENHARI
198
x2
2 x1 x2 6
3 x1 + 9 x2 45
(4 5 7 ,3 3 7 )
Z = 17 4
D
0
C
1
7 x
1
A.BENHARI
199
x2 7
6
O
5
(4 5 7 ,3 3 7 )
4
E
B
Z = 17 4
(4,3)
2
Z = 15
D
0
N
4
C
1
P
5
7 x
1
A.BENHARI
200
x1
xi
yj
y1
y2
c1
c2
cj
cm
ym
br
xn
x1
c11
c12
c1j
c1m
b1
x2
c21
c22
c2j
c2m
b2
c31
c32
c3j
c3m
bi
c41
c42
c4j
c4m
bn
x2
xi
xn
Choose any basic variable xi with the highest fractional value. If there is a tie between two
basic variables, arbitrarily choose any of them as xi . Then from the ith equation of table,
m
xi = bi cij y j
..(1)
j =1
bi = bi + i
........(2)
cij = cij + ij
........(3)
where bi , cij denote the integer part and i , ij denote the fractional part. i will be a
i ij y j = xi bi cij y j
j =1
.(4)
j =1
For all the variables xi and y j to be integers, the right hand side of equation (4) should be
an integer.
m
i ij y j = integer
(5)
j =1
A.BENHARI
201
Since ij are non-negative integers and y j are non-negative integers, the term
ij y j
will
j =1
y <1
i
ij j
i
j =1
.(6)
i ij y j 0
.(7)
j =1
By introducing a slack variable si (which should also be an integer), the Gomory constraint
can be written as
m
si ij y j = i
.(8)
j =1
2.
If any of the basic variables has fractional values, introduce the Gomory constraints
as discussed in the previous section. Insert a new row with the coefficients of this
constraint, to the final tableau of the ordinary LP problem (Table 1).
3.
4.
Check whether the new solution is all-integer or not. If all values are not integers,
then a new Gomory constraint is developed from the new simplex tableau and the
dual simplex method is applied again.
5.
This process is continued until an optimal integer solution is obtained or it shows that
the problem has no feasible integer solution.
A.BENHARI
202
A.BENHARI
203
xi = bi cij y j
j =1
A.BENHARI
204
where
cij if cij 0
cij+ =
0 if cij < 0
0 if cij 0
cij =
cij if cij < 0
Therefore,
j =1
Since xi is restricted to be an integer, bi is also an integer and 0 < i < 1 , the value of
(cij+ + cij )y j i
m
j =1
cij+ y j i
j =1
i + (bi xi ) = 1 + i or 2 + i or 3 + i ,....
Therefore,
(cij+ + cij )y j i 1
m
j =1
A.BENHARI
205
cij y j i 1
j =1
cij y j i
1
j =1
Considering both cases I and II, the final form of the inequality becomes (since one of the
inequalities should be satisfied)
m
j =1
cij+ y j
cij y j i
1
j =1
si cij+ y j
j =1
cij y j = i
1
j =1
Generate the Gomory constraint for the variables having integer restrictions. Insert this
constraint as the last row of the final tableau of LP problem and solve this using dual simplex
method. MIP techniques are useful for solving pure-binary problems and any combination of
real, integer and binary problems.
A.BENHARI
206
Z = 3x1 + x 2
subject to
2 x1 x 2 6
3x1 + 9 x 2 45
x1 , x 2 0
Z = 3x1 + x2
subject to
2 x1 x2 + y1 = 6
3x1 + 9 x2 + y2 = 45
x1 , x2 , y1 , y2 0
A.BENHARI
Variables
Basis
x1
x2
y1
y2
br
br
crs
-3
-1
--
y1
-1
y2
45
15
207
Basis
x1
x2
y1
y2
br
br
crs
5
2
3
2
--
x1
1
2
1
2
--
y2
21
2
36
24
7
br
br
crs
3
2
Table 3
Variables
Iteration
Basis
x1
x2
y1
y2
8
7
5
21
123
7
--
x1
6
14
1
21
33
7
--
x2
1
7
2
21
24
7
--
Optimum value of Z is
are x1 =
123
as shown above. Corresponding values of basic variables
7
33
24
= 4 5 , x2 =
= 3 3 and those of non-basic variables are all zero
7
7
7
7
A.BENHARI
208
14
y1 1
21
y2 = 1
21
By inserting a new row with the coefficients of this constraint to the last table, we get
Table 4
br
Variables
Iteration
Basis
x1
x2
y1
y2
s1
8
7
5
21
123
7
x1
6
14
1
21
33
7
x2
1
7
2
21
24
7
s1
6
14
1
21
5
7
br
Variables
Iteration
Basis
x1
x2
y1
y2
s1
8
7
5
21
123
7
x1
6
14
1
21
33
7
x2
1
7
2
21
24
7
s1
6
14
1
21
--
--
--
Ratio
A.BENHARI
8
3
209
5
7
br
Variables
Iteration
Basis
x1
x2
y1
y2
s1
1
9
8
3
47
3
x1
x2
1
9
1
3
11
3
y1
1
9
7
3
5
3
47
as shown above. Corresponding values
3
11
7
, y1 = and those of nonbasic variables are all zero
3
3
Step 5: Adding this constraint to the previous table and solving using dual simplex method
A.BENHARI
210
br
Variables
Iteration
Basis
x1
x2
y1
y2
s1
s2
1
9
8
3
47
3
x1
x2
1
9
1
3
11
3
y1
1
9
7
3
5
3
s2
--
--
--
Ratio
1
9
1
3
--
--
2
3
--
Table 8
br
Variables
Iteration
Basis
x1
x2
y1
y2
s1
s2
8
3
1
3
15
x1
x2
1
3
1
3
y1
7
3
1
3
s2
-3
A.BENHARI
211
Z = 3x1 + x2
subject to
2 x1 x2 + y1 = 6
3x1 + 9 x2 + y2 = 45
x1 , x2 , y1 , y2 0
x2 should be an integer
Step 1: Solve the problem as an ordinary LP problem. The final tableau is showing the
optimal solutions are shown below.
Table 9
Variables
Iteration
Basis
x1
x2
y1
y2
br
br
crs
8
7
5
21
123
7
--
x1
6
14
1
21
33
7
--
x2
1
7
2
21
24
7
--
Optimum value of Z is
x2 =
33
123
and corresponding values of basic variables are x1 =
= 45 ,
7
7
7
24
= 3 3 and those of non-basic variables are all zero (i.e., y1 = y 2 = 0 ). This is not
7
7
A.BENHARI
212
+ c21
+ c22
Similarly c21 = c 21
and c 22 = c22
.
Therefore,
+
c21
= 0, c21
=1
+
c22
=2
21
, c22
= 0 since c22 is positive
si
j =1
cij+ y j
i.e., s2 2
21
cij y j = i
y2 3
j =1
28
y1 = 3
br
Variables
Iteration
A.BENHARI
Basis
x1
x2
y1
y2
s2
8
7
5
21
123
7
x1
6
14
1
21
33
7
x2
1
7
2
21
24
7
s2
3
28
213
2
21
3
7
br
Variables
Iteration
Basis
x1
x2
y1
y2
s2
8
7
5
21
123
7
x1
6
14
1
21
33
7
x2
1
7
2
21
24
7
s2
3
28
2
21
--
--
2.5
--
Ratio
32
3
3
7
Table 12
br
Variables
Iteration
Basis
x1
x2
y1
y2
s2
7
8
33
2
x1
3
8
9
2
x2
y2
21
2
9
2
1
4
9
8
33
as shown above. Corresponding values
2
of basic variables are x1 = 4.5 , x2 = 3 , y 2 = 4.5 and those of non-basic variables are all zero
(i.e., y1 = s 2 = 0 ). This solution is satisfying all the constraints.
A.BENHARI
214
f(x)
f(t4)
f(t3)
f(t2)
f(t1)
t1
A.BENHARI
t2
t3
215
t4
i.e., x = wi t i
i =1
= wi f (ti )
i =1
where
i =1
=1
f ( x ) = wi f (ti )
Max or Min
i =1
wt
i =1
i i
w
i =1
=x
=1
Linearly approximated model stated above can be solved using simplex method with some
restrictions. The restricting condition specifies that there should not be more than two wi in
the basis and also two wi can take positive values only if they are adjacent. This is because,
if the actual value of x is between ti and ti+1, then x can be represented as a weighted
A.BENHARI
216
for j = 1,2,..., m
subjected to wki g kj (t ki ) b j
for j = 1,2,..., m
k =1 i =1
p
w
i =1
ki
=1
for k = 1,2,..., n
Method 2:
Let x be expressed as a sum, instead of expressing as the weighted sum of the breaking
points as in the previous method. Then,
x = t1 + u1 + u 2 + .... + u p 1
p 1
= t1 + ui
i =1
where ui is the increment of the variable x in the interval (ti , ti +1 ) , i.e., the bound of ui is
0 u i t i +1 t i .
The function f(x) can be expressed as
p 1
f ( x ) = f (t1 ) + i ui
i =1
where i represents the slope of the linear approximation between the points ti +1 and ti
given by
A.BENHARI
217
i =
f (ti +1 ) f (ti )
.
ti +1 ti
t1 + ui = x
i =1
0 ui ti +1 ti , i = 1,2,...., p 1
Example
The example below illustrates the application of method 1.
Consider the objective function
Maximize f = x13 + x2
subject to
2 x12 + 2 x2 15
0 x1 4
x2 0
The problem is already in separable form (i.e., each term consists of only one variable). So
we can split up the objective function and constraint into two parts
f = f1 ( x1 ) + f 2 ( x2 )
g1 = g11 ( x1 ) + g12 ( x2 )
where
f1 ( x1 ) = x13 ; f 2 ( x2 ) = x2
A.BENHARI
218
f i (t1i )
0
1
8
27
64
t1i
0
1
2
3
4
1
2
3
4
5
g1i (t1i )
0
2
8
18
32
f1 ( x1 ) = w1i f1 (t1i )
i =1
Maximize
subject to
A.BENHARI
br
br
c rs
w11
w12
w13
w14
w15
x2
s1
-1
-8
-27
-64
-1
--
s1
18
32
15
1.87
w11
219
From the table above, it is clear that w15 should be the entering variable. For that s1 should
be the exiting variable. But according to restricted basis condition w15 and w11 cannot occur
together in basis as they are not adjacent. Therefore, consider the next best entering
variable w14 . This also is not possible, since s1 should be exited and w14 and w11 cannot occur
together. Again, considering the next best variable w13 , it is clear that w11 should be the
exiting variable.
Table 3
Variables
Iteration Basis
br
br
crs
w11
w12
w13
w14
w15
x2
s1
-19
-56
--
s1
-8
-6
10
24
w13
15
For the table 2 above, the entering variable is w15 . Then the variable to be exited is s1 and this
is not acceptable since w15 is not an adjacent point to w13 . Next variable w14 can be admitted
by dropping s1 .
Table 4
Variables
Iteration Basis
A.BENHARI
br
br
c rs
--
w11
w12
w13
w14
w15
x2
s1
-7.2
-4.4
-10.4
4.8
1.9
21.3
w14
-0.8
-0.6
2.4
0.2
0.1
0.7
w13
1.8
1.6
-1.4
-0.2
-0.1
0.3
220
x1 = w1i t1i
i =1
= w13 2 + w14 3
= 2.7
and x2 = 0
The optimum value of f = 21.3 .
This may be an approximate solution to the original nonlinear problem. However, the
solution can be improved by taking finer breaking points.
A.BENHARI
221
Multi-objective Optimization
Introduction
In a real world problem it is very unlikely that we will meet the situation of single objective
and multiple constraints more often than not. Thus the rigidity provided by the General
Problem (GP) is, many a times, far away from the practical design problems. In many of the
water resources optimization problems maximizing aggregated net benefits is a common
objective. At the same time maximizing water quality, regional development, resource
utilization, and various social issues are other objectives which are to be maximized. There
may be conflicting objectives along with the main objective like irrigation, hydropower,
recreation etc. Generally multiple objectives or parameters have to be met before any
acceptable solution can be obtained. Here it should be noticed that the word acceptable
implicates that there is normally no single solution to the problems of the above type.
Actually methods of multi-criteria or multi-objective analysis are not designed to identify the
best solution, but only to provide information on the tradeoffs between given sets of
quantitative performance criteria.
In the present discussion on multi-objective optimization, we will first introduce the
mathematical definition and then talk about two broad classes of solution methods typically
known as (i) Utility Function Method (Weighting function method) (ii) Bounded Objective
Function Method (Reduced Feasible Region Method or Constraint Method ).
Multi-objective Problem
A multi-objective optimization problem with inequality (or equality) constraints may be
formulated as
A.BENHARI
222
Find
x1
x
2
.
X =
.
.
xn
(1)
(2)
subject to
g j ( X ) 0,
(3)
j= 1, 2 , , m
Here k denotes the number of objective functions to be minimized and m is the number of
constraints. It is worthwhile to mention that objective functions and constraints need not be
linear but when they are, it is called Multi-objective Linear Programming (MOLP).
For the problems of the type mentioned above the very notion of optimization changes and
we try to find good trade-offs rather than a single solution as in GP. The most commonly
adopted notion of the optimum proposed by Pareto is depicted below.
A vector of the decision variable X is called Pareto Optimal (efficient) if there does not exist
another Y such that f i (Y ) f i ( X ) for i = 1, 2, , k with f j (Y ) < f i ( X ) for at least one j. In
other words a solution vector X is called optimal if there is no other vector Y that reduces
some objective functions without causing simultaneous increase in at least one other
objective function.
A.BENHARI
223
k
i
j
Fig. 1
As shown in above figure there are three objectives i, j, k. Direction of their increment is also
indicated. The surface (which is formed based on constraints) is efficient because no
objective can be reduced without a simultaneous increase in at least one of the other
objectives.
Utility Function Method (Weighting function method)
In this method a utility function is defined for each of the objectives according to the relative
importance of fi.. A simple utility function may be defined as ifi(X) for ith objective where i
is a scalar and represents the weight assigned to the corresponding objective. Then the total
utility U may be defined as weighted sum of objective functions as below
k
U = i f i ( X ),
i > 0, i = 1, 2, , k.
(4)
i =1
The solution vector X may be found by maximizing the total utility as defined above with the
constraint (3).
k
i =1
= 1 although it is not
essential. Also i values indicate the relative utility of each of the objectives.
A.BENHARI
224
x1
functions. Here X = and two objectives are f1(X) and f2(X) with upper bound
x2
constraints* of type (3) as in figure 2.
g3(X)
g1(X)
g4(X)
x2
C
D
A
g2(X)
g5(X)
E
O
x1
g6(X)
Decision Space
Fig. 2
*constraints g1(X) to g6(X) represent x1 , x2
For Linear Programming (LP), the Pareto front is obtained by plotting the values of objective
functions at common points (points of intersection) of constraints and joining them through
straight lines in objective space.
It should be noted that all the points on the constraint surface need not be efficient in Pareto
sense as point A in the following figure.
A.BENHARI
225
B
f2
C
D
f1
Objective Space
Fig. 3
By looking at Figure 3 one may qualitatively infer that it follows Pareto Optimal definition.
Now optimizing utility function means moving along the efficient front and looking for the
maximum value of utility function U defined by equation (4).
One major limitation is that this method cannot generate the complete set of efficient
solutions unless the efficiency frontier is strictly convex. If a part of it is concave, only the
end points of this can be obtained by the weighing technique.
Bounded objective function method
In this method we try to trap the optimal solution of the objective functions in a bounded or
reduced feasible region. In formulating the problem one objective function is maximized
while all other objectives are converted into constraints with lower bounds along with other
constraints of the problem. Mathematically the problem may be formulated as
Maximize fi(X)
Subject to g j ( X ) 0,
A.BENHARI
j= 1, 2 , , m
226
(5)
fk(X) ek
here ek represents lower bound of the kth objective. In this approach the feasible region S
represented by g j ( X ) 0, j= 1, 2 , , m is further reduced to S by (k-1) constraints
fk(X) ek k i .
e.g. let there are three objectives which are to be maximized in the region of constraints S.
The problem may be formulated as:
maximize{objective-1}
maximize{objective-2}
maximize{objective-3}
x1
subject to X = S
x2
In the above problem S identifies the region given by g j ( X ) 0, j= 1, 2 , , m.
In the bounded objective function method, the same problem may be formulated as
maximize{objective-1}
subject to
{objective-2} e1
{objective-3} e2
X S
As may be seen, one of the objectives ({objective-1}) is now the only objective and all other
objectives are included as constraints. There are lower bounds specified for other objectives
which are the minimum values at least to be attained. Subject to these additional constraints,
the objective is maximized. Figure 4 illustrates the scheme.
A.BENHARI
227
x2
w3
e2
w2
e1
w1
S
F
x1
Fig. 4
In the above figure w1, w2, and w3 are gradients of the three objectives respectively. If
{objective-1} was to be maximized in the region S, without taking into consideration the
other objectives, then solution point is E. But due to lower bounds of the other objectives the
feasible region reduces to S and solution point is P now. It may be seen that changing e1
does not affect {objective-1} as much as changing e2. This fact gives rise to sensitivity
analysis.
Exercise Problem
A reservoir is planned both for gravity and lift irrigation through withdrawals from its
storage. The total storage available for both the uses is limited to 5 units each year. It is
decided to limit the gravity irrigation withdrawals in a year to 4 units. If X1 is the allocation
of water for gravity irrigation and X2 is the allocation for lift irrigation, the two objectives
planned to be maximized are expressed as
Maximize
A.BENHARI
228
Z2(X) = - x1 + 4x2
A.BENHARI
229
Multilevel Optimization
Introduction
The example problems discussed in the previous modules consist of very few decision
variables and constraints. However in practical situations, one has to handle an optimization
problem involving a large number of variables and constraints. Solving such a problem will
be quite cumbersome. In multilevel optimization, such large sized problems are decomposed
into smaller independent problems and the overall optimum solution can be obtained by
solving each sub-problem independently. In this lecture a decomposition method for
nonlinear optimization problems, known as model-coordination method will be discussed.
Model Coordination Method
Consider a minimization optimization problem F (x ) consisting of n variables, x1 , x 2 ,..., x n .
Min F ( x1 , x 2 ,..., x n )
subjected to constraints
g j ( x1 , x2 ,..., xn ) 0,
lxi xi uxi
j = 1,2,..., m
i = 1,2,..., n
where lxi and ux i represents the lower and upper bounds of the decision variable xi .
Let X = {x1 , x2 ,..., xn } be the decision variable vector. For applying the model coordination
method, the vector X should be divided into two subvectors, Y and Z such that Y contains the
coordination variables between the subsystems i.e., variables that are common to the
subproblems and Z vector contains the free or confined variables of subproblems. If the
problem is partitioned into P subproblems, then vector Z can also be partitioned into P
variable sets, each set corresponding to each subproblem.
Z1
Z
Z = 2
M
Z P
Thus the objective function F (x ) can be partitioned into P parts as shown,
A.BENHARI
230
F ( x) = f k (Y , Z k )
k =1
where f k (Y , Z k ) denotes the objective function of the kth subproblem. In this the coordination
variable Y will appear in all sub-objective functions and Z k will appear only in kth subobjective function.
Similarly the constraints can be decomposed as
g k (Y , Z k ) 0
for k = 1,2,..., P
lY Y uY
lZ k Z k uZ k
for k = 1,2,..., P
The problem thus decomposed is solved using a two level approach which is described
below.
Procedure:
First level:
Fix the value of the coordination variables, Y at some value, say Yopt . Then solve each
independent subproblem and find the value of Z k .
Min f k (Y , Z k )
subject to
g k (Y , Z k ) 0
lY Y uY
lZ k Z k uZ k
for k = 1,2,..., P
k =1
subject to
A.BENHARI
lY Y uY
231
A.BENHARI
232
Compute Objective
function f(Xi)
Generate new
solution Xi+1
Compute Objective
function f(Xi+1)
Set i=i+1
Convergence Check
No
Yes
Optimal Solution
Xopt=Xi
A.BENHARI
233
A.BENHARI
234
A.BENHARI
235
x1
f
x 2
f = .
.
.
f
x N
(1)
1
( X X i )T [J i ]( X X i )
2
(2)
where, [Ji]=[J]|xi , is the Hessian matrix of f evaluated at the point Xi. Setting the partial
derivatives of Eq. (2), to zero, the minimum value of f(X) can be obtained.
f ( X )
= 0,
x j
A.BENHARI
j = 1,2,..., N
236
(3)
(4)
X i +1 = X i [J i ] f i
1
(5)
The procedure is repeated till convergence for finding out the optimal solution.
D) Marquardt Method: Marquardt method is a combination method of both the steepest
descent algorithm and Newtons method, which has the advantages of both the methods,
movement of function value towards optimum point and fast convergence rate. By modifying
the diagonal elements of the Hessian matrix iteratively, the optimum solution is obtained in
this method.
E) Quasi-Newton Method: Quasi-Newton methods are well-known algorithms for finding
maxima and minima of nonlinear functions. They are based on Newton's method, but they
approximate the Hessian matrix, or its inverse, in order to reduce the amount of computation
per iteration. The Hessian matrix is updated using the secant equation, a generalization of the
secant method for multidimensional problems.
It should be noted that the above mentioned algorithms can be used for solving only
unconstrained optimization. For solving constrained optimization, a common procedure is the
use of a penalty function to convert the constrained optimization problem into an
unconstrained optimization problem. Let us assume that for a point Xi, the amount of
violation of a constraint is . In such cases the objective function is given by:
f (X i ) = f (X i ) + M 2
(6)
where, =1( for minimization problem) and -1 ( for maximization problem), M=dummy
variable with a very high value. The penalty function automatically makes the solution
inferior where there is a violation of constraint.
A.BENHARI
237
A.BENHARI
238
Introduction
Most real world optimization problems involve complexities like discrete, continuous or
mixed variables, multiple conflicting objectives, non-linearity, discontinuity and non-convex
region. The search space (design space) may be so large that global optimum cannot be found
in a reasonable time. The existing linear or nonlinear methods may not be efficient or
computationally inexpensive for solving such problems. Various stochastic search methods
like simulated annealing, evolutionary algorithms (EA) or hill climbing can be used in such
situations. EAs have the advantage of being applicable to any combination of complexities
(multi-objective, non-linearity etc) and also can be combined with any existing local search
or other methods. Various techniques which make use of EA approach are Genetic
Algorithms (GA), evolutionary programming, evolution strategy, learning classifier system
etc. All these EA techniques operate mainly on a population search basis. In this lecture
Genetic Algorithms, the most popular EA technique, is explained.
Concept
EAs start from a population of possible solutions (called individuals) and move towards the
optimal one by applying the principle of Darwinian evolution theory i.e., survival of the
fittest. Objects forming possible solution sets to the original problem is called phenotype and
the encoding (representation) of the individuals in the EA is called genotype. The mapping of
phenotype to genotype differs in each EA technique. In GA which is the most popular EA,
the variables are represented as strings of numbers (normally binary). If each design variable
is given a string of length l, and there are n such variables, then the design vector will have
a total string length of nl. For example, let there are 3 design variables and the string length
be 4 for each variable. The variables are x1 = 4, x2 = 7 and x3 = 1 . Then the chromosome
length is 12 as shown in the figure.
A.BENHARI
0 1 0 0
0 1 1 1
0 0 0 1
x1
x2
x3
239
An individual consists a genotype and a fitness function. Fitness represents the quality of the
solution (normally called fitness function). It forms the basis for selecting the individuals and
thereby facilitates improvements.
The pseudo code for a simple EA is given below
i=0
Initialize population P0
Evaluate initial population
while ( ! termination condition)
{
i = i+1
Perform competitive selection
Create population Pi from Pi-1 by recombination and mutation
Evaluate population Pi
}
A.BENHARI
240
Best
Individuals
R
E
G
E
N
E
R
A
T
I
O
N
Meets
Optimization
Criteria?
Yes
Stop
No
Selection (select parents)
The initial population is usually generated randomly in all EAs. The termination condition
may be a desired fitness function, maximum number of generations etc. In selection,
individuals with better fitness functions from generation i' are taken to generate individuals
of i+1th generation. New population (offspring) is created by applying recombination and
mutation to the selected individuals (parents). Recombination creates one or two new
individuals by swaping (crossing over) the genome of a parent with another. Recombined
individual is then mutated by changing a single element (genome) to create a new individual.
A.BENHARI
241
variables a,b,c and d. Length of each string is taken as four bits. The first column represents
the possible solution in binary form. The second column gives the fitness values of the
decoded strings. The third column gives the percentage contribution of each string to the total
fitness of the population. Then by "Roulette Wheel" method, the probability of candidate 1
being selected as a parent of the next generation is 28.09%. Similarly, the probability that the
candidates 2, 3, 4 will be chosen for the next generation are 19.59, 12.89 and 39.43
respectively. These probabilities are represented on a pie chart, and then four numbers are
randomly generated between 1 and 100. Then, the likeliness that the numbers generated
would fall in the region of candidate 2 might be once, whereas for candidate 4 it might be
twice and candidate 1 more than once and for candidate 3 it may not fall at all. Thus, the
strings are chosen to form the parents of the next generation.
A.BENHARI
242
Fitness value
109
28.09
76
19.59
50
12.89
153
39.43
Total
388
100
Rank Selection:
The previous type of selection may have problems when the fitnesses differ very much. For
example, if the best chromosome fitness is 90% of the entire roulette wheel then the other
chromosomes will have very few chances to be selected. Rank selection first ranks the
population and then every chromosome receives fitness from this ranking. The worst will
have fitness 1, second worst 2 etc. and the best will have fitness N (number of chromosomes
in population). By this, all the chromosomes will have a chance to be selected. But this
method can lead to slower convergence, because the best chromosomes may not differ much
from the others.
Crossover
Selection alone cannot introduce any new individuals into the population, i.e., it cannot find
new points in the search space. These are generated by genetically-inspired operators, of
which the most well known are crossover and mutation.
Crossover can be of either one-point or two-point scheme. In one point crossover, selected
pair of strings is cut at some random position and their segments are swapped to form new
pair of strings. In two-point scheme, there will be two break points in the strings that are
randomly chosen. At the break-point, the segments of the two strings are swapped so that
A.BENHARI
243
A.BENHARI
244
10111111
It is seen in the above example that the sixth bit '0' is changed to '1'. Thus, in mutation
process, bits are changed from '1' to '0' or '0' to '1' at the randomly chosen position of
randomly selected strings.
Real-coded GAs
As explained earlier, GAs work with a coding of variables i.e., with a discrete search space.
GAs have also been developed to work directly with continuous variables. In these cases,
binary strings are not used. Instead, the variables are directly used. After the creation of
population of random variables, a reproduction operator can be used to select good strings in
the population.
Advantages and Disadvantages of EA:
EA can be efficiently used for highly complex problems with multi-objectivity, non-linearity
etc. It provides not only a single best solution, but the 2nd best, 3rd best and so on as required.
It gives quick approximate solutions. EA methods can very well incorporate with other local
search algorithms.
There are some drawbacks also in using EA techniques. An optimal solution cannot be
ensured on using EA methods, which are usually known as heuristic search methods.
Convergence of EA techniques are problem oriented. Sensitivity analysis should be carried
out to find out the range in which the model is efficient. Also, the implementation of these
techniques requires good programming skill.
A.BENHARI
245
(1)
Minimize
(2)
c = f (x )
(3)
Eq. (3) represents the relationship between the water quality indicator and the fractional
removal levels of the pollutants. It should be noted that the relationship between c and x may
be nonlinear and therefore linear programming may not be applicable. In such cases the
applications of evolutionary algorithm is a possible solution. Interested readers may refer to
Tung and Hathhorn (1989), Sasikumar and Mujumdar (1998), and Mujumdar and Subbarao
(2004).
A.BENHARI
246
A.BENHARI
247
to
its
internal
fleet
of
vehicles
or
to
outsource
the
jobs
to
other companies. The solution to the TTVRP consists of finding a complete routing schedule
for serving the jobs with - 1. minimum routing distance and 2. minimum number of trucks,
subject to a number of constraints such as time windows and availability of trailers.
Multiobjective evolutionary algorithm can be used to solve such models. Applications of
evolutionary algorithm in solving transportation problems can be found in Lee et al. (2003).
A.BENHARI
248
References :
1. Hillier F.S. and G.J. Lieberman, Operations Research, CBS Publishers & Distributors,
New Delhi, 1987.
2. D.G.Luenberger, Linear and Non-linear Programming. New York:Addison-Wesley,1990.
3. D.E.Goldberg, Genetic algorithms in Search, Optimization, and Machine Learning
NewYork: Addison Wesley Longman,1989.
4. Rao S.S., Engineering Optimization Theory and Practice, Third Edition, New Age
International Limited, New Delhi, 2000
5. Ravindran A., D.T. Phillips and J.J. Solberg, Operations Research Principles and
Practice, John Wiley & Sons, New York, 2001.
6. Taha H.A., Operations Research An Introduction, Prentice-Hall of India Pvt. Ltd., New
Delhi, 2005.
7. Deb K., Multi-Objective Optimization using Evolutionary Algorithms, First Edition, John
Wiley & Sons Pte Ltd, 2002.
8. Deb K., Optimization for Engineering Design Algorithms and Examples, Prentice Hall
of India Pvt. Ltd., New Delhi, 1995.
9. Dorigo M., and T. Stutzle, Ant Colony Optimization, Prentice Hall of India Pvt. Ltd.,
New Delhi, 2005.
10. Jain S.K. and V.P. Singh, Water Resources Systems Planning and Management, Elsevier
B.V., The Netherlands, 2003.
11. Loucks, D.P., J.R. Stedinger, and D.A. Haith, Water Resources Systems Planning and
Analysis, Prentice Hall, N.J., 1981.
12. Mays, L.W. and K. Tung, Hydrosystems Engineering and Management, McGraw-Hill
Inc., New York, 1992.
13. Vedula S., and P.P. Mujumdar, Water Resources Systems: Modelling Techniques and
Analysis, Tata McGraw Hill, New Delhi, 2005. A.
14. M.S.Bazaraa, J.J.Jarvis, and H.D.Sherali, Linear Programming and Network Flows
,2nded. New York:JohnWiley&Sons,1990.
15. M.S.Bazaraa, H.D.Sherali, and C.M.Shetty, Nonlinear Programmingtheory and algorithms.
NewYork:JohnWiley&Sons,1993.
16. Z.MichalewiczandM.Michalewicz, Evolutionary computation techniques and their
applications, in IEEEInternational Conference on Intelligent Processing Systems
,Beijing,1997,pp.1425.
17. P.G.Alotto,C.Eranda,and B.Brandstaetter,Stochastic algorithms in electromagnetic
optimization, IEEE Transactions on Magnetics,vol.34,no.5,pp.36743684,Sept.1998.
18. C.A.C.Coello, Handling preferences in evolutionary multiobjective optimization:
A survey,in IEEECongress on Evolutionary Computation,vol.1,New Jersey,2000,pp.30
37.
19. J.A.VasconcelosandA.H.F.Dias, Multiobjective genetic algorithms Applied to solve
optimization problems,IEEE Transactions on Magnetics,vol.38,no.2,pp.1133
1136,Mar.2002.
20. K.KrishnakumarandD.E.Goldberg,Control system optimization, Journal of
Guidance,Control,and Dynamics,vol.15,no.3,pp.735740,1992.
A.BENHARI
249