Gradiente Reducido

189
SIMPLEX LIKE (aka REDUCED GRADIENT) METHODS

The REDUCED GRADIENT algorithm and its variants such as the CONVEX
SIMPLEX METHOD (CSM) and the GENERALIZED REDUCED GRADIENT
(GRG) algorithm are approximation methods that are similar in spirit to the Simplex
method for LP.
The reduced gradient method and the CSM operate on a linear constraint set
with nonnegative decision variables, while GRG generalizes the procedure to
nonlinear constraints, possibly with lower and upper bounds on decision variables.
REDUCED GRADIENT METHOD (Wolfe)

This is a generalization of the simplex method for LP, to solve problems with
nonlinear objectives:
Minimize
f(x)
st
Ax = b,
x0.
Here we assume that A is an (mn) matrix of rank m, and that b, xRn.

At iteration i
We have the n decision variables partitioned as BASIC and NON-BASIC, i.e,
B
= ,
N
()
Rank() =
B
Thus = [ ] = B + N =
N
B
(After possible reordering), let = , be the current design.
N
1 exists
190
A requirement for guaranteeing convergence is that no degeneracy is permitted at any

iteration, i.e., xB>0. If more than m variables are positive at the iteration, then the m
largest (positive) variables are selected as the basic variables (assuming their columns
are linearly independent).
We wish to decrease f(x) while maintaining feasibility. Suppose we change the nonbasic variables by an amount N from xNi, and the basic variables by an amount B
from xBi. Then = B .

N
As xN changes from xNi by N, xB should be changed by B in such a way that the new
point xi+1 = xi + is still feasible, i.e.,
+1
B+1 = [ ] B + B =
N
N
N
B + B + N + N =
B + N + B + N =
+ B + N =
B = 1 N
That is, if we change xB by B= -B-1NN then our new design will satisfy Ax=b.
Now consider f (the change in f ) for small : f fT(xi),
f = BfT(xi)B + NfT(xi)N
Substituting for B in () above we have:
()
191
f = BfT(xi)(-B-1NN) + NfT(xi)N = {NfT(xi) - BfT(xi)B-1N}N = [N]TN

(say), where [N]T = NfT(xi) - BfT(xi)B-1N
In summary:
IF xN changes from xNi by N,
AND xB is correspondingly changed from xBi by B to maintain feasibility,
THEN f is the change in f.
Now, since we are minimizing, we would like to move in the direction -N to decrease
f. Let N be the search direction for xN. Then for iN,
a) If i0, i = -i (0): we 'increase' xi in the direction -i
b) If i0, i = -i (0): we 'decrease' xi in the direction -i as long as xi >0,
= 0 if xi = 0 (since cannot decrease below zero)
As proposed initially by Wolfe, the Reduced Gradient method uses for iN
0
0 if i 0 & xi =
i =
i otherwise
The Convex Simplex method (Zangwill) chooses only one axis to move along. Let,
a = Max {-i | i 0}
case (a),
b = Max {xii | i 0}
case (b),
a = -s (0), b=xtt (0)
192
If
(1)
ab
OR a>0, b= then,
N = [0 00 1 00]T
(2)
sth position
b >a OR b >0, a= then,

N = [0 00 -1 00]T
(3)
b =a=0
tth position
a=0, b = OR
OR
a=, b= then,
we have a K-K-T point: STOP!

Once we have N, then B = -B-1NN and
= B
N
Once is determined we do a line search as usual in the direction (by using a one
dimensional search procedure like Golden Section , Fibonacci etc.), i.e.,,
Min f(xi +a), st 0aamax, where amax = sup {a | xi +a0}
If this yields a*, then xi+1 = xi + a* and so on...
EXAMPLE (CSM):
Min
x12 x1 x2
st
2x1 + x2 1,
x1 + 2x2 1, x1, x2 0
Adding slack variables we rewrite the constraints as

2x1 + x2 + x3
x1 + 2x2
=1
+ x4 = 1;
x1, x2, x3, x4 0
193
21 1
1
For this problem, fT(x) =
.
0
0
Start with x1 = x2 = 0, x3 = x4=1, f(x0)=0, and with x3 and x4 in the basis.

3
1
ITERATION 1: B = , N = ,
4
2
= [ ] =
1 0
0 1
2 1
.
1 2
3
4
B
= =
1
N
2
(Note that Rank(B)=2)
0
1
B = , N = , and = [0 0 1
1
0
1]T
1
0
1
( ) = , and N ( ) =
=
2
1
0
Search Direction: NT = NfT(x0) - BfT(x0)B-1N
TN = [1 1]
[0
a = max{+1, +1} = 1;
b=
1
0
We could choose N= or .
0
1
B = -B-1NN =
1 0
0 1
1
0]
0
0 1 2

1
1
1

2
N =
1
Let us choose ...
0
2 1 1
2
= ;
1 2 0
1
3
2
4
1
= B =
N
1
1
0
2
0
1
0
0
0
Line Search: Minimize0 + =
1 2
1
2
1
1
1
amax=sup{ a | a0, - , (1-2a)0, (1-a)0 } = 0.5
194
Minimize00.5 { 2 0}
A line search yields a*=0.5. Therefore
1/2
0
x1 = x0 +a* =
, f(x1)= -0.25
0
1/2
In summary, x1 has now entered the basis and replaced x3 (which cannot be in the
basis since it has reached its lower bound of zero).
1
3
4
2
= [ ] =
B =
2 0
1 1
1 1
,
0 2
1
4
B
= =
3
N
2
and 1 =
0.5 0
0.5 1
0
0.5
, N = , and = [0.5 0.5 0 0]T
0
0.5
1
0
0
( ) = , and N ( ) =
=
4
1
0
TN = [0 1]
a = max{+1} = 1;
[0
0.5 0 1
0.5 1 0
b=max (00)=0
0
Since a>b we have N= .
1
B = -B-1NN =
0]
1

2
0.5 0 1 1 0
0.5
=
;
0.5 1 0 2 1
1.5
N =
1
0.5
4
1.5
= B =
N
3
0
2
1
195
(1 )/2
0.5
0.5
0
1
Line Search: Minimize0 +
=
0
0
0
(1 3)/2
0.5
1.5
amax=sup{ a | (1-a)/20, a0, - , (1-3a)/20 } = 1/3
Minimize01/3 {[(1 )/2]2 [(1 )/2] }

A line search yields a*=1/3. Therefore
1/3
1/3
x2 = x1 +a* =
, f(x2)= -0.555
0
0
Thus x2 has now entered the basis and replaced x4

1
3
2
4
= [ ] =
2 1
1 2
1 0
,
0 1
1
2
B
= =
3
N
4
2/3
and 1 =
1/3
1/3
2/3
1/3
0
B =
, N = , and = [1/3 1/3 0 0]T
1/3
0
3
0
1/3
( ) =
, and N ( ) =
=
4
0
1
TN = [0 0] [1/3
a = max{+1/9} = 1/9;
1]
2/3
1/3
1/3 1 0

2/3 0 1
b=max (05/9)=0
N =
1/9
5/9
196
1
Since a>b we have N= .
0
B = -B-1NN =
2/3 1/3 1 0 1
2/3
=
;
1/3
2/3 0 1 0
1/3
1
2/3
2
1/3
= B =
N
3
1
4
0
1/3
2/3
(1 2)/3
1/3
1/3
Line Search: Minimize 0
+
= (1 + )/3
0
1
0
0
0
amax=sup{ a | (1-2a)0, 1+a)0, a0, - } = 0.5
Minimize00.5 {[(1 2)/3]2 [(1 2)/3] (1 + )/3}

A line search yields a*=1/8. Therefore
1/4
3/8
x3 = x2 +a* =
, f(x3)= -0.5625
1/8
0
At this point x4 is definitely nonbasic, we could select any two of the other three to be
basic as long as they have linearly independent columns in A; let us (arbitrarily)
choose x1 and x2 (same as before).
ITERATION 4: Continue and verify that x3 is the optimum solution...
197
The algorithms progress is as shown below:
x2
f=-0.5625
x1+2x2=1
x3= x*
x2
f=-0.555
FEASIBLE
REGION
f=0
2x1+x2=1
f=-0.5
x0
x1
Path followed by the algorithm
x1
198
GENERALIZED REDUCED GRADIENT (GRG) ALGORITHM

GRG (Abadie and Carpentier) is essentially a generalization of the convex simplex
method to solve
Min f(x)
st
gj(x) = 0,
j=1,2,...,m,
a i xi b i ,
i=1,2,...,n;
xRn.
GRG is considered to be one of the better algorithms for solving problems of the
above form.
SIMILAR IN CONCEPT TO SIMPLEX METHOD FOR LP....

SIMPLEX
GRG
Reduced Costs
Reduced Gradients
Basic Variables
Dependent Variables
Nonbasic Variables
Independent Variables
DIFFERS IN THAT
Nonbasic (independent) variables need not be at their bounds,
At each iteration, several nonbasic (independent) variables could change values
Basis need not change from one iteration to the next.
199
At the beginning of each iteration we partition the n decision variables into 2 sets:
1) Dependent variables (one per equation),
2) Independent variables
Let
D
=
I
(after reordering the variables)
Here, xD is a vector of (m) Dependent variables

xI is a vector of (n-m) Independent variables
Similarly, we partition
D
rows
()
= ; = D ; () = D
( ) rows
I
I ()
I
The Jacobian matrix (J) is
T () IT ()
DT
()
()
I
D
() =
:
:
= [D
:
:
T
T
D () I ()
m cols.
I ] , where JD is mm and JI is m(n-m)
(n-m) cols.
Note that J has m rows; row i is giT (x).

Let
D
be our current design, where
I
gj(xi) = 0 j, (xi is feasible)
aD < D < bD (dependent variables are not at a bound)

JD is nonsingular (JD-1 exists)
aI I bI (independent variables are within their bounds)
200
Let = D be a change vector in the value of xi

I
For small values of , gj = change in gj gjT(xi).

Choose so that xi+1 is still feasible (or to be exact, near-feasible), i.e., choose
T = 0,
= 1,2, ,
D T D + I T I = 0
1

2
= : = = D D + I I =
:

Since JD(xi) is nonsingular, [JD(xi)]-1 exists and
D = - [JD(xi)]-1[JI(xi)]I
Similarly, for small values of , f = change in f fT(xi)

f = DfT(xi)D + IfT(xi)I
Substituting D in () we have
f = DfT(xi) {-[JD(xi)]-1[JI(xi)]I} + IfT(xi)I
= { IfT(xi) - DfT(xi) [JD(xi)]-1[JI(xi)] }I
i.e.,
where
= T I
I = If(xi) - [DfT(xi) [JD(xi)]-1[JI(xi)]]T

is called the REDUCED GRADIENT VECTOR.
()
201
In other words
IF I is changed by an amount I, AND D is changed by an amount D in order to
maintain gj(x) 0 for all j, THEN = T I is the corresponding change in the
value of f.
Since we are minimizing we want f to be negative. Assuming that i>0, (so that -i
would correspond to a decrease in xi and +i to an increase), we would therefore
move the negative of the direction corresponding to the vector I while ensuring that
the bounds on xi (namely ai and bi) are not violated. Thus the STEP DIRECTION I
for the independent variables has elements given by
if < < ( )
if > 0 and =

(I ) = if < 0 and =
0 if > 0 and = ( )
0 if < 0 and = ( )
For the dependent variables,
D = -[JD(xi)]-1 [JI(xi)] I
can go either way irrespective of the direction of I since aD< xD < bD.
Contrast this with the Simplex method for LP where only one independent (nonbasic)
variable is changed at each iteration by selecting Max |i| which doesn't violate a
constraint ( the ratio test for the minimum value).
202
Having found the search direction = D we do the usual line search (using one of
I
Golden Section, Fibonacci, etc.) to find the step size, i.e.,
Min f(xi +a), st 0aamax, where amax = sup {a | axi +ab}

Suppose the optimum step size is a*.
Correction Step
In general, we might find that gj(xi+a*) 0 for some j.
Therefore, we need a correction step (as in the gradient projection method) to reattain feasibility:
xi
xi+a*
correction step
xi+1
gj(x)=0
This is done for example, by fixing the independent variables at the values found and
then solving the system g(x)=0 using xi +a* as the initial guess for a NewtonRaphson search.
Suppose we write xi +a* = [xI , xD]T. Then our problem at this stage is to solve
g(xI, y)=0 by adjusting only y (a system of m equations in m unknowns).
We start with y0= xD, and Newtons iterations are applied for r=1,2, via
yr+1 = yr [JD(xI, yr)]-1g(xI, yr)
until g(xI, yk)=0 for some k. We then set xD = yk.
Here, JD(xI, yr) is the Jacobian matrix for the constraint functions evaluated at y=yr for
the dependent variables, with the independent variables fixed at xI
203
There are three possible pitfalls during the correction step that we must safeguard
against:
1. The final feasible solution xi+1=(xI, xD) might yield a value for f that is worse
than what it was at xi.
2. One or more bounds on x are violated during the Newtons iterations
3. The constraint violation actually starts to increase.
In all three cases, the typical solution is to reduce the step size along the direction
from a* to (say) a*/2. Now the point xi+a* is closer to xi and the Newton iterations
are re-executed to obtain a new feasible point xi+1 with f(xi+1)< f(xi). This process
might need to be repeated more than once
xi+a*
xi
xi+1
gj(x)=0
b2
xi+a*
xi
a2
a1
xi+1
b1
gj(x)=0
204
Finding Search Directions from the Reduced Gradient vector

There are several versions actually for picking I. Let us start by defining
S ={ j | ( I ) j > 0 & ( xI ) j =(aI ) j , OR ( I ) j < 0 & ( xI ) j =(bI ) j }
Note that S is an index subset of I (the index set of independent variables), denoting
variables that (1) we would like to decrease but are already at their lower bounds, or
(2) we would like to increase but are already at their upper bounds.
0 if j S
Version 1: (I)j =
( I ) j otherwise
(this is the classical version)
Version 2: Identify s argmax {MaxjS |(I)j|} and set

0 if j s
(I)j =
(I ) s if j = s
(GRGS)
This version coincides with the greedy version of the simplex method for LP if the
optimization problem is a linear program.
Version 3: Cyclically set (I)j = -(I)j for jS and all other (I)j =0, where a cycle
consists of n iterations.
(GRGC)
In other words, for k=1,2,,n set

0 if j k
(I)jk =
(I ) j if j = k
as long as kS and k is not a basic variable. If k is basic or is in S we simply move on

to k+1 instead. After k=n we return to k=1 and continue the process.
205
GRG and the K-K-T Conditions

Recall that with GRG we stop when (i) i 0 for all {i xi=bi}, (ii) i 0, for all {i
xi=ai}, and (iii)I =0 for all {i xi<bi and xi>ai}.
This stopping condition is related to the K-K-T conditions for the original problem:
Min f(x)
st
gj(x)=0 for j=1,2,,m

aixibi for i=1,2,,n
Consider the Lagrangian for the above problem:

L(x,) = f(x) + jjgj(x) - iai(xi-ai) + ibi(xi-bi)
At the optimum the K-K-T conditions yield
L = f(x) + j jgj(x) - (a -b) = 0, along with the original constraints; a,b 0, and
complementary slackness for the bounding constraints. Thus
a i if xi = ai
L = f(x) + j jgj(x) - = 0, where i = b i if xi = bi
0 if ai < xi < bi
Since we disallow degeneracy and the variables indexed in D are all strictly between
their respective bounds this implies that
If(x) + JIT(x) - I = 0, and
Df(x) + JDT(x) = 0.
206
The second equation yields = -[ JDT(x)]-1Df(x)

Substituting this value for into the first equation yields
I = If(x) - JIT(x) [JDT(x)]-1Df(x)

= If(x) - JIT(x) [ JD-1(x)]TDf(x)
(since (AT)-1=(A-1)T)
= If(x) [[JD(x)]-1 JI(x)]TDf(x)
(since ATBT=(BA)T)
= If(x) [DfT(x) [JD(x)]-1 JI(x)]T
(since ATB=(BTA)T)
Note that this is the same formula we obtained for the reduced gradient vector! Thus
the method may be viewed as moving through a sequence of feasible points xk in
search of a point that yields a vector that satisfies the K-K-T conditions for
optimality. This search is done using the above relationship.
207
EXAMPLE:
Min
x12 x1 x2
st
2x1 + x2 1,
x1 + 2x2 1, x1, x2 0
Rewriting this in the standard form required for GRG

Min f(x) = x12 x1 x2
st
g1(x) = 2x1 + x2 + x3
g2(x) = x1 + 2x2
-1=0
+ x4 -1=0
(ai=) 0 xi 1 (=bi)
Note that we could have actually used ( ) for b1 and b2 in this problem. In cases
where the bound is not obvious from the constraints we could use some large value.
Let us start with

x0 = [x10 x20 x30 x40]T = [
]T
with f(x0) = -3/16.

Let us (arbitrarily) choose x1 and x2 as the independent variables.
The requirements are that the point should be feasible with respect to both the
constraints and the bounds (which is obviously met), the dependent variables cannot
be at either bound (also met), and finally that JD should be nonsingular (also met
since JD =
1 0
, which is obviously nonsingular).
0 1
208
ITERATION 1: I =
21 1
1
()
;
0
0
1/2
1/4
, D =
3/4
0
0
D ( ) =
0
Thus
1
D ( ) =
0
and
() = 1
2
1
Compute the reduced gradient...
1
2
2
2
1
3
2
3
I ( ) =
4
2
=
2
1
1 1 0
2 0 1
1/2
2
I ( ) =
1
I = If(x0) - [DfT(x0) [JD(x0)]-1[JI(x0)]]T

(NOTE: A corresponding step in the Simplex method is the computation of the "cj
zj" values; the reduced costs.)
1
1/2
1 0 1 2 1
1/2
I = =
[0 0]
=

1 2
0 1
2
1
1
This says that the objective function f increases in the direction [-1/2 -1]T .
Therefore our search should proceed in the opposite direction, i.e.,
1/2
= 1
I = I =
2
1
Note that 2=1 is possible only because x2 is currently at its lower bound of a2 =0. If
it had been at its upper bound b2 we would have chosen 2=0.
D =
1
D I I
1
=
0
0 1 2

1
1
2
1 1/2
=
= 3
5/2
4
2 1
209
x2
1/2
1
=
2
5/2
0.5
x1+2x2=1
0.4
0.3
FEASIBLE
REGION
0.2
2x1+x2=1
f= -3/16
0.1
0.1
0.3
0.2
0.4
0.5
x1
x =(1/4, 0)
(-1/2, -1)
LINE SEARCH: We now minimize along the direction of search:

Minimize f(x0+a), st a x0+a b
i.e.,
Minimize f(x0+a), st 0 a amax

where amax = supremum{a| a x0+a b}
0 1/4 (1/2) 1 1/4

1/4 + /2
0+
00
(1)
10
=
+ =
1/2 2
1 1/2
0 1/2 (2)
3/4 5/2
0 3/4 (5/2) 1 3/4
0.5a1.5;
0a1;
0.25a0.25;
0.1a0.3
amax = Min (1.5, 1, 0.25, 0.3) = 0.25
Hence the line search is
1/4 + /2
1/2 2
00.25
3/4 5/2
210
[(1/4) + (/2)]2 [(1/4) + (/2)]

00.25

00.25
1 2 5
3
+
4
4
16
This could be done by a line search procedure such as Fibonacci, GSS, quadratic
interpolation, etc. It is easily shown that the optimum is at a*=0.25. Thus we have:
3/8
1/4
1/2
1
0
1/4
1
;
=
= + =
+
1/2
2
0
4
5/2
1/8
3/4
( ) = 31/64
Checking for feasibility, we see that x1 is feasible; so we need not make any
correction step (in general, this will be true here because we have all linear
constraints...)
The current point is shown below:
x2
0.5
x1+2x2=1
0.4
f= -31/64
x1= (3/8, 1/4)
0.3
FEASIBLE
REGION
0.2
2x1+x2=1
f= -3/16
0.1
0.1
0.2
0.3
0
x =(1/4, 0)
0.4
0.5
x1
211
At the point x1, x3 has reached its lower bound. Recall that a dependent variable (such
as x3) cannot be at a bound. We must therefore make x3 independent and remove it
from xD and place it in xI.
The old partition was I={1,2} and D={3,4}.Since neither independent variable
is currently at a lower bound, we could choose either x1 or x2 to replace, as long as the
resulting JD is. Let us (arbitrarily) choose to replace x1. Thus D={1,4} and I={2,3}.
ITERATION 2:
2
1/4
I = =
,
3
0
= [3/8 1/4 0 1/8]T ;
21 1
1
()

0
0
2
() =
1
1 1 0

2 0 1
D ( ) =
1/4
;
0
2 0
;
D ( ) =
1 1
and 1
D ( ) =
Recompute the reduced gradient:
1
3/8
D = =
1/8
4
I ( ) =
I ( ) =
1/2 0
1/2 1
1 1
2 0
I = =If(x1) - [DfT(x1) [JD(x1)]-1[JI(x1)]]T

3
2
7/8
1/2 0 1 1 T
1
I = = [1/4 0]
=
1/8
1/2 1 2 0
3
0
This says that in order to decrease f, our search direction should be

7/8
I = I =
= 2 .
3
1/8
212
BUT
while x2=1/4 implies that we can increase it, the fact that x3=0 implies that we
CANNOT decrease it; it is already at its lower bound of a3=0!
THEREFORE
I =
7/8
, and since we are only looking for a direction, we could simplify matters
0
1
by choosing I = 2 = .
3
0
Then
1/2 0 1 1 1
1/2
I I =
D = 1 = 1
=
D
4
1/2 1 2 0 0
3/2
LINE SEARCH: We now minimize along the direction of search:
x2
0.5
x1+2x2=1
0.4
1/2
1
=
0
3/2
f= -31/64
0.3
x1= (3/8, 1/4)
FEASIBLE
REGION
0.2
(steepest descent
direction)
2x1+x2=1
0.1
f
0.1
0.2
0.3
0
x =(1/4, 0)
0.4
0.5
x1
213
Minimize f(x1+a), st a x1+a b

i.e.,
Minimize f(x1+a), st 0 a amax

where amax = supremum{a| a x1+a b}
0 3/8 (1/2) 1 3/8
3/8 /2
0 1/4 (1)
1 1/4
1/4 +
=
=
0 + 0
0 (0)
10
1/8 3/2
0 1/8 (3/2) 1 1/8
a3/4;
a3/4;
amax = Min (3/4,3/4,-,1/12) = 1/12
Hence the line search is

01/12
a1/12
3/8 /2
1/4 +

0
01/12
1/8 3/2
1 2 1
31

4
8
64
a*=1/12. (VERIFY).
Thus
3/8
1/2
1/3
1/4
1
1
1/3
+
=
;
= + =
12
0
0
0
3/2
1/8
0
( ) = 5/9
Checking for feasibility, we see that x2 is feasible; so we need not make any
correction step...
Again, note that a dependent variable (x4) has reached its (lower) bound of a4 =0.
Therefore we must once again repartition...
214
x2
x1+2x2=1
0.5
f= -5/9
0.4
x2= (1/3, 1/3)
f= -31/64
0.3
x1= (3/8, 1/4)
FEASIBLE
REGION
0.2
f= -3/16
2x1+x2=1
0.1
0.1
0.2
0.3
0.4
0.5
x1
x =(1/4, 0)
The old partition was I={1,4} and D={2,3}. We must choose either x3 or x2 to replace
x4. However x3 is already at its (lower) bound so it cannot be made dependent.
Therefore, we replace x2 by x4.
Our new partition is D={1,2} and I={3,4}
ITERATION 3:
3
0
I = = ,
0
4
= [1/3 1/3 0 0]T ;
21 1
1
()

0
0
2
() =
1
D ( ) =
1 1 0

2 0 1
1/3
;
1
2 1
;
D ( ) =
1 2
1
1/3
D = =
1/3
2
0
I ( ) =
0
I ( ) =
2/3 1/3
and 1
D ( ) = 1/3
2/3
1 0
0 1
215
Recompute the reduced gradient:

3
I = =If(x2) - [DfT(x2) [JD(x2)]-1[JI(x2)]]T

4
3
1/9
2/3 1/3 1 0
0
I = = [1/3 1]
=
5/9
1/3
2/3 0 1
0
4
Thus, to decrease f, we must move in the direction -I and

(a) Increase x3:this is OK since x3=0 currently
(b) Decrease x4: this is not allowed because x4=a4 = 0
Hence
and
1/9
1
I = 3 =
.
4
0
0
2/3 1/3 1 0 1
2/3
1
I I =
=
D = 1 = D
2
1/3
2/3 0 1 0
1/3
CONTINUE AND VERIFY THAT YOU GET THE SAME OPTIMAL SOLUTION
OBTAINED BY THE CONVEX SIMPLEX METHOD...
216
SEPARABLE PROGRAMMING
DEFINITION: A function f(x1,x2,...,xn) is SEPARABLE if it can be written as a sum
of terms, with each term being a function of a SINGLE variable:
(1 , 2 , , ) = ( )
=1
EXAMPLES
NOT SEPARABLE
SEPARABLE
1 2 + 3
1 + 2 log 2
12 +31 + 62 22
51 /2 1
log( 2 ), although this could be separated as log x + 2(log y)
PIECE-WISE LINEAR (SEPARABLE) PROGRAMMING

Let 1, 2, be "weights" such that jj=1, j0 for all j. If we have a convex
function f(z), we have for any z given by z = 1z1 + 2z2 + ... + kzk,
f(z) 1f(z1) + 2f(z2) + ... + kf(zk)
each zi is a
GRID POINT
z1
z2
z3
z4
217
In general, any z has many representations in terms of the zk. Consider z=1.75 = 7/4,
with f(z)=1.5, and four grid points z0=0, f(z0)=2; z1 =1, f(z1)=1; z2 =2, f(z2)=2; and
z3 =3, f(z3)=5:
3.75
3
2.5
2
f(z)
f(z)
0
0
0
2
0=5/12, 3=7/12, 1=2=0;
1=5/8, 4=7/12, 1=3=0;
f(z) 3.75
f(z) 2.5
2.25
2
1.75
f(z)
f(z)
0
0
0
2
1=1/2, 2=1/4, 3=1/4, 0=0;

f(z) 2.25
1=1/4, 2=3/4, 0=3=0;

f(z) 1.75
218
Notice that in all cases the approximation to f(z) is always greater than the true value
of f(z) (=1.5). However, the specific combination of grid points that yields the lowest
value for the (and hence the best) approximation is the one which uses only the two
ADJACENT grid points.
The problem Min f(z), st g(z)0 could now be approximated as
Min k k f(zk)
k k g(zk)0; k k =1, k 0 k.
st
This is an LP Problem in the variables! Furthermore, in the optimal set of values,

AT MOST two j values will be positive and these will be ADJACENT.
Question: What if f(z) is NOT convex?
z1
z2
z3
z4
z5
z6
In that case, the property above will not necessarily hold; a "RESTRICTED BASIS
ENTRY" rule must be enforced to "force" adjacent points in order to get a good
approximation.
219
EXAMPLE Maximize 2x1 x12 + x2

2x12 + 3x22 6;
st
x1, x2 0.
f =2.5
2.5
f =2.213
2
f =1.9
1.5
x1*=0.7906
x2*=1.258
FEASIBLE REGION
0.5
0
0
0.5
1.5
2.5
Thus the problem is

Max f1(x1) + f2(x2) = {2x1 x12 } + { x2}
st
g1(x1) + g2(x2) = {2x12} + {3x22} 6
Suppose we use the 9 grid points (0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0) for
approximating f1and g1; and the same 9 grid points for approximating f2 and g2.
k
x1
f1
g1
x2
f2
0
1
2
3
4
5
6
7
8
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
0.00
0.4375
0.75
0.9375
1.00
0.9375
0.75
0.4375
0.00
0.00
0.125
0.50
1.125
2.00
3.125
4.50
6.125
8.00
0
1
2
3
4
5
6
7
8
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
g2
0.00
0.1875
0.75
1.6875
3.00
4.6875
6.75
9.1875
12.00
220
9
g1(x1) = 2x12
8
7
6
5
4
3
2
f1(x1) = 2x1 x12
1
0
0
0.5
1.5
14
g2(x2) = 3x22
12
10
8
6
4
f2(x2) = x2
2
0
0
0.5
1.5
Assuming weights ak and b k respectively for the gridpoints corresponding to x1 and

x2, the PIECE-WISE LINEAR approximation yields the following LP:
Maximize 8=0 1 + 8=0 2
st
8=0 1 + 8=0 2 6
8=0 = 1,
8=0 =1;
ak,bk0 for all k.
Max 0a0 + 0.4375a1 + 0.75a2 + ... + 0.4375a7 + 0a8

+ 0b0 + 0.25b1 + 0.5b2 + ... + 1.75b7 + 2b8
st
(0a0 + 0.125a1 +...+ 8a8) 4+ (0b0 +...+ 12b8) 6
a0 + a1 +...+ a8 = 1,
b0 + b1 +...+ b8 = 1;
ak,bk0 for all k.
221
Thus we have a linear program in 18 variables and 3 constraints.

The OPTIMAL SOLUTION to this LP is
a3 =1, ak* =0 for k3; and

b5 =0.90909..., b6 =0.09090..., bk*=0 for k5,6,
with an objective function value of 2.21023.
Reconstructing the original problem, we have,
x1* = 1(0.75) = 0.75,
x2* = 0.90909 (1.25) + 0.09090 (1.5) = 1.2727,
with f(x*) = 2.21023,
which compares quite favorably with the true optimum,
x1* = 0.7906,
x2* = 1.258,
with f(x*) = 2.213
The piece-wise LP approach and separable programming are somewhat "specialized"

in nature, with limited general applicability. Often, one could try and INDUCE
separability...
For example:
Term
Substitution
Extra Constraints
Restrictions
x1 x2
x1x2 = y1 y2
--
x1 x2
x1 x2 = y
y1=(x1+x2)/2
y2=(x1- x2)/2
log y = log x1 +log x2
x1, x2 >0
log y = (x1 + x22)
--
1+2
1+2 =
222
FRACTIONAL PROGRAMMING
The linear fractional programming problem is:
T +
Minimize T
,
+
st = ,
This has some interesting properties:

1. If an optimum solution exists, then an extreme point is an optimum (cf LP)
2. Every local minimum is also a GLOBAL minimum
Thus we could use a simplex like approach of moving from one extreme point to an
adjacent one. The most obvious approach is Zangwill's convex simplex algorithm that
we saw earlier, which simplifies here into a minor modification of the Simplex
method for LP. The direction finding and step size subproblems are considerably
simplified.
Other fractional programming algorithms include the ones developed by Gilmore and
Gomory, and Charnes and Cooper.
Refer to Bazaraa, Sherali and Shetty for example, for more details...

Gradiente Reducido

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gradiente Reducido

Uploaded by

Copyright:

Available Formats

189

SIMPLEX LIKE (aka REDUCED GRADIENT) METHODS

REDUCED GRADIENT METHOD (Wolfe)

Here we assume that A is an (mn) matrix of rank m, and that b, xRn.

A requirement for guaranteeing convergence is that no degeneracy is permitted at any

from xBi. Then = B .

Now consider f (the change in f ) for small : f fT(xi),

Substituting for B in () above we have:

f = BfT(xi)(-B-1NN) + NfT(xi)N = {NfT(xi) - BfT(xi)B-1N}N = [N]TN

As proposed initially by Wolfe, the Reduced Gradient method uses for iN

a = -s (0), b=xtt (0)

b >a OR b >0, a= then,

we have a K-K-T point: STOP!

Adding slack variables we rewrite the constraints as

x1, x2, x3, x4 0

Start with x1 = x2 = 0, x3 = x4=1, f(x0)=0, and with x3 and x4 in the basis.

(Note that Rank(B)=2)

Search Direction: NT = NfT(x0) - BfT(x0)B-1N

amax=sup{ a | a0, - , (1-2a)0, (1-a)0 } = 0.5

A line search yields a*=0.5. Therefore

Search Direction: NT = NfT(x1) - BfT(x1)B-1N

amax=sup{ a | (1-a)/20, a0, - , (1-3a)/20 } = 1/3

Minimize01/3 {[(1 )/2]2 [(1 )/2] }

Thus x2 has now entered the basis and replaced x4

amax=sup{ a | (1-2a)0, 1+a)0, a0, - } = 0.5

Minimize00.5 {[(1 2)/3]2 [(1 2)/3] (1 + )/3}

ITERATION 4: Continue and verify that x3 is the optimum solution...

The algorithms progress is as shown below:

Path followed by the algorithm

GENERALIZED REDUCED GRADIENT (GRG) ALGORITHM

SIMILAR IN CONCEPT TO SIMPLEX METHOD FOR LP....

(after reordering the variables)

Here, xD is a vector of (m) Dependent variables

I ] , where JD is mm and JI is m(n-m)

Note that J has m rows; row i is giT (x).

gj(xi) = 0 j, (xi is feasible)

aD < D < bD (dependent variables are not at a bound)

aI I bI (independent variables are within their bounds)

Let = D be a change vector in the value of xi

For small values of , gj = change in gj gjT(xi).

Similarly, for small values of , f = change in f fT(xi)

I = If(xi) - [DfT(xi) [JD(xi)]-1[JI(xi)]]T

For the dependent variables,

Golden Section, Fibonacci, etc.) to find the step size, i.e.,

Min f(xi +a), st 0aamax, where amax = sup {a | axi +ab}

Finding Search Directions from the Reduced Gradient vector

(this is the classical version)

Version 2: Identify s argmax {MaxjS |(I)j|} and set

In other words, for k=1,2,,n set

as long as kS and k is not a basic variable. If k is basic or is in S we simply move on

GRG and the K-K-T Conditions

gj(x)=0 for j=1,2,,m

Consider the Lagrangian for the above problem:

L = f(x) + j jgj(x) - = 0, where i = b i if xi = bi

The second equation yields = -[ JDT(x)]-1Df(x)

I = If(x) - JIT(x) [JDT(x)]-1Df(x)

= If(x) [[JD(x)]-1 JI(x)]TDf(x)

= If(x) [DfT(x) [JD(x)]-1 JI(x)]T

Rewriting this in the standard form required for GRG

Let us start with

with f(x0) = -3/16.

Compute the reduced gradient...

I = If(x0) - [DfT(x0) [JD(x0)]-1[JI(x0)]]T