Clase Proximal 2017

Metodo del Punto Proximal
Erik Alex Papa Quiroz

Universidad Nacional Mayor de San Marcos, UNMSM-PERU
Optimizacion
Facultad de Ciencias Matematicas
Universidad Nacional Mayor de San Marcos
29/09/2017
Resumen
1 Introduccion
2 Extension para el caso irrestricto
3 Extension para el ortante no negativo
4 Referencias
1. Problema Irrestricto
Considere el problema:

min f (x)
s.a :
x R.

El Metodo de Punto Proximal (MPP) genera una sucesion {xk }

definido por:
x 0 Rn ,
xk arg min{f (x) + (k /2)||x xk1 ||2 : x Rn },

donde k > 0.
Datos historicos
I El MPP fue introducido por Martinet (1970).
I Rockafellar (1976) extendio el MPP para resolver ceros de
operadores monotonos maximales.
I En los ultimos anos se estudia la extension del metodo para
problemas no convexos y problemas variacionales donde el
operador no es monotono.
Importancia
I Se ha convertido en la base teorica para justificar la
convergencia de los metodos de multiplicadores para
resolver problemas de optimizacion. En particular, se ha
demostrado que detras de cada metodo multiplicador existe
un metodo proximal que lo genera (Rockafellar, 1976).
I Se ha probado tambien que una gran clase de metodos de
descomposicion de problemas convexos son casos
particulares del metodo del punto proximal para encontrar un
cero de un operador monotono maximal, especficamente el
metodo de Douglas-Rachford estudiado por Lions y Mercier
(que abarca una gran clase de metodos de optimizacion) es
una version particular del metodo proximal, (Eckstein y
Bertsekas, 1992).
Algoritmo
Dado x0 y {k } sucesion de parametros positivos.
Para k = 0, 1, . . .
Si: 0 f (xk ), parar !!
Caso contrario:
Resolver el problema:

k k k1 2 n
min f (x ) + ( )||x x || : x R ,
2
para obtener xk y hacer k = k + 1 y volver a la condicion Si.
Resultados de Convergencia
I Si f es convexa y {k } satisface
+
X
(1/k ) = +.
k=1
entonces,
lim f (xk ) = inf{f (x) : x Rn }.
k
I Ademas, si el conjunto optimo X es no vacio, entonces {xk }

converge a un punto optimo del problema.
El caso cuadratico convexo
Si f (x) = 12 xT Ax bT x + c, A = AT y A 0, entonces,
1. Las iteraciones son explcitas:
xk = (A + k I)1 b + k xk1 .

2. En general si f es difernciable se tiene que
1
xk = xk1 f (xk ).
k
El metodo, salvo el caso cuadratico, es implcito en el sentido

tradicional de los metodos iterativos.
Ejercicio: Implementar

1 T 2 0 x1
min f (x) = 2 x x (2 1) +c

0 1 x2

s.a :
x Rn

Usaremos el punto inicial x0 = (0, 0) y realizaremos pasos del

metodo proximal para encontrar la solucion optima x = (1, 1).
MPP para problemas Restrictos
Considere el problema:

min f (x)
s.a :
x C,

donde f es una funcion convexa y C es un conjunto convexo

cerrado con interior no vacio.
I El MPP para resolver problemas restrictos introduce una
funcion llamada distancia que sustituye a ||.||2 por d(, ) en la
regularizacion proximal de tal manera que los puntos
obtenidos sean viables y en la mayoria de los casos
interiores.
MPP con distancias de Bregman
Dados x0 y {k }.
Para k = 0, 1, . . ..
Si 0 f (xk ) finalizar !!!!!
Caso contrario, resolver el problema:
min f (x) + k Dh (x, xk1 )

s.a :
xC

Obteniendo xk C, donde,
Dh (x, y) = h(x) h(y) hh(y), x yi y h es una
funcion de Bregman.
Hacer k = k + 1 y volver al test de parada Si.
Fin Para
Funcion de Bregman
La funcion h : S R es llamada funcion de Bregman con zona S,
si se verifican las siguientes condiciones:
B1). h es estrictamente convexa y continua en S.
B2). h es continuamente diferenciable en S.
B3). x S, R, el conjunto de nivel respecto a la segunda
variable
LDh (x, ) = {y S / Dh (x, y) } es acotada.
B4). Si y S converge a y, entonces Dh (y, yk ) converge a
k
cero.
B5). Si xk S y {yk } S son sucesiones tal que {yk } es

acotado, lim xk = x y lim Dh (yk , xk ) = 0, entonces lim yk = x .

Ejemplos de Funcion de Bregman
n
1. Si S = Rn++ , h(x) =
P
xi ln xj , con la convencion 0 ln 0 = 0, es
i=1
una funcion de Bregman y
n
X
Dh (x, y) = (xi ln(xi /yi ) + yi xi )
i=1
llamada divergencia de Kullback-Leibler, que es usada en

estadstica.
n
2. Si S = Rn++ , h(x) = (xi xi ) con 1, 0 < < 1.
P
i=1
Para = 2, = 1/2 obtenemos:
n
2
X 1
Dh (x, y) = ||x y|| + ( xi yi )2 .
2 yi
i=1
Para = 1, = 1/2 tenemos:
n
X 1
Dh (x, y) = ( xi yi )2 .
2 yi
i=1
Proposicion:
Si h es una funcion de Bregman con zona S entonces:
i) x S, z, y S,
Dh (x, y) Dh (x, z) Dh (z, y) = hh(y) h(z), z xi .
ii) x, y S,
x Dh (x, y) = h(x) h(y).
iii) Dh (., y) es estrictamente convexa y S.
Resultados de convergencia
+
X
(1/k ) = +.
k=1
entonces,
lim f (xk ) = inf{f (x) : x Rn+ }.
kri

Ejemplo PMKullbackLeibler
Considere el problema
min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.
La solucion optima es: x1 = 3/2, x2 = 3.

Tomamos la distancia:
n
X
d(x, y) = (xi ln(xi /yi ) + yi xi )
i=1
y definir
Dh (x, z) = d(y(x), y(z)), con y(x) = Ax b

Obteniendo:
6 2x1 x2
Dh (x, z) = (6 2x1 x2 ) ln( ) + (2z1 z2 + 2x1 + x2 )+
6 2z1 z2
+(x1 ln(x1 /z1 ) + z1 x1 ) + (x2 ln(x2 /z2 ) + z2 x2 ).

Problemas en el ortante no negativo
min{f (x) : x Rn+ }

donde f es una funcion convexa no necesariamente diferenciable.
MPP con distancias -divergencia
Dado x0 y {k }.
Para k = 0, 1, . . . , n.
Si 0 f (xk ) parar !!!!!
min f (x) + k d (x, xk1 )

s.a
x Rn+

obteniendo xk C, donde,
p
X xj
d (x, y) = yj ( ),
yj
j=1
Hacer k = k + 1 y volver al test Si.

: R (, +] es una funcion convexa propia y cerrada, tal
que, dom [0, +), y que cumple las siguientes propiedades:
(i) es dos veces continuamente diferenciable en
int(dom) = (0, +).
(ii) es estrictamente convexa en su dominio.
(iii) lim 0 (t) = .
t0
(iv) (1) = 0 (1) = 0 y 00 (1) > 0.
Denotamos por las clases de funciones que satisfacen (i)-(iv).
Introduciremos dos sub clases de nucleo :
1 := , existe M > 0 tal que 00 (t) M, t 1 .

(1)

1
2 := ; 00 (1)(1 ) 0 (t) 00 (1)(t 1), t > 0 . (2)
t
Existen funciones 1 2 por ejemplo:
1. 1 (t) = t ln t + 1, dom = [0, +) , obteniendo
p
X xi
d1 (t) (x, y) = (xi ln + yi xi ).
yi
i=1
2. 2 (t) = ln t + t 1, dom = (0, +) , obteniendo

n
X
d2 (t) (x, y) = yi ln(yi /xi ) + xi yi
j=1
2
3. 3 (t) = 2 t1 , dom = [0, +) , obteniendo
n
X
d3 (t) (x, y) = 2 ( xj yj )2
j=1
Resultados de convergencia
+
X
(1/k ) = +.
k=1
entonces,
lim f (xk ) = inf{f (x) : x Rn+ }.
kri

Ejemplo PMPhi
min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.

n
X
d(x, y) = (yi ln(yi /xi ) + xi yi )
i=1
y definir
D (x, z) = d(y(x), y(z)), con y(x) = Ax b
obteniendo:
6 2z1 z2
D (x, z) = (6 2z1 z2 ) ln( ) + (2x1 x2 + 2z1 + z2 )+
6 2x1 x2
+(z1 ln(z1 /x1 ) + x1 z1 ) + (z2 ln(z2 /x2 ) + x2 z2 ).

MPP con distancias Log-cuadraticas
Dado x0 y {k }.
Para k = 0, 1, . . . , n.
Si 0 f (xk ) parar !!!!!
min f (x) + k d (x, xk1 )

s.a
x Rn+

obteniendo xk C, donde,
p
X xj
d (x, y) = y2j ( ),
yj
j=1
Hacer k = k + 1 y volver al test Si.

2
X ui
Usando la distancia: d0 (u, v) := v2i ( ), con
vi
i=1
(t) := ( log t + t 1) + 2 (t 1)2 , > > 0, obtenemos:
2
X vi
d0 (u, v) := (ui vi )2 + (v2i log + ui vi v2i ).
2 ui
i=1
Ejemplo PMLog
min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.

2
X vi
d0 (u, v) := (ui vi )2 + (v2i log + ui vi v2i ).
2 ui
i=1
4. Referencias
ABB04 Alvarez, F., Bolte, J., and Brahic, O., Hessian Riemannian Gradient Flows in
Convex Programming, SIAM J. Optim., 43, 2, 477-501, 2004.
Attouch, H., and Bolte, J. On the Convergence of the Proximal Algorithm for
Nonsmooth Functions Involving Analytic Features, Mathematical Programming, 116,
1-2, B, 5-16, 2009.
Attouch, H., and Teboulle, M. Regularized Lotka-Volterra Dynamical System as
Continuous Proximal-Like Method in Optimization, Journal Optimization Theory and
Applications, 121, 541-580, 2004.
Chen, J. S., and S. Pan, S. H. Proximal-Like Algorithm for a Class of Nonconvex
Programming, Pacific Journal of Optimization, 4, 319-333, 2008.
Cunha, G. F. M., da Cruz Neto, J. X., and Oliveira, P. R. A Proximal Point Algorithm
with -Divergence to Quasiconvex Programming, Optimization, 59, 777-792, 2010.
Eckstein J. y Bertsekas D.P. On the Douglas-Rachford splitting method and the
proximal algorithm for maximal monotone operators. Mathematical Programming Vol
55, N 3: 293-318, 1992.
Gromicho J. Quasiconvex optimization and location theory. Kluwer Academic
Publishers, Dordrecht, the Netherlands, (1998).
Guler O. On the Convergence of the Proximal Point Algorithm for Convex
Minimization. SIAM J. Control and Optimization, 29 (2), 403-4019, 1991.
Kaplan, A., and Tichatschke, R. Proximal Point Methods and Nonconvex
Optimization, Journal of global Optimization, 13, 389-406, 1998.
Martinet B. Regularisaton, Dinequations Variationelles par Approximations
Successives, Revue Francaise DInformatique et de Recherche Operationelle,
154-159, 1970.
Mas-Colell A, Whinston MD, Green JR. Microeconomic theory, Oxford University
Press: New York, NY, USA; 1995.
Pan, S. H., and Chen, J. S. Entropy-Like Proximal Algorithms Based on a
Second-Order Homogeneus Distances Function for Quasiconvex Programming,
Journal of Global Optimization, 39, 555-575, 2007.
Papa Quiroz, E.A., Oliveira, P.R. Proximal Point Methods for Quasiconvex and
Convex Functions with Bregman Distances on Hadamard Manifolds. Journal of
Convex Analysis, 16,1, pp 49-69, 2009.
Papa Quiroz, E.A., Oliveira, P.R. Full Convergence of the Proximal Point Method for
Quasiconvex Functions on Hadamard Manifolds. ESAIM: Control, Optimization and
Calculus of Variations, v.18(2), pp. 483-500, 2012.
Papa Quiroz, E.A. An Extension of the Proximal Point Algorithm with Bregman
Distances on Hadamard Manifolds, Sometido a JOGO, 2011.
Papa Quiroz, E.A., Mallma Ramirez, L. Proximal Point Algorithm with Generalized
Distances for Quasiconvex functions with linear inequalities, Sometido a EJOR,
2012.
Papa Quiroz, E.A., Oliveira, P.R. Proximal Point Method for Minimizing Quasiconvex
Locally Lipschitz Functions on Hadamard Manifolds, Nonlinear Analysis, 75, pp.
5924-5932, 2012.
Papa Quiroz, E.A., Oliveira, P.R. An extension of proximal methods for quasiconvex
minimization on the nonnegative orthant, European Journal of Operational
Research, 2016, pp. 26-32, 2012.
Rockafellar, R. T. Augmented Lagrangians and Application of Proximal Point
Algorithm in Convex Programming, Mathematics of Operation Research, Vol.1, pp.
97-116, 1976.
Takayama A. Mathematical economics, 2nd Edition, Cambridge Univ. Press, 1995.
Tseng P. Convergence of a Block Coordinate Descent Method for Nondifferentiable
Minimization, Journal of Optimization Theory and Applications, 109, 3, 475-494,
2001.
Souza, S. S., Oliveira, P. R., da Cruz Neto, J. X., and Soubeyran, A. A Proximal
Method with Separable Bregman Distance for Quasiconvex Minimization on the
Nonnegative Orthant, European Journal of Operational Research, 201, 365-376,
2010.

Clase Proximal 2017

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Clase Proximal 2017

Uploaded by

Copyright:

Available Formats

Metodo del Punto Proximal

Erik Alex Papa Quiroz

El Metodo de Punto Proximal (MPP) genera una sucesion {xk }

xk arg min{f (x) + (k /2)||x xk1 ||2 : x Rn },

I Ademas, si el conjunto optimo X es no vacio, entonces {xk }

2. En general si f es difernciable se tiene que

El metodo, salvo el caso cuadratico, es implcito en el sentido

Usaremos el punto inicial x0 = (0, 0) y realizaremos pasos del

donde f es una funcion convexa y C es un conjunto convexo

min f (x) + k Dh (x, xk1 )

acotado, lim xk = x y lim Dh (yk , xk ) = 0, entonces lim yk = x .

llamada divergencia de Kullback-Leibler, que es usada en

I Ademas, si el conjunto optimo X es no vacio, entonces {xk }

min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.

La solucion optima es: x1 = 3/2, x2 = 3.

Dh (x, z) = d(y(x), y(z)), con y(x) = Ax b

+(x1 ln(x1 /z1 ) + z1 x1 ) + (x2 ln(x2 /z2 ) + z2 x2 ).

min{f (x) : x Rn+ }

min f (x) + k d (x, xk1 )

Hacer k = k + 1 y volver al test Si.

1 := , existe M > 0 tal que 00 (t) M, t 1 .

2. 2 (t) = ln t + t 1, dom = (0, +) , obteniendo

I Ademas, si el conjunto optimo X es no vacio, entonces {xk }

min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.

La solucion optima es: x1 = 3/2, x2 = 3.

D (x, z) = d(y(x), y(z)), con y(x) = Ax b

+(z1 ln(z1 /x1 ) + x1 z1 ) + (z2 ln(z2 /x2 ) + x2 z2 ).

min f (x) + k d (x, xk1 )

Hacer k = k + 1 y volver al test Si.

min{f (x) = x10.5 x20.5 : 2x1 + x2 6, x1 0, x2 0}.

La solucion optima es: x1 = 3/2, x2 = 3.

You might also like