You are on page 1of 410
Constrained Optimization and Lagrange Multiplier Methods Dimitri P. Bertsekas Massachusetts Institute of Technology WWW cite for book information and orders bttp://world.std.com/“athenase/ a Athena Scientific, Belmont, Massachusetts Athena Scientific Post Office Box 301 Belmont, Mass. 02178-9998 US.A. Email: athenasc@world.std.com ‘WWW information and orders: http://world.std.com/~athenasc/ Cover Design: Ann Gallager © 1996 Dimitri P, Bertsekas All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Originally published by Academie Press, Ine., in 1982 Publisher’s Cataloging-in-Publication Data Bertsekas, Dimitri P, Constrained Optimization and Lagrange Multiplier Methods Includes bibliographical references and index 1. Mathematical Optimization. 2. Multipliers (Mathematical Analysis). I Title. QA402.5.B46 1996 519.4. 9679307 ISBN 1-886529-04-3, ABOUT THE AUTHOR Dimitri Bertsekas studied! Mechanical and Electrical Engineering at the National Technical University of Athens, Greece, and obtained his Ph.D. insystem science from the Massachusetts Institute of Technology. He has held faculty positions with the Engineering-Economic Systems Dep Stanford University and the Electrical Engineering Dept. of the Univer- sity of Mlinois, Urbana, He is currently Professor of Blectrical Engineering and Computer Science at the Massachusetts Institute of Technology. He consults regularly with private industry and has held editorial positions in several journals. He has been elected Fellow of the IEEE. Professor Bertsekas has done research in a broad variety of subjects from optimization theory, control theory, parallel and distributed computa- tion, data communication networks, and systems analysis. He has written numerous papers in each of these areas. Other books by the author: 1) Dynamic Programming and Stochastic Control, Academie Press, 1976. 2) Stochastic Optimal Control: The Discrete-Time Case, Academie Press, 1978; republished by Athena Scientific, 1997 (with . E, Shreve: trans- Inted in Russian) 3) Dynamic Programming: Deterministic and Stochastic Models, Prenti- ce-Hall, 1987. 4) Data Networks, Prentice-Hall, 1987; 2nd Edition 1992 (with R. G. Gallager; translated in Russian and Japanese). 5) Parallel and Distributed Computation: Numerical Methods, Prentice- Hall, 1989; republished by Athena Scientific, 1997 (with J. N, Tsit= siklis). 6) Linear Network Optimization: Algorithms and Codes, M.LT. Press, 1991 7) Dynamic Programming and Optimal Control, (2 Vols.), Athena entific, 1995. 8) Nonlinear Programming, Athena Scientific, 1995. 9) Neuro-Dynamie Programming, Athena Scientific, 1996 (with J. N. ‘Teitsildls) 10) Network Optimization: Continuous and Diserete Models, Athena Sci- entific, 1998, ATHENA SCIENTIFIC OPTIMIZATION AND COMPUTATION SERIES Dynamic Programming and Optimal Control, Vols, I and II, by Dimitri P. Bertsekas, 1995, ISBN 1-886529-11-6, 704 pages Nonlinear Programming, by Dimitri P. Bertselas, 1995, ISBN 1-886529-14-0, 656 pages Neuro-Dynamic Programming, by Dimitri P. Bertsckas and John N, Tsitsiklis, 1996, ISBN 1-886529-10-8, 512 pages Constrained Optimization and Lagrange Multiplier Methods, by Dimitri P. Bertsekas, 1996, ISBN 1-886529-04-3, 410 pages Stochastic Optimal Control: The Discrete-Time Case by Dimitri P. Bertselas and Steven E. Shreve, 1996, ISBN 1-886529-0 330 pages Introduction Wo Linear Optimization by Dimitris Bertsiuas and John N. Tsitsiklis, 1997, ISBN 1-886529-19-1, 608 pages Parallel and Distributed Computation: Numerical Methods by Dimitri P. Bertsokas and John N. Tsitsiklis, 1997, ISBN 1-886529- 01-9, 718 pages Network Flows and Monotropie Optimization by R. Tyrrell Rock- afellar, 1998, ISBN 1-886529-06-X, 634 pages Network Optimization: Continuous and Discrete Models by Dim- itri P. Bertsekas, 1998, ISBN 1-886529-02-7, 608 pages To Teli and Taki Contents Preface Chapter 1 Introduetion General Remarks Notation and Mathematical Background ‘Unconstrained Minimization LBL Convergence Analysis of Gradient Methods 132. Steopest Descent and Sealing 133 Newton's Method and Its Modieatons 134 Conjugate Diretion and Conjugate Gradient Methods 135 Quasi-Newton Methods 1.346 Methods Not Requiring Evaluation of Derivatives 1.4 Constrained Minimization 15 Algorithms for Minimization Subject vo Simple Constraints 16 Notes and Sourees 1 1 1 Chapter 2 The Method of Multipliers for Equality Constrained Problems “The Quadtatic Penalty Function Method ‘The Original Method of Multipliers 221 Geometric Interpretation 222. Existence of Local Minima of the Augmented Lagrangian 22.3 The Primal Functional 224 Conversence Analysis 255 Comparison with the Penalty Method —Computational Aspects 23, Duality Framework for the Method of Multipliers 231 Stepsze Analysis forthe Method of Multiples 232. The Second-Order Multiplier Heration 233. QuasiNewion Versions of the Second-Order Iteration 234 Geometric Interpretation ofthe Second-Order Multiplier 96 los or ns us 2 ns, 126 13 bs vil CONTENTS 24 Multiplier Methods with Partial Elimination of Constraints 25 Asymptotically Exact Minimization in Methods of Multipliers 26 ‘Dual Methods Not Using a Penalty Function 27 Notes and Sources, Chapter 3 The Method of Multipliers for Inequality Constrained and Nondifferentiable Optimization Problems 3.1 One Sided Inequaliy Constraints 32 TworSided Inequality Constraints 33 Approximation Proceduts for Nondiferentiable and Il-Conditioned Optimization Problems 34 Notes and Sources Chapter 4 Exact Penalty Methods and Lagrangian Methods 41 Nonsiifereatable Esact Penalty Functions 442 Linearization Algorithms Based on Nondiferentiable Exact Penalty Functions 421 Algorithms for Minimax Problems 422 Algorithms for Constrained Optimization Problems 43. Dilfereatable Exact Penalty Functions 43.1 Exact Penalty Functions Depending on x and 2 432 Exact Penalty Functions Depending Only on > 433 Algorithms Based on Differeniable Exact Penalty Functions 44 Lagrangian Methods—Local Convergence 44.1 FirstOrder Methods 442. Newton-ke Methods for Equality Constraints 443. Newton lke Methods for Inequality Constraints 44. Quasi-Nowton Versions 45" Lagrangian Method:—Global Convergence 4.5.1 Combination with Penalty and Multiplier Methods 452 Combinations with Diflerentiable Exact Penalty Methods — Newton and Quasi-Newton Versions 453 Combination with Nondifferentiale Exact Penalty Methods — Powell’s Variable Metric Approach 46 Notesand Sources Chapter 5 Nonquadratic Penalty Functions—Convex Programming 5:1 Classes of Penalty Functions and Corresponding Methods of Multipliers Sid Penalty Functions for Equality Constraints 51.2 Penalty Functions for Inequality Constraints 5.13 Approximaticn Procedures Based on Nonquadrtic Penalty Functions 52 Convex Programming and Duality 53. Convergence Anal of Multiplier Methods 54 Rate of Convergence Analysis 55 Conditions for Penaly Methods to Be Exact 141 147 133, 156 138, 167 8 sa 303 is siz is 326 ua 359 CONTENTS 56 Large Scale Separabe Integer Programming Problems andthe Exponential Method of Multipliers 5161 An Estimate ofthe Duality Gap 562. Solution ofthe Dual and Relaxed Problems 5:7 Notes and Sources References Index 368 an 6 383 395 Preface ‘The area of Lagrange multiplier methods for constrained minimization has undergone @ radical transformation starting with the introduction of augmented Lagrangian functions and methods of mul:ipliers in 1968 by Hestenes and Powell. The initial success of these methods in computational practice motivated further efforts aimed at understanding and improving their properties. At the same time their discovery provided impetus and a new perspective for reexamination of Lagrange multiplier methods proposed and nearly abandoned several years earlier. These efforts, aided by fresh ideas based on exact penalty functions, have resulted in a variety of interest~ ing methods utilizing Lagrange multiplier iterations and competing with each other for solution of different classes of problems. ‘This monograph is the outgrowth of the author's research involvement in the area of Lagrange multiplier methods over a nine-yeer period beginning in early 1972. Itis aimed primarily toward researchers and practitioners of ‘mathematical programming algorithms, with a solid background in intro- ductory linear algebra and real analysis. Considerable emphasis is placed on the method of multipliers which, together with its many variations, may be viewed as a primary subject of the monograph. Chapters 2, 3, and $ are devoted to this method. A large portion ‘of Chapter 1 is devoted to unconstrained minimization algorithms on which xii PREFACE the method relies. The developments on methods of multipliers serve as a good introduction to other Lagrange multiplier methods examined in Chapter 4. Several results and algorithms were developed as the monograph was being written and have not as yet been published in journals. These include the algorithm for minimization subject to simple constraints (Section 1.5), the improved convergence and rate-of-convergence results of Chapter 2, the first stepsize rule of Section 2.3.1, the unification of the exact penalty methods of DiPillo and Grippo, and Fletcher, and their relationship with Newton’s method (Section 4.3), the globally convergent Newton and quasi-Newton methods based on differentiable exact penalty functions (Section 4.5.2), and the methodology fer solving large-scale separable integer programming problems of Section 5.6. The line of development of the monograph is based on the author's conviction that solving practical nonlinear optimization problems effi- ciently (or at all) is typically a challenging undertaking and can be accom- plished only through a thorough understanding of the underlying theory. This is true even if @ polished packaged optimization program is used, but ‘more so when the problem is large enough or important enough to warrant the development ofa specialized algorithm. Furthermore, itis quite common in practice that methods are modified, combined, and extended in order to construct an algorithm that matches best the features of the particular problem at hand, and such modifications require a full understanding of the theoretical foundations of the method utilized. For these reasons, we place primary emphasis on the principles underlying various methods and the analysis of their convergence and rate-of-convergence properties. We also provide extensive guidance on the merits of various types of methods but, with a few exceptions, do not provide any algorithms that are specified to the last level of detail ‘The monograph is based on the collective works of many researchers as well as my own. Of those people whose work had a substantial influence on my thinking and contributed in an important way to the monograph I ‘would like to mention J. D. Buys, G. DiPillo, L. Dixon, R. Fletcher, T. Glad, L. Grippo, M. Hestznes, D. Luenberger, O. Mangasarian, D. Q. Mayne, E. Polak, B. T. Poljak, M. J. D. Powell, B. Pschenichny, R. T. Rockafellar, and R. Tapia. My research on methods of multipliers began at Stanford University. My interaction there with Daniel Gabay, Barry Kort, and David Luenberger had a lasting influence on my subsequent work on the subject. The materiel of Chapter 5 in particular is largely based on the results of my direct collaboration with Barry Kort. The material of Sec- PREFACE xiii tion 5.6 is based on work on electric power system scheduling at Alphatech, Inc, where I collaborated with Greg Lauer, Tom Posbergh, and Nils R. Sandel, Jr. Finally, I wish to acknowledge gratefully the research support of the ‘National Science Foundation, and the expert typing of Margaret Flaherty, Leni Gross, and Rosalie J. Bialy. Chapter 1 Introduction 1.1 General Remarks ‘Two classical nonlinear programming problems are the equality con- strained problem (ecP) minimize f(x) subject to h(x) = 0 and its inequality constrained version acre) minimize f(x) subject to g(x) <0, where f: R" + R, h: R" + R",g: R* — R’ are given functions. Computational methods for solving these problems became the subject of intensive investiga~ tion during the late fities and early sixties. We discuss three of the approaches that were pursued. ‘The first approach was based on the idea of itera:ive descent within the confines of the constraint set. Given a feasible point x,, a direction dj ‘was chosen satisfying the descent condition Vf (x,)'d, <0 and the condition 1 2 1. petropuction 2X, + ady: feasible for all « positive and sufficiently small, A search along the line (xj +adyla >0} produced a new feasible point x,., =x, + ad, satisfying f(G1.1) < f(x). This led to various classes of feasible direction ‘methods with which the names of Frank-Wolfe, Zoutendijk, Rosen, Goldstein, and Levitin-Poljak are commonly associated. These methods, together with their more sophisticated versions, enjoyed considerable success and still continue to be very popular for problems with linear constraints. On the ‘other hand, feasible direction methods by their very nature were unable to handle problems with nonlinear equality constraints, and some of them were inapplicable or otherwise not well suited for handling nonlinear inequality constrainis as well. A number of modifications were proposed for treating nonlinear equality constraints, but these involved considerable complexity and detracted substantially from the appeal ofthe descent idea. ‘A second approsch was based on the possibility of solving the system of ‘equations and (possibly) inequalities which constitute necessary conditions {or optimality for the optimization problem, For (ECP), these conditions are (la) V 0 holds at a solution (x*, 2", It was noted, however, by Arrow and Solow (1958) that ifthe local convexity assumption did not hold, then (ECP) could be replaced by the equivalent problem @ minimize f(x) + 4e1M9 subject to h(x) = 0, where c is a scalar and |-| denotes Euclidean norm. If ¢ is taken sufficiently large, then the local convexity condition can be shown to hold for problem (3) under fairly mild conditions. The idea of focusing attention on the necessary 1.1. GENERAL REMARKS 3 conditions rather than the original problem also attracted considerable attention in optimal control where the necessary conditions can often be formulated as a two-point boundary value problem. However, it quickly became evident that the approach had some fundamental limitations, ‘mainly the lack of a good mechanism to enforce convergence when far froma solution, and the difficulty of some of the methods to distinguish between local minima and local maxima. ‘A third approach was based on elimination of constraints through the use of penalty functions. For example the quadratic penalty function method (Fiacco and McCormick, 1968) for (ECP) consists of sequential uncon- strained minimization of the form @ minimize f(x) + $¢4|h)/? subjectto xeR", where {c4 8 @ postive scalar sequence with cy

You might also like