Multiobjective Optimization - Interactive and Evolutionary Approaches

Lecture Notes in Computer Science 5252
Commenced Publication in 1973

Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Jrgen Branke Kalyanmoy Deb
Kaisa Miettinen Roman Sowinski (Eds.)
Multiobjective
Optimization
Interactive and Evolutionary Approaches
13
Volume Editors
Jrgen Branke
University of Karlsruhe, Institute AIFB
76128 Karlsruhe, Germany
E-mail: branke@aifb.uni-karlsruhe.de
Kalyanmoy Deb
Indian Institute of Technology Kanpur, Department of Mechanical Engineering
Kanpur 208016, India
E-mail: deb@iitk.ac.in
and
Helsinki School of Economics, Department of Business Technology
P.O. Box 1210, 00101 Helsinki, Finland
E-mail: kalyanmoy.deb@hse.
Kaisa Miettinen
University of Jyvskyl, Department of Mathematical Information Technology
P.O. Box 35 (Agora), 40014 University of Jyvskyl, Finland
E-mail: kaisa.miettinen@jyu.
Roman Sowinski
Poznan University of Technology, Institute of Computing Science
60-965 Poznan, Poland
E-mail: roman.slowinski@cs.put.poznan.pl
and
Systems Research Institute, Polish Academy of Sciences
00-441 Warsaw, Poland
Library of Congress Control Number: 2008937576

CR Subject Classication (1998): F.1, F.2, G.1.6, G.2.1, G.1
LNCS Sublibrary: SL 1 Theoretical Computer Science and General Issues
ISSN 0302-9743
ISBN-10 3-540-88907-8 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-88907-6 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microlms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
Springer-Verlag Berlin Heidelberg 2008
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Markus Richter, Heidelberg
Printed on acid-free paper SPIN: 12542253 06/3180 543210
Preface
Optimization is the task of nding one or more solutions which correspond to

minimizing (or maximizing) one or more specied objectives and which sat-
isfy all constraints (if any). A single-objective optimization problem involves a
single objective function and usually results in a single solution, called an opti-
mal solution. On the other hand, a multiobjective optimization task considers
several conicting objectives simultaneously. In such a case, there is usually
no single optimal solution, but a set of alternatives with dierent trade-os,
called Pareto optimal solutions, or non-dominated solutions. Despite the ex-
istence of multiple Pareto optimal solutions, in practice, usually only one of
these solutions is to be chosen. Thus, compared to single-objective optimiza-
tion problems, in multiobjective optimization, there are at least two equally
important tasks: an optimization task for nding Pareto optimal solutions (in-
volving a computer-based procedure) and a decision-making task for choosing
a single most preferred solution. The latter typically necessitates preference
information from a decision maker (DM).
1 Modelling an Optimization Problem

Before any optimization can be done, the problem must rst be modelled. As a
matter of fact, to build an appropriate mathematical or computational model
for an optimization problem is as important or as critical as the optimization
task itself. Typically, most books devoted to optimization methods tacitly as-
sume that the problem has been correctly specied. However, in practice, this
is not necessarily always the case. Quantifying and discussing the modelling
aspects largely depend on the actual context of the underlying problem and,
thus, we do not consider modelling aspects in this book. However, we wish to
highlight the following points.
First, building a suitable model (that is, the formulation of the optimiza-
tion problem with specifying decision variables, objectives, constraints, and
VI Preface
variable bounds) is an important task. Second, an optimization algorithm (sin-

gle or multiobjective, alike) nds the optima of the model of the optimization
problem specied and not of the true optimization problem. Due to these rea-
sons, the optimal solutions found by an optimization algorithm must always
be analyzed (through a post-optimality analysis) for their appropriateness in
the context of the problem. This aspect makes the optimization task iterative
in the sense that if some discrepancies in the optimal solutions obtained are
found in the post-optimality analysis, the optimization model may have to be
modied and the optimization task must be performed again. For example, if
the DM during the solution process of a multiobjective optimization problem
learns that the interdependencies between the objectives do not correspond
to his/her experience and understanding, one must get back to the modelling
phase.
2 Why Use Multiple Objectives?
It is a common misconception in practice that most design or problem solving

activities must be geared toward optimizing a single objective, for example,
bringing maximum prot or causing the smallest cost, even though there may
exist dierent conicting goals for the optimization task. As a result, the dif-
ferent goals are often redened to provide an equivalent cost or a prot value,
thereby articially reducing the number of apparently conicting goals into a
single objective. However, the correlation between objectives is usually rather
complex and dependent on the alternatives available. Moreover, the dierent
objectives are typically non-commensurable, so it is dicult to aggregate them
into one synthetic objective. Let us consider the simple example of choosing
a hotel for a night. If the alternatives are a one-star hotel for 70 euros, or
a zero-star hotel for 20 euros, the user might prefer the one-star hotel. On
the other hand, if the choice is between a ve-star hotel for 300 euros, and
a four-star hotel for 250 euros, the four-star hotel may be sucient. That is,
stars cannot be simply weighted with money. How much an extra star is val-
ued depends on the alternatives. As a consequence, it may be very dicult to
combine dierent objectives into a single goal function a priori, that is, before
alternatives are known. It may be comparatively easier to choose among a
given set of alternatives if appropriate decision support is available for the
DM. Similarly, one cannot simply specify constraints on the objectives before
alternatives are known, as the resulting feasible region may become empty,
making the optimization problem impossible to solve.
It should be clear that multiobjective optimization consists of three phases:
model building, optimization, and decision making (preference articulation).
Converting a multiobjective optimization problem into a simplistic single-
objective problem puts decision making before optimization, that is, before
alternatives are known. As explained above, articulating preferences without
a good knowledge of alternatives is dicult, and thus the resulting optimum
Preface VII
may not correspond to the solution the user would have selected from the set
of Pareto optimal solutions. Treating the problem as a true multiobjective
problem means putting the preference articulation stage after optimization,
or interlacing optimization and preference articulation. This will help the user
gain a much better understanding of the problem and the available alterna-
tives, thus leading to a more conscious and better choice. Furthermore, the
resulting multiple Pareto optimal solutions can be analyzed to learn about
interdependencies among decision variables, objectives, and constraints. Such
knowledge about the interactions can be used to redene the model of the op-
timization problem to get solutions that, on the one hand, correspond better
to reality, and, on the other hand, satisfy better the DMs preferences.
3 Multiple Criteria Decision Making

The research eld of considering decision problems with multiple conicting
objectives (or goals or criteria) is known as multiple criteria decision making
(MCDM) or multiple criteria decision aiding (MCDA). It covers both discrete
problems (with a nite set of alternatives, also called actions or solutions) and
continuous problems (multiobjective optimization). Traditionally, in multiob-
jective optimization (also known as multicriteria optimization), mathematical
programming techniques and decision making have been used in an inter-
twined manner, and the ultimate aim of solving a multiobjective optimization
problem has been characterized as supporting the DM in nding the solution
that best ts the DMs preferences. The alternating stages of decision making
and optimization create typically an interactive procedure for nding the most
preferred solution. The DM participates actively in this procedure, particu-
larly in the decision-making stage. Decision making on alternatives discovered
by optimization requires a more or less explicit model of DMs preferences, so
as to nd the most preferred solution among the alternatives currently consid-
ered, or to give indications for nding better solutions in the next optimization
stage. Many interactive methods have been proposed to date, diering mainly
in the way the DM is involved in the process, and in the type of preference
model built on preference information elicited from the DM.
The origin of nonlinear multiobjective optimization goes back almost 60
years, when Kuhn and Tucker formulated optimality conditions. However, for
example, the concept of Pareto optimality has a much earlier origin. More
information about the history of the eld can be found in Chap. 1 of this
book. It is worth mentioning that biannual conferences on MCDM have been
regularly organized since 1975 (rst by active researchers in the eld, then by
a Special Interest Group formed by them and later by the International Soci-
ety on Multiple Criteria Decision Making). In addition, in Europe a Working
Group on Multiple Criteria Decision Aiding was established in 1975 within
EURO (European Association of Operational Research Societies) and holds
two meetings per year (it is presently in its 67th meeting). Furthermore, Inter-
VIII Preface
national Summer Schools on Multicriteria Decision Aid have been arranged

since 1983. A signicant number of monographs, journal articles, conference
proceedings, and collections have been published during the years and the
eld is still active.
4 Evolutionary Multiobjective Optimization

In the 1960s, several researchers independently suggested adopting the prin-
ciples of natural evolution, in particular Darwins theory of the survival of
the ttest, for optimization. These pioneers were Lawrence Fogel, John H.
Holland, Ingo Rechenberg, and Hans-Paul Schwefel. One distinguishing fea-
ture of these so-called evolutionary algorithms (EAs) is that they work with a
population of solutions. This is of particular advantage in the case of multiob-
jective optimization, as they can search for several Pareto optimal solutions
simultaneously in one run, providing the DM with a set of alternatives to
choose from.
Despite some early suggestions and studies, major research and applica-
tion activities of EAs in multiobjective optimization, spurred by a unique
suggestion by David E. Goldberg of a combined EA involving domination
and niching, started only in the beginning of 1990s. But in the last 15 years,
the eld of evolutionary multiobjective optimization (EMO) has developed
rapidly, with a regular, dedicated, biannual conference, commercial software,
and more than 10 books on the topic. Although earlier studies focused on nd-
ing a representative set of solutions on the entire Pareto optimal set, EMO
methodologies are also good candidates for nding only a part of the Pareto
optimal set.
5 Genesis of This Book

Soon after initiating EMO activities, the leading researchers recognized the
existence of the MCDM eld and commonality in interests between the two
elds. They realized the importance of exchanging ideas and engaging in col-
laborative studies. Since their rst international conference in 2001 in Zurich,
EMO conference organizers have always invited leading MCDM researchers to
deliver keynote and invited lectures. The need for cross-fertilization was also
realized by the MCDM community and they reciprocated. However, as each
eld tried to understand the other, the need for real collaborations became
clear.
In the 2003 visit of Kalyanmoy Deb to the University of Karlsruhe to work
on EMO topics with Jrgen Branke and Hartmut Schmeck, they came up with
the idea of arranging a Dagstuhl seminar on multiobjective optimization along
with two MCDM leading researchers, Kaisa Miettinen and Ralph E. Steuer.
Preface IX
The Dagstuhl seminar organized in November 2004 provided an ideal plat-

form for bringing in the best minds from the two elds and exchanging the
philosophies of each others methodologies in solving multiobjective optimiza-
tion problems. It became obvious that the elds did not yet know each others
approaches well enough. For example, some EMO researchers had developed
ideas that have existed in the MCDM eld for long and, on the other hand,
the MCDM eld welcomed the applicability of EMO approaches to problems
where mathematical programming has diculties.
The success of a multiobjective optimization application relies on the way
the DM is allowed to interact with the optimization procedure. At the end of
the 2004 Dagstuhl seminar, a general consensus clearly emerged that there is
plenty of potential in combining ideas and approaches of MCDM and EMO
elds and preparing hybrids of them. Examples of ideas that emerged were
that more attention in the EMO eld should be devoted to incorporating pref-
erence information into the methods and that EMO procedures can be used to
parallelize the repetitive tasks often performed in an MCDM task. By sensing
the opportunity of a collaborative eort, a second Dagstuhl seminar was or-
ganized in December 2006 and Roman Sowiski, who strongly advocated for
inclusion of preference modelling into EMO procedures, was invited to the or-
ganizing team. The seminar brought together about 50 researchers from EMO
and MCDM elds interested in bringing EMO and MCDM approaches closer
to each other. We, the organizers, had a clear idea in mind. The presence of
experts from both elds should be exploited so that the outcome could be
written up in a single book for the benet of both novices and experts from
both elds.
6 Topics Covered
Before we discuss the topics covered in this book, we mention a few aspects of
the MCDM eld which we do not discuss here. Because of the large amount of
research and publications produced in the MCDM eld during the years, we
have limited our review. We have mostly restricted our discussion to problems
involving continuous problems, although some chapters include some exten-
sions to discrete problems, as well. However, one has to mention that because
the multiattribute or multiple criteria decision analysis methods have been
developed for problems involving a discrete set of solution alternatives, they
can directly be used for analyzing the nal population of an EMO algorithm.
In this way, there is a clear link between the two elds. Another topic not cov-
ered here is group decision making. This refers to situations where we have
several DMs with dierent preferences. Instead, we assume that we have a
single DM or a unanimous group of DMs involved.
We have divided the contents of this book into ve parts. The rst part is
devoted to the basics of multiobjective optimization and introduces in three
chapters the main methods and ideas developed in the eld of nonlinear mul-
X Preface
tiobjective optimization on the MCDM side (including both noninteractive

and interactive approaches) and on the EMO side. This part lays a founda-
tion for the rest of the book and should also allow newcomers to the eld
to get familiar with the topic. The second part introduces in four chapters
recent developments in considering preference information or creating inter-
active methods. Approaches with both MCDM and EMO origin as well as
their hybrids are included. The third part concentrates with Chap. 8 and 9
on visualization, both for individual solution candidates and the whole sets of
Pareto optimal solutions. In Chap. 10-13 (Part Four), implementation issues
including meta-modelling, parallel approaches, and software are of interest. In
addition, various real-world applications are described in order to give some
idea of the wide spectrum of disciplines and problems that can benet from
multiobjective optimization. Finally, in the last three chapters forming Part
Five, some relevant topics including approximation quality in the EMO ap-
proaches and learning perspectives in decision making are studied. The last
chapter points to some future challenges and encourages further research in
the eld. All 16 chapters matured during the 2006 Dagstuhl seminar. In par-
ticular, the last six chapters are outcomes of active working groups formed
during the seminar.
7 Main Terminology and Notations Used

In order to avoid repeating basic concepts and problem formulations in each
chapter, we present them here. We handle multiobjective optimization prob-
lems of the form
minimize {f1 (x), f2 (x), . . . , fk (x)}
(1)
subject to x S
involving k ( 2) conicting objective functions fi : Rn R that we want

to minimize simultaneously. The decision (variable) vectors x = (x1 , x2 , . . . ,
xn )T belong to the nonempty feasible region S Rn . In this general problem
formulation we do not x the types of constraints forming the feasible region.
Objective vectors are images of decision vectors and consist of objective (func-
tion) values z = f (x) = (f1 (x), f2 (x), . . . , fk (x))T . Furthermore, the image
of the feasible region in the objective space is called a feasible objective region
Z = f (S).
In multiobjective optimization, objective vectors are regarded as optimal
if none of their components can be improved without deterioration to at least
one of the other components. More precisely, a decision vector x S is called
Pareto optimal if there does not exist another x S such that fi (x) fi (x )
for all i = 1, . . . , k and fj (x) < fj (x ) for at least one index j. The set of
Pareto optimal decision vectors can be denoted by P (S). Correspondingly,
an objective vector is Pareto optimal if the corresponding decision vector is
Pareto optimal and the set of Pareto optimal objective vectors can be denoted
Preface XI
by P (Z). The set of Pareto optimal solutions is a subset of the set of weakly
Pareto optimal solutions. A decision vector x S is weakly Pareto optimal if
there does not exist another x S such that fi (x) < fi (x ) for all i = 1, . . . , k.
As above, here we can also denote two sets corresponding to decision and
objective spaces by W P (S) and W P (Z), respectively.
The ranges of the Pareto optimal solutions in the feasible objective region
provide valuable information about the problem considered if the objective
functions are bounded over the feasible region. Lower bounds of the Pareto
optimal set are available in the ideal objective vector z Rk . Its components
zi are obtained by minimizing each of the objective functions individually
subject to the feasible region. A vector strictly better than z can be called a
utopian objective vector z . In practice, we set zi = zi for i = 1, . . . , k,
where is some small positive scalar.
The upper bounds of the Pareto optimal set, that is, the components of
a nadir objective vector znad , are usually dicult to obtain. Unfortunately,
there exists no constructive way to obtain the exact nadir objective vector for
nonlinear problems. It can be estimated using a payo table but the estimate
may be unreliable.
Because vectors cannot be ordered completely, all the Pareto optimal so-
lutions can be regarded as equally desirable in the mathematical sense and
we need a decision maker (DM) to identify the most preferred one among
them. The DM is a person who can express preference information related to
the conicting objectives and we assume that less is preferred to more in each
objective for her/him.
Besides a DM, we usually also need a so-called analyst to take part in the
solution process. By an analyst we mean a person or a computer program
responsible for the mathematical side of the solution process. The analyst
may be, for example, responsible for selecting the appropriate method for
optimization.
Acknowledgments
We would like to take this opportunity to thank all the participants of the 2004
and 2006 Dagstuhl seminars for their dedication and eort, without which
this book would not have been possible. Andrzej Wierzbickis suggestion to
include a discussion on the importance of modelling issues in optimization in
the preface is appreciated. We thank Springer for supporting our idea of this
book. K. Deb and K. Miettinen acknowledge the support from the Academy
of Finland (grant no. 118319) and the Foundation of the Helsinki School of
Economics for completing this task.
The topics covered in this book are wide ranging; from presenting the
basics of multiobjective optimization to advanced topics of incorporating di-
verse interactive features in multiobjective optimization and from practical
XII Preface
real-world applications to software and visualization issues as well as vari-

ous perspectives highlighting relevant research issues. With these contents,
hopefully, the book remains useful to both beginners and current researchers
including experts. Besides the coverage of the topics, this book will also re-
main a milestone achievement in the eld of multiobjective optimization for
another reason. This book is the rst concrete approach in bringing two paral-
lel elds of multiobjective optimization together. The 16 chapters of this book
are contributed by 19 EMO and 22 MCDM researchers. Of the 16 chapters,
six are written by a mix of EMO and MCDM researchers and all 16 chapters
have been reviewed by at least one EMO and one MCDM researcher. We shall
consider our eorts worthwhile if more such collaborative tasks are pursued in
the coming years to develop hybrid ideas by sharing the strengths of dierent
approaches.
June 2008 Jrgen Branke,

Kalyanmoy Deb,
Kaisa Miettinen,
Roman Sowiski
Preface XIII
Most participants of the 2006 Dagstuhl seminar on Practical Approaches to Multi-

objective Optimization: 1 Eckart Zitzler, 2 Kalyanmoy Deb, 3 Kaisa Miettinen, 4
Joshua Knowles, 5 Carlos Fonseca, 6 Salvatore Greco, 7 Oliver Bandte, 8 Christian
Igel, 9 Nirupam Chakraborti, 10 Silvia Poles, 11 Valerie Belton, 12 Jyrki Walle-
nius, 13 Roman Sowiski, 14 Serpil Sayin, 15 Pekka Korhonen, 16 Lothar Thiele,
17 Wodzimierz Ogryczak, 18 Andrzej Osyczka, 19 Koji Shimoyama, 20 Daisuke
Sasaki, 21 Johannes Jahn, 22 Gnter Rudolph, 23 Jrg Fliege, 24 Matthias Ehrgott,
25 Petri Eskelinen, 26 Jerzy Baszczyski, 27 Sanaz Mostaghim, 28 Pablo Funes, 29
Carlos Coello Coello, 30 Theodor Stewart, 31 Jos Figueira, 32 El-Ghazali Talbi, 33
Julian Molina, 34 Andrzej Wierzbicki, 35 Yaochu Jin, 36 Andrzej Jaszkiewicz, 37
Jrgen Branke, 38 Fransisco Ruiz, 39 Hirotaka Nakayama, 40 Tatsuya Okabe, 41
Alexander Lotov, 42 Hisao Ishibuchi
List of Contributors
Oliver Bandte Kalyanmoy Deb

Icosystem Corporation, Cambridge, Department of Mechanical Engineer-
MA 02138 ing, Indian Institute of Technology
oliver@icosystem.com Kanpur
Valerie Belton Kanpur, PIN 208016, India
Department of Management Science, deb@iitk.ac.in
University of Department of Business Technology,
Strathclyde, 40 George Street, Helsinki School of Economics
Glasgow, UK, G1 1QE PO Box 1210, 00101 Helsinki,
val.belton@strath.ac.uk Finland
Kalyanmoy.Deb@hse.fi
Jrgen Branke
Institute AIFB Matthias Ehrgott
University of Karlsruhe Department of Engineering Science,
76128 Karlsruhe, Germany The University of Auckland, Private
branke@aifb.uni-karlsruhe.de Bag 92019, Auckland 1142, New
Heinrich Braun Zealand
SAP AG, Walldorf, Germany m.ehrgott@auckland.ac.nz
heinrich.braun@sap.com
Petri Eskelinen
Nirupam Chakraborti Helsinki School of Economics
Indian Institute of Technology, P.O. Box 1210,
Kharagpur 721 302, India FI-00101 Helsinki, Finland
nchakrab@iitkgp.ac.in Petri.Eskelinen@hse.fi
Carlos A. Coello Coello
CINVESTAV-IPN (Evolutionary Jos Rui Figueira
Computation Group), Depto. de CEG-IST, Center for Management
Computacin Studies, Instituto
Av. IPN No 2508, Col. San Pedro Superior Tcnico, Technical Univer-
Zacatenco, D.F., 07360, Mexico sity of Lisbon, Portugal
ccoello@cs.cinvestav.mx figueira@ist.utl.pt
XVI List of Contributors
Mathias Gbelt Alexander V. Lotov

SAP AG, Walldorf, Germany Dorodnicyn Computing Centre of
mathias.goebelt@sap.com Russian Academy of Sciences
Vavilova str. 40, Moscow 119333,
Salvatore Greco
Russia
Faculty of Economics, University of
lotov08@ccas.ru
Catania, Corso Italia, 55,
95129 Catania, Italy
salgreco@unict.it Benedetto Matarazzo
Hisao Ishibuchi Faculty of Economics, University of
Department of Computer Science Catania, Corso Italia, 55,
and Intelligent Systems, Osaka 95129 Catania, Italy
Prefecture University matarazz@unict.it
Osaka 599-8531, Japan
hisaoi@cs.osakafu-u.ac.jp
Kaisa Miettinen
Johannes Jahn Department of Mathematical
Department of Mathematics, Information Technology,
University of Erlangen-Nrnberg P.O. Box 35 (Agora),
Martensstrasse 3, 91058 Erlangen, FI-40014 University of Jyvskyl,
Germany Finland1
jahn@am.uni-erlangen.de kaisa.miettinen@jyu.fi
Andrzej Jaszkiewicz
Poznan University of Technology,
Institute of Computing Science Julin Molina
jaszkiewicz@cs.put.poznan.pl Department of Applied Economics
(Mathematics), University of
Yaochu Jin Mlaga, Calle Ejido 6, E-29071
Honda Research Institute Europe, Mlaga, Spain,
63073 Oenbach, Germany julian.molina@uma.es
Yaochu.Jin@honda-ri.de
Joshua Knowles
Sanaz Mostaghim
School of Computer Science, Uni-
Institute AIFB, University of
versity of Manchester, Oxford Road,
Karlsruhe
Manchester M13 9PL, UK
j.knowles@manchester.ac.uk 76128 Karlsruhe, Germany
mostaghim@aifb.uni-karlsruhe.de
Pekka Korhonen
Helsinki School of Economics,
Department of Business Vincent Mousseau
Technology, P.O. Box 1210, FI-00101 LAMSADE, Universit Paris-
Helsinki, Finland Dauphine, Paris, France
mousseau@lamsade.dauphine.fr
1
In 2007 also Helsinki
School of Economics, Helsinki, Finland
List of Contributors XVII
Hirotaka Nakayama Daisuke Sasaki

Konan University, Dept. of In- CFD Laboratory, Department
formation Science and Systems of Engineering, University of
Engineering, 8-9-1 Okamoto, Cambridge
Higashinada, Kobe 658-8501, Japan Trumpington Street, Cambridge
nakayama@konan-u.ac.jp CB2 1PZ, UK
ds432@eng.cam.ac.uk
Wlodzimierz Ogryczak Koji Shimoyama
Institute of Control & Computation Institute of Fluid Science, Tohoku
Engineering, Faculty of Electronics University
& Information Technology, Warsaw 2-1-1 Katahira, Aoba-ku, Sendai,
University of Technology 980-8577, Japan
ul. Nowowiejska 15/19, 00-665 shimoyama@edge.ifs.tohoku.ac.jp
Warsaw, Poland Roman Sowiski
w.ogryczak@ia.pw.edu.pl Institute of Computing
Science, Pozna University of
Tatsuya Okabe Technology, 60-965 Pozna, and
Honda Research Institute Japan Co., Systems Research Institute, Polish
Ltd. Academy of Sciences
8-1 Honcho, Wako-City, Saitama, 00-441 Warsaw, Poland,
351-0188, Japan roman.slowinski@cs.put.poznan.pl
okabe@jp.honda-ri.com Danilo Di Stefano
Esteco Research Labs, 35129 Padova,
Silvia Poles Italy
ESTECO - Research Labs danilo.distefano@esteco.com
Via Giambellino, 7 35129 Padova, Theodor Stewart
Italy University of Cape Town, Ronde-
silvia.poles@esteco.com bosch 7701, South Africa
theodor.stewart@uct.ac.za
Gnter Rudolph El-Ghazali Talbi
Computational Intelligence Research Laboratoire dInformatique Fon-
Group, Chair of Algorithm Engi- damentale de Lille Universit des
neering (LS XI), Department of Sciences et Technologies de Lille
Computer Science, University of 59655 - Villeneuve dAscq cedex,
Dortmund France
44227 Dortmund, Germany talbi@lifl.fr
Guenter.Rudolph@uni-dortmund.de
Lothar Thiele
Computer Engineering and Networks
Francisco Ruiz Laboratory (TIK)
Department of Applied Economics Department of Electrical Engineering
(Mathematics), University of Mlaga and Information Technology
Calle Ejido 6, E-29071 Mlaga, Spain ETH Zurich, Switzerland
rua@uma.es thiele@tik.ee.ethz.ch
XVIII List of Contributors
Mariana Vassileva Andrzej P. Wierzbicki

Institute of Information Technolo- 21st Century COE
gies, Bulgarian Academy of Sciences, Program: Technology Creation
Bulgaria Based on Knowledge Science, JAIST
mvassileva@iinf.bas.bg (Japan Advanced Institute of Science
and Technology), Asahidai 1-1,
Rudolf Vetschera Nomi, Ishikawa 923-1292, Japan and
Department of Business Administra- National Institute of
tion, University of Vienna Telecommunications, Szachowa Str.
Brnnerstrasse 72, 1210 Wien, 1, 04-894 Warsaw, Poland,
Austria andrzej@jaist.ac.jp
rudolf.vetschera@univie.ac.at
Eckart Zitzler
Jyrki Wallenius Computer Engineering and Networks
Helsinki School of Economics, Laboratory (TIK)
Department of Business Department of Electrical Engineering
Technology, P.O. Box 1210, FI-00101 and Information Technology
Helsinki, Finland ETH Zurich, Switzerland
jyrki.wallenius@hse.fi eckart.zitzler@tik.ee.ethz.ch
Table of Contents
Basics on Multiobjective Optimization
1 Introduction to Multiobjective Optimization:

Noninteractive Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Kaisa Miettinen
2 Introduction to Multiobjective Optimization: Interactive
Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Kaisa Miettinen, Francisco Ruiz, and Andrzej P. Wierzbicki
3 Introduction to Evolutionary Multiobjective Optimization 59
Kalyanmoy Deb
Recent Interactive and Preference-Based Approaches
4 Interactive Multiobjective Optimization

Using a Set of Additive Value Functions . . . . . . . . . . . . . . . . . . 97
Jos Rui Figueira, Salvatore Greco, Vincent Mousseau,
and Roman Sowiski
5 Dominance-Based Rough Set Approach to Interactive

Multiobjective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Salvatore Greco, Benedetto Matarazzo, and Roman Sowiski
6 Consideration of Partial User Preferences in Evolutionary
Multiobjective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Jrgen Branke
7 Interactive Multiobjective Evolutionary Algorithms . . . . . . . 179
Andrzej Jaszkiewicz and Jrgen Branke
XX Table of Contents
Visualization of Solutions
8 Visualization in the Multiple Objective Decision-Making

Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Pekka Korhonen and Jyrki Wallenius
9 Visualizing the Pareto Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Alexander V. Lotov and Kaisa Miettinen
Modelling, Implementation and Applications
10 Meta-Modeling in Multiobjective Optimization . . . . . . . . . . . 245

Joshua Knowles and Hirotaka Nakayama
11 Real-World Applications of Multiobjective Optimization . . 285
Theodor Stewart, Oliver Bandte, Heinrich Braun,
Nirupam Chakraborti, Matthias Ehrgott, Mathias Gbelt,
Yaochu Jin, Hirotaka Nakayama, Silvia Poles, and
Danilo Di Stefano
12 Multiobjective Optimization Software . . . . . . . . . . . . . . . . . . . . . 329
Silvia Poles, Mariana Vassileva, and Daisuke Sasaki
13 Parallel Approaches for Multiobjective Optimization . . . . . . 349
El-Ghazali Talbi, Sanaz Mostaghim, Tatsuya Okabe, Hisao
Ishibuchi, Gnter Rudolph, and Carlos A. Coello Coello
Quality Assessment, Learning, and Future Challenges
14 Quality Assessment of Pareto Set Approximations . . . . . . . . . 373

Eckart Zitzler, Joshua Knowles, and Lothar Thiele
15 Interactive Multiobjective Optimization from a Learning
Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Valerie Belton, Jrgen Branke, Petri Eskelinen, Salvatore Greco,
Julin Molina, Francisco Ruiz, and Roman Sowiski
16 Future Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
Kaisa Miettinen, Kalyanmoy Deb, Johannes Jahn,
Wlodzimierz Ogryczak, Koji Shimoyama, and Rudolf
Vetschera
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
1
Introduction to Multiobjective Optimization:
Noninteractive Approaches
Kaisa Miettinen
Department of Mathematical Information Technology

P.O. Box 35 (Agora), FI-40014 University of Jyvskyl, Finland
kaisa.miettinen@jyu.fi
Abstract. We give an introduction to nonlinear multiobjective optimization by

covering some basic concepts as well as outlines of some methods. Because Pareto
optimal solutions cannot be ordered completely, we need extra preference informa-
tion coming from a decision maker to be able to select the most preferred solution
for a problem involving multiple conicting objectives. Multiobjective optimization
methods are often classied according to the role of a decision maker in the solution
process. In this chapter, we concentrate on noninteractive methods where the deci-
sion maker either is not involved or species preference information before or after
the actual solution process. In other words, the decision maker is not assumed to
devote too much time in the solution process.
1.1 Introduction
Many decision and planning problems involve multiple conicting objectives
that should be considered simultaneously (alternatively, we can talk about
multiple conicting criteria). Such problems are generally known as multiple
criteria decision making (MCDM) problems. We can classify MCDM problems
in many ways depending on the characteristics of the problem in question. For
example, we talk about multiattribute decision analysis if we have a discrete,
predened set of alternatives to be considered. Here we study multiobjec-
tive optimization (also known as multiobjective mathematical programming)
where the set of feasible solutions is not explicitly known in advance but it is
restricted by constraint functions. Because of the aims and scope of this book,
we concentrate on nonlinear multiobjective optimization (where at least one
function in the problem formulation is nonlinear) and ignore approaches de-
signed only for multiobjective linear programming (MOLP) problems (where
all the functions are linear).
In 2007 also Helsinki School of Economics, Helsinki, Finland

Reviewed by: Nirupam Chakraborti, Indian Institute of Technology, India
Hirotaka Nakayama, Konan University, Japan
Roman Sowiski, Poznan University of Technology, Poland
J. Branke et al. (Eds.): Multiobjective Optimization, LNCS 5252, pp. 126, 2008.
c Springer-Verlag Berlin Heidelberg 2008
2 K. Miettinen
In multiobjective optimization problems, it is characteristic that no unique

solution exists but a set of mathematically equally good solutions can be iden-
tied. These solutions are known as nondominated, ecient, noninferior or
Pareto optimal solutions (dened in Preface). In the MCDM literature, these
terms are usually seen as synonyms. Multiobjective optimization problems
have been intensively studied for several decades and the research is based
on the theoretical background laid, for example, in (Edgeworth, 1881; Koop-
mans, 1951; Kuhn and Tucker, 1951; Pareto, 1896, 1906). As a matter of fact,
many ideas and approaches have their foundation in the theory of mathemat-
ical programming. For example, while formulating optimality conditions of
nonlinear programming, Kuhn and Tucker (1951) did also formulate them for
multiobjective optimization problems.
Typically, in the MCDM literature, the idea of solving a multiobjective
optimization problem is understood as helping a human decision maker (DM)
in considering the multiple objectives simultaneously and in nding a Pareto
optimal solution that pleases him/her the most. Thus, the solution process
needs some involvement of the DM in the form of specifying preference in-
formation and the nal solution is determined by his/her preferences in one
way or the other. In other words, a more or less explicit preference model is
built from preference information and this model is exploited in order to nd
solutions that better t the DMs preferences. Here we assume that a single
DM is involved. Group decision making with several DMs is discussed, e.g.,
in (Hwang and Lin, 1987; Fandel, 1990).
In general, the DM is a person who is assumed to know the problem con-
sidered and be able to provide preference information related to the objectives
and/or dierent solutions in some form. Besides a DM, we usually also need
an analyst when solving a multiobjective optimization problems. An analyst
is a person or a computer program responsible for the mathematical modelling
and computing sides of the solution process. The analyst is supposed to help
the DM at various stages of the solution process, in particular, in eliciting
preference information and in interpreting the information coming from the
computations (see also Chapter 15).
We can list several desirable properties of multiobjective optimization
methods. Among them are, for example, that the method should generate
Pareto optimal solutions reliably, it should help the DM to get an overview of
the set of Pareto optimal solutions, it should not require too much time from
the DM, the information exchanged (given by the method and asked from the
DM) should be understandable and not too demanding or complicated (cog-
nitively or otherwise) and the method should support the DM in nding the
most preferred solution as the nal one so that the DM could be convinced
of its relative goodness. The last-mentioned aim could be characterized as
psychological convergence (diering from mathematical convergence which is
emphasized in mathematical programming).
Surveys of methods developed for multiobjective optimization problems
include (Chankong and Haimes, 1983; Hwang and Masud, 1979; Marler and
1 Introduction to Noninteractive Approaches 3
Arora, 2004; Miettinen, 1999; Sawaragi et al., 1985; Steuer, 1986; Vincke,
1992). For example, in (Hwang and Masud, 1979; Miettinen, 1999), the meth-
ods are classied into the four following classes according to the role of the
DM in the solution process. Sometimes, there is no DM and her/his preference
information available and in those cases we must use so-called no-preference
methods. Then, the task is to nd some neutral compromise solution with-
out any additional preference information. This means that instead of asking
the DM for preference information, some assumptions are made about what
a reasonable compromise could be like. In all the other classes, the DM is
assumed to take part in the solution process.
In a priori methods, the DM rst articulates preference information and
ones aspirations and then the solution process tries to nd a Pareto optimal
solution satisfying them as well as possible. This is a straightforward approach
but the diculty is that the DM does not necessarily know the possibilities
and limitations of the problem beforehand and may have too optimistic or pes-
simistic expectations. Alternatively, it is possible to use a posteriori methods,
where a representation of the set of Pareto optimal solutions is rst gener-
ated and then the DM is supposed to select the most preferred one among
them. This approach gives the DM an overview of dierent solutions available
but if there are more than two objectives in the problem, it may be dicult
for the DM to analyze the large amount of information (because visualizing
the solutions is no longer as straightforward as in a biobjective case) and,
on the other hand, generating the set of Pareto optimal solutions may be
computationally expensive. Typically, evolutionary multiobjective optimiza-
tion algorithms (see Chapter 3) belong to this class but, when using them, it
may happen that the real Pareto optimal set is not reached. This means that
the solutions produced are nondominated in the current population but not
necessarily actually Pareto optimal (if, e.g., the search is stopped too early).
In this chapter, we concentrate on the three classes of noninteractive meth-
ods where either no DM takes part in the solution process or (s)he expresses
preference relations before or after the process. The fourth class devoted to
interactive methods is the most extensive class of methods and it will be cov-
ered in Chapter 2. In interactive approaches, an iterative solution algorithm
(which can be called a solution pattern) is formed and repeated (typically
several times). After each iteration, some information is given to the DM and
(s)he is asked to specify preference information (in the form that the method
in question can utilize, e.g., by answering some questions). One can say that
the analyst aims at determining the preference structure of the DM in an
interactive way. What is noteworthy is that the DM can specify and adjust
ones preferences between each iteration and at the same time learn about the
interdependencies in the problem as well as about ones own preferences.
Methods in dierent classes have their strengths and weaknesses and for
that reason dierent approaches are needed. Let us point out that the classi-
cation we use here is not complete or absolute. Overlapping and combinations
of classes are possible and some methods can belong to more than one class
4 K. Miettinen
depending on dierent interpretations. Other classications are given, for ex-

ample, by Cohon (1985); Rosenthal (1985).
The rest of this chapter is organized as follows. In Section 1.2, we augment
the basic terminology and notation introduced in Preface. In other words, we
discuss some more concepts of multiobjective optimization including optimal-
ity and elements of a solution process. After that we introduce two widely used
basic methods, the weighting method and the -constraint method in Section
1.3. Sections 1.41.6 are devoted to some methods belonging to the three
above-described classes, that is, no-preference methods, a posteriori methods
and a priori methods, respectively. We also give references to further details.
In Section 1.7, we summarize some properties of the methods described and,
nally, we conclude with Section 1.8.
1.2 Some Concepts

1.2.1 Optimality
Continuous multiobjective optimization problems typically have an innite

number of Pareto optimal solutions (whereas combinatorial multiobjective
optimization problems have a nite but possibly very large number of Pareto
optimal solutions) and the Pareto optimal set (consisting of the Pareto optimal
solutions) can be nonconvex and disconnected. Because the basic terminology
and concepts of multiobjective optimization were dened in Preface, we do
not repeat them here. However, it is important to note that the denitions
of Pareto optimality and weak Pareto optimality (given in Preface) introduce
global Pareto optimality and global weak Pareto optimality. Corresponding to
nonlinear programming, we can also dene local (weak) Pareto optimality in
a small environment of the point considered. Let us emphasize that a locally
Pareto optimal objective vector has no practical relevance (if it is not global)
because it may be located in the interior of the feasible objective region (i.e.,
it is possible to improve all objective function values) whereas globally Pareto
optimal solutions are always located on its boundary. Thus, it is important to
use appropriate tools to get globally Pareto optimal solutions. We shall get
back to this when we discuss scalarizing functions.
Naturally, any globally Pareto optimal solution is locally Pareto optimal.
The converse is valid for convex problems, see, for example, (Miettinen, 1999).
A multiobjective optimization problem can be dened to be convex if the fea-
sible objective region is convex or if the feasible region is convex and the objec-
tive functions are quasiconvex with at least one strictly quasiconvex function.
Before we continue, it is important to briey touch the existence of Pareto
optimal solutions. It is shown in (Sawaragi et al., 1985) that Pareto optimal
solutions exist if we assume that the (nonempty) feasible region is compact
and all the objective functions are lower semicontinuous. Alternatively, we
can formulate the assumption in the form that the feasible objective region is
nonempty and compact. We do not go into details of theoretical foundations

here but assume in what follows that Pareto optimal solutions exist. Another
important question besides the existence of Pareto optimal solutions is the
stability of the Pareto optimal set with respect to perturbations of the feasible
region, objective functions or domination structures of the DM. This topic
is extensively discussed in (Sawaragi et al., 1985) and it is also touched in
Chapter 9. Let us mention that sometimes, like by Steuer (1986), Pareto
optimal decision vectors are referred to as ecient solutions and the term
nondominated solution is used for Pareto optimal objective vectors.
If the problem is correctly specied, the nal solution of a rational DM
is always Pareto optimal. Thus, we can restrict our consideration to Pareto
optimal solutions. For that reason, it is important that the multiobjective
optimization method used can meet the following two needs: rstly, is must be
able to cover, that is, nd any Pareto optimal solution and, secondly, generate
only Pareto optimal solutions (Sawaragi et al., 1985). However, weakly Pareto
optimal solutions are often relevant from a technical point of view because they
are sometimes easier to generate than Pareto optimal ones.
One more widely used optimality concepts is proper Pareto optimality.
The properly Pareto optimal set is a subset of the Pareto optimal set which
is a subset of the weakly Pareto optimal set. For an example of these three
concepts of optimality and their relationships, see Figure 1.1. In the gure,
the set of weakly Pareto optimal solutions is denoted by a bold line. The
endpoints of the Pareto optimal set are denoted by circles and the endpoints
of the properly Pareto optimal set by short lines (note that the sets can also
be disconnected).
z
2
z*
z* R 2+
z1
Fig. 1.1. Sets of properly, weakly and Pareto optimal solutions.

6 K. Miettinen
As a matter of fact, Pareto optimal solutions can be divided into im-

properly and properly Pareto optimal ones depending on whether unbounded
trade-os between objectives are allowed or not. Practically, a properly Pareto
optimal solution with a very high trade-o does not essentially dier from a
weakly Pareto optimal solution for a human DM. There are several denitions
for proper Pareto optimality and they are not equivalent. The rst denition
was given by Kuhn and Tucker (1951) while they formulated optimality condi-
tions for multiobjective optimization. Some of the denitions are collected, for
example, in (Miettinen, 1999) and relationships between dierent denitions
are analyzed in (Sawaragi et al., 1985; Makarov and Rachkovski, 1999).
The idea of proper Pareto optimality is easily understandable in the de-
nition of Georion (1968): A decision vector x S is properly Pareto optimal
(in the sense of Georion) if it is Pareto optimal and if there is some real
number M such that for each fi and each x S satisfying fi (x) < fi (x )
there exists at least one fj such that fj (x ) < fj (x) and
fi (x ) fi (x)
M.
fj (x) fj (x )
An objective vector is properly Pareto optimal if the corresponding decision

vector is properly Pareto optimal. We can see from the denition that a so-
lution is properly Pareto optimal if there is at least one pair of objectives for
which a nite decrement in one objective is possible only at the expense of
some reasonable increment in the other objective.
Let us point out that optimality can be dened in more general ways (than
above) with the help of ordering cones (pointed convex cones) D dened in
Rk . The cone D can be used to induce a partial ordering in Z. In other words,
for two objective vectors z and z we can say that z dominates z if
z z + D \ {0}.
Now we can say that a feasible decision vector is ecient and the correspond-
ing objective vector is nondominated with respect to D if there exists no
other feasible objective vector that dominates it. This denition is equivalent
to Pareto optimality if we set
D = Rk+ = {z Rk | zi 0 for i = 1, . . . , k},
that is, D is the nonnegative orthant of Rk . For further details of ordering

cones and dierent spaces we refer, for example, to (Jahn, 2004; Luc, 1989)
and references therein.
As said, we can give an equivalent formulation to the denition of Pareto
optimality (given in Preface) as follows: A feasible decision vector x S and
the corresponding objective vector z = f (x ) Z are Pareto optimal if
z Rk+ \ {0}) Z = .
For a visualization of this, see Figure 1.1, where a shifted cone at z is il-
lustrated. This denition clearly shows why Pareto optimal objective vectors
must be located on the boundary of the feasible objective region Z. After hav-
ing introduced the denition of Pareto optimality in this form, we can give
another denition for proper Pareto optimality. This denition (introduced
by Wierzbicki (1986)) is both computationally usable and intuitive.
The above-dened vectors x S and z Z are -properly Pareto opti-
mal if
z Rk \ {0}) Z = ,
where Rk is a slightly broader cone than Rk+ . Now, trade-os are bounded
by and 1/ and we have a relationship to M used in Georions denition
as M = 1 + 1/. For details, see, for example (Miettinen, 1999; Wierzbicki,
1986).
1.2.2 Solution Process and Some Elements in It
Mathematically, we cannot order Pareto optimal objective vectors because the

objective space is only partially ordered. However, it is generally desirable to
obtain one point as a nal solution to be implemented and this solution should
satisfy the preferences of the particular DM. Finding a solution to problem (1)
dened in Preface is called a solution process. As mentioned earlier, it usually
involves co-operation of the DM and an analyst. The analyst is supposed to
know the specics of the methods used and help the DM at various stages
of the solution process. It is important to emphasize that the DM is not
assumed to know MCDM or methods available but (s)he is supposed to be an
expert in the problem domain, that is, understand the application considered.
Sometimes, nding the set of Pareto optimal solutions is referred to as vector
optimization. However, here by solving a multiobjective optimization problem
we mean nding a feasible and Pareto optimal decision vector that satises
the DM. Assuming such a solution exists, it is called a nal solution.
The concepts of ideal and nadir objective vectors were dened in Preface
for getting information about the ranges of the objective function values in
the Pareto optimal set; provided the objective functions are bounded over
the feasible region. As mentioned then, there is no constructive method for
calculating the nadir objective vector for nonlinear problems. A payo table
(suggested by Benayoun et al. (1971)) is often used but it is not a reliable way
as demonstrated, for example, by Korhonen et al. (1997); Weistroer (1985).
The payo table has k objective vectors as its rows where objective function
values are calculated at points optimizing each objective function individually.
In other words, components of the ideal objective vector are located on the
diagonal of the payo table. An estimate of the nadir objective vector is
obtained by nding the worst objective values in each column. This method
gives accurate information only in the case of two objectives. Otherwise, it
may be an over- or an underestimation (because of alternative optima, see,
8 K. Miettinen
e.g., (Miettinen, 1999) for details). Let us mention that the nadir objective
vector can also be estimated using evolutionary algorithms (Deb et al., 2006).
Multiobjective optimization problems are usually solved by scalarization.
Scalarization means that the problem involving multiple objectives is con-
verted into an optimization problem with a single objective function or a
family of such problems. Because this new problem has a real-valued objec-
tive function (that possibly depends on some parameters coming, e.g., from
preference information), it can be solved using appropriate single objective
optimizers. The real-valued objective function is often referred to as a scalar-
izing function and, as discussed earlier, it is justied to use such scalarizing
functions that can be proven to generate Pareto optimal solutions. (However,
sometimes it may be computationally easier to generate weakly Pareto op-
timal solutions.) Depending on whether a local or a global solver is used,
we get either locally or globally Pareto optimal solutions (if the problem is
not convex). As discussed earlier, locally Pareto optimal objective vectors are
not of interest and, thus, we must pay attention that an appropriate solver
is used. We must also keep in mind that when using numerical optimization
methods, the solutions obtained are not necessarily optimal in practice (e.g.,
if the method used does not converge properly or if the global solver fails in
nding the global optimum).
It is sometimes assumed that the DM makes decisions on the basis of an
underlying function. This function representing the preferences of the DM
is called a value function v : Rk R (Keeney and Raia, 1976). In some
methods, the value function is assumed to be known implicitly and it has
been important in the development of solution methods and as a theoretical
background. A utility function is often used as a synonym for a value function
but we reserve that concept for stochastic problems which are not treated
here. The value function is assumed to be non-increasing with the increase
of objective values because we here assume that all objective functions are
to be minimized, while the value function is to be maximized. This means
that the preference of the DM will not decrease but will rather increase if the
value of an objective function decreases, while all the other objective values
remain unchanged (i.e., less is preferred to more). In this case, the solution
maximizing v can be proven to be Pareto optimal. Regardless of the existence
of a value function, it is usually assumed that less is preferred to more by
the DM.
Instead of as a maximum of a value function, a nal solution can be un-
derstood as a satiscing one. Satiscing decision making means that the DM
does not intend to maximize any value function but tries to achieve certain as-
pirations (Sawaragi et al., 1985). A Pareto optimal solution which satises all
the aspirations of the DM is called a satiscing solution. In some rare cases,
DMs may regard solutions satiscing even if they are not Pareto optimal.
This may, for example, means that not all relevant objectives are explicitly
expressed. However, here we assume DMs to be rational and concentrate on
Pareto optimal solutions.
Not only value functions but, in general, any preference model of a DM

may be explicit or implicit in multiobjective optimization methods. Exam-
ples of local preference models include aspiration levels and dierent distance
measures. During solution processes, various kinds of information can be so-
licited from the DM. Aspiration levels zi (i = 1, . . . , k) are such desirable or
acceptable levels in the objective function values that are of special interest
and importance to the DM. The vector z Rk consisting of aspiration levels
is called a reference point.
According to the denition of Pareto optimality, moving from one Pareto
optimal solution to another necessitates trading o. This is one of the ba-
sic concepts in multiobjective optimization. A trade-o reects the ratio of
change in the values of the objective functions concerning the increment of one
objective function that occurs when the value of some other objective func-
tion decreases. For details, see, e.g., (Chankong and Haimes, 1983; Miettinen,
1999) and Chapters 2 and 9.
As mentioned earlier, it is sometimes easier to generate weakly Pareto op-
timal solutions than Pareto optimal ones (because some scalarizing functions
produce weakly Pareto optimal solutions). There are dierent ways to get so-
lutions that can be proven to be Pareto optimal. Benson (1978) has suggested
to check the Pareto optimality of the decision vector x S by solving the
problem
k
maximize i=1 i
subject to fi (x) + i = fi (x ) for all i = 1, . . . , k,
(1.1)
i 0 for all i = 1, . . . , k,
x S,
where both x Rn and Rk+ are variables. If the optimal objective func-
tion value of (1.1) is zero, then x can be proven to be Pareto optimal and
if the optimal objective function value is nite and nonzero corresponding to
a decision vector x , then x is Pareto optimal. Note that the equality con-
straints in (1.1) can be replaced by inequalities fi (x) + i fi (x ). However,
we must point out that problem (1.1) is computationally badly conditioned
because it has only one feasible solution (i = 0 for each i) if x is Pareto op-
timal and computational diculties must be handled in practice, for example,
using penalty functions. We shall introduce other ways to guarantee Pareto
optimality in what follows in connection with some scalarizing functions.
Let us point out that in this chapter we do not concentrate on the theory
behind multiobjective optimization, necessary and sucient optimality condi-
tions, duality results, etc. Instead, we refer, for example, to (Jahn, 2004; Luc,
1989; Miettinen, 1999; Sawaragi et al., 1985) and references therein.
In the following sections, we briey describe some methods for solving
multiobjective optimization problems. We introduce several philosophies and
ways of approaching the problem. As mentioned in the introduction, we con-
centrate on the classes devoted to no-preference methods, a posteriori methods
10 K. Miettinen
and a priori methods and remind that overlapping and combinations of classes
are possible because no classication can fully cover the plethora of existing
methods.
Methods in each class have their strengths and weaknesses and selecting
a method to be used should be based on the desires and abilities of the DM
as well as properties of the problem in question. Naturally, an analyst plays
a crusial role when selecting a method because (s)he is supposed to know
the properties of dierent methods available. Her/his recommendation should
t the needs and the psychological prole of the DM in question. In dier-
ent methods, dierent types of information are given to the DM, the DM
is assumed to specify preference information in dierent ways and dierent
scalarizing functions are used. Besides the references given in each section, fur-
ther details about the methods to be described, including proofs of theorems
related to optimality, can be found in (Miettinen, 1999).
1.3 Basic Methods

Before we concentrate on the three classes of methods described in the in-
troduction, we rst discuss two well-known methods that can be called basic
methods because they are so widely used. Actually, in many applications one
can see them being used without necessarily recognizing them as multiobjec-
tive optimization methods. In other words, the dierence between a modelling
and an optimization phase are often blurred and these methods are used in
order to convert the problem into a form where one objective function can be
optimized with single objective solvers available. The reason for this may be
that methods of single objective optimization are more widely known as those
of multiobjective optimization. One can say that these two basic methods are
the ones that rst come to ones mind if there is a need to optimize multiple
objectives simultaneously. Here we consider their strengths and weaknesses
(which the users of these methods are not necessarily aware of) as well as
show that many other (more advanced) approaches exist.
1.3.1 Weighting Method
In the weighting method (see, e.g., (Gass and Saaty, 1955; Zadeh, 1963)), we
solve the problem
k
minimize i=1 wi fi (x) (1.2)
subject to x S,
k
where wi 0 for all i = 1, . . . , k and, typically, i=1 wi = 1. The solution
of (1.2) can be proven to be weakly Pareto optimal and, furthermore, Pareto
optimal if we have wi > 0 for all i = 1, . . . , k or if the solution is unique (see,
e.g., (Miettinen, 1999)).
The weighting method can be used as an a posteriori method so that

dierent weights are used to generate dierent Pareto optimal solutions and
then the DM is asked to select the most satisfactory one. Alternatively, the
DM can be asked to specify the weights in which case the method is used as
an a priori method.
As mentioned earlier, it is important in multiobjective optimization that
Pareto optimal solutions are generated and that any Pareto optimal solution
can be found. In this respect, the weighting method has a serious shortcoming.
It can be proven that any Pareto optimal solution can be found by altering the
weights only if the problem is convex. Thus, it may happen that some Pareto
optimal solutions of nonconvex problems cannot be found no matter how the
weights are selected. (Conditions under which the whole Pareto optimal set
can be generated by the weighting method with positive weights are presented
in (Censor, 1977).) Even though linear problems are not considered here, we
should point out that despite MOLP problems being convex, the weighting
method may not behave as expected even when solving them. This is because,
when altering the weights, the method may jump from one vertex to another
leaving intermediate solutions undetected. This is explained by the fact that
linear solvers typically produce vertex solutions.
Unfortunately, people who use the weighting method do not necessarily
know that that it does not work correctly for nonconvex problems. This is
a serious and important aspect because it is not always easy to check the
convexity in real applications if the problem is based, for example, on some
simulation model or solving some systems like systems of partial dieren-
tial equations. If the method is used in nonconvex problems for generating a
representation of the Pareto optimal set, the DM gets a completely mislead-
ing impression about the feasible solutions available when some parts of the
Pareto optimal set remain uncovered.
It is advisable to normalize the objectives with some scaling so that dif-
ferent magnitudes do not confuse the method. Systematic ways of perturbing
the weights to obtain dierent Pareto optimal solutions are suggested, e.g.,
in (Chankong and Haimes, 1983). However, as illustrated by Das and Den-
nis (1997), an evenly distributed set of weights does not necessarily produce
an evenly distributed representation of the Pareto optimal set, even if the
problem is convex.
On the other hand, if the method is used as an a priori method, the
DM is expected to be able to represent her/his preferences in the form of
weights. This may be possible if we assume that the DM has a linear value
function (which then corresponds to the objective function in problem (1.2)).
However, in general, the role of the weights may be greatly misleading. They
are often said to reect the relative importance of the objective functions
but, for example, Roy and Mousseau (1996) show that it is not at all clear
what underlies this notion. Moreover, the relative importance of objective
functions is usually understood globally, for the entire decision problem, while
many practical applications show that the importance typically varies for
12 K. Miettinen
dierent objective function values, that is, the concept is meaningful only
locally. (For more discussion on ordering objective functions by importance,
see, e.g., (Podinovski, 1994).)
One more reason why the DM may not get satisfactory solutions with the
weighting method is that if some of the objective functions correlate with
each other, then changing the weights may not produce expected solutions at
all but, instead, seemingly bad weights may result with satisfactory solutions
and vice versa (see, e.g., (Steuer, 1986)). This is also shown in (Tanner, 1991)
with an example originally formulated by P. Korhonen. With this example of
choosing a spouse (where three candidates are evaluated with ve criteria) it
is clearly demonstrated how weights representing the preferences of the DM
(i.e., giving the clearly biggest weight to the most important criterion) result
with a spouse who is the worst in the criterion that the DM regarded as the
most important one. (In this case, the undesired outcome may be explained
by the compensatory character of the weighting method.)
In particular for MOLP problems, weights that produce a certain Pareto
optimal solution are not necessarily unique and, thus, dramatically dierent
weights may produce similar solutions. On the other hand, it is also possible
that a small change in the weights may cause big dierences in objective
values. In all, we can say that it is not necessarily easy for the DM (or the
analyst) to control the solution process with weights because weights behave
in an indirect way. Then, the solution process may become an interactive one
where the DM tries to guess such weights that would produce a satisfactory
solution and this is not at all desirable because the DM can not be properly
supported and (s)he is likely to get frustrated. Instead, in such cases it is
advisable to use real interactive methods where the DM can better control
the solution process with more intuitive preference information. For further
details, see Chapter 2.
1.3.2 -Constraint Method
In the -constraint method, one of the objective functions is selected to be

optimized, the others are converted into constraints and the problem gets the
form
minimize f (x)
subject to fj (x) j for all j = 1, . . . , k, j = , (1.3)
x S,
where {1, . . . , k} and j are upper bounds for the objectives (j = ). The
method has been introduced in (Haimes et al., 1971) and widely discussed in
(Chankong and Haimes, 1983).
As far as optimality is concerned, the solution of problem (1.3) can be
proven to always be weakly Pareto optimal. On the other hand, x S can be
proven to be Pareto optimal if and only if it solves (1.3) for every = 1, . . . , k,
where j = fj (x ) for j = 1, . . . , k, j = . In addition, a unique solution of
(1.3) can be proven to be Pareto optimal for any upper bounds. In other
words, to ensure Pareto optimality we must either solve k dierent problems
(and solving many problems for each Pareto optimal solution increases com-
putational cost) or obtain a unique solution (which is not necessarily easy to
verify). However, a positive fact is that nding any Pareto optimal solution
does not necessitate convexity (as was the case with the weighting method).
In other words, this method works for both convex and nonconvex problems.
In practice, it may be dicult to specify the upper bounds so that the re-
sulting problem (1.3) has solutions, that is, the feasible region will not become
empty. This diculty is emphasized when the number of objective functions
increases. Systematic ways of perturbing the upper bounds to obtain dier-
ent Pareto optimal solutions are suggested in (Chankong and Haimes, 1983).
In this way, the method can be used as an a posteriori method. Information
about the ranges of objective functions in the Pareto optimal set is useful in
perturbing the upper bounds. On the other hand, it is possible to use the
method in an a priori way and ask the DM to specify the function to be op-
timized and the upper bounds. Specifying upper bounds can be expected to
be easier for the DM than, for example, weights because objective function
values are understandable as such for the DM. However, the drawback here
is that if there is a promising solution really close to the bound but on the
infeasible side, it will never be found. In other words, the bounds are a very
sti way of specifying preference information.
In what follows, we discuss three method classes described in the intro-
duction and outline some methods belonging to each of them. Again, proofs
of theorems related to optimality as well as further details about the methods
can be found in (Miettinen, 1999).
1.4 No-Preference Methods
In no-preference methods, the opinions of the DM are not taken into con-
sideration in the solution process. Thus, the problem is solved using some
relatively simple method and the idea is to nd some compromise solution
typically in the middle of the Pareto optimal set because there is no pref-
erence information available to direct the solution process otherwise. These
methods are suitable for situations where there is no DM available or (s)he
has no special expectations of the solution. They can also be used to produce
a starting point for interactive methods.
One can question the name of no-preference methods because there may
still exist an underlying preference model (e.g., the acceptance of a global
criterion by a DM, like the one in the method to be described in the next
subsection, can be seen as a preference model). However, we use the term of no-
preference method in order to emphasize the fact that no explicit preferences
from the DM are available and and, thus, they cannot be used. These methods
can also be referred to as methods of neutral preferences.
14 K. Miettinen
1.4.1 Method of Global Criterion
In the method of global criterion or compromise programming (Yu, 1973;

Zeleny, 1973), the distance between some desirable reference point in the
objective space and the feasible objective region is minimized. The analyst
selects the reference point used and a natural choice is to set it as the ideal
objective vector. We can use, for example, the Lp -metric or the Chebyshev
metric (also known as the L -metric) to measure the distance to the ideal
objective vector z or the utopian objective vector z (see denitions in
Preface) and then we need to solve the problem
1/p
k p
minimize i=1 fi (x) zi (1.4)
subject to x S,
(where the exponent 1/p can be dropped) or

minimize maxi=1,...,k |fi (x) zi |
(1.5)
subject to x S,
respectively. Note that if we here know the real ideal objective vector, we
can ignore the absolute value signs because the dierence is always positive
(according to the denition of the ideal objective vector).
It is demonstrated, for example, in (Miettinen, 1999) that the choice of the
distance metric aects the solution obtained. We can prove that the solution
of (1.4) is Pareto optimal and the solution of (1.5) is weakly Pareto optimal.
Furthermore, the latter can be proven to be Pareto optimal if it is unique.
Let us point out that if the objective functions have dierent magnitudes,
the method works properly only if we scale the objective functions to a uni-
form, dimensionless scale. This means, for example, that we divide each ab-
solute value term involving fi by the corresponding range of fi in the Pareto
optimal set characterized by nadir and utopian objective vectors (dened in
Preface), that is, by zinad zi (for each i). As the utopian objective vector
dominates all Pareto optimal solutions, we use the utopian and not the ideal
objective values in order to avoid dividing by zero in all occasions. (Connec-
tions of this method to utility or value functions are discussed in (Ballestero
and Romero, 1991).)
1.4.2 Neutral Compromise Solution
Another simple way of generating a solution without the involvement of the

DM is suggested in (Wierzbicki, 1999) and referred to as a neutral compromise
solution. The idea is to project a point located somewhere in the middle of
the ranges of objective values in the Pareto optimal set to become feasible.
Components of such a point can be obtained as the average of the ideal (or
utopian) and nadir values of each objective function. We can get a neutral
compromise solution by solving the problem
fi (x)((z +znad )/2)
minimize maxi=1,...,k i i
zinad zi (1.6)
subject to x S.
As can be seen, this problem uses the utopian and the nadir objective vectors
or other reliable approximations about the ranges of the objective functions
in the Pareto optimal set for scaling purposes (in the denominator), as men-
tioned above. The solution is weakly Pareto optimal. We shall later return to
scalarizing functions of this type later and discuss how Pareto optimality can
be guaranteed. Naturally, the average in the numinator can be taken between
components of utopian and nadir objective vectors, instead of the ideal and
nadir ones.
1.5 A Posteriori Methods

In what follows, we assume that we have a DM available to take part in the
solution process. A posteriori methods can be called methods for generating
Pareto optimal solutions. Because there usually are innitely many Pareto
optimal solutions, the idea is to generate a representation of the Pareto opti-
mal set and present it to the DM who selects the most satisfactory solution
of them as the nal one. The idea is that once the DM has seen an overview
of dierent Pareto optimal solutions, it is easier to select the most preferred
one. The inconveniences here are that the generation process is usually com-
putationally expensive and sometimes in part, at least, dicult. On the other
hand, it may be hard for the DM to make a choice from a large set of alterna-
tives. An important question related to this is how to represent and display
the alternatives to the DM in an illustrative way (Miettinen, 2003, 1999).
Plotting the objective vectors on a plane is a natural way of displaying them
only in the case of two objectives. In that case, the Pareto optimal set can be
generated parametrically (see, e.g., (Benson, 1979; Gass and Saaty, 1955)).
The problem becomes more complicated with more objectives. For visualiz-
ing sets of Pareto optimal solutions, see Chapter 8. Furthermore, visualization
and approximation of Pareto optimal sets are discussed in Chapter 9. It is also
possible to use so-called box-indices to represent Pareto optimal solutions to
be compared by using a rough enough scale in order to let the DM easily rec-
ognize the main characteristics of the solutions at a glance (Miettinen et al.,
2008).
Remember that the weighting method and the -constraint method can
be used as a posteriori methods. Next we outline some other methods in this
class.
1.5.1 Method of Weighted Metrics
In the method of weighted metrics, we generalize the idea of the method

of global criterion where the distance between some reference point and the
16 K. Miettinen
feasible objective region is minimized. The dierence is that we can produce

dierent solutions by weighting the metrics. The weighted approach is also
sometimes called compromise programming (Zeleny, 1973).
Again, the solution obtained depends greatly on the distance measure used.
For 1 p < , we have a problem
1/p
k p
minimize w f
i=1 i i (x) z i (1.7)
subject to x S.
The exponent 1/p can be dropped. Alternatively, we can use a weighted Cheby-
shev problem
minimize maxi=1,...,k wi (fi (x) zi )
(1.8)
subject to x S.
Note that we have here ignored the absolute values assuming we know the
global ideal (or utopian) objective vector. As far as optimality is concerned,
we can prove that the solution of (1.7) is Pareto optimal if either the solution
is unique or all the weights are positive. Furthermore, the solution of (1.8)
is weakly Pareto optimal for positive weights. Finally, (1.8) has at least one
Pareto optimal solution. On the other hand, convexity of the problem is needed
in order to be able to prove that every Pareto optimal solution can be found
by (1.7) by altering the weights. However, any Pareto optimal solution can
be found by (1.8) assuming that the utopian objective vector z is used as a
reference point.
The objective function in (1.8) is nondierentiable and, thus single objec-
tive optimizers using gradient information cannot be used to solve it. But if
all the functions in the problem considered are dierentiable, we can use an
equivalent dierentiable variant of (1.8) by introducing one more variable and
new constraints of the form
minimize
subject to wi (fi (x) zi ) for all i = 1, . . . , k, (1.9)
x S,
where both x Rn and R are variables. With this formulation, single

objective solvers assuming dierentiability can be used.
Because problem (1.8) with z seems a promising approach (as it can
nd any Pareto optimal solution), it would be nice to be able to avoid weakly
Pareto optimal solutions. This can be done by giving a slight slope to the
contours of the scalarizing function used (see, e.g., (Steuer, 1986)). In other
words, we can formulate a so-called augmented Chebyshev problem in the form
k
minimize maxi=1,...,k wi (fi (x) zi ) + i=1 (fi (x) zi )
(1.10)
subject to x S,
where is a suciently small positive scalar. Strictly speaking, (1.10) gen-

erates properly Pareto optimal solutions and any properly Pareto optimal
solution can be found (Kaliszewski, 1994). In other words, we are not actually
able to nd any Pareto optimal solution but only such solutions having a nite
trade-o. However, when solving real-life problems, it is very likely that the
DM is not interested in improperly Pareto optimal solutions after all. Here
corresponds to the bound for desirable or acceptable trade-os (see denition
of -proper Pareto optimality in Section 1.2.1). Let us mention that an aug-
mented version of the dierentiable problem formulation (1.9) is obtained by
adding the augmentation term (i.e., the term multiplied by ) to the objective
function .
Alternatively, it is possible to generate provably Pareto optimal solutions
by solving two problems in a row. In other words, problem (1.8) is rst solved
and then another optimization problem is solved in the set of optimal solutions
to (1.8). To be more specic, let x be the solution of the rst problem (1.8).
Then the second problem is the following
k
i=1 (fi (x) zi )

minimize
subject to maxi=1,...,k wi (fi (x) zi ) maxi=1,...,k wi (fi (x ) zi ) ,
x S.
One should mention that the resulting problem may be computationally badly
conditioned if the problem has only one feasible solution. With this so-called
lexicographic approach it is possible to reach any Pareto optimal solution. Un-
fortunately, the computational cost increases because two optimization prob-
lems must be solved for each Pareto optimal solution (Miettinen et al., 2006).
1.5.2 Achievement Scalarizing Function Approach
Scalarizing functions of a special type are called achievement (scalarizing)

functions . They have been introduced, for example, in (Wierzbicki, 1982,
1986). These functions are based on an arbitrary reference point z Rk and
the idea is to project the reference point consisting of desirable aspiration lev-
els onto the set of Pareto optimal solutions. Dierent Pareto optimal solutions
can be produced with dierent reference points. The dierence to the previous
method (i.e., method of weighted metrics) is that no distance metric is used
and the reference point does not have to be xed as the ideal or utopian ob-
jective vector. Because of these characteristics, Pareto optimal solutions are
obtained no matter how the reference point is selected in the objective space.
Achievement functions can be formulated in dierent ways. As an example
we can mention the problem
k
minimize maxi=1,...,k wi (fi (x) zi ) + i=1 (fi (x) zi )
(1.11)
subject to x S,
where w is a xed normalizing factor, for example, wi = 1/(zinad zi ) for

all i and > 0 is an augmentation multiplier as in (1.10). And corresponding
18 K. Miettinen
to (1.10), we can prove that solutions of this problem are properly Pareto
optimal and any properly Pareto optimal solution can be found. To be more
specic, the solutions obtained are -properly Pareto optimal (as dened in
Section 1.2). If the augmentation term is dropped, the solutions can be proven
to be weakly Pareto optimal. Pareto optimality can also be guaranteed and
proven if the lexicographic approach described above is used. Let us point
out that problem (1.6) uses an achievement scalarizing function where the
reference point is xed. The problem could be augmented as in (1.11).
Note that when compared to the method of weighted metrics, we do not use
absolute value signs here in any case. No matter which achievement function
formulation is used, the idea is the same: if the reference point is feasible,
or actually to be more exact, z Z + Rk+ , then the minimization of the
achievement function subject to the feasible region allocates slack between
the reference point and Pareto optimal solutions producing a Pareto optimal
solution. In other words, in this case the reference point is a Pareto optimal
solution for the problem in question or it is dominated by some Pareto optimal
solution. On the other hand, if the reference point is infeasible, that is, z
/
Z+Rk+, then the minimization produces a solution that minimizes the distance
between z and Z. In both cases, we can say that we project the reference point
on the Pareto optimal set. Discussion on how the projection direction can be
varied in the achievement function can be found in (Luque et al., 2009).
As mentioned before, achievement functions can be formulated in many
ways and they can be based on so-called reservation levels, besides aspiration
levels. For more details about them, we refer, for example, to (Wierzbicki,
1982, 1986, 1999, 2000) and Chapter 2.
1.5.3 Approximation Methods
During the years, many methods have been developed for approximating the
set of Pareto optimal solutions in the MCDM literature. Here we do not go
into their details. A survey of such methods is given in (Ruzika and Wiecek,
2005). Other approximation algorithms (not included there) are introduced
in (Lotov et al., 2004). For more information about approximation methods
we also refer to Chapter 9.
1.6 A Priori Methods
In a priori methods, the DM must specify her/his preference information (for

example, in the form of aspirations or opinions) before the solution process. If
the solution obtained is satisfactory, the DM does not have to invest too much
time in the solution process. However, unfortunately, the DM does not neces-
sarily know beforehand what it is possible to attain in the problem and how
realistic her/his expectations are. In this case, the DM may be disappointed
at the solution obtained and may be willing to change ones preference in-
formation. This easily leads to a desire of using an interactive approach (see
Chapter 2). As already mentioned, the basic methods introduced earlier can
be used as a priori methods. It is also possible to use the achievement scalar-
izing function approach as an a priori method where the DM species the
reference point and the Pareto optimal solution closest to it is generated.
Here we briey describe three other methods.
1.6.1 Value Function Method
The value function method (Keeney and Raia, 1976) was already mentioned
in Section 1.2.2. It is an excellent method if the DM happens to know an
explicit mathematical formulation for the value function and if that function
can capture and represent all her/his preferences. Then the problem to be
solved is
maximize v(f (x))
subject to x S.
Because the value function provides a complete ordering in the objective space,
the best Pareto optimal solution is found in this way. Unfortunately, it may be
dicult, if not impossible, to get that mathematical expression of v. For ex-
ample, in (deNeufville and McCord, 1984), the inability to encode the DMs
underlying value function reliably is demonstrated by experiments. On the
other hand, the value function can be dicult to optimize because of its pos-
sible complicated nature. Finally, even if it were possible for the DM to express
her/his preferences globally as a value function, the resulting preference struc-
ture may be too simple since value functions cannot represent intransitivity
or incomparability. In other words, the DMs preferences must satisfy certain
conditions (like consistent preferences) so that a value function can be dened
on them. For more discussion see, for example, (Miettinen, 1999).
1.6.2 Lexicographic Ordering
In lexicographic ordering (Fishburn, 1974), the DM must arrange the objec-

tive functions according to their absolute importance. This means that a more
important objective is innitely more important than a less important objec-
tive. After the ordering, the most important objective function is minimized
subject to the original constraints. If this problem has a unique solution, it is
the nal one and the solution process stops. Otherwise, the second most im-
portant objective function is minimized. Now, a new constraint is introduced
to guarantee that the most important objective function preserves its optimal
value. If this problem has a unique solution, the solution process stops. Oth-
erwise, the process goes on as above. (Let us add that computationally it is
not trivial to check the uniqueness of solutions. Then the next problem must
be solved just to be sure. However, if the next problem has a unique solution,
the problem is computationally badly conditioned, as discussed earlier.)
20 K. Miettinen
The solution of lexicographic ordering can be proven to be Pareto optimal.

The method is quite simple and one can claim that people often make decisions
successively. However, the DM may have diculties in specifying an absolute
order of importance. Besides, the method is very rough and it is very likely
that the process stops before less important objective functions are taken
into consideration. This means that all the objectives that were regarded as
relevant while formulating the problem are not taken into account at all, which
is questionable.
The notion of absolute importance is discussed in (Roy and Mousseau,
1996). Note that lexicographic ordering does not allow a small increment of
an important objective function to be traded o with a great decrement of
a less important objective. Yet, the DM might nd this kind of trading o
appealing. If this is the case, lexicographic ordering is not likely to produce a
satiscing solution.
1.6.3 Goal Programming
Goal programming is one of the rst methods expressly created for multiob-
jective optimization (Charnes et al., 1955; Charnes and Cooper, 1961). It has
been originally developed for MOLP problems (Ignizio, 1985).
In goal programming, the DM is asked to specify aspiration levels zi (i =
1, . . . , k) for the objective functions. Then, deviations from these aspiration
levels are minimized. An objective function jointly with an aspiration level is
referred to as a goal . For minimization problems, goals are of the form fi (x)
zi and the aspiration levels are assumed to be selected so that they are not
achievable simultaneously. After the goals have been formed, the deviations
i = max [0, fi (x) zi ] of the objective function values are minimized.
The method has several variants. In the weighted goal programming ap-
proach (Charnes and Cooper, 1977), the weighted sum of the deviations is
minimized. This means that in addition to the aspiration levels, the DM must
specify positive weights. Then we solve a problem
k
minimize i=1 wi i
subject to fi (x) i zi for all i = 1, . . . , k,
(1.12)
i 0 for all i = 1, . . . , k,
x S,
where x Rn and i (i = 1, . . . , k) are the variables.

On the other hand, in the lexicographic goal programming approach, the
DM must specify a lexicographic order for the goals in addition to the aspira-
tion levels. After the lexicographic ordering, the problem with the deviations
as objective functions is solved lexicographically subject to the constraints
of (1.12) as explained in Section 1.6.2. It is also possible to use a combina-
tion of the weighted and the lexicographic approaches. In this case, several
objective functions may belong to the same class of importance in the lexico-
graphic order. In each priority class, a weighted sum of the deviations is min-
imized. Let us also mention a so-called min-max goal programming approach
(Flavell, 1976) where the maximum of deviations is minimized and meta-goal
programming (Rodrguez Ura et al., 2002), where dierent variants of goal
programming are incorporated.
Let us next discuss optimality. The solution of a goal programming prob-
lem can be proven to be Pareto optimal if either the aspiration levels form a
Pareto optimal reference point or all the variables i have positive values at
the optimum. In other words, if the aspiration levels form a feasible point, the
solution is equal to that reference point which is not necessarily Pareto op-
timal. We can say that the basic formulation of goal programming presented
here works only if the aspiration levels are overoptimistic enough. Pareto op-
timality of the solutions obtained is discussed, for example, in (Jones et al.,
1998).
Goal programming is a very widely used and popular solution method.
Goal-setting is an understandable and easy way of making decisions. The
specication of the weights or the lexicographic ordering may be more di-
cult (the weights have no direct physical meaning). For further details, see
(Romero, 1991). Let us point out that goal programming is related to the
achievement scalarizing function approach (see Section 1.5.2) because they
both are based on reference points. The advantage of the latter is that it is
able to produce Pareto optimal solutions independently of how the reference
point is selected.
Let us nally add that goal programming has been used in a variety of
further developments and modications. Among others, goal programming
is related to some fuzzy multiobjective optimization methods where fuzzy
sets are used to express degrees of satisfaction from the attainment of goals
and from satisfaction of soft constraints (Rommelfanger and Slowinski, 1998).
Some more applications of goal programming will be discussed in further
chapters of this book.
1.7 Summary
In this section we summarize some of the properties of the nine methods
discussed so far. We provide a collection of dierent properties in Figure 1.2.
We pay attention to the class the method can be regarded to belong to as well
as properties of solutions obtained. We also briey comment the format of
preference information used. In some connections, we use the notation (X) to
indicate that the statement or property is true under assumptions mentioned
when describing the method.
22 K. Miettinen
achievement scalarizing function

neutral compromise solution
method of weighted metrics

method of global criterion
lexicographic ordering
value function method
econstraint method
goal programming
weighting method
nopreference method
a priori method
a posteriori method
can find any Pareto

optimal solution
solution always
Pareto optimal
type of preference
information
weights
bounds
reference point
value function
lexicographic order
Fig. 1.2. Summary of some properties of the methods described.
1.8 Conclusions
The aim of this chapter has been to briey describe some basics of MCDM
methods. For this, we have concentrated on some noninteractive methods de-
veloped for multiobjective optimization. A large variety of methods exists
and it is impossible to cover all of them. In this chapter, we have concen-
trated on methods where the DM either species no preferences or species
them after or before the solution process. The methods can be combined,
hybridized and further developed in many ways, for example, with evolution-
ary algorithms. Other chapters of this book will discuss possibilities of such
developments more.
None of the methods can be claimed to be superior to the others in every
aspect. When selecting a solution method, the specic features of the problem
to be solved must be taken into consideration. In addition, the opinions and

abilities of the DM are important. The theoretical properties of the methods
can rather easily be compared but, in addition, practical applicability also
plays an important role in the selection of an appropriate method. One can
say that selecting a multiobjective optimization method is a problem with
multiple objectives itself! Some methods may suit some problems and some
DMs better than others. A decision tree is provided in (Miettinen, 1999) for
easing the selection. Specic methods for dierent areas of application that
take into account the characteristics of the problems may also be useful.
Acknowledgements
I would like to give my most sincere thanks to Professors Alexander Lotov,
Francisco Ruiz and Andrzej Wierzbicki for their valuable comments that im-
proved this chapter. This work was partly supported by the Foundation of the
Helsinki School of Economics.
References
Ballestero, E., Romero, C.: A theorem connecting utility function optimization and
compromise programming. Operations Research Letters 10(7), 421427 (1991)
Benayoun, R., de Montgoler, J., Tergny, J., Laritchev, O.: Programming with mul-
tiple objective functions: Step method (STEM). Mathematical Programming 1(3),
366375 (1971)
Benson, H.P.: Existence of ecient solutions for vector maximization problems. Jour-
nal of Optimization Theory and Application 26(4), 569580 (1978)
Benson, H.P.: Vector maximization with two objective functions. Journal of Opti-
mization Theory and Applications 28(3), 253257 (1979)
Censor, Y.: Pareto optimality in multiobjective problems. Applied Mathematics and
Optimization 4(1), 4159 (1977)
Chankong, V., Haimes, Y.Y.: Multiobjective Decision Making: Theory and Method-
ology. Elsevier Science Publishing, New York (1983)
Charnes, A., Cooper, W.W.: Management Models and Industrial Applications of
Linear Programming, vol. 1. Wiley, New York (1961)
Charnes, A., Cooper, W.W.: Goal programming and multiple objective optimization;
part 1. European Journal of Operational Research 1(1), 3954 (1977)
Charnes, A., Cooper, W.W., Ferguson, R.O.: Optimal estimation of executive com-
pensation by linear programming. Management Science 1(2), 138151 (1955)
Cohon, J.L.: Multicriteria programming: Brief review and application. In: Gero, J.S.
(ed.) Design Optimization, pp. 163191. Academic Press, London (1985)
Das, I., Dennis, J.E.: A closer look at drawbacks of minimizing weighted sums of
objectives for Pareto set generation in multicriteria optimization problems. Struc-
tural Optimization 14(1), 6369 (1997)
Deb, K., Chaudhuri, S., Miettinen, K.: Towards estimating nadir objective vector
using evolutionary approaches. In: Keijzer, M., et al. (eds.) Proceedings of the
8th Annual Genetic and Evolutionary Computation Conference (GECCO-2006),
Seattle, vol. 1, pp. 643650. ACM Press, New York (2006)
24 K. Miettinen
deNeufville, R., McCord, M.: Unreliable measurement of utility: Signicant problems

for decision analysis. In: Brans, J.P. (ed.) Operational Research 84, pp. 464476.
Elsevier, Amsterdam (1984)
Edgeworth, F.Y.: Mathematical Psychics: An Essay on the Application of Mathe-
matics to the Moral Sciences. C. Kegan Paul & Co., London (1881), University
Microlms International (Out-of-Print Books on Demand) (1987)
Fandel, G.: Group decision making: Methodology and applications. In: Bana e Costa,
C. (ed.) Readings in Multiple Criteria Decision Aid, pp. 569605. Berlin (1990)
Fishburn, P.C.: Lexicographic orders, utilities and decision rules: A survey. Manage-
ment Science 20(11), 14421471 (1974)
Flavell, R.B.: A new goal programming formulation. Omega 4(6), 731732 (1976)
Gass, S., Saaty, T.: The computational algorithm for the parametric objective func-
tion. Naval Research Logistics Quarterly 2, 3945 (1955)
Georion, A.M.: Proper eciency and the theory of vector maximization. Journal
of Mathematical Analysis and Applications 22(3), 618630 (1968)
Haimes, Y.Y., Lasdon, L.S., Wismer, D.A.: On a bicriterion formulation of the prob-
lems of integrated system identication and system optimization. IEEE Transac-
tions on Systems, Man, and Cybernetics 1, 296297 (1971)
Hwang, C.-L., Lin, M.-J.: Group Decision Making under Multiple Criteria: Methods
and Applications. Springer, New York (1987)
Hwang, C.L., Masud, A.S.M.: Multiple Objective Decision Making Methods and
Applications: A State-of-the-Art Survey. Springer, Berlin (1979)
Ignizio, J.P.: Introduction to Linear Goal Programming. Sage Publications, Beverly
Hills (1985)
Jahn, J.: Vector Optimization. Springer, Berlin (2004)
Jones, D.F., Tamiz, M., Mirrazavi, S.K.: Intelligent solution and analysis of goal
programmes: the GPSYS system. Decision Support Systems 23(4), 329332 (1998)
Kaliszewski, I.: Quantitative Pareto Analysis by Cone Separation Technique.
Kluwer, Dordrecht (1994)
Keeney, R.L., Raia, H.: Decisions with Multiple Objectives: Preferences and Value
Tradeos. Wiley, Chichester (1976)
Koopmans, T.: Analysis and production as an ecient combination of activities.
In: Koopmans, T. (ed.) Activity Analysis of Production and Allocation: Proceed-
ings of a Conference, pp. 3397. Wiley, New York (1951), Yale University Press,
London (1971)
Korhonen, P., Salo, S., Steuer, R.E.: A heuristic for estimating nadir criterion values
in multiple objective linear programming. Operations Research 45(5), 751757
(1997)
Kuhn, H., Tucker, A.: Nonlinear programming. In: Neyman, J. (ed.) Proceedings of
the Second Berkeley Symposium on Mathematical Statistics and Probability, pp.
481492. University of California Press, Berkeley (1951)
Lotov, A.V., Bushenkov, V.A., Kamenev, G.K.: Interactive Decision Maps. Approxi-
mation and Visualization of Pareto Frontier. Kluwer Academic Publishers, Boston
(2004)
Luc, D.T.: Theory of Vector Optimization. Springer, Berlin (1989)
Luque, M., Miettinen, K., Eskelinen, P., Ruiz, F.: Incorporating preference infor-
mation in interactive reference point methods for multiobjective optimization.
Omega 37(2), 450462 (2009)
Makarov, E.K., Rachkovski, N.N.: Unied representation of proper eciency by

means of dilating cones. Journal of Optimization Theory and Applications 101(1),
141165 (1999)
Marler, R., Arora, J.: Survey of multi-objective optimization methods for engineer-
ing. Structural and Multidisciplinary Optimization 26(6), 369395 (2004)
Miettinen, K.: Nonlinear Multiobjective Optimization. Kluwer Academic Publishers,
Boston (1999)
Miettinen, K.: Graphical illustration of Pareto optimal solutions. In: Tanino, T.,
Tanaka, T., Inuiguchi, M. (eds.) Multi-Objective Programming and Goal Pro-
gramming: Theory and Applications, pp. 197202. Springer, Berlin (2003)
Miettinen, K., Mkel, M.M., Kaario, K.: Experiments with classication-based
scalarizing functions in interactive multiobjective optimization. European Journal
of Operational Research 175(2), 931947 (2006)
Miettinen, K., Molina, J., Gonzlez, M., Hernndez-Daz, A., Caballero, R.: Using
box indices in supporting comparison in multiobjective optimization. European
Journal of Operational Research, to appear (2008), doi:10.1016/j.ejor.2008.05.103
Pareto, V.: Cours dEconomie Politique. Rouge, Lausanne (1896)
Pareto, V.: Manuale di Economia Politica. Piccola Biblioteca Scientica, Milan
(1906), Translated into English by Schwier, A.S., Manual of Political Economy,
MacMillan, London (1971)
Podinovski, V.V.: Criteria importance theory. Mathematical Social Sciences 27(3),
237252 (1994)
Rodrguez Ura, M., Caballero, R., Ruiz, F., Romero, C.: Meta-goal programming.
European Journal of Operational Research 136(2), 422429 (2002)
Romero, C.: Handbook of Critical Issues in Goal Programming. Pergamon Press,
Oxford (1991)
Rommelfanger, H., Slowinski, R.: Fuzzy linear programming with single or multiple
objective functions. In: Slowinski, R. (ed.) Fuzzy Sets in Decision Analysis, Oper-
ations Research and Statistics, pp. 179213. Kluwer Academic Publishers, Boston
(1998)
Rosenthal, R.E.: Principles of Multiobjective Optimization. Decision Sciences 16(2),
133152 (1985)
Roy, B., Mousseau, V.: A theoretical framework for analysing the notion of relative
importance of criteria. Journal of Multi-Criteria Decision Analysis 5(2), 145159
(1996)
Ruzika, S., Wiecek, M.M.: Approximation methods in multiobjective programming.
Journal of Optimization Theory and Applications 126(3), 473501 (2005)
Sawaragi, Y., Nakayama, H., Tanino, T.: Theory of Multiobjective Optimization.
Academic Press, Orlando (1985)
Steuer, R.E.: Multiple Criteria Optimization: Theory, Computation, and Applica-
tion. Wiley, New York (1986)
Tanner, L.: Selecting a text-processing system as a qualitative multiple criteria prob-
lem. European Journal of Operational Research 50(2), 179187 (1991)
Vincke, P.: Multicriteria Decision-Aid. Wiley, Chichester (1992)
Weistroer, H.R.: Careful usage of pessimistic values is needed in multiple objectives
optimization. Operations Research Letters 4(1), 2325 (1985)
Wierzbicki, A.P.: A mathematical basis for satiscing decision making. Mathemati-
cal Modelling 3, 391405 (1982)
26 K. Miettinen
Wierzbicki, A.P.: On the completeness and constructiveness of parametric charac-

terizations to vector optimization problems. OR Spectrum 8(2), 7387 (1986)
Wierzbicki, A.P.: Reference point approaches. In: Gal, T., Stewart, T.J., Hanne, T.
(eds.) Multicriteria Decision Making: Advances in MCDM Models, Algorithms,
Theory, and Applications, pp. 9-19-39. Kluwer, Boston (1999)
Wierzbicki, A.P.: Reference point methodology. In: Wierzbicki, A.P., Makowski, M.,
Wessels, J. (eds.) Model-Based Decision Support Methodology with Environmen-
tal Applications, pp. 7189. Kluwer Academic Publishers, Dordrecht (2000)
Yu, P.L.: A class of solutions for group decision problems. Management Sci-
ence 19(8), 936946 (1973)
Zadeh, L.: Optimality and non-scalar-valued performance criteria. IEEE Transac-
tions on Automatic Control 8, 5960 (1963)
Zeleny, M.: Compromise programming. In: Cochrane, J.L., Zeleny, M. (eds.) Multiple
Criteria Decision Making, pp. 262301. University of South Carolina, Columbia,
SC (1973)
2
Introduction to Multiobjective Optimization:
Interactive Approaches
Kaisa Miettinen1 , Francisco Ruiz2 , and Andrzej P. Wierzbicki3

1
Department of Mathematical Information Technology, P.O. Box 35 (Agora),
FI-40014 University of Jyvskyl, Finland, kaisa.miettinen@jyu.fi
2
Department of Applied Economics (Mathematics), University of Mlaga, Calle
Ejido 6, E-29071 Mlaga, Spain, rua@uma.es
3
21st Century COE Program: Technology Creation Based on Knowledge Science,
JAIST (Japan Advanced Institute of Science and Technology), Asahidai 1-1,
Nomi, Ishikawa 923-1292, Japan and National Institute of Telecommunications,
Szachowa Str. 1, 04-894 Warsaw, Poland, andrzej@jaist.ac.jp
Abstract. We give an overview of interactive methods developed for solving nonlin-

ear multiobjective optimization problems. In interactive methods, a decision maker
plays an important part and the idea is to support her/him in the search for the most
preferred solution. In interactive methods, steps of an iterative solution algorithm
are repeated and the decision maker progressively provides preference information so
that the most preferred solution can be found. We identify three types of specifying
preference information in interactive methods and give some examples of methods
representing each type. The types are methods based on trade-o information, ref-
erence points and classication of objective functions.
2.1 Introduction
Solving multiobjective optimization problems typically means helping a hu-
man decision maker (DM) in nding the most preferred solution as the nal
one. By the most preferred solution we refer to a Pareto optimal solution which
the DM is convinced to be her/his best option. Naturally, nding the most
preferred solution necessitates the participation of the DM who is supposed
to have insight into the problem and be able to specify preference informa-
tion related to the objectives considered and dierent solution alternatives,
as discussed in Chapter 1. There we presented four classes for multiobjective
optimization methods according to the role of the DM in the solution process.

Reviewed by: Andrzej Jaszkiewicz, Poznan University of Technology, Poland
Wlodzimierz Ogryczak, Warsaw University of Technology, Poland
28 K. Miettinen, F. Ruiz, and A.P. Wierzbicki
This chapter is a direct continuation to Chapter 1 and here we concentrate

on the fourth class, that is, interactive methods.
As introduced in Chapter 1, in interactive methods, an iterative solution
algorithm (which can be called a solution pattern) is formed, its steps are
repeated and the DM species preference information progressively during the
solution process. In other words, the phases of preference elicitation (decision
phase) and solution generation (optimization stage) alternate until the DM
has found the most preferred solution (or some stopping criterion is satised,
or there is no satisfactory solution for the current problem setting). After every
iteration, some information is given to the DM and (s)he is asked to answer
some questions concerning a critical evaluation of the proposed solutions or to
provide some other type of information to express her/his preferences. This
information is used to construct a more or less explicit model of the DMs
local preferences and new solutions (which are supposed to better t the DMs
preferences) are generated based on this model. In this way, the DM directs
the solution process and only a part of the Pareto optimal solutions has to
be generated and evaluated. Furthermore, the DM can specify and correct
her/his preferences and selections during the solution process.
In brief, the main steps of a general interactive method are the following:
(1) initialize (e.g., calculate ideal and nadir values and showing them to the
DM), (2) generate a Pareto optimal starting point (some neutral compromise
solution or solution given by the DM), (3) ask for preference information from
the DM (e.g., aspiration levels or number of new solutions to be generated),
(4) generate new Pareto optimal solution(s) according to the preferences and
show it/them and possibly some other information about the problem to the
DM. If several solutions were generated, ask the DM to select the best solution
so far, and (6) stop, if the DM wants to. Otherwise, go to step (3).
Because of the structure of an iterative approach, the DM does not need
to have any global preference structure and (s)he can learn (see Chapter 15)
during the solution process. This is a very important benet of interactive
methods because getting to know the problem, its possibilities and limitations
is often very valuable for the DM. To summarize, we can say that interactive
methods overcome weaknesses of a priori and a posteriori methods because
the DM does not need a global preference structure and only such Pareto
optimal solutions are generated that are interesting to the DM. The latter
means savings in computational cost and, in addition, avoids the need to
compare many Pareto optimal solutions simultaneously.
If the nal aim is to choose and implement a solution, then the goal of
applying a multiobjective optimization method is to nd a single, most pre-
ferred, nal solution. However, in some occasions it may be preferable that
instead of one, we nd several solutions. This may be particularly true in case
of robustness considerations when some aspects of uncertainty, imprecision or
inconsistency in the data or in the model are to be taken into account (but
typically, eventually, one of them will still have to be chosen). In what follows,
2 Introduction to Interactive Approaches 29
as the goal of using an interactive solution process we consider nding a single

most preferred solution.
A large variety of interactive methods has been developed during the years.
We can say that none of them is generally superior to all the others and some
methods may suit dierent DMs and problems better than the others. The
most important assumption underlying the successful application of interac-
tive methods is that the DM must be available and willing to actively partic-
ipate in the solution process and direct it according to her/his preferences.
Interactive methods dier from each other by both the style of interaction
and technical elements. The former includes the form in which information
is given to the DM and the form and type of preference information the DM
species. On the other hand, the latter includes the type of nal solution
obtained (i.e., whether it is weakly, properly or Pareto optimal or none of
these), the kind of problems handled (i.e., mathematical assumptions set on
the problem), the mathematical convergence of the method (if any) and what
kind of a scalarizing function is used. It is always important that the DM
nds the method worthwhile and acceptable and is able to use it properly, in
other words, the DM must nd the style of specifying preference information
understandable and preferences easy and intuitive to provide in the style se-
lected. We can often identify two phases in the solution process: a learning
phase when the DM learns about the problem and feasible solutions in it and
a decision phase when the most preferred solution is found in the region iden-
tied in the rst phase. Naturally, the two phases can also be used iteratively
if so desired.
In fact, solving a multiobjective optimization problem interactively is a
constructive process where, while learning, the DM is building a conviction of
what is possible (i.e., what kind of solutions are available) and confronting this
knowledge with her/his preferences that also evolve. In this sense, one should
generally speak about a psychological convergence in interactive methods,
rather than about a mathematical one.
Here we identify three types of specifying preference information in in-
teractive methods and discuss the main characteristics of each type as well
as give some examples of methods. The types are methods based on trade-
o information, reference point approaches and classication-based methods.
However, it is important to point out that other interactive methods do also
exist. For example, it is possible to generate a small sample of Pareto optimal
solutions using dierent weights in the weighted Chebyshev problem (1.8) or
(1.10) introduced in Chapter 1 when minimizing the distance to the utopian
objective vector. Then we can ask the DM to select the most preferred one
of them and the next sample of Pareto optimal solutions is generated so that
it concentrates on the neighbourhood of the selected one. See (Steuer, 1986,
1989) for details of this so-called Tchebyche method.
Dierent interactive methods are described, for example, in the mono-
graphs (Chankong and Haimes, 1983; Hwang and Masud, 1979; Miettinen,
1999; Sawaragi et al., 1985; Steuer, 1986; Vincke, 1992). Furthermore, meth-
ods with applications to large-scale systems and industry are presented in

(Haimes et al., 1990; Statnikov, 1999; Tabucanon, 1988). Let us also mention
examples of reviews of methods including (Buchanan, 1986; Stewart, 1992)
and collections of interactive methods like (Korhonen, 2005; Shin and Ravin-
dran, 1991; Vanderpooten and Vincke, 1989).
2.2 Trade-o Based Methods

2.2.1 Dierent Trade-o Concepts
Several denitions of trade-os are available in the MCDM literature. Intu-

itively speaking, a trade-o is an exchange, that is, a loss in one aspect of
the problem, in order to gain additional benet in another aspect. In our
multiobjective optimization language, a trade-o represents giving up in one
of the objectives, which allows the improvement of another objective. More
precisely, how much must we give up of a certain objective in order to improve
another one to a certain quantity. Here, one important distinction must be
made. A trade-o can measure, attending just to the structure of the problem,
the change in one objective in relation to the change in another one, when
moving from a feasible solution to another one. This is what we call an objec-
tive trade-o . On the other hand, a trade-o can also measure how much the
DM considers desirable to sacrice in the value of some objective function in
order to improve another objective to a certain quantity. Then we talk about
a subjective trade-o . As it will be seen, both concepts may be used within
an interactive scheme in order to move from a Pareto optimal solution to
another. For objective trade-os, let us dene the following concepts:
Denition 1. Let us consider two feasible solutions x1 and x2 , and the corre-
sponding objective vectors f (x1 ) and f (x2 ). Then, the ratio of change between
fi and fj is denoted by Tij (x1 , x2 ), where
fi (x1 ) fi (x2 )
Tij (x1 , x2 ) = .
fj (x1 ) fj (x2 )
Tij (x1 , x2 ) is said to be a partial trade-o involving fi and fj between x1

and x2 if fl (x1 ) = fl (x2 ) for all l = 1, . . . , k, l = i, j. If, on the other hand,
there exists an index l {1, . . . , k} {i, j} such that fl (x1 ) = fl (x2 ), then
Tij (x1 , x2 ) is called the total trade-o involving fi and fj between x1 and x2 .
When moving from one Pareto optimal solution to another, there is at least
a pair of objective functions such that one of them is improved and the other
one gets worse. These trade-o concepts help the DM to study the eect
of changing the current solution. For continuously dierentiable problems,
the nite increments quotient represented by Tij (x1 , x2 ) can be changed by
an innitesimal change trend when moving from a certain Pareto optimal
solution x0 along a feasible direction d. This yields the following concept.
Denition 2. Given a feasible solution x0 and a feasible direction d emanat-

ing from x0 (i.e., there exists 0 > 0 so that x0 + d S for 0 0 ), we
dene the total trade-o rate at x0 , involving fi and fj , along the direction
d as
tij (x0 , d) = lim Tij (x0 + d, x0 ).
0
> 0 such that
If d is a feasible direction with the property that there exists
fl (x0 + d) = fl (x0 ) for all l / {i, j} and for all 0
, then we shall
call the corresponding tij (x0 , d) the partial trade-o rate.
The following result is straightforward:
Proposition 1. Let us assume that all the objective functions fi are contin-
uously dierentiable. Then,
fi (x0 )T d
tij (x0 , d) = .
fj (x0 )T d
It must be pointed out that the expression given in Denition 2 for the trade-
o rate makes it necessary for direction d to be feasible. Nevertheless, the
characterization given in Proposition 1 for the continuously dierentiable case
can be extended to (non-feasible) tangent directions.
Now, let us proceed to the denition of subjective trade-o concepts. The
term subjective means that the DMs preferences are somehow taken into
account. That is, subjective trade-os are desirable trade-os for the DM.
This idea often implies the existence of an underlying (implicit) value function
(z1 , . . . , zk ) which denes the DMs subjective preferences among the feasible
solutions of the problem. If the objective functions are to be minimized, then
is assumed to be strictly decreasing with respect to each one of its variables.
Very frequently, the concavity of is also assumed. If two alternatives are
equally desired for the DM, this means that they lie on the same indierence
curve (i.e., an isoquant of the value function), (z1 , . . . , zk ) = 0 .
This yields the following denition:
Denition 3. Given two solutions x1 and x2 , if f (x1 ) and f (x2 ) lie on the
same indierence curve, the corresponding trade-o Tij (x1 , x2 ) whether total
or partial, is usually known as the indierence trade-o involving fi and fj
between x1 and x2 .
Let us assume that all functions fi are continuously dierentiable, and suppose
we are studying the indierence trade-os between fi and fj at a xed point
x0 , with objective vector z0 = f (x0 ), which lies on the indierence curve
(z1 , . . . , zk ) = 0 . If (z0 )/zi = 0, we can express, zi as an implicit function
of the remaining objectives (including zj ):
zi = zi (z1 , . . . , zi1 , zi+1 , . . . , zk ). (2.1)
This expression allows us to obtain the trade-o rate between two functions
when moving along an indierence curve:
Denition 4. Given a solution x0 and the corresponding z0 , the indierence

trade-o rate or marginal rate of substitution (MRS) between fi and fj at x0
is dened as follows:

(z0 ) (z0 )
mij (x ) =
0
.
zj zi
Let us observe that, if the chain rule is applied to expression (2.1), then

zi (z1 , . . . , zi1 , zi+1 , . . . , zk )
mij (x ) =
0
0. (2.2)
zj z=z
Therefore, the marginal rate of substitution between fi and fj at x0 repre-

sents the amount of decrement in the value of the objective function fi that
compensates an innitesimal increment in the value of the objective fj , while
the values of all the other objectives remain unaltered. Alternatively, it can
be viewed as the (absolute value of the) slope of the indierence curve 0 at
f (x0 ), if zi and zj are represented on the axes. This, given that is strictly
decreasing, implies that
N (x0 ) = (mi1 (x0 ), . . . , mii1 (x0 ), 1, mii+1 (x0 ), . . . , mik (x0 ))

(2.3)
is a normal vector to the indierence curve at z0 (see Figure 2.1).
2.2.2 Obtaining Objective Trade-os over the Pareto Optimal Set
When solving a multiobjective optimization problem using an interactive

method, it can be important and useful to know the objective trade-os when
moving from a Pareto optimal solution to another one. This knowledge can
allow the DM to decide whether to search for more preferred Pareto optimal
solutions in certain directions. A key issue for many trade-o based interactive
methods is to obtain partial trade-o rates for a Pareto optimal solution. The
-constraint problem plays a key role in this task.
Given the multiobjective problem, a vector Rk1 , and an objective
function fi to be optimized, let us consider problem (1.3) dened in Chapter 1.
We can denote this problem by Pi (). If the feasible set of Pi () is nonempty
and x0 is an optimal solution, then let us denote by ij the optimal Karush-
Kuhn-Tucker (KKT) multipliers associated with the fj constraints. Chankong
and Haimes (1983) prove (under certain regularity and second order condi-
tions) that if all the optimal KKT multipliers are strictly positive, then ij
is a partial trade-o rate between objectives fi and fj , along a direction dj :
fi (x0 )
ij = = tij (x0 , dj ), j = i, (2.4)
zj
where
x(0 )
dj = (2.5)
j
is an ecient direction, that is, a direction that is tangent to the Pareto
optimal set (i.e., frontier) at x0 . Therefore, graphically, ij can be viewed
as the slope of the Pareto optimal frontier at z0 , if functions fi and fj are
represented on the axes. For objective functions to be minimized, a vector
N (x0 ) = (i1 , . . . , ii1 , 1, ii+1 , . . . , ik ) (2.6)
can be interpreted as a normal vector to the Pareto optimal frontier at z0
(see Figure 2.1). In fact, this expression matches the traditional denition of
a normal vector when the ecient frontier is dierentiable.
Fig. 2.1. On the left, N , given by the optimal KKT multipliers, is the normal vector
to the Pareto optimal frontier at z0 , and N , given by the MRS, is the normal vector
to the indierence curve at z0 . On the right, the convergence condition given in (2.7)
holds.
If the strict positivity conditions on the multipliers is removed, a more general

result is also proved in (Chankong and Haimes, 1983). Namely, if all the
optimal multipliers are strictly positive, then kj is a partial trade-o rate
between objectives fk and fj , along direction dj , as dened in (2.5). If, on the
other hand, there are optimal multipliers equal to zero, then ij is a total
trade-o rate between fi and fj , along direction dj , as dened in (2.5).
Let us consider a Pareto optimal solution x0 and the objective vector z0 =
f (x ). If the Pareto optimal set is connected as well as of full dimensionality
0
(k 1) and continuously dierentiable, there is an alternative way of obtaining

a normal vector to the Pareto optimal frontier at z0 . It does not require the
second order suciency conditions to be satised and works by means of
solving problem (1.9) formulated in Chapter 1 with
1
wi = , i = 1, . . . , k,
zi0 zi
where z is the utopian objective vector. If ij are the optimal KKT multi-
pliers of this problem associated with the function constraints, then the vector
N = (w1 i1 , . . . , wk ik )
is a normal vector to the Pareto optimal frontier at z0 (Yang and Li, 2002).
2.2.3 The Use of Trade-os within Interactive Methods
Among the dierent trade-o based interactive methods reported in the liter-
ature there are two most commonly used schemes:
to determine at each iteration objective trade-os, which are shown to the
DM, who must give some kind of answer about the desirability of such
trade-os, or
to ask the DM to provide subjective trade-os, which are used to nd a
Pareto optimal solution with a better value of the DMs (implicit) value
function.
The Zionts-Wallenius (Z-W) method (Zionts and Wallenius, 1976) belongs to
the rst group. In this method, the DM is shown several objective trade-os
at each iteration, and (s)he is expected to say whether (s)he likes, dislikes or
is indierent with respect to each trade-o. More elaborated information is
required from the DM in the ISWT method (Chankong and Haimes, 1983),
where several objective trade-os are shown at each iteration to the DM who
must rank each one of them in a scale from 10 to 10, depending on its
desirability (or from 2 to 2, as suggested by Tarvainen (1984)).
In the second group, there are three important methods. The Georion-
Dyer-Feinberg (GDF) method (Georion et al., 1972) uses a FrankWolfe
algorithm in order to perform a line search using the subjective trade-o
information given by the DM to determine the search direction. In the SPOT
method (Sakawa, 1982), the subjective trade-os given by the DM are also
used to determine a search direction, but a proxy function is used to calculate
the optimal step length. Finally, the GRIST method(Yang, 1999) uses the
normal vector in order to project the direction given by the subjective trade-
os onto the tangent plane to the Pareto optimal frontier.
All these ve methods will be briey described in the following section. In
most of them, the relation between the objective and the subjective trade-os
is very important in order to determine the nal solution. Namely, at a nal
solution, that is, a Pareto optimal solution that maximizes the DMs value
function, the indierence curve of the value function must be tangent to the
Pareto optimal frontier. This implies that if the indierence trade-o rate is
dened like in (2.2) and the objective partial trade-o rate is dened as in
(2.4), then in a nal solution the relations
mij (x0 ) = ij j = 1, . . . , k j = i (2.7)
must hold, which in turn implies that the normal vectors (2.3) and (2.6) must
coincide (see Figure 2.1). Note that, in this case, the i-th component of both
the vectors is equal to 1, and that is why they must exactly coincide. If the
normal vector to the Pareto optimal frontier is dened as in (2.6), then the
equality given in expression (2.7) or the equality between the normal vectors
should be replaced by a proportionality condition.
2.2.4 Some Trade-o Based Methods
In this section, we will briey give the basic ideas of the previously mentioned
methods. For further details, the reader may follow the references given. See
also (Miettinen, 1999) for other trade-o based interactive techniques not
mentioned here.
The Z-W method was originally proposed in (Zionts and Wallenius, 1976),
and it is based on piecewise linearizations of the problem and the use of the
properties of Pareto optimal solutions of linear problems. The assumptions of
this method are the following:
An implicit value function exists, and it is assumed to be concave.
The objective functions and the feasible region are convex.
Although not strictly necessary from the theoretical point of view, the authors
mention that the additive separability of the objective functions is convenient
for practical reasons. Another version of the algorithm exists for a class of
pseudoconcave value functions (Zionts and Wallenius, 1983). As the method
is based on piecewise linear approximations of the functions, we will briey
describe it for MOLP problems representing one of these piecewise approxi-
mations. The idea of the method is the following: A Pareto optimal solution
is found using the weighting method (see Section 1.3.1 in Chapter 1). Then
adjacent Pareto optimal vertices to the current solution are identied and the
corresponding trade-os are shown to the DM, who is asked to say whether
(s)he prefers each of them to the current solution or not. Making use of this
information, the weights are actualized, and a new solution is found.
The interactive surrogate worth trade-o method (ISWT) is an interac-
tive version of the original surrogate worth trade-o method (Haimes and
Hall, 1974). The basic idea lies on the concept of surrogate worth, which is a
valuation by the DM of the desirability of the trade-os obtained at a Pareto
optimal solution. The interactive method was rst reported in (Chankong
and Haimes, 1978), and both versions are also described in (Chankong and
Haimes, 1983). The basic assumptions of this method are the following:
An implicit value function exists, and it is assumed to be continuously
dierentiable and monotonically decreasing.
All the functions are twice continuously dierentiable.
The feasible region S is compact.
Optimal KKT multipliers provide partial trade-os.
This method proceeds as follows. A Pareto optimal solution is determined
using the -constraint problem (see Section 1.3.2 in Chapter 1). The objective
trade-os at the current solution are obtained and shown to the DM, who is
asked to assess their desirability using a scale from 10 to 10. This information
is used to actualize the vector of upper bounds, and to produce a new solution.
The basic idea of the interactive Georion, Dyer and Feinberg (GDF) al-
gorithm (Georion et al., 1972) is the following. The existence of an implicit
value function is assumed, which the DM wishes to maximize over the fea-
sible region. The Franke-Wolfe algorithm is applied to solve the intermediate
problems formed. The assumptions of the GDF method are:
The feasible region S is compact and convex.
Objective functions are continuously dierentiable and convex.
An implicit value function exists, and is assumed to be continuously
dierentiable, monotonically decreasing and concave.
In the GDF method, given the current solution, the DM is asked to provide
marginal rates of substitution, which are used to determine an ascent direction
for the value function. Then, the optimal step-length is approximated using
an evaluation scheme, and the next iteration is generated.
The sequential proxy optimization technique (SPOT) is an interactive al-
gorithm developed by Sakawa (1982). The basic idea of this method is to
assume the existence of an implicit value function of the DM, that has to
be maximized over the feasible region. This maximization is done using a fea-
sible direction scheme. In order to determine the optimal step-length, a proxy
function is used to simulate locally the behavior of the (unknown) value func-
tion (the author proposes several options for this function). The -constraint
problem is used to determine Pareto optimal solutions and to obtain trade-o
information at the current solution, and the DM is asked to provide marginal
rates of substitution. The assumptions of this method are the following:
The implicit value function exists, and is continuously dierentiable,
strictly decreasing and concave.
All objective functions fi are convex and twice continuously dierentiable.
The feasible region S is compact and convex.
Optimal KKT multipliers provide partial trade-os.
The idea of the iterations is the following. The -constraint problem is used to
determine a Pareto optimal solution. The DM is asked to give the MRSs, which
are used to nd a search direction. The optimal step-length is approximated
using the proxy function, and the vector of bounds is updated so as to nd
the next iteration.
Finally, we introduce the gradient based interactive step trade-o method
(GRIST) proposed by Yang (1999). This interactive technique has been de-
signed to deal with general (non necessarily convex) dierentiable problems
with a dierentiable, connected and full dimensional (k 1) Pareto optimal
set. The main idea consists of a projection of the vector determined by the
marginal rates of substitution given by the DM onto a tangent plane to the
Pareto optimal frontier at the current iteration. This projection is proved to
be an increasing direction of the DMs underlying value function. Then a refer-

ence point is obtained following this direction, which is in turn projected onto
the Pareto optimal frontier to generate the next iteration. The assumptions
of this method are the following ones:
The implicit value function exists, and it is continuously dierentiable
and strictly decreasing.
All objective functions are continuously dierentiable.
The feasible region S is compact.
The Pareto optimal set is dierentiable, connected and of full dimension-
ality (k 1).
All the solutions generated are regular.
In GRIST, given the current solution, the DM is asked to give the MRSs.
The corresponding vector is projected onto the tangent plane to the Pareto
optimal frontier, and a step-length is approximated by an evaluation scheme.
The point obtained is projected onto the Pareto optimal set, and this yields
a new solution.
2.2.5 Summary
Finally, let us point out some of the most outstanding features of the methods
described.
Convergence. The mathematical convergence of all the methods to the opti-
mum of the implicit value function can be proved, given that the value function
satises for each method the assumptions mentioned in Section 2.2.4. (More-
over, for MOLP problems, the Z-W method can be proved to converge in a
nite number of iterations.)
Information. The preference information required from the DM can be re-
garded as not very hard for the Z-W method (the DM has to say whether
the trade-os proposed are desirable or not), hard for the SPOT, GDF and
GRIST methods (the DM has to give MRSs at the current iteration) and very
hard for the ISWT method (where a surrogate worth for each trade-o has
to be given). Besides, in some methods the step-length has to be estimated
evaluating dierent solutions.
Consistency. In all the methods the consistency of the responses of the DM
is vital for the real convergence of the method. In the special case of the SPOT
method, there are hard consistency tests for choosing the proxy functions and
their parameters. Although all the methods (except Z-W) allow to revisit
solutions and to go back in the process, they usually do not perform well with
inconsistent answers.
Type of problem. The Z-W method handles convex problems for which a
piecewise linearization can be carried out, although it is mainly used in prac-
tice for linear problems. The GDF method is designed for convex problems.
The SPOT and ISWT methods do not assume convexity, but second order
sucient conditions must be satised at the iterations. Finally, the GRIST

method assumes that the Pareto optimal set is dierentiable, connected and
full dimensional. These conditions may be very hard to be assured a priori.
Pareto optimality. The Pareto optimality of the nal solution is not guaran-
teed in the GDF method. All the other methods assure that the nal solution
is Pareto optimal (although only extreme Pareto optimal solutions are ob-
tained with the Z-W method).
2.3 Reference Point Approaches

2.3.1 Fundamental Assumptions of Reference Point Approaches
Reference point approaches have a long history and multiple practical applica-
tions (Wierzbicki, 1977, 1980, 1999; Wierzbicki et al., 2000). However, we shall
limit the description here to their fundamental philosophy, a short indication
of their basic features and of some contemporary, new developments related
to this class of approaches. During over 30 years of development of reference
point approaches, including their diverse applications, several methodological
postulates describing desirable features of the decision process supported by
these approaches have been claried, most of them expressing lessons learned
from the practice of decision making. These postulates are:
1) Separation of preferential and substantive models. This indicates
the conviction that in a good decision support system, we should carefully
distinguish between the subjective part of knowledge represented in this sys-
tem, concerning the preferences of the user, thus called a preferential model
(including, but understood more broadly than a preference model of the DM)
of the decision situation, and the objective part, representing in this system
some selected knowledge about pertinent aspects of the decision situation
obviously selected never fully objectively, but formulated with objectivity as
a goal called a substantive model (sometimes core model) of the decision
situation. For example, objective trade-os, dened in Section 2.2 are part of
the substantive model, while subjective trade-os belong to the preferential
model. Typically, a substantive model has the following general form:
y = F(x, v, a); x S, (2.8)

where
y is a vector of outcomes (outputs) yj , used for measuring the consequences
of implementation of decisions;
x is a vector of decisions (controls, inputs to the decision making process),
which are controlled by the user;
v is a vector of external impacts (external inputs, perturbations), which
are not controlled by the user;
a is a vector of model parameters;
F is a vector of functions (including such that are conventionally called

objectives and constraints), describing the relations between decisions x,
impacts v, parameters a, and outcomes y;
S is the set of feasible decisions.
The form of (2.8) is only slightly more complex but essentially equivalent to
what in the other parts of this book is written simply as z = f (x), where
z denotes objectives selected between outcomes y. This short notation is a
great oversimplication of the actual complexity of models involving multiple
objectives or criteria. However, even the form of (2.8) is misleading by its
compactness, since it hides the actual complexity of the underlying knowledge
representation: a large model today may have several millions of variables and
constraints, even when the number of decision and outcome variables is usually
much smaller (Makowski, 2005).
Additionally, the substantive model includes constraint specication (sym-
bolically denoted by x S) that might have the form of feasible bounds on
selected model outcomes (or be just a list of considered decision options in a
discrete case). While the reference point approach is typically described for
the continuous case (with a nonempty interior of S, thus an innite number
of solutions in this set), it is as well (or even better) applicable to the discrete
case, with a nite number of decision options. The reason for this is that the
reference point approach is specically designed to be eective for nonconvex
problems (which is typical for the discrete case).
The actual issue of the separation of preferential and substantive models
is that the substantive model should not represent the preferences of the DM,
except in one aspect: the number of decision outcomes in this model should be
large enough for using them in a separate representation of a preferential struc-
ture P (x, y) of the user, needed for selecting a manageable subset of solutions
(decisions) that correspond best to users preferences. The separate represen-
tation of preferential structure (the structure of preferential information) can
have several degrees of specicity, while the reference point approaches assume
that this specication should be as general as possible, since a more detailed
specication violates the sovereign right of a DM to change her/his mind.
The most general specication contains a selection of outcomes yj that are
chosen by the DM to measure the quality of decisions (or solutions), called
objectives (values of objective functions) or criteria (quality measures, quality
indicators) and denoted by zj , j = 1, . . . , k. This specication is accompanied
by dening a partial order in the space of objectives simply asking the DM
which objectives should be maximized and which minimized (while another
option, stabilizing some objectives around given reference levels, is also possi-
ble in reference point approaches (Wierzbicki et al., 2000)). Here we consider
the simplest case when all the objectives are to be minimized.
The second level of specicity in reference point approaches is assumed to
consist of specication of reference points generally, desired objective func-
tion values. These reference points might be interval-type, double, including
aspiration levels, denoted here by zja (objective function values that the DM
would like to achieve) and reservation levels zjr (objective values that should
be achieved according to the DM). Specication of reference levels is treated
as an alternative to trade-o or weighting coecient information that leads
usually to linear representation of preferences and unbalanced decisions as
discussed below, although some reference point approaches (Nakayama, 1995)
combine reference levels with trade-o information.
The detailed specication of preferences might include full or gradual iden-
tication of value functions, see Section 2.2 on trade-o methods or (Keeney
and Raia, 1976; Keeney, 1992). This is avoided in reference point approaches
that stress learning instead of value identication. According to the reference
point philosophy, the DM should learn during the interaction with a decision
support system (DSS), hence her/his preferences might change in the deci-
sion making process and (s)he has full, sovereign right or even necessity to be
inconsistent.
2) Nonlinearity of preferences. According to a general conviction that
human preferences have essentially a nonlinear character, including a prefer-
ence for balanced solutions. Any linear approximation of preferences (e.g., by
a weighted sum distorts them, favoring unbalanced solutions. This is in op-
position to the methods taught usually as the basic approaches to MCDM.
These methods consist of determining (by diverse approaches, between which
the AHP (Saaty, 1982) is one of the most often used) weighting coecients
and solving the weighted problem (1.2) discussed in Chapter 1. Such a linear
aggregation might be sometimes necessary, but it has several limitations as
discussed in Chapter 1. The most serious ones are the following:
The weighted sum tends to promote decisions with unbalanced objectives,
as illustrated by the Korhonen paradox mentioned in Chapter 1. In order
to accommodate the natural human preference for balanced solutions, a
nonlinear aggregation is necessary.
The weighted sum is based on a tacit (unstated) assumption that a trade-
o analysis is applicable to all objective functions: a worsening of the value
of one objective function might be compensated by the improvement of the
value of another one. While often encountered in economic applications,
this compensatory character of objectives is usually not encountered in
interdisciplinary applications.
Educated that weighting methods are basic, the legislators in Poland intro-
duced a public tender law. This law requires that any institution preparing a
tender using public money should publish beforehand all objectives of rank-
ing the oers and all weighting coecients used to aggregate the objectives.
This legal innovation backred: while the law was intended to make public
tenders more transparent and accountable, the practical outcome was oppo-
site because of eects similar to the Korhonen paradox. Organizers of the
tenders soon discovered that they are forced either to select the oer that is
the cheapest and worst in quality or the best in quality but most expensive
one. In order to counteract, they either limited the solution space drastically
by diverse side constraints (which is dicult but consistent with the spirit of
the law) or added additional poorly dened objectives such as the degree of
satisfaction (which is simple and legal but fully inconsistent with the spirit of
the law, since it makes the tender less transparent and opens a hidden door
for graft).
To summarize, a linear weighted sum aggregation is simple but too simplis-
tic in representing typical human preferences that are often nonlinear. Using
this simplistic approach may result in adverse and unforeseen side-eects.
3) Holistic perception of objectives. The third basic assumption of refer-
ence point approaches is that the DM selects her/his decision using a holistic
assessment of the decision situation. In order to help her/him in such a holistic
evaluation, a DSS should compute and inform the DM about relevant ranges
of objective function values. Such ranges can be dened in diverse ways, while
the two basic ones are the following:
Total ranges of objective functions involve the denition of the lower zjlo
and the upper bound zjup , over all feasible decisions x S (j = 1, . . . , k).
Pareto optimal ranges of objectives are counted only over Pareto optimal
solutions. The lower bound is the utopian or ideal objective vector zj and
is typically equal to zjlo . The upper bound is the nadir objective vector
zjnad (as discussed in Preface and Chapter 1).
Generally, zjnad zjup and the nadir objective vector is easy to determine only
in the case of biobjective problems (Ehrgott and Tenfelde-Podehl, 2000) (for
continuous models; for discrete models the determination of a nadir point is
somewhat simpler). No matter which ranges of objectives we use, it is often
useful to assume that all objective functions or quality indicators and their
values fj (x) for decision vectors x S are scaled down to a relative scale by
the transformation:
zjrel = fjrel (x) = (fj (x) zjlo )/(zjup zjlo ) 100%.
4) Reference points as tools of holistic learning. Another basic assump-

tion of reference point approaches is that reference (aspiration, reservation)
levels and points are treated not as a xed expression of preferences but as a
tool of adaptive, holistic learning about the decision situation as described by
the substantive model. Thus, even if the convergence of reference point ap-
proaches to a solution most preferred by the DM can be proved (Wierzbicki,
1999), this aspect is never stressed. More important aspects relate to other
properties of these approaches. Even if the reference points might be deter-
mined in some objective fashion, independently of the preferences of the DM,
we stress again a diversity of such objective determinations, thus making pos-
sible comparisons of resulting optimal solutions.
5) Achievement functions as ad hoc approximations of value. Given

the partial information about preferences (the partial order in the objective
space) and their assumed nonlinearity, and the information about the posi-
tioning of reference points inside known objective function ranges, the simplest
ad hoc approximation of a nonlinear value function consistent with this infor-
mation and promoting balanced solutions can be proposed. Such an ad hoc
approximation takes the form of achievement functions discussed later; see
(2.9)(2.10). (A simple example of them was also introduced in Chapter 1
as problem (1.11). Note that that problem was formulated so that it was to
be minimized but here a dierent variant is described where the achievement
function is maximized.)
Achievement functions are determined essentially by max-min terms that
favour solutions with balanced deviations from reference points and express
the Rawlsian principle of justice (concentrating the attention on worst o
members of society or on issues worst provided for (Rawls, 1971)). These terms
are slightly corrected by regularizing terms, resulting in the Pareto optimality
of the solutions that maximize achievement functions. See also (Lewandowski
and Wierzbicki, 1989) for diverse applications where the partial order in the
objective space (called also the dominance relation) is not assumed a priori
but dened interactively with the DM.
6) Sovereignty of the DM. It can be shown (Wierzbicki, 1986) that achieve-
ment functions have the property of full controllability. This means that any
Pareto optimal solution can be selected by the DM by modifying reference
points and maximizing the achievement function. This provides for the full
sovereignty of the DM. Thus, a DSS based on a reference point approach
behaves analogously to a perfect analytic section sta in a business organiza-
tion (Wierzbicki, 1983). The CEO (boss) can outline her/his preferences to
the sta and specify the reference points. The perfect sta will tell the boss
that her/his aspirations or even reservations are not attainable, if this is the
case; but the sta computes in this case also the Pareto optimal solution that
comes closest to the aspirations or reservations. If, however, the aspirations
are attainable and not Pareto optimal (a better decision might be found), the
perfect sta will present to the boss the decision that results in the aspirations
and also a Pareto optimal solution corresponding to a uniform improvement
of all objectives over the aspirations. In a special case when the aspirations or
reservations are Pareto optimal, the perfect sta responds with the decision
that results precisely in attaining these aspirations (reservations) and does
not argue that another decision is better, even if such a decision might result
from a trade-o analysis performed by the sta. (Only a computerized DSS,
not a human sta, can behave in such a perfect fashion.)
7) Final aims: intuition support versus rational objectivity. To sum-
marize the fundamental assumptions and philosophy of reference point ap-
proaches, the basic aim when supporting an individual, subjective DM, is
to enhance her/his power of intuition (Wierzbicki, 1997) by enabling holistic
learning about the decision situation as modelled by the substantive model.

The same applies, actually, when using reference point approaches for sup-
porting negotiations and group decision making (Makowski, 2005).
2.3.2 Basic Features of Reference Point Approaches
The large disparity between the opposite ends of the spectrum of preference
elicitation (full value function identication versus a weighted sum approach)
indicates the need of a middle-ground approach, simple enough and easily
adaptable but not too simplistic. An interactive decision making process in a
DSS using a reference point approach consists typically of the following steps:
The decision maker (DM) species reference points (e.g., aspiration and
reservation levels for all objective functions). To help her/him in starting
the process, the DSS can compute a neutral solution, a response to refer-
ence levels situated in the middle of objective function ranges (see problem
(1.6) in Chapter 1);
The DSS responds by maximizing an achievement function, a relatively
simple but nonlinear aggregation of objective functions interpreted as an
ad hoc and adaptable approximation of the value function of the DM based
on the information contained in the estimates of the ranges of objective
functions and in the positioning of aspiration and reservation levels inside
these ranges;
The DM is free to modify the reference points as (s)he will. (S)he is sup-
posed to use this freedom to learn about the decision situation and to
explore interesting parts of the Pareto optimal set;
Diverse methods can be used to help the DM in this exploration (we com-
ment on them later), but the essential point is that they should not limit
the freedom of the DM.
In order to formulate an achievement function, we rst count achievements
for each individual objective function by transforming it (piece-wise linearly)
(for objective functions to be minimized) as

1 + (zja zj )/(zja zjlo ), if zjlo zj zja
j (zj , zj , zj ) =
a r
(zjr zj )/(zjr zja ), if zja < zj zjr , . (2.9)

(zjr zj )/(zj zjr ), if zjr < zj zjup
up
The coecients and are typically selected to assure the concavity of

this function (Wierzbicki et al., 2000); but the concavity is needed only for
problems with a continuous (nonempty interior) set of solutions, for an easy
transformation to a linear programming problem. The value j = j (zj , zja , zjr )
of this achievement function (where zj = fj (x) for a given decision vector
x S) signies the satisfaction level with the quality indicator or objective
j for this decision vector. If we assign the values of satisfaction from -1 to 0
for zjr < zj zjup , values from 0 to 1 for zja < zj zjr , values from 1 to 2
for zjlo zj zja , then we can just set = = 1. After this transformation
of all objective function values, we might use then the following form of the
overall achievement function to be maximized1 :

(z, za , zr ) = min j (zj , zaj , zrj ) + j (zj , zaj , zrj ), (2.10)
j=1,...,k
j=1,...,k
where z = f (x) is the objective vector and za = (za1 , . . . , zak ) and zr =

(zr1 , . . . , zrk ) the vectors of aspiration and reservation levels, respectively. Fur-
thermore, > 0 is a small regularizing coecient (as discussed in Chapter 1.
There are many possible forms of achievement functions besides (2.9)
(2.10), as shown in (Wierzbicki et al., 2000). All of them, however, have an
important property of partial order approximation: their level sets approxi-
mate closely the positive cone dening the partial order (Wierzbicki, 1986).
As indicated above, the achievement function has also a very important the-
oretical property of controllability, not possessed by value functions nor by
weighted sums: for suciently small values of , given any point z in the
set of (properly) Pareto optimal objective vectors, we can always choose such
reference levels that the maximum of the achievement function (2.10) is at-
tained precisely at this point. In fact, it suces to set aspiration levels equal
to the components of z . Conversely, if > 0, all maxima of the achieve-
ment function (2.10) correspond to Pareto optimal solutions (because of the
monotonicity of this function with respect to the partial order in the objec-
tive space.) Thus, the behaviour of achievement functions corresponds in this
respect to value functions and weighted sums. However, let us emphasize that
this is not the case in the distance norm used in goal programming (see Section
1.6.3 in Chapter 1), since the norm is not monotone when passing zero. As
noted above, precisely the controllability property results in a fully sovereign
control of the DSS by the user.
Alternatively, as shown in (Ogryczak, 2006), we can assume = 0 and
use the nucleolar minimax approach. In this approach, we consider rst the
minimal, worst individual objective-wise achievement computed as in (2.9)
(2.10) with = 0. If two (or more) solutions have the same achievement
value, we order them according to the second worst individual objective-wise
achievement and so on.
There are many modications, variants and extensions (Wierzbicki et al.,
2000) or approaches related to the basic reference point approach, mostly
designed for helping the search phase in the Pareto optimal set. For example,
the Tchebyche method (Steuer, 1986) was developed independently but
actually is equivalent to using weighting coecients implied by reference
levels;
1
Even if in this book objectives are supposed to be typically minimized, achieve-
ments are here maximized.
Pareto Race (Korhonen and Laakso, 1986) is a visual method based on

reference points distributed along a direction in the objective space, or the
REF-LEX method for nonlinear problems (Miettinen and Kirilov, 2005);
the satiscing trade-o method, or the NIMBUS method, both described
in a later section, or the light beam search method (Jaszkiewicz and Sow-
iski, 1999), or several other approaches were motivated by the reference
point approach.
In this section, we have presented some of the basic assumptions and philoso-
phy of reference point approaches, stressing their unique concentration on the
sovereignty of the subjective DM. Next we concentrate on classication-based
methods.
2.4 Classication-Based Methods

2.4.1 Introduction to Classication of Objective Functions
According to the denition of Pareto optimality, moving from one Pareto op-
timal solution to another implies trading o. In other words, it is possible to
move to another Pareto optimal solution and improve some objective function
value(s) only by allowing some other objective function value(s) to get worse.
This idea is used as such in classication-based methods. By classication-
based methods we mean methods where the DM indicates her/his preferences
by classifying objective functions. The idea is to tell which objective func-
tions should improve and which ones could impair from their current values.
In other words, the DM is shown the current Pareto optimal solution and
asked what kind of changes in the objective function values would lead to
a more preferred solution. It has been shown by Larichev (1992) that the
classication of objective functions is a cognitively valid way of expressing
preference information for a DM.
We can say that classication is a very intuitive way for the DM to direct
the solution process in order to nd the most preferred solution because no
articial concepts are used. Instead, the DM deals with objective function
values that are as such meaningful and understandable for her/him. The DM
can express hopes about improved solutions and directly see and compare how
well the hopes could be attained when the next solution is generated.
To be more specic, when classifying objective functions (at the current
Pareto optimal solution) the DM indicates which function values should im-
prove, which one are acceptable as such and which ones are allowed to impair.
In addition, desirable amounts of improvement or allowed amounts of impair-
ments may be asked from the DM. There exist several classication-based
interactive multiobjective optimization methods. They dier from each other,
for example, in the number of classes available, the preference information
asked from the DM and how this information is used to generate new Pareto
optimal solutions.
Let us point out that closely related to classication is the idea of express-
ing preference information as a reference point (Miettinen and Mkel, 2002;
Miettinen et al., 2006). The dierence is that while classication assumes that
some objective function must be allowed to get worse, a reference point can
be selected more freely. Naturally, it is not possible to improve all objective
function values of a Pareto optimal solution, but the DM can express prefer-
ences without paying attention to this and then see what kind of solutions are
feasible. However, when using classication, the DM can be more in control
and select functions to be improved and specify amounts of relaxation for the
others.
As far as stopping criteria are concerned, classication-based methods
share the philosophy of reference point based methods (discussed in the pre-
vious section) so that the DMs satisfaction is the most important stopping
criterion. This means that the search process continues as long as the DM
wants to and the mathematical convergence is not essential (as in trade-o
based methods) but rather the psychological convergence is emphasized (dis-
cussed in the introduction). This is justied by the fact that DMs typically
want to feel being in control and do not necessarily want the method to tell
them when they have found their most preferred solutions. After all, the most
important task of interactive methods is to support the DM in decision mak-
ing.
In what follows, we briey describe the step method, the satiscing trade-
o method and the NIMBUS method. Before that, we introduce some common
notation.
Throughout this section, we denote the current Pareto optimal solution
by zh = f (xh ). When the DM classies the objective functions at the current
solution, we can say that (s)he assigns each of them into some class and the
number of classes available varies in dierent methods. In general, we have
the following classes for functions fi (i = 1, . . . , k)
I < whose values should be improved (i.e., decrease) from the current level,
I whose values should improve till some desired aspiration level zi < zih ,
I = whose values are acceptable in the current solution,
I whose values can be impaired (i.e., increase) till some upper bound
i > zih and,
I whose values are temporarily allowed to change freely.
The aspiration levels and the upper bounds corresponding to the classication
are elicited from the DM, if they are needed. According to the denition of
Pareto optimality, a classication is feasible only if I < I = and I I =
and the DM has to classify all the objective functions, that is, I < I I =
I I = {1, . . . , k}.
2.4.2 Step Method
The step method (STEM) (Benayoun et al., 1971) uses only two classes. STEM
is one of the rst interactive methods introduced for multiobjective optimiza-
tion and it was originally developed for MOLP problems. However, here we
describe variants for nonlinear problems according to Eschenauer et al. (1990);
Sawaragi et al. (1985); Vanderpooten and Vincke (1989).
In STEM, the DM is assumed to classify the objective functions at the
current solution zh into those that have acceptable values I and those whose
values are too high, that is, functions that have unacceptable values I < . Then
the DM is supposed to give up a little in the value(s) of some acceptable
objective function(s) in order to improve the values of some unacceptable
objective functions. In other words, the DM is asked to specify upper bounds
hi > zih for the functions in I . All the objective functions must be classied
and, thus, I < I = {1, . . . , k}.
It is assumed that the objective functions are bounded over S because
distances are measured to the (global) ideal objective vector. STEM uses the
weighted Chebyshev problem 1.8 introduced in Chapter 1 to generate new
solutions. The weights are used to make the scales of the objective functions
similar. The rst problem to be solved is

ei
minimize max k (fi (x) zi )

i=1,...,k ej (2.11)
j=1
subject to x S,

1 zi zi
nad
where ei = zi zinad as suggested by Eschenauer et al. (1990). Alternatively,
zinad zi
we can set ei = as suggested by Vanderpooten and Vincke
max |zinad |,|zi |
(1989). (Naturally we assume that the denominators are not equal to zero.)
It can be proved that the solution of (2.11) is weakly Pareto optimal. The
solution obtained is the starting point for the method and the DM is asked
to classify the objective functions at this point.
Then the feasible region is restricted according to the information given
by the DM. The weights of the relaxed objective functions are set equal to
zero, that is ei = 0 for i I . Then a new distance minimization problem

ei
minimize max k (fi (x) zi )

i=1,...,k
j=1 ej
subject to fi (x) hi for all i I ,
fi (x) fi (xh ) for all i I < ,
xS
is solved.
The DM can classify the objective functions at the solution obtained and
the procedure continues until the DM does not want to change the current
solution. In STEM, the idea is to move from one weakly Pareto optimal so-
lution to another. Pareto optimality of the solutions could be guaranteed, for
example, by using augmentation terms as discussed in Chapter 1 (see also
(Miettinen and Mkel, 2006)). The idea of classication is quite simple for
the DM. However, it may be dicult to estimate how much the other functions
should be relaxed in order to potentiate the desired amounts of improvement
in the others. The next method aims at resolving this kind of a diculty.
2.4.3 Satiscing Trade-o Method
The satiscing trade-o method (STOM) (Nakayama, 1995; Nakayama and

Sawaragi, 1984) is based on ideas very similar to those in reference point ap-
proaches. As its name suggests, it concentrates on nding a satiscing solution
(see Chapter 1).
The DM is asked to classify the objective functions at zh into three classes.
The classes are the objective functions whose values should be improved I ,
the functions whose values can be relaxed I and the functions whose values
are acceptable as such I = . The DM is supposed to specify desirable aspiration
levels for functions in I . Here, I I I = = {1, . . . , k}.
Because of so-called automatic trade-o, the DM only has to specify desir-
able levels for functions in I and the upper bounds for functions in I are
derived from trade-o rate information. The idea is to decrease the burden set
on the DM so that the amount of information to be specied is reduced. Func-
tions are assumed to be twice continuously dierentiable. Under some special
assumptions, trade-o information can be obtained from the KKT multipliers
related to the scalarizing function used (corresponding to Section 2.2).
By putting together information about desirable function values, upper
bounds deduced using automatic trade-o and current acceptable function
values, we get a reference point zh . Dierent scalarizing functions can be used
in STOM but in general, the idea is to minimize the distance to the utopian
objective vector z . We can, for example, solve the problem

k
fi (x) zi fi (x)
minimize max +
i=1,...,k zih zi i=1
zih zi (2.12)
subject to x S,
where we must have zi > zi for all i = 1, . . . , k. It can be proved that
the solutions obtained are properly Pareto optimal . Furthermore, it can be
proved that the solution x is satiscing (see Chapter 1) if the reference point
is feasible (Sawaragi et al., 1985). This means that fi (x ) zih for all i. Let
us point out that if some objective function fi is not bounded from below
in S, then some small scalar value can be selected as zi . Other weighting
coecients can also be used instead of 1/( zih zi ) (Nakayama, 1995).
The solution process can be started, for example, by asking the DM to
specify a reference point and solving problem (2.12). Then, at this point, the
DM is asked to classify the objective functions and specify desirable aspiration

levels for the functions to be improved. The solution process continues until
the DM does not want to improve or relax any objective function value.
In particular, if the problem has many objective functions, the DM my
appreciate the fact that (s)he does not have to specify upper bound values.
Naturally, the DM may modify the calculated values if they are not agreeable.
It is important to note that STOM can be used even if the assumptions
enabling automatic trade-o are not valid. In this case, the DM has to specify
both aspiration levels and upper bounds.
2.4.4 NIMBUS Method
The NIMBUS method is described in (Miettinen, 1999; Miettinen and Mkel,

1995, 1999, 2000; Miettinen et al., 1998) and it is based on the classication
of the objective functions into up to ve classes. It has been developed for
demanding nonlinear multiobjective optimization. In NIMBUS, the DM can
classify objective functions at zh into any of the ve classes introduced at
the beginning of this section, that is, functions to be decreased I < and to
be decreased till an aspiration level I , functions that are satisfactory at the
moment I = , functions that can increase till an upper bound I and functions
that can change freely I and I < I I = I I = {1, . . . , k}. We assume
that we have the ideal objective vector and a good approximation of the nadir
objective vector available.
The dierence between the classes I < and I is that the functions in I <
are to be minimized as far as possible but the functions in I only till the
aspiration level (specied by the DM). There are several dierent variants
of NIMBUS but here we concentrate on the so-called synchronous method
(Miettinen and Mkel, 2006). After the DM has made the classication, we
form a scalarizing function and solve the problem

fi (x) zi fj (x) zj k
fi (x)
minimize max , +
iI < zi zi zj zj
nad nad z
i=1 i
nad zi
jI
subject to f (x) f (xh ) for all i I < I I = , (2.13)

i i
fi (x) i for all i I ,
x S,
where > 0 is a relatively small scalar. The weighting coecients 1/(zjnad

zj ) (scaling the objective function values) have proven to facilitate capturing
the preferences of the DM well. They also increase computational eciency
(Miettinen et al., 2006). By solving problem (2.13) we get a provably (prop-
erly) Pareto optimal solution that satises the classication as well as possible.
In the synchronous NIMBUS method, the DM can ask for up to four dif-
ferent Pareto optimal solutions be generated based on the classication once
expressed. This means that solutions are produced that take the classica-
tion information into account in slightly dierent ways. In practice, we form
a reference point z based on the classication information specied as fol-

lows: zi = zi for i I < , zi = zi for i I , zi = zih for i I = , zi = i
for i I and zi = zinad for i I . (This, once again, demonstrates the
close relationship between classication and reference points.) Then we can
use reference point based scalarizing functions to generate new solutions. In
the synchronous NIMBUS method, the scalarizing functions used are those
coming from STOM (problem (2.12) in Section 2.4.3), reference point method
(problem (1.11) dened in Chapter 1) and GUESS (Buchanan, 1997). See
(Miettinen and Mkel, 2002) for details on how they were selected. Let us
point out that all the solutions produced are guaranteed to be properly Pareto
optimal.
Further details and the synchronous NIMBUS algorithm are described in
(Miettinen and Mkel, 2006). The main steps are the following: Once the DM
has classied the objective functions at the current solution zh and specied
aspiration levels and upper bounds, if needed, (s)he is asked how many new
solutions (s)he wants to see and compare. As many solutions are generated
(as described above) and the DM can select any of them as the nal solution
or as the starting point of a new classication. It is also possible to select any
of the solutions generated so far as a starting point for a new classication.
The DM can also control the search process by asking for a desired number of
intermediate solutions to be generated between any two interesting solutions
found so far. In this case, steps of equal length are taken in the decision space
and corresponding objective vectors are used as reference points to get Pareto
optimal solutions that the DM can compare.
The starting point can be, for example, a neutral compromise solution
(see problem (1.6) in Chapter 1) or any point specied by the DM (which
has been projected to the Pareto optimal set). In NIMBUS, the DM expresses
iteratively her/his desires. Unlike some other methods based on classication,
the success of the solution process does not depend entirely on how well the
DM manages in specifying the classication and the appropriate parameter
values. It is important to note that the classication is not irreversible. Thus,
no irrevocable damage is caused in NIMBUS if the solution obtained is not
what was expected. The DM is free to go back or explore intermediate points.
(S)he can easily get to know the problem and its possibilities by specifying,
for example, loose upper bounds and examining intermediate solutions.
The method has been implemented as a WWW-NIMBUS R
system oper-
ating on the Internet (Miettinen and Mkel, 2000, 2006). Via the Internet,
the computing can be centralized to one server computer and the WWW is a
way of distributing the graphical user interface to the computers of each indi-
vidual user and the user always has the latest version of the method available.
The most important aspect of WWW-NIMBUS R
is that it is easily acces-
sible and available to any academic Internet user at http://nimbus.it.jyu./.
For a discussion on how to design user interfaces for a software implementing
a classication-based interactive method, see (Miettinen and Kaario, 2003).
(When the rst version of WWW-NIMBUS R
was implemented in 1995 it was
a pioneering interactive optimization system on the Internet.) Another imple-

mentation IND-NIMBUS R
for MS-Windows and Linux operating systems
also exists (Miettinen, 2006) (see Chapter 12). Many successful applications,
for example, in the elds of optimal control, optimal shape design and process
design have shown the usefulness of the method (Hakanen et al., 2005, 2007;
Hmlinen et al., 2003; Heikkola et al., 2006; Miettinen et al., 1998).
2.4.5 Other Classication-Based Methods
Let us briey mention some more classication-based methods. Among them

are the interactive reference direction algorithm for convex nonlinear integer
problems (Vassilev et al., 2001) which uses three classes I , I and I = and
the reference direction approach for nonlinear problems (Narula et al., 1994)
using the same three classes and generating several solutions in the reference
direction (pointing from the current solution towards the reference point).
Furthermore, the interactive decision making approach NIDMA (Kaliszewski
and Michalowski, 1999) asks for both a classication and maximal acceptable
global trade-os from the DM.
A method where NIMBUS (see Subsection 2.4.4) is hybridized with the
feasible goals method (Lotov et al., 2004) is described in (Miettinen et al.,
2003). Because the feasible goals method produces visual interactive displays
of the variety of feasible objective vectors, the hybrids introduced help the DM
in getting understanding of what kinds of solutions are feasible, which helps
when specifying classications for NIMBUS. (There are also classication-
based methods developed for MOLP problems which we do not touch here.)
2.5 Discussion
Due to the large variety of interactive methods available in the literature,
it is a hard task to choose the most appropriate method for each decision
situation. Here, a decision situation must be understood in a wide sense:
a DM, with a given attitude (due to many possible facts) facing (a part of)
a decision problem. This issue will be discussed in detail in Chapter 15. In
order to accommodate the variety of methods in a single decision system, some
authors have already proposed the creation of open architectures or combined
systems (Gardiner and Steuer, 1994; Kaliszewski, 2004; Luque et al., 2007b).
Some of such integrated systems have already been implemented. For example,
MKO and PROMOIN are described in Chapter 12. It is also worth pointing
out that some relations among the dierent types of information (like weights,
trade-os, reference points etc.) that the interactive methods may require from
the DM, are investigated in (Luque et al., 2007a).
One direction for developing new, improved, methods is hybridizing advan-
tages of dierent methods in order to overcome their weaknesses. For example,
hybridizing a posteriori and interactive methods has a lot of potential. The
DM can, for example, rst get a rough impression of the possibilities of the
problem and then can interactively locate the most preferred solution. One ap-
proach in this direction was already mentioned with NIMBUS (Lotov et al.,
2004). Another idea is to combine reference points with an algorithm that
generates an approximation of the Pareto optimal set (Klamroth and Miet-
tinen, 2008). This means that only those parts of the Pareto optimal set are
approximated more accurately that the DM is interested in and (s)he can con-
veniently control which parts to study by using a reference point. Steuer et al.
(1993) provide an example of hybridizing ideas of two interactive methods by
combining ideas of the Tchebyche method and reference point approaches.
When dealing with human DMs, behavoural issues cannot be ignored. For
example, some points of view in this respect are collected in (Korhonen and
Wallenius, 1996). Let us also mention an interesting practical observation
mentioned by (Buchanan, 1997). Namely, DMs seem to be easily satised if
there is a small dierence between their hopes and the solution obtained.
Somehow they feel a need to be satised when they have almost achieved
what they wanted for even if they still were in the early steps of the learning
process. In this case they may stop iterating too early. Naturally, the DM
is allowed to stop the solution process if the solution really is satisfactory
but the coincidence of setting the desires near an attainable solution may
unnecessarily increase the DMs satisfaction (see also Chapter 15).
2.6 Conclusions
We have characterized some basic properties of interactive methods developed

for multiobjective optimization and considered three types of methods based
on trade-os, reference points and classication. An important requirement
for using interactive methods is that the DM must have time and interest in
taking part in the iterative solution process. On the other hand, the major
advantage of these methods is that they give the DM a unique possibility to
learn about the problem considered. In this way, the DM is much better able
to justify why the nal solution is the most preferred one.
As has been stressed, a large variety of methods exists and none of them
can be claimed to be superior to the others in every aspect. When selecting a
solution method, the opinions of the DM are important because (s)he must feel
comfortable with the way (s)he is expected to provide preference information.
In addition, the specic features of the problem to be solved must be taken
into consideration. One can say that selecting a multiobjective optimization
method is a problem with multiple objectives itself.
When dealing with interactive methods, the importance of user-friendliness
is emphasized. This is also a topic for future research. Methods must be even
better able to correspond to the characteristics of the DM. If the aspirations
of the DM change during the solution process, the algorithm must be able to
cope with this situation.
DMs want to feel in control of the solution process and, consequently,

they must understand what is happening. Thus, the preferences expressed
must be reected in the Pareto optimal solutions generated. But if the DM
needs support in identifying the most preferred region of the Pareto optimal
set, this should be available, as well. Thus, the aim is to have methods that
support learning so that guidance is given whenever necessary. The DM can
be supported by using visual illustrations (see, e.g. Chapters 8 and 9) and
further development of such tools is essential. In particular, when designing
DSSs for DMs, user interfaces play a central role. Special-purpose methods
for dierent areas of application that take into account the characteristics of
the problems are also important.
Acknowledgements
The work of K. Miettinen was partly supported by the Foundation of the

Helsinki School of Economics. The work of F. Ruiz was partly supported by
the Spanish Ministry of Education and Science.
References
Benayoun, R., de Montgoler, J., Tergny, J., Laritchev, O.: Programming with mul-
tiple objective functions: Step method (STEM). Mathematical Programming 1,
366375 (1971)
Buchanan, J.T.: Multiple objective mathematical programming: A review. New
Zealand Operational Research 14, 127 (1986)
Buchanan, J.T.: A nave approach for solving MCDM problems: The GUESS
method. Journal of the Operational Research Society 48, 202206 (1997)
Chankong, V., Haimes, Y.Y.: The interactive surrogate worth trade-o (ISWT)
method for multiobjective decision making. In: Zionts, S. (ed.) Multiple Criteria
Problem Solving, pp. 4267. Springer, Berlin (1978)
Chankong, V., Haimes, Y.Y.: Multiobjective Decision Making. Theory and Method-
ology. North-Holland, New York (1983)
Ehrgott, M., Tenfelde-Podehl, D.: Nadir values: Computation and use in compromise
programming. Technical report, Universitt Kaiserslautern Fachbereich Mathe-
matik (2000)
Eschenauer, H.A., Osyczka, A., Schfer, E.: Interactive multicriteria optimization
in design process. In: Eschenauer, H., Koski, J., Osyczka, A. (eds.) Multicriteria
Design Optimization Procedures and Applications, pp. 71114. Springer, Berlin
(1990)
Gardiner, L., Steuer, R.E.: Unied interactive multiobjective programming. Euro-
pean Journal of Operational Research 74, 391406 (1994)
Georion, A.M., Dyer, J.S., Feinberg, A.: An interactive approach for multi-criterion
optimization, with an application to the operation of an academic department.
Management Science 19, 357368 (1972)
Haimes, Y.Y., Hall, W.A.: Multiobjectives in water resources systems analysis: the
surrogate worth trade o method. Water Resources Research 10, 615624 (1974)
Haimes, Y.Y., Tarvainen, K., Shima, T., Thadathil, J.: Hierarchical Multiobjective
Analysis of Large-Scale Systems. Hemisphere Publishing Corporation, New York
(1990)
Hakanen, J., Miettinen, K., Mkel, M., Manninen, J.: On interactive multiobjective
optimization with NIMBUS in chemical process design. Journal of Multi-Criteria
Decision Analysis 13, 125134 (2005)
Hakanen, J., Kawajiri, Y., Miettinen, K., Biegler, L.: Interactive multi-objective
optimization for simulated moving bed processes. Control and Cybernetics 36,
283302 (2007)
Hmlinen, J., Miettinen, K., Tarvainen, P., Toivanen, J.: Interactive solution ap-
proach to a multiobjective optimization problem in paper machine headbox de-
sign. Journal of Optimization Theory and Applications 116, 265281 (2003)
Heikkola, E., Miettinen, K., Nieminen, P.: Multiobjective optimization of an ultra-
sonic transducer using NIMBUS. Ultrasonics 44, 368380 (2006)
Hwang, C.L., Masud, A.S.M.: Multiple Objective Decision Making Methods and
Applications: A State-of-the-Art Survey. Springer, Berlin (1979)
Jaszkiewicz, A., Sowiski, R.: The light beam search approach - an overview of
methodology and applications. European Journal of Operational Research 113,
300314 (1999)
Kaliszewski, I.: Out of the misttowards decision-maker-friendly multiple criteria
decision support. European Journal of Operational Research 158, 293307 (2004)
Kaliszewski, I., Michalowski, W.: Searching for psychologically stable solutions
of multiple criteria decision problems. European Journal of Operational Re-
search 118, 549562 (1999)
Keeney, R.: Value Focused Thinking, a Path to Creative Decision Making. Harvard
University Press, Harvard (1992)
Keeney, R., Raia, H.: Decisions with Multiple Objectives: Preferences and Value
Tradeos. Wiley, New York (1976)
Klamroth, K., Miettinen, K.: Integrating approximation and interactive decision
making in multicriteria optimization. Operations Research 56, 222234 (2008)
Korhonen, P.: Interactive methods. In: Figueira, J., Greco, S., Ehrgott, M. (eds.)
Multiple Criteria Decision Analysis. State of the Art Surveys, pp. 641665.
Springer, New York (2005)
Korhonen, P., Laakso, J.: A visual interactive method for solving the multiple criteria
problem. European Journal of Operational Research 24, 277287 (1986)
Korhonen, P., Wallenius, J.: Behavioural issues in MCDM: Neglected research ques-
tions. Journal of Multi-Criteria Decision Analysis 5, 178182 (1996)
Larichev, O.: Cognitive validity in design of decision aiding techniques. Journal of
Multi-Criteria Decision Analysis 1, 127138 (1992)
Lewandowski, A., Wierzbicki, A.P.: Aspiration Based Decision Support Systems.
Theory, Software and Applications. Springer, Berlin (1989)
(2004)
Luque, M., Caballero, R., Molina, J., Ruiz, F.: Equivalent information for multiob-
jective interactive procedures. Management Science 53, 125134 (2007a)
Luque, M., Ruiz, F., Miettinen, K.: GLIDE general formulation for interactive mul-
tiobjective optimization. Technical Report W-432, Helsinki School of Economics,
Helsinki (2007b)
Makowski, M.: Model-based decision making support for problems with conicting
goals. In: Proceedings of the 2nd International Symposium on System and Human
Science, Lawrence Livermore National Laboratory, Livermore (2005)
Boston (1999)
Miettinen, K.: IND-NIMBUS for demanding interactive multiobjective optimization.
In: Trzaskalik, T. (ed.) Multiple Criteria Decision Making 05, pp. 137150. The
Karol Adamiecki University of Economics, Katowice (2006)
Miettinen, K., Kaario, K.: Comparing graphic and symbolic classication in interac-
tive multiobjective optimization. Journal of Multi-Criteria Decision Analysis 12,
321335 (2003)
Miettinen, K., Kirilov, L.: Interactive reference direction approach using implicit
parametrization for nonlinear multiobjective optimization. Journal of Multi-
Criteria Decision Analysis 13, 115123 (2005)
Miettinen, K., Mkel, M.M.: Interactive bundle-based method for nondierentiable
multiobjective optimization: NIMBUS. Optimization 34, 231246 (1995)
Miettinen, K., Mkel, M.M.: Comparative evaluation of some interactive reference
point-based methods for multi-objective optimisation. Journal of the Operational
Research Society 50, 949959 (1999)
Miettinen, K., Mkel, M.M.: Interactive multiobjective optimization system
WWW-NIMBUS on the Internet. Computers & Operations Research 27, 709
723 (2000)
Miettinen, K., Mkel, M.M.: On scalarizing functions in multiobjective optimiza-
tion. OR Spectrum 24, 193213 (2002)
Miettinen, K., Mkel, M.M.: Synchronous approach in interactive multiobjective
optimization. European Journal of Operational Research 170, 909922 (2006)
Miettinen, K., Mkel, M.M., Mnnikk, T.: Optimal control of continuous cast-
ing by nondierentiable multiobjective optimization. Computational Optimiza-
tion and Applications 11, 177194 (1998)
Miettinen, K., Lotov, A.V., Kamenev, G.K., Berezkin, V.E.: Integration of two mul-
tiobjective optimization methods for nonlinear problems. Optimization Methods
and Software 18, 6380 (2003)
of Operational Research 175, 931947 (2006)
Nakayama, H.: Aspiration level approach to interactive multi-objective programming
and its applications. In: Pardalos, P.M., Siskos, Y., Zopounidis, C. (eds.) Advances
in Multicriteria Analysis, pp. 147174. Kluwer Academic Publishers, Dordrecht
(1995)
Nakayama, H., Sawaragi, Y.: Satiscing trade-o method for multiobjective pro-
gramming. In: Grauer, M., Wierzbicki, A.P. (eds.) Interactive Decision Analysis,
pp. 113122. Springer, Heidelberg (1984)
Narula, S.C., Kirilov, L., Vassilev, V.: Reference direction approach for solving mul-
tiple objective nonlinear programming problems. IEEE Transactions on Systems,
Man, and Cybernetics 24, 804806 (1994)
Ogryczak, W.: On multicriteria optimization with fair aggregation of individ-

ual achievements. In: CSM06: 20th Workshop on Methodologies and Tools
for Complex System Modeling and Integrated Policy Assessment, IIASA, Lax-
enburg, Austria (2006), http://www.iiasa.ac.at/marek/ftppub/Pubs/csm06/
ogryczak_pap.pdf
Rawls, J.: A Theory of Justice. Belknap Press, Cambridge (1971)
Saaty, T.: Decision Making for Leaders: the Analytical Hierarchy Process for Deci-
sions in a Complex World. Lifetime Learning Publications, Belmont (1982)
Sakawa, M.: Interactive multiobjective decision making by the sequential proxy opti-
mization technique. European Journal of Operational Research 9, 386396 (1982)
Shin, W.S., Ravindran, A.: Interactive multiple objective optimization: Survey I
continuous case. Computers & Operations Research 18, 97114 (1991)
Statnikov, R.B.: Multicriteria Design: Optimization and Identication. Kluwer Aca-
demic Publishers, Dordrecht (1999)
tions. Wiley, Chichester (1986)
Steuer, R.E.: The Tchebyche procedure of interactive multiple objective program-
ming. In: Karpak, B., Zionts, S. (eds.) Multiple Criteria Decision Making and
Risk Analysis Using Microcomputers, pp. 235249. Springer, Berlin (1989)
Steuer, R.E., Silverman, J., Whisman, A.W.: A combined Tchebyche/aspiration
criterion vector interactive multiobjective programming procedure. Management
Science 39, 12551260 (1993)
Stewart, T.J.: A critical survey on the status of multiple criteria decision making
theory and practice. Omega 20, 569586 (1992)
Tabucanon, M.T.: Multiple Criteria Decision Making in Industry. Elsevier Science
Publishers, Amsterdam (1988)
Tarvainen, K.: On the implementation of the interactive surrogate worth trade-
o (ISWT) method. In: Grauer, M., Wierzbicki, A.P. (eds.) Interactive Decision
Analysis, pp. 154161. Springer, Berlin (1984)
Vanderpooten, D., Vincke, P.: Description and analysis of some representative in-
teractive multicriteria procedures. Mathematical and Computer Modelling 12,
12211238 (1989)
Vassilev, V.S., Narula, S.C., Gouljashki, V.G.: An interactive reference direction al-
gorithm for solving multi-objective convex nonlinear integer programming prob-
lems. International Transactions in Operational Research 8, 367380 (2001)
Vincke, P.: Multicriteria Decision-Aid. Wiley, Chichester (1992)
Wierzbicki, A.P.: Basic properties of scalarizing functionals for multiobjective op-
timization. Mathematische Operationsforschung und Statistik Optimization 8,
5560 (1977)
Wierzbicki, A.P.: The use of reference objectives in multiobjective optimization.
In: Fandel, G., Gal, T. (eds.) Multiple Criteria Decision Making, Theory and
Applications, pp. 468486. Springer, Berlin (1980)
cal Modeling 3, 391405 (1983)
terizations to vector optimization problems. OR Spectrum 8, 7387 (1986)
Wierzbicki, A.P.: On the role of intuition in decision making and some ways of
multicriteria aid of intuition. Journal of Multi-Criteria Decision Analysis 6, 6578
(1997)
Theory, and Applications, pp. 9-19-39, Kluwer, Dordrecht (1999)
Wierzbicki, A.P., Makowski, M., Wessels, J. (eds.): Decision Support Methodology
with Environmental Applications. Kluwer Academic Publishers, Dordrecht (2000)
Yang, J.B.: Gradient projection and local region search for multiobjective optimisa-
tion. European Journal of Operational Research 112, 432459 (1999)
Yang, J.B., Li, D.: Normal vector identication and interactive tradeo analysis
using minimax formulation in multiobjective optimisation. IEEE Transactions on
Systems, Man and Cybernetics 32, 305319 (2002)
Zionts, S., Wallenius, J.: An interactive programming method for solving the multi-
ple criteria problem. Management Science 22, 652663 (1976)
Zionts, S., Wallenius, J.: An interactive multiple objective linear programming
method for a class of underlying utility functions. Management Science 29, 519
529 (1983)
3
Introduction to Evolutionary Multiobjective
Optimization
Kalyanmoy Deb
1
Department of Mechanical Engineering, Indian Institute of Technology Kanpur,
Kanpur, PIN 208016, India
deb@iitk.ac.in
http://www.iitk.ac.in/kangal/deb.htm
2
Department of Business Technology, Helsinki School of Economics,
PO Box 1210, 00101 Helsinki, Finland
Kalyanmoy.Deb@hse.fi
Abstract. In its current state, evolutionary multiobjective optimization (EMO)

is an established eld of research and application with more than 150 PhD theses,
more than ten dedicated texts and edited books, commercial softwares and numerous
freely downloadable codes, a biannual conference series running successfully since
2001, special sessions and workshops held at all major evolutionary computing con-
ferences, and full-time researchers from universities and industries from all around
the globe. In this chapter, we provide a brief introduction to EMO principles, illus-
trate some EMO algorithms with simulated results, and outline the current research
and application potential of EMO. For solving multiobjective optimization problems,
EMO procedures attempt to nd a set of well-distributed Pareto-optimal points, so
that an idea of the extent and shape of the Pareto-optimal front can be obtained.
Although this task was the early motivation of EMO research, EMO principles are
now being found to be useful in various other problem solving tasks, enabling one to
treat problems naturally as they are. One of the major current research thrusts is to
combine EMO procedures with other multiple criterion decision making (MCDM) ()
tools so as to develop hybrid and interactive multiobjective optimization algorithms
for nding a set of trade-o optimal solutions and then choose a preferred solution
for implementation. This chapter provides the background of EMO principles and
their potential to launch such collaborative studies with MCDM researchers in the
coming years.
3.1 Introduction
In a short span of about fourteen years since the suggestion of the rst set
of successful algorithms, evolutionary multiobjective optimization (EMO) has
Reviewed by: Matthias Ehrgott, The University of Auckland, New Zealand

Christian Igel, Ruhr-Universitt Bochum, Germany
60 K. Deb
now become a popular and useful eld of research and application. In a recent
survey announced during the World Congress on Computational Intelligence
(WCCI) in Vancouver 2006, EMO has been judged as one of the three fastest
growing elds of research and application among all computational intelligence
topics. Evolutionary optimization (EO) algorithms use a population based
approach in which more than one solution participates in an iteration and
evolves a new population of solutions in each iteration. The reasons for their
popularity are many. Some of them are: (i) EOs do not require any deriva-
tive information (ii) EOs are relatively simple to implement and (iii) EOs
are exible and have a wide-spread applicability. For solving single-objective
optimization problems or in other tasks focusing on nding a single optimal
solution, the use of a population of solutions in each iteration may at rst
seem like an overkill (but they help provide an implicit parallel search abil-
ity, thereby making EOs computationally ecient (Holland, 1975; Goldberg,
1989)), in solving multiobjective optimization problems an EO procedure is
a perfect match (Deb, 2001). The multiobjective optimization problems, by
nature, give rise to a set of Pareto-optimal solutions which need a further
processing to arrive at a single preferred solution. To achieve the rst task,
it becomes quite a natural proposition to use an EO, because the use of
population in an iteration helps an EO to simultaneously nd multiple non-
dominated solutions, which portrays a trade-o among objectives, in a single
simulation run.
In this chapter, we begin with a brief description of an evolutionary opti-
mization procedure for single-objective optimization. Thereafter, we describe
the principles of evolutionary multiobjective optimization and sketch a brief
history of how the eld has evolved over the past one-and-a-half decades. To
generate interest in the minds of the readers, we also provide a description
of a few well-known EMO procedures with some representative simulation
studies. Finally, we discuss the achievements of EMO research and its current
focus. It is clear from these discussions that EMO is not only being found to
be useful in solving multiobjective optimization problems, it is also helping
to solve other kinds of optimization problems in a better manner than they
are traditionally solved. As a by-product, EMO-based solutions are helping
to reveal important hidden knowledge about a problem a matter which is
dicult to achieve otherwise.
However, much of the research focus in EMO had been to nd a set of near
Pareto-optimal solutions and not much studies have been made yet to execute
the remaining half of a multiobjective optimization task selection of a single
preferred Pareto-optimal solution for implementation. Although a few studies
have just scratched the surface in this direction, this book remains as a ma-
jor step towards achieving possible EMO and multi-criterion decision-making
hybrids. This chapter provides a sketch of the EMO background, some repre-
sentative EMO procedures, and EMOs achievements and potential, so that
readers can get attracted to the EMO literature and engage in collaborative
research and application using EMO and MCDM.
3.2 Evolutionary Optimization (EO): A Brief

Introduction
Evolutionary optimization principles are dierent from classical optimization
methodologies in the following main ways (Goldberg, 1989):
An EO procedure does not usually use gradient information in its search
process. Thus, EO methodologies are direct search procedures, allowing
them to be applied to a wide variety of optimization problems. Due
to this reason, EO methodologies may not be competitive with specic
gradient-based optimization procedures in solving more structured and
well-behaved optimization problems such as convex programming, linear
or quadratic programming.
An EO procedure uses more than one solution (a population approach)
in an iteration, unlike in most classical optimization algorithms which
updates one solution in each iteration (a point approach). The use of a
population has a number of advantages: (i) it provides an EO with a
parallel processing power achieving a computationally quick overall search,
(ii) it allows an EO to nd multiple optimal solutions, thereby facilitating
the solution of multi-modal and multiobjective optimization problems, and
(iii) it provides an EO with the ability to normalize decision variables (as
well as objective and constraint functions) within an evolving population
using the population-best minimum and maximum values. However, the
ip side of working with a population of solutions is the computational
cost and memory associated with executing each iteration.
An EO procedure uses stochastic operators, unlike deterministic opera-
tors used in most classical optimization methods. The operators tend to
achieve a desired eect by using biased probability distributions towards
desirable outcomes, as opposed to using predetermined and xed transi-
tion rules. This allows an EO algorithm to negotiate multiple optima and
other complexities better and provide them with a global perspective in
their search.
An EO begins its search with a population of solutions usually created at
random within a specied lower and upper bound on each variable. If bounds
are not supplied in an optimization problem, suitable values can be assumed
only for the initialization purpose. Thereafter, the EO procedure enters into
an iterative operation of updating the current population to create a new
population by the use of four main operators: selection, crossover, mutation
and elite-preservation. The operation stops when one or more pre-specied
termination criteria are met. Thus, an EO takes the following simple structure
in which the terms shown in bold (but not underlined) are changeable by
the user.
The initialization procedure is already described above. If in a problem
the knowledge of some good solutions is available, it is wise to use such infor-
mation in creating the initial population P0 . Elsewhere (Deb et al., 2003a),
62 K. Deb
Algorithm 1 An Evolutionary Optimization Procedure:

t = 0;
Initialization(Pt);
do
Evaluation(Pt );
Pt = Selection(Pt );
Pt = V ariation(Pt );
Pt+1 = Elitism(Pt , Pt );
t = t + 1; od;
while
(T ermination(Pt , Pt+1 )); od
it is highlighted that for solving complex real-world optimization problems,

such a customized initialization is useful and also helpful in achieving a faster
search.
The evaluation of a population means computation of each population
member for its objective function value, constraint values and determining
if the solution is feasible. Since this requires evaluation of multiple functions,
this procedure also requires a relative preference order (or sorting) of solutions
(say from best to worst) in the population. Often, such an ordering can be
established by creating a real-valued tness function derived from objective
and constraint values. For multiobjective optimization problems, one of the
ways to achieve the ordering is to sort the population based on a domination
principle Goldberg (1989). It is interesting to note that since an ordering
of best-to-worst is enough for the evaluation purpose, EO procedures allow
handling of dierent problem types: dynamically changing problems, problems
which are not mathematically expressed, problems which are procedure-based,
and others.
After the population members are evaluated, the selection operator chooses
above-average (in other words, better) solutions with a larger probability to
ll an intermediate mating pool. For this purpose, several stochastic selection
operators exist in the EO literature. In its simplest form (called the tourna-
ment selection (Deb, 1999a)), two solutions can be picked at random from
the evaluated population and the better of the two (in terms of its evaluated
order) can be picked.
The variation operator is a collection of a number of operators (such as
crossover, mutation etc.) which are used to generate a modied population.
The purpose of the is to pick two or more solutions (parents) randomly from
the mating pool and create one or more solutions by exchanging information
among the parent solutions. The crossover operator is applied with a crossover
probability (pc [0, 1]), indicating the proportion of population members par-
ticipating in the crossover operation. The remaining (1 pc ) proportion of the
population is simply copied to the modied (child) population. In the context
of real-parameter optimization having n real-valued variables and involving a
crossover with two parent solutions, each variable may be crossed at a time.
A probability distribution which depends on the dierence between the two
parent variable values is often used to create two new numbers as child values
around the two parent values. One way to set the probability distribution is
that if the two parent values are quite dierent, the created child values are
also set to dierent from their parent values, thereby allowing a broader search
to take place. This process is usually helpful during the early generations of
an EO when population members are quite dierent from each other. How-
ever, after some generations when population members tend to get close to
each other to converge to an interesting region in the search space due to the
interactions of EO operators, the same crossover operator must then focus the
search by creating child values closer to parent values. This varying action of
the EO crossover operator without any intervention from the user provides a
self-adaptive feature to a real-parameter EO algorithm. There exists a num-
ber of probability distributions for achieving such a self-adaptive crossover
operation (Deb, 2001; Herrera et al., 1998). A unimodal probability distribu-
tion with its mode at the parent variable value is used to design the simulated
binary crossover (SBX) (SBX|seesimulated binary crossover (SBX)) and is
popularly used in the EO literature. Besides the variable-wise recombination
operators, vector-wise recombination operators also suggested to propagate
the correlation among variables of parent solutions to the created child solu-
tions (Deb et al., 2002a; Storn and Price, 1997).
Each child solution, created by the crossover operator, is then perturbed
in its vicinity by a mutation operator (Goldberg, 1989). Every variable is
mutated with a mutation probability pm , usually set as 1/n (n is the number
of variables), so that on an average one variable gets mutated per solution.
In the context of real-parameter optimization, a simple Gaussian probability
distribution with a predened variance can be used with its mean at the
child variable value (Deb, 2001). This operator allows an EO to search locally
around a solution and is independent on the location of other solutions in the
population.
The elitism operator combines the old population with the newly created
population and chooses to keep better solutions from the combined popula-
tion. Such an operation makes sure that an algorithm has a monotonically
non-degrading performance. Rudolph (1994) proved an of a specic EO but
having elitism and mutation as two essential operators.
Finally, the user of an EO needs to choose . Often, a predetermined number
of generations is used as a termination criterion. For goal attainment problems,
an EO can be terminated as soon as a solution with a predened goal or a
target solution is found. In many studies (Goldberg, 1989; Michalewicz, 1992;
Gen and Cheng, 1997; Bck et al., 1997), a termination criterion based on the
statistics of the current population vis-a-vis that of the previous population
to determine the rate of convergence is used. In other more recent studies,
theoretical optimality conditions (such as the extent of satisfaction of Karush-
Kuhn-Tucker (KKT) conditions) are used to determine the termination of a
64 K. Deb
real-parameter EO algorithm (Deb et al., 2007). Although EOs are heuristic

based, the use of such theoretical optimality concepts in an EO can also be
used to test their converging abilities towards local optimal solutions.
Thus, overall an EO procedure is a population-based stochastic search
procedure which iteratively emphasizes its better population members, uses
them to recombine and perturb locally in the hope of creating new and bet-
ter populations until a predened termination criterion is met. The use of a
population helps to achieve an implicit parallelism (Goldberg, 1989; Holland,
1975; Vose et al., 2003) in an EOs search mechanism (an inherent parallel
search in dierent regions of the search space), a matter which makes an EO
computationally attractive for solving dicult problems. In the context of
certain Boolean functions, a computational time saving to nd the optimum
varying polynomial to the population size is proven (Jansen and Wegener,
2001). On one hand, the EO procedure is exible, thereby allowing a user to
choose suitable operators and problem-specic information to suit a specic
problem. On the other hand, the exibility comes with an onus on the part of
a user to choose appropriate and tangible operators so as to create an ecient
and consistent search (Radclie, 1991). However, the benets of having a ex-
ible optimization procedure, over their more rigid and specic optimization
algorithms, provide hope in solving dicult real-world optimization prob-
lems involving non-dierentiable objectives and constraints, non-linearities,
discreteness, multiple optima, large problem sizes, uncertainties in computa-
tion of objectives and constraints, uncertainties in decision variables, mixed
type of variables, and others. A wiser approach to solving optimization prob-
lems of the real world would be to rst understand the niche of both EO
and classical methodologies and then adopt hybrid procedures employing the
better of the two as the search progresses over varying degrees of search-space
complexity from start to nish.
3.2.1 EO Terminologies
Evolutionary optimization literature uses somewhat dierent terminologies

than those used in classical optimization. The following brief descriptions may
be benecial for the readers from other optimization elds:
Children: New solutions (or decision variable vectors) created by a combined
eect of crossover and mutation operators.
Crossover: An operator in which two or more parent solutions are used to
create (through recombination) one or more child solutions.
Crossover probability: The probability of performing a crossover operation.
This means, on average, the proportion of population members partici-
pating in crossover operation in a generation.
Dierential evolution (DE): A particular evolutionary optimization proce-
dure which is usually applied to solve real-parameter optimization prob-
lems. A good reference on DE is available from Price et al. (2005).
Distribution index: A non-negative parameter (c ) used for implementing the

real-parameter simulated binary crossover (SBX). For achieving a muta-
tion operation in the real-valued search space, a polynomial probability
distribution with its mode at the parent variable value and with monoton-
ically reducing probability for creating child values away from parent was
suggested (Deb, 2001). This operator involves a non-negative parameter
(m ), which must also be set the user. In both the above recombination
and mutation operations, a large distribution index value means a proba-
bility distribution with a small variance. For single-objective EO, c = 2
is commonly used and for multiobjective EAs, c [5, 10] is used, whereas
m 20 is usually used in both cases. A study (Deb and Agrawal, 1999)
1
has shown that with m , a perturbation to the order of O(m ) takes place
to the mutated variable from its parent solution to the mutated solution.
Elitism: An operator which preserves the better of parent and child solutions
(or populations) so that a previously found better solution is never deleted.
Evolutionary algorithm: A generic name given to an algorithm which applies
Darwinian survival-of-the-ttest evolutionary principles along with genet-
ically motivated recombination and mutation principles in a stochastic
manner usually to a population of solutions to iteratively create a new
and hopefully better population of solutions in the context of a stationary
or a dynamic tness landscape.
Evolutionary optimization (EO): An EA which is designed to solve an opti-
mization problem.
Evolutionary programming (EP): An EA originally applied to a set of nite-
state machines to evolve better learning machines (Fogel et al., 1966).
Now, EPs are used for various optimization tasks including real-parameter
optimization and are somewhat similar to evolution strategies. EP mainly
depends on its selection and mutation operators.
Evolution strategy (ES): An EA which is mostly applied to real-valued deci-
sion variables and is mainly driven by a selection and a mutation operator,
originally suggested by P. Bienert, I. Rechenberg and H.-P. Schwefel of
the Technical University of Berlin. Early applications were experimental
based (Rechenberg, 1965; Schwefel, 1968) and some texts (Rechenberg,
1973; Schwefel, 1995) provide details about ES procedures.
Fitness: A tness or a tness landscape is a function derived from objective
function(s), constraint(s) and other problem descriptions which is used in
the selection (or reproduction) operator of an EA. A solution is usually
called better than the other, if its tness function value is better.
Generation: An iteration of an EA.
Generational EA: An EA in which a complete set of child solutions (equal
to the population size N ) is rst created by EA operators (usually, selec-
tion, crossover and mutation) before comparing with parent solutions to
decide which solutions qualify as members of the new population. Thus,
in one iteration, a complete population of N members replaces the old
population.
66 K. Deb
Genetic algorithm (GA): An early version of an EA, originally proposed

by Holland (1962, 1975), which uses three main operators selection,
crossover and mutation on a population of solutions at every genera-
tion. In binary-coded GAs, solutions are represented in a string of binary
digits (bits). In real-parameter GAs, solutions are represented as a vec-
tor of real-parameter decision variables. Other representations can also be
used to suit the handling of a problem.
Genetic programming (GP): An EA which works on computer programs (usu-
ally C or Lisp codes) or on graphs, trees, etc., representing a solution
methodology to mainly evolve optimal learning strategies (Koza, 1992).
Individual: An EA population member representing a solution to the problem
at hand.
Mating pool: An intermediate population (usually created by the selection
operator) used for creating new solutions by crossover and mutation op-
erators.
Mutation: An EA operator which is applied to a single solution to create a new
perturbed solution. A fundamental dierence with a crossover operator is
that mutation is applied to a single solution, whereas crossover is applied
to more than one solution.
Mutation probability: The probability of performing a mutation operation.
This refers to, on average, the proportion of decision variables participat-
ing in a mutation operation to a solution.
Niching: A niching is an operator by which selection pressure of population
members are controlled so as to not allow a single solution to take over
the population. Thus, niching helps to maintain a diverse population. A
number of niching techniques exist, but the sharing approach (Goldberg
and Richardson, 1987) is popularly used to nd a set of multiple optimal
solutions in a multi-modal optimization problem.
Ospring: Same as Children, dened above.
Parent: A solution used during crossover operation to create a child solution.
Particle swarm optimization (PSO): An EA which updates each population
member by using a weighted sum of two concepts borrowed from the
movement of natural swarm of birds (or shes etc.) inclination to move
towards its own individual best position and towards the swarms best
position in the decision variable space. A good source for more information
is (Kennedy et al., 2001).
Population: A set of solutions used in one generation of an EA. The number
of solutions in a population is called population size.
Recombination: Same as crossover, dened above.
Reproduction: An EA operator which mimics Darwins survival of the ttest
principle by making duplicate copies of above-average solutions in the pop-
ulation at the expense of deleting below-average solutions. Initial EA stud-
ies used a proportionate reproduction procedure in which multiple copies
of a population member are assigned to the mating pool proportionate
to the individuals tness. Thus, this operator is used for maximization
problems and for tness values which are non-negative. Current studies
use tournament selection which compares two population members based
on their tness values and sends the better solution to the mating pool.
This operator does not have any limitation on tness function.
Selection: Same as Reproduction, dened above.
Selection pressure: The extent of emphasis given to above-average solution
in a selection operator. No real quantiable denition exists, but loosely
it is considered as the number of copies allocated to the population-best
solution by the selection operator Goldberg et al. (1993).
Sharing strategy: It is niching operation in which each population members
tness is divided by a niche-count (in some sense, a niche-count is an
estimate of number of other population members around an individual)
and a shared tness is computed. A proportionate selection procedure is
then used with the shared tness values to create the mating pool.
Solution: An EA population member, same as an individual.
Steady-state EA: An EA in which only one new population member is added
to the population in one generation.
String: In a binary-coded GA, a population member, made of a collection of
bits, is called a string.
3.3 Evolutionary Multiobjective Optimization (EMO)

A multiobjective optimization problem involves a number of objective func-
tions which are to be either minimized or maximized. As in a single-objective
optimization problem, the multiobjective optimization problem may contain
a number of constraints which any feasible solution (including all optimal so-
lutions) must satisfy. Since objectives can be either minimized or maximized,
we state the multiobjective optimization problem in its general form:

Minimize/Maximize fm (x), m = 1, 2, . . . , M ;

subject to gj (x) 0, j = 1, 2, . . . , J;
hk (x) = 0, k = 1, 2, . . . , K; (3.1)
(L) (U)

xi xi xi , i = 1, 2, . . . , n.
A solution x Rn is a vector of n decision variables: x = (x1 , x2 , . . . , xn )T .

The solutions satisfying the constraints and variable bounds constitute a fea-
sible decision variable space S Rn . One of the striking dierences between
single-objective and multiobjective optimization is that in multiobjective op-
timization the objective functions constitute a multi-dimensional space, in
addition to the usual decision variable space. This additional M -dimensional
space is called the objective space, Z RM . For each solution x in the decision
variable space, there exists a point z RM ) in the objective space, denoted
by f (x) = z = (z1 , z2 , . . . , zM )T . To make the descriptions clear, we refer a
solution as a variable vector and a point as the corresponding objective
vector.
68 K. Deb
The optimal solutions in multiobjective optimization can be dened from

a mathematical concept of partial ordering. In the parlance of multiobjective
optimization, the term domination is used for this purpose. In this section, we
restrict ourselves to discuss unconstrained (without any equality, inequality
or bound constraints) optimization problems. The domination between two
solutions is dened as follows (Deb, 2001; Miettinen, 1999):
Denition 1. A solution x(1) is said to dominate the other solution x(2) , if
both the following conditions are true:
1. The solution x(1) is no worse than x(2) in all objectives. Thus, the solu-
tions are compared based on their objective function values (or location of
the corresponding points (z(1) and z(2) ) on the objective space).
2. The solution x(1) is strictly better than x(2) in at least one objective.
Although this denition is dened between two solution vectors here, the
domination is determined with their objective vectors and the above denition
is identical to the one outlined in Section 7 of the preface. For a given set of
solutions (or corresponding points on the objective space, for example, those
shown in Figure 3.1(a)), a pair-wise comparison can be made using the above
denition and whether one point dominates the other can be established. All
points which are not dominated by any other member of the set are called the
non-dominated points of class one, or simply the non-dominated points. For
the set of six solutions shown in the gure, they are points 3, 5, and 6. One
property of any two such points is that a gain in an objective from one point to
the other happens only due to a sacrice in at least one other objective. This
trade-o property between the non-dominated points makes the practitioners
interested in nding a wide variety of them before making a nal choice.
These points make up a front when viewed them together on the objective
space; hence the non-dominated points are often visualized to represent a
non-domination front. The computational eort needed to select the points
of the non-domination front from a set of N points is O(N log N ) for 2 and 3
objectives, and O(N logM2 N ) for M > 3 objectives (Kung et al., 1975).
f2 (minimize) f2 (minimize)
6
6
2 2
5 5 Nondominated
front
4 4
3 5 3 5
1 1
3
1 3 1
2 6 10 14 18 2 6 10 14 18
f1 (maximize) f1 (maximize)
(a) (b)
Fig. 3.1. A set of points and the rst non-domination front are shown.
With the above concept, now it is easier to dene the Pareto-optimal so-
lutions in a multiobjective optimization problem. If the given set of points
for the above task contain all points in the search space (assuming a count-
able number), the points lying on the non-domination front, by deni-
tion, do not get dominated by any other point in the objective space,
hence are Pareto-optimal points (together they constitute the Pareto-optimal
front) and the corresponding pre-images (decision variable vectors) are called
Pareto-optimal solutions. However, more mathematically elegant denitions
of Pareto-optimality (including the ones for continuous search space problems)
exist in the multiobjective literature and are discussed in Chapters 1 and 2.
Similar to local optimal solutions in single objective optimization, local
Pareto-optimal solutions are also dened in multiobjective optimization (Deb,
2001; Miettinen, 1999):
Denition 2. If for every member x in a set P there exists no solution y (in
the neighborhood of x such that y x , where is a small positive
scalar) dominating any member of the set P , then solutions belonging to the
set P constitute a local Pareto-optimal set.
3.3.1 EMO Principles
In the context of multiobjective optimization, the extremist principle of nd-

ing the optimum solution cannot be applied to one objective alone, when
the rest of the objectives are also important. Dierent solutions may produce
trade-os (conicting outcomes among objectives) among dierent objectives.
A solution that is extreme (in a better sense) with respect to one objective
requires a compromise in other objectives. This prohibits one to choose a solu-
tion which is optimal with respect to only one objective. This clearly suggests
two ideal goals of multiobjective optimization:
1. Find a set of solutions which lie on the Pareto-optimal front, and
2. Find a set of solutions which are diverse enough to represent the entire
range of the Pareto-optimal front.
Evolutionary multiobjective optimization (EMO) algorithms attempt to fol-
low both the above principles similar to the other a posteriori MCDM methods
(refer to Chapter 1).
Although one fundamental dierence between single and multiple objec-
tive optimization lies in the cardinality in the optimal set, from a practical
standpoint a user needs only one solution, no matter whether the associated
optimization problem is single or multiobjective. The user is now in a dilemma.
Since a number of solutions are optimal, the obvious question arises: Which
of these optimal solutions must one choose? This is not an easy question to
answer. It involves higher-level information which is often non-technical, qual-
itative and experience-driven. However, if a set of many trade-o solutions are
already worked out or available, one can evaluate the pros and cons of each
70 K. Deb
of these solutions based on all such non-technical and qualitative, yet still
important, considerations and compare them to make a choice. Thus, in a
multiobjective optimization, ideally the eort must be made in nding the
set of trade-o optimal solutions by considering all objectives to be impor-
tant. After a set of such trade-o solutions are found, a user can then use
higher-level qualitative considerations to make a choice. Since an EMO pro-
cedure deals with a population of solutions in every iteration, it makes them
intuitive to be applied in multiobjective optimization to nd a set of non-
dominated solutions. Like other a posteriori MCDM methodologies, an EMO
based procedure works with the following principle in handling multiobjective
optimization problems:
Step 1 Find multiple non-dominated points as close to the Pareto-optimal
front as possible, with a wide trade-o among objectives.
Step 2 Choose one of the obtained points using higher-level information.
Figure 3.2 shows schematically the principles, followed in an EMO procedure.
Since EMO procedures are heuristic based, they may not guarantee in nd-
ing Pareto-optimal points, as a theoretically provable optimization method
would do for tractable (for example, linear or convex) problems. But EMO
procedures have essential operators to constantly improve the evolving non-
dominated points (from the point of view of convergence and diversity dis-
cussed above) similar to the way most natural and articial evolving sys-
tems continuously improve their solutions. To this eect, a recent simulation
study (Deb et al., 2007) has demonstrated that a particular EMO procedure,
starting from random non-optimal solutions, can progress towards theoretical
Karush-Kuhn-Tucker (KKT) points with iterations in real-valued multiobjec-
Multiobjective
optimization problem
Minimize f1
Minimize f2
......
Minimize f M
Step 1
subject to constraints
IDEAL
Multiobjective
optimizer
Multiple tradeoff Choose one

solutions found solution
Higherlevel
information
Step 2
Fig. 3.2. Schematic of a two-step multiobjective optimization procedure.

tive optimization problems. The main dierence and advantage of using an

EMO compared to a posteriori MCDM procedures is that multiple trade-o
solutions can be found in a single simulation run, as most a posteriori MCDM
methodologies would require multiple applications.
In Step 1 of the EMO-based multiobjective optimization (the task shown
vertically downwards in Figure 3.2), multiple trade-o, non-dominated points
are found. Thereafter, in Step 2 (the task shown horizontally, towards the
right), higher-level information is used to choose one of the obtained trade-
o points. This dual task allows an interesting feature, if applied for solv-
ing single-objective optimization problems. It is easy to realize that a single-
objective optimization is a degenerate case of multiobjective optimization, as
shown in details in another study (Deb and Tiwari, 2008). In the case of single-
objective optimization having only one globally optimal solution, Step 1 will
ideally nd only one solution, thereby not requiring us to proceed to Step 2.
However, in the case of single-objective optimization having multiple global
optima, both steps are necessary to rst nd all or multiple global optima,
and then to choose one solution from them by using a higher-level information
about the problem. Thus, although seems ideal for multiobjective optimiza-
tion, the framework suggested in Figure 3.2 can be ideally thought as a generic
principle for both single and multiple objective optimization.
3.3.2 A Posteriori MCDM Methods and EMO
In the a posteriori MCDM approaches (also known as generating MCDM

methods), the task of nding multiple Pareto-optimal solutions is achieved
by executing many independent single-objective optimizations, each time nd-
ing a single Pareto-optimal solution (see Miettinen (1999) and Chapter 1 of
this book). A parametric scalarizing approach (such as the weighted-sum ap-
proach, -constraint approach, and others) can be used to convert multiple
objectives into a parametric single-objective objective function. By simply
varying the parameters (weight vector or -vector) and optimizing the scalar-
ized function, dierent Pareto-optimal solutions can be found. In contrast, in
an EMO, multiple Pareto-optimal solutions are attempted to be found in a
single simulation by emphasizing multiple non-dominated and isolated solu-
tions. We discuss a little later some EMO algorithms describing how such dual
emphasis is provided, but now discuss qualitatively the dierence between a
posteriori MCDM and EMO approaches.
Consider Figure 3.3, in which we sketch how multiple independent para-
metric single-objective optimizations may nd dierent Pareto-optimal solu-
tions. The Pareto-optimal front corresponds to global optimal solutions of
several scalarized objectives. However, during the course of an optimization
task, algorithms must overcome a number of diculties, such as infeasible
regions, local optimal solutions, at regions of objective functions, isolation
of optimum, etc., to converge to the global optimal solution. Moreover, due
to practical limitations, an optimization task must also be completed in a
72 K. Deb
f2
Local fronts
Initial
points
Infeasible
regions
Paretooptimal
front
f1
Fig. 3.3. Posteriori MCDM methodology employs independent single-objective op-

timizations.
reasonable computational time. This requires an algorithm to strike a good

balance between the extent of these tasks its search operators must do to
overcome the above-mentioned diculties reliably and quickly. When multi-
ple simulations are to performed to nd a set of Pareto-optimal solutions, the
above balancing act must have to performed in every single simulation. Since
simulations are performed independently, no information about the success
or failure of previous simulations is used to speed up the process. In dicult
multiobjective optimization problems, such memory-less a posteriori methods
may demand a large overall computational overhead to get a set of Pareto-
optimal solutions. Moreover, even though the convergence can be achieved in
some problems, independent simulations can never guarantee nding a good
distribution among obtained points.
EMO, as mentioned earlier, constitutes an inherent parallel search. When a
population member overcomes certain diculties and make a progress towards
the Pareto-optimal front, its variable values and their combination reect
this fact. When a recombination takes place between this solution and other
population members, such valuable information of variable value combinations
gets shared through variable exchanges and blending, thereby making the
overall task of nding multiple trade-o solutions a parallelly processed task.
3.4 History of EMO and Non-elitist Methodologies

During the early years, EA researchers have realized the need of solving mul-
tiobjective optimization problems in practice and mainly resorted to using
weighted-sum approaches to convert multiple objectives into a single goal
(Rosenberg, 1967; Fogel et al., 1966).
However, the rst implementation of a real multiobjective evolutionary
algorithm (vector-evaluated GA or VEGA) was suggested by David Schaer
in the year 1984 (Schaer, 1984). Schaer modied the simple three-operator
genetic algorithm (with selection, crossover, and mutation) by performing
independent selection cycles according to each objective. The selection method
is repeated for each individual objective to ll up a portion of the mating
pool. Then the entire population is thoroughly shued to apply crossover and
mutation operators. This is performed to achieve the mating of individuals
of dierent subpopulation groups. The algorithm worked eciently for some
generations but in some cases suered from its bias towards some individuals
or regions (mostly individual objective champions). This does not fulll the
second goal of EMO, discussed earlier.
Ironically, no signicant study was performed for almost a decade after
the pioneering work of Schaer, until a revolutionary 10-line sketch of a new
non-dominated sorting procedure suggested by David E. Goldberg in his sem-
inal book on GAs (Goldberg, 1989). Since an EA needs a tness function for
reproduction, the trick was to nd a single metric from a number of objec-
tive functions. Goldbergs suggestion was to use the concept of domination
to assign more copies to non-dominated individuals in a population. Since
diversity is the other concern, he also suggested the use of a niching strategy
(Goldberg and Richardson, 1987) among solutions of a non-dominated class.
Getting this clue, at least three independent groups of researchers developed
dierent versions of multiobjective evolutionary algorithms during 1993-94.
Basically, these algorithms dier in the way a tness assignment scheme is
introduced to each individual. We discuss them briey in the following para-
graphs.
Fonseca and Fleming (1993) suggested a multiobjective GA (MOGA), in
which all non-dominated population members are assigned a rank one. Other
individuals are ranked by calculating how many solutions (say k) dominated a
particular solution. That solution is then assigned a rank (k+1). The selection
procedure then chooses lower rank solutions to form the mating pool. Since
the tness of a population member is the same as its rank, many population
members will have an identical tness. MOGA applies a niching technique
on solutions having identical tness to maintain a diverse population. But
instead of performing niching on the parameter values, they suggested niching
on objective function values. The ranking of individuals according to their
non-dominance in the population is an important aspect of the work.
Horn et al. (1994) used Pareto domination tournaments in their niched-
Pareto GA (NPGA). In this method, a comparison set comprising of a specic
number (tdom ) of individuals is picked at random from the population at the
beginning of each selection process. Two random individuals are picked from
the population for selecting a winner according to the following procedure.
Both individuals are compared with the members of the comparison set for
domination. If one of them is non-dominated and the other is dominated, then
the non-dominated point is selected. On the other hand, if both are either non-
dominated or dominated, a niche-count is calculated by simply counting the
number of points in the entire population within a certain distance (share ) in
74 K. Deb
the variable space from an individual. The individual with least niche-count
is selected. The proposers of this algorithm has reported that the outcome of
the algorithm depends on the chosen value of tdom . Nevertheless, the concept
of niche formation among the non-dominated points using the tournament
selection is an important aspect of the work.
Srinivas and Deb (1994) developed a non-dominated sorting GA (NSGA)
which diers from MOGA in two ways: tness assignment and the way nich-
ing is performed. After the population members belonging to the rst non-
domination class are identied, they are assigned a dummy tness value equal
to N (population size). A sharing strategy is then used on parameter values
(instead of objective function values) to nd the niche-count for each indi-
vidual of the best class. Here, the parameter niche-count of an individual is
estimated by calculating sharing function values (Goldberg and Richardson,
1987). Interested readers may refer to the original study or (Deb, 2001). For
each individual, a shared tness is then found by dividing the assigned tness
N by the niche-count. Thereafter, the second class of non-dominated solu-
tions (obtained by temporarily discounting solutions of rst non-domination
class and then nding new non-dominated points) is assigned a dummy t-
ness value smaller than the least shared tness of solutions of the previous
non-domination class. This process is continued till all solutions are assigned
a tness value. This tness assignment procedure ensured two matters: (i) a
dominated solution is assigned a smaller shared tness value than any solution
which dominated it and (ii) in each non-domination class an adequate diver-
sity is ensured. On a number of test problems and real-world optimization
problems, NSGA has been found to provide a wide-spread Pareto-optimal or
near Pareto-optimal solutions. However, one diculty of NSGA is to choose
an appropriate niching parameter, which directly aects the maximum dis-
tance between two neighboring solutions obtained by NSGA. Although most
studies used a xed value of the niching parameter, an adaptive niche-sizing
strategy has been suggested elsewhere (Fonseca and Fleming, 1993).
3.5 Elitist Methodologies

The above-mentioned non-elitist EMO methodologies gave a good head-start
to the research and application of EMO, but suered from the fact that they
did not use an elite-preservation mechanism in their procedures. It is men-
tioned before that an addition of elitism in an EO provides a monotonically
non-degrading performance. The second generation EMO algorithms imple-
mented an elite-preserving operator in dierent ways and gave birth to elitist
EMO procedures, such as NSGA-II (Deb et al., 2002b), Strength Pareto EA
(SPEA) (Zitzler and Thiele, 1999), Pareto-archived ES (PAES) (Knowles and
Corne, 2000), and others. Since these EMO algorithms are state-of-the-art
procedures, we describe one of these algorithms in detail and briey discuss
two other procedures commonly used in EMO studies.
3.5.1 Elitist Non-dominated Sorting GA or NSGA-II
The NSGA-II procedure (Deb et al., 2002b) is one of the popularly used
EMO procedures which attempt to nd multiple Pareto-optimal solutions in
a multiobjective optimization problem and has the following three features:
1. It uses an elitist principle,
2. it uses an explicit diversity preserving mechanism, and
3. it emphasizes non-dominated solutions.
At any generation t, the ospring population (say, Qt ) is rst created by
using the parent population (say, Pt ) and the usual genetic operators. There-
after, the two populations are combined together to form a new population
(say, Rt ) of size 2N . Then, the population Rt classied into dierent non-
domination classes. Thereafter, the new population is lled by points of dif-
ferent non-domination fronts, one at a time. The lling starts with the rst
non-domination front (of class one) and continues with points of the second
non-domination front, and so on. Since the overall population size of Rt is
2N , not all fronts can be accommodated in N slots available for the new pop-
ulation. All fronts which could not be accommodated are deleted. When the
last allowed front is being considered, there may exist more points in the front
than the remaining slots in the new population. This scenario is illustrated in
Figure 3.4. Instead of arbitrarily discarding some members from the last front,
the points which will make the diversity of the selected points the highest are
chosen.
The crowded-sorting of the points of the last front which could not be
accommodated fully is achieved in the descending order of their crowd-
ing!distance values and points from the top of the ordered list are chosen. The
crowding distance di of point i is a measure of the objective space around
i which is not occupied by any other solution in the population. Here, we
simply calculate this quantity di by estimating the perimeter of the cuboid
Nondominated Crowding
sorting distance Pt+1
sorting
F1
Pt F2 f2 0
F3
Cuboid
i-1
Qt i
Rejected
l
i+1
Rt
f1
Fig. 3.4. Schematic of the NSGA-II procedure. Fig. 3.5. The crowding dis-
tance calculation.
76 K. Deb
(Figure 3.5) formed by using the nearest neighbors in the objective space as
the vertices (we call this the crowding distance).
Sample Simulation Results
Here, we show simulation results of NSGA-II on two test problems. The rst
problem (ZDT2) is two-objective, 30-variable problem with a concave Pareto-
optimal front:

Minimize f1 (x) = x1 ,
Minimize f (x) = g(x) 1 (f (x)/g(x))2 ,

2
30 1
ZDT : where g(x) = 1 + 299
i=2 xi (3.2)

0 x 1,
1
1 xi 1, i = 2, 3, . . . , 30.
The second problem (KUR), with three variables, has a disconnected Pareto-
optimal front:
2

Minimize f1 (x) = i=1 10 exp(0.2 x2i + x2i+1 ) ,
KUR : Minimize f (x) = 3 |x |0.8 + 5 sin(x3 ) , (3.3)

2 i=1 i i
5 xi 5, i = 1, 2, 3.
NSGA-II is run with a population size of 100 and for 250 generations. The
variables are used as real numbers and an SBX recombination operator with
pc = 0.9 and distribution index of c = 10 and a polynomial mutation operator
(Deb, 2001) with pm = 1/n (n is the number of variables) and distribution
index of m = 20 are used. Figures 3.6 and 3.7 show that NSGA-II converges
on the Pareto-optimal front and maintains a good spread of solutions on both
test problems.
1.1 2
1
0
0.9
0.8 2
0.7
4
0.6
f2
f2
0.5 6
0.4
0.3
8
0.2 10
0.1
0 12
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 19 18 17 16 15 14
f1 f1
Fig. 3.6. NSGA-II on ZDT2. Fig. 3.7. NSGA-II on KUR.

3.5.2 Strength Pareto EA (SPEA) and SPEA2
Zitzler and Thiele (1998) suggested an elitist multi-criterion EA with the

concept of non-domination in their strength Pareto EA (SPEA). They sug-
gested maintaining an external population in every generation storing all non-
dominated solutions discovered so far, beginning from the initial population.
This external population participates in genetic operations. In each gener-
ation, a combined population with the external and the current population
is rst constructed. All non-dominated solutions in the combined population
are assigned a tness based on the number of solutions they dominate and all
dominated solutions are assigned a tness equal to one more than the sum
of tness of solutions which dominate it. This assignment of tness makes
sure that the search is directed towards the non-dominated solutions and si-
multaneously diversity among dominated and non-dominated solutions are
maintained. Diversity is maintained by performing a clustering procedure to
maintain a xed size archive. On knapsack problems, they have reported bet-
ter results than other methods used in that study.
In their subsequent improved version (SPEA2) (Zitzler et al., 2001b), three
changes have been made. First, the archive size is always kept xed by adding
dominated solutions from the EA population, if needed. Second, the tness
assignment procedure for the dominated solutions is slightly dierent and a
density information is used to resolve ties between solutions having identical
tness values. Third, a modied clustering algorithm is used from the k-th
nearest neighbor distance estimates for each cluster and special attention is
made to preserve the boundary elements.
3.5.3 Pareto Archived ES (PAES) and Pareto Envelope based

Selection Algorithms (PESA and PESA2)
Knowles and Corne (2000) suggested Pareto-archived ES (PAES) with one

parent and one child. The child is compared with the parent. If the child
dominates the parent, the child is accepted as the next parent and the itera-
tion continues. On the other hand, if the parent dominates the child, the child
is discarded and a new mutated solution (a new child) is found. However, if
the child and the parent do not dominate each other, the choice between child
or a parent is resolved by using a crowding procedure. To maintain diversity,
an archive of non-dominated solutions found so far is maintained. The child is
compared with the archive to check if it dominates any member of the archive.
If yes, the child is accepted as the new parent and the dominated solution is
eliminated from the archive. If the child does not dominate any member of
the archive, both parent and child are checked for their proximity (in terms
of Euclidean distance in the objective space) to the archive members. If the
child resides in the least crowded area in the objective space compared to
other archive members, it is accepted as a parent and a copy is added to
78 K. Deb
the archive. Later, they suggested a multi-parent PAES with similar princi-
ples as above, but applied with a multi-parent evolution strategy framework
(Schwefel, 1995). In their subsequent version, called the Pareto Envelope based
Selection Algorithm (PESA) (Corne et al., 2000), they combined good aspects
of SPEA and PAES. Like SPEA, PESA carries two populations (a smaller EA
population and a larger archive population). Non-domination and the PAES
crowding concept is used to update the archive with the newly created child
solutions.
In an extended version of PESA (Corne et al., 2001), instead of applying
the selection procedure on population members, hyperboxes in the objective
space are selected based on the number of solutions residing in the hyper-
boxes. After hyperboxes are selected, a random solution from the chosen hy-
perboxes is kept. This region-based selection procedure has shown to perform
better than the individual-based selection procedure of PESA. In some sense,
the PESA2 selection scheme is similar in concept to -dominance (Laumanns
et al., 2002) in which predened values determine the hyperbox dimen-
sions. Other -dominance based EMO procedures (Deb et al., 2003b) have
shown computationally faster and better distributed solutions than NSGA-II
or SPEA2.
There also exist other competent EMOs, such as multiobjective messy GA
(MOMGA) (Veldhuizen and Lamont, 2000), multiobjective micro-GA (Coello
and Toscano, 2000), neighborhood constraint GA (Loughlin and Ranjithan,
1997), ARMOGA (Sasaki et al., 2001), and others. Besides, there exists other
EA based methodologies, such as particle swarm EMO (Coello and Lechuga,
2002; Mostaghim and Teich, 2003), ant-based EMO (McMullen, 2001; Gravel
et al., 2002), and dierential evolution based EMO (Babu and Jehan, 2003).
3.6 Applications of EMO

Since the early development of EMO algorithms in 1993, they have been ap-
plied to many real-world and interesting optimization problems. Descriptions
of some of these studies can be found in books (Deb, 2001; Coello et al.,
2002; Osyczka, 2002), dedicated conference proceedings (Zitzler et al., 2001a;
Fonseca et al., 2003; Coello et al., 2005; Obayashi et al., 2007), and domain-
specic books, journals and proceedings. In this section, we describe one case
study which clearly demonstrates the EMO philosophy which we described in
Section 3.3.1.
3.6.1 Spacecraft Trajectory Design
Coverstone-Carroll et al. (2000) proposed a multiobjective optimization tech-

nique using the original non-dominated sorting algorithm (NSGA) (Srinivas
and Deb, 1994) to nd multiple trade-o solutions in a spacecraft trajectory
optimization problem. To evaluate a solution (trajectory), the SEPTOP (Solar
Electric Propulsion Trajectory Optimization) software (Sauer, 1973) is called

for, and the delivered payload mass and the total time of ight are calcu-
lated. The multiobjective optimization problem has eight decision variables
controlling the trajectory, three objective functions: (i) maximize the delivered
payload at destination, (ii) maximize the negative of the time of ight, and
(iii) maximize the total number of heliocentric revolutions in the trajectory,
and three constraints limiting the SEPTOP convergence error and minimum
and maximum bounds on heliocentric revolutions.
On the EarthMars rendezvous mission, the study found interesting trade-
o solutions (Coverstone-Carroll et al., 2000). Using a population of size 150,
the NSGA was run for 30 generations. The obtained non-dominated solutions
are shown in Figure 3.8 for two of the three objectives and some selected so-
lutions are shown in Figure 3.9. It is clear that there exist short-time ights
with smaller delivered payloads (solution marked 44) and long-time ights
with larger delivered payloads (solution marked 36). Solution 44 can deliver a
mass of 685.28 kg and requires about 1.12 years. On other hand, an interme-
diate solution 72 can deliver almost 862 kg with a travel time of about 3 years.
In these gures, each continuous part of a trajectory represents a thrusting
arc and each dashed part of a trajectory represents a coasting arc. It is inter-
esting to note that only a small improvement in delivered mass occurs in the
solutions between 73 and 72 with a sacrice in ight time of about an year.
The multiplicity in trade-o solutions, as depicted in Figure 3.9, is what
we envisaged in discovering in a multiobjective optimization problem by us-
ing a posteriori procedure, such as an EMO algorithm. This aspect was also
discussed in Figure 3.2. Once such a set of solutions with a good trade-o
among objectives is obtained, one can analyze them for choosing a particular
solution. For example, in this problem context, it makes sense to not choose a
22
36
900 73 72
Mass Delivered to Target (kg.)
800 132
44
700
600
500
400
300
200
100
0
1 1.5 2 2.5 3 3.5
Transfer Time (yrs.)
Fig. 3.8. Obtained non-dominated solutions using NSGA.

80 K. Deb
Fig. 3.9. Four trade-o trajectories.
solution between points 73 and 72 due to poor trade-o between the objectives
in this range. On the other hand, choosing a solution within points 44 and 73
is worthwhile, but which particular solution to choose depends on other mis-
sion related issues. But by rst nding a wide range of possible solutions and
revealing the shape of front, EMO can help narrow down the choices and allow
a decision maker to make a better decision. Without the knowledge of such a
wide variety of trade-o solutions, a proper decision-making may be a dicult
task. Although one can choose a scalarized objective (such as the -constraint
method with a particular vector) and nd the resulting optimal solution,
the decision-maker will always wonder what solution would have been derived
if a dierent vector was chosen. For example, if 1 = 2.5 years is chosen and
mass delivered to the target is maximized, a solution in between points 73 and
72 will be found. As discussed earlier, this part of the Pareto-optimal front
does not provide the best trade-os between objectives that this problem can
oer. A lack of knowledge of good trade-o regions before a decision is made
may allow the decision maker to settle for a solution which, although optimal,
may not be a good compromised solution. The EMO procedure allows a ex-
ible and a pragmatic procedure for nding a well-diversied set of solutions
simultaneously so as to enable picking a particular region for further analysis
or a particular solution for implementation.
3.7 Constraint Handling in EMO

The constraint handling method modies the binary tournament selection,
where two solutions are picked from the population and the better solution is
chosen. In the presence of constraints, each solution can be either feasible or
infeasible. Thus, there may be at most three situations: (i) both solutions are
feasible, (ii) one is feasible and other is not, and (iii) both are infeasible. We
consider each case by simply redening the domination principle as follows
(we call it the constrained-domination condition for any two solutions x(i)
and x(j) ):
Denition 3. A solution x(i) is said to constrained-dominate a solution x(j)
(or x(i) c x(j) ), if any of the following conditions are true:
1. Solution x(i) is feasible and solution x(j) is not.
2. Solutions x(i) and x(j) are both infeasible, but solution x(i) has a smaller
constraint violation, which can be computed by adding the normalized vi-
olation of all constraints:

J
K
CV(x) =
gj (x) + k (x)),
abs(h
j=1 k=1
where is , if < 0 and is zero, otherwise. The normalization is

achieved with the population minimum (gj min ) and maximum (gj max )
constraint violations: gj (x) = (gj (x) gj min )/(gj max gj min ).
3. Solutions x(i) and x(j) are feasible and solution x(i) dominates solution
x(j) in the usual sense (Denition 1).
The above change in the denition requires a minimal change in the NSGA-
II procedure described earlier. Figure 3.10 shows the non-domination fronts
on a six-membered population due to the introduction of two constraints (the
minimization problem is described as CONSTR elsewhere (Deb, 2001)). In the
absence of the constraints, the non-domination fronts (shown by dashed lines)
would have been ((1,3,5), (2,6), (4)), but in their presence, the new
fronts are ((4,5), (6), (2), (1), (3)). The rst non-domination front
consists of the best (that is, non-dominated and feasible) points from the
population and any feasible point lies on a better non-domination front than
an infeasible point.
3.8 Test Problems with Known Optima

After the initial three non-elitist EMO procedures came out around 1993-94, a
plethora of EMO algorithms was suggested starting around 2000. Since EMO
algorithms are heuristic based, it is essential to compare them against each
other and against true optimal solutions, so as to test the algorithms to in-
vestigate if they are adequate enough to handle dierent vagaries of problem
82 K. Deb
10
8 4
3
1 2
6 5 3
f2 4
6
4 5
Front 2
Front 1
2
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
f1
Fig. 3.10. Non-constrained-domination fronts.
diculties real-world problems would oer. Since the purpose of EMO algo-
rithms is to nd a set of well-converged and well-spread solutions on or as close
to the Pareto-optimal front as possible, it was necessary to develop numerical
test problems for which the location and extent of the Pareto-optimal front
are exactly known and test the performance of EMO algorithms on such test
problems. The availability of test problems has not only boosted the research
in EMO but has also brought researchers closer together in understanding
how a population-based EMO solves multiobjective optimization problems.
Veldhuizen and Lamont (1998) collected a set of test problems from vari-
ous literature including classical multiobjective optimization books. In 1999,
Deb (1999b) devised a simple procedure of designing two-objective optimiza-
tion problems in which dierent kinds of diculties commonly found and
studied in single-objective EO problems were introduced in a multiobjective
optimization problem. A later study (Zitzler et al., 2000) used Debs original
idea and suggested six test problems (now popularly known as Zitzler-Deb-
Thiele or ZDT) which tested a multiobjective optimization for the following
complexities:
1. Eect of increasing the number of decision variables (all six ZDT prob-
lems),
2. eect of convexity and non-convexity of the Pareto-optimal front (ZDT1
versus ZDT2),
3. eect of discontinuities and disconnectedness in the Pareto-optimal front
(ZDT1 or ZDT2 versus ZDT3),
4. eect of multiple local Pareto-optimal fronts in arriving at the global
Pareto-optimal front (ZDT1 versus ZDT4),
5. eect of isolation (presence of potentially bad solutions in the neighbor-
hood of Pareto-optimal solutions) and deception (presence of a wider and
a specic type of basin of attraction for the local Pareto-optimal front)

(ZDT1 versus ZDT5), and
6. eect of non-uniform density of points (on the objective space) across the
Pareto-optimal front (ZDT2 versus ZDT6).
ZDT problems served the EMO research for quite many years since their
suggestion in 2000. To introduce further complexities, Okabe et al. (2004) and
Huband et al. (2005) have suggested test problems with strong correlations
among variables. An extension of ZDT test problems to include correlations
among variables was suggested in the original study Deb (1999b) and specic
test problems are suggested recently (Deb et al., 2006a; Igel et al., 2007).
A set of nine test problems scalable to the number of objectives and vari-
ables were suggested in 2002 (Deb et al., 2002c, 2005). Some of these prob-
lems, (DTLZ1 to DTLZ7 minimization problems) use a bottom-up strategy
in which the functional shape of the Pareto-optimal front is rst assumed and
then an objective space is constructed with the desired Pareto-optimal front
as the lower boundary. Thereafter, a mapping between the decision space and
the objective space is constructed to introduce dierent kinds of complexities
which can cause an EMO procedure sustained diculties in converging and
nding a widely distributed set of solutions. The remaining two test prob-
lems (DTLZ8 and DTLZ9) use a principle of starting with a hyperbox as the
objective space and then introduces a number of constraints which eliminate
portions of the hyperbox to help generate a Pareto-optimal front. These test
problems have served EMO researchers in various ways, particularly in devel-
oping better algorithms and in testing the performance of their algorithms in
solving various multiobjective optimization problems (Deb et al., 2006a; Igel
et al., 2007).
In addition, there also exist a number of constrained test problems with
non-linear objective functions and constraints. These so-called CTP prob-
lems (Deb, 2001; Deb et al., 2001) introduce dierent complexities: discon-
nected feasible objective space, long narrow feasible objective space to reach
to Pareto-optimal points, and others. The purpose of these test problems was
to simulate dierent diculties which real-world problems may have. The
problems provide a tunable degree of such diculties and this way, if an al-
gorithm is capable of negotiating them well in test problems, the algorithm is
also expected to perform well in a real-world scenario having similar dicul-
ties.
3.9 Performance Measures and Evaluation of EMO

Methodologies
When algorithms are developed and test problems with known Pareto-optimal
fronts are available, an important task is to have one or more performance
measures with which the EMO algorithms can be evaluated. Thus, a major
84 K. Deb
focus of EMO research has been spent in developing dierent performance

measures and is a major area of EMO research. Chapter 14 presents a de-
tail account of this topic. Since the focus in an EMO task is multi-faceted
convergence to the Pareto-optimal front and diversity of solutions along the
entire front, it is also expected that one performance measure to evaluate
EMO algorithms will be unsatisfactory. In the early years of EMO research,
three dierent sets of performance measures were used:
1. Metrics evaluating convergence to the known Pareto-optimal front (such
as error ratio, distance from reference set, etc.),
2. Metrics evaluating spread of solutions on the known Pareto-optimal front
(such as spread, spacing, etc.), and
3. Metrics evaluating certain combinations of convergence and spread of so-
lutions (such as hypervolume, coverage, R-metrics, etc.).
Some of these metrics are described in texts (Coello et al., 2002; Deb, 2001).
A detailed study (Knowles and Corne, 2002) comparing most existing per-
formance metrics based on out-performance relations has recommended the
use of the S-metric (or the hypervolume metric) and R-metrics suggested by
Hansen and Jaskiewicz (1998). A recent study has argued that a single unary
performance measure or any nite combination of them (for example, any of
the rst two metrics described above in the enumerated list or both together)
cannot adequately determine whether one set is better than another (Zitzler
et al., 2003). That study also concluded that binary performance metrics (in-
dicating usually two dierent values when a set of solutions A is compared
against B and B is compared against A), such as epsilon-indicator, binary
hypervolume indicator, utility indicators R1 to R3, etc., are better measures
for multiobjective optimization. The ip side is that the chosen binary met-
ric must be computed K(K 1) times when comparing K dierent sets to
make a fair comparison, thereby making the use of binary metrics computa-
tionally expensive in practice. Importantly, these performance measures have
allowed researchers to use them directly as tness measures within indicator
based EAs (IBEAs) (Zitzler and Knzli, 2004). In addition, attainment indi-
cators of Fonseca and Fleming (1996); Fonseca et al. (2005) provide further
information about location and inter-dependencies among obtained solutions.
3.10 Other Current EMO Research and Practices
With the initial fast development of ecient EMO procedures, availability of

free and commercial softwares (some of which are described in the Appendix),
and applications to a wide variety of problems, EMO research and application
is in its peak at the current time. It is dicult to cover every aspect of current
research in a single paper. Here, we outline four main broad areas of research
and application.
3.10.1 Hybrid EMO Procedures
The search operators used in EMO are heuristic based. Thus these method-
ologies are not guaranteed to nd Pareto-optimal solutions with a nite num-
ber of solution evaluations in an arbitrary problem. However, as discussed
in this chapter, EMO methodologies provide adequate emphasis to currently
non-dominated and isolated solutions so that population members progress
towards the Pareto-optimal front iteratively. To make the overall procedure
faster and to perform the task with a more theoretical basis, EMO methodolo-
gies can be combined with mathematical optimization techniques having local
convergence properties. A simple-minded approach would be to start the pro-
cess with an EMO and the solutions obtained from EMO can be improved by
optimizing a composite objective derived from multiple objectives to ensure a
good spread by using a local search technique. Another approach would be to
use a local search technique as a mutation-like operator in an EMO so that
all population members are at least guaranteed local optimal solutions. To
save computational time, the local search based mutation can be performed
after a few generations. In single-objective EA research, hybridization of EAs
is common for ensuring an optima, it is time that more studies on developing
hybrid EMO are pursued to ensure nding true Pareto-optimal solutions.
3.10.2 EMO and Decision-Making
This book is designed to cover this topic in detail. It will suce to point out in
this chapter that nding a set of Pareto-optimal solutions by using an EMO
fullls only one aspect of multiobjective optimization, as choosing a particular
solution for an implementation is the remaining decision-making task which
is also equally important. In the view of the author, the decision-making task
can be considered from two main aspects:
1. Generic consideration: There are some aspects which most practical
users would like to use in narrowing down their choice. For example, in the
presence of uncertainties in decision variables and/or problem parameters,
the users are usually interested in nding robust solutions which demon-
strate a insensitiveness in objective variation due to a perturbation in
decision variable values or parameters. In the presence of such variations,
no one is interested in Pareto-optimal but sensitive solutions. Practitioners
are interested in sacricing a global optimal solution to achieve a robust
solution which when mapped to the objective space lie on a relatively at
part or which lie away from the constraint boundaries. In such scenarios
and in the presence of multiple objectives, instead of nding the global
Pareto-optimal front, the user may be interested in nding the robust front
which may be partially or totally dierent from the Pareto-optimal front.
A couple of denitions for a robust front are discussed in Chapter 16.
A recent study has developed a modied EMO procedure for nding the
robust front in a problem (Deb and Gupta, 2005).
86 K. Deb
In addition, instead of nding the entire Pareto-optimal front, the users

may be interested in nding some specic solutions on the Pareto-optimal
front, such as knee points (requiring a large sacrice in at least one objec-
tive to achieve a small gain in another thereby making it discouraging to
move out from a knee point (Branke et al., 2004)), Pareto-optimal points
depicting certain pre-specied relationship between objectives, Pareto-
optimal points having multiplicity (say, at least two or more solutions in
the decision variable space mapping to identical objective values), Pareto-
optimal solutions which do not lie close to variable boundaries, Pareto-
optimal points having certain mathematical properties, such as all La-
grange multipliers having more or less identical magnitude a condition
often desired to have an equal importance to all constraints, and others.
These considerations are motivated from the fundamental and practical
aspects of optimization and may be applied to most multiobjective prob-
lem solving tasks, without any consent of a decision-maker.
2. Subjective consideration: In this category, any problem-specic infor-
mation can be used to narrow down the choices and the process may
even lead to a single preferred solution at the end. Most decision-making
procedures use some preference information (utility functions, reference
points (Wierzbicki, 1980), reference directions (Korhonen and Laakso,
1986), marginal rate of return and a host of other considerations (Mi-
ettinen, 1999)) to select a subset of Pareto-optimal solutions. This book
is dedicated to the discussion of many such multicriteria decision analy-
sis (MCDA) tools and collaborative suggestions of using EMO with such
MCDA tools. Some hybrid EMO and MCDA algorithms are also sug-
gested in the recent past (Deb et al., 2006b; Deb and Kumar, 2007b,a;
Thiele et al., 2007; Luque et al., 2009).
3.10.3 Multi-objectivization
Interestingly, the act of nding multiple trade-o solutions using an EMO pro-
cedure has found its application outside the realm of solving multiobjective
optimization problems per se. The concept of nding near-optimal trade-o
solutions is applied to solve other kinds of optimization problems as well.
For example, the EMO concept is used to solve constrained single-objective
optimization problems by converting the task into a two-objective optimiza-
tion task of additionally minimizing an aggregate constraint violation (Coello,
2000). This eliminates the need to specify a penalty parameter while using
a penalty based constraint handling procedure. If viewed this way, the usual
penalty function approach used in classical optimization studies is a special
weighted-sum approach to the bi-objective optimization problem of minimiz-
ing the objective function and minimizing the constraint violation, for which
the weight vector is a function of penalty parameter. A well-known diculty
in genetic programming studies, called the bloating, arises due to the con-
tinual increase in size of genetic programs with iteration. The reduction of
bloating by minimizing the size of programs as an additional objective helped

nd high-performing solutions with a smaller size of the code (Bleuler et al.,
2001). Minimizing the intra-cluster distance and maximizing inter-cluster dis-
tance simultaneously in a bi-objective formulation of a clustering problem is
found to yield better solutions than the usual single-objective minimization of
the ratio of the intra-cluster distance to the inter-cluster distance (Handl and
Knowles, 2007). An EMO is used to solve minimum spanning tree problem bet-
ter than a single-objective EA (Neumann and Wegener, 2005). A recent edited
book (Knowles et al., 2008) describes many such interesting applications in
which EMO methodologies have helped solve problems which are otherwise
(or traditionally) not treated as multiobjective optimization problems.
3.10.4 Applications
EMO methodologies including other a posteriori multiobjective optimization

methods must be applied to more interesting real-world problems to demon-
strate the utility of nding multiple trade-o solutions. Although some recent
studies are nding that EMO procedures are not computationally ecient to
nd multiple and widely distributed sets of solutions on problems having a
large number of objectives (say more than ve objectives) (Deb and Saxena,
2006; Corne and Knowles, 2007), EMO procedures are still applicable in very
large problems if the attention is changed to nding only a preferred region on
the Pareto-optimal front, instead of the complete front. Some such preference
based EMO studies (Deb et al., 2006b; Deb and Kumar, 2007a; Branke and
Deb, 2004) are applied to 10 or more objectives. In certain many-objective
problems, the Pareto-optimal front can be low-dimensional mainly due to the
presence of redundant objectives and EMO procedures can again be eective
in solving such problems (Deb and Saxena, 2006; Brockho and Zitzler, 2007).
In addition, the use of reliability based EMO (Deb et al., 2007) and robust
EMO (Deb and Gupta, 2005) procedures are ready to be applied to real-world
multiobjective design optimization problems. Application studies are also of
interest from the point of demonstrating how an EMO procedure and a sub-
sequent MCDA approach can be combined in an iterative manner together
to solve a multicriteria decision making problem. Such eorts may lead to
development of GUI-based softwares and approaches for solving the task and
will demand addressing other important issues such as visualization of multi-
dimensional data, parallel implementation of EMO and MCDA procedures,
meta-modeling approaches, and others.
Besides solving real-world multiobjective optimization problems, EMO
procedures are also found to be useful for a knowledge discovery task re-
lated to a better understanding of a problem. After a set of trade-o solutions
are found by an EMO, these solutions can be compared against each other to
unveil interesting principles which are common to all these solutions. These
common properties among high-performing solutions will provide useful in-
sights about what makes a solution optimal in a particular problem. Such
88 K. Deb
useful information mined from the obtained EMO trade-o solutions have
been discovered in many real-world engineering design problems in the recent
past and is termed as the task of innovization (Deb and Srinivasan, 2006).
3.11 Conclusions
This chapter has provided a brief introduction to the fast-growing eld of mul-
tiobjective optimization based on evolutionary algorithms. First, the princi-
ples of single-objective evolutionary optimization (EO) techniques have been
discussed so that readers can visualize the dierences between evolutionary
optimization and classical optimization methods. Although the main dier-
ence seems to be in the population approach, EO methodologies do not use
any derivative information and they possess a parallel search ability through
their operators which makes them computationally ecient procedures.
The EMO principle of handling multiobjective optimization problems is to
rst attempt to nd a set of Pareto-optimal solutions and then choose a pre-
ferred solution. Since an EO uses a population of solutions in each iteration,
EO procedures are potentially viable techniques to capture a number of trade-
o near-optimal solutions in a single simulation run. Thus, EMO procedures
work in achieving two goals: (i) convergence to as close to the Pareto-optimal
front as possible and (ii) maintenance of a well-distributed set of trade-o
solutions. This chapter has described a number of popular EMO methodolo-
gies, presented some simulation studies on test problems, and discussed how
EMO principles can be useful in solving real-world multiobjective optimiza-
tion problems through a case study of spacecraft trajectory optimization.
Since early EMO research concentrated on nding a set of well-converged
and well-distributed set of near-optimal trade-o solutions, EMO researchers
concentrated on developing better and computationally faster algorithms by
developing scalable test problems and adequate performance metrics to eval-
uate EMO algorithms.
Finally, this chapter has discussed the potential of EMO and its current
research activities. Interestingly, in addition to EMO, these applications can
also be achieved with a posteriori MCDM techniques. Besides their routine
applications in solving multiobjective optimization problems, EMO and a pos-
teriori MCDM methodologies are capable of solving other types of optimiza-
tion problems, such as single-objective constrained optimization, clustering
problems etc. in a better manner than they are usually solved. EMO and a
posteriori MCDM methodologies are capable of unveiling important hidden
knowledge about what makes a solution optimal in a problem. EMO tech-
niques are increasingly being found to have tremendous potential to be used
in conjunction with interactive multiple criterion decision making tasks in not
only nding a set of optimal solutions but also to aid in selecting a preferred
solution at the end.
Before closing the chapter, we provide some useful information about the
EMO research and literature in the Appendix.
Acknowledgement
The author acknowledges the support of the Academy of Finland and Foun-
dation of Helsinki School of Economics (Grant # 118319).
Appendix: EMO Repository

Here, we outline some dedicated literature in the area of multiobjective op-
timization. Further references can be found from http://www.lania.mx/
ccoello/EMOO/.
Some Relevant Books in Print

A. Abraham, L. C. Jain and R. Goldberg. Evolutionary Multiobjective
Optimization: Theoretical Advances and Applications, London: Springer-
Verlag, 2005.
This is a collection of the latest state-of-the-art theoretical research, design
challenges and applications in the eld of EMO.
C. A. C. Coello, D. A. VanVeldhuizen, and G. Lamont. Evolutionary
Algorithms for Solving Multi-Objective Problems. Boston, MA: Kluwer
Academic Publishers, 2002.
A good reference book with a good citation of most EMO studies. A revised
version is in print.
Y. Collette and P. Siarry. Multiobjective Optimization: Principles and Case
Studies, Berlin: Springer, 2004.
This book describes multiobjective optimization methods including EMO
and decision-making, and a number of engineering case studies.
K. Deb. Multi-objective optimization using evolutionary algorithms. Chich-
ester, UK: Wiley, 2001. (Third edition, with exercise problems)
A comprehensive text-book introducing the EMO eld and describing ma-
jor EMO methodologies and salient research directions.
N. Nedjah and L. de Macedo Mourelle (Eds.). Real-World Multi-Objective
System Engineering, New York: Nova Science Publishers, 2005.
This edited book discusses recent developments and application of multi-
objective optimization including EMO.
A. Osyczka. Evolutionary algorithms for single and multicriteria design
optimisation, Heidelberg: Physica-Verlag, 2002.
A book describing single and multiobjective EAs with many engineering
applications.
M. Sakawa. Genetic Algorithms and Fuzzy Multiobjective Optimization,
Norwell, MA: Kluwer, 2002.
This book discusses EMO for 0-1 programming, integer programming, non-
convex programming, and job-shop scheduling problems under multiobjec-
tiveness and fuzziness.
90 K. Deb
K. C. Tan and E. F. Khor and T. H. Lee. Multiobjective Evolutionary

Algorithms and Applications, London, UK: Springer-Verlag, 2005.
A book on various methods of preference-based EMO and application case
studies covering areas such as control and scheduling.
Some Review Papers
C. A. C. Coello. Evolutionary Multi-Objective Optimization: A Critical
Review. In R. Sarker, M. Mohammadian and X. Yao (Eds), Evolutionary
Optimization, pp. 117146, Kluwer Academic Publishers, New York, 2002.
C. Dimopoulos. A Review of Evolutionary Multiobjective Optimization
Applications in the Area of Production Research. 2004 Congress on Evolu-
tionary Computation (CEC2004), IEEE Service Center, Vol. 2, pp. 1487
1494, 2004.
S. Huband, P. Hingston, L. Barone and L. While. A Review of Multiob-
jective Test Problems and a Scalable Test Problem Toolkit, IEEE Trans-
actions on Evolutionary Computation, Vol. 10, No. 5, pp. 477506, 2006.
Dedicated Conference Proceedings
S. Obayashi, K. Deb, C. Poloni, and T. Hiroyasu (Eds.)., Evolution-
ary Multi-Criterion Optimization (EMO-07) Conference Proceedings, Also
available as LNCS 4403. Berlin, Germany: Springer, 2007.
C. A. Coello Coello and A. H. Aguirre and E. Zitzler (Eds.), Evolution-
ary Multi-Criterion Optimization (EMO-05) Conference Proceedings, Also
available as LNCS 3410. Berlin, Germany: Springer, 2005.
Fonseca, C., Fleming, F., Zitzler, E., Deb, K., and Thiele, L. (Eds.), Evolu-
tionary Multi-Criterion Optimization (EMO-03) Conference Proceedings.
Also available as LNCS 2632. Heidelberg: Springer, 2003.
Zitzler, E., Deb, K., Thiele, L., Coello, C. A. C. and Corne, D. (Eds.),
Evolutionary Multi-Criterion Optimization (EMO-01) Conference Pro-
ceedings, Also available as LNCS 1993. Heidelberg: Springer, 2001.
Some Public-Domain Source Codes
NIMBUS: http://www.mit.jyu.fi/MCDM/soft.html
NSGA-II in C: http://www.iitk.ac.in/kangal/soft.htm
PISA: http://www.tik.ee.ethz.ch/sop/pisa/
Shark in C++: http://shark-project.sourceforge.net
SPEA2 in C++: http://www.tik.ee.ethz.ch/~zitzler
Some Commercial Codes Implementing EMO
GEATbx in Matlab (http://www.geatbx.com/)

iSIGHT and FIPER from Engineous (http://www.engineous.com/)
MAX from CENAERO (http://www.cenaero.be/)
modeFRONTIER from Esteco (http://www.esteco.com/)
References
Babu, B.V., Jehan, M.M.L.: Dierential Evolution for Multi-Objective Opti-
mization. In: Proceedings of the 2003 Congress on Evolutionary Computation
(CEC2003), Canberra, Australia, December 2003, vol. 4, pp. 26962703. IEEE
Computer Society Press, Los Alamitos (2003)
Bck, T., Fogel, D., Michalewicz, Z.: Handbook of Evolutionary Computation. Ox-
ford University Press, Oxford (1997)
Bleuler, S., Brack, M., Zitzler, E.: Multiobjective genetic programming: Reducing
bloat using spea2. In: Proceedings of the 2001 Congress on Evolutionary Compu-
tation, pp. 536543 (2001)
Branke, J., Deb, K.: Integrating user preferences into evolutionary multi-objective
optimization. In: Jin, Y. (ed.) Knowledge Incorporation in Evolutionary Compu-
tation, pp. 461477. Springer, Heidelberg (2004)
Branke, J., Deb, K., Dierolf, H., Osswald, M.: Finding knees in multi-objective
optimization. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervs,
J.J., Bullinaria, J.A., Rowe, J.E., Tio, P., Kabn, A., Schwefel, H.-P. (eds.) PPSN
2004. LNCS, vol. 3242, pp. 722731. Springer, Heidelberg (2004)
Brockho, D., Zitzler, E.: Dimensionality Reduction in Multiobjective Optimization:
The Minimum Objective Subset Problem. In: Waldmann, K.H., Stocker, U.M.
(eds.) Operations Research Proceedings 2006, Saarbcken, Germany, pp. 423
429. Springer, Heidelberg (2007)
Coello Coello, C.A.: Treating objectives as constraints for single objective optimiza-
tion. Engineering Optimization 32(3), 275308 (2000)
Coello Coello, C.A., Lechuga, M.S.: MOPSO: A Proposal for Multiple Objec-
tive Particle Swarm Optimization. In: Congress on Evolutionary Computation
(CEC2002), May 2002, vol. 2, pp. 10511056. IEEE Service Center, Piscataway
(2002)
Coello Coello, C.A., Toscano, G.: A micro-genetic algorithm for multi-objective op-
timization. Technical Report Lania-RI-2000-06, Laboratoria Nacional de Infor-
matica Avanzada, Xalapa, Veracruz, Mexico (2000)
Coello, C.A.C., VanVeldhuizen, D.A., Lamont, G.: Evolutionary Algorithms for Solv-
ing Multi-Objective Problems. Kluwer Academic Publishers, Boston (2002)
Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.): EMO 2005. LNCS,
vol. 3410. Springer, Heidelberg (2005)
Corne, D.W., Knowles, J.D.: Techniques for highly multiobjective optimisation:
some nondominated points are better than others. In: GECCO07: Proceedings of
the 9th annual conference on Genetic and evolutionary computation, pp. 773780.
ACM Press, New York (2007)
Corne, D.W., Knowles, J.D., Oates, M.: The Pareto envelope-based selection al-
gorithm for multiobjective optimization. In: Deb, K., Rudolph, G., Lutton, E.,
Merelo, J.J., Schoenauer, M., Schwefel, H.-P., Yao, X. (eds.) PPSN 2000. LNCS,
vol. 1917, pp. 839848. Springer, Heidelberg (2000)
Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: Region-based
selection in evolutionary multiobjective optimization. In: Proceedings of the Ge-
netic and Evolutionary Computation Conference (GECCO-2001), pp. 283290.
Morgan Kaufmann, San Francisco (2001)
92 K. Deb
Coverstone-Carroll, V., Hartmann, J.W., Mason, W.J.: Optimal multi-objective low-

thurst spacecraft trajectories. Computer Methods in Applied Mechanics and En-
gineering 186(24), 387402 (2000)
Deb, K.: An introduction to genetic algorithms. S adhan a 24(4), 293315 (1999a)
Deb, K.: Multi-objective genetic algorithms: Problem diculties and construction
of test problems. Evolutionary Computation Journal 7(3), 205230 (1999b)
Deb, K.: Multi-objective optimization using evolutionary algorithms. Wiley, Chich-
ester (2001)
Deb, K., Agrawal, S.: A niched-penalty approach for constraint handling in genetic
algorithms. In: Proceedings of the International Conference on Articial Neural
Networks and Genetic Algorithms (ICANNGA-99), pp. 235243. Springer, Hei-
delberg (1999)
Deb, K., Gupta, H.: Searching for robust pareto-optimal solutions in multi-objective
optimization. In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.)
EMO 2005. LNCS, vol. 3410, pp. 150164. Springer, Heidelberg (2005)
Deb, K., Kumar, A.: Interactive evolutionary multi-objective optimization and
decision-making using reference direction method. In: Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO-2007), pp. 781788. ACM,
New York (2007a)
Deb, K., Kumar, A.: Light beam search based multi-objective optimization using
evolutionary algorithms. In: Proceedings of the Congress on Evolutionary Com-
putation (CEC-07), pp. 21252132 (2007b)
Deb, K., Saxena, D.: Searching for pareto-optimal solutions through dimensionality
reduction for certain large-dimensional multi-objective optimization problems. In:
Proceedings of the World Congress on Computational Intelligence (WCCI-2006),
pp. 33523360 (2006)
Deb, K., Srinivasan, A.: Innovization: Innovating design principles through optimiza-
tion. In: Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO-2006), pp. 16291636. ACM, New York (2006)
Deb, K., Tiwari, S.: Omni-optimizer: A generic evolutionary algorithm for global
optimization. European Journal of Operational Research (EJOR) 185(3), 1062
1087 (2008)
Deb, K., Pratap, A., Meyarivan, T.: Constrained test problems for multi-objective
evolutionary optimization. In: Zitzler, E., Deb, K., Thiele, L., Coello Coello, C.A.,
Corne, D.W. (eds.) EMO 2001. LNCS, vol. 1993, pp. 284298. Springer, Heidel-
berg (2001)
Deb, K., Anand, A., Joshi, D.: A computationally ecient evolutionary algorithm for
real-parameter optimization. Evolutionary Computation Journal 10(4), 371395
(2002a)
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multi-objective ge-
netic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2),
182197 (2002b)
Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable multi-objective optimization
test problems. In: Proceedings of the Congress on Evolutionary Computation
(CEC-2002), pp. 825830 (2002c)
Deb, K., Reddy, A.R., Singh, G.: Optimal scheduling of casting sequence using
genetic algorithms. Journal of Materials and Manufacturing Processes 18(3), 409
432 (2003a)
Deb, K., Mohan, R.S., Mishra, S.K.: Towards a quick computation of well-spread
pareto-optimal solutions. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K.,
Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 222236. Springer, Heidelberg
(2003b)
Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable test problems for evolution-
ary multi-objective optimization. In: Abraham, A., Jain, L., Goldberg, R. (eds.)
Evolutionary Multiobjective Optimization, pp. 105145. Springer, London (2005)
Deb, K., Sinha, A., Kukkonen, S.: Multi-objective test problems, linkages and evo-
lutionary methodologies. In: Proceedings of the Genetic and Evolutionary Com-
putation Conference (GECCO-2006), pp. 11411148. ACM, New York (2006a)
Deb, K., Sundar, J., Uday, N., Chaudhuri, S.: Reference point based multi-objective
optimization using evolutionary algorithms. International Journal of Computa-
tional Intelligence Research (IJCIR) 2(6), 273286 (2006b)
Deb, K., Padmanabhan, D., Gupta, S., Mall, A.K.: Reliability-based multi-objective
optimization using evolutionary algorithms. In: Obayashi, S., Deb, K., Poloni, C.,
Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 6680. Springer,
Heidelberg (2007)
Deb, K., Tiwari, R., Dixit, M., Dutta, J.: Finding trade-o solutions close to KKT
points using evolutionary multi-objective optimisation. In: Proceedings of the
Congress on Evolutionary Computation (CEC-2007), Singapore, pp. 21092116
(2007)
Fogel, L.J., Owens, A.J., Walsh, M.J.: Articial Intelligence Through Simulated
Evolution. Wiley, New York (1966)
Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.): EMO 2003.
LNCS, vol. 2632. Springer, Heidelberg (2003)
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization:
Formulation, discussion, and generalization. In: Proceedings of the Fifth Interna-
tional Conference on Genetic Algorithms, pp. 416423 (1993)
Fonseca, C.M., Fleming, P.J.: On the performance assessment and comparison of
stochastic multiobjective optimizers. In: Ebeling, W., Rechenberg, I., Voigt, H.-
M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 584593. Springer,
Heidelberg (1996)
Fonseca, C.M., da Fonseca, V.G., Paquete, L.: Exploring the performance of stochas-
tic multiobjective optimisers with the second-order attainment function. In:
Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS,
Gen, M., Cheng, R.: Genetic Algorithms and Engineering Design. Wiley, Chichester
(1997)
Goldberg, D.E.: Genetic Algorithms for Search, Optimization, and Machine Learn-
ing. Addison-Wesley, Reading (1989)
Goldberg, D.E., Richardson, J.: Genetic algorithms with sharing for multimodal
function optimization. In: Proceedings of the First International Conference on
Genetic Algorithms and Their Applications, pp. 4149 (1987)
Goldberg, D.E., Deb, K., Thierens, D.: Toward a better understanding of mixing in
genetic algorithms. Journal of the Society of Instruments and Control Engineers
(SICE) 32(1), 1016 (1993)
Gravel, M., Price, W.L., Gagn, C.: Scheduling continuous casting of aluminum using
a multiple objective ant colony optimization metaheuristic. European Journal of
Operational Research 143(1), 218229 (2002)
94 K. Deb
Handl, J., Knowles, J.D.: An evolutionary approach to multiobjective clustering.

IEEE Transactions on Evolutionary Computation 11(1), 5676 (2007)
Hansen, M.P., Jaskiewicz, A.: Evaluating the quality of approximations to the non-
dominated set. Technical Report IMM-REP-1998-7, Institute of Mathematical
Modelling, Technical University of Denmark, Lyngby (1998)
Herrera, F., Lozano, M., Verdegay, J.L.: Tackling real-coded genetic algorithms:
Operators and tools for behavioural analysis. Articial Intelligence Review 12(4),
265319 (1998)
Holland, J.H.: Concerning ecient adaptive systems. In: Yovits, M.C., Jacobi, G.T.,
Goldstein, G.B. (eds.) Self-Organizing Systems, pp. 215230. Spartan Press, New
York (1962)
Holland, J.H.: Adaptation in Natural and Articial Systems. MIT Press, Ann Arbor
(1975)
Horn, J., Nafploitis, N., Goldberg, D.E.: A niched Pareto genetic algorithm for
multi-objective optimization. In: Proceedings of the First IEEE Conference on
Evolutionary Computation, pp. 8287 (1994)
Huband, S., Barone, L., While, L., Hingston, P.: A scalable multi-objective test
problem toolkit. In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.)
Igel, C., Hansen, N., Roth, S.: Covariance matrix adaptation for multi-objective op-
timization evolutionary computation. Evolutionary Computation Journal 15(1),
128 (2007)
Jansen, T., Wegener, I.: On the utility of populations. In: Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO 2001), pp. 375382. Morgan
Kaufmann, San Francisco (2001)
Kennedy, J., Eberhart, R.C., Shi, Y.: Swarm intelligence. Morgan Kaufmann, San
Francisco (2001)
Knowles, J., Corne, D., Deb, K.: Multiobjective Problem Solving from Nature.
Springer, Heidelberg (2008)
Knowles, J.D., Corne, D.W.: Approximating the non-dominated front using the
Pareto archived evolution strategy. Evolutionary Computation Journal 8(2), 149
172 (2000)
Knowles, J.D., Corne, D.W.: On metrics for comparing nondominated sets. In:
Congress on Evolutionary Computation (CEC-2002), pp. 711716. IEEE Press,
Piscataway (2002)
problem. European Journal of Operational Reseaech 24, 277287 (1986)
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of
Natural Selection. MIT Press, Cambridge (1992)
Kung, H.T., Luccio, F., Preparata, F.P.: On nding the maxima of a set of vectors.
Journal of the Association for Computing Machinery 22(4), 469476 (1975)
Laumanns, M., Thiele, L., Deb, K., Zitzler, E.: Combining convergence and diversity
in evolutionary multi-objective optimization. Evolutionary Computation 10(3),
263282 (2002)
Loughlin, D.H., Ranjithan, S.: The neighborhood constraint method: A multiobjec-
tive optimization technique. In: Proceedings of the Seventh International Confer-
ence on Genetic Algorithms, pp. 666673 (1997)
Luque, M., Miettinen, K., Eskelinen, P., Ruiz, F.: Incorporating preference infor-
mation in interactive reference point methods for multiobjective optimization.
Omega 37(2), 450462 (2009)
McMullen, P.R.: An ant colony optimization approach to addessing a JIT sequencing
problem with multiple objectives. Articial Intelligence in Engineering 15, 309
317 (2001)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs.
Springer, Berlin (1992)
Miettinen, K.: Nonlinear Multiobjective Optimization. Kluwer, Boston (1999)
Mostaghim, S., Teich, J.: Strategies for Finding Good Local Guides in Multi-
objective Particle Swarm Optimization (MOPSO). In: 2003 IEEE Swarm Intelli-
gence Symposium Proceedings, Indianapolis, Indiana, USA, April 2003, pp. 2633.
IEEE Computer Society Press, Los Alamitos (2003)
Neumann, F., Wegener, I.: Minimum spanning trees made easier via multi-objective
optimization. In: GECCO 05: Proceedings of the 2005 conference on Genetic and
evolutionary computation, pp. 763769. ACM Press, New York (2005)
Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.): EMO 2007. LNCS,
vol. 4403. Springer, Heidelberg (2007)
Okabe, T., Jin, Y., Olhofer, M., Sendho, B.: On test functions for evolutionary
multi-objective optimization. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J.,
Merelo-Guervs, J.J., Bullinaria, J.A., Rowe, J.E., Tio, P., Kabn, A., Schwe-
fel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 792802. Springer, Heidelberg
(2004)
Osyczka, A.: Evolutionary algorithms for single and multicriteria design optimiza-
tion. Physica-Verlag, Heidelberg (2002)
Price, K.V., Storn, R., Lampinen, J.: Dierential Evolution: A Practical Approach
to Global Optimization. Springer-Verlag, Berlin (2005)
Radclie, N.J.: Forma analysis and random respectful recombination. In: Proceed-
ings of the Fourth International Conference on Genetic Algorithms, pp. 222229
(1991)
Rechenberg, I.: Cybernetic solution path of an experimental problem. Royal Aircraft
Establishment, Library Translation Number 1122, Farnborough, UK (1965)
Rechenberg, I.: Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzip-
ien der Biologischen Evolution. Frommann-Holzboog Verlag, Stuttgart (1973)
Rosenberg, R.S.: Simulation of Genetic Populations with Biochemical Properties.
Ph.D. thesis, Ann Arbor, MI, University of Michigan (1967)
Rudolph, G.: Convergence analysis of canonical genetic algorithms. IEEE Transac-
tions on Neural Network 5(1), 96101 (1994)
Sasaki, D., Morikawa, M., Obayashi, S., Nakahashi, K.: Aerodynamic shape opti-
mization of supersonic wings by adaptive range multiobjective genetic algorithms.
In: Zitzler, E., Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO
Sauer, C.G.: Optimization of multiple target electric propulsion trajectories. In:
AIAA 11th Aerospace Science Meeting, Paper Number 73-205 (1973)
Schaer, J.D.: Some Experiments in Machine Learning Using Vector Evaluated Ge-
netic Algorithms. Ph.D. thesis, Vanderbilt University, Nashville, TN (1984)
Schwefel, H.-P.: Projekt MHD-Staustrahlrohr: Experimentelle optimierung einer
zweiphasendse, teil I. Technical Report 11.034/68, 35, AEG Forschungsinstitut,
Berlin (1968)
96 K. Deb
Schwefel, H.-P.: Evolution and Optimum Seeking. Wiley, New York (1995)
Srinivas, N., Deb, K.: Multi-objective function optimization using non-dominated
sorting genetic algorithms. Evolutionary Computation Journal 2(3), 221248
(1994)
Storn, R., Price, K.: Dierential evolution A fast and ecient heuristic for global
optimization over continuous spaces. Journal of Global Optimization 11, 341359
(1997)
Thiele, L., Miettinen, K., Korhonen, P., Molina, J.: A preference-based interactive
evolutionary algorithm for multiobjective optimization. Technical Report W-412,
Helsingin School of Economics, Helsingin Kauppakorkeakoulu, Finland (2007)
Veldhuizen, D.V., Lamont, G.B.: Multiobjective evolutionary algorithm research: A
history and analysis. Technical Report TR-98-03, Department of Electrical and
Computer Engineering, Air Force Institute of Technology, Dayton, OH (1998)
Veldhuizen, D.V., Lamont, G.B.: Multiobjective evolutionary algorithms: Analyzing
the state-of-the-art. Evolutionary Computation Journal 8(2), 125148 (2000)
Vose, M.D., Wright, A.H., Rowe, J.E.: Implicit parallelism. In: Cant-Paz, E., Foster,
J.A., Deb, K., Davis, L., Roy, R., OReilly, U.-M., Beyer, H.-G., Kendall, G.,
Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz,
A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003.
LNCS, vol. 2723, Springer, Heidelberg (2003)
In: Fandel, G., Gal, T. (eds.) Multiple Criteria Decision Making Theory and
Zitzler, E., Knzli, S.: Indicator-Based Selection in Multiobjective Search. In: Yao,
X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervs, J.J., Bullinaria, J.A.,
Rowe, J.E., Tio, P., Kabn, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS,
Zitzler, E., Thiele, L.: Multiobjective optimization using evolutionary algorithms - A
comparative case study. In: Eiben, A.E., Bck, T., Schoenauer, M., Schwefel, H.-P.
(eds.) PPSN 1998. LNCS, vol. 1498, pp. 292301. Springer, Heidelberg (1998)
Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: A comparative case
study and the strength pareto approach. IEEE Transactions on Evolutionary
Computation 3(4), 257271 (1999)
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary al-
gorithms: Empirical results. Evolutionary Computation Journal 8(2), 125148
(2000)
Zitzler, E., Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.): EMO 2001.
LNCS, vol. 1993. Springer, Heidelberg (2001a)
Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength pareto evolu-
tionary algorithm for multiobjective optimization. In: Giannakoglou, K.C., Tsa-
halis, D.T., Priaux, J., Papailiou, K.D., Fogarty, T. (eds.) Evolutionary Methods
for Design Optimization and Control with Applications to Industrial Problems,
pp. 95100. International Center for Numerical Methods in Engineering (Cmine),
Athens, Greece (2001b)
Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., Fonseca, V.G.: Performance
assessment of multiobjective optimizers: An analysis and review. IEEE Transac-
tions on Evolutionary Computation 7(2), 117132 (2003)
4
Interactive Multiobjective Optimization
Using a Set of Additive Value Functions
Jos Rui Figueira1 , Salvatore Greco2 , Vincent Mousseau3 ,

and Roman Sowiski4,5
1
CEG-IST, Center for Management Studies, Instituto Superior Tcnico,
Technical University of Lisbon, Portugal, figueira@ist.utl.pt
2
Faculty of Economics, University of Catania, Corso Italia, 55,
95129 Catania, Italy, salgreco@unict.it
3
LAMSADE, Universit Paris-Dauphine, 75775 Paris, France,
mousseau@lamsade.dauphine.fr
4
Institute of Computing Science, Pozna University of Technology,
60-965 Pozna, Poland, roman.slowinski@cs.put.poznan.pl
5
Systems Research Institute, Polish Academy of Sciences, 01-447 Warsaw, Poland
Abstract. In this chapter, we present a new interactive procedure for multiobjec-

tive optimization, which is based on the use of a set of value functions as a preference
model built by an ordinal regression method. The procedure is composed of two al-
ternating stages. In the rst stage, a representative sample of solutions from the
Pareto optimal set (or from its approximation) is generated. In the second stage,
the Decision Maker (DM) is asked to make pairwise comparisons of some solutions
from the generated sample. Besides pairwise comparisons, the DM may compare se-
lected pairs from the viewpoint of the intensity of preference, both comprehensively
and with respect to a single criterion. This preference information is used to build a
preference model composed of all general additive value functions compatible with
the obtained information. The set of compatible value functions is then applied on
the whole Pareto optimal set, which results in possible and necessary rankings of
Pareto optimal solutions. These rankings are used to select a new sample of solu-
tions, which is presented to the DM, and the procedure cycles until a satisfactory
solution is selected from the sample or the DM comes to conclusion that there is
no satisfactory solution for the current problem setting. Construction of the set of
compatible value functions is done using ordinal regression methods called UTAGMS
and GRIP. These two methods generalize UTA-like methods and they are competi-
tive to AHP and MACBETH methods. The interactive procedure will be illustrated
through an example.
Reviewed by: Jerzy Baszczyski, Pozna University of Technology, Poland

Daisuke Sasaki, University of Cambridge, UK
Kalyanmoy Deb, Indian Institute of Technology Kanpur, India
98 J.R. Figueira et al.
4.1 Introduction
Over the last decade, research on MultiObjective Optimization (MOO) has

been mainly devoted to generation of exact Pareto optimal solutions, or of an
approximation of the Pareto optimal set (also called Pareto Frontier PF),
for problems with both combinatorial and multiple criteria structure. Only
little attention has been paid to the inclusion of Decision Makers (DMs)
preferences in the generation process. MOO has been thus considered merely
from the point of view of mathematical programming, while limited work is
devoted to the point of view of decision aiding (see, however, Chapter 6 and
Chapter 7, where preference information is used in evolutionary multiobjective
optimization). There is no doubt, that the research about the inclusion of
preferences within MOO is not sucient, and thus the link between MOO
and decision aiding should be strengthened. With this aim, in this chapter
we propose to use the ordinal regression paradigm as a theoretically sound
foundation for handling preference information in an interactive process of
solving MOO problems.
In the following, we assume that the interactive procedure explores the PF
of an MOO problem, however, it could be as well an approximation of this set.
The ordinal regression paradigm has been originally applied to multiple cri-
teria decision aiding in the UTA method (Jacquet-Lagrze and Siskos, 1982).
This paradigm assumes construction of a criteria aggregation model compati-
ble with preference information elicited from the DM. In the context of MOO,
this information has the form of holistic judgments on a reference subset of
the PF. The criteria aggregation model built in this way, is a DMs preference
model. It is applied on the whole PF to show how the PF solutions compare
between them using this model. The ordinal regression paradigm, gives a new
sense to the interaction with the DM. The preference information is collected
in a very easy way and concerns a small subset of PF solutions playing the
role of a training sample. Elicitation of holistic pairwise comparisons of some
solutions from the training sample, as well as comparisons of the intensity of
preferences between some selected pairs of solutions, require from the DM a
relatively small cognitive eort. The ordinal regression paradigm is also ap-
propriate for designing an interactive process of solving a MOO problem, as a
constructive learning process. This allows a DM to learn progressively about
his/her preferences and make revisions of his/her judgments in successive it-
erations.
Designing an interactive process in a constructive learning perspective is
based on the hypothesis that beyond the model denition, one of the promi-
nent roles of the interactive process is to build a conviction in the mind of the
DM on how solutions compare between them. Elaborating such a conviction
is grounded on two aspects: (1) the preexisting elements, such as the DMs
value system, past experience related to the decision problem; and (2) the
elements of information presented to the DM in the dialogue stage, showing
how the preference information from the previous iterations induces compar-
4 Interactive MOO Using a Set of Additive Value Functions 99
isons of PF solutions. In order to be more specic about the nature of the

constructive learning of preferences, it is important to say that there is a clear
feedback in the process. On one hand, the preference information provided
by the DM contributes to the construction of a preference model and, on the
other hand, the use of the preference model shapes the DMs preferences or,
at least, makes the DMs conviction evolve.
An interactive MOO procedure using the ordinal regression has been pro-
posed in (Jacquet-Lagrze et al., 1987). The ordinal regression implemented in
this procedure is the same as in the UTA method, thus the preference model
being used is a single additive value function with piecewise-linear compo-
nents.
The interactive procedure proposed in this chapter is also based on ordinal
regression, however, it is quite dierent from the previous proposal because
it is using a preference model being a set of value functions, as considered in
UTAGMS and GRIP methods (Greco et al., 2003, 2008; Figueira et al., 2008).
The value functions have a general additive form and they are compatible
with preference information composed of pairwise comparisons of some solu-
tions, and comparisons of the intensities of preference between pairs of selected
solutions. UTAGMS and GRIP methods extend the original UTA method in
several ways: (1) all additive value functions compatible with the preference
information are taken into account, while UTA is using only one such a func-
tion; (2) the marginal value functions are general monotone non-decreasing
functions, and not only piecewise-linear ones, as in UTA. Moreover, the DM
can provide a preference information which can be less demanding than in
UTA (partial preorder of some solutions instead of a complete preorder), on
one hand, and richer than in UTA (comparison of intensities of preference
between selected pairs of solutions, e.g., the preference of a over b is stronger
than the one of c over b), on the other hand. Lastly, these methods provide as
results necessary rankings that express statements that hold for all compati-
ble value functions, and possible rankings that express statements which hold
for at least one compatible value function, respectively. The two extensions
of the UTA method appear to be very useful for organizing an interactive
search of the most satisfactory solution of a MOO problem the interaction
with the DM is organized such that the preference information is provided
incrementally, with the possibility of checking the impact of particular pieces
of information on the preference structure of the PF.
The chapter is organized as follows. Section 4.2 is devoted to presenta-
tion of the general scheme of the constructive learning interactive procedure.
Section 4.3 provides a brief reminder on learning of one compatible additive
piecewise-linear value function for multiple criteria ranking problems using
the UTA method. In Section 4.4, the GRIP method is presented, which is
currently the most general of all UTA-like methods. GRIP is also competitive
to the current main methods in the eld of multiple criteria decision aiding. In
particular, it is competitive to the AHP method (Saaty, 1980), which requires
pairwise comparisons of solutions and criteria, and yields a priority ranking
of solutions. Then, GRIP is competitive to the MACBETH method (Bana e

Costa and Vansnick, 1994), which also takes into account a preference order
of solutions and intensity of preference for pairs of solutions. The preference
information used in GRIP does not need, however, to be complete: the DM
is asked to provide comparisons of only those pairs of selected solutions on
particular criteria for which his/her judgment is suciently certain. This is
an important advantage when comparing GRIP to methods which, instead,
require comparison of all possible pairs of solutions on all the considered crite-
ria. Section 4.5 presents an application of the proposed interactive procedure
for MOO; the possible pieces of preference information that can be consid-
ered in an interactive protocol are the following: ordinal pairwise comparisons
of selected PF solutions, and ordinal comparisons of intensities of preference
between selected pairs of PF solutions. In the last Section, some conclusions
and further research directions are provided.
4.2 Application of an Ordinal Regression Method within

a Multiobjective Interactive Procedure
In the following, we assume that the Pareto optimal set of a MOO problem
is generated prior to an interactive exploration of this set. Instead of the
whole and exact Pareto optimal set of a MOO problem, one can also consider
a proper representation of this set, or its approximation. In any case, an
interactive exploration of this set should lead the DM to a conviction that
either there is no satisfactory solution to the considered problem, or there
is at least one such a solution. We will focus our attention on the interactive
exploration, and the proposed interactive procedure will be valid for any nite
set of solutions to be explored. Let us denote this set by A. Note that such set
A can be computed using a MOO or EMO algorithm (see Chapters 1 and 3).
In the course of the interactive procedure, the preference information pro-
vided by the DM concerns a small subset of A, called reference or training
sample, and denoted by AR . The preference information is transformed by an
ordinal regression method into a DMs preference model. We propose to use
at this stage the GRIP method, thus the preference model is a set of general
additive value functions compatible with the preference information. A com-
patible value function compares the solutions from the reference sample in
the same way as the DM. The obtained preference model is then applied on
the whole set A, which results in necessary and possible rankings of solutions.
These rankings are used to select a new sample of reference solutions, which
is presented to the DM, and the procedure cycles until a satisfactory solution
is selected from the sample or the DM comes to conclusion that there is no
satisfactory solution for the current problem setting.
The proposed interactive procedure is composed of the following steps:
Step 1. Select a representative reference sample AR of solutions from

set A.
Step 2. Present the sample AR to the DM.
Step 3. If the DM is satised with at least one solution from the sam-
ple, then this is the most preferred solution and the procedure stops. The
procedure also stops in this step if, after several iterations, the DM con-
cludes that there is no satisfactory solution for the current problem setting.
Otherwise continue.
Step 4. Ask the DM to provide information about his/her preferences on
set AR in the following terms:
pairwise comparison of some solutions from AR ,
comparison of intensities of comprehensive preferences between some
pairs of solutions from AR ,
comparison of intensities of preferences on single criteria between some
pairs of solutions from AR .
Step 5. Use the GRIP method to build a set of general additive value
functions compatible with the preference information obtained from the
DM in Step 4.
Step 6. Apply the set of compatible value functions built in Step 5 on
the whole set A, and present the necessary and possible rankings (see
sub-section 4.4.2) resulting from this application to the DM.
Step 7. Taking into account the necessary and possible rankings on set
A, let the DM select a new reference sample of solutions AR A , and go
to Step 2.
In Step 4, the information provided by the DM may lead to a set of con-
straints which dene an empty polyhedron of the compatible value functions.
In this case, the DM is informed what items of his/her preference informa-
tion make the polyhedron empty, so as to enable revision in the next round.
This point is explained in detail in (Greco et al., 2008; Figueira et al., 2008).
Moreover, information provided by the DM in Step 4 cannot be considered
as irreversible. Indeed, the DM can come back to one of previous iterations
and continue from this point. This feature is concordant with the spirit of
a learning oriented conception of multiobjective interactive optimization, i.e.
it conrms the idea that the interactive procedure permits the DM to learn
about his/her preferences and about the shape of the Pareto optimal set
(see Chapter 15).
Notice that the proposed approach allows to elicit incrementally preference
information by the DM. Remark that in Step 7, the new reference sample
AR is not necessarily dierent from the previously considered, however, the
preference information elicited by the DM in the next iteration is richer than
previously, due to the learning eect. This permits to build and rene progres-
sively the preference model: in fact, each new item of information provided
in Step 4 reduces the set of compatible value functions and denes the DMs
preferences more and more precisely.
Let us also observe that information obtained from the DM in Step 4 and
information given to the DM in Step 6 is composed of very simple and easy
to understand statements: preference comparisons in Step 4, and necessary
and possible rankings in Step 6 (i.e., a necessary ranking that holds for all
compatible value functions, and a possible ranking that holds for at least one
compatible value function, see sub-section 4.4.2). Thus, the nature of informa-
tion exchanged with the DM during the interaction is purely ordinal. Indeed,
monotonically increasing transformations of evaluation scales of considered
criteria have no inuence on the nal result.
Finally, observe that a very important characteristic of our method from
the point of view of learning is that the DM can observe the impact of in-
formation provided in Step 4 in terms of necessary and possible rankings of
solutions from set A.
4.3 The Ordinal Regression Method for Learning

One Compatible Additive Piecewise-Linear
Value Function
The preference information may be either direct or indirect, depending
whether it species directly values of some parameters used in the preference
model (e.g. trade-o weights, aspiration levels, discrimination thresholds, etc.)
or, whether it species some examples of holistic judgments from which com-
patible values of the preference model parameters are induced. Eliciting direct
preference information from the DM can be counterproductive in real-world
decision making situations because of a high cognitive eort required. Conse-
quently, asking directly the DM to provide values for the parameters seems to
make the DM uncomfortable. Eliciting indirect preference is less demanding
of the cognitive eort. Indirect preference information is mainly used in the
ordinal regression paradigm. According to this paradigm, a holistic preference
information on a subset of some reference or training solutions is known rst
and then a preference model compatible with the information is built and
applied to the whole set of solutions in order to rank them.
The ordinal regression paradigm emphasizes the discovery of intentions as
an interpretation of actions rather than as a priori position, which was called
by March the posterior rationality (March, 1978). It has been known for at
least fty years in the eld of multidimensional analysis. It is also concordant
with the induction principle used in machine learning. This paradigm has
been applied within the two main Multiple Criteria Decision Aiding (MCDA)
approaches mentioned above: those using a value function as preference model
(Srinivasan and Shocker, 1973; Pekelman and Sen, 1974; Jacquet-Lagrze and
Siskos, 1982; Siskos et al., 2005), and those using an outranking relation as
preference model (Kiss et al., 1994; Mousseau and Slowinski, 1998). This
paradigm has also been used since mid nineties in MCDA methods involving
a new, third family of preference models - a set of dominance decision rules in-
duced from rough approximations of holistic preference relations (Greco et al.,
1999, 2001, 2005; Sowiski et al., 2005).
Recently, the ordinal regression paradigm has been revisited with the aim
of considering the whole set of value functions compatible with the preference
information provided by the DM, instead of a single compatible value function
used in UTA-like methods (Jacquet-Lagrze and Siskos, 1982; Siskos et al.,
2005). This extension has been implemented in a method called UTAGMS
(Greco et al., 2003, 2008), further generalized in another method called GRIP
(Figueira et al., 2008). UTAGMS and GRIP are not revealing to the DM one
compatible value function, but they are using the whole set of compatible
(general, not piecewise-linear only) additive value functions to set up a nec-
essary weak preference relation and a possible weak preference relation in the
whole set of considered solutions.
4.3.1 Concepts: Denitions and Notation
We are considering a multiple criteria decision problem where a nite set of so-
lutions A = {x, . . . , y, . . . w, . . .} is evaluated on a family F = {g1 , g2 , . . . , gn }
of n criteria. Let I = {1, 2, . . . , n} denote the set of criteria indices. We as-
sume, without loss of generality, that the greater gi (x), the better solution x
on criterion gi , for all i I, x A. A DM is willing to rank the solutions of A
from the best to the worst, according to his/her preferences. The ranking can
be complete or partial, depending on the preference information provided by
the DM and on the way of exploiting this information. The family of criteria
F is supposed to satisfy consistency conditions, i.e. completeness (all relevant
criteria are considered), monotonicity (the better the evaluation of a solution
on the considered criteria, the more it is preferable to another), and non-
redundancy (no superuous criteria are considered), see (Roy and Bouyssou,
1993).
Such a decision-making problem statement is called multiple criteria rank-
ing problem. It is known that the only information coming out from the for-
mulation of this problem is the dominance ranking. Let us recall that in the
dominance ranking, solution x A is preferred to solution y A if and only
if gi (x) gi (y) for all i I, with at least one strict inequality. Moreover, x is
indierent to y if and only if gi (x) = gi (y) for all i I. Hence, for any pair
of solutions x, y A, one of the four situations may arise in the dominance
ranking: x is preferred to y, y is preferred to x, x is indierent to y, or x is
incomparable to y. Usually, the dominance ranking is very poor, i.e. the most
frequent situation is: x incomparable to y.
In order to enrich the dominance ranking, the DM has to provide prefer-
ence information which is used to construct an aggregation model making the
solutions more comparable. Such an aggregation model is called preference
model. It induces a preference structure on set A, whose proper exploitation

permits to work out a ranking proposed to the DM.
In what follows, the evaluation of each solution x A on each criterion
gi F will be denoted either by gi (x) or xi .
Let Gi denote the value set (scale) of criterion gi , i I. Consequently,

n
G= Gi
i=1
represents the evaluation space, and x G denotes a prole of a solution in

such a space. We consider a weak preference relation on A which means,
for each pair of solutions x, y A,
x y x is at least as good as y.
This weak preference relation can be decomposed into its asymmetric and
symmetric parts, as follows,
1) x y [x y and not(y x)] x is preferred to y, and
2) x y [x y and y x] x is indierent to y.
From a pragmatic point of view, it is reasonable to assume that Gi R, for
i = 1, . . . , n. More specically, we will assume that the evaluation scale on
each criterion gi is bounded, such that Gi = [i , i ], where i , i , i < i are
the worst and the best (nite) evaluations, respectively. Thus, gi : A Gi ,
i I. Therefore, each solution x A is associated with an evaluation solution
denoted by g(x) = (x1 , x2 , . . . , xn ) G.
4.3.2 The UTA Method for a Multiple Criteria Ranking Problem
In this sub-section, we recall the principle of the ordinal regression via linear
programming, as proposed in the original UTA method, see (Jacquet-Lagrze
and Siskos, 1982).
Preference Information
The preference information is given in the form of a complete preorder on

a subset of reference solutions AR A (where |AR | = p), called reference
preorder. The reference solutions are usually those contained in set A for
which the DM is able to express holistic preferences. Let AR = {a, b, c, . . .} be
the set of reference solutions.
An Additive Model
The additive value function is dened on A such that for each g(x) G,

n
U (g(x)) = ui (gi (xi )), (4.1)
i=1
where, ui are non-decreasing marginal value functions, ui : Gi R, i I.

For the sake of simplicity, we shall write (4.1) as follows,

n
U (x) = ui (xi ). (4.2)
i=1
In the UTA method, the marginal value functions ui are assumed to be

piecewise-linear functions. The ranges [i , i ] are divided into i 1 equal
sub-intervals,
[x0i , x1i ], [x1i , x2i ], . . . , [xi i 1 , xi i ]
where,
j
xji = i + (i i ), j = 0, . . . , i , and i I.
i
The marginal value of a solution x A is obtained by linear interpolation,
xi xji
ui (x) = ui (xji ) + (ui (xj+1
i ) ui (xji )), for xi [xji , xj+1
i ]. (4.3)
xj+1
i xji
The piecewise-linear additive model is completely dened by the marginal
values at the breakpoints, i.e. ui (x0i ) = ui (i ), ui (x1i ), ui (x2i ), ..., ui (xi i ) =
ui (i ).
In what follows, the principle of the UTA method is described as it was
recently presented by Siskos et al. (2005). n
Therefore, a value function U (x) = i=1 ui (xi ) is compatible if it satises
the following set of constraints

U (a) > U (b) a b
a, b AR
U (a) = U (b) a b
ui (xj+1
i ) ui (xji ) 0, i = 1, ..., n, j = 1, ..., i 1 (4.4)
u
ni (i ) = 0, i = 1, ..., n
i=1 ui (i ) = 1
Checking for Compatible Value Functions through Linear

Programming
n
To verify if a compatible value function U (x) = i=1 ui (xi ) restoring the
reference preorder on AR exists, one can solve the following linear pro-
gramming problem, where ui (xji ), i = 1, ..., n, j = 1, ..., i , are unknown, and
+ (a), (a) (a AR ) are auxiliary variables:

M in F = aAR ( + (a) + (a))
s.t.
U (a) + + (a) (a)

U (b) + + (b) (b) + a b
a, b AR
U (a) + (a) (a) =
+

(4.5)
U (b) + + (b) (b) a b
j+1 j
ui (xi ) ui (xi ) 0, i = 1, ..., n, j = 1, ..., i 1
ui (i ) = 0, i = 1, ..., n
n
i=1 ui (i ) = 1
+ (a), (a) 0, a AR
where, is an arbitrarily small positive value so that U (a) + + (a) (a) >
U (b) + + (b) (b) in case of a b.
If the optimal value of the objective function of program (4.5)is equal to
zero (F = 0), then there exists at least one value function U (x) = i=1 ui (xi )
n
R
satisfying (4.4), i.e. compatible with the reference preorder on A . In other
words, this means that the corresponding polyhedron (4.4) of feasible solutions
for ui (xji ), i = 1, ..., n, j = 1, ..., i , is not empty.
Let us remark that the transition from the preorder to the marginal value
function exploits the ordinal character of the criterion scale Gi . Note, however,
that the scale of the marginal value function is a conjoint interval n scale. More
precisely, for the considered additive value function U (x) = i=1 ui (xi ), the
admissible transformations on the marginal value functions ui (xi ) have the
form ui (xi ) = k ui (xi ) + hi , hi R, i = 1, . . . , n, k > 0, such that for all
[x1 , ..., xn ], [y1 , ..., yn ] G

n
n
n
n
ui (xi ) ui (yi ) ui (xi ) ui (yi ).
i=1 i=1 i=1 i=1
An alternative way of representing the same preference model is:

n
U (x) = wi ui (x), (4.6)
i=1
n
where u (i ) = 0, u(i ) = 1, wi 0 i = 1, 2, . . . , n and i=1 wi = 1. Note
that the correspondence between (4.6) and (4.2) is such that wi = ui (i ), i
G. Due to the cardinal character of the marginal value function scale, the
parameters wi can be interpreted as tradeo weights among marginal value
functions ui (x). We will use, however, the preference model (4.2) with nor-
malization constraints bounding U (x) to the interval [0, 1].
When the optimal value of the objective function of the program n (4.5) is
greater than zero (F > 0), then there is no value function U (x) = i=1 ui (xi )
compatible with the reference preorder on AR . In such a case, three possible
moves can be considered:
increasing the number of linear pieces i for one or several marginal value
function ui could make it possible to nd an additive value function com-
patible with the reference preorder on AR ;
revising the reference preorder on AR could lead to nd an additive value
function compatible with the new preorder;
searching over the relaxed domain F F + could lead to an additive
value function giving a preorder on AR suciently close to the reference
preorder (in the sense of Kendalls ).
4.4 The Ordinal Regression Method for Learning the

Whole Set of Compatible Value Functions
Recently, two new methods, UTAGMS (Greco et al., 2008) and GRIP (Figueira
et al., 2008), have generalized the ordinal regression approach of the UTA
method in several aspects:
taking into account all additive value functions (4.1) compatible with the
preference information, while UTA is using only one such function,
considering marginal value functions of (4.1) as general non-decreasing
functions, and not piecewise-linear, as in UTA,
asking the DM for a ranking of reference solutions which is not necessarily
complete (just pairwise comparisons),
taking into account additional preference information about intensity of
preference, expressed both comprehensively and with respect to a single
criterion,
avoiding the use of the exogenous, and not neutral for the result, parameter
in the modeling of strict preference between solutions.
UTAGMS produces two rankings on the set of solutions A, such that for any
pair of solutions a, b A:
in the necessary ranking, a is ranked at least as good as b if and only
if, U (a) U (b) for all value functions compatible with the preference
information,
in the possible ranking, a is ranked at least as good as b if and only if,
U (a) U (b) for at least one value function compatible with the preference
information.
GRIP produces four more necessary and possible rankings on the set of solu-
tions A A as it can bee seen in sub-section 4.4.2.
The necessary ranking can be considered as robust with respect to the
preference information. Such robustness of the necessary ranking refers to the
fact that any pair of solutions compares in the same way whatever the additive
value function compatible with the preference information. Indeed, when no
preference information is given, the necessary ranking boils down to the weak
dominance relation (i.e., a is necessarily at least as good as b, if gi (a) gi (b)

for all gi F ), and the possible ranking is a complete relation. Every new
pairwise comparison of reference solutions, for which the dominance relation
does not hold, is enriching the necessary ranking and it is impoverishing the
possible ranking, so that they converge with the growth of the preference
information.
Moreover, such an approach has another feature which is very appealing
in the context of MOO. It stems from the fact that it gives space for inter-
activity with the DM. Presentation of the necessary ranking, resulting from
a preference information provided by the DM, is a good support for gener-
ating reactions from part of the DM. Namely, (s)he could wish to enrich the
ranking or to contradict a part of it. Such a reaction can be integrated in the
preference information considered in the next calculation stage.
The idea of considering the whole set of compatible value functions was
originally introduced in UTAGMS . GRIP (Generalized Regression with Inten-
sities of Preference) can be seen as an extension of UTAGMS permitting to
take into account additional preference information in form of comparisons
of intensities of preference between some pairs of reference solutions. For so-
lutions x, y, w, z A, these comparisons are expressed in two possible ways
(not exclusive): (i) comprehensively, on all criteria, like x is preferred to y
at least as much as w is preferred to z; and, (ii) partially, on each criterion,
like x is preferred to y at least as much as w is preferred to z, on criterion
gi F . Although UTAGMS was historically the rst method among the two,
as GRIP incorporates and extends UTAGMS , in the following we shall present
only GRIP.
4.4.1 The Preference Information Provided by the Decision Maker

The DM is expected to provide the following preference information in the
dialogue stage of the procedure:
A partial preorder on AR whose meaning is: for some x, y AR
x y x is at least as good as y.
Moreover, (preference) is the asymmetric part of , and (indierence)
is its symmetric part.
A partial preorder on AR AR , whose meaning is: for some x, y, w,
z AR ,
(x, y) (w, z) x is preferred to y at least as much as w is preferred
to z.
Also in this case, is the asymmetric part of , and is its symmetric
part.
A partial preorder i on AR AR , whose meaning is: for some x, y, w, z
AR , (x, y) i (w, z) x is preferred to y at least as much as w is preferred
to z on criterion gi , i I.
In the following, we also consider the weak preference relation i being a

complete preorder whose meaning is: for all x, y A,
x i y x is at least as good as y on criterion gi , i I.
Weak preference relations i , i I, are not provided by the DM, but they
are obtained directly from the evaluation of solutions x and y on criteria gi ,
i.e., x i y gi (x) gi (y), i I.
4.4.2 Necessary and Possible Binary Relations in Set A and in Set

AA
When there exists at least one value function compatible with the preference
information provided by the DM, the method produces the following rankings:
- a necessary ranking N , for all pairs of solutions (x, y) A A;
- a possible ranking P , for all pairs of solutions (x, y) A A;
N
- a necessary ranking , with respect to the comprehensive intensities of
preferences for all ((x, y), (w, z)) A A A A;
P
- a possible ranking , with respect to the comprehensive intensities of
preferences for all ((x, y), (w, z)) A A A A;
N
- a necessary ranking i , with respect to the partial intensities of pref-
erences for all ((x, y), (w, z)) A A A A and for all criteria gi ,
i I;
P
- a possible ranking i , with respect to the partial intensities of prefer-
ences for all ((x, y), (w, z)) A A A A and for all criteria gi , i I.
4.4.3 Linear Programming Constraints
In this sub-section, we present a set of constraints that interprets the prefer-

ence information in terms of conditions on the compatible value functions.
To be compatible with the provided preference information, the value func-
tion U : A [0, 1] should satisfy the following constraints corresponding to
DMs preference information:
a) U (w) > U (z) if w z
b) U (w) = U (z) if w z
c) U (w) U (z) > U (x) U (y) if (w, z) (x, y)
d) U (w) U (z) = U (x) U (y) if (w, z) (x, y)
e) ui (w) ui (z) if w i z, i I
f) ui (w) ui (z) > ui (x) ui (y) if (w, z) i (x, y), i I
g) ui (w) ui (z) = ui (x) ui (y) if (w, z) i (x, y), i I
Let us remark that within UTA-like methods, constraint a) is written as
U (w) U (z) + , where > 0 is a threshold exogenously introduced. Analo-
gously, constraints c) and f ) should be written as,
U (w) U (z) U (x) U (y) +
and
ui (w) ui (z) ui (x) ui (y) + .
However, we would like to avoid the use of any exogenous parameter and,
therefore, instead of setting an arbitrary value of , we consider it as an
auxiliary variable, and we test the feasibility of constraints a), c), and f )
(see sub-section 4.4.4). This permits to take into account all possible value
functions, even those having a very small preference threshold . This is also
safer from the viewpoint of objectivity of the used methodology. In fact, the
value of is not meaningful in itself and it is useful only because it permits
to discriminate preference from indierence.
Moreover, the following normalization constraints should also be taken
into account:
h) ui (xi ) = 0, where xi is such that xi = min{gi (x) : x A};

i) iI ui (yi ) = 1, where yi is such that yi = max{gi (x) : x A}.
4.4.4 Computational Issues

N
In order to conclude the truth or falsity of binary relations N , P , ,
P N P
, i and i , we have to take into account that, for all x, y, w, z A
and i I:

1) x N y inf U (x) U (y) 0,

2) x P y inf U (y) U (x) 0,
N

3) (x, y) (w, z) inf U (x) U (y) U (w) U (z) 0,
P

4) (x, y) (w, z) inf U (w) U (z) U (x) U (y) 0,

N
5) (x, y) i (w, z) inf ui (xi ) ui (yi ) ui (wi ) ui (zi ) 0,
P

6) (x, y) i (w, z) inf ui (wi ) ui (zi ) ui (xi ) ui (yi ) 0,
with the inmum calculated on the set of value functions satisfying constraints
from a) to i). Let us remark, however, that the linear programming is not able
to handle strict inequalities such as the above a), c), and f ). Moreover, linear
programming permits to calculate the minimum or the maximum of an ob-
jective function and not an inmum. Nevertheless, reformulating properly the
above properties 1) to 6), a result presented in (Marichal and Roubens, 2000)
permits to use linear programming for testing the truth of binary relations,
N P N P
N , P , , , i and i .
In order to use such a result, constraints a), c) and f ) have to be reformu-
lated as follows:
a ) U (x) U (y) + if x y;
c ) U (x) U (y) U (w) U (z) + if (x, y) (w, z);
f ) ui (x) ui (y) ui (w) ui (z) + if (x, y) i (w, z).
Notice that constraints a), c) and f ) are equivalent to a ), c ), and f ) whenever
> 0.
After properties 1) 6) have to be reformulated such that the search of
the inmum is replaced by the calculation of the maximum value of on the
set of value functions satisfying constraints from a) to i), with constraints a),
c), and f ) transformed to a ), c ), and f ), plus constraints specic for each
point:
1 ) x P y > 0,
where = max , subject to the constraints a ), b), c ), d), e), f ), plus
the constraint U (x) U (y);
2 ) x N y 0,
where = max , subject to the constraints a ), b), c ), d), e), f ), plus
the constraint U (y) U (x) + ;
P
3 ) (x, y) (w, z) > 0,
where = max
, subject totheconstraints a ),

b), c ), d), e), f ), plus
the constraint (U (x) U (y) U (w) U (z)) 0;
N
4 ) (x, y) (w, z) 0,
where = max
, subject totheconstraints a ),

b), c ), d), e), f ), plus
the constraint (U (w) U (z) U (x) U (y)) ;
P
5 ) (x, y) i (w, z) > 0,
where = max constraints a ),
, subject to the b), c ), d), e), f ), plus
the constraint ui (xi ) ui (yi ) ui (wi ) ui (zi ) 0;
N
6 ) (x, y) i (w, z) 0,
where = max , subject to the constraints
a ), b),
c ), d), e), f ), plus
the constraint (ui (wi ) ui (zi ) ui (xi ) ui (yi ) .
4.4.5 Comparison of GRIP with the Analytical Hierarchy Process
In AHP (Saaty, 1980, 2005), criteria should be compared pairwise with re-
spect to their importance. Actions (solutions) are also compared pairwise on
particular criteria with respect to intensity of preference. The following nine
point scale of preference is used: 1 - equal importance, 3 - moderate impor-
tance, 5 - strong importance, 7 - very strong or demonstrated importance, and
9 - extreme importance. 2, 4, 6 and 8 are intermediate values between the two
adjacent judgements. The intensity of importance of criterion gi over criterion
gj is the inverse of the intensity of importance of gj over gi . Analogously, the
intensity of preference of action x over action y is the inverse of the intensity
of preference of y over x. The above scale is a ratio scale. Therefore, the in-
tensity of importance is read as the ratio of weights wi and wj corresponding
to criteria gi and gj , and the intensity of preference is read as the ratio of the
attractiveness of x and the attractiveness of y, with respect to the considered
criterion gi . In terms of value functions, the intensity of preference can be
interpreted as the ratio uuii (gi (x))
(gi (y)) . Thus, the problem is how to obtain values
of wi and wj from ratio wj , and values of ui (gi (x)) and ui (gi (y)) from ratio
wi
ui (gi (x))
ui (gi (y)) .
In AHP, it is proposed that these values are supplied by principal eigenvec-
tors of matrices composed of the ratios w wi
j
and uuii (gi (x))
(gi (y)) . The marginal value
functions ui (gi (x)) are then aggregated by means of a weighted-sum using the
weights wi .
Comparing AHP with GRIP, we can say that with respect to a single cri-
terion, the type of questions addressed to the DM is the same: express the
intensity of preference in qualitative-ordinal terms (equal, moderate, strong,
very strong, extreme). However, dierently from GRIP, this intensity of pref-
erence is translated into quantitative terms (the scale from 1 to 9) in a quite
arbitrary way. In GRIP, instead, the marginal value functions are just a nu-
merical representation of the original qualitative-ordinal information, and no
intermediate transformation into quantitative terms is exogenously imposed.
Other dierences between AHP and GRIP are related to the following
aspects.
1) In GRIP, the value functions ui (gi (x)) depend mainly on holistic judge-
ments, i.e. comprehensive preferences involving jointly all the criteria,
while this is not the case in AHP.
2) In AHP, the weights wi of criteria gi are calculated on the basis of pairwise
comparisons of criteria with respect to their importance; in GRIP, this is
not the case, because the value functions ui (gi (x)) are expressed on the
same scale and thus they can be summed up without any further weighting.
3) In AHP, all non-ordered pairs of actions must be compared from the view-
point of the intensity of preference with respect to each particular criterion.
Therefore, if m is the number of actions, and n the number of criteria, then
the DM has to answer n m(m1) 2 questions. Moreover, the DM has to
n(n1)
answer questions relative to 2 pairwise comparisons of considered
criteria with respect to their importance. This is not the case in GRIP,
which accepts partial information about preferences in terms of pairwise
comparison of some reference actions. Finally, in GRIP there is no question
about comparison of relative importance of criteria.
As far as point 2) is concerned, observe that the weights wi used in AHP rep-
resent tradeos between evaluations on dierent criteria. For this reason it is
doubtful if they could be inferred from answers to questions concerning com-
parison of importance. Therefore, AHP has a problem with meaningfulness of
its output with respect to its input, and this is not the case of GRIP.
4.4.6 Comparison of GRIP with MACBETH
MACBETH (Measuring Attractiveness by a Categorical Based Evaluation

TecHnique) (Bana e Costa and Vansnick, 1994; Bana e Costa et al., 2005)
is a method for multiple criteria decision analysis that appeared in the early
nineties. This approach requires from the DM qualitative judgements about
dierences of value to quantify the relative attractiveness of actions (solutions)
or criteria.
When using MACBETH, the DM is asked to provide preference infor-
mation composed of a strict order of all actions from A, and a qualitative
judgement of the dierence of attractiveness between all two non-indierent
actions. Seven semantic categories of the dierence of attractiveness are con-
sidered: null, very weak, weak, moderate, strong, very strong, and extreme.
The dierence of attractiveness reects the intensity of preferences.
The main idea of MACBETH is to build an interval scale from the pref-
erence information provided by the DM. It is, however, necessary that the
above categories correspond to disjoint intervals (represented in terms of the
real numbers). The bounds for such intervals are not arbitrarily xed a priori,
but they are calculated so as to be compatible with the numerical values of all
particular actions from A, and to ensure compatibility between these values
(see Bana e Costa et al. 2005). Linear programming models are used for these
calculations. In case of inconsistent judgments, MACBETH provides the DM
with information in order to eliminate such inconsistency.
When comparing MACBETH with GRIP, the following aspects should be
considered:
both deal with qualitative judgements;
both need a set of comparisons of actions or pairs of actions to work out a
numerical representation of preferences, however, MACBETH depends on
the specication of two characteristic levels on the original scale, neutral
and good, to obtain the numerical representation of preferences, while
GRIP does not need this information;
GRIP adopts the disaggregation-aggregation approach and, therefore,
it considers mainly holistic judgements relative to comparisons involving
jointly all the criteria, which is not the case of MACBETH;
GRIP is more general than MACBETH since it can take into account
the same kind of qualitative judgments as MACBETH (the dierence of
attractiveness between pairs of actions) and, moreover, the intensity of
preferences of the type x is preferred to y at least as much as z is preferred
to w.
As for the last item, it should be noticed that the intensity of preference
considered in MACBETH and the intensity coming from comparisons of the
type x is preferred to y at least as strongly as w is preferred to z (i.e., the
quaternary relation ) are substantially the same. In fact, the intensities of
preference are equivalence classes of the preorder generated by . This means
that all the pairs (x, y) and (w, z), such that x is preferred to y with the same
intensity as w is preferred to z, belong to the same semantic category of
dierence of attractiveness considered in MACBETH. To be more precise, the
structure of intensity of preference considered in MACBETH is a particular
case of the structure of intensity of preference represented by in GRIP. Still
more precisely, GRIP has the same structure of intensity as MACBETH when
is a complete preorder. When this does not occur, MACBETH cannot be
used while GRIP can naturally deal with this situation.
Comparison of GRIP and MACBETH could be summarized in the follow-
ing points:
1. GRIP is using preference information relative to: 1) comprehensive prefer-
ence on a subset of reference actions with respect to all criteria, 2) marginal
intensity of preference on some single criteria, and 3) comprehensive inten-
sity of preference with respect to all criteria, while MACBETH requires
preference information on all pairs of actions with respect to each one of
the considered criteria.
2. Information about marginal intensity of preference is of the same nature
in GRIP and MACBETH (equivalence classes of relation i correspond
to qualitative judgements of MACBETH), but in GRIP it may not be
complete.
3. GRIP is a disaggregation-aggregation approach while MACBETH makes
use of the aggregation approach and, therefore, it needs weights to ag-
gregate evaluations on the criteria.
4. GRIP works with all compatible value functions, while MACBETH builds
a single interval scale for each criterion, even if many such scales would
be compatible with preference information.
5. Distinguishing necessary and possible consequences of using all value func-
tions compatible with preference information, GRIP includes a kind of
robustness analysis instead of using a single best-t value function.
6. The necessary and possible preference relations considered in GRIP have
several properties of general interest for MCDA.
4.5 An Illustrative Example
In this section, we illustrate how our approach can support the DM to specify
his/her preferences on a set of Pareto optimal solutions. In this didactic ex-
ample, we shall imagine an interaction with a ctitious DM so as to exemplify
and illustrate the type of interaction proposed in our method.
We consider a MOO problem that involves ve objectives that are to be
maximized. Let us consider a subset A of the Pareto Frontier of a MOO
problem consisting of 20 solutions (see Table 4.1). Note that this set A can
be computed using a MOO or EMO algorithm (see Chapters 2 and 3). Let us
suppose that the reference sample AR of solutions from set A is the following:
Table 4.1. The whole set of Pareto optimal solutions for the example MOO problem
s1 = (14.5, 147, 4, 1014, 5.25) s11 = (15.75, 164.375, 41.5, 311, 6.5)
s2 = (13.25, 199.125, 4, 1014, 4) s12 = (13.25, 181.75, 41.5, 311, 4)
s3 = (15.75, 164.375, 16.5, 838.25, 5.25) s13 = (12, 199.125, 41.5, 311, 2.75)
s4 = (12, 181.75, 16.5, 838.25, 4) s14 = (17, 147, 16.5, 662.5, 5.25)
s5 = (12, 164.375, 54, 838.25, 4) s15 = (15.75, 199.125, 16.5, 311, 6.5)
s6 = (13.25, 199.125, 29, 662.5, 5.25) s16 = (13.25, 164.375, 54, 311, 4)
s7 = (13.25, 147, 41.5, 662.5, 5.25) s17 = (17, 181.75, 16.5, 486.75, 5.25)
s8 = (17, 216.5, 16.5, 486.75, 1.5) s18 = (14.5, 164.375, 41.5, 838.25, 4)
s9 = (17, 147, 41.5, 486.75, 5.25) s19 = (15.75, 181.75, 41.5, 135.25, 5.25)
s10 = (15.75, 216.5, 41.5, 662.5, 1.5) s20 = (15.75, 181.75, 41.5, 311, 2.75)
AR = {s1 , s2 , s4 , s5 , s8 , s10 }. For the sake of simplicity, we shall consider the

set AR constant across iterations (although the interaction scheme permits AR
to evolve during the process). For the same reason, we will suppose that the
DM expresses preference information only in terms of pairwise comparisons
of solutions from AR (intensity of preference will not be expressed in the
preference information).
The DM does not see any satisfactory solution in the reference sample AR
(s1 , s2 , s4 and s5 have too weak evaluations on the rst criterion, while s8
and s10 have the worst evaluation in A on the last criterion), and wishes to
nd a satisfactory solution in A. Obviously, solutions in A are not comparable
unless preference information is expressed by the DM. In this perspective,
he/she provides a rst comparison: s1 s2 .
Considering the provided preference information, we can compute the
necessary and possible rankings on set A (computation of this example were
performed using the GNU-UTA software package (Chakhar and Mousseau,
2007); note that the UTAGMS and GRIP methods are also implemented in
the Decision Deck software platform (Consortium, 2008)). The DM decided to
consider the necessary ranking only, as it has more readable graphical repre-
sentation than the possible ranking at the stage of relatively poor preference
information. The partial preorder of the necessary ranking is depicted in Fig.
4.1 and shows the comparisons that hold for all additive value functions com-
patible with the information provided by the DM (i.e., s1 s2 ). It should be
observed that the computed partial preorder contains the preference informa-
tion provided by the DM (dashed arrow), but also additional comparisons that
result from the initial information (continuous arrows); for instance, s3 N s4
holds because U (s3 ) U (s4 ) for each compatible value function (this gives
s3 N s4 ) and U (s3 ) > U (s4 ) for at least one value function (this gives
not(s4 N s3 )).
Analyzing this rst result, the DM observes that the necessary ranking is
still very poor which makes it dicult to discriminate among the solutions
in A. He/she reacts by stating that s4 is preferred to s5 . Considering this
new piece of preference information, the necessary ranking is computed again
Fig. 4.1. Necessary partial ranking at the rst iteration
and shown in Fig. 4.2. At this second iteration, it should be observed that
the resulting necessary ranking has been enriched as compared to the rst
iteration (bold arrows), narrowing the set of best choices, i.e., solutions that
are not preferred by any other solution in the necessary ranking: {s1 , s3 , s6 ,
s8 , s10 , s14 , s15 , s17 , s18 , s19 , s20 }.
Fig. 4.2. Necessary partial ranking at the second iteration
The DM believes that this necessary ranking is still insuciently decisive

and adds a new comparison: s8 is preferred to s10 . Once again, the necessary
ranking is computed and shown in Fig. 4.3.
At this stage, the set of possible best choices has been narrowed down to
a limited number of solutions, among which s14 and s17 are judged satisfactory
by the DM. In fact, these two solutions have a very good performance on the
rst criterion without any dramatic evaluation on the other criteria.
Fig. 4.3. Necessary partial ranking at the third iteration
The current example stops at this step, but the DM could then decide to
provide further preference information to enrich the necessary ranking. He/she
could also compute new Pareto optimal solutions close to s14 and s17 to zoom
investigations in this area. In this example we have shown that the proposed
interactive process supports the DM in choosing most satisfactory solutions,
without imposing any strong cognitive eort, as the only information required
is a holistic preference information.
4.6 Conclusions and Further Research Directions

In this chapter, we introduced a new interactive procedure for multiobjective
optimization. It consists in an interactive exploration of a Pareto optimal set,
or its approximation, generated prior to the exploration using a MOO or EMO
algorithm. The procedure represents a constructive learning approach, because
on one hand, the preference information provided by the DM contributes to
the construction of a preference model and, on the other hand, the use of
the preference model shapes the DMs preferences or, at least, makes DMs
convictions evolve.
Contrary to many existing MCDA methods, the proposed procedure does
not require any excessive cognitive eort from the DM because the preference
information is of a holistic nature and, moreover, it can be partial. Due to dis-
tinguishing necessary and possible consequences of using all value functions
compatible with the preference information, the procedure is also robust, com-
paring to methods using a single best-t value function. This is a feature of
uttermost importance in MOO.
An almost immediate extension of the procedure could consist in admitting

preference information in form of a sorting of selected Pareto optimal solu-
tions into some pre-dened and preference ordered classes. Providing such an
information could be easier for some DMs than making the pairwise compar-
isons.
Acknowledgements
The rst and the third authors acknowledge the support from Luso-French
PESSOA bilateral cooperation. The fourth author wishes to acknowledge -
nancial support from the Polish Ministry of Science and Higher Education.
All authors acknowledge, moreover, the support of the COST Action IC0602
Algorithmic Decision Theory".
References
Bana e Costa, C.A., Vansnick, J.C.: MACBETH: An interactive path towards the
construction of cardinal value functions. International Transactions in Operational
Research 1(4), 387500 (1994)
Bana e Costa, C.A., De Corte, J.M., Vansnick, J.C.: On the mathematical foundation
of MACBETH. In: Figueira, J., Greco, S., Ehrgott, M. (eds.) Multiple Criteria
Decision Analysis: State of the Art Surveys, pp. 409443. Springer Science +
Business Media Inc., New York (2005)
Chakhar, S., Mousseau, V.: GNU-UTA: a GNU implementation of UTA methods
(2007), http://www.lamsade.dauphine.fr/~mousseau/GNU-UTA
Consortium, D.D.: Decision deck: an open-source software platform for mcda meth-
ods (2006-2008), www.decision-deck.org
Figueira, J., Greco, S., Sowiski, R.: Building a set of additive value functions
representing a reference preorder and intensities of preference: GRIP method.
European Journal of Operational Research, to appear
Greco, S., Matarazzo, B., Sowiski, R.: The use of rough sets and fuzzy sets in
MCDM. In: Gal, T., Hanne, T., Stewart, T. (eds.) Multicriteria Decision Making:
Advances in MCDM Models, Algorithms, Theory and Applications, pp. 114.
Kluwer Academic Publishers, Dordrecht (1999)
Greco, S., Matarazzo, B., Slowinski, R.: Rough sets theory for multicriteria decision
analysis. European Journal of Operational Research 129, 147 (2001)
Greco, S., Mousseau, V., Sowiski, R.: Assessing a partial preorder of alternatives
using ordinal regression and additive utility functions: A new UTA method. In:
58th Meeting of the EURO Working Group on MCDA, Moscow (2003)
Greco, S., Matarazzo, B., Sowiski, R.: Decision rule approach. In: Figueira, J.,
Greco, S., Ehrgott, M. (eds.) Multiple Criteria Decision Analysis: State of the
Art Surveys, pp. 507562. Springer Science + Business Media Inc., New York
(2005)
Greco, S., Mousseau, V., Sowiski, R.: Ordinal regression revisited: Multiple criteria
ranking with a set of additive value functions. European Journal of Operational
Research 191(2), 416436 (2008)
Jacquet-Lagrze, E., Siskos, Y.: Assessing a set of additive utility functions for mul-
ticriteria decision making: The UTA method. European Journal of Operational
Research 10(2), 151164 (1982)
Jacquet-Lagrze, E., Meziani, R., Sowiski, R.: MOLP with an interactive assess-
ment of a piecewise linear utility function. European Journal of Operational Re-
search 31, 350357 (1987)
Kiss, L., Martel, J., Nadeau, R.: ELECCALC - an interactive software for mod-
elling the decision makers preferences. Decision Support Systems 12(4-5), 757
777 (1994)
March, J.: Bounded rationality, ambiguity and the engineering of choice. Bell Journal
of Economics 9, 587608 (1978)
Marichal, J., Roubens, M.: Determination of weights of interacting criteria from a
reference set. European Journal of Operational Research 124(3), 641650 (2000)
Mousseau, V., Slowinski, R.: Inferring an ELECTRE TRI model from assignment
examples. Journal of Global Optimization 12(2), 157174 (1998)
Pekelman, D., Sen, S.: Mathematical programming models for the determination of
attribute weights. Management Science 20(8), 12171229 (1974)
Roy, B., Bouyssou, D.: Aide Multicritre la Dcision: Mthodes et Cas. Economica,
Paris (1993)
Saaty, T.: The Analytic Hierarchy Process. McGraw Hill, New York (1980)
Saaty, T.: The analytic hierarchy and analytic network processes for the measure-
ment of intangible criteria and for decision-making. In: Figueira, J., Greco, S.,
Ehrgott, M. (eds.) Multiple Criteria Decision Analysis: The State of the Art Sur-
veys, pp. 345407. Springer Science+Business Media, Inc., New York (2005)
Siskos, Y., Grigoroudis, V., Matsatsinis, N.: UTA methods. In: Figueira, F., Greco,
S., Ehrgott, M. (eds.) Multiple Criteria Decision Analysis: State of the Art Sur-
veys, pp. 297343. Springer Science + Business Media Inc., New York (2005)
Sowiski, R., Greco, S., Matarazzo, B.: Rough set based decision support. In: Burke,
E., Kendall, G. (eds.) Introductory Tutorials on Optimization, Search and Deci-
sion Support Methodologies, pp. 475527. Springer Science + Business Media
Inc., New York (2005)
Srinivasan, V., Shocker, A.: Estimating the weights for multiple attributes in a com-
posite criterion using pairwise judgments. Psychometrika 38(4), 473493 (1973)
5
Dominance-Based Rough Set Approach to
Interactive Multiobjective Optimization
Salvatore Greco1 , Benedetto Matarazzo1, and Roman Sowiski2,3

1
Faculty of Economics, University of Catania, Corso Italia, 55,
95129 Catania, Italy, salgreco@unict.it, matarazz@unict.it
2
Institute of Computing Science, Pozna University of Technology,
60-965 Pozna, Poland, roman.slowinski@cs.put.poznan.pl
3
Systems Research Institute, Polish Academy of Sciences,
01-447 Warsaw, Poland
Abstract. In this chapter, we present a new method for interactive multiobjective

optimization, which is based on application of a logical preference model built using
the Dominance-based Rough Set Approach (DRSA). The method is composed of two
main stages that alternate in an interactive procedure. In the rst stage, a sample
of solutions from the Pareto optimal set (or from its approximation) is generated.
In the second stage, the Decision Maker (DM) indicates relatively good solutions in
the generated sample. From this information, a preference model expressed in terms
of if ..., then ... decision rules is induced using DRSA. These rules dene some
new constraints which can be added to original constraints of the problem, cutting-
o non-interesting solutions from the currently considered Pareto optimal set. A
new sample of solutions is generated in the next iteration from the reduced Pareto
optimal set. The interaction continues until the DM nds a satisfactory solution in
the generated sample. This procedure permits a progressive exploration of the Pareto
optimal set in zones which are interesting from the point of view of DMs preferences.
The driving model of this exploration is a set of user-friendly decision rules, such
as if the value of objective i1 is not smaller than i1 and the value of objective i2 is
not smaller than i2 , then the solution is good. The sampling of the reduced Pareto
optimal set becomes ner with the advancement of the procedure and, moreover, a
return to previously abandoned zones is possible. Another feature of the method is
the possibility of learning about relationships between values of objective functions
in the currently considered zone of the Pareto optimal set. These relationships are
expressed by DRSA association rules, such as if objective j1 is not greater than j1
and objective j2 is not greater than j2 , then objective j3 is not smaller than j3
and objective j4 is not smaller than j4 .
Reviewed by: Jos Rui Figueira, Technical University of Lisbon, Portugal

Hisao Ishibuchi, Osaka Prefecture University, Japan
Kaisa Miettinen, University of Jyvskyl, Finland
122 S. Greco, B. Matarazzo, and R. Sowiski
5.1 Introduction
We propose a new method to interactive multiobjective optimization permit-

ting to use the preference model expressed in terms of easily understandable
if ..., then ... decision rules, induced from information about preferences of
the Decision Maker (DM) expressed in terms of a simple indication of rela-
tively good solutions in a given sample. The method we propose complements
well any multiobjective optimization method (see Chapter 1), which nds the
Pareto optimal set or its approximation, such as Evolutionary Multiobjective
Optimization methods (see Chapter 3).
Interactive multiobjective optimization (for a systematic introduction see
Chapter 2) consists of organizing the search for the most preferred solution by
alternating stages of calculation and dialogue with the DM. In many interac-
tive methods, the rst stage of calculation provides a rst sample of candidate
solutions from the Pareto optimal set or from its approximation. This sample
is presented to the DM. In the dialogue stage, the DM reacts to this pro-
posal by supplying additional information revealing his/her preferences. This
information is taken into account in the search for a new sample of candidate
solutions in the next calculation stage, so as to provide solutions which better
t DMs preferences. The search stops when, among the candidate solutions,
the DM nds one which yields a satisfactory compromise between objective
function values, or when the DM comes to the conclusion that there is no
such a solution in the current problem setting. The convergence of interactive
procedures is of psychological rather than mathematical nature. Information
supplied by the DM in the dialogue stage is a critical information about pre-
sented candidate solutions as such, it is preference information, which is an
indispensable component of each method supporting Multiple Criteria Deci-
sion Making (MCDM) (for an updated collection of state of the art surveys
see (Figueira et al., 2005b)). Let us remark that from a semantic point of
view, criterion and objective function mean the same, thus we will use them
alternatively. The concept of criterion is more handy in the context of evalu-
ation of a nite set of solutions, and the concept of objective function better
ts the context of optimization.
Preference information permits building a preference model of the DM.
The preference model induces a preference structure in the set of candidate
solutions (objects, alternatives, actions); a proper exploitation of this struc-
ture leads to a recommendation consistent with the preferences of the DM
the recommendation may concern one of the following three main problems
of multiple criteria decision:
sorting of candidate solutions into pre-dened and preference ordered de-
cision classes (also called ordinal classication),
choice of the most preferred solution(s),
ranking of the solutions from the most to the least preferred.
5 Dominance-Based Rough Set Approach to Multiobjective Optimization 123
The interactive method proposed hereafter combines two multiple criteria de-
cision problem settings. In the dialogue stage, it requires the DM to sort a
sample of solutions into two classes: good and others. Finally, it gives a
recommendation for the choice. A similar combination of sorting and choice
has been used by Jaszkiewicz and Ferhat (1999). Remark that most interac-
tive optimization methods require the DM to select one (feasible or infeasible)
solution as a reference point (see Chapter 2). Moreover, there exist interac-
tive multiobjective optimization methods requiring preference information in
terms of ranking of a set of reference solutions (see, e.g., (Jacquet-Lagrze
et al., 1987) and Chapter 4).
Experience indicates that decision support methods requiring from a DM
a lot of cognitive eort in the dialogue stage fail to be accepted. Preference
information may be either direct or indirect, depending on whether it species
directly values of some parameters used in the preference model (e.g. trade-o
weights, aspiration levels, discrimination thresholds, etc.), or some examples of
holistic judgments, called exemplary decisions, from which compatible values
of the preference model parameters are induced.
Direct preference information is used in the traditional paradigm, accord-
ing to which the preference model is rst constructed and then applied on the
set of candidate solutions.
Indirect preference information is used in the regression paradigm, accord-
ing to which the holistic preferences on a subset of candidate solutions are
known rst, and then a consistent preference model is inferred from this in-
formation to be applied on the whole set of candidate solutions.
Presently, MCDM methods based on indirect preference information and
the regression paradigm are of increasing interest for they require relatively
weaker cognitive eort from the DM. Preference information given in terms
of exemplary decisions is very natural and, for this reason, reliable. Indeed,
the regression paradigm emphasizes the discovery of intentions as an interpre-
tation of actions rather than as a prior position, which was called by March
the posterior rationality (March, 1978). It is also consistent with the inductive
learning used in articial intelligence approaches (Michalski et al., 1998). Typ-
ical applications of this paradigm in MCDM are presented in (Greco et al.,
1999b, 2008; Jacquet-Lagrze et al., 1987; Jacquet-Lagrze and Siskos, 1982;
Mousseau and Sowiski, 1998).
The form of exemplary decisions which constitute preference information
depends on the multiple criteria problem setting. In multiple criteria sorting,
an exemplary decision is an assignment of a selected solution to one of de-
cision classes, because sorting is based on absolute evaluation of solutions.
In multiple criteria choice and ranking, however, an exemplary decision is
a pairwise comparison of solutions, because choice and ranking is based on
relative evaluations of solutions. While it is relatively easy to acquire a set
of exemplary decisions, they are rarely logically consistent. By inconsistent
exemplary decisions we mean decisions which do not respect the dominance
principle (called also Pareto principle). In multiple criteria sorting, decision
examples concerning solutions x and y are inconsistent if one of the following

two situations occurs:
) x and y have the same evaluations on all criteria (x and y are indiscernible),
however, they have been assigned to dierent decision classes,
) x has not worse evaluations on all criteria than y (x dominates y), however,
x has been assigned to a worse decision class than y.
In multiple criteria choice and ranking, decision examples concerning pairs of
solutions (x, y) and (w, z) are inconsistent if one of the following two situations
occur:
) dierences of evaluations of x and y are the same as dierences of evalua-
tions of w and z on all criteria ((x, y) and (w, z) are indiscernible), however,
x has been compared to y dierently than w to z,
) dierences of evaluations of x and y are not smaller than dierences of
evaluations of w and z on all criteria ((x, y) dominates (w, z)), however, x
has been compared to y as being less preferred than w is preferred to z,
i.e. even if with respect to all considered criteria the strength of preference
of x over y is not smaller than the strength of preference of w over z,
the overall strength of preference of x over y is smaller than the overall
strength of preference of w over z.
The dominance principle is the only objective principle that is widely agreed
upon in the multiple criteria decision analysis. Inconsistency of exemplary
decisions may come from many sources. Examples include:
incomplete set of criteria,
limited clear discrimination between criteria,
unstable preferences of decision makers.
Inconsistencies cannot be considered as error or noise to be simply eliminated
from the preference information or amalgamated with the consistent part
of this information by some averaging operators. Indeed, they can convey
important information that should be taken into account in the construction
of the DMs preference model.
The rough set concept proposed by Pawlak (1982, 1991) is intended to deal
with inconsistency in information and this is a major argument to support its
application to multiple criteria decision analysis.
Since its conception, rough set theory has proved to be an excellent math-
ematical tool for the analysis of inconsistent information. The rough set phi-
losophy is based on the assumption that with every object (e.g. a solution
of multiobjective optimization problem) of a universe U there is associated a
certain amount of information (data, knowledge). This information can be ex-
pressed by means of a number of attributes. The attribute values describe the
objects. Objects which have the same description are said to be indiscernible
(similar) with respect to the available information. The indiscernibility rela-
tion thus generated constitutes the mathematical basis of rough set theory. It
induces a partition of the universe into blocks of indiscernible objects, called

elementary sets or granules, which can be used to build concepts (e.g. classes
of acceptable, doubtful or non-acceptable solutions).
Any subset X of the universe may be expressed in terms of these gran-
ules either precisely (as a union of granules) or approximately. In the latter
case, the subset X may be characterized by two ordinary sets, called the
lower and upper approximations. A rough set is dened by means of these
two approximations. The lower approximation of X is composed of all the
granules included in X (whose elements, therefore, certainly belong to X),
while the upper approximation of X consists of all the granules which have
a non-empty intersection with X (whose elements, therefore, may belong to
X). The dierence between the upper and lower approximation constitutes
the boundary region of the rough set, whose elements cannot be characterized
with certainty as belonging or not to X (by using the available information).
The information about objects from the boundary region is, therefore, incon-
sistent. The cardinality of the boundary region states, moreover, the extent
to which it is possible to express X in terms of certainty, on the basis of the
available information. In fact, these objects have the same description, but are
assigned to dierent classes, such as patients having the same symptoms (the
same description), but dierent pathologies (dierent classes). For this reason,
this cardinality may be used as a measure of inconsistency of the information
about X.
Some important characteristics of the rough set approach make it a par-
ticularly interesting tool in a variety of problems and concrete applications.
For example, it is possible to deal with both quantitative and qualitative at-
tributes (e.g., in case of diagnostics, the blood pressure and the temperature
are quantitative attributes, while the color of eyes or state of consciousness
are qualitative attributes) and inconsistencies need not to be removed before
the analysis. In result of the rough set approach, it is possible to acquire a pos-
teriori information regarding the relevance of particular attributes and their
subsets (Greco et al., 1999b, 2001b). Moreover, given a partition of U into
disjoint decision classes, the lower and upper approximations of this partition
give a structure to the available information such that it is possible to induce
from this information certain and possible decision rules, which are logical
if..., then... statements.
Several attempts have been made to employ rough set theory for decision
support (Pawlak and Sowiski, 1994; Sowiski, 1993). The classical rough
set approach is not able, however, to deal with preference ordered value sets of
attributes, as well as with the preference ordered decision classes. In decision
analysis, an attribute with a preference ordered value set (scale) is called cri-
terion, and the multiattribute classication problem corresponds to a multiple
criteria sorting problem (also called ordinal classication).
At this point, let us return to the issue of inconsistent preference informa-
tion provided by the DM in the process of solving a multiple criteria sorting
problem. As can be seen from the above brief description of the classical rough
set approach, it is able to deal with inconsistency of type ) and ), however,

it is not able to recognize inconsistency of type ) and ).
In the late 90s, adapting the classical rough set approach to analysis
of preference ordered information became a particularly challenging problem
within the eld of multiple criteria decision analysis. Why might it be so im-
portant? The answer is that the result of the rough set analysis in form of
certain decision rules, having a syntax if for x some conditions hold, then x
certainly belongs to class Cl, and being induced from the lower approxima-
tion, as well as possible decision rules, having a syntax if for x some condi-
tions hold, then x possibly belongs to class Cl, and being induced from the
upper approximation, is very attractive for its ability of representing DMs
preferences hidden in exemplary decisions. Such a preference model is very
convenient for decision support, because it is intelligible and it speaks the
same language as the DM.
An extension of the classical rough set approach, which enables the anal-
ysis of preference ordered information was proposed by Greco, Matarazzo
and Sowiski (Greco et al., 1999b,a; Sowiski et al., 2002b). This extension,
called the Dominance-based Rough Set Approach (DRSA), is mainly based
on the replacement of the indiscernibility relation by a dominance relation in
the rough approximation. This change permits to recognize inconsistency of
both types, ) and ) on one hand, and ) and ) on the other hand, and
to build lower and upper approximations of decision classes, corresponding to
consistent only or all available information, respectively. An important con-
sequence of this fact is the possibility of inferring from these approximations
the preference model in terms of decision rules. Depending upon whether they
are induced from lower approximations or from the upper approximations of
decision classes, one gets certain or possible decision rules, which represent
certain or possible knowledge about the DMs preferences. Such a preference
model is more general than the classical functional models considered within
multiattribute utility theory (see Keeney and Raia, 1976; Dyer, 2005) or
the relational models considered, for example, in outranking methods (for a
general review of outranking methods see (Roy and Bouyssou, 1993; Figueira
et al., 2005a; Brans and Mareschal, 2005; Martel and Matarazzo, 2005), while
for their comparison with the rough set approach see (Greco et al., 2002b,
2004a; Sowiski et al., 2002a).
DRSA has been applied to all three types of multiple criteria decision prob-
lems, to decision with multiple decision makers, to decision under uncertainty,
and to hierarchical decision making. For comprehensive surveys of DRSA, see
(Greco et al., 1999b, 2001b, 2004c, 2005a,b; Sowiski et al., 2005).
The aim of this chapter is to extend further the range of applications of
DRSA to interactive multiobjective optimization. It is organized as follows.
In the next Section, we recall the main concepts of DRSA, which are also
illustrated with an example in Section 5.3. In Section 5.4, we introduce asso-
ciation rules expressing relationships between objective function values in the
currently considered zone of the Pareto optimal set. In Section 5.5, we present
the new interactive multiobjective optimization method based on DRSA. In

Section 5.6, we illustrate this method with an example, and, in Section 5.7,
we discuss its characteristic features. Final Section contains conclusions.
5.2 Dominance-Based Rough Set Approach (DRSA)

DRSA is a methodology of multiple criteria decision analysis aiming at obtain-
ing a representation of the DMs preferences in terms of easily understandable
if ..., then ... decision rules, on the basis of some exemplary decisions (past
decisions or simulated decisions) given by the DM. In this Section, we present
the DRSA to sorting problems, because in the dialogue stage of our inter-
active method this multiple criteria decision problem is considered. In this
case, exemplary decisions are sorting examples, i.e. objects (solutions, alter-
natives, actions) described by a set of criteria and assigned to preference or-
dered classes. The criteria and the class assignment considered within DRSA
correspond to the condition attributes and the decision attribute, respectively,
in the classical Rough Set Approach (Pawlak, 1991). For example, in multiple
criteria sorting of cars, an example of decision is an assignment of a particu-
lar car evaluated on such criteria as maximum speed, acceleration, price and
fuel consumption to one of three classes of overall quality: bad, medium,
good.
Let us consider a set of criteria F = {f1 , . . . , fn }, the set of their indices
I = {1, . . . , n}, and a nite universe of objects (solutions, alternatives, actions)
U , such that fi : U for each i = 1, . . . , n. To explain DRSA, it will be
convenient to assume that fi (), i = 1, . . . , n, are gain-type functions. Thus,
without loss of generality, for all objects x, y U , fi (x) fi (y) means that
x is at least as good as y with respect to criterion i, which is denoted as
x i y. We suppose that i is a complete preorder, i.e. a strongly complete
and transitive binary relation, dened on U on the basis of evaluations fi ().
Note that in the context of multiobjective optimization, fi () corresponds to
objective functions. Furthermore, we assume that there is a decision attribute
d which makes a partition of U into a nite number of decision classes called
sorting, Cl ={Cl1 , . . . , Clm }, such that each x U belongs to one and only
one class Cl t , t = 1, . . . , m. We suppose that the classes are preference ordered,
i.e. for all r, s = 1, . . . , m, such that r > s, the objects from Cl r are preferred
to the objects from Cl s . More formally, if is a comprehensive weak preference
relation on U , i.e. if for all x, y U , xy reads x is at least as good as y,
then we suppose
[xClr , yCls , r>s] xy,
where xy means xy and not yx. The above assumptions are typical for
consideration of a multicriteria sorting problem.
In DRSA, the explanation of the assignment of objects to preference or-
dered decision classes is made on the base of their evaluation with respect to
a subset of criteria P I. This explanation is called approximation of deci-

sion classes with respect to P . Indeed, in order to take into account the order
of decision classes, in DRSA the classes are not considered one by one but,
instead, unions of classes are approximated: upward union from class Clt to
class Clm denoted by Clt , and downward union from class Clt to class Cl1 ,
denoted by Clt , i.e.:

Clt = Cls , Cl t = Cls , t = 1, ..., m.
st st
The statement x Clt

reads x belongs to at least class Cl t , while x Clt
reads x belongs to at most class Cl t . Let us remark that Cl
1 = Cl m = U ,

Clm =Cl m and Cl1 =Cl 1 . Furthermore, for t=2,...,m, we have:
Cl
t1 = U Clt and Cl
t = U Cl t1 .
In the above example concerning multiple criteria sorting of cars, the upward

unions are: Clmedium , that is the set of all the cars classied at least medium

(i.e. the set of cars classied medium or good), and Clgood , that is the set of
all the cars classied at least good (i.e. the set of cars classied good), while

the downward unions are: Clmedium , that is the set of all the cars classied at

most medium (i.e. the set of cars classied medium or bad), and Clbad ,
that is the set of all the cars classied at most bad (i.e. the set of cars

classied bad). Notice that, formally, also Clbad is an upward union as well

as Clgood is a downward union, however, as bad and good are extreme
classes, the two unions boil down to the whole universe U .
The key idea of the rough set approach is explanation (approximation)
of knowledge generated by the decision attributes, by granules of knowledge
generated by condition attributes.
In DRSA, where condition attributes are criteria and decision classes are
preference ordered, the knowledge to be explained is the assignments of objects
to upward and downward unions of classes and the granules of knowledge are
sets of objects in dominance cones in the criteria values space.
We say that x dominates y with respect to P I (shortly, x P-dominates
y), denoted by xD P y, if for every criterion i P , fi (x) fi (y). The relation
of P -dominance is reexive and transitive, that is it is a partial preorder.
Given a set of criteria P I and x U , the granules of knowledge used
for approximation in DRSA are:
a set of objects dominating x, called P -dominating set,
DP+ (x)={y U : yD P x},
a set of objects dominated by x, called P -dominated set,
DP (x)={y U : xD P y}.
Let us observe that we can write the P -dominating sets and the P -dominated
sets as follows. For x U and P I we dene the P -positive dominance cone
of x, P (x), and the P -negative dominance cone of x, P (x), as follows
P (x) = {z = (z1 , ..., z|P | ) |P | : zj fij (x), for all ij P },
P (x) = {z = (z1 , ..., z|P | ) |P | : zj fij (x), for all ij P }.

Thus, the P -dominating set and the P -dominated set of x can be formulated
as
DP+ (x) = {y U : (fi (y), i P ) P (x)},
DP (x) = {y U : (fi (y), i P ) P (x)}.
Let us remark that we can also write:
DP+ (x) = {y U : P (y) P (x)},
DP (x) = {y U : P (y) P (x)}.

For the sake of simplicity, with an abuse of denition, in the following we
shall speak of dominance cones also when we refer to the dominating and
dominated sets.
Let us recall that the dominance principle requires that an object x dom-
inating object y with respect to considered criteria (i.e. x having evaluations
at least as good as y on all considered criteria) should also dominate y on the
decision (i.e. x should be assigned to at least as good decision class as y).
The P -lower approximation of Cl
t , denoted by P (Cl t ), and the P -upper

approximation of Clt , denoted by P ( Clt ), are dened as follows (t=1,...,m):
P ( Cl
t ) = {x U : DP (x) Cl t },
+
P ( Cl
t ) = {x U : DP (x) Cl t = }.
Analogously, one can dene the P -lower approximation and the P -upper ap-
proximation of Cl
t as follows (t=1,...,m):
P ( Cl
t ) = {x U : DP (x) Cl t },
P ( Cl
t ) = {x U : DP (x) Cl t = }.
+
The P -lower and P -upper approximations so dened satisfy the following

inclusion properties, for each t {1, . . . , m} and for all P I:
P ( Cl
t ) Cl t P ( Cl t ), P ( Cl
t ) Clt P ( Cl t ).
The P -lower and P -upper approximations of Cl

t and Clt have an important
complementarity property, according to which,
P ( Cl
t ) = U P (Clt1 ) and P ( Cl t ) = U P (Clt1 ), t=2,...,m,
P ( Cl
t ) = U P (Clt+1 ) and P ( Clt ) = U P (Clt+1 ), t=1,...,m1.
The P -boundaries of Cl
t and Cl t , denoted by BnP (Cl t ) and BnP (Cl t ),
respectively, are dened as follows (t=1,...,m):
BnP (Cl
t ) = P ( Cl t )P ( Cl t ), BnP (Cl t ) = P ( Clt )P ( Cl t ).
Due to the above complementarity property, BnP (Cl

t ) = BnP (Clt1 ), for
t = 2, ..., m.
For every P I, the quality of approximation of sorting Cl by a set
of criteria P is dened as the ratio of the number of objects P -consistent
with the dominance principle and the number of all the objects in U . Since
the P -consistent objects are those which do not belong to any P -boundary
BnP (Cl
t ) or BnP (Cl t ), t = 1, . . . , m, the quality of approximation of sorting
Cl by a set of criteria P , can be written as

U BnP (Clt ) BnP (Clt )
t{1,...,m} t{1,...,m}
P (Cl) =
|U |

U BnP (Clt ) U BnP (Clt )
t{1,...,m} t{1,...,m}
= = .
|U | |U |
P (Cl) can be seen as a degree of consistency of the sorting examples, where
P is the set of criteria and Cl is the considered sorting.
Each minimal (in the sense of inclusion) subset P I such that P (Cl) =
I (Cl) is called a reduct of sorting Cl , and is denoted by REDCl . Let us
remark that for a given set of sorting examples one can have more than one
reduct. The intersection of all reducts is called the core, and is denoted by
CORECl . Criteria in CORECl cannot be removed from consideration without
deteriorating the quality of approximation of sorting Cl. This means that, in
set I, there are three categories of criteria:
indispensable criteria included in the core,
exchangeable criteria included in some reducts, but not in the core,
redundant criteria, neither indispensable nor exchangeable, and thus not
included in any reduct.
The dominance-based rough approximations of upward and downward unions
of decision classes can serve to induce a generalized description of sorting de-
cisions in terms of if . . . , then . . . decision rules. For a given upward or
downward union of classes, Cl
t or Cls , the decision rules induced under a
hypothesis that objects belonging to P (Cl
t ) or P (Cls ) are positive examples
(that is objects that have to be matched by the induced decision rules), and
all the others are negative (that is objects that have to be not matched by the
induced decision rules), suggest a certain assignment to class Cl t or better,
or to class Cl s or worse, respectively. On the other hand, the decision rules
induced under a hypothesis that objects belonging to P (Clt ) or P (Cls ) are
positive examples, and all the others are negative, suggest a possible assign-
ment to class Cl t or better, or to class Cl s or worse, respectively. Finally,
the decision rules induced under a hypothesis that objects belonging to the
intersection P (Cls ) P (Clt ) are positive examples, and all the others are
negative, suggest an assignment to some classes between Cl s and Cl t (s < t).
These rules are matching inconsistent objects x U , which cannot be assigned
without doubts to classes Clr , s r t, with s < t, because x / P (Clr ) and

x/ P (Clr ) for all r such that s r t, with s < t.
Given the preference information in terms of sorting examples, it is mean-
ingful to consider the following ve types of decision rules:
1) certain D -decision rules, providing lower proles (i.e. sets of minimal
values for considered criteria) of objects belonging to P (Cl t ),
P = {i1 , . . . , ip } I :
if fi1 (x) ri1 and . . . and fip (x) rip , then x Clt ,
t = 2, . . . , m, ri1 , . . . , rip ;
2) possible D -decision rules, providing lower proles of objects belonging to
P (Clt ), P = {i1 , . . . , ip } I:
if fi1 (x) ri1 and . . . and fip (x) rip , then x possibly belongs to Clt ,
t = 2, . . . , m, ri1 , . . . , rip ;
3) certain D -decision rules, providing upper proles (i.e. sets of
maximal values for considered criteria) of objects belonging to P (Cl
t ),
P = {i1 , . . . , ip } I:
t = 1, . . . , m 1, ri1 , . . . , rip ;
4) possible D -decision rules, providing upper proles of objects belonging
to P (Cl t ), P = {i1 , . . . , ip } I:
t = 1, . . . , m 1, ri1 , . . . , rip ;
5) approximate D -decision rules, providing simultaneously lower and
upper proles of objects belonging to Cl s Cl s+1 . . . Cl t , without
possibility of discerning to which class:
if fi1 (x) ri1 and . . . and fik (x) rik and fik+1 (x) rik+1 and . . . and
fip (x) rip , then x Cls Cls+1 . . . Clt ,
{i1 , . . . , ip } I s, t {1, . . . , m}, s < t, ri1 , . . . , rip .
In the premise of a D -decision rule, we can have fi (x) ri and fi (x)
ri , where ri ri , for the same i I. Moreover, if ri = ri , the two conditions
boil down to fi (x) = ri .
Let us remark that the values ri1 , ..., rip in the premise of each decision
rule are evaluation proles with respect to {i1 , . . . , ip } I of some objects in
the corresponding rough approximations. More precisely,
1) in case
is a certain D -decision rule, there exists some y P (Cl

t ),
P = {i1 , . . . , ip }, such that
fi1 (y) = ri1 and . . . and fip (y) = rip ;
2) in case
is a possible D -decision rules, there exists some y P (Cl t ),
P = {i1 , . . . , ip }, such that
3) in case
is a certain D -decision rules, there exists some y P (Cl
t ),
P = {i1 , . . . , ip }, such that
4) in case
is a possible D -decision rules, there exists some y P (Cl t ),
P = {i1 , . . . , ip }, such that
5) in case
if fi1 (x) ri1 and . . . and fik (x) rik and fik+1 (x) rik+1 and . . . and
fip (x) rip , then x Cls Cls+1 . . . Clt ,
is an approximate D -decision rules, there exists some y P (Cl t ), and

some z P (Cls ), {i1 , . . . , ik } {ik+1 , . . . , ip } = P , such that
fi1 (y) = ri1 and . . . and fik (y) = rik , and fk+1 (z) = rk+1 and . . . and
fip (z) = rip .
Note that in the above rules, each condition prole denes a dominance cone
in |P |-dimensional condition space |P | , where P = {i1 , . . . , ip } is the set of
criteria considered in the rule, and each decision denes a dominance cone in
one-dimensional decision space {1, . . . , m}. Both dominance cones are posi-
tive for D -rules, negative for D -rules and partially positive and partially
negative for D -rules.
Let also point out that dominance cones corresponding to condition pro-
les can originate from any point of |P | , P = {i1 , . . . , ip }, without the risk
of being too specic. Thus, contrary to traditional granular computing, the
condition space n (i.e. the set of all possible vectors of evaluations of objects
with respect to considered criteria) does not need to be discretized. This im-
plies that the rules obtained from DRSA are also meaningful for analysis of
vectors coming from a continuous multiobjective optimization problem where
the concept of dominance cone is particularly useful (see e.g. Chapter 1 where
the related concept of ordering cones is discussed).
Since a decision rule is a kind of implication, by a minimal rule we mean

an implication such that there is no other implication with the premise of at
least the same weakness (in other words, a rule using a subset of conditions
and/or weaker conditions) and the conclusion of at least the same strength
(in other words, a D - or a D -decision rule assigning objects to the same
union or sub-union of classes, or a D -decision rule assigning objects to the
same or smaller set of classes).
The rules of type 1) and 3) represent certain knowledge extracted from
data (sorting examples), while the rules of type 2) and 4) represent possible
knowledge; the rules of type 5) represent doubtful knowledge, because they
are supported by inconsistent objects only.
Moreover, the rules of type 1) and 3) are exact or deterministic if they
do not cover negative examples, and they are probabilistic otherwise. In the
latter case, each rule is characterized by a condence ratio, representing the
probability that an object matching the premise of the rule also matches its
conclusion.
Given a certain or possible D -decision rule r if fi1 (x) ri1 and . . .
and fip (x) rip , then x Clt , an object y U supports r if fi1 (y) ri1 and
. . . and fip (y) rip and y Clt . Moreover, object y U supporting decision
rule r is a base of r if fi1 (y) = ri1 and . . . and fip (y) = rip . Similar denitions
hold for certain or possible D -decision rules and approximate D -decision
rules. A decision rule having at least one base is called robust. Identication
of supporting objects and bases of robust rules is important for interpretation
of the rules in multiple criteria decision analysis perspective. The ratio of the
number of objects supporting the premise of a rule and the number of all
considered objects is called relative support of a rule. The relative support
and the condence ratio are basic characteristics of a rule, however, some
Bayesian conrmation measures reect much better the attractiveness of a
rule (Greco et al., 2004b).
A set of decision rules is complete if it covers all considered objects (sorting
examples) in such a way that consistent objects are re-assigned to their original
classes, and inconsistent objects are assigned to clusters of classes referring to
this inconsistency. We call each set of decision rules that is complete and non-
redundant minimal, i.e. exclusion of any rule from this set makes it incomplete.
One of three induction strategies can be adopted to obtain a set of decision
rules (Stefanowski, 1998):
generation of a minimal representation, i.e. a minimal set of rules,
generation of an exhaustive representation, i.e. all rules for a given data
table,
generation of a characteristic representation, i.e. a set of rules covering
relatively many objects, however, not necessarily all objects, from U .
Procedures for induction of decision rules from dominance-based rough ap-
proximations have been proposed by Greco et al. (2001a).
In (Giove et al., 2002), a new methodology for the induction of monotonic

decision trees from dominance-based rough approximations of preference or-
dered decision classes has been proposed.
5.3 Example Illustrating DRSA

In this section we present a didactic example which illustrates the main con-
cepts of DRSA. Let us consider the following multiple criteria sorting problem.
Students of a college must obtain an overall evaluation on the basis of their
achievements in Mathematics, Physics and Literature. The three subjects are
clearly criteria (condition attributes) and the comprehensive evaluation is a
decision attribute. For simplicity, the value sets of the criteria and of the
decision attribute are the same, and they are composed of three values: bad,
medium and good. The preference order of these values is obvious. Thus, there
are three preference ordered decision classes, so the problem belongs to the
category of multiple criteria sorting. In order to build a preference model of
the jury, we will analyze a set of exemplary evaluations of students (sorting
examples) provided by the jury. They are presented in Table 5.1.
Note that the dominance principle obviously applies to the sorting exam-
ples, since an improvement of a students score on one of three criteria, with
other scores unchanged, should not worsen the students overall evaluation,
but rather improve it.
Table 5.1. Exemplary evaluations of students (sorting examples)

Student Mathematics Physics Literature Overall Evaluation
S1 good medium bad bad
S2 medium medium bad medium
S3 medium medium medium medium
S4 good good medium good
S5 good medium good good
S6 good good good good
S7 bad bad bad bad
S8 bad bad medium bad
Observe that student S1 has not worse evaluations than student S2 on all
the considered criteria, however, the overall evaluation of S1 is worse than
the overall evaluation of S2. This contradicts the dominance principle, so the
two sorting examples are inconsistent. Let us observe that if we reduced the
set of considered criteria, i.e. the set of considered subjects, then some more
inconsistencies could occur. For example, let us remove from Table 5.1 the
evaluation on Literature. In this way we get Table 5.2, where S1 is inconsistent
not only with S2, but also with S3 and S5. In fact, student S1 has not
worse evaluations than students S2, S3 and S5 on all the considered criteria
(Mathematics and Physics), however, the overall evaluation of S1 is worse
than the overall evaluation of S2, S3 and S5.
Table 5.2. Exemplary evaluations of students excluding Literature

Student Mathematics Physics Overall Evaluation
S1 good medium bad
S2 medium medium medium
S4 good good good
S5 good medium good
S6 good good good
S7 bad bad bad
S8 bad bad bad
Observe, moreover, that if we remove from Table 5.1 the evaluations on Math-
ematics, we obtain Table 5.3, where no new inconsistencies occur, comparing
to Table 5.1.
Table 5.3. Exemplary evaluations of students excluding Mathematics

Student Physics Literature Overall Evaluation
S1 medium bad bad
S2 medium bad medium
S4 good medium good
S5 medium good good
S6 good good good
S7 bad bad bad
S8 bad medium bad
Similarly, if we remove from Table 5.1 the evaluations on Physics, we obtain

Table 5.4, where no new inconsistencies occur, comparing to Table 5.1.
The fact that no new inconsistency occurs when Mathematics or Physics is
removed, means that the subsets of criteria {Physics, Literature} or {Math-
ematics, Literature} contain sucient information to represent the overall
evaluation of students with the same quality of approximation as using the
complete set of three criteria. This is not the case, however, for the subset
{Mathematics, Physics}. Observe, moreover, that subsets {Physics, Litera-
ture} and {Mathematics, Literature} are minimal, because no other criterion
Table 5.4. Exemplary evaluations of students excluding Physics

Student Mathematics Literature Overall Evaluation
S1 good bad bad
S2 medium bad medium
S4 good medium good
S5 good good good
S6 good good good
S7 bad bad bad
S8 bad medium bad
can be removed without new inconsistencies occur. Thus, {Physics, Litera-

ture} and {Mathematics, Literature} are the reducts of the complete set of
criteria {Mathematics, Physics, Literature}. Since Literature is the only cri-
terion which cannot be removed from any reduct without introducing new
inconsistencies, it constitutes the core, i.e. the set of indispensable criteria.
The core is, of course, the intersection of all reducts, i.e. in our example:
{Literature} = {Physics, Literature} {Mathematics, Literature}.
In order to illustrate in a simple way the concept of rough approximation, let
us conne our analysis to the reduct {Mathematics, Literature}. Let us con-
+
sider student S4. His positive dominance cone D{Mathematics,Literature} (S4) is
composed of all the students having evaluations not worse than him on Math-
ematics and Literature, i.e. of all the students dominating him with respect
to Mathematics and Literature. Thus, we have
+
D{Mathematics,Literature} (S4) = {S4, S5, S6}.
On the other hand, the negative dominance cone of student S4,

D{Mathematics,Literature} (S4), is composed of all the students having evalua-
tions not better than him on Mathematics and Literature, i.e. of all the stu-
dents dominated by him with respect to Mathematics and Literature. Thus,
we have

D{Mathematics,Literature} (S4) = {S1, S2, S3, S4, S7, S8}.
Similar dominance cones can be obtained for all the students from Table 5.4.
For example, for S2 we have
+
D{Mathematics,Literature} (S2) = {S1, S2, S3, S4, S5, S6}.
and

D{Mathematics,Literature} (S2) = {S2, S7}.
Using dominance cones, we can calculate the rough approximations. Let us

consider, for example, the lower approximation of the set of students having

a good overall evaluation P (Clgood ), with P ={Mathematics, Literature}.

We have, P (Clgood ) = {S4, S5, S6}, because positive dominance cones of stu-
dents S4, S5 and S6 are all included in the set of students with an overall
evaluation good. In other words, this means that there is no student domi-
nating S4 or S5 or S6 while having an overall evaluation worse than good.
From the viewpoint of decision making, this means that, taking into account
the available information about evaluation of students on Mathematics and
Literature, the fact that student y dominates S4 or S5 or S6 is a sucient
condition to conclude that y is a good student.
As to the upper approximation of the set of students with a good overall

evaluation, we have P (Clgood ) = {S4, S5, S6}, because negative dominance
cones of students S4, S5 and S6 have a nonempty intersection with the set of
students having a good overall evaluation. In other words, this means that for
each one of the students S4, S5 and S6, there is at least one student dominated
by him with an overall evaluation good. From the point of view of decision
making, this means that, taking into account the available information about
evaluation of students on Mathematics and Literature, the fact that student
y dominates S4 or S5 or S6 is a possible condition to conclude that y is a
good student.
Let us observe that for the set of criteria P ={Mathematics, Literature},
the lower and upper approximations of the set of good students are the
same. This means that sorting examples concerning this class are all consis-
tent. This is not the case, however, for the sorting examples concerning the
union of decision classes at least medium. For this upward union we have

P (Clmedium ) = {S3, S4, S5, S6} and P (Clmedium ) = {S1, S2, S3, S4, S5, S6}.

The dierence between P (Clmedium ) and P (Clmedium ), i.e. the boundary

BnP (Clmedium ) = {S1, S2}, is composed of students with inconsistent overall
evaluations, which has already been noticed above. From the viewpoint of de-
cision making, this means that, taking into account the available information
about evaluation of students on Mathematics and Literature, the fact that
student y is dominated by S1 and dominates S2 is a condition to conclude
that y can obtain an overall evaluation at least medium with some doubts.
Until now we have considered rough approximations of only upward unions
of decision classes. It is interesting, however, to calculate also rough approxi-
mations of downward unions of decision classes. Let us consider rst the lower
approximation of the set of students having at most medium overall eval-

uation P (Clmedium ). We have, P (Clmedium ) = {S1, S2, S3, S7, S8}, because
the negative dominance cones of students S1, S2, S3, S7, and S8 are all in-
cluded in the set of students with overall evaluation at most medium. In
other words, this means that there is no student dominated by S1 or S2 or S3
or S7 or S8 while having an overall evaluation better than medium. From
the viewpoint of decision making, this means that, taking into account the
available information about evaluation of students on Mathematics and Lit-

erature, the fact that student y is dominated by S1 or S2 or S3 or S7 or S8
is a sucient condition to conclude that y is an at most medium student.
As to the upper approximation of the set of students with an at most

medium overall evaluation, we have P (Clmedium ) = {S1, S2, S3, S7, S8}, be-
cause the positive dominance cones of students S1, S2, S3, S7, and S8 have a
nonempty intersection with the set of students having an at most medium
overall evaluation. In other words, this means that for each one of the students
S1, S2, S3, S7, and S8, there is at least one student dominating him with an
overall evaluation at most medium. From the viewpoint of decision making,
this means that, taking into account the available information about evalu-
ation of students on Mathematics and Literature, the fact that student y is
dominated by S1 or S2 or S3 or S7 or S8 is a possible condition to conclude
that y is an at most medium student.
Finally, for the set of students having a bad overall evaluation, we have

P (Clbad ) = {S7, S8} and P (Clbad ) = {S1, S2, S7, S8}. The dierence be-

tween P (Clbad ) and P (Clbad ), i.e. the boundary BnP (Clbad ) = {S1, S2}
is composed of students with inconsistent overall evaluations, which has
already been noticed above. From the viewpoint of decision making, this
means that, taking into account the available information about evaluation
of students on Mathematics and Literature, the fact that student y is dom-
inated by S1 and dominates S2 is a condition to conclude that y can ob-
tain an overall evaluation bad with some doubts. Observe, moreover, that

BnP (Clmedium ) = BnP (Clbad ).
Given the above rough approximations with respect to the set of criteria
P ={Mathematics, Literature}, one can induce a set of decision rules repre-
senting the preferences of the jury. The idea is that evaluation proles of
students belonging to the lower approximations can serve as a base for some
certain rules, while evaluation proles of students belonging to the boundaries
can serve as a base for some approximate rules. The following decision rules
have been induced (between parentheses there are ids of students supporting
the corresponding rule; the student being a rule base is underlined):
rule 1) if the evaluation on Mathematics is (at least) good, and the evaluation
on Literature is at least medium, then the overall evaluation is (at least)
good, {S4,S5, S6},
rule 2) if the evaluation on Mathematics is at least medium, and the evalu-
ation on Literature is at least medium, then the overall evaluation is at
least medium, {S3, S4, S5, S6},
rule 3) if the evaluation on Mathematics is at least medium, and the evalua-
tion on Literature is (at most) bad, then the overall evaluation is bad or
medium, {S1,S2},
rule 4) if the evaluation on Mathematics is at least medium, then the overall
evaluation is at least medium, {S2, S3,S4, S5, S6},
rule 5) if the evaluation on Literature is (at most) bad, then the overall eval-
uation is at most medium, {S1, S2, S7},
rule 6) if the evaluation on Mathematics is (at most) bad, then the overall
evaluation is (at most) bad, {S7, S8}.
Analogously in rule 3) we wrote if ... the evaluation in Literature is (at
most) bad and not simply if ... the evaluation in Literature is bad (as
it was possible, since in the considered example bad is the worst possible
evaluation). The same remark holds for rule 5) and rule 6). Notice that rules
1)2), 4)7) are certain, while rule 3) is an approximate one. These rules
represent knowledge discovered from the available information. In the current
context, the knowledge is interpreted as a preference model of the jury. A
characteristic feature of the syntax of decision rules representing preferences
is the use of expressions at least or at most a value; in case of extreme
values (good and bad), these expressions are put in parentheses because
there is no value above good and below bad.
Even if one can represent all the knowledge using only one reduct of the set
of criteria (as we have done using P ={Mathematics, Literature}), when con-
sidering a larger set of criteria than a reduct, one can obtain a more synthetic
representation of knowledge, i.e. the number of decision rules or the number
of elementary conditions, or both of them, can get smaller. For example, con-
sidering the set of all three criteria, {Mathematics, Physics, Literature}, we
can induce a set of decision rules composed of the above rules 1), 2), 3) and
6), plus the following :
rule 7) if the evaluation on Physics is at most medium, and the evaluation
on Literature is at most medium, then the overall evaluation is at most
medium, {S1, S2,S3,S7, S8}.
Thus, the complete set of decision rules induced from Table 5.1 is composed
of 5 instead of 6 rules.
Once accepted by the DM, these rules represent his/her preference model.
Assuming that rules 1)7) in our example represent the preference model of
the jury, it can be used to evaluate new students. For example, student S9
who is medium in Mathematics and Physics and good in Literature, would
be evaluated as medium because his prole matches the premise of rule 2),
having as consequence an overall evaluation at least medium. The overall
evaluation of S9 cannot be good, because his prole does not match any
rule having as consequence an overall evaluation good (in the considered
example, the only rule of this type is rule 1) whose premise is not matched by
the prole of S9).
5.4 Multiple Criteria Decision Analysis Using

Association Rules
In interactive multiobjective optimization, and, more generally, in multiple
criteria decision analysis, the DM is interested in relationships between at-
tainable values of criteria. For instance, in a car selection problem, one can
observe that in the set of considered cars, if the maximum speed is at least 200
km/h and the time to reach 100 km/h is at most 7 seconds, then the price is
not less than 40,000$ and the fuel consumption is not less than 9 liters per 100
km. These relationships are association rules whose general syntax, in case of
minimization of criteria fi , i I, is:
if fi1 (x) ri1 and . . . and fip (x) rip , then fip+1 (x) rip+1 and . . . and
fiq (x) riq , where {i1 , . . . , iq } I, ri1 , . . . , riq .
If criterion fi , i I, should be maximized, the corresponding condition in the
association rule should be reversed, i.e. in the premise, the condition becomes
fi (x) ri , and in the conclusion it becomes fi (x) ri .
Given an association rule r if fi1 (x) ri1 and . . . and fip (x) rip ,
then fip+1 (x) rip+1 and . . . and fiq (x) riq , an object y U supports
r if fi1 (y) ri1 and . . . and fip (y) rip and fip+1 (y) rip+1 and . . . and
fiq (y) riq . Moreover, object y U supporting decision rule r is a base of
r if fi1 (y) = ri1 and . . . and fip (y) = rip and fip+1 (y) = rip+1 and . . . and
fiq (y) = riq . An association rule having at least one base is called robust.
We say that an association rule r if fi1 (x) ri1 and . . . and fip (x) rip ,
then fip+1 (x) rip+1 and . . . and fiq (x) riq holds in universe U if:
1) there is at least one y U supporting r,
2) r is not contradicted in U , i.e. there is no z U such that fi1 (z) ri1
and . . . and fip (z) rip , while not fip+1 (z) rip+1 or . . . or fiq (z) riq .
Given two association rules
r1 if fi1 (x) ri11 and . . . and fip (x) ri1p , then fip+1 (x) ri1p+1 and . . .
and fiq (x) ri1q ,
r2 if fj1 (x) rj21 and . . . and fjs (x) rj2s , then fjs+1 (x) rj2s+1 and . . .
and fjt (x) rj2t ,
we say that rule r1 is not weaker than rule r2 , denoted by r1 r2 , if:
) {i1 , . . . , ip } {j1 , . . . , js },
) ri11 ri21 , . . . , ri1p ri2p ,
) {ip+1 , . . . , iq } {js+1 , . . . , jt },
) rj1s+1 rj2s+1 , . . . , rj1t rj2t .
Conditions ) and ) are formulated for criteria fi to be minimized. If criterion

fi should be maximized, the corresponding inequalities should be reversed, i.e.
ri1 ri2 in condition ) as well as in condition ). Notice that is a binary
relation on the set of association rules, which is a partial preorder, i.e. it is
reexive (each rule is not weaker than itself) and transitive. The asymmetric
part of the relation is denoted by , and r1 r2 reads r1 is stronger than r2 .
For example, consider the following association rules:
r1 if the maximum speed is at least 200 km/h and the time to reach
100 km/h is at most 7 seconds, then the price is not less than 40,000$ and
the fuel consumption is not less than 9 liters per 100 km,
100 km/h is at most 7 seconds and the horse power is at least 175 kW,
then the price is not less than 40,000$ and the fuel consumption is not less
than 9 liters per 100 km,
100 km/h is at most 7 seconds, then the price is not less than 40,000$,
100 km/h is at most 7 seconds and the horse power is at least 175 kW,
then the price is not less than 35,000$.
Let us observe that rule r1 is stronger than each of the other ve rules for the
following reasons:
r1 r2 for condition ) because, all things equal elsewhere, in the premise
of r2 there is an additional condition: the horse power is at least 175 kW,
r1 r3 for condition ) because, all things equal elsewhere, in the premise
of r3 there is a condition with a worse threshold value: the maximum
speed is at least 220 km/h instead of the maximum speed is at least 200
km/h,
r1 r4 for condition ) because, all thing equal elsewhere, in the conclusion
of r4 one condition is missing: the fuel consumption is not less than 9 liters
per 100 km,
r1 r5 for condition ) because, all thing equal elsewhere, in the conclusion
of r5 there is a condition with a worse threshold value: the price is not
less than 35,000 $ instead of the price is not less than 40,000$,
r1 r6 for conditions ), ), ) and ) because all weak points for which

rules r2 , r3 , r4 and r5 are weaker than rule r1 are present in r6 .
An association rule r is minimal if there is no other rule stronger than r with
respect to . An algorithm for induction of association rules from preference
ordered data has been presented in (Greco et al., 2002a).
5.5 Interactive Multiobjective Optimization Using

Dominance-Based Rough Set Approach
(IMO-DRSA)
In this section, we present a new method for Interactive Multiobjective Op-
timization using Dominance-based Rough Set Approach (IMO-DRSA). The
method is composed of the following steps.
Step 1. Generate a representative sample of solutions from the currently con-
sidered part of the Pareto optimal set.
Step 2. Present the sample to the DM, possibly together with association rules
showing relationships between attainable values of objective functions in
the Pareto optimal set.
Step 3. If the DM is satised with one solution from the sample, then this is
the most preferred solution and the procedure stops. Otherwise continue.
Step 4. Ask the DM to indicate a subset of good solutions in the sample.
Step 5. Apply DRSA to the current sample of solutions sorted into good
and others, in order to induce a set of decision rules with the following
syntax if fj1 (x) j1 and ... and fjp (x) jp , then solution x is good,
{j1 , . . . , jp } {1, . . . , n}.
Step 6. Present the obtained set of rules to the DM.
Step 7. Ask the DM to select the decision rules most adequate to his/her
preferences.
Step 8. Adjoin the constraints fj1 (x) j1 , ... , fjp (x) jp coming from
the rules selected in Step 7 to the set of constraints imposed on the Pareto
optimal set, in order to focus on a part interesting from the point of view
of DMs preferences.
Step 9. Go back to Step 1.
In a sequence of iterations the method is exploring the Pareto optimal set
of a multiobjective optimization problem or an approximation of this set.
In the calculation stage (Step 1), any multiobjective optimization method
(see Chapter 1), which nds the Pareto optimal set or its approximation,
such as Evolutionary Multiobjective Optimization methods (see Chapter 3),
can be used. In the dialogue stage of the method (Step 2 to 7), the DM is
asked to select a decision rule induced from his/her preference information,

which is equivalent to xing some upper bounds for the minimized objective
functions fj .
In Step 1, the representative sample of solutions from the currently con-
sidered part of the Pareto optimal set can be generated using one of existing
procedures, such as (Steuer and Choo, 1983; Wierzbicki, 1980; Jaszkiewicz and
Sowiski, 1999). It is recommended to use a ne grained sample of represen-
tative solutions to induce association rules; however, the sample of solutions
presented to the DM in Step 2 should be much smaller (about a dozen) in
order to avoid an excessive cognitive eort of the DM. Otherwise, the DM
would risk to give non reliable information (for a discussion about cognitive
aspects of interactive multiobjective optimization methods see Chapter 2 and
Chapter 15).
The association rules presented in Step 2 help the DM in understanding
what (s)he can expect from the optimization problem. More precisely, any
association rule
if fi1 (x) ri1 and . . . and fip (x) rip , then fip+1 (x) rip+1 and . . . and
fiq (x) riq , where {i1 , . . . , iq } I, ri1 , . . . , riq
says to the DM that, if (s)he wants attain the values of objective functions
fi1 (x) ri1 and . . . and fip (x) rip , then (s)he cannot reasonably expect to
obtain values of objective functions fip+1 (x) < rip+1 and . . . and fiq (x) < riq .
In Step 2, it could be useful to visualize the currently considered part of
Pareto optimal set using some techniques discussed in Chapter 8 and Chap-
ter 9.
With respect to the sorting of solutions into the two classes of good
and others, observe that good means in fact relatively good, i.e. better
than the rest in the current sample. In case, the DM would refuse to classify as
good any solution, one can ask the DM to specify some minimal requirements
of the type fj1 (x) j1 and ... and fjp (x) jp for good solutions. These
minimal requirements give some constraints that can be used in Step 8, in the
same way as the analogous constraints coming from selected decisions rules.
The rules considered in Step 5 have a syntax corresponding to minimization
of objective functions. In case of maximization of an objective function fj , the
condition concerning this objective in the decision rule should have the form
fj (x) j .
Remark, moreover, that the Pareto optimal set reduced in Step 8 by
constraints fj1 (x) j1 , . . . , fjp (x) jp is certainly not empty if these
constraints are coming from one decision rule only. Let us remember that
we consider robust rules (see Section 2) and, therefore, the threshold val-
ues j1 , . . . , jp are values of objective functions of some solutions from the
Pareto optimal set. If {j1 , . . . , jp } = {1, . . . , n}, i.e. {j1 , . . . , jp } is the set
of all objective functions, then the new reduced part of the Pareto optimal
set contains only one solution x such that f1 (x) = 1 , . . . , fn (x) = n . If
{j1 , . . . , jp } {1, . . . , n}, i.e. {j1 , . . . , jp } is a proper subset of the set of all
objective functions, then the new reduced part of the Pareto optimal set con-
tains solutions satisfying conditions fj1 (x) j1 and ... and fjp (x) jp .
Since the considered rules are robust, then there is at least one solution x
satisfying these constraints. When the Pareto optimal set is reduced in Step
8 by constraints fj1 (x) j1 , . . . , fjp (x) jp coming from more than one
rule, then it is possible that the resulting reduced part of the Pareto optimal
set is empty. Thus, before passing to Step 9, it is necessary to verify if the
reduced Pareto optimal set is not empty. If the reduced Pareto optimal set
is empty, then the DM is required to revise his/her selection of rules. The
DM can be supported in this task, by information about minimal sets of con-
straints fj (x) j coming from the considered decision rules to be removed
in order to get a non-empty part of the Pareto optimal set.
The constraints introduced in Step 8 are maintained in the following it-
erations of the procedure, however, they cannot be considered as irreversible.
Indeed, the DM can always come back to the Pareto optimal set considered
in one of previous iterations and continue from this point. This is in the spirit
of a learning oriented conception of interactive multiobjective optimization,
i.e. it agrees with the idea that the interactive procedure permits the DM to
learn about his/her preferences and about the shape of the Pareto optimal
set (see e.g. Chapter 2 and Chapter 15).
5.6 Illustrative Example
To illustrate the interactive multiobjective optimization procedure based on

DRSA, we consider a product mix problem. There are three products: A, B, C
which are produced in quantities denoted by xA , xB , and xC , respectively.
The unit prices of the three products are pA = 20, pB = 30, pC = 25. The
production process involves two machines. The production times of A, B, C on
the rst machine are equal to t1A = 5, t1B = 8, t1C = 10, and on the second
machine they are equal to t2A = 8, t2B = 6, t2C = 2. Two raw materials are
used in the production process. The rst raw material has a unit cost of 6 and
the quantity required for production of one unit of A, B and C is r1A = 1,
r1B = 2 and r1C = 0.75, respectively. The second raw material has a unit cost
of 8 and the quantity required for production of one unit of A, B and C is
r2A = 0.5, r2B = 1 and r2C = 0.5, respectively. We know, moreover, that the
market cannot absorb a production greater than 10, 20 and 10 units for A, B
and C, respectively. To decide how much of A, B and C should be produced,
the following objectives have to be taken into account:
Prot (to be maximized),
Time (total production time on two machines to be minimized),
Production of A (to be maximized),
Production of B (to be maximized),
Production of C (to be maximized),

Sales (to be maximized).
The above product mix problem can be formulated as the following multiob-
jective optimization problem:
Maximize
20xA + 30xB + 25xC (1xA + 2xB + 0.75xC )6 (0.5xA + 1xB + 0.5xC )8
[Prot],
Minimize 5xA + 8xB + 10xC + 8xA + 6xB + 2xC [Time],
Maximize xA [Production of A],
Maximize xB [Production of B],
Maximize xC [Production of C],
Maximize 20xA + 30xB + 25xC [Sales],
subject to:
xA 10, xB 20, xC 10 [Market absorption limits],
xA 0, xB 0, xC 0 [Non-negativity constraints].
A sample of representative Pareto optimal solutions has been calculated and

proposed to the DM. Let us observe that the problem we are considering is
a Multiple Objective Linear Programming (MOLP) and thus representative
Pareto optimal solutions can be calculated using classical linear programming,
looking for the solutions optimizing each one of the considered objectives or
xing all the considered objective functions but one at a satisfying value, and
looking for the solution optimizing the remaining objective function. The set
of representative Pareto optimal solutions is shown in Table 5.5. Moreover,
a set of potentially interesting association rules have been induced from the
sample and presented to the DM. These rules represent strongly supported
relationships between attainable values of objective functions. The associa-
tion rules are the following (between parentheses there are ids of solutions
supporting the rule):
rule 1) if Time 140, then Prot 180.38 and Sales 280.77
(s1, s2, s3, s4, s12, s13),
rule 2) if Time 150, then Prot 188.08 and Sales 296.15
(s1, s2, s3, s4, s5, s6, s7, s8, s9, s12, s13),
rule 3) if xB 2, then Prot 209.25 and xA 6 and xC 7.83
(s4, s5, s6, s9, s10, s11, s12, s13),
rule 4) if Time 150, then xB 3

(s1, s2, s3, s4, s5, s6, s7, s8, s9, s12, s13),
rule 5) if Prot 148.38 and Time 150, then xB 2
(s1, s2, s3, s5, s7, s8),
rule 6) if xA 5, then Time 150
(s5, s6, s8, s9, s10, s11),
rule 7) if Prot 127.38 and xA 3, then Time 130
(s4, s5, s6, s8, s9, s10, s11, s12),
rule 8) if Time 150 and xB 2, then Prot 148.38
(s4, s5, s6, s9, s12, s13),
rule 9) if xA 3 and xC 4.08, then Time 130
(s4, s5, s8, s10, , s11, s12),
rule 10) if Sales 265.38 , then Time 130
(s2, s3, s4, s5, s6, s7, s8, s9, s10, s11).
Then, the DM has been asked if (s)he was satised with one of the proposed
Pareto optimal solutions. Since his/her answer was negative, (s)he was re-
quested to indicate a subset of good solutions which are indicated in the
Evaluation column of Table 5.5.
Table 5.5. A sample of Pareto optimal solutions proposed in the rst iteration
Solution Prot Time Prod. A Prod. B Prod. C Sales Evaluation
s1 165 120 0 0 10 250
s2 172.69 130 0.769 0 10 265.38
s3 180.38 140 1.538 0 10 280.77 good
s4 141.13 140 3 3 4.92 272.92 good
s5 148.38 150 5 2 4.75 278.75 good
s6 139.13 150 5 3 3.58 279.58
s7 188.08 150 2.308 0 10 296.15
s8 159 150 6 0 6 270
s9 140.5 150 6 2 3.67 271.67 good
s10 209.25 200 6 2 7.83 375.83
s11 189.38 200 5 5 5.42 385.42
s12 127.38 130 3 3 4.08 252.08
s13 113.63 120 3 3 3.25 231.25
Taking into account the sorting of Pareto optimal solutions into good and
others, made by the DM, twelve decision rules have been induced from the
lower approximation of good solutions. The frequency of the presence of
objectives in the premises of the rules gives a rst idea of the importance of
the considered objectives. These frequencies are the following:
Prot: 4
12 ,
Time: 12
12 ,
Production of A: 7
12 ,
Production of B: 4
12 ,
Production of C: 5
12 ,
Sales: 5
12 .
The following potentially interesting decision rules were presented to the DM:
rule 1) if Prot 140.5 and Time 150 and xB 2,
then product mix is good (s4, s5, s9),
rule 2) if Time 140 and xA 1.538 and xC 10,
then product mix is good (s3),
rule 3) if Time 150 and xB 2 and xC 4.75,
then product mix is good (s4, s5),
rule 4) if Time 140 and Sales 272.9167,
then product mix is good (s3, s4),
rule 5) if Time 150 and xB 2 and xC 3.67 and Sales 271.67,
then product mix is good (s4, s5, s9).
Among these decision rules, the DM has selected rule 1) as the most adequate
to his/her preferences. This rule permits to dene the following constraints
reducing the feasible region of the production mix problem:
20xA +30xB +25xC (xA +2xB +0.75xC )6(0.5xA +xB +0.5xC )8 140.5
[Prot 140.5],
5xA + 8xB + 10xC + 8xA + 6xB + 2xC 150 [Time 150],
xB 2 [Production of B 2].
These constraints have been considered together with the original constraints
for the production mix problem, and a new sample of representative Pareto
optimal solutions shown in Table 5.6 have been calculated and presented to
the DM, together with the following potentially interesting association rules:
rule 1) if Time 140, then Prot 174 and xC 9.33 and Sales 293.33
(s5 , s6 , s7 , s8 , s9 , s10 , s11 , s12 ),
rule 2) if xA 2, then xB 3 and Sales 300.83
(s2 , s3 , s4 , s6 , s7 , s9 ),
rule 3) if xA 2, then Prot 172 and xC 8
(s2 , s3 , s4 , s6 , s7 , s9 ),
rule 4) if Time 140, then xA 2 and xB 3

(s5 , s6 , s7 , s8 , s9 , s10 , s11 , s12 ),
rule 5) if Prot 158.25, then xA 2
(s1 , s3 , s4 , s5 , s6 , s8 ),
rule 6) if xA 2, then Time 130
(s2 , s3 , s4 , s6 , s7 , s9 ),
rule 7) if xC 7.17, then xA 2 and xB 2
(s1 , s3 , s5 , s6 , s8 , s10 ),
rule 8) if xC 6, then xA 2 and xB 3
(s1 , s3 , s4 , s5 , s6 , s7 , s8 , s9 , s10 , s11 , s12 ),
rule 9) if xC 7, then Time 125 and xB 2
(s1 , s3 , s5 , s6 , s8 , s10 , s11 ),
rule 10) if Sales 280, then Time 140 and xB 3
(s1 , s2 , s3 , s4 , s5 , s7 ),
rule 11) if Sales 279.17, then Time 140
(s1 , s2 , s3 , s4 , s5 , s6 , s7 ),
rule 12) if Sales 272, then Time 130
(s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 ).
The DM has been asked again if (s)he was satised with one of the proposed
Pareto optimal solutions. Since his/her answer was negative, (s)he was re-
quested again to indicate a subset of good solutions, which are indicated in
the Evaluation column of Table 5.6.
Table 5.6. A sample of Pareto optimal solutions proposed in the second iteration
s1 186.53 150 0.154 2 10 313.08
s2 154.88 150 3 3 5.75 293.75
s3 172 150 2 2 8 300 good
s4 162.75 150 2 3 6.83 300.83 good
s5 174 140 0 2 9.33 293.33
s6 158.25 140 2 2 7.17 279.17 good
s7 149 140 2 3 6 280
s8 160.25 130 0 2 8.5 272 good
s9 144.5 130 2 2 6.33 258.33
s10 153.38 125 0 2 8.08 262.08
s11 145.5 125 1 2 7 255 good
s12 141.56 125 1.5 2 6.46 251.46 good
Taking into account the sorting of Pareto optimal solutions into good
and others, made by the DM, eight decision rules have been induced from
the lower approximation of good solutions. The frequencies of the presence
of objectives in the premises of the rules are the following:
Prot: 28 ,
Time: 18 ,
Production of A: 58 ,
Production of B: 38 ,
Production of C: 38 ,
Sales: 28 .
The following potentially interesting decision rules were presented to the DM:
rule 1) if Time 125 and xA 1, then product mix is good
(s11 , s12 ),
rule 2) if xA 1 and xC 7, then product mix is good
(s3 , s6 , s11 ),
rule 3) if xA 1.5 and xC 6.46, then product mix is good
(s3 , s4 , s6 , s12 ),
rule 4) if Prot 158.25 and xA 2, then product mix is good
(s3 , s4 , s6 ),
rule 5) if xA 2 and Sales 300, then product mix is good
(s3 , s4 ).
Among these decision rules, the DM has selected rule 4) as the most adequate
to his/her preferences. This rule permits to dene the following constraints
reducing the Pareto optimal set of the production mix problem:
20xA +30xB +25xC (xA +2xB +0.75xC )6(0.5xA +xB +0.5xC )8 158.25
[Prot 158.25],
xA 2 [Production of A 2].
Let us observe that the rst constraint is just strengthening an analogous
constraint introduced in the rst iteration (Prot 140.5).
Considering the new set of constraints, a new sample of representative
Pareto optimal solutions shown in Table 5.7 has been calculated and presented
to the DM, together with the following potentially interesting association
rules:
rule 1 ) if Time 145, then xA 2 and xB 2.74 and Sales 290.2
(s2 , s3 , s4 ),
rule 2 ) if xC 6.92, then xA 3 and xB 2 and Sales 292.92
(s3 , s4 , s5 ),
rule 3 ) if Time 145, then Prot 165.13 and xA 2 and xC 7.58
(s2 , s3 , s4 ),
rule 4 ) if xC 6.72, then xB 2.74
(s2 , s3 , s4 , s5 ),
rule 5 ) if Sales 289.58, then Prot 165.13 and Time 145 and
xC 7.58 (s1 , s2 , s3 , s5 ).
The DM has been asked again if (s)he was satised with one of the presented
Pareto optimal solutions shown in Table 5.7 and this time (s)he declared that
solution s3 is satisfactory for him/her. This ends the interactive procedure.
Table 5.7. A sample of Pareto optimal solutions proposed in the third iteration
s1 158.25 150 2 3.49 6.27 301.24
s2 158.25 145 2 2.74 6.72 290.20
s3 165.13 145 2 2 7.58 289.58 selected
s4 158.25 140 2 2 7.17 279.17
s5 164.13 150 3 2 6.92 292.93
s6 158.25 145.3 3 2 6.56 284.02
5.7 Characteristics of the IMO-DRSA
The interactive procedure presented in Section 5.5 can be analyzed from the
point of view of input and output information. As to the input, the DM gives
preference information by answering easy questions related to sorting of some
representative solutions into two classes (good and others). Very often, in
multiple criteria decision analysis, in general, and in interactive multiobjective
optimization, in particular, the preference information has to be given in terms
of preference model parameters, such as importance weights, substitution rates
and various thresholds (see (Fishburn, 1967) for the Multiple Attribute Util-
ity Theory and (Roy and Bouyssou, 1993; Figueira et al., 2005a; Brans and
Mareschal, 2005; Martel and Matarazzo, 2005) for outranking methods; for
some well known interactive multiobjective optimization methods requiring
preference model parameters, see the Georion-Dyer-Feinberg method (Ge-
orion et al., 1972), the method of Zionts and Wallenius (1976, 1983) and
the Interactive Surrogate Worth Tradeo method (Chankong and Haimes,
1978, 1983) requiring information in terms of marginal rates of substitution,
the reference point method (Wierzbicki, 1980, 1982, 1986) requiring a refer-
ence point and weights to formulate an achievement scalarizing function, the
Light Beam Search method (Jaszkiewicz and Sowiski, 1999) requiring infor-
mation in terms of weights and indierence, preference and veto thresholds,
being typical parameters of ELECTRE methods). Eliciting such information
requires a signicant cognitive eort on the part of the DM. It is generally ac-
knowledged that people often prefer to make exemplary decisions and cannot
always explain them in terms of specic parameters. For this reason, the idea
of inferring preference models from exemplary decisions provided by the DM
is very attractive. The output result of the analysis is the model of preferences
in terms of if..., then... decision rules which is used to reduce the Pareto op-
timal set iteratively, until the DM selects a satisfactory solution. The decision
rule preference model is very convenient for decision support, because it gives
argumentation for preferences in a logical form, which is intelligible for the
DM, and identies the Pareto optimal solutions supporting each particular
decision rule. This is very useful for a critical revision of the original sorting
of representative solutions into the two classes of good and others. Indeed,
decision rule preference model speaks the same language of the DM without
any recourse to technical terms, like utility, tradeos, scalarizing functions
and so on.
All this implies that IMO-DRSA has a transparent feedback organized in
a learning oriented perspective, which permits to consider this procedure as
a glass box, contrary to the black box characteristic of many procedures
giving nal result without any clear explanation. Note that with the proposed
procedure, the DM learns about the shape of the Pareto optimal set using
the association rules. They represent relationships between attainable values
of objective functions on the Pareto optimal set in logical and very natural
statements. The information given by association rules is as intelligible as
the decision rule preference model, since they speak the language of the DM
and permit him/her to identify the Pareto optimal solutions supporting each
particular association rule.
Thus, decision rules and association rules give an explanation and a justi-
cation of the nal decision, that does not result from a mechanical application
of a certain technical method, but rather from a mature conclusion of a deci-
sion process based on active intervention of the DM.
Observe, nally, that the decision rules representing preferences and the
association rules describing the Pareto optimal set are based on ordinal prop-
erties of objective functions only. Dierently from methods involving some
scalarization (almost all existing interactive methods), in any step the pro-
posed procedure does not aggregate the objectives into a single value, avoid-
ing operations (such as averaging, weighted sum, dierent types of distance,
achievement scalarization) which are always arbitrary to some extent. Remark
that one could use a method based on a scalarization to generate the represen-
tative set of Pareto optimal solutions; nevertheless, the decision rule approach
would continue to be based on ordinal properties of objective functions only,
because the dialogue stage of the method operates on ordinal comparisons
only. In the proposed method, the DM gets clear arguments for his/her deci-
sion in terms of if ..., then... decision rules and the verication if a proposed
solution satises these decision rules is particularly easy. This is not the case
of interactive multiobjective optimization methods based on scalarization. For
example, in the methods using an achievement scalarization function, it is not
evident what does it mean for a solution to be close to the reference point.
How to justify the choice of the weights used in the achievement function?
What is their interpretation? Observe, instead, that the method proposed in
this chapter operates on data using ordinal comparisons which would not be
aected by any increasing monotonic transformation of scales, and this en-
sures the meaningfulness of results from the point of view of measurement
theory (see, e.g., Roberts, 1979).
With respect to computational aspects of the method, notice that the deci-
sion rules can be calculated eciently in few seconds only using the algorithms
presented in (Greco et al., 2001a, 2002a). When the number of objective func-
tions is not too large to be eectively controlled by the DM (let us say seven
plus or minus two, as suggested by Miller (1956)), then the decision rules
can be calculated in a fraction of one second. In any case, the computational
eort grows exponentially with the number of objective functions, but not
with respect to the number of considered Pareto optimal solutions, which can
increase with no particularly negative consequence on calculation time.
5.8 Conclusions
We presented a new interactive multiobjective optimization method using a
decision rule preference model to support interaction with the DM. It is using
the Dominance-based Rough Set Approach, being a well grounded methodol-
ogy of reasoning about preference ordered data. Due to the transparency and
intelligibility of the transformation of the input information into the preference
model, the proposed method can be qualied as a glass box, contrary to the
black box, characteristic for many methods giving a nal result without any
clear argumentation. Moreover, the method is purely ordinal, because it does
not make any invasive operations on data, such as averaging or calculation of
a weighted sum, of a distance or of an achievement scalarization.
We believe that the good properties of the proposed method will encour-
age further developments and specializations to dierent decision problems
involving multiple objectives, such as portfolio selection, inventory manage-
ment, scheduling and so on.
Acknowledgements
The third author wishes to acknowledge nancial support from the Polish
Ministry of Science and Higher Education.
References
Brans, J.P., Mareschal, B.: PROMETHEE methods. In: Figueira, J., Greco, S.,
Ehrgott, M. (eds.) Multiple Criteria Decision Analysis: State of the Art Surveys,
pp. 163195. Springer, Berlin (2005)
Chankong, V., Haimes, Y.Y.: The interactive surrogate worth trade-o (iswt)
method for multiobjective decision-making. In: Zionts, S. (ed.) Multiple Crite-
ria Problem Solving, pp. 4267. Springer, Berlin (1978)
Chankong, V., Haimes, Y.Y.: Multiobjective Decision Making Theory and Method-
ology. Elsiever Science Publishing Co., New York (1983)
Dyer, J.S.: MAUT-multiattribute utility theory. In: Figueira, J., Greco, S., Ehrgott,
M. (eds.) Multiple Criteria Decision Analysis: State of the Art Surveys, pp. 265
295. Springer, Berlin (2005)
Figueira, J., Mousseau, V., Roy, B.: ELECTRE methods. In: Figueira, J., Greco, S.,
pp. 265295. Springer, Berlin (2005a)
Figueira, J., Greco, S., Ehrgott, M. (eds.): Multiple Criteria Decision Analysis: State
of the Art Surveys. Springer, Berlin (2005b)
Fishburn, P.C.: Methods of estimating additive utilities. Management Science 13(7),
435453 (1967)
Georion, A., Dyer, J., Feinberg, A.: An interactive approach for multi-criterion
Management Science 19(4), 357368 (1972)
Giove, S., Greco, S., Matarazzo, B., Sowiski, R.: Variable consistency monotonic
decision trees. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.)
RSCTC 2002. LNCS (LNAI), vol. 2475, pp. 247254. Springer, Heidelberg (2002)
Greco, S., Matarazzo, B., Sowiski, R.: Rough approximation of a preference re-
lation by dominance relations. European J. Operational Research 117, 6383
(1999a)
Greco, S., Matarazzo, B., Sowiski, R.: The use of rough sets and fuzzy sets in
MCDM. In: Gal, T., Stewart, T., Hanne, T. (eds.) Advances in Multiple Criteria
Decision Making, pp. 14.114.59, Kluwer, Boston (1999b)
Greco, S., Matarazzo, B., Sowiski, R., Stefanowski, J.: An algorithm for induction
of decision rules consistent with the dominance principle. In: Ziarko, W., Yao, Y.
(eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 304313. Springer, Heidelberg
(2001a)
Greco, S., Matarazzo, B., Sowiski, R.: Rough sets theory for multicriteria decision
analysis. European J. of Operational Research 129, 147 (2001b)
Greco, S., Sowiski, R., Stefanowski, J.: Mining association rules in preference-
ordered data. In: Hacid, M.-S., Ra, Z.W., Zighed, A.D.A., Kodrato, Y. (eds.)
ISMIS 2002. LNCS (LNAI), vol. 2366, pp. 442450. Springer, Heidelberg (2002a)
Greco, S., Matarazzo, B., Sowiski, R.: Preference representation by means of con-
joint measurement & decision rule model. In: Bouyssou, D., et al. (eds.) Aiding
Decisions with Multiple CriteriaEssays in Honor of Bernard Roy, pp. 263313.
Kluwer Academic Publishers, Dordrecht (2002b)
Greco, S., Matarazzo, B., Sowiski, R.: Axiomatic characterization of a general util-
ity function and its particular cases in terms of conjoint measurement and rough-
set decision rules. European J. of Operational Research 158, 271292 (2004a)
Greco, S., Pawlak, Z., Sowiski, R.: Can Bayesian conrmation measures be useful
for rough set decision rules? Engineering Applications of Articial Intelligence 17,
345361 (2004b)
Greco, S., Matarazzo, B., Sowiski, R.: Dominance-based rough set approach to
knowledge discovery (I) general perspective, (II) extensions and applications.
In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis, pp.
513612. Springer, Berlin (2004c)
Art Surveys, pp. 507563. Springer, Berlin (2005a)
Greco, S., Matarazzo, B., Sowiski, R.: Generalizing rough set theory through
dominance-based rough set approach. In: lzak, D., Yao, J., Peters, J.F., Ziarko,
W., Hu, X. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3642, pp. 111. Springer,
Heidelberg (2005b)
Greco, S., Mousseau, V., Sowiski, R.: Ordinal regression revisted: Multiple criteria
ranking with a set of additive value functions. European Journal of Operational
Research 191, 415435 (2008)
ticriteria decision making: the UTA method. European J. of Operational Re-
search 10, 151164 (1982)
Jacquet-Lagrze, E., Meziani, R., Sowiski, R.: MOLP with an interactive as-
sessment of a piecewise linear utility function. European J. of Operational Re-
search 31, 350357 (1987)
Jaszkiewicz, A., Ferhat, A.B.: Solving multiple criteria choice problems by interac-
tive trichotomy segementation. European Journal of Operational Research 113,
271280 (1999)
Jaszkiewicz, A., Sowiski, R.: The "Light Beam Search approach - an overview of
300314 (1999)
Keeney, R.L., Raia, H.: Decision with Multiple Objectives: Preference and Value
Tradeos. John Wiley and Sons, New York (1976)
March, J.G.: Bounded rationality, ambiguity and the engineering of choice. Bell
Journal of Economics 9, 587608 (1978)
Martel, J.M., Matarazzo, B.: Other outranking approaches. In: Figueira, J., Greco,
S., Ehrgott, M. (eds.) Multiple Criteria Decision Analysis: State of the Art Sur-
veys, pp. 197262. Springer, Berlin (2005)
Michalski, R.S., Bratko, I., Kubat, M. (eds.): Machine learning and datamining
Methods and applications. Wiley, New York (1998)
Miller, G.A.: The magical number seven, plus or minus two: some limits in our
capacity for processing information. The Psychological Review 63, 8197 (1956)
Mousseau, V., Sowiski, R.: Inferring an ELECTRE TRI model from assignment
examples. Journal of Global Optimization 12, 157174 (1998)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sci-
ences 11, 341356 (1982)
Pawlak, Z.: Rough Sets. Kluwer, Dordrecht (1991)
Pawlak, Z., Sowiski, R.: Rough set approach to multi-attribute decision analysis.
European J. of Operational Research 72, 443459 (1994)
Roberts, F.: Measurement Theory, with Applications to Decision Making, Utility
and the Social Sciences. Addison-Wesley, Boston (1979)
Roy, B., Bouyssou, D.: Aide Multicritre la Dcision: Mthodes et Cas. Economica,
Paris (1993)
Sowiski, R.: Rough set learning of preferential attitude in multi-criteria decision
making. In: Komorowski, J., Ra, Z.W. (eds.) ISMIS 1993. LNCS, vol. 689, pp.
Sowiski, R., Greco, S., Matarazzo, B.: Axiomatization of utility, outranking and
decision-rule preference models for multiple-criteria classication problems under
partial inconsistency with the dominance principle. Control and Cybernetics 31,
10051035 (2002a)
Sowiski, R., Greco, S., Matarazzo, B.: Rough set analysis of preference-ordered
data. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002.
LNCS (LNAI), vol. 2475, pp. 4459. Springer, Heidelberg (2002b)
Sowiski, R., Greco, S., Matarazzo, B.: Rough set based decision support. In: Burke,
E.K., Kendall, G. (eds.) Search Methodologies: Introductory Tutorials in Opti-
mization and Decision Support Techniques, pp. 475527. Springer, New York
(2005)
Stefanowski, J.: On rough set based approaches to induction of decision rules. In:
Polkowski, L., Skowron, A. (eds.) Rough Sets in Data Mining and Knowledge
Discovery, vol. 1, pp. 500529. Physica, Heidelberg (1998)
Steuer, R.E., Choo, E.-U.: An interactive weighted tchebyche procedure for multi-
ple objective programming. Mathematical Programming 26, 326344 (1983)
Applications. LNEMS, vol. 177, pp. 468486. Springer, Berlin (1980)
cal Modelling 3, 391405 (1982)
terizations to vector optimization problems. OR Spektrum 8, 7387 (1986)
Zionts, S., Wallenius, J.: An interactive multiple objective linear programming
method for a class of underlying nonlinear utility functions. Management Sci-
ence 29, 519523 (1983)
6
Consideration of Partial User Preferences in
Evolutionary Multiobjective Optimization
Jrgen Branke
Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany

branke@aifb.uni-karlsruhe.de
Abstract. Evolutionary multiobjective optimization usually attempts to nd a

good approximation to the complete Pareto optimal front. However, often the user
has at least a vague idea about what kind of solutions might be preferred. If such in-
formation is available, it can be used to focus the search, yielding a more ne-grained
approximation of the most relevant (from a users perspective) areas of the Pareto
optimal front and/or reducing computation time. This chapter surveys the literature
on incorporating partial user preference information in evolutionary multiobjective
optimization.
6.1 Introduction
Most research in evolutionary multiobjective optimization (EMO) attempts
to approximate the complete Pareto optimal front by a set of well-distributed
representatives of Pareto optimal solutions. The underlying reasoning is that
in the absence of any preference information, all Pareto optimal solutions have
to be considered equivalent.
On the other hand, in most practical applications, the decision maker (DM)
is eventually interested in only a single solution. In order to come up with a
single solution, at some point during the optimization process, the DM has to
reveal his/her preferences to choose between mutually non-dominating solu-
tions. Following a classication by Horn (1997) and Veldhuizen and Lamont
(2000), the articulation of preferences may be done either before (a priori),
during (progressive), or after (a posteriori) the optimization process, see also
Figure 6.1.
A priori approaches aggregate dierent objectives into a single auxilliary
objective in one way or another, which allows to use standard optimization
techniques (including single-objective evolutionary algorithms) and usually
Reviewed by: Carlos Coello Coello, CINEVESTAV-IPN, Mexico
Salvatore Greco, University of Catania, Italy
158 J. Branke
a priori
approach
Full preferences
Multiobjective Singleobjective
Problem (MOP) Problem
Partial preferences
MOP +
singleobjective
Partial Preferences
optimizer
e.g., MOEA
MOEA
Biased Paretofront
Approximation
User selection
Paretofront
Solution
Approximation
User selection
a posteriori approach
Fig. 6.1. Dierent ways to solve multiobjective problems.
results in a single solution. Many classical MCDM methodologies fall into this
category. The most often used aggregation method is probably just a linear
combination of the dierent objectives. Alternatives would be a lexicographic
ordering of the objectives, or to use the distance from a specied target as
objective. For an example of an approach based on fuzzy rules see Sait et al.
(1999) or Sakawa and Yauchi (1999). As aggregation of objectives turn the
multiobjective problem into a single objective problem, such evolutionary al-
gorithms are actually out of scope of this chapter. A discussion of advantages
and disadvantages of such aggregations can be found in Coello et al. (2002),
Chapter 2.2. In any case, the aggregation of objectives into a single objective
is usually not practical, because it basically requires to specify a ranking of
alternatives before these alternatives are known. Classical MCDM techniques
usually solve this predicament by repeatedly adjusting the auxilliary objective
and re-solving the single objective problem until the DM is satised with the
solution.
Most multiobjective evolutionary algorithms (MOEAs) can be classied as
a posteriori. First, the EA generates a (potentially large) set of non-dominated
solutions, then the DM can examine the possible trade-os and choose accord-
ing to his/her preferences. For an introduction to MOEAs, see Chapter 3. The
most prominent MOEAs are the Non-Dominated Sorting Genetic Algorithm
(NSGA-II, Deb et al., 2002a) and the Strength-Pareto Evolutionary Algorithm
(SPEA-II, Zitzler et al., 2002).
Interactive approaches interleave the optimization with a progressive elic-
itation of user preferences. These approaches are discussed in detail in Chap-
ter 7.
In the following, we consider an intermediate approach (middle path in
Figure 6.1). Although we agree that it may be impractical for a DM to com-
pletely specify his or her preferences before any alternatives are known, we
6 Consideration of Partial User Preferences in EMO 159
assume that the DM has at least a rough idea about what solutions might
be preferred, and can specify partial preferences. The methods discussed here
aim at integrating such imprecise knowledge into the EMO approach, biasing
the search towards solutions that are considered as relevant by the DM. The
goal is no longer to generate a good approximation to all Pareto optimal so-
lutions, but a small set of solution that contains the DMs preferred solution
with the highest probability. This may yield three important advantages:
1. Focus: Partial user preferences may be used to focus the search and gen-
erate a subset of all Pareto optimal alternatives that is particularly inter-
esting to the DM. This avoids overwhelming the DM with a huge set of
(mostly irrelevant) alternatives.
2. Speed: By focusing the search onto the relevant part of the search space,
one may expect the optimization algorithm to nd these solutions more
quickly, not wasting computational eort to identify Pareto optimal but
irrelevant solutions.
3. Gradient: MOEAs require some quality measure for solutions in or-
der to identify the most promising search direction (gradient). The most
important quality measure used in MOEA is Pareto dominance. How-
ever, with an increasing number of objectives, more and more solutions
become incomparable, rendering Pareto dominance as tness criterion
less useful,resulting in a severe performance loss of MOEAs (e.g., Deb
et al., 2002b). Incorporating (partial) user preferences introduces addi-
tional preference relations, restoring the necessary tness gradient infor-
mation to some extend and ensuring MOEAs progress.
To reach these goals, the MOEA community can accomodate or be inspired by
many of the classical MCDM methodologies covered in Chapters 1 and 2, as
those generally integrate preference information into the optimization process.
Thus, combining MOEAs, and their ability to generate multiple alternatives
simultaneously in one run, and classical MCDM methodologies, and their ways
to incorporate user preferences, holds great promise.
The literature contains quite a few techniques to incorporate full or par-
tial preference information into MOEAs, and previous surveys on this topic
include Coello (2000); Rachmawati and Srinivasan (2006), and Coello et al.
(2002). In the following, we classify the dierent approaches based on the
type of partial preference information they ask from the DM, namely objec-
tive scaling (Section 6.2), constraints (Section 6.3), a goal or reference point
(Section 6.4), trade-o information (Section 6.5), or weighted performance
measures (Section 6.6 on approaches based on marginal contribution). Some
additional approaches are summarized in Section 6.7. The chapter concludes
with a summary in Section 6.8.
160 J. Branke
6.2 Scaling
One of the often claimed advantages of MOEAs is that they do not require
an a priori specication of user preferences because they generate a good
approximation of the whole Pareto front, allowing the DM to pick his/her
preferred solution afterwards. However, the whole Pareto optimal front may
contain very many alternatives, in which case MOEAs can only hope to nd
a representative subset of all Pareto optimal solutions. Therefore, all basic
EMO approaches attempt to generate a uniform distribution of representatives
along the Pareto front. For this goal, they rely on distance information in the
objective space, be it in the crowding distance of NSGA-II or in the clustering
of SPEA-II1 . Thus, what is considered uniform depends on the scaling of
the objectives. This is illustrated in Figure 6.2. The left panel (a) shows an
evenly distributed set of solutions along the Pareto front. Scaling the second
objective by a factor of 100 (e.g., using centimeters instead of meters as unit),
leads to a bias of the distribution and more solutions along the front parallel
to the axis of the second objective (right panel). Note that depending on the
shape of the front, this means that there is a bias towards objective 1 (as in
the convex front in Figure 6.2), or objective 2 (if the front is concave). So,
the user-dened scaling is actually a usually ignored form of user preference
specication necessary also for MOEAs.
Many current implementations of MOEAs (e.g., NSGA-II and SPEA) scale
objectives based on the solutions currently in the population (see, e.g., Deb
(2001), S. 248). While this results in nice visualizations if the front is plotted
with a 1:1 ratio, and relieves the DM from specifying a scaling, it assumes
that ranges of values covered by the Pareto front in each objective are equally
2 200
1.6 160
1.2 120
0.8 80
0.4 40
0.4 0.8 1.2 1.6 2 0.4 0.8 1.2 1.6 2
(a) (b)
Fig. 6.2. Inuence of scaling on the distribution of solutions along the Pareto front.
1
The rest of this section assumes familiarity with the crowding distance concept.
Readers unfamiliar with this concept are referred to, e.g., Deb (2001) or Chapter 3.
important. Whether this assumption is justied certainly depends strongly on

the application and the DMs preferences.
In order to nd a biased distribution anywhere on the Pareto optimal
front, a previous study by Deb (2003) used a biased sharing2 mechanism
implemented on NSGA. In brief, the objectives are scaled according to pref-
erences when calculating the distances. This allows to make distances in one
objective appear larger than they are, with a corresponding change in the
resulting distribution of individuals. Although this allows to focus on one ob-
jective or another, the approach does not allow to focus on a compromise
region (for equal weighting of the objectives, the algorithm would produce no
bias at all).
In Branke and Deb (2005), the biased sharing mechanism has been ex-
tended with a better control of the region of interest and a separate parame-
ter controlling the strength of the bias. For a solution i on a particular front,
the biased crowding disctance measure Di is re-dened as follows. Let be
a user-specied direction vector indicating the most probable, or central lin-
early weighted utility function, and let be a parameter controlling the bias
intensity. Then, ! "
di
Di = di , (6.1)
di
where di and di are the original crowding distance and the crowding distance
calculated based on the locations of the individuals projected onto the (hy-
per)plane with direction vector . Figure 6.3 illustrates the concept.
As a result, for a solution in a region of the Pareto optimal front more or
less parallel to the projected plane (such as solution a), the original crowded
distance da and projected crowding distance da are more or less the same,
thereby making the ratio da /da close to one. On the other hand, for a solu-
tion in an area of the Pareto optimal front where the tangent has an orien-
tation signicantly dierent from the chosen plane (such as solution b), the
projected crowding distance db is much smaller than the original crowding
distance db . For such a solution, the biased crowding distance value Di will
be a small quantity, meaning that such a solution is assumed to be articially
crowded by neighboring solutions. A preference of solutions having a larger
biased crowding distance Di will then enable solutions closer to the tangent
point to be found. The exponent controls the extent of the bias, with larger
resulting in a stronger bias.
Note that biased crowding will focus on the area of the Pareto optimal front
which is parallel to the iso-utility function dened by the provided direction
vector . For a convex Pareto optimal front, that is just the area around
the optimal solution regarding a corresponding aggregate cost function. For
a concave region, such an aggregate cost function would always prefer one of
the edge points, while biased crowding may focus on the area in between.
2
The sharing function in NSGA fullls the same functionality as the crowding
distance in NSGA-II, namely to ensure a diversity of solutions.
162 J. Branke
f2
d point b
D
point a
d
Paretooptimal
d front
Projected
plane f1

Fig. 6.3. The biased crowding approach is illustrated on a two-objective minimiza-

tion problem (Branke and Deb, 2005).
Trautmann and Mehnen (2005) suggest an explicit incorporation of prefer-

ences into the scaling. They propose to map the objectives into the range [0, 1]
according to desirability functions. With one-sided sigmoid (monotone) desir-
ability functions, the non-dominance relations are not changed. Therefore,
the solutions found are always also non-dominated in the original objective
space. What changes is the distribution along the front. Solutions that are in
at parts of the desirability function receive very similar desirability values
and as MOEAs then attempt to spread solutions evenly in the desirability
space, this will result in a more spread out distribution in the original objec-
tive space. However, in order to specify the desirability functions in a sensible
manner, it is necessary to at least know the ranges of the Pareto front.
6.3 Constraints
Often, the DM can formulate preferences in the form of constraints, for ex-
ample Criterion 1 should be less than . Handling constraints is a well-
researched topic in evolutionary algorithms in general, and most of the tech-
niques carry over to EMO in a straightforward manner. One of the simplest
and most common techniques is probably to rank infeasible solutions accord-
ing to their degree of infeasibility, and inferior to all feasible solutions (Deb,
2000; Jimnez and Verdegay, 1999). A detailed discussion of constraint han-
dling techniques is out of the scope of this chapter. Instead, the interested
reader is referred to Coello (2002) for a general survey on constraint han-
dling techniques, and Deb (2001), Chapter 7, for a survey with focus on EMO
techniques.
6.4 Providing a Reference Point
Perhaps the most important way to provide preference information is a ref-

erence point, a technique that has a long tradition in multicriteria decision
making, see, e.g., Wierzbicki (1977, 1986) and also Chapter 2. A reference
point consists of aspiration levels reecting desirable values for the objective
function, i.e., a target the user is hoping for. Such an information can then
be used in dierent ways to focus the search. However, it should not lead to
a dominated solution being preferred over the dominating solution.
The use of a reference point to guide the EMO algorithm has rst been
proposed in Fonseca and Fleming (1993). The basic idea there is to give a
higher priority to objectives in which the goal is not fullled. Thus, when
deciding whether a solution x is preferable to a solution y or not, rst, only
the objectives in which solution x does not satisfy the goal are considered,
and x is preferred to y if it dominates y on these objectives. If x is equal to y
in all these objectives, or if x satises the goal in all objectives, x is preferred
over y either if y does not fulll some of the objectives fullled by x, or if x
dominates y on the objectives fullled by x. More formally, this can be stated
as follows. Let r denote the reference point, and let there be m objectives
without loss of generality sorted such that x fullls objectives k + 1 . . . m but
not objectives 1 . . . k, i.e.
fi (x) > ri i = 1 . . . k (6.2)

fi (x) ri i = k + 1 . . . m. (6.3)
Then, x is preferred to y if and only if
x 1...k y
x =1...k y [(l [k + 1 . . . n] : fl (y) > rk ) (x k+1...n y)] (6.4)
with x i...j y meaning that solution x dominates solution y on objectives i to

j (i.e., for minimization problems as considered here, fk (x fk (yk = i . . . j
with at least one strict inequality). A slightly extended version that allows
the decision maker to additionally assign priorities to objectives has been
published in Fonseca and Fleming (1998). This publication also contains the
proof that the proposed preference relation is transitive. Figure 6.4 visualizes
what part of the Pareto front remains preferred depending on whether the
reference point is reachable (a) or not (b). If the goal has been set so ambitious
that there is no solution which can reach the goal in even a single objective,
the goal has no eect on search, and simply the whole Pareto front is returned.
In Deb (1999), a simpler variant has been proposed which simply ignores
improvements over a goal value by replacing a solutions objective value fi (x)
by max{fi (x), ri }. If the goal vector r is outside the feasible range, the method
is almost identical to the denition in Fonseca and Fleming (1993). However, if
the goal can be reached, the approach from Deb (1999) will lose its selection
164 J. Branke
f2 f2
f1 f1
(a) (b)
Fig. 6.4. Part of the Pareto optimal front that remains optimal with a given ref-
erence point r and the preference relation from Fonseca and Fleming (1993). The
left panel (a) shows a reachable reference point, while the right panel (b) shows an
unreachable one. Minimization of objectives is assumed.
pressure and basically stop search as soon as the reference point has been
found, i.e., return a solution which is not Pareto optimal. On the other hand,
the approach from Fonseca and Fleming (1993) keeps improving beyond the
reference point. The goal-programming idea has been extended in Deb (2001)
to allow for reference regions in addition to reference points.
Tan et al. (1999) proposed another ranking scheme which in a rst stage
prefers individuals fullling all criteria, and ranks those individuals according
to standard non-dominance sorting. Among the remaining solutions, solution
x dominates solution y if and only if x dominates y with respect to the
objectives in which x does not fulll the goal (as in Fonseca and Fleming
(1993)), or if |x r| |y r|. The latter corresponds to a mirroring of
the objective vector along the axis of the fullled criteria. This may lead to
some strange eects, such as non-transitivity of the preference relation (x is
preferred to y, and y to z, but x and z are considered equal). Also, it seems odd
to penalize solutions for largely exceeding a goal. What is more interesting
in Tan et al. (1999) is the suggestion on how to account for multiple reference
points, connected with AND and OR operations. The idea here is to rank the
solutions independently with respect to all reference points. Then, rankings
are combined as follows. If two reference points are connected by an AND
operator, the rank of the solution is the maximum of the ranks according
to the individual reference points. If the operator is an OR, the rank of the
solution is the minimum of the ranks according to the individual reference
points. This idea of combining the information of several reference points
can naturally be combined with other preference relations using a reference
point. The paper also presents a way to prioritize objectives by introducing
additional goals. In eect, however, the priorization is equivalent to the one

proposed in Fonseca and Fleming (1998).
In Deb and Sundar (2006); Deb et al. (2006), the crowding distance cal-
culation in NSGA-II is replaced by the distance to the reference point, where
solutions with a smaller distance are preferred. More specically, solutions
with the same non-dominated rank are sorted with respect to their distance
to the reference point. Furthermore, to control the extent of obtained solu-
tions, all solutions having a distance of or less between them are grouped.
Only one randomly picked solution from each group is retained, while all other
group members are assigned a large rank to discourage their use. As Fonseca
and Fleming (1998) and Tan et al. (1999), this approach is able to improve be-
yond a reference point within the feasible region, because the non-dominated
sorting keeps driving the population to the Pareto optimal front. Also, as Tan
et al. (1999), it can handle multiple reference points simultaneously. With
the parameter , it is possible to explicitly inuence the diversity of solutions
returned. Whether this extra parameter is an advantage or a burden may
depend on the application.
Yet another dominance scheme was recently proposed in Molina et al.
(2009), where solutions fullling all goals and solutions fullling none of the
goals are preferred over solutions fullling only some of the goals. This, again,
drives the search beyond the reference point if it is feasible, but it can obviously
lead to situations where a solution which is dominated (fullling none of the
goals) is actually preferred over the solution that dominates it (fullling some
of the goals).
Thiele et al. (2007) integrate reference point information into the Indicator-
Based Evolutionary Algorithm, see Section 6.6 for details.
The classical MCDM literature also includes some approaches where, in
addition to a reference point, some further indicators are used to generate a
set of alternative solutions. These include the reference direction method (Ko-
rhonen and Laakso, 1986) and light beam search (Jaszkiewicz and Slowinski,
1999). Recently, these methods have also been adopted into MOEAs.
In brief, the reference direction method allows the user to specify a start-
ing point and a reference point, with the dierence of the two dening the
reference direction. Then, several points on this vector are used to dene a
set of achievement scalarizing functions, and each of these is used to search
for a point on the Pareto optimal frontier. In Deb and Kumar (2007a), an
MOEA is used to search for all these points simultaneously. For this pur-
pose, the NSGA-II ranking mechanism has been modied to focus the search
accordingly.
The light beam search also uses a reference direction, and additionally
asks the user for some thresholds which are then used so nd some possibly
interesting neighboring solutions around the (according to the reference di-
rection) most preferred solution. Deb and Kumar (2007b) use an MOEA to
simultaneously search for a number of solutions in the neighborhood of the
solution dened by the reference direction. This is achieved by rst identify-
166 J. Branke
ing the most preferred or middle solution using an achievement scalarizing

function based on the reference point. Then, a modied crowding distance
calculation is used to focus the search on those solutions which are not worse
by more than the allowed threshold in all the objectives.
Summarizing, the rst approach proposed in Fonseca and Fleming (1993)
still seems to be a good way to include reference point information. While in
most approaches the part of the Pareto optimal front considered as relevant
depends on the reference point and the shape and location of the Pareto
optimal front, in Deb and Sundar (2006) the desired spread of solutions in the
vicinity of the Pareto optimal solution closest to the reference point is specied
explicitly. The schemes proposed by Tan et al. (1999) and Deb and Sundar
(2006) allow to consider several reference points simultaneously. The MOEAs
based on the reference direction and light beam search (Deb and Kumar,
2007a,b) allow the user to specify additional information that inuences the
focus of the search and the set of solutions returned.
6.5 Limit Possible Trade-os

If the user has no idea of what kind of solutions may be reachable, it may
be easier to specify suitable trade-os, i.e., how much gain in one objective is
necessary to balance the loss in the other.
Greenwood et al. (1997) suggested a procedure which asks the user to rank
a few alternatives, and from this derives constraints for linear weighting of the
objectives consistent with the given ordering. Then, these are used to check
whether there is a feasible linear weighting such that solution x is preferable
to solution y. More specically, if the DM prefers a solution with objective
values f (x) to a solution with objective values f (y), then, assuming linearly
weighted additive utility functions and minimization of objectives, we know
that

n
wk (fk (x) fk (y)) < 0 (6.5)
k=1

n
wk = 1, wk 0.
k=1
Let A denote the set of all pairs of solutions (x, y) ranked by the DM, and
x preferred to y. Then, to compare any two solutions u and v, all linearly
weighted additive utility functions are considered which are consistent with
the ordering on the initially ranked solutions, i.e., consistent with Inequal-
ity 6.5 for all pairs of solutions (x, y) A. A preference of u over v is inferred
if u is preferred to v for all such utility functions. A linear program (LP) is
used to search for a utility function where u is not preferred to v.

n
min Z = wk (fk (u) fk (v)) (6.6)
k=1

n
wk (fk (x) fk (y)) < 0 (x, y) A (6.7)
k=1

n
wk = 1, wk 0.
k=1
If the LP returns a solution value Z > 0, we know there is no linear com-

bination of objectives consistent with Inequality 6.7 such that u would be
preferable, and we can conclude that v is preferred over u. If the LP can nd
a linear combination with Z < 0, it only means that v is not preferred to
u. To test whether u is preferred to v, one has to solve another LP and fail
to nd a linear combination of objectives such that v would be preferable.
Overall, the method requires to solve 1 or 2 LPs for each pair of solutions
in the population. Also, it needs special mechanisms to make sure that the
allowed weight space does not become empty, i.e., that the user ranking is
consistent with at least one possible linear weight assignment. The authors
suggest to use a mechanism from White et al. (1984) which removes a minimal
set of the DMs preference statements to make the weight space non-empty.
Note that although linear combinations of objectives are assumed, it is pos-
sible to identify a concave part of the Pareto front, because the comparisons
are only pair-wise. A more general framework for inferring preferences from
examples (allowing for piecewise linear additive utility functions rather than
linear additive utility functions) is discussed in Chapter 4.
In the guided MOEA proposed in Branke et al. (2001), the user is allowed
to specify preferences in the form of maximally acceptable trade-os like one
unit improvement in objective i is worth at most aji units in objective j. The
basic idea is to modify the dominance criterion accordingly, so that it reects
the specied maximally acceptable trade-os. A solution x is now preferred to
a non-dominated solution y if the gain in the objective where y is better does
not outweigh the loss in the other objective, see Figure 6.5 for an example.
The region dominated by a solution is adjusted by changing the slope of the
boundaries according to the specied maximal and minimal trade-os. In this
example, Solution A is now dominated by Solution B, because the loss in
Objective 2 is too big to justify the improvement in Objective 1. On the other
hand, Solutions D and C are still mutually non-dominated.
This idea can be implemented by a simple transformation of the objectives:
It is sucient to replace the original objectives with two auxiliary objectives
1 and 2 and use these together with the standard dominance principle,
where
168 J. Branke
1
1 (x) = f1 (x) + f2 (x)
a21
1
2 (x) = f1 (x) + f2 (x)
a12
See Figures 6.7 and 6.6 for a visualization.
Because the transformation is so simple, the guided dominance scheme can
be easily incorporated into standard MOEAs based on dominance, and it does
not change the complexity nor the inner workings of the algorithm. However,
an extension of this simple idea to more than two dimensions seems dicult.
Although developed independently and with a dierent motivation, the
guided MOEA can lead to the same preference relation as the imprecise value
function approach in Greenwood et al. (1997) discussed above. A maximally
acceptable trade-o of the form one unit improvement in objective i is worth
at most aji units in objective j could easily be transformed into the constraint
wi + aji wj < 0 or (6.8)
wi
> aji (6.9)
wj
The dierences are in the way the maximally acceptable trade-os are de-
rived (specied directly by the DM in the guided MOEA, and inferred from
a ranking of solutions in Greenwood et al. (1997)), and in the dierent im-
plementation (a simple transformation of objectives in guided MOEA, and
the solving of many LPs in the imprecise value function approach). While the
guided MOEA is more elegant and computationally ecient for two objec-
tives, the imprecise value function approach works independent of the number
of objectives.
The idea proposed in Jin and Sendho (2002) is to aggregate the dier-
ent objectives into one objective via weighted summation, but to vary the
A
f 2 = criterion 2
D
slope = 1/a 12
slope = a
21
f 1 = criterion 1
Fig. 6.5. Eect of the modied dominance scheme used by G-MOEA.

f_2
12 2 f_2
2 10
20
(p) 8 15
2
6 p:(f(p)f(p)
1, 2 )
4 10
q p f_1
2 q
f_1 5
0
2 4 6 8 10 12 0 1
0
(p)
1 1 5 10 15 20
Fig. 6.6. When the guided dominance Fig. 6.7. The guided dominance prin-
principle is used, non-dominated region ciple is equivalent to the original
of the Pareto optimal front is bounded dominance principle and appropriately
by the two solutions p and q where the transformed objective space (Branke
trade-o functions are tangent (Branke and Deb, 2005).
and Deb, 2005).
weights gradually over time during the optimization. For two objectives, it is
suggested to set w1 (t) = | sin(2t/F )| and w2 (t) = 1 w1 (t), where t is the
generation counter and F is a parameter to inuence the oscillation period.
The range of weights used in this process can be easily restricted to reect
the preferences of the DM by specifying a maximal and minimal weight w1max
and w1min , setting w1 (t) = w1min + (w1max w1min ) (sin(2t/F ) + 1)/2 and
adjusting w2 accordingly. The eect is a population moving along the Pareto
front, covering the part of the front which is optimal with respect to the range
of possible weight values. Because the population will not converge but keep
oscillating along the front, it is necessary to collect all non-dominated solu-
tions found in an external archive. Note also the slight dierence in eect to
restricting the maximal and minimal trade-o as do the other approaches in
this section. While the other approaches enforce these trade-os locally, on
a one-to-one comparison, the dynamic weighting modies the global tness
function. Therefore, the approach runs into problems if the Pareto front is
concave, because a small weight change would require the population to make
a big jump.
6.6 Approaches Based on Marginal Contribution

Several authors have recently proposed to replace the crowding distance as
used in NSGA-II by a solutions contribution to a given performance measure,
170 J. Branke
i.e., the loss in performance if that particular solution would be absent from
the population (Branke et al., 2004; Emmerich et al., 2005; Zitzler and Knzli,
2004). In the following, we call this a solutions marginal contribution. The
algorithm then looks similar to Algorithm 2.
Algorithm 2 Marginal contribution MOEA

Initialize population of size
Determine Pareto-ranking
Compute marginal contributions
repeat
Select parents
Generate ospring by crossover and mutation and add them to the
population
Determine Pareto-ranking
Compute marginal contributions
while (population size > ) do {Environmental selection}
From worst Pareto rank, remove individual with least marginal contri-
bution
Recompute marginal contributions
end while
until termination condition
In Zitzler and Knzli (2004) and Emmerich et al. (2005), the performance mea-
sure used is the hypervolume. The hypervolume is the area (in 2D) or part of
the objective space dominated by the solution set and bounded by a reference
point p, see Chapter 14. Figure 6.8 gives an example for the hypervolume,
and the parts used to rank the dierent solutions. The marginal contribution
is then calculated only based on the individuals with the same Pareto rank. In
the given example, Solution B has the largest marginal contribution. An obvi-
ous diculty with hypervolume calculations is the determination of a proper
reference point p, as this strongly inuences the marginal contribution of the
extreme solutions.
Zitzler et al. (2007) extend this idea by dening a weighting function over
the objective space, and use the weighted hypervolume as indicator. This al-
lows to incorporate preferences into the MOEA by giving preferred regions
of the objective space a higher weight. In Zitzler et al. (2007), three dierent
weighting schemes are proposed: a weight distribution which favors extremal
solutions, a weight distribution which favors one objective over the other (but
still keeping the best solution with respect to the less important objective),
and a weight distribution based on a reference point, which generates a ridge-
like function through the reference point (a, b) parallel to the diagonal. To
calculate the weighted hypervolume marginal contributions, numerical inte-
gration is used.
Reference point p
B
C
D
Fig. 6.8. Marginal contributions as calculated according to the hypervolume per-

formance measure. The marginal contributions correspond to the respective shaded
areas.
Another measure discussed in Zitzler and Knzli (2004) is the -Indicator.

Basically, it measures the minimal distance by which an individual needs to
be improved in each objective to become non-dominated (or can be worsened
before it becomes dominated). Recently, Thiele et al. (2007) suggested to
weight the -Indicator by an achievement scalarizing function based on a user
specied reference point. The paper demonstrates that this allows to focus the
search on the area around the specied reference point, and nd interesting
solutions faster.
Branke et al. (2004) proposed to use the expected utility as performance
measure, i.e., a solution is evaluated by the expected loss in utility if this
solution would be absent from the population. To calculate the expected util-
ity, Branke et al. (2004) assumed that the DM has a linear utility function
of the form u(x) = f1 (x) + (1 )f2 (x), and is unknown but follows a
uniform distribution over [0, 1]. The expected marginal utility (emu) of a solu-
tion x is then the utility dierence between the best and second best solution,
integrated over all utility functions where solution x is best:
# 1
emu(x) = max{0, min{u(y) u(x)}}d (6.10)
=0 y
While the expected marginal utility can be calculated exactly in the case
of two objectives, numerical integration is required for more objectives. The
result of using this performance measure is a natural focus of the search on
so-called knees, i.e., convex regions with strong curvature. In these regions,
an improvement in either objective requires a signicant worsening of the
other objective, and such solutions are often preferred by DMs (Das, 1999).
An example of the resulting distribution of individuals along a Pareto front
with a single knee is shown in Figure 6.9. Although this approach does not
take into account individual user preferences explicitly, it favors the often
172 J. Branke
preferred knee regions of the Pareto front. Additional explicit user preferences
can be taken into account by allowing the user to specify the probability
distribution for . For example, a probable preference for objective f2 could
be expressed by a linearly decreasing probability density of in the interval
[0..1], p () = 2 2. The eect of integrating such a preference information
can be seen in Figure 6.10.
6.7 Other Approaches

The method by Cvetkovic and Parmee (2002) assigns each criterion a weight
wi , and additionally requires a minimum level for dominance , which corre-
sponds to the concordance criterion of the ELECTRE method Figueira et al.
(2005). Accordingly, the following weighted dominance criterion is used as
dominance relation in the MOEA.

x w y wi .
i:fi (x)fi (y)
To facilitate specication of the required weights, they suggest a method to

turn fuzzy preferences into specic quantitative weights. However, since for
every criterion the dominance scheme only considers whether one solution is
better than another solution, and not by how much it is better, this approach
allows only a very coarse guidance and is dicult to control. A somewhat
similar dominance criterion has been proposed in Schmiedle et al. (2002). As
Fig. 6.9. Marginal contribution calculated according to expected utility result in a

concentration of the individuals in knee areas.
Objective 2
4
-2
-2 0 2 4 6 8
Objective 1
Fig. 6.10. Resulting distribution of individuals with the marginal expected utility
approach and a linearly decreasing probability distribution for .
an additioanal feature, cycles in the preference relation graph are treated by

considering all alternatives in a cycle as equivalent, and merging them into a
single meta-node in the preference relation graph.
Hughes (2001) is concerned with MOEAs for noisy objective functions.
The main idea to cope with the noise is to rank individuals by the sum of
probabilities of being dominated by any other individual. To take preferences
into account, the paper proposes a kind of weighting of the domination prob-
abilities.
Some papers (Rekiek et al., 2000; Coelho et al., 2003; Parreiras and Vas-
concelos, 2005) use preference ow according to Promethee II (Brans and
Mareschal, 2005). Although this generates a preference order of the individu-
als, it does so depending on the dierent alternatives present in the population,
not in absolute terms as, e.g., a weighted aggregation would do.
6.8 Discussions and Conclusions

If a single solution is to be selected in a multiobjective optimization problem,
at some point during the process, the DM has to reveal his/her preferences.
Specifying these preferences a priori, i.e., before alternatives are known, often
means to ask too much of the DM. On the other hand, searching for all non-
dominated solutions as most MOEA do may result in a waste of optimization
eorts to nd solutions that are clearly unacceptable to the DM.
This chapter overviewed intermediate approaches, that ask for partial pref-
erence information from the DM a priori, and then focus the search to those
regions of the Pareto optimal front that seem most interesting to the DM.
174 J. Branke
That way, it is possible to provide a larger number of relevant solutions. It

seems intuitive that this should also allow to reduce the computation time,
although this aspect has explicitly only been shown in Branke and Deb (2005)
and Thiele et al. (2007).
Table 6.1 summarizes some aspects of some of the most prominent ap-
proaches. It lists the information required from the DM (Information), the
part of the MOEA modied (Modication), and whether the result is a
bounded region of the Pareto optimal front or a biased distribution (Inu-
ence). What method is most appropriate certainly depends on the application
(e.g., whether the Pareto front is convex or concave, or whether the DM has
a good conception of what is reachable) and on the kind of information the
DM feels comfortable to provide. Many of the ideas can be combined, allowing
the DM to provide preference information in dierent ways. For example, it
would be straightforward to combine a reference point based approach which
leads to sharp boundaries of the area in objective space considered as inter-
esting with a marginal contribution approach which alters the distribution
Table 6.1. Comparison of some selected approaches to incorporate partial user

preferences.
Name Information Modication Inuence
Constraints
constraint miscellaneous region
Coello (2002)
Preference relation
reference point dominance region
Fonseca and Fleming (1993)
Reference point based

reference point crowding dist. region
EMO, Deb et al. (2006)
Light beam search based reference direction

crowding dist. region
EMO, Deb and Kumar (2007b) thresholds
Imprecise value function

solution ranking dominance region
Greenwood et al. (1997)
Guided MOEA maximal/minimal

objectives region
Branke et al. (2001) trade-o
Weighted integration weighting of

crowding dist. distribution
Zitzler et al. (2007) objective space
Marginal expected utility trade-o prob-

crowding dist. distribution
Branke et al. (2004) ability distribution
Biased crowding
desired trade-o crowding dist. distribution
Branke and Deb (2005)
within this area. Furthermore, many of the ideas can be used in an interactive
manner, which will be the focus of the following chapter (Chapter 7).
References
Branke, J., Deb, K.: Integrating user preference into evolutionary multi-objective
optimization. In: Jin, Y. (ed.) Knowledge Incorporation in Evolutionary Compu-
tation, pp. 461478. Springer, Heidelberg (2005)
Branke, J., Kauler, T., Schmeck, H.: Guidance in evolutionary multi-objective op-
timization. Advances in Engineering Software 32, 499507 (2001)
optimization. In: Yao, X., et al. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 722731.
Brans, J.-P., Mareschal, B.: PROMETHEE methods. In: Figueira, J., et al. (eds.)
Multiple criteria decision analysis, pp. 163196. Springer, Heidelberg (2005)
Coelho, R.F., Bersini, H., Bouillard, P.: Parametrical mechanical design with con-
straints and preferences: Application to a purge valve. Computer Methods in
Applied Mechanics and Engineering 192, 43554378 (2003)
Coello Coello, C.A.: Handling preferences in evolutionary multiobjective optimiza-
tion: A survey. In: Congress on Evolutionary Computation, vol. 1, pp. 3037.
Coello Coello, C.A.: Theoretical and numerical constraint-handling techniques used
with evolutionary algorithms: A survey of the state of the art. Computer Methods
in Applied Mechanics and Engineering 191(11-12), 12451287 (2002)
Coello Coello, C.A., van Veldhuizen, D.A., Lamont, G.B.: Evolutionary Algorithms
for Solving Multi-Objective Problems. Kluwer Academic Publishers, Dordrecht
(2002)
Cvetkovic, D., Parmee, I.C.: Preferences and their application in evolutionary mul-
tiobjective optimisation. IEEE Transactions on Evolutionary Computation 6(1),
4257 (2002)
Das, I.: On characterizing the knee of the pareto curve based on normal-boundary
intersection. Structural Optimization 18(2/3), 107115 (1999)
Deb, K.: Solving goal programming problems using multi-objective genetic algo-
rithms. In: Proceedings of Congress on Evolutionary Computation, pp. 7784
(1999)
Deb, K.: An ecient constraint handling method for genetic algorithms. Computer
Methods in Applied Mechanics and Engineering 186(2-4), 311338 (2000)
Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms. Wiley,
Chichester (2001)
Deb, K.: Multi-objective evolutionary algorithms: Introducing bias among Pareto-
optimal solutions. In: Ghosh, A., Tsutsui, S. (eds.) Advances in Evolutionary
Computing: Theory and Applications, pp. 263292. Springer, Heidelberg (2003)
decision-making using reference direction method. In: Genetic and Evolutionary
Computation Conference, pp. 781788. ACM Press, New York (2007a)
evolutionary algorithms. In: Congress on Evolutionary Computation, pp. 2125
2132. IEEE Computer Society Press, Los Alamitos (2007b)
176 J. Branke
Deb, K., Sundar, J.: Reference point based multi-objective optimization using evo-
lutionary algorithms. In: Genetic and Evolutionary Computation Conference, pp.
635642. ACM Press, New York (2006)
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multi-objective ge-
182197 (2002a)
Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable multi-objective optimization
test problems. In: Congress on Evolutionary Computation (CEC), pp. 825830
(2002b)
Deb, K., Sundar, J., Reddy, U.B., Chaudhuri, S.: Reference point based multi-
objective optimization using evolutionary algorithms. International Journal of
Computational Intelligence Research 2(3), 273286 (2006)
Emmerich, M.T.M., Beume, N., Naujoks, B.: An EMO algorithm using the hyper-
volume measure as selection criterion. In: Coello Coello, C.A., Hernndez Aguirre,
A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 6276. Springer, Heidelberg
(2005)
Figueira, J., Mousseau, V., Roy, B.: ELECTRE methods. In: Figueia, J., Greco, S.,
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization:
Formulation, discussion, and generalization. In: International Conference on Ge-
netic Algorithms, pp. 416423 (1993)
Fonseca, C.M., Fleming, P.J.: Multiobjective optimization and multiple constraint
handling with evolutionary algorithms - part I: A unied fomulation. IEEE Trans-
actions on Systems, Man, and Cybernetics - Part A 28(1), 2637 (1998)
Greenwood, G.W., Hu, X.S., DAmbrosio, J.G.: Fitness functions for multiple ob-
jective optimization problems: combining preferences with Pareto rankings. In:
Belew, R.K., Vose, M.D. (eds.) Foundations of Genetic Algorithms, pp. 437455.
Morgan Kaufmann, San Francisco (1997)
Horn, J.: Multicriterion Decision making. In: Bck, T., Fogel, D., Michalewicz, Z.
(eds.) Handbook of Evolutionary Computation, vol. 1, pp. F1.9:1F1.9:15. Oxford
University Press, Oxford (1997)
Hughes, E.J.: Constraint handling with uncertain and noisy multi-objective evolu-
tion. In: Congress on Evolutionary Computation, pp. 963970. IEEE Computer
Society Press, Los Alamitos (2001)
Jaszkiewicz, A., Slowinski, R.: The light beam search over a non-dominated surface
of a multiple-objective programming problem. European Journal of Operational
Research 113(2), 300314 (1999)
Jimnez, F., Verdegay, J.L.: Evolutionary techniques for constrained optimization
problems. In: Zimmermann, H.-J. (ed.) European Congress on Intelligent Tech-
niques and Soft Computing, Verlag Mainz, Aachen (1999)
Jin, Y., Sendho, B.: Incorporation of fuzzy preferences into evolutionary multi-
objective optimization. In: Asia-Pacic Conference on Simulated Evolution and
Learning, Nanyang Technical University, Singapore, pp. 2630 (2002)
Molina, J., Santana, L.V., Hernandez-Diaz, A.G., Coello Coello, C.A., Caballero, R.:
g-dominance: Reference point based dominance. European Journal of Operational
Research (2009)
Parreiras, R.O., Vasconcelos, J.A.: Decision making in multiobjective optimiza-

tion problems. In: Nedjah, N., de Macedo Mourelle, L. (eds.) Real-World Multi-
Objective System Engineering, pp. 2952. Nova Science Publishers, New York
(2005)
Rachmawati, L., Srinivasan, D.: Preference incorporation in multi-objective evolu-
tionary algorithms: A survey. In: Congress on Evolutionary Computation, pp.
33853391. IEEE Computer Society Press, Los Alamitos (2006)
Rekiek, B., Lit, P.D., Fabrice, P., LEglise, T., Emanuel, F., Delchambre, A.: Dealing
with userss preferences in hybrid assembly lines design. In: Binder, Z., et al. (eds.)
Management and Control of Production and Logistics Conference, pp. 989994.
Pergamon Press, Oxford (2000)
Sait, S.M., Youssef, H., Ali, H.: Fuzzy simulated evolution algorithm for multi-
objective optimization of VLSI placement. In: Congress on Evolutionary Compu-
tation, pp. 9197. IEEE Computer Society Press, Los Alamitos (1999)
Sakawa, M., Yauchi, K.: An interactive fuzzy satiscing method for multiobjective
nonconvex programming problems through oating point genetic algorithms. Eu-
ropean Journal of Operational Research 117, 113124 (1999)
Schmiedle, F., Drechsler, N., Groe, D., Drechsler, R.: Priorities in multi-objective
optimization for genetic programming. In: Spector, L., et al. (eds.) Genetic and
Evolutionary Computation Conference, pp. 129136. Morgan Kaufmann, San
Francisco (2002)
Tan, K.C., Lee, T.H., Khor, E.F.: Evolutionary algorithms with goal and priority
information for multi-objective optimization. In: Congress on Evolutionary Com-
putation, pp. 106113. IEEE Computer Society Press, Los Alamitos (1999)
Thiele, L., Miettinen, K., Korhonen, P.J., Molina, J.: A preference-based interactive
evolutionary algorithm for multiobjective optimization. Technical Report W-412,
Helsinki School of Economics, Helsinki, Finland (2007)
Trautmann, H., Mehnen, J.: A method for including a-priori-preference in multicri-
teria optimization. Technical Report 49/2005, SFG 475, University of Dortmund,
Germany (2005)
Van Veldhuizen, D., Lamont, G.B.: Multiobjective evolutionary algorithms: Analyz-
ing the state-of-the-art. Evolutionary Computation Journal 8(2), 125148 (2000)
White, C., Sage, A., Dozono, S.: A model of multiattribute decision-making and
tradeo weight determination under uncertainty. IEEE Transactions on Systems,
Wierzbicki, A.P.: Basic properties of scalarizing functions for multiobjective opti-
mization. Optimization 8(1), 5560 (1977)
terizations to vector optimization problems. OR Spektrum 8(2), 7387 (1986)
Zitzler, E., Knzli, S.: Indicator-based selection in multiobjective search. In: Yao,
X., et al. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 832842. Springer, Heidelberg
(2004)
178 J. Branke
Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the Strength Pareto Evolu-
tionary Algorithm for multiobjective optimization. In: Giannakoglou, K.C., et al.
(eds.) Evolutionary Methods for Design, Optimisation and Control with Applica-
tion to Industrial Problems (EUROGEN 2001), pp. 95100. International Center
for Numerical Methods in Engineering, CIMNE (2002)
Zitzler, E., Brockho, D., Thiele, L.: The hypervolume indicator revisited: On the
design of pareto-compliant indicators via weighted integration. In: Obayashi, S.,
Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403,
7
Interactive Multiobjective Evolutionary
Algorithms
Andrzej Jaszkiewicz1 and Jrgen Branke2

1
Poznan University of Technology, Institute of Computing Science
jaszkiewicz@cs.put.poznan.pl
2
Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany
Abstract. This chapter describes various approaches to the use of evolutionary

algorithms and other metaheuristics in interactive multiobjective optimization. We
distinguish the traditional approach to interactive analysis with the use of single
objective metaheuristics, the semi-a posteriori approach with interactive selection
from a set of solutions generated by a multiobjective metaheuristic, and specialized
interactive multiobjective metaheuristics in which the DMs preferences are interac-
tively expressed during the run of the method. We analyze properties of each of the
approaches and give examples from the literature.
7.1 Introduction
As already discussed in Chapters 1 and 6, in order to nd the best compro-

mise solution of a multiobjective optimization (MOO) problem, or a good
approximation of it, MOO methods need to elicit some information about the
DMs preferences. Thus, MOO methods may be classied with respect to the
time of collecting the preference information as methods with either a priori,
a posteriori, or progressive (interactive) articulation of preferences (Hwang
et al., 1980; Sowiski, 1984). While the previous Chapter 6 discussed the use
of (partial) a priori preference information in evolutionary MOO, here we will
focus on interactive approaches. Note, however, that the two issues are closely
related, as methods working with (partial) a priori information can be turned
into interactive methods simply by allowing the DM to adjust preferences and
re-start or continue the optimization interactively.
In general, interactive methods have the following main advantages:
The preference information requested from the DM is usually much simpler
than the preference information required by a priori methods.
Reviewed by: Francisco Ruiz, University of Mlaga, Spain

Eckart Zitzler, ETH Zrich, Switzerland
180 A. Jaszkiewicz and J. Branke
They have moderate computational requirements in comparison to a pos-

teriori methods.
As the DM controls the search process, he/she gets more involved in the
process, learns about potential alternatives, and is more condent about
the nal choice.
Evolutionary algorithms and other metaheuristics may be used for interactive
MOO in several ways. Traditional interactive methods usually assume that
an underlying single objective, exact solver is available. This solver is used
to solve a series of substitute single objective problems, whose solutions are
guaranteed to be (weakly) Pareto optimal solutions. However, for many hard
MOO problems, e.g., nonlinear or NP-hard combinatorial problems, no such
ecient solvers are available. For such problems, one may use a single objective
metaheuristic in place of an exact solver. This straightforward adaptation
of the classical approach we will call the traditional approach to interactive
analysis with the use of single objective metaheuristics.
In recent years, many multiobjective metaheuristics have been developed
(Coello et al., 2002; Deb, 2001). Most multiobjective metaheuristics aim at
generating simultaneously, in a single run, a set of solutions being a good
approximation of the whole Pareto optimal set. This set of solutions may then
be presented to the DM to allow him/her choose the best compromise solution
a posteriori. However, if the set of generated solutions and/or the number of
objectives is large, the DM may be unable to perceive the whole set and to
select the best solution without further support characteristic to interactive
methods. Thus, the approach in which the DM interactively analyzes a large
set of solutions generated by a multiobjective metaheuristic will be called the
semi-a posteriori approach.
Furthermore, the DM may also interact with a metaheuristic during its
run. Such interaction with evolutionary algorithms has been proposed for
problems with (partially) subjective functions, and is known under the name
of interactive evolution (Takagi, 2001). Finally, if at least the underlying ob-
jectives are known, we have interactive multiobjective metaheuristics, where
the DM interacts with the method during its run, allowing it to focus on the
desired region of the Pareto optimal set.
This chapter is organized as follows. In the next section, the traditional
approach to interactive analysis with the use of single objective metaheuris-
tics is analyzed. The semi-a posteriori approach is discussed in Section 7.3.
Section 7.4 makes a short excursion to eciency measures. Then, Section 7.5
contains the discussion of the interactive multiobjective metaheuristics. Fu-
ture trends and research directions are discussed in the nal section.
7 Interactive Multiobjective Evolutionary Algorithms 181
7.2 Traditional Approach to Interactive Analysis with

the Use of Single Objective Metaheuristics
As was mentioned above, the traditional interactive methods rely on the use of
an exact single objective solver (e.g., Miettinen, 1999, and also Chapter 2) In
general, the single objective solver is used to solve a substitute single objective
problem whose optimal solution is guaranteed to be (weakly) Pareto optimal.
For example, in the simplest version of the reference point method (c.f., Chap-
ter 1), in each iteration, the DM species a reference point in the objective
space. The reference point denes an achievement scalarizing function that
is optimized as substitute single objective problem with the use of an exact
single objective solver. The single solution generated in this way is presented
to the DM, an he/she decides whether this solution is satisfactory. If the DM
is not satised he/she adjusts the reference point and the process is repeated.
Typical examples of achievement scalarizing functions are the weighted linear
scalarizing function dened as:

J
sj (x, ) = j fj (x),
j=1
and the weighted Tchebyche scalarizing function dened as:

$ %
s (x, z0 , ) = max j (fj (x) z0j )
j
where z0 is a reference point, and = [1 , ... , J ] is a weight vector such that

j 0 j.
A well established theory is associated with the generation of Pareto-
optima through solving substitute single objective problems. For example,
it may be proved that each weighted linear scalarizing function has at least
one global optimum (minimum) belonging to the set of Pareto-optimal solu-
tions (Steuer, 1986). A Pareto-optimal solution that is a global optimum of
a weighted linear scalarizing function is called supported Pareto-optimal so-
lution (Ulungu and Teghem, 1994). Furthermore, each weighted Tchebyche
scalarizing function has at least one global optimum (minimum) belonging to
the set of Pareto-optimal solutions. For each Pareto-optimal solution x there
exists a weighted Tchebyche scalarizing function s such that x is a global
optimum (minimum) of s (Steuer, 1986, ch. 14.8). If, for a given problem,
no ecient exact solver exists, a straightforward approach could be to use a
single objective metaheuristic instead. One should be aware, however, that
the mentioned above theoretical results are valid in the case of exact solvers
only. If a metaheuristic solver is used several potential problems appear:
Metaheuristics do not guarantee nding a global optimum. Thus, when
applied to solve a substitute single objective problem, they may produce
a dominated solution.
Solutions generated in dierent iterations of the interactive method may

dominate each other, which may be very misleading to the DM.
No control mechanism allowing the DM to guide the search for the best
compromise may be guaranteed. For example, the DM improving the value
of one of the objectives in the reference point, expects that this objective
will also improve in the newly generated Pareto optimal solution. Since
metaheuristics do not give any guarantee on the quality of generated so-
lutions, the change on this objective may even be opposite.
Computational eciency may be insucient. In interactive methods, gen-
eration of new solutions needs to be done on-line. Thus, while a running
time of several minutes may be fully acceptable for solving most single ob-
jective problems, it may be unacceptable when used within an interactive
method.
Note that the potential non-optimality of metaheuristics may sometimes also
be benecial. In particular, a metaheuristic applied to the optimization of
a linear scalarizing function may produce a non-supported Pareto-optimal
solution as an approximate solution of the substitute problem. However, such
results are due to randomness and can not be relied upon.
Several examples of this traditional approach may be found in the lit-
erature. Alves and Clmaco (2000) use either simulated annealing or tabu
search in an interactive method for 0-1 programming. Kato and Sakawa (1998);
Sakawa and Shibano (1998); Sakawa and Yauchi (1999) have proposed a se-
ries of interactive fuzzy satiscing methods using various kinds of genetic
algorithms adapted to various kinds of problems. Miettinen and Mkel use
two dierent versions of genetic algorithms with constraint handling (among
other solvers) in NIMBUS (Miettinen, 1999; Miettinen and Mkel, 2006;
Ojalehto et al., 2007). Gabrys et al. (2006) use a single objective EA within
several classical interactive procedures, i.e., STEP, reference point approach,
Tchebyche method, and GDF.
7.3 Semi-a-Posteriori Approach Interactive Selection

from a Set of Solutions Generated by a
Multiobjective Metaheuristic
Classical a posteriori approaches to MOO assume that the set of (approxi-
mately) Pareto optimal solutions is presented to the DM, who is then able
to select the best compromise solution. In recent years, many multiobjective
metaheuristics have been developed, see, e.g., Coello et al. (2002); Deb (2001)
for overviews. The typical goal of these methods is to generate in a single
run a set of solutions being a good approximation of the whole Pareto set.
Because metaheuristics such as evolutionary algorithms or particle swarm op-
timization concurrently work with sets (populations) of solutions, they can
simultaneously generate in one run a set of solutions which approximate the

Pareto front. Thus, they are naturally suited as a posteriori MOO methods.
The set of potentially Pareto optimal solutions generated by a multiobjec-
tive metaheuristic may, however, be very large (Jaszkiewicz, 2002b), and even
for relatively small problems may contain thousands of solutions. Obviously,
most DMs will not be able to analyze such sets of solutions, in particular in
the case of many objectives. Thus, they may require some further support
which is typical for interactive methods.
In fact, numerous methods for interactive analysis of large nite sets of
alternatives have been already proposed. This class of methods includes:
Zionts method (Zionts, 1981), Korhonen, Wallenius and Zionts method (Ko-
rhonen et al., 1984), Kksalan, Karwan and Zionts method (Kksalan et al.,
1988), Korhonen method (Korhonen, 1988), Malakooti method (Malakooti,
1989), Taner and Kksalan method (Taner and Kksalan, 1991), AIM (Lot
et al., 1992), Light-Beam-Search-Discrete (Jaszkiewicz and Sowiski, 1997),
Interactive Trichotomy Segmentation (Jaszkiewicz and Ferhat, 1999), and In-
terquad (Sun and Steuer, 1996). These methods are usually based on some
traditional interactive methods (with on-line generation of solutions through
optimization of the substitute problems). Thus, from the point of view of the
DM the interaction may look almost the same, and he/she may be unaware
whether the traditional or semi-a posteriori interaction is used. For example,
using again the simplest version of the reference point method, in each itera-
tion, the DM species a reference point in the objective space. The reference
point denes an achievement scalarizing function. Then, the set of potentially
Pareto optimal solutions is searched for the best solution with respect to this
scalarizing function, which is presented to the DM. If the DM is not satised
he/she adjusts the reference point and the process is repeated.
The advantages of this semi-a posteriori approach are:
A large number of methods for both computational and interactive phases
exist.
All heavy calculations are done prior to interaction, thus interaction is
very fast and saves the DM time.
All solutions presented to the DM are mutually non-dominated. Note,
however, that it does not necessarily mean that they are Pareto-optimal,
since Pareto-optimality cannot be guaranteed by metaheuristics.
Some control mechanisms allowing the DM to guide the search for the best
compromise may be guaranteed.
Additional various forms of statistical analysis of the pre-generated set of
solutions (e.g. calculation of correlation coecients between objectives)
and graphical visualization may be used.
Generation of a high quality approximation of the whole Pareto set may, how-
ever, become computationally too demanding, especially in the case of realistic
size combinatorial problems and a high number of objectives. Several stud-
ies (Jaszkiewicz, 2002b, 2003, 2004b; Purshouse, 2003) addressed the issue of
computational eciency of multiobjective metaheuristics compared to itera-

tively applied single objective metaheuristics to generate an approximation to
the Pareto front. Below the idea of the eciency index of a multiobjective
metaheuristic with respect to a corresponding single objective metaheuristic
introduced in Jaszkiewicz (2002b, 2003, 2004b) is presented.
Despite a large number of both multiobjective metaheuristics and interac-
tive methods for analysis of large nite sets of alternatives, few examples of
the semi-a posteriori approach may be found in the literature. Hapke et al.
(1998) use an interactive Light Beam Search-discrete method over a set of solu-
tions of project scheduling problems generated by Pareto simulated annealing.
Jaszkiewicz and Ferhat (1999) use the interactive trichotomy segmentation
method for analysis of a set of solutions of a personnel scheduling problem
generated by the same metaheuristic. Tanaka et al. (1995) use an interactive
procedure based on iterative evaluation of samples of points selected from a
larger set generated by an MOEA. They use a radial basis function network
to model the DMs utility function and suggest new solutions for evaluation.
7.4 Excursion: Eciency Index

The general idea of the eciency index presented here is to compare running
times of single and multiobjective metaheuristics needed to generate solution
sets of comparable quality. Many techniques have been proposed for evaluation
of sets of solutions in the objective space, see Chapter14. Obviously, no single
measure can cover all aspects of the quality. In this section, we will focus on the
measures based on scalarizing functions, mainly because they can be directly
applied to evaluate results of both single and multiobjective metaheuristics.
Assume that a single objective metaheuristic is applied to the optimiza-
tion of an achievement scalarizing function. The quality of the approximate
solution generated by the single objective metaheuristic is naturally evaluated
with the achievement scalarizing function. Furthermore, one may run a sin-
gle objective metaheuristic a number of times using a representative sample
of achievement scalarizing functions dened, e.g., by a representative sample
of weight vectors. Then, the average value of the scalarizing functions over
the respective generated solutions measures the average quality of solutions
generated.
Scalarizing functions are also often used for evaluation of the results of
multiobjective metaheuristics (e.g. Czyzak and Jaszkiewicz, 1996; Viana and
de Sousa, 2000). In Hansen and Jaszkiewicz (1998) and Jaszkiewicz (2002a)
we have proposed a scalarizing functions-based quality measure fully com-
patible with the above outlined approach for evaluation of single objective
metaheuristics. The quality measure evaluates sets of approximately Pareto
optimal solutions with average value of scalarizing functions dened by a rep-
resentative sample of weight vectors.
In order to compare the quality and the computational eciency of a mul-

tiobjective metaheuristic and a single objective metaheuristic, a representative
sample R of normalized weight vectors is used. Each weight vector R
denes a scalarizing function, e.g., a weighted Tchebyche scalarizing function
s (z, z0 , ). As the reference point z0 one may use an approximation of the
ideal point. All scalarizing functions dened by vectors from set R constitute
the set Sa of functions. |R | is a sampling parameter, too low values of this
parameter may increase the statistical error, but, in general, the results will
not depend on it.
In order to evaluate the quality of solutions generated by a single objective
metaheuristic the method solves a series of optimization problems for each
achievement scalarizing function from set Sa . For each function s (z, z0 , )
Sa , the single objective metaheuristic returns a solution corresponding to point
z in objective space. Thus, the average quality of solutions generated by the
single objective metaheuristic is:

s (z , z0 , )
R
Qs =
| R |
Note that due to the nature of metaheuristics, the best solution with respect
to a particular achievement scalarizing function s (z, z0 , ) Sa may ac-
tually be found when optimizing another function s (z, z0 , ) from set Sa .
If desired, this can be taken into account by storing all solutions generated
so far in all single objective runs, and using the respective best of the stored
solutions instead of z for each R .
In order to evaluate the quality of solutions generated by a multiobjective
metaheuristic, a set A of solutions generated by the method in a single run is
evaluated. For each function s (z, z0 , ) Sa the best point z on this func-
tion is selected from set A , i.e., s (z , z0 , ) s (z, z0 , )z A Thus, the
average quality of solutions generated by the multiobjective metaheuristic is:

s (z , z0 , )
R
Qm =
| R |
The quality of solutions generated by both the single objective metaheuristic
and the multiobjective metaheuristic may be assumed approximately the same
if Qs = Qm .
The two quality measures Qs and Qm are used in order to compare compu-
tational eorts (running times) needed by a single objective metaheuristic and
a multiobjective metaheuristic to generate solutions of the same quality. First,
the single objective metaheuristic is run for each function s (z, z0 , ) Sa .
Let the average time of a single run of the single objective metaheuristic be
denoted by Ts (Ts is rather independent of |R | since it is an average time of
a single objective run). Then, the multiobjective metaheuristic is run. During
the run of the multiobjective metaheuristic the quality Qm is observed using
the same set Sa of functions. Qm is applied to a set A of potentially Pareto

optimal solutions generated up to a given time by the multiobjective meta-
heuristic. This evaluation is repeated whenever set A changes. Note that the
set Sa is used only to calculate Qm , it has no inuence on the work of the mul-
tiobjective metaheuristic, which is guided by its own mechanisms (e.g., Pareto
ranking). The multiobjective metaheuristic is stopped as soon as Qm Qs
(or if some maximum allowed running time is reached). Let the running time
of the multiobjective metaheuristic be denoted by Tm If condition Qm Qs
was fullled one may calculate the eciency index of the multiobjective meta-
heuristic with respect to the single objective metaheuristic:
Tm
EI =
Ts
In general, one would expect EI > 1, and the lower EI, the more ecient the
multiple objective metaheuristic with respect to the single objective method.
The eciency index may be used to decide what MOO method might be
most ecient. Assume that a DM using an iterative single-objective approach
has to do L iterations before choosing the best compromise solution1 . If EI <
L , then the o-line generation of approximately Pareto-optimal solutions with
the use of a multiobjective metaheuristic would require less computational
eort than on-line generation with the use of a single objective metaheuristic.
In other words, EI compares eciency of the traditional interactive approach
and the semi-a posteriori approach.
The eciency index has been used in several studies (Jaszkiewicz, 2002b,
2003, 2004b), similar techniques were also used by Purshouse (2003). Although
the quantitative results depend on the test problem (e.g., TSP, knapsack, set
covering) and method, the general observation is that the relative compu-
tational eciency of multiobjective metaheuristics reduces below reasonable
levels with increasing number of objectives and instance size. For example, the
computational experiment on multiobjective knapsack problems (Jaszkiewicz,
2004b) indicated that "the computational eciency of the multiobjective
metaheuristics cannot be justied in the case of 5-objective instances and
can hardly be justied in the case of 4-objective instances". Thus, the semi-a
posteriori would become very inecient compared to an interactive approach
in such cases.
7.5 Interactive Multiobjective Metaheuristics

Evolutionary algorithms and other metaheuristics are black box algorithms,
i.e., it suces to provide them with a quality measure for the solutions gen-
erated. There are almost no restrictions regarding this quality measure; a
1
Of course, in practice, this number is not known in advance and depends on the
problem, the method and the DM, but often may be reasonably estimated on the
basis of past experience
closed-loop description of the solution quality, or additional information such

as gradients, is not necessary. This makes metaheuristics applicable to an
enormous range of applications. In most applications, a solutions quality can
be evaluated automatically. However, there are also applications where a so-
lutions quality can not be computed automatically, but depends on user pref-
erences. In this case, interactive evolutionary computation (Takagi, 2001) may
be used. An interactive evolutionary algorithm relies on the user to evaluate
and rank solutions during the run. These evaluations are then used to guide
the further search towards the most promising regions of the solution space.
One popular example is the generation of a picture of a criminal suspect,
with the only source of information being someones memory (Caldwell and
Johnston, 1991). The problem may be modeled as an optimization task where
solutions are faces (built of some building blocks) and the objective function
is a similarity to the suspects face. Of course, the similarity may be evaluated
only subjectively by the human witness. Another example may be optimiza-
tion of aesthetics of a design (Kamalian et al., 2004, 2006). An example in the
realm of MCDM can be found in Hsu and Chen (1999).
Human fatigue is a crucial factor in such algorithms, as the number of so-
lutions usually looked at by metaheuristics may become very large. Thus, var-
ious approaches based on approximate modeling (e.g., with a function learned
from evaluation examples) of the DMs preferences have been proposed in
this eld (Takagi, 2001). The evolutionary algorithm tries to predict a DMs
answers using this model, and asks the DM to evaluate only some of the new
solutions.
There are apparent similarities of this eld to interactive MOO. In both
cases we are looking for solutions being the best from the point of view of
subjective preferences. Thus, a very straightforward approach could be to
apply an interactive evolutionary algorithm to a MOO problem asking the DM
to evaluate presented solutions. However, in MOO, we assume to at least know
the criteria that form the basis for the evaluation of a solution, and that these
can be computed. Only how these objectives are combined to the overall utility
of a solution is subjective. A direct application of interactive evolutionary
algorithms to MOO would not take into account the fact that many solutions
could be compared with the use of the dominance relations without consulting
the DM. In other words, in an MOO problem, user evaluation is only necessary
to compare mutually non-dominated solutions.
Interactive multiobjective metaheuristics are methods that are specically
adapted to interactive MOO, and use the dominance relation and the knowl-
edge about the objectives to reduce the number of questions asked to the DM.
Note that they may present to the DM intermediate solutions during the run,
while in the case of the traditional or the semi-a priori approach only nal
solution(s) are presented to the DM.
As opposed to the traditional approach to interactive analysis, the inter-
active multiobjective metaheuristics do not make a complete run of a single
objective method in each iteration. They rather modify the internal workings
of a single or multiobjective metaheuristic, allowing interaction with the DM

during the run.
Several methods belonging to this class may be found in the literature.
Tanino et al. (1993) proposed probably the rst method of this kind. The
method is based on a relatively simple version of a Pareto-ranking based multi-
objective evolutionary algorithm. The evaluation of solutions from the current
population is based on both dominance relation and on preferences expressed
iteratively by the DM. The DM has several options: He/she may directly point
out satisfactory/unsatisfactory solutions, or specify aspiration/reservation lev-
els that are used to identify satisfactory/unsatisfactory solutions.
Kamalian et al. (2004) suggest to use an a posteriori evolutionary MOO
followed by an interactive evolutionary algorithm. First a Pareto ranking-
based evolutionary algorithm is run to generate a rough approximation of the
Pareto front. Then, the DM selects a sample of the most promising solutions
that are subsequently used as starting population of a standard interactive
evolutionary algorithm.
Kita et al. (1999) interleave generations of a Pareto ranking-based evolu-
tionary algorithm with ranking of the solutions by a DM, while Kamalian et al.
(2006) allow the user to modify the Pareto ranking computed automatically
by changing the rank of some of the solutions.
Several authors allow the DM to set and adjust aspiration levels or refer-
ence points during the run, and thereby guide the MOEA towards the (from
the DMs perspective) most promising solutions. For example, Fonseca and
Fleming (1998) allow the user to specify aspiration levels in form of a refer-
ence point, and use this to modify the MOEAs ranking scheme in order to
focus the search. This approach is discussed in more detail also in Chapter 6.
The approach proposed by Geiger (2007) is based on Pareto Iterated Local
Search. It rst approximates the Pareto front by calculating some upper and
lower bounds, to give the DM a rough idea of what can be expected. Based
on this information, the DM may restrict the search to the most interesting
parts of the objective space. Ulungu et al. (1998) proposed an interactive ver-
sion of multiple objective simulated annealing. In addition to allowing to set
aspiration levels, solutions may be explicitly removed from the archive, and
weights may be specied to further focus the search. Thiele et al. (2007) also
use DMs preferences interactively expressed in the form of reference points.
They use an indicator-based evolutionary algorithm, and use the achievement
scalarizing function to modify the indicator and force the algorithm to focus
on the more interesting part of the Pareto front.
Deb and Chaudhuri (2007) proposed an interactive decision support sys-
tem called I-MODE that implements an interactive procedure built over a
number of existing EMO and classical decision making methods. The main
idea of the interactive procedure is to allow the DM to interactively focus on
interesting region(s) of the Pareto front. The DM has options to use several
tools for generation of potentially Pareto optimal solutions concentrated in
the desired regions. For example, he/she may use weighted sum approach,
utility function based approach, Tchebyche function approach or trade-o

information. Note that the preference information may be used to dene a
number of interesting regions. For example, the DM may dene a number
of reference (aspiration) points dening dierent regions. The preference in-
formation is then used by an EMO to generate new solutions in (hopefully)
interesting regions.
The interactive evolutionary algorithm proposed by Phelps and Kksalan
(2003) allows the user to provide preference information about pairs of solu-
tions during the run. Based on this information, the authors compute a most
compatible weighted sum of objectives (i.e., a linear achievement scalarizing
function) by means of linear programming, and use this as single substitute
objective for some generations of the evolutionary algorithm. Note that the
weight vector denes a single search direction and may change only when the
user provides new comparisons of solutions. However, since only partial pref-
erence information is available, there is no guarantee that the weight vector
obtained by solving the linear programming model denes the DMs utility
function, even if the utility function has the form of a weighted sum. Thus,
the use of a single weight vector may bias the algorithm towards some solu-
tions not necessarily being the best for the DM. This bias may become even
more signicant when the DMs preferences cannot be modeled with a linear
function.
Instead of using linear programming to derive a weighting of the objectives
most compatible with the pairwise comparisons as in Phelps and Kksalan
(2003), Barbosa and Barreto (2001) use two evolutionary algorithms, one
to nd the solutions, and one to determine the most compatible ranking.
These EAs are run in turn: rst, both populations (solutions and weights)
are initialized, then the DM is asked to rank the solutions. After that, the
population of weights is run for some generations to produce a weighting
which is most compatible with the user ranking. Then, this weighting is used
to evolve the solutions for some generations, and the process repeats. Todd
and Sen (1999) also try to learn the users utility function, but instead of only
considering linear weightings of objectives, they use the preference information
provided by the DM to train an articial neural network, which is then used
to evaluate solutions in the evolutionary algorithm.
The method of Jaszkiewicz (2007) is based on the Pareto memetic al-
gorithm (PMA)(Jaszkiewicz, 2004a). The original PMA samples the set of
scalarizing functions drawing a random weight vector for each single iteration
and uses this during crossover and local search. In the proposed interactive
version, preference information from pairwise comparisons of solutions is used
to reduce the set of possible weight vectors. Note that dierent from the ap-
proaches above, it is not attempted to identify one most likely utility function,
but simultaneously allows for a range of utility functions compatible with the
preference information specied by the user.
7.6 Summary
In this chapter, we have described three principal ways for interactively using
evolutionary algorithms or similar metaheuristics in MOO.
The traditional approach to interactive MOO with the use of single objec-
tive metaheuristics is a straightforward adaptation of the classical interactive
methods. It suers, however, from a number of weaknesses when metaheuris-
tics are used in place of exact solvers, since many important theoretical prop-
erties are not valid in the case of heuristic solvers.
The semi-a posteriori approach allows combining multiobjective meta-
heuristics with methods for interactive analysis of large nite sets of alter-
natives. An important advantage of this approach is that various methods
from both classes are available. The semi-a posteriori approach allows over-
coming a number of weaknesses of the traditional approach. It may, however,
become computationally inecient for large problems with larger number of
objectives.
Interactive multiobjective metaheuristics is a very promising class of meth-
ods specically adapted to interactive solving of hard MOO problems. Accord-
ing to some studies (Phelps and Kksalan, 2003; Jaszkiewicz, 2007) they may
be computationally ecient even for a large number of objectives and re-
quire relatively low eort from the DM. Such specically designed methods
may combine the main advantages of metaheuristics and interactive MOO
avoiding weaknesses of the other approaches. Note that the way the DM in-
teracts with such methods may be signicantly dierent from the traditional
approaches. For example, the DM may be asked to compare solutions being
known to be located far from the Pareto optimal set.
Both interactive methods and the use of metaheuristics are among the
most active research areas within MOO. Combination of these approaches may
results in very eective methods for hard, multidimensional MOO problems.
Despite of a number of proposals known from the literature, this eld has not
yet received appropriate attention from MOO researchers community.
References
Alves, M., Clmaco, J.: An interactive method for 0-1 multiobjective problems using
simulated annealing and tabu search. Journal of Heuristics 6, 385403 (2000)
Barbosa, H.J.C., Barreto, A.M.S.: An interactive genetic algorithm with co-evolution
of weights for multiobjective problems. In: Spector, L., et al. (eds.) Genetic and
Francisco (2001)
Caldwell, C., Johnston, V.S.: Tracking a criminal suspect through face-space" with
a genetic algorithm. In: International Conference on Genetic Algorithms, pp. 416
421. Morgan Kaufmann, San Francisco (1991)
Coello Coello, C.A., Van Veldhuizen, D.A., Lamont, G.B.: Evolutionary Algorithms
for Solving Multi-Objective Problems. Kluwer Academic Publishers, Dordrecht
(2002)
Czyzak, P., Jaszkiewicz, A.: A multiobjective metaheuristic approach to the local-

ization of a chain of petrol stations by the capital budgeting model. Control and
Cybernetics 25(1), 177187 (1996)
Chichester (2001)
Deb, K., Chaudhuri, S.: I-MODE: An interactive multi-objective optimization and
decision-making using evolutionary methods. Technical Report KanGAL Report
No. 2007003, Indian Institute of Technology Kanpur (2007)
handling with evolutionary algorithms - part I: A unied fomulation. IEEE Trans-
actions on Systems, Man, and Cybernetics - Part A 28(1), 2637 (1998)
Streichert, F., Tanaka-Yamawaki, M.: A new scheme for interactive multi-criteria
decision making. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS
(LNAI), vol. 4253, pp. 655662. Springer, Heidelberg (2006)
Geiger, M.J.: The interactive Pareto iterated local search (iPILS) metaheuristic
and its application to the biobjective portfolio optimization problem. In: IEEE
Symposium on Computational Intelligence in Multicriteria Decision Making, pp.
Hansen, M.P., Jaszkiewicz, A.: Evaluating the quality of approximations to the
nondominated set. Working paper, Institute of Mathematical Modelling Technical
University of Denmark (1998)
Hapke, M., Jaszkiewicz, A., Sowiski, R.: Interactive analysis of multiple-criteria
project scheduling problems. European Journal of Operational Research 107, 315
324 (1998)
Hsu, F.C., Chen, J.-S.: A study on multicriteria decision making model: Interactive
genetic algorithms approach. In: Congress on Evolutionary Computation, vol. 3,
pp. 634639. IEEE Computer Society Press, Los Alamitos (1999)
Hwang, C.-L., Paidy, S.R., Yoon, K., Masud, A.S.M.: Mathematical programming
with multiple objectives: A tutorial. Computers and Operations Research 7, 531
(1980)
Jaszkiewicz, A.: Genetic local search for multiple objective combinatorial optimiza-
tion. European Journal of Operational Research 137(1), 5071 (2002a)
Jaszkiewicz, A.: On the computational eectiveness of multiple objective metaheuris-
tics. In: Trzaskalik, T., Michnik, J. (eds.) Multiple Objective and Goal Program-
ming. Recent Developments, pp. 86100. Physica-Verlag, Heidelberg (2002b)
Jaszkiewicz, A.: Do multiple-objective metaheuristics deliver on their promises? a
computational experiment on the set-covering problem. IEEE Transactions on
Evolutionary Computation 7(2), 133143 (2003)
Jaszkiewicz, A.: A comparative study of multiple-objective metaheuristics on the
bi-objective set covering problem and the pareto memetic algorithm. Annals of
Operations Research 131(1-4), 135158 (2004a)
Jaszkiewicz, A.: On the computational eciency of multiobjective metaheuris-
tics. the knapsack problem case study. European Journal of Operational Re-
search 158(2), 418433 (2004b)
Jaszkiewicz, A.: Interactive multiobjective optimization with the Pareto memetic
algorithm. Foundations of Computing and Decision Sciences 32(1), 1532 (2007)
Jaszkiewicz, A., Ferhat, A.B.: Solving multiple criteria choice problems by interac-
tive trichotomy segmentation. European Journal of Operational Research 113(2),
271280 (1999)
Jaszkiewicz, A., Sowiski, R.: The lbs-discrete interactive procedure for multiple-
criteria analysis of decision problems. In: Climaco, J. (ed.) Multicriteria Analysis,
International Conference on MCDM, Coimbra, Portugal, 1-6 August 1994, pp.
Kamalian, R., Takagi, H., Agogino, A.M.: Optimized design of MEMS by evolution-
ary multi-objective optimization with interactive evolutionary computation. In:
Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3103, pp. 10301041. Springer,
Heidelberg (2004)
Kamalian, R., Zhang, Y., Takagi, H., Agogino, A.M.: Evolutionary synthesis of mi-
cromachines using supervisory multiobjective interactive evolutionary computa-
tion. In: Yeung, D.S., Liu, Z.-Q., Wang, X.-Z., Yan, H. (eds.) ICMLC 2005. LNCS
(LNAI), vol. 3930, pp. 428437. Springer, Heidelberg (2006)
Kato, K., Sakawa, M.: An interactive fuzzy satiscing method for large scale mul-
tiobjective 0-1 programming problems with fuzzy numbers through genetic algo-
rithms. European Journal of Operational Research 107(3), 590598 (1998)
Kita, H., Shibuya, M., Kobayashi, S.: Integration of multi-objective and interactive
genetic algorithms and its application to animation design. In: IEEE Systems,
Man, and Cybernetics, pp. 646651 (1999)
Kksalan, M., Karwan, M.H., Zionts, S.: An approach for solving discrete alterna-
tive multiple criteria problems involving ordinal criteria. Naval Research Logis-
tics 35(6), 625642 (1988)
Korhonen, P.: A visual reference direction approach to solving discrete multiple cri-
teria problems. European Journal of Operational Research 34(2), 152159 (1988)
Korhonen, P., Wallenius, J., Zionts, S.: Solving the discrete multiple criteria problem
using convex cones. Management Science 30(11), 13361345 (1984)
Lot, V., Stewart, T.J., Zionts, S.: An aspiration-level interactive model for multiple
criteria decision making. Computers and Operations Research 19, 677681 (1992)
Malakooti, B.: Theories and an exact interactive paired-comparison approach for
discrete multiple criteria problems. IEEE Transactions on Systems, Man, and
Cybernetics 19(2), 365378 (1989)
Dordrecht (1999)
optimization. European Journal of Operational Research 170(3), 909922 (2006)
Ojalehto, V., Miettinen, K., Mkel, M.M.: Interactive software for multiobjective
optimization: IND-NIMBUS. WSEAS Transactions on Computers 6(1), 8794
(2007)
Phelps, S., Kksalan, M.: An interactive evolutionary metaheuristic for multiobjec-
tive combinatorial optimization. Management Science 49(12), 17261738 (2003)
Purshouse, R.C.: On the evolutionary optimisation of many objectives. Ph.D. thesis,
Department of Automatic Control and Systems Engineering, The University of
Sheeld (2003)
Sakawa, M., Shibano, T.: An interactive fuzzy satiscing method for multiobjective
0-1 programming problems with fuzzy numbers through genetic algorithms with
double strings. European Journal of Operational Research 107(3), 564574 (1998)
Sakawa, M., Yauchi, K.: An interactive fuzzy satiscing method for multiobjective
nonconvex programming problems through oating point genetic algorithms. Eu-
ropean Journal of Operational Research 117(1), 113124 (1999)
Sowiski, R.: Review of multiple objective programming methods (Part I in Polish).

Przegld Statystyczny 31, 4763 (1984)
Steuer, R.E.: Multiple Criteria Optimization - Theory, Computation and Applica-
tion. Wiley, Chichester (1986)
Sun, M., Steuer, R.E.: Interquad: An interactive quad tree based procedure for
solving the discrete alternative multiple criteria problem. European Journal of
Operational Research 89(3), 462472 (1996)
Takagi, H.: Interactive evolutionary computation: fusion of the capabilities of ec op-
timization and human evaluation. Proceedings of the IEEE 89, 12751296 (2001)
Tanaka, M., Watanabe, H., Furukawa, Y., Tanino, T.: GA-based decision support
system for multicriteria optimization. In: International Conference on Systems,
Man and Cybernetics, vol. 2, pp. 15561561. IEEE Computer Society Press, Los
Alamitos (1995)
Taner, O.V., Kksalan, M.: Experiments and an improved method for solving the
discrete alternative multiple criteria problem. Journal of the Operational Research
Society 42(5), 383392 (1991)
Tanino, T., Tanaka, M., Hojo, C.: An interactive multicriteria decision making
method by using a genetic algorithm. In: 2nd International Conference on Systems
Science and Systens Engineering, pp. 381386 (1993)
evolutionary algorithm for multiobjective optimization. Working papers w-412,
Helsinki School of Economics, Helsinki (2007)
Todd, D.S., Sen, P.: Directed multiple objective search of design spaces using ge-
netic algorithms and neural networks. In: Banzhaf, W., et al. (eds.) Genetic and
Francisco (1999)
Ulungu, B., Teghem, J., Ost, C.: Eciency of interactive multi-objective simulated
annealing through a case study. Journal of the Operational Research Society 49,
10441050 (1998)
Ulungu, E.L., Teghem, J.: Multi-objective combinatorial optimization problems: A
survey. Journal of Multi-Criteria Decision Analysis 3, 83101 (1994)
Viana, A., de Sousa, J.P.: Using metaheuristics in multiobjective resource con-
strained project scheduling. European Journal of Operational Research 120, 359
374 (2000)
Zionts, S.: Criteria method for choosing among discete alternatives. European Jour-
nal of Operational Research 7(1), 143147 (1981)
8
Visualization in the Multiple Objective
Decision-Making Framework
Pekka Korhonen and Jyrki Wallenius
Helsinki School of Economics, Department of Business Technology, P.O. Box 1210,

FI-00101 Helsinki, Finland, pekka.korhonen@hse.fi, jyrki.wallenius@hse.fi
Abstract. In this paper we describe various visualization techniques which have

been used or which might be useful in the multiple objective decision making frame-
work. Several of the ideas originate from statistics, especially multivariate statistics.
Some techniques are simply for illustrating snapshots of a single solution or a set of
solutions. Others are used as an essential part of the human-computer interface.
8.1 Introduction
We describe various visualization techniques which have been proven useful
or which we feel might prove useful in the multiple objective decision making
framework. We focus on fundamental visualization techniques (see Chapter
9, for more specic techniques). Several of our ideas originate from statistics,
especially multivariate statistics. Typically, in the multiple objectives frame-
work, the decision maker (DM) is asked to evaluate a number of alternatives.
Each alternative is characterized using an objective vector. From the perspec-
tive of visualization, the complexity of the decision problem depends on two
dimensions: the number of objectives and the number of alternatives. A prob-
lem may be complex due to a large number of alternatives and a small number
of objectives, or the other way round, although the nature of the complexity
is dierent. Dierent visualization techniques are required for each case. The
number of alternatives may also be uncountable, such as a subset of a feasible
region in an objective space in multiobjective optimization.
In descriptive statistics, computer graphics is widely used to illustrate nu-
merical information by producing standard visual representations (bar charts,
line graphs, pie charts, etc.). More advanced visualization techniques, for ex-
ample, Andrews (1972) curves and Cherno (1973) faces have also been pro-
posed. Especially Andrews curves and Cherno faces were developed to illus-
Reviewed by: Julian Molina, University of Malaga, Spain
Mariana Vassileva, Bulgarian Academy of Sciences, Bulgaria
196 P. Korhonen and J. Wallenius
trate multivariate data; a problem closely related to ours. These techniques

have been developed for problems in which the main purpose is to obtain a
holistic view of the data and/or to identify clusters, outliers, etc. In the multi-
ple objective framework, an additional requirement is to provide the DM with
information (value information) for articulating preferences.
In this chapter we review visualization techniques, which are useful for
illustrating snapshots of a single solution or a set of solutions in discrete
and continuous situations. The evaluation of alternatives is a key issue in
multiple objective approaches. Although this book is devoted to continuous
MCDM/EMO-problems, the set of alternatives which the DM is asked to eval-
uate is generally nite and its cardinality is small. An example of an exception
is Lotovs Generalized Reachable Sets method (Lotov et al., 2004). Therefore,
graphical methods developed in statistics to illustrate discrete alternatives
are essential, irrespective of whether the MCDM/EMO-problem is continuous
or discrete. In addition, many continuous problems are approximated with
discrete sets.
To facilitate DMs evaluations, graphical representation is an essential part
of the human-computer interface. For more information, see Chapter 9. Inter-
active methods are described in Chapter 2. The organization of this chapter
is as follows. In Section 8.2 we consider the use and misuse of standard sta-
tistical techniques for visual representation of numerical data. In Section 8.3
we describe visualization in the context of a multiple objective framework.
Section 8.4 provides a discussion and conclusion.
8.2 Visual Representation of Numerical Data

Graphical techniques have been considered extremely useful by statisticians in
analyzing data. However, they have not been utilized by the MCDM- or EMO-
community to their full potential in, for example, interactive approaches.
Statisticians have developed a number of graphical methods for data analysis.
Standard graphical techniques, such as bar charts, value paths, line graphs,
etc., have a common feature: they provide an alternative representation for
numerical data and there is a one-to-one correspondence between a graphical
representation and numerical data (see any basic textbook in statistics, for ex-
ample, Levine et al. (2006).) The graphical representation can be transformed
back into numerical form (with a certain accuracy) and conversely.
8.2.1 Standard Statistical Techniques
In this subsection we review some standard graphical techniques, such as bar

charts, line graphs, and scatter plots, which are widely used to summarize
information in statistical data sets. As an example, we use a small data set
consisting of the unemployment rate (%) and the ination rate (%) in nine
countries.
8 Visualization in the Multiple Objective Decision-Making Framework 197
The bar charts are a standard technique to summarize frequency data

(see, e.g., Figure 8.10), and they are called histograms in this context. In
Figure 8.1, we use a bar chart to represent the (values of) unemployment
rate (%) and the ination rate (%) for each of the nine countries. This is
customary in a multiobjective framework, where instead of summary data the
values of variables (objectives) of various alternatives are more interesting. Let
us emphasize that in this subsection we follow the terminology of statistics
and talk about variables, but in connection to multiobjective optimization
variables correspond to objectives. In other words, we refer to the objective
space and not to the variable space of multiobjective optimization.
Unemployment % Inflation %
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
k
ay
ia
a
en
a
a
d
d
ar
ni
ni
an
si
an
tv
w
ed
us
ua
to
m
La
nl
el
or
Es
Sw
en
R
th
Fi
Ic
N
Li
Fig. 8.1. Illustrating unemployment and ination rates with a bar chart
Line graphs are another technique used like bar charts. (Line graphs (line
charts) are called value paths in the multiple objective framework.) They are
particularly appropriate, when the order of alternatives has a special meaning
as in time series. In Figure 8.2, we ordered the alternatives in terms of decreas-
ing ination rates. This should make it easier to see how the unemployment
rate and the ination rate are related in dierent countries. The dependence
between the unemployment rate and the ination rate can alternatively be
observed from the scatter diagram in Figure 8.3. Figure 8.3 is very useful in
case where we are interested in recognizing the best (Pareto optimal) countries
(Iceland, Norway, and Sweden).
There are also many other visualization techniques used in statistics, such
as pie charts and boxplots, which may be useful in a specic multiple objec-
tive context. Pie charts are, for example, useful for visualizing probabilities
and percentages. For further information, please consult any basic statistics
textbook such as (Bowerman et al., 2004).
14.0
12.0
10.0
8.0
6.0
4.0
2.0
0.0
k
ay
ia
a
en
a
a
d
d
ar
ni
ni
an
si
an
tv
w
ed
us
ua
to
m
La
nl
el
or
Es
Sw
en
R
th
Fi
Ic
N
Li
D
Fig. 8.2. Line graph
Unemployment % vs. Inflation %

14.0
12.0
10.0
Inflation %
8.0
6.0
Iceland
4.0
2.0 Norway Sweden
0.0
0.0 2.0 4.0 6.0 8.0 10.0
Unemployment %
Fig. 8.3. Relationship between unemployment and ination using a scatter diagram
8.2.2 Visualization of Multivariate Data
Visual representation is limited to two dimensions. Therefore, the main prob-

lem in visualizing multivariate data is to construct a two-dimensional repre-
sentation, when the number of variables (i.e., objectives in the MCDM/EMO
context) exceeds two. In statistics, two general principles have been applied
to this problem:
1. reduce the dimensionality of a problem or
2. plot a multivariate observation as an object (an icon).
Principal component analysis and multidimensional scaling (MDS) (Mardia

et al., 1979) are two well-known techniques for obtaining a low-dimensional
(specically two-dimensional) representation of multivariate data, so that the
data may be examined visually (see, for example, Everitt (1978)).
In principle, some standard graphical techniques, such as bar charts and
line graphs can be used to illustrate more than two-variable data sets, but
when the number of variables and/or alternatives increases the graphs quickly
become unreadable. However, some standard techniques such as radar charts
are more appropriate to illustrate multivariate data, provided the number of
variables is reasonably small. As you can see from Figure 8.4, the radar chart
is not very clear, even if we have only nine alternatives and two variables.
However, the chart is readable if the number of variables is not large. A
remedy to problems with a large number of variables is to represent them in
dierent pictures.
Sweden
15.0
Denmark Russia
10.0
5.0
Estonia Norway
0.0
Finland Lithuania
Iceland Latvia
Fig. 8.4. Radar chart
In the early 1970s two promising techniques: Andrews (1972) curves and
Cherno (1973) faces were developed for visualizing multivariate data. An-
drews plotted the following curve
zi1
gi (t) = + zi2 sin t + zi3 cos t + zi4 sin 2t + . . .
2
for each data point zi = (zi1 , zi2 , . . . , zip ) over the interval - t ,
i = 1, 2, . . . , n, where p refers to the number of variables (i.e., objectives in
multiobjective optimization, denoted by k in this book). Thus, each observa-

tion is a harmonic curve in two dimensions. In this technique the number of
variables is unlimited. The harmonic curves depend on the order in which the
variables have been presented. Figure 8.5 reproduces the famous iris ower
data (Fisher, 1936). The data set consists of three dierent species of iris
owers.
Fig. 8.5. Andrews curves (R Graph Gallery, 2007)
Cherno used a human face to represent each observation graphically. The

construction of Cherno faces consists of geometrically well-dened elements,
such as arcs of circles, arcs of ellipses, and straight lines. The values of variables
are used as the parameters of these elements. Chernos original proposal
consisted of 18 face parameters (Figure 8.6).
Fig. 8.6. Cherno face
Andrews harmonic curves and Cherno faces help us view similarities and dis-
similarities between observations, identify clusters, outliers etc., but they are
not very suitable for describing preference information. In Andrews curves,
each curve stands for one observation, and the curves, which do not devi-
ate much from each other, represent similar observations. However, Andrews
curves are not good to illustrate the magnitude of variable values, and are
thus not very practical to describing preference information. In a Cherno
face, it is easy to understand that a "smile" means something positive, but
the length of the nose does not convey similar information. In addition, we
have no knowledge about the joint eects of the face parameters. Big eyes and
a long nose may make the face look silly in some users mind, although big
eyes are usually a positive feature.
In spite of the preceding disadvantages, the techniques provide us with
new directions for developing visual techniques. Especially, there is a need for
methods, which can also convey preference information.
8.2.3 A Careful Look at Using Graphical Illustrations
We should always present as truthful a representation of the data set as pos-

sible. However, it is possible to construct graphs that are misleading. We do
not want that. One should always be aware of the ways statistical graphs and
charts can be manipulated purposefully to distort the truth. In the multiple
objective framework, where the DM is an essential part of the solution process,
this may be even more important. For example, in interactive approaches, the
DM is asked to react to graphical representations. A wrong illusion provided
by a graph may lead to an undesirable nal solution.
In the following, we present some common examples of (purposefully) mis-
leading graphs. In Figure 8.7, we have described the development of trac fa-
talities in Finland during selected years between 19802000 using histograms.
The left-hand gure stands for the standard case, where the vertical axis starts
from zero. In the right-hand gure, we present the same data, but start the
vertical scale from 350 instead of 0. This makes the decrease in trac fatalities
appear more dramatic.
In Figure 8.8, we illustrate the per capita electricity consumption in Den-
mark and Finland. The consumption of electricity in Finland is about twice
that of Denmark. It sounds appealing to illustrate the consumption by using
an object somehow related to electricity such as light bulbs. Maintaining the
shape of the light bulb in the left-hand gure makes the electricity consump-
tion (height of the bulb) in Finland appear much larger than it actually is. In
700 700
600 650
500 600
400 550
300 500
200 450
100 400
0 350
1980 1990 1995 1996 1997 1998 1999 2000 1980 1990 1995 1996 1997 1998 1999 2000
Fig. 8.7. Trac fatalities in Finland: two representations

the right-hand gure, we have only stretched the height of Denmarks light
bulb for Finland. This kind of illustration provides a correct impression.
14 000 12 979 14 000 12 979

12 000 12 000
10 000 10 000
8 000 6 113 8 000 6 113
6 000 6 000
4 000 4 000
2 000 2 000
0 0
Denmark Finland Denmark Finland
Fig. 8.8. Electricity consumption/capita in Denmark and Finland: two representa-

tions
In Figure 8.9, we illustrate the meaning of a stretched axis with a line graph.
A neutral approach to describing, for example, the development of the un-
employment rate by using a line graph is to start the percentage scale from
zero and end it above the maximum, as is done in the left-hand gure. If we
stretch the vertical axis and take the range roughly from the minimum to the
maximum in the right-hand gure, it makes the downward trend look steeper,
demonstrating a dramatic improvement in the unemployment rate. Concern-
ing the use of Figures 8.78.9 in an MCDM/EMO-context, it is not obvious
which gure is better. For instance, sometimes it is useful to zoom in on some
value range of objectives, especially, when we like to compare small dierences
in objective values. In other cases, a holistic gure is more desirable.
12.0 11.0
10.0 10.5
8.0 10.0
6.0 9.5
4.0 9.0
2.0 8.5
0.0 8.0
2000 2001 2002 2003 2004 2005 2000 2001 2002 2003 2004 2005
Fig. 8.9. Unemployment (%) in Finland during 2000-2005: two representations
Above, we have presented only some representative examples of misleading

graphs. More information can be found in (Hu, 1954). In this classical book,
the author shows how to take a graph and make it say anything you want.
See also Wainer (1984).
8.3 Visualization in Multiple Objective Decision Making

Approaches
Graphical techniques used with multivariate data have been of special interest
for researchers working on MCDM problems, because of many similarities be-
tween these two problems. These techniques may also be used in MCDM/EMO
problems to provide the DM with holistic information and to obtain a quick
overall view of the relevant information, as well as detailed information for
evaluation and comparison purposes.
Many authors have proposed the use of graphical techniques (bar charts,
value paths, line graphs, etc.) to help evaluate alternatives (see, for example,
Cohon (1978); Georion et al. (1972); Grauer (1983); Grauer et al. (1984);
Kok and Lootsma (1985); Korhonen and Laakso (1986); Korhonen and Wal-
lenius (1988); Schilling et al. (1983); Silverman et al. (1985); Steuer (1986)).
In Miettinen (1999), there is one chapter devoted to dierent visualization
techniques. In most situations, the amount of information presented to the
DM for evaluation may be considerable. A visual representation improves the
readability of such information.
In statistics, the reduction of the dimensionality of a problem is a widely
used technique to compress the information in multivariate data. If the number
of objectives in the multiobjective context can be reduced to two (or three)
without loosing essential information, then the data may be examined visually.
Principal component analysis and multidimensional scaling (MDS) are also
interesting from the point of view of MCDM. However, to our knowledge
the principal component analysis has not been used for graphical purposes
with an exception of the paper by Mareschal and Brans (1988), in which
they showed how to describe objectives and alternatives in the same picture.
Korhonen et al. (1980) used MDS to reduce a four-objective problem into two
dimensions and then described their search procedure in terms of a planar
graph.
Graphical techniques have also been implemented as part of several com-
puter systems developed for solving MCDM problems. Well known systems
DIDASS (Dynamic Interactive Decision Analysis and Support Systems), Ex-
pert Choice, PREFCALC, NIMBUS, and Reachable Goals method are good
examples. DIDASS has been developed by the System and Decision Sci-
ences (SDS) research group at IIASA (Grauer et al., 1984). Expert Choice
has been developed to implement the AHP (the Analytic Hierarchy Process)
(Saaty, 1980). PREFCALC has been proposed by Jacquet-Lagreze and Siskos
(1982) for assessing a set of additive utility functions. Miettinen and Mkel
(1995, 2000, 2006) developed the WWW-NIMBUS system, the rst web-based
MCDM decision support system, for nonlinear multiobjective optimization
(http://nimbus.it.jyu./). Lotov and his colleagues proposed the Reachable
Goals method, in which the authors slice and visualize the Pareto optimal set
(Lotov et al., 2004). In addition, we would like to mention our systems VIG
(Korhonen, 1987) and VIMDA (Korhonen, 1988; Korhonen and Karaivanova,
1999). For linear multiobjective optimization problems, VIG implements a free

search in the Pareto optimal set by using a dynamic graphical interface called
Pareto Race (Korhonen and Wallenius, 1988). VIMDA is a discrete version of
the original Korhonen and Laakso (1986) method. The current version is able
to deal with many millions of nondominated or Pareto optimal alternatives.
In Figure 8.13 we display how a reference direction is projected into a set of
randomly generated 500,000 alternatives.
8.3.1 Snapshots of a Single Solution

We can use graphical techniques to illustrate a single solution (objective vec-
tor, alternative). Classical techniques, such as bar charts, are mostly suitable
for this purpose. The objectives are described on the x-axis and their values
on the y-axis. Figure 8.10 illustrates unemployment rates in dierent pop-
ulation groups (total, male, female, youth, and long term) in Finland. This
corresponds to a situation of one alternative and ve objectives. When the
number of objectives increases, the bar chart representation looses its ability
to convey holistic information.
Cherno faces (Figure 8.6) is one of the techniques, which makes it possible
to provide information on an alternative with one icon. In the MCDM/EMO
context, one has multiple alternatives to choose from. Sometimes, the choice
is between two (or a few) alternatives (in what follows, we refer to such cases
as discrete sets), sometimes the most preferred solution has to be chosen from
an actually innite number of alternatives (in what follows, we refer to such
cases as innite sets). In the following subsection we consider these situations.
Composition of Unemployment Rates in Finland
30.0
25.0
20.0
15.0
10.0
5.0
0.0
Total Male Female Youth Long-Term
Fig. 8.10. Illustrating unemployment rates in dierent groups in Finland
8.3.2 Illustrating a Set of Solutions/Alternatives

We rst consider nite sets and then extend the discussion to innite sets (the
continuous case).
Finite Sets
Bar charts, again, can be used to compare alternatives described with mul-
tiple objectives, when their number is small. In Figure 8.11 we compare the
unemployment rates in ve dierent population groups in Finland, Norway
and Denmark. As we can see, the bar chart representation suits to objective-
wise comparisons well. In this bar chart the country information is provided
objective-wise. In fact one can think of Figure 8.11 as consisting of three sep-
arate bar charts presented in one graph. However, if one wants to compare
across alternatives and choose the best one from the three countries, it is dif-
cult to conclude whether the unemployment situation in Norway is better
than in Denmark. (Finland is clearly the worst!)
Comparing Unemployment Rates in Finland, Norway and

Denmark
30.0
25.0
20.0
15.0
10.0
5.0
0.0
Total Male Female Youth Long-Term
Finland Norway Denmark
Fig. 8.11. Comparing unemployment rates in Finland, Norway, and Denmark
When the number of objectives is small, and we wish to compare alternatives,

the bar chart representation is appropriate, but it would make sense to present
the objectives pertaining to each alternative (country) together, as is done in
Figure 8.1. A widely used alternative in MCDM is to use line graphs. Each line
commonly stands for one objective. See Figure 8.2 for a typical two-objective
example. The visual eect may be enhanced by ranking alternatives according
to one objective, as is done in Figure 8.2.
A line graph may also be drawn in such a way that one line stands for one
alternative, with the objectives on the x-axis. In the MCDM framework, this
is often called a score prole (Belton and Stewart, 2001).
When standard graphical techniques (e.g., bar charts, value graphs, line
graphs) are used to visualize alternative solutions, whether a solution is more
or less preferred to another one in terms of some objective can be seen in
the details. Short bars, lines with a negative slope, etc. stand for good values
or improvements (in the minimization case). However, obtaining a holistic
perception of these pieces of preference information, when there are a lot of
details, is often impossible. On the other hand, advanced techniques, such as
Cherno faces and Andrews harmonic curves, help the DM obtain a holistic
perception of the alternatives, but do not provide a good basis for the pur-
pose of evaluation. Therefore, Korhonen (1991) developed an approach, which
transforms a vector into a picture in the spirit of Cherno faces and Andrews
curves, but which also enables a DM to see information that inuences her/his
preference on the basis of a visual representation. The underlying ideas are
based on the use of two concepts: harmony and symmetry. For the icon, Korho-
nen chose a simple "harmonious house". The ideal (standard/normal) house
is described in a harmonious (and symmetric) form. Deviations from this ideal
are perceived as abnormal. The "degree" to which a house resembles the ideal
serves as the basis for evaluation.
The rst requirement for the icon is that it can be parametrized in such a
way that by improving the value of an objective the icon becomes "better" or
more "positive" in some sense. To convey this positive-negative information
we can apply the concepts of harmony and symmetry.
The structure of the house is controlled by varying the positions of the
corner points. Each corner point is in its default position and allowable moves
of corner points are shown as squares. An objective can now be associated with
the x- or y-coordinate of any corner point in such a way that the ideal value
of the objective corresponds to the default value of the coordinate and the
deviation from the ideal value is shown as a move in an x- or y-direction. The x-
and y-coordinates of corner points are called house parameters. Two objectives
can be associated with each corner point, one aecting the horizontal position
and the other the vertical position of the corner point. When the value of
the objective deviates much from the ideal, it has a dramatic eect on the
position of a corner point.
A preliminary version of an interactive micro-computer decision support
system known as VICO (A VIsual multiple criteria COmparison) has been de-
veloped to implement the above idea. A preliminary version of VICO has been
used for experimental purposes with student subjects. The results were en-
couraging. Figure 8.12 refers to a test situation, where the subjects compared
20 companies using 11 objectives, consisting of detailed nancial information.
The purpose of the test was to identify the three companies, which had led
for bankruptcy. One of the companies (Yritys 19) in Figure 8.12 was one of
the bankrupt companies. Note how disharmonious it is compared to a well-
performing company (Yritys 1). In the system, the houses are compared in a
pairwise fashion.
The method is dependent on how the house parameters are specied. It is
important to associate key objectives with the house parameters that deter-
mine the structure of the house thus inuencing the degree of harmony. Of
course, the directions of the changes also play an essential role. The method
is subjective, and therefore it is important that a DM takes full advantage of
this subjectivity and uses his/her knowledge of the problem and its relation-
ships in the best possible way. VICO is quite suitable for MCDM problems,
where the number of objectives is large and one uses pairwise comparisons to
collect preference information.
An even more popular approach to visualizing a set of solutions is to use
line graphs. This approach has been used, for example, in VIMDA (Korhonen,
1988). VIMDA is a "free search" type of approach that makes no assumptions,
except monotonicity, about the properties of the DMs value function. It is
a visual, interactive procedure for solving discrete multiple criteria decision
problems. It is very suitable to continuous problems as well, when the number
of objective vectors to be simultaneously evaluated by the DM is large. The
search is controlled by varying the aspiration levels. The information is used
to generate a set of discrete alternatives, which are provided for evaluation in
a graphical form described in Figure 8.13.
The objective values in Figure 8.13 are shown on the ordinate. The current
alternative is shown in the left-hand margin. The objective values of consec-
utive alternatives have been connected with lines using dierent colors and
patterns. The cursor characterizes the alternative whose objective values are
printed numerically at the top of the screen. The cursor moves to the right
and to the left, and each time the objective values are updated. The DM is
asked to choose his/her most preferred alternative from the screen by pointing
the cursor.
Using this procedure, the DM is free to examine any Pareto optimal so-
lution. Furthermore, this freedom is not restricted by previous choices. The
currently implemented version of VIMDA does not include a stopping crite-
rion based on a mathematical optimality test. The process is terminated when
the DM is satised with the currently best solution.
Innite Sets
The Georion et al. (1972) interactive procedure was the rst to present the
idea of a (one dimensional) search in a projected direction in the context of
continuous multiobjective optimization. They implemented the classic Frank-
Wolfe (single objective) nonlinear programming algorithm for solving multiob-
jective problems. Korhonen and Laakso (1986), in their reference direction ap-
proach for solving continuous multiobjective optimization problems, adopted
the idea of a visual line search from Georion et al. (1972). In their approach,
the DM provides his/her aspiration levels for the objectives, thereby dening
a reference direction. This reference direction is projected onto the Pareto
optimal set. Solutions along the projection are presented to the DM, who is
assumed to choose his/her most preferred solution along the projection. The
algorithm continues by updating the aspiration levels, forming a new reference
direction, etc.
Lotovs slices represent an interesting visualization technique for contin-
uous multiobjective problems (see, for example, Lotov et al. (1997, 2004)).
The approach is based on a visualization of the feasible set in the objective
space. For details, see Chapter 9.
Fig. 8.12. Two harmonious houses
For other techniques providing a graphical illustration of Pareto optimal so-

lutions, see, for example, Miettinen (2003).
8.3.3 Dynamic Representation of a Set of Solutions
Pareto Race (Korhonen and Wallenius, 1988) is a dynamic version of the Ko-
rhonen and Laakso (1986) reference direction approach. It enables a DM to
move freely in the Pareto optimal set and, thus to work with the computer
to nd the most preferred values for the objectives (output variables). Figure
8.14 shows an example of the Pareto Race screen. In Pareto Race the DM
sees the objective values on a display in numeric form and as bar graphs, as
(s)he travels along the Pareto optimal set. The keyboard controls include an
accelerator, gears and brakes. The search in the Pareto optimal set is analo-
gous to driving an automobile. The DM can, for example, increase/decrease
speed and brake at any time. It is also possible to change direction.
8.4 Discussion and Conclusion

In this chapter we have considered the use of graphics in the multiobjective
decision making framework. The main issue is to enable the DM to view
objective vectors (multidimensional alternatives) and to facilitate preference
comparisons. The alternatives may be presented one at a time, two at a time,
or many at a time for the DMs consideration. Based on the available infor-
mation, the MCDM procedures ask the DM to choose the best (or the worst)
Fig. 8.13. VIMDAs visual interface
A Pareto Race
Goal 1 (max ): Credit Units <==

30.45
Goal 2 (max ): Free Time ==>
5.01
Goal 3 (max ): Excellent Grades <==
11.39
Goal 4 (max ): Professional Work ==>
5.34
Goal 5 (max ): Income ==>
5.21
Bar:Gas Pedal F1:Gears (B) F3:Fix F9:RefDi

F5: Brake F2:Gears (F) F4:Relax F10:Exit
Fig. 8.14. Pareto Race interface

from the set of displayed alternatives, as well as to rank, or cluster the alter-
natives. In many cases one can use standard statistical techniques, or their
variations, developed for visualizing numerical data. In statistics the main
purpose is to provide a tool for classifying, clustering or identifying outliers
in short to obtain a holistic view of the data. In MCDM/EMO, the DM
needs support in expressing preference information, based on displayed nu-
merical data. Hence, in some cases more advanced techniques are needed for
visualizing alternatives.
We have explored to what extent standard graphical techniques can be
used in the MCDM/EMO framework, and reviewed some more advanced tech-
niques specically developed for the MCDM problem. Examples include the
use of dynamic bar charts, such as the Pareto Race interface (Korhonen and
Wallenius, 1988), and Korhonens harmonious houses (Korhonen, 1991). Typ-
ically, the graphical representation is an essential part of the user interface in
interactive MCDM procedures.
There exist several other ways to utilize graphics in MCDM procedures.
For example, Salo and Hmlinen (1992) developed a method allowing the
DM to express approximate preference statements as interval judgments,
which indicate a range for the relative importance of the objectives. The
ranges are given as bar charts in an interactive manner. Moreover, Hmli-
nen and his colleagues have implemented a system by name HIPRE 3+
(http://www.hipre.hut./), which includes the above idea and other graphi-
cal interfaces. A typical example is a graphical assessment of a value function
(Hmlinen, 2004).
An important problem in MCDM/EMO is to provide a tool, with which
one could obtain a holistic view of the Pareto optimal set. This is dicult
in more than two or three dimensions in the objective space. Interestingly, in
EMO the nondominated set evolves from one generation to the next; hence it
would be important to visualize this process. The two-dimensional case is easy,
but to obtain a good visualization in three dimensions is already challenging
(for additional discussions, see Chapter 3).
There is clearly a need to develop advanced techniques, which help DMs
evaluate and compare alternatives characterized by multiple objectives. Prob-
lems where the number of objectives is large are especially challenging. We
need techniques which help DMs make preference comparisons between alter-
natives, which are characterized with tens of objectives (criteria, attributes).
It is quite plausible that such techniques will be developed in the near fu-
ture. To visualize search in the multiple objective framework is not easy, but
even the current techniques provide useful tools for this case. In fact, when
we developed Pareto Race (Korhonen and Wallenius, 1988), we noticed that
it was very demanding to drive a car in ten dimensions, but using moving
bar charts we were able to implement the idea. The visualization of the whole
Pareto optimal set in more than three dimensions is a problem which may be
too complicated to solve. However, we may surely invent partial solutions!
Acknowledgements
The authors wish to thank Kaisa Miettinen, Julian Molina, Alexander Lo-
tov, and Mariana Vasileva for useful comments. Moreover, we would like to
acknowledge the nancial support of the Academy of Finland (grant 121980).
References
Andrews, D.: Plots of high dimensional data. Biometrics 28, 125136 (1972)
Belton, V., Stewart, T.J.: Multiple Criteria Decision Analysis: An Integrated Ap-
proach. Kluwer Academic Publishers, Dordrecht (2001)
Bowerman, B.L., OConnell, R.T., Orris, J.B.: Essentials in Business Statistics.
McGraw-Hill, New York (2004)
Cherno, H.: Using faces to represent points in k-dimensional space graphically.
Journal of American Statistical Association 68, 361368 (1973)
Cohon, J.L.: Multiobjective Programming and Planning. Academic Press, New York
(1978)
Everitt, B.: Graphical Techniques for Multivariate Data. Heinemann Educational
Books, London (1978)
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annual
Eugenics 7, 179188 (1936)
Management Science 19, 357368 (1972)
Grauer, M.: Reference point optimization the nonlinear case. In: Hansen, P. (ed.)
Essays and Surveys on Multiple Criteria Decision Making, vol. 209, pp. 126135.
Springer, Berlin (1983)
Grauer, M., Lewandowski, A., Wierzbicki, A.: DIDASS theory, implementation and
experiences. In: Grauer, M., Wierzbicki, A. (eds.) Interactive Decision Analysis,
vol. 229, pp. 2230. Springer, Berlin (1984)
Hmlinen, R.P.: Reversing the perspective on the applications of decision analysis.
Hu, D.: How to Lie with Statistics. Norton, New York (1954)
Jacquet-Lagreze, E., Siskos, J.: Assessing a set of additive utility functions for mul-
ticriteria decision making, the uta method. European Journal of Operational Re-
search 10, 151164 (1982)
Kok, M., Lootsma, F.: Pairwise-comparison methods in multiple objective program-
ming, with applications in a long-term energy-planning model. European Journal
Korhonen, P.: VIG: A Visual Interactive Support System for Multiple Criteria De-
cision Making. Belgian Journal of Operations Research, Statistics and Computer
Science 27, 315 (1987)
Korhonen, P.: A visual reference direction approach to solving discrete multiple
criteria problems. European Journal of Operational Research 34, 152159 (1988)
Korhonen, P.: Using harmonious houses for visual pairwise comparison of multiple
criteria alternatives. Decision Support Systems 7, 4754 (1991)
Korhonen, P., Karaivanova, J.: An algorithm for projecting a reference direction

onto the nondominated set of given points. IEEE Transactions on Systems, Man,
and Cybernetics 29, 429435 (1999)
Korhonen, P., Wallenius, J.: A Pareto race. Naval Research Logistics 35, 615623
(1988)
Korhonen, P., Wallenius, J., Zionts, S.: A bargaining model for solving the multiple
criteria problem. In: Fandel, G., Gal, T. (eds.) Multiple Criteria Decision Making
Theory and Application, pp. 178188. Springer, Berlin (1980)
Levine, D.M., Krehbiel, T.C., Berenson, M.L.: Business Statistics: A First Course.
Prentice-Hall, Englewood Clis (2006)
Lotov, A.V., Bushenkov, V.A., Chernov, A.V., Gusev, D.V., Kamenev, G.K.: Inter-
net, GIS and interactive decision maps. Journal of Geographic Information and
(2004)
Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, San
Diego (1979)
Mareschal, B., Brans, J.: Geometrical representations for mcda. European Journal
Boston (1999)
Miettinen, K., Mkel, M.M.: Interactive bundle-based method for nondierentiable
multiobjective optimization: NIMBUS. Optimization 34, 231246 (1995)
WWW-NIMBUS on the Internet. Computers & Operations Research 27, 709
723 (2000)
R Graph Gallery (2007),
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=47
Saaty, T.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980)
Salo, A., Hmlinen, R.P.: Preference assessment by imprecise ratio statements.
Operations Research 40, 10531061 (1992)
Schilling, D., Revelle, C., Cohon, J.: An approach to the display and analysis of
multiobjective problems. Socio-Economic Planning Sciences 17, 5763 (1983)
Silverman, J., Steuer, R.E., Whisman, A.: Computer graphics at the multicriterion
computer/user interface. In: Haimes, Y., Chankong, V. (eds.) Decision Making
with Multiple Objectives, vol. 242, pp. 201213. Springer, Berlin (1985)
tion. John Wiley and Sons, New York (1986)
Wainer, H.: How to display data badly. The American Statistician 38, 137147 (1984)
9
Visualizing the Pareto Frontier
Alexander V. Lotov1 and Kaisa Miettinen2

1
Dorodnicyn Computing Centre of Russian Academy of Sciences, Vavilova str.,
40, Moscow 119333 Russia, lotov08@ccas.ru
2
Abstract. We describe techniques for visualizing the Pareto optimal set that can
be used if the multiobjective optimization problem considered has more than two
objective functions. The techniques discussed can be applied in the framework of
both MCDM and EMO approaches. First, lessons learned from methods developed
for biobjective problems are considered. Then, visualization techniques for convex
multiobjective optimization problems based on a polyhedral approximation of the
Pareto optimal set are discussed. Finally, some visualization techniques are consid-
ered that use a pointwise approximation of the Pareto optimal set.
9.1 Introduction
Visualization of the Pareto optimal set in the objective space, to be called
here a Pareto frontier , is an important tool for informing the decision maker
(DM) about the feasible Pareto optimal solutions in biobjective optimization
problems. This chapter is devoted to the visualization techniques that can
be used in the case of more than two objective functions with both MCDM
and EMO approaches for multiobjective optimization. In practice, we discuss
methods for visualizing a large (or innite) number of Pareto optimal solutions
in the case of three and more objectives.
When discussing the visualization of the Pareto frontier, one has to note
the dierence between the decision space and the objective space (see deni-
tions given in Preface). The techniques covered here are mainly aimed at the
visualization of objective vectors (points) which form a part of the feasible
objective region. Such an interest in visualizing the Pareto frontier is related
mostly to the fact that in multiobjective optimization problems, the prefer-
ences of the DM are related to objective values, but not to decision variables

Reviewed by: Pekka Korhonen, Helsinki School of Economics, Finland
Sanaz Mostaghim, University of Karlsruhe, Germany
214 A.V. Lotov and K. Miettinen
directly. Moreover, a possible relatively high dimension of the decision space,

which can reach many thousands or more, prevents such a visualization. In
contrast, the number of objectives in well-posed multiobjective optimization
problems is usually not too high. Thus, one can hope that the visualization
of the Pareto frontier would be possible.
Visualization techniques to be discussed in this chapter can be used in
decision support in the framework of a posteriori methods (see, e.g., Chapter 1
and (Miettinen, 1999)). However, the techniques may also be used by analysts
for evaluating Pareto frontiers or for comparing Pareto frontiers constructed
using dierent methods. Usually, a posteriori methods include four steps: (1)
approximation of the Pareto optimal set; (2) presentation of a representation
of the whole Pareto optimal set; (3) specication of a preferred Pareto optimal
objective vector by the DM; (4) selecting a (Pareto optimal) decision vector
corresponding to the objective vector identied.
Approximation of the Pareto optimal set is computationally the most com-
plicated step of a posteriori methods. Though modern computational methods
often can approximate this set without human involvement, in complicated
cases it must be performed by the analyst. After several decades of moder-
ate development, a posteriori methods are now actively developed. This can
be attributed to at least two reasons: better understanding of complications
that arise in the process of constructing the value (utility) function and us-
ing of multiobjective optimization concepts by analysts or researchers deeply
involved in real-life applications. These people are usually more interested in
practical advantages of the methods than in their theoretical properties.
The analyst supporting the application of a posteriori methods has to an-
swer two related questions: what kind of a technique to use to approximate the
Pareto frontier and how to inform the DM concerning the resulting approx-
imation. After the rst Pareto frontier approximation method was proposed
for biobjective linear problems more than 50 years ago by Gass and Saaty
(1955), many dierent approximation techniques have been developed. Some-
times, they are called Pareto frontier generation techniques (Cohon, 1978) or
vector optimization techniques (Jahn, 2004). Methods have been developed
for problems involving nonlinear models and more than two objectives.
In case of more than two objectives, approximation methods are typically
based on either a) approximation by a large but nite number of objective
vectors or b) approximation by a convex polyhedral set. The rst form can be
used in general nonlinear problems and the second only in convex problems.
In what follows, we refer to the approximation by a large number of objective
vectors as a pointwise approximation of the Pareto optimal set.
As far as informing the DM about the Pareto optimal solutions is con-
cerned, there are two possibilities: providing a list of the solutions to her/him
or visualizing the approximation. Visualization of the Pareto frontier was in-
troduced by Gass and Saaty (1955). They noted that in the case of two objec-
tives, the Pareto frontier can easily be depicted because the objective space is
a plane. Due to this, information concerning feasible objective values and ob-
9 Visualizing the Pareto Frontier 215
jective tradeos can be provided to the DM in a graphic form. However, this

convenient and straightforward form of displaying objective vectors forming
the Pareto frontier on a plane is restricted to biobjective problems, only.
In the case of more objectives, the standard approach of a posteriori meth-
ods has been based on approximating the Pareto frontier by a large number
of objective vectors and informing the DM about the Pareto frontier by pro-
viding a list of such points to her/him. However, selecting from a long list
of objective vectors has been recognized to be complicated for human beings
(Larichev, 1992). Yet, this approach is still applied (see interesting discussion
as well as references in (Benson and Sayin, 1997)). We can attribute this to
the fact that convenient human/computer interfaces and computer graphics
that are needed in the case of more than two objectives have been absent (at
least till the middle of the 1980s). In this respect, it is important to mention
that the need for visualizing the Pareto frontier (for more than two objectives)
in the form that gives the DM explicit information of objective tradeos has
already been stated by Meisel (1973). Nowadays, this problem has been solved
to a large extent: modern computers provide an opportunity to visualize the
Pareto frontier for linear and nonlinear decision problems for three to about
eight objectives; and computer networks are able to bring, for example, Java
applets that display graphs of Pareto frontiers for dierent decision problems.
In what follows, we use the notation introduced in Preface. Remember that
P (Z) stands for the set of Pareto optimal objective vectors (Pareto frontier),
which is a part of the feasible objective region Z. Another set that will often
be used in this chapter is dened for multiobjective minimization problems as
Zp = {z Rk : zi zi , for all i, z Z}.
Though the set Zp has been considered in various theoretical studies of multi-
objective optimization for many years, it was named as the Edgeworth-Pareto
hull (EPH) of the set Z, in 1995, at the 12th International Conference on Mul-
tiple Criteria Decision Making in Hagen, Germany. EPH includes, along with
the points of the set Z, all objective vectors dominated by points of the set
Z. Importantly, EPH is the maximal set for which it holds P (Zp ) = P (Z).
In this chapter, we concentrate on visualizing the Pareto frontier or an
approximation of it as a whole whereas Chapter 8 was devoted mainly to
visualizing individual alternatives or a small number of them. Let us here also
mention so-called box indices suggested by Miettinen et al. (2008) that support
the DM in comparing dierent Pareto optimal solutions by representing them
in a rough enough scale in order to let her/him easily recognize the main
characteristics of the solutions at a glance.
The rest of this chapter is organized as follows. General problems of vi-
sualizing the Pareto frontier are considered in Section 9.2. We discuss the
application of visualization for informing DMs, consider lessons learned from
methods developed for biobjective problems and briey touch the possible
instability of the Pareto frontier. Section 9.3 is devoted to visualization tech-
niques for convex multiobjective optimization problems. These techniques are
based on the polyhedral approximation of the Pareto frontier. We augment

recent surveys of approximation methods (Ruzika and Wiecek, 2005) by dis-
cussing polyhedral approximation and theory of approximation, and then turn
to visualization of the Pareto frontier in the convex case. In Section 9.4, vi-
sualization techniques are described that are based on the pointwise approx-
imation of the Pareto frontier (including techniques based on enveloping a
large number of approximating points). Such visualization techniques can be
applied in both MCDM and EMO based approaches. Finally, we conclude in
Section 9.5.
9.2 General Problems of Visualizing the Pareto Frontier

In this section, we discuss general problems of visualizing the Pareto frontier.
Subsection 9.2.1 is devoted to the advantages of visualization. Then in Sub-
section 9.2.2 we discuss the lessons that can be learned from experiences with
biobjective problems. Finally, issues related to the instability of the Pareto
frontier with respect to disturbances of the parameters of the problem (that
are often met in real-life problems, but usually not considered in multiobjec-
tive optimization textbooks) are briey outlined in Subsection 9.2.3.
9.2.1 Why Is Visualization of the Pareto Frontier as a Whole

Useful?
Visualization, that is, transformation of symbolic data into geometric infor-

mation, can support human beings in forming a mental picture of the symbolic
data. About one half of neurons in human brain are associated with vision in
some way, and this evidence provides a solid basis for successful application
of visualization for transforming data into knowledge. As a proverb says: A
picture is worth a thousand words. Another estimate of the role of visualiza-
tion is given by Wierzbicki and Nakamori (2005). To their opinion, a picture
is worth a ten thousands words. In any case, visualization is an extremely
eective tool for providing information to human beings. Visualization on the
basis of computer graphics has proved to be a convenient technique that can
help people to assess information. The question that we consider is how it can
be eectively used in the eld of multiobjective optimization, namely, with a
posteriori methods.
As discussed earlier, a posteriori methods are based on informing the DM
about the Pareto optimal set without asking for his/her preferences. The DM
does not need to express his/her preferences immediately since expressing
preferences in the form of the single-shot specication of the preferred Pareto
optimal objective vector may be separated in time from the approximation
phase. Thus, once the Pareto frontier has been approximated, the visualization
of the Pareto frontier can be repeated as many times as the DM wants to and
can last as long as needed.
The absence of time pressure is an important advantage of a posteriori

methods. To prove this claim, let us consider psychological aspects of think-
ing. Studies in the eld of psychology have resulted in a fairly complicated
picture of a human decision making process. In particular, the concept of a
mental model of reality that provides the basis of decision making has been
proposed and experimentally proven (Lomov, 1984). The mental models have
at least three levels that describe the reality in dierent ways: as logical think-
ing, as interaction of images and as sub-conscious processes. Preferences are
connected to all three levels. A conict between the mental levels may be one
of the reasons of the well-known non-transitive behavior of people (both in
experiments studying their preferences as well as in real-life situations).
A large part of human mental activities is related to the coordination of the
levels. To settle the conict between the levels, time is required. Psychologists
assure that sleeping is used by the brain to coordinate the mental levels.
(Compare with the proverb: The morning is wiser than the evening). In
his famous letter on making a tough decision, Benjamin Franklin advised
to spend several days to make a choice. It is known that group decision and
brainstorming sessions are more eective if they last at least two days. Detailed
discussion of this topic is given in (Wierzbicki, 1997).
Thus, to settle the conict between the levels of ones mental models in
nding a balance between dierent objectives in a multiobjective optimiza-
tion problem, the DM needs to keep the Pareto optimal solutions in his/her
brains for a suciently long time. It is well known that a human being cannot
simultaneously handle very many objects (see (Miller, 1956) for the magical
number seven plus or minus two). This statement is true in the case of letters,
words, sentences and even paragraphs. Thus, a human being cannot think
about hundreds or thousands objective vectors of the Pareto frontier approx-
imation simultaneously. Even having such a long list in front of his/her eyes,
the DM may be unable to nd the best one (Larichev, 1992). Instead, the
DM often somehow selects a small number of objective vectors from the list
and compares them (for details see (Simon, 1983)). Though after some time,
the most preferred one of these solutions will be selected, such an approach
results in missing most of the Pareto optimal solutions, and one of them may
be better than the selected one. Visualization can help to avoid this problem.
However, to be eective, a visualization technique must satisfy some re-
quirements formulated, for example, by McQuaid et al. (1999). These require-
ments include
1. simplicity, that is, visualization must be immediately understandable,
2. persistence, that is, the graphs must linger in the mind of the beholder,
and
3. completeness, that is, all relevant information must be depicted by the
graphs.
If a visualization technique satises these requirements, the DM can consider
the Pareto frontier mentally as long as needed and select the most preferred
objective vector from the whole set of Pareto optimal solutions. If some fea-
tures of the graphs may happen to be forgotten, (s)he can look at the frontier
again and again.
9.2.2 Lessons Learned from Biobjective Problems
According to (Roy, 1972), In a general bi-criterion case, it has a sense to dis-

play all ecient decisions by computing and depicting the associated criterion
points; then, DM can be invited to specify the best point at the compromise
curve. Indeed, the Pareto frontier, which is often called as a compromise curve
in the biobjective case, can be depicted easily after its approximation was con-
structed. For complications that may arise in the process of approximating
biobjective Pareto frontiers, we refer to (Ruzika and Wiecek, 2005).
As has already been said, the rst method for approximating the Pareto
frontier by Gass and Saaty (1955) used visualization of the Pareto frontier as
the nal output of the process of approximation. Since then, various studies of
biobjective optimization problems have used visualization of the Pareto fron-
tier. It is interesting to note that visualization of the feasible objective region
or the EPH is sometimes used, as well. In this case, the set of Pareto optimal
objective vectors can be recognized easily. For minimization problems, it is the
lower left frontier of the feasible objective region. It can be recognized even in
nonconvex problems. For further details, see Chapter 1. For an example, see
Figure 9.1, where Pareto optimal solutions are depicted by bold lines.
It is extremely important that the graphs used provide, along with Pareto
optimal objective vectors, information about objective tradeos. There exist
dierent formulations of the intuitively clear concept of the objective tradeo
as a value that provides a comparison of two objective vectors (see, e.g.,
Fig. 9.1. Example of a feasible objective region and its Pareto frontier.
Chapter 2) or characterizes the properties of a movement along a curve in the

objective space. Let us consider two of them for the biobjective case.
As an objective tradeo (Chankong and Haimes, 1983; Miettinen, 1999)
between any two objective functions f1 and f2 and decision vectors x1 and
x2 (assuming f2 (x2 ) f2 (x1 ) = 0) one can understand the value T12 (x1 , x2 )
dened in Chapter 2 as Denition 1. For any Pareto optimal decision vectors
x1 and x2 , the value of the objective tradeo is negative because it describes
the relation between the improvement of one objective and worsening of an-
other. To estimate the objective tradeo between vectors x1 and x2 visually,
the DM has simply to compare the related objective values z1 = f (x1 ) and
z2 = f (x2 ), respectively (see Figure 9.1). Such information is very important
for the DM who can use it to decide, which of these two objective vectors is
more preferable for her/him.
For a Pareto optimal objective vector z , in which the Pareto frontier is

smooth, the objective tradeo can be expressed by the tradeo rate dz dz1 (z ),
2
where the derivative is taken along the curve which is the Pareto frontier in
the biobjective case (see Figure 9.1). The value of the tradeo rate informs
the DM concerning the exchange between the objective values if one moves
along the Pareto frontier. The tradeo rate, which is given graphically by the
tangent line to the Pareto frontier at z , can easily be imagined at any point
of the Pareto frontier. In case of a kink, tradeo rates can be given by a cone
of tangent lines (Henig and Buchanan, 1997; Miettinen and Mkel, 2002,
2003). Such cones can be estimated mentally by using the graph of the Pareto
frontier, too. The importance of the role of the tradeo rate has resulted in
an alternative name for the Pareto optimal frontier: the tradeo curve.
Tradeo information is extremely important for the DM since it helps to
identify the most preferred point along the tradeo curve. This information
is given by the graph of the Pareto frontier in a clear form, which can be
accessed immediately. The graph of the Pareto frontier, if the frontier is not
too complicated, can linger in the mind of the DM for a relatively long time.
In any case, the DM can explore the curve as long as needed. Finally, the
graph provides full information on the objective values and their mutual de-
pendence along the tradeo curve. Thus, such kind of visualization satises
the requirements formulated above.
Visualization of the Pareto frontier helps transforming data on Pareto op-
timal objective values into knowledge of them. In other words, it helps in the
formation of a mental picture of a multiobjective optimization problem (com-
pare with learning processes discussed in Chapter 15). Since visualization can
inuence all levels of thinking, it can support the mental search for the most
preferred Pareto optimal solution. Such a search may be logically imperfect,
but acceptable for all levels of human mentality. Visualization of the Pareto
frontier in biobjective problems helps to specify a preferable feasible objective
vector as a feasible goal directly on the graph. This information is sucient
for nding the associated decision vector. In addition, the feasible goal (or
its neighborhood) can be used as a starting point for various procedures, like
identication of decision rules or in selecting a part of the Pareto frontier for

a subsequent detailed study.
Thus, in biobjective cases, visualization of the Pareto frontier helps the
DM
1. to estimate the objective tradeo between any two feasible points and the
tradeo rate at any point of the Pareto frontier;
2. to specify the most preferred solution directly on the Pareto frontier.
The question can arise whether it is possible and protable to visualize the
Pareto frontier in the case of more than two objectives. To make visualization
as eective as it is in the biobjective case, one needs to satisfy the general
requirements of visualization. In addition, visualization must provide infor-
mation on objective tradeos. Finally, the techniques must support the DM
in identifying the most preferred solution.
Let us next consider the objective tradeos in the case of more than two
objectives. The concept of an objective tradeo introduced earlier can give rise
to two dierent concepts in the multiobjective case: partial objective tradeo
and total objective tradeo. Both these concepts are dened for objectives fi
and fj and decision vectors x1 and x2 , for which fj (x2 ) fj (x1 ) = 0, by the
same formula as earlier, that is, Tij (x1 , x2 ) dened in Chapter 2 as Denition
1 (Miettinen, 1999). As discussed in Chapter 2, the value Ti,j is said to be a
partial objective tradeo if other objective values are not taken into account.
On the other hand, it is a total objective tradeo if decision vectors x1 and
x2 satisfy fl (x1 ) = fl (x2 ) for all l = i, j. Thus, the total tradeo can only
be used for a small part of pairs of decisions. At rst sight, a total tradeo
cannot play an important role. However, it is not so.
To give a geometric interpretation of the total tradeo, it is convenient
to consider biobjective slices (cross-sections) of the set Z (or the set Zp ). A
biobjective slice (cross-section) of the set Z is dened as a set of such points
in Z for which all objective values except two (i and j, in our case) are xed.
Then, the slice is a two-dimensional set containing only those objective vectors
z1 = f (x1 ) and z2 = f (x2 ), for which it holds zl1 = zl2 for all l = i, j. Thus,
since only the values of zi and zj change in the slice, the tradeo can be
estimated visually between any pair of points of the slice. In turn, it means
that the total tradeo can be estimated between objectives fi and fj for any
pair of decision vectors x1 and x2 that may be unknown, but they certainly
exist since they result in the objective vectors of the slice. Such a comparison
is especially informative if both objective vectors belong to the Pareto frontier.
Application of biobjective slices is even more important while studying
tradeo rates between objective values. If the Pareto frontier is smooth in
its point z = f (x ), a tradeo rate becomes a partial tradeo rate dened
zi
as z j
(z ), where the partial derivative is taken along the Pareto frontier.
Graphically, it is given by the tangent line to the frontier of the slice. The value
of the partial tradeo rate informs the DM about the tradeof rate between
values of two objectives under study at the point z , while other objectives are
xed at some values. Once again, the case of nonsmooth frontiers is discussed
in (Henig and Buchanan, 1997; Miettinen and Mkel, 2002, 2003).
According to our knowledge, the idea of visualizing the biobjective slices
of the Pareto frontier related to multiple objectives was introduced by Meisel
(1973). He developed a method for visualizing biobjective projections of Pareto
optimal solutions (compare the scatterplot techniques to be described in Sec-
tion 9.4). At the same time, he argued that it would be much more useful
to visualize the biobjective slices of the Pareto frontier instead of biobjective
projections of particular objective vectors. The reason, according to Meisel,
is related to the evidence that graphs of biobjective slices (in contrast to pro-
jections) can inform the DM on total objective tradeos and partial tradeo
rates. Moreover, they can support identication of the most preferred objec-
tive vector directly at the Pareto frontier.
The simplest approach to constructing and displaying the biobjective slices
of the Pareto frontier can be based on a direct conversion of the multiobjective
problem with k > 2 to a series of biobjective problems. Such a concept is
close to the -constraint method (Chankong and Haimes, 1983) described in
Chapter 1. The only dierence is the following: One has to select any two
objectives fi and fj (of k functions) to be minimized instead of only one
(as in the -constraint method). Then, the following biobjective problem is
considered
minimize {fi (x), fj (x)}
subject to fl (x) l for all l = 1, . . . , k, l = i, j (9.1)
x S.
Here the values of l , l = 1, . . . , k, l = i, j must be given in advance. As

said, various methods have been proposed for constructing biobjective Pareto
frontiers. These methods can be used for solving problem (9.1). As a result,
one obtains the Pareto frontier for two selected objectives fi and fj for given
values of l of k 2 other objectives. One has to note, however, that this is
true only in the case if the plane does indeed cut the Pareto frontier.
To get a full picture of the Pareto frontier in this way, one needs to specify
a grid in the space of k 2 objectives and solve a biobjective problem for
any point of the grid. For example, in case of ve objective functions with 10
possible values of l for any of (k 2 =) 3 objectives, we have to construct the
Pareto frontier for 1000 biobjective problems, which naturally is a tremendous
task. In addition, one has to somehow visualize these 1000 biobjective fron-
tiers. For this reason, researchers usually apply this approach only in the case
of k = 3 or, sometimes, k = 4 and restrict to a dozen biobjective problems. As
to visualization, one usually can nd a convenient way for displaying about
a dozen tradeo curves. Examples are given in (Mattson and Messac, 2005;
Schuetze et al., 2007).
One can prove that tradeo curves obtained in this way do not intersect
for k = 3 (though they can touch each other). Thus, in this case a system
of Pareto frontiers displayed in the same graph is relatively simple (see, e.g.,
Figure 9.2 adopted from (Haimes et al., 1990), p. 71, Fig. 4.2). Such graphs
are known as decision maps.
If needed, a decision map can be considered as a collection of projections
of biobjective slices on the objective plane of fi and fj , but this interpretation
is not obligatory and does not bring additional information. Early examples of
applying decision maps in multiobjective water management have been given
by Louie et al. (1984) (p. 53, Fig. 7) and Jewel (1990) (p. 464, Fig. 13.10).
In addition to tradeo rates for any of several tradeo curves, decision maps
provide graphic information concerning objective tradeos between any pair of
objective vectors that belong to dierent tradeo curves. Thus, decision maps
provide tradeo information on the relations between Pareto optimal points
for three objectives. In the next sections we describe methods for constructing
decision maps that are more eective than using problem (9.1) with a large
number of values for 3 .
9.2.3 Comment on the Stability of the Pareto Frontier
Let us here pay attention to the issue of the stability of the Pareto frontier,
that is, to the question whether the Pareto frontier depends continuously on
the parameters of the problem. This issue is important for computer approxi-
mation and visualization. The problem consists of the evidence that a Pareto
frontier is often instable to the disturbances of parameters of the model. To ap-
ply Pareto frontier approximation techniques correctly, one has to rst prove
that the Pareto frontier exists and is unique for some set of parameter values
(which usually is true) and that the Pareto frontier is stable to their distur-
bances (which usually is unknown). A sucient condition for the stability of
the Pareto frontier follows from the theorems of Chapter 4 in (Sawaragi et al.,
1985): If some simple technical conditions are satised, the stability of the
Pareto frontier is provided by the coincidence of an unbiased Pareto frontier
and a weak Pareto frontier. If the class of disturbances is broad enough, this
condition is also necessary.
Fig. 9.2. A decision map.

Usually it is impossible to check whether Pareto and weak Pareto frontiers

coincide before constructing the Pareto frontier. In the process of approxima-
tion, large parts of the Pareto frontier (and, hence, related decisions) can be
lost. Or, vise versa, some parts of the frontier of the set Z, which do not
belong to the Pareto frontier, can be considered to be Pareto optimal. The
result may depend on the computer and even on its initial state. In this case,
dierent users will get dierent approximations.
To solve this problem, robust algorithms for approximating the Pareto
frontier have been proposed (Krasnoshchekov et al., 1979). A more detailed
list of references is given in Subsection 9.4.4. To avoid using these complicated
algorithms, many researchers have tried to revise the problem of approximat-
ing the Pareto frontier to make it well posed. One of the approaches is based
on squeezing the domination cone. As a result, a set is constructed which con-
tains the Pareto frontier as its subset. However, such an approach may result
in complications related to the instability of the broadened Pareto frontier.
Another form of problem revision can be based on the approximation of the
set Z or Zp , thus avoiding the problem of the stability of the Pareto frontier.
The idea of approximating sets Z or Zp instead of the Pareto frontier itself
has been known for years (Lotov, 1975, 1983; Yu, 1985; Kaliszewski, 1994;
Benson, 1998). In the case of pointwise approximations, this idea has been
used in (Evtushenko and Potapov, 1987; Reuter, 1990; Benson and Sayin,
1997). In Section 9.3 we show that visualizing the frontier of the EPH may
help to solve the problem of an instable Pareto frontier.
9.3 Visualization of the Pareto Frontier Based on

Approximation by Convex Polyhedra
An important class of nonlinear multiobjective optimization problems is the
set of convex problems. Several denitions of convex multiobjective optimiza-
tion problems can be given. We use the most general one: a multiobjective
optimization problem is convex if the set Zp is convex (Henig and Buchanan,
1994). Note that the feasible objective region may be nonconvex in this case.
A sucient condition for the convexity of an EPH is given, for example, as:
the set Zp is convex if the set S is convex and all objective functions are con-
vex (Yu, 1974). One can see that this sucient condition is not a necessary
one because problems with nonconvex feasible regions do not satisfy the con-
dition. At the same time, there exist important real-life problems which have
nonconvex feasible regions, but a convex EPH. Note that linear multiobjective
optimization (MOLP) problems are always convex.
In a convex case, in addition to the universal pointwise approximation
of the Pareto frontier, its polyhedral approximation is possible. This can be
given by hyperplanes or hyperfaces (i.e., faces of dimension k1). A polyhedral
approximation of the Pareto frontier provides a fast visualization of the Pareto
frontier.
There exist two main polyhedral approximation approaches aimed at vi-

sualizing Pareto frontiers in convex problems. The rst approach is based on
the fact that the Pareto frontier is a part of the frontier of the convex feasible
objective region (or its EPH). Thus, one can construct faces of the feasible
objective region that approximate the Pareto frontier without approximating
the feasible objective region by itself. In the second approach, one constructs
a polyhedral set that approximates the feasible objective region or any con-
vex set that has the same Pareto frontier as the feasible objective region (like
EPH).
The rst approach results in complications with more than two objectives.
Note that the Pareto frontier of the convex feasible objective region is usually
a nonconvex set by itself. Its approximation can include a large number of
hyperfaces, which must be somehow connected to each other. Taking into
account that, in addition, the Pareto frontier is often instable, it is extremely
hard to develop a method and software that are able to approximate and
visualize the Pareto frontier in this form if we have more than three objectives.
Such complications are discussed in (Solanki et al., 1993; Das and Dennis,
1998).
The second approach is based on approximating the convex feasible ob-
jective region (or another set in the objective space that has the same Pareto
frontier), for which a polyhedral approximation can be given by a system of
a nite number of linear inequalities
Hz h, (9.2)
where H is a matrix and h is a vector, which are to be constructed by an

approximation technique. A biobjective slice for the objective values zi and
zj can be constructed very fast: it is sucient to x the values of other objec-
tives in (9.2). Another advantage of the form (9.2) is the stability of compact
polyhedral sets to errors in data and rounding errors (Lotov et al., 2004).
As far as the system of inequalities (9.2) is concerned, it is usually con-
structed when approximating any of the following convex sets:
1. set Z (i.e., the feasible objective region) (Lotov, 1975),
2. set Zp (i.e., the EPH), which is the largest set that has the same Pareto
frontier as Z (Lotov, 1983), or
3. a conned set that is the intersection of Zp with such a box that the
frontier of the intersection contains the Pareto frontier. Formally, this was
proposed by Benson (1998) who called it an eciency-equivalent polyhe-
dron. However, such a set was used already in the 1980s in a software for
approximating and visualizing Pareto frontiers, see, e.g., Figure 9.3.
From the point of view of visualization, approximating the EPH or an
eciency-equivalent polyhedron is preferable since its biobjective slices pro-
vide decision maps, in which the Pareto optimal frontier can be recognized by
the DM immediately.
The structure of the rest of this section is as follows. Subsections 9.3.1

and 9.3.2 are devoted to visualization in the case of three and more than
three objectives, respectively. In Subsection 9.3.3 we shortly outline the main
methods for constructing a polyhedral approximation in the form (9.2) when
k > 2. Various methods for approximating Pareto frontiers in convex problems
with two objectives are described in (Ruzika and Wiecek, 2005).
9.3.1 Visualization in the Case of Three Objectives
If we have three objectives, two main approaches to visualizing the Pareto

frontier can be applied: displaying the approximation of the Pareto frontier
(or EPH) (i) as a three-dimensional graph; or (ii) using decision maps. Let us
compare these two approaches with an example of a long-term dynamic eco-
nomic model that takes environmental indicators into account (Lotov et al.,
2004). The objectives are a consumption indicator C (deviation from the
balance growth to be maximized), a pollution indicator Z (maximal pollu-
tion during a time period to be minimized) and an unemployment indicator
U (maximal unemployment during a time period to be minimized). The
three-dimensional graph of the EPH and the decision map for this problem
are given in Figures 9.3 and 9.4, respectively. They were prepared manually
to demonstrate dierences between the two formats of visualization.
The Pareto frontier is hatched in a three-dimensional graph in Figure
9.3. This picture shows the general structure of the problem. Ten interesting
objective vectors are specied in the graph. Some of them (vectors 16) belong
to the plane U = 0. Due to this, it is possible to estimate the objective
values in these vectors. At the same time, it is fairly complicated to check
the objective values in the other objective vectors and to evaluate tradeos
between them or tradeo rates using such a graph (except the tradeo curve
for U = 0 that is, actually, a biobjective slice of the Pareto frontier).
Let us next compare the three-dimensional graph with the decision map
in Figure 9.4. The decision map is of a style given in Figure 9.2. Tradeo
curves (biobjective slices of the EPH) were calculated for several values of
Fig. 9.3. A three-dimensional graph of the Fig. 9.4. A decision map.

Pareto frontier.
U , which are given in the graph near the associated tradeo curves. The
objective vectors 110 are also depicted in the gure.
One can see that the tradeo curve corresponding to U = 0 has a kinked
form: the tradeo rate for Z and C changes substantially when 0.4 <
C < 0.2. If C < 0.40, one can increase consumption without increasing
pollution indicator Z (or with minimal increase). In contrast, while C > 0,
a small additional growth of C results in a drastic growth of the pollution
indicator Z . One can easily compare other points along this tradeo curve,
as well.
Four other tradeo curves in Figure 9.4 have a similar form, but the kink
is smoother than for U = 0. These tradeo curves show in a clear way how
it is possible to decrease pollution while maintaining the same value of C
by switching from one tradeo curve to another. In this way, a decision map
provides more information on objective values and, especially, on tradeos,
than a three-dimensional graph. One can see the total tradeo between any
two points belonging to the same tradeo curve as well as the total tradeo
between any two points that have the same value of C or Z . The tradeo
rates are visible at any point of the tradeo curves.
Let us next discuss how well decision maps meet the requirements on vi-
sualization techniques formulated in Subsection 9.2.1. First of all, let us note
that tradeo curves do not intersect in a decision map (though they may some-
times coincide). Due to this, they look like contour lines of topographic maps.
Indeed, a value of a third objective (related to a particular tradeo curve)
plays the role of the height level related to a contour line of a topographic
map. For example, a tradeo curve describes such combinations of values of
the rst and the second objectives that are feasible for a given constraint im-
posed on the value of the third objective (like places lower, than... or places
higher, than...). Moreover, one can easily estimate which values of the third
objective are feasible for a given combination of the rst and of the second
objectives (like height of this particular place is between...). If the distance
between tradeo curves is small, this could mean that there is a steep ascent
or descent in values, that is, a small move in the plane of two objectives is
related to a substantial change in the value of the third objective.
Thus, decision maps are fairly similar to topographic maps. For this reason,
one can use topographic maps for the evaluation of the eectiveness of the
visualization given by decision maps. Topographic maps have been used for
a long time and people usually understand information displayed without
diculties. Experience of application of topographic maps shows that they
are simple enough to be immediately understood, persistent enough not to be
forgotten by people after their exploration is over, and complete enough to
provide information on the levels of particular points in the map.
The analogy between decision maps and topographic maps asserts that
decision maps satisfy the requirements specied in Subsection 9.2.1. Note that
the decision maps can be constructed in advance before they are studied, or
on-line, after obtaining a request from a user. On-line calculation of decision
maps provides additional options to change objectives located on axes, change

the number of tradeo curves on decision maps, zoom the picture and change
graphic features of the display such as the color of the background, colors of
the slices, etc. For example, it would be good to get an additional tradeo
curve for the values of U between zero and 30% for the example considered.
9.3.2 Visualization in the Case of More than Three Objectives
In the case of more than three objectives, the possibility of an interactive dis-
play of decision maps (if exists) is an important advantage. Visualization of
the Pareto frontier with more than three objectives can be based on the explo-
ration of a large number of decision maps, each of which describing objective
tradeos between three objectives. Let as assume that an approximation of
a convex feasible objective region or of another convex set in the objective
space has already been constructed in the form (9.2). Then, a large number
of decision maps can be constructed in advance and displayed on a computer
screen, whenever asked for. However, the number of decision maps to be pre-
pared increases drastically with the growth of the number of objectives. For
example, in the case of four objectives, by using the modication of the -
constraint method described in Subsection 9.2.2, one could prepare several
dozens of decision maps for dierent values of the fourth objective. However,
it is not clear in advance, whether these decision maps are informative enough
for the DM. It may happen that dierent three-objective decision maps are
more preferable for her/him (e.g., f1 and f4 on the axes, f2 as level of height
and f3 to be used as a constraint). The situation is much worse in the case
of ve objectives. Thus, it seems to be reasonable to enable calculating and
displaying the decision maps on-line.
To provide on-line displays of decision maps fast, one can approximate
the Pareto frontier beforehand and then compute biobjective slices fast. In
convex cases, one can use polyhedral approximations of the EPH for which
biobjective slices can be computed and superimposed very fast. Decision maps
corresponding to constraints imposed on the rest k 3 objectives can be
provided to the DM on request in dierent forms. Here we describe two of
them: animation and matrices. These two forms compete in the case of four or
ve objectives. In the case of more than ve objectives, these forms supplement
each other.
Animation of decision maps requires generating and displaying a large
number of biobjective slices fast (several hundred slices per second for four
objectives). It is important that in the process of animation, the slices are
constructed for the same two objectives whose values are located on the axes.
Chernykh and Kamenev (1993) have developed a special fast algorithm for
computing biobjective slices based on preprocessing the linear inequalities of
the polyhedral approximation of the EPH. Due to this algorithm, thousands
of decision maps can be computed and depicted in seconds. The control of
animation is based on using such traditional computer visualization technique
as scroll-bars. Figure 9.5 represents a gray scale copy of a color computer

display for a real-life water quality problem (Lotov et al., 2004) involving
ve objectives. The decision map consists of four superimposed biobjective
dierently colored slices. There is only one dierence between decision maps
given in Figures 9.5 and 9.4: the Pareto frontiers in Figure 9.5 are given as
the frontiers of colored areas and not as curves. This form has proven to be
more convenient for decision makers. A palette shows the relation between
the values of the third objective and colors. Two scroll-bars are related to the
values of the fourth and the fth objectives.
Fig. 9.5. Gray scale copy of a decision map and two scroll-bars.
A movement of a scroll-bar results in a change of the decision map. The DM

can move the slider manually. However, the most eective form of displaying
information to the DM is based on an automatic movement of the slider, that
is, on a gradual increment (or decrement) in the constraint imposed on the
value of an objective. A fast replacement of the decision maps oers the eect
of animation. Because any reasonable number of scroll-bars can be located
on the display, one can explore the inuence of the fourth, the fth (and
maybe even the sixth and the seventh etc.) objectives on the decision map.
Since the EPH has been approximated for the whole problem in advance,
dierent forms of animation can be used, including a simultaneous movement
of several sliders. However, such eects are not recommended since they are
too complicated for DMs. Animation of only one slider at a time has turned
out to be recommendable (by keeping the other sliders xed).
Another form of displaying several decision maps is a matrix of decision
maps, which can be interpreted as a collection of animation snap-shots. In
Figure 9.6, one can see a matrix of decision maps for the same problem.
Four values of an objective (associated with a scroll-bar in Figure 9.5) form
four columns, and four values of another objective (associated with the other
scroll-bar) form four rows. Note that such a matrix can be displayed without
animation at all. The constraints on the objectives that dene the decision
maps used can be specied manually by the DM or automatically. The maxi-
mal number of rows and columns in such a matrix depends exclusively on the
desires of the DM and on the quality of the computer display.
In the case of six or more objectives, animation of the entire matrix of decision
maps is possible but may be cognitively very demanding. In this case, the
values of the sixth, the seventh and the eights objectives can be related to
scroll-bars, the sliders of which can be moved manually or automatically. It is
important that the objectives in the matrix of decision map can be arranged
in the order the DM wishes, that is, any objective can be associated with an
axis, the color palette or a scroll-bar and the column or row of the matrix.
By shortening the ranges of the objectives the DM can zoom the decision
maps. It is recommended to restrict the number of objective functions consid-
ered to ve or six, because otherwise the amount of information becomes too
ramied for a human being. However, in some real-life applications, environ-
mental engineers have managed to apply matrices of decision maps even for
nine objective functions.
The visualization technique for the Pareto frontier described in this sub-
section is called by the name interactive decision maps (IDM) technique. For
further details, see (Lotov et al., 2004). After exploring the Pareto frontier, the
DM can specify a preferred combination of feasible objective values (feasible
goal) directly on one of the decision maps. Since the goal identied is close
to the precise Pareto frontier, the related decision can be found by solving an
optimization problem using any distance function. For the same reason, the
objective vector corresponding to the Pareto optimal decision vector found
will be close to the goal identied and, thus, this approach can be called a
feasible goals method (Lotov et al., 2004).
9.3.3 Comment on Polyhedral Approximation
Note that the problem of the polyhedral approximation of the Pareto frontier
is close to the classical problem of applied mathematics: polyhedral approxi-
mation of convex sets. Methods for polyhedral approximation of convex sets
are based on iterative algorithms for constructing converging sequences of
approximating polyhedra.
The main concepts of iterative methods for polyhedral approximation of
convex sets were introduced already in the 1970s by McClure and Vitale
(1975), who proposed the general idea of simultaneously constructing two
polyhedral approximations, an internal one and an external one, where ver-
tices of the internal approximation coincide with the points of contact of the
external approximation. Note that by this an assessment of the quality of the
approximation of the Pareto frontier is provided.
Then, both the approximations are used for iteratively decreasing the local
discrepancy between them in a balanced adaptive way. It should result in a
Fig. 9.6. Gray scale copy of a matrix of decision maps.
fast total convergence of the internal and external polyhedra providing by this
convergence to the approximated convex set.
This idea was transformed by Cohon (1978); Cohon et al. (1979) into the
rst adaptive approximation method, the non-inferior set estimation (NISE)
method. The NISE method was implemented as a software for approximat-
ing Pareto frontiers in linear biobjective optimization problems. Then, the
estimation renement (ER) method was proposed (Bushenkov and Lotov,
1982; Lotov, 1989) that integrated the concepts of the NISE method with
methods of linear inequality theory. The ER method eectively constructs
internal and external approximations of multi-dimensional (3 < k < 8) con-
vex bodies. Its convergence can be proved to be asymptotically optimal.
The ER method has been implemented as a software (downloadable from
http://www.ccas.ru/mmes/mmeda/soft/) that has been applied for several
real-life problems (Lotov et al., 2004).
To illustrate approximation methods, we provide a simplied iteration of
the ER method applied for approximating a convex and compact feasible
objective region Z. Prior to the (l + 1)-th iteration, a polyhedron P l , for
which vertices belong to the frontier of Z, has to be constructed in two forms:
in the form of a list of its vertices as well as in the form of a system of linear
inequalities
(uk , z) uk0 , k = 1, 2, ..., K(l) (9.3)
where K(l) is the number of inequalities in the system (9.3), uk are vectors
with unit norm and uk0 are right sides of the inequalities.
Step 1. Solve the optimization problems (uk , z) max over the set Z for
all k = 1, . . . , K(l) except those solved at the previous iterations. Denote
the maximal values by gZ (uk ). Among uk nd a vector u that maximizes
the increment, i.e., the value gZ (uk ) uk0 . If all increments are suciently
small, then stop.
Step 2. The new approximating polyhedron P l+1 is given by the convex
hull of the polyhedron P l and the point z of the boundary of Z that has
(u , z ) = gZ (u ). Construct a new linear inequality system of type (9.3)
which describes P l+1 by using the stable algorithm of the beneath-beyond
method described in (Lotov et al., 2004). Start the next iteration.
Note that the ER method provides assessment of the quality of the approxi-
mation of the Pareto frontier: the external polyhedron Pl+1 that is given by
the linear inequality system
(uk , z) gZ (uk ), k = 1, 2, ..., K(l + 1)
contains Z. Due to this, at any iteration, we have internal and external esti-
mates for Z. Therefore, it is possible to evaluate the accuracy visually or use
the maximal value of gZ (uk ) uk0 as the accuracy measure.
Later, various related methods have been developed, including methods
which are dual to the ER method (Kamenev, 2002; Lotov et al., 2004). An
interesting method has been proposed by Schandl et al. (2002a,b); Klamroth
et al. (2002) for the case of more than two objectives. The method is aimed
at constructing inner polyhedral approximations for convex, nonconvex, and
discrete multiobjective optimization problems. In (Klamroth et al., 2002) the
method for convex problems is described separately, and one can see that
the algorithm for constructing the internal approximation is close to the ER
method, while the algorithm for constructing an external approximation is
close to the dual ER method. Other interesting ideas have been proposed,
including (Voinalovich, 1984; Benson, 1998).
9.4 Visualization of Pointwise Approximations of the

Pareto Frontier
In this section, we consider nonconvex multiobjective optimization problems.
To be more specic, we discuss methods for visualizing approximations of the
Pareto frontier given in the form of a list of objective vectors. We assume
that this list has already been constructed. It is clear that a good approxi-
mation representing the Pareto optimal set should typically consist of a large
number of solutions. This number may be hundreds, thousands or even much
more. For this reason, methods developed for visualizing a small number of
Pareto optimal points, as a rule, cannot be used for visualizing the Pareto
frontier as a whole. Thus, we do not consider here such methods as bar and
pie charts, value paths, star and web (radar) diagrams or harmonious houses,
which are described in Chapter 8 and (Miettinen, 1999, 2003) and used, for
example, in (Miettinen and Mkel, 2006). Let us point out that here we do
not consider visualization techniques that are based on various transforma-
tions of the Pareto frontier, like GAIA, which is a part of the PROMETHEE
method (Mareschal and Brans, 1988), BIPLOT (Lewandowski and Granat,
1991) or GRADS (Klimberg, 1992), or methods that utilize preference infor-
mation (Vetschera, 1992).
There exists a very close, but dierent problem: visualization of objective
vectors in nite selection problems involving multiple objectives (often named
attributes in such problems) (Olson, 1996). If the number of alternatives is
large, the problem is close to the problem of visualizing an approximation of
the Pareto frontier.
Note that any point of a pointwise approximation of the Pareto optimal
set usually corresponds to a feasible decision vector. Thus, along a point-
wise approximation of the Pareto frontier, one usually obtains a pointwise
approximation of the set of Pareto optimal decision vectors. However, we do
not consider the set of Pareto optimal decision vectors since it is usually not
visualized for the reasons which have been discussed in the introduction.
This section consists of four parts. Subsection 9.4.1 surveys visualization
methods based on concepts of heatmap graphs and scatterplot matrices for
the case of a large number of objective vectors. In Subsections 9.4.2 and 9.4.3,
we describe visualization of pointwise Pareto frontier approximations by us-
ing biobjective slices of the EPH and application of enveloping, respectively.
Finally, in Subsection 9.4.4, we discuss some topics related to pointwise ap-
proximations of Pareto frontiers (that were not touched in Chapter 1).
9.4.1 Heatmap Graphs and Scatterplots
A well-known technique for visualizing a small number of decision alternatives

is the value path method proposed, for example, in (Georion et al., 1972). A
value paths gure consists of k parallel lines related to attributes (objectives)
with broken lines on them associated with the decision alternatives (Miettinen,
1999). See also Chapter 8. Such gures are not too helpful if we are interested
in selecting the best alternative from a large number of them, but they can
help to cluster such sets of alternatives (Cooke and van Noortwijk, 1999).
Interesting developments of value paths are heatmap graphs (Pryke et al.,
2007) that were adopted from visualization methods developed by specialists
in data mining. In Figure 9.7, we provide a gray scale copy of a heatmap graph
(which however, gives only a limited impression of the colorful graph). In the
gure, there are 50 alternatives described by 11 decision variables (p1p11)
and two objective functions (o1o2). An alternative is given by a straight
line, while the deepness of gray provides information about the normalized
values (from 0 to 1) of decision and objective values (black corresponds to
zero, white corresponds to 1). Heatmap graphs are convenient for selecting
clusters of alternatives, but can be used for selecting one alternative, as well.
Fig. 9.7. Gray scale copy of a heatmap graph
A scatterplot matrix is a technique for visualizing a nite number of objective

vectors with biobjective projections. For clarity, we introduce it here for a
small number of objective vectors even though it could be used for bigger
sets of solutions. Hereby we augment the treatment given in Chapter 8 by
showing that a visualization, like the scatterplot, can result in misinforming
the DM and skipping the best alternatives even in the case of a small number
of solutions.
A scatterplot matrix has been introduced in (Meisel, 1973), but it has
become widely recognized since Cleveland (1985) discussed it. It has directly
been inspired by a biobjective case, but is used in the case of three and more
objectives. The technique displays objective vectors as their projections at all
possible biobjective planes. These projections are displayed as a scatterplot
matrix, which is a square matrix of panels each representing one pair of ob-
jective function. Thus, any panel shows partial objective tradeos for dierent
pairs of objectives. The dimension of the matrix coincides with the number
of objective functions. Thus, any pair of objective functions is displayed twice
with the scales interchanged (with the diagonal panels being empty).
Fig. 9.8. Scatterplot matrix for three objectives.
A simple example of a scatterplot matrix is given in Figure 9.8. It is related

to a simple decision problem involving seven Pareto optimal alternatives and
three objectives to be minimized. Thus we have seven objective vectors
1. Alternative #1: (0, 1, 1)
2. Alternative #2: (1, 0, 1)
3. Alternative #3: (1, 1, 0)
4. Alternative #4: (0.2, 0.2, 0.8)
5. Alternative #5: (0.2, 0.8, 0.2)
6. Alternative #6: (0.8, 0.2, 0.2)
7. Alternative #7: (0.4, 0.4, 0.4)
The projections of the vectors are numbered according to their number in the
list. One can see that in this example the vectors were specied in a way that
the variety of their projections is the same for all biobjective planes (only the
number of the original objective vector changes). It was done to illustrate the
evidence that despite its simplicity, this technique can be misleading since only
partial tradeos are displayed in the biobjective projections. In the list one
can see that vector #7 is balanced (which means that it may be a preferred
one). However, it is deep inside in all the projections in Figure 9.8. Thus,
the user may not recognize its merits and,instead, select solutions for which
projections look great, but the projected vectors have unacceptable values of
some objectives. This example shows that simple projections on biobjective
planes may result in misunderstanding the position of a solution in the Pareto
optimal set. It is especially complicated in the case of a large number of
alternatives.
For this reason, researchers try to improve the scatter plot technique by
providing values of objectives, which are not given on the axes, in some form,
like colors. However, such an approach may be eective only in the case of a
small number of points and objectives. Furthermore, in some software imple-
mentations of the scatterplot technique, one can specify a region in one of the
panels, and the software will color points belonging to that region in all other
panels. Though such a tool may be convenient, balanced points may still be
lost as in Figure 9.8.
Let us nally mention one more tool, so-called knowCube, by Trinkaus and
Hanne (2005) that can be used when a large set of Pareto optimal solutions
should be visualized. It is based on a radar chart (or spider-web chart) de-
scribed in Chapter 8. However, in knowCube there can be so many solutions
(in the beginning) that it is impossible to identify individual solutions. Then,
the DM can iteratively study the solutions, for example, by xing desirable
values for some objective functions or ltering away undesirable solutions by
specifying upper and/or lower bounds for objective function values.
9.4.2 Decision Maps in Visualizing Pointwise Approximations of

the Pareto Frontier
Let us next consider visualization of an approximation of the Pareto frontier

given by a nite number of objective vectors (from several hundreds to several
thousands) using decision maps of biobjective slices. As described in Section
9.3, in convex cases decision maps visualize tradeo curves and support a
direct identication of the most preferred solution (goal) (for three to eight
objectives). The same can be done if we provide a set of Pareto optimal points
by visualizing biobjective slices of the EPH of these points. If we have a nite
number of objective vectors, the description of the EPH is very simple: it is
a union of domination cones R+ k with vertices located in these points. Due
to this simple explicit form, the biobjective slices of the EPH can be rapidly
computed and displayed by computer graphics. Since the EPH of a nite set
of points, in contrast to the set of points itself, is a bodily set, the frontier
of its slice is given by a continuous line (frontier of the gure provided by
superimposed cones). Collections of biobjective slices can be given in the form
of decision maps, where the value of only one objective is changing from slice
to slice. Visualization of nite sets of objective vectors by slices of their EPHs
was introduced in (Bushenkov et al., 1993) and the current state of the art is
given in (Lotov et al., 2004).
In Figure 9.9 (adopted from (Lotov et al., 2005)), one can see a decision
map involving 2879 objective vectors for a steel casting process described by
a complicated nonlinear model with 325 decision variables. Note that in the
case of a large number of Pareto optimal objective vectors, tradeo curves are
displayed.
Fig. 9.9. Decision map and a scroll-bar for 2879 Pareto optimal objective vectors
and four objectives.
Let us next pay some attention to the case of more than three objectives.
As in the previous section, the ideas used in the IDM technique can be ap-
plied. Since biobjective slices of the EPH given by a system of cones can be
constructed fairly fast, an interactive study of decision maps can be used. It
can be implemented using the same tools as in the convex case (scroll-bars
and matrices of the decision maps). For example, in Figure 9.9 one can see
a scroll-bar that helps to animate decision maps. In this way, the animation
helps to study a problem with four objectives.
It is important to stress that one can study the associated EPH in general
and in details from various points of view by providing various decision maps
on-line. One can see the values and tradeos for three objectives, while the in-
uence of the fourth objective (or fth, etc.) can be studied by moving sliders
of scroll-bars. The DM can select dierent allocations of objectives among
scrollbars, colors and axes. However, understanding interdependencies be-
tween many objectives may involve a lot of cognitive eort. As discussed, after
studying the decision maps, the DM can identify the most preferred solution
and obtain the closest Pareto optimal objective vector from the approximation
as well as the associated decision. Examples of hybridizing this idea with the
interactive NIMBUS method (Miettinen, 1999; Miettinen and Mkel, 2006)
(for ne-tuning the solution) are given by Miettinen et al. (2003).
9.4.3 Application of Enveloping for Visualizing Alternatives

Finally, let us describe an alternative approach to visualizing objective vectors
(where k > 2) based on approximating and visualizing the EPH of their
envelope (convex hull). This has been proposed in (Bushenkov et al., 1993),
and can be used in the case of a large number (from several hundreds to
several millions) of objective vectors. Due to enveloping, the IDM technique
can be applied for visualizing the Pareto frontier of a convex hull.
We assume that we have objective vectors (k < 7) either as a list of
objective vectors or dened implicitly as objective vectors corresponding to
integer-valued decision vectors satisfying some constraints. First, the EPH of
the convex hull of these points must be approximated. One can use meth-
ods for the polyhedral approximation of the convex sets (see Section 9.3) or
methods specially developed for constructing a convex hull of a system of
multi-dimensional points (Barber et al., 1996). Then the IDM technique can
be applied for visualization. An example of a decision map describing about
400 thousand objective vectors for an environmental problem is given in (Lo-
tov et al., 2004). It is important that tradeo rates can be seen more easily
than in the case of visualizing the EPH for objective vectors without convex
enveloping.
According to experiments, even non-professional users have been able to
understand tradeo rates and identify the most preferred solution. However,
one must remember that it is the tradeo rate of the envelope that is shown:
the Pareto frontier of the convex hull is displayed, which includes additional
infeasible points of the objective space that simplify the graph. Thus, the
solution identied by the DM on the Pareto frontier of the envelope is only
reasonable (the concept of a reasonable goal as the goal that is close to a
feasible objective vector was introduced by Lot et al. (1992). Eventually,
several alternatives can be selected which are close to the goal in some sense.
For further details, see (Gusev and Lotov, 1994; Lotov et al., 2004).
9.4.4 Comment on Methods for Pointwise Approximations of the

Pareto Frontier
Several basic methods have been proposed for generating a representation of
the Pareto optimal set, that is, Pareto frontier approximation. Some scalar-
ization based approaches were described in Chapter 1. That is why we do
not consider them here. A more detailed overview of methods for pointwise
approximation is provided in (Ruzika and Wiecek, 2005). Here we augment
this overview.
There exist scalarization methods that take the possible instability of the
Pareto frontier into account (Krasnoshchekov et al., 1979; Popov, 1982; Nefe-
dov, 1984, 1986; Abramova, 1986; Smirnov, 1996) and, by solving a large
number of scalarized parametric optimization problems they generate point-
wise approximations. It is important that the methods are stable or insensi-
tive to disturbances. If Lipschitz constants exist, a reasonable approximation
accuracy can be achieved by solving a huge number of global optimization

problems. Note that it is possible to avoid a huge number of solutions in
the approximation by ltering (Reuter, 1990; Sayin, 2003; Steuer and Harris,
1980).
Methods based on covering the feasible region (in the decision space) by
balls whose radius depends on the Lipschitz constants of objective functions
(Evtushenko and Potapov, 1987) provide a theoretically justied approach to
approximating the Pareto frontier. They use the EPH of a nite number of
objective vectors as an external approximation of the precise EPH. The idea
to use an approximation of the EPH for approximating the Pareto frontier in
the nonconvex case has also been used in (Reuter, 1990; Kaliszewski, 1994;
Benson and Sayin, 1997; Lotov et al., 2002).
As far as random search methods are concerned, they compute objective
vectors at random decision vectors and choose the nondominated ones (Sobol
and Statnikov, 1981). Though such methods are not theoretically justied (no
estimate is given for the approximation quality in a general case), the conver-
gence of the process is guaranteed when the number of random points tends
to innity. These methods are easy to implement (Statnikov and Matusov,
1995).
In simulated annealing (Chippereld et al., 1999; Suppaptnarm et al.,
2000; Kubotani and Yoshimura, 2003), physical processes are imitated. This
approach is methodologically close to evolutionary (see Chapter 3) and other
metaheuristic methods. Furthermore, various combinations of approximation
methods considered above as well as EMO methods can be used in hybrid
methods. One eective hybrid method, which combines random search, op-
timization and a genetic algorithm (Berezkin et al., 2006) was applied in a
real-life problem with 325 decisison variables and four objective functions (see
Figure 9.9). Some other examples of hybrid methods are discussed in Chap-
ter 16.
Methods for the assessment of the quality of the approximation of the
Pareto frontier for nonlinear models are intensively developed by specialists in
evolutionary methods (see Chapter 14). In the framework of MCDM methods,
in addition to the convex case (see the previous section), quality assessment
is provided in methods described in (Evtushenko and Potapov, 1987; Schandl
et al., 2002a,b; Klamroth et al., 2002). The quality assessment techniques play
a major role in the hybrid methods by Lotov et al. (2002, 2004); Berezkin et al.
(2006).
9.5 Conclusions
We have discussed visualization of the Pareto optimal set from dierent per-
spectives. Overall, the aim has been to help users of a posteriori methods
to nd the most preferred solutions for multiobjective optimization problems
involving more than two objectives. Both cases of using polyhedral approxi-
mations of the Pareto optimal set as well as sets of Pareto optimal solutions as
a starting point have been considered. It has been shown that visualization of
the Pareto frontier can be carried out in nonlinear multiobjective optimization
problems with up to four or ve objectives, but this requires more cognitive
eort if the number of objectives increases (e.g., till eight).
Acknowledgements
The work of A.V. Lotov was supported by the Russian Foundation for Ba-
sic Research (project # 07-01-00472). The work of K. Miettinen was partly
supported by the Foundation of the Helsinki School of Economics.
References
Abramova, M.: Approximation of a Pareto set on the basis of inexact information.
Moscow University Computational Mathematics and Cybernetics 2, 6269 (1986)
Barber, C.B., Dobkin, D.P., Huhdanpaa, H.T.: The quickhull algorithm for convex
hulls. ACM Transactions on Mathematical Software 2(4), 469483 (1996)
Benson, H.P.: An outerapproximation algorithm for generating all ecient extreme
points in the outcome set of a multiple-objective linear programming problem.
Journal of Global Optimization 13, 124 (1998)
Benson, H.P., Sayin, S.: Toward nding global representations of the ecient set
in multiple-objective mathematical programming. Naval Research Logistics 44,
4767 (1997)
Berezkin, V.E., Kamenev, G.K., Lotov, A.V.: Hybrid adaptive methods for approxi-
mating a nonconvex multidimensional Pareto frontier. Computational Mathemat-
ics and Mathematical Physics 46(11), 19181931 (2006)
Bushenkov, V.A., Lotov, A.V.: Methods for the Constructing and Application
of Generalized Reachable Sets (In Russian). Computing Centre of the USSR
Academy of Sciences, Moscow (1982)
Bushenkov, V.A., Gusev, D.I., Kamenev, G.K., Lotov, A.V., Chernykh, O.L.: Visu-
alization of the Pareto set in multi-attribute choice problem (In Russian). Doklady
of Russian Academy of Sciences 335(5), 567569 (1993)
ology. Elsevier, New York (1983)
Chernykh, O.L., Kamenev, G.K.: Linear algorithm for a series of parallel two-
dimensional slices of multidimensional convex polytope. Pattern Recognition and
Image Analysis 3(2), 7783 (1993)
Chippereld, A.J., Whideborn, J.F., Fleming, P.J.: Evolutionary algorithms and
simulated annealing for MCDM. In: Gal, T., Stewart, T.J., Hanne, T. (eds.) Mul-
ticriteria Decision Making: Advances in MCDM Models, Algorithms, Theory and
Applications, pp. 16-116-32, Kluwer Academic Publishers, Boston (1999)
Cleveland, W.S.: The Elements of Graphing Data. Wadsworth, Belmont (1985)
Cohon, J., Church, R., Sheer, D.: Generating multiobjective tradeos: An algorithm
for bicriterion problems. Water Resources Research 15, 10011010 (1979)
Cohon, J.L.: Multiobjective Programming and Planning. Academic Press, New York
(1978)
Cooke, R.M., van Noortwijk, J.M.: Generic graphics for uncertainty and sensitivity
analysis. In: Schueller, G., Kafka, P. (eds.) Safety and Reliability, Proceedings of
ESREL 99, pp. 11871192. Balkema, Rotterdam (1999)
Das, I., Dennis, J.: Normal boundary intersection: A new method for generating the
Pareto surface in nonlinear multicriteria optimization problems. SIAM Journal
on Optimization 8, 631657 (1998)
Evtushenko, Y., Potapov, M.: Methods of Numerical Solutions of Multicriteria Prob-
lems. Soviet Mathematics Doklady 34, 420423 (1987)
Gass, S., Saaty, T.: The Computational Algorithm for the Parametric Objective
Function. Naval Research Logistics Quarterly 2, 39 (1955)
Gusev, D.V., Lotov, A.V.: Methods for decision support in nite choice problems (In
Russian). In: Ivanilov, J. (ed.) Operations Research. Models, Systems, Decisions,
pp. 1543. Computing Center of Russian Academy of Sciences, Moscow (1994)
Haimes, Y.Y., Tarvainen, K., Shima, T., Thadathill, J.: Hierachical Multiobjective
Analysis of Large-Scale Systems. Hemisphere Publishing Corporation, New York
(1990)
Henig, M., Buchanan, J.T.: Generalized tradeo directions in multiobjective op-
timization problems. In: Tzeng, G., Wang, H., Wen, U., Yu, P. (eds.) Multiple
Criteria Decision Making Proceedings of the Tenth International Conference,
pp. 4756. Springer, New York (1994)
Henig, M., Buchanan, J.T.: Tradeo directions in multiobjective optimization prob-
lems. Mathematical Programming 78(3), 357374 (1997)
Jahn, J.: Vector Optimization. Springer, Berlin (2004)
Jewel, T.K.: A Systems Approach to Civil Engineering, Planning, Design. Harper
& Row, New York (1990)
Kaliszewski, I.: Quantitative Pareto Analysis by Cone Separation Technique.
Kluwer, Dordrecht (1994)
Kamenev, G.K.: Dual adaptive algorithms for polyhedral approximation of convex
bodies. Computational Mathematics and Mathematical Physics 42(8) (2002)
Klamroth, K., Tind, J., Wiecek, M.M.: Unbiased approximation in multicriteria
optimization. Mathematical Methods of Operations Research 56, 413437 (2002)
Klimberg, R.: GRADS: A new graphical display system for visualizing multiple
criteria solutions. Computers & Operations Research 19(7), 707711 (1992)
Krasnoshchekov, P.S., Morozov, V.V., Fedorov, V.V.: Decomposition in design prob-
lems (In Russian). Izvestiya Akademii Nauk SSSR, Series Technical Cybernet-
ics (2), 717 (1979)
Kubotani, H., Yoshimura, K.: Performance evaluation of acceptance probability
functions for multiobjective simulated annealing. Computers & Operations Re-
search 30, 427442 (2003)
Larichev, O.: Cognitive validity in design of decision-aiding techniques. Journal of
Multi-Criteria Decision Analysis 1(3), 127138 (1992)
Lewandowski, A., Granat, J.: Dynamic BIPLOT as the interaction interface for
aspiration based decision support systems. In: Korhonen, P., Lewandowski, A.,
Wallenius, J. (eds.) Multiple Criteria Decision Support, pp. 229241. Springer,
Berlin (1991)
Lomov, B.F.: Philosophical and Theoretical Problems of Psychology. Science Pub-
lisher, Moscow (1984)
Lot, V., Stewart, T.J., Zionts, S.: An aspiration level interactive model for multiple
criteria decision making. Computers & Operations Research 19(7), 671681 (1992)
Lotov, A., Berezkin, V., Kamenev, G., Miettinen, K.: Optimal control of cooling
process in continuous casting of steel using a visualization-based multi-criteria
approach. Applied Mathematical Modelling 29(7), 653672 (2005)
Lotov, A.V.: Exploration of economic systems with the help of reachable sets (In
Russian). In: Proceedings of the International Conference on the Modeling of Eco-
nomic Processes, Erevan, 1974, pp. 132137. Computing Centre of USSR Academy
of Sciences, Moscow (1975)
Lotov, A.V.: Coordination of Economic Models by the Attainable Sets Method (In
Russian). In: Berlyand, E.L., Barabash, S.B. (eds.) Mathematical Methods for
an Analysis of Interaction between Industrial and Regional Systems, pp. 3644.
Nauka, Novosibirsk (1983)
Lotov, A.V.: Generalized reachable sets method in multiple criteria problems. In:
Lewandowski, A., Stanchev, I. (eds.) Methodology and Software for Interactive
Decision Support, pp. 6573. Springer, Berlin (1989)
Lotov, A.V., Kamenev, G.K., Berezkin, V.E.: Approximation and Visualization of
Pareto-Ecient Frontier for Nonconvex Multiobjective Problems. Doklady Math-
ematics 66(2), 260262 (2002)
(2004)
Louie, P., Yeh, W., Hsu, N.: Multiobjective Water Resources Management Planning.
Journal of Water Resources Planning and Management 110(1), 3956 (1984)
Mareschal, B., Brans, J.P.: Geometrical representation for MCDA. European Jour-
nal of Operational Research 34(1), 6977 (1988)
Mattson, C.A., Messac, A.: Pareto frontier based concept selection under uncer-
tainty, with visualization. Optimization and Engineering 6, 85115 (2005)
McClure, D.E., Vitale, R.A.: Polygonal approximation of plane convex bodies. Jour-
nal of Mathematical Analysis and Applications 51(2), 326358 (1975)
McQuaid, M.J., Ong, T.H., Chen, H., Nunamaker, J.F.: Multidimensional scaling
for group memory visualization. Decision Support Systems 27, 163176 (1999)
Meisel, W.S.: Tradeo decision in multiple criteria decision making. In: Cochrane,
J., Zeleny, M. (eds.) Multiple Criteria Decision Making, pp. 461476. University
of South Carolina Press, Columbia (1973)
Boston (1999)
Miettinen, K., Mkel, M.M.: On generalized trade-o directions in nonconvex mul-
tiobjective optimization. Mathematical Programming 92, 141151 (2002)
Miettinen, K., Mkel, M.M.: Characterizing general tradeo directions. Mathe-

matical Methods of Operations Research 57, 89100 (2003)
and Software 18, 6380 (2003)
Miettinen, K., Molina, J., Gonzlez, M., Hernndez-Daz, A., Caballero, R.: Using
box indices in supporting comparison in multiobjective optimization. European
Journal of Operational Research. To appear (2008), doi:10.1016/j.ejor.2008.05.103
Miller, G.A.: The magical number seven, plus or minus two: Some limits of our
capacity for processing information. Psychological Review 63, 8197 (1956)
Nefedov, V.N.: On the approximation of Pareto set. USSR Computational Mathe-
matics and Mathematical Physics 24, 1928 (1984)
Nefedov, V.N.: Approximation of a set of Pareto-optimal solutions. USSR Compu-
tational Mathematics and Mathematical Physics 26, 99107 (1986)
Olson, D.: Decision Aids for Selection Problems. Springer, New York (1996)
Popov, N.: Approximation of a Pareto set by the convolution method. Moscow
University Computational Mathematics and Cybernetics (2), 4148 (1982)
Pryke, A., Mostaghim, S., Nazemi, A.: Heatmap Visualization of Population Based
Multi Objective Algorithms. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T.,
Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 361375. Springer, Heidelberg
(2007)
Reuter, H.: An approximation method for the eciency set of multiobjective pro-
gramming problems. Optimization 21, 905911 (1990)
Roy, B.: Decisions avec criteres multiples. Metra International 11(1), 121151 (1972)
Ruzika, S., Wiecek, M.M.: Survey paper: Approximation methods in multiobjective
programming. Journal of Optimization Theory and Applications 126(3), 473501
(2005)
Sayin, S.: A Procedure to Find Discrete Representations of the Ecient Set with
Specied Coverage Errors. Operations Research 51, 427436 (2003)
Schandl, B., Klamroth, K., Wiecek, M.M.: Introducing oblique norms into multiple
criteria programming. Journal of Global Optimization 23, 8197 (2002a)
Schandl, B., Klamroth, K., Wiecek, M.M.: Norm-based approximation in multicri-
teria programming. Computers and Mathematics with Applications 44, 925942
(2002b)
Schtze, O., Jourdan, L., Legrand, T., Talbi, E.-G., Wojkiewicz, J.L.: A multi-
objective approach to the design of conducting polymer composites for electro-
magnetic shielding. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata,
T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 590603. Springer, Heidelberg (2007)
Simon, H.: Reason in Human Aairs. Stanford University Press, Stanford (1983)
Smirnov, M.: The logical convolution of the criterion vector in the problem of approx-
imating Pareto set. Computational Mathematics and Mathematical Physics 36,
605614 (1996)
Sobol, I.M., Statnikov, R.B.: Choice of Optimal Parameters in Multiobjective Prob-
lems (In Russian). Nauka Publishing House, Moscow (1981)
Solanki, R.S., Appino, P.A., Cohon, J.L.: Approximating the noninferior set in mul-
tiobjective linear programming problems. European Journal of Operational Re-
search 68, 356373 (1993)
Statnikov, R.B., Matusov, J.: Multicriteria Optimization and Engineering. Chapman
and Hall, Boca Raton (1995)
Steuer, R.E., Harris, F.W.: Intra-set point generation and ltering in decision and
criterion space. Computers & Operations Research 7, 4158 (1980)
Suppaptnarm, A., Steen, K.A., Parks, G.T., Clarkson, P.J.: Simulated annealing:
An alternative approach to true multiobjective optimization. Engineering Opti-
mization 33(1), 5985 (2000)
Trinkaus, H.L., Hanne, T.: knowcube: a visual and interactive support for multicri-
teria decision making. Computers & Operations Research 32, 12891309 (2005)
Vetschera, R.: A preference preserving projection technique for MCDM. European
Journal of Operational Research 61(12), 195203 (1992)
Voinalovich, V.: External Approximation to the Pareto Set in Criterion Space for
Multicriterion Linear Programming Tasks. Cybernetics and Computing Technol-
ogy 1, 135142 (1984)
Wierzbicki, A.P.: On the role of intuition in decision making and some ways of
multicriteria aid of intuition. Journal of Multi-Criteria Decision Analysis 6(2),
6576 (1997)
Wierzbicki, A.P., Nakamori, Y.: Creative Space. Springer, Berlin (2005)
Yu, P.L.: Cone convexity, cone extreme points and nondominated solutions in de-
cision problems with multiple objectives. Journal of Optimization Theory and
Applications 14(3), 319377 (1974)
Yu, P.L.: Multiple-Criteria Decision Making Concepts, Techniques and Extensions.
Plenum Press, New York (1985)
10
Meta-Modeling in Multiobjective Optimization
Joshua Knowles1 and Hirotaka Nakayama2

1
School of Computer Science, University of Manchester,
Oxford Road, Manchester M13 9PL, UK
j.knowles@manchester.ac.uk
2
Konan University, Dept. of Information Science and Systems Engineering,
8-9-1 Okamoto, Higashinada, Kobe 658-8501, Japan
nakayama@konan-u.ac.jp
Abstract. In many practical engineering design and other scientic optimization

problems, the objective function is not given in closed form in terms of the design
variables. Given the value of the design variables, the value of the objective function
is obtained by some numerical analysis, such as structural analysis, uidmechanic
analysis, thermodynamic analysis, and so on. It may even be obtained by conduct-
ing a real (physical) experiment and taking direct measurements. Usually, these
evaluations are considerably more time-consuming than evaluations of closed-form
functions. In order to make the number of evaluations as few as possible, we may
combine iterative search with meta-modeling. The objective function is modeled dur-
ing optimization by tting a function through the evaluated points. This model is
then used to help predict the value of future search points, so that high performance
regions of design space can be identied more rapidly. In this chapter, a survey of
meta-modeling approaches and their suitability to specic problem contexts is given.
The aspects of dimensionality, noise, expensiveness of evaluations and others, are
related to choice of methods. For the multiobjective version of the meta-modeling
problem, further aspects must be considered, such as how to dene improvement in
a Pareto approximation set, and how to model each objective function. The possi-
bility of interactive methods combining meta-modeling with decision-making is also
covered. Two example applications are included. One is a multiobjective biochem-
istry problem, involving instrument optimization; the other relates to seismic design
in the reinforcement of cable-stayed bridges.
10.1 An Introduction to Meta-modeling

In all areas of science and engineering, models of one type or another are used
in order to help understand, simulate and predict. Today, numerical methods
Reviewed by: Jerzy Baszczyski, Poznan University, Poland
Yaochu Jin, Honda Research Institute Europe, Germany
Koji Shimoyama, Tohoku University, Japan
246 J. Knowles and H. Nakayama
make it possible to obtain models or simulations of quite complex and large-

scale systems, even when closed-form equations cannot be derived or solved.
Thus, it is now a commonplace to model, usually on computer, everything
from aeroplane wings to continental weather systems to the activity of novel
drugs.
An expanding use of models is to optimize some aspect of the modeled
system or process. This is done to nd the best wing prole, the best method
of reducing the eects of climate change, or the best drug intervention, for
example. But there are diculties with such a pursuit when the system is
being modeled numerically. It is usually impossible to nd an optimum of the
system directly and, furthermore, iterative optimization by trial and error can
be very expensive, in terms of computation time.
What is required, to reduce the burden on the computer, is a method of
further modeling the model, that is, generating a simple model that captures
only the relationships between the relevant input and output variables not
modeling any underlying process. Meta-modeling, as the name suggests, is
such a technique: it is used to build rather simple and computationally inex-
pensive models, which hopefully replicate the relationships that are observed
when samples of a more complicated, high-delity model or simulation are
drawn.1 Meta-modeling has a relatively long history in statistics, where it
is called the response surface method, and is also related to the Design of
Experiments (DoE) (Anderson and McLean, 1974; Myers and Montgomery,
1995).
Meta-modeling in Optimization
Iterative optimization procedures employing meta-models (also called surro-

gate models in this context) alternate between making evaluations on the given
high-delity model, and on the meta-model. The full-cost evaluations are used
to train the initial meta-model, and to update or re-train it, periodically. In
this way, the number of full-cost evaluations can often be reduced substan-
tially, whilst a high accuracy is still achieved. This is the main advantage of
using a meta-model. A secondary advantage is that the trained meta-model
may represent important information about the cost surface and the variables
in a relatively simple, and easily interpretable manner. A schematic of the
meta-modeling approach is shown in Figure 10.1.
The following pseudocode makes more explicit this process of optimization
using models and meta-models:
1. Take an initial sample I of (x,y) pairs from the high-delity model.
2. From some or all the samples collected, build/update a model M of p(y
Y |x X) or f : X Y .
1
Meta-modeling need not refer to modeling of a computational model ; real pro-
cesses can also be meta-modeled.
10 Meta-Modeling in Multiobjective Optimization 247
12
$$$
6
parameters objectives
iterative search metamodel

algorithm
Fig. 10.1. A schematic diagram showing how meta-modeling is used for opti-
mization. The high-delity model or function is represented as a black box, which
is expensive to use. The iterative searcher makes use of evaluations on both the
meta-model and the black box function.
3. Using M , choose a new sample P of points and evaluate them on the

high-delity model.
4. Until stopping criteria satised, return to 2.
The pseudocode is intentionally very general, and covers many dierent spe-
cic strategies. For instance, in some methods the choice of new sample(s) P
is made solely based on the current model M , e.g. by nding the optimum on
the approximate model (cf. EGO, Jones et al. (1998) described in Section 3).
Whereas, in other methods, P may be updated based on a memory of previ-
ously searched or considered points: e.g. an evolutionary algorithm (EA) using
a meta-model may construct the new sample from its current population via
the usual application of variation operators, but M is then used to screen out
points that it predicts will not have a good evaluation (see (Jin, 2005) and
Section 4).
The criterion for selecting the next point(s) to evaluate, or for screening out
points, is not always based exclusively on their predicted value. Rather, the
estimated informativeness of a point may also be accounted for. This can be
estimated in dierent ways, depending on the form of the meta-model. There

is a natural tension between choosing of samples because they are predicted to
be high-performance points and because they would yield much information,
and this tension can be resolved in dierent ways.
The process of actually constructing a meta-model from data is related to
classical regression methods and also to machine learning. Where a model is
built up from an initial sample of solutions only, the importance of placing
those points in the design space in a theoretically well-founded manner is em-
phasized, a subject dealt with in the classical design of experiments (DoE)
literature. When the model is updated using new samples, classical DoE prin-
ciples do not usually apply, and one needs to look to machine learning theory
to understand what strategies might lead to optimal performance. Here care
must be taken, however. Although the supervised learning paradigm is usu-
ally taken as the default method used for training meta-models, it is worth
noting that in supervised learning, it is usually a base assumption that the
available data for training a regression model are independent and identically
distributed samples drawn from some underlying distribution. But in meta-
modeling, the samples are not drawn randomly in this way: they are chosen,
and this means that training sets will often contain highly correlated data,
which can aect the estimation of goodness of t and/or generalization per-
formance. Also, meta-modeling in optimization can be related to active learn-
ing (Cohn et al., 1996), since the latter is concerned with the iterative choice of
training samples; however, active learning is concerned only with maximising
what is learned whereas meta-modeling in optimization is concerned mainly or
partially with seeking optima, so neither supervised learning or active learn-
ing are identical with meta-modeling. Finally, in meta-modeling, samples are
often added to a training set incrementally (see, e.g. (Cauwenberghs and
Poggio, 2001)), and the model is re-trained periodically; how this re-training
is achieved also leads to a variety of methods.
Interactive and Evolutionary Meta-modeling
Meta-modeling brings together a number of dierent elds to tackle the prob-

lem of how to optimize expensive functions on a limited budget. Its basis in
the DoE literature gives the subject a classical feel, but evolutionary algo-
rithms employing meta-models have been emerging for some years now, too
(for a comprehensive survey, see(Jin, 2005)).
In the case of multiobjective problems, it does not seem possible or desir-
able to make a clear distinction between interactive and evolutionary ap-
proaches to meta-modeling. Some methods of managing and using meta-
models seek to add one new search point, derived from the model, at every
iteration; others are closer to a standard evolutionary algorithm, with a pop-
ulation of solutions used to generate a set of new candidate points, which are
then ltered using the model, so that only a few are really evaluated. We deal
with both types of methods here (and those that lie in between), and also
consider how interaction with a decision maker can be used.
Organization of the Chapter
We begin by considering the dierent settings in which (single-objective)

meta-modeling may be used, and the consequences for algorithm design. In
the succeeding section, we survey, in more detail, methods for constructing
meta-models, i.e., dierent regression techniques and how models can be up-
dated iteratively when new samples are collected. Section 10.4 explicitly con-
siders how to handle meta-models in the case of multiobjective optimization.
This section elaborates on the way in which a Pareto front approximation is
gradually built up in dierent methods, and also considers how interaction
with a DM may be incorporated. Practical advice for evaluating meta-models
in multiobjective optimization is given in section 10.5, including ideas for
performance measures as well as baseline methods to compare against. Two
application sections follow, one from analytical chemistry and one from civil
engineering.
10.2 Aspects of Managing Meta-models in Optimization
The combination of meta-modeling and optimization can be implemented in

many dierent ways. In this section, we relate some of the ways of using meta-
models to properties of the particular optimization scenario encountered: such
properties as the cost of an evaluation, the number that can be performed
concurrently, and the features of the cost landscape.2
10.2.1 Choice of Model Type
The whole concept of using a model of the cost landscape to improve the
performance of an optimizer rests on the assumption that the model will aid
in choosing worthy new points to sample, i.e., it will be predictive of the real
evaluations of those points, to some degree. This assumption will hold only
if the function being optimized is amenable to approximation by the selected
type of meta-model. It is not realistic to imagine that functions of arbitrary
form and complexity can be optimized more eciently using meta-modeling.
This important consideration is related to the no free lunch theorems for
search (Wolpert and Macready, 1997) and especially to the pithy observation
2
The concept of a cost landscape relies on the fact that proximity of points in design
space is dened, which in turn, assumes that a choice of problem representation
has already been made. We shall not venture any further into the subject of
choices of representation here.
of Thomas English that learning is hard and optimization is easy in the

typical function (English, 2000).
In theory, given a cost landscape, there should exist a type of model that
approximates it fastest as data is collected and tting progresses. However,
we think it is fair to say that little is yet known about which types of model
accord best with particular features of a landscape and, in any case, very
little may be known to guide this choice. Nonetheless, some basic consid-
erations of the problem do help guide in the choice of suitable models and
learning algorithms. In particular, the dimension of the design space is im-
portant, as certain models cope better than others with higher dimensions.
For example, naive Bayes regression (Eyheramendy et al., 2003) is used rou-
tinely in very high-dimensional feature spaces (e.g. in spam recognition where
individual word frequencies form the input space). On the other hand, for
low-dimensional spaces, where local correlations are important, naive Bayes
might be very poor, whereas a Gaussian process model (Schwaighofer and
Tresp, 2003) might be expected to perform more eectively.
Much of the meta-modeling literature considers only problems over con-
tinuous design variables, as these are common in certain engineering domains.
However, there is no reason why this restriction need prevail. Meta-modeling
will surely become more commonly used for problems featuring discrete design
spaces or mixed discrete and continuous variables. Machine learning methods
such as classication and regression trees (C&RT) (Breiman, 1984), genetic
programming (Langdon and Poli, 2001), and Bayes regression may be more
appropriate for modeling these high-dimensional landscapes than the splines
and polynomials that are used commonly in continuous design spaces. Meta-
modeling of cost functions in discrete and/or high-dimensional spaces is by
no means the only approach to combining machine learning and optimiza-
tion, however. An alternative is to model the distribution over the variables
that leads to high-quality points in the objective space an approach known
as model-based search or estimation of distribution algorithms (Larranaga
and Lozano, 2001; Laumanns and Ocenasek, 2002). The learnable evolution
model is a related approach (Michalski, 2000; Jourdan et al., 2005). These
approaches, though interesting rivals to meta-models, do not predict costs or
model the cost function, and are thus beyond the scope of this chapter.
When choosing the type of model to use, other factors that are less linked
to properties of the cost landscape/design space should also be considered.
Models dier in how they scale in terms of accuracy and speed of training
as the number of training samples varies. Some models can be trained in-
crementally, using only the latest samples (some particular SVMs), whereas
others (most multi-layer perceptrons) can suer from catastrophic forgetting
if trained on new samples only, and need to use complete re-training over all
samples when some new ones become available, or some other strategy of re-
hearsal(Robins, 1997). Some types of model need cross-validation to control
overtting, whereas others use regularization. Finally, some types of model,
such as Gaussian random elds, model their own error, which can be a distinct
advantage when deciding where to sample next (Jones et al., 1998; Emmerich
et al., 2006). A more detailed survey of types of model, and methods for train-
ing them, is given in section 3.
10.2.2 The Cost of Evaluations
One of the most important aspects aecting meta-modeling for optimization

is the actual cost of an evaluation. At one extreme, a cost function may only
take the order of 1s to evaluate, but is still considered expensive in the context
of evolutionary algorithm searches where tens of thousands of evaluations are
typical. At the other extreme, when the design points are very expensive,
such as vehicle crash tests (Hamza and Saitou, 2005), then each evaluation
may be associated with nancial costs and/or may take days to organize and
carry out. In the former case, there is so little time between evaluations that
model-tting is best carried out only periodically (every few generations of
the evolutionary algorithm) and new sample points are generated mainly as
a result of the normal EA mechanisms, with the meta-model playing only a
subsidiary role of ltering out estimated poor points. In the latter case, the
overheads of tting and cross-validating models is small compared with the
time between evaluations, so a number of alternative models can be tted and
validated after every evaluation, and might be used very carefully to decide
on the best succeeding design point, e.g. by searching over the whole design
space using the meta-model(s) to evaluate points.
The cost of evaluations might aect the meta-modeling strategy in more
complicated and interesting ways than the simple examples above, too. In
some applications, the cost of evaluating a point is not uniform over the design
space, so points may be chosen based partly on their expected cost to evaluate
(though this has not been considered in the literature, to our knowledge). In
other applications, the cost of determining whether a solution is feasible or
not is expensive, while evaluating it on the objective function is cheap. This
was the case in (Joslin et al., 2006), and led to a particular strategy of only
checking the constraints for solutions passing a threshold on the objective
function.
In some applications, the cost of an evaluation can be high in time, yet
many can be performed in parallel. This situation occurs, for example, in using
optimization to design new drugs via combinatorial chemistry methods. Here,
the time to prepare a drug sample and to test it can be of the order of 24
hours, but using high-throughput equipment, several hundred or thousands
of dierent drug compounds can be made and tested in each batch (Corne
et al., 2002). Clearly, this places very particular constraints on the meta-
modeling/optimization process: there is not always freedom to choose how
frequently updates of the model are done, or how many new design points
should be evaluated in each generation. Only future studies will show how to
best deal with these scenarios.
10.2.3 Advanced Topics
Progress in meta-modeling seems to be heading in several exciting new direc-

tions, worth mentioning here.
Typically, when tting a meta-model to the data, a single global model is
learned or updated, using all available design points. However, some research
is departing from this by using local models (Atkeson et al., 1997), which are
trained only on local subsets of the data (Emmerich et al., 2006). Another
departure from the single global model is the possibility of using an ensemble
of meta-models (Hamza and Saitou, 2005). Ensemble learning has general
advantages in supervised learning scenarios (Brown et al., 2005) and may
increase the accuracy of meta-models too. Moreover, Jin and Sendho (2004)
showed that ensemble methods can be used to predict the quality of the
estimation, which can be very useful in the meta-modeling approach.
Noise or stochasticity is an element in many systems that require optimiza-
tion. Many current meta-modeling methods, especially those based on radial
basis functions or Kriging (see next section) assume noiseless evaluation, so
that the uncertainty of the meta-model at the evaluated points is assumed
to be zero. However, with noisy functions, to obtain more accurate estimates
of the expected quality of a design point, several evaluations may be needed.
Huang et al. (2006) consider how to extend the well-known EGO algorithm
(see next section) to account for the case of noise or stochasticity on the ob-
jective function. Other noisy optimization methods, such as those based on
EAs (Fieldsend and Everson, 2005), could be combined with meta-models in
future work.
A further exciting avenue of research is the use of transductive learning.
It has been shown by Chapelle et al. (1999) that in supervised learning of a
regression model, knowledge of the future test points (just in design space),
at the time of training, can be used to improve the training and lead to better
ultimate prediction performance on those points. This results has been im-
ported into meta-modeling by Schwaighofer and Tresp (2003), which compares
transductive Gaussian regression methods with standard, inductive ones, and
nds them much more accurate.
10.3 Brief Survey of Methods for Meta-modeling

The Response Surface Method (RSM) is probably the most widely applied to
meta-modeling (Myers and Montgomery, 1995). The role of RSM is to predict
the response y for the vector of design variables x Rn on the basis of the
given sampled obsevation (xi , yi ) (i = 1, . . . , p).
Usually, the Response Surface Method is a generic name, and it covers
a wide range of methods. Above all, methods using experimental design are
famous. However, many of them select sample points only on the basis of
statisitical analysis of design variable space. They may provide a good ap-
proximation of black-box functions with a mild nonlineality. It is clear, how-
ever, that in cases in which the black-box function is highly nonlinear, we
can obtain better performance by methods taking into account not only the
statistical property of design variable space but also that of range space of
the black-box function (in other words, the shape of function).
Moreover, machine learning techniques such as RBF (Radial Basis Func-
tion) networks and Support Vector Machines (SVM) have been recently ap-
plied for approximating the black-box function (Nakayama et al., 2002, 2003).
10.3.1 Using Design of Experiments
Suppose, for example for simplicitly, that we consider a response function

given by a quadratic polynomial:

n
n
y = 0 + i xi + ii x2i + ij xi xj (10.1)
i=1 i=1 i<j
Since the above equation is linear with respect to i , we can rewrite the
equation (10.1) into the following:
y = X + , (10.2)
where E() = 0, V () = 2 I.
The above (10.2) is well known as linear regression, and the solution
minimizing the squarred error is given by
= (X T X)1 X T y (10.3)
= cov(i , j ) of the least squarred error

The variance covariance matrix V ()
prediction given by (10.3) becomes
= cov(i , j ) = E(( E())(

V () E())
(10.4)
1 2
= (X X) ,
T
(10.5)
where 2 is the variance of error in the response y such that E(T ) = 2 I.
i) Orthogonal Design
Orthogonal design is usually applied for experimental design with linear poly-
nomials. Selecting sample points in such a way that the set X is orthogonal,
the matrix X T X becomes diagonal. It is well known that the orthogonal de-
sign with the rst order model is eective for cases with only main eects
or rst order interaction eects. For the polynomial regressrion with higher
order ( 2), orthogonal polynomials are usually used in order to make the
design to be orthogonal (namely, X T X is diagonal). Then the coecients of

polynomials are easily evaluated by using orthogonal arrays.
Another kind of experimental design, e.g., CCD (Cetral Composite De-
sign) is applied mostly for experiments with quadratic polynomials.
ii) D-optimality
Considering the equation (10.5), the matrix (X T X)1 should be minimized
so that the variance of the predicted may decrease. Since each element of
(X T X)1 has det(X T X) in the denomnator, we can expect to decrease not
only variance but also covariance of i by maximizing det(X T X). This is the
idea of D-optimality in design of experiments. In fact, it is usual to use the
moment matrix
XT X
M= , (10.6)
p
where p is the number of sample points.
Other criteria are possible: to minimize the trace of (X T X)1 (A-
optimiality), to minimize the maximal value of the diagonal components of
(X T X)1 (minimax criterion), to maximize the minimal eigen value of X T X
(E-optimality). In general, however, the D-optimality criterion is widely used
for many practical problems.
10.3.2 Kriging Method
Consider the response y(x) as a realization of a random function, Y (x) such

that
Y (x) = (x) + Z(x). (10.7)
Here, (x) is a global model and Z(x) reecting a deviation from the global
model is a random function with zero mean and nonzero covariance given by
cov[Z(x), Z(x )] = 2 R(x, x ) (10.8)
where R is the correlation between Z(x) and Z(x ). Usually, the stochas-
tic process is supposed to be stationary, which implies that the correlation
R(x, x ) depends only on x x , namely
R(x, x ) = R(x x ). (10.9)
A commonly used example of such correlation functions is

n
R(x, x ) = exp[ i |xi xi |2 ], (10.10)
i=1
where xi and xi are i-th component of x and x , respectively.

k
Although a linear regressrion model j=1 j fj (x) can be applied as a
global model in (10.7) (universal Kriging), (x) = in which is unknown
but constant is commonly used in many cases (ordinary Kriging). In the or-
dinary Kriging, the best linear unbiased predictor of y at an untried x can be
given by
+ r T (x)R1 (y 1
y(x) = ), (10.11)
where = (1T R1 1)1 1T R1 y is the generalized least squares estimator of
, r(x) is the n1 vector of correlations R(x, xi ) between Z at x and sampled
points xi (i = 1, . . . , p), R is an n n correlation matrix with (i, j)-element
dened by R(xi , xj ) and 1 is a unity vector whose components are all 1.
Using Expected Improvement
Jones et al. (1998) suggested a method called EGO (Ecient Global Opti-
mization) for black-box objective functions. They applied a stochastic process
model (10.7) for predictor and the expected improvement as a gure of merit
for additional sample points.
The estimated value of the mean of the stochastic process, , is given by
1T R1 y
=
. (10.12)
1T R1 1
In this event, the variation 2 is estimated by
)T R1 (y 1
(y 1 )
2 =
. (10.13)
n
The mean squared error of the predictor is estimated by
(1 1T R1 r)2
s2 (x) = 2 [1 r T R1 r + ]. (10.14)
1T R1 1
&
In the following s = s2 (x) is called a standard error.
Using the above predictor on the basis of stochastic process model, Jones
et al. applied the expected improvemnet for adding a new sample point. Let
p
fmin = min{y1 , . . . , yp } be the current best function value. They model the
uncertainty at y(x) by treating it as the realization of a normally distributed
random variable Y with mean and standard deviation given by the above
predictor and its standard error.
p
For minimization cases, the improvement at x is I = [max(fmin Y, 0).
Therefore, the expected improvement is given by
p
E[I(x)] = E[max(fmin Y, 0)].
It has been shown that the above formula can be expanded as follows:
' p p
fmin
y p
fmin
y
E(I) = (fmin y)( s ) + s( s ) if s < 0 (10.15)
0 if s = 0,
where is the standard normal density and is the distribution function.

We can add a new sample point which maximizes the expected improve-
ment. Although Jones et al. proposed a method for maximizing the expected
improvement by using the branch and bound method, it is possible to select
the best one among several candidates which are generated randomly in the
design variable space.
Furthermore, Schonlau (1997) extended the expected improvement as fol-
p
lows: Letting I g = max((fmin Y )g , 0), then

g
g!
E(I ) = s
g g
(1)i ( )(f p )gi Ti (10.16)
i=0
i!(g i)! min
where p
p fmin y
fmin =
s
and
p p
Tk = (fmin )(fmin )(k1) + (k 1)Tk2 .
Here
p
T0 = (fmin )
p
T1 = (f ). min
It has been observed that larger value of g makes the global search, while
smaller value of g the local search. Therefore, we can control the value of g
depending upon the situation.
10.3.3 Computational Intelligence

Multi-layer Perceptron Neural Networks
The multi-layer perceptron (MLP) is used in several meta-modeling appli-
cations in the literature (Jin et al., 2001; Gaspar-Cunha and Vieira, 2004).
It is well-known that MLPs are universal approximators, which makes them
attractive for modeling black box functions for which little information about
their form is known. But, in practice, it can be dicult and time-consuming
to train MLPs eectively as they still have biases and it is easy to get caught
in local minima which give far from desirable performance. A large MLP with
many weights has a large capacity, i.e. it can model complex functions, but
it is also easy to over-t it, so that generalization performance may be poor.
The use of a regularization term to help control the complexity is necessary to
ensure better generalization performance. Cross-validation can also be used
during the training to mitigate overtting.
Disadvantages of using MLPs may include the diculty to train it quickly,
especially if cross-validation with several folds is used (a problem in some ap-
plications). It is not easy to train incrementally (compare with RBFs). More-
over, an MLP does not estimate its own error (compare with Kriging), which
means that it can be dicult to estimate the best points to sample next.
Radial Basis Function Networks
Since the number of sample points for predicting objective functions should
be as few as possible, incremental learning techniques which predict black-
box functions by adding learning samples step by step, are attractive. RBF
Networks (RBFN) and Support Vector Machines (SVM) are eective to this
end. For RBFN, the necessary information for incremental learning can be
easily updated, while the information of support vector can be utilized in
selecting additional samples as the sensitivity in SVM. The details of these
approaches can be seen in (Nakayama et al., 2002) and (Nakayama et al.,
2003). Here, we introduce the incremental learning by RBFN briey in the
following.
The output of an RBFN is given by

m
f (x) = wj hj (x),
j=1
where hj , j = 1, . . . , m are radial basis functions, e.g.,
hj (x) = e
xcj
2
/rj
.
Given the training data (xi , yi ), i = 1, , p, the learning of RBFN is usually

made by solving

p
m
min E= (yi f (xi ))2 + j wj2
i=1 j=1
where the second term is introduced for the purpose of regularization.

In general cases with a large number of training data p, the number of basis
functions m is set to be less than p in order to avoid overlearning. However,
the number of training data is not so large in this paper, because it is desired
to be as small as possible in applications under consideration. The value m is
set, therefore, to be equal to p in later sections in this paper. Also, the center
of radial basis function ci is set to be xi . The values of j and rj are usually
determined by cross-validation test. It is observed through our experience that
in many problems we have a good performance with j = 0.01 and a simple
estimate for rj given by
dmax
r= n np
, (10.17)
where dmax is the maximal distance among the data; n is the dimension of
data; p is the number of data.
Letting A = (HpT Hp + ), we have
,
Aw = HpT y
as a necessary condition for the above minimization. Here

HpT = [h1 hp ] ,
where hTj = [h1 (xj ), . . . , hm (xj )], and is a diagonal matrix whose diagonal
components are 1 m .
Therefore, the learning in RBFN is reduced to nding
A1 = (HpT Hp + )1 .
The incremental learning in RBFN can be made by adding new samples

and/or a basis function, if necesary. Since the learning in RBFN is equiva-
lent to the matrix inversion A1 , the additional learning here is reduced to
the incremental calculation of the matrix inversion. The following algorithm
can be seen in (Orr, 1996):
(i) Adding a New Training Sample
Adding a new sample xp+1 , the incremental learning in RBFN can be made
by the following simple update formula: Let

Hp
Hp+1 = ,
hTp+1
where hTp+1 = [h1 (xp+1 ), . . . , hm (xp+1 )].

Then
A1 1
p hp+1 hp+1 Ap
T
A1
p+1 = Ap
1
.
1 + hTp+1 A1
p hp+1
(ii) Adding a New Basis Function

In those cases where a new basis function is needed to improve the learning
for a new data, we have the following update formula for the matrix inversion:
Let
Hm+1 = Hm hm+1 ,
where hTm+1 = [hm+1 (x1 ), . . . , hm+1 (xp )].
Then 1
Am 0
A1 =
m+1 0T 0
1 T T
1 Am Hm hm+1 A1
m Hm hm+1
T
+ .
m+1 + hTm+1 (Ip Hm A1
m Hm )hm+1
T 1 1
10.3.4 Support Vector Machines
Support vector machines (SVMs) were originally developed for pattern clas-
sication and later extended to regression (Cortes and Vapnik, 1995; Vapnik,
1998; Cristianini and Shawe-Tylor, 2000; B.Schlkopf and A.J.Smola, 2002).
Regression using SVMs, called often support vector regression, plays an im-
portant role in meta-modeling. However, the essential idea of support vector
regression lies in SVMs for classication. Therefore, we start with a brief re-
view of SVM for classication problems.
Let X be a space of conditional attributes. For binary classication prob-
lems, the value of +1 or 1 is assigned to each pattern xi X according
to its class A or B. The aim of machine learning is to predict which class
newly observed patterns belong to on the basis of the given training data set
(xi , yi ) (i = 1, . . . , p), where yi = +1 or 1. This is performed by nding a
discriminant function f (x) such that f (x) 0 for x A and f (x) < 0 for
x B. Linear discriminant functions, in particular, can be expressed by the
following linear form
f (x) = wT x + b
with the property
wT x + b 0 for xA
w x+b<0
T
for x B.
In cases where training data set X is not linearly separable, we map the
original data set X to a feature space Z by some nonlinear map . Increasing
the dimension of the feature space, it is expected that the mapped data set
becomes linearly separable. We try to nd linear classiers with maximal
margin in the feature space. Letting zi = (xi ), the separating hyperplane
with maximal margin can be given by solving the following problem with the
normalization wT z + b = 1 at points with the minimum interior deviation:
min ||w|| (SVMhard )P
w,b

s.t. yi wT zi + b 1, i = 1, . . . , p.
Dual problem of (SVMhard )P with 12 ||w||22 is

p
1
p
max i i j yi yj (xi )T (xj ) (SVMhard )D
i
i=1
2 i,j=1
p
s.t. i yi = 0,
i=1
i 0, i = 1, . . . , p.
Using the kernel function K(x, x ) = (x)T (x ), the problem (SVMhard )D
can be reformulated as follows:
p
1
p
max i i j yi yj K(xi , xj ) (SVMhard )
i
i=1
2 i,j=1

p
s.t. i yi = 0,
i=1
i 0, i = 1, . . . , p.
Although several kinds of kernel functions have been suggested, the Gaussian
kernel ! "
||x x ||2
K(x, x ) = exp
2r2
is popularly used in many cases.
MOP/GP Approaches to Support Vector Classication
In 1981, Freed and Glover suggested to get just a hyperplane separating two
classes with as few misclassied data as possible by using goal programming
(Freed and Glover, 1981) (see also (Erenguc and Koehler, 1990)). Let i denote
the exterior deviation which is a deviation from the hyperplane of a point xi
improperly classied. Similarly, let i denote the interior deviation which is a
deviation from the hyperplane of a point xi properly classied. Some of main
objectives in this approach are as follows:
i) Minimize the maximum exterior deviation (decrease errors as
much as possible)
ii) Maximize the minimum interior deviation (i.e., maximize the mar-
gin)
iii) Maximize the weighted sum of interior deviation
iv) Minimize the weighted sum of exterior deviation.
Introducing the idea iv) above, the well known soft margin SVM with slack
variables (or, exterior deviations) i (i = 1, . . . , p) which allow classication
errors to some extent can be formulated as follows:
1 p
min ||w||22 + C i (SVMsof t )P
w,b,i 2 i=1

s.t. yi wT zi + b 1 i ,
i 0, i = 1, . . . , p,
where C is a trade-o parameter between minimizing ||w||22 and minimizing

p
i=1 i .
Using a kernel function in the dual problem yields

p
1
p
max i i j yi yj K(xi , xj ) (SVMsof t )
i
i=1
2 i,j=1
p
s.t. i yi = 0,
i=1
0 i C, i = 1, . . . , p.
Lately, taking into account the objectives (ii) and (iv) of goal programming,
we have the same formulation of -support vector algorithm developed by
Schlkopf and Smola (1998):
1
p
1
min ||w||22 + i (SVM)P
w,b,i , 2 p i=1

s.t. yi wT zi + b i ,
0, i 0, i = 1, . . . , p,
where 0 1 is a parameter.
Compared with the existing soft margin algorithm, one of the dierences is
that the parameter C for slack variables does not appear, and another dier-
ence is that the new variable appears in the above formulation. The problem
(SVM)P maximizes the variable which corresponds to the minimum inte-
rior deviation (i.e., the minimum distance between the separating hyperplane
and correctly classied points).
The Lagrangian dual problem to the problem (SVM)P is as follows:
1
p
max yi yj i j K (xi , xj ) (SVM)
i 2 i,j=1

p
s.t. yi i = 0,
i=1

i ,
i=1
1
0 i , i = 1, . . . , p.
p
Other variants of SVM considering both slack variables for misclassied data
points (i.e., exterior deviations) and surplus variables for correctly classied
data points (i.e., interior deviations) are possible (Nakayama and Yun, 2006a):
Considering iii) and iv) above, we have the fomula of total margin SVM, while
SVM can be derived from i) and iii).
Finally, SVM is derived by considering the objectives i) and ii) in
MOP/GP:
1
min w22 + ( SVM)P
w,b,, 2
T
s.t. yi w zi + b , i = 1, . . . , p,
0, 0,
where and are parameters.

The dual formulation is given by
1
p
max i j yi yj K (xi , xj ) ( SVM)
i 2 i,j=1

p
s.t. i yi = 0,
i=1

p
i ,
i=1
i 0, i = 1, . . . , p.
Letting be the optimal solution to the problem (SVM), the oset b

can be chosen easily for any i satisfying i > 0. Otherwise, b can be obtained
by the similar way with the decision of the b in the other algorithms.
Support Vector Regression
Support Vector Machines were extended to regression by introducing the

insensitive loss function by Vapnik (1998). Denote the given sample data by
(xi , yi ) for i = 1, ..., p. Suppose that the regression function on the Z space
p
is expressed by f (z) = wi zi + b. The linear insensitive loss function is
i=1
dened by
L (z, y, f ) = |y f (z)| = max(0, |y f (z)| ).
For a given insensitivity parameter ,
1 1 p
min w22 + C (i + i ) (sof tSVR)P
w,b,,i ,i 2 p i=1
T
s.t. w zi + b yi + i , i = 1, . . . , p,

yi wT zi + b + i , i = 1, . . . , p,
, i , i 0

where C is a trade-o parameter between the norm of w and ().
The dual formulation to (sof tSVR)P is given by
1
p
max (i i ) (j j ) K (xi , xj ) (sof tSVR)
i ,i 2 i,j=1

p
p
+ (i i ) yi (i + i )
i=1 i,j=1

p
s.t. (i i ) = 0,
i=1
C C
0 i , 0 i , i = 1, . . . , p.
p p
In order to decide automatically, Schlkopf and Smola proposed -SVR as
follows (Schlkopf and Smola, 1998):
1 1
p
min w22 + C + (i + i ) (SVR)P
w,b,,i ,i 2 p i=1
T
s.t. w zi + b yi + i , i = 1, . . . , p,

yi wT zi + b + i , i = 1, . . . , p,
, i , i 0,
where C and are trade-o parameters between the norm of w and and
i (i ).
The dual formulation to (SVR)P is given by
1
p
max (i i ) (j j ) K (xi , xj ) (SVR)
i ,i 2 i,j=1

p
+ (i i ) yi
i=1

p
s.t. (i i ) = 0,
i=1
p
(i + i ) C ,
i=1
C C
0 i , 0 i , i = 1, . . . , p.
p p
In a similar fashion to classication, we can obtain ( SVR) as follows:

1
min w22 + + ( + ) ( SVR)P
w,b,,, 2
T
s.t. w zi + b yi + , i = 1, . . . , p,

yi wT zi + b + , i = 1, . . . , p,
, , 0,

where and are trade-o parameters between the norm of w and and .
The dual formulation of SVR is as follows:
1
p
max (i i ) (j j ) K (xi , xj ) ( SVR)
i ,i 2 i,j=1

p
+ (i i ) yi
i=1

p
s.t. (i i ) = 0,
i=1
p
p
i , i ,
i=1 i=1
p
(i + i ) ,
i=1
i 0, i 0, i = 1, . . . , p.
10.4 Managing Meta-models of Multiple Objectives

Meta-modeling in the context of multiobjective optimization has been consid-
ered in several works in recent years (Chafekar et al., 2005; Emmerich et al.,
2006; Gaspar-Cunha and Vieira, 2004; Keane, 2006; Knowles, 2006; Nain and
Deb, 2002; Ray and Smith, 2006; Voutchkov and Keane, 2006). The gener-
alization to multiple objective functions has led to a variety of approaches,
with dierences in what is modeled, and also how models are updated. These
dierences follow partly from the dierent possible methods that there are of
building up a Pareto front approximation (see Figure 10.2).
In a modern multiobjective evolutionary algorithm approach like NSGA-
II, selection favours solutions of low dominance rank and uncrowded solu-
tions, which helps build up a diverse and converged Pareto set approximation.
A straightforward way to obtain a meta-modeling-based multiobjective opti-
mization algorithm is thus to take NSGA-II and simply plug in meta-models of
each independent objective function. This can be achieved by running NSGA-
II for several generations on the meta-model (initially constructed from a
DoE sample), and then cycling through phases of selection, evaluation on the
(a) Nonominated sorting (b) Scalarizing functions
Minimize Minimize
z2 z2
1 4
2 3
1
2
2
1
1 1
Minimize z1 Minimize z1
max [w iz i] = m, m=1,2,3,4
i
(c) NSGAII crowding d) Hypervolume

Minimize Minimize
z2 z2 0000000000000
1111111111111
0000000000000
1111111111111
1111111111111
0000000000000
0000000000000
1111111111111
i1 0000000000000
1111111111111
0000000000000
1111111111111
0000000000000
1111111111111
i
i+1 0000000000000
1111111111111
0000000000000
1111111111111
0000000000000
1111111111111
0000000000000
1111111111111
0000000000000
1111111111111
Minimize z1 Minimize z1
Fig. 10.2. Building up a Pareto front approximation can be achieved by dier-

ent routes. Four are shown here: nondominated sorting, scalarizing, crowding, and
maximising hypervolume.
real model, and update of the meta-model. This approach is the one taken
by Voutchkov and Keane (2006) wherein a variety of response surface meth-
ods are compared, including splines, radial basis functions and polynomial
regression. As the authors explain, an advantage of this approach is that each
objective could, in theory, be modeled by a dierent type of response surface
method (appropriate to it), or some objectives may be cheap to compute and
may not need modeling at all.
An alternative to simply plugging meta-models into the existing selection
step of an MOEA, is to use the meta-models, instead, to pre-screen points.
This approach, much like EGO, may base the screening on both the predicted
value of points and the condence in these predictions. In a method still
based on NSGA-II, Emmerich et al. (2006) proposed a number of dierent
pre-screening criteria for multiobjective optimization, including the expected
improvement and the probability of improvement. Note that in the case of

multiobjective optimization, improvement is relative to the whole Pareto set
approximation achieved so far, not a single value. Thus, to measure improve-
ment, the estimated increase in hypervolume (Zitzler et al., 2003) of the cur-
rent approximation set (were a candidate point added to it) is used, based on a
meta-model for each objective function. In experiments, Emmerich et al com-
pared four dierent screening criteria on two and three-objective problems,
and found improvement over the standard NSGA-II in all cases.
The approach of Emmerich et al is a sophisticated method of generalizing
the use of meta-models to multiobjective optimization, via MOEAs, though
it is as yet open whether this sophistication leads to better performance than
the simpler method of Voutchkov and Keane (2006). Moreover, it does seem
slightly unnatural to marry NSGA-II, which uses dominance rank and crowd-
edness to select its parents, with a meta-modeling approach that uses the
hypervolume to estimate probable improvement. It would seem more logical
to use evolutionary algorithms that themselves maximize hypervolume as the
tness assignment method, such as (Emmerich et al., 2005). It remains to be
seen whether such approaches would perform even better.
One worry with the methods described so far is that tness assignments
based on dominance rank (like NSGA-II) can perform poorly when the number
of objectives is greater than three or four (Hughes, 2005). Hypervolume may
be a better measure but it is very expensive to compute for large dimension,
as the complexity of known methods for computing it is polynomial in the set
size but exponential in d. Thus, scaling up objective dimension in methods
based on either of the approaches described above might prove dicult.
A method that does not use either hypervolume or dominance rank is the
ParEGO approach proposed by Knowles (2006). This method is a generaliza-
tion to multiobjective optimization of the well-founded EGO algorithm (Jones
et al., 1998). To build up a Pareto front, ParEGO uses a series of weighting
vectors to scalarize the objective functions. At each iteration of the algorithm,
a new candidate point is determined by (i) computing the expected improve-
ment (Jones et al., 1998) in the direction specied by the weighting vector
drawn for that iteration, and (ii) searching for a point that maximizes this
expected improvement (a single-objective evolutionary algorithm is used for
this search). The use of such scalarizing weight vectors has been shown to
scale well to many objectives, compared with Pareto ranking (Hughes, 2005).
The ParEGO method has the additional advantage that it would be relatively
straightforward to make it interactive, allowing the user to narrow down the
set of scalarizing weight vectors to allow focus on a particular region of the
Pareto front. This can further reduce the number of function evaluations it is
necessary to perform.
Yet a further way of building up a Pareto front is exemplied in the nal
method we review here. (Chafekar et al., 2005) proposes a genetic algorithm
with meta-models OEGADO, based closely on their own method for single
objective optimization. To make it work for the multiobjective case, a dis-
tinct genetic algorithm is run for each objective, with information exchange
occurring between the algorithms at intervals, which helps the GAs to nd the
compromise solutions. The fact that each objective is optimized by its own
genetic algorithm means that objective functions with dierent computational
overhead can be appropriately handled slow objectives do not slow down
the evaluation of faster ones. The code may also be trivially implemented on
parallel architectures.
10.4.1 Combining Interactive Methods and EMO for Generating a

Pareto Frontier
Aspiration Level Methods for Interactive Multiobjective

Programming
Since there may be many Pareto solutions in practice, the nal decision should
be made among them taking the total balance over all criteria into account.
This is a problem of value judgment of DM. The totally balancing over criteria
is usually called trade-o. Interactive multiobjective programming searches a
solution in an interactive way with DM while making trade-o analysis on the
basis of DMs value judgment. Among them, the aspiration level approach is
now recognized to be eective in practice, because
(i) it does not require any consistency of DMs judgment,
(ii) aspiration levels reect the wish of DM very well,
(iii)aspiration levels play the role of probe better than the weight for objective
functions.
As one of aspiration level approaches, one of authors proposed the satis-
cing trade-o method (Nakayama and Sawaragi, 1984). Suppose that we
have objective functions f (x) := (f1 (x), . . . , fr (x)) to be minimized over
x X Rn . In the satiscing trade-o method, the aspiration level at the
k
k-th iteration f is modied as follows:
k+1 k
f = T P (f ).
Here, the operator P selects the Pareto solution nearest in some sense to
k
the given aspiration level f . The operator T is the trade-o operator which
k
changes the k-th aspiration level f if DM does not compromise with the
k k
shown solution P (f ). Of course, since P (f ) is a Pareto solution, there exists
k
no feasible solution which makes all criteria better than P (f ), and thus DM
has to trade-o among criteria if he wants to improve some of criteria. Based
k
on this trade-o, a new aspiration level is decided as T P (f ). Similar process
is continued until DM obtains an agreeable solution.
On the Operation P
k k
The operation which gives a Pareto solution P (f ) nearest to f is per-
formed by some auxiliary scalar optimization. It has been shown in Sawaragi-
Nakayama-Tanino (1985) that the only one scalarization technique, which
provides any Pareto solution regardless of the structure of problem, is of the
Tchebyshev norm type. However, the scalarization function of Tchebyshev
norm type yields not only a Pareto solution but also a weak Pareto solution.
Since weak Pareto solutions have a possibility that there may be another solu-
tion which improves a criteria while others being xed, they are not necessarily
ecient" as a solution in decision making. In order to exclude weak Pareto
solutions, the following scalarization function of the augmented Tchebyshev
type can be used:
r
max i fi (x) f i + i fi (x), (10.18)
1ir
i=1
where is usually set a suciently small positive number, say 106 .

The weight i is usually given as follows: Let fi be an ideal value which is
usually given in such a way that fi < min {fi (x) | x X}. For this circum-
stance, we set
1
ik = k . (10.19)
f i fi
The minimization of (10.18) with (10.19) is usually performed by solving the
following equivalent optimization problem, because the original one is not
smooth:

r
(AP) minimize z+ i fi (x)
z, x
i=1
k

subject to ik fi (x) f i z (10.20)
x X.
On the Operation T
k
In cases that DM is not satised with the solution for P (f ), he/she is re-
k+1
quested to answer his/her new aspiration level f . Let xk denote the Pareto
k
solution obtained by projection P (f ), and classify the objective functions
into the following three groups:
(i) the class of criteria which are to be improved more,
(ii) the class of criteria which may be relaxed,
(iii)the class of criteria which are acceptable as they are.
f k +1 k
f
k +1
f
f2
f k
f*
f1
Fig. 10.3. Satiscing Trade-o Method
Let the index set of each class be denoted by IIk , IR

k k
, IA , respectively. Clearly,
k+1 k+1
fi < fi (xk ) for all i IIk . Usually, for i IA
k
, we set f i = fi (xk ). For
k+1
i IR
k
, DM has to agree to increase the value of fi .
It should be noted
that an appropriate sacrice of fj for j IRk
is needed for attaining the
improvement of fi for i IIk .
Combining Satiscing Trade-o Method and Sequential

Approximate Optimization
Nakayama and Yun proposed a method combining the satiscing trade-o

method for interactive multiobjective programming and the sequential ap-
proximate optimization using SVR (Nakayama and Yun, 2006b). The
procedure is summarized as follows:
Step 1. (Real Evaluation)
Evaluate actually the values of objective functions f (x1 ), f (x2 ), . . . , f (x )
for sampled data x1 , . . . , x through computational simulation analysis or
experiments.
Step 2. (Approximation)
Approximate each objective function f1 (x), . . . , fm (x) by the learning of
SVR on the basis of real sample data set.
Step 3. (Find a Pareto Solution Nearest to the Aspiration Level
and Generate Pareto Frontier)
Find a Pareto optimal solution nearest to the given aspiration level for
the approximated objective functions f(x) := (f1 (x), . . . , fm (x)). This is
performed by using GA for minimizing the augmented Tchebyshev scalar-
ization function (10.18). In addition, generate Pareto frontier by MOGA
for accumulated individuals during the procedure for optimizing the aug-
mented Tchebyshev scalarization function.
Step 4. (Choice of Additional Learning Data)
Choose the additional 0 -data from the set of obtained Pareto optimal
solutions. Go to Step 1. (Set + 0 .)
how to choose the additional data

Stage 0. First, add the point with highest achievement degree
among Pareto optimal solutions obtained in Step 3. ( local
information)
Stage 1. Evaluate the ranks for the real sampled data of Step 1
by the ranking method (Fonseca and Fleming, 1993).
Stage 2. Approximate the rank function associated with the
ranks calculated in the Stage 1 by SVR.
Stage 3. Calculate the expected tness for Pareto optimal so-
lutions obtained in Step 3.
Stage 4. Among them, add the point with highest rank. (global
information)
Next, we consider the following problem (Ex-1):
minimize f1 := x1 + x2
f2 := 20 cos(15x1 ) + (x1 4)4 + 100 sin(x1 x2 )
subject to 0 x1 , x2 3.
The true function of each objective function f1 and f2 in the problem (Ex-1)
are shown in Fig. 10.4.
3 3
2.5 2.5
2 2
x2
x2
1.5 1.5
1 1
0.5 0.5
0 0
0 1 2 3 0 1 2 3
x1 x1
(a) f1 (b) f2
Fig. 10.4. The true contours to the problem
In our simulation, the ideal point and the aspiration level is respectively given
by
3 3
2.5 2.5
2 2
x2
x2
1.5 1.5
1 1
0.5 0.5
0 0
0 1 2 3 0 1 2 3
x1 x1
(a) contour of f1 (b) contour of f2
300
250
aspiration level
200
150
100
f2
50
50
ideal point
100
0 1 2 3 4
f1
(c) population at the nal generation

Fig. 10.5. # sample data : 50 points (Ex-1)

f1 , f2 = (0, 120),

f 1 , f 2 = (3, 200),
and the closest Pareto solution to the above aspiration level is as follows:

exact optimal solution x 2 = (1.41321, 0)
1 , x

exact optimal value f1 , f2 = (1.41321, 30.74221)
Starting with initial data 10 points randomly, we obtained the following ap-
proximate solution by proposed method after 50 real evaluations:

approximate solution x1 , x2 = (1.45748, 0)

approximate value f1 , f2 = (1.45748, 35.34059)
The nal result is shown in Fig. 10.5

Additionally, a rough conguration of Pareto frontier is also obtained in
Fig. 10.6 through the step 3. Although this gure shows a rough approximation
of the whole Pareto frontier, it may be expected to provide a reasonable
approximation to it in a neighborhood of the Pareto optimal point nearest to
the aspiration level. This information of the approximation of Pareto frontier
in a neighborhood of the obtained Pareto solution helps the decision maker
to make trade-o analysis.
It has been observed that a method combining the satiscing trade-o
method and meta-modeling is eective for supporting DM to get a nal so-
lution in a reasonable number of computational experiments. It is promising
in practical problems since it has been observed that the method reduces the
number of function evaluations up to less than 1/100 to 1/10 of usual methods
such as MOGAs and usual aspiration level methods through several numerical
experiments.
10.5 Evaluation of Meta-modeling Methods

Improvement in the design of heuristic search algorithms may sometimes be
based on good theoretical ideas, but it is generally driven by the empirical
testing and comparison of methods. Meta-model-based multiobjective opti-
mization algorithms are relatively new, so, as yet, there has been little serious
direct comparison of methods. In the coming years, this will become more of
a focus.
10.5.1 What to Evaluate
In meta-modeling scenarios, full-cost evaluations are expensive and should be

reduced. Therefore, it follows that whatever assessments of Pareto set ap-
proximation are used, these should generally be plotted against the number
of full-cost function evaluations, so that it is possible to see how performance
evolves over time. In the case of multiobjective optimization, this is often
overlooked, because of the desire to show Pareto front approximations. For
two-objective problems, the use of snapshots of Pareto fronts can be informa-
tive, but for general dimension, plotting the value of an indicator, such as the
hypervolume, against full-cost evaluation number (averaged over a number of
runs) is a more advisable approach.
An alternative statistical method that has been used little to date, is the
attainment function method of comparing two optimizers (da Fonseca et al.,
2001). With this, the number of full cost evaluations can be considered an
additional objective. For the rst-order attainment function, the method re-
turns an overall signicance result indicating whether the attainment function
(which describes the probability of attaining a certain point by a certain time,
for all points and times) diers signicantly between the two algorithms.
300
250
aspiration level
200
150
100
f2 50
50
ideal point
100
0 1 2 3 4
f1
Fig. 10.6. The whole Pareto frontier with 50 sample points
In addition to the performance at optimizing a function, meta-modeling also

involves its approximation. Therefore, it is sometimes desirable to show the
time evolution of the accuracy of the model over all or some region of the
design space.
In (Emmerich et al., 2006), the accuracy of the model, as seen by the
evolutionary algorithm optimizer was measured. This was done by computing
the precision and recall of the pre-screening methods used in terms of being
able to correctly rank solutions (rather than get their absolute evaluation
correct). In a similar ilk, a number of measures of model quality that are
based on evaluating the models utility at making the correct selection of
individuals within an EA, were proposed in (Hsken et al., 2005).
Evaluation of interactive optimization and decision-making methods is
even more of a challenge, and is dealt with in Chapter 7 of this book.
10.5.2 Adversaries
Assessment of progress in meta-modeling would be facilitated by using com-

mon, simple adversary algorithms or methods to compare against. Perhaps
the simplest adversary is the random search. It is very interesting to com-
pare with random search (as done in (Knowles, 2006) and see also (Hughes,
2006)) because it is not necessarily trivial to outperform the algorithm when
the number of evaluations is small, and depending on the function. More-
over, when approximating a higher dimensional Pareto front, random search
may be better than some multiobjective EAs, such as NSGA-II. The obvious
additional adversary is the algorithm being proposed, with the meta-model
removed (if this is possible).
10.6 Real Applications
10.6.1 Closed-Loop Mass-Spectrometer Optimization
Mass spectrometers are analytical instruments for determining the chemical

compounds present in a sample. Typically, they are used for testing a hy-
pothesis as to whether a particular compound is present or not. When used
in this way, the instrument can be congured according to standard princi-
ples and settings provided by the instrument manufacturer. However, modern
biological applications aim at using mass-spectrometry to mine data with-
out a hypothesis, i.e. to measure/detect simultaneously the hundreds of com-
pounds contained in complex biological samples. For such applications, the
mass spectrometer will not perform well in a standard conguration, so it
must be optimized.
(OHagan et al., 2006) describes experiments to optimize the conguration
of a GCxGC-TOF mass spectrometer with the aim of improving its eective-
ness at detecting all the metabolites (products of the metabolic system) in
a human serum (blood) sample. To undertake this optimization, real experi-
ments were conducted with the instrument in dierent conguration set-ups,
as dictated by the optimization algorithm. Instrument congurations are spec-
ied by 15 continuous parameters. Typically, a single evaluation of an instru-
ment conguration lasts of the order of one hour, although the throughput of
the instrument (i.e. how long it takes to process the serum sample) was also
one of the objectives and thus subject to some variation. The overall arrange-
ment showing the optimization algorithm and how it is connected up to the
instrument in a closed-loop is shown in Figure 10.7.
The experiments reported in (OHagan et al., 2006) used just 320 ex-
periments (evaluations) to increase the number of metabolites observed (pri-
mary objective) substantially, as measured by the number of peaks in the
mass-spectrometers output. Simultaneously, the signal to noise ratio and the
throughput of the instrument were optimized. A plot of the evolution of the
three objectives is given in Figure 10.8.
The meta-modeling algorithm used for the optimization was ParEGO
(Knowles, 2006), a multiobjective version of the ecient global optimization
(EGO) algorithm (Jones et al., 1998). ParEGO uses a design and analysis of
computer experiments (DACE) approach to model the objective function(s)
(in this case, the instruments behaviour under dierent operating congura-
tions), based on an initial Latin hypercube sampling of the parameter space.
Subsequently, the model is used to suggest the next experiment (set of in-
strumentation parameter values), such that the expected improvement in
the objective function(s) is maximized. The notion of expected improvement
implicitly ensures that ParEGO balances exploration of new parameter com-
binations with exploitation and ne-tuning of parameter values that have led
to good design points in previous experiments. The DACE model is updated
after each tness evaluation.
Initialization Expensive evaluation step
Define k initial Do an experiment with the

sample configurations x GCxGC mass spectrometer
based on a latin configured according to x
hypercube design
Measure the three objectives y
and store x,y in the matrix M
Main loop
Choose a scalarizing weight vector
Generate Gaussian Process

model of the scalarized fitness based
on matrix M
Search for a configuration x

that maximizes the expected
improvement in the scalarized fitness
Fig. 10.7. Closed-loop optimization of mass spectrometer parameters using the

ParEGO algorithm
10.6.2 Application to Reinforcement of Cable-Stayed Bridges
After the big earthquake in Kobe in 1995, many in-service structures were
required to improve their anti-seismic property by law regulation in Japan.
However, it is very dicult for large and/or complicated bridges, such as
suspension bridges, cable-stayed bridges, arch bridges and so on, to be rein-
forced because of impractical executing methods and complicated dynamic
responses. Recently, many kinds of anti-seismic device have been developed
(Honda et al., 2004). It is practical in the bridge to be installed a number
of small devices taking into account of strength and/or space, and to obtain
the most reasonable arrangement and capacity of the devices by using op-
timization technique. In this problem, the form of objective function is not
given explicitly in terms of design variables, but the value of the function is
obtained by seismic response analysis. Since this analysis needs much cost and
long time, it is strongly desirable to make the number of analyses as few as
possible. To this end, radial basis function networks (RBFN) are employed
in predicting the form of objective function, and genetic algorithms (GA) in
searching the optimal value of the predicted objective function (Nakayama
et al., 2006).
The proposed method was appplied to a problem of anti-seismic improve-
ment of a cable-stayed bridge which typies the diculty of reinforcement of
in-service structure. In this investigation, we determine an ecient arrange-
ment and amount of additional mass for cables to reduce the seismic response
of the tower of a cable-stayed bridge (Fig. 10.9).
4000
3000
#Peaks
2000
1000
80
60 20
40 15
Runtime /mins
20 10 Signal:Noise Ratio
Fig. 10.8. Three-objective closed-loop optimization of GCxGC-TOF-mass spec-

trometry congurations for the analysis of human serum. The number of peaks and
signal:noise ratio are maximized; the runtime of the mass spectrometer is minimized.
The shading of the points represents the experiment number. Darker circles represent
later experiments, and the six back dots represent replications of the same chosen
nal conguration. The number of peaks has risen to over 3000, whilst sucient
signal:noise has been maintained and runtime kept down to around 60 minutes.
Fig. 10.9. Cable-stayed bridge
The inuence of additional mass on cables was investigated by numerical sensi-

tivity analysis. The analytical model shown in Fig. 10.9 is a 3-span continuous
and symmetrical cable-stayed bridge whose 2 20 cables are in one plane and the
towers stand freely in their transverse direction. The mass must be distributed
over cables uniformly to prevent them from concentrating deformation.
The seismic response of interest was the stress at the xed end of the
tower when an earthquake occurs in the transverse direction. Seismic response
analysis was carried out by a spectrum method. As there are a lot of modes
whose natural frequencies were close to each other, the response was evaluated
by the complete quadratic combination method. The input spectrum is given
in the new Specications for Highway Bridges in Japan.
The natural frequencies of modes accompanied with the bending of the
tower (natural frequency of the tower alone is 1.4Hz) range from 0.79Hz to
2.31Hz, due to coupling with the cables.
As mentioned above, the seismic response of the tower can be controlled
by additional mass to cables, but each cable inuences other ones in a complex
way. Thus, the most eective distribution of additional mass must be decided
by optimization.
10.6.3 Case 1
The objective is to minimize the bending moment M at the base of tower.

The variables are ratios of additional mass and mass of cables. The number of
variables is 20. The lower bound and upper bound of each variable are 0.0, and
1.0, respectively. For comparison, we applied a quasi-Newton method based
on approximated dierentials as an existing method. We made ve trials with
dierent initial points in order to obtain a global optimum.
In applying our proposed method, we used BLX- as a genetic algorithm
which is observed to be eective for continuous variables. The population is
10, and the number of generation is 200. We set = 0.01, and decided the
value of width r of Gaussian by the simple estimate given by (10.17).
We started the iteration with 60 sample points. The rst 20 sample points
are generated randomly with one of variables xed at the upper bound 1 by
turns; the next 20s are generated similarly with one of variables xed at the
lower bound 0 by turns; the last 20s similary with one of variables xed at the
mid-value 0.5 by turns. The parameters for convergence are Cx0 = 20, Cf0 = 20
and l0 = 0.1.
The result is shown in Table 10.1. It is seen that the proposed method can
nd out fairly good solutions within 1/10 or less times of analysis than the
conventional optimization.
10.6.4 Case 2
Now, we take the number of cables to be added with masses, N , as another ob-
jective function in addition to the bending moment M . Namely, our objective
function is
F = (M/M0 ) + (N/N0 ) (10.21)
where is a parameter for trade-o between the rst term and the second
one. M0 andN0 are used for normalization of the bending moment and the
Table 10.1. Result for Case 1
exisiting RBF Network

method best average
1 0.32 0.04 0.40
2 1.00 0.69 0.84
3 0.49 0.18 0.51
4 0.62 0.82 0.80
5 0.81 0.57 0.64
6 0.52 0.43 0.56
7 0.49 1.00 0.39
8 0.52 0.44 0.66
9 0.48 0.94 0.50
cable 10 0.48 0.50 0.56
No. 11 0.50 0.45 0.47
12 0.55 1.00 0.74
13 0.70 0.85 0.71
14 0.61 0.50 0.30
15 0.61 1.00 0.58
16 0.46 0.24 0.37
17 0.22 0.10 0.13
18 1.00 0.95 0.91
19 0.98 1.00 0.94
20 1.00 1.00 0.91
bending moment (MNm) 50.3 54.90 63.70

#analysis 1365 150.00 124.80
number of cables, respectively. In this experiment, we set M0 = 147.0MNm

and N0 = 20.
In this experiment, we used a simple GA, because some of variables are
discrete. The parameters for calculation are the same as in Case 1. The result
is given in Table 10.2. It should be noted that the number of analysis in
our proposed method is reduced to about 1/20 of the conventional method.
Although the precision of solution in our method is behind the conventional
method, it is suciently acceptable in practice.
Table 10.2. Result for Case 2
existing method RBF network

best average best average
1 0.00 0.00 0.00 0.83
2 0.00 0.00 0.00 0.09
3 0.00 0.00 0.00 0.00
4 0.00 0.00 0.00 0.04
5 0.00 0.00 0.00 0.00
6 0.00 0.00 0.00 0.00
7 0.00 0.00 0.00 0.00
8 0.00 0.00 1.00 0.99
9 0.00 0.00 0.00 0.10
cable 10 0.00 0.00 0.86 0.53
No. 11 0.00 0.00 0.00 0.00
12 0.00 0.00 1.00 0.63
13 0.00 0.00 0.00 0.13
14 0.00 0.00 0.86 0.53
15 0.00 0.00 0.00 0.00
16 0.00 0.00 0.00 0.00
17 0.00 0.00 0.00 0.00
18 0.00 0.00 0.00 0.00
19 0.71 0.74 1.00 1.00
20 0.86 0.83 0.86 0.79
bending moment (MNm) 62.6 62.8 67.1 69.7

#cable with additional mass 2 2 6 6.2
objective fn. 0.526 0.527 0.756 0.784
#analysis 4100 3780 199 193.3
The well known Response Surface Method (RSM, in short) is competitive with
the proposed method. However, the proposed method has been observed to
have advantage over RSM especially for highly nonlinear cases. On the other
hand, EGO (Ecient Global Optimization) for black-box objective functions
(Jones et al., 1998; M.J. Sasena, 2000; Schonlau, 1997) takes time to calculate
the expected improvement, while it is rather simple and easy to add two
kinds of additional samples for global information and local information for
approximation (Nakayama et al., 2002, 2003).
10.7 Concluding Remarks

The increasing desire to apply optimization methods in expensive domains
is driving forward research in meta-modeling. Up to now, meta-modeling has
been applied mainly in continuous, low dimensional design variable spaces,
and methods from design of experiments and response surfaces have been
used. High-dimensional discrete spaces may also arise in applications involving
expensive evaluations and this will motivate research into meta-modeling of
these domains too. Research in meta-modeling for multiobjective optimization
is relatively young and there is still much to do. So far, there are few standards
for comparisons of methods, and little is yet known about the relative perfor-
mance of dierent approaches. The state of the art research surveyed in this
chapter is beginning to grapple with the issues of incremental learning and
the trade-o between exploitation and exploration within meta-modeling. In
the future, scalability of methods in variable dimension and objective space
dimension will become important, as will methods capable of dealing with
noise or uncertainty. Interactive meta-modeling is also likely to be investi-
gated more thoroughly, as the number of evaluations can be further reduced
by these approaches.
References
Anderson, V.L., McLean, R.A.: Design of Experiments: A Realistic Approach. Mar-
cel Dekker, New York (1974)
Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning for control. Ar-
ticial Intelligence Review 11(1), 75113 (1997)
Breiman, L.: Classication and Regression Trees. Chapman and Hall, Boca Raton
(1984)
Brown, G., Wyatt, J.L., Tio, P.: Managing diversity in regression ensembles. The
Journal of Machine Learning Research 6, 16211650 (2005)
Schlkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Reg-
ularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine
learning. Advances in Neural Information Processing Systems 13, 409415 (2001)
Chafekar, D., Shi, L., Rasheed, K., Xuan, J.: Multiobjective ga optimization using
reduced models. IEEE Transactions on Systems, Man and Cybernetics, Part C:
Applications and Reviews 35(2), 261265 (2005)
Chapelle, O., Vapnik, V., Weston, J.: Transductive inference for estimating values of
functions. Advances in Neural Information Processing Systems 12, 421427 (1999)
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models.
Journal of Articial Intelligence Research 4, 129145 (1996)
Corne, D.W., Oates, M.J., Kell, D.B.: On tness distributions and expected tness
gain of mutation rates in parallel evolutionary algorithms. In: Guervs, J.J.M.,
Adamidis, P.A., Beyer, H.-G., Fernndez-Villacaas, J.-L., Schwefel, H.-P. (eds.)
PPSN 2002. LNCS, vol. 2439, pp. 132141. Springer, Heidelberg (2002)
Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20, 273297
(1995)
Cristianini, N., Shawe-Tylor, J.: An Introduction to Support Vector Machines and
Other Kernel-based Learning Methods. Cambridge University Press, Cambridge
(2000)
Grunert da Fonseca, V., Fonseca, C.M., Hall, A.O.: Inferential Performance As-
sessment of Stochastic Optimisers and the Attainment Function. In: Zitzler, E.,
Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO 2001. LNCS,
Emmerich, M.T.M., Beume, N., Naujoks, B.: An EMO algorithm using the hyper-
volume measure as selection criterion. In: Coello Coello, C.A., Hernndez Aguirre,
A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 6276. Springer, Heidelberg
(2005)
Emmerich, M., Giannakoglou, K., Naujoks, B.: Single-and Multi-objective Evolu-
tionary Optimization Assisted by Gaussian Random Field Metamodels. IEEE
Transactions on Evolutionary Computation 10(4), 421439 (2006)
English, T.M.: Optimization is easy and learning is hard in the typical function. In:
Proceedings of the 2000 Congress on Evolutionary Computation (CEC00), pp.
924931. IEEE Computer Society Press, Piscataway (2000)
Erenguc, S.S., Koehler, G.J.: Survey of mathematical programming models and ex-
perimental results for linear discriminant analysis. Managerial and Decision Eco-
nomics 11, 215225 (1990)
Eyheramendy, S., Lewis, D., Madigan, D.: On the naive Bayes model for text cate-
gorization. In: Proceedings Articial Intelligence & Statistics 2003 (2003)
Fieldsend, J.E., Everson, R.M.: Multi-objective Optimisation in the Presence of Un-
certainty. In: 2005 IEEE Congress on Evolutionary Computation (CEC2005),
Edinburgh, Scotland, September 2005, vol. 1, pp. 243250. IEEE Computer So-
ciety Press, Los Alamitos (2005)
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multi-objective optimization:
Formulation, discussion and generalization. In: Proceedings of the Fifth Interna-
tional Conference on Genetic Algorithms, pp. 416426 (1993)
Freed, N., Glover, F.: Simple but powerful goal programming models for discriminant
problems. European J. of Operational Research 7, 4460 (1981)
Gaspar-Cunha, A., Vieira, A.: A multi-objective evolutionary algorithm using neural
networks to approximate tness evaluations. International Journal of Computers,
Systems, and Signals (2004)
Hamza, K., Saitou, K.: Vehicle crashworthiness design via a surrogate model en-
semble and a co-evolutionary genetic algorithm. In: Proc. of IDETC/CIE 2005
ASME 2005 International Design Engineering Technical Conference, California,
USA (2005)
Honda, M., Morishita, K., Inoue, K., Hirai, J.: Improvement of anti-seismic capacity
with damper braces for bridges. In: Proceedings of the Seventh International
Conference on Motion and Vibration Control (2004)
Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global Optimization of Stochastic
Black-Box Systems via Sequential Kriging Meta-Models. Journal of Global Opti-
mization 34(3), 441466 (2006)
Hughes, E.J.: Evolutionary Many-Objective Optimisation: Many Once or One
Many? In: 2005 IEEE Congress on Evolutionary Computation (CEC2005), vol. 1,
pp. 222227. IEEE Computer Society Press, Los Alamitos (2005)
Hughes, E.J.: Multi-Objective Equivalent Random Search. In: Runarsson, T.P.,
Beyer, H.-G., Burke, E.K., Merelo-Guervs, J.J., Whitley, L.D., Yao, X. (eds.)
PPSN 2006. LNCS, vol. 4193, pp. 463472. Springer, Heidelberg (2006)
Hsken, M., Jin, Y., Sendho, B.: Structure optimization of neural networks for
evolutionary design optimization. Soft Computing 9(1), 2128 (2005)
Jin, Y.: A comprehensive survey of tness approximation in evolutionary compu-
tation. Soft Computing - A Fusion of Foundations, Methodologies and Applica-
tions 9(1), 312 (2005)
Jin, Y., Sendho, B.: Reducing tness evaluations using clustering techniques
and neural network ensembles. In: Deb, K., et al. (eds.) GECCO 2004. LNCS,
Jin, Y., Olhofer, M., Sendho, B.: Managing approximate models in evolutionary
algorithms design optimization. In: Proceedings of the 2001 Congress on Evolu-
tionary Computation, CEC2001, pp. 592599 (2001)
Jones, D., Schonlau, M., Welch, W.: Ecient global optimization of expensive black-
box functions. Journal of Global Optimization 13, 455492 (1998)
Joslin, D., Dragovich, J., Vo, H., Terada, J.: Opportunistic tness evaluation in
a genetic algorithm for civil engineering design optimization. In: Proceedings of
the Congress on Evolutionary Computation (CEC 2006), pp. 29042911. IEEE
Computer Society Press, Los Alamitos (2006)
Jourdan, L., Corne, D.W., Savic, D.A., Walters, G.A.: Preliminary Investigation of
the Learnable Evolution Model for Faster/Better Multiobjective Water Systems
Design. In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.) EMO
Keane, A.J.: Statistical improvement criteria for use in multiobjective design opti-
mization. AIAA Journal 44(4), 879891 (2006)
Knowles, J.: ParEGO: A hybrid algorithm with on-line landscape approximation
for expensive multiobjective optimization problems. IEEE Transactions on Evo-
lutionary Computation 10(1), 5066 (2006)
Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg
(2001)
Larranaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for
Evolutionary Computation. Kluwer Academic Publishers, Dordrecht (2001)
Laumanns, M., Oenek, J.: Bayesian optimization algorithms for multi-objective
optimization. In: Guervs, J.J.M., Adamidis, P.A., Beyer, H.-G., Fernndez-
Villacaas, J.-L., Schwefel, H.-P. (eds.) PPSN 2002. LNCS, vol. 2439, pp. 298307.
Michalski, R.: Learnable Evolution Model: Evolutionary Processes Guided by Ma-
chine Learning. Machine Learning 38(1), 940 (2000)
Sasena, M.J., Papalambros, P.Y., Goovaerts, P.: Metamodeling sample criteria in a
global optimization framework. In: 8th AIAA/NASA/USAF/ISSMO Symposium
on Multidisciplinary Analysis and Optimization, Long Beach, AIAA-2000-4921
(2000)
Myers, R.H., Montgomery, D.C.: Response Surface Methodology: Process and Prod-
uct Optimization using Designed Experiments. Wiley, Chichester (1995)
Nain, P., Deb, K.: A computationally eective multi-objective search and optimiza-
tion technique using coarse-to-ne grain modeling. Kangal Report 2002005 (2002)
Nakayama, H., Sawaragi, Y.: Satiscing trade-o method for multi-objective pro-
gramming. In: Grauer, M., Wierzbicki, A. (eds.) Interactive Decision Analysis,
Nakayama, H., Yun, Y.: Generating support vector machines using multiobjective
optimization and goal programming. In: Jin, Y. (ed.) Multi-objective Machine
Learning, pp. 173198. Springer, Heidelberg (2006a)
Nakayama, H., Yun, Y.: Support vector regression based on goal programming and
multi-objective programming. IEEE World Congress on Computational Intelli-
gence, CD-ROM, Paper ID: 1536 (2006b)
Nakayama, H., Arakawa, M., Sasaki, R.: Simulation based optimization for unknown
objective functions. Optimization and Engineering 3, 201214 (2002)
Nakayama, H., Arakawa, M., Washino, K.: Optimization for black-box objective
functions. In: Tseveendorj, I., Pardalos, P.M., Enkhbat, R. (eds.) Optimization
and Optimal Control, pp. 185210. World Scientic, Singapore (2003)
Nakayama, H., Inoue, K., Yoshimori, Y.: Approximate optimization using compu-
tational intelligence and its application to reinforcement of cable-stayed bridges.
In: Zha, X.F., Howlett, R.J. (eds.) Integrated Intelligent Systems for Engineering
Design, pp. 289304. IOS Press, Amsterdam (2006)
OHagan, S., Dunn, W.B., Knowles, J.D., Broadhurst, D., Williams, R., Ashworth,
J.J., Cameron, M., Kell, D.B.: Closed-Loop, Multiobjective Optimization of Two-
Dimensional Gas Chromatography/Mass Spectrometry for Serum Metabolomics.
Analytical Chemistry 79(2), 464476 (2006)
Orr, M.: Introduction to radial basis function networks. Centre for Cognitive Science,
University of Edinburgh (1996), http://www.cns.ed.ac.uk/people/mark.html
Ray, T., Smith, W.: Surrogate assisted evolutionary algorithm for multiobjective op-
timization. In: 47th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dy-
namics, and Materials Conference, pp. 18 (2006)
Robins, A.: Maintaining stability during new learning in neural networks. In: IEEE
International Conference on Systems, Man, and Cybernetics, 1997, Computa-
tional Cybernetics and Simulation, vol. 4, pp. 30133018 (1997)
Schlkopf, B., Smola, A.: New support vector algorithms. Technical Report NC2-
TR-1998-031, NeuroCOLT2 Technical report Series (1998)
Schonlau, M.: Computer Experiments and Global Optimization. Ph.D. thesis,
Univ.of Waterloo, Ontario, Canada (1997)
Schwaighofer, A., Tresp, V.: Transductive and Inductive Methods for Approximate
Gaussian Process Regression. Advances in Neural Information Processing Sys-
tems 15, 953960 (2003)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)
Voutchkov, I., Keane, A.J.: Multiobjective optimization using surrogates. Presented
at Adaptive Computing in Design and Manufacture (ACDM 06), Bristol, UK
(2006)
Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE
Transactions on Evolutionary Computation 1, 6782 (1997)
Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., Grunert da Fonseca, V.: Perfor-
mance Assessment of Multiobjective Optimizers: An Analysis and Review. IEEE
11
Real-World Applications of Multiobjective
Optimization
Theodor Stewart1 , Oliver Bandte2 , Heinrich Braun3 , Nirupam Chakraborti4 ,

Matthias Ehrgott5 , Mathias Gbelt3 , Yaochu Jin6 , Hirotaka Nakayama7,
Silvia Poles8 , and Danilo Di Stefano8
1
University of Cape Town, Rondebosch 7701, South Africa
theodor.stewart@uct.ac.za
2
Icosystem Corporation, Cambridge, MA 02138 oliver@icosystem.com
3
SAP AG, Walldorf, Germany heinrich.braun@sap.com
4
Indian Institute of Technology, Kharagpur 721 302, India
nchakrab@iitkgp.ac.in
5
The University of Auckland, Auckland 1142, New Zealand
m.ehrgott@auckland.ac.nz
6
Honda Research Institute Europe, 63073 Oenbach, Germany
Yaochu.Jin@honda-ri.de
7
Konan University, Higashinada, Kobe 658-8501, Japan nakayama@konan-u.ac.jp
8
Esteco Research Labs, 35129 Padova, Italy silvia.poles@esteco.com
Abstract. This chapter presents a number of illustrative case studies of a wide

range of applications of multiobjective optimization methods, in areas ranging from
engineering design to medical treatments. The methods used include both conven-
tional mathematical programming and evolutionary optimization, and in one case
an integration of the two approaches. Although not a comprehensive review, the
case studies provide evidence of the extent of the potential for using classical and
modern multiobjective optimization in practice, and opens many opportunities for
further research.
11.1 Introduction
The intention with this chapter is to provide illustrations of real applications

of multiobjective optimization, covering both conventional mathematical pro-
gramming approaches and evolutionary multiobjective optimization. These
illustrations do cover a broad range of application, but do not attempt to
provide a comprehensive review of applications.
Reviewed by: Alexander Lotov, Russian Academy of Sciences, Russia

Tatsuya Okabe, Honda Research and Development Inc., Japan
286 T. Stewart et al.
In examining the case studies presented here, it may be seen that the
applications may be distinguished along two primary dimensions, namely:
The number of objectives which may be:
Few, i.e. 2 or 3 (impacts of which can be visualized graphically);
Moderate, perhaps ranging from 4 to around 20;
Large, up to hundreds of objectives
The level of interaction with decision makers, i.e. the involvement of policy
makers, stakeholders or advisers outside of the technical team. The level
of such interaction may be:
Low, such as in many engineering design problems (but for an excep-
tion, see Section 11.7) where the analyst is part of the engineering
team concerned with identifying a few potentially good designs;
Moderate, such as in operational management or interactive design
problems where solutions may need to be modied in the light of pro-
fessional or technical experience from other areas of expertise;
Intensive, such as in public sector planning or strategic management
problems, where acceptable alternatives are constructed by interaction
between decision makers and other stakeholders, facilitated by the an-
alyst.
Not all combinations of number of objectives and level of interaction may
necessarily occur. For example, the public sector planning or strategic man-
agement problems which require intensive interactions, also tend often to be
associated with larger numbers of objectives. In the case studies reported in
this chapter, we have attempted to provide summaries of a number of real
case studies in which the authors have been involved, and which do between
them illustrate all three levels for each dimension identied above. Table 11.1
summarizes the placement of each of the cases along the above two dimensions.
These studies exemplify the wide range of problems to which multiobjec-
tive optimization methods can and have been applied. Half of the case studies
deal with engineering design problems, which is clearly an important area
of application, but even within this category there is a wide diversity. For
Table 11.1. Classication of case studies

Number of Level of Case study
Objectives Interaction
Few Low Aerodynamic design (Section 11.2)
Industrial neural network design (Section 11.3)
Molecular structures for drugs (Section 11.4)
Few Moderate Medical decision making (Section 11.5)
Supply chain management (Section 11.6)
Moderate Moderate Interactive aircraft design (Section 11.7)
Moderate Intensive Land use planning (Section 11.8)
Large Low Lens and bridge designs (Section 11.9)
11 Real-World Applications of Multiobjective Optimization 287
example, we have two examples from aircraft design, but one (Section 11.2)
focuses on the trade-o between robustness and cost in aircraft design, while
the other (Section 11.7) deals with the need to provide a broad holistic in-
teractive decision support to aircraft designers. Although both applications
relate to aircraft design, the issues raised are substantially dierent so that
dierent sections in this chapter are devoted to each of them.
Other applications range over operational management of supply chains,
eective treatment of cancers and conicts between environmental, social and
economic factors in regional planning.
A perhaps less usual application is that described in Section 11.4. Here
the multiobjective optimization methods are applied not directly to design,
operational or strategic decisions, but to the development of understanding of
molecular processes in synthesizing drugs.
11.2 Aerodynamic Design Optimization
11.2.1 Problem Description
Although the number of objectives are typically low (two or three) if the ge-
ometrical constraints are not counted, aerodynamic design optimization is a
challenging engineering task for a number of reasons. Firstly, aerodynamic
optimization often needs to deal with a large number of design parameters.
Secondly, no analytical function is available for evaluating the performance of
a given design, and as a result many gradient-based optimization techniques
are inapplicable. Thirdly, to evaluate the quality of designs, either computa-
tionally expensive computational uid dynamics (CFD) simulations have to
be performed or costly experiments have to be conducted. Finally, aerody-
namic optimization involves multiple disciplines and more than one objective
must be considered.
In recent years, evolutionary algorithms have successfully been applied to
single and multiobjective aerodynamic optimization (Obayashi et al., 2000;
Olhofer et al., 2000; Hasenjger et al., 2005). Despite the success that has
been achieved in evolutionary aerodynamic optimization, several issues must
be carefully addressed.
11.2.2 Methodology
Geometric Representation
Finding a proper representation scheme is the rst and most important step
toward successful optimization of aerodynamic structures. A few general cri-
teria can be mentioned for choosing an appropriate geometric representation.
Firstly, the representation should be suciently exible to describe highly
complex structures. An overly constrained representation will produce only
suboptimal results. Secondly, the representation should be ecient, which

means that the exibility of representation can be achieved with a minimum
number of free parameters. Inecient representations may result in an unnec-
essarily large search space, which reduces the search eciency of evolutionary
algorithms. Finally, the representation should makes it possible to perform a
local search. This requirement is important for rening the performance and
for reducing search space.
Several methods are available for the geometric representation, such as
B-Splines, Bezier curves, and T-Splines. Furthermore, constrained deforma-
tion instead of global deformation techniques can be used. An example of
constrained deformation is free form deformation or simplied constrained
deformation.
Nevertheless, it can happen that no single representation is able to satisfy
all the above-mentioned properties. To solve this problem, adaptive represen-
tation techniques can be used. The basic idea of adaptive representation is to
start the optimization with a relatively compact representation, so that the
global search can be conducted rst. After that, the number of search pa-
rameters can be increased, e.g., by inserting new control points in a B-Spline
based representation. Encouraging results have been reported where an adap-
tive representation for evolutionary optimization of turbine blades has been
adopted (Olhofer et al., 2001).
However, adaptive representation is not as straightforward as it appears.
On the one hand, it is not trivial to establish when a new point should be
inserted, or removed. On the other hand, the insertion or removal of a search
point should be neutral to the tness value. Moreover, an adaptation in repre-
sentation may degrade the search performance of evolutionary algorithms, for
example, for evolutionary strategies with a small population size (Jin et al.,
2005).
In multiobjective optimization, even more complex situations can occur.
For example, it has been found that in micro heat-exchanger optimization
more than one representation is needed to achieve the whole Pareto front (Ok-
abe et al., 2003).
Reduction of Computational Cost
Evolutionary algorithms (EAs) acquire strong search power at the cost of

search eciency. In contrast to gradient-based search methods, EAs often
need a large number of quality evaluations to achieve acceptable solutions.
This poses serious problems in aerodynamic optimization where each qual-
ity evaluation is costly. For example, a full three-dimensional CFD simulation
may takes several hours on a high-end computer. To reduce the computational
time for evolutionary optimization aerodynamic structures, the following ap-
proaches are adopted. Firstly, ecient and scalable evolutionary algorithms
need to be developed. An ecient and scalable EA should be able to converge
to an acceptable non-dominated front within a small number of tness evalua-

tions, and should be insensitive to the increase in search dimension. Secondly,
both the CFD simulations and the tness evaluations should be parallelized.
A further step is to take advantage the grid computing techniques that enable
us to use available computational resources as eciently as possible. In addi-
tion, computational eciency can further be improved if EAs are adapted to
the parallel or grid based computing hardware architecture. Finally, computa-
tionally expensive full simulations can be replaced partly by computationally
more ecient reduced simulations, or surrogates. In recent years, metamodel-
ing techniques have been extensively investigated to reduce the computational
cost in evolutionary optimization of expensive problems, both for single ob-
jective (Jin et al., 2002) and multiobjective optimization (Emmerich et al.,
2006). Refer to Chapter 10 on metamodeling for a more detailed discussion
of their use in multiobjective optimization.
Metamodels can introduce errors in quality evaluation, which may lead
to the convergence to a false minimum (Jin et al., 2000). Therefore, a major
concern in using surrogates in evolutionary optimization is to reduce compu-
tational cost as much as possible without misleading the evolutionary algo-
rithm. To this end, the meta-model should be properly interleaved with the
original tness function, which is known as evolution control or model manage-
ment (Jin et al., 2002). A meta-model can be employed in dierent operations
of EAs, such as population initialization, crossover or mutation, evolutionary
tness evaluation, or local search combined with evolutionary search. Dierent
model management techniques have been suggested. In the individual-based
techniques, all individuals in the current generation are evaluated with the
metamodel. Then, the most promising solutions according to the metamodel
are re-evaluated using the original tness function. In a generation-based evo-
lution control framework, the meta-model is used for tness evaluation in
some of the generations, and the frequency at which the meta-model is used
can be adjusted according to the delity of the model (Jin et al., 2002). If a
metamodel is employed in local search, the trust-region framework (Alexan-
drov et al., 1998) can be adopted. A comprehensive survey of techniques for
using meta-models (surrogates) in evolutionary optimization can be found
in Jin (2005).
Robustness Considerations
In aerodynamic optimization, uncertainties in the environment must be taken

into account. For example, the Mach number may deviate from the normal
condition during the ight. In this case, a robust optimal solution is very
much desired. By robustness, it is meant in general that the performance of
an optimal solution should be insensitive to small perturbations of the design
variables or environmental parameters. In multiobjective optimization, the
robustness of a solution can be an important factor for a user in choosing the
nal solution. Robust solutions can be achieved in evolutionary optimization
by a number of means. One simple approach is to add perturbations to the

design variables or environmental parameters before the tness is evaluated,
which is known as implicit averaging, see e.g., Tsutsui and Ghosh (1997). An
alternative to implicit averaging is explicit averaging, which means that the
tness value of a given design is averaged over a number of designs gener-
ated by adding random perturbations to the original design. One drawback of
the explicit averaging method is the number of additional quality evaluations
needed, which may make the approach impractical. Partly to solve this prob-
lem, metamodeling techniques have been considered (Ong et al., 2006; Paenke
et al., 2006). A slightly dierent approach is to nd the solution with the maxi-
mal allowed deviation given the allowed performance deterioration (Lim et al.,
2007). One potential advantage of this methods is that no assumptions need to
be made concerning the noise distribution (as needed in the averaging based
approaches).
Search for robust solutions can also be treated as a multiobjective task, i.e.,
to maximize the performance and the robustness simultaneously. These two
tasks are very likely conicting, and therefore, Pareto-based multiobjective
methods can be employed to nd a number of trade-o solutions. Refer to Jin
and Branke (2005) for a more detailed discussion on evolutionary search for
robust solutions.
11.2.3 An Example
We present here an example of evolutionary multiobjective optimization of

a three-dimensional (3D) turbine stator blade used in gas turbines. Two ob-
jectives are taken into account in the optimization. The rst objective is the
average pressure loss, which indicates the energy eciency of the blade. The
second objective, as suggested in Hasenjger et al. (2005), is the variation of
the pitch-wise static outlet pressure.
The 3D geometry of the blade is represented by two sections of closed cubic
B-splines, namely, a hub section and a tip section, each consisting of 25 control
points, as illustrated in Fig. 11.1. In the representation, the rst three and
the last three control points of the closed B-splines are overlapping, resulting
in 22 control points. In addition, since the hub and tip sections are supposed
to lie on a cylindrical surface, the z-coordinate of the control points is xed
and not optimized during the evolution. As a result, 88 design parameters in
total (x and y coordinates of 22 control points) are to be optimized by the
evolutionary algorithm.
To evaluate the performance of a given design, 3D CFD simulations have to
be performed. In our work, a 3D Navier-Stokes ow solver, HSTAR3D (Arima
et al., 1999) is employed, which usually takes from two to four hours on
an AMD Opteron 2 GHz double processor depending on the convergence
speed of the uid dynamics. To reduce computation time, a two-level parallel
computing architecture has been adopted. At the rst level, tness evaluations
for each individual in the population are parallelized using the master-slave
Fig. 11.1. Three-dimensional representation of a blade using B-splines.
model. At the second level, each CFD simulation is again parallelized on four
computers using the node-only model. Consequently, if the population size is
P , the needed number of computing nodes will be 4P +1.
An ecient model-based evolutionary multiobjective optimization algo-
rithm, the regularity modeling multiobjective estimation of distribution algo-
rithm (RM-MEDA) (Zhang et al., 2008), has been employed for the optimiza-
tion of the 3D turbine blade. RM-MEDA is in principle a variant of estimation
of distribution algorithms (EDA) (Larranaga and Lozano, 2001). Instead of
using Gaussian models, a rst-order principle curve has been used to model
the regularly distributed Pareto-optimal solutions complemented by a Gaus-
sian model. As demonstrated in Jin et al. (2008), by modeling the regularity
in the distribution of Pareto-optimal solutions, the scalability of the EDA can
be greatly improved. Furthermore, unlike most EDAs, which require a large
population size, RM-MEDA performs well even with a small population size.
In this example, a population size of 20 has been used.
The optimization results from two independent runs are plotted in Fig.
11.2, in each of which the population has been evolved for 100 generations.
Note, however, that the population was initialized not randomly, but with
solutions from previous optimization runs using weighted aggregation ap-
proaches (Hasenjger et al., 2005). Compared to the results reported in Hasen-
jger et al. (2005), we see that the non-dominated solutions obtained by the
RM-MEDA are better in terms of both coverage and accuracy.
40.000
35.000
30.000
PS variation
25.000
20.000
15.000
10.000
5.000
9 9.5 10 10.5 11 11.5 12 12.5 13
pressure loss
Fig. 11.2. Solutions obtained from two independent evolutionary runs. The dia-
monds denote the solutions from the rst run, and the squares from the second
run.
11.3 Design of a Neural Network for an Industrial Blast

Furnace
Intuitively one can conceive of a neural network that is simultaneously at-

tempting to satisfy two dierent requirements: it should be able to reproduce
the data in an accurate manner and simultaneously, it should engage a small
number of neural connections, primarily to avoid over-training of the data set.
These two objectives in many cases could be conicting with each other. As
expected, as the number of nodes becomes smaller, the training error tends to
shoot up, and the converse usually remains true as well. In term of these two
objectives, one can thus think of working out a Pareto tradeo, where each
solution in the Pareto frontier denotes a neural network of a unique archi-
tecture with a unique set of weights. A procedure for evolving such frontiers
through a Predator-Prey type multi-objective Genetic Algorithm (Li, 2003)
has been demonstrated in a recent article (Pettersson et al., 2007a) and was
elaborately tested on the highly nonlinear data from an industrial iron making
blast furnace shown schematically in Figure 11.3.
The aim of the study reported in Pettersson et al. (2007a) was to evolve
a neural network that would optimally predict the carbon, sulfur and silicon
contents of the hot metal produced in the blast furnace as a function of a
number of process parameters. What was attempted there was simultaneously
to minimize (i) the training error of the network (E) and (ii) the required
number of active connections in the lower part of it (N ). The idea was to

tinker with the architecture of the lower part of the network, and to treat
their corresponding weights as variables inuencing the objective functions,
as further elaborated in Figure 11.4. The trade o situation between E and
N is expected to be represented as a Pareto-frontier.
11.3.2 Methodology
The evolutionary process: Here the crossing over was done between two enti-
ties both of which are essentially self-sustaining neural networks. This process
is further elaborated in Figure 11.5. A self-adaptive real-coded mutation was
performed on the weights, which draws its inspirations from Dierential Evo-
lution (Price et al., 2005).
The multi-objective algorithm used in this study utilized a Moore neigh-
bourhood inhabited by two distinct species: the predators and the preys. The
preys are a family of sparse neural networks, initiated randomly as a popu-
lation, and they evolved in the usual genetic way. The members of the prey
population diered from each other both by the topology of the lower part
connections and the corresponding weight values. The predators in this algo-
rithm are a family of externally induced entities, which do not evolve, and the
major purpose of their presence is to prune the prey populations based upon
the tness values. A two dimensional lattice was constructed as a computa-
tional space and both the predators and the prey were randomly introduced
there, where each of them would have its own neighbourhood. The basic idea
propagated in this algorithm inherits some of the concepts of cellular au-
tomata in Moores neighbourhood. However, unlike cellular automata, here
the lattice here does not denote the discretized physical space; it is just a
mathematical construction that facilitates a smooth implementation of this
algorithm. Further details are available in the original work (Pettersson et al.,
2007a).
The method seems to have worked better when the initial population is de-
liberately generated in the vicinity of the estimated nadir region. The progress
of the rank-one members is captured in Figure 11.6 and a computed Pareto
frontier is shown in Figure 11.7. Each discrete point in the frontier denotes a
neural net with a dierent ability of prediction than the others. Some typi-
cal examples are shown in Figure 11.8. As the ultimate choice between them
remains the task of the decision maker, the conservative middle ground B
shown in Figure 11.7 should be adequate for most applications.
This novel method of multi-objective analysis is not just to benet the
steel industries: basically it is robust enough to handle noisy data irrespective
of their sources. Very recently this methodology has been augmented further
through the use of Kalman lters (Saxn et al., 2007), and it has also been
eectively utilized for identifying the most important in-signal in a very large
network (Pettersson et al., 2007b), rendering it of further interest to the soft
computing researchers at large.
Fig. 11.3. Schematic diagram of an iron making blast furnace.
Fig. 11.4. Multi-objective formulation of the neural network.
Fig. 11.5. The crossover scheme. The shaded regions are participating in the
crossover process.
Fig. 11.6. Movement of rank 1 population in dierent generations.
Fig. 11.7. A typical Pareto frontier presented in (Pettersson et al., 2007a).

Fig. 11.8. Data prediction through three networks A (top), B (middle) and C
(bottom). The lighter lines denote actual observations for a period of 200 days and
the darker lines are the predicted values provided in (Pettersson et al., 2007a).
11.4 Molecular Docking
The docking of a highly exible small molecule (the ligand) to the active site of
a highly exible macromolecule (the receptor) is described in this Section. See
Morris et al. (1998); MacKerell Jr. (2004) for a more detailed discussion of the
problem. The ability to predict the nal docked structure of the intermolecular
complex has a great importance for the development of new drugs as docking
modies the biological and chemical behavior of the receptor. Most of the
current docking methods account only for the ligand exibility and consider
the receptor as a rigid body because the inclusion of receptor exibility could
involve thousands of degrees of freedom. Current research in this eld is faced
with this problem. The application described here focuses on a dierent aspect
of the docking procedure: the optimization methodology applied to nd the
best docked structure. The application of a multi-objective approach to the
docking problem based on the Pareto optimization of dierent features of a
docked structure is proposed. It is shown that this approach allows for the
identication of the dominating interactions that drives the global process.
A drug performs its activity by binding itself to the receptor molecule,
usually a protein. In their bounded structure, the molecules gain complemen-
tary chemical and geometrical properties that are essential for the therapeutic
function of the drug. The computational process that searches for a ligand that
best ts the binding site of the receptor from a geometrical and chemical point
of view is known as molecular docking.
A molecule is represented by its atoms and the bonds connecting them.
Atoms are described mainly by their Van der Waals radius that roughly de-
nes their volume; bonds are described by their lengths (the distance between
atoms), by the angle between two consecutive bonds and by their conforma-
tional state (the dihedral angle between three consecutive bonds). Molecules
are not static systems. At room temperature they perform a variety of motions
each one having a characteristic time scale. Since the time scales of stretching
(changes in bond lengths) and bending (changes in bond angles) have greater
time scales than conformational motions (changes in dihedral angles), bond
lengths and bond angles can be considered xed. Thus, from the docking point
of view, only conformational degrees of freedom are important.
Typically, ligands have from 3 to 15 conformational degrees of freedom;
their values dene the conformational state of the ligand. Receptors have typ-
ically from 1000 to 3000 conformational degrees of freedom, so the dimension
of the complete search space for best docked conformation becomes compu-
tationally unaordable even for routine cases. The most widely used simpli-
cation is to consider only the ligand exibility, so reducing the complexity of
the search space.
The dierent possible ligand conformations are ranked according to their
tness with the receptor. What this tness stands for is one of the key aspects
of molecular docking and dierentiates various docking methodologies. Most
of the docking tness functions are based on the calculation of the total en-
ergy of the docked structure. Energy based tness functions are built starting
from force elds which represent a functional form of the potential energy of a
molecule. They are composed of a combination of dierent terms that can be
classied in bonded terms (regarding bond energies, bond angles, bond con-
formations) and non-bonded terms (Van der Waals and electrostatic). This
energy can be calculated in various ways, ranging from quantum mechanics
to empirical methods. Obviously, a more exact tness function as derived
from quantum mechanical simulations strongly impacts on the computational
complexity and is applicable only for small systems on massive parallel com-
puters; the opportunity to use rough empirical models creates the possibility
of treating more realistic cases.
In summary, a docking procedure is composed by two main elements: a t-
ness function to score dierent conformations for the molecular complex and
a search procedure to explore the space of possible conformations. In current
docking approaches, the bonded and non-bonded terms both contribute to
the tness function and the optimization has a single objective equal to their
weighted sum. The weights are determined by statistical analysis of exper-
imental data. The proposed multi-objective optimization approach incorpo-
rates two conicting objectives, i.e. the concurrent minimization of internal
and intermolecular energy terms, derived from a suitable scoring function,

each one corresponding to an objective for the optimization algorithm.
11.4.2 Methodology
MOGA-II
MOGA-II is an improved version of the MOGA (Multi-Objective Genetic Al-
gorithm) of Poloni (Poloni and Pediroda, 1997). It uses smart multi-search
elitism for robustness and directional crossover for fast convergence. The ef-
ciency of MOGA-II is controlled by its operators (classical crossover, di-
rectional crossover, mutation and selection) and by the use of elitism. The
internal encoding of MOGA-II is implemented as in classical genetic algo-
rithms. Elitism plays a crucial role in multi-objective optimization because it
helps preserving the individuals that are closest to the Pareto front and the
ones that have the best dispersion. MOGA-II uses four dierent operators for
reproduction: one point crossover, directional crossover, mutation and selec-
tion. At each step of the reproduction process, one of the four operators is
chosen with regard to the predened operator probabilities.
A strong characteristic of this algorithm is the directional crossover that
is slightly dierent from other crossover operators and assumes that a direc-
tion of improvement can be detected comparing the tness of individuals. A
novel operator called evolutionary direction crossover is introduced and it is
shown that even in the case of a complex multi-modal function this operator
outperforms classical crossover. The direction of improvement is evaluated by
comparing the tness of the individual Indi from generation t with the t-
ness of its parents belonging to generation t 1. The new individual is then
created by moving in a randomly weighted direction that lies within the ones
individuated by the given individual and his parents.
Multi-objective Ligand-Receptor Docking

In this example, the MOGA-II implementation in modeFRONTIER R is used
to optimize the docking towards each of the dierent contributions of the
docking program Autodock v. 3.05 (http://autodock.scripps.edu) scoring
function:
G = CCVDW GGVDW + Chbond Ghbond + CelecGelec

+ Ctor Ntor + Csol Gsol (11.1)
that tries to estimate the change in Gibbs free energy G involved in passing
from the system (receptor + ligand) to the docked system (receptor-ligand).
The coecients C are parametrized from experimental data and set to proper
values. VDW stands for Van der Waals contribution, hbond for hydrogen
bonds contribution, elec for electrostatic contributions, tor for the entropy
change if Ntor rotatable bonds are connected with heavy atoms, and sol for
the change in solvation energy.
11.4.3 Results
A bound docking experiment was performed: on the basis of the x-ray struc-
ture of the complex, the receptor coordinates were separated from those of the
ligand, and then an attempt made to reconstruct the original x-ray structure
by docking the ligand to the receptor. Starting from the scoring function of
equation (11.1), Autodock gives the values for the internal energy of the lig-
and and for the intermolecular ligand-receptor interaction energy. These two
outputs were assigned as the objective of the optimization.
The tests were conducted on PDB code 1KV3 chain A co-crystallized with
GDP (http://www.rcsb.org/pdb). The resulting Pareto front is reported in
Figure 11.9 (in which the units for the axes are Kcal/mol).
Fig. 11.9. Pareto frontier for the molecular docking problem. Energies are in
Kcal/mol. Boxed values represent RMSD in Angstrom between the candidate solu-
tion and the original x-ray structure.
The squared values represent the root mean squared deviation (RMSD) in A
(angstrom) between the candidate solution and the original x-ray structure.
Typically, RMSD values less than 1.5 A are considered as good solutions.
It is possible to note that in this case the docking process is mainly driven
by the intermolecular energy. This information could be useful for a deeper
understanding of the eective relative inuence of the contributions of the
scoring function to this particular docking process. From a practical point of
view, it could also be useful for the design of a tailored scoring function for the
docking of similar drug candidates. Also note the presence of a knee point
(RMSD=1.27 A). This is a particularly interesting solution of the docking
problem in which a small improvement in the minimization of the ligand
energy leads to a large deterioration of the intermolecular energy.
11.5 Radiotherapy Treatment Planning

Cancer is one of the most signicant health problems worldwide. In industri-

alised countries it is the second most common cause of death and more than 5
in every 1,000 people are diagnosed with some type of cancer every year. The
main treatment form besides surgery and chemotherapy is radiation therapy.
It is estimated that 50% of all patients diagnosed with cancer would currently
benet from radiotherapy.
Ionising radiation is used to damage the DNA and interfere with division
and growth of cancer cells. Radiation therapy exploits the fact that cancerous
cells are more susceptible to radiation than healthy cells. The goal of radio-
therapy treatment planning is therefore to ensure that enough radiation is
delivered to the targeted region to kill the cancerous cells while surrounding
anatomical structures are spared.
In the past it was possible for a physician manually to design a treatment
that took full advantage of the available technology. Modern procedures use a
technique called Intensity Modulated Radiotherapy (IMRT). This technique
uses a multileaf collimator to shape the beam and control, or modulate, the
intensity that is delivered along a xed beam direction. IMRT allows patients
to receive complicated treatments and the number of options that are avail-
able in IMRT places the optimal planning of a treatment outside the realm of
human awareness. Because of this complexity of the planning process, treat-
ment planning is segmented into a three-phase process that rst selects beam
directions then decides an intensity map (exposure times, uence) for the di-
rections selected in phase one, and nally chooses a delivery sequence that
eciently administers the treatment. Computer assisted optimisation meth-
ods are needed in each phase and since the end of the 20th century these
problems have attracted the interest of the Operations Research community.
Surveys on optimisation methods for the three phases can be found in Ehrgott
et al. (2008a); Shao (2005); Ehrgott et al. (2008b), respectively. In the follow-
ing we will concentrate on the intensity optimisation problem and assume that
beam directions are given.
In 2000, we started a collaboration with the Physics Section of the Oncol-
ogy Department at Auckland City Hospital to work on treatment planning
problems. Treatment planners spend between 30 minutes and several hours on
one single case. This is because the available planning system (like almost all
commercially available ones worldwide) requires a trial and error approach.
Apart from desired dose levels in the tumour and surrounding critical struc-
tures, so called importance factors for these entities need to be specied as
input. The software then employs heuristics to nd a good treatment plan,
which is presented to the planner. If it is unsatisfactory the importance factors
have to be changed and the process will be repeated. Treatment planners are
aware of the ineciency of this approach. So the goal was to investigate the
possibility of a planning system that would calculate several plans right away
and provides decision support for choosing an appropriate one.
11.5.2 Methodology
Mathematical models for the intensity optimisation problem are based on the
discretisation of the body and the beams. The body is divided into volume
elements (voxels) represented by dose points. Voxels are cubic and their edge
length is dened by the slice thickness and resolution of the patients CAT
images and is in the range of a few mm. at most. Deposited dose is calculated
for one dose point in every voxel and assumed to be the same throughout
the voxel. A beam is discretised into beam elements (bixels). Their size is
dened by the number of leafs of the collimator and the number of stops for
each leaf. The number of voxels may be tens or hundreds of thousands and
the number of bixels can be up to 1,000 per beam. The relationship between
intensity and dose is linear, i.e., d = Ax where x is a vector of bixel intensities.
The entries aij of A represent the rate at which dose is deposited in voxel i
by bixel j. Finally, d is a dose vector that represents the discretised dose
distribution in the patient. The computation of the values aij is referred to
as dose calculation.
While most optimisation models in the medical physics literature have a
single objective, they do try to accommodate the conicting goals of destroying
tumour cells and sparing healthy tissue. Almost all can be interpreted as
weighted sum scalarisations of multi-objective programming models, where
the weights are the importance factors mentioned above. Almost all of these
multi-objective models are convex problems, so that their ecient sets can be
mapped to one another. We decided to use a multi-objective version of the
model of Holder (2003), which has some nice mathematical properties. Here
A is decomposed by rows into AT , AC , and AN depending on whether a voxel
belongs to the tumour, critical structures, or normal tissue. Accordingly, T U B
and T LB are vectors of upper and lower bounds on the dose delivered to the
tumour voxels; CU B is a vector of upper bounds for the critical structure
voxels; and N U B a vector of upper bounds for the remaining normal tissue
voxels. The objectives of the model are to minimise the violation of any of the
lower and upper bounds and can be stated as shown in (11.2). U B, U B,
and U B are parameters to restrict the deviations to clinically relevant values.
min{(, , ) : T LB e AT x T U B, AC x CU B + e,
AN x N U B + e, 0 U B,
min CU Bi U B, 0 U B, 0 x}, (11.2)
i
where e denotes a vector of ones of appropriate dimension.

We will denote the feasible set of (11.2) by X and its image in the objec-
tive space by Y . In (11.2) we have a multi-objective linear programme with
three objectives, a large number of variables (order of thousands), and a very
large number of constraints (order of hundred thousands). Under these cir-
cumstances we never did try to solve the problem with simplex methods, as
it is known that the number of ecient basic feasible solutions can be very
large, even for moderately sized problems. Moreover, treatment planners will
never use the intensity maps to decide on a treatment, but always look at the
dose distribution. The obvious choice was to try and solve the problem in the
three-dimensional objective space.
To that end Bensons outer approximation algorithm (Benson, 1998) was
implemented. With this method 2D planning problems (i.e. on a single CAT
slice) could be solved, but the experiments indicated that 3D problems would
require prohibitive computation times. It was therefore necessary to consider
approximate solution of the problem. Discussions with physicists on whether
that would be acceptable from a radiotherapy point of view provided valu-
able insights. We discovered that dose calculation is imprecise because the
mathematical models to compute the entries of A are inexact since they can-
not exactly capture the specic tissue composition in individual patients. The
medical physicists assured us that it is acceptable to work with precision of
about 0.1 Gy (Gy, for Gray, is the physical unit for radiation dose).
This allowed us to consider -ecient solutions of (11.2). It was possible to
adapt Bensons algorithm in such a way that it does guarantee the construc-
tion of an -nondominated set, the modied algorithm is described in Shao
and Ehrgott (2008). Solving the problems approximately reduced the compu-
tation times dramatically. Figure 11.10 (a) and (b) show the -nondominated
set of a 2D problem with 986 voxels and 1140 bixels. For = 0.1 the prob-
lem had 152 nondominated extreme points and was solved in 20 minutes. For
= 0.005 it took 9 hours to compute 1,989 nondominated extreme points.
11.5.3 Interactive Scheme
From the planners point of view the whole set of nondominated points is not
very useful, since it is innite. Also, for the same reason of imprecision in dose
calculation mentioned above, planners would not distinguish between plans if
they dier only by very small amounts. It is necessary to select a nite set of
nondominated points (ecient solutions). The nondominated extreme points
and associated basic solutions have only mathematical relevance, but no clini-
100 100
90 90 100
80 80
80

70 70

60

60 60
50 40
50
20 0 0
5
0 10
15 40
10
40 20
20
0 20 0
20 20 20
40
20 0
20
20

(a) (b) (c)
Fig. 11.10. -nondominated set with = 0.1 (a) and = 0.005 (b) and set of
representative nondominated points (c).
cal meaning. The selection of plans should represent the whole nondominated
set, but guarantee a certain minimal dierence between the points. We devel-
oped a method to determine a representative subset of nondominated points
in Shao and Ehrgott (2007). The method rst constructs an equidistant lat-
tice of points placed with distance d on a simplex S (the reference plane) that
supports Y at the minimiser of eT y over Y and such that Y S + R3 . For
each lattice point q an LP
min{t : q + te Y, t 0}
is solved. If the optimal value is t, the point q + te is tested for nondominance.

It can be shownthat the distance between remaining nondominated points is
between d and 3d. Figure 11.10 (c) shows a representative set for the same
example shown in Figure 11.10 (a) and (b).
Since the representative points are all nondominated the planners now
have a choice between several plans. By the theory of linear programming,
we know that they are all optimal solutions of some weighted sum problem
using importance factors as used in current practice. Moreover, the whole
range of such solutions is represented. To support planners in the choice of
a plan, visual aids are necessary within a decision suppor system. Planners
are used to judging the quality of a plan by looking at isodose curves and
dose volume histograms (DVH). The former are colour-wash pictures showing
curves of equal dose superimposed on CAT pictures. The latter are plots
of the percentage of tumour and critical structures against dose levels, see
Figure 11.11.
The representative set of solutions (treatment plans) is stored in a database
and input to the software Carina (Ehrgott and Winz, 2008) which rst pro-
poses a balanced solution of (11.2) (with as equal as possible values of , , )
displaying the corresponding DVH and isodose plots as well as some informa-
tion on available trade-os. The planner can then specify changes (going to a
neighbouring solution, searching for solutions with specic values, or for solu-
Fig. 11.11. Isodose curves and dose-volume histogram for a brain tumour treatment
plan.
tions satisfying some thresholds). This process is continued until the planner
accepts a treatment plan.
The interaction with the treatment planner is therefore ex-post, allowing
the likely time consuming plan calculation to be decoupled from plan selection.
As a consequnce, plan selection becomes faster as it is based on information
retrieval from databse, a real-time operation. Moreover, the specication of
dose levels is more natural than the guessing of importance factors.
11.5.4 Remarks
It is interesting to note that the optimisation model (11.2) tries to characterise

dose distribution by a few numbers, whereas the quality is actually judged by
the whole DVH. This, of course, is not part of the model. Attempts to specify
some points on the DVH curves as constraints in optimisation models exist.
But they lead to mixed integer programming models that at this time cannot
be solved as multi-objective models.
Throughout the project radiotherapists have been involved in the project.
This had several advantages. We obtained valuable information on radio-
therapy practice and could ensure to develop usfeul tools that would be ac-
cepted by the actual users. Fears that we intend to replace people by software
could easily be laid to rest once the radiotherapists understood that we never
thought it is possible to replace their role in the treatment design, but that
we could improve the planning process. Finally, we made sure to use the tools
they are accustomed to work with every day.
The work on this project has thus far resulted in an academic software
that allows solution of the multi-objective linear programme (11.2) for 2D
and smaller sized 3D problems. The software has been developed in close con-
sultation with treatment planners at Auckland City Hospital. Further work
needs to address numerical issues arising in large 3D problems. Before actual

use in clinical practice a lengthy and costly approval process needs to be com-
pleted for which the support of a medical software company will be required.
11.6 Supply Chain Management

In supply chains often thousands of individual decisions need to be made

and coordinated. Due to the high degree of complexity successive planning
approaches are therefore often chosen in practice.
Figure 11.12 shows typical planning tasks that arise in supply chains. These
planning tasks are arranged in two dimensions. The rst dimension is the sup-
ply chain process. In this dimension, planning tasks are arranged focusing on
the most important processes following the ow of goods in supply chains.
These are procurement, production, distribution and sales. The second di-
mension is the planning horizon. In this dimension the planning tasks are
distinguished by their temporal impact on the supply chain. These may be
strategic decisions with a long-term impact or operational decisions, which
have only an immediate impact in the near future (short-term).
The strategic network planning module covers the long-term decisions
across all supply chain processes. It supports the user to determine the struc-
ture of the supply network (plant location, distribution system) as well as
the product program. Although its results are important for the long-term
Fig. 11.12. Supply Chain Planning Matrix Fleischmann et al. (2005), p. 87.
protability of supply chains, it is often not a core functionality of Advanced

Planning Systems (APS). This is because APS are primarily built to support
daily business, whereas strategic decisions are only reviewed periodically and
most often not within the regular organization, but rather on a project basis.
The master planning module coordinates procurement, production and
distribution on a mid-term level. Its major decision support is about sourcing:
Which product is produced at which location and when? Thus, in this module
the master production schedule is xed. However, it is important to anticipate
the key characteristics of the lower (short-term) planning levels within this
module, because otherwise inconsistent plans (for procurement, production
and distribution) will result on the lower planning level.
In the area of distribution and transport planning, distribution related
planning tasks are addressed, the latter on a more detailed level (e.g., schedul-
ing of transports, vehicle loading and routing). Production planning and
scheduling on the other hand are the two modules that support production re-
lated issues in the short-term planning horizon. Finally, purchasing and mate-
rial requirements planning support the (short-term) procurement of materials
and components.
The capabilities oered within mySAP Supply Chain Management (SCM)
extend far beyond the scope of this article. The key functionalities we will
describe in the following are highlighted in Figure 11.13, which is based on
the generic supply chain planning matrix (Figure 11.12). They are part of the
SAP Advanced Planner and Optimizer (SAP APO), which is the advanced
planning component within the mySAP SCM solution. For more information
on SAP APO refer to Bartsch and Bickenbach (2001); Dickersbach (2005).
Fig. 11.13. Supply chain planning matrix using mySAP SCM terminology.
Solving a planning problem of this complexity in its entirety within one

planning step is neither feasible from an algorithmic perspective nor sensible
from a planning process point of view. A hierarchical decomposition of the
complete planning problem into a master planning and a production planning
and scheduling part addresses planning complexity as well as planning pro-
cess design issues. In the following, we describe how the planning problem is
partitioned into a master planning and a production planning part and which
business requirements are addressed on which planning level.
11.6.2 Methodology
Multi Location Optimization in Supply Network Planning (SNP)
In SAP APO, the master planning process is implemented in the Supply

Network Planning (SNP) module. SNP oers a multitude of functionalities,
not all of which can be described in the limited scope of this article. More
details on the SNP module can be found, among others, in Dickersbach (2005).
The SNP model contains all relevant locations, i.e. production plants and
distribution centres, in the supply network. SNP determines which of the
plants produces which quantities of which products in which time periods.
On a rough level, SNP also determines which production alternative is used
at a specic plant, for instance with regard to ingredients and general process
characteristics.
To reduce the complexity of the master planning model, not all products
are considered in the SNP optimization run. The selection is made by agging
specic products as not relevant for SNP planning. SNP planning takes into
account all products produced in a location, all products for which a stock
transfer between locations is possible, externally procured active ingredients,
goods for resale and selected forming auxiliaries. Not relevant to SNP are
most raw materials, most forming auxiliaries as well as packaging materials.
A similar logic is used for resources. Only bottleneck resources are selected
for SNP planning.
The concentration on key products and bottleneck resources also results
in a signicant simplication of the recipes1 used in SNP, which are derived
from the more detailed recipes used in Production Planning and Detailed
Scheduling (PP/DS) and the attached enterprise resource planning (ERP)
system. Furthermore, compared to the recipes used on the PP/DS level, not
all setup and cleaning operations are considered in SNP recipes. Small setup
operations are normally neglected while key setup activities which are relevant
for campaign planning on bottleneck resources due to their long duration or
high costs are considered. To account for the resource capacity consumed by
small setup and cleaning operations, a loss factor is applied to calculate the
resource capacity for SNP planning.
1
In APO, a recipe is commonly referred to as PPM (production process models)
or PDS (production data structure).
One of the main aspects of the SNP planning process is the cost-based
plan determination. The following cost types are used to build a cost model
which represents the business scenario of value base planning:
Penalties for not meeting customer demand / forecast,
Penalties for late satisfaction of customer demand / forecast (location
product specic)
Penalties for not meeting safety stock / safety days supply requirements
(location product specic, linear or piecewise linear)
Storage cost (location product specic)
Penalty for exceeding maximum stock level / maximum coverage (location
product specic, linear or piecewise linear)
External procurement cost (linear or piecewise linear, location product
specic)
Handling in / out cost (location product specic)
Transportation cost (transportation lane, product and means of transport
specic, linear or piecewise linear)
Variable production cost (production process specic, linear or piecewise
linear)
Fixed production cost / setup cost (production process specic)
Resource utilization cost (resource specic)
Costs for additional resource utilization (e.g. use of additional shifts, re-
source specic)
Cost for falling below minimum resource utilization.
The denition of the cost model is of crucial importance for controlling the
behaviour of the SNP optimizer. One of the central questions is whether to
maximize service level which usually means using high penalties for non
and late delivery or to maximize prots which requires use of realistic sale
prices. In the case study scenario, the non delivery cost levels reect real sale
prices suciently close to enable a prot maximization logic.
Another important feature of the case study scenario and the resulting
cost model is inventory control. High seasonality eects and long campaign
durations necessitate considerable build-up of stocks. To avoid an unbalanced
build-up of stock, soft constraints for safety stock and maximum stock levels
are used. To achieve an even better inventory levelling across products and
locations, piecewise linear cost functions for falling below safety stock as well
as for exceeding maximum stock levels are employed. In SNP optimization all
revelevant constraints can be considered, including
capacities for production, transportation, handling and storage resources,
maximum location product specic storage quantities,
minimum, maximum and xed production lot sizes,
minimum, maximum and xed transportation lot sizes,
minimum production campaign lot sizes.
An optimization model which considers all these constraints especially those

which can only be modelled using binary or general integer variables can be
highly complex.
Production Planning and Detailed Scheduling (PP/DS)
The short term planning process is dealt with in the Production Planning and
Detailed Scheduling module within SAP APO.
PP/DS focuses on determining an optimal production sequence on key
resources. In PP/DS, a more detailed modelling than on the SNP planning
level is chosen. On the basis of the results determined in SNP optimization,
a detailed schedule which considers additional resources and products is cre-
ated. This schedule is fully executable and there is no need for manual planner
intervention, even though manual re-planning and adjustments are fully sup-
ported within the PP/DS module. An executable plan can only be ensured
by considering additional complex constraints in PP/DS optimization. These
additional constraints include:
Time-continuous planning
Sequence-dependent setup and cleaning operations.
As the value based planning part is handled within SNP, the PP/DS optimizer
uses a dierent objective function than the SNP optimizer. The following goals
can be weighted in the objective function, which is subject to minimization:
Sum of delays and maximum delay against given due dates
Setup time and Setup cost
Makespan (i.e. time interval between rst and last activity for optimizing
the compactness of the plan)
Resource cost (i.e. costs associated with the selection of alternative re-
sources)
The main objective of the PP/DS optimizer run in the scenario at hand is to
minimize setup times and costs on resources without incurring too much delay
against the order due dates. For some resource groups, resource costs are also
used to ensure that priority is given to the best (i.e. fastest, cheapest, etc.)
resources.
11.6.3 Remarks
We have seen that both in Supply Network Planning and in Detailed Schedul-
ing there are a huge number of objectives to be minimized. However these
objectives can be mastered by forming a 4-level hierarchy.
On the root or top level, two dimensions of the second level can be dieren-
tiated: Service degree and real costs. The objective of real costs dierentiates
at the third level between for example:
storage costs: the minimization of inventory by weighting the inventory of

each storage location by an estimated cost factor
safety costs: the minimization of the risk of getting our of stock by weight-
ing the risk of each storage location by a cost factor
setup costs: the minimization the overhead of change over for each resource
by weighting each change over by a cost factor.
The objective of service degree dierentiates at the third level between for
example:
delay penalties: the minimization of delay for each demand or customer
order by weighting the priority of the customer
non delivery penalties: the minimization of non delivery for each demand
or customer order by weighting the priority of the customer.
Only for the top level are weighting factors not appropriate. The planner wants
to see several alternative solutions of the Pareto front of these two objectives:
By how much would costs increase if we wish to achieve a better service
level? A high service level is clearly an important objective, but there is no
direct cost measure for a delay. Summarizing in Advanced Planning for Supply
Chain Management we can focus on an optimization problem with just two
objectives maximizing service degree while minimizing the costs. In particular,
two dimensional visualization of the Pareto front and representative solutions
of this front are needed and sucient.
11.7 Interactive Processes for Aircraft Design

Interactive Evolutionary Computation (IEC) has started to capture the fasci-

nation of researchers from elds as diverse as art, architecture, data mining,
geophysics, medicine, psychology, robotics, and sociology. Takagi (2001) out-
lined many of these applications in his overview paper. However, to this day
only very few researchers have applied IEC to the problem of engineering and
design of complicated artifacts. While the main reason for the slow pace of
adoption in engineering is mostly open for speculation, it is partially a result
of the elds reluctance to accept new methods, like Genetic Algorithms, as
well as the elds already heavy reliance on automated optimization processes
that leave decidedly little room for subjectivity. While the reliance of engi-
neers on analysis tools requires interactive evolutionary techniques to utilize
them in the tness generation, it is also true that many design decisions in
practice are made through gut feel and intuition rather then analysis. Rec-
ognizing that fact, this example identies an IEC approach to design that
allows for automatic tness calculation through analysis as well as selection
and tness assignment by the human designers directly.
There are few things humans build that are more complicated than air-
craft. Not only are the reliability requirements enormous, given the fatal conse-
quences of failure, but the system itself strides a multitude of areas in physics,
such as aerodynamics, thermodynamics, mechanics, and materials. This con-
volution of disciplines has led historically to a very sequential design process,
tackling the various disciplinary issues separately: aerodynamicists only tried
to maximize the performance of the wing (or even just an airfoil), propulsion
engineers tried to build the largest engines, structural engineers tried to build
the sturdiest airframe, while material scientists attempted to only utilize the
lightest and sturdiest materials. As a consequence, the design process itself was
a highly inecient iterative process of ever changing airplane congurations,
only reconciled by rare, experienced individuals that were procient in all (or
at least many) disciplines. As these people retired, and signicant computa-
tional power became available, a new design process emerged, attempting to
satisfy the concerns of all disciplines concurrently: Multidisciplinary (Design)
Optimization. MDO is inherently a multicriteria optimization problem, since
each discipline contributes at least one objective function that potentially
conicts with the objective(s) of the other disciplines. The following example
demonstrates the ability of one MCDM technique, Interactive Evolutionary
Design, to address the dicult task of balancing the dierent disciplinary
objectives when determining the preliminary design conguration of a Super-
sonic Business Jet.
Figure 11.14 outlines the interactive evolutionary design process employed
for this application example (see Bandte and Malinchik, 2004, for background
discussion). After the problem is set-up by dening design variables, objec-
tives and constraints and sucient feasibility has been established, a GA is
being interrupted after several generations to display the current population
via spider-graphs and Pareto Frontier displays for objective values as well as
visualizations of the aircraft congurations. Based on this information the
Fig. 11.14. Integrated interactive evolutionary design process.

designer can make some choices regarding objective preferences and features
of interest, redirecting search and inuencing selection respectively. To limit
the scope of this example, only the redirection of the search through objective
preferences is being implemented here. However, as exemplied later, designer
selection of features of interest is an important part of interactive evolutionary
design and should not be neglected in general. The following sections lay out
in detail all tasks performed over several iterations for this example.
As in any design problem, the rst step is to dene the independent param-
eters, objectives and constraints, as well as evaluation functions that describe
the objectives dependencies on the independent variables. For this interactive
evolutionary design environment, this step also identies the genotype repre-
sentation of a design alternative, the tness evaluation function, inuenced by
the objectives, and how to handle design alternatives that violate constraints.
The supersonic business jet is described by ve groups of design variables,
displayed in a screen shot presented in Figure 11.15. The rst group, general,
consists of variables for the vehicle, some of which could be designated design
requirements. The other four groups contain geometric parameters for the
wing, fuselage, empennage, and engine. The engine group also entails propul-
sion performance parameters relevant to the design. All in all, the chromo-
some for this supersonic business jet contains 35 variables that can be varied
to identify the best solution.
A mix of economic, size, and vehicle performance parameters were chosen
as objectives in this example, with a special emphasis on noise generation,
since it is anticipated to be a primary concern for a supersonic aircraft. Hence,
for the initial loop the boom loudness and, as a counter weight, the acquisition
cost are given slightly higher importance of 20%, while all other objectives are
set at 10%. Furthermore, certain noise levels could be prohibitively large and
prevent the design from getting regulatory approval. Hence, some of the noise
objectives have to have constraint values imposed on them. In addition to
these constraints, the design has to fulll certain FAA requirements regarding
take-o and landing distances as well as approach speed. Furthermore, the
amount of available fuel has to be more than what is needed for the design
mission. Finally, tness is calculated via a weighted sum of normalized ob-
jective values, penalized by a 20% increase in value whenever at least one
constraint is violated. Note that the best solution is identied as the one
with the lowest tness, i.e. objective function values. All constraints, objec-
tives, normalization values and preferences are also displayed in Figure 11.15.
11.7.2 Methodology and Interactive Scheme
Run GA
Since the initial objective preferences were already specied at problem de-
nition, the GA can be executed next without requiring further input from the
designer. The GA chosen for this example is one of the most general found
Fig. 11.15. Screen shot: Display of information after 80 generations.
in the literature Holland (1975); Mitchell (1996); Haupt and Haupt (1998). It
has a population size of 20 and makes use of a real valued 35-gene represen-
tation, limited to the ranges selected at problem set-up. New generations are
created from a population with a two individuals elite pool and proportionate
probabilistic selection for crossover. The crossover algorithm utilizes a strat-
egy with one splice point that is selected at random within the chromosome.
Since the design variables are grouped in logical categories, this crossover al-
gorithm enables, for example, a complete swap of the engine or fuselage-engine
assembly between parents. Parent solutions are being replaced with ospring.
Each member of the new population has a 15% probability for mutation at
ten random genes, sampling a new value from a uniform distribution over the
entire range of design variable values. The GA in this example is used for
demonstration purposes only and therefore employs just a small population.
A population size of 50 to 100 seems more appropriate for a more elaborate
version of the presented interactive evolutionary design approach.
Display Information
Once the GA has executed 80 generations, it is interrupted for the rst time to
display the population of design alternatives found to this point. The designers
are presented with information that is intended to provide maximum insight
into the search process and the solutions it is yielding. In order to allow for a
reasonable display size of the aircraft conguration, only the four best design
alternatives, based on tness and highest diversity in geometrical features, are
presented in detail on the top of the left hand side of the display. A screen
shot of the displayed information is presented in Figure 11.15, highlighting the
individual with the best/lowest tness, which is also enlarged to provide the
designers with a more detailed view of the selected conguration. The design
variable values for the highlighted alternative and their respective ranges are
presented below this larger image, completing the chromosome information
on the left hand side of the pane.
On the right hand side of the pane, the designers can nd the objective
and constraint information pertaining to the population and the highlighted
individual. On the top, a simple table outlines the specic objective values for
the highlighted alternative, as well as the objective preferences and normal-
ization factors used to generate the tness values for the current population.
Below the table, a spider graph compares the four presented alternatives on
the basis of their normalized objective values, while to the right four graphs
display the objective values for the entire population, including its Pareto
frontier (highlighted individual in black). Below the spider chart, a table lists
the constraint parameter values for the highlighted alternative as well as the
respective constraint values. A green font represents constraint parameter val-
ues near, orange font right around, and red font way beyond the constraint
value. Finally at the bottom, three graphs display the population with respect
to its members constraint parameter values as well as the infeasible region,
superimposed. These graphs in particular indicate the level of feasibility in
the current population.
Provide Input
This step represents the central interaction point of the human with the IEC
environment. Here they process the information displayed and communicate
preferences for objectives, features of interest in particular designs, whether
specic design variable values should be held constant in future iterations,
what parameter setting the GA should run with in the next iteration (e.g.
a condition that identies the end of the GA iteration), or whether specic
design alternatives should serve as parents for the next generation.
Analyzing the data provided, it is noticeable that all objectives except
for the boom loudness are being satised well. Consequently, in an attempt
to achieve satisfactory levels for the boom loudness in the next iteration, its
preference is increased to 30%, reducing the acquisition costs importance to
10%. This feedback is provided to the GA via a pop-up screen (not displayed
here) that allows the designer to enter the new preference values for the next
iteration. With this new preference information the GA is executed for another
80 generations.
Second Iteration, 160 Generations
After the GA has executed an additional 80 generations, it is apparent from

the objective values that the last set of preferences did not emphasize the boom
loudness and sideline noise enough, since boom loudness did not improve (from
87.11 to 87.12 dB) and the sideline noise got worse (from 90.89 to 93.65 dB).
Consequently, for the next iteration the importance of both is increased to
35% while all other objectives are reduced to 5%.
Third Iteration, 240 Generations
Examining the population after another 80 generations yields that the last
set of preferences still did not emphasize the boom loudness enough, since
boom loudness improved only marginally (from 87.12 to 86.96 dB). On the
other hand, sideline noise did improve signicantly (from 93.65 to 90.94 dB),
so that for the next iteration all emphasis can be given to boom loudness. To
keep the score even for all other objectives, they are being kept at 5% with
boom loudness at 65%.
Fourth Iteration, 400 Generations
For this iteration 160 more generations were executed to produce the popula-
tion displayed in the screen shot presented in Figure 11.16. In part due to the
longer GA run, a very good solution, #7172, is found after 400 generations
with largely improved values for almost all the objectives. This result is some-
what surprising, considering most objectives had only a 5% level of preference
and the one objective with 65%, boom loudness, improved only marginally.
This result can be attributed to an exemplied eect from summing corre-
lated objective function values that are caused by similar design alternatives
(with similar design variable settings) exhibiting similarly good (or bad) val-
ues for all objectives except boom loudness. The fact that boom loudness is
a conicting objective, specically with sideline noise, can also be observed
from the pronounced Pareto frontier in the second objective chart from the
top.
However, the presented solution after 400 generations seems to satisfy
the objectives better than the published solution in Buonanno et al. (2002),
generated by MATLABs c Fmincon function (The MathWorks Inc. (2008)).
So it could be concluded that none of these objective values are dramatically
out of sync or range and the presented individual is the nal solution.
11.8 Land Use Planning

The work described here was motivated by problems of land use allocation or
re-allocation in the Netherlands. Land which is already intensely developed
Fig. 11.16. Screen shot: Fourth iteration after 400 generations.
has often to be redeveloped to meet current needs for agriculture, residence,

recreation and industry, while at the same time recognizing conservation goals
(including possible restoration of some land to approximately pristine condi-
tions). The initial model development was based on a specic region near
Amsterdam (the Jisperveld), as briey described in Section 5 of Stewart et al.
(2004). However, the longer term intention is to build the model into a general
land use planning decision support system (LUDSS). The function of such an
LUDSS would be to generate a small number of plans which can be evaluated
holistically by decision makers or planners. They would then indicate which
features of the plan they like or dislike, which would lead to a readjustment of
goals in the LUDSS and the generation of a new solution. This process may be
repeated until planners are satised that no substantial further improvements
are likely.
The model represents the region under consideration by a rectangular grid
of (say) R C equal-sized cells. It is then assumed that one and only one
land use (from a set of possible uses) is allocated to each grid. Formally,
we dene binary variables xrc , such that xrc = 1 if land use is allocated
to cell (r, c) and xrc = 0 otherwise. For ease of notation we shall denote
the three dimensional array of all xrc values by x. Typical constraints on x
would include exclusions of certain land uses from certain cells (corresponding
xrc set to zero) and upper and lower bounds on the total area allocated to a
R C
particular land use (i.e. on r=1 c=1 xrc ).
Some objectives relate to directly quantiable costs and benets, and tend
to be additive in nature. For example, if all such objectives (without loss in
generality) are expressed as costs then:

R
C

fi (x) = i
rc xrc
r=1 c=1 =1
i
where rc is the cost in terms of objective i associated with allocating land
use to cell (r, c).
As initially described by Aerts et al. (2005), however, a critical manage-
ment objective is to ensure that land uses are suciently compact to allow
integrated planning and management. Aerts et al. (2005) introduce essentially
one measure of compactness, related to the numbers of cells adjacent to cells
of the same land use. This concept was extended in our work by means of a
more detailed evaluation of the fundamental underlying management objec-
tives. Dening a cluster of cells as a connected set of cells allocated to a single
land use, three measures of performance for each land use type were identied
as follows:
Numbers of clusters for each land use, C : These measure the degree of
fragmentation of land uses, and minimization of the number of clusters
would seek to ensure that areas of the same land use are connected as far
as possible.
Relative magnitude of the largest cluster for each land use: Maximization
of the ratio L = nL L
/N is sought, where n and N are respectively
the number of cells in the largest cluster and the total numbers of cells
allocated for land use . If multiple clusters are formed, then it would
often be better to have at least one large consolidated cluster, than for all
clusters to be relatively small.
Compactness of land uses, denoted by R , dened by a weighted average
across all clusters for land use of the ratio of the perimeter to the square
root of the area of the cluster. This measure should be minimized as a
compact area for one land use (e.g. a square or circular region) may be
easier to manage than a long thread-like cluster.
The above measures dene an additional 3 objectives, as the compactness
goals need to be achieved for each land use individually. Furthermore, the
calculation of C , L and R require the execution of a clustering algorithm, so
that these additional objectives are non-linear and computationally expensive.
The total number of objectives is thus k = k0 + 3, where k0 is the number
of additive objectives.
11.8.2 Methodology
In view of the large number of objectives, it is not practical to seek to represent

the ecient frontier in full. The approach adopted was thus based on sampling
the ecient frontier by optimizing an aggregate measure of performance sub-
ject to the constraints on x, for each of a number of dierent aggregations.
The aggregation chosen was that of the scalarizing function introduced by
Wierzbicki (1999) in the context of his reference point methodology, except
that we chose a smoother function than that based on the maximum operator.
Thus for any given reference point (which can be viewed as a set of goals or
aspiration levels for each objective), say g1 , g2 , . . . , gk , the scalarizing which is
to be minimized is dened by:
k
4
fi (x) Ii
(11.3)
i=1
gi Ii
where Ii is the ideal (best achievable measure of performance) for objective i.

Constrained minimization of (11.3) with respect to x is a non-linear com-
binatorial optimization problem, with the added complexity that most of the
functions cannot be evaluated explicitly in closed form (but are are derived as
outputs from a clustering algorithm). Stewart et al. (2004) describe a special
purpose genetic algorithm (GA), designed to exploit a number of special char-
acteristics of the land use planning problem models, both in the generation
of population elements and in the execution of cross-overs (see cited reference
for details).
It is interesting to emphasize at this point that the chosen methodology in-
cludes elements from conventional multiobjective optimization and from evo-
lutionary optimization, thus representing an integration of the two themes of
the present volume.
11.8.3 Interactive Scheme and Results
In implementing the algorithm within an LUDSS, the process starts by select-

ing one or more tentative reference points, perhaps a central reference point
(all goals positioned midway between ideals and worst performance levels)
and a selection of reference points which favour each individual goal in turn.
Each individual solution generated will be ecient (to the level of optimiza-
tion accuracy achieved by the GA), and so represents a point on the ecient
frontier. In response to assessments by the user as to the direction in which
it is desired to move the solution, the reference points are adjusted and the
optimizations repeated. In this way, the user is able incrementally to explore
the ecient frontier until such time as a satisfactory solution, or short list of
possible solutions, is found.
A detailed case study in the use of this system is given in Janssen et al.
(2007). An illustration of the manner in which the interactions may progress
is given in Figure 11.17, which presents three land use maps. The numbers
in the maps indicate nine potential land use types, namely: 1. Intensive agri-
culture; 2. Extensive agriculture; 3. Residence; 4. Industry; 5. Day recreation;
6. Overnight recreation; 7. Wet natural area; 8. Water (recreational); and 9.
Water (limited access).
Initial Map First Solution
Final Solution
Fig. 11.17. Three land use maps generated from the LUDSS.
The upper left hand map displays the original land use pattern. The upper
right hand map was generated in the rst optimization step, and provides a
very compact allocation of land uses. However, the costs were deemed to be
too high, largely because of the extent of agricultural land reclaimed from wet
areas. For this reason, the priority on the cost attribute was increased. Some
more fragmentation of the land uses was then re-introduced, much of which
was acceptable except for that of the agricultural land. Also, the values as-
sociated with conservation goals were found to be unsatisfactory. Adjustment
of these priorities led after the 8th iteration to the lower map in Figure 11.17,
and this was found to represent a satisfactory compromise.
11.9 Engineering Design Problems with Large Numbers

of Objectives
11.9.1 Problem Description: Cable-Stayed Bridges
Cable-stayed bridges are gaining much popularity due to their beautiful shape.
During and after construction, this kind of bridge needs to have the cable
length adjusted in order to attain errors of cable tension and camber (the
conguration of the girders of the bridge) within some allowable range.
To this end, the following criteria are considered (Nakayama et al. (1995)):
residual error in each cable tension,
residual error in camber at each node,
amount of shim adjustment for each cable,
number of cables to be adjusted.
Since the change of cable rigidity is small enough to be neglected with respect
to cable length adjustment, both the residual error in each cable tension and
that in each camber are assumed to be linear functions of the amount of shim
adjustment.
Let us dene n as the number of cables in use, Ti (i = 1, . . . , n) as the
dierence between the designed tension values and the measured ones, and
xik as the tension change of i-th cable caused from the change of the k-th
cable length by a unit. The residual error in cable tension caused by the shim
adjustment is given by

n
pi = |Ti xik lk | (i = 1, . . . , n)
k=1
Let m be the number of nodes, zj (j = 1, . . . , m) the dierence between the

designed camber values and the measured ones, and yjk the camber change at
j-th node caused from the change of the k-th cable length by a unit. Then the
residual error in the camber caused by the shim adjustments of l1 , . . . , ln
is given by
n
qj = |Zj yjk lk | (j = 1, . . . , m)
k=1
In addition, the amount of shim adjustment can be treated as objective func-
tions of
ri = |li | (i = 1, . . . , n)
The upper and lower bounds of shim adjustment inherent in the structure of
the cable anchorage are as follows:
lLi li lUi (i = 1, . . . , n). (11.4)
11.9.2 Methodology
Now we have a multi-objective optimization problem in which p1 , . . . , pn ,

q1 , . . . , qm and r1 , . . . , rn are to be minimized under the constraint (11.4).
Some large scale bridges have around 100 cables at each side, so that the
problem results easily in a very large number of objective functions. For this
multi-objective optimization problem, engineers in bridge construction have
tried to apply goal programming (Charnes and Cooper, 1961), in which they
want to get a desirable solution by adjusting weights imposed on criteria.
However, it has been pointed out in the literature (e.g. see Nakayama, 1995)
that this task is very dicult even for simple problems. In addition, the shim
adjustment must ususally be done during a relatively short period (say, 2:00
am to 8:00 am) with a stable temperature, because the cable length is greatly
aected by change of temperature. Therefore, the decision of cable length
adjustment must be made very quickly. Also, due to this reason, the tradi-
tional goal programming approach is not satisfactory for practical use in this
problem.
On the other hand, an interactive multi-objective programming technique
has been developed, called the satiscing trade-o method (Nakayama and
Sawaragi, 1984). The method is one of aspiration level approaches to multi-
objective optimization, which are observed to be eective in many practical
problems because they are very simple and easy to implement and do not re-
quire any mathematical consistency of decision makers judgment, and in ad-
dition take aspiration levels of decision makers as a probe rather than weights
imposed on criteria.
Figure 11.18 shows the graphical user interface (GUI) for the erection man-
agement system of a cable stayed bridge using the satiscing trade-o method.
The residual error of each criterion and the amount of shim adjustment are
represented by bar graphs. The aspiration level is inputted by a mouse on
the graph. After solving the auxiliary min-max problem, the Pareto solution
according to the aspiration level is represented by another bar graph in a sim-
ilar fashion. If the designer is not satised with the Pareto solution displayed,
he/she can revise the aspiration level by means of mouse operations, and the
process repeated.
This procedure is continued until the designer obtains a desirable shim-
adjustment. The interactive operation using the GUI is very easy for the de-
signer, and the visual information on trade-os among criteria is user-friendly.
The software has been used for real bridge construction, for example tje
Tokiwa Swan Bridge (Ube City) and the Karasuo Harp Bridge (Kita-Kyusyu
City) in 1992.
One of the important aspects of such a problem with a large number of
objective functions is the graphical user interface. As can be easily seen in
Fig.11.18, it is not too dicult for designers to make a trade-o analysis on
the basis of the displayed visual information, even in cases with hundreds
of objective functions. However, it might be dicult or even impossible to
Fig. 11.18. Erection Management System of Cable-stayed Bridge.
grasp the total trade-o context on the basis of numerical information only,
for problems with such a large number of objective functions.
11.9.3 A Further Application: Lens Design
Another good example with a large number of objective functions is lens

design. There are many kinds of lenses such as copier, camera, medical instru-
ments and so on. Above all, lenses in semiconductor chip production are very
expensive (of the order of million dollars), and hence have to be designed very
carefully.
In lens design, there are around 200 design variables such as
kinds of glass,
number of lenses,
diameter,
curvature,
distance between lenses,
and around 400 criteria such as
cost
weight
criteria for images:
aberration
chromatic
spherical
astigmatism
coma
distortion
curvature of eld
color balance
resolution
MTF
CCI
In lens design, there is the further diculty of nonlinear optimization in addi-
tion to the large number of objective functions: Scalarized optimization prob-
lems are usually highly nonlinear and highly multi-modal. Moreover, those
functional forms are not given explicitly in terms of design variables. Those
function values are evaluated on the basis of some kind of simulation (ray
trace). Therefore, it is dicult to obtain a global minimum for the objective
function.
So far, engineers use specic software in lens design. Their main attention
has been directed to how they obtain a global optimum for the scalarized
objective function, while the linearly weighted sum scalization function is
applied. It will be surely a good subject to investigate how interactive multi-
objective optimization techniques can work in lens design.
11.10 Concluding Comments

In what sense are the above applications dierent to other optimization stud-
ies? Clearly, the distinction lies in the multiplicity of objectives which are
central to the applications discussed here. In common with more general ap-
proaches to multiple criteria decision making (MCDM), those applying the
multiobjective methods start by careful problem structuring to identify the
underlying objectives, and to represent these in meaningful manner, much as
has been described in Chapter 3 of Belton and Stewart (2002).
Some of the approaches reported in the case studies do ultimately seek to
identify an overall mathematical objective function as a surrogate measure of
performance to be maximized or minimized, but:
This is done only after careful attention to tradeos between objectives
and clear recognition that these tradeos may change as one explores the
decision space;
The methods are applied interactively, with systematic changes in formula-
tion (revising goals or value tradeos) in the light of preference information
(as described in other Chapters of this book concerning interactive meth-
ods), and thus providing a means of implicit exploration of the Pareto
frontier.
Other reported approaches avoid the use of surrogate objective functions, by
seeking to identify the Pareto frontier explicitly, leaving the user or ultimate
client to explore the options visually before making the nal selection. Unfor-
tunately, it is dicult to provide an unambiguous visualization of the frontier
for more more than two or three objectives, although the Chapter 9 in this
book seeks to extend such opportunities.
The clear challenge to future research lies precisely in the interface be-
tween these implicit and explicit methods of searching the Pareto Frontier.
An opportunity may lie in using the interactive methods using surrogate mea-
sures of performance for an initial exploration, but using the explicit search
methods (linked to appropriate visualization) to rene the exploration across
those objectives which are found most critical to the nal decisions in the
most promising regions of the decision space.
References
Aerts, J.C.J.H., van Herwijnen, M., Janssen, R., Stewart, T.J.: Evaluating spatial
design techniques for solving land-use allocation problems. Journal of Environ-
mental Planning and Management 48(1), 121142 (2005)
Alexandrov, N.M., Dennis, J.E., Lewis, R.M., Torczon, V.: A trust region framework
for managing use of approximation models in optimization. Journal on Structural
Optimization 15(1), 1623 (1998)
Arima, T., Sonoda, T., Shirotori, M., Tamura, A., Kikuchi, K.: A numerical investi-
gation of transonic axial compressor rotor ow using a low-Reynolds-number k
turbulence model. ASME Journal of Turbomachinery 121(1), 4458 (1999)
Bandte, O., Malinchik, S.: A broad and narrow approach to interactive evolutionary
design an aircraft design example. In: Deb, K., et al. (eds.) GECCO 2004. LNCS,
Bartsch, H., Bickenbach, P.: Supply Chain Management mit SAP APO. Galileo
Press, Bonn (2001)
Belton, V., Stewart, T.J.: Multiple Criteria Decision Analysis: An Integrated Ap-
proach. Kluwer Academic Publishers, Boston (2002)
Benson, H.: An outer approximation algorithm for generating all ecient extreme
points in the outcome set of a multiple objective linear programming problem.
Journal of Global Optimization 13(1), 124 (1998)
Buonanno, M., Lim, C., Mavris, D.N.: Impact of conguration and requirements on
the sonic boom of a quiet supersonic jet. Presented at World Aviation Congress,
Phoenix, AZ (2002)
Charnes, A., Cooper, W.: Management Models and Industrial Applications of Linear
Programming, vol. 1. John Wiley, New York (1961)
Dickersbach, J.T.: Supply Chain Management with APO, 2nd edn. Springer, Berlin
(2005)
Ehrgott, M., Winz, I.: Interactive decision support in radiation therapy treatment
planning. OR Spectrum 30, 311329 (2008)
Ehrgott, M., Holder, A., Reese, J.: Beam selection in radiotherapy design. In: Linear
Algebra and Its Applications, vol. 428, pp. 12721312 (2008a)
Ehrgott, M., Hamacher, H.W., Nubaum, M.: Decomposition of matrices and static
multileaf collimators: A survey. In: Alves, C.J.S., Pardalos, P.M., Vicente, L.N.
(eds.) Optimization in Medicine. Springer Series in Optimization and Its Applica-
tions, vol. 12, pp. 2546. Springer Science & Business Media, New York (2008b)
Emmerich, M., Giannakoglou, K., Naujoks, B.: Single and multi-objective evolution-
ary optimization assisted by Gaussian random eld meta-models. IEEE Transac-
tions on Evolutionary Computation 10(4), 421439 (2006)
Fleischmann, B., Meyr, H., Wagner, M.: Advanced planning. In: Stadtler, H., Kilger,
C. (eds.) Supply Chain Management and Advanced Planning. Concepts, Models,
Software and Case Studies, 3rd edn., pp. 81106. Springer, Berlin (2005)
Hasenjger, M., Sendho, B., Sonoda, T., Arima, T.: Three dimensional evolution-
ary aerodynamic design optimization using single and multi-objective approaches.
In: Schilling, R., Haase, W., Periaux, J., Baier, H., Bugeda, G. (eds.) Evolutionary
and Deterministic Methods for Design, Optimization and Control with Applica-
tions to Industrial and Societal Problems EUROGEN 2005, Munich, FLM (2005)
Haupt, R.L., Haupt, S.E.: Practical Genetic Algorithms. John Wiley & Sons, New
York (1998)
Holder, A.: Designing radiotherapy plans with elastic constraints and interior point
methods. Health Care Management Science 6, 516 (2003)
Holland, J.: Adaptation in Natural and Articial Systems. The University of Michi-
gan Press, Ann Arbor (1975)
Janssen, R., van Herwijnen, M., Stewart, T.J., Aerts, J.C.J.H.: Multiobjective deci-
sion support for land use planning. Environment and Planning B, Planning and
Design. To appear (2007)
Jin, Y.: A comprehensive survey of tness approximation in evolutionary computa-
tion. Soft Computing 9(1), 312 (2005)
Jin, Y., Branke, J.: Evolutionary optimization in uncertain environments A survey.
Jin, Y., Olhofer, M., Sendho, B.: On evolutionary optimization with approximate
tness functions. In: Genetic and Evolutionary Computation Conference, pp. 786
792. Morgan Kaufmann, San Francisco (2000)
Jin, Y., Olhofer, M., Sendho, B.: A framework for evolutionary optimization with
approximate tness functions. IEEE Transactions on Evolutionary Computa-
tion 6(5), 481494 (2002)
Jin, Y., Olhofer, M., Sendho, B.: On evolutionary optimization of large problems
using small populations. In: Wang, L., Chen, K., Ong, Y.S. (eds.) ICNC 2005.
LNCS, vol. 3611, pp. 11451154. Springer, Heidelberg (2005)
Jin, Y., Zhou, A., Zhang, Q., Tsang, E.: Modeling regularity to improve scalabil-
ity of model-based multi-objective optimization algorithms. In: Multiobjective
Problem Solving from Nature. Natural Computing Series, pp. 331356. Springer,
Heidelberg (2008)
Larranaga, P., Lozano, J.A. (eds.): Estimation of Distribution Algorithms: A New
Tool for Evolutionary Computation. Kluwer Academic Publishers, Dordrecht
(2001)
Li, X.-D.: A real-coded predator-prey genetic algorithm for multiobjective optimiza-
tion. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.) EMO
Lim, D., Ong, Y.-S., Jin, Y., Sendho, B., Lee, B.S.: Inverse multi-objective robust
evolutionary optimization. Genetic Programming and Evolvable Machines 7(4),
383404 (2007)
MacKerell Jr., A.D.: Empirical force elds for biological macromolecules: Overview
and issues. Journal of Computational Chemistry 25(13), 15841604 (2004)
Mitchell, M.: An Introduction to Genetic Algorithms. The MIT Press, Cambridge

(1996)
Morris, G.M., Goodsell, D.S., Halliday, R.S., Huey, R., Hart, W.E., Belew, R.K., Ol-
son, A.J.: Automated docking using a Lamarckian genetic algorithm and an em-
pirical binding free energy function. Journal of Computational Chemistry 19(14),
16391662 (1998)
Nakayama, H.: Aspiration level approach to interactive multi-objective programming
and its applications. In: Pardalos, P.M., Siskos, Y., Zopounidis, C. (eds.) Advances
in Multicriteria Analysis, pp. 147174. Kluwer Academic Publishers, Dordrecht
(1995)
Nakayama, H., Sawaragi, Y.: Satiscing trade-o method for interactive multiob-
jective programming methods. In: Grauer, M., Wierzbicki, A.P. (eds.) Interactive
Decision Analysis Proceedings of an International Workshop on Interactive De-
cision Analysis and Interpretative Computer Intelligence, pp. 113122. Springer,
Heidelberg (1984)
Nakayama, H., Kaneshige, S., Takemoto, S., Watada, Y.: An application of a multi-
objective programming technique to construction accuracy control of cable-stayed
bridges. European Journal of Operational Research 87, 731738 (1995)
Obayashi, S., Sasaki, D., Takaguchi, Y., Hirose, N.: Multi-objective evolutionary
computation for supersonic wing-shape optimization. IEEE Transactions on Evo-
lutionary Computation 4(2), 182187 (2000)
Okabe, T., Foli, K., Olhofer, M., Jin, Y., Sendho, B.: Comparative Studies on Micro
Heat Exchanger Optimisation. In: Proceedings of IEEE Congress on Evolution-
ary Computation (CEC-2003), pp. 647654. IEEE Computer Society Press, Los
Alamitos (2003)
Olhofer, M., Arima, T., Sonoda, T., Sendho, B.: Optimization of a stator blade
used in a transonic compressor cascade with evolution strategies. In: Parmee,
I. (ed.) Adaptive Computing in Design and Manufacture, pp. 4554. Springer,
Heidelberg (2000)
Olhofer, M., Jin, Y., Sendho, B.: Adaptive encoding for aerodynamic shape opti-
mization using evolution strategies. In: Congress on Evolutionary Computation
(CEC), Seoul, Korea, May 2001, vol. 2, pp. 576583. IEEE Computer Society
Press, Los Alamitos (2001)
Ong, Y.-S., Nair, P.B., Lim, K.Y.: Max-min surrogate-assisted evolutionary algo-
rithms for robust design. IEEE Transactions on Evolutionary Computation 10(4),
392404 (2006)
Paenke, I., Branke, J., Jin, Y.: Ecient search for robust solutions by means of
evolutionary algorithms and tness approximation. IEEE Transactions on Evolu-
tionary Computation 10, 405420 (2006)
Pettersson, F., Chakraborti, N., Saxn, H.: A genetic algorithms based multiobjec-
tive neural net applied to noisy blast furnace data. Applied Soft Computing 7,
387397 (2007a)
Pettersson, F., Chakraborti, N., Singh, S.B.: Neural networks analysis of steel plate
processing augmented by multi-objective genetic algorithms. Steel Research In-
ternational 78, 890898 (2007b)
Poloni, C., Pediroda, V.: GA coupled with computationally expensive simulations:
tools to improve eciency. In: Genetic Algorithms and Evolution Strategies in
Engineering and Computer Science, pp. 267288. John Wiley and Sons, Chichester
(1997)
Price, K., Storn, R.N., Lampinen, J.A. (eds.): Dierential Evolution: A Practical
Approach to Global Optimizations. Springer, Berlin (2005)
Saxn, H., Pettersson, F., Gunturu, K.: Evolving nonlinear time-series models of
the hot metal silicon content in the blast furnace. Materials and Manufacturing
Processes 22, 577584 (2007)
Shao, L.: A survey of beam intensity optimization in IMRT. In: Halliburton, T. (ed.)
Proceedings of the 40th Annual Conference of the Operational Research Society
of New Zealand, Wellington, 2-3 December 2005, pp. 255264 (2005), Available
online at http://secure.orsnz.org.nz/conf40/content/paper/Shao.pdf
Shao, L., Ehrgott, M.: Finding representative nondominated points in multiobjec-
tive linear programming. In: IEEE Symposium on Computational Intelligence in
Multi-Criteria Decision Making, pp. 245252. IEEE Computer Society Press, Los
Alamitos (2007)
Shao, L., Ehrgott, M.: Approximately solving multiobjective linear programmes in
objective space and an application in radiotherapy treatment planning. Mathe-
matical Methods of Operations Research (2008)
Stewart, T.J., Janssen, R., van Herwijnen, M.: A genetic algorithm approach to
multiobjective land use planning. Computers and Operations Research 32, 2293
2313 (2004)
Takagi, H.: Interactive evolutionary computation: Fusion of the capacities of EC op-
timization and human evaluation. Proceedings of the IEEE 89, 12751296 (2001)
The MathWorks Inc. (2008)
Tsutsui, S., Ghosh, A.: Genetic algorithms with a robust solution searching scheme.
Theory, and Applications, Kluwer Academic Publishers, Boston (1999)
Zhang, Q., Zhou, A., Jin, Y.: RM-MEDA: A regularity model-based multi-objective
estimation of distribution algorithm. IEEE Transactions on Evolutionary Com-
putation 12(1), 4163 (2008)
12
Multiobjective Optimization Software
Silvia Poles1 , Mariana Vassileva2 , and Daisuke Sasaki3

1
ESTECO - Research Labs, Via Giambellino, 7 35129 Padova, ITALY
silvia.poles@esteco.com
2
Institute of Information Technologies, Bulgarian Academy of Sciences,
BULGARIA mvassileva@iinf.bas.bg
3
CFD Laboratory, Department of Engineering, University of Cambridge,
Trumpington Street, Cambridge CB2 1PZ, UK ds432@eng.cam.ac.uk
Abstract. This chapter provides a description of multiobjective optimization soft-

ware with a general overview of selected few available tools developed in the last
decade. This chapter can be considered a revision of previous valid papers and chap-
ters on nonlinear multiobjective optimization software such as the ones written by
Weistroer et al. (2005) and Miettinen (1999) that lists existing software packages
up to the year 1999. More precisely, this chapter is focused on the tools and features
that advisable multiobjective optimization software should contain.
12.1 Introduction
The main topic to be discussed in this chapter is available multiobjective op-

timization software. The main concern is devoted to software developed for
nonlinear problems. Several questions may be raised when discussing multi-
objective optimization software, but among the most recurring questions we
may list the following:
What do experts think about multiobjective optimization tools and what
are the most important features good software should always possess?
What is the current state-of-the-art of multiobjective optimization soft-
ware?
What are the advantages and gaps of all these optimization tools?
The description of an ideal software is very close to a complex integrated
environment such as a Process Integration and Design Optimization (PIDO)
or a Problem Solving Environment (PSE) (Gallopoulos et al., 1991; Houstis
et al., 1997). PIDO and PSE are integrated computing environments which
Reviewed by: Oliver Bandte, Icosystem Cooperation, USA

Jyrki Wallenius, Helsinki School of Economics, Finland
330 S. Poles, M. Vassileva, and D. Sasaki
provide the users all the necessary tools for solving multiobjective optimization
problems and for supporting decision making.
An ideal tool should have: an easy-to-use graphical user interface, a good
set of optimization methods, a good tool for visualizing the results and choos-
ing the nal solutions. Moreover meta-modeling and validation of models are
fundamental when dealing with time-consuming function evaluations. Last but
not least, robustness and reliability of solutions are of primary importance for
selecting the best design.
There are many attributes and characteristics that can be used to measure
software quality as seen by end-users. Leaving out all the problems related to
reliability, absence of bugs, extensibility and maintainability of each tool, we
here refer to requirements that a decision maker may have for a multiobjective
optimization software.
In the following sections a list of advisable program specications is ex-
plained. Next, a list of software is described and their conformance to require-
ments and specications is analyzed.
12.2 Software Features and Quality

12.2.1 Graphical User Interface
One of the most evident characteristics of a software is always a exible,

complete and easy-to-use graphical user interface (GUI). Even with multi-
objective optimization tools, the GUI plays an important role. In this case,
the GUI should give to the users of the software being it analysts or deci-
sion makers (e.g. engineers and managers) the ability to dene and modify
a problem, to dene input, output, objectives and constraints. Moreover, the
GUI should give to the decision makers the ability to choose optimization
strategies, manage software and hardware resources, describe how the pro-
cesses are synchronized and visualize and analyze results. Moreover, the GUI
should be suitable for introducing decision makers preferences in order to
solve multiobjective decision making problems with an intelligent guidance.
For example, a multiobjective optimization problem can be described using
graph-based formalisms as shown in Fig. 12.1. The gure describes a standard
mechanical design problem, the design of a welded beam structure with the
aim to minimize cost and displacement subject to constraints on shear.
12.2.2 Optimization Methods
Problems related to one or more than one conicting objective functions,

originate in several disciplines; their solution has been a challenge for a long
time. Typically, using a single optimization technology is not sucient to deal
with real-life problems.
12 Multiobjective Optimization Software 331
Fig. 12.1. An example of how a workow can describe input, output, constraints
and objectives of a multiobjective optimization problem.
In order to help engineers and decision makers, old and new multiobjective
optimization techniques are studied in industries, project and portfolio man-
agement, military and governmental elds. The importance of managing more
than one objective at once as opposed to just optimizing one outcome is well
recognized, for example, in portfolio management. In fact, constructing a bal-
anced bond portfolio must deal with uncertainty in the future price of bonds
and several other aspects. Despite what is reported in (Kaliszewski, 2004),
multiobjective optimization has recently started to gain attention within the
engineering and scientic communities since many real world optimization
problems in numerous disciplines and application areas, contain more than
one outcome that should be optimized.
Each optimization technique is qualied by its search strategy that im-
plies the robustness and/or the accuracy of the method. An indication of the
robustness of an optimization method is the ability to reach the absolute ex-
tremes of the functions even when starting far away from the nal solutions.
On the contrary, the accuracy measures the capability of the optimization al-
gorithm to get as close as possible to function extremes. There are hundreds or
thousands of optimization methods in the literature: each numerical method
can solve a specic or more generic problem. Dierent algorithms are intended
to solve dierent types of multiobjective optimization problems such as linear,
nonlinear, continuous, discrete, mixed, and so on. Dierent strategies can be
selected for dierent problems. Unfortunately, real world applications often
include one or more diculties which make many of these methods inapplica-
ble. Many engineering problems involve highly non-linear objective functions

or even may not have an analytic expression in terms of the variables. A gen-
eral overview of basic and recent approaches to multiobjective optimization
has been given in Chapters 17.
Therefore, a multipurpose software that can be used in several elds and
contests should include the most widely used and state-of-the-art methods
using both MCDM based and metaheuristics approaches to multiobjective
optimization. Obviously, some specic problems can be solved with software
that contain only few mathematical programming based methods. Unfortu-
nately, decision makers or analysts do not necessarily know the mathematical
formulation of the problem at hand and the problem can change time after
time. These are the main reasons why a really multipurpose software repre-
sents a viable solution.
12.2.3 Visualization, Post-processing and Statistical Charts
Visualization is the key in understanding the results coming out from large
simulations in computational science and engineering. After a multiobjective
optimization, we typically wish to visualize the entire set of results, rather than
simply analyzing each single result. Understanding the results of a multiobjec-
tive process can be quite hard, particularly in higher dimensional spaces. Even
though there are plenty of generic visualization tools (such as 2D and 3D scat-
ter plots as explained in Chapter 8), an ad hoc visualization tool for Pareto
optimal solutions is needed. Visualizing the objective space and the Pareto
points is quite easy with 2 or 3 objectives. For a higher number of objectives,
some more complex techniques should be implemented. For example, a com-
mon way of visualizing multivariate problems is using a parallel coordinates
chart (Inselberg and Dimsdale, 1990). Some more complex techniques can be
really useful with high dimensional spaces. Two important multi-dimensional
visualizing tools are self organizing maps (SOMs) (Kohonen, 1982) that can
really speed up the optimization phase as reported in (Obayashi and Sasaki,
2004) and heatmaps as described by Pryke et al. (2007).
These visualizing tools should be considered even as tools for data manage-
ment and preliminary exploration. In multiobjective optimization, an initial
explorative phase, called as a learning phase in Chapter 2, is important in
order to determine the behavior and the main characteristics of the problem
at hand. The principal aim of a preliminary exploration is to get the most
relevant qualitative information from a problem making the smallest possi-
ble number of evaluations. This can be done by using a smart positioning of
points in the space. This methodology provides a strong tool to design and
analyze functions; it eliminates redundant observations and reduces the time
and resources to make evaluations and experiments (Fig. 12.2).
Moreover, traditionally, visualization and statistical charts have been used
as post-processing operations to visualize results. Anyhow, visualization can
also be used to show the quality of the solutions. This kind of visualization can
Fig. 12.2. Data management and preliminary exploration methods. A smart posi-
tioning of points in a 3-dimensional space (left) and a reliable meta-model (right)
give an important feedback during runtime and a good chart can support in
deciding whether the optimization is going in the right direction or not. Based
on visual feedback, the decision maker can stop and re-run the optimization
using dierent parameters.
More detailed discussion of visualizing multiobjective optimization results
is given in Chapters 8 and 9.
12.2.4 Decision Support Tool
In the absence of preference information, all Pareto optimal solutions can be

regarded as equally desirable in the mathematical sense. Ranking a long list of
Pareto optimal or nondominated alternatives is a dicult task especially when
several solutions are available or when several conicting goals are involved.
In several cases, more than one decision maker can be involved in select-
ing the best solution. In these cases, each person may even reect dierent
competencies and roles. Therefore, making coherent choices, with rational and
transitive preferences, can be a very dicult task.
A decision support tool can assist the decision maker(s) in nding the best
solution from among a set of reasonable alternatives. Moreover, a decision
support tool can even allow the correct grouping of objectives into a single
utility function by identifying possible relations between the objectives. A
decision support tool can even guide the DM(s) in specifying preferences which
leads to constructing a scalarized function that results to be coherent with the
given preferences (see Chapters 1 and 2).
12.2.5 Meta-modeling and Validation of Models
In real life applications, it is not always possible to reduce the complexity of

the problem and obtain a function that can be evaluated quickly. As reported
in Chapter 11, in many practical engineering design and other scientic op-
timization problems, every single function evaluation can take hours or even
days. In these cases, the time to run a single step of an algorithm makes run-
ning more than a few evaluations prohibitive and some other smart approaches
are needed. In these situations, decision makers can turn to a preliminary ex-
ploration technique to perform a reduced number of calculations. After that,
it is possible to use these well-distributed results to create a surface which
interpolates these points. This surface represents a meta-model of the original
problem and can be used to perform the optimization without costly compu-
tations. The use of mathematical and statistical tools to approximate, analyze
and simulate complex real world systems is widely applied in many scientic
domains. These kinds of interpolation and regression methodologies are now
becoming common even for solving complex optimization problems where they
are also known as response surface methods (RSMs). For example, RSMs are
becoming very popular oering a surrogate model with a second generation
of improvements in speed and accuracy in computer aided engineering. This
approach allows direct optimization otherwise impossible.
Constructing a useful meta-model starting from a reduced number of real
evaluations is not a trivial task. Mathematical and physical soundness, com-
putational costs and prediction errors are not the only points to take into
account when developing meta-models. Ergonomics of the software has to be
considered in a wide sense. The users would like to grasp the general trends
in the phenomena, especially when the behavior is nonlinear. Moreover, de-
Fig. 12.3. A tool for meta-models: 3D-exploration

cision makers and engineers would like to re-use the experience accumulated,
in order to spread the possible advantages to dierent projects. When using
meta-models, the users should always keep in mind that this instrument allows
a faster analysis than the complex models, but interpolation and extrapolation
introduce a new element of error that must be managed carefully.
For these reasons, in the last years, dierent approximation strategies have
been developed to provide inexpensive meta-models of the simulation models
to substitute computationally expensive modules. As reported in Chapter 11,
there is not a unique meta-model that is valid for any kind of situations. For
this reason a good multiobjective optimization software should contain several
dierent interpolation techniques such as, for example, neural networks, radial
basis functions, kriging and Gaussian processes (see Chapter 11).
Once the meta-model has been constructed, it is really important to certify
its delity. This is the reason why a tool for exploring (Fig. 12.3) and mea-
suring the quality (Fig. 12.4) of meta-models in terms of statistical reliability
would be appreciated together with all approximation strategies.
Fig. 12.4. Tools for measuring the quality of meta-models. Distance chart that
points out the dierences between real values and values calculated using the meta-
model, (left) and residual chart (right). The residuals are the amounts which the
meta-model has not been able to explain (approximation errors). These charts help
to determine whether a meta-model is an acceptable representation of the original
problem.
12.2.6 Robustness and Reliability
When dealing with uncertainty, conventional optimization techniques tend to

over-optimize, producing solutions that may perform well at the optimal
point but have poor characteristics against the dispersion of design variables
or environmental variables. As reported in Chapter 16, it is quite possible that
the optimal solution will not be the most stable solution. For example, the
function in Fig. 12.5 has a global optimum at point A, and a local optimum at
point B. However, any small variations in the input parameters will cause the
performance to drop o markedly around A. The performance of B may not
be as good in absolute terms, but it is much more robust, since small changes
in the input values do not cause drastic performance degradation.
For this reason, a tool that allows the user to perform a robust design
analysis to check on the systems sensitivity to manufacturing tolerances or
small changes in operating conditions can be really useful.
Fig. 12.5. Robustness of solutions
12.2.7 Parallelization
In those cases where each single function evaluation can be really time-
consuming, parallel computing can be an important resource. In few words,
parallel computing refers to evaluating simultaneously the same function on
several processors in order to obtain results faster. The simple idea of paral-
lelization is based on the fact that the optimization process usually can be
divided into smaller steps. These smaller steps can be carried out simultane-
ously on multiple computers with some special coordination. The coordination
can be done by a central manager that manages all the computers of the pool,
collecting the requests and moving the computations accordingly to the cur-
rent load of each computer. In this way, the whole optimization, or a part of it,
can even be submitted to a queuing system and executed taking advantages
of several dierent remote computers. This concept has been well described
in Chapter 13.
This approach can really speed up the optimization because a parallel

optimization algorithm can be much faster than a corresponding sequential
algorithm. Parallel optimization methods can be developed by redesigning
serial algorithms to make eective use of parallel hardware. Unfortunately,
not all algorithms can be parallelized: for example, evolutionary algorithms
can be parallelized more easily than many MCDM approaches.
12.2.8 Plug-in
A very good quality for a software is to be a completely open platform where

anyone can contribute. The complexity of multiobjective optimization is be-
coming too big to design monolithic platforms. That is the reason why an
open platform where scientists and software engineers can introduce their
own methodologies and algorithms may represent a good solution.
Open platforms usually provide application programming interfaces (APIs)
allowing third parties to create plug-ins that interact with the main applica-
tion. In this kind of an open platform, the users can contribute with their
own optimization techniques without any changes to the main platform. Us-
ing the APIs the users can introduce the optimization technique that is most
appropriate for solving the problem at hand. An example of an open frame-
work dedicated to the design of metaheuristics is Paradiseo-MOEO (Liefooghe
et al., 2008).
12.3 List and Description of Software

Several software cover one or at most two of the previously discussed prop-
erties. There are several multiobjective optimization tools available; and each
tool can solve a specic or more generic problem. Some tools are more appro-
priate for constrained optimization, others may be suitable for unconstrained
continuous problems, or tailored for solving some specic problems. Unfor-
tunately, real world applications often include one or more diculties which
make these tools inapplicable. Most of the time, objective functions are highly
nonlinear or even may not be given in a closed form in terms of the design
variables.
In this chapter, we describe only general purpose software tools that have
been built from the ground up to solve multiobjective optimization problems.
Therefore, we describe software that can eciently handle several goals and
constraints at the same time, allowing to choose the best solution from a set
of solutions that represent the best trade-os.
In this section we identify and select a collection of free and commercial
general purpose software. An expanded list of other software and interesting
libraries is also provided in Section 12.5.
12.3.1 modeFRONTIER
modeFRONTIER is a multiobjective optimization and design environment,

written to allow easy coupling to almost any computer aided engineering
(CAE) tool. modeFRONTIER provides an environment which allows product
engineers and designers to integrate their various CAE tools, such as CAD,
nite element structural analysis and computational uid dynamics (CFD)
software. There are also direct interfaces for Excel, Matlab and Simulink. Us-
ing a variety of state-of-the-art multiobjective optimization techniques, rang-
ing from gradient-based methods to genetic algorithms, the process or design
of interest can be optimized by specifying objectives and dening variables
which aect factors such as geometric shape and operating conditions. mod-
eFRONTIER (Fig. 12.6) in eect becomes a wrapper around the CAE tool,
performing the optimization.
Fig. 12.6. A snapshot of modeFRONTER graphical users interface. In this panel

the user can dene the optimization problem.
modeFRONTIER includes a wide range of possible algorithms that can

be selected for solving dierent problems. At present, the multiobjective
methods available in modeFRONTIER are: Multiobjective Genetic Algo-
rithm (MOGA), Adaptive Range MOGA, Multiobjective Simulated annealing
(MOSA), Non-dominated Sorting Genetic Algorithm (NSGA-II) (Deb et al.,
2002), Multiobjective Game Theory, Evolutionary Strategies Methodologies
and Normal Boundary Intersection (NBI) (Das and Dennis, 1998). Moreover,
dierent algorithms can even be combined by the decision makers in order
to obtain some hybrid approaches according to their preferences. A hybrid

method can try to exploit the specic advantages of dierent approaches by
combining more than one together. For example, it is possible to combine the
robustness of a genetic algorithm together with the accuracy of a gradient-
based method, using the former for initial screening and the latter for re-
nements. Whenever possible, modeFRONTIERs algorithms can be used in
parallel, to run more than one evaluation at once and to take advantage of
available queuing systems. modeFRONTIER is a commercial software devel-
oped by ESTECO; its website contains several examples of how to use the
software for solving multiobjective optimization problems and decision mak-
ing processes in engineering.
12.3.2 OPTIMUS
OPTIMUS is a world-leading process integration software, that bundles a

collection of design exploration and numerical optimization methods. It allows
users to build simulation workows to automate their numerical simulation
processes. These simulation workows integrate one or more simulation codes
and are executed by OPTIMUS - if possible - without user intervention. Once
the simulation process is captured in an OPTIMUS workow, users are able
to explore their design space by modifying selected input parameters and hunt
for new designs that are more reliable with better functional performance. All
calculations are based on the integrated simulation tools that are part of the
OPTIMUS workows.
Methods available in OPTIMUS include:
Fig. 12.7. Optimization Post-Processing with OPTIMUS. A scatter plot showing

points on the Pareto optimal set for two objectives. When clicking on a point, the
variable values are shown (left). A 3-dimensional plot of the Pareto optimal set
(right).
design space exploration, such as Design of Experiments (DOE) and Re-

sponse Surface Modeling (RSM),
numerical optimization, based on gradient-based local algorithms or ge-
netic global algorithms, both for single or multiple objectives with contin-
uous and/or discrete design variables and
robustness and reliability engineering, including methods to assess and
optimize the variability of design outputs based on variable design inputs.
The Multiobjective optimization methods include: non-dominated sorting
evolutionary algorithm (NSEA and NSEA+, based on NSGA-II), normal-
boundary intersection method as well as the weighting method, the weighted
Chebyshev problem, the trade-o method, lexicographic ordering and the
method of global criterion (see, for example, Chapters 1 and 2).
OPTIMUS is developed by Noesis Solutions, a subsidiary of LMS Inter-
national, headquartered in Leuven, Belgium.
12.3.3 iSIGHT
Engineous iSIGHT software integrates and manages the computer software

required to execute simulation-based design processes, including commercial
CAD/CAE software, internally developed programs, and Excel spreadsheets.
iSIGHT drives toward optimal and reliable product designs using the library
of advanced engineering tools. iSIGHT components include: optimization, de-
sign of experiments, Monte Carlo analysis, approximations. iSIGHT is contin-
uously updated and contains state-of-the-art multiobjective genetic algorithm
routines such as for example MOGA-NCGA and NSGA-II.
12.3.4 NIMBUS
NIMBUS (Miettinen and Mkel, 2006) is an interactive classication-based

method for multiobjective optimization, see Chapter 2 (Miettinen, 1999). It
is suitable for both dierentiable and nondierentiable multiobjective and
single objective optimization problems subject to nonlinear and linear con-
straints with bounds for the variables. The classication information obtained
from a decision maker is used to generate one to four Pareto optimal so-
lutions that best reect the preferences expressed in the classication. In
practice, this means that one to four subproblems are created and solved
with a solver appropriate for the characteristics of the problem in question.
The subproblems include also reference point based subproblems from the
reference point method, the STOM and the GUESS methods (see Chap-
ter 2). WWW-NIMBUS is the implementation of the NIMBUS method oper-
ating via the Internet at http://nimbus.it.jyu.fi/. WWW-NIMBUS can
be used free of charge for academic purposes. The implementation operat-
ing under Linux and MS-Windows operating systems is IND-NIMBUS (Mi-
ettinen, 2006). It is for sale and some information about it is available at
http://ind-nimbus.it.jyu.fi/. An example of the user interface of IND-

NIMBUS (classication window) is given in Figure 12.8.
In both the implementations, there are several underlying solvers avail-
able including a local proximal bundle method and a global genetic algorithm
with dierent constraint-handling techniques. It is also possible to use a hybrid
solver where the local solver is used after the global one. Both the implemen-
tations support the DM in comparing Pareto optimal solutions (s)he likes
with graphical visualizations. Besides classication, the DM can also direct
the search by asking for intermediate solutions between any two generated
solutions.
12.3.5 PROMOIN
The interactive system PROMOIN (Caballero et al., 2002) has been designed
as a decision aid tool for multiobjective problems, based on the use of in-
teractive techniques. The current version of the software deals with linear
problems, although a nonlinear version is currently under construction. The
main idea underlying PROMOIN is the following. There are plenty of interac-
tive techniques available in the literature. They dier in several aspects: the
kind of problem handled, the type of nal solution, the inner solution process
and the information asked from the decision maker. The last issue is a key
factor for the success of an interactive method. If the decision maker does not
feel comfortable with the information (s)he has to provide, the method will
hardly succeed in nding her/his most preferred solution. Therefore, the in-
teractive technique should be chosen according to the decision makers wishes
Fig. 12.8. IND-NIMBUS: graphical users interface

regarding how (s)he wants to give the information. Furthermore, the kind
of information that the decision maker wishes to give may vary during the
solution process, due to the fact that (s)he progressively learns about the
problem, and gets a more accurate picture of the problem. The main inter-
active procedures have been incorporated into the system, including tradeo
based methods such as the Georion-Dyer-Feinberg (GDF) method, the se-
quential proxy optimization (SPOT) method, the interactive surrogate worth
trade-o method (ISWT) and other methods such as the Tchebyche method
and the Zionts-Wallenius method. Moreover, it contains reference point based
methods such as STEM, STOM and the reference point method. (For further
details of the methods, see, e.g., Chapter 2.) The method can be chosen ac-
cording to the decision makers wishes. On the other hand, the system oers
the possibility to change between methods any time during the solution pro-
cess. The program has been implemented under Windows environment, with
the aim of providing the user with a friendly interface.
12.3.6 MKO-2
The interactive software system MKO-2 is a generalized multiobjective deci-

sion support system (Staykov, 2006; Vassilev et al., 2006). It has been designed
to support solving linear and linear integer multiobjective optimization prob-
lems. The system implements an innovative generalized classication-based
interactive algorithm for multiobjective optimization with variable scalariza-
tions and parameterizations, applicable for dierent types of problems (i.e.,
linear, nonlinear and mixed variables). The MKO-2 system will be extended
to handle nonlinear multiobjective optimization problems as well. The incor-
porated generalized interactive algorithm is applicable for dierent ways of
dening the preferences by the decision maker, such as weighting factors (pri-
orities), aspiration levels, aspiration intervals, aspiration directions of change
in the values of some or of all the objective functions, etc. Using the MKO-
2 system, the decision makers can apply twelve interactive MCDM methods
existing in the literature (the Chebyshev method, the STEM method, the
STOM method, the reference point method, the GUESS Method (Buchanan,
1997), the modied reference point method, the visual interactive method, the
reference direction method, the NIMBUS method, the DALDI method, the
weighting method, and the -constraint method; see Chapters 1 and 2) and
dierent strategies in the search for new Pareto optimal solutions, not only
with the help of one particular method, but combining dierent interactive
MCDM methods. In this way, the MKO-2 software system can be used not
only for solving multiobjective optimization problems, but also for comparing
and analyzing dierent solutions of a given problem, using dierent types of
preference information, set by the decision maker, and dierent interactive
methods. The MKO-2 software system operates under MS Windows operat-
ing system. The graphical user interface of the system enables decision makers
with dierent degrees of qualication, referring to the methods and software
tools, to operate easily with the system. MKO-2 decision support system can
be used both for education and for solving real-life problems. It can be used
free for academic purposes under a certain bilateral agreement.
12.3.7 Pareto Front Viewer

The software Pareto Front Viewer (PFV) provides interactive visualization
of the Pareto frontier for multiobjective problems in the case of two to eight
objective functions. It is assumed that an approximation of the Pareto fron-
tier has already been constructed in the form of a nite list of objective vec-
tors. The method for constructing the objective vectors plays no role. Thus,
the PFV software can be combined with any Pareto frontier approximation
technique. The method is based on the visualization of bi-objective slices
of the Edgeworth-Pareto Hull (EPH) of the objective vectors, that is, the
union of the domination cones with vertices located in the objective vec-
tors of the approximation. The objective tradeos for any three objectives
that are specied by a decision maker are visualized in the form of deci-
sion maps, which are collections of the overlaid bi-objective slices of the
EPH. The inuence of the other objectives can be studied by moving the
sliders of the related scroll-bars. By this, the user is informed on the ob-
jective tradeos and is supported in the process of selecting the preferred
Pareto optimal decision vector, which is based on the direct identication of
the feasible goal at the computer screen. The software PFV was coded for
the platforms MS Windows 98/NT/2000/XP. A demonstration version for
up to ve criteria and 500 criterion points can be downloaded for free from
http://www.ccas.ru/mmes/mmeda/soft/third.htm.
12.3.8 Reasonable Goals Method for Databases

Reasonable Goals Method for Databases (RGDB) is the Web application
server that supports selecting a small number of alternatives from large ta-
bles through Internet. The application server applies the Reasonable Goals
Method, that is, visualization of the Pareto frontier of the envelope (convex
hull) of the objective vectors related to the alternatives in the case of two
to eight objectives. First, the user provides the table with alternatives to the
server. Then, the server approximates the Edgeworth-Pareto Hull (EPH) of
the convex hull of the objective vectors. Then, the users computer receives the
applet that supports interactive visualization of the Pareto frontier based on
the visualization of bi-objective slices of the EPH. The user explores objective
tradeos for any three selected objectives. The inuence of the other objec-
tives can be studied by moving the sliders of the related scroll-bars. Then,
the reasonable goal is transmitted to the server, which selects a small number
of the Pareto optimal decisions and transmits them back to the user. The
Web application server coded in C++ can be used with the help of standard
browsers. A demonstration version (5 objectives and up to 500 alternatives),
can be found at http://www.ccas.ru/mmes/mmeda/rgdb/index.htm.
12.3.9 ParadisEO and GUIMOO

ParadisEO is a white-box object oriented generic framework dedicated to the
exible design of evolutionary multiobjective algorithms. This paradigm-free
software aims to provide a set of classes allowing to ease and speed up the
development of computationally ecient programs. It is based on a clear con-
ceptual distinction between the solution methods and the multiobjective prob-
lems they are intended to solve. This separation confers a maximum design
and code reuse. ParadisEO provides a broad range of archive-related features
(such as elitism or performance metrics) and the most common Pareto-based
tness assignment strategies such as MOGA, NSGA, SPEA and Indicator-
Based Evolutionary Algorithm (IBEA) (Zitzler and Knzli, 2004). Further-
more, parallel and distributed models as well as hybridization mechanisms can
be applied to an algorithm designed within ParadisEO. This tool is developed
by INRIA that, in addition, provides GUIMOO, a platform-independent free
software dedicated to analysis of results of multiobjective problems. GUIMOO
allows visualization of approximative Pareto frontiers and contains metrics for
quantitative and qualitative performance evaluations.
12.4 Summary Table on Optimization Software

Table 12.1 summarizes the main characteristics of all the multiobjective opti-
mization software tools described in the previous section. This table lists only
tools that have been developed exclusively for multiobjective optimization.
Many other software systems can be used for optimization or visualization of
data but we limit our study to the tools that look for the Pareto frontier and
have visualization capabilities dedicated to this type of results.
The table has 11 columns that can be read as follows:
1. Software and Developers: name of the software and information about
companies or institutions taking care of the development. Whenever pos-
sible, web-pages or email contacts are reported.
2. Platforms: platforms where the software can run
3. An easy-to-use GUI
4. EMO: this column contains a brief description of the evolutionary mul-
tiobjective methods available in the software.
5. MCDM: This column contains a brief description of the MCDM methods
available in the software.
6. Rob.: This column contains the symbol X if and only if the software
contains at least a method to establish robustness of solutions.
7. Meta: This column reports whether or not the software contains one or
more methods for meta-modeling.
8. Vis: visualization tool and statistical analysis. This column indicates
whether or not the software contains one or more methods for visualizing
the Pareto frontier and/or other results coming out from the optimization
phase.
Software and Developers Platforms GUI EMO MCDM Rob. Meta Vis. Plug License
modeFRONTIER ESTECO All x MOGA-II, NSGA-II, NBI, weighting x x x x Commercial x

www.esteco.com ARMOGA, MOSA, method, reference
Game Theory, MO point method
evolutionary strategies
OPTIMUS NOESIS Solutions Windows x Non-Dominated NBI and 7 other x x x x Commercial x

www.noesissolutions.com and Linux Sorting Evolutionary methods
software described
Algorithms (NSEA and
NSEA+)
iSIGHT Engineous Software All x NCGA, NSGA-II LSGRG, MMFD, x x x x Commercial x

www.engineous.com MOST, Stress Ratio,
MIGA, Pointer
WWW-NIMBUS, University of Web x NIMBUS, reference x Free for

Jyvskyl, nimbus.it.jyu. point method, GUESS, Academic
STOM
methods as external plug-ins.
IND-NIMBUS, University of Windows x NIMBUS, reference x Commercial

Jyvskyl, ind-nimbus.it.jyu. and Linux point method, GUESS,
STOM
PROMOIN University of Malaga Windows x SPOT, ISWT, STEM, Free

STOM
MKO-2 Institute of Information Windows x Chebyshev, STEM and x Free for

Technologies - Bulgarian Academy STOM methods, Academic
of Sciences, Department of Decision GUESS, reference
Support Systems www.iit.bas.bg point method,
NIMBUS, DALDI,
weighting method
11. License: License type, commercial, free or academic.
Pareto Front Viewer Windows x x x Commercial,

optimization algorithms can deal with queuing systems.
www.ccas.ru/mmes/mmeda/soft Free up to
500 points
Reasonable Goals Methods for Web x x x Free

Databases
12 Multiobjective Optimization Software
www.ccas.ru/mmes/mmeda/rgdb/
ParadisEO Inria All x NSGA-II, IBEA x Commercial x

paradiseo.gforge.inria.fr
Table 12.1. Summary of the main characteristics of the multiobjective optimization
345
the symbol X if the software supports parallel computation and if the

ered as an open platform where the users can add their own optimization
10. //: this symbol stands for Parallelization. Hence this column contains
9. Plug: this column contains the symbol X if the software can be consid-
12.5 List of Available Libraries

Much of the evolutionary multiobjective optimization studies use computer
codes which are freely downloadable. Some of them are the NSGA-II code in
C (http://www.iitk.ac.in/kangal/soft.htm), SPEA2 and other EMO codes in
C++. An important platform containing a set of ready-to-go multiobjective
optimization methods is PISA (Bleuler et al., 2003). PISA consists of two
parts: a set of optimization problems (variators) and a set of optimization al-
gorithms (selectors). The selectors are state-of-the-art evolutionary multiob-
jective optimization methods (see Chapter 3). The user can write and submit
a new module in the platform. All modules available can be used for academic
purposes without a fee. Each module species its own licensing policy. PISA
itself is a copyright of the Swiss Federal Institute of Technology, Computer
Engineering and Networks Laboratory.
There are several other platforms that help the development process of
evolutionary multiobjective optimizers, for example, Open BEAGLE which
is an Object-Oriented software environment enabling the implementation of
almost any kind of evolutionary algorithm, such as genetic algorithms and
genetic programming, MOMHLib++, MOEA (Tan et al., 2000).
Another important package is DAKOTA, a multilevel parallel object-
oriented framework for design optimization, parameter estimation, uncer-
tainty quantication, and sensitivity analysis developed by the Sandia Na-
tional Laboratories.
For MCDM based multiobjective methods there are several items avail-
able in the Internet such as, for example, PROTASS developed by Rafal
Cytrycki for linear multiobjective problems and available for download at
http://www.ekspert.szczecin.pl/protass/en.
Even though designed for discrete problems, let us still mention one of the
most famous decision support systems, the Analytic Hierarchy Process (AHP).
Designed to reect the way people actually think, AHP was developed in the
1970s (Saaty, 1996). AHP is now included into a commercial software called
Expert Choice. This software is intuitive, graphically based and structured in
a user-friendly fashion. With this tool, decision makers are able to drill down
to their level of expertise, and apply judgments to the objectives to achieving
their goals.
Finally, we can mention the Decisionarium project (http://www.decision-
arium.net/) which focuses on the development of web based tools for interac-
tive multicriteria decision support for individual decision making, for group
collaboration and negotiation as well as for interaction and surveys over the
web (mostly for discrete problems).
There are probably several other packages that should be listed in this
paragraph. Unfortunately, most of the times, software products are imple-
mented for academic testing purposes and are usually neither maintained nor
advertised. Moreover, there exist several tools for solving single objective op-
timization problems that may contain some possibilities for multiobjective
optimization. Those are deliberately excluded from this nal list because this
chapter wants to concentrate only on nonlinear multiobjective optimization

software developed in the last decade.
12.6 Conclusions
The information collected and presented in this chapter is just a snapshot
of the multiobjective optimization tools available. Setting up, installing and
testing all these software packages on a number of dierent platforms has
been a quite demanding job. It is obviously impossible to say which one is
the best amongst all the listed software. A high number issues should be
taken into account for evaluating a software such as ease of use, completeness,
congurability, robustness, eciency, user support and so on.
Software for multiobjective optimization and more complex integrated en-
vironments such as Process Integration and Design Optimization (PIDO) or
Problem Solving Environment (PSE) have become popular in the last years
and several new packages are probably coming on the market. The impor-
tance of multiobjective optimization for the commercial world can be readily
seen by the fact that most of the industrial companies now support one or
more of the available packages. There is clear evidence that both commercial
and research/academic communities are becoming increasingly interested in
multiobjective optimization software.
The data summarized in Table 12.1 let us conclude that commercial soft-
ware are usually more complete and more close to an advisable multiobjective
optimization software than a free or open sources tool. Anyhow, there are even
some good libraries that are well-qualied starting points for people approach-
ing multiobjective optimization.
References
Bleuler, S., Laumanns, M., Thiele, L., Zitzler, E.: PISA A Platform and Program-
ming Language Independent Interface for Search Algorithms. In: Fonseca, C.M.,
Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632,
Buchanan, J.T.: A Naiive Approach for Solving MCDM Problems: The GUESS
Method. The Journal of the Operational Research Society 48(2), 202206 (1997)
Caballero, R., Luque, M., Molina, J., Ruiz, F.: Promoin: An interactive system for
multiobjective programming. Information Technologies and Decision Making 1,
635656 (2002)
Das, I., Dennis, J.E.: Normal-boundary intersection: a new method for generating
Pareto optimal points in multicriteria optimization problems. SIAM Journal on
Optimization 8(3), 631657 (1998)
Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multi-objective ge-
181197 (2002)
Gallopoulos, E., Houstis, E., Rice, J.R.: Future Research Directions in Problem
Solving Environments for Computational Science (1991)
Houstis, E., Gallopoulos, E., Bramley, R., Rice, J.: Problem-Solving Environments
for Computational Science. IEEE Computational Science and Engineering 4(3),
1821 (1997)
Inselberg, A., Dimsdale, B.: Parallel Coordinates: a Tool for Visualizing Multi-
Dimensional Geometry. In: VIS 90: Proceedings of the 1st conference on Vi-
sualization 90, San Francisco, California, pp. 361378. IEEE Computer Society
Press, Los Alamitos (1990)
Kaliszewski, I.: Out of the mist Towards decision-maker-friendly multiple criteria
decision making support. European Journal of Operational Research 158, 293307
(2004)
Kohonen, T.: Self-organized formation of topologically correct feature maps. Biolog-
ical Cybernetics 43, 5969 (1982)
Liefooghe, A., Basseur, M., Jourdan, L., Talbi, E.-G.: ParadisEO-MOEO: A Frame-
work for Evolutionary Multi-objective Optimization (2008)
Boston (1999)
Miettinen, K.: IND-NIMBUS for demanding interactive multiobjective optimization.
In: Trzaskalik, T. (ed.) Multiple Criteria Decision Making 05, pp. 137150. Karol
Adamiecki University of Economics, Katowice (2006)
Obayashi, S., Sasaki, D.: Multi-objective optimization for aerodynamic designs by
using armogas. In: HPCASIA 04: Proceedings of the High Performance Comput-
ing and Grid in Asia Pacic Region, Seventh International Conference on (HP-
CAsia04), Washington, DC, USA, pp. 396403. IEEE Computer Society Press,
Los Alamitos (2004)
Pryke, A., Mostaghim, S., Nazemi, A.: Heatmap Visualization of Population Based
Multi Objective Algorithms. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T.,
Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 361375. Springer, Heidelberg
(2007)
Saaty, T.L.: Multicriteria Decision Making: The Analytic Hierarchy Process; Plan-
ning, Priority Setting, Resource Allocation, 2nd edn. Analytic Hierarchy Process
Series. RWS Publications, Pittsburgh (1996)
Staykov, B.: Multiobjective optimization software system. In: Problems of Engineer-
ing Cybernetics and Robotics, vol. 57, pp. 2130 (2006)
Tan, K.C., Lee, T.H., Khoo, D., Khor, E.F.: MOEA Toolbox for Computer-Aided
Multi-Objective Optimization. In: 2000 Congress on Evolutionary Computation,
July 2000, vol. 1, pp. 3845. IEEE Computer Society Press, Piscataway (2000)
Vassilev, V., Vassileva, M., Staykov, B., Miettinen, K.: Generalized multicriteria de-
cision support systems. In: Proceedings of the International Workshop on Seman-
tic Web and Knowledge Technologies Applications, 12th International Conference
AIMSA, pp. 1630 (2006)
Weistroer, H.R., Smith, C.H., Narula, S.C.: Multiple criteria decision support soft-
ware. In: Figueira, J., Greco, S., Ehrgott, M. (eds.) Multiple Criteria Decision
Analysis: State of the Art Surveys, pp. 9891018. Springer, Heidelberg (2005)
Zitzler, E., Knzli, S.: Indicator-based selection in multiobjective search. In: Yao, X.,
Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervs, J.J., Bullinaria, J.A., Rowe,
J.E., Tio, P., Kabn, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242,
13
Parallel Approaches for Multiobjective
Optimization
El-Ghazali Talbi1 , Sanaz Mostaghim2 , Tatsuya Okabe3 , Hisao Ishibuchi4 ,

Gnter Rudolph5 , and Carlos A. Coello Coello6
1
Laboratoire dInformatique Fondamentale de Lille
Universit des Sciences et Technologies de Lille
59655 - Villeneuve dAscq cedex, France
talbi@lifl.fr
2
Institute AIFB
University of Karlsruhe
76128 Karlsruhe, Germany
mostaghim@aifb.uni-karlsruhe.de
3
Honda Research Institute Japan Co., Ltd.
8-1 Honcho, Wako-City, Saitama, 351-0188, Japan
okabe@jp.honda-ri.com
4
Department of Computer Science and Intelligent Systems
Osaka Prefecture University
Osaka 599-8531, Japan
hisaoi@cs.osakafu-u.ac.jp
5
Computational Intelligence Research Group
Chair of Algorithm Engineering (LS XI)
Department of Computer Science, University of Dortmund
44227 Dortmund, Germany Guenter.Rudolphuni-dortmund.de
6
CINVESTAV-IPN (Evolutionary Computation Group)
Depto. de Computacin, Av. IPN No 2508
Col. San Pedro Zacatenco, Mxico, D.F., 07360 MEXICO
ccoello@cs.cinvestav.mx
Abstract. This chapter presents a general overview of parallel approaches for mul-
tiobjective optimization. For this purpose, we propose a taxonomy for parallel meta-
heuristics and exact methods. This chapter covers the design aspect of the algorithms
as well as the implementation aspects on dierent parallel and distributed architec-
tures.
Key words: Parallel algorithms, Parallel metaheuristics, Parallel multiobjective

optimization, Parallel exact optimization
Reviewed by: Heinrich Braun, SAP AG, Walldorf, Germany

Jrgen Branke, University of Karlsruhe, Germany
350 E.-G. Talbi et al.
13.1 Introduction
Multiobjective optimization problems are often NP-hard, complex and CPU

time consuming. Exact methods can be used to nd the exact Pareto front
(or a subset of the front), but they are impractical to solve large problems
as they are time and memory consuming. On the other hand, metaheuristics
provide the approximated Pareto fronts in a reasonable time. However, they
also remain time-consuming for solving large problems.
Parallel and distributed computing are used in the design and implemen-
tation of multiobjective optimization algorithms to speedup the search. Also,
they are used to improve the precision of the used mathematical models, the
quality of the obtained Pareto fronts, the robustness of the obtained solutions,
and to solve large scale problems.
In this chapter, we present the main parallel models for metaheuristics
and exact methods from the algorithmic design point of view. We consider
continuous and combinatorial optimization problems as parallel models are
suited either for combinatorial or continuous optimization problems. From
the implementation point of view, we concentrate on the parallelization of
multiobjective optimization algorithms on general-purpose parallel and dis-
tributed architectures as these architectures are the most widespread com-
putation platforms. The rapid evolution of technology in terms of processors
(multi-core), networks (Inniband), and architectures (GRIDs, clusters) make
the parallelization very popular.
Dierent architectural criteria which aect the eciency of the imple-
mentation are shared memory / distributed memory, homogeneous / hetero-
geneous, dedicated / non dedicated, local network / large network. Indeed,
these criteria have a strong impact on the deployment techniques such as
load balancing and fault-tolerance. Depending on the type of the used archi-
tecture, dierent parallel and distributed programming environments such as
message passing (PVM, MPI), shared memory (multi-threading, OpenMP),
high throughput computing (Condor), and Grid computing (Globus) can be
used.
This chapter is organized as follows. In the next section, we present the
parallel models for designing metaheuristics for MOPs. In Section 3, we review
the parallel models for exact algorithms. Section 4 deals with the implementa-
tion issues for metaheuristics and exact algorithms. Finally, we conclude the
paper and discuss several lines for future research in Section 5.
13.2 Parallel Models for Metaheuristics
Dierent parallel models for metaheuristics have been proposed in the litera-
ture. They follow three major hierarchical models such as:
Self-contained parallel cooperation (between dierent algorithms)
13 Parallel Approaches for Multiobjective Optimization 351
Problem independent intra-algorithm parallelization

Problem dependent intra-algorithm parallelization
where the last two models do not alter the behavior of the algorithms and
therefore are generally used to speedup the search.
13.2.1 Level 1: Self-Contained Parallel Cooperation
Basic Concept
This group of parallel algorithms containing the Island model is used for par-
allel systems with very limited communication. In the island model, every
processor runs an independent MOEA using a separate (sub)population. The
processors might cooperate by regularly exchanging migrants which are good
individuals in their subpopulations. These algorithms are also suitable for
problems with large search spaces where a large population is being required.
The large population is then being divided into several subpopulations.
In every processor, an optimization algorithm with selection and recombi-
nation operators is being carried out on a subpopulation. As written by Coello
et al. (2002), there are several methods (also based on the island model) in
the literature which we can categorize into two main groups. (1) Cooperating
Subpopulations: These methods are based on partitioning the objective/search
space. In this group, the population is divided into subpopulations. The num-
ber of subpopulations and the way the population is divided are the two key
issues. (2) Multi-start Approach: Here, each processor independently runs an
optimization algorithm.
Group 1: Cooperating Subpopulations
These algorithms attempt to distribute the task of nding the entire Pareto-
optimal front among participating processors. By this way, each processor is
destined to nd a particular portion of the Pareto-optimal front. In fact, the
population of a MOEA is divided into a number of independent and separate
subpopulations resulting in several small separate MOEAs executing simulta-
neously which have the responsibility to nd the (Pareto-)optimal solutions in
their own search region. Each MOEA could have dierent operators, param-
eter values, as well as a dierent structure. In this model, some individuals
within some particular subpopulations occasionally migrate to another one.
Generally, when distributing the task among the processors, the overlap
between the solutions of two processors should be as small as possible. Also,
the distribution algorithm must be scalable. Usually, the designer or a com-
putational resource (master node) is responsible for distributing and dividing
the population or the objective/search space.
In the literature, the very rst approaches based on the island model do not
directly divide the objective/search space into dierent regions, but implicitly
result in the division as studied by Baita et al. (1995); Poloni (1995); Hiroyasu
et al. (2000); Jozefowiez et al. (2002); Deb et al. (2003); Xiao and Armstrong
(2003); de Toro Negro et al. (2004).
Baita et al. (1995) and Poloni (1995) use a local geographic selection
scheme in which individuals are placed on a toroidal grid with one individ-
ual per grid intersection point. Hiroyasu et al. (2000) proposed the Divided
Range Multi-Objective Genetic Algorithm (DRMOGA) in which the global
population is sorted according to one of the objective functions (which is
changed after a number of generations). Then, the population is divided into
equally-sized sub-populations. Each of these sub-populations is allocated to a
dierent processor in which a serial MOEA is applied. After a certain number
of generations, the sub-populations are gathered and the process is repeated,
but this time using some other objective function as the sorting criterion. The
main goal of this approach is to focus the search eort of the population on
dierent regions of the objective space. However, in this approach we cannot
guarantee that the sub-populations remain in their assigned region. A similar
approach is followed by de Toro Negro et al. (2004). Deb et al. (2003) use a
modied domination criterion for assigning a specic region of the objective
space to a processor.
Zhu and Leung (2002); Zhu (2002) proposed the Asynchronous Self-
Adjustable Island Genetic Algorithm (aSAIGA) in which, rather than mi-
grating a set of individuals, the islands exchange information related to their
current explored region. Based on the information coming from other islands,
a self-adjusting operation modies the tness of the individuals in the is-
land to prevent two islands from exploring the same region. In a similar way
to DRMOGA, this approach cannot guarantee that the sub-populations move
tightly together throughout the search space, hence the information about the
explored region may be meaningless.
Xiao and Armstrong (2003) use a generalized version of VEGA (Vector
Evaluated Genetic Algorithm, Schaer (1985)) to divide the population into
subpopulations.
Lpez-Jaimes and Coello (2005) proposed an approach called Multiple Res-
olution Multi-Objective Genetic Algorithm (MRMOGA), whose main idea is
to encode the solutions using a dierent resolution in each island (heteroge-
neous nodes are assumed). Then, the variable decision space is divided into
hierarchical levels with well-dened overlaps. Evidently, migration is only al-
lowed in one direction (from low resolution to high resolution islands). A
conversion scheme is required when migrating individuals, so that the resolu-
tion is properly adjusted. This approach uses an external population (or elitist
archive) and the migration strategy considers such a population as well. The
approach also uses a strategy to detect nominal convergence of the islands in
order to increase their initial resolution. The rationale behind this approach is
that the true Pareto front can be reached faster using this change of resolution
in the islands, because the search space of the low resolution islands is propor-
tionally smaller and, therefore, convergence is faster. This issue was originally
identied by Parmee and Vekeria (1997) when they used an injection island
strategy to solve a single-objective engineering optimization problem.
In the method proposed by Jozefowiez et al. (2005), each processor has
its own population which is dened in the entire search space. The dened
communication network between the processors is a ring where the processors
send half of their populations to their two neighbors. The computations of a
given processor do not begin until it has received the information from its two
neighbors.
The rst approach on dividing the objective space into several regions is
introduced by Branke et al. (2004). This technique called Cone Separation
divides the objective space into subspaces and assigns each subspace to one
processor. They, however, do not divide the search space and therefore each
processor explores the entire search space. The solutions outside the dened
region in the objective space of each processor are considered as infeasible
(although in reality they are feasible). Those infeasible solutions are migrated
to other processors. This algorithm is scalable and there is no overlap between
the solutions obtained by each processor. The so-called hypergraph has been
used by Mehnen et al. (2004) to structure the populations in MOEAs and
then applied it to parallel MOEAs. Streichert et al. (2005) rened the idea of
the Cone Separation technique by using a clustering method for nding the
right partitions in the objective space.
More recently, Bui et al. (2006) study an approach for dividing the search
space. In their approach, they select a random (hyper-)sphere as the search
space for every single processor. Then every processor runs a MOEA inside its
dened region. The spheres are evaluated in terms of their solutions and their
positions are being improved in the search space for the next iteration(s). This
has been done beside other techniques like racing model using Multi-Objective
Particle Swarm Optimization.
All of these methods work on processors which have similar properties
in other words homogeneous systems. Mostaghim et al. (2007) study an ap-
proach which works asynchronously and is thus particularly suitable for het-
erogeneous computer clusters as occurring, e.g., in modern grid computing
platforms.
Group 2: Multi-start Approach
This model introduced by Mezmaz et al. (2006) consists of several parallel

local search algorithms which are independently run on several (also hetero-
geneous) processors. The basic idea of using such a model is that running
several optimization algorithms with dierent initial seeds is more valuable
than executing only one single run for a very long time. This is of particular
importance for local search algorithms. Jozefowiez et al. (2007) use a parallel
hybrid approach combining the multi-start model and the self contained paral-
lel cooperation model. The Pareto front found by a parallel EA is partitioned
and serves as a guide to multiple tabu search tasks.
Synchronous versus Asynchronous
Usually in MOEAs a set of non-dominated solution are found as the result of

the optimization. In case of using the cooperating subpopulation model every
single processor will cooperate to obtain one part of the non-dominated set.
In elitist MOEAs like SPEA2 the non-dominated solutions are usually stored
in an archive. In other algorithms like the NSGA-II, there is no archive as the
main population contains the non-dominated solutions. In any of these cases,
the set of non-dominated solutions must be updated as soon as a processor
nishes its optimization task.
Apart from the way the subpopulations are created, we must ensure that
the processors obtain good convergence and diversity of solutions. For this in
some cases each processor can run the optimization several times as shown
in Algorithm 3. Algorithm 3 is basically being used on a set of homogeneous
Algorithm 3 Synchronous cooperating subpopulations

Initiate subpopulations
repeat
Wait for results of all processors
Migration of individuals if any
Update archive if any
until Termination condition met
Return archive
systems. The termination criterion could be a xed number of runs on each

processor (in many cases one iteration has been selected). In this algorithm
"Initiate Subpopulation" deals with dividing the objective/search space in or-
der to build the subpopulations. "Migration of individuals" refers to methods
in which processors communicate with each other and exchange some of their
individuals as migrants.
In reality, we typically deal with heterogeneous systems where this Algo-
rithm is not suitable. In heterogeneous systems, there are dierent computing
resources including very fast and very slow processors. According to Algo-
rithm 3, all of the processors have to wait for the slowest one. In order to
deal with these systems, Algorithm 4 is proposed. In this algorithm, whenever
a processor returns its results, they can be immediately integrated into the
archive. Based on the quality of the obtained archive a suitable new subpopu-
lation can be selected for that processor. This makes the approach particularly
suitable for heterogeneous computer clusters such as Grids, where very fast
processors are used along with rather slow ones. It is not necessary to wait
for the slowest processor to return its results. Here the processors can indi-
rectly communicate through the archive. We must notice that migration is
not straightforward as before.
Algorithm 4 Asynchronous cooperating subpopulations (Heterogeneous sys-

tems)
Initiate an empty archive
Initiate subpopulations
repeat
if A processor returns results then
Update archive
Determine its new subpopulation
end if
until Termination condition met
Return archive
Mostaghim et al. (2007) integrate a hypervolume based method into the opti-
mization routine in every processor. For initializing a subpopulation, a guide is
selected according to its marginal hypervolume. The hypervolume is the area
dominated by all solutions stored in the archive (Chapter 14). The marginal
hypervolume of a solution is the area dominated by the solution that is not
dominated by any other solution. The guide is the solution from the archive
which has not been selected before and which has the largest marginal hy-
pervolume. After selecting the guide, a Multi-Objective Particle Swarm Op-
timization method is used to move its subpopulation toward the guide, hence
searching the area around the guide.
13.2.2 Level 2: Problem Independent Parallel Intra-algorithm
Most of the metaheuristics are iterative methods. In this model, we will par-
allelize a single iteration of the algorithm. Our concern in this model are only
search mechanisms which are problem independent such as the evaluation of
the neighborhood in local search and the reproduction mechanism in evolu-
tionary algorithms.
Basic Concept
During an optimization, we have to evaluate tness values of candidates of

solution (individuals). If we use benchmark problems/simple applications to
evaluate tness values, the calculation time is negligible. However, a real ap-
plication sometimes needs huge computational time, e.g., using computational
uid dynamics (CFD), electro-magnetic eld analysis, nite element method
(FEM) etc. See Okabe et al. (2003); Okabe (2004). In this situation, total
calculation time becomes too huge and it is generally impossible to obtain a
certain result in a reasonable calculation time.
Let assume that the number of individuals, the maximum number of gen-
erations, the number of objectives and the calculation time of ith objective
function are n, g, k and ti , respectively. The total calculation time in evo-

lutionary multiobjective optimization, denote T , can be easily calculated as
follows:
k
T = gn ti + g = gnt + g, (13.1)
i=1
where,
k is the time that genetic operator needs in one generation and t =
i=1 i . If n = 100, g = 500, 0 and t = 3 (days) which is a certain
t
real example using CFD solver, the total calculation time is about 411 years!
Nevertheless, the problem should be optimized.
To tackle this problem, a parallel calculation is often used. The basic idea
is shown in Fig. 13.1. This type of parallelization is called master-slave model
or global parallelization, e.g., Branke et al. (2004); Cantu-Paz (1997a); Veld-
huizen et al. (2003). The optimizer running on a master node carries out an
overall calculation including initialization, crossover, mutation and selection
except for evaluation of individuals. In evolutionary computation, several indi-
viduals exist in a population to be evaluated. However, the evaluation of each
individual is completely independent from other evaluations. Therefore, in Fig.
13.1, each evaluation will be done on dierent slave nodes. The master node
generates a population, e.g. car designs. Then, the master node distributes
individuals to several independent slave nodes. In the slave nodes, the evalua-
tions of individual, e.g. car design, are carried out simultaneously. Thereafter,
the tness values are gathered by the master node. Based on the tness values,
the master node selects promising individuals and generates new individuals
by genetic operators. This ow is repeated until a given termination condition
is met. Since several time-consuming evaluations are carried out at the same
time, the total calculation time is dramatically reduced.
Calculation Time
Now, we will consider when we should parallelize a calculation using master-

slave model. Assume that the total calculation time without/with paralleliza-
tion and the number of nodes are T wo, T w and N , respectively. As an ex-
ample, n = N (the number of available nodes is the same as the number of
individuals) is also assumed. Since a master node can be used not only for
managing total calculation but also for tness evaluation, a master node also
contributes to tness evaluation. One can easily obtain the following equations
of T wo and T w :
T wo = gnt + g = gN t + g, (13.2)
T w = g + gt + g(N 1)TDT , (13.3)
where, TDT is the necessary time for data transfer from the master node to
one slave node and from one slave node to the master node in one generation.
Now, the eciency of parallelization, denoted as , is calculated as:
Fig. 13.1. Master-slave model for parallelization.
T wo
= 100(%). (13.4)
NT w
The numerator is the total resources for calculation when the parallelization
is not used. The denominator is the total resources for calculation when paral-
lelization is used. Since one master node and (N 1) slave nodes are occupied
for the time of T w , the total resources are N T w . If becomes 100%, the
parallelization is very useful. Oppositely, if becomes 0%, the parallelization
should not be done.
Using Eq. (13.2) and Eq. (13.3), Eq. (13.4) can be calculated as follows:
Nt +
= 100. (13.5)
N (t + + (N 1)TDT )
Eq. (13.5) leads the following results:
If << t and TDT << t, is nearly 100%. This means that if the necessary
calculation time for one tness evaluation is suciently larger than (for
genetic operators) and TDT (for data transfer), we should parallelize a
calculation.
If 0, one can easily obtain the following relation:
! "
(N 1)TDT
= 1 100. (13.6)
(N 1)TDT + t
This equation means that the smaller the value of t is, the worse is the
eciency of .
Survey
From the beginning of the research for evolutionary algorithms on single ob-
jective optimization, parallelization technique has been paid attention due to
population-based approach of evolutionary algorithms. There are many sur-
veys in the literature, e.g., Schmeck et al. (2001); Bethke (1976); Adamidis
(1994); Cantu-Paz (1997a,b). As a natural extension, parallelization is also
used for evolutionary multiobjective optimization.
Stanley and Mudge (1995) propose the framework of parallel genetic algo-
rithm called Genetic Algorithm running on the INternet (GAIN). The usage
of dierent architectures for parallel computation is often due to the fact
that homogenous computers are not readily available. This situation leads to
dierent computational time of tness evaluations on slave nodes. If the com-
putational time on a certain node is dierent from others, the eciency of
parallel computation decreases dramatically due to much idle time of faster
slave nodes. To solve this problem, Stanley and Mudge propose the GAIN.
Based on a given parameter that determines the maximum number of pending
evaluations, the idle time is reduced. If the number of unevaluated individuals
exceeds this number, the generation process sleeps. Otherwise, the generation
process is carried out even if unevaluated individuals exist. The results of the
GAIN show a robust and good performance.
Watanabe et al. (2002) extend an original master-slave model to maintain
a higher diversity of the population. They call this extension as Master-slave
model with local cultivation (MSLC) model. In this model, two randomly se-
lected individuals are sent to a slave node. Using two individuals, most genetic
operators are carried out in a slave node. However, in one generation, all in-
dividuals distributed to slave nodes are gathered and ranked again on the
master node. Since most of calculation is done on slave node, the problem
occurred on a master-slave node, i.e. higher computational cost of a master
slave, is solved.
de Toro Negro et al. (2002) propose the parallel multiobjective evolution-
ary algorithm called Parallel Single Front Genetic Algorithm (PSFGA) as an
extension of Single Front Genetic Algorithm (SFGA) based on master-slave
model. The characteristic of the SFGA are as follows: Only the non-dominated
individuals can join the recombination process, all non-dominated individuals
are copied to the next population and the rest of individuals to complete the
population are obtained by recombination and mutation of the non-dominated
individuals. In PSFGA, the population is divided into several sub-populations
based on tness values. In the sub-population, the original SFGA is carried
out. After execution of SFGA, all individuals are gathered by a master node.
They conclude that parallelization is very helpful not only for the reduction
of computational cost but also for the preservation of diversity.
Coello and Sierra (2004) study the parallelization of a coevolutionary mul-
tiobjective evolutionary algorithm. Based on the master-slave model, they par-
allelize their algorithm. The population is divided into several sub-population
according to search region. In each generation, sup-populations cooperate or

compete. In Coello and Sierra (2004), the parallel algorithm is compared with
the serial (original) algorithm and shows better result from accuracy of solu-
tion and computational cost points of view.
Veldhuizen et al. (2003) discuss parallel evolutionary multiobjective opti-
mization. In Veldhuizen et al. (2003), master-slave model, island model, diu-
sion model and hybrid model are discussed and the calculation time of them
are also compared.
Dubreuil et al. (2006) analyze the master-slave model for distributed evolu-
tionary computation theoretically. This paper builds a theoretical framework
for the master-slave model and validates the framework empirically based on
the Distributed BEAGLE C++ framework. They conclude that contrary to
popular belief, the master-slave model can scale well.
Recently, many applications which need time-consuming tness evalua-
tions are successfully optimized using the master-slave model. Due to the
page limitation, few of them are introduced, e.g., Jones et al. (1998); Sasaki
et al. (2000); Okabe et al. (2003). Jones et al. (1998) parallelize a genetic algo-
rithm on an aerodynamic and aeroacoustic optimization of airfoils. Despite the
time-consuming multidisciplinary tness evaluations, they successfully show
good results by the usage of master-slave parallelization. Since their tness
evaluations need huge computational cost, their eciency of parallelization
achieves nearly 100%. Sasaki et al. (2000) optimize the design of a wing for
supersonic transport using multiobjective genetic algorithm. To solve the huge
computational cost, a simple master-slave model is used. They obtain the suc-
cessful results with better performance. Okabe et al. (2003) optimize the shape
of a micro heat exchanger problem using a commercial computational uid
dynamics software. To reduce huge computational cost, the algorithm is paral-
lelized based on the master-slave model and successfully optimizes the shape.
In Okabe et al. (2003), the necessary conditions of parallel optimization using
a commercial solver are also discussed.
As introduced above, there are a lot of papers proposing new ecient
method for the master-slave model and showing successful optimization results
by master-slave model. Since real multiobjective optimization problems are
more complicated, this type of parallelization will gather much attention in
order to successfully obtain the optimal design of applications in reasonable
time.
13.2.3 Level 3: Problem Dependent Parallelization
In this model, problem-dependent operations are parallelized. In general, the

interest here is the parallelization of the evaluation of a single solution (dif-
ferent objectives and/or constraints). The parallel models may be based on
the data partitioning or task partitioning. This model is useful in MOPs with
time and/or memory intensive objectives and constraints. It may also be use-
ful in MOPs with uncertainty which need in general an repeated evaluation

of objective.
Basic Concept
In the last section, the evaluations in a generation are parallelized. However,

even if the evaluations are parallelized, one tness evaluation is sometimes still
time-consuming. To solve this problem, we discuss the parallelization of each
evaluation in this section. Possible parallelization of one tness evaluations
are listed as follows:
1. Several solvers:
Consider a multiobjective optimization of a car design as an example. To
design a car, several disciplines should be considered. Examples are to
optimize the air ow around a car and the toughness of materials of a car.
To optimize this problem, two independent solvers are necessary, i.e. CFD
solver for the air ow and FEM for the toughness of materials. If we use one
computer to evaluate two objectives, many users will rstly use the CFD
to obtain the rst objective function and secondly use the FEM to obtain
the second objective function or vise versa. Some users will execute the
CFD and the FEM at the same time. However, the total calculation time
is nearly same with the above case because the computational resources
are shared by two solvers. However, it is reasonable to execute the CFD
and the FEM at the same time on dierent computers. Although the idle
time, caused by the dierent calculation time of the CFD and the FEM,
is not avoidable, the total calculation time becomes shorter.
2. Decomposition of one tness evaluation:
Consider the evaluation of a big product which consists of several parts.
A simple idea to reduce the computational time is evaluation of each part
and merging of them. Generally, CFD calculation for a big product is
terribly time-consuming and has huge memory consumption. To tackle
these problems, domain decomposition method (DDM) is often used in
CFD research eld, e.g., Elleighy and Tanaka (2001). Calculation domain
is divided into several parts and assigned to dierent computers. Each
computer calculates only the assigned part. To balance all calculation,
the boundary condition is shared regularly. This division reduces the cal-
culation time and memory consumption. However, since the boundary
condition is shared regularly, rich connections among several computers
are necessary. Furthermore, the user should take care of the division to
reduce boundary and the balance of calculation cost on each computer.
3. Multiple runs for one tness evaluation:
Fitness evaluation sometimes needs several runs of a solver with dierent
calculation conditions. An example is an optimization with uncertainty.
Recently, robustness of tness value against the variance of design pa-
rameters has gathered much attention, in particular, by researchers and
practitioners researching for a real application. In a real application, it is

impossible to generate a product based on optimal design parameters be-
cause some variations are unavoidable. Therefore, it is very important to
obtain a robust and (nearly) optimal design. To nd robust and (nearly)
optimal design, multiple runs of a solver are sometimes necessary. Assume
that an optimizer obtains the design parameter x. To see the robustness
against variance of the design parameter, the tness values of x + dx and
x dx should also be evaluated. In this situation, it is reasonable to exe-
cute three solvers with dierent design parameters on dierent computers
simultaneously. By simultaneous execution, calculation time will be re-
duced.
Calculation Time
For above situations, the total calculation time is considered here. Based on
equations shown later, we will discuss when we should parallelize a calculation
or not.
1. Several solvers:
Assume a k-objective optimization problem where the time ti is necessary
to evaluate the ith objective function and N nodes are available for cal-
culation. As an example, N = k is assumed. The total calculation time
without/with parallelization can be obtained as:

k
N
two = ti = ti , (13.7)
i=1 i=1
tw = max(ti ) + TDT , (13.8)

here, two , tw , and TDT are the necessary time for k objective evaluations
without/with parallelization, and the necessary time for data transfer, see
Fig. 13.2 (a). The maximum time of all ti is denoted by max(ti ). Using
these equations, the eciency of parallelization, denote , can be obtained
as: N
two i=1 ti
= 100 = 100. (13.9)
Nt w N max(ti ) + N TDT
This equation leads the following results when TDT is negligible:
If all ti are the same, the eciency of parallelization is 100%.
Otherwise, the eciency is reduced due to idle time of the computer
with a shorter calculation.
If TDT is not negligible, the eciency will be reduced. In the worst case,
the eciency is nearly 0% when TDT >> ti . This means that the paral-
lelization should not be used.
2. Decomposition of one tness evaluation:

Assume that N nodes are available for calculation and one problem will
be decomposed into N sub-problems. The total calculation time of one
problem and one sub-problem are assumed to be tall and tsub , respectively.
Here, the decomposition is assumed as ideal, i.e., the time for all sub-
problems is the same. The number of boundaries caused by decomposition
and the time of internal data transfer per one boundary are assumed
as B and Tin . By decomposition, each domain will be solved separately.
However, to consider relations among neighbor domains, the boundary
information should be adjusted regularly. One can obtain the following
equations:
two = tall (13.10)
tw = tsub + (N 1)TDT + BTin . (13.11)
Here, the number of boundaries of each decomposed domain is assume to
be the same with others. The variable of TDT is the time of data transfer
for initial data. Since tall is approximately N tsub , one can obtain the
following eciency (see Fig. 13.2 (b)):
! "
two (N 1)TDT + BTin
= 100 = 1 100 (13.12)
N tw (N 1)TDT + BTin + tsub
In domain decomposition method, Tin is generally very high. Therefore, by
using rich connections among nodes, Tin should be reduced. Furthermore,
the users should think of a way to reduce the number of boundaries, B.
3. Multiple runs for one tness evaluation:
Following the same way of master-slave model, it is easy to obtain the
following equation:
! "
(N 1)TDT
= 1 100, (13.13)
(N 1)TDT + t
here, N , TDT and t are the number of necessary runs for one tness
value, the time for data transfer and the time for one tness evaluation.
As discussed in master-slave model, the calculation should be parallelized
when t >> TDT .
The three models for parallel metaheuristics may be used in conjunction
within a hierarchical structure. In Meunier et al. (2000); Talbi and Meunier
(2006), this hierarchical architecture has been adopted to solve a complex
multiobjective network design problem. At level 1, a parallel self contained
cooperative model based on evolutionary algorithms (island model) and local
search has been used. At level 2, a parallel evaluation model for a steady state
evolutionary algorithm, in which the evaluation phase of the algorithm is done
in parallel and in an asynchronous manner. Those two rst parallel model are
independent of the target MOP. Finally at level 3, a parallel synchronous de-
composition model, in which the evaluation of a single solution is carried out
in parallel by partitioning the geographical domain.
Data transfer
Working
Fitness evaluation 111
000
000
111
000
111 Idle
0000000
1111111
0000000
1111111
1 (N1) f1 0000000 (N1)
1111111
0000000
1111111
1111111
0000000
1111111
0000000
000000
111111 0000000
1111111
000000
111111 1111111
0000000
2 1 111111
000000
000000
111111
000000
111111
f2 0000000
1111111
1
0000000
1111111
1111111
0000000
111111
000000 1111111
0000000
111111
000000
000000
111111 111111111
000000000
0000000
1111111
000000000
111111111
0000000
1111111
N 000000
111111
1
000000
111111
000000
111111
fN 000000000
111111111
0000000
1111111
000000000
111111111 1
0000000
1111111
000000000
0000000
111111111
1111111
000000
111111 000000000
0000000
111111111
1111111
(a) Several solvers
Data transfer
Working
Fitness evaluation 00
11
11
00
00
11 Idle
1 (N1) (N1) Internal data transfer
0000000
1111111 000000
111111
2 1111111
0000000
0000000
1111111
1
0000000
1111111
1111111
0000000
000000
111111
000000
111111
1000000
000000
111111
111111
1111111
0000000 111111
000000
111111
000000
000000
111111 0000000
1111111
0000000
1111111
N 000000
111111
1
000000
111111 0000000
1111111
1
0000000
1111111
000000
111111 0000000
1111111
000000
111111 0000000
1111111
(b) Decomposition
Fig. 13.2. Necessary calculation time of several solvers and decomposition.
13.3 Parallel Models for Non-heuristic Methods
Parallelization of exact optimization methods, particularly branch and bound

ones, has been largely studied in the literature (refer to Talbi (2006)). How-
ever, to the best of our knowledge it is rarely tackled in the multiobjective
context. For example, in the 11th MCDM conference (1994) Antunes and
Tsoukis (1997) survey new developments in computer science and they try
to explore their specic relevance for the eld of MCDA. The eld of dis-
tributed computing is mentioned (p. 382f.) with the potential to integrate
dierent MC models and methods in a single MCDA system. The benet of
parallel computing is seen in decomposing the problems. But no reference to
any existing work is given1 . Although there were presentations of parallel ap-
proaches at MCDM conferences, these papers have been rarely included in
the ocial proceedings. For example, in the 15th MCDM conference (2000)
1
The potential benets of fuzzy sets and neural networks have been discussed
where evolutionary algorithms and related metaheuristics were not an issue at
that time.
there were at least two papers on parallel MCDM methods2 but none of them
appeared in the ocial proceedings prepared by Kksalan and Zionts (2001).
This might be the reason why there is only a short list of publications that is
presented next.
13.3.1 High Level Parallel Models
High level parallel models embrace approaches in which sequential MCDM

methods are run independently with no or occasional information exchange
in parallel. The simplest example is probably given by the idea to run sev-
eral instances of the same algorithm with scalarized objective functions but
dierent weights. This approach yields an approximation of the Pareto front
and set. More exibility is provided by the OpTiX-II software framework de-
scribed by Grauer and Boden (1997). Here, the user may specify which MCDM
methods are run in parallel on workstation clusters, and which information
they are going to exchange. Unfortunately, numerical results are given for
single-objective optimization only.
13.3.2 Low Level Parallel Models
Low level parallel models represent approaches in which parts of the sequen-
tial MCDM method are parallelized. For example, in 1992 Galperin Galperin
(1992) proposed a new unscalarized method for MCDM based on his concept
of balance numbers. In this procedure m subproblems can be solved in paral-
lel independently (p. 81) in each iteration. Apparently, the parallelization was
outlined only but not realized.
In case of interrelated multiobjective linear (MOLP) problems Volkovich
(1997) parallelize the search for local solutions. Again, there are no results
regarding speedup or eciency. The situation changes for the parallel method
for MOLPs presented by Wiecek and Zhang (1997). They achieve a speedup
about 27 when using 32 processors for large problems. They deploy the tech-
nique of task partitioning by using an ADBASE solver on each processor.
Another reason for using parallel hardware is raised with interactive
MCDM methods: If you like to foster interactivity during robustness analysis
and that the decision maker does not give up too early (due to impatience
and/or time schedule) then you must take care about fast response times.
For this purpose, Costa and Climaco (1994) calculated solutions in parallel
when using multiple reference points on a four processor system. They ex-
tended their work on parallelization to other interactive MCDM methods: in
the course of the ELECTRE III method the creditability indices that dene a
fuzzy outranking relation can be calculated independently (and hence in par-
allel) for dierent pairs of alternatives. Moreover, they also parallelized the
four subproblems of the distillation algorithm used in this method. A speedup
2
See the program at http://mcdm2000.ie.metu.edu.tr/tentprog.htm
about 7 was achieved for 16 processors by Dias et al. (1997). Similarly, the
preference indices of the interactive PROMETHEE method can be calculated
independently and therefore in parallel. Dias et al. (1998) achieve a speedup
about 15 for 16 processors in the best case.
Low level parallel models are also used for exact, combinatorial multi-
objective problems. In particular, in case of biobjective owshop problems
Dhaenens et al. (2006) discuss parallelization of the weighted sum method
with dichotomic search: after a new solution is found two new searches are
launched. Consequently, many processors are idle in the early phase of the
search resulting in poor speedup/eciency measures. A similar approach can
be deployed for the two phase methodwith same disadvantages as in the
previous method proposed by Lemesre et al. (2007a). Speedups do not exceed
1.7 for four processors. Lemesre et al. (2007b) achieved similar performance for
the partitioning parallel method. In the rst phase they use task partitioning
and then space partitioning.
Needless to say, hybridizations of metaheuristics and exact methods do
have some potential in case of parallelization. Basseur et al. (2004) propose a
parallel hybrid model combining an exact approach (branch and bound) and
a metaheuristic. The parallel metaheuristic is used to approximate the Pareto
front. The parallel branch and bound is used to solve sub-problems and to
improve the quality of the obtained Pareto front.
We are aware that this list of publications is not complete. But it reveals
that there is considerably less work on the parallelization of non-heuristic and
exact methods than for metaheuristics. Finally, we like to emphasize that the
deployment of parallel hardware for interactive environments might be fruitful
also for multiobjective metaheuristics.
13.4 Parallel Implementation and Deployment
Parallel and distributed architectures can have dierent memories (shared/

distributed), computation resources (homogeneous/heterogeneous), and net-
works (local/large). These dierent properties have a strong impact on the
deployment technique such as load balancing and fault-tolerance. Depending
on the type of architecture used, dierent parallel and distributed program-
ming environments such as message passing (PVM, MPI), shared memory
(multi-threading, OpenMP), high throughput computing (Condor), and Grid
computing (Globus) can be used. We briey study some of this issues in the
following.
13.4.1 Shared Memory versus Distributed Memory
The main advantage of parallel MOP algorithms implemented on shared mem-

ory architectures such as SMPs and multi-core processors is the simplicity.
For example, it is easier to share data such as upper bounds in exact algo-
rithms and best found approximated non-dominated set of solutions in meta-
heuristics. However, parallel distributed architectures oer a more exible and
fault-tolerant programming platform. Indeed, the memory access contention
in shared memory architecture make the number of processors limited for this
type of architectures.
13.4.2 Homogeneous and Dedicated versus Heterogeneous and

Non-dedicated
Most massively parallel machines (MPP) and clusters of workstations (COW)

such as IBM SP3 are composed of homogeneous processors and are generally
dedicated to the application. The proliferation of powerful workstations and
fast communication networks have shown the emergence of heterogeneous net-
work of workstations (NOW) as platforms for high-performance computing.
COWs and NOWs constitute a low-cost hardware alternative to run paral-
lel algorithms. However, the ecient scheduling of tasks and fault tolerant
mechanisms in NOWs is more complex to design and analyze due to the
heterogeneity of those architectures (processors, networks, etc.) and a higher
probability of faults.
Melab et al. (2006a) focus on solving large size problems using the Condor
environment. It is an open source framework originally intended to the design
and deployment of the three parallel models for meta-heuristics on dedicated
clusters and networks of workstations. Relying on the Condor programming
environment, it enables the execution of these applications on volatile non
dedicated heterogeneous computational pools of resources. Ecient load bal-
ancing and fault-tolerance mechanisms have been designed for this purpose.
Experimentations have been carried out on more than 100 PCs originally in-
tended for education. The obtained results are convincing, both in terms of
exibility and easiness at implementation, and in terms of eciency, quality
and robustness of the provided solutions at run time.
13.4.3 Tightly Coupled (Local Networks) versus Loosely Coupled

(Large Networks)
Massively parallel machines, clusters and local networks of workstations may

be considered as tightly coupled architectures. Large network of workstations
and Grid computing platforms are loosely coupled and are aected by higher
cost of communication. The larger the granularity of a model, the better suited
is the model for large networks.
Since the granularity of the self-contained parallel cooperation models
(level 1) is very high, they can be easily deployed on large scale architectures
which are in general loosely coupled and have high communication cost.
This model is also scalable in terms of the number of processors which

is not the case for the other models (Problem Independent Parallel Intra-
Algorithm and Problem Dependant Parallelization). These models are inter-
esting when the evaluation of a single solution is CPU-time consuming and/or
Input/Output intensive.
Melab et al. (2006b) report some results on parallel cooperative multiob-
jective meta-heuristics on computational grids. They particularly focus on the
island model and the multi-start model and their cooperation. They propose
a checkpointing-based approach to deal with the fault tolerance issue of the
island model. Nowadays, existing DispatcherWorker grid middlewares are in-
adequate for the deployment of parallel cooperative applications. Indeed, these
need to be extended with a software layer to support the cooperation. There-
fore, they propose a Linda-like cooperation model and its implementation on
top of XtremWeb. This middleware is then used to develop a parallel meta-
heuristic applied to a bi-objective Flow-Shop problem using the two models.
The work has been experimented on a multi-domain education network of 321
heterogeneous Linux PCs. The preliminary results, obtained after more than
10 days, demonstrate that the use of grid computing allows to fully exploit ef-
fectively dierent parallel models and their combination for solving large-size
problem instances.
In terms of exact methods, the high level is more appropriate for large net-
works. The most popular parallelization approach of the branch and bound
algorithm consists in building and exploring in parallel the search tree repre-
senting the problem being tackled. The deployment of such parallel model on
a grid raises the crucial issue of dynamic load balancing. The major question
is how to eciently distribute the nodes of an irregular search tree among
a large set of heterogeneous and volatile processors. Mezmaz et al. (2007)
propose a new dynamic load balancing approach for the parallel branch and
bound algorithm on the computational grid. The approach is based on a par-
ticular encoding of the tree nodes allowing a very simple description of the
work units distributed during the exploration. Such description optimizes the
communications involved by the huge amount of load balancing operations.
The approach has been applied to one instance of the bi-objective ow-shop
scheduling problem. The application has been experimented on a computa-
tional pool of more than 1000 processors belonging to seven Nation-wide clus-
ters. The optimal Pareto front has been generated within almost 6 days with
a parallel eciency of 98%.
13.5 Conclusion and Future Trend
Parallel and distributed computing are powerful and necessary ways to re-
duce the computation time of multiobjective optimization algorithms and/or
improve the quality of the obtained solutions. This chapter presents a gen-
eral overview of parallel approaches for multiobjective optimization. For this
purpose, we have proposed a taxonomy for parallel metaheuristics and exact

methods. We have covered both the design aspect of algorithms and imple-
mentation on dierent parallel and distributed architectures. Dierent parallel
models have been proposed in the design of multiobjective optimization algo-
rithms. These models are largely experimented on a wide range of academic
and real-life MOPs in dierent domains. The presented models may be used
in conjunction within a hierarchical structure.
Multiobjective optimization algorithms have been implemented and de-
ployed on dierent type of parallel and distributed architectures: clusters and
networks of workstations and shared memory parallel architectures. An e-
cient implementation must consider the characteristics of the target parallel
model (granularity, synchronous, etc.) and architecture (homogeneity, dedi-
cated, etc.). For example, ne granularity models cannot easily be deployed
on large scale distributed systems.
In the last decade, Grid computing and Peer-to-Peer (P2P) computing
have become a real alternative to traditional high performance computing
architectures for the development of large-scale distributed applications. This
is a great challenge as Grid and P2P-enabled frameworks for multiobjective
optimization algorithms are emerging.
Designing generic software frameworks to deal with the design and e-
cient transparent implementation of distributed multiobjective optimization
algorithms is another important aspect. Software frameworks such as PAR-
ADISEO oer transparent implementation of dierent parallel models on dif-
ferent architectures using suitable programming environments as written by
Cahon et al. (2004) and Liefooghe et al. (2007)3.
In future, more and more applications will be concerned by parallel multi-
objective optimization in dierent domains such as MDO (Multi-disciplinary
Design Optimization), life sciences and industrial applications. Also, designing
the interactive multiobjective optimization approaches which requires real-
time parallel solving of MOPs is another important challenge.
References
Adamidis, P.: Review of Parallel Genetic Algorithms Bibliography. Technical Report,
Aristotle University of Thessaloniki (1994)
Antunes, C., Tsoukis, A.: Against fashion: A travel survival kit in "modern" MCDA.
In: Multicriteria Analysis:International Conference on Multiple Criteria Decision
Making, pp. 378389. Springer, Berlin (1997)
Baita, F., Mason, F., Poloni, C., Ukovich, W.: Genetic Algorithm with Redundancies
for the Vehicle Scheduling Problem. In: Biethahn, J., Nissen, V. (eds.) Evolution-
ary Algorithms in Management Applications, pp. 341353. Springer, Berlin (1995)
3
See the web site: http://paradiseo.gforge.fr for more details.
Basseur, M., Lemesre, J., Dhaenens, C., Talbi, E.-G.: Cooperation between branch
and bound and evolutionary approaches to solve a bi-objective ow shop problem.
In: Ribeiro, C.C., Martins, S.L. (eds.) WEA 2004. LNCS, vol. 3059, pp. 7286.
Bethke, A.D.: Comparison of Genetic Algorithms and Gradient-based Optimizers on
Parallel Processors: Eciency of Use of Processing Capacity. Logic of Computers
Group Technical Report 197, University of Michigan (1976)
Branke, J., Schmeck, H., Deb, K., Reddy, M.: Parallelizing Multi-Objective Evolu-
tionary Algorithms: Cone Separation. In: IEEE Congress on Evolutionary Com-
putation, pp. 19521957 (2004)
Bui, L.T., Abbass, H.A., Essam, D.: Local models - an approach to distributed mul-
tiobjective optimization. Technical Report TR-ALAR-200601002, The Articial
Life and Adaptive Robotics Laboratory, University of New South Wales, Australia
(2006)
Cahon, S., Melab, N., Talbi, E.-G.: ParadisEO: A framework for the reusable design
of parallel and distributed metaheuristics. Journal of Heuristics 10(3), 357380
(2004)
Cantu-Paz, E.: A Survey of Parallel Genetic Algorithms. IlliGAL Report 97003,
University of Illinois (1997a)
Cantu-Paz, E.: Designing Ecient Master-slave Parallel Genetic Algorithms. Illi-
GAL Report 97004, University of Illinois (1997b)
Coello Coello, C.A., Reyes Sierra, M.: A study of the parallelization of a coevolution-
ary multi-objective evolutionary algorithm. In: Monroy, R., Arroyo-Figueroa, G.,
Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 688697.
for Solving Multi-Objective Problems. Kluwer Academic Publishers, New York
(2002)
Costa, J.P., Climaco, J.N.: A multiple reference point parallel approach in MCDM.
In: International Conference on Multiple Criteria Decision Making, pp. 255263.
Springer, New York (1994)
de Toro Negro, F., Ortega, J., Fernandez, J., Diaz, A.: PSFGA: a parallel genetic
algorithm for multiobjective optimization. In: Euromicro Workshop on Parallel,
Distributed and Network-based Processing, pp. 384391 (2002)
de Toro Negro, F., Ortega, J., Ros, E., Mota, S., Paechter, B., Martn, J.M.: PSFGA:
Parallel Processing and Evolutionary Computation for Multiobjective Optimisa-
tion. Parallel Computing 30(56), 721739 (2004)
Deb, K., Zope, P., Jain, S.: Distributed computing of Pareto-optimal solutions with
evolutionary algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K.,
(2003)
Dhaenens, C., Lemesre, J., Melab, N., Mezmaz, M., Talbi, E.-G.: Parallel exact
methods for multi-objective combinatorial optimization. In: Parallel Combinato-
rial Optimization, John Wiley and Sons, Berlin (2006)
Dias, L.C., Costa, J.P., Climaco, J.N.: Conicting criteria, cooperating processors
some experiments on implementing a multicriteria support method on a parallel
computer. Computers and Operations Research 24(9), 805817 (1997)
Dias, L.C., Costa, J.P., Climaco, J.N.: A parallel implementation of the

PROMETHEE method. European Journal of Operational Research 104(3), 521
531 (1998)
Dubreuil, M., Gagne, C., Parizeau, M.: Analysis of a Master-slave Architecture for
Distributed Evolutionary Computations. IEEE Transactions on Systems, Man,
and Cybernetics 36(1), 229235 (2006)
Elleighy, W.M., Tanaka, M.: Domain Decomposition Coupling of FEM and BEM.
Transactions of the Japan Society for Computational Engineering and Science 4,
107111 (2001)
Galperin, E.A.: Nonscalarized multiobjective global optimization. Journal of Opti-
mization Theory and Applications 75(1), 6985 (1992)
Grauer, M., Boden, H.: OpTiX-II: A software environment for MCDM based on
distributed and parallel computing. In: Multicriteria Analysis: International Con-
ference on Multiple Criteria Decision Making, pp. 199208. Springer, Berlin (1997)
Hiroyasu, T., Miki, M., Watanabe, S.: The New Model of Parallel Genetic Algorithm
in Multi-Objective Optimization ProblemsDivided Range Multi-Objective Ge-
netic Algorithm. In: IEEE Congress on Evolutionary Computation, July 2000,
vol. 1, pp. 333340. IEEE Computer Society Press, Piscataway (2000)
Jones, B.R., Crossley, W.A., Lyrintzis, A.S.: Aerodynamic and Aeroacoustic Opti-
mization of Airfoils via a Parallel Genetic Algorithm. In: AIAA 98-4811 (1998)
Jozefowiez, N., Semet, F., Talbi, E.-G.: Parallel and hybrid models for multi-
objective optimization: Application to the vehicle routing problem. In: Guervs,
J.J.M., Adamidis, P.A., Beyer, H.-G., Fernndez-Villacaas, J.-L., Schwefel, H.-P.
(eds.) PPSN 2002. LNCS, vol. 2439, pp. 271280. Springer, Heidelberg (2002)
Jozefowiez, N., Semet, F., Talbi, E.-G.: Enhancements of nsga ii and its application
to the vehicle routing problem with route balancing. In: Talbi, E.-G., Liardet,
P., Collet, P., Lutton, E., Schoenauer, M. (eds.) EA 2005. LNCS, vol. 3871, pp.
Jozefowiez, N., Semet, F., Talbi, E.-G.: Target aiming pareto search and its appli-
cation to the vehicle routing problem with route balancing. Journal of Heuris-
tics 13(5), 455469 (2007)
Kksalan, M., Zionts, S.: International Conference on Multiple Criteria Decision
Making. Springer, Berlin (2001)
Lemesre, J., Dhaenens, C., Talbi, E.-G.: An exact parallel method for a bi-objective
permutation owshop problem. European Journal of Operational Research 177(3),
16411655 (2007a)
Lemesre, J., Dhaenens, C., Talbi, E.-G.: Parallel partitioning method (PPM): A
new exact method to solve bi-objective problems. Computers and Operations
Research 34(8), 24502462 (2007b)
Liefooghe, A., Jourdan, L., Talbi, E.-G.: Paradiseo-MOEO: A framework for evolu-
tionary multi-objective optimization. In: Evolutionary Multi-objective Optimiza-
tion, Japan, pp. 457471 (2007)
Lpez-Jaimes, A., Coello Coello, C.A.: MRMOGA: Parallel Evolutionary Multiob-
jective Optimization using Multiple Resolutions. In: IEEE Congress on Evolution-
ary Computation, Edinburgh, Scotland, September 2005, vol. 3, pp. 22942301.
Mehnen, J., Michelitsch, T., Schmitt, K., Kohlen, T.: pMOHypEA: Parallel evolu-
tionary multiobjective optimization using hypergraphs. Technical Report of the
SFB Project 531 Computational Intelligence CI189/04, University of Dortmund
(2004)
Melab, N., Cahon, S., Talbi, E.-G.: Grid computing for parallel bioinspired algo-
rithms. Journal of Parallel and Distributed Computing (JPDC) 66(8), 10521061
(2006a)
Melab, N., Mezmaz, M., Talbi, E.-G.: Parallel cooperative metaheuristics on the
computational grid: A case study - the biobjective ow-shop problem. Parallel
computing 32(9), 643659 (2006b)
Meunier, H., Talbi, E.-G., Reininger, P.: A multiobjective genetic algorithm for radio
network design. In: IEEE Congress on Evolutionary Computation, Orlando, USA,
pp. 317324 (2000)
Mezmaz, M., Melab, N., Talbi, E.-G.: Using the multi-start and island models for
parallel multi-objective optimization on the computational grid. In: IEEE Inter-
national Conference on e-Science and Grid Computing (e-Science06), pp. 112120
(2006)
Mezmaz, M., Melab, N., Talbi, E.-G.: An ecient load balancing strategy for grid-
based branch and bound. Parallel computing 33(4-5), 302313 (2007)
Mostaghim, S., Branke, J., Schmeck, H.: Multi-objective particle swarm optimization
on computer grids. In: The Genetic and Evolutionary Computation Conference,
vol. 1, pp. 869875 (2007)
Okabe, T.: Evolutionary Multi-objective Optimization -On the Distribution of O-
spring in Parameter and Fitness Space-. Shaker Verlag, Aachen (2004)
Okabe, T., Foli, K., Olhofer, M., Jin, Y., Sendho, B.: Comparative Studies on Micro
Heat Exchanger Optimization. In: IEEE Congress on Evolutionary Computation,
pp. 647654 (2003)
Parmee, I.C., Vekeria, H.D.: Co-operative Evolutionary Strategies for Single Compo-
nent Design. In: Bck, T. (ed.) International Conference on Genetic Algorithms,
pp. 529536. Morgan Kaufmann, San Francisco (1997)
Poloni, C.: Hybrid GA for Multi-Objective Aerodynamic Shape Optimization. In:
Winter, G., Periaux, J., Galan, M., Cuesta, P. (eds.) Genetic Algorithms in En-
gineering and Computer Science, pp. 397416. Wiley & Sons, Chichester (1995)
Sasaki, D., Obayashi, S., Sawada, K., Himeno, R.: Multiobjective Aerodynamic Op-
timization of Supersonic Wings Using Navier-Stokes Equations. In: European
Congress on Computational Methods in Applied Sciences and Engineering (2000)
Schaer, D.J.: Multiple objective optimization with vector evaluated genetic algo-
rithms. In: International Conference on Genetic Algorithms and Their Applica-
tions, pp. 93100 (1985)
Schmeck, H., Kohlmorgen, U., Branke, J.: Parallel Implementations of Evolutionary
Algorithms. In: Solutions to Parallel and Distributed Computing Problems, pp.
4768 (2001)
Stanley, T.J., Mudge, T.: A Parallel Genetic Algorithm for Multiobjective Micro-
processor Design. In: The Sixth International Conference on Genetic Algorithms,
pp. 597604 (1995)
Streichert, F., Ulmer, H., Zell, A.: Parallelization of multi-objective evolutionary al-
gorithms using clustering algorithms. In: Coello Coello, C.A., Hernndez Aguirre,
A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 92107. Springer, Heidel-
berg (2005)
Talbi, E.-G.: Parallel combinatorial optimization. Wiley, Chichester (2006)

Talbi, E.-G., Meunier, H.: Hierarchical parallel approach for gsm mobile network
design. Journal of Parallel and Distributed Computing 66(2), 274290 (2006)
Van Veldhuizen, D.A., Zydallis, J.B., Lamont, G.B.: Considerations in Engineering
Parallel Multiobjective Evolutionary Algorithms. IEEE Transactions on Evolu-
tionary Computation 7(2), 144173 (2003)
Volkovich, V.L.: Distributed multiobjective optimization problems and methods for
their solution. In: International Conference on Multiple Criteria Decision Making,
pp. 222232. Springer, Berlin (1997)
Watanabe, S., Hiroyasu, T., Miki, M.: Parallel Evolutionary Multi-Criterion Opti-
mization for Mobile Telecommunication Networks Optimization. In: Evolutionary
Methods for Design, Optimization and Control, pp. 162172 (2002)
Wiecek, M.M., Zhang, H.: A parallel algorithm for multiple objective linear pro-
grams. Computational Optimization and Applications 8(1), 4156 (1997)
Xiao, N., Armstrong, M.P.: A specialized island model and its application in multi-
objective optimization. In: Cant-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy,
R., OReilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., We-
gener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N.,
Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 15301540.
Zhu, Z.-Y.: An Evolutionary Approach to Multi-Objective Optimization Problems.
Ph.D. thesis, The Chinese University of Hong Kong (2002)
Zhu, Z.-Y., Leung, K.-S.: Asynchronous Self-Adjustable Island Genetic Algorithm
for Multi-Objective Optimization Problems. In: IEEE Congress on Evolutionary
Computation, Piscataway, New Jersey, May 2002, vol. 1, pp. 837842 (2002)
14
Quality Assessment of
Pareto Set Approximations
Eckart Zitzler1 , Joshua Knowles2, and Lothar Thiele1

1
ETH Zurich, Switzerland
eckart.zitzler@tik.ee.ethz.ch, thiele@tik.ee.ethz.ch
2
University of Manchester, UK
j.knowles@manchester.ac.uk
Abstract. This chapter reviews methods for the assessment and comparison of
Pareto set approximations. Existing set quality measures from the literature are
critically evaluated based on a number of orthogonal criteria, including invariance
to scaling, monotonicity and computational eort. Statistical aspects of quality as-
sessment are also considered in the chapter. Three main methods for the statistical
treatment of Pareto set approximations deriving from stochastic generating methods
are reviewed. The dominance ranking method is a generalization to partially-ordered
sets of a standard non-parametric statistical test, allowing collections of Pareto set
approximations from two or more stochastic optimizers to be directly compared sta-
tistically. The quality indicator method the dominant method in the literature
maps each Pareto set approximation to a number, and performs statistics on the
resulting distribution(s) of numbers. The attainment function method estimates the
probability of attaining each goal in the objective space, and looks for signicant
dierences between these probability density functions for dierent optimizers. All
three methods are valid approaches to quality assessment, but give dierent informa-
tion. We explain the scope and drawbacks of each approach and also consider some
more advanced topics, including multiple testing issues, and using combinations of
indicators. The chapter should be of interest to anyone concerned with generating
and analysing Pareto set approximations.
14.1 Introduction
In many application domains, it is useful to approximate the set of Pareto-
optimal solutions, cf. (Ehrgott and Gandibleux, 2000; Deb, 2001; Coello Coello
et al., 2002). To this end, various approaches have been proposed ranging
from exact methods to randomized search algorithms such as evolutionary
algorithms, simulated annealing, and tabu search (see Chapters 2 and 3).
Reviewed by: Gnter Rudolph, University of Dortmund, Germany
Serpil Sayin, Ko University, Turkey
374 E. Zitzler, J. Knowles, and L. Thiele
With the rapid increase of the number of available techniques, the issue
of performance assessment has become more and more important and has
developed into an independent research topic. As with single objective opti-
mization, the notion of performance involves both the quality of the solution
found and the time to generate such a solution. The diculty is that in the
case of stochastic optimizers the relationship between quality and time is not
xed, but may be described by a corresponding probability density function.
Accordingly, every statement about the performance of a randomized search
algorithm is probabilistic in nature. Another diculty is particular to mul-
tiobjective optimizers that aim at approximating the set of Pareto-optimal
solutions in a scenario with multiple criteria: the outcome of the optimiza-
tion process is usually not a single solution but a set of trade-os. This not
only raises the question of how to dene quality in this context, but also how
to represent the outcomes of multiple runs in terms of a probability density
function.
This chapter addresses both quality and stochasticity. Sections 25 are
devoted to the issue of set quality measures; they dene properties of such
measures and discuss selected measures in the light of these properties. The
question of how to statistically assess multiple sets generated by a stochas-
tic multiobjective optimizer is dealt with in Sections 68. Both aspects are
summarized in Section 9.
The chapter will be of interest to anyone concerned with generating meth-
ods of any type. Those who are interested in a preference based set of solutions
should nd this paper useful as well.
14.2 Quantifying Quality General Considerations

14.2.1 Pareto Set Approximations
Assume a general optimization problem (X, Z, f , rel ) where X denotes the

decision space, Z = Rk is the objective space, f = (f1 , f2 , . . . , fk ) is the
vector of objective functions, and rel represents a binary relation over Z that
denes a partial order of the objective space, which in turn induces a preorder
of the decision space.1 In the presence of a single objective function (k =
1), the standard relation less than or equal is generally used to dene the
corresponding minimization problem (X, R, (f1 ), ). In the case of multiple
objective functions, i.e., k > 1, usually the relation with z1 z2 i
{1, . . . , k} : z1i z2i is taken; it represents a natural extension of to Rk and
is also known as weak Pareto dominance. The associated strict order with
z1 z2 z1 z2 z2 z1 is often denoted as Pareto dominance, and
instead of z1 z2 one also says z1 dominates z2 . Using this terminology, the
1
A binary relation is called a preorder i it is reexive and transitive. A preorder
which is antisymmetric is denoted as partial order.
14 Quality Assessment of Pareto Set Approximations 375
Pareto-optimal set comprises the set of decision vectors not dominated by any
other element in the feasible set S X.
The formal denition of an optimization problem given above assumes
that only a single solution, any of those mapped to a minimal element, is
sought. However, in a multiobjective setting one is often interested in the
entire Pareto-optimal set rather than in a single, arbitrary Pareto-optimal
solution. With many applications, e.g., engineering designs problems, knowl-
edge about the Pareto-optimal set is helpful and provides valuable information
about the underlying problem. This leads to a dierent optimization problem
where the goal is to nd a set of mutually incomparable solutions (for any two
decision vectors x1 , x2 , neither weakly dominates the other one), which will
be here denoted as Pareto set approximations; the symbol stands for the
sets of all Pareto set approximations over X. Accordingly, sets of mutually
incomparable objective vectors are here called Pareto front approximations,
and the set of all Pareto front approximations over Z is represented by .
Now, let (X, Z, f , rel) be the original optimization problem. It can be
canonically transformed into a corresponding set problem (, , f , rel ) by
extending f and rel in the following manner:
f (E) = {z Z | x E : z = f (x)}
A rel B z2 B z1 A : z1 rel z2
If rel is , then rel represents the natural extension of weak Pareto dominance
to Pareto front approximations. In the following, we will use the symbols
and as for dominance relations on objective vectors and decision vectors
also for Pareto front approximations respectively Pareto set approximations
it will become clear from the context, which relation is referred to.
14.2.2 Outperformance and Quality Indicators
Suppose we would like to assess the performance of two multiobjective op-

timizers. The question of whether either outperforms the other one involves
various aspects such as the quality of the outcome, the computation time
required, the parameter settings, etc. Sections 25 of this chapter focus on
the quality aspect and address the issue of how to compare two (or several)
Pareto set approximations. For the time being, assume that we consider one
optimization problem only and that the two algorithms to be compared are
deterministic, i.e., with each optimizer exactly one Pareto set approximation
is associated; the issue of stochasticity will be treated in later sections.
As discussed above, optimization is about searching in an ordered set. The
partial order rel for an optimization problem (X, Z, f , rel) denes a preference
structure on the decision space: a solution x1 is preferable to a solution x2 i
f (x1 ) rel f (x2 ) and not f (x2 ) rel f (x1 ). This preference structure is the basis
on which the optimization process is performed. For the corresponding set
problem (, , f , rel ), this means that the most natural way to compare two
Pareto set approximations A and B generated by two dierent multiobjective
optimizers is to use the underlying preference structure rel . In the context

of weak Pareto dominance, there can be four situations: (i) A is better than
B (A B B A), (ii) B is better than A (A B B A), (iii) A
and B are incomparable (A B B A), or (iv) A and B are indierent
(A B B A), where better means the rst set weakly dominates the
second, but the second does not weakly dominate the rst. These are the types
of statements one can make without any additional preference information.
Often, though, we are interested in more precise statements that quantify the
dierence in quality on a continuous scale. For instance, in cases (i) and (ii)
we may be interested in knowing how much better the preferable Pareto set
approximation is, and in case (iii) one may ask whether either set is better
than the other in certain aspects not captured by the preference structure
this is illustrated in Fig. 14.1. This is crucial for the search process itself, and
almost all algorithms for approximating the Pareto set make use of additional
preference information, e.g., in terms of diversity measures.
For this purpose, quantitative set quality measures have been introduced.
We will use the term quality indicator in the following:
A (unary) quality indicator is a function I : R that assigns each
Pareto set approximation a real number.
In combination with the or relation on R, a quality indicator I denes
a total order of and thereby induces a corresponding preference structure:
A is preferable to B i I(A) > I(B), assuming that the indicator values are
to be maximized. That means we can compare the outcomes of two multi-
objective optimizers, i.e., two Pareto set approximations, by comparing the
corresponding indicator values.
Example 1. Let A be an arbitrary Pareto set approximation and consider the

subspace Z of the objective space Z = Rk that is, roughly speaking, weakly
dominated by A. That means any objective vector in Z is weakly dominated
by at least one objective vector in f (A).
The hypervolume indicator IH (Zitzler and Thiele, 1999) gives the hyper-
volume of Z (see Fig. 14.2). The greater the indicator value, the better the
approximation set. Note that this indicator requires a reference point rela-
tively to which the hypervolume is calculated.
Considering again Fig. 14.1, it can be seen that the hypervolume indicator re-
veals dierences in quality that cannot be detected by the dominance relation.
In the left scenario, IH (A) = 277 and I(B) = 231, while for the scenario in
the middle, IH (A) = 277 and I(B) = 76; in the right scenario, the indicator
values are IH (A) = 277 and IH (B) = 174.2 This advantage, though, comes at
the expense of generality, since every quality indicator represents certain as-
sumptions about the decision makers preferences. Whenever IH (A) > IH (B),
2
The objective vector (20, 20) is the reference point.
f2 f2 f2
20 A B 20 A B 20 B A
15 15 15
10 10 10
5 5 5
f1 f1 f1
5 10 15 20 5 10 15 20 5 10 15 20
Fig. 14.1. Three examples to illustrate the limitations of statements purely based on
weak Pareto dominance. In both the gures on the left, the Pareto set approximation
A dominates the Pareto set approximation B, but in one case the two sets are much
closer together than in the other case. On the right, A and B are incomparable,
but in most situations A will be more useful to the decision maker than B. The
background dots represent the image of the feasible set S in the objective space R2
for a discrete problem.
f2
20 A reference point (20,20)
15
10
5
dominated subspace Z
f1
5 10 15 20
Fig. 14.2. Illustration of the hypervolume indicator. In this example, approximation

set A is assigned the indicator value IH (A) = 277; the objective vector (20, 20) is
taken as the reference point.
we can state that A is better than B with respect to the hypervolume indi-
cator; however, the situation could be dierent for another quality indicator
I that assigns B a better indicator value than A. As a consequence, every
comparison of multiobjective optimizers is not only restricted to the selected
benchmark problems and parameter settings, but also to the quality indica-
tor(s) under consideration. For instance, if we use the hypervolume indicator
in a comparative study, any statement like optimizer 1 outperforms opti-
mizer 2 in terms of quality of the generated Pareto set approximation needs
to be qualied by adding under the assumption that IH reects the decision

makers preferences.
Finally, note that the following discussion focuses on unary quality indica-
tors, although an indicator can take in principle an arbitrary number of Pareto
set approximations as arguments. Several quality indicators have been pro-
posed that assign real numbers to pairs of Pareto set approximations, which
are denoted as binary quality indicators (see Hansen and Jaszkiewicz, 1998;
Knowles and Corne, 2002; Zitzler et al., 2003, for an overview). For instance,
the unary hypervolume indicator can be extended to a binary quality indica-
tor by dening IH (A, B) as the hypervolume of the subspace of the objective
space that is dominated by A but not by B.
14.3 Properties of Unary Quality Indicators

Quality indicators serve dierent goals: they may be used for comparing al-
gorithms, but also during the optimization process as guidance for the search
or as stopping criterion. In principle, one may consider any function from
to R as an indicator, but clearly there are certain properties that need to be
fullled in order to make the indicator useful. These properties may vary de-
pending on the purpose: for instance, when comparing several algorithms on
a benchmark problem one may assume that the Pareto-optimal set is known,
while such information is clearly not available in a real-world scenario. In the
following, we will consider four main criteria:
Monotonicity: An indicator I is said to be monotonic i for any Pareto set ap-
proximation that is compared to another Pareto set approximation holds:
at least as good in terms of the dominance relation implies at least as
good in terms of the indicator values. Formally, this can be expressed as
follows:
A, B : A B I(A) I(B)
where stands for the underlying dominance relation, here weak Pareto
dominance.
Monotonicity guarantees that an indicator does not contradict the partial
order of that is imposed by the weak Pareto dominance relation, i.e.,
consistency with the inherent preference structure of the optimization
problem under consideration is maintained. However, it does not guar-
antee a unique optimum with respect to the indicator values; in other
words, a Pareto set approximation that has the same indicator value as
the Pareto-optimal set not necessarily contains only Pareto-optimal so-
lutions. To this end, a stronger condition is needed which leads to the
property of strict monotonicity:
A, B : A B I(A) > I(B)

Currently, the hypervolume indicator is the only strictly monotonic unary

indicator known, (see Zitzler et al., 2007).
Scaling invariance: In practice, the objective functions are often subject to
scaling, i.e., the objective function values undergo a strictly monotonic
transformation. Most common are transformations of the form of
s(f (x)) = (f (x) fl )/(fu fl ) where fl and fu are lower and upper
bounds respectively for the objective function values such that each ob-
jective vector lies in [0, 1]k . In this context, it may be desirable that an
indicator is not aected by any type of scaling which can be stated as
follows: an indicator is denoted as scaling invariant i for any strictly
monotonic transformation s : Rk Rk the indicator values remain un-
aected, i.e., for all A the indicator value I(A) is the same inde-
pendently of whether we consider the problem (, , f , rel ) or the scaled
problem (, , sf , rel ).3 Scaling invariant indicators usually only exploit
the dominance relation among solutions, but not their absolute objective
function values.
Computation eort: A further property that is less easy to formalize addresses
the computational resources needed to compute the indicator value for a
given Pareto set approximation. We here consider the runtime complexity,
depending on the number of solutions in the Pareto set approximation as
well as the number of objectives, as a measure to compare indicators. This
aspect becomes critical, if an indicator is to be used during the search
process; however, even for pure performance assessment there may be
limitations for certain indicators, e.g., if the running time is exponential
in the number of objectives as with the hypervolume indicator (While,
2005).
Additional problem knowledge: Many indicators are parameterized and re-
quire additional information in order to be applied. Some assume the
Pareto-optimal set to be known, while others rely on reference objective
vectors or reference sets. In most cases, the indicator parameters are both
user- and problem-dependent; therefore, it may be desirable to have as
few parameters as possible.
There are many properties one may consider, and the interested reader is
referred to (Knowles, 2002; Knowles and Corne, 2002) for a more detailed
discussion.
3
Alternatively, one may consider a weaker version of scaling invariance which is
based on the order of the indicator values rather than on the absolute values.
More precisely, the elements of would be sorted according to their indicator
values; if the order remains the same for any type of scaling, then the indicator
under consideration would be called scaling independent.
14.4 Discussion of Selected Unary Quality Indicators

The unary quality indicators that will be discussed in the following repre-
sent a selection of popular measures; however, the list of indicators is by no
means complete. Furthermore, only deterministic indicators are considered. A
summary of the indicators and properties we consider is given in Table 14.1.
14.4.1 Outer Diameter
The outer diameter measures the distance between the ideal objective vector
and the nadir objective vector of a Pareto set approximation in terms of a
specic distance metric. We here dene the corresponding indicator IOD as
! "
IOD (A) = max wi (max fi (x)) (min fi (x))
1in xA xA
with weights wi R+ . If all weights are set to 1, then the outer diameter
simply provides the maximum extent over all dimensions of the objective
space.
The outer diameter is neither monotonic nor scaling invariant. However, it
is cheap to compute (the runtime is linear in the cardinality of the Pareto set
approximation A) and does not require any additional problem knowledge.
The paramters wi can be used to weight the dierent objectives, but they are
as such not problem-specic.
14.4.2 Proportion of Pareto-Optimal Objective Vectors Found
Another measure to consider is the number of Pareto-optimal objective vec-

tors that are weakly dominated by the image of a Pareto set approximation
in objective space. The corresponding indicator IPF has been introduced by
Ulungu et al. (1999) as the fraction of the Pareto-optimal front P weakly
dominated by a specic set A :
{z | x A : f (x) z}
IPF (A) =
|P |
This measure assumes that the Pareto-optimal set resp. the Pareto-optimal
front is known and that the number of optimal objective vectors is nite. The
indicator value can be computed in O(|P | |A|) time, and they are invariant
to scaling. The indicator is monotonic, but not strictly monotonic.
14.4.3 Cardinality
The cardinality IC (A) of a Pareto set approximation A can be considered

both in decision space and objective space, (see, e.g. Van Veldhuizen, 1999).
In either case, the indicator is not monotonic. However, it is cheap to compute,
scaling invariant, and does not require any additional information.
Table 14.1. Summary of selected indicators and some of their properties. See ac-
companying text for full details
Indicator Monotonicity Scaling Computational Additional
invariance eort problem
knowledge
needed
Outer Diameter linear time none
Proportion of Pareto not strictly invariant quadratic all Pareto op-
Optimal Vectors tima
Found
Cardinality invariant linear time none
Hypervolume strictly mono- exponential in k needs upper
tonic bounding
vector
Completeness not strictly invariant anytime as it is none
based on sam-
pling, but eort
grows rapidly
with decision
space dimension
Epsilon Family not strictly quadratic reference set
D Family not strictly quadratic reference set
R Family not strictly quadratic reference set
and point
Uniformity Measures quadratic varies
14.4.4 Hypervolume Indicator
The hypervolume indicator IH , which was introduced in (Zitzler and Thiele,

1998), gives the volume of the portion of the objective space that is weakly
dominated by a specic Pareto set approximation. It can be formally dened
as # z upper

IH (A) := A (z)dz
zlower
where zlower and zupper are objective vectors representing lower resp. upper
bounds for the portion of the objective space within which the hypervolume
is calculated, and where the function A is the attainment function (Grunert
da Fonseca et al., 2001) for A
'
1 if x A : f (x) z
A (z) :=
0 else
that returns for an objective vector a 1 if and only if it is weakly dominated

by A. In practice, the lower bound zlower is not required to calculate the
hypervolume for a set A. The hypervolume indicator is to be maximized.
The hypervolume indicator is currently the only unary indicator known to

be strictly monotonic. This comes at the cost of high computational cost: the
best known algorithms for computing the hypervolume have running times
which are exponential in the number of objectives (see While, 2005; While
et al., 2005, 2006; Fonseca et al., 2006; Beume and Rudolph, 2006). Further-
more, a reference point, an upper bound, needs to be specied; the indicator
is sensitive to the choice of this upper bound, i.e. the ordering of Pareto set
approximations induced by the indicator is aected by it, so the indicator is
not scaling invariant by the above denition. Note: preference information can
be incorporated into the hypervolume indicator, so that more emphasis can
be placed on certain parts of the Pareto front than others (e.g. the middle,
the extremes, etc.), whilst maintaining monotonicity (Zitzler et al., 2007).
14.4.5 Completeness Indicator

The completeness indicator ICP was introduced in (Lotov et al., 2002, 2004)
and goes back to the concept of completeness as dened by (Kamenev and
Kondtratev, 1992; Kamenev, 2001). The indicator gives the probability that
a randomly chosen solution from the feasible set S is weakly dominated by a
given Pareto set approximation A, i.e.,
ICP (A) = Prob [A {U}] (14.1)
where U is a random variable representing the random choice from S. Pro-
vided that U follows a uniform probability density function, the indicator
value ICP (A) can also be interpreted as the portion of the feasible set that
is dominated by A. As such, the completeness indicator is strongly related to
the hypervolume indicator; the dierence is that the former takes the decision
space into account, while the latter considers the objective space only.
Normally, one cannot compute the completeness directly. For this reason,
the indicator values can be estimated by drawing samples from the feasible
set and computing the completeness for these samples. As shown by Lotov et
al. (Lotov et al., 2004), the condence interval for the true value can be eval-
uated with any reliability, given suciently large samples. Furthermore, there
is an extension of this indicator, namely ICP
(A), where another dominance
relation, e.g., the -dominance relation, , as dened above, is considered
which reects a specic -neighborhood of a Pareto set approximation in the
objective space, see (Lotov et al., 2002, 2004) for details.
The completeness indicator is scaling-invariant as it does not rely on the
absolute objective function values. Furthermore, the exact completeness in-
dicator is as the hypervolume indicator strictly monotonic. However, as in
practice sampling is necessary to estimate the exact indicator values, the in-
dicator function based on estimates is monotonic (if always the same sample
is used to compare two Pareto set approximation), while strict monotonicity
cannot be ensured in general. Experimental studies have shown that the indi-
cator estimates are eective only for relatively low-dimensional decision spaces
(not more than a dozen decision variables) and for suciently slowly varying
objective functions (Lotov et al., 2002, 2004). For a high-dimensional decision
space, the Pareto-optimal set cannot be found via random point generation if
it has an extremely small volume. For this reason, a generalized completeness
estimate for the quality of approximation has been proposed for the case of a
large number of variables and rapidly varying functions, see (Berezkin et al.,
2006).
14.4.6 Epsilon Indicator Family
The epsilon indicator family has been introduced in (Zitzler et al., 2003)
and comprises a multiplicative and an additive versionboth exist in unary
and binary form; the denition is closely related to the notion of epsilon e-
ciency (Helbig and Pateva, 1994). The binary multiplicative epsilon indicator,
I (A, B), gives the minimum factor by which objective vector associated with
B can be multiplied such that the resulting transformed Pareto front approx-
imation is weakly dominated by the Pareto front approximation represented
by A:
I (A, B) = inf {x2 B x1 A : x1 x2 }. (14.2)
R
This indicator relies on the -dominance relation, , dened as:
x1 x2 i 1..n : fi (x1 ) fi (x2 ) (14.3)
for a minimization problem, and assuming that all points are positive in all
objectives. On this basis, the unary multiplicative epsilon indicator, I 1 (A) can
then be dened as:
I 1 (A) = I (A, R), (14.4)
where R is any reference set of points. An equivalent unary additive epsilon
1
indicator I + is dened analogously, but is based on additive -dominance:
x1 + x2 i 1..n : fi (x1 ) + fi (x2 ). (14.5)
Both unary indicators are to be minimized. An indicator value less than or

equal to 1 (I 1 ) respectively 0 (I +
1
) implies that A weakly dominates the
reference set R.
The unary epsilon indicators are monotonic, but not strictly monotonic.
They are sensitive to scaling and require a reference set relatively to which the
epsilon value is calculated. For any nite Pareto set approximation A and any
nite reference set R, the indicator values are cheap to compute; the runtime
complexity is of order O(n |A| |R|).
14.4.7 The D Indicator Family
The D indicators are similar to the additive epsilon indicator and measure
the average resp. worst case component-wise distance in objective space to
the closest solution in a reference R, as suggested in (Czyzak and Jaskiewicz,

1998). Czyzak and Jaskiewicz (1998) introduced two versions, ID1 and ID2 ;
the rst considers the average distance regarding the set R:
1
ID1 (A) = min max 0, wi (fi (x1 ) fi (x2 ))
|R| 2 x A 1ik
1
x R
where the wi are weights associated with the specic objective functions.
Alternatively, the worst case distance may be considered:

ID2 (A) = max min max 0, wi (fi (x1 ) fi (x2 ))
2x R x A 1ik
1
As with the epsilon indicator family, the D indicators are monotonic, but
not strictly monotonic, scaling dependent, and require a reference set. The
running time complexity is of order O(n |A| |R|).
14.4.8 The R Indicator Family
The R indicators proposed in (Hansen and Jaszkiewicz, 1998) can be used to

assess and compare Pareto set approximations on the basis of a set of utility
functions. Here, a utility function u is dened as a mapping from the set Rk
of k-dimensional objective vectors to the set of real numbers:
u : Rk % R.
Now, suppose that the decision makers preferences are given in terms of a
parameterized utility function u and a corresponding set of parameters. For
instance, u could represent a weighted sum of the objective values, where =
(1 , . . . n ) stands for a particular weight vector. Hansen and Jaszkiewicz
(1998) propose several ways to transform such a family of utility functions
into a quality indicator; in particular, the binary IR2 and IR3 indicators are
dened as:4

u (, A) u (, B)
IR2 (A, B) = ,
||

[u (, B) u (, A)]/u (, B)
IR3 (A, B) = .
||
4
The full formalism described in (Hansen and Jaszkiewicz, 1998) also considers
arbitrary sets of utility functions in combination with a corresponding probabil-
ity distribution over the utility functions. This is a way of enabling preference
information regarding dierent parts of the Pareto front to be accounted for, e.g.
more utility functions can be placed in the middle of the Pareto front in order to
emphasise that region. The interested reader is referred to the original paper for
further information.
where u is the maximum value reached by the utility function u with weight
vector on an Pareto set approximation A, i.e., u (, A) = maxxA u (f (x)).
Similarly to the epsilon indicators, the unary R indicators are dened on the
basis of the binary versions by replacing one set by an arbitrary, but xed
1
reference set R: IR2 (A) = IR2 (R, A) and IR3 1
(A) = IR3 (A, R). The indicator
values are to be minimized.
With respect to the choice of the parameterized utility function u , there
are various possibilities. A rst utility function u that can be used in the above
is a weighted linear function

u (z) = j |zj zj |, (14.6)
j1..n
where z is the ideal point, if known, or any point that weakly dominates all
points in the corresponding Pareto front approximation. (When comparing
approximation sets, the same z must be used each time).
A disadvantage of the use of a weighted linear function means that points
not on the convex hull of the Pareto front approximation are not rewarded.
Therefore, it is often preferable to use a nonlinear function such as the
weighted Tchebyche function,
u (z) = max j |zj zj |. (14.7)

j1..n
In this case, however, the utility of a point and one which weakly dominates
it might be the same. To avoid this, it is possible to use the combination of
linear and nonlinear functions: the augmented Tchebyche function,

u (z) = max j |zj zj | + |zj zj | , (14.8)
j1..n
j1..n
where is a suciently small positive real number. In all cases, the set
of weight vectors should contain a suciently large number of uniformly
dispersed
normalized weight combinations with i 1..n : n 0
j=1..n j = 1.
The R indicators are monotonic, but not strictly, scaling dependent and
require both a reference set as well as an ideal objective vector. The runtime
complexity for computing the indicator values is of order O(n || |A| |R|).
14.4.9 Uniformity Measures
Various indicators have been proposed that measure how well the solutions
of a Pareto set approximations are distributed in the objective space; often,
the main focus is on a uniform distribution. To this end, one can consider the
standard deviation of nearest neighbor distances, (see, e.g. Schott, 1995) and
(Deb et al., 2002). Further examples can be found in (Knowles, 2002; Knowles
and Corne, 2002).
In general, uniformity measures are not monotonic and not scaling invari-
ant. The computation time required to compute the indicator values is usually
quadratic in the cardinality of the Pareto set approximation under consider-
ation, i.e., O(n |A|2 ). Most measures of this class do not require additional
information, but some involve certain problem-dependent parameters.
14.5 Indicator Combinations and Binary Indicators

The ideal quality indicator is strictly monotonic, scaling invariant, cheap to
compute and does not require any additional information. However, it can be
seen from the discussion above that such an ideal indicator does not exist.
For instance, all monotonic unary quality indicators require a reference point
and/or a reference set. The only strictly monotonic indicator currently known,
the hypervolume indicator, is by far the most computationally expensive in-
dicator. An obvious way to circumvent some of these problems is to combine
multiple indicators. One has to dene how exactly the resulting information
is combined, for instance, one may consider a sequence of indicators. Suppose
we would like to combine the epsilon indicator and the hypervolume indicator:
one may say A is preferable to B if either the epsilon value for A is better
or the epsilon values are identical and the hypervolume value for A is better.
The resulting indicator combination would be strictly monotonic, but in av-
erage much less expensive than the hypervolume computation alone because
in many cases the decision can be already made on the basis of the epsilon
indicator. Another possibility is the use of binary quality indicators; see (Zit-
zler et al., 2003) for a detailed discussion. Here, both scaling invariance and
strict monotonicity can be achieved at the same time, e.g., with the coverage
indicator (Zitzler and Thiele, 1998).
14.6 Stochasticity: General Considerations

So far, we have assumed that each algorithm under consideration always gen-
erates the same Pareto set approximation for a specic problem. However,
many multiobjective optimizers are variants of randomized search algorithms
and therefore stochastic in nature. If a stochastic multiobjective optimizer is
applied several times to the same problem, each time a dierent Pareto set ap-
proximation may be returned. In this sense, with each randomized algorithm
a random variable is associated whose possible values are Pareto Set approx-
imations, i.e., elements of ; the underlying probability density function is
usually unknown.
One way to estimate this probability density function is by means of the-
oretical analysis. Since this approach is infeasible for many problems and al-
gorithms used in practice, empirical studies are common in the context of the
performance assessment of multiobjective optimizers. By running a specic

algorithm several times on the same problem instance, one obtains a sam-
ple of Pareto set approximations. Now, comparing two stochastic optimizers
basically means comparing the two corresponding samples of Pareto set ap-
proximations. This leads to the issue of statistical hypothesis testing. While in
the deterministic case one can state, e.g., that optimizer 1 achieves a higher
hypervolume indicator value than optimizer 2, a corresponding statement in
the stochastic case could be that the expected hypervolume indicator value
for algorithm 1 is greater than the expected hypervolume indicator value for
algorithm 2 at a signicance level of 5%.
In principle, there exist two basic approaches in the literature to analyze
two or several samples of Pareto set approximations statistically. The more
popular approach rst transforms the samples of Pareto set approximations
into samples of real values using quality indicators; then, the resulting sam-
ples of indicator values are compared based on standard statistical testing
procedures.
Example 2. Consider two hypothetical stochastic multiobjective optimizers

and assume that the outcomes of three independent optimization runs are
as depicted in Fig. 14.3. If we use the hypervolume indicator with the refer-
ence point (20, 20), we obtain two samples of indicator values: (277, 171, 135)
and (277, 64, 25). These indicator value samples can then be compared and
dierences can be subjected to statistical testing procedures.
The alternative approach, the attainment function method, summarizes a

sample of Pareto set approximations in terms of a so-called empirical at-
tainment function. To explain the underlying idea, suppose that a certain
stochastic multiobjective optimizer is run once on a specic problem. For
each objective vector z in the objective space, there is a certain probability
p that the resulting Pareto set approximation contains an element x such
f (x) z. We say p is the probability that z is attained by the optimizer. The
attainment function gives for each objective vector z in the objective space
the probability that z is attained in one optimization run of the considered
algorithm. As before, the true attainment function is usually unknown, but it
can be estimated on the basis of the approximation set samples: one simply
counts the number of approximation sets by which each objective vector is
attained and normalizes the resulting number with the overall sample size.
The attainment function is a rst order moment measure, meaning that it
estimates the probability that z is attained in one optimization run of the
considered algorithm independently of attaining any other z. For the consid-
eration of higher order attainment functions, Grunert da Fonseca et al. (2001)
have developed corresponding statistical testing procedures.
Example 3. Consider Fig. 14.3. For the scenario on the right, the three Pareto
front approximations cut the objective space into four regions: the upper right
f2 f2
1/3 2/3 3/3 0/3 1/3 2/3 3/3
20 20
15 15
10 10
2/3
5 5
1/3
0/3
f1 f1
5 10 15 20 5 10 15 20
Fig. 14.3. Hypothetical outcomes of three runs for two dierent stochastic optimiz-
ers (left and right). The numbers in the gures give the relative frequencies according
to which the distinct regions in the objective space were attained.
region is attained in all of the runs and therefore is assigned a relative fre-
quency of 1, the lower left region is attained in none of the runs, and the re-
maining two regions are assigned relative frequencies of 1/3 and 2/3 because
they are attained in one respectively two of the three runs. In the scenario
on the left, the objective space is partitioned into six regions; the relative
frequencies are determined analogously as shown in the gure.
A third approach to statistical analysis of approximation sets consists in rank-

ing the obtained approximations by means of the dominance relation, in anal-
ogous fashion to the way dominance-based tness assignment ranks objective
vectors in evolutionary multiobjective optimization. Basically, for each Pareto
set approximation generated by one optimizer it is computed by how many
Pareto set approximations produced by another optimizer it is dominated. As
a result, one obtains, for each algorithm, a set of ranks and can statistically
verify whether the rank distributions for two algorithms dier signicantly or
not. We call this method, dominance ranking.
Example 4. To compare the outcomes of the two hypothetical optimizers de-

picted in Fig. 14.3, we check for each pair consisting of one Pareto set approx-
imation of the rst optimizer and one Pareto set approximation of the second
optimizer whether either is better or not. For the Pareto front approximation
represented by the diamond on the left hand side, none of the three Pareto
front approximations on the right is better and therefore it is assigned the
lowest rank 0. The Pareto front approximation associated with the diamond
on the right hand side is worse than all three Pareto front approximations
on the left and accordingly its rank is 3. Overall, the resulting rank distri-
butions are (0, 0, 1) for the algorithm on the left hand side and (0, 2, 3) for
the algorithm on the right hand side. A special statistical test can be used to
determine whether the two rank distributions are signicantly dierent.
14.7 Sample Transformations
The three comparison methodologies outlined in the previous section have in

common that the sample of approximation sets associated with an algorithm
is rst transformed into another representationspecically, a sample of indi-
cator values, an empirical attainment function, or a sample of ranksbefore
the statistical testing methods are applied. In the following, we will review
each of the dierent types of sample transformations in greater detail (but
now considering the dominance ranking rst); the issue of statistical testing
will be covered in Section 14.8.
14.7.1 Dominance Ranking
Principles and Procedure
Suppose that we wish to compare the quality of Pareto set approximations

generated by two stochastic multiobjective optimizers, where A1 , A2 , . . . , Ar
represent the approximations generated by the rst optimizer in r runs,
while B1 , B2 , . . . , Bs denote the approximations generated by the second opti-
mizer in s runs. Using the preference structure of the underlying set problem
(, , f , rel ), one can now compare all Ai with all Bj and thereby assign a
gure of merit or a rank to each Pareto set approximation, similarly to the way
that dominance-based tness assignment works in multiobjective evolution-
ary algorithms. In principle, there are several ways to assign each Pareto set
approximation a rank on the basis of a dominance relation, e.g., by counting
the number of sets by which a specic approximation is dominated (Fonseca
and Fleming, 1993) or by performing a nondominated sorting on the Pareto
set approximations under considerations. Here, the former approach in com-
bination with the extended weak Pareto dominance , cf. Section 14.2 on
Page 374, is preferred as it produces a ner-grained ranking, with fewer ties,
than nondominated sorting:
rank (Ai ) = |{B|B {B1 , . . . , Bs } B Ai }|. (14.9)
The ranks for B1 , . . . , Bs are determined analogously. The lower the rank, the
better the corresponding Pareto set approximation with respect to the entire
collection of sets associated with the other optimizer.
The result of this procedure is that each Ai and Bj is associated with
a gure of merit. Accordingly, the samples of Pareto set approximations as-
sociated with each algorithm have been transformed into samples of ranks:
(rank (A1 ), rank (A2 ), . . . , rank (Ar )) and (rank (B1 ), rank (B2 ), . . . , rank (Bs )).
An example performance comparison study using the dominance ranking
procedure can be found in (Knowles et al., 2006).
Discussion
The dominance ranking approach relies on the concept of Pareto dominance

and some ranking procedure only, and thus yields quite general statements
about the relative performance of the considered optimizers, fairly indepen-
dently of any preference information. Thus, we recommend this approach to
be the rst step in any comparison: if one optimizer is found to be signicantly
better than the other by this procedure, then it is better in a sense consistent
with the underlying preference structure. It may be interesting and worthwhile
to use either quality indicators or the attainment function to characterize fur-
ther the dierences in the distributions of the Pareto set approximations, but
these methods are not needed to conclude which of the stochastic optimizers
generates the better sets, if a signicant dierence can be demonstrated using
the ranking of approximation sets alone.
14.7.2 Quality Indicators
Principles and Procedures
As stated earlier, a unary quality indicator I is dened as a mapping from

to the set of real numbers. The order that I establishes on is supposed to
represent the quality of the Pareto set approximations. Thus, given a pair of
approximations, A and B, the dierence between their corresponding indicator
values I(A) and I(B) should reveal a dierence in the quality of the two sets.
This not only holds for the case that either set is better, but also when A and B
are incomparable. Note that this type of information goes beyond pure Pareto
dominance and represents additonal knowledge; we denote this knowledge as
preference information.
Discussion
Using unary quality indicators in a comparative study is attractive as it trans-

forms a sample of approximation sets into a sample of reals for which standard
statistical testing procedures exist, cf. Section 14.8. In contrast to the dom-
inance ranking approach, it is also possible to make quantitative statements
about the dierences in quality, even for incomparable Pareto set approxi-
mations. However, this comes at the cost of generality: every unary quality
indicator represents specic preference information. Accordingly, any state-
ment of the type algorithm A outperforms algorithm B needs to be qualied
in the sense of with respect to quality indicator Ithe situation may be
dierent for another indicator.
14.7.3 Empirical Attainment Function
Principles and Procedures
The central concept in this approach is the notion of an attainment function.

Since the multiobjective optimizers that we consider may be stochastic, the
result of running the optimizer can be described by a distribution. Because the
optimizer returns a Pareto set approximation in any given run, the distribution
can be described in the objective space by a random set Z of random objective
vectors zj , with the cardinality of the set, m, also random, as follows:
zj Rk , j = 1, . . . , m},
Z = { (14.10)
where k is the number of objectives of the problem. The attainment function
is a description of this distribution based on the notion of goal-attainment:
A goal, here meaning an objective vector, is attained whenever it is weakly
dominated by the Pareto front approximation returned by the optimizer. It is
dened by the function Z (.) : Rn % [0, 1] with
z1 z
Z (z) = P ( z2 z . . .
zm z) (14.11)
= P (Z {z}) (14.12)
= P (that the optimizer attains goal z in a single run), (14.13)
where P (.) is the probability density function. The attainment function is a
rst order moment measure, and can be seen as a mean-measure for the set Z.
Thus, it describes the location of the Pareto set approximation distribution;
higher order moments are needed if the variability across runs is to be assessed,
and to assess dependencies between the probabilities of attaining two or more
goals in the same run (see Fonseca et al., 2005).
The attainment function can be estimated from a sample of r independent
runs of an optimizer via the empirical attainment function (EAF) dened as
1
r
r (z) = I(f (Ai ) {z}), (14.14)
r i=1
where Ai is the ith Pareto set approximation (run) of the optimizer and I(.)
is the indicator function, which evaluates to one if its argument is true and
zero if its argument is false. In other words, the EAF gives for each objective
vector in the objective space the relative frequency that it was attained, i.e.,
weakly dominated by an Pareto front approximation, with respect to the r
runs.
The outcomes of two optimizers can be compared by performing a corre-
sponding statistical test on the resulting two EAFs, as will be explained in
Section 14.8.4. In addition, EAFs can also be used for visualizing the out-
comes of multiple runs of an optimizer. For instance, one may be interested in
plotting all the goals that have been attained (independently) in 50% of the
runs. This is dened in terms of a k%-attainment set :
A Pareto set approximation A is called the k%-attainment set of an

EAF r (z), i the corresponding Pareto front approximation weakly
dominates exactly those objective vectors that have been attained in
at least k percent of the r runs. Formally,
z Z : r (z) k/100 f (A) {z} (14.15)
We can then plot the attainment surface of such an approximation set, dened
as:
An attainment surface of a given Pareto set approximation A is the
union of all tightest goals that are known to be attainable as a result
of A. Formally, this is the set {z Rk | f (A) z z2 Rk :
f (A) z2 z}.
Roughly speaking, then, the k%-attainment surface divides the objective space
in two parts: the goals that have been attained and the goals that have not
been attained with a frequency of at least k percent.
Example 5. Suppose a stochastic multiobjective optimizer returns the Pareto

front approximations depicted in Fig. 14.4 for ve dierent runs on a biobjec-
tive optimization problem. The corresponding attainment surfaces are shown
in Fig. 14.5; they summarize the underlying empirical attainment function.
Discussion
The attainment function approach distinguishes itself from the dominance

ranking and indicator approaches by the fact that the transformed samples are
multidimensional, i.e., dened on Z and not on R. Thereby, less information is
lost by the transformation, and in combination with a corresponding statisti-
cal testing procedure detailed dierences can be revealed between the EAFs of
two algorithms (see Section 14.8). However, the approach is computationally
f2 f2
1/3 2/3 3/3 0/3 1/3 2/3 3/3
20 20
15 15
10 10
2/3
5 5
1/3
0/3
f1 f1
5 10 15 20 5 10 15 20
Fig. 14.4. A plot showing ve Pareto front approximations. The visual evaluation
is dicult, although there are only a few points per set, and few sets.
30
run1
run2
25 run 3
run 4
minimize f2(x)
20 run 5
15
10
0
0 5 10 15 20 25 30
minimize f1(x)
Fig. 14.5. Attainment surface plots for the Pareto fron approximations in Fig-
ure 14.4. The rst (solid) line represents the 20%-attainment surface, the second
line the 40%-attainment surface, and so forth; the fth line stands for the 100%-
attainment surface.
expensive and therefore only applicable in the case of a few objective func-
tions. Concerning visualization of EAFs, recently, an approximate algorithm
has been presented by Knowles (2005) that computes a given k%-attainment
surface only at specied points on a grid and thereby achieves considerable
speedups in comparison with the exact calculation of the attainment surface
dened above.
14.8 Statistical Testing
14.8.1 Fundamentals
The previous section has described three dierent transformations that can
be applied to a sample of Pareto set approximations generated from multiple
runs of an optimizer. The ultimate purpose of generating the samples and
applying the transformations is to allow us to (a) describe and (b) make
inferences about the underlying random approximation set distributions of
the (two or more) optimizers, thus enabling us to compare their performance.
It is often convenient to summarise a random sample from a distribution
using descriptive statistics such as the mean and variance. The mean, median
and mode are sometimes referred to as rst order moments of a distribution,
and they describe or summarise the location of the distribution on the real
number line. The variance, standard deviation, and inter-quartile range are
known as second-order moments and they describe the spread of the data.
Using box-plots (Chambers et al., 1983) or tabulating mean and standard
deviation values are useful ways of presenting such data.
Statistical Inferences
Descriptive statistics are limited, however, and should usually be given only
to supplement any statistical inferences that can be made from the data. The
standard statistical inference we would like to make, if it is true, is that one
optimizers underlying Pareto set approximation distribution is better than
another ones.5 However, we cannot determine this fact denitively because we
only have access to nite-sized samples of Pareto set approximations. Instead,
it is standard practice to assume that the data is consistent with a simpler
explanation known as the null hypothesis, H0 , and then to test how likely
this is to be true, given the data. H0 will often be of the form samples A
and B are drawn from the same distribution or samples A and B are drawn
from distributions with the same mean value. The probability of obtaining a
nding at least as impressive as that obtained, assuming the null hypothesis
is true, is called the p-value and is computed using an inferential statistical
test. The signicance level, often denoted as , denes the largest acceptable
p-value and represents a threshold that is user-dened. A p-value lower than
the chosen signicance level then signies that the null hypothesis can be
rejected in favour of an alternative hypothesis, HA , at a signicance level of .
The denition of the alternative hypothesis usually takes one of two forms. If
HA is of the form sample A comes from a better distribution than sample B
then the inferential test is a one-tailed test. If HA does not specify a prediction
about which distribution is better, and is of the form sample A and sample B
are from dierent distributions then it is a two-tailed test. A one-tailed test
is more powerful than a two-tailed test, meaning that for a given alpha value,
it rejects the null hypothesis more readily in cases where it is actually false.
Non-parametric Statistical Inference: Rank and Permutation Tests
Some inferential statistical tests are based on assuming the data is drawn
from a distribution that closely approximates a known distribution, e.g. the
normal distribution or Students t distribution. Such known distributions are
completely dened by their parameters (e.g. the mean and standard devia-
tion), and tests based on these known distributions are thus termed paramet-
ric statistical tests. Parametric tests are powerfulthat is, the null hypothesis
is rejected in most cases where it is indeed falsebecause even quite small
dierences between the means of two normal distributions can be detected
accurately. However, unfortunately, the assumption of normality cannot be
theoretically justied for stochastic optimizer outputs, in general, and it is
dicult to empirically test for normality with relatively small samples (less
than 100 runs). Therefore, it is safer to rely on nonparametric tests (Conover,
1999), which make no assumptions about the distributions of the variables.
5
Most statistical inferences are formulated in terms of precisely two samples, in
this way.
Two main types of nonparametric tests exist: rank tests and permutation
tests. Rank tests pool the values from several samples and convert them into
ranks by sorting them, and then employ tables describing the limited number
of ways in which ranks can be distributed (between two or more algorithms)
to determine the probability that the samples come from the same source.
Permutation tests use the original values without converting them to ranks
but estimate the likelihood that samples come from the same source explicitly
by Monte Carlo simulation. Rank tests are the less powerful but are also less
sensitive to outliers and computationally cheap. Permutation tests are more
powerful because information is not thrown away, and they are also better
when there are many tied values in the samples, however they can be expensive
to compute for large samples.
In the following, we describe selected methods for nonparametric infer-
ence testing for each of the dierent transformations. We follow this with a
discussion of issues relating to matched samples, multiple inference testing,
and assessing worst- and best-case performance.
14.8.2 Comparing Samples of Dominance Ranks
Dominance ranking converts the samples of approximation sets from two or

more optimizers into a sample of dominance ranks. A test statistic is computed
from these ranks by summing over the ranks in each of the two samples and
taking the dierence of these sums. In order to determine whether the value of
the test statistic is signicant, a permutation test must be used. The standard
Mann-Whitney rank sum test and tables (Conover, 1999) cannot be used here
because the rank distributions are aected by the fact that the sets are par-
tially ordered (rather than totally ordered numbers). Thus, to compute the
null distribution, the assignment of the Pareto set approximations to the opti-
mizers must be permuted. Basically, the set {A1 , A2 , . . . , Ar , B1 , B2 , . . . , Bs }
is partitioned into one set of r approximations and another set of s approx-
imations; for each partitioning the dierence between the rank sums can be
determined, nally yielding a distribution of rank sum dierences. Details for
this statistical testing procedure are given in (Knowles et al., 2006).
14.8.3 Comparing Sample Indicators Values
The use of a quality indicator reduces the dimension of a Pareto set approxi-
mation to a single gure of merit. One of the main advantages, and underly-
ing motivations, for using indicators is that this reduction to one dimension
allows statistical testing to be carried out in a relatively straightforward man-
ner using standard univariate statistical tests, i.e. as is done when comparing
best-of-population tness values (or equivalents) in single-objective algorithm
comparisons. Here, the Mann-Whitney rank sum test or Fishers permuta-
tion test can be used (Conover, 1999); the Kruskal-Wallis test may be more
appropriate if multiple (more than two) algorithms are to be compared.
In the case that a combination of multiple quality indicators is considered

(see Page 386), slightly dierent preferences are assessed by each of the indi-
cators and this may help to build up a better picture of the overall quality
of the Pareto set approximations. On the other hand, using several indicators
does bring into play multiple testing issues if the distributions from dierent
indicators are being tested independently, cf. Section 14.8.5.
14.8.4 Comparing Empirical Attainment Functions
The EAF of an optimizer is a generalization of a univariate empirical cumula-

tive distribution function (ECDF) (Grunert da Fonseca et al., 2001). In order
to test if two ECDFs are dierent, the Kolmogorov-Smirnov (KS) test can
be applied. This test measures the maximum dierence between the ECDFs
and assesses the statistical signicance of this dierence. An algorithm that
computes a KS-like test for two EAFs is described in (Shaw et al., 1999).
The test only determines if there is a signicant dierence between the two
EAFs, based on the maximum dierence. It does not determine whether one
algorithms entire EAF is above the other one:
r (z) r (z),
z Z, A B
or not. In order to probe such specic dierences, one must use methods for
visualizing the EAFs.
For two-objective problems, plotting signicant dierences in the empirical
attainment functions of two optimizers, using a pair of plots, can be done
by colour-coding either (i) levels of dierence in the sample probability, or
(ii) levels of statistical signicance of a dierence in sample probability, of
attaining a goal, for all goals. Option (ii) is more informative and can be
computed from the fact that there is a correspondence between the statistical
signicance level of the KS-like test and the maximum distance between
the EAFs that needs to be exceeded. Thus the KS-like test can be run for
dierent selected values to compute these dierent distances. Then, the
actual measured distances between the EAFs at every z can be converted to
a signicance level.
An example of such a pair of plots is shown in Figure 14.6. This kind
of plot has been used to good eect in (Lpez-Ibez et al., 2006). Note
also that Fonseca et al. (2005) have devised plots that can indicate second-
order information, i.e. the probability of an optimizer attaining pairs of goals
simultaneously.
14.8.5 Advanced Topics
Matched Samples
When comparing a pair of stochastic optimizers, two slightly dierent sce-

narios are possible. In one case, each run of each optimizer is a completely
A attains B attains
minimize
minimize
grand worst grand worst
attainment attainment
surface surface
grand best grand best

attainment attainment
minimize minimize
Fig. 14.6. Individual dierences between the probabilities of attaining dierent

goals on a two-objective minimization problem with optimizer O1 and optimizer
O2 , shown using a greyscale plot. The grand best and worst attainment surfaces
(the same in both plots) indicate the borders beyond which the goals are never
attained or always attained, computed from the combined collection of Pareto set
approximations. Dierences in the frequency with which certain goals are met by
the respective algorithms O1 and O2 are then represented in the region between
these two surfaces. In the left plot, darker regions indicate goals that are attained
more frequently by O1 than by O2 . In the right plot, the reverse is shown. The
intensity of the shading can correspond to either the magnitude of a dierence in
the sample probabilities, or to the level of statistical signicance of a dierence in
these probabilities.
independent random sample; that is, the initial population (if appropriate),
the random seed, and all other random variables are drawn independently
and at random on each run. In the other case, the inuence of one or more
random variables is partially removed from consideration; e.g. the initial pop-
ulation used by the two algorithms may be matched in corresponding runs, so
that the runs (and hence the nal quality indicator values) should be taken as
pairs. In the former scenario, the statistical testing will reveal, in quite general
terms, whether there is a dierence in the distributions of indicator values re-
sulting from the two stochastic optimizers, from which a general performance
dierence can be inferred. In the latter scenariotaking the particular case
where initial populations are matchedthe statistical testing reveals whether
there is a dierence in the indicator value distributions given the same initial
population, and the inference in this case relates to the optimizers ability to
improve the initial population. While the former scenario is more general, the
latter may give more statistically signicant results.
If matched samples have been collected, then the Wilcoxon signed rank
test (Conover, 1999) or Fishers matched samples test (Conover, 1999) can
be used instead of the Mann-Whitney rank sum test respectively Fishers
permutation test.
Multiple Testing
Multiple testing (Benjamini and Hochberg, 1995; Bland and Altman, 1995;
Miller, 1981; Perneger, 1998; Westfall and Young, 1993) occurs when one
wishes to consider several statistical hypotheses (or comparisons) simultane-
ously. When considering multiple tests, the signicance of each single result
needs to be adjusted to account for the fact that, as more tests are considered,
it becomes more and more likely that some (unspecied) result will give an
extreme value, resulting in a rejection of the null hypothesis for that test.
For example, imagine we carry out a study consisting of twenty dierent
hypothesis tests, and assume that we reject the null hypothesis of each test if
the p-value is 0.05 or less. Now, the chance that at least one of the inferences
will be a type-1 error (i.e. the null hypothesis is wrongly rejected) is 1
(0.9520 ) & 64%, when assuming that the null hypothesis was true in every
case. In other words, more often than not, we wrongly claim a signicant
result (on at least one test). This situation is made even worse if we only
report the cases where the null hypothesis was rejected, and do not report that
the other tests were performed: in that case, results can be utterly misleading
to a reader.
Multiple testing issues in the case of assessing stochastic multiobjective
optimizers can arise for at least two dierent reasons:
There are more than two algorithms and we wish to make inferences about
performance dierences between all or a subset of them.
There are just two algorithms, but we wish to make multiple statistical
tests of their performance, e.g., considering more than one indicator.
Clearly, this is a complicated issue and we can only touch on the correct
procedures here. The important thing to know is that the issue exists, and
to do something to minimize the problem. We briey consider ve possible
approaches:
i). Do all tests as normal (with uncorrected p-values) but report all tests done
openly and notify the reader that the signicance levels are not, therefore,
reliable.
ii). In the special case where we have multiple algorithms but just one statistic
(e.g. one indicator), use a statistical test that is designed explicitly for
assessing several independent samples. The Kruskal-Wallis test (Conover,
1999), is an extension of the two-sample Mann-Whitney test that works
for multiple samples. Similarly, the Friedman test (Conover, 1999) extends
the paired Wilcoxon signed rank test to any number of related samples.
iii). In the special case where we want to use multiple statistics (e.g. multiple
dierent indicators) for just two algorithms, and we are interested only
in an inference derived per-sample from all statistics, (e.g. we want to
test the signicance of a dierence in hypervolume between those pairs Ai
and Bi where the diversity dierence between them is positive), then the
permutation test can be used to derive the null distribution, as usual.
iv). Minimize the number of dierent tests carried out on a pair of algorithms
by carefully choosing which tests to apply before collecting the data. Col-
lect independent data for each test to be carried out.
v). Apply the tests on the same data but use methods for correcting the
p-values for the reduction in condence associated with data re-use.
Approach (i) does not allow powerful conclusions to be drawn, but it at least
avoids mis-representation of results. The second approach is quite restrictive
as it only applies to a single test being applied to multiple algorithmsand
uses rank tests, which might not be appropriate in all circumstances. Similarly,
(iii) only applies in the special case noted. A more general approach is (iv),
which is just the conservative option; the underlying strategy is to perform
a test only if there is some realistic chance that the null hypothesis can be
rejected (and the result would be interesting). This careful conservatism can
then be accommodated. However, while following (iv) might be possible much
of the time, sometimes it is essential to do several tests on limited data and to
be as condent as possible about any positive results. In this case, one should
then use approach (v).
The simplest and most conservative, i.e., weakest approach for correcting
the p-values is the Bonferroni correction (Bland and Altman, 1995). Suppose
we would like to consider an overall signicance level of and that altogether
n comparisons, i.e., distinct statistical tests, are performed per sample. Then,
the signicance level s for each distinct test is set to

s = (14.16)
n
Explicitly, given n tests Ti for hypotheses Hi (1 i n) under the assumption
H0 that all hypotheses Hi are false, and if the individual test critical values
are /n, then the experiment-wide critical value is . In equation form,
if

P (Ti passes | H0 ) for 1 i n, (14.17)
n
then
P (some Ti passes | H0 ) . (14.18)
In most cases, the Bonferroni approach is too weak to be useful and other
methods are preferrable (Perneger, 1998), e.g., resampling based methods
(Westfall and Young, 1993).
Assessing Worst-Case or Best-Case Performance
In certain circumstances, it may be important to compare the worst-case or

best-case performance of two optimizers. Obtaining statistically signicant
inferences for these is more computationally demanding than when assessing
dierences in mean or typical performance, however, it can be done using per-
mutation methods, such as bootstrapping or variants of Fishers permutation
test (Efron and Tibshirani, 1993, chap. 15).
For example, let us say that we wish to estimate whether there is a dier-
ence in the expected worst indicator value of two algorithms, when each is run
ten times. To assess this, one can run each algorithm for 30 batches of 10 runs,
and nd the mean of the worst-in-a-batch value, for each algorithm. Then, to
compute the null distribution, the labels of all 600 samples can be randomly
permuted, and the worst indicator value from those with a label in 1, . . . , 10
are determined. By sampling this statistic many times, the desired p-value
that the mean of the worst-in-a-batch statistics are signicantly dierent, can
be computed. Quite obviously, such a testing procedure is quite general and
it can be tailored to answer many questions related to worst-case or best-case
performance.
14.9 Summary
This chapter deals with the issue of assessing and comparing the quality of
Pareto set approximations. Two current principal approaches, the quality in-
dicator method and the attainment function method, are discussed, and, in
addition, a third approach, the dominance-ranking technique, is presented.6
As discussed, there is no best quality assessment technique with respect
to both quality measures and statistical analysis. Instead, it appears to be rea-
sonable to use the complementary strengths of the three general approaches.
As a rst step in a comparison, it can be checked whether the considered op-
timizers exhibit signicant dierences using the dominance-ranking approach,
because such an analysis allows the strongest type of statements. Quality in-
dicators can then be applied in order to quantify the potential dierences
in quality and to detect dierences that could not be revealed by dominance
ranking. The corresponding statements are always restricted as they only hold
for the preferences that are represented by the considered indicators. The com-
putation and the statistical comparison of the empirical attainment functions
are especially useful in terms of visualization and to add another level of de-
tail; for instance, plotting the regions of signicant dierence gives hints on
where the outcomes of two algorithms dier.
6
Implementations for selected quality indicators as well as statistical testing proce-
dures can be downloaded at http://www.tik.ee.ethz.ch/sop/pisa/ under the head-
ing performance assessment.
We noted when discussing quality indicators that, as well as their tradi-

tional use to assess optimization outcomes, they can also be used within op-
timizers, to guide the generating process (Beume et al., 2007; Fleischer, 2003;
Smith et al., 2008; Wagner et al., 2007; Zitzler and Knzli, 2004). Optimizers
that seek to maximize a quality indicator directly are eectively conducting
the search in the space of approximation sets, rather than in the space of solu-
tions or points. This seems a logical and attractive approach when attempting
to generate a Pareto front approximation, because ultimately the outcome
will be assessed using a quality indicator (usually). However, although such
approaches are improving, some of them still rely on approximation of the
set-based indicator function, or they do not rely solely on the indicator, but
make use of heuristics concerning individuals (point/solutions) (e.g., an indi-
viduals nondominated rank) as well. A recent study even compared set-based
selection with individual-based selection, and found the latter to be generally
preferable.
Quality indicators for assessing Pareto front approximations are some-
times used without explicitly stating what the DM preferences are. Really,
the indicator(s) used should reect any information one has about the DM
preferences, so that approximation sets are assessed appropriately. The work
of Hansen and Jaszkiewicz (1998) dened some quality indicators in terms of
sets of utility functions, a framework that easily allows for DM preferences to
be incorporated into assessment. A similar approach was recently proposed
by Zitzler et al. (2007) for the hypervolume. Both of these indicator families
can be used to incoporate preferences within generating methods (potentially
in an interactive fashion).
Finally, note that there are several further issues that have not been treated
in this chapter, e.g., binary quality indicators; indicators taking the decision
vectors into account; computation of indicators on parallel or distributed ar-
chitectures. Many of these issues represent current research directions which
will probably lead to modied or additional performance assessment methods
in the near future.
Acknowledgements
Sections 14.1 to 14.5 summarize the results of the discussion of the working
group on set quality measures during the Dagstuhl seminar on evolution-
ary multiobjective optimization 2006. The working group consisted of the
following persons: Jrg Fliege, Carlos M. Fonseca, Christian Igel, Andrzej
Jaszkiewicz, Joshua D. Knowles, Alexander Lotov, Serpil Sayin, Lothar Thiele,
Andrzej Wierzbicki, and Eckart Zitzler. The authors would also like to thank
Carlos M. Fonseca for valuable discussion and for providing the EAF tools.
References
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and
powerful approach to multiple testing. Journal of the Royal Statistical Society,
Series B (Methodological) 57, 125133 (1995)
Berezkin, V.E., Kamenev, G.K., Lotov, A.V.: Hybrid adaptive methods for approxi-
mating a nonconvex multidimensional pareto frontier. Computational Mathemat-
ics and Mathematical Physics 46(11), 19181931 (2006)
Beume, N., Rudolph, G.: Faster S-Metric Calculation by Considering Dominated
Hypervolume as Klees Measure Problem. In: Proceedings of the Second IASTED
Conference on Computational Intelligence, pp. 231236. ACTA Press, Anaheim
(2006)
Beume, N., Naujoks, B., Emmerich, M.: SMS-EMOA: Multiobjective selection based
on dominated hypervolume. European Journal on Operational Research 181,
16531669 (2007)
Bland, J.M., Altman, D.G.: Multiple signicance tests: the bonferroni method.
British Medical Journal 310, 170 (1995)
Chambers, J., Cleveland, W., Kleiner, B., Tukey, P.: Graphical Methods for Data
Analysis. Wadsworth, Belmont (1983)
for Solving Multi-Objective Problems. Kluwer Academic Publishers, New York
(2002)
Conover, W.J.: Practical Nonparametric Statistics, 3rd edn. John Wiley and Sons,
New York (1999)
Czyzak, P., Jaskiewicz, A.: Pareto simulated annealinga metaheuristic for multi-
objective combinatorial optimization. Multi-Criteria Decision Analysis 7, 3447
(1998)
Deb, K.: Multi-objective optimization using evolutionary algorithms. Wiley, Chich-
ester (2001)
Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multi-objective ge-
181197 (2002)
Efron, B., Tibshirani, R.: An introduction to the bootstrap. Chapman and Hall,
London (1993)
Ehrgott, M., Gandibleux, X.: A Survey and Annotated Bibliography of Multiobjec-
tive Combinatorial Optimization. OR Spektrum 22, 425460 (2000)
Fleischer, M.: The Measure of Pareto Optima. In: Fonseca, C.M., Fleming, P.J.,
Zitzler, E., Deb, K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 519533.
Fonseca, C.M., Fleming, P.J.: Genetic Algorithms for Multiobjective Optimization:
Formulation, Discussion and Generalization. In: Forrest, S. (ed.) Proceedings of
the Fifth International Conference on Genetic Algorithms, pp. 416423. Morgan
Kaufmann, San Mateo (1993)
Fonseca, C.M., Grunert da Fonseca, V., Paquete, L.: Exploring the performance of
stochastic multiobjective optimisers with the second-order attainment function.
In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.) EMO 2005.
LNCS, vol. 3410, pp. 250264. Springer, Heidelberg (2005)
Fonseca, C.M., Paquete, L., Lpez-Ibez, M.: An Improved Dimension-Sweep Algo-

rithm for the Hypervolume Indicator. In: Congress on Evolutionary Computation
(CEC 2006), Sheraton Vancouver Wall Centre Hotel, Vancouver, BC Canada, pp.
Grunert da Fonseca, V., Fonseca, C.M., Hall, A.O.: Inferential Performance As-
sessment of Stochastic Optimisers and the Attainment Function. In: Zitzler, E.,
Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO 2001. LNCS,
Hansen, M.P., Jaszkiewicz, A.: Evaluating the quality of approximations of the non-
dominated set. Technical report, Institute of Mathematical Modeling, Technical
University of Denmark. IMM Technical Report IMM-REP-1998-7 (1998)
Helbig, S., Pateva, D.: On several concepts for -eciency. OR Spektrum 16(3),
179186 (1994)
Kamenev, G., Kondtratev, D.: Method for the exploration of non-closed nonlinear
models (in Russian). Matematicheskoe Modelirovanie 4(3), 105118 (1992)
Kamenev, G.K.: Approximation of completely bounded sets by the deep holes method.
Computational Mathematics And Mathematical Physics 41, 16671676 (2001)
Knowles, J.: A summary-attainment-surface plotting method for visualizing the per-
formance of stochastic multiobjective optimizers. In: Computational Intelligence
and Applications, Proceedings of the Fifth International Workshop on Intelligent
Systems Design and Applications: ISDA05 (2005)
Knowles, J., Corne, D.: On Metrics for Comparing Non-Dominated Sets. In:
Congress on Evolutionary Computation (CEC 2002), pp. 711716. IEEE Press,
Piscataway (2002)
Knowles, J., Thiele, L., Zitzler, E.: A Tutorial on the Performance Assessment of
Stochastic Multiobjective Optimizers. TIK Report 214, Computer Engineering
and Networks Laboratory (TIK), ETH Zurich (2006)
Knowles, J.D.: Local-Search and Hybrid Evolutionary Algorithms for Pareto Opti-
mization. Ph.D. thesis, University of Reading (2002)
Lpez-Ibez, M., Paquete, L., Sttzle, T.: Hybrid population-based algorithms for
the bi-objective quadratic assignment problem. Journal of Mathematical Mod-
elling and Algorithms 5(1), 111137 (2006)
Lotov, A.V., Kamenev, G.K., Berezkin, V.E.: Approximation and Visualization of
Pareto-Ecient Frontier for Nonconvex Multiobjective Problems. Doklady Math-
ematics 66(2), 260262 (2002)
(2004)
Miller, R.G.: Simultaneous Statistical Inference, 2nd edn. Springer, New York (1981)
Perneger, T.V.: Whats wrong with Bonferroni adjustments. British Medical Jour-
nal 316, 12361238 (1998)
Schott, J.: Fault Tolerant Design Using Single and Multicriteria Genetic Algorithm
Optimization. Masters thesis, Department of Aeronautics and Astronautics, Mas-
sachusetts Institute of Technology (1995)
Shaw, K.J., Nortclie, A.L., Thompson, M., Love, J., Fonseca, C.M., Fleming, P.J.:
Assessing the Performance of Multiobjective Genetic Algorithms for Optimiza-
tion of a Batch Process Scheduling Problem. In: 1999 Congress on Evolutionary
Computation, Washington, D.C., pp. 3745. IEEE Computer Society Press, Los
Alamitos (1999)
Smith, K.I., Everson, R.M., Fieldsend, J.E., Murphy, C., Misra, R.: Dominance-
based multiobjective simulated annealing. IEEE Transactions on Evolutionary
Computation. In press (2008)
Ulungu, E.L., Teghem, J., Fortemps, P.H., Tuyttens, D.: Mosa method: A tool for
solving multiobjective combinatorial optimization problems. Journal of Multi-
Criteria Decision Analysis 8(4), 221236 (1999)
Van Veldhuizen, D.A.: Multiobjective Evolutionary Algorithms: Classications,
Analyses, and New Innovations. Ph.D. thesis, Graduate School of Engineering,
Air Force Institute of Technology, Air University (1999)
Wagner, T., Beume, N., Naujoks, B.: Pareto-, Aggregation-, and Indicator-Based
Methods in Many-Objective Optimization. In: Obayashi, S., Deb, K., Poloni,
C., Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 742756.
Springer, Heidelberg (2007), extended version published as internal report of
Sonderforschungsbereich 531 Computational Intelligence CI-217/06, Universitt
Dortmund (September 2006).
Westfall, P.H., Young, S.S.: Resampling-based multiple testing. Wiley, New York
(1993)
While, L.: A New Analysis of the LebMeasure Algorithm for Calculating Hyper-
volume. In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.) EMO
While, L., Bradstreet, L., Barone, L., Hingston, P.: Heuristics for Optimising the Cal-
culation of Hypervolume for Multi-objective Optimisation Problems. In: Congress
on Evolutionary Computation (CEC 2005), IEEE Service Center, Edinburgh,
Scotland, pp. 22252232. IEEE Computer Society Press, Los Alamitos (2005)
While, L., Hingston, P., Barone, L., Huband, S.: A Faster Algorithm for Calculat-
ing Hypervolume. IEEE Transactions on Evolutionary Computation 10(1), 2938
(2006)
Zitzler, E., Knzli, S.: Indicator-Based Selection in Multiobjective Search. In: Yao,
X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervs, J.J., Bullinaria, J.A.,
Rowe, J.E., Tio, P., Kabn, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS,
Zitzler, E., Thiele, L.: Multiobjective Optimization Using Evolutionary Algorithms
- A Comparative Case Study. In: Eiben, A.E., Bck, T., Schoenauer, M., Schwe-
fel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 292301. Springer, Heidelberg
(1998)
Zitzler, E., Thiele, L.: Multiobjective Evolutionary Algorithms: A Comparative Case
Study and the Strength Pareto Approach. IEEE Transactions on Evolutionary
Computation 3(4), 257271 (1999)
Zitzler, E., Thiele, L., Laumanns, M., Foneseca, C.M., Grunert da Fonseca, V.:
Performance Assessment of Multiobjective Optimizers: An Analysis and Review.
Zitzler, E., Brockho, D., Thiele, L.: The Hypervolume Indicator Revisited: On the
Design of Pareto-compliant Indicators Via Weighted Integration. In: Obayashi, S.,
Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403,
15
Interactive Multiobjective Optimization from a
Learning Perspective
Valerie Belton1 , Jrgen Branke2 , Petri Eskelinen3 , Salvatore Greco4 ,

Julin Molina5 , Francisco Ruiz5 , and Roman Sowiski6
1
Department of Management Science, University of Strathclyde, 40 George
Street, Glasgow, UK, G1 1QE,< val.belton@strath.ac.uk
2
Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany,
3
Helsinki School of Economics P.O. Box 1210, FI-00101 Helsinki, Finland,
Petri.Eskelinen@hse.fi
4
Faculty of Economics, University of Catania, Corso Italia 55, 95129 Catania,
Italy, salgreco@unict.it
5
Department of Applied Economics (Mathematics), University of Mlaga, Calle
Ejido 6, E-29071 Mlaga, Spain, julian.molina@uma.es, rua@uma.es
6
Institute of Computing Science, Pozna University of Technology, 60-965
Pozna, Poland, and Systems Research Institute, Polish Academy of Sciences,
01-447 Warsaw, Poland, roman.slowinski@cs.put.poznan.pl
Abstract. Learning is inherently connected with Interactive Multiobjective Opti-

mization (IMO), therefore, a systematic analysis of IMO from the learning perspec-
tive is worthwhile. After an introduction to the nature and the interest of learning
within IMO, we consider two complementary aspects of learning: individual learning,
i.e., what the decision maker can learn, and model or machine learning, i.e., what
the formal model can learn in the course of an IMO procedure. Finally, we discuss
how one might investigate learning experimentally, in order to understand how to
better support decision makers. Experiments involving a human decision maker or
a virtual decision maker are considered.
15.1 Introduction
The aim of this chapter is to explore the notion of learning in the context
of Interactive Multiobjective Optimization (IMO) where Classical Multiob-
jective Optimization (CMO) (see Chapter 2) or Evolutionary Multiobjective
Optimization (EMO) (see Chapter 7) are used. This is an important subject
because, on one hand, IMO enables the Decision Maker (DM) to learn about
the optimization problem, and, on the other hand, it allows the formal model
Reviewed by: Kalyanmoy Deb, Indian Institute of Technology Kanpur, India
406 V. Belton et al.
to evolve in response to additional information about preferences of the DM,

which can be viewed as learning of the formal model. In consequence, the
quality of an IMO procedure is related to what and how the DM and the
model can learn in the course of the search for the most preferred solution. A
characterization of IMO procedures from this perspective requires answers to
several questions. What are characteristic features of individual learning? How
can individual learning be supported? How do dierent models learn about
preferences of the DM? What relations exist between individual learning and
model learning? We try to answer these questions in order to assess poten-
tial benets of incorporating interactive procedures within the evolutionary
process, taking into account both behavioural and technical aspects. In our
investigation, we have sought to take account of past and current research
in this area; however, we are aware that the analysis presented is not to the
depth that the importance of and potential interest in the topic would merit.
The subject is simply worth another book. Our modest aim in this chapter
is to point out interesting issues and to present some preliminary conclusions
supported by a selective review of literature. We would wish to encourage fu-
ture research in this area, and to recommend particular attention to learning
aspects when developing and implementing IMO procedures.
The chapter begins with a brief broad overview of learning and why it is
of interest to us. The following two sections go on to explore in greater detail
individual learning and model learning in the context of IMO. The fourth
section focuses on methodologies and procedures to investigate learning within
IMO, and how we can characterize IMO procedures from the learning point
of view, in order to transmit the best practice from CMO to EMO.
15.2 What Is Learning?

Literature on learning is extensive and cuts across a number of domains,
most traditionally Psychology and Education which are both concerned with
individual learning (see (Merriam and Caarella, 1991) for a well regarded
overview), but more recently (since the mid 20th century) including: Arti-
cial Intelligence and Machine Learning (see (Russell and Norvig, 2003) for a
comprehensive review), Organisational Studies, with the inuential work of
Argyris and Schon (1978) on single and double loop learning; and the work
of Lave and Wenger (1991) on social and situated learning in communities of
practice.
At the general level of seeking to support complex decision making in
organisations, all of these perspectives on learning are relevant. However, in
this chapter we focus on an individual participating in an interactive decision
making process, with the dual objectives of directly facilitating the DMs
learning (individual learning) and of progressive modelling of DMs preferences
(model learning).
15 Interactive Multiobjective Optimization from a Learning Perspective 407
15.2.1 Individual Learning
Individual learning is a concept which is intuitively meaningful to us all, but

one which conjures up many dierent interpretations - for example, you might
learn that the worlds population is currently about 6.5 billion (CIA, 2008),
how to ride a bike, that you prefer apples to oranges but your friend prefers
bananas to both, that drinking more than 2 cups of coee a day leaves you
a nervous wreck, or that you now believe it is unethical to eat meat and
intend to become a vegetarian. These interpretations incorporate a range of
learning outcomes. Some of the outcomes are objective in the sense of being
subject to external validation, for example, facts and conceptual relationships,
demonstration of skills or changes in behaviour. Others, such as beliefs, values,
and self-awareness, are internal and subjective, in the sense of pertaining to
the individual; others relate to your understanding of others values and are
inter-subjective. These outcomes are the consequence of learning processes
and associated stimuli. Possible learning processes range from the planned and
explicit, such as formal education or conscious reection, to the unconscious
learning that leads to the building of tacit knowledge, or the learning may
be triggered by an unexpected event. Stimuli for learning may be: exposure
to new ideas or knowledge; a good or bad experience; the need to respond
to questions about ones own ideas; the motivation to challenge others ideas;
the need to take action; or the process of conscious reection. It is important
also to recognise dierent levels" of learning; from the acquisition of new
knowledge or facts driven by external stimuli, through the internal processes
of assimilating and making sense of what we know or have experienced, to a
transformation of perspective as a consequence of critical reection.
Thus, learning is both process and product, so we need to pay attention
to how individuals learn and what they learn in course of an IMO procedure.
15.2.2 Model Learning
Model learning is a concept which underlines an evolution of the model in view

of facts observed through sensors in an external world. Instead of evolution,
one can also speak about adaptation to a new situation. A model is usually
implemented as a computer program on a machine, hence the term machine
learning is often used instead of model learning. Thus, machine, or model,
learning is broadly dened as the ability of a computer program to improve its
performance with regard to a dened task by learning from data or examples.
Machine, or model, learning is an important subeld of Articial Intelligence
(Michalski et al., 1998). Of course, a model learns (evolves, adapts) in a way
and with regard to a task dened by a human. The model relates an output
(response or dependent variable, decision attribute, conclusion) with an input
(explanatory or independent variables, condition attributes, premise), either
analytically, using a function, or logically, using decision rules or trees. Tradi-
tionally, the learning of functions falls into a broad term of regression (ordinal
regression, statistical regression, neural network training), while learning of

decision rules or trees is called induction (learning by example, data min-
ing, knowledge discovery). Several model learning techniques are surveyed in
Chapter 10.
In case of IMO, the input information for model learning is a reection
of subjective matters, such as an individuals preferences. It is exhibited by
pieces of information such as: classication, rating or pairwise comparisons
of selected solutions; specication of a reference or a reservation point; or
indication of acceptable trade-os. Note that the information provided by
the DM may be holistic, when it concerns overall evaluation of particular
solutions expressed in terms of classication, rating or some pairwise com-
parisons, and/or decomposed, when it concerns directly some parameters of
the decision model (weights, tradeos, marginal value functions, preference
thresholds, etc.). The output of model learning is a more or less explicit pref-
erence model of the DM who provided the input information. It may be used
to guide the search for a preferred" solution, and evolve from iteration to iter-
ation with regard to new input information. In Chapters 4 and 5 of this book,
two methods in which the model learning component is of primary importance
are presented. In the GRIP method (Chapter 4) involving ordinal regression,
the model learns a set of additive value functions compatible with pairwise
comparisons of some reference solutions, including some indications of inten-
sity of preference. In the Dominance-based Rough Set Approach (Chapter
5), the model learns association rules characterizing the Pareto optimal set,
and decision rules characterizing good solutions in a given population. Model
learning also plays a role in interactive evolutionary multiobjective optimiza-
tion discussed in Chapters 6 and 7, at least for those approaches that attempt
to infer a higher-level description of DMs preferences from the interaction.
Model learning and individual learning are strongly inter-related, since the
model learns from reactions of the DM and the DM learns from explanations
provided by the model, as discussed in the next section.
15.2.3 The Learning Cycle

The overall cycle of IMO is depicted in Figure 15.1. Using preference infor-
mation articulated by the DM, the inference engine constructs or updates the
computers model of the DMs preferences. The inference engine depends on
the one hand on the nature of the preference information, and on the other
hand on the nature of the considered model of DMs preferences. Thus, for
example, the inference engine can determine:
a preference model in terms of a linear value function built using preference
information expressed in terms of subjective tradeos (Georion et al.,
1972),
a preference model in terms of an achievement scalarizing function built us-
ing preference information expressed in terms of a reference point consist-
ing of aspiration levels assigned to considered criteria (Wierzbicki, 1980),
a preference model in terms of one or a set of additive value functions

built using preference information expressed in terms of a partial or com-
plete preorder of some reference solutions (Jacquet-Lagrze et al. (1987),
Chapter 4).
a preference model in terms of a set of if..., then..." decision rules induced
from rough approximation of an ordinal classication of some representa-
tive solutions in the Pareto optimal set (see Chapter 5).
Based on this model, the computer attempts to nd better solutions using
an optimizer. These solutions are then presented to the DM, possibly with
some additional information and/or specic requests for a new input. The
outputs of the inference and optimization phase help the DM to learn about
the problem and may inuence her preferences. This general outline shows
the complex interaction between individual learning (which is inuenced by
the optimizers output) and the model learning (which is inuenced by the
preference information provided by the DM). Thus, in the context of IMO, one
cannot be considered independently of the other. The interaction with the DM
should be designed such that the DMs response allows the best update of the
preference model, but also to help the DM learn as much as possible about
the problem. It is important to realize that these goals may be in conict.
One implication of this is that it may be a good idea to give the DM more
information than the minimum necessary to provide the required preference
information. For example, instead of proposing only one solution to the DM as
the best in the current state of interaction (which is the minimum information
that one could expect from a calculation stage of an IMO procedure), the
DM can learn much more from a visualization of the Pareto optimal set (see
Chapter 9), or from its description in terms of if..., then..." association rules
explaining that if certain levels are attained on some objectives, it is not
reasonable to expect more than some other levels on correlated objectives
(see Chapter 5).
As indicated above, this general framework can be instantiated in many
dierent ways according to:
How the decision makers preferences are articulated: for example, via
pairwise comparisons or classication of a set of solutions, specication of
a reference point, bounds or maximal/minimal trade-os.
The nature of the inference engine: for example, an ordinal regression,
a rule induction mechanism, an articial neural network or evolutionary
algorithms.
The adopted preference model: for example, a value function, achievement
scalarizing function, outranking relation or set of decision rules.
The output of the inference and optimization tool: for example, new so-
lution candidates, ranking or classication of solutions, decision rules or
association rules.
Preference Information
Optimizer/ Computer
Decision Maker
Inference engine
User preferences
User knowledge Preference model
Optimization
Information and
requests for user input
Fig. 15.1. Learning cycle of IMO
15.2.4 Motivation for the Interest in Learning
Consideration of learning is important in the context of interactive decision

processes because it is one of a number of factors we might wish to consider
in evaluating the performance of dierent approaches (Vanderpooten, 1990;
Olson, 1992; Miettinen, 1999). The broad aim of both classical" and evolu-
tionary" approaches to multiobjective optimisation is to guide the DM, in an
ecient and eective manner, to a preferred solution that is Pareto optimal.
The DM does not know in advance what are the good" solutions from the
technical, objective perspective (Pareto optimality) and thus she has to learn
what is possible. Analogously, the model does not know which of these solu-
tions the DM will prefer from the subjective perspective. The extent to which
the DMs preferences are pre-formed is an open question. The answer to this
depends in part on ones philosophical perspective, but there is much support
for the constructivist view of modelling for decision support (Morecroft and
Sterman, 1992; Roy, 1993; Belton and Elder, 1994) which argues that the
whole IMO process is instrumental in helping the DM learn about her pref-
erences. The expectation is that an eective learning process would lead to
increased satisfaction with and condence in a decision, as well as a better un-
derstanding of the underlying rationale. However, there are many unanswered
questions. Are these benets achievable? Are some interactive processes more
eective than others? Does it depend on characteristics of the DM or the
problem? To what extent is the technical performance of an approach in con-

ict with the learning eectiveness and if so, how can an appropriate balance
be achieved? An important meta-level question is How can we research these
issues?" We return to this discussion in Section 15.5. In Sections 15.3 and 15.4
we expand on the discussions of individual learning and model, or machine,
learning.
15.3 Aspects of Individual Learning

15.3.1 What Does a Decision Maker Learn?
The aim of any decision aiding process is that the DM should discover a way
forward which is both feasible and desirable - we might say that she learns her
preferred" solution. We position the word preferred in quotation marks, be-
cause, as Korhonen and Wallenius (1996) highlight, the notion of preferred"
(or most preferred", or satisfactory") is problematical; they propose a def-
inition based on a DMs conviction that, given a realistic understanding of
the problem, it is not possible to nd a solution she likes better. Thus there
are two important, interdependent components to the learning necessary to
arrive at a preferred" solution - an understanding of the problem and an
understanding of what the DMs values are.
Understanding the problem rst calls for knowledge both of what is tech-
nically relevant and possible, alongside what matters to the DM, i.e., what
her values are. It is important that appropriate attention is paid to the pro-
cess of problem structuring, as advocated by Keeney (1992)s value focused
thinking and Belton and Stewart (2002)s integrated approach to Multiple
Criteria Decision Making (MCDM), in order to properly learn what are the
key objectives and constraints. This is particularly important if an analysis
incorporates only a few key objectives. The interactive process may lead to
the surfacing or emergence of new criteria, which may lead to the analyst and
DM deciding to re-analyse the problem.
Having specied what denes a potential solution then the foregoing de-
scription of a preferred solution requires the DM to have a realistic under-
standing of what solutions are possible. This could be interpreted in many
ways - the set of all possible solutions, the set of Pareto optimal solutions,
an understanding of how objectives are causally related (i.e., what trade-os
are necessary), etc. The IMO process will enable the DM to learn about these
issues to a greater or lesser degree.
A good interactive process should enable the DM to learn about her pref-
erences and there is a strong belief on the part of analysts that learning
does take place (Miettinen, 1999; Vanderpooten, 1990; Olson, 1992), and in
our own experiences this is echoed by DMs asked to reect on their involve-
ment in case studies. However, the notion is also problematic. The eld of
MCDM has an extensive language, together with associated parameters and
structures, to reect the notion of preference within its models, for exam-
ple: importance weights, global weights, swing weights, preference thresholds,
indierence thresholds, veto thresholds, aspiration levels, goals, acceptance
levels, value functions, preference functions, ... to mention a few. The extent
to which these parameters are psychologically meaningful to DMs and the
questions used to elicit them are behaviourally realistic is often questioned
and contributes to the cognitive complexity of the task. Larichevs (Larichev,
1992) consideration of these issues is one of the most signicant contributions
in the MCDM community to date, and there have been several subsequent
calls (for example, (Dyer et al., 1992; Korhonen and Wallenius, 1996)) for the
need to pay attention to behavioural issues and the work of the Behavioural
Decision Theory community.
The doubt about psychological meaningfulness of specic preference model
parameters to DMs, has also led several researchers to develop approaches
which eliminate these parameters from the dialogue between the DM and the
model. Instead, the DM is asked to provide a holistic preference information,
e.g., in form of pairwise comparisons of selected solutions, which serve as exam-
ples for model learning. The preference model learned from such information
is used, in turn, on the complete set of solutions in order to obtain a preference
relation on this set. A proper exploitation of this relation leads to preferred"
solutions. This approach emphasizes the discovery of intentions as an inter-
pretation of actions rather than as a priori position. It is thus concordant
with the principle of posterior rationality proposed by March (1978) and with
aggregation-disaggregation logic of Jacquet-Lagrze (1981). The best-known
implementations in the eld of MCDM are: the UTA method (Jacquet-Lagrze
and Siskos, 1982), with its extensions UTAGMS (Greco et al., 2008) and the
previously mentioned GRIP (Figueira et al., 2008) - in which the model learns
a set of compatible value functions; and the Dominance-based Rough Set Ap-
proach (DRSA) (Greco et al., 2001, 2005, 2008; Sowinski et al., 2005) in
which the model learns a set of if..., then..." decision rules.
15.3.2 How Do We Know if a Decision Maker Has Learned?
A DMs condence in the decision and understanding of the related issues

(such as the range of potential solutions, the necessary trade-os, etc) and the
ability to explain her choice would seem to be good indicators that she has
learned. However, we should be careful not to look only for the DM learning
how to speak the technical language of MCDM rather than truly about pref-
erences. It may also be possible to detect learning as an interactive process
progresses. For example, if the DM provides preference information incon-
sistent with previously provided information, this could be due to learning.
Learning about the process itself and about her preferences would lead a DM
to being able to anticipate the next step and to become more comfortable with
making judgements at a later stage in the process. Furthermore, a willingness
to use the process again, or to recommend it to someone else, is an indicator

of satisfaction which may be associated with learning.
15.3.3 How Does a Decision Maker Learn?
The Chinese proverb "Tell me and I will forget, show me and I may remember,
involve me and I will understand", which is often quoted in an educational
context, is equally applicable here and highlights the potential power of in-
teractive methods to facilitate learning. However, the way in which someone
learns depends on many factors, including what is being learned, the circum-
stances in which this happens, the characteristics of the individual and her
motivation to do so. The circumstances of the DM may also inuence her will-
ingness to seriously engage in the interactive process, for example, her degree
of ownership of the problem, her motivation or the pressure to nd a good
solution, the time available, her general mental state (whether she is relaxed,
stressed, alert, etc.), as well as the presence or an absence of a facilitator
(analyst).
The characteristics of the approach, in conjunction with the DM charac-
teristics, will determine whether the overall chemistry is successful, for ex-
ample: How is information presented? What questions does the DM have to
respond to and how easy or dicult is it, rstly to understand the question
and, secondly, to answer? How exible is the approach? How transparent is
the approach? What opportunities are there to explore and experiment? How
demanding is the approach, for example, how many interactions are required?
What level of time commitment is required? How long is the time span between
interactions? For example, Korhonen and Wallenius (1996) indicate that DMs
often appear to tire of interactive procedures, and may conclude the process
prematurely.
Much learning relies on feedback systems of the kind characterised as single
and double loop learning by Argyris and Schon (1978). Single loop learning,
which can be described as learning for improvement, achieves that improve-
ment without questioning underlying assumptions or values. Action is initiated
with a view to achieving a specied goal; the outcome is compared with the
objective and if this is not met corrective action is taken. A thermostat is
an oft-cited example of single loop learning. Double loop learning, which may
be described as learning for understanding or transformation, results from a
challenge to and rethinking of assumptions or values. An interactive process
such as IMO is a feedback system of exactly this nature. However, it is im-
portant to consider when and how it is appropriate to encourage single or
double loop learning. In some circumstances the DM may be strongly focused
on quickly achieving specic goals, a form of single loop learning. Another DM
may be more interested in exploration and experimentation. An environment
which permits experimentation or play (as advocated by March (1988) in
"The technology of foolishness") encourages trial and error and is more likely
to lead to double loop learning. Learning for understanding was an important
objective of early visual interactive systems to support MCDM such as VIG

(Korhonen, 1987) and V.I.S.A. (Belton and Vickers, 1989). The ability to see
a situation through a dierent lens, facilitated in one sense by powerful tools
for visualisation (see Chapters 8 and 9) can also be a catalyst for learning.
The involvement of other stakeholders in the decision can also highlight dier-
ent perspectives on an issue. If their view is signicantly dierent, this is one
way of challenging the DMs thinking; another potential catalyst for learn-
ing. Other challenges might arise if the DM nds it very dicult to express a
preference between options, if she does not see any progress in the interactive
search process, perceives inconsistencies in where the search takes her, or is
surprised or disappointed by results. In much of the literature on learning a
"disorienting dilemma" is cited as being necessary for learning, which leads
to a signicant change in understanding the problem permitting a more in-
clusive, discriminating and integrating perspective (Mezirow, 1991). Perhaps
it might be interesting to build such challenges into a method if it is felt that
the DM would benet from the challenge? It is also important to allow space
for reection, to give the DM time to mull over her thought processes and to
ensure that no new insights arise, or if they do, to provide the opportunity to
retrace ones steps.
Many models of individual learning style" have been proposed, two of
the most common being the Learning Styles Inventory (LSI) (Kolb, 1984)
based on Kolbs learning cycle (Kolb and Fry, 1975; Kolb, 1984) and Visual
/Auditory /Kinaesthetic (VAK) models (Dunn et al., 1984) together with ex-
tended versions of this associated with Gardner (1993)s theory of multiple
intelligences. Although the validity of these models is questioned by some and
they are primarily situated in the educational domain, they may oer insight
into factors which may inuence how a DM responds to interactive methods
in general and to a particular approach. For example, the VAK classication
provides an indication of an individuals preferred mode(s) of receiving in-
formation. We might expect auditory learners to respond poorly to the more
usual, visual (text or graphics) way of presenting information in interactive
methods. Kolbs LSI categorises learners on two dimensions which dene four
learning styles. The dimensions are: the way in which they approach a situ-
ation (through reective observation - watching what happens and reecting
on it, or active experimentation - just getting on and doing it); and the way in
which they make sense of the experience (through abstract conceptualisation
- i.e., thinking about it, or through concrete experience - or feeling" it). Ac-
tive experimenters are perhaps more likely to respond positively to interactive
methods as usually implemented.
An important issue related to the question how a DM learns, is the trans-
parency and traceability of the decision support process. If information pro-
vided by the DM is processed in a way that is not clear, the consequence
may be that she cannot see how a nal recommendation is derived from her
inputs. Such a decision model is essentially a black box, the output of which
has to be accepted on the basis of trust in the analysts expertise and au-
thority. The DM has not learned about the problem, about her preferences or
why she should decide in a particular way. Ideally, the decision model should
be a glass box which fulls both representation and recommendation tasks
in a transparent way (see (Roy, 1993; Sowinski et al., 2005)). However, it
is likely that dierent representations are seen as more or less transparent
by dierent DMs. Some may be comfortable with a simple functional rep-
resentation augmented by a natural language explanation (for example, see
(Papamichail and French, 2003)). Others may prefer a rule-based approach
in which the model can be expressed in quasi-natural language, such as that
described by Greco et al. (2005, 2008) and by Sowinski et al. (2005). In this
approach each decision rule can be clearly identied with those parts of the
preference information (decision examples) which support the rule, and the
preference model is decomposed into simple scenarios which inform the DM
in a quasi-natural language about the local trade-os. In this way, the rules
permit traceability of the decision support process and give understandable
justications for the decision to be made.
It is also important to pay attention to the extensive work from the eld
of Behavioural Decision Theory (BDT) on the exploration of factors which
may inuence how a DM perceives a problem situation and how the way in
which she responds to questions designed to elicit preferences can be inu-
enced by the framing of the problem and the phrasing of the question. There
is a substantial literature on this topic (for example, see (Kahneman et al.,
1982; von Winterfeldt and Edwards, 1982)) and it is not possible to review this
in detail here. One phenomenon which is particularly relevant in the context
of IMO is the notion of anchoring, one form of which is the status-quo bias;
this demonstrates that DMs are reluctant to move away from an initial posi-
tion, or solution, even if that position is only recently or even hypothetically
assigned (see, (Tversky and Kahneman, 1991), or (Keeney et al., 2006), for
examples). The eects of anchoring in IMO were investigated by Buchanan
and Corner (1997). Other potentially relevant phenomena are loss aversion
(decision makers are more signicantly inuenced by dierences framed as
losses rather than gains - see Tversky and Kahneman 1991) and the eect
of decoy options (whereby the presence of an option C can inuence a DMs
choice between A and B - see (Huber et al., 1982)). The reader is also referred
to Korhonen and Wallenius (1996) who review behavioural issues in MCDM
and a recent paper by Morton and Fasolo (2008), which reviews work in BDT
of particular relevance to the use of methods based on multiattribute value
theory.
Having considered what and how a DM might learn from engaging in an
interactive process to support her decision making, together with the factors
that might inuence the extent to which the experience is a positive and
productive one, we have tried to summarise these in Figure 15.2. The key
drivers of learning are shown in the hexagonal boxes, with the solid arrows
indicating the learning cycle of experimentation and reection; the rectangular
boxes depict what" is learned, linked to how" by the dashed arrows; and
inuential factors are grouped and displayed in the ovals. In the next section
we go on to consider factors which dene the ability of dierent models to
learn about the DMs preferences.
15.4 Aspects of Model (or Machine) Learning

In the decision making phase of an interactive procedure, the DM provides
some information about her preferences and this information is used to con-
struct a preference model, which is usually a value function, even if it is
interpreted as a distance (metric) between a reference point and an attainable
point or, more generally, as an achievement scalarizing function. This model is
used in the next calculation phase to nd a compromise solution that ts the
presumed DMs preferences. In the following decision making phase, the DM
can accept this solution or give some new preference information that updates
the preference model. By adapting the model, the computer learns about the
DMs preferences. It is not our aim in this chapter to discuss technical details
of the processes by which this might happen, but to reect on the generic
characteristics of these and to introduce a model of the quality of the learning
process.
A fundamental feature of this learning process is the exibility of the
interaction procedure, meaning the capacity to incorporate any preference
information coming from the DM. The exibility is related to the generality
of the preference model and to the reversibility of the procedure, understood
as a possibility for the DM to return to a previous iteration in order to change
the preference information provided at that stage.
From the point of view of generality of the preference model, an impor-
tant characteristic is consideration of a plurality of value functions which are
compatible with the preference information provided by the DM. This is an
important distinctive feature of some interactive methods, contrasting with
methods which consider only one instance of the value function within a given
class (for example the linear value functions).
To illustrate by an example: consider the case in which the decision maker
wants to minimize two objective functions g1 and g2 , and the decision maker
says that she prefers solution x, for which g1 (x) = 4 and g2 (x) = 8, to solution
y, for which g1 (y) = 6 and g2 (y) = 4. If we consider the class of linear value
functions U (a) = 1 g1 (a) + 2 g2 (a), then two cases are possible:
one can look for just one value function compatible with the preferences
of the decision maker and, for example, consider the instance of the form
U (a) = 2g1 (a) + 3g2 (a), which veries the representation condition:
U (x) = 2g1 (x) + 3g2 (x) = 32 > U (y) = 2g1 (y) + 3g2 (y) = 24
The twenty-four methods of estimating additive utilities described by Fish-
burn (1967) are classical examples of such an approach.
Understanding inter- Understanding

related variables Understanding limits
what is possible
Facilitated by:
Ease of interaction
Option for
ffor trial & error
r
Exploration Option to backtrack /
Situational influences: & change ones mind
Time available Experimentation Visualisation ..
Way of presenting information
Nature and number of questions
asked
.
Understanding
what is
preferred
Reflection Challenge
Behavioural influences: Provoked by:

Learning style Understanding Surprising or disappointing
Motivation to find a solution what matters outcomes
Mental state Difficulty of making judgements
. Contradictory results
Alternative perspectives
.
Fig. 15.2. What and how a DM learns and the factors that inuence this.
one can look for the whole set of value functions compatible with the
preferences of the decision maker, i.e. all instances of value functions
U (a) = 1 g1 (a) + 2 g2 (a), such that
1 g1 (x) + 2 g2 (x) > 1 g1 (y) + 2 g2 (y)

Another interesting feature related to the learning process is universality, i.e.
the non specicity of the form of value functions: the less specic the form,
the greater the chance that the model learns in a sequence of iterations. For
example, the preference model admitting the form of a linear value function
is less universal than the preference model admitting the form of an additive
value function (U (a) = U1 [g1 (a)] + U2 [g2 (a)], with U1 and U2 being non-
decreasing functions, in the above example), which, in turn, is less universal
than the model admitting the form of an increasing monotonic value function
(U (a) = F (g1 (a), g2 (a)), with F being non-decreasing in both its arguments,
in the above example).
However, we note that there are two sides of universality. On the one hand,
the more universal the preference model, the higher the chance of being able
to properly represent the DMs true preferences. On the other hand, a more
complex underlying model requires more information, and thus more tiresome
interaction, to properly tune the many parameters. In particular, if the DM
gives a noisy response, a complex underlying preference model may overt,
while a simple function will resist to the noise. Finally, even a less general
model may be sucient to appropriately reect the DMs preferences in the
local vicinity of the current point of interest.
Another point to be considered is the zooming capacity of the preference

model. With respect to this point of view, we have on one side methods us-
ing the preference model only locally, i.e., representing DMs preferences in a
limited region of the objective space, and on the other side, globally, i.e., rep-
resenting the DMs preferences in the whole objective space. For example, the
GDF method is mainly local, and the reference point method is mainly global.
Between these two extremes, there are methods which use a global model for
the DMs preferences, but allow it to be rened locally in the course of the
interactive process. One example for a method with such a zooming" capacity
would be UTAGMS , where the addition of preference information relative to
solutions from a particular region results in a more precise representation of
preferences in this local region.
An overall representation of the features characterizing the learning ca-
pacity of the preference model in an interactive procedure is presented in a
tree form in Figure 15.3.
In the table displayed in Figure 15.4, we show how the above features are
extant in few representative interactive procedures, outlined briey below. We
have selected a set of suciently diversied procedures, from the viewpoint of
the model learning - this diversication is connected with the type of prefer-
ence information provided by the DM, type of preference model being learned,
Learning capacity of
the preference model
Flexibility
Generality Reversibility Zooming

capacity
Plurality of Universality of
instances of the form of
value functions value functions
Fig. 15.3. Tree of features characterizing the learning capacity of the preference
model in an interactive procedure.
and type of information given by the method to the DM. For an exhaustive
overview of interactive methods see Chapter 2.
STEM method (Benayoun et al., 1971). The procedure progressively
reduces the set of admissible Pareto optimal solutions by adding constraints
expressing the concessions that the DM is willing to concede with respect
to considered criteria. The compromise solutions proposed to the DM are
obtained by minimizing the distance (weighted Tchebychev metric) from the
utopia point to the set of admissible solutions. Let us remark that STEM
is devoted to linear problems, but nonlinear variants of STEM have been
proposed, for example, in (Vanderpooten and Vincke, 1989; Eschenauer et al.,
2000). Observe, moreover, that STEM asks the DM to classify the objective
functions into those whose values are satisfactory in the current solution and
those whose values are unsatisfactory. This means that it is a classication-
based method (see Chapter 2).
Georion, Dyer and Feinberg method ((Georion et al., 1972), Chap-
ter 2). The procedure maximizes a value function which is not known explic-
itly, but is assumed to be dierentiable and concave. In each iteration the DM
provides preference information specifying a subjective local trade-o between
dierent criteria. The subjective trade-o information denes a search direc-
tion for the next calculation phase. A solution that maximizes the unknown
value function in the established search direction is proposed to the DM, until
the DM is satised.
Reference Point method ((Wierzbicki, 1980), Chapter 2) nds in each
iteration a Pareto optimal point, called the current point, which optimizes
an achievement scalarizing function referring to a reference point (attainable
or not) specied by the DM (as well as several other solutions by shifting
the reference point). The DM is free to modify the reference points as she is
supposed to use this freedom to learn about the shape of the Pareto optimal
set and to explore its interesting parts.
Light Beam Search (LBS) method (Jaszkiewicz and Sowiski, 1999)
organizes the search on the Pareto set through a sampling of a neighborhood of
the current point which moves as a result of either a change of decision makers
reference point, or a shift of the current point within its neighborhood. An
outranking relation is used as a local preference model in the neighborhood
of the current point. The neighborhood is composed of Pareto optimal points
that outrank the current point, so the neighborhood includes points that are
comparable with the current point; the Pareto optimal points from outside the
neighborhood are either incomparable with or are outranked by the current
point. The procedure requires threshold information and is called LBS due
to analogy of the search with a projection of a focused beam of light from a
spotlight at the reference point onto the Pareto optimal set.
Multiobjective Optimization using UTAGMS or GRIP Methods
(Chapter 4). This method is based on the ordinal regression methodology for
elicitation of preference models (Greco et al., 2008; Figueira et al., 2008). In
each iteration, the DM gives preference information in terms of preference
STEM method GDF method Reference Point Light Beam Search UTAGMS or GRIP Dominance-based
420
method method methods Rough Set Approach

the value function is the the value function is an the value function is an the value function is an the method considers the implicitly, the method
Tchebychev metric, thus unknown value function achievement scalarizing achievement scalarizing whole set of additive value considers a whole set of
only one value function is representing a local trade- function for a particular function for a particular functions compatible with value functions making the
considered in each iteration off between different reference point, thus only reference point, however, the preference information same classification of
criteria specified by the one value function is each point in the given by the DM solutions from the Pareto
DM; only one value considered in each iteration neighbourhood of the optimal sample into good
function is considered in current point is attainable and others as the selected
Plurality
each iteration, even if many by an achievement decision rule; thus, the
value functions may scalarizing function with a whole set of value
V. Belton et al.
represent the same trade-off different direction, thus functions concordant with
at a certain point many value functions are the decision rules is
considered in each iteration considered in each iteration
the considered value the considered value the achievement scalarizing the achievement scalarizing this method considers it has been proved that a set
function has only one form function is very general: function is composed of an function is the same as in additive value functions of decision rules is
of a weighted Tchebychev it is supposed to be additive and a max-min the Reference Point with monotone marginal equivalent to the most
metric measuring the differentiable and concave component; it enjoys some method, however, it is value functions; such an general value function
distance form the utopia good theoretical properties, controlled not only by additive function is a very (monotonic with respect to
point to the set of like controllability and reference points but also by general value function; the value of the considered
admissible Pareto optimal acceptance of both the relative weights of consequently, it reaches a objectives); thus, this
solutions attainable and non- objectives within the very good level of method considers the most
Universality
attainable reference points neighbourhood of the universality general formulation of a
current point defined by the value function and,
outranking relation consequently, it reaches the
maximum of universality
the considered value for the features of plurality for the features of plurality for the features of plurality for the features of plurality for the features of plurality
function has only one form of instances and of instances and of instances and of instances and of instances and
of a weighted Tchebychev universality of value universality of value universality of value universality of value universality of value
metric measuring the functions, there is some functions, there is a fair functions, and for functions, there is a large functions, there is a large
distance form the utopia generality generality consideration of a credit for generality credit for generality
Generality
point to the set of neighbourhood of the
admissible solutions current point, there is a
good credit for generality.
each concession given on a the method could be used the method is fully the method is fully the method could be used the method could be used
criterion is definite and no as an explanatory search reversible reversible as an explanatory search as an explanatory search
more concessions can be process and in this case it process and in this case it process and in this case it
considered on the same would be reversible would be reversible would be reversible
criterion, thus the procedure
is not reversible
Reversibility
STEM method GDF method Reference Point Light Beam Search UTAGMS or GRIP Dominance-based
method method methods Rough Set Approach
for the lack of generality for the moderate generality for the fair generality and for the good generality and for the good generality and for the good generality and
and reversibility, there is no and reversibility, there is the reversibility, there is a the reversibility, there is a the reversibility, there is the reversibility, there is
space for flexibility of the some credit for flexibility fair credit for flexibility of good credit for flexibility of enough space for flexibility enough space for flexibility
preference model of the preference model the preference model the preference model of the preference model of the preference model
Flexibility
the model can learn very the model can learn the the model can learn a single the model can learn the due to the universality, the due to the universality, the
little from the interaction direction of improvement in point on the Pareto front in interesting area of compar- model can learn a lot from model can learn a lot from
with the DM each iteration each iteration able points on the Pareto the interaction with the DM the interaction with the DM
front in each iteration
capacity
Learning
has zooming capacity local use of the preference global use of the preference global with respect to the has zooming capacity has zooming capacity
model model reference point, local with
respect to the outranking
relation on the Pareto set
Zooming
capability
Fig. 15.4. Features inuencing the learning capacity of selected interactive methods.
15 Interactive Multiobjective Optimization from a Learning Perspective
421
comparisons of selected Pareto optimal solutions. Then, the whole set of ad-
ditive value functions compatible with the preference information given by
the DM is used to build a possible and a necessary order of all solutions con-
sidered in the Pareto optimal set. In the possible order, solution a is weakly
preferred to solution b if there is at least one value function giving to solution
a a value not smaller than to solution b. In the necessary order, solution a is
weakly preferred to solution b if all value functions give to solution a a value
not smaller than to solution b. The procedure ends when the possible and nec-
essary orders give to the DM enough information for choosing the satisfactory
solution.
Dominance-based Rough Set Approach (DRSA) to Multiobjec-
tive Optimization (Chapter 5). This method is based on MCDA methodol-
ogy for data mining on ordinal data (Greco et al., 2001). In each iteration, the
procedure presents a set of representative Pareto optimal solutions to the DM
who is asked to indicate the solutions that are relatively good. A set of if...,
then... decision rules explaining this classication is induced using DRSA.
The DM chooses from this set one decision rule characterizing good solutions.
This decision rule imposes some additional constraints on the acceptable set of
solutions to the MOO problem. Then, a new Pareto optimal set is calculated
and a representative sample of this set is presented to the DM. The procedure
stops when the DM nds a satisfactory solution in the presented sample. In
each decision phase of this procedure, the DM is informed about the shape
of the current Pareto optimal set by association rules showing relationships
between values of particular objectives.
In conclusion, if we take into account the features presented in Figure 15.3,
we can see that the interactive procedures which consider the whole set of value
functions compatible with preference information have a greater capacity of
learning the preference of the DM than the procedures considering only one
such value function. The procedure based on the UTAGMS and GRIP methods
has all favourable features; in particular, the form of the considered value
functions is general, just additive. The interactive procedure based on DRSA
has a similar capacity; it takes into account the whole set of compatible value
functions, i.e., those which make the same classication of solutions from the
Pareto optimal sample (into good and others) as the selected decision rule,
however, it is even more general because value functions corresponding to
rules used in DRSA are just non-decreasing for gain-type objectives.
15.5 How do We Investigate Learning?

Having dened what we mean by eective learning from the perspective of
both the decision maker and the model which denes the supporting inter-
active process, in this nal section of the chapter we go on to discuss how
we might investigate learning in order to understand how to better support
decision makers. Is it possible to determine, a priori or interactively, the most
appropriate way of helping a particular decision maker learn and to change

or adapt the model in use accordingly? This is a very challenging task. Not
only will dierent decision makers like or dislike dierent methods, the be-
haviour of an individual may dier from one situation to another, depending
on factors such as the importance and complexity of the decision, time avail-
able, and knowledge of the problem. Furthermore, as Korhonen and Wallenius
(1996) state, learning ones preferences is a gradual process and may lead to
a change in the decision makers needs and behaviour during the resolution
of a problem. Thus our aim would be to understand the characteristics and
needs of a decision maker and how these evolve during an interactive process,
in order to chose and, if possible, adapt the interactive method such that it
matches both the decision maker and the problem. This leads us to the ques-
tion: is it possible to determine and/or classify dierent learning patterns"
for both parties? In this section we discuss open questions with the aim of
giving a series of guidelines for further research in this eld, which should
combine behavioural and psychological aspects related to the decision maker,
with technical features and performances of the interactive techniques.
Following on from earlier discussion of how a decision maker might learn
from an interactive process and what could inuence that learning, it would
seem reasonable to utilise these factors in dening decision maker proles and,
potentially, a taxonomy of decision makers (or, more precisely, of decision
making behaviours). The next step would be to seek to classify interactive
techniques according to their performance with respect to the relevant aspects.
As a result of these two classications, a correspondence between methods
and behavioural patterns could be established. Easier said than done! We
will try to suggest some ideas that could be useful in order to produce such
classications.
Whilst a case study or action research approach (see (Easterby-Smith
et al., 2002) for an overview of qualitative research methods, (Reason and
Bradbury, 2001), for an in depth coverage of action research, and (Montibeller,
2007), for discussion of the use of action research in MCDM) might be used
to develop a taxonomy of DMs and understand how they learn in real sit-
uations, we feel that more could be learned and more quickly in a realistic
experimental context. These experiments should confront real DMs with real
decision problems, and with the use of dierent interactive techniques. A num-
ber of such experiments are described in the literature and although none were
explicitly concerned with assessing learning, we can learn from them about
how DMs respond to dierent methods. Wallenius (1975) compares the per-
formances of the GDF method (Georion et al., 1972), the STEP method
(Benayoun et al., 1971) and a simple trial and error procedure (somewhat
similar to an iterative goal programming scheme, see (Dyer, 1973)). A hypo-
thetical problem of planning production, inventory and labour was presented
to 36 participants (18 students and 18 managers), who solved the problem
using all the three methods. The following criteria were used to evaluate the
performance of each method: DMs condence in the nal solution; ease of use
of the method; ease of understanding the logic of the method; usefulness of

the information provided; rapidity of convergence (number of cycles and total
time) and CPU time. In addition, the DMs were asked to perform an overall
preference ranking of the methods.
Buchanan and Daellenbach (1987) carried out a similar experiment (using
the same hypothetical problem and the same evaluation criteria) to com-
pare the performance of the Zionts-Wallenius method (Zionts and Wallenius,
1976, 1983), the Surrogate Worth Tradeo Method (Chankong and Haimes,
1987, 1983, pp. 371379), Steuers Chebyshev method (Steuer and Choo, 1983,
Steuer, 1985, pp. 419450) and a naive approach. Miettinen and Mkel (1999)
and Miettinen et al. (2006) report two studies about reference point (and clas-
sication) based interactive procedure NIMBUS (Miettinen, 1999; Miettinen
and Mkel, 2006) and the reference direction method (Narula et al., 1994),
taking into account the computational eciency (number of objective func-
tion evaluations), duration of the procedure (overall time, number of itera-
tions, time per iteration) and the opinion of the DMs on the controllability of
the method and the nal solution. Buchanan and Gardiner (2003) also com-
pare the performance of two reference point approaches, seeking the DMs
evaluation of the quality of the nal solution obtained.
In all the studies outlined above, two common issues emerge. First, al-
though the studies reveal that a majority of participants share opinions on
many aspects, there are still DMs who have a dierent view. This implies that
the performance of a given interactive method could perhaps be described in
general for a typical DM, but not for all individuals. Second, the criteria
used to evaluate the methods have in all cases been determined by the au-
thors of the studies, and no indication is given of how important or relevant
the DMs consider each criterion to be. This suggests an initial unstructured
approach which allows DMs to dene factors they consider relevant in evalu-
ating dierent interactive processes. Taking these together with those factors
derived from the literature and already outlined in this paper this will pro-
vide a comprehensive basis for the characterisation of DM behaviours, to be
further validated through a series of realistic experiments.
The next step would be to categorize the dierent interactive procedures
with respect to their match with the DM characteristics. One way to do this
is to test them with human DMs. Such experiments can highlight desirable
properties of each procedure, but we must always be aware of the potential
for experimental bias. For example, if the same problem is solved several
times by a DM using dierent methods, it is impossible to make her forget
what she has learnt about the problem before starting each resolution. If, on
the other hand, the problem is changed, then it might be dicult to explain
whether a change in the DMs behaviour is due to the change of method or
to the change of problem. These potential drawbacks may be overcome if a
virtual decision maker is used instead. In many such experiments (see, e.g.,
(Miettinen, 1999)), this virtual" decision maker has always taken the form
of a utility or value function. But in most of the cases, a value function may
be too limited to actually model the real behaviours of a DM. So the most
challenging question of this section is: is it possible to model the behavioural
characteristics previously identied? Is it possible to substitute the classical
value functions by more sophisticated modelling tools?
15.5.1 Designing Experiments Involving a Human Decision Maker
As already indicated, a key aim of the proposed experiments is to better

understand how DMs learn from the use of interactive methods in order to
better match and adapt methods to a particular DM in a particular situation.
Previous experimental work has indicated that DMs often act in an apparently
irrational" way, for example when the cognitive load imposed by a process
becomes heavy (Larichev et al., 1987). However, it is important to recognise
that although such behaviour might indicate limitations on the part of the
DM, it may equally well reveal serious gaps in current understanding and
theories related to learning and rational decision making (see (Olson, 1992),
and references therein). In particular, the lack of a theory to explain the
process of learning means that a DM changing her mind, because she has
better understood the problem or her preferences, may be mis-interpreted as
irrationality. These factors highlight the importance of designing experiments
which engage real DMs with real problems, supported by Hobbs assertions
that experiments rather than reasoning are necessary in assessing subjective
factors such as DM perceptions (Hobbs, 1986).
However, the nature of experiments is wide-ranging and it is important to
consider the range of possibilities. On the one hand, a so called true exper-
iment, which randomly allocates participants in the experiment to a control
or experimental group and seeks to control for extraneous factors, is consis-
tent with a positivist/deductive research methodology, focuses on establishing
causal relationships, has high specicity (concentrates on a few variables) and
high internal validity, but suers from limited realism and limited general-
isability. On the other hand, a quasi experiment (which lacks the random
allocation to controlled conditions) in a eld setting (which gives less control
over extraneous factors) (Robson, 1993) may have greater realism but at the
expense of internal validity. A quasi-experiment comes closer to the case study
or naturalistic paradigm and there tends to be more emphasis on emergent
design and inductive reasoning as well as greater reliance on qualitative data.
Thus, the latter approach may be appropriate to help us better understand
and develop models of learning, whilst the former may be better suited to test
these.
Whilst there is a substantial general literature on the use of experiments
as a research method and the experimental method is widely used in the eld
of behavioural decision making, there are relatively few published papers on
its use in MCDM in general and in evaluating interactive methods in particu-
lar. Key overview papers are those by Hobbs (1986), Olson (1992) and Aksoy
et al. (1996). The paper by Olson (1992), which is strongly focused on inter-
active multiobjective methods summarises 11 studies conducted up to 1990
and Aksoy et al. (1996) extend this to 1994, incorporating 6 studies involving
human DMs and 8 in which the DMs responses are simulated by a model
of some kind. These studies highlight a number of important considerations
in designing experiments to evaluate or compare MCDM methods, as well
as a range of practices. For Hobbs (1986) the ideal comparative experiment
should be: well controlled and use a suciently large sample of DMs to be
able to discern signicant dierences and generalize results; compare widely
used methods that represent divergent philosophies of decision making, or
claimed methodological improvements; compare methods across a variety of
problem types; be a realistic representation of real world decision making. He
also points out the need to be aware of order bias and possible DM fatigue.
Olson (1992)s view of an ideal study is one which involves substantive prob-
lems with real DMs, a situation which often limits the number of DMs that
can be involved (thus the small sample size makes it dicult to discern sig-
nicant dierences or generalize results) but permits more in depth enquiry
- circumstances which equate more to the quasi experiment in a eld setting
than the true experiment outlined above. It is clear that in designing an ex-
periment it is necessary to balance multiple factors which may be in conict,
a multicriteria problem in itself. We outline briey some of the key issues:
i). Dening realistic problems which are meaningful to the DMs, in the sense
that they understand the issue and have responsibility for implementing a
solution. This is very dicult to achieve in practice and often constructed
problems are utilized with students as DMs. Although this latter point
may be seen as a limitation, it is reported that expert DMs may favour
a trial and error approach (which have performed surprisingly well in com-
parison to more sophisticated methods in a couple of studies, especially
in the case of simple problems), whereas more inexperienced DMs tend
to welcome more guidance. Olson (1992) suggests that the use of business
school students, who are potential DMs but with little experience, can be
justied and may in fact represent an appropriate sample set in which to
discern learning eects.
ii). The level of randomness to incorporate in an experiment, ranging from a
tightly controlled approach which exposes all DMs to the same methods
and problems in the same order, etc, to one which seeks to vary all factors
in an attempt to eliminate potential eects. A particular consideration
might be the eect of the order in which a DM is exposed to methods;
if the DM feels that the rst method gives a good understanding of the
problem and generates a solution in which she has condence, then she
may simply try to nd the same solution with subsequent methods.
iii). The issue of DM fatigue as a consequence of the cognitive burden of the
experimental process, due simply to the time elapsed, and also to subse-
quent questioning.
iv). The need to consider not only the methods, but the way in which they
are implemented and the extent to which this is comparable across meth-
ods. Miettinen (1999) points out that a poor user interface might spoil a
good method; similarly a poor method can be given credibility through
an eective interface.
To summarize some limitations and disadvantages of experiments involving
human DMs, we refer to the list of Zanakis et al. (1998), which draws mostly
on the same sources as Olson (1992) and Aksoy et al. (1996): sample size
(number of DMs involved) and range of problems studied is very limited; DMs
are typically students rather than real DMs; the way information is elicited
may inuence the results more than the model used; the learning eect bi-
ases outcomes, especially when the DM employs various methods sequentially;
inherent human dierences might aect decisions, and thus, dierent perfor-
mances can be due to the methods used or to which DM applies it; and nally,
it is impossible or dicult to answer questions such as: Which method is more
appropriate for what type of problem? What are the advantages or disadvan-
tages of using one method over another? Does a decision change when using
dierent methods and if so, why and to what extent?
However, although we can learn from the experiments on which the above
discussions are based, the proposed research will go beyond comparative eval-
uation of methods in terms of DM satisfaction to try to understand the nature
of DM learning and the behavioural, situational and method factors or char-
acteristics which impact on this. Thus we may need to look more broadly at
ways of enhancing the experimental process in order to stimulate learning and
to capture subjective accounts of it.
15.5.2 Designing Experiments Involving a Virtual Decision Maker
Experimentation involving human DMs is essential to capture behavioural

and subjective issues such as how a DM is reacting to the method and her
condence in the outcome. However, the use of a simulated DM has certain
advantages and may provide a convenient means of examining more theoreti-
cal properties of methods. These advantages include, for example: the ability
to fully control the experimental situation; the possibility of repeating exper-
iments; the ability to expose the simulated DM to sequential trials without
the problems of order bias and learning transfer; and considerations such as
availability and lack of fatigue. Last but not least, experiments with a simu-
lated DM are much cheaper and less time consuming than experiments with a
human DM. Hence we return now to the question of whether is it possible to
model the behavioural patterns which we would hope to be revealed by exper-
imentation with human DMs. There have been some attempts to model DM
behaviour and use it to test software (for example, Gibson et al., 1987; Mote
et al., 1988; Reeves and Gonzalez, 1989; Aksoy et al., 1996; Shackelford and
Corne, 2004). However, these approaches are limited in the extent to which
they reect actual DMs behaviour and hence we suggest it is appropriate to

consider if we can develop more sophisticated modelling tools. As a rst step
we need to consider what behavioural characteristics we would like to imi-
tate. On the basis of the foregoing discussions we suggest that in addition to
an underlying preference structure and stopping rule, the virtual DM model
should capture:
Cognitive complexity of the task, which should reect the type of judge-
ments required, the nature and amount of information a DM must process.
DM fatigue, which might be a function of the number of iterations, the
number of required judgements and cognitive complexity. This might be
countered by level of motivation.
DM learning, which might be reected by a change in preference structure,
possibly dependent on the previous iterations.
DM inconsistency, which may be consequence of the tendency of a DM
to experiment, a reection of the cognitive complexity of the required
judgements, or simple noise" in preference judgements.
The second step is to consider how we might model these characteristics and
their consequences. Making the assumption of a base preference model (ex-
pressed in a functional or logical form) some factors may lead to a gradual or
stepwise increase in associated noise level (for example, noise level might in-
crease as a function of the number of iterations beyond a specied threshold),
others to a change of form of the preference function.
A virtual DM could be exposed to dierent problems and interactive ap-
proaches allowing comparisons to be made without problems of bias or fa-
tigue. Performance may be measured in terms of time to reach a solution,
eort expended, level of satisfaction achieved (which might be expressed as
the distance from the achieved solution to the theoretically preferred solution).
A long term aim of such experimentation could be to determine a match
between dierent identied behavioural patterns and the techniques which
are most suitable to support that particular way of working. However, we
would like to insist on the fact that the behaviours of the DM can always be
more complex and dynamic than those suggested by the previously mentioned
approaches. Thus, it would be an interesting challenge to seek to develop
exible problem solving environments which can detect behavioural changes
and react by oering dierent methods during the resolution process. In this
line, Caballero et al. (2002), proposed an integrated interactive system in
which several interactive methods of dierent kinds have been implemented,
and where the DM can change between them at any time during the process.
In this system, the decision to change the style of interaction rests with the
analyst, but this idea could be complemented with those ideas explored above
in order to assist the analyst to better choose the most adequate technique in
each case.
15.6 Summary
The aim of this chapter was to explore the notion of learning in the context
of interactive multiobjective optimization. There are two fundamental and
interdependent sides to learning: the individual learning of the DM, and the
model learning of the inference and optimization engine. Our discussion has
been wide ranging, covering considerations of how individual learning can be
characterised and facilitated, and the ways in which dierent types of model
learn, through interaction with individuals, about preference structure. We
have also looked at the interdependence of the two sides of learning, and the
potential research challenges. It is clear that there is scope for both positive
and negative feedback between these two forms of learning in any interactive
decision process and it is essential that those who develop and implement
interactive approaches need to pay attention to these issues. So far, there
has been rather little focused research in this area and thus we hope that
this chapter will provide both a stimulus and foundation for more in depth
research which pays explicit attention to the issues discussed here.
Acknowledgements
This chapter is based on working group discussions during the Dagstuhl sem-
inar on Practical Approaches to Multiobjective Optimization in December
2006. Besides the authors, the following people participated in the working
group and contributed to the discussion: Jerzy Baszczyski, Jos Figueira,
Pablo Funes, Vincent Mousseau. We gratefully acknowlege their input during
the seminar.
The second, the forth and the seventh author wish to acknowledge the support
of COST Action IC0602 Algorithmic Decision Theory.
References
Aksoy, Y., Butler, T.W., Minor, E.D.: Comparative studies in interactive multi-
ple objective mathematical programming. European Journal of Operational Re-
search 89(2), 408422 (1996)
Argyris, C., Schon, D.A.: Organisational Learning: A Theory of Action and Perspec-
tive. Addison-Wesley, Reading (1978)
Belton, V., Elder, M.D.: Decision support systems: Learning from visual interactive
modelling. Decision support systems 12, 355364 (1994)
Belton, V., Stewart, T.J.: Muliple Criteria Decision Analysis: An Integrated Ap-
proach. Kluwer, Boston (2002)
Belton, V., Vickers, S.P.: V.I.S.A - VIM for MCDA. In: Lockett, G., Islei, G. (eds.)
Improving Decision Making in Organizations, pp. 287304. Springer, Berlin (1989)
Benayoun, R., de Montgoler, J., Tergny, J., Larichev, O.: Linear programming
with multiple objective functions: STEP method (STEM). Mathematical Pro-
gramming 1, 366375 (1971)
Buchanan, J.T., Corner, J.L.: The eects of anchoring in interactive MCDM solution
methods. Computers and Operations Research 24(10), 907918 (1997)
Buchanan, J.T., Daellenbach, H.G.: Comparative evaluation of interactive solution
methods for multiple objective decision models. European Journal of Operational
Research 29, 353359 (1987)
Buchanan, J.T., Gardiner, L.: A comparison of two reference point methods in mul-
tiple objective mathematical programming. European Journal of Operational Re-
search 149, 1734 (2003)
Caballero, R., Luque, M., Molina, J., Ruiz, F.: PROMOIN: an interactive system for
multiobjective programming. Information Technologies and Decision Making 1,
635656 (2002)
ology. Elsevier Science Publishing Co., New York (1983)
Chankong, V., Haimes, Y.Y.: The interactive surrogate worth trade-o (ISWT)
method for multiobjective decision-making. In: Zionts, S. (ed.) Multiple Criteria
Problem Solving, pp. 4267. Springer, Berlin (1987)
CIA: The 2008 World Factbook. Central Intelligence Agency (2008), https://www.
cia.gov/library/publications/the-world-factbook/index.html
Dunn, R., Dunn, K., Price, G.E.: Learning Style Inventory. Price Systems, Lawrence,
KS (1984)
Dyer, J.S.: An empirical investigation of a man-machine interactive approach to
the solution of the multiple criteria problem. In: Cochrane, J., Zeleny, M. (eds.)
Multiple Criteria Decision Making, pp. 202216. University of South Carolina
Press, Columbia (1973)
Dyer, J.S., Fishburn, P.C., Steuer, R.E., Wallenius, J., Zionts, S.: Multiple criteria
decision making, multiattribute utility theory: The next 10 years. Management
Science 38, 645654 (1992)
Easterby-Smith, M., Thorpe, R., Lowe, A.: Management Research: an Introduction.
Sage, London (2002)
Eschenauer, H.E., Osyczka, A., Schfer, E.: Interactive multicriteria optimisation
in design process. In: Eschenauer, H., Koski, J., Osyczka, A. (eds.) Multicriteria
Design Optimization Procedures and Applications, pp. 71114. Springer, Berlin
(2000)
Figueira, J., Greco, S., Sowiski, R.: Building a set of additive value functions
representing a reference preorder and intensities of preference: GRIP method.
European Journal of Operational Research. doi:10.1016/j.ejor.2008.02.006 (2008)
Fishburn, P.C.: Methods of estimating additive utilities. Management Science 13(7),
435453 (1967)
Gardner, H.: Frames of Mind: The Theory of Multiple Intelligences. Basic Books,
New York (1993)
Georion, A., Dyer, J., Feinberg, A.: An interactive approach for multi-criterion
Gibson, M., Bernardo, J.J., Cheng, C., Badinelli, R.: A comparison of interactive
multiple-objective decision making procedures. Computers and Operations Re-
search 50(1), 97105 (1987)
Greco, S., Matarazzo, B., Sowiski, R.: Rough sets theory for multicriteria decision
analysis. European Journal of Operational Research 129(1), 147 (2001)
Art Surveys, pp. 507563. Springer, Berlin (2005)
Greco, S., Mousseau, V., Sowiski, R.: Ordinal regression revisited: Multiple criteria
ranking using a set of additive value functions. European Journal of Operational
Research 191(2), 415435 (2008)
Hobbs, B.F.: What can we learn from experiments in multiobjective decision anal-
ysis? IEEE Transactions on Systems, Man, and Cybernetics 3, 384394 (1986)
Huber, J., Payne, J.W., Puto, C.: Adding asymmetrically dominated alterna-
tives:violations of regularity and the similarity hypothesis. Journal of Consumer
Research 9, 9098 (1982)
Jacquet-Lagrze, E.: Systmes de dcision et acteurs multipies: Contribution une
thorie de laction pour les sciences des organisations. Thse dEtat, Universit
de Paris-Dauphine, Paris (1981)
ticriteria decision making: the UTA method. European Journal of Operational
Research 10(2), 151164 (1982)
Jacquet-Lagrze, E., Meziani, R., Sowiski, R.: MOLP with an interactive assess-
ment of a piecewise-linear utility function. European Journal of Operational Re-
search 31(3), 350357 (1987)
Jaszkiewicz, A., Sowiski, R.: The Light Beam Search approach - an overview of
methodology and applications. European Journal of Operational Research 113(2),
300314 (1999)
Kahneman, D., Slovic, P., Tversky, A. (eds.): Judgment under Uncertainty. Cam-
bridge University Press, Cambridge (1982)
Keeney, R.: Value Focussed Thinking: A Path to Creative Decision Making. Harvard
University Press, Cambridge (1992)
Keeney, R.L., Raia, H., Hammond, J.S.: The hidden traps in decision making. In:
Harvard Business Review Online (2006)
Kolb, D.: Experiential Learning: Experience as the Source of Learning. Prentice-
Hall, Englewood Clis (1984)
Kolb, D., Fry, R.: Toward an applied theory of experiential learning. In: Cooper,
C.L. (ed.) Theories of Group Processes, pp. 2756. John Wiley, London (1975)
Korhonen, P.: VIG - a visual interactive support system for multiple criteria decision
making. JORBEL 27(1), 415 (1987)
Korhonen, P., Wallenius, J.: Behavioural issues in MCDM: Neglected research ques-
tions. Journal of Multi-Criteria Decision Analysis 5, 178182 (1996)
Larichev, O.I.: Cognitive validity in design of decision-aiding techniques. Journal of
Multi-Criteria Decision Analysis 1, 127138 (1992)
Larichev, O.I., Polyakov, O.A., Nikiforov, A.D.: Multicriterion linear programming
problems - analytical survey. Journal of Economic Psychology 8, 389407 (1987)
Lave, J., Wenger, E.: Situated Learning. Legitimate peripheral participation. Uni-
versity of Cambridge Press, Cambridge (1991)
March, J.G.: Bounded rationality, ambiguity, and the engineering of choice. The Bell
Journal of Economics 9(2), 587608 (1978)
March, J.G.: The technology of foolishness. In: March, J.G. (ed.) Decisions and
Organizations, pp. 253265. Basil Blackwell, New York (1988)
Merriam, S.B., Caarella, R.S.: Learning in Adulthood. A comprehensive guide.
Jossey-Bass, San Francisco (1991)
Mezirow, J.: Transformative Dimensions of Adult Learning. Jossey-Bass, San Fran-

cisco (1991)
Michalski, R.S., Bratko, I., Kubat, M.: Machine Learning and Data Mining - Meth-
ods and Applications. Wiley, New York (1998)
Miettinen, K.: Nonlinear Multiobjective Optimization. Kluwer, Boston (1999)
Miettinen, K., Mkel, M.M.: Comparative evaluation of some interactive reference
point-based methods for multi-objective optimisation. Journal of the Operational
Research Society 50, 949959 (1999)
Montibeller, G.: Action researching multiple criteria decision analysis inter-
ventions. In: Shaw, D. (ed.) 49th Operational Research Society Conference
Keynote Papers (2007), http://personal.lse.ac.uk/MONTIBEL/OR49_Action_
researching_MCDA.pdf
Morecroft, J.D.W., Sterman, J.D.: Modelling for Learning. North Holland, Amster-
dam (1992)
Morton, A., Fasolo, B.: Behavioural decision theory for multi-criteria decision anal-
ysis: a guided tour. In: Journal of the Operational Research Society. To appear
(2008), doi:10.1057/palgrave.jors.2602550
Mote, J., Olson, D.L., Venkataramanan, M.A.: A comparative multiobjective pro-
gramming study. Mathematical and Computer Modelling 10(10), 719729 (1988)
Narula, S.C., Kirilov, L., Vassilev, V.: Reference direction approach for solving mul-
tiple objective nonlinear programming problems. IEEE Transactions on Systems,
Olson, D.L.: Review of empirical studies in multiobjective mathematical program-
ming: Subject reection of nonlinear utility and learning. Decision Sciences 23,
120 (1992)
Papamichail, K.N., French, S.: Explaining and justifying the advice of a decision
support system: a natural language generation approach. Expert Systems with
Applications 24(1), 3548 (2003)
Reason, P., Bradbury, H.: Handbook of Action Research. Sage, London (2001)
Reeves, G.R., Gonzalez, J.J.: A comparison of two interactive MCDM procedures.
European Journal of Operational Research 41(2), 203209 (1989)
Robson, C.: Real World Research: a Resource for Social Scientists and Practitioner
Researchers, 2nd edn. Blackwell, Oxford (1993)
Roy, B.: Decision science or decision-aid science. European Journal of Operational
Research 66(2), 184203 (1993)
Russell, S.J., Norvig, P.: Articial Intelligence: a Modern Approach. Prentice Hall,
Upper Saddle River (2003)
Shackelford, M.R.N., Corne, D.W.: A technique for evaluation of interactive evo-
lutionary systems. In: Parmee, I.C. (ed.) Proceedings of the 6th International
Conference on Adaptive Computing in Design and Manufacture (ACDM 2004),
Sowinski, R., Greco, S., Matarazzo, B.: Rough set based decision support. In:
Burke, E.K., Kendall, G. (eds.) Search Methodologies: Introductory Tutorials in
Optimization and Decision Support Techniques, pp. 475527. Springer, New York
(2005)
tions. John Wiley & Sons, New York (1985)
Steuer, R.E., Choo, E.U.: An interactive weighted tchebyche procedure for multiple
objective programming. Mathematical Programming 26, 326344 (1983)
Tversky, A., Kahneman, D.: Loss aversion in riskless choice: a reference-dependent
model. The Quarterly Journal of Economics 106, 10391061 (1991)
Vanderpooten, D.: The interactive approach in MCDA a conceptual framework and
some basic conceptions. Mathematical and Computer Modelling 12, 12131220
(1990)
Vanderpooten, D., Vincke, P.: Description and analysis of some representative in-
teractive multicriteria procedures. Mathematical and Computer Modelling 12,
12211238 (1989)
von Winterfeldt, D., Edwards, W.: Decision Analysis and Behavioral Research. Cam-
Wallenius, J.: Comparative evaluation of some interactive approaches to multicrite-
rion optimization. Management Science 21(2), 13871396 (1975)
Zanakis, S.H., Solomon, A., Wishart, N., Dublish, S.: Multi-attribute decision mak-
ing: A simulation comparison of selected methods. European Journal of Opera-
tional Research 107(3), 507529 (1998)
Zionts, S., Wallenius, J.: An interactive multiobjective linear programming method
for a class of underlying nonlinear utility functions. Management Science 29, 519
529 (1983)
16
Future Challenges
Kaisa Miettinen1, , Kalyanmoy Deb2, , Johannes Jahn3 ,

Wlodzimierz Ogryczak4, Koji Shimoyama5 , and Rudolf Vetschera6
1
2
Department of Mechanical Engineering, Indian Institute of Technology Kanpur,
PIN 208 016, India, deb@iitk.ac.in
3
Department of Mathematics, University of Erlangen-Nrnberg, Martensstrasse
3, 91058 Erlangen, Germany, jahn@am.uni-erlangen.de
4
Institute of Control & Computation Engineering, Faculty of Electronics &
Information Technology, Warsaw University of Technology, ul. Nowowiejska
15/19, 00-665 Warsaw, Poland, w.ogryczak@ia.pw.edu.pl
5
Institute of Fluid Science, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai,
980-8577, Japan, shimoyama@edge.ifs.tohoku.ac.jp
6
Department of Business Administration, University of Vienna, Brnnerstrasse
72, 1210 Wien, Austria, rudolf.vetschera@univie.ac.at
Abstract. Many important topics in multiobjective optimization and decision

making have been studied in this book so far. In this chapter, we wish to dis-
cuss some new trends and challenges which the eld is facing. For brevity, we here
concentrate on three main issues: new problem areas in which multiobjective opti-
mization can be of use, new procedures and algorithms to make ecient and useful
applications of multiobjective optimization tools and, nally, new interesting and
practically usable optimality concepts. Some research has already been started and
some such topics are also mentioned here to encourage further research. Some other
topics are just ideas and deserve further attention in the near future.
16.1 Introduction
Handling problems with multiple conicting objectives has been studied for
decades (as discussed, e.g., in Chapters 1 to 3); yet there still exist many
interesting topics for future research. There are both theoretical questions as
well as challenges set by real applications to be tackled. Some of the questions
In 2007 also Helsinki School of Economics, P.O. Box 1210, FI-00101 Helsinki,
Finland
Reviewed by: Jrg Fliege, University of Southampton, UK
Joshua Knowles, University of Manchester, UK
Jrgen Branke, University of Karlsruhe, Germany
436 K. Miettinen et al.
can be answered, for example, by hybridizing or integrating ideas from the

MCDM and EMO literature.
Here we do not even pursue covering all relevant future challenges but
concentrate on three major topics, mainly due to space limitation. First, we
discuss new and challenging problem domains in which multiobjective opti-
mization and decision making (or decision support) techniques can be applied.
Second, we discuss some new methodologies for multiobjective optimization
which allow a synergetic application of optimization and decision making or
provide a more global approach to optimization. Third, we describe new and
innovative denitions of optimality in multiobjective optimization, which al-
lows one to nd a subjective or preferred set of optimal solutions.
Many topics discussed in this chapter are currently under study in various
research groups. We still discuss such topics here mainly from the point of
view of propagating such ideas to more people. We would like to encourage
readers to pursue research along these directions, but our compilation will be
successful if future researchers make due acknowledgment of the cited refer-
ences and this compilation. In our view, the ideas presented are important
and have a long-term implication to the eld of multiobjective optimization.
Collaborative and focused research eorts to implement some of such ideas
will be the next step towards making the eld more applicable, sustainable
and enjoyable.
16.2 Challenging Multiobjective Optimization Problems

Besides solving typical optimization problems having multiple objectives, mul-
tiobjective optimization methodologies can also be used in other kinds of prob-
lem solving tasks. In this section, we briey mention some of such research
directions.
16.2.1 New Problem Domains
Multiobjective optimization problems arise in many applied elds of research.

Although many of these problem types are already investigated, there are also
some important new problem classes which deserve to be examined in detail.
In the following, we discuss a few selected problems.
Multiobjective Bilevel Optimization

In multiobjective bilevel optimization (Dempe, 2002) one considers the opti-
mization problem
minimize f (x, y)
subject to y Y and
'
minimize g(x, y)
x solves
subject to x X.
16 Future Challenges 437
Here Y Rn and X Rm are given feasible sets possibly dened by in-

equalities and equalities and f : Rm Rn Rk and g : Rm Rn Rl are
vector-valued functions. So, on the lower level one has to solve a multiobjec-
tive optimization problem for an arbitrary parameter y Y . The problem on
the upper level is a multiobjective optimization problem where the feasible
set is dened by Y and the whole Pareto optimal set of the lower problem.
Actually, we have two coupled multiobjective problems on two levels. This
so-called multiobjective bilevel problem is dicult to solve because we need
the complete Pareto optimal set of the problem on the lower level for every
parameter y Y . The use of interactive methods on the lower level is not
helpful in this case. An overview on these complicated problem types in the
single objective case can be found in (Dempe, 2002, 2003).
There are interesting applications for this problem class. The bilevel prob-
lem in its original form goes back to von Stackelberg (1934), who has intro-
duced a special case of these problems. The so-called Stackelberg games are
special bilevel problems. In our case the leader and the follower (in the context
of Stackelberg games) have multiple objectives.
In addition to games and economical applications, there are also various
applications in engineering (Bard, 1998; Dempe, 2002, 2003). For instance,
certain equilibrium problems in chemical engineering can be formulated as
bilevel problems (Dempe, 2002).
Semidenite Optimization
Semidenite Optimization is a eld in optimization which has rapidly grown
since the beginning of the 1990s. A multiobjective semidenite optimization
problem can be formulated as
minimize f (x)
subject to G(x) is positive semidenite and
x Rm .
Here we assume that f : Rm Rk is a vector-valued function and G : Rm

S n is a matrix-valued function, where S n denotes the Hilbert space of symmet-
ric (n, n)-matrices with real coecients. Although the case that the objective
function is real-valued has been studied in detail and numerical methods are
available for linear and nonlinear semidenite optimization problems, investi-
gations of the multiobjective case and the development of numerical methods
are still expected.
Many applications lead to semidenite optimization problems (Jahn,
2007). Among others, we only mention the design of a rib in the front of
the wing of the new Airbus A380. This complicated problem of material op-
timization has been solved by semidenite optimization where one minimizes
the weight of the structure and the compliance is treated as a constraint.
Thus, the design of one rib is a solution of an -constraint problem (see Sec-
tion 1.3.2in Chapter 1), where has not been varied. In this sense, this rib is
a result of a special scalarization technique known from multiobjective opti-

mization.
Set Optimization
Since the end of the 1980s, multiobjective optimization has been extended to
set optimization. These are problems of the type
minimize F (x)
subject to x S,
where S Rn is a feasible set being dened by inequalities and equalities
and F : Rn Rk is set-valued. So, for a given feasible point x the image
F (x) is a set of vectors in Rk . Although there are investigations on these set
problems as an extension of multiobjective optimization, we need new ap-
proaches taking into account that we have to work with partial orderings for
sets (and not only for points). First steps have been taken with the KNY
(Kuroiwa-Nishnianidze-Young) partial ordering (Jahn, 2004) but many sig-
nicant theoretical questions are still open.
Problems of this type may occur if the objective is not clearly dened
but only specied in a vague set-oriented way. If one cannot dene a function
value of the objective but only the range for this value, one has to solve a set
problem.
An example of an industrial application is the navigation of autonomous
transportation robots. Here one uses ultrasonic sensors determining the small-
est distance to an obstacle in the emission cone. Since the direction of the ob-
ject cannot be identied in the cone, the location of the object is set-valued.
Therefore, questions of navigation may lead to problems of set optimization.
Further Problem Types

There are many other multiobjective problem types to be explored more inten-
sively. Among others, we need more investigations in multiobjective dynamic
optimization. Dynamic optimization is a signicant eld of optimization with
important applications and it has been used for decades. It is essential to ex-
tend these studies to the multiobjective case and only a few studies exist to
date (Bingul, 2007; Deb et al., 2007a; Farina et al., 2000; Palaniappan et al.,
2001).
Another problem class can be called multiobjective clustering (Delattre
and Hansen, 1980). If one applies cluster analysis to a set of points in order to
nd out appropriate clusters, in some cases standard methods do not give the
desired result. Recent investigations on multiobjective clustering show that
the use of multiple objectives may result in better clusters (e.g., formulating a
biobjective optimization of minimizing intra-cluster distance and maximizing
inter-cluster distance and nding a set of trade-o solutions will provide solu-
tions not accessible by methods that optimize only one of the criteria) (Handl
and Knowles, 2007). This topic is certainly a future challenge as well.
16.2.2 Large-Scale Problems

By large-scale problems we understand problems with many variables, con-
straints or objectives. In general, an exact lower bound for the number of
variables, constraints or objectives is not specied. These problems arise very
naturally in concrete applications. For instance, if one discretizes a system
of partial dierential equations dening the constraints one gets immediately
many constraints and many variables. If the considered variable of our prob-
lem is a function, the discretization of this function leads to many variables.
In order to get a good approximation one has to work with many variables in
this case.
Problems with many objectives may occur, for example, in engineering.
For instance, the design of suspension bridges may lead to several hundred
objectives being dicult to handle. Other problems in material optimization
also belong to this class of large-scale problems. Standard methods of multi-
objective optimization cannot be applied to these large-scale problem types
without simplifying the original problem. Therefore, we need new methods
being able to treat problems with many variables, constraints or objectives.
It seems to be dicult to design interactive methods for large-scale prob-
lems. There are various reasons. For the decision maker (DM) it is dicult to
handle a lot of objectives (or variables). The auxiliary problems which have to
be solved during every calculation phase may be so time-consuming that an
interaction with the DM does not make sense. Here we have to nd new con-
cepts for interaction. Let us add that in some problems function evaluations
may be very costly even though the dimensions are small. From the compu-
tational point of view, such problems can also be regarded as large-scale ones
and interaction may suer as discussed above.
Like in the case of single objective optimization, one important step for
the reduction of computation time in multiobjective optimization is the paral-
lelization of algorithms and their implementation with a distributed comput-
ing system. Some earlier studies have demonstrated the use of a distributed
computing paradigm for the parallel computation of automatically allocated
non-overlapping regions of the Pareto optimal set (Deb et al., 2003; Branke
et al., 2004b). A successful treatment of large-scale problems can be reached
by using computers in parallel. New approaches such as grid computing allow
to use entire networks of computers as one huge parallel computer. New meth-
ods have to be designed for parallel architectures. The change from sequential
structures to parallel structures will accelerate in the future. More discus-
sions on possibilities of parallel multiobjective optimization can be found in
Chapter 13.
16.2.3 Using Multiobjective Optimization to Aid in Other

Problem Solving Tasks
Besides solving multiobjective optimization problems, multiobjective concepts
and approaches can also be exploited to solve other optimization problems:
Constraint handling: In single objective optimization problems, an addi-

tional objective of minimizing the overall constraint violation can be em-
ployed. Furthermore, in problems where the constraints form an empty
feasible region, the constraints can each be converted as objective func-
tions. This enables solving the problem by taking constraint violations as
objectives to be minimized (Miettinen et al., 1998).
Optimization with an additional requirement: In many problems, although
the goal is to minimize or maximize a single or multiple objectives, the
solution should also exhibit other desirable properties. For example, in the
context of evolving computer programs for performing a task using the
genetic programming approach, the goal is often to execute the task as
accurately as possible but with a hidden agenda of developing a strategy
(program) which is as simple as possible. Bleuler et al. (2001) minimized
the size of a genetic program in addition to the supplied objective func-
tions. Since the minimization of the program size is also an important
objective, the genetic programming attempts to nd the optimal program
without making the program unnecessarily large.
Improving the search landscape: Furthermore, some recent studies (Knowles
et al., 2001; Neumann and Wegener, 2005) have shown that decomposing
the original single objective function carefully into multiple functionally
dierent objectives and treating the problem as a multiobjective optimiza-
tion problem makes the problem easier to solve than the usual single ob-
jective optimization procedure.
Revisit traditional problems with multiple objectives for a better and more
informative solution strategy: Sometimes, adding extra objectives as so-
called helper (or proxy) objectives allows a better handling of single ob-
jective optimization problems (Jensen, 2004). Certain problems are tra-
ditionally solved using a particular procedure. A reconsideration of such
problems using a multiobjective optimization strategy can be useful in
many problem solving tasks, like the multiobjective clustering problem
discussed earlier.
Knowledge discovery from multiobjective optimization results: A recent
concept of innovization, innovation through optimization, makes a post-
optimality analysis of obtained trade-o solutions for deciphering princi-
ples which are commonly appearing in most obtained trade-o solutions
(Deb and Srinivasan, 2006). Since the solutions obtained by an EMO or
an a posteriori MCDM method are close to being (or are) Pareto opti-
mal, they are expected to have certain features which remain common to
qualify these solutions to be close to the Pareto optimal set and certain
features which allow them to have a trade-o among objectives. An eort
to try to decipher such valuable information from a set of trade-o near-
optimal solutions is a unique way of discovering salient information about
how to solve a problem in a near-optimal manner?. In many engineering
design problems and game playing problems, interesting and new design
principles and strategies can be unearthed by such a procedure.
The possibility of adding additional objectives to make the search more ex-
ible or even deleting one or more objectives to restrict the search in certain
directions provides exible ways of performing various search tasks (Fliege,
2007). These possibilities certainly open up new avenues and new ways of
solving problems and should be exploited more in the near future. The above
and a number of other possibilities of aiding dierent problem solving tasks
through multiobjective optimization are discussed in (Knowles et al., 2008).
16.3 Challenging Methods of Finding Optimal Solutions

Having discussed new problem domains for multiobjective optimization, we
now discuss new and challenging methodologies of arriving at optimal solu-
tions to multiobjective optimization problems.
16.3.1 Hybrid Methods
As mentioned earlier in this book, in the MCDM literature, solving multiobjec-

tive optimization problems has typically been understood as a task of helping
a DM in nding the most preferred solution in the presence of conicting ob-
jectives. In this kind of a problem setting, DMs preference information plays
an important role. However, until recently, EMO approaches have mostly con-
centrated on approximating the whole set of Pareto optimal solutions. This
brings about a natural question of how MCDM and EMO approaches can com-
plement each other. For example, EMO methodologies can be used to include
preference information (Fonseca and Fleming, 1998; Parmee et al., 2000) (see
Chapter 6 for more studies). As an example of hybridizing ideas and meth-
ods of MCDM and EMO elds, we can mention that some reference point
(see Section 2.3 in Chapter 2) based EMO methods have already been intro-
duced (Deb et al., 2006; Thiele et al., 2007), but there is much more potential
in preparing new hybrid methods. Other examples of augmenting interactive
MCDM methods with EMO ideas include (Deb and Kumar, 2007a,b), where
the reference direction approach (Korhonen, 1988) and the light beam search
(Jaszkiewicz and Sowiski, 1999) are utilized, respectively. Overall, the goal is
to analyze the strengths of dierent approaches and utilize and combine them.
A very simple hybridizing idea is to use continuous local search meth-
ods (with scalarizing functions used in MCDM, see Chapter 1) together with
EMO. This can be useful, for example, in order to improve (or even guarantee
Pareto optimality of) dierent solutions produced by an EMO algorithm.
Hybridizations of approximation algorithms (approximating the Pareto
optimal set) and interactive MCDM methods have, for example, been given
in (Klamroth and Miettinen, 2008; Miettinen et al., 2003). Similar ideas can
be applied with EMO and interactive MCDM methods. By rst using an
approximation algorithm, the DM gets a general understanding about the
problem as a whole, its possibilities and limitations and it is easier for him/her
to specify preference information for the interactive method used. It is, for
example, easier to specify the starting point for the interactive method or to
specify a reference point.
One possibility of creating hybrid methods is to apply MCDA methods
developed for dealing with a discrete set of solution alternatives (Olson, 1996)
to the set of solutions generated by an EMO algorithm or a subset thereof.
In this way, decision support tools of MCDA could help the DM in analyzing
multidimensional objective vectors and nding the most preferred solution.
For an example, see (Thiele et al., 2007), where using the reference direction
based VIMDA method (Korhonen, 1988) is discussed. A simple implemen-
tation of an EMO-MCDA hybrid procedure is also suggested in (Deb and
Chaudhuri, 2007).
Sometimes, it is dicult for DMs to move from one Pareto optimal solu-
tion to another because this necessitates giving up in some objective function
values. If, for example, an EMO algorithm is used to generate an approxima-
tion of the Pareto optimal set, the solutions in the population produced are
not yet necessarily Pareto optimal. This leads to an idea of a method with
a natural win-win situation. Namely, populations generated during the EMO
search are shown to the DM, (s)he can direct the search and get better and
better solutions.
16.3.2 Global Solvers
Many multiobjective optimization problems arising in engineering are prob-

lems dened by nonconvex, nondierentiable and multi-modal functions.
These functions are highly nonlinear. In this case, we must be able to em-
ploy a global solver to nd the globally optimal solutions. Often, the auxiliary
single objective problems which have to be solved as subproblems in mul-
tiobjective methods do not have the necessary mathematical structure, like
generalized convexity, ensuring that computed points are global solutions. In
many algorithmic investigations the question whether a global solution of the
auxiliary problems can be determined, is very often not discussed. But in
practice this is a signicant point. In single objective optimization, locally
optimal solutions can still be of some use, as economists or engineers already
accept a computed point if a drastic reduction of costs can be obtained by the
obtained (local or global) solution. But in multiobjective optimization nding
global solutions is crucial, as often the optimization task is followed by a de-
cision making task. If the solvers used do not compute global solutions, one
obtains an approximation of the set of Pareto optimal points which may be
completely awkward to make decisions with (see also discussion in Chapter
1). Such an ill-functioning appears, for example, if the set of Pareto optimal
points is not connected, that is, it consists of several disconnected parts. Then
the gaps between these parts are dicult to identify from a numerical point
of view. Evolutionary or stochastic optimization methods are better equipped
in dealing with such problems and must be investigated more rigorously.
Let us point out that instead of using solvers that can guarantee only
the local optimality of solutions generated, it is possible to use some global
single objective solver for solving the auxiliary problems produced by MCDM
methods, for example, evolutionary algorithms or a hybrid solver where a local
solver is used after an evolutionary algorithm, both suggested by Miettinen
and Mkel (2006).
In the context of single objective optimization, a recent study (Eremeev
and Reeves, 2003) has suggested that after a solution is found by using an ap-
proximate solver (such as an evolutionary algorithm), a validation procedure
must be used to support the result. The study suggested a sampling proce-
dure to estimate the frequency of falling into local optima. Extensions of such
studies can be made in the context of multiobjective optimization. However,
there still is much to do in this eld in order to nd the most appropriate
solvers to be used in each problem considered.
In general, because evolutionary algorithms or stochastic methods are po-
tential global solvers, the question of interest is how to improve their algo-
rithmic behavior with techniques using derivatives. The above-mentioned way
is the simplest possibility. For instance, if one applies an evolutionary algo-
rithm to a complicated problem with smooth functions, it certainly makes
sense to combine the evolutionary algorithm with a local solver which uses
information on derivatives. Such a combination may improve the evolutionary
algorithm. These memetic algorithms are dicult to design because one has to
determine when to switch from the evolutionary algorithm to the local solver
and back. For example, memetic methods combining an evolutionary algo-
rithm with the well-known sequential quadratic programming (SQP) method
produce promising results. Here we need comprehensive investigations on the
interface of these evolutionary methods and derivative-based methods being
qualied for such a combination. These investigations should lead to modern
metaheuristic approaches resulting in new global solvers. For instance, hybrid
solvers involving simulated annealing and the (local) proximal bundle method
are introduced in (Miettinen et al., 2006).
Based on the remarks listed, there is a need for ecient global solvers (as
also concluded by Aittokoski and Miettinen (2008)). We should develop hybrid
methods combining standard methods with global strategies. These global hy-
brid solvers are very desirable and they would bring a breakthrough in nding
guaranteed global Pareto optimal solutions in multiobjective optimization.
16.4 New Trends in Optimality

Finally, let us devote some thoughts to a few new trends in dening optimality
in the context to multiobjective optimization: subjective preferences, dierent
optimality concepts, and robust solutions.
16.4.1 Subjective Preferences
The MCDM literature typically places the DM at the center of the solution
process of a multiobjective decision problem (as, e.g., Belton and Stewart
(2001)). The DMs preferences determine which objective functions are more
or less important, and how dierent objective values are to be rated. Conse-
quently, aggregation across objectives has to be performed in a way which is
consistent with the DMs preferences. This subjectivity is often seen as one
of the characteristic features of multiobjective optimization problems, which
distinguish this area from single objective optimization, where an objectively
optimal solution can be found.
This subjective view is not entirely shared in the EMO literature, or more
generally in multiobjective optimization. When solving multiobjective opti-
mization problems, the aggregation across the individual objectives is often
specied by model developers, with little involvement of the actual DMs. From
a subjective perspective, this might seem a grave omission: an analyst who
selects an aggregation mechanism across objective functions (like an additive
function), and species weights of individual objectives to be used in this ag-
gregation, takes away decision authority from the actual DM. Even seemingly
objective concepts like dominance or Pareto optimality contain subjective
elements because dominance requires at least information about the direc-
tion of preference within each objective function. But not all multiobjective
optimization problems exhibit this level of subjectivity. Sometimes, even an
aggregation across objective functions can be performed quite objectively, and
a model developer might be even in a better position to perform such an ob-
jective optimization than the actual DM. However, if subjective information
is not available, EMO can be used to get an idea of the Pareto optimal set,
at least in the case of optimization problems with two to four objectives.
Rather than establishing a strict dichotomy between objective single ob-
jective optimization problems and subjective multiobjective problems, we
can propose a taxonomy of dierent levels of subjectivity in multiobjective
optimization problems as four cases. In this taxonomy, the classical view of
multiobjective problems does not even form an endpoint, but an intermediate
stage.
The proposed taxonomy consists of four cases:
i) Multiobjective optimization problems as a technical solution device.
ii) Multiobjective optimization problems as an approximation of a higher
level objective.
iii) Subjective multiobjective optimization problems.
iv) Problems involving meta-criteria.
Multiobjective Optimization Problems as a Technical Solution

Device
In some cases, heuristics work better on multiobjective optimization problems

than on problems with a single objective function. In these cases, it might
make sense to perform a multi-objectivization of the problem (Knowles et al.,
2001): to split an explicitly given criterion into several functions and solve the
resulting multiobjective optimization problem, as discussed in Section 16.2.3.
Of course, the aggregation procedure in this case is fully determined and has
to reconstruct the original objective function.
Multiobjective Optimization Problems as an Approximation of a

Higher Level Objective
In many applications of multiobjective optimization methods, the DM ac-

tually wants to maximize some higher level criterion, but this criterion can
either not be directly measured, or the relationship of the decision variables to
that higher level criterion is not clear. Therefore, one uses several substitute
criteria and solves a multiobjective optimization problem, instead. For more
details, refer to (Miettinen, 1999). A study on EMO (Handl et al., 2007) called
these substitute criteria proxy objectives. In many MCDM applications in
business, the long run prot of the rm is the ultimate goal. But the impact
of many decisions on a long run prot can hardly be quantied. For exam-
ple, when hiring a new executive, one cannot predict how much a particular
person will contribute to prot, so substitute criteria like education and expe-
rience are used to approximate that persons productivity. Another example
demonstrating the benets of using multiobjective optimization is discussed
in (Hakanen et al., 2005), where estimating amortization time and interest
rate for capital is avoided when balancing between investment and running
costs in the case of designing a heat recovery system of a paper mill.
In these cases, neither the choice of a preference model nor the selection
of parameters (e.g., weights) to be used in that model is purely subjective,
but both should approximate the likely relationship of substitute criteria to
the higher level criterion. A higher weight in this case does not indicate that
a substitute (lower level) criterion is considered more important, but that it
is considered to have a stronger inuence on the higher level criterion.
Subjective Multiobjective Optimization Problems
Subjective problems are typically considered in MCDM, where the aggregation

of objective functions depends solely on the subjective preferences of the DM.
This type of decision problems are often illustrated by referring to personal
decisions like the purchase of a car, where attributes like comfort, speed or
costs need to be compared.
In this case, no objective aggregation model exists, which would be valid

for all DMs. Of course, modeling can still be performed by an analyst, but
only in close contact with the actual DM who has to provide the relevant
preference information.
Problems Involving Meta-criteria
Many multiobjective optimization problems are related to decisions in which

the interests of multiple stakeholders have to be taken into account. Such
decisions occur, for example, in public policy. Even when the decision is ulti-
mately made by one individual, for example, a politician, that individual has
to consider the interests of dierent parties. While in the decision problems
discussed so far, an improvement in any objective function could be considered
to improve the overall evaluation of a decision alternative (this assumption
underlies the whole concept of Pareto optimality), this is no longer true when
aspects like fairness need to be taken into account. Here, further improve-
ments of the situation of stakeholders who are already better o than the
others might be considered as unfair and, thus, make a solution less prefer-
able.
Such meta-criteria, which evaluate the distribution of results across sev-
eral criteria, occur not only in multi-person decisions. For example, when time
streams of income are evaluated, income in each period could be considered
as an objective. Apart from maximizing income in each period (which would
correspond to the standard multiobjective formulation), DMs might prefer a
constant income stream over a stream which exhibits large variations over
time. This preference for particular patterns should not be confused with risk
aversion; it can occur even if all payments are known in advance with cer-
tainty. To handle this type of problems, Kostreva et al. (2004) developed the
concept of equitable multiobjective decision making and showed how several
multiobjective optimization methods, in particular reference point methods,
can be extended to handle such problems.
Further Comments on the Taxonomy
Our taxonomy of multiobjective optimization problems has several conse-

quences for the way in which preferences are elicited, modelled and ag-
gregated. The rst dierence concerns the person, or group of persons, from
whom preference information can be obtained. In highly subjective problems,
only the DM him/herself can provide information about preferences. But in
problems where multiple criteria are used to approximate a higher level objec-
tive, it might be reasonable to obtain input from several experts in order to get
a clearer picture of how substitute criteria will actually inuence the higher
level objective. In the remaining two cases no real preference elicitation can
take place. When multiple criteria are introduced for technical reasons, their
aggregation is also a technical problem. In the case of meta-criteria, the way in
which individual criteria are aggregated is based on the meta-criteria involved,

which can be considered as axioms an aggregation method must fulll.
This distinction has also consequences for the likely stability of preference
information. While there is some empirical evidence that individual prefer-
ences towards multiple criteria remain stable over time (Blackmond and Fis-
cher, 1987; San Miguel et al., 2002), they are still subject to more external
inuences than causal relationships between substitute criteria and higher
level objective. Consequently, preference information obtained for the latter
type of problems, as well as for the other two classes, needs to be elicited less
often in repeated decisions than for subjective problems.
One might also view properties of solution concepts, like eciency or in-
dependence of irrelevant alternatives, dierently in the four cases of our tax-
onomy. In the rst case, such axioms are more or less irrelevant. Aggregation
has to reconstruct the original objective, regardless of whether it fullls com-
mon axioms of decision analysis or not. In the second case, rationality (in
the form of axioms) becomes more important, since in most problems, it can
be expected that the true relationship between substitute criteria and the
actual higher level objective also follows these principles. In the third case,
acceptance of axioms is entirely up to the DM. Empirical research on bias
phenomena in decision making has provided considerable evidence that sub-
jects consciously choose to violate axioms of decision analysis, even when this
violation is pointed out to them (von Winterfeldt and Edwards, 1986). Finally,
in the last case, meta-criteria are themselves axioms, and their acceptance by
all stakeholders is a prerequisite for acceptable solutions.
By formulating the above taxonomy, we have just started to explore the
impact of dierent levels of subjectivity on the solution process, as well as
the underlying theory of multiobjective optimization, both with MCDM and
EMO methods. This could become an interesting area of future research.
16.4.2 Generalized Dominance and Redening Optimality
Most multiobjective optimization studies use the concept of Pareto optimal-

ity for driving their search. However, there exist a number of other trends of
redening the usual Pareto optimality. Such considerations usually reduce the
size of the optimal set and in some occasions make it easier for the search
algorithms to handle the complexity associated with multiobjective optimiza-
tion. Here we discuss a number of such trends of redening optimality in
multiobjective optimization.
In this book, the basic concept of optimality has been that of Pareto op-
timality, but a closely related, relaxed, concept of weak Pareto optimality
is sometimes used because the latter is computationally simpler and many
straightforward approaches to multiobjective optimization generate weakly
Pareto optimal solutions (see, e.g., Preface and Chapter 1). However, weak
Pareto optimality is not satisfactory for applications because it ignores clear
possibilities of solution improvement with respect to some objectives. Actu-

ally, even the concept of Pareto optimality may be too weak for many ap-
plications. As discussed in Chapter 1, the notion of proper Pareto optimality
(Georion, 1968) assumes that all the trade-os are bounded (see also Chap-
ter 2). Sometimes, more useful for applications are solutions that are properly
Pareto optimal with an a priori given bound on trade-os.
Several dominance (and thereby eciency or Pareto optimality) concepts
can be introduced as the so-called dominance cone (Yu, 1974) as also briey
discussed in Chapter 1. The partial order of the dominance relation is implied
by a convex cone D in such a sense that y dominates y if and only if y y
D \ {0}. The standard Pareto optimality or Pareto dominance is dened by
using an orthant cone (negative orthant for minimization). A narrower cone
restricts the dominance relation thus expanding the corresponding ecient set.
On the other hand, a wider cone enforces more dominated outcome vectors,
thus narrowing down the ecient set.
A corresponding dominance cone can be constructed by combining the
orthant with the half-space (Kaliszewski, 1994; Wierzbicki, 1986). Actually,
the reference point method and many other scalarizing function model such
dominance by taking the sum of objective values (the half space) with a small
weight to regularize the basic term of the max-min aggregation (the orthant).
See also Chapters 1 and 2 as well as (Miettinen, 1999).
Most traditional MCDM approaches to multiobjective optimization seek
for the best solution according to the DMs preferences while treating the
dominance relation as a common principle of all rational preference models.
Thus the concept of Pareto optimality is rather used as a necessary condi-
tion to establish the boundary of acceptable choices. Therefore, strengthening
the dominance concept is not so crucial for the implementation of interac-
tive MCDM procedures, although still important. On the other hand, EMO
procedures use three dierent features: emphasis on nondominated solutions
in the current population, emphasis on previously-found nondominated so-
lutions, and emphasis on less crowded solutions in the objective space (see
Chapter 3). Many studies related to dierent dominance relations and ap-
proaches utilizing them have been published during the years in the MCDM
eld. Lately, they have also attracted attention in the EMO eld. For example,
wider dominance cones can be used to focus an EMO search on a part of the
Pareto optimal set (Branke et al., 2001; Laumanns et al., 2002), instead on the
complete set. In particular, the cone dominance enables to formalize concepts
of narrowing the Pareto optimal set related to limitations on trade-os.
Note that the dominance cone can be changed during the solution process.
Such a dominance structure appears, for instance, in the case of a given value
(or utility) function maximization. The dominance structure corresponding to
the comparison of the value function values is represented by the tangent cone
to the isoline contours of the value function at any objective vector. For poorly
characterized preferences in multiobjective problems, it is often desirable to
seek (approximate) optimal solutions for a large class of value functions. So-
lutions corresponding to the optimal value of a large variety of linear value

functions can be emphasized within the EMO procedure, thereby aiding to
nd knee objective vectors in certain problems (Branke et al., 2004a). An ap-
proximate majorization relation enables the search for solutions maximizing
all symmetric concave value functions (Goel and Meyerson, 2006).
There are many applications leading to problems with a large number of
uniform criteria considered impartially which makes the distribution of out-
comes more important than the assignment of several outcomes to the specic
criteria. Such models are generally related to the evaluation and optimization
of various systems which serve many users where quality of service for every
individual user denes the criteria. This applies to various technical and social
systems. An example arises in locating public facilities where the decisions of-
ten concern the placement of a service center or another facility in a position
so that the users are allocated in an impartial way. Thus, we are interested
in comparing distributions of values within the objective vectors rather than
componentwise comparison of objective vectors (Ogryczak, 1999). Note that
having two possible location patterns generating objective vectors (5, 0, 5) and
(0, 1, 0), we would recognize both the location patterns as Pareto optimal in
terms of (distance) minimization. However, the rst location pattern generates
two objectives (distances) equal to 5 and one objective equal to 0, whereas the
second pattern generates one objective equal to 1 and two objectives equal to
0. Thus, in terms of the distribution of objective values, the second location
pattern is clearly better.
The need to search for some optimal distribution of objective values is
commonly recognized in problems which may be viewed as resource allocation
models. While allocating limited resources to maximize the system eciency
they also attempt to provide a fair treatment of all the competing activi-
ties. For instance, in networking, a central issue is how to allocate bandwidth
to ows eciently and fairly (Denda et al., 2000; Piro and Medhi, 2004).
Furthermore, uniform individual criteria may be associated with some events
rather than physical users, like in many dynamic optimization problems where
uniform individual criteria represent a similar event in various periods or in
decision problems under uncertainty where uniform individual criteria repre-
sent the outcome realizations under various scenarios. Another type of model
is that of approximation of discrete data by a functional form. The residuals
may be viewed as objectives to be minimized, and there is no reason to treat
them in any way but impartially.
In many models fair consideration of all criteria requires more than only
impartiality. In order to ensure fairness in a system, all system entities have
to be equally well provided with the systems services. This means that more
equal objective vectors are preferred to unequal ones or, more formally, a
transfer of any small amount from an objective function to any other relatively
worse-o objective results in a more preferred objective vector. For instance,
a solution generating all three objective values equal to 2 is considered better
than any solution generating individual values 4, 2 and 0. This leads to con-
cepts of fairness expressed by the equitable eciency as a specic renement

of Pareto optimality taking into account impartiality and inequality mini-
mization (Kostreva et al., 2004). Thus, seeking for the optimal distribution
of objective values is actually a new multiobjective problem type. However,
the dominance structure for objective vectors does not represent any cone
(Kostreva and Ogryczak, 1999).
Currently, some specic solution concepts are used for various application
areas. Biobjective aggregations to the mean and some dispersion measure are
used in the areas of decisions under risk and location analysis as well. The
max-min approach additionally regularized with the lexicographic order (the
so-called max-min fairness) is commonly used in resource allocation problems
(Luss, 1999). Approaches exploiting the multiobjective nature of distribution
optimization problems are rather rare (Ogryczak et al., 2008). Actually, such
problems are hard for preference modeling and identication within the in-
teractive MCDM methods as well as for the EMO approaches. Nevertheless,
they deserve to be investigated more intensively.
16.4.3 Robust Solutions
A conventional optimization approach that considers only the optimality of

a decision or a design, that is, performance at decision or design condition,
should work ne in a controlled environment. Real-world applications, on
the other hand, inevitably involve errors and uncertainties (be it, e.g., in
the design process, manufacturing process, and/or operating conditions); so
that the resulting performance may be lower than expected. For instance,
the aerodynamic performance of an airplane wing design is very sensitive to
the wing shape and ight conditions and, thus, it may deteriorate drastically
when subject to wing manufacturing errors and wind variations even if the
wing design is optimized.
Several approaches have been developed to deal with uncertain or impre-
cise data. The approaches focused on the quality or on the variation (stability)
of the solution for some data domains are considered robust. The notion of
robustness applied to decision problems was rst introduced almost 50 years
ago by Gupta and Rosenhead (1968). Practical importance of the performance
sensitivity against data uncertainty and errors has later attracted consider-
able attention to the search for robust solutions. Actually, as suggested by Roy
(1998), the concept of robustness should be applied not only to solutions but,
more generally to various assertions and recommendations generated within
a decision support process. A brief comparison between conventional opti-
mization and robust optimization is illustrated in Fig. 16.1 a). Solution A
obtained by a conventional optimization is the best in terms of optimality,
but disperses widely in terms of the objective function against the dispersion
of design variable or environmental variable, and this dispersion may extend
to an infeasible range. On the other hand, solution B obtained by a robust
optimization is moderately good in terms of optimality and also good in terms
of robustness, that is, dispersion of objective function is narrow against dis-

persion of design variable.
On the other hand, the optimal solution despite generating objective val-
ues dispersed quite widely may be clearly better than a solution not dispersed
at all. As depicted in Fig. 16.1 b), solution A though characterized by dis-
persed results remains under all conditions better than the stable solution B.
Hence, solution B is obviously dominated and it cannot be considered a robust
optimal solution.
Infeasible
Optimal Robust Optimal NOT
Optimal Robust Optimal
Infeasible
Objective Function
Objective Function
Feasible
Sol. B
Feasible
Sol. B
Sol. A Sol. A
Design Variable or Design Variable or
Environmental Variable Environmental Variable
Fig. 16.1. Comparison between conventional optimization and robust optimization

(for a minimization problem): a) conventional optimal solution A vs. robust optimal
solution B; b) stable but not robust optimal solution B.
The precise concept of robustness depends on the way the uncertain data do-
mains and the quality or stability characteristics are introduced. Typically, in
robust analysis one does not attribute any probability distribution to repre-
sent uncertainties. Data uncertainty is rather represented by non-attributed
scenarios, which means there is no specic rule to determine the data uncer-
tainty characteristics. Since one wishes to optimize results under each scenario,
robust optimization might be in some sense viewed as a multiobjective op-
timization problem where objectives correspond to the scenarios. However,
despite of many similarities of such robust optimization concepts to multiob-
jective models, there are also some signicant dierences (Hites et al., 2006).
Actually, robust optimization is a problem of optimal distribution of objec-
tive values under several scenarios (c.f. Section 16.4.2) rather than a standard
multiobjective optimization model.
A conservative notion of robustness focusing on worst case scenario re-
sults is widely accepted and the min-max optimization is commonly used to
seek robust solutions. The worst case scenario analysis can be applied ei-
ther to the absolute values of objectives (the absolute robustness) or to the
regret values (the deviational robustness) (Kouvelis and Yu, 1997). The lat-
ter, when considered from the multiobjective perspective, represents a simpli-

ed reference point approach with the utopian (ideal) objective values for all
the scenario used as aspiration levels. Recently, a more advanced concept of
ordered weighted averaging was introduced into robust optimization (Perny
et al., 2006), thus, allowing to optimize combined performances under the
worst case scenario together with the performances under the second worst
scenario, the third worst and so on. Such an approach exploits better the en-
tire distribution of objective vectors in search for robust solutions and, more
importantly, it introduces some tools for modeling robust preferences. Actu-
ally, while more sophisticated concepts of robust optimization are considered
within the area of discrete programming models, only the absolute robustness
is usually applied to the majority of decision and design problems.
Taking into account the current computational capabilities of both EMO
and MCDM techniques, one may expect development of new robust optimiza-
tion approaches in many areas. Here, we do not make any attempt to discuss
all such existing implementations.
Dealing with Risk
When an (objective or subjective) probability distribution is specied to char-

acterize the data uncertainty, robust optimization becomes a problem of de-
cision under risk. In this context, robustness is represented by the notion
of risk aversion, and typically by a strong risk aversion. There exists a
well-developed methodology for decisions under risk and it can be directly
applied to robust optimization. In particular, the mean-risk (MR) approach
(Markowitz model) quanties the problem in a lucid form of only two objec-
tives: the mean (expected) outcome and the risk , a scalar measure of the
variability (dispersion) of outcomes. The latter may be equally interpreted as
a robustness measure of solutions, thus allowing the MR model to be read as
mean-robustness in an appropriate setting.
The MR approach allows to formalize robust optimization with two sep-
arate criteria: optimality () and robustness (). Indeed, in many real-life
problems improvements in optimality and robustness are competing while the
MR model allows to formalize it and to analyze the trade-o between these
two criteria. The classical Markowitz model uses the standard deviation (or
variance 2 ) as the risk measure. Similarly, the biobjective model min{, }
is applied for robust optimization, although frequently in the scalarized form
min{ + } with the trade-o parameter < 0. Unfortunately, while the
mean-variance model is well suited for normal distributions, it may lead to
inferior conclusions in general. Referring to the case depicted in Fig. 16.1 b),
one may notice that obviously a worse solution B is characterized by = 0,
thus, in terms of the biobjective MR model it is not dominated by solution A
with a positive measure of dispersion, despite the fact that the latter is clearly
better under all scenarios. This aw of MR models may be overcome by the
use of asymmetric dispersion measures focused only on disturbances negative

to the optimization and combining them with the mean values.
For instance, the biobjective model min{, + } with
representing the
upper side standard deviation will generate only solutions with nondominated
distributions of results (Ogryczak and Ruszczyski, 1999), namely, solutions
which cannot be improved under all scenarios simultaneously. One may no-
tice that, while considering the maximum upper deviation (from the mean)
as a probability independent dispersion measure, one gets the criterion
+ expressing the worst case scenario result, that is, the classical conser-
vative notion of robustness. Multiobjective approaches to decisions under risk
(Ogryczak, 2002) allow to model various robust solution concepts.
Shimoyama et al. (2005) have proposed a multiobjective robust optimiza-
tion approach called design for multiobjective six sigma (DFMOSS). The DF-
MOSS builds on the ideas of design for six sigma (DFSS) (Engineous Software,
Inc., 2002), coupled with an EMO algorithm (Deb, 2001), for an enhanced
capability to reveal trade-o information considering both optimality and ro-
bustness of design. Jin and Sendho (2003) have also discussed the trade-o
between optimality and robustness in the context of multiobjective optimiza-
tion. The DFSS is based on the six sigma concept, which was originally
established as a measure of excellence for business processes. The aim is to
achieve a process with such a small dispersion that the range of 6 (where
is standard deviation) around the mean value is included in an acceptable
range for the performance parameter. The level of dispersion can be dened
as sigma level n satisfying the following constraints:
n LSL and + n USL, (16.1)
where LSL and USL are lower and upper specication limits, respectively. A
larger sigma level indicates smaller dispersion. In the context of robust design
optimization, smaller dispersion translates to a more robust characteristic.
For a general single objective optimization problem where an objective
function f (x) of design variable x must be minimized, the DFMOSS deals
with the biobjective optimization problem where the mean value (f ) and the
standard deviation (f ) of f (x) must be minimized when x disperses around
the design condition due to errors and uncertainties. During the optimization
process itself, multiple solutions (individuals) are dealt with simultaneously
using EMO. For each individual, f and f are evaluated as two separate ob-
jective functions from f (x) at the sample points around x. From them, better
solutions are selected based on the Pareto optimality concept between f and
f . New solutions for the next step are reproduced by crossover and muta-
tion from the selected solutions. This optimization process is iterated until
the trade-o relation between f and f has converged, and multiple robust
optimal solutions have been obtained. After the optimization, the sigma level
n satisfying (16.1) is post-evaluated for the obtained optimal solutions. This
allow one to select a robust solution with the highest sigma level (preferably
with the level 6).
Note that some optimization problems do not have robust solutions that
satisfy six sigma. In such cases, it is preferable to nd a solution with a sigma
level n as high as possible, even if it is less than six sigma. In addition, (16.1)
can still be considered during the f f optimization; it is better to do this
when n to be satised is strongly determined by a certain design requirement.
Let us also mention that Deb and Gupta (2005) have suggested two types
of robustness in the context of multiobjective optimization. Certainly, many
other variations are possible.
Uncertainty in Presence of Constraints
Uncertainty in problem parameters may aect not only the objective functions
but also the feasible set, thus, threatening the feasibility of solutions. Solving
such problems is frequently referred to as reliability-based optimization, where
one seeks the best solution among those remaining feasible for various data
perturbations. Again, the precise concept of solution depends on the way the
uncertain data domains are introduced. When uncertainty is represented by
non-attributed scenarios, the worst case approach can be applied. When prob-
ability distribution is specied (either objective or subjective) to characterize
the data uncertainty, one gets a typical stochastic programming problem. Fig.
16.2 shows a hypothetical problem with two inequality constraints. Typically,
the optimal solution lies on a constraint boundary or at the intersection of
more than one constraints, as shown in the gure. In the event of uncertain-
ties in design variables (as shown in the gure with a probability distribution
around the optimal solution) in many instances such a solution will be infea-
sible. In order to nd a solution which is more reliable (meaning that there
is a very small probability of instances producing an infeasible solution), the
true optimal solution must be sacriced and a solution interior to the feasible
x2
00000000000
11111111111
11111111111111111111
00000000000000000000
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
Feasible
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
region
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
Reliable
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
solution
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111 Deterministic
00000000000
11111111111
00000000000000000000
11111111111111111111 optimum
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
Uncertainities
00000000000
11111111111
00000000000000000000
11111111111111111111
00000000000
11111111111
00000000000000000000
11111111111111111111
in x1 and x2
00000000000x1
11111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
00000000000000000000
11111111111111111111
Fig. 16.2. The concept of reliability-based optimization.

region may be chosen. For a desired reliability measure R, it is then desired to

nd that feasible solution which will ensure that the probability of having an
infeasible solution instance created through uncertainties from this solution is
at most (1 R). To arrive at such a solution, a stochastic optimization prob-
lem can be converted to its deterministic equivalent (Birge and Louveaux,
1997; Romeijn et al., 2006).
To handle such cases with a large reliability requirement, probabilistic
methodologies involving a double loop, single loop and decoupled methods are
used. For example, one can incorporate a decoupled method with an EMO
procedure (Deb et al., 2007b) to nd reliable sets, instead of a sensitive Pareto
optimal set, corresponding to a specied reliability value. More such studies
are needed to make the approach computationally viable and applicable to
practical multiobjective problem solving tasks.
Another issue involving uncertainty in solution evaluation comes from deal-
ing with noisy environments, in which objective and constraint function eval-
uations introduce inherent noise. Although this issue has received a lot of
attention in the context of single objective evolutionary algorithms (see a sur-
vey by Jin and Branke (2005)), some attempts have recently been made in
EMO as well (Bui et al., 2005; Hughes, 2001; Teich, 2001). Clearly, more stud-
ies are needed to fully understand the eect of noise in solution evaluation
procedures in multiobjective optimization.
16.5 Conclusions
In this chapter, we have discussed some ideas for future research in the con-
text of multiobjective optimization and decision making. However, plenty of
research is still needed in many aspects of decision making. In this respect,
some of the future directions mentioned by Miettinen (1999) still stand as
relevant challenges. Let us hope that the future years will bring new light in
them and many other fruitful and rewarding topics.
An important challenge is to increase awareness of the possibilities and
potential of multiobjective optimization because there still are many applica-
tion elds where multiobjective optimization is not used at all or is used in
a very simplistic way even though the problems solved clearly involve mul-
tiple conicting objectives. Often, the existence of decision support tools is
simply not known to many researchers. Here, the importance of strong and
encouraging case studies cannot be emphasized enough. For people dealing
with applications, case studies give a possibility to see the benets obtainable
in a concrete and understandable way. A necessary and natural step of bring-
ing multiobjective optimization tools closer to real DMs is the important
challenge of designing user-friendly software for decision support. We need
software that is easily accessible (like the WWW-NIMBUS R
system (Mietti-
nen and Mkel, 2000, 2006) operating via the Internet) and this certainly is
a eld needing more attention.
In many applications, multiple objectives are hidden and simplied in the

modeling phase in order to produce a problem that seems to be solvable. In-
creasing awareness of the existence of multiobjective optimization methods
and tools also encourages questioning the existing models in order to avoid
simplications that blur the possibility of studying the interdependencies be-
tween the conicting objectives in real problems.
The possibilities and importance of interactive methods have been empha-
sized a lot in this book because interactive methods give the DM a possibility
to learn about the problem considered. If the problem is complex and function
evaluations take a lot of time, the interactive nature of the solution process
may suer because the DM has to wait for new and improved solutions. This
sets requirements and challenges on the computational eciency of the meth-
ods used. Besides using meta-modeling (see Chapter 10) and optimization
techniques with increased accuracy (as the solution process proceeds), new
approaches and ideas are needed. For example, ideas related to learning are
discussed in Chapter 15.
One aspect has clearly emerged from the chapters of this book: multiob-
jective optimization using evolutionary algorithms or otherwise and decision
making aids must be put together synergistically, computationally eciently,
and, above all, interactively for a DM to examine possible candidate solutions
and choose one particular preferred solution at the end. Such a task requires
one to rst know both optimization and decision making literature well. This
book has shown a number of possibilities of such mergers from various points
of view. This chapter has also suggested a number of avenues for moving for-
ward in this direction. With collaborative eorts from various research groups
involving multiobjective optimization and decision making, we should witness
more holistic approaches, interactive algorithms and software systems to be
developed for practical use in the coming years.
Acknowledgements
The work of K. Miettinen was partly supported by the Foundation of the

Helsinki School of Economics. The work of K. Deb was partly supported
by the Academy of Finland (grant # 118319). The work of W. Ogryczak
was partially supported by the Ministry of Science and Information Society
Technologies under grant 3T11C 005 27.
References
Aittokoski, T., Miettinen, K.: Cost eective simulation-based multiobjective opti-
mization in performance of internal combustion engine. Engineering Optimiza-
tion 40(7), 593612 (2008)
Bard, J.F.: Practical Bilevel Optimization: Algorithms and Applications. Kluwer
Academic Publishers, Dordrecht (1998)
Belton, V., Stewart, T.J.: Multiple Criteria Decision Analysis. Kluwer Academic
Publishers, Dordrecht (2001)
Bingul, Z.: Adaptive genetic algorithms applied to dynamic multiobjective problems.
Applied Soft Computing 7(3), 791799 (2007), doi:10.1016/j.asoc.2006.03.001.
Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer, Hei-
delberg (1997)
Blackmond, L.K., Fischer, G.W.: Estimating utility functions in the presence of
response error. Management Science 33, 965980 (1987)
Bleuler, S., Brack, M., Zitzler, E.: Multiobjective genetic programming: Reducing
bloat using SPEA2. In: Proceedings of the 2001 Congress on Evolutionary Com-
putation, pp. 536543. IEEE Computer Society Press, Piscataway (2001)
Branke, J., Kauler, T., Schmeck, H.: Guidance in evolutionary multi-objective op-
timization. Advances in Engineering Software 32, 499507 (2001)
optimization. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervs,
J.J., Bullinaria, J.A., Rowe, J.E., Tio, P., Kabn, A., Schwefel, H.-P. (eds.) PPSN
2004. LNCS, vol. 3242, pp. 722731. Springer, Heidelberg (2004a)
Branke, J., Schmeck, H., Deb, K., Reddy, M.: Parallelizing multi-objective evolution-
ary algorithms: Cone separation. In: Proceedings of the Congress on Evolutionary
Computation (CEC-2004), pp. 19521957. IEEE Press, Piscataway (2004b)
Bui, L.T., Abbass, H.A., Essam, D.: Fitness inheritance for noisy evolutionary multi-
objective optimization. In: Proceedings of the International Conference on Genetic
and evolutionary computation (GECCO-2005), pp. 779785. ACM Press, New
York (2005)
Chichester (2001)
Deb, K., Chaudhuri, S.: I-MODE: An interactive multi-objective optimization and
decision-making using evolutionary methods. In: Obayashi, S., Deb, K., Poloni,
C., Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 788802.
Deb, K., Gupta, H.: Searching for robust Pareto-optimal solutions in multi-objective
optimization. In: Coello Coello, C.A., Hernndez Aguirre, A., Zitzler, E. (eds.)
decision-making using reference direction method. In: Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO-2007), pp. 781788. ACM
Press, New York (2007a)
evolutionary algorithms. In: Proceedings of the Congress on Evolutionary Com-
putation (CEC-2007), pp. 21252132. IEEE Computer Society Press, Piscataway
(2007b)
Deb, K., Srinivasan, A.: Innovization: Innovating design principles through optimiza-
tion. In: Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO-2006), pp. 16291636. ACM Press, New York (2006)
Deb, K., Zope, P., Jain, S.: Distributed computing of Pareto-optimal solutions with
evolutionary algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K.,
(2003)
Deb, K., Sundar, J., Reddy, U., Chaudhuri, S.: Reference point based multi-objective
optimization using evolutionary algorithms. International Journal of Computa-
tional Intelligence Research 2(6), 273286 (2006)
Deb, K., Rao N., U.B., Karthik, S.: Dynamic multi-objective optimization and
decision-making using modied NSGA-II: A case study on hydro-thermal power
scheduling. In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.)
EMO 2007. LNCS, vol. 4403, pp. 803817. Springer, Heidelberg (2007a)
Deb, K., Padmanabhan, D., Gupta, S., Mall, A.K.: Reliability-based multi-objective
optimization using evolutionary algorithms. In: Obayashi, S., Deb, K., Poloni, C.,
Hiroyasu, T., Murata, T. (eds.) EMO 2007. LNCS, vol. 4403, pp. 6680. Springer,
Heidelberg (2007b)
Delattre, M., Hansen, P.: Bicriterion cluster analysis. IEEE Transaction Pattern
Analysis and Machine Intelligence 2(4), 277291 (1980)
Dempe, S.: Foundations of Bilevel Programming. Kluwer Academic Publishers, Dor-
drecht (2002)
Dempe, S.: Annotated bibliography on bilevel programming and mathematical pro-
grams with equilibrium constraints. Optimization 52, 333359 (2003)
Denda, R., Banchs, A., Eelsberg, W.: The fairness challenge in computer networks.
In: Crowcroft, J., Roberts, J., Smirnov, M.I. (eds.) Quality of Future Internet
Services, pp. 208220. Springer, Heidelberg (2000)
Engineous Software, Inc.: iSIGHT Reference Guide Version 7.1, pp. 220233. Engi-
neous Software, Inc. (2002)
Eremeev, A.V., Reeves, C.R.: On condence intervals for the number of local optima.
In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot,
A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.)
EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003,
EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 224
Farina, M., Deb, K., Amato, P.: Dynamic multiobjective optimization problems:
Test cases, approximations, and applications. IEEE Transactions on Evolutionary
Computation 8(5), 425442 (2000)
Fliege, J.: The eects of adding objectives to an optimisation problem on the solution
set. Operations Research Letters 35(6), 782790 (2007)
handling with evolutionary algorithmsPart I: A unied formulation. IEEE Trans-
actions on Systems, Man and Cybernetics 28(1), 2637 (1998)
Georion, A.M.: Proper eciency and the theory of vector maximization. Journal
of Mathematical Analysis and Applications 22(3), 618630 (1968)
Goel, A., Meyerson, A.: Simultaneous optimization via approximate majorization
for concave prots or convex costs. Algorithmica 44, 301323 (2006)
Gupta, S., Rosenhead, J.: Robustness in sequential investment decisions. Manage-
ment Science 15, 1829 (1968)
Hakanen, J., Miettinen, K., Mkel, M., Manninen, J.: On interactive multiobjective
optimization with NIMBUS in chemical process design. Journal of Multi-Criteria
Decision Analysis 13(23), 125134 (2005)
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE
Handl, J., Kell, D.B., Knowles, J.: Multiobjective optimization in bioinformatics and
computational biology. ACM/IEEE Transactions on Computational Biology and
Bioinformatics 4(2), 279292 (2007)
Hites, R., De Smet, Y., Risse, N., Salazar-Neumann, M., Vincke, P.: About the appli-
cability of MCDA to some robustness problems. European Journal of Operational
Research 174, 322332 (2006)
Hughes, E.J.: Evolutionary multi-objective ranking with uncertainty and noise. In:
Zitzler, E., Deb, K., Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO
Jahn, J.: Vector Optimization Theory, Applications, and Extensions. Springer,
Heidelberg (2004)
Jahn, J.: Introduction to the Theory on Nonlinear Optimization. Springer, Heidel-
berg (2007)
Jaszkiewicz, A., Sowiski, R.: The light beam search approach an overview of
300314 (1999)
Jensen, M.T.: Helper-objectives: Using multi-objective evolutionary algorithms for
single-objective optimisation. Journal of Mathematical Modelling and Algo-
rithms 3(4), 323347 (2004)
Jin, Y., Branke, J.: Evolutionary optimization in uncertain environments. IEEE
Jin, Y., Sendho, B.: Trade-o between performance and robustness: An evolution-
ary multiobjective approach. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb,
K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 237251. Springer, Heidel-
berg (2003)
Kaliszewski, I.: Quantitative Pareto Analysis by Cone Separation Technique. Kluwer
Academic Publishers, Dodrecht (1994)
Klamroth, K., Miettinen, K.: Integrating approximation and interactive decision
making in multicriteria optimization. Operations Research 56(1), 222234 (2008)
Knowles, J., Corne, D., Deb, K. (eds.): Multiobjective Problem Solving from Nature.
Knowles, J.D., Watson, R.A., Corne, D.W.: Reducing local optima in single-
objective problems by multi-objectivization. In: Zitzler, E., Deb, K., Thiele, L.,
Coello Coello, C.A., Corne, D.W. (eds.) EMO 2001. LNCS, vol. 1993, pp. 269283.
Korhonen, P.: A visual reference direction approach to solving discrete multiple
criteria problems. European Journal of Operational Research 34, 152159 (1988)
Kostreva, M., Ogryczak, W., Wierzbicki, A.: Equitable aggregations and multiple cri-
teria analysis. European Journal of Operational Research 158(2), 362377 (2004)
Kostreva, M.M., Ogryczak, W.: Linear optimization with multiple equitable criteria.
RAIRO Operations Research 33, 275297 (1999)
Kouvelis, P., Yu, G.: Robust Discrete Optimization and Its Applications. Kluwer
Academic Publishers, Dodrecht (1997)
Laumanns, M., Thiele, L., Deb, K., Zitzler, E.: Combining convergence and diversity
in evolutionary multi-objective optimization. Evolutionary Computation 10(3),
263282 (2002)
Luss, H.: On equitable resource allocation problems: A lexicographic minimax ap-
proach. Operations Research 47, 361378 (1999)

Boston (1999)
WWW-NIMBUS on the Internet. Computers & Operations Research 27(78),
709723 (2000)
Miettinen, K., Mkel, M.M., Mnnikk, T.: Optimal control of continuous cast-
ing by nondierentiable multiobjective optimization. Computational Optimiza-
tion and Applications 11(2), 177194 (1998)
and Software 18, 6380 (2003)
Miettinen, K., Mkel, M.M., Maaranen, H.: Ecient hybrid methods for global
continuous optimization based on simulated annealing. Computers & Operations
Research 33(4), 11021116 (2006)
Neumann, F., Wegener, I.: Minimum spanning trees made easier via multi-objective
optimization. In: Proceedings of the Genetic and Evolutionary Computation Con-
ference (GECCO-2005), pp. 763769. ACM Press, New York (2005)
Ogryczak, W.: On the distribution approach to location problems. Computers &
Industrial Engineering 37, 595612 (1999)
Ogryczak, W.: Multiple criteria optimization and decisions under risk. Control and
Cybernetics 31, 9751003 (2002)
Ogryczak, W., Ruszczyski, A.: From stochastic dominance to mean-risk models:
Semideviations as risk measures. European Journal of Operational Research 116,
3350 (1999)
Ogryczak, W., Wierzbicki, A., Milewski, M.: A multi-criteria approach to fair and
ecient bandwidth allocation. Omega 36, 451463 (2008)
Olson, D.: Decision Aids for Selection Problems. Springer, New York (1996)
Palaniappan, S., Zein-Sabatto, S., Sekmen, A.: Dynamic multiobjective optimization
of war resource allocation using adaptive genetic algorithms. In: Proceedings of
the IEEE Southeast Conference, pp. 160165. Clemson University, Clemson, SC
(2001)
Parmee, I.C., Cevtkovi, D., Watson, A.W., Bonham, C.R.: Multiobjective satisfac-
tion within an interactive evolutionary design enviornment. Evolutionary Com-
putation Journal 8(2), 197222 (2000)
Perny, P., Spanjaard, O., Storme, L.-X.: A decision-theoretic approach to robust
optimization in multivalued graphs. Annals of Operations Research 147, 317341
(2006)
Piro, M., Medhi, D.: Routing, Flow and Capacity Design in Communication and
Computer Networks. Morgan Kaufmann, San Francisco (2004)
Romeijn, H.E., Ahuja, R.K., Dempsey, J.F., Kumar, A.: A new linear program-
ming approach to radiation therapy treatment planning problems. Operations
Research 54, 201216 (2006)
Roy, B.: A missing link in OR-DA: Robustness analysis. Foundations of Computing
and Decision Sciences 23, 141160 (1998)
San Miguel, F., Ryan, M., Scott, A.: Are preferences stable? The case of health care.
Journal of Economic Behavior and Organization 48, 114 (2002)
Shimoyama, K., Oyama, A., Fujii, K.: A new ecient and useful robust optimization
approach design for multi-objective six sigma. In: Proceedings of the IEEE
Congress on Evolutionary Computation, vol. 1, pp. 950957. IEEE Computer
Society Press, Piscataway (2005)
Teich, J.: Pareto-front exploration with uncertain objectives. In: Zitzler, E., Deb, K.,
Thiele, L., Coello Coello, C.A., Corne, D.W. (eds.) EMO 2001. LNCS, vol. 1993,
evolutionary algorithm for multiobjective optimization. Working Papers W-412,
Helsinki School of Economics, Helsinki (2007)
von Stackelberg, H.: Marktform und Gleichgewicht. Springer, Berlin (1934)
von Winterfeldt, D., Edwards, W.: Decision Analysis and Behavioral Research. Cam-
Wierzbicki, A.: On completeness and constructiveness of parametric characteriza-
tions to vector optimization problems. OR Spectrum 8, 7387 (1986)
Yu, P.: Cone convexity, cone extreme points, and nondominated solutions in de-
cision problems with multiple objectives. Journal of Optimization Theory and
Applications 14, 319377 (1974)
Index
dominance, 382, 383 methods, 18

-constraint method, 12, 21, 221 quality, 130
-constraint problem, 32, 35, 36 quality of Pareto frontier, 229, 238
-ecient solution, 302 upper, 125, 129
-indicator, 383 articial intelligence, 407
articial neural network, 409
a posteriori, 3, 15, 28, 51, 71, 157, 179, aspiration level, 9, 17, 20, 40412
214 aspiration level approach, 267, 321
a priori, 3, 18, 28, 157, 179
association rule, 408
abstract conceptualisation, 414
asymptotic convergence, see conver-
acceptance level, 412
gence
accuracy measure, 231
asynchronous, 354
achievement scalarizing function, 17,
attainment function, 272, 387, 391
19, 43, 181, 408, 409, 416, 419
attainment indicators, 84
approach, 17, 21
attainment surface, 392
active experimentation, 414
augmented Chebyshev problem, 16
additive value function, 99, see value
averaging
function, 115
explicit, 290
aerodynamic design, 287
implicit, 290
aggregation-disaggregation logic, 412
AHP, 97, 99, 111, 112, 203
aircraft design, 310 bar charts, 197
analyst, XI, 2, 7 base of a decision rule, 133
anchoring, 415 base of an association rule, 140
Andrews curves, 199 basic methods, 10
ant-based EMO, 78 Bayesian conrmation measure, 133
API, see application programming behavioural decision theory, 415
interface (API) behavioural issues, 412
application programming interface behavioural pattern, 427
(API), 337 biased crowding distance, 161
applications, 285 biased sharing, 161
approximate decision rule, 131 bilevel optimization, 436
approximation, 128 blast furnace, 292
lower, 125, 129 bloating, 86
464 Index
boundary, 125, 129 distance, 75, 161

box index, 15, 215 customized algorithm, 62
cable-stayed bridges, 320 D indicator, 383

cardinality, 380 d-optimality, 254
case studies, 285 DAKOTA, 346
case study, 423 data management, 332
certain assignment, 130 decision (variable) vector, X
certain decision rule, 131 decision attribute, 127
challenging multiobjective problems, decision maker (DM), XI, 2, 7
436 human decision maker, 405, 425
characteristic representation, 133 role of, 3
Chebyshev single, 2
method, 29, 44 virtual decision maker, 405, 427
norm, 268 decision making process, 40
Cherno faces, 200 decision map, 222
choice, 122 decision rule, 122, 407409
classication, 29, 45, 47, 49, 408, 409, decision support system (DSS), 40, 333
419 decision tree, 407
classes, 4649, 51 decoy option, 415
classication-based methods, 45, 52 design of experiments, 253
classifying objective functions, 45 desirable properties of methods, 2
clustering problem, 87 deterministic rule, 133
cognitive complexity, 412, 428 deviation, 20
complementarity property, 129 DFSS, 453
complete set of decision rules, 133 DIDASS, 203
completeness, 382 dierential evolution, 64, 78
comprehensive weak preference relation, distributed architectures, 365
127 distribution of objective values, 449
compromise programming, 14, 16 dominance cone, 128, 448
computation eort, 379 negative, 128
computational uid dynamics, 287 dominance ranking, 389
concrete experience, 414 dominance relation, 128
condition attribute, 127 dominance-based rough set approach
Condor, 365 (DRSA), 126, 422
condence ratio, 133 dominated set, 128
constrained-domination, 81 dominating set, 128
constraint handling, 440 dose calculation, 301
convergence, 41 downward union, 128
asymptotic, 63 downward union of classes, 128
mathematical, 29, 37, 46 drug development, 296
psychological, 29, 46 dynamic optimization, 438
convex multiobjective optimization
problem, 4, 11 Edgeworth-Pareto hull (EPH), 215
convex programming, 61 eciency index, 184, 186
cooperating subpopulations, 351 ecient solutions
core, 130 representative set, see Pareto optimal
crossover operator, 62 set
crowding, 265 elitism, 65
Index 465
EMO, see evolutionary multiobjective exibility, 416

optimization (EMO) fraction of the Pareto-optimal front, 380
engineering design, 320 framing, 415
EO, see evolutionary optimization future challenges, 435
epsilon-indicator, 84
equitable eciency, 450 GDF method, see Georion, Dyer and
estimation of distribution algorithm, Feinberg method (GDF)
291 generality, 416
estimation renement (ER) method, generalized dominance, 447
230 genetic algorithm, 66
evolution strategy, 65 predator-prey, 292
evolutionary algorithm (EA), VIII, 409 genetic programming, 66, 440
evolutionary multiobjective optimiza- Georion, Dyer and Feinberg method
tion (EMO), VIII (GDF), 34, 36, 419
constraint handling, 81 global solvers, 442
decision-making, 85 Globus, 365
elitist approach, 74 goal, 20, 412
hybrid approaches, 85 goal programming, 20, 21
non-elitist approach, 72, 81 lexicographic approach, 20
reliability based, 87 min-max approach, 21
repository, 89 weighted approach, 20
robust Pareto-optimal frontier, 85 gradient based interactive step trade-o
test problem method (GRIST), 34, 36
constrained test suite, 83 granule of knowledge, 128
DTLZ test suite, 83 graphical user interface (GUI), 330
ZDT test suite, 82 grid computing, 289, 365
test problems, 81 GRIP, 97, 99101, 103, 107, 108,
evolutionary multiobjective optimzation 111115, 419
application, 78 group decision making, 2
evolutionary optimization (EO), 60, 65 GUI, see graphical user interface (GUI)
evolutionary programming, 65 guided MOEA, 167
exact rule, 133 GUIMOO, 344
exchangeable criterion, 130
exemplary decision, 123 harmonious houses, 206
inconsistent, 123 heatmap, 232, 332
exhaustive representation, 133 heterogeneous, 365
expected improvement, 255, 266 high level, 364
experiments, 405, 423, 425427 HIPRE, 210
quasi experiments, 425 homogeneous, 365
expert choice, 203 hybrid methods, 51, 441
hypervolume, 170, 266, 272, 381
fairness, 450 hypervolume indicator, 84, 376
fatigue, 428
feasible decision variable space, 67 ideal objective vector, XI, 7, 14
feasible goals method, 51 inclusion properties, 129
feasible objective region, X inconsistency, 428
feasible region, X IND-NIMBUS, 51
nal solution, 7, 27, 34 independent parallel intra-algorithm,
tness function, 62 355
466 Index
indiscernibility relation, 124 machine learning, 407, 416

indispensable criterion, 130 model, 407, 416
innovization, 88, 440 outcomes, 407
intensity of preference, 97100, 107, processes, 407
108, 111115 single loop, 413
interaction with decision makers, 286 style, 414
interactive approaches, 27 transductive, 252
interactive decision making approach learning phase, 29
NIDMA, 51 lens design, 322
interactive decision maps (IDM) lexicographic ordering, 17, 19, 21
technique, 229 light beam search method, 45, 165
interactive evolutionary computation line graphs, 197
(IEC), 187, 310 local networks, 366
interactive general algorithm, 28 loss aversion, 415
interactive methods, 3, 12, 27, 28 low level, 364
interactive multiobjective evolutionary
algorithm, 179 MACBETH, 97, 100, 113, 114
interactive multiobjective metaheuris- many objectives, 266
tics, 179, 187 marginal contribution, 169
interactive multiobjective optimization, Markowitz model, 452
27, 97100, 117, 267 matched samples, 396
interactive reference direction algo- mathematical convergence, see
rithm, 51 convergence
interactive surrogate worth trade-o mating pool, 66
method (ISWT), 34, 35 max-min aggregation, 448
iSIGHT, 340 MCDM, see multiple criteria decision
making (MCDM)
Karush-Kuhn-Tucker conditions, 64 mean-risk approach, 452
Karush-Kuhn-Tucker multipliers, 32, mean-robustness approach, 452
33, 35, 36, 48 memetic algorithms, 443
knee, 86, 171, 449 meta-criteria, 446
knowCube tool, 235 metamodels, 246, 289, 334
kriging, 254 method of global criterion, 14, 15, 21
KUR, 76 method of weighted metrics, 15, 21
methods for generating Pareto optimal
land use planning, 315 solutions, 15
large networks, 366 methods of neutral preferences, 13
large scale problems, 439 micro-GA, 78
LBS method, 419 minimal association rule, 142
learning, 40, 41, 43, 52, 53, 405 minimal representation, 133
constructive approach, 98, 99, 117 minimal rule, 133
cycle, 408 minimal set of decision rules, 133
double loop, 413 MKO-2, 342
ensemble , 252 modeFRONTIER, 338
incremental , 258 modelling, V
individual, 411 MOEA, see multiobjective evolutionary
individual learning, 407 algorithm
interest, 410 molecular docking, 296
investigate, 422 MOMGA, 78
Index 467
monotonicity, 378 NSGA-II, 75

Moore neighbourhood, 293
most preferred solution, 27, 52, 411 objective function, X
multi-objective genetic algorithm objective space, 67
(MOGA), 298 objectives
multi-start approach, 353 numbers of, 286
multiattribute or multicriteria decision optimality, 4
analysis, 1 OPTIMUS, 339
multicriteria optimization, VII ordinal regression, 97100, 102104, 409
multicriteria ranking problem, 103 orthogonal design, 253
multidimensional scaling (MDS), 199 outer approximation algorithm, 302
multidisciplinary design optimization outer diameter, 380
(MDO), 311 outranking relation, 409
multiobjective clustering, 438
multiobjective evolutionary algorithm, PAES, 74, 77
59, 157, 179 pairwise comparison, 97, 99, 107, 108,
multiobjective evolutionary algorithm , 408, 409, 412
interactive179 paradisEO, 344
multiobjective genetic algorithm parallel algorithms, 439
(MOGA), 73 parallel computing, 336
multiobjective linear programming parallel models for metaheuristics, 350
(MOLP), 1, 301 parallel non-heuristics, 363
multiobjective optimization, VII parallel optimization, 349
problem, 67 parallelization, 336
multiple criteria decision making Pareto dominance, 374, 448
(MCDM), VII, 1, 59 weak, 374
multiple testing, 398 Pareto front viewer, 343
mutation operator, 63 Pareto frontier, 213
Pareto optimal set, 4
Nadir objective vector, XI existence, 4
networks, 365 stability, 5
neural network, 292 Pareto optimal solution, X, 333, 411
neutral compromise solution, 14, 21, 50 Pareto optimality, X, 6, 29, 38, 49
new trends in optimality, 443 -proper, 7, 18
niching, 73 denition, X
NIMBUS method, 45, 49, 51, 52, 203, global, 4, 8
236, 340 improper, 6
synchronous, 49 local, 4, 8
NIMBUS software, 340 proper, 6, 18, 29, 48, 50, 448
no free lunch theorem, 249 weak, XI, 4, 18, 29, 47
no-preference methods, 3, 13 Pareto race, 45, 208
noise, 252 Pareto set approximations, 375
non-inferior set estimation (NISE) particle swarm EMO, 78
method, 230 payo table, 7
noninteractive methods, 3 performance metrics, 84
nonlinear multiobjective optimization, 1 PESA, 78
normal vector, 3235 phrasing, 415
NPGA, 73 PIDO, see process integration and
NSGA, 74, 78 design optimzation (PIDO)
468 Index
pie chart, 196 strictly monotonic, 378

plug-in, 337 unary quality indicator, 378
positive dominance cone, 128 uniformity, 386
possible assignment, 130
possible condition, 137 R indicator, 384
possible decision rule, 131 radar chart, 199
post-processing, 332 radial basis function, 253, 257, 258
posterior rationality, 412 radiotherapy, 300
PREFCALC, 203 ranges of objectives, 41
preference, 42, 43 ranking, 40, 122, 333
preference elicitation, 446 rating, 408
preference function, 412 reasonable goals method for databases,
preference information, 27, 29, 31, 343
3841, 45, 46, 49, 52, 97104, 108, reduct, 130
109, 122 redundant criterion, 130
preference model, 97103, 106, 122, 416 REF-LEX method, 45
preference relation, 412
reference direction, 86
preference structure, 428
reference direction approach, 51, 165
preferences, 157
reference level, 40
preferential model, 38, 39
reference point, 9, 14, 15, 17, 29, 38,
preliminary exploration, 332
4046, 48, 5052, 86, 163, 408,
principal component analysis, 199
409, 419, 441
probabilistic rule, 133
reference point method, 181, 318, 419
problem solving environment (PSE),
reference preorder, 104
329
reective observation, 414
problem structuring, 411
regression paradigm, 123
process integration and design
relative importance of objectives, 11
optimzation (PIDO), 329
relative support of a decision rule, 133
progressive preference articulation, 179
reliability, 335
PROMOIN, 341
reservation level, 4044
PSE, see problem solving environment
reservation point, 408
(PSE)
psychological convergence, see conver- response surface method (RSM), 252,
gence 334
reversibility, 416
quality indicator, 376 robust association rule, 140
-indicator, 383 robust decision rule, 133
binary quality indicator, 378 robust solutions, 450
cardinality, 380 robustness, 289, 335
completeness, 382 rough set, 124
computation eort, 379 RSM, see response surface method
D indicator, 383 (RSM)
fraction of the Pareto-optimal front, rule induction, 409
380
hypervolume, 381 satiscing decision making, 8
monotonic, 378 satiscing solution, 8, 48, 411
outer diameter, 380 satiscing trade-o, 321
R indicator, 384 satiscing trade-o method (STOM),
scaling invariant, 379 48, 269
Index 469
SBX, see simulated binary crossover Tchebyche, see Chebyshev

(SBX) Tchebyche method, 52
scalarization, 8 termination criteria, 63
scalarizing function, 8, 17, 29, 4850, test problem
266, 318, 441, 448 DTLZ, 83
achievement, see achievement KUR, 76
scalarizing function ZDT, 82
scaling, 160 test problems, 427
scaling invariance, 379 thresholds
scatter plot, 196 indierence, 412
scatterplot matrix, 234 preference, 412
score prole, 205 preference thresholds, 408
selecting a method, 10, 22, 51 veto, 412
selection, 67 tournament selection, 62, 81
self organizing maps, 332 traceability, 414
self-contained parallel cooperation, 351 trade-os, 6, 9, 29, 30, 35, 40, 48, 52,
semi-a-posteriori, 182 166, 408, 409
semidenite optimization, 437 automatic, 48
sequential proxy optimization technique curve, 219
(SPOT), 34, 36 indierence, 31
set optimization, 438 indierence trade-o rate, 32, 34
set quality measure, 376 local, 415
shared memory, 365 marginal rate of substitution (MRS),
simulated binary crossover (SBX), 63 32, 36
software, 329 objective, 30, 32, 34, 36
solution process, 7 partial, 30
sorting, 122 partial trade-o rate, 3134
example, 127
rate, 48
sovereignty, 42
ratio of change, 30
SPEA, 74, 77
subjective, 30, 34, 408, 419
SPEA2, 77
total, 30
statistical charts, 332
total trade-o rate, 31, 33
STEM method, 47, 419
transparency, 414
stopping criteria, 46
turbine stator blade, 290
strict monotonicity, 378
subjective preference, 444
substantive model, 38, 39, 41, 43 uncertainty, 335
sucient condition, 137 uniformity, 386
supply chain management, 305 universality, 417
supply network planning, 307 upward union, 128
support of a decision rule, 133 upward union of classes, 128
support of an association rule, 140 UTA, 99, 107
support vector machine, 253, 258, UTAGMS , 97, 99, 103, 107, 108, 115,
260262 419
surrogate models, 246 utility
surveys of methods, 2 expected, 171
synchronous, 354 function, 8, 384
utility indicators, 84
taxonomy, 444 utopian objective vector, XI, 14
470 Index
value function, 8, 21, 31, 3437, 40, 408, weak Pareto optimality, see Pareto
409, 412, 416, 424 optimality
additive, 106, 107 weighted Chebyshev problem, 16, 29, 47
method, 19 weighted sum, 40
value paths, 197 weighting coecients, 40
variation operator, 62 weighting method, 10, 21, 40
vector optimization, 7 weights, 408, 412
VEGA, 72 WWW-NIMBUS software, 50, 203, 340,
VICO, 206 455
VIG, 203
VIMDA, 203, 207
visual interactive systems, 414 ZDT, 76, 82
visualization, 195, 213, 332 Zionts-Wallenius method (Z-W), 34, 35
visualization tool, 332 zooming capacity, 418

Multiobjective Optimization - Interactive and Evolutionary Approaches

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multiobjective Optimization - Interactive and Evolutionary Approaches

Uploaded by

Copyright:

Available Formats

Lecture Notes in Computer Science 5252

Commenced Publication in 1973

Library of Congress Control Number: 2008937576

Optimization is the task of nding one or more solutions which correspond to

1 Modelling an Optimization Problem

variable bounds) is an important task. Second, an optimization algorithm (sin-

2 Why Use Multiple Objectives?

It is a common misconception in practice that most design or problem solving

3 Multiple Criteria Decision Making

national Summer Schools on Multicriteria Decision Aid have been arranged

4 Evolutionary Multiobjective Optimization

5 Genesis of This Book

The Dagstuhl seminar organized in November 2004 provided an ideal plat-

tiobjective optimization on the MCDM side (including both noninteractive

7 Main Terminology and Notations Used

involving k ( 2) conicting objective functions fi : Rn R that we want

real-world applications to software and visualization issues as well as vari-

June 2008 Jrgen Branke,

Most participants of the 2006 Dagstuhl seminar on Practical Approaches to Multi-

Oliver Bandte Kalyanmoy Deb

Mathias Gbelt Alexander V. Lotov

Hirotaka Nakayama Daisuke Sasaki

Mariana Vassileva Andrzej P. Wierzbicki

Basics on Multiobjective Optimization

1 Introduction to Multiobjective Optimization:

Recent Interactive and Preference-Based Approaches

4 Interactive Multiobjective Optimization

5 Dominance-Based Rough Set Approach to Interactive

8 Visualization in the Multiple Objective Decision-Making

Modelling, Implementation and Applications

10 Meta-Modeling in Multiobjective Optimization . . . . . . . . . . . 245

Quality Assessment, Learning, and Future Challenges

14 Quality Assessment of Pareto Set Approximations . . . . . . . . . 373

Department of Mathematical Information Technology

Abstract. We give an introduction to nonlinear multiobjective optimization by

In 2007 also Helsinki School of Economics, Helsinki, Finland

In multiobjective optimization problems, it is characteristic that no unique

depending on dierent interpretations. Other classications are given, for ex-

1.2 Some Concepts

Continuous multiobjective optimization problems typically have an innite

nonempty and compact. We do not go into details of theoretical foundations

Fig. 1.1. Sets of properly, weakly and Pareto optimal solutions.

As a matter of fact, Pareto optimal solutions can be divided into im-

An objective vector is properly Pareto optimal if the corresponding decision

D = Rk+ = {z Rk | zi 0 for i = 1, . . . , k},

that is, D is the nonnegative orthant of Rk . For further details of ordering

1.2.2 Solution Process and Some Elements in It

Mathematically, we cannot order Pareto optimal objective vectors because the

Not only value functions but, in general, any preference model of a DM

1.3 Basic Methods

1.3.1 Weighting Method

The weighting method can be used as an a posteriori method so that

1.3.2 -Constraint Method

In the -constraint method, one of the objective functions is selected to be

1.4 No-Preference Methods

1.4.1 Method of Global Criterion

In the method of global criterion or compromise programming (Yu, 1973;

(where the exponent 1/p can be dropped) or

1.4.2 Neutral Compromise Solution

Another simple way of generating a solution without the involvement of the

1.5 A Posteriori Methods

1.5.1 Method of Weighted Metrics

In the method of weighted metrics, we generalize the idea of the method

feasible objective region is minimized. The dierence is that we can produce

mij (x0 ) = ij j = 1, . . . , k j = i (2.7)