0% found this document useful (0 votes)
82 views8 pages

Evolutionary Computation and Convergence To A Pareto Front: David A. Van Veldhuizen Gary B. Lamont

This document discusses evolutionary computation and convergence to a Pareto front. It introduces relevant concepts of multi-objective optimization problems including Pareto dominance and optimality. It then presents an experiment investigating the convergence of an evolutionary algorithm to the true Pareto front of a specific multi-objective problem. The results provide a basis for a theorem showing that a multi-objective evolutionary algorithm statistically converges to the Pareto front.

Uploaded by

Victor Draghiciu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views8 pages

Evolutionary Computation and Convergence To A Pareto Front: David A. Van Veldhuizen Gary B. Lamont

This document discusses evolutionary computation and convergence to a Pareto front. It introduces relevant concepts of multi-objective optimization problems including Pareto dominance and optimality. It then presents an experiment investigating the convergence of an evolutionary algorithm to the true Pareto front of a specific multi-objective problem. The results provide a basis for a theorem showing that a multi-objective evolutionary algorithm statistically converges to the Pareto front.

Uploaded by

Victor Draghiciu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Evolutionary Computation and Convergence to a Pareto Front

David A. Van Veldhuizen


Department of Electrical and Computer Engineering Graduate School of Engineering Air Force Institute of Technology Wright-Patterson AFB, OH 45433-7765 fdvanveldg@at.af.mil (937) 255-3636 ext. 4704

Gary B. Lamont
Department of Electrical and Computer Engineering Graduate School of Engineering Air Force Institute of Technology Wright-Patterson AFB, OH 45433-7765 flamontg@at.af.mil (937) 255-3636 ext. 4718 This paper focuses on ECs ability to converge to the pdimensional hypersurface dened by simultaneously optimizing p functions. For example, by optimizing a bi-objective problem where both objectives are functions of one variable, we can determine necessary analytical expressions for use in a specic EC convergence computation. We then show an arbitrary Evolutionary Algorithm (EA) statistically converges (given specic MOPs) and generalize our results. The remainder of this paper is organized as follows. Section 2 introduces relevant MOP concepts. Section 3 describes our experiments with an example MOP and the software environment used. Section 4 presents and presents a theorem showing a multiobjective EA statistically converges to the Pareto front. Finally, Section 5 presents our conclusions and proposes promising directions for further research.

ABSTRACT Research into solving multiobjective optimization problems (MOP) has as one of its an overall goals that of developing and dening foundations of an Evolutionary Computation (EC)-based MOP theory. In this paper, we introduce relevant MOP concepts, and the notion of Pareto optimality, in particular. Specic notation is dened and theorems are presented ensuring Paretobased Evolutionary Algorithm (EA) implementations are clearly understood. Then, a specic experiment investigating the convergence of an arbitrary EA to a Pareto front is presented. This experiment gives a basis for a theorem showing a specic multiobjective EA statistically converges to the Pareto front. We conclude by using this work to justify further exploration into the theoretical foundations of EC-based MOP solution methods.

Key MOP Concepts

Introduction

Although single-objective optimization problems may have a unique optimal solution, MOPs (as a rule) offer a possibly uncountable set of solutions, which when evaluated produce vectors whose components represent trade-offs in decision space. A decision maker then implicitly chooses an acceptable solution by selecting one of these vectors. Mathematically speaking, an MOP minimizes (or maximizes, since minfF (x)g = ? maxf?F (x)g) the components of a vector f (x) where x is an n-dimensional decision variable vector from some universe U . Or in general, minimize subject to

Our research focuses on solving scientic and engineering multiobjective optimization problems (MOPs), contributing to the overall goal of developing and dening foundations of an Evolutionary Computation (EC)-based MOP theory. This paper addresses shortfalls recently identied by Horn (Horn 1997), focusing on MOP theory development and experimentation. Solving MOPs with EC methods presents several unique challenges. For a good introduction to the relevant issues and past EC-based approaches, see the articles by Fonseca and Fleming (Fonseca and Fleming 1995), and by Horn (Horn 1997).

f (x) = (f1(x); : : : ; fp (x)) gi (x) 0; i = 1; : : : ; m ;

(1)

An MOP then consists of n variables, m constraints, and p objectives (p 2), of which any or all of the objective functions may be nonlinear (Hwang and Masud 1979). MOPs are often characterized by measures of performance (objectives) which may be (in)dependent and/or non-commensurable; the multiple objectives being optimized almost always conict. These opposing objectives place a partial, rather than total, ordering on the search space. In order to successfully deal with

these characteristics, several EC-based methods (Fonseca and Fleming 1995, Horn 1997) were developed to determine optimal solutions given an MOPs objectives and constraints. Key to many of these methods, however, is the use of Pareto Optimality in determining a (set of) MOP solutions.

The problems two objectives are dened as: Minimize f11 Minimize f12

= x2 ; = (x ? 2)2 :

(3)

2.1 Pareto Optimality


Although Pareto optimality and its related concepts/terminology are frequently invoked they are sometimes used incorrectly in the literature. To ensure understanding and consistency, we introduce Pareto Dominance and Optimality in this section, and introduce a consistent notation in Section 2.2. Using the MOP presented in Equation 1, key Pareto concepts are mathematically dened as follows (BenTal 1980). Denition 1 (Pareto Dominance): A vector is said to dominate v = (v1 ; : : : ; vp ) if and only if u is partially less than v, i.e., 8i 2 f1; : : : ; pg; ui vi ^ 9i 2 f1; : : : ; pg : ui < vi. 2

to f11 but not f12 ; the solution x = 2 is optimal with respect to f12 but not f11 . Any solution fx j x 62 0 x 2g is not a member of the Pareto optimal set because it is not better than a solution in the set with respect to either objective. For the general case, Rudolph (Rudolph 1998b) recently showed that given Minimize f11 Minimize f12

fx j 0 x 2g. The solution x = 0 is optimal with respect

By looking at Figure 1 its apparent the Pareto optimal set is

u = (u1; : : : ; up)

= jjxjj2 ; = jjx ? zjj2 ; with 0 6= z 2 R ;

(4)

the Pareto optimal set for this general MOP is:

X = fx 2 R j x = rz; r 2 0; 1]g :
40 f11 f12 35

(5)

As an example of Pareto dominance we use the optimization problem:

minimize f (xu); u 2 fa; b; c; dg; where a , f (xa) = (3:25; 1:76; 4:67); b , f (xb ) = (3:25; 1:76; 4:67); c , f (xc) = (3:15; 1:76; 4:67); d , f (xd) = (3:15; 1:76; 4:22): (2) Here, a and b are dominated by both c and d; c is dominated by d, and d dominates all other vectors.
Denition 2 (Pareto Optimality): A solution xu 2 U is said to be Pareto optimal if and only if there is no xv 2 U for which v = f (xv ) = (v1 ; : : : ; vp ) dominates u = f (xu ) = 2 (u1; : : : ; up ). The solution xd in Equation 2s example is the only element of the associated Pareto optimal set; i.e., it is a Pareto optimal solution of the set fxa ; xb ; xc ; xd g. Pareto optimal solutions are also termed non-inferior, admissible, or efcient solutions. Their corresponding vectors are termed nondominated (Horn 1997); selecting a vector(s) from this nondominated vector set implicitly indicates acceptable Pareto optimal solutions. These solutions may have no clearly apparent relationship besides their membership in the Pareto optimal set. We stress here that Pareto optimal solutions are classied as such based on their phenotypical expression. Their expression (the non-dominated vectors), when plotted in criterion space, is known as the Pareto front. Researchers have sometimes inconsistently used these terms in the literature. As another example of Pareto optimality we present the one-variable, two-objective problem F 1. This is the same problem used by Vincent and Grantham, Schaffer, and Srinivas and Deb for the same purpose (Srinivas and Deb 1994). Figure 1

30

25 f11, f12

20

15

10

0 4

Function f11 and f12 Values.

0 x

9 8 7 6 5 f12 4 3 2 1 0 0

Pareto Front

4 f11

Figure 2 f12 ).

Function F1 s Pareto Front (f11 plotted against

We point out a signicant difference between Figures 1 and 2. Figure 1 plots the values of functions f11 and f12 for different values of the independent variable. However, Figure 2 represents the values of function f11 plotted against those of function f12 for the same value of the independent

variable. In other words, Figure 2 is a graph in phenotype space. This plot graphically displays the non-dominated vectors for this problem as points in criterion space; this representation forms the Pareto front. Identifying a set of Pareto optimal solutions is thus key for a decision maker to choose a compromise solution satisfying the objectives as best possible. Choosing an MOP solution optimized for only one objective may well ignore solutions, which from an overall standpoint, are better. The Pareto optimal solution set denes that overall view. Of course, the accuracy of the decision makers view depends on both the true Pareto optimal set and the set presented as Pareto optimal. EA-derived solutions of real-world MOPs offer only a nite number of points which may or may not be truly Pareto optimal. Also, complex MOPs dont generally lend themselves to analytical determination of the actual Pareto front. This raises the issue of EC convergence to the true Pareto front, which we address in Section 4.

Proof: Assume n elements of an arbitrary solution set. When evaluated in a minimization problem n vectors result. Without loss of generality, assume a one-dimensional function which when evaluated results in n 1-D vectors. Order these vectors from smallest to largest, which we can do because by denition, a total ordering exists. There are then two cases to consider: Case I. The smallest vector is strictly less than or equal to all other vectors. Since the vectors are ordered this vector is Pareto dominant (by denition) and its associated solution is Pareto optimal. Case II. Beginning with the smallest vector, two or more vectors are equal in value. Since the vectors are ordered, these particular vectors are guaranteed to be nondominated since no other vector dominates them, and neither (or any other vector of equal value) dominates the other(s). Thus, since they are non-dominated, their associated solutions are also Pareto optimal. Without loss of generality, the same reasoning holds when extended to vectors of two or more dimensions except that the ordering is now partial, and to maximization problems where the vector ordering is now largest to smallest. Thus, any non-empty solution set has at least one Pareto optimal solu2 tion.

2.2 Pareto Methodology for EAs


During EA execution, a local set of Pareto optimal solutions (with respect to the current EA population) is determined at each EA generation and termed Pcurrent . Like other practical applications (Horn 1997), this solution set is added to a secondary population termed Pknown (i.e., Pcurrent Pknown ), and the process continued until EA termination. Because a solutions classication as Pareto optimal is dependent upon the context within which it is evaluated (i.e., the given set of which its a member), all corresponding vectors of Pknown are tested each generation and solutions whose associated vectors are dominated are removed. The result is a nal set of Pareto optimal solutions found by the EA. Of course, the actual Pareto optimal solution set (termed Ptrue ) is not known for problems of any difculty. This set is dened by the functions composing the MOP; Ptrue is xed and does not change.1 Pcurrent and Pknown are sets of EA genotypes. EA tness is judged via phenotypes, which is a Pareto front in the MOP case. We term the associated Pareto front for each of the above sets as PFcurrent , PFknown , and PFtrue . Thus, when using an EA to solve MOPs, the implicit assumption is that one of the following holds: Pknown = Ptrue , Pknown Ptrue , or PFknown 2 PFtrue ? ; PFtrue ] over some norm. Because of the manner in which Pareto optimality is dened, Pcurrent is always a non-empty set. As this may be non-intuitive, and because we assume this in our EA implementation, we present the following theorem showing this for the general case.

2.3 Convergence Conjecture


The global optimum for an MOP is a set of vectors. Pareto optimal solutions are those in which performance in one objective cannot be improved without adversely affecting another objective. Thus, the Pareto front (PFtrue ) determined by evaluating the Pareto optimal solution set (Ptrue ) is the global optimum of an MOP. This view forms the basis for the following conjecture (based on a theorem presented by B ck (B ck 1996, pg. 129)). a a Conjecture 1: The global optimum of an MOP is a Pareto ~ front F , composed of at most an uncountably innite number of vectors v1 ; v2 ; : : : ; v@1 . An EA using Pareto-based ranking and a monotonic selection algorithm converges to the global optimum, i.e.,

Pft?!1 F 2 P (t)g = 1 ; lim ~


where P (t) = Pcurrent . We show this conjecture is true in Section 4.

Experimentation

Theorem 1: Given any non-empty solution set, at least one 2 Pareto optimal solution exists within that set.

Pcurrent , Pknown , and Ptrue .

1 Horn

(Horn 1997) uses

Ponline , Po

ine ,

and

Pactual

instead of

EC embodies the techniques of Genetic Algorithms (GAs), Genetic Programming (GP), Evolution Strategies (ESs), and Evolutionary Programming (EP), collectively known as EAs. Several research efforts successfully used Pareto optimality as the basis for EA tness assignment in solving MOPs (Fonseca and Fleming 1995, Horn 1997). We investigate here whether

using Pareto-based tness ranking results in EA convergence to PFtrue .

3.1 Software Environment


We chose the Genetic and Evolutionary Algorithm Toolbox (GEATbx) v1.91 (Pohlheim 1998) as the environment within which to perform our experiments. This toolbox operates under MATLAB2 and provides a versatile tool set for implementing a wide range of EA methods. It eases development of user-specic routines for integration into GEATbx, and allows access to MATLAB data analysis and visualization functions for use with GEATbx-derived data. We easily modied GEATbx to solve MOPs.

Using the Pareto fronts equation (Equation 7) and setting the derivative of F (x) equal to zero (since minimizing the square of the distance simultaneously minimizes the distance) results in:

3.2 Example Problem I


To experimentally determine if a specic EA exhibits convergence to PFtrue we use the MOP example given in Equation 3. Figure 2 plots function f11 against f12 using the computed objective values for a particular input value. If the EA exhibits the desired convergence, PFkknown should move closer and closer to PFtrue . In this case were able to determine PFtrue s analytical expression because both objective functions are functions of one variable. In order to determine the polynomial equation describing PFtrue we write f12 as a function of f11 .3 We rst dene:

0 = F 0(x) = 2(y1 ? y10 ) + 2 f (y1 ) ? y20 ] f 0 (y1 )] = 2y1 ? 2y10 + 2py 2(y1 ? 4py1 + 4) ? 2y20 ] 1 ? y 1 ] 1 py + 4py1 (y ? 4) + = 4y1 ? 12 1 y 20 1 24 ? 2y20 ? 2y10 : (10) Finally, for an arbitrary tuple (y1i ; y2i ) in Pcurrent , solving Equation 10 for y1 allows Equation 8 to be solved and a distance di between (y1i ; y2i ) and the nearest point on PFtrue

determined. Several solutions in each generation may be elements of Pcurrent . Thus, as a measure of distance between PFcurrent and PFtrue we dene a generational distance:

Gj ,

pPn

y1 , f11 = x2 ; y2 , f12 = (x ? 2)2 : y2 = (x ? 2)2 = x2 ? 4x + 4 = y1 ? 4py1 + 4 :

(6)

Then, multiplying out y2 and substituting in y1 gives PFtrue s equation as (7)

Because weve shown y2 is a function of y1 we can assume a standard coordinate system for the following computations. The equation to determine distance between any given point (i.e., EA MOP phenotypes) and a point on the given curve (Equation 7) is then given by

di = =

p(y p(y

? y10 )2 + (y2 ? y20 )2 2 2 1 ? y10 ) + f (y1 ) ? y20 ] ;


1

(8)

where y10 and y20 are EA-returned tness values, and y1 and y2 are coordinates of a point on PFtrue . We wish to determine the distance between an arbitrary point and the nearest point on a given curve, where the point corresponds to a member of PFcurrent and the given curve is PFtrue . For easier mathematical manipulation Equation 8 is modied to reect the distance squared:

F (x) = (Distance)2 = (y1 ? y10 )2 + f (y1 ) ? y20 ]2 :


2 MATLAB 3 Single-valued objective

(9)

is a Trademark of The MathWorks, Inc. functions must be invertible in order to obtain an analytical formula for a Pareto front.

where n is the number of points in Pcurrent at generation j , i.e., Equation 8 is solved for each member of Pcurrent and the results used to derive a value for Equation 11. The generational distance value then determines the error between PFcurrent and PFtrue . To perform our experiment we chose the default GA implementation provided by GEATbx, requiring few user-specied parameters. We set the following necessary options: a realvalued representation, 50 individuals in the GA population, terminate after 100 generations, and search space bounds of [-100,100]. Also, a secondary solution population (Pknown ) was constructed and integrated into the GA. At each generation, Pcurrent Pknown and Pknown updated. To solve the MOP under study, the GA was augmented by the Paretobased ranking and linear tness assignment scheme described by Fonseca (Fonseca 1995). Appropriate tness values were assigned to each member of the GAs population by developing two additional MATLAB routines. The rst determines which associated vectors of a given solution set are non-dominated and assigns them a rank of 0. The remaining solutions (all of which are dominated) are given a rank equal to the number of associated vectors dominating theirs. The second routine transforms the rankings into tness; all solutions of equal rank receive identical tness values. Using the above parameters and methodology, we see the implemented GA does in fact converge using Pareto optimality as a tness assignment criterion. Figure 3 shows PFcurrent generational distances. Its seen that at some generations, the currently known Pareto front appears to be close to or part of the true front, but at other generations is farther away from it. This clearly indicates the contextual nature of Pareto-based tness assignment. Figure 4

i=1 di

(11)

Distance from Pareto Optimal Front

shows the generational distance of the secondary population, indicating that as better Pareto optimal solutions are found, worse ones are discarded and PFknown moves ever closer to PFtrue .
3.5

0.16

0.14

0.12

0.1

0.08

0.06

Generational Distance from Pareto Optimal Front

0.04

2.5

0.02

0 0

10

20

30

40 50 60 Generation Number

70

80

90

100

1.5

Figure 6
1 0.5

PFknown s Generational Distance (Example II).

0 0

10

20

30

40 50 60 Generation Number

70

80

90

100

Figure 3

PFcurrent s Generational Distance (Example I).


0.6

0.5 Distance from Pareto Optimal Front

0.4

0.3

Two changes were made to the GA parameters mentioned above. First, search space bounds were set to [-10, 10]. Secondly, because this MOP involves two independent variables we used MATLAB to derive an approximation to the equation describing PFtrue ; the RMS error between the actual and computed solutions was 0:0224. We used this equation in determining the distance between any member of PFcurrent or PFknown and the nearest point on the curve representing PFtrue . Figurers 5 and 6 show the resulting data. Again, the implemented GA converges to PFtrue using Pareto optimality as a tness assignment criterion.

0.2

3.4 Real World Issues


0.1 0 0 10 20 30 40 50 60 Generation Number 70 80 90 100

Figure 4

PFknown s Generational Distance (Example I).


0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0

10

20

30

40 50 60 Generation Number

70

80

90

100

Figure 5 II).

PFcurrent s Generational Distance (Example

3.3 Example Problem II


Using the same methodology, we also applied the above GA to another bi-objective problem. Fonseca discussed this MOP (Fonseca and Fleming 1995):

minimize f (x; y) = (f1(x; y); f2 (x; y)); where f1 (x; y) = 1 ? exp(?(x ? 1)2 ? (y + 1)2) ; f2 (x; y) = 1 ? exp(?(x + 1)2 ? (y ? 1)2) :

(12)

We must consider the real world before discussing experimental results. The major issue concerns our denition of the Pareto front in the above problems. It is only because of the problems unconstrained and bi-objective nature that we are able to determine expressions for PFtrue . As Horn also notes (Horn 1997), in real problems we have no way of knowing when to terminate search. Unlike our example, there is no general method to determine PFknown s distance from PFtrue . In the rst example (Equation 3) we were able to derive an exact analytical expression for PFtrue because both objective functions are functions of one variable. In the second case we deterministically computed the objective functions values at equidistant values of x and y over a given range, thus determining which values gave rise to PFtrue . Using those values, a MATLAB polynomial curve-tting function then gave us an approximation of f (f1 ) for our distance computations. We realize EAs are overkill for nding solutions to these particular problems. These MOPs were selected for their simplicity and ease of implementation for our experiments. We do plan to extend this experiment to MOPs of three or more functions (which are functions of both single and multiple variables), and apply results to real-world problems. Note also that EA implementations discretize continuous models because of computational limitations. Thus, if one is searching a discrete space, PFtrue of this space is most likely only an approximation to the continuous PFtrue , to a particular level of resolution.

Generational Distance from Pareto Optimal Front

3.5 Experimental Results


Our experimental task was to determine if a specic EA exhibits convergence to PFtrue . To do so we require metrics independent of initially measured values. B ck denes a paa rameter used in assessing EA convergence velocity called a Progress Measure (B ck 1996), which quanties relative a rather than absolute convergence improvement by:

EA Convergence

f P , ln f max((0)) ; max T

Theorem 2: An EA using Pareto-based ranking and a monotonic selection algorithm converges to the global optimum of ~ an MOP (a Pareto front F , composed of at most an uncountably innite number of vectors v1 ; v2 ; : : : ; v@1 ) with probability one, i.e.,

(13)

where P (t) = Pcurrent .

Pft?!1 F 2 P (t)g = 1 ; lim ~

where fmax (i)is the best objective function value in the parent population at generation i. We modify this denition to the following:

Proof: B ck proves (B ck 1996, pg. 129) that an EA cona a verges with probability one if it fullls the following conditions:

RP , ln G 1 ; T

rG

~ ~ ~ ~ 8F ; F 0 2 I; F 0 is reachable from F by means of


mutation and recombination; and The population sequence P (0); P (1); : : : is monotone i.e., 8t : (15)

(14)

where G1 is the generational distance at generation 1, and GT the distance at generation T . We also select GT as a metric showing the EAs effectiveness in converging to the Pareto front. Initial experimental results indicate convergence occurs as conjectured, as shown in Table 1. Each reported result was derived from 10 GA executions. Niching and sharing werent explicitly considered in these runs; we wished to see only if PFknown converged to PFtrue , not how much of the front it covered. Each result is labeled in the form DEB(X) or FNF(X), where DEB refers to (Equation 3) and FNF to (Equation 12). The X designator is interpreted as follows: A had a recombination probability of 0.7 and a mutation probability of 0.7, and Bs recombination was 0.7 and mutation was 0.35. The mean and variance are reported for each metric. Table 1
Function DEB(A) DEB(B) DEB(ALL) FNF(A) FNF(B) FNF(ALL)

~ ~ minf (F (t + 1) j (F (t + 1) 2 P (t + 1)g ~ ~ minf (F (t) j (F (t) 2 P (t)g (16)

Experimental Results.

RP (Mean)
11.0071 14.0125 12.5098 1.4247 1.3617 1.3932

RP (Variance)
6.8985 11.0616 9.1038 0.1999 0.2565 0.2262

GT (Mean)
1.0346e-04 4.9123e-05 7.6292e-05 0.0056 0.0065 0.0061

GT (Variance)
1.8771e-04 1.5534e-04 1.6999e-04 0.0013 0.0022 0.0018

RP (Equation 14) measures the relative convergence of the implemented GAs. These results cannot be generalized at this time because supporting data and analysis are lacking. Addressing this shortfall is part of of our future research. However, for both examples, after 100 generations, GT indicates the GA was quite effective in converging PFknown to PFtrue :. That is, by looking at Figures 4 and 6, we see the trend for GT values moving towards a zero error (i.e., they are within some of PFtrue : ).

3.6 Experimental Analysis

We assume an EA with innite precision4, a minimization MOP, an uncountably innite population size, and appropriate mutation and recombination operators allowing every point in ~ ~ ~ the search space to be visited. Thus, 8F and F 0 2 I , F 0 is ~ . This satises (15), and it remains to prove reachable from F that the EAs population sequence is monotone. This situation occurs when the EA used is admissible (both tness function and selection algorithm are monotonic). We note here that the tness function is not a combination of the objective functions (see Section 3.2). Grefenstette denes a monotonic tness function as one which does not reverse the sense of any pairwise solution ranking (Grefenstette 1997). When using Pareto-based tness assignment, any given pair of Pareto optimal solutions receive identical tness values; they also receive better tness than dominated solutions. Any tness function assigning tness in this way is monotonic. A monotonic selection algorithm also respects the tness of pairwise solutions as regarding the expected number of offspring. Thus, we see that Equation 16 is also satised, since any selection algorithm selecting individuals based on tness is monotonic. All Pareto optimal solutions present in generation P (t) have the best tness and are selected for inclusion in generation P (t + 1). Pareto optimal solutions whose associated vectors are non-dominated in generation P (t + 1) receive higher tness than non-Pareto solutions. However, using Pareto-based ranking ensures either case satises Equation 16. Thus, an EA using Pareto optimality-based ranking and a monotonic selection algorithm converges to the global optimum of an MOP (i.e., PFtrue ) with probability one. 2
4 This assumption is based on our use of a real-valued EA; if the optimal solutions were rationals we could use a nite representation.

4.1 Other Convergence Proofs


Other research also addresses the desired EA convergence. Rudolphs (Rudolph 1998a) Corollary2 guarantees that given a countably innite EA population and an MOP, at least one decision variable (xk ) sequence exists such that f (xk ) converges in the mean to the PFtrue , although it appears his nomenclature is inconsistent with accepted denitions. We note his variation kernel (i.e., transition probability function) is equivalent to our reachability condition (appropriate mutation and recombination operators allowing every point in the search space to be visited). Also, his assumed elitist preservation strategy is more restrictive than our monotonic selection algorithm. Finally, he refers to at least one sequence leading to an associated point on PFtrue , as compared to this work which indicates that through Pareto ranking all decision variable sequences lead towards Ptrue ; likewise, these variables phenotypical expressions lead towards PFtrue . Rudolph (Rudolph 1998b) also independently proved that a specic (1+1) EA converges with probability one to a member of the Pareto optimal set Ptrue of a specic MOP. His distance function is in the genotype domain, as compared to ours and his previous work, which is phenotypically based. Rudolph shows the desired convergence for a multiobjective (1+1)-ES, with objective functions specied by Equation 4. The evolutionary operators in his model are not able to search the entire space (in a probabilistic sense), since a step size restriction is placed upon the probabilistic mutation operator. Thus, convergence only occurs when the ESs step size is proportional to the distance to the Pareto set as shown in the elaborate proof. However, this distance is obviously unknown in problems of high complexity, which is typical of most real-world problems. Rudolphs theorems are for a specic EA and MOP instantiation with constrained evolutionary operators, ours requires a less-specic EA. Both theorems show that what we seek is possible; given EAs do converge to an optimal set, although Rudolph denes a genotypic optimal set, and we dene a phenotypic set. Using phenotypical information is more appropriate, as a decision makers costs and prots are more accurately reected in attribute space. What is more important, though, is the rate at which the EAs converge to PFtrue , and whether these nondominated vectors (PFknown ) are uniformly distributed across PFtrue as t ! 1. These are subjects of our current research.

parameter selection. We plan to evaluate more complex multiobjective problems from this convergence perspective and apply insights to real-world MOP solving.

Acknowledgments

We thank Maj (Dr) Thomas Reid, and the anonymous reviewers from an earlier draft.

References
B ck, Thomas (1996). Evolutionary Algorithms in Theory a and Practice. Oxford University Press. New York. B ck, Thomas, Fogel, David and Michalewicz, Zbigniew, a Eds.) (1997). Handbook of Evolutionary Computation. Vol. 1. IOP Publishing Ltd. and Oxford University Press. Ben-Tal, Aharon (1980). Characterization of pareto and lexicographic optimal solutions. In: Multiple Criteria Decision Making Theory and Application (G. Fandel and T. Gal, Eds.). Vol. 177 of Lecture Notes in Economics and Mathematical Systems. pp. 111. Springer-Verlag. Berlin. Fonseca, Carlos M. and Peter J. Fleming (1995). An overview of evolutionary algorithms in multiobjective optimization. Evolutionary Computation 3(1), 116. Fonseca, Carlos Manuel Mira (1995). Multiobjective Genetic Algorithms with Application to Control Engineering Problems. PhD thesis. The University of Shefeld. Shefeld, UK. Grefenstette, John (1997). Handbook of Evolutionary Computation. Chap. Rank-Based Selection, pp. C2.4:1 C2.4:6. Vol. 1 of B ck et al. (1997). a Horn, Jeffrey (1997). Handbook of Evolutionary Computation. Chap. Multicriterion Decision Making, pp. F1.9:1 F1.9:15. Vol. 1 of B ck et al. (1997). a Hwang, Ching-Lai and Abu Syed Md. Masud (1979). Multiple Objective Decision Making - Methods and Applications. Springer Verlag. Pohlheim, Hartmut (1998). Genetic and evolutionary algorithm toolbox for use with matlab. Technical report. Technical University Ilmenau. Rudolph, G nter (1998a). Evolutionary search for minimal u elements in partially ordered nite sets. In: Proceedings of the Seventh Annual Conference on Evolutionary Programming. Springer. Berlin. Rudolph, G nter (1998b). On a multi-objective evolutionary u algorithm and its convergence to the pareto set. In: Proceedings of the 1998 IEEE Conference on Evolutionary Computation. IEEE.

Conclusions

We have shown an EA statistically converges to a given Pareto front and presented a theorem supporting the experimental evidence. We also introduced a consistent notation not reected in the current literature, ensuring various Pareto concepts are clearly understood. We now plan to investigate how to make the EA more efcient, i.e., how to make it converge more quickly to, and uniformly across, PFtrue . Insight into examples of the types presented here may help in appropriate EA

Srinivas, N. and Kalyanmoy Deb (1994). Multiobjective optimization using nondominated sorting in genetic algorithms. Evolutionary Computation 2(3), 221248.

You might also like