Professional Documents
Culture Documents
(53)
m max Ps
where is the value defined by user, is the mutation probability, and
Pmax arg max Pn , n 1,2,...., N
, . By using mutation rate, the equilibrium of the population
will be obtained. The probability of species living in the habitat can be calculated using P s
which is change time t to time (t+t).
Ps t t Ps t 1 s t s t Ps 1 t s 1 t Ps 1 t s 1 t
A B C
(54)
The Eq. (54) is divided into three parts mentioned as A, B and C, which is S species at t with
no migration, S-1 species at t with 1 species immigrate, and S+1 species at t with 1 species
emigrate, respectively. Ps is the probability that habitat contains S species. Thus, the equation
can be converted a differential equation
0 0 P0 s 1Ps 1 S0
P s s s Ps s 1Ps 1 s 1Ps 1 1 S n 1
P P Sn
n n n s 1 s 1
(55)
Then, the differential can be combined into matrix as
P P0 P1 Pn
T
(56)
Or can be written as simplicity
P AP
(57)
0 0 1 0 0
0 1 1 2
A
n 2 n 1 n 1 n
0 0 n 1 n n
with
The procedure of the BBO algorithm can be written as follows:
(a) Initialize the BBO parameters to set the framework of problem solution of BBO
algorithm. The initializations also deal with modification of habitat, maximum species,
S max m max
minimum species ,maximum migration rate, maximum mutation rate , and
mutation and elitism parameter.
H1 , H 2, ....., H n
(b) Specify a random set of habitats , and compute corresponding HSI values.
(c) Compute immigration rate and emigration rate for each habitat based on HSI.
(d) Modify each non elite habitat to compute probability of species living in the habitat using
immigration and emigration.
P s
(e) Update the probability of habitat which contains species, mutate each non-elite habitat
and recomputed each HSI values. The HSI can be represented as the vector encoding. The
weighting parameters of neural network and fuzzy system can be constructed into the form of
habitat vector.
(f) Verify the criterion of the problem solution, if it satisfies the criterion, terminate the
algorithm. Conversely, if it does not satisfy the criterion, go to step (c) for next iteration.
3.2 Grey Wolf Optimizer (GWO) Algorithm
The GWO [13] is a recently proposed swarm-based method. This algorithm mimics the
social leadership and hunting behavior of grey wolves in nature. In this algorithm the
population is divided into four groups: alpha (), beta (), delta (), and omega (). The first
three fittest wolves are considered as , , and who guide other wolves () toward
promising areas of the search space. During optimization, the wolves update their positions
around , , or as follows:
D C X p ( t ) X( t )
(58)
X t 1 X p t A D
(59)
C
A 2a r1a 2 r2 , X p
where t indicates the current iteration, , is the position vector of the
X a
prey, indicates the position vector of a grey wolf, is linearly decreased from 2 to 0, and
r1 r2 0,1 X, Y
, are random vectors in . In this algorithm, a wolf in the position of is able to
relocate itself around the prey with the proposed equations. The random parameters A and C
allow the wolves to relocate to any position in the continuous space around the prey.
In the GWO algorithm, it is always assumed that , , and are likely to be the
position of the prey (optimum). During optimization, the first three best solutions obtained so
far are assumed as , , and respectively. Then, other wolves are considered as and able
to re-position with respect to , , and . The mathematical model proposed to re-adjust the
position of wolves are as follows:
D C1 X X
(60)
D C 2 X X
(61)
D C3 X X
(62)
X X X
where shows the position of the alpha, shows the position of the beta, is the
C1 , C 2 , C 3 X
position of delta, are random vectors and indicates the position of the current
solution. Eqs. (60)-(62) calculate the approximate distance between the current solution and
alpha, bet, and delta respectively. After defining the distances the final position of the current
solution is calculated as follows:
X1 X A1 D
(63)
X 2 X A 2 D
(64)
X3 X A3 D
(65)
X1 X 2 X 3
X t 1
3
(66)
X X X
where shows the position of the alpha, shows the position of the beta, is the
A1 , A 2 , A 3 t
position of delta, are random vectors, and indicates the number of iterations.
The Eqs. (60), (61), and (62) define the step size of the wolf toward , , and
respectively. The Eqs. (63), (64), (65), and (66) then define the final position of the wolves.
A C
It may also be observed that there are two vectors: and
. These two vectors are random
A
and exploitation for the GWO algorithm. The exploration occurs when is greater than 1 or
less than -1. The vector C also promotes exploration when it is greater than 1. In contrary, the
A 1 C 1
exploitation is emphasized when and . It should be noted here that A is decreased
linearly during optimization in order to emphasize exploitation as the iteration counter
increases. However, C is generated randomly throughout optimization to emphasize
exploration/exploitation at any stage, a very helpful mechanism for resolving local optima
entrapment. The general steps of the GWO algorithm are as follows:
1. Initialize a population of wolves randomly based on the upper and lower bounds of the
variables.
2. Calculate the corresponding objective value for each wolf.
3. Choose the first three best wolves and save them as , and .
4. Update the position of the rest of the population ( wolves) using Eq. (60) to (69).
5. Update parameters a, A, and C.
6. Go to step 2 if the end criterion is not satisfied.
7. Return the position of as the best approximated optimum.
It is shown that the GWO algorithm [13] is able to provide very competitive results compared
to other well-known meta-heuristics. The problem of training MLPs is considered as a
challenging problem with an unknown search space, which is due to varying datasets and
inputs that may be provided for the MLP. High exploration of the GWO algorithm, therefore,
requires it to be efficient theoretically as an MLP learner.
3.3 Particle swarm optimization (PSO) algorithm
PSO was originally designed and introduced by Eberhart and Kennedy [7] in 1995. The
PSO is a population based search algorithm based on the simulation of the social behavior of
birds, bees or a school of fishes. This algorithm originally intendeds to simulate the graceful
and unpredictable choreography of the bird folk. A vector in multidimensional search space
represents each individual within the swarm. This vector has also one assigned vector, which
determines the next movement of the particle and is called the velocity vector. The PSO
algorithm also determines how to update the velocity of a particle. Each particle updates its
velocity based on current velocity and the best position it has explored so far, and based on
the global best position explored by swarm [7]. The PSO process then is iterated a fixed
number of times or until a minimum error based on desired performance index is achieved. It
has been shown that this simple model can deal with difficult optimization problems
efficiently. Here we will give a short description of the PSO algorithm proposed by Kennedy
and Eberhart. Assume that our search space is d-dimensional, and the i-th particle of the
Xi ( x1i , x i2 , , x id )
swarm can be represented by a d-dimensional position vector . The
V i ( v i , v i , , v i )
1 2 d
for the particle is and also the best position explored so far is
Pgbest (p g , p g , p g )
1 2 d
. So the position of the particle and its velocity is being updated using
following equations:
vi ( t 1) w vi ( t ) c1 1 (pi ( t ) x i ( t )) c 2 2 (pg ( t ) x i ( t ))
(67)
x i ( t 1) x i ( t ) vi ( t 1)
(68)
c1 c2 1 2
where and are positive constants, and and are two uniformly distributed number
w
between 0 and 1. In this equation, is the inertia weight which shows the effect of previous
w
velocity vector on the new vector. The inertia weight plays the role of balancing the global
and local searches and its value may vary during the optimization process. A large inertia
weight encourages a global search while a small value pursues a local search. An adaptive
weighted PSO (AWPSO) algorithm in which the velocity formula of PSO is modified as
follows:
v i ( t 1) w v i ( t ) r1 (p i x i ( t )) r2 (p g x i ( t ))
(69)
The second term in Eq. (69) can be viewed as an acceleration term, which depends on the
xi pi pg
distances between the current position , the personal best , and the global best . The
acceleration factor is defined as follows:
0 t / Nt
(70)
Nt t
where denotes the number of iterations, represents the current generation, and the
suggested range for is [0.5,1]. As can be seen from Eq. (64), the acceleration term will
increase as the number of iterations increases, which will enhance the global search ability at
the end of run and help the algorithm to jump out of the local optimum, especially in the case
of multi-modal problems. Furthermore, instead of using a linearly decreasing inertia weight, a
random number is used to improve the performance of the PSO in some benchmark functions
as follows:
w w 0 r (1 w 0 )
(71)
w 0 [0 ,1] r
where is a positive constant, and is a random number uniformly distributed in
w0 w
[0,1]. The suggested range for is [0, 0.5], which makes the weight randomly varying
between 0 and 1. An upper bound is placed on the velocity in all dimensions. This limitation
prevents the particle from moving too rapidly from one region in search space to another.
This value is usually initialized as a function of the range of the problem. For example if the
xij Vmax Pibest
range of all is [-1, 1] then is proportional to 1. for each particle is updated in
each iteration when a better position for the particle or for the whole swarm is obtained. The
feature that drives PSO is social interaction. Individuals (particles) within the swarm learn
from each other, and based on the knowledge obtained then move to become similar to their
better previously obtained position and also to their better neighbors. Individual within a
neighborhood communicate with one other. Based on the communication of a particle within
the swarm different neighborhood topologies are defined. One of these topologies which is
considered here, is the star topology. In this topology each particle can communicate with
every other individual, forming a fully connected social network. In this case each particle is
attracted toward the best particle (best problem solution) found by any member of the entire
Pgbest
swarm. Each particle therefore imitates the overall best particle. So the is updated when
a new best position within the whole swarm is found.