An Introduction To Particle Swarm Optimization: Article
An Introduction To Particle Swarm Optimization: Article
An Introduction To Particle Swarm Optimization: Article
net/publication/242463151
Article
CITATIONS READS
50 889
1 author:
Matthew L Settles
University of California, Davis
117 PUBLICATIONS 1,645 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Matthew L Settles on 07 February 2014.
Matthew Settles
Department of Computer Science, University of Idaho,
Moscow, Idaho U.S.A 83844
November 7, 2005
1 Introduction
When the search space is too large to search exhaustively, population based searches may
be a good alternative, however, population based search techniques cannot guarantee
you the optimal (best) solution.
I will discuss a population based search technique, Particle Swarm Optimization (PSO).
The PSO Algorithm shares similar characteristics to Genetic Algorithm, however, the
manner in which the two algorithms traverse the search space is fundamentally differ-
ent.
Both Genetic Algorithms and Paticle Swarm Optimizers share common elements:
1. Both initialize a population in a similar manner.
2. Both use an evaluation function to determine how fit (good) a potential solution is.
3. Both are generational, that is both repeat the same set of processes for a predeter-
mined amount of time.
1
2 Particle Swarm Optimization
Particle Swarm Optimization was first introduced by Dr. Russell C. Eberhart 1 and Dr.
James Kennedy 2 in 1995.
As described by Eberhart and Kennedy, the PSO algorithm is an adaptive algorithm
based on a social-psychological metaphor; a population of individuals (referred to as par-
ticles) adapts by returning stochastically toward previously successful regions[1].
Particle Swarm has two primary operators: Velocity update and Position update. Dur-
ing each generation each particle is accelerated toward the particles previous best position
and the global best position. At each iteration a new velocity value for each particle is cal-
culated based on its current velocity, the distance from its previous best position, and the
distance from the global best position. The new velocity value is then used to calculate
the next position of the particle in the search space. This process is then iterated a set
number of times, or until a minimum error is achieved.
9: g=i . arbitrary
10: for j = indexes of neighbors do
11: if G(~pj ) > G(~pg ) then
12: g=j . g is the index of the best performer in the neighborhood
13: end if
14: end for
1
Dr. Russell C. Eberhart is the Chair of the Department of Electrical and Computer Engineering, Profes-
sor of Electrical and Computer Engineering, and Adjunct Professor of Biomedical Engineering at the Pur-
due School of Engineering and Technology, Indiana University Purdue University, Indianapolis (IUPUI).
2
Dr. James Kennedy is a research psychologist, at the Bureau of Labor and Statistics in Washington, DC.
2
3 Definitions and Variables Used
PSO Particle Swarm Optimizer.
t means the current time step, t − 1 means the previous time step.
Tmax the maximum number of time step the swarm is allowed to search.
P (xid (t) = 1) is the probability that individual i will choose 1 for the bit at the dth site on
the bitstring.
ϕ1 is a positive random number drawn form a uniform distribution between 0.0 and 1.0.
ϕ2 is a positive random number drawn from a uniform distribution between 0.0 and 1.0.
ρid is a positive random number, drawn from a uniform distribution between 0.0 and 1.0
(Binary Particle Swarm).
wstart is the starting inertia weight (w(0) = wstart ). (Inertia Particle Swarm)
wend is the ending inertia weight (w(Tmax ) = wend ). (Inertia Particle Swarm)
3
1
1/(1+exp(-x))
0.9
0.8
0.7
0.6
S(Vid(t))
0.5
0.4
0.3
0.2
0.1
0
-10 -5 0 5 10
Vid(t)
The parameter vid (t) , an individuals predisposition to make one or the other choice,
will determine a probability threshold. If vid (t) is higher, the individual is more likely to
choose 1, and lower values favor the 0 choice. Such a threshold needs to stay in the range
[0.0,1.0]. The sigmoidal function is a logical choice to do this. The sigmoidal function
squashes the range of vid to a range of [0.0,1.0].
1
s(vid ) = (4.2)
1 + exp(−vid )
Further more we can limit vid so that s(vid ) does not approach too closely to 0.0 or 1.0.
This ensures that there is always some chance of a bit flipping. A constant parameter Vmax
can be set at the start of a trial to limit the range of vid is often set at ±4.0, so that there
is always at lease a chance of s(vmax ) ≈ 0.0180 that a bit will change state. In this binary
model, Vmax functions similarly to mutation rate in genetic algorithms.
4
5 Standard Particle Swarm Optimizer
In real number space, the parameters of a function can be conceptualized as a point in
space. Furthermore the space in which the particles move is heterogeneous with respect
to fitness; that is some regions are better than others. A number of particles can be evalu-
ated and there is presumed to be some kind of preference or attraction for better regions
of the search space.
The standard version of the PSO has a tendancy to explode as oscillations become
wider and wider, unless some method is applied for damping the velocity. The usual
method for preventing explosion is simply to define a parameter Vmax and prevent the
velocity from exceeding it on each dimension d for individual i. Typically Vmax is set to
Xmax , the maximum initalization range of xid .
Other methods have also been introduced that deal with controlling the explosion of
vid , the two most notable are Eberhart and Shi’s PSO with inertia and Clerc’s PSO with
Constriction.
xid (t) = f (w(t), xid (t − 1), vid (t − 1), pid , pgd ) (6.1)
vid (t) = w(t) ∗ vid (t − 1) + c1 ϕ1 (pid − xid (t − 1)) + c2 ϕ2 (pgd − xid (t − 1))
xid (t) = xid (t − 1) + vid (t) (6.2)
5
Typically w(t) is reduced linearly, from wstart to wend , each iteration, a good starting
point is to set wstart to 0.9 and wend to 0.4.
xid (t) = f (χ, xid (t − 1), vid (t − 1), pid , pgd ) (7.1)
2k
χ = p , whereϕ = c1 + c2 , ϕ > 4 (7.2)
2 − ϕ − ϕ 2 − 4ϕ
Clerc, et al., found that by modifying ϕ, the convergence characteristics of the system
can be controlled. Typically k = 1 and c1 = c2 = 2, and ϕ is set to 4.1, thus K = 0.73.
8 Neighborhood Topologies
There are 3 main neighborhood topologies used in PSO: circle, wheel, and star. The
choice for neighborhood topology determines which individual to use for pgd . In the cir-
cle toplogy (See Figure 2), each individual in socially conneted to its k nearest topological
neighbors (pgd = best individual result among its k nearest neighbors, k typically equal 2
6
Figure 2: Circle Topology
). The wheel topology (See Figure 3), effectively isolates individuals from one another, as
information has to be communicated through a focal individual, pf d (pgd = best{pf d , pid }).
The star topology (See Figure 4) is better known as the global best topology, here every
individual is connected to every other individual (pgd = best individual results in the
population).
References
[1] J. Kennedy and R. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers, Inc.,
San Francisco, CA, 2001.
[2] J. Kennedy and R. C. Eberhart. A discreet binary version of the particle swarm algo-
rithm, 1997.
[3] Clerc M. The swarm and the queen: Towards a determininistic and adaptive particle
swarm optimization. In Congress on Evolutionary Computation (CEC99), pages 1951–
1957, 1999.
7
Figure 3: Wheel Topology