Noname manuscript No.
(will be inserted by the editor)
Swarm Intelligence: Past, Present and Future
Xin-She Yang · Suash Deb · Yu-Xin
Zhao · Simon Fong · Xingshi He
the date of receipt and acceptance should be inserted later
Abstract Many optimization problems in science and engineering are challenging to solve, and the current trend is to use swarm intelligence (SI) and
SI-based algorithms to tackle such challenging problems. Some significant developments have been made in recent years, though there are still many open
problems in this area. This paper provides a short but timely analysis about
SI-based algorithms and their links with self-organization. Different characteristics and properties are analyzed here from both mathematical and qualitative
perspectives. Future research directions are outlined and open questions are
also highlighted.
Citation detail: Soft Computing, Published online 5 Sept 2017
https://link.springer.com/article/10.1007/s00500-017-2810-5
Yang, XS., Deb, S., Zhao, YX, Fong, S., He, XS.,
Swarm intelligence: past, present and future, Soft Comput (2017).
https://doi.org/10.1007/s00500-017-2810-5
Xin-She Yang
School of Science and Technology, Middlesex University, London NW4 4BT, UK.
(x.yang@mdx.ac.uk, corresponding author)
Suash Deb
1) IT & Educational Consultant, Ranchi, India, and 2) Distinguished Professorial Associate,
Decision Sciences and Modelling Program, Victoria University, Melbourne, Australia.
Yu-Xin Zhao
College of Automation, Harbin Engineering University, Harbin, China.
Simon Fong
Department of Computer and Information Sciences, University of Macau, Macau, China.
Xingshi He
College of Science, Xi’an Polytechnic University, No. 19 Jinhua South Road, Xi’an, China.
2
Xin-She Yang et al.
1 Introduction
Many optimization problems in science and engineering applications are highly
complex and challenging, and thus require novel problem-solving approaches.
Traditional approaches tend to use problem-specific information such as the
gradients of the objective to guide the search for optimal solutions, and such
approaches tend to be highly sophisticated and specialized. They also have the
disadvantage of getting trapped in local optima, except for linear programming
and convex optimization. One of the current trends is to solve difficult optimization problems in a quasi-heuristic way in combination with the successful
characteristics of multi-agent systems. Such trend seems also to be true for
solving problems in industry and business settings. This new way of problem solving has resulted in a significant development of new and novel swarm
intelligence based algorithms.
In nature, many living organisms live in a community where there is no
centralized decision-making. In fact, the decision making among many biological systems, especially social insects such as ants and bees, seems to occur
in a distributed, local manner. Individuals make decisions based on local information and interactions with other agents and their environment. Such
local interactions seem to be responsible for the rise of social intelligence, and
it can be hypothesized that such complex interactions may directly or indirectly somehow contribute to the emergence of intelligence in general. After
all, changes tend to be some sort of responses and adaptation to the changes
of the organism’s community and environment. Groups of different organisms
of the same species in nature have been found to be successful in carrying out
specific tasks, by means of a collective behaviour, namely collective intelligence
or swarm intelligence (SI) [22, 33, 46].
It has also been observed in nature that different species can also co-evolve
and cooperate under the right conditions, especially when the resources are
sparse. Such swarm intelligence has inspired researchers to develop various ingenious ways for solving challenging problems in optimization, machine learning and data mining [6, 27, 55, 58]. Nature-inspired algorithms tend to be flexible, easy to implement and sufficiently versatile to deal with different types of
optimization problems in practice. Such characteristics enable to solve problems that may be too challenging to solve using traditional algorithms.
Accompanying the emergence and success of nature-inspired algorithms,
especially the SI-based algorithms, there is a strong need to understand the
mechanisms of algorithms in a rigorous mathematical perspective. In contrast,
the progress in theory lags behind. Thus, it is often the case that we know how
to use such algorithms and know they will generally work well, but we rarely
know why they work under exactly what conditions. Consequently, the use and
applications of such metaheuristic algorithms are partially heuristic as well.
However, some promising progress starts to emerge in recent years concerning the analysis of algorithms using Markov chain theory, dynamic systems,
random walks and stability analysis. They start to provide some insight into
the intrinsic part of algorithms. This paper will briefly review the state-of-
Swarm Intelligence: Past, Present and Future
3
the-art developments concerning swarm intelligence with a focus on both the
present and future. We will also highlight some key challenges and trends
for future developments. Therefore, the paper is organized as follows. Section
2 first briefly touches the concept of swarm intelligence and then Section 3
mainly focuses on the present, and Section 4 looks at these algorithms from a
theoretical perspective. Section 5 will try to inspire the future research. Finally,
the paper concludes briefly in Section 6.
2 Swarm Intelligence: A Critical Analysis
Swarm intelligence can arise in multi-agent systems and it is not clear yet
what mechanisms are responsible for the emergence of collective behavior in a
swarm. Even so, swarm-intelligence-based algorithms have been developed and
applied in a vast number of applications in optimization, engineering, machine
learning, image processing and data mining. Here, we review critically the
essence of swarm intelligence and its link with self-organization.
2.1 Swarm Intelligence
The emergence of swarm intelligence (SI) is a complex process, and it is not
quite clear what mechanisms are required to ensure the emergence of collective intelligence. Inside a swarm, individual agents such as ants and bees in
the complex system follow simple rules, act on local information, and there
is no centralized control [6, 33]. Such rule-based interactions can lead to the
emergence of self-organization, resulting in structures and characteristics at a
higher system level. Loosely speaking, individuals in the system are not intelligent, but the overall system can behave intelligently, at least as can be considered as some sort of collective intelligence. Such emerging self-organization
can explain some key swarming behaviour from ants to people [22, 46].
For such self-organization behaviour to emerge, it seems that there are certain conditions that are necessary, and conditions such as feedbacks, stigmergy,
multiple interactions, memory and environment setting are very important.
but it is still not clear about the exact role of such conditions and how seemingly self-organized structures can arise under such conditions. Though there
are different attempts that try to understand the system behaviour, however,
different studies in various subjects typically focus on one or a subset of these
factors [11, 6, 22, 28, 40, 50].
Even though we may not fully understand the true mechanisms that lead
to the self-organization and intelligent characteristics of a complex system, researchers have successfully developed optimization algorithms based on swarm
intelligence. Examples of such algorithms include particle swarm optimization
(PSO) [27], ant colony optimization (ACO) [6], bat algorithm (BA) [55], cuckoo
search (CS) [58], flower pollination algorithm (FPA) [57], wolf search algorithm
(WSA) [25] and many others.
4
Xin-She Yang et al.
Before we discuss the links between swarm intelligence, self-organization
and algorithms, let us analyze first the main characteristics of optimization
algorithms.
2.2 Algorithmic Characteristics
Algorithms have always been an important part of computation [5, 9], but contemporary algorithms tend to be a combination of deterministic and stochastic
components. Almost all nature-inspired algorithms use some aspects of swarm
intelligence with stochastic components [60]. Since swarm intelligence-based
algorithms are very diverse in terms of the sources of inspiration in nature
and their formulations, there are different ways of analyzing and decomposing the essential components of these algorithms. For example, we can look
at algorithms by focusing on the key characteristics and their properties from
a perspective of self-organizing systems. This is a higher-level analysis that
does not depend on the details of the mathematical formulations or algorithmic steps, which allows us to focus on the functionalities and the main search
behaviour of algorithms.
First, all SI-based algorithms use a population of multiple agents and each
agent is represented by a solution vector xi , and each vector can be considered
as a state of the system. An algorithmic system is typically initialized by
setting the population as the random sampling of the search space. Then,
the update of this population is realized by moving the agents in a quasideterministic manner to be referred to as its ‘algorithmic dynamics’. This
algorithm dynamics determines how the system evolves, according to a set of
equations (such as those used in particle swarm optimization) or a predefined
procedure (such as those used in genetic algorithms). Randomness is often
used in SI-based algorithms to act as a perturbation force to drive the system
from equilibrium and potentially jump out of local valleys in the objective
landscape.
In addition, a selection mechanism is needed to select the best solution (or
the fittest solutions) in the population so as to allow the fittest to pass onto the
next generation. This means that some states/solutions are preferably selected.
Such selection, together with the evolution of population, often enables the
population in the search process to converge to a set of solutions (often the
optimal set), and consequently some convergent states or solutions may emerge
as iterations continue.
For example, let us consider both particle swarm optimization (PSO) and
firefly algorithm (FA) to be introduced later. Both algorithms have randomization by using random numbers. Though their use of two random numbers
is different, both can provide some form of stochastic properties in generating
new solutions so that new solutions can be different and sometimes sufficiently
distant from existing solutions. This means that they essentially provide the
ability for the algorithms to escape local optima without being trapped.
Swarm Intelligence: Past, Present and Future
5
Table 1 Main characteristics of an algorithm based on swarm intelligence.
Algorithmic Components
Multi-agents
Randomization
Selection
Algorithmic equations
Characteristics
Population
Perturbations
Driving force
Iterative evolution
Role/Properties
Diversity and sampling
Escape local optima
Organization and convergence
Evolution of solutions
However, there are significant differences between PSO and FA. Firstly,
PSO uses the best solution found so far g∗ , while FA does not use g∗ . Secondly,
PSO is a linear system, while FA is nonlinear in terms of updating equations.
Thirdly, the attraction mechanism in FA allows the swarm to subdivide into
multiple small subswarms, which enables FA to solve multimodal problems
more effectively. On the other hand, PSO cannot subdivide the swarm. In addition, PSO has the drawbacks of using velocity, while FA does not use velocity.
Thus, these differences in algorithmic dynamics will lead to significantly different characteristics, performance and efficiency of algorithms. In fact, studies
show that FA can have a higher convergence rate in most applications [23, 49].
In all algorithms, iterations are used to provide the evolution of solutions
towards some selected solutions in terms of a pseudo-time iteration counter. At
the initial stages of such iterative evolution, solutions tend to have much higher
diversity as solutions are usually different and often uniformly distributed
randomly in the search space. As the evolution continues, solutions become
more similar to each other by some selection mechanism based on the fitness
landscape. Selection acts as a driving force for evolution. Good solutions are
selected according to their fitness, often the objective values, which exerts a
selection pressure for the multi-agent populations to adapt and react to the
changes in the objective landscape and can thus drive the system to converge
towards some specific, selected states or solutions.
These key characteristics and properties as well as their role can be summarized in Table 1. It is worth pointing out that this is only one way of looking
at the algorithms and the emphasis here is purely for the convenience of comparing with the mechanisms for self-organization to be discussed in the next
subsection.
Obviously, there are other ways to look at algorithms [7, 11, 59]. For example, the use of exploration and exploitation is another good way to analyze
the behaviour of algorithms [7]. In addition, mathematical analysis can provide insight from a theoretical point of view [59]. In general, different ways of
looking at algorithms can lead to different insights and thus understand the
algorithms from different perspectives.
2.3 Algorithms as Self-Organization
A complex system may be able to self-organize under the right conditions.
Loosely speaking, when the size of the system is sufficiently large, it will lead
6
Xin-She Yang et al.
Table 2 Similarities between self-organization and an algorithm.
Self-organization
Multiple states
Noise, perturbations
Selection mechanism
Re-organization
Features
High complexity
Diversity
Structure
State changes
Algorithm
Population
Randomization
Selection
Evolution
Properties
Diversity and sampling
Escape local optima
Convergence
Evolution of solutions
to a sufficiently high number of degrees of freedom or possible states S. At the
same time, there should be a sufficiently long time for the system to evolve
from noise and far from equilibrium states [2].
Another important factor is that a proper selection mechanism must be
in place to ensure that self-organization is possible. In other words, the primary conditions for self-organization to evolve in a complex system can be
summarized as follows [2, 28]:
• The size of the complex system is sufficiently large with a higher number
of degrees of freedom or states.
• Enough diversity exists in the system in terms of perturbations, noise, edge
of chaos, or far from the equilibrium.
• The system is given enough time to evolve.
• There is a selection mechanism (or an unchanging law) in the system to
select certain states.
If we loosely represent the above conditions mathematically, we can say
that a system with multiple states Si can evolve towards the self-organized
states S∗ , driven by a driving mechanism M (t, p) with a set of parameters p
that may vary with time t, which can be written schematically as
M (t,p)
Si =⇒ S∗ .
(1)
If we look at an algorithm from the perspective of self-organization, we can
indeed consider an algorithm as a self-organization system, starting from many
possible states xi (solutions) and tries to converge to the optimal solution/state
x∗ , driven by the selection mechanism in an algorithm A(p, t) with a set of
parameter p, evolving with time pseudo-time t. This can be represented in the
following schematic format:
A(p,t)
f (xi ) =⇒ fmin (x∗ ) or fmax (x∗ ).
(2)
Now if we look both algorithms and self-organization systems more closely,
we can identify the main role and properties of an algorithm and compare them
with the conditions for self-organization, we can summarize them in Table 2.
Despite these striking similarities, however, there are some significant differences between a self-organizing system and an algorithm. First, for selforganization, the exact avenues to the self-organized states may not be clear.
But for an algorithm, the way that makes an algorithm converge is crucial.
Second, time is not an important factor for self-organization, while the rate
Swarm Intelligence: Past, Present and Future
7
of convergence is paramount for an algorithm because the minimum computational cost is needed in practice so as to quickly reach either truly global
optimality or suboptimal solutions. Finally, the structure can be important
for a self-organized system, while the converged solution vectors (rather than
their structure) is more important for solving an optimization problem.
It is worth pointing out these similarities of self-organization to algorithms are applicable for almost all stochastic algorithms, including the classic
stochastic algorithms such as genetic algorithm [19]. In addition, even we can
consider an algorithm as a self-organized system, this does not mean that we
can always make an algorithm efficiently. This is partly because the exact behaviour is influenced by both the interactions of algorithmic components and
algorithm-dependent parameters, but these details are not clearly understood
yet for most algorithms.
In the next section, we will outline the state-of-the-art developments of SIbased algorithms before we proceed to do some in-depth mathematical analysis
afterwards.
3 The Present Developments
In the current literature, there are many algorithms that use the concept of
swarm intelligence (SI), and the number of SI-based algorithms is increasingly
almost monthly. Therefore, it is not possible to introduce and analyze these
algorithms in a very short paper as such. Therefore, our emphasis will be on the
brief analysis of a few selected algorithms as representatives so as to highlight
the main points. Now we first introduce briefly a few algorithms and we then
categorize them in terms of their characteristics and algorithmic dynamics and
links to self-organization.
3.1 Algorithms Based on Swarm Intelligence
Though both ant colony optimization (ACO) [6] and particle swarm optimization (PSO) [27] are primary examples of SI-based algorithms. However, ACO
can be considered as a mixture of descriptive procedure and equations, while
PSO is mainly based on dynamic equations. For this reason, we focus first on
the PSO here.
For the ease of discussing particle swarm optimization (PSO), developed
by Kennedy and Eberhart [27], we use xi and vi to denote the position (solution) and velocity, respectively, of a particle or agent i. The main iteratively
updating equations for PSO are
vit+1 = vit + αǫ1 [g∗ − xti ] + βǫ2 [x∗i − xti ],
(3)
= xti + vit+1 ,
xt+1
i
(4)
where ǫ1 and ǫ2 are two uniformly distributed random vectors in [0,1]. Both
α and β are so-called learning parameters. This algorithm uses the current
8
Xin-She Yang et al.
global best solution g∗ found so far as well as the individual best x∗i . PSO
has been applied in many areas in science and engineering [3, 44], and it has
also been extended to solve multiobjective optimization problems [37]. For
comprehensive reviews, please refer to [3, 29].
It is clearly seen that the above algorithmic equations are linear in the sense
that both equations only depend on xi and vi linearly. Selection is carried out
by the attractor or converged state g∗ , which is also evolving. Randomization
is done by two uniformly distributed random numbers.
Bat algorithm (BA) is another example of SI-based algorithms. BA was
developed by Yang and BA mainly uses frequency-tuning and some characteristics of echolocation of microbats [55]. The main algorithmic equations for
BA are
fi = fmin + (fmax − fmin )β,
(5)
vit = vit−1 + (xt−1
− x∗ )fi ,
i
(6)
xti = xt−1
+ vit ,
i
(7)
where β ∈ [0, 1] is a random vector drawn from a uniform distribution. fmin
and fmax are the frequency-tuning range. These equations are also associated
with the pulse emission rate r and loudness A that can be switched on or off
by comparing with a uniformly distributed random number ε. For each bat i,
we can use
(0)
rit+1 = ri (1 − e−γt ), At+1
= αAti ,
(8)
i
where 0 < α < 1 and γ > 0 are two parameters to control the variations of r
and A.
The bat algorithm has been extended to multiobjective optimization and
hybrid versions as well as chaotic bat algorithm with many applications [39,
35, 26, 56, 16].
The algorithmic equations in BA are also linear in the sense that the equations depend on xi and vi linearly. However, the control of exploration and
exploitation is carried out by the variations of loudness A(t) from a high value
to a lower value, while the pulse emission rate is increased nonlinearly from
a lower value to a higher value. Selection is done by the current best solution
x∗ , which acts a similar role as the g∗ in PSO. Randomization is done by
a uniformly distributed number β for frequency tuning. As a result, BA can
have a faster convergence rate.
Firefly algorithm (FA) developed by Yang is an algorithm inspired by the
swarming behaviour of tropical fireflies. FA uses a nonlinear system by combing the exponential decay of light absorption and inverse-square law of light
variation with distance. The main equation in FA is a single nonlinear equation
in the following form:
2
xt+1
= xti + β0 e−γrij (xtj − xti ) + α ǫti ,
i
(9)
where α is a scaling factor controlling the step sizes, while γ is a scaledependent parameter controlling the visibility of the fireflies (and thus search
modes). In addition, β0 is the attractiveness constant. Firefly algorithm has
Swarm Intelligence: Past, Present and Future
9
been applied to many applications [8, 13, 23, 24, 15, 32, 34, 43, 49, 62] and there
are many different variants such as neighborhood firefly algorithm [52] and the
quantum-based hybrid [65].
Since FA is a nonlinear system, it has the ability to automatically subdivide
the whole swarm into multiple subswarms due to the fact that short-distance
attraction is stronger than long-distance attraction. Each subswarm can potentially swarm around a local mode, and among all the modes, there is always
a globally optimal solution. Therefore, it is suitable for multimodal optimization problems. There is no explicit use of the best solution, thus selection is
through the comparison of relative brightness according to the rule of ‘beauty
is in the eye of the beholder’. Randomization is explicitly done by a perturbation term in the equation. As pointed out earlier, FA has some significant
differences from the PSO. FA is nonlinear, while the PSO is linear. FA has
an ability of multi-swarming, while the PSO cannot. In addition, the PSO
uses velocities (and thus some drawbacks), while FA does not use velocities.
Furthermore, FA have a scaling control by using γ, while the PSO has no
scaling control. All these differences enable FA to search more effectively for
multimodal objective landscapes.
Cuckoo search (CS), developed by Yang and Deb, is another nonlinear
system. CS is a primary example of intriguing brooding parasitism of some
cuckoo species, and it uses a balanced combination of both local and global
search capabilities, controlled by a switching probability pa . One equation is
mainly local and can be written as
xt+1
= xti + αs ⊗ H(pa − ǫ) ⊗ (xtj − xtk ),
i
(10)
where xtj and xtk are two different solutions selected randomly by random
permutation, H(u) is a Heaviside function, ǫ is a random number drawn from
a uniform distribution, and s is the step size.
The other equation is mainly global and can be expressed as
xt+1
= xti + αL(s, λ),
i
(11)
where the Lévy flights are simulated by
L(s, λ) ∼
λΓ (λ) sin(πλ/2) 1
,
π
s1+λ
(s ≫ 0).
(12)
Here α > 0 is the step size scaling factor. Cuckoo search has become powerful
in solving many problems such as software testing [41], scheduling [31], cyberphysical systems [12] and others [58, 64].
As we can see from the above equations, CS is a nonlinear system due to
the Heaviside function and switch probability. Selection is not via the explicit
use of global best g∗ , but selection is done by ranking and elitism where the
current best is passed onto the next generation. Randomization is carried out
more effectively using Lévy flights where a fraction of steps are larger than
those used in Gaussian. Thus, the search is heavy-tailed [42]. In addition, as
the Lévy flights can be approximated by a power-law type of distribution, the
10
Xin-She Yang et al.
search steps are also scale-free. In fact, it is observed in the simulations that CS
is indeed scale-free and have a fractal-like search structure. Consequently, CS
can be very effective for nonlinear optimization problems and multiobjective
optimization [17, 36, 30, 58, 64].
Flower pollination algorithm (FPA) is inspired by the pollination characteristics of flowering plants [57], and FPA mimics the biotic and abiotic
pollination characteristics as well as the flower constancy as a co-evolution between certain flower species and pollinators such as insects and animals. The
main algorithmic equations are
xt+1
= xti + γL(λ)(g∗ − xti ),
i
(13)
xt+1
= xti + U (xtj − xtk ),
i
(14)
and
where γ is a scaling parameter, L(λ) is the random number vector drawn
from a Lévy distribution governed by the exponent λ. Here g∗ is the best
solution found so far, which acts as a selection mechanism. In addition, U is a
uniformly distributed random number. Furthermore, xtj and xtk are solutions
representing pollen from different flower patches.
FPA is a quasi-linear system because the equations are linear in terms of
xi , but the random switching between two branches of search moves introduces some weak nonlinearity. Selection uses the current best explicitly, while
randomization is carried out via three components: Lévy flights, a uniform distribution and a switch probability. Thus, FPA can have a higher explorative
ability while remaining a strong exploitation ability. In fact, it has recently
been proved that FPA can have guaranteed global convergence under the right
conditions [20]. FPA has been applied to many applications with an expanding
literature [1, 4, 38, 63].
Obviously, there are other SI algorithms, including the wolf search algorithm (WSA) [25], ant colony optimization (ACO), artificial bee colony and
others, but we do not have space to discuss them in this paper.
In addition, some algorithms such as differential evolution (DE) are also
very efficient [45], but they may not be classified as SI-based algorithms, and
thus we will not discuss them either. Instead, our focus will be on the discussion
and analyses of the above algorithms.
It is worth pointing out that though there are many algorithms in the
literature, there is no single algorithm that can be most efficient to solve all
types of problems as dictated by the no-free-lunch theorems [53]. However,
under the right conditions such as co-evolution, certain algorithms can be
more effective [54]. As our purpose here is not to search for best algorithms,
our main focus is to gain more insights into different types of algorithms. In the
rest of section, we will discuss the main characteristics of SI-based algorithms.
Swarm Intelligence: Past, Present and Future
11
Table 3 SI Algorithms and their relevant characteristics.
Algorithm
GA
PSO
BA
FA
CS
ACO
FPA
Randomization
Uniform
Uniform
Uniform
Gaussian
Lévy flights
Probabilistic
Uniform & Lévy
Selection
Elitism
g∗ , x∗i
x∗
Brightest
Best
Pheromone
g∗
Evolution Mechanism
Survival of the fittest
swarming towards g∗
swarming towards x∗
Attraction
Similarity/Elitism
Pheromone variations
constancy and similarity
3.2 Characteristics of SI Algorithms
Based on the above brief descriptions of some SI-based algorithms, we can now
analyze them in terms of their main characteristics such as randomization techniques, selection mechanism, the potential mechanism driving the evolution,
as summarized in Table 3. Here, we include genetic algorithms (GA) for the
purpose of comparison [19].
From this table, we can see that almost all algorithms use some sort of best
solutions such as the centre of the swarm. Some algorithms such as PSO, BA
and FPA use the current global best solution explicitly in their formulations,
while others such as FA, CS and ACO use it in an implicit way. One of the
advantages of explicit use of g∗ is that it provides a direct driving force in the
governing equations, and thus it may be able to speed up the convergence.
However, if this driving force is too strong, it may lead to premature convergence as it can often be observed in PSO. On the other hand, the advantage of
implicit use (either by ranking or post-processing) can lead to a higher probability of finding the true global best solution, thus potentially avoid some form
of premature convergence. However, it may slow down the search process due
to a weaker driving force for evolution.
Another advantage without the direct use of g∗ is that multiswarms can
occur as in the case of FA due to its nonlinear attraction among different
fireflies. In the standard firefly algorithm, as the short-distance attraction is
stronger than long-distance attraction, the whole population can automatically subdivide into many sub-swarms, and each sub-swarm can potentially
swarm around a local mode, and among all the modes, there is certainly the
global best solution. All this makes it natural for FA to deal with mutlimodal
optimization problems effectively.
It is worth pointing out that the above analysis is just one way to analyze
SI-based algorithms from a higher-level but qualitative perspective. Another
way of analyses is to use rigorous mathematical theories, which will be the
focus of next section.
12
Xin-She Yang et al.
4 Mathematical Framework for Algorithms
Mathematical frameworks for analyzing algorithms can be dynamic systems,
fixed-point theory, Markov chain theory, self-organization, filtering and others.
Though our long-term intention is to build a solid mathematical framework
to analyze algorithms, however, it is not possible to achieve such a huge task
in a single paper. Instead, we would like to highlight a few key points so as to
inspire the research community to carry out more research in this area.
4.1 Fixed-Point Theory
Traditional numerical analysis tends to focus on the iterative nature of an
algorithm A(xt ) and see how the solution xt evolves as a pseudo-time iteration
counter t. From the well-known Newton’s method for finding the roots of a
nonlinear function f (x)
f (xt )
,
(15)
xt+1 = xt −
∇f (xt )
we know that when the solution sequence converges, we have
lim xt = x∗ ,
t→∞
(16)
where x∗ is the final converged solution which is essentially the fixed point.
The general theory is the fixed-point theory which dictates how an iterative
formula may evolve and lead to a fixed point in the search space [47].
For a population of solutions in any of SI-based algorithms, the population
can interact with each other and may lead to potentially multiple fixed points,
depending on the algorithm dynamics of each algorithm. It can be expected
that g∗ acts as a fixed point in PSO, while there are multiple fixed points in the
firefly algorithm. In fact, we can hypothesize that there is a single fixed point
in BA, PSO, simulated annealing, FPA and bee algorithm, while multiple fixed
points can exist in FA, CS, ACO and genetic algorithms if the conditions are
right. However, it is not clear yet what these conditions should be, and such
conditions may also be problem dependent. More research is highly needed in
this area.
4.2 Dynamic System
The first analysis of PSO using a dynamic system theory was carried out by
Clerc and Kennedy [10], and they linked the governing equations of PSO with
the dynamic behaviour of particles under different parameter settings. In fact,
their analysis suggested that the PSO system is governed by the eigenvalues
of a system matrix
p
γ 2 − 4γ
γ
,
(17)
λ1,2 = 1 − ±
2
2
Swarm Intelligence: Past, Present and Future
13
which leads a bifurcation at γ = α + β = 4. Such analysis can indeed provide
some insight into the working mechanism and main characteristics, however,
they do not provide a full picture of the system due to the assumptions and
simplifications used in the analysis.
Though in principle we can use the similar method to analyze other algorithms, however, it becomes difficult to extend to a generalized system. For
example, in FA, CS and ACO, the nonlinearity makes it difficult to figure
out the eigenvalues because the matrix will depend on the current solution,
randomization and other factors. In addition, nonlinearity in FA also means
that the characteristics can be much richer than simple linear dynamics such
as PSO. Thus, this method may become intractable in general and it may not
be very useful to gain any insight into these algorithms.
4.3 Markov Chain Theory
From the probability perspective, the solutions generated by an algorithm is
a statistical sampling method such as Monte Carlo [21]. In the more general
sense, the solution set generated by an algorithm essentially form a system of
Markov chains. A Markov chain is a chain whose next state will depend only
on the current state and the transition probability. In this sense, Markov chain
theory can provide a generalized framework for analyzing SI-based algorithms.
In fact, a simple analysis of genetic algorithms using Markov chain theory was
carried out by Suzuki [48], and a discrete-time Markov chain approach has
been used to prove that the flower pollination algorithm can have guaranteed
global convergence [20].
On the other hand, a generalized approach has been designed using a
Markov chain for global optimization [18]; however, this approach may converge slower than SI-based algorithms. This methodology can provide a quite
general framework for optimization.
It is worth pointing out that Markov chain theory can be rigorous, enabling to provide some significant insight into the algorithms. In theory, the
largest eigenvalue of a proper Markov chain is one, while the second largest
eigenvalue λ2 of the transition probability matrix essentially controls the rate
of convergence of the Markov chain. But in practice it is very challenging to
find this eigenvalue. Even some estimates can be difficult. Therefore, the information and insight we can obtain is limited in practice, which may also limit
its practical use.
4.4 Self-Organization
As we have seen earlier, an algorithm can be considered as a self-organizing
system where multiple agents sample the search space, driven by a selection
mechanism, evolving according a predefined procedure or a set of algorithmic equations. The iterative evolution will usually lead to a converged set of
14
Xin-She Yang et al.
solutions that may correspond to the optimal solutions to the problem under consideration. There are similarities and differences between algorithmic
evolution and self-organization as summarized in Table 2. However, such comparison and perspective only provide the qualitative nature of the algorithm.
Though the insightful can be at higher level, it lacks crucial details about how
the self-organized states emerge, under what conditions and how quickly such
converged states can be reached. Unless new theory about self-organization
emerges, the information we can gain is mainly qualitative. Key information
and properties may need to obtain by other means.
4.5 Other Approaches
Sometimes, it may not be easy to put some studies into a fixed category,
but their results can be equally useful [51]. For example, Zaharie carried out
a variance analysis of population and the effect of crossover in differential
evolution [61]. The variance provides some information about the diversity of
the population during the iterations.
4.6 Multidisciplinary Approach
From the above discussion, it seems that one approach can give only a part
of the full information and insight. Different approaches and perspectives can
provide different insights, potentially complementary to each other. Therefore, to truly understand a complex algorithm, it may be useful to use all
different approaches so as to build a fuller picture about the algorithm. It can
be expected that a multidisciplinary framework can be formulated to analyze
algorithms comprehensively.
5 Trends and Future Challenges
The above analyses and discussions about SI-based algorithms have laid the
foundation for us to turn our attention to the possible future developments.
Obviously, it is not possible to predict what future research would be, but we
hope to inspire more studies in the important directions concerning swarm
intelligence and their applications in optimization, machine learning and data
mining. Therefore, we would like to highlight some of key challenges in this
area.
5.1 Some Key Challenges
Many challenging issues exist concerning swarm intelligence, and it is not
our intention to address every aspect of these challenges. As an example, the
emergence of intelligent behaviour among a complex swarm is still poorly
Swarm Intelligence: Past, Present and Future
15
understood, which will not be addressed here. Therefore, we can only focus on
a small but key set of challenging issues as outlined below.
– Parameter tuning and control : All algorithms have algorithm-related parameters and some algorithms have more parameters than others. In general, the setting of these parameters can affect the algorithm significantly,
though some parameters may have a weak influence, while others may have
a strong influence. In theory, these parameters should be tuned so as to
maximize the performance of the algorithm, however, such parameter tuning is not a trivial task [14]. Even with well-tuned parameters, there is no
strong reason that they should remain fixed. It may be advantageous to
use varying parameters during iterations and the proper variations of parameters are called parameter control. Both parameter tuning and control
can be considered as a high level of optimization; that is the optimization
of optimization algorithms.
Currently, most tuning approaches are done using parametric studies, while
parameter control uses stochastic adaptivity where certain parameters are
allowed to vary randomly within a predefined range. Ideally, parameter tuning and control can be done automatically, such as the self-tuning framework by Yang et al. [59]. However, the computational costs may be still
high. Therefore, there is a strong need to find an effective way to tune
parameters both automatically and adaptively.
– Optimal exploration and exploitation: An efficient algorithm should be able
to balance the exploration of the search space and the exploitation of the
landscape information. Exploration can increase the diversity and thus
increase the probability of finding the global optimal solutions, while exploitation uses local information to enhance the search process. However,
too much exploration and too little exploitation will slow down the search
process, while too much exploitation and too little exploration will lead to
premature convergence. The optimal balance can be difficult to find, and
empirical observations suggest that such balance may be also problemspecific. How to achieve such a balance is still an open problem, though it
is possible to produce a better balance under certain conditions.
– Large-scale problems and algorithm scalability: The current literature seems
to suggest that SI-based algorithms can be effective in solving various design problems, and there is some indication that they can even solve highly
complex NP-hard problems, but the case studies in the current literature
have most about optimization problems with the number of variables ranging from a few to a few hundred. Compared to real-world applications, the
dimensionality tested is relatively low. However, it is not clear if these algorithms can be directly applied to large-scale problems. The true scalability
is yet to be tested. It is highly needed to test problems with the number
of variables more than a thousand or even much higher.
– Mathematical Framework : As we discussed earlier, there are different ways
of looking at SI-based algorithms and analyzing them from different perspectives such as stability, dynamic systems, Markov chain theory and self-
16
Xin-She Yang et al.
organization. However, there is no unified framework yet that can provide
a fuller picture of an algorithm, concerning convergence, rate of convergence, stability, ergodicity, repeatability and scalability. It is highly likely
that any unified framework for theoretical analysis is a multidisciplinary
approach, looking at algorithms from all angles and perspectives.
– Rate of convergence and control : From both theoretical and practical perspectives, the rate of convergence is extremely important. After all, we
want the best solution to a problem quickly with the minimum computational costs. Even though we can understand largely the characteristics of
many algorithms, this does not mean that we can control their behaviour,
especially the rate of convergence in practice. Loosely speaking, the rate of
convergence can depend on many factors such as the intrinsic components,
structure, parameter values and initial configuration of an algorithm, and
such dependence can be complex, indirect and nonlinear. Even in the case
it may be possible to figure out the rate of convergence, it may be difficult
to control it so as to maximize the search efficiency. It can be expected
that such control can be interlinked with parameter tuning and control.
5.2 Recommendations for Future Research
With the key challenges we just outlined, it is highly recommended to carry
out further research to address such challenges. Therefore, research priority
should be given to the following areas:
– Theoretical framework : Due to its importance in understanding how algorithms work, theoretical analysis should be among one of the top priorities
in the near future. Theoretical analysis can gain more insight into algorithms that allow us to identify the best types of problems to solve and to
tune or control the parameters more effectively. This may also allow us to
potentially design better and more effective algorithms and tools.
– Hybridization: Though some algorithms are very effective in solving certain
types or even a wider range of problems, studies suggest that hybridization
can be powerful by combining the advantages of different algorithms [50].
In fact, hybrid algorithms have been attempted for many years, but the
hybridization process is still a bit trial and error. It is not clear yet how to
combine different algorithms so as to produce a better hybrid?
– Self-tuning and self-adaptive algorithms: As we mentioned earlier, the tuning of algorithm-dependent parameters is a challenging task. The control of
parameters is also a difficult task. Ideally, a truly useful algorithm should
be able to self-tune and self-adapt to suit for different types of problems
[59]. The main unanswered questions are: what is the best way for algorithms to be self-tuning and self-adaptive? For a given set of algorithms,
how to adapt them to new problems without any prior knowledge?
– Diverse applications: The usefulness of algorithms is the ability to solve
a wide range of problems, especially large-scale, real-world applications.
Swarm Intelligence: Past, Present and Future
17
After all, there are many optimization problems that need to be solved in
all areas of science, engineering, industry and business applications.
– Intelligent tools: Several decades of intensive research in algorithms and
optimization have enable researchers to design better and more effective
tools. However, no one can claim that they have produced truly intelligent
tools that can solve problems automatically, quickly and intelligently. In
fact, there are so many related questions concerning this issue. For example, what do we mean by ‘intelligent algorithm’ ? Can algorithms really be
intelligent? Questions like these can be endless, but we may at least wish
to know what the minimum components are so as to make an algorithm
sufficiently intelligent?
Obviously, there are other important directions and active research topics concerning swarm intelligence, optimization and machine learning. One
important topic is to use a good combination of new algorithms with traditional techniques because traditional techniques have been well established
and tested, and they are among the most useful ones to a specific class of problems. New methods will be most needed when traditional methods do not work
well. In addition, even algorithms are efficient, the proper implementation and
parallelization can make algorithms even more useful in practice.
6 Conclusions
Swarm intelligence is an interesting and important area, and swarm intelligence based algorithms have permeated into almost all areas of sciences and
engineering. Accompanying their success and popularity, there are some key
issues to be addressed. In this paper, we have first reviewed the essence of
swarm intelligence, and then linked algorithms and swarm intelligence to selforganization of complex systems. Then, we highlighted some SI-based algorithms and subsequently analyzed their main components, characteristics and
properties, followed by a more theoretical approach using fixed-point theory,
dynamic system and Markov chain theory. Finally, we have also outlined some
key challenges and provide some recommendations for addressing such issues.
It is authors’ hope that more research can be inspired, concerning swarm intelligence and nature-inspired computation so as to solve a diverse range of
optimization problems in real-world applications.
References
1. D.F. Alam. D.A. Yousri, M.B. Eteiba, Flower pollination algorithm based solar PV
parameter estimation, Energy Conversion and Management, 101(2), 410–422 (2015).
2. W. R. Ashby, Princinples of the self-organizing sysem, in: Principles of SelfOrganization: Transactions of the University of Illinois Symposium (Eds H. Von Foerster
and G. W. Zopf, Jr.), Pergamon Press, London 1962; UK. pp. 255–278.
3. A. Banks, J. Vincent, C. Anyakoha, A review of particle swarm optimization. Part II:
hybridisation, combinatorial, multicriteria and constrained optimization, and indicative
applications, Natural Computing, 7(1), 109–124 (2008).
18
Xin-She Yang et al.
4. G. Bekdas, S.M. Nigdeli, X.S. Yang, Sizing optimization of truss structures using flower
pollination algorithm, Applied Soft Computing, 37, 322-331 (2015).
5. D. Berlinski, The Advent of the Algorithm: The 300-Year Journey from an Idea to the
Computer, Harvest Book, New York, (2001).
6. E. Bonabeau, M. Dorigo, G. Theraulaz, Swarm Intelligence: From Natural to Artificial
Systems, Oxford University Press, Oxford, (1999).
7. C. Blum, and A. Roli, Metaheuristics in combinatorial optimization: overview and conceptural comparision, ACM Comput. Survey, 35(2), 268–308 (2003).
8. S. Carbas, Design optimization of steel frames using an enhanced firefly algorithm,
Engineering Optimization, 48(12), 2007–2025 (2016).
9. J.L. Chabert, A History of Algorithms: From the Pebble to the Microchip, SpringerVerlag, Heidelberg, (1999).
10. M. Clerc, J. Kennedy, The particle swarm: Explosion, stability and convergence in a
multidimensional compelx space, IEEE Trans. Evol. Comput., 6(1), 58-73 (2002).
11. D. W. Corne, A. Reynolds, E. Bonabeau, Swarm Intelligence, in: Handbook of Natural
Computing (Eds. G. Rozenberg, T. Bäck, J. N. Kok), Springer, pp. 1599-1622 (2012).
12. Z.H. Cui, B. Sun, G. Wang, Y. Xue, J.J. Chen, A novel oriented cuckoo search algorithm to improve DV-Hop performance for cyber-physical systems, J. Parallel Distrb.
Comput., 103(1), 42–52 (2017).
13. S.M. Darwish, Combining firefly algorithm and Bayesian classifier: new direction for
automatic multilabel image annotation, IET Image Procesing, 10(10), 763–772 (2016).
14. A. E. Eiben and S. K. Smit, Parameter tuning for configuring and analyzing evolutionary
aglorithms, Swarm and Evolutionary Computing, 1(1), 19-31 (2011).
15. A. Gálvez and A. Iglesias, New memetic self-adaptive firefly algorithm for continuous
optimisation, Int. J. Bio-Inspired Computation, 8(5), 300–317 (2016).
16. A.H. Gandomi, X.S. Yang, Chaoti bat algorithm, Journal of Computational Science,
5(2), 224–232 (2014).
17. A.H. Gandom, X.S. Yang, A.H. Alavi, Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems, Engineering with Computers, 29(1),
17–35 (2013).
18. A. Ghate and R. Smith, Adaptive search with stochastic acceptance probabilities for
global optimization, Oper. Res. Lett., 36 (3), 285-290 (2008).
19. D. E. Goldberg, Genetic Algorithms in Search, Optimisation and Machine Learning,
Reading, Mass.: Addison Wesley, (1989).
20. X. S. He, X. S. Yang, M. Karamanoglu, Y. X. Zhao, Global convergence analysis of the
flower pollination algorithm: a discrete-time Markov chain approach, Procedia Computer
Science, 108, 1354-1363 (2017).
21. G. S. Fishman, Monte Carlo: Concepts, Algorithms and Applications, Springer, New
York, (1995).
22. L. Fisher, The Perfect Swarm: The Science of Complexity in Everday Life, Basic Books,
(2009).
23. I. Fister, I. Fister, X.S. Yang, J. Brest, A comprehensive review of firefly algorithms,
Swarm and Evolutionary Computation, 13(1), 34–46 (2013).
24. I. Fister, X.S. Yang, J. Brest, I. Fister Jr., Modified firefly algorithm using quaternion
representation, Expert Systems with Applications, 40(18), 7220–7230 (2013).
25. S. Fong, S. Deb, X. S. Yang, A heuristic optimization method inspired by wolf preying
behavior, Neural Computing and Appplications, 26(7), 1725-1738 (2015).
26. S. Kashi, A. Minuchehr, N. Poursalehi, A. Zolfaghari, Bat algorithm for the fuel arrangement optimization of reactor core, Annals of Nuclear Energy, 64, 144-151 (2014).
27. J. Kennedy and R. C. Eberhart, Particle swarm optimization. Proc. of IEEE International Conference on Neural Networks, Piscataway, NJ: IEEE Press, pp. 1942-1948
(1995).
28. E. F. Keller, Organisms, machines, and thunderstorms: a history of self-organization,
part two. Complexity, emergenece, and stable attractors, Historical Studies in the Natural Sciences, 39(1), 1–31 (2009).
29. A. Khare, S. Rangnekar, A review of particle swarm optimization and its applications
in solar photovoltaic system, Applied Soft Computing, 13(5), 2997–3006 (2013).
Swarm Intelligence: Past, Present and Future
19
30. J.M. Ma, T.O. Ting, K.L. Man, N. Zhang, S.U. Guan, P.W.H. Wong, Parameter estimation of photovoltaic models via cuckoo search, Applied Mathematics, Volume 2013,
Article ID 362619 (8 pages), (2013). http://dx.doi.org/10.1155/2013/362619
31. M. Marichelvam, T. Prabaharan, X. S. Yang, Improved cuckoo search algorithm for
hybrid flow shop scheduling problems to minimize makespan, Applied Soft Computing,
19(1), 93-101 (2014).
32. M.K. Marichelvam, P. Thirumoorthy, X.S. Yang, A discrete firefly algorithm for the
multi-objective hybrid flowshop scheduling problems, IEEE Trans. Evolutionary Computation, 18(2), 301–305 (2014).
33. P. Miller, Swarm theory, National Geographic, July 2007.
34. E. Osaba, X. S. Yang, F. Diaz, E. Onieva, A.D., Masegosa, A. Perallos, A discrete firefly algorithm to sole a rich vehicle routing problem modelling a newspaper distribution system with recycling policy, Soft Computing, Online First, (2016).
https://doi.org/10.1007/s00500-016-2114-1
35. E. Osaba, X.S. Yang, F. Diaz, P. Lopez-Garcia, R. Carballedo, An improved discrete
bat algorithm for symmetric and assymmetric traveling salesman problems, Engineering
Applications of Artificial Intelligence, 48(1), 59–71 (2016).
36. A. Ouaarab, B. Ahiod, X.S. Yang, Random-key cuckoo search for the travelling salesman
problem, Soft Computing, 19(4), 1099–1106 (2015).
37. M. Reyes-Sierra and C. A. Coello Coello, Multi-objective particle swarm optimizers:
A survey of the state-of-the-art, Int. J. of Computational Intelligence Research, 2(3),
287–308 (2006).
38. D. Rodrigues, G.F. Silva, J.P. Papa, A.N. Marana, X.S. Yang, EEG-based person identification through binary flower pollination algorithm, Expert Systems with Applications,
62(1), 81–90 (2016).
39. D. Rodrigues, L.A.M. Pereira, R.Y.M. Nakamura, K.A.P. Costa, X.S. Yang, A.N.
Souza, J.P. Papa, A wrapper approach for feature selection based on bat algorithm
and optimum-path forest, Expert Systems with Applications, 41(5), 2250–2258 (2014).
40. K. E. Parsopoulos and M. N. Vrahatis, Particle Swarm Optimization and Intelligence:
Advances and Applications, Information Science Publishing (IGI Global), (2010).
41. P. Srivastava, M. Chis, S. Deb and X. S. Yang, An efficient optimization algorithm for
structural software testing, Intl. Journal of Artificial Intelligence, 8(12), 68-77 (2012).
42. A. M. Reynolds, C. J. Rhodes, The Lévy fligth paradigm: random search patterns and
mechanisms, Ecology, 90(4), 877-887 (2009).
43. J. Senthinath, S.N. Omkar, V. Mani, Clustering using firefly algorithm: performance
study, Swarm and Evolutionary Computation, 1(3), 164–171 (2011).
44. A. Soleimani, Combined particle swarm optimization and canonical sign digit to design
finite impulse response filter, Soft Computing, 19(2), 407–419 (2015).
45. R. Storn and K. Price, Differential evolution: a simple and efficient heuristic for global
optimization over continuous spaces, J. Global Optimization, 11(4), 341-59 (1997).
46. J. Surowiecki, The Wisdom of Crowds, Anchor Books, (2004).
47. E. Süli and D. Mayer, An Introduction to Numerical Analysis, Cambridge University
Press, Cambridge, (2003).
48. J. A. Suzuki, A Markov chain analysis on simple genetic algorithms, IEEE Trans. Sys.
Man Cybern., 25(4), 655-9 (1995).
49. S.L. Tilahun, J.M.T. Ngnotechouye, Firefly algorithm for discrete optimization problems: A survey, KSCE Journal of Civil Engineering, 21(2), 535–545 (2017).
50. O. Ting, X.S. Yang, S. Cheng, K.Z. Huang, Hybrid metaheuristic algorithms: Past,
present, and futute, in: Recent Advances in Swarm Intelligence and Evolutionary Computation (Ed. X.S. Yang), Studies in Computational Intelligence 585, pp. 71–83 (2015).
51. M. Villalobos-Arias, C. A. C. Colleo, O. Hernández-Lerma, Asypmotic convergence of
metaheuristics for multiobjective optimization problems, Soft Computing, 10(11), 10015 (2005).
52. H. Wang, W.J. Wang, X.Y. Zhou, H. Sun, J. Zhao, X. Yu, Z.H. Cui, Firefly algorithm
with neighborhood attraction, Information Sciences, 382-383(1), 374-387 (2017).
53. D. H. Wolpert and W. G. Macready, No free lunch theorem for optimization, IEEE
Trans. Evol. Comput., 1(1), 67-82 (1997).
54. D. H. Wolpert and W. G. Macready, Coevolutionary free lunches, IEEE Trans. Evol.
Comput. , 9(6), 721-735 (2005).
20
Xin-She Yang et al.
55. X. S. Yang, Bat algorithm for multi-objective optimisation, Int. J. Bio-Inspired Computation, 3(5), 267-274 (2011).
56. X. S. Yang and S. He, Bat algorithm: literature review and applications, Int. J. BioInspired Computation, 5(3), 141-149 (2013).
57. X. S. Yang, M. Karamanoglu and X. S. He, Flower pollination algorithm: A novel
approach for multiobjective optimization, Engineering Optimization, 46(9), 1222-1237
(2014).
58. X. S. Yang and S. Deb, Multi-objective cuckoo search for design optimization, Computers and Operations Research, 40(6): 1616–1624 (2013).
59. X.S. Yang, S. Deb, M. Loomes, M. Karamanoglu, A framework for self-tuning optimization algorithm, Neural Computing and Applications, 23(7-8), 2051-2057 (2013).
60. X.S. Yang, S. Deb, S. Fong, X.S. He, Y.X. Zhao, From swarm intelligence to metaheuristics: nature-inspired optimization algorithms, Computer, 49(9), 52–59 (2016).
61. D. Zaharie, Influence of crossover on the behaviour of the differential evolution algorithm, Applied Soft Computing, 9(3), 1126-38 (2009).
62. C.X. Zhao, C.Z. Wu, J. Chai, X.Y. Wang, X.M., Yang, M. Lee, M.J. Kim,
Decomposition-based multi-objective firefly algorithm for RFID network planning with
uncertainy, Applied Soft Computing, 55, 549–564 (2017).
63. Y. Zhou, R. Wang, Q. Luo, Elite opposition-based flower pollinaton algorithm, Neurocomputing, 188, 294–310 (2016).
64. M. Zineddine, Vulnerabilities and mitigation techniques toning in the cloud: a cost and
vulnerabilities coverage optimiation approach using cuckoo search algorithm with Lévy
flights, Computers & Security, 48(1), 1–18 (2015).
65. D. Zouache, F. Nouioua, A. Moussaoui, Quantum-inspired firefly algorithm with particle
swarm optimization for discrete optimization problems, Soft Computing, 20(7), 2781–
2799 (2016).