Abstract— This paper presents some guidelines for conduct- who provide a global view of the subject in question, such
ing a systematic review SR (also referred to as a systematic as the reviews [3], [19].
literature review) in the field of systems and automatic en-
gineering. Inspired by the available literature in SR in other
fields, this paper presents an adaptation of this kind of review. The methodology to elaborate the SR that will be used
In particular, we provide the advantages and limitations on in this work is inspired on the guide proposed in [10], ac-
the most used databases in this area and some advice on cordingly modified to suit the particular requirements of our
defining the boolean function for the search and the inclusion field, and follows the PRISMA (Preferred Reporting Items
and exclusion criteria of the selected publications. The paper for Systematic reviews and Meta-Analysis) method [16]. The
proposes some recommendations to extract and synthesize the
data collected and, finally, some guides to create the final report. objective of this research is to clearly explain the different
This methodology is applied to a practical case: distributed stages to follow to achieve an SR in systems and automatic
estimation techniques applied to cyber-physical systems (CPS). engineering. As a contribution, this research proposes some
modifications to the aforementioned guidelines to adapt them
Index Terms— Systematic review, Automatic and systems to considered field: available engineering databases, with the
engineering, Distributed estimation, Cyber-physical systems.
kind of search that can be done and their coverage, adapted
I. INTRODUCTION inclusion and exclusion criteria, booleans operators for the
search, definition of the boolean function, etc.
Nowadays, with the development of new information
and communication technologies, the amount of information In addition, a practical case on the execution of an SR
available and its ease of acquisition has increased enor- on the techniques of distributed estimation of cyber-physical
mously, perhaps surprisingly implying that it is increasingly systems is presented in parallel. Cyber-physical systems are
difficult to make an accurate and selective research on a spe- complex systems composed of entities of different nature
cific topic to prepare a bibliographic review. The systematic that interact with a given physical medium, and which
review (SR) is presented in this context as a solution to this can simultaneously have communication, computation and
problem. In [10] an SR is defined as a means to evaluate control capabilities, through which they can involve humans,
and interpret all available research relevant to a particular animals and biological process [11]. The aim of this SR
research question, thematic area or phenomenon of interest. is to get to know and compare all the techniques related
The SR’s aim is to present a fair evaluation of a research to distributed estimation that has been successfully applied
topic using a reliable, rigorous and verifiable methodology. to any kind of cyber-physical system. To be considered for
An SR is a form of secondary study, whereas individual inclusion in the review, the primary studies must deal with
studies contributing to the review are termed primary studies some sort of dynamical CPS and the distributed estimators
[10]. Briefly, a SR is a very useful study when there is a or observers must share some sort of information (excluding
concrete question of research generally related to different pure decentralized estimators). Among others features, we
subjects, with several primary studies, perhaps with divergent are interested in the amount of information that those meth-
objectives and/or results that can generate an uncertainty ods require to transmit, the type of communication protocol
about the process. that needs to be implemented, and whether the design of
Many are the areas in which this methodology is com- the estimator is made in a centralized or a distributed way.
monly used. Among them, medical, psychology, biology, This paper presents a table including all this information so
economics or software engineering stand out. In the area that the reader may get a quick perspective on the available
of systems and automatic engineering, literature reviews results.
are usually made without following a certain methodology,
so that they can not be replicated by other authors or This paper is organized as follows. Section II summarizes
may be biased by the reviewer’s criteria. Normally, these the steps of the SR. The planning of the review is discussed
revisions are usually made by long-experienced researchers in Section III. Finally, the stages for the conducting and
1 Authors
reporting of the review are presented respectively in Sections
are with Departamento de Inge-
nierı́a, Universidad Loyola Andalucı́a, Sevilla, Spain IV and V. The conclusions and pending work are discussed
{cierardi,dorihuela,ijurado} in Section VI.
conferences, etc.), and the kind of search (electronic or
manual searches, or a combination of both) must be specified.
Selection of
Databases primary studies Reporting When choosing the databases to conduct the search, it is
Documentation of important to be sure that they cover, at least, the content
Extraction and the extracted data
Inclusion and
exclusion criteria synthesis of data of the most important publishers. However, the coverage of
each database is, sometimes not easy to discover, since the
Boolean function databases do not provide it in an exact way.
In the authors’ opinion, for the field of automatic and
systems engineering it is advisable to choose between the
Fig. 1. Phases y sub-phases of the systematic review databases listed in Table I. Regarding the search fields that
these databases allow, in this table some acronyms have been
used, namely: A = abstract, T = article title, K = keywords, F
III. P LANNING THE REVIEW = full text. Depending on the database, the set of “abstract”,
Once the research topic has been defined and the need to title and keywords (A + T + K) which is a very common
initiate an efficient SR is found it is highly recommended choice is called differently: for example, in Web of Science
to follow the different stages that constitute it. The most this field is named “Topic”; in IEEE Xplore, “Metadata”.
demanding and most significant phase, due to the influence The search strategy may be different depending on the
of subsequent steps, is the planning of the review, which con- chosen database, as some are more prepared than others to
sists of: 1) Definition a clear and precise research question; conduct an advanced search. Some of the databases, as shown
2) Choice of the databases; 3) Establishment of inclusion in Table I, let us do the search directly in certain fields of
text (IEEE Xplore, Web of Science, ScienceDirect, Scopus), be decided at the outset, although they may be refined
others do not (SpringerLink, Wiley, Google Scholar). Some during the search process. There are two types of criteria,
databases allow us to include only a finite number of terms exclusion and inclusion. The first ones indicate that if
for the search (such as IEEE Xplore that has a limits of 15 the article presents one of the points contained in this
terms), while others have no limits. Almost all databases criterion will be excluded. The second ones involve all the
support operators, booleans and of other types. Table I characteristics that each chosen article needs to have. As it
includes the operators that can be used in each database. will be explained later, the number of people involved in an
For a detailed explanation of the supported operators, the SR shall be at least 2. For this reason, the criteria should be
researchers should go to the respective database, since not clear to ensure that they can be interpreted reliably and the
all databases use the same operators in the same way. studies can be classified correctly. Many of these criteria
The download limits are substantially different, ranging are common to all SR’s, such as the language of articles,
from just 1 citation at a time (Google Scholar) to up to 2000 the minimum number of pages, etc. [10], [16].
documents (Scopus and among others). Table I also reflects
the format in which the results can be downloaded, which Case study: Tables III - IV list the chosen criteria for the
is important to consider if a reference manager (Mendeley, selection of an article. The so-called secondary studies and
BibDesk, etc.) will be used, since each of them works with the gray literature include books or book chapters previously
a different file format. Finally, the table indicates what or subsequently published in journal papers, poster presenta-
information can be extracted from the database after the tions, abstracts, reviews and surveys, conference paper that
search has been conducted. In some databases, in addition have given rise to journal articles, or doctoral theses. It
to the title, the user can download the abstract as well. As has to be highlighted that the doctoral theses have been
we will explain later, this will facilitate the subsequent tasks. included in the gray literature because they usually give rise
to journal articles. Mainly we are including novel results
Case study: For the case study, we have decided to which has passed a peer-review process. In systems and
exclude those databases that did do not allow to make the automatic engineering, the realization of experiments and/or
search in the abstract, title and keywords. In addition, we simulations could be required. The proposition or just usage
have checked that those databases have enough coverage to
conduct the planned review. Table II presents this information
in a graphical way, using the information provided by each
database. The period of interest of the search is from the
IT=IET, PE=Pegamon-Elsevier, ES=Elsevier Science, WB=Wiley
01/01/90 to 02/10/2017.
Blackwell, TF=Taylor & Francis, SP=Springer, SI=SIAM Publications,
C. Inclusion and exclusion criteria OX=Oxford University Press, KO=Korean Inst. Electrical Eng., SA=Sage
Publications, AS=ASME, MP=Microtome Publications
Once the potentially relevant studies for each chosen
database have been obtained, their actual suitability for IE IT PE ES WB TF SP SI OX KO SA AS MP
the research topic has to be evaluated. The study selection IEEEX
criteria aim to identify the primary studies that are those Scopus
with direct evidence on the research question. In order to WoS
reduce the likelihood of bias, the selection criteria should SD
of a certain technique could be distinguished as well. TABLE V
221 727 40 223 577 1788
• Full paper available (through search engines or by contacting
the authors) • Use or propose a distributed estimation technique
on cyber-physical systems, heterogeneous systems or system of
systems or make specific reference to humans, animals or
biological systems • Use a distributed estimator with some sort
of communication between local estimators • The system to Once the results are obtained for each database, they will
be estimated must have dynamics
have to be analyzed and evaluated to verify if they are really
TABLE IV good candidates for the review. The first filter will be the
E XCLUSION CRITERIA application of the inclusion and exclusion criteria mentioned
in Subsection III-C. In a first inspection, the criteria will be
• Secondary studies and gray literature • Non English written applied to the title and abstract. Only when there is doubt
papers • Duplicated studies • Studies clearly irrelevant to the
research • Focused only on control • It does not present whether to include it or not, a full text reading will be made.
experiments nor simulations
A. Selection of primary studies
D. Boolean function As it was explained before, the search will not have to
include secondary studies or gray literature. Most databases
The search begins with an adequate formulation of the allow you to directly perform this first filtering, not including
keywords of the boolean function and with the research in some or all secondary studies. In order to guide this process,
the electronic databases of each sector. In order to refine an international group of experts has developed the PRISMA
the search and develop a better strategy, it would be better method, which is a framework for the realization of SR [16].
to identify synonyms, acronyms, truncated words and/or It consists of 27 points and a flow diagram in four stages,
alternative terms that have been used to ensure that the presented in Figure 2 that summarize the selection process.
reference found is independent of synonyms, variants, etc. The number of studies included in each step in given in
At the point, we recommend the potential reviewers to read brackets. We will come to this later in the case study.
several survey or key papers on the topic to find all the
terminology involved. The boolean function represents a
string consisting of several combinations of terms derived
Records identified through Additional records identified
database searching through other sources
from the research question. Once the keywords describing (n = 1788) (n = 8)
the research topic are found, boolean operators are used to
combine the different terms. Finally, the search strategy is Records after duplicates removed
(n = 1113)
launched and the result obtained is reviewed. If necessary,
Full‐text articles assessed Full‐text articles excluded,
for eligibility with reasons
us confine the topic. This increases the confidence in the (n = 15) (n = 2*)
effectiveness of the search.
*Does not meet all the
Studies included in inclusion criteria
Case study: The boolean function for this study is: qualitative synthesis
(n = 13)
{Estimator OR Estimation OR Filter OR Fil-
tering OR Observer OR Observability OR Sens-
ing} AND
{“Cyber Physical System” OR “Human in the loop” Fig. 2. PRISMA flow diagram
OR “Human Robot” OR “System of systems” OR
“Heterogeneous System” OR “Human Machine” OR In order to reduce the bias, the results will be evaluated
“Canine Machine” OR “Heterogeneous Multiagent Sys- by two different people, considering the title and the
tem” OR “Humanoid Robot” OR “Animal Robot”} abstract. Only in the case when the two people claim that
AND an article is adequate, it goes to the next stage1 . If they
{Distributed OR Decentralized OR disagree, a third person will have the final decision. This
Decentralised OR “Sensor Fusion”} step considerably reduces the number of items considered
1 There exists another methodology that assigns to each paper a grade
The results of the search obtained using this boolean
between 0 and 1 at the choice of the reviewers: if the final grade, which is the
function in the title, abstract and keywords are shown in average between the grades of each reviewer, is higher than a predetermined
the Table V. We have obtained 1788 potential studies. threshold, then the article is accepted, otherwise it is rejected.
to be suitable. The next step consist in reading the full Neighborhood, in which every agent communicates with the
document making a final selection. It is worth mentioning agents in its neighborhood at each sampling time.
that, after finishing all the stages, it is common to review the One quality criterion that deserves to be commented is the
bibliography of the accepted studies and, even, to contact fact that the papers included in the SR do not cite the others
with the corresponding authors. If any additional papers papers. This means that answering the questions reported at
have to be considered, they will need to go through the the beginning would not be an accurate task if the researcher
whole PRISMA procedure, as it is shown in Figure 2. just read some of those papers and their citations. This fact
strengthens the need of this review, since the whole picture
Case study: As it is shown, the additional records identi- of the topic is grouped.
fied through other sources are 8. These studies correspond to
those advised by the corresponding authors of the accepted
papers and those included in the bibliography of the accepted The final phase of the SR includes the drafting and report-
documents. We have detected more than 600 duplicates since, ing of all results. As it was already mentioned, the SR does
as detailed in Table II, the databases share part of their not have to be a simple description of what other authors have
content. In order to facilitate this task, an automatic duplicate published, but a critical, objective and reasoned discussion of
detection tool has been used (Mendeley). This results in 1113 the literature examined, showing a deep understanding and
papers to be screened. For the first screening, only some of awareness of the different arguments and approaches.
the exclusion criteria have been used (secondary studies, part A. Documentation of the extracted data
of gray literature and duplicates have been ruled out). The
Once the review is organized, the maps and/or tables are
reason is that some databases allow to exclude part of the
completed, a complete overview of the gathered material and
gray literature directly. The application of the rest of the
an orientation to the logical articulation of the relationships
exclusion criteria and the inclusion criteria are actually the
between the different results will be taken. Among other
task that would be necessary to tackle between the 2 people.
things, this process facilitates the organization of content
Concluding this phase of the review, we have a number of
and, in particular, the selection of key information, explaining
studies to read in full text very small respecting the initial
what concepts should be developed in the review. The report
number. In particular, 15 papers will have to be evaluated. Of
has to include all the aspects mentioned in Section IV-B,
these 15 papers, 2 were excluded with the motivations shown
using all the basic information available in each article that
in the Figure 2. The percentage of agreement between the
is important for the review, highlighting the characteristics
authors has been almost 85%.
of the study and the results obtained. The SR is concluded
B. Extraction and synthesis of data arguing the results, that is, explaining and proposing an inter-
pretation of the most significant data related to the topic. The
The validity of a SR is closely related to the quality of the
final report will have to answer in a clear and precise way
original studies and to the methods used by the reviewers
to the research questions proposed at the beginning. Some
to organize and systematize the information useful for the
authors raise this last part by developing each question with
review. There are different methods that help us organizing
their respective answer, while others argue the conclusions
the collected information. The conceptual maps or tables of
without necessarily using this subdivision [22], [17].
data extraction are the most used, where the most important
feature of the study are summarized. The basic information VI. C ONCLUSIONS AND FUTURE WORK
these tables should contain is different for each sector. In this paper we have presented the main stages of an SR
for the field of automatic and systems engineering. The paper
Case study: The information gathered from the different presents, from the authors’ point of view, an important set of
papers is detailed in Table VI. In addition to the information guidelines, advices and organized information that will help
relative to the publication, the table provides details of those researchers that feel the need to conduct a SR within
different aspects we think are crucial: the estimator used, the this field. A case study has been presented on the techniques
concrete application within CPS, and whether simulations of distributed estimation of cyber-physical systems. It is
and/or real experiments have been carried out. Particular pending to complete the reporting the review, which is left
attention has been paid to the limitations and advantages as future work. In conclusion, future research is intended to
presented in each paper and whether the design (the imple- provide deep insights on the evolution of these techniques
mentation must be distributed) of the estimators/observers from the 900 to 2017 in order to improve understanding
is made in a distributed or centralized way. Furthermore, we and clarify the evolutionary trajectory by identifying the
have made a deep study on the communication requirements, theoretical trends and gaps that need to be addressed in future
focusing on the required amount of data to be transmitted studies.
at each sampling time (as a reference level for this data,
we denote the dimension of the state vector to be estimated ACKNOWLEDGMENT
by n) and the communication protocol implemented. The We would like to thank A. Rigabert for her advices.
following cases are identified: All-to-all, in which every This research is supported by grant TEC2016-80242-P
agent communicates to every agent each sampling time; and (AEI/FEDER).
Title Year Estimator Application Real Experimen- Limitations Design Exchanged Communication
used ts or simulations and advantages information protocol
[5] 2008 Decentralized To coordinate a Real Lost packets in Distributed Vector (n) Neighborhood
Bayesian network of humans Experiment communication
filtering and robots process
[20] 2010 A distributed Fault detection and Simulations Provide high efficiency, Distributed Matrix (nxn) All-to-all
detection algorithm isolation (FDI) , applicable scalability and
based on consensus to systems of systems robustness
[4] 2011 Goertzel Algorithm Structural Both Reduce latency respect to FFT Distributed Magnitude espectrum (n) All-to-all
with transmissibility health and require a centralized
functions monitoring inizialization algorithm
[12] 2011 Consensus-based Synchronization of linear Simulations LMI centralized design, Centralized State vector (n) All-to-all
Luenberger observer heterogeneous structural constraints in Estimated state vector (n)
multi-agent systems the observer matrix state of ecosystem (r < n)
[18] 2012 Annealing Body tracking Real For 3D model Centralized Matrix (nxn) Neighborhood
particle experiment
[9] 2013 Consensus based CPS with adversarial Simulations With attack on the sensed Centralized Estimated state vector (n) Neighborhood
estimator attack on the sensed and and communication; Output vector (r < n)
communicated information Full rank output matrix
[14] 2013 Extended Multi-robot Both The robots have Distributed Vector (n) All-to-all
Kalman filter tracking a low capacity
(EKF) processing unit
[15] 2016 Adaptive Tracking a leader Simulations Distriduted Matrix (nxn) Neighborhood
distributed with heterogeneous
observer followers
[6] 2016 Distributed fusion Cyber-physical systems Simulation Communication Distributed Matrix (nxn) All-to-all
estimator (DFE) with communication delays and packet
(Kalman filtering) constraints dropouts
[7] 2016 Bayesian Arrival rates in Simulation The number of samples at Distributed Augmented state Neighborhood
Estimation asynchronous moni- each node are allowed to vector (n)
toring networks be highly inhomogeneous
[8] 2016 A self-tuning adaptive Multi-agent system Simulations Are treated separately han- Distributed Matrix (nxn) Neighborhood
distributed observer dle the leader’s signal and
the external disturbances
[2] 2016 Robust distributed nonlinear descri- Estimation state and Real Robustly with residual Distributed Matrix (nxn) All-to-all
ptor observers with neural network attack simultaneously experiment signals and require global
approximation of uncertainties in the CPS information for design
[1] 2017 Differents Kalman Filters: the general Biological system Simulations Reducing congestion and Distributed Estimated state vector (n) Neighborhood
consensus-based distributed filter incrementing the robustness
and the distributed adaptive filter against nodes failures and attacks